|  | # SPIR-V translation of shader input and output variables | 
|  |  | 
|  | WGSL [MR 1315](https://github.com/gpuweb/gpuweb/issues/1315) changed WGSL so | 
|  | that pipeline inputs and outputs are handled similar to HLSL: | 
|  |  | 
|  | - Shader pipeline inputs are the WGSL entry point function arguments. | 
|  | - Shader pipeline outputs are the WGSL entry point return value. | 
|  |  | 
|  | Note: In both cases, a struct may be used to pack multiple values together. | 
|  | In that case, I/O specific attributes appear on struct members at the struct declaration. | 
|  |  | 
|  | Resource variables, e.g. buffers, samplers, and textures, are still declared | 
|  | as variables at module scope. | 
|  |  | 
|  | ## Vulkan SPIR-V today | 
|  |  | 
|  | SPIR-V for Vulkan models inputs and outputs as module-scope variables in | 
|  | the Input and Output storage classes, respectively. | 
|  |  | 
|  | The `OpEntryPoint` instruction has a list of module-scope variables that must | 
|  | be a superset of all the input and output variables that are statically | 
|  | accessed in the shader call tree. | 
|  | From SPIR-V 1.4 onward, all interface variables that might be statically accessed | 
|  | must appear on that list. | 
|  | So that includes all resource variables that might be statically accessed | 
|  | by the shader call tree. | 
|  |  | 
|  | ## Translation scheme for SPIR-V to WGSL | 
|  |  | 
|  | A translation scheme from SPIR-V to WGSL is as follows: | 
|  |  | 
|  | Each SPIR-V entry point maps to a set of Private variables proxying the | 
|  | inputs and outputs, and two functions: | 
|  |  | 
|  | - An inner function with no arguments or return values, and whose body | 
|  | is the same as the original SPIR-V entry point. | 
|  | - Original input variables are mapped to pseudo-in Private variables | 
|  | with the same store types, but no other attributes or properties copied. | 
|  | In Vulkan, Input variables don't have initalizers. | 
|  | - Original output variables are mapped to pseudo-out Private variables | 
|  | with the same store types and optional initializer, but no other attributes | 
|  | or properties are copied. | 
|  | - A wrapper entry point function whose arguments correspond in type, location | 
|  | and builtin attributes the original input variables, and whose return type is | 
|  | a structure containing members correspond in type, location, and builtin | 
|  | attributes to the original output variables. | 
|  | The body of the wrapper function the following phases: | 
|  | - Copy formal parameter values into pseudo-in variables. | 
|  | - Insert a bitcast if the WGSL builtin variable has different signedness | 
|  | from the SPIR-V declared type. | 
|  | - Execute the inner function. | 
|  | - Copy pseudo-out variables into the return structure. | 
|  | - Insert a bitcast if the WGSL builtin variable has different signedness | 
|  | from the SPIR-V declared type. | 
|  | - Return the return structure. | 
|  |  | 
|  | - Replace uses of the the original input/output variables to the pseudo-in and | 
|  | pseudo-out variables, respectively. | 
|  | - Remap pointer-to-Input with pointer-to-Private | 
|  | - Remap pointer-to-Output with pointer-to-Private | 
|  |  | 
|  | We are not concerned with the cost of extra copying input/output values. | 
|  | First, the pipeline inputs/outputs tend to be small. | 
|  | Second, we expect the backend compiler in the driver will be able to see | 
|  | through the copying and optimize the result. | 
|  |  | 
|  | ### Example | 
|  |  | 
|  |  | 
|  | ```glsl | 
|  | #version 450 | 
|  |  | 
|  | layout(location = 0) out vec4 frag_colour; | 
|  | layout(location = 0) in vec4 the_colour; | 
|  |  | 
|  | void bar() { | 
|  | frag_colour = the_colour; | 
|  | } | 
|  |  | 
|  | void main() { | 
|  | bar(); | 
|  | } | 
|  | ``` | 
|  |  | 
|  | Current translation, through SPIR-V, SPIR-V reader, WGSL writer: | 
|  |  | 
|  | ```groovy | 
|  | @location(0) var<out> frag_colour : vec4<f32>; | 
|  | @location(0) var<in> the_colour : vec4<f32>; | 
|  |  | 
|  | fn bar_() -> void { | 
|  | const x_14 : vec4<f32> = the_colour; | 
|  | frag_colour = x_14; | 
|  | return; | 
|  | } | 
|  |  | 
|  | @fragment | 
|  | fn main() -> void { | 
|  | bar_(); | 
|  | return; | 
|  | } | 
|  | ``` | 
|  |  | 
|  | Proposed translation, through SPIR-V, SPIR-V reader, WGSL writer: | 
|  |  | 
|  | ```groovy | 
|  | // 'in' variables are now 'private'. | 
|  | var<private> frag_colour : vec4<f32>; | 
|  | var<private> the_colour : vec4<f32>; | 
|  |  | 
|  | fn bar_() -> void { | 
|  | // Accesses to the module-scope variables do not change. | 
|  | // This is a big simplifying advantage. | 
|  | const x_14 : vec4<f32> = the_colour; | 
|  | frag_colour = x_14; | 
|  | return; | 
|  | } | 
|  |  | 
|  | fn main_inner() -> void { | 
|  | bar_(); | 
|  | return; | 
|  | } | 
|  |  | 
|  | // Declare a structure type to collect the return values. | 
|  | struct main_result_type { | 
|  | @location(0) frag_color : vec4<f32>; | 
|  | }; | 
|  |  | 
|  | @fragment | 
|  | fn main( | 
|  |  | 
|  | // 'in' variables are entry point parameters | 
|  | @location(0) the_color_arg : vec4<f32> | 
|  |  | 
|  | ) -> main_result_type { | 
|  |  | 
|  | // Save 'in' arguments to 'private' variables. | 
|  | the_color = the_color_arg; | 
|  |  | 
|  | // Initialize 'out' variables. | 
|  | // Use the zero value, since no initializer was specified. | 
|  | frag_color = vec4<f32>(); | 
|  |  | 
|  | // Invoke the original entry point. | 
|  | main_inner(); | 
|  |  | 
|  | // Collect outputs into a structure and return it. | 
|  | var result : main_outer_result_type; | 
|  | result.frag_color = frag_color; | 
|  | return result; | 
|  | } | 
|  | ``` | 
|  |  | 
|  | Alternately, we could emit the body of the original entry point at | 
|  | the point of invocation. | 
|  | However that is more complex because the original entry point function | 
|  | may return from multiple locations, and we would like to have only | 
|  | a single exit path to construct and return the result value. | 
|  |  | 
|  | ### Handling fragment discard | 
|  |  | 
|  | In SPIR-V `OpKill` causes immediate termination of the shader. | 
|  | Is the shader obligated to write its outputs when `OpKill` is executed? | 
|  |  | 
|  | The Vulkan fragment operations are as follows: | 
|  | (see [6. Fragment operations](https://www.khronos.org/registry/vulkan/specs/1.2/html/vkspec.html#fragops)). | 
|  |  | 
|  | * Scissor test | 
|  | * Sample mask test | 
|  | * Fragment shading | 
|  | * Multisample coverage | 
|  | * Depth bounds test | 
|  | * Stencil test | 
|  | * Depth test | 
|  | * Sample counting | 
|  | * Coverage reduction | 
|  |  | 
|  | After that, the fragment results are used to update output attachments, including | 
|  | colour, depth, and stencil attachments. | 
|  |  | 
|  | Vulkan says: | 
|  |  | 
|  | > If a fragment operation results in all bits of the coverage mask being 0, | 
|  | > the fragment is discarded, and no further operations are performed. | 
|  | > Fragments can also be programmatically discarded in a fragment shader by executing one of | 
|  | > | 
|  | >     OpKill. | 
|  |  | 
|  | I interpret this to mean that the outputs of a discarded fragment are ignored. | 
|  |  | 
|  | Therefore, `OpKill` does not require us to modify the basic scheme from the previous | 
|  | section. | 
|  |  | 
|  | The `OpDemoteToHelperInvocationEXT` | 
|  | instruction is an alternative way to throw away a fragment, but which | 
|  | does not immediately terminate execution of the invocation. | 
|  | It is introduced in the [`SPV_EXT_demote_to_helper_invocation](http://htmlpreview.github.io/?https://github.com/KhronosGroup/SPIRV-Registry/blob/master/extensions/EXT/SPV_EXT_demote_to_helper_invocation.html) | 
|  | extension.  WGSL does not have this feature, but we expect it will be introduced by a | 
|  | future WGSL extension.  The same analysis applies to demote-to-helper.  When introduced, | 
|  | it will not affect translation of pipeline outputs. | 
|  |  | 
|  | ### Handling depth-replacing mode | 
|  |  | 
|  | A Vulkan fragment shader must write to the fragment depth builtin if and only if it | 
|  | has a `DepthReplacing` execution mode. Otherwise behaviour is undefined. | 
|  |  | 
|  | We will ignore the case where the SPIR-V shader writes to the `FragDepth` builtin | 
|  | and then discards the fragment. | 
|  | This is justified because "no further operations" are performed by the pipeline | 
|  | after the fragment is discarded, and that includes writing to depth output attachments. | 
|  |  | 
|  | Assuming the shader is valid, no special translation is required. | 
|  |  | 
|  | ### Handling output sample mask | 
|  |  | 
|  | By the same reasoning as for depth-replacing, it is ok to incidentally not write | 
|  | to the sample-mask builtin variable when the fragment is discarded. | 
|  |  | 
|  | ### Handling clip distance and cull distance | 
|  |  | 
|  | Most builtin variables are scalars or vectors. | 
|  | However, the `ClipDistance` and `CullDistance` builtin variables are arrays of 32-bit float values. | 
|  | Each entry defines a clip half-plane (respectively cull half-plane) | 
|  | A Vulkan implementation must support array sizes of up to 8 elements. | 
|  |  | 
|  | How prevalent are shaders that use these features? | 
|  | These variables are supported when Vulkan features `shaderClipDistance` and `shaderCullDistance` | 
|  | are supported. | 
|  | According to gpuinfo.org as of this writing, those | 
|  | Vulkan features appear to be nearly universally supported on Windows devices (>99%), | 
|  | but by only 70% on Android. | 
|  | It appears that Qualcomm devices support them, but Mali devices do not (e.g. Mali-G77). | 
|  |  | 
|  | The proposed translation scheme forces a copy of each array from private | 
|  | variables into the return value of a vertex shader, or into a private | 
|  | variable of a fragment shader. | 
|  | In addition to the register pressure, there may be a performance degradation | 
|  | due to the bulk copying of data. | 
|  |  | 
|  | We think this is an acceptable tradeoff for the gain in usability and | 
|  | consistency with other pipeline inputs and outputs. | 
|  |  | 
|  | ## Translation scheme for WGSL AST to SPIR-V | 
|  |  | 
|  | To translate from the WGSL AST to SPIR-V, do the following: | 
|  |  | 
|  | - Each entry point formal parameter is mapped to a SPIR-V `Input` variable. | 
|  | - Struct and array inputs may have to be broken down into individual variables. | 
|  | - The return of the entry point is broken down into fields, with one | 
|  | `Output` variable per field. | 
|  | - In the above, builtins must be separated from user attributes. | 
|  | - Builtin attributes are moved to the corresponding variable. | 
|  | - Location and interpolation attributes are moved to the corresponding | 
|  | variables. | 
|  | - This translation relies on the fact that pipeline inputs and pipeline | 
|  | outputs are IO-shareable types. IO-shareable types are always storable, | 
|  | and can be the store type of input/output variables. | 
|  | - Input function parameters will be automatically initialized by the system | 
|  | as part of setting up the pipeline inputs to the entry point. | 
|  | - Replace each return statement in the entry point with a code sequence | 
|  | which writes the return value components to the synthesized output variables, | 
|  | and then executes an `OpReturn` (without value). | 
|  |  | 
|  | This translation is sufficient even for fragment shaders with discard. | 
|  | In that case, outputs will be ignored because downstream pipeline | 
|  | operations will not be performed. | 
|  | This is the same rationale as for translation from SPIR-V to WGSL AST. |