Add translation scheme for HLSL-style input/outputs
Add design doc describing how to translate from SPIR-V Vulkan
style inputs/outputs to WGSL-style inputs as shader entry points,
and outputs as shader return value.
Change-Id: If0dc5f07542ccf2db95ed648a34f73a06e20a5a4
Reviewed-on: https://dawn-review.googlesource.com/c/tint/+/38961
Reviewed-by: Alan Baker <alanbaker@google.com>
Commit-Queue: David Neto <dneto@google.com>
diff --git a/docs/spirv-input-output-variables.md b/docs/spirv-input-output-variables.md
new file mode 100644
index 0000000..4814dda
--- /dev/null
+++ b/docs/spirv-input-output-variables.md
@@ -0,0 +1,264 @@
+# SPIR-V translation of shader input and output variables
+
+WGSL [MR 1315](https://github.com/gpuweb/gpuweb/issues/1315) changed WGSL so
+that pipeline inputs and outputs are handled similar to HLSL:
+
+- Shader pipeline inputs are the WGSL entry point function arguments.
+- Shader pipeline outputs are the WGSL entry point return value.
+
+Note: In both cases, a struct may be used to pack multiple values together.
+In that case, I/O specific attributes appear on struct members at the struct declaration.
+
+Resource variables, e.g. buffers, samplers, and textures, are still declared
+as variables at module scope.
+
+## Vulkan SPIR-V today
+
+SPIR-V for Vulkan models inputs and outputs as module-scope variables in
+the Input and Output storage classes, respectively.
+
+The `OpEntryPoint` instruction has a list of module-scope variables that must
+be a superset of all the input and output variables that are statically
+accessed in the shader call tree.
+From SPIR-V 1.4 onward, all interface variables that might be statically accessed
+must appear on that list.
+So that includes all resource variables that might be statically accessed
+by the shader call tree.
+
+## Translation scheme for SPIR-V to WGSL
+
+A translation scheme from SPIR-V to WGSL is as follows:
+
+Each SPIR-V entry point maps to a set of Private variables proxying the
+inputs and outputs, and two functions:
+
+- An inner function with no arguments or return values, and whose body
+ is the same as the original SPIR-V entry point.
+- Original input variables are mapped to pseudo-in Private variables
+ with the same store types, but no other attributes or properties copied.
+- Original output variables are mapped to pseudo-out Private variables
+ with the same store types, but no other attributes or properties are copied.
+- A wrapper entry point function whose arguments correspond in type, location
+ and builtin attributes the original input variables, and whose return type is
+ a structure containing members correspond in type, location, and builtin
+ attributes to the original output variables.
+ The body of the wrapper function the following phases:
+ - Copy formal parameter values into pseudo-in variables.
+ - Use stores to initialize pseudo-out variables:
+ - If the original variable had an initializer, store that value.
+ - Otherwise, store a zero value for the store type.
+ - Execute the inner function.
+ - Copy pseudo-out variables into the return structure.
+ - Return the return structure.
+
+- Replace uses of the the original input/output variables to the pseudo-in and
+ pseudo-out variables, respectively.
+- Remap pointer-to-Input with pointer-to-Private
+- Remap pointer-to-Output with pointer-to-Private
+
+We are not concerned with the cost of extra copying input/output values.
+First, the pipeline inputs/outputs tend to be small.
+Second, we expect the backend compiler in the driver will be able to see
+through the copying and optimize the result.
+
+### Example
+
+
+```glsl
+ #version 450
+
+ layout(location = 0) out vec4 frag_colour;
+ layout(location = 0) in vec4 the_colour;
+
+ void bar() {
+ frag_colour = the_colour;
+ }
+
+ void main() {
+ bar();
+ }
+```
+
+Current translation, through SPIR-V, SPIR-V reader, WGSL writer:
+
+```groovy
+ [[location(0)]] var<out> frag_colour : vec4<f32>;
+ [[location(0)]] var<in> the_colour : vec4<f32>;
+
+ fn bar_() -> void {
+ const x_14 : vec4<f32> = the_colour;
+ frag_colour = x_14;
+ return;
+ }
+
+ [[stage(fragment)]]
+ fn main() -> void {
+ bar_();
+ return;
+ }
+```
+
+Proposed translation, through SPIR-V, SPIR-V reader, WGSL writer:
+
+```groovy
+ // 'in' variables are now 'private'.
+ var<private> frag_colour : vec4<f32>;
+ var<private> the_colour : vec4<f32>;
+
+ fn bar_() -> void {
+ // Accesses to the module-scope variables do not change.
+ // This is a big simplifying advantage.
+ const x_14 : vec4<f32> = the_colour;
+ frag_colour = x_14;
+ return;
+ }
+
+ fn main_inner() -> void {
+ bar_();
+ return;
+ }
+
+ // Declare a structure type to collect the return values.
+ struct main_result_type {
+ [[location(0)]] frag_color : vec4<f32>;
+ };
+
+ [[stage(fragment)]]
+ fn main(
+
+ // 'in' variables are entry point parameters
+ [[location(0)]] the_color_arg : vec4<f32>
+
+ ) -> main_result_type {
+
+ // Save 'in' arguments to 'private' variables.
+ the_color = the_color_arg;
+
+ // Initialize 'out' variables.
+ // Use the zero value, since no initializer was specified.
+ frag_color = vec4<f32>();
+
+ // Invoke the original entry point.
+ main_inner();
+
+ // Collect outputs into a structure and return it.
+ var result : main_outer_result_type;
+ result.frag_color = frag_color;
+ return result;
+ }
+```
+
+Alternately, we could emit the body of the original entry point at
+the point of invocation.
+However that is more complex because the original entry point function
+may return from multiple locations, and we would like to have only
+a single exit path to construct and return the result value.
+
+### Handling fragment discard
+
+In SPIR-V `OpKill` causes immediate termination of the shader.
+Is the shader obligated to write its outputs when `OpKill` is executed?
+
+The Vulkan fragment operations are as follows:
+(see [6. Fragment operations](https://www.khronos.org/registry/vulkan/specs/1.2/html/vkspec.html#fragops)).
+
+* Scissor test
+* Sample mask test
+* Fragment shading
+* Multisample coverage
+* Depth bounds test
+* Stencil test
+* Depth test
+* Sample counting
+* Coverage reduction
+
+After that, the fragment results are used to update output attachments, including
+colour, depth, and stencil attachments.
+
+Vulkan says:
+
+> If a fragment operation results in all bits of the coverage mask being 0,
+> the fragment is discarded, and no further operations are performed.
+> Fragments can also be programmatically discarded in a fragment shader by executing one of
+>
+> OpKill.
+
+I interpret this to mean that the outputs of a discarded fragment are ignored.
+
+Therefore, `OpKill` does not require us to modify the basic scheme from the previous
+section.
+
+The `OpDemoteToHelperInvocationEXT`
+instruction is an alternative way to throw away a fragment, but which
+does not immediately terminate execution of the invocation.
+It is introduced in the [`SPV_EXT_demote_to_helper_invocation](http://htmlpreview.github.io/?https://github.com/KhronosGroup/SPIRV-Registry/blob/master/extensions/EXT/SPV_EXT_demote_to_helper_invocation.html)
+extension. WGSL does not have this feature, but we expect it will be introduced by a
+future WGSL extension. The same analysis applies to demote-to-helper. When introduced,
+it will not affect translation of pipeline outputs.
+
+### Handling depth-replacing mode
+
+A Vulkan fragment shader must write to the fragment depth builtin if and only if it
+has a `DepthReplacing` execution mode. Otherwise behaviour is undefined.
+
+We will ignore the case where the SPIR-V shader writes to the `FragDepth` builtin
+and then discards the fragment.
+This is justified because "no further operations" are performed by the pipeline
+after the fragment is discarded, and that includes writing to depth output attachments.
+
+Assuming the shader is valid, no special translation is required.
+
+### Handling output sample mask
+
+By the same reasoning as for depth-replacing, it is ok to incidentally not write
+to the sample-mask builtin variable when the fragment is discarded.
+
+### Handling clip distance and cull distance
+
+Most builtin variables are scalars or vectors.
+However, the `ClipDistance` and `CullDistance` builtin variables are arrays of 32-bit float values.
+Each entry defines a clip half-plane (respectively cull half-plane)
+A Vulkan implementation must support array sizes of up to 8 elements.
+
+How prevalent are shaders that use these features?
+These variables are supported when Vulkan features `shaderClipDistance` and `shaderCullDistance`
+are supported.
+According to gpuinfo.org as of this writing, those
+Vulkan features appear to be nearly universally supported on Windows devices (>99%),
+but by only 70% on Android.
+It appears that Qualcomm devices support them, but Mali devices do not (e.g. Mali-G77).
+
+The proposed translation scheme forces a copy of each array from private
+variables into the return value of a vertex shader, or into a private
+variable of a fragment shader.
+In addition to the register pressure, there may be a performance degradation
+due to the bulk copying of data.
+
+We think this is an acceptable tradeoff for the gain in usability and
+consistency with other pipeline inputs and outputs.
+
+## Translation scheme for WGSL AST to SPIR-V
+
+To translate from the WGSL AST to SPIR-V, do the following:
+
+- Each entry point formal parameter is mapped to a SPIR-V `Input` variable.
+ - Struct and array inputs may have to be broken down into individual variables.
+- The return of the entry point is broken down into fields, with one
+ `Output` variable per field.
+- In the above, builtins must be separated from user attributes.
+ - Builtin attributes are moved to the corresponding variable.
+ - Location and interpolation attributes are moved to the corresponding
+ variables.
+- This translation relies on the fact that pipeline inputs and pipeline
+ outputs are IO-shareable types. IO-shareable types are always storable,
+ and can be the store type of input/output variables.
+- Input function parameters will be automatically initialized by the system
+ as part of setting up the pipeline inputs to the entry point.
+- Replace each return statement in the entry point with a code sequence
+ which writes the return value components to the synthesized output variables,
+ and then executes an `OpReturn` (without value).
+
+This translation is sufficient even for fragment shaders with discard.
+In that case, outputs will be ignored because downstream pipeline
+operations will not be performed.
+This is the same rationale as for translation from SPIR-V to WGSL AST.