Blame - docs/tint/spirv-reader-overview.md - dawn

blob: dc9031bf7dda952c920f9a80e3a5d890e89c0051 [file] [log] [blame] [view]

David Neto	0e6d95b	2022-11-01 03:54:42 +0000	[diff] [blame]	1	# Converting SPIR-V to WGSL
				2
				3	This document describes the challenges in converting SPIR-V into WGSL.
				4
				5	Note: Unless otherwise specified, the namespace for C++ code is
dan sinclair	eb66431	2023-07-27 21:29:56 +0000	[diff] [blame^]	6	`tint::spirv::reader::`.
David Neto	0e6d95b	2022-11-01 03:54:42 +0000	[diff] [blame]	7
				8	## Overall flow
				9
				10	1. Validate the SPIR-V input.
				11
				12	The SPIR-V module (binary blob) is validated against rules for
				13	Vulkan 1.1, using the SPIRV-Tools validator.
				14
				15	This allows the rest of the flow to ignore invalid inputs.
				16	However, the SPIR-V might still be rejected in a later step because:
				17
				18	- it uses features unavailable in WGSL, or
				19	- the SPIR-V Reader is insufficiently smart, or
				20	- the translated program tries to do something rejected by WGSL's rules
				21	(which are checked by Tint's Resolver).
				22
				23	2. Load the SPIR-V binary into an in-memory representation.
				24
				25	The SPIR-V reader uses the in-memory representation of the SPIR-V
				26	module defined by the SPIRV-Tools optimizer. That provides
				27	convenient representation of basic structures such as:
				28
				29	- instructions
				30	- types
				31	- constants
				32	- functions
				33	- basic blocks
				34
				35	and provides analyses for:
				36
				37	- relating definitions to uses (spvtools::opt::analysis::DefUseMgr)
				38	- types (spvtools::opt::analysis:TypeManager)
				39	- constants (spvtools::opt::analysis:ConstantManager)
				40
				41	Note: The SPIR-V is not modified by the SPIR-V Reader.
				42
				43	3. Translate the SPIR-V module into Tint's AST.
				44
				45	The AST is valid for WGSL except for some small exceptions which are
				46	cleaned up by transformations.
				47
				48	4. Post-process the AST to make it valid for WGSL.
				49
				50	Example:
				51	- Rewrite strided arrays and matrices (remove `@stride` attribute)
				52	- Rewrite atomic functions
				53	- Remove unreachable statements, to satisfy WGSL's behaviour analysis.
				54
				55
				56	## Overcoming mismatches between SPIR-V and WGSL
				57
				58	### Remapping builtin inputs and outputs
				59
				60	SPIR-V for Vulkan models builtin inputs and outputs as variables
				61	in Input and Output storage classes.
				62
				63	WGSL builtin inputs are parameters to the entry point, and
				64	builtin outputs are result values of the entry point.
				65
				66	See [spirv-input-output-variables.md](spirv-input-output-variables.md)
				67
				68	### We only care about `gl_Position` from `gl_PerVertex`
				69
				70	Glslang SPIR-V output for a vertex shader has a `gl_PerVertex`
				71	output variable with four members:
				72
				73	- `gl_Position`
				74	- `gl_PointSize`
				75	- `gl_ClipDistance`
				76	- `gl_CullDistance`
				77
				78	WGSL only supports the `position` builtin variable.
				79
				80	The SPIR-V Reader has a bunch of carveouts so it only generates the
				81	position variable. In partcular, it tracks which expressions are actually
				82	accesses into the per-vertex variable, and ignores accesses to other
				83	parts of the structure, and remaps accesses of the position member.
				84
				85	### `gl_PointSize` must be 1.0
				86
				87	It's a WGSL rule. SPIR-V is more flexible, and the SPIR-V Reader
				88	checks that any assignment to (the equivalent of) `gl_PointSize`
				89	must the constant value 1.0.
				90
				91	### Remapping sample mask inputs and outputs
				92
				93	There's some shenanigans here I don't recall.
				94	See the SkipReason enum.
				95
				96	### Integer signedness
				97
				98	In SPIR-V, the instruction determines the signedness of an operation,
				99	not the types of its operands.
				100
				101	For example:
				102
				103	%uint = OpTypeInt 32 0 ; u32 type
				104	%int = OpTypeInt 32 1 ; i32 type
				105
				106	%int_1 = OpConstant %int 1 ; WGSL 1i
				107	%uint_2 = OpConstant %uint 2 ; WGSL 2u
				108
				109	; You can mix signs of an operand, and the instruction
				110	; tells you the result type.
				111	%sum_uint = OpIAdd %uint %int %int_1 %uint_2
				112	%sum_int = OpIAdd %int %int %int_1 %uint_2
				113
				114	However, WGSL arithmetic tends to require the operands and
				115	result type for an operation to all have the same signedness.
				116
				117	So the above might translate to WGSL as:
				118
				119	let sum_uint: u32 = bitcast<u32>(1i) + 2u;
				120	let sum_int: i32 = 1i + bitcast<i32>(2u);
				121
				122	See:
				123	* ParserImpl::RectifyOperandSignedness
				124	* ParserImpl::RectifySecondOperandSignedness
				125	* ParserImpl::RectifyForcedResultType
				126
				127	### Translating textures and samplers
				128
				129	SPIR-V textures and samplers are module-scope variables
				130	in UniformConstant storage class.
				131	These map directly to WGSL variables.
				132
				133	For a sampled-image operation, SPIR-V will:
				134	- load the image value from a texture variable
				135	- load the sampler value from a sampler variable
				136	- form a "sampled image" value using `SpvOpSampledImage`
				137	- then use that sampled image value in a image operation
				138	such as `SpvOpImageSampleImplicitLod`
				139
				140	For an image operation that is not a sampled-image operation
				141	(e.g. OpImageLoad or OpImageWrite), then the steps are similar
				142	except without a sampler (clearly), and without invoking
				143	`OpSampledImage`.
				144
				145	In contrast to the SPIR-V code pattern, the WGSL builtin requires
				146	the texture and sampler value to be passed in as separate parameters.
				147	Secondly, they are passed in by value, by naming the variables
				148	themselves and relying on WGSL's "Load Rule" to pass the handle
				149	value into the callee.
				150
				151	When the SPIR-V Reader translates a texture builtin, it traces
				152	backward through the `OpSampledImage` operation (if any),
				153	back through the load, and all the way back to the `OpVariable`
				154	declaration. It does this for both the image/texture variable and
				155	the sampler variable (if applicable). It then uses the names
				156	of those variables as the corresponding arguments to the WGSL
				157	texture builtin.
				158
				159	### Passing textures and samplers into helper functions
				160
David Neto	0e6d95b	2022-11-01 03:54:42 +0000	[diff] [blame]	161	Glslang generates SPIR-V where texture and sampler formal parameters
				162	are as pointer-to-UniformConstant.
				163
				164	WGSL models them as passing texture and sampler values themselves,
				165	conceptually as opaque handles. This is similar to GLSL, but unlike
				166	SPIR-V.
				167
David Neto	24c8440	2022-12-09 19:53:27 +0000	[diff] [blame]	168	To support textures and samplers as arguments to user-defined functions,
				169	we extend the tracing logic so it knows to bottom out at OpFunctionParameter.
				170
				171	Also, code that generates function declarations now understands formal
				172	parameters declared as a pointer to uniform-constant as
				173	well as direct image and sampler values.
David Neto	0e6d95b	2022-11-01 03:54:42 +0000	[diff] [blame]	174
				175	Example GLSL compute shader:
				176
				177	#version 450
				178
				179	layout(set=0,binding=0) uniform texture2D im;
				180	layout(set=0,binding=1) uniform sampler s;
				181
				182	vec4 helper(texture2D imparam, sampler sparam) {
				183	return texture(sampler2D(imparam,sparam),vec2(0));
				184	}
				185
				186	void main() {
				187	vec4 v = helper(im,s);
				188	}
				189
				190	SPIR-V generated by Glslang (Shaderc's glslc):
				191
				192	; SPIR-V
				193	; Version: 1.0
				194	; Generator: Google Shaderc over Glslang; 10
				195	; Bound: 32
				196	; Schema: 0
				197	OpCapability Shader
				198	%1 = OpExtInstImport "GLSL.std.450"
				199	OpMemoryModel Logical GLSL450
				200	OpEntryPoint GLCompute %main "main"
				201	OpExecutionMode %main LocalSize 1 1 1
				202	OpSource GLSL 450
				203	OpSourceExtension "GL_GOOGLE_cpp_style_line_directive"
				204	OpSourceExtension "GL_GOOGLE_include_directive"
				205	OpName %main "main"
				206	OpName %helper_t21_p1_ "helper(t21;p1;"
				207	OpName %imparam "imparam"
				208	OpName %sparam "sparam"
				209	OpName %v "v"
				210	OpName %im "im"
				211	OpName %s "s"
				212	OpDecorate %im DescriptorSet 0
				213	OpDecorate %im Binding 0
				214	OpDecorate %s DescriptorSet 0
				215	OpDecorate %s Binding 1
				216	%void = OpTypeVoid
				217	%3 = OpTypeFunction %void
				218	%float = OpTypeFloat 32
				219	%7 = OpTypeImage %float 2D 0 0 0 1 Unknown
				220	%_ptr_UniformConstant_7 = OpTypePointer UniformConstant %7
				221	%9 = OpTypeSampler
				222	%_ptr_UniformConstant_9 = OpTypePointer UniformConstant %9
				223	%v4float = OpTypeVector %float 4
				224	%12 = OpTypeFunction %v4float %_ptr_UniformConstant_7 %_ptr_UniformConstant_9
				225	%19 = OpTypeSampledImage %7
				226	%v2float = OpTypeVector %float 2
				227	%float_0 = OpConstant %float 0
				228	%23 = OpConstantComposite %v2float %float_0 %float_0
				229	%_ptr_Function_v4float = OpTypePointer Function %v4float
				230	%im = OpVariable %_ptr_UniformConstant_7 UniformConstant
				231	%s = OpVariable %_ptr_UniformConstant_9 UniformConstant
				232	%main = OpFunction %void None %3
				233	%5 = OpLabel
				234	%v = OpVariable %_ptr_Function_v4float Function
				235	%31 = OpFunctionCall %v4float %helper_t21_p1_ %im %s
				236	OpStore %v %31
				237	OpReturn
				238	OpFunctionEnd
				239	%helper_t21_p1_ = OpFunction %v4float None %12
				240	%imparam = OpFunctionParameter %_ptr_UniformConstant_7
				241	%sparam = OpFunctionParameter %_ptr_UniformConstant_9
				242	%16 = OpLabel
				243	%17 = OpLoad %7 %imparam
				244	%18 = OpLoad %9 %sparam
				245	%20 = OpSampledImage %19 %17 %18
				246	%24 = OpImageSampleExplicitLod %v4float %20 %23 Lod %float_0
				247	OpReturnValue %24
				248	OpFunctionEnd
				249
				250	What the SPIR-V Reader currently generates:
				251
				252	@group(0) @binding(0) var im : texture_2d<f32>;
				253
				254	@group(0) @binding(1) var s : sampler;
				255
David Neto	0e6d95b	2022-11-01 03:54:42 +0000	[diff] [blame]	256	fn helper_t21_p1_(imparam : texture_2d<f32>, sparam : sampler) -> vec4<f32> {
				257	let x_24 : vec4<f32> = textureSampleLevel(imparam, sparam, vec2<f32>(0.0f, 0.0f), 0.0f);
				258	return x_24;
				259	}
				260
				261	fn main_1() {
				262	var v : vec4<f32>;
				263	let x_31 : vec4<f32> = helper_t21_p1_(im, s);
				264	v = x_31;
				265	return;
				266	}
				267
				268	@compute @workgroup_size(1i, 1i, 1i)
				269	fn main() {
				270	main_1();
				271	}
				272
				273	### Dimensionality mismatch in texture builtins
				274
				275	Vulkan SPIR-V is fairly forgiving in the dimensionality
				276	of input coordinates and result values of texturing operations.
				277	There is some localized rewriting of values to satisfy the overloads
				278	of WGSL's texture builtin functions.
				279
				280	### Reconstructing structured control flow
				281
				282	This is subtle.
				283
				284	- Use structural dominance (but we didn't have the name at the time).
				285	See SPIR-V 1.6 Rev 2 for updated definitions.
				286	- See the big comment at the start of reader/spirv/function.cc
				287	- See internal presentations.
				288
				289	Basically:
				290	* Compute a "structured order" for structurally reachable basic blocks.
				291	* Traversing in structured order, use a stack-based algorithn to
				292	identify intervals of blocks corresponding to structured constructs.
				293	For example, loop construct, continue construct, if-selection,
				294	switch-selection, and case-construct. Constructs can be nested,
				295	hence the need for a stack. This is akin to "drawing braces"
				296	around statements, to form block-statements that will appear in
				297	the output. This step performs some validation, which may now be
				298	redundant with the SPIRV-Tools validator. This is defensive
				299	programming, and some tests skip use of the SPIRV-Tools validator.
				300	* Traversing in structured order, identify structured exits from the
				301	constructs identified in the previous step. This determines what
				302	control flow edges correspond to `break`, `continue`, and `return`,
				303	as needed.
				304	* Traversing in structured order, generate statements for instructions.
				305	This uses a stack corresponding to nested constructs. The kind of
				306	each construct being entered or exited determines emission of control
				307	flow constructs (WGSL's `if`, `loop`, `continuing`, `switch`, `case`).
				308
				309	### Preserving execution order
				310
				311	An instruction inside a SPIR-V instruction is one of:
				312
				313	- control flow: see the previous section
				314	- combinatorial: think of this as an ALU operation, i.e. the effect
				315	is purely to evaluate a result value from the values of its operands.
				316	It has no side effects, and is not affected by external state such
				317	as memory or the actions of other invocations in its subgroup.
				318	Examples: arithmetic, OpCopyObject
				319	- interacts with memory or other invocations in some way.
				320	Examples: load, store, atomics, barriers, (subgroup operations when we
				321	get them)
				322	- function calls: functions are not analyzed to see if they are pure,
				323	so we assume function calls are non-combinatorial.
				324
				325	To preserve execution order, all non-combinatorial instructions must
				326	be translated as their own separate statement. For example, an OpStore
				327	maps to an assignment statement.
				328
				329	However, combinatorial instructions can be emitted at any point
				330	in evaluation, provided data flow constraints are satisfied: input
				331	values are available, and such that the resulting value is generated
				332	in time for consumption by downstream uses.
				333
				334	The SPIR-V Reader uses a heuristic to choose when to emit combinatorial
				335	values:
				336	- if a combinatorial expression only has one use, and
				337	- its use is in the same structured construct as its definition, then
				338	- emit the expression at the place where it is consumed.
				339
				340	Otherwise, make a `let` declaration for the value.
				341
				342	Why:
				343	- If a value has many uses, then computing it once can save effort.
				344	Preserve that choice if it was made by an upstream optimizing compiler.
				345	- If a value is consumed in a different structured construct, then the
				346	site of its consumption may be inside a loop, and we don't want to
				347	sink the computation into the loop, thereby causing spurious extra
				348	evaluation.
				349
				350	This heuristic generates halfway-readable code, greatly reducing the
				351	varbosity of code in the common case.
				352
				353	### Hoisting and phis
				354
				355	SPIR-V uses SSA (static single assignment). The key requirement is
				356	that the definition of a value must dominate its uses.
				357
				358	WGSL uses lexical scoping.
				359
				360	It is easy enough for a human or an optimizing compiler to generate
				361	SSA cases which do not map cleanly to a lexically scoped value.
				362
				363	Example pseudo-GLSL:
				364
				365	void main() {
				366	if (cond) {
				367	const uint x = 1;
				368	} else {
				369	return;
				370	}
				371	const uint y = x; // x's definition dominates this use.
				372	}
				373
				374	This isn't valid GLSL and its analog would not be a valid WGSL
				375	program because x is used outside the scope of its declaration.
				376
				377	Additionally, SSA uses `phi` nodes to transmit values from predecessor
				378	basic blocks that would otherwise not be visible (because the
				379	parent does not dominate the consuming basic block). An example
				380	is sending the updated value of a loop induction variable back to
				381	the top of the loop.
				382
				383	The SPIR-V reader handles these cases by tracking:
				384	- where a value definition occurs
				385	- the span of basic blocks, in structured order, where there
				386	are uses of the value.
				387
				388	If the uses of a value span structured contructs which are not
				389	contained by the construct containing the definition (or
				390	if the value is a `phi` node), then we "hoist" the value
				391	into a variable:
				392
				393	- create a function-scope variable at the top of the structured
				394	construct that spans all the uses, so that all the uses
				395	are in scope of that variable declaration.
				396
				397	- for a non-phi: generate an assignment to that variable corresponding
				398	to the value definition in the original SPIR-V.
				399
				400	- for a phi: generate an assigment to that variable at the end of
				401	each predecessor block for that phi, assigning the value to be
				402	transmitted from that phi.
				403
				404	This scheme works for values which can be the stored in a variable.
				405
				406	It does not work for pointers. However, we don't think we need
				407	to solve this case any time soon as it is uncommon or hard/impossible
				408	to generate via standard tooling.
				409	See https://crbug.com/tint/98 and https://crbug.com/tint/837
				410
				411	## Mapping types
				412
				413	SPIR-V has a recursive type system. Types are defined, given result IDs,
				414	before any functions are defined, and before any constant values using
				415	the corresponding types.
				416
				417	WGSL also has a recursive type system. However, except for structure types,
				418	types are spelled inline at their uses.
				419
				420	## Texture and sampler types
				421
				422	SPIR-V image types map to WGSL types, but the WGSL type is determined
				423	more by usage (what operations are performed on it) than by declaration.
				424
				425	For example, Vulkan ignores the "Depth" operand of the image type
				426	declaration (OpTypeImage).
				427	See [16.1 Image Operations Overview](https://registry.khronos.org/vulkan/specs/1.3/html/vkspec.html#_image_operations_overview).
				428	Instead, we must infer that a texture is a depth texture because
				429	it is used by image instructions using a depth-reference, e.g.
				430	OpImageSampleDrefImplicitLod vs. OpImageSampleImplicitLod.
				431
				432	Similarly, SPIR-V only has one sampler type. The use of the
				433	sampler determines whether it maps to a WGSL `sampler` or
				434	`sampler_comparison` (for depth sampling).
				435
				436	The SPIR-V Reader scans uses of each texture and sampler
				437	in the module to infer the appropriate target WGSL type.
				438	See ParserImpl::RegisterHandleUsage
				439
				440	In Vulkan SPIR-V it is possible to use the same sampler for regular
				441	sampling and depth-reference sampling. In this case the SPIR-V Reader
				442	will infer a depth texture, but then the generated program will fail WGSL
				443	validation.
				444
				445	For example, this GLSL fragment shader:
				446
				447	#version 450
				448
				449	layout(set=1,binding=0) uniform texture2D tInput;
				450	layout(set=1,binding=1) uniform sampler s;
				451
				452	void main() {
				453	vec4 v = texture(sampler2D(tInput,s),vec2(0));
				454	float f = texture(sampler2DShadow(tInput,s),vec3(0));
				455	}
				456
				457	Converts to this WGSL shader:
				458
				459	@group(1) @binding(0) var tInput : texture_depth_2d;
				460
				461	@group(1) @binding(1) var s : sampler_comparison;
				462
				463	fn main_1() {
				464	var v : vec4<f32>;
				465	var f : f32;
				466	let x_23 : vec4<f32> = vec4<f32>(textureSample(tInput, s, vec2<f32>(0.0f, 0.0f)), 0.0f, 0.0f, 0.0f);
				467	v = x_23;
				468	let x_34 : f32 = textureSampleCompare(tInput, s, vec3<f32>(0.0f, 0.0f, 0.0f).xy, vec3<f32>(0.0f, 0.0f, 0.0f).z);
				469	f = x_34;
				470	return;
				471	}
				472
				473	@fragment
				474	fn main() {
				475	main_1();
				476	}
				477
				478	But then this fails validation:
				479
				480	error: no matching call to textureSample(texture_depth_2d, sampler_comparison, vec2<f32>)
				481	15 candidate functions: ...
				482
				483	## References and pointers
				484
				485	SPIR-V has a pointer type.
				486
				487	A SPIR-V pointer type corresponds to a WGSL memory view. WGSL has two
				488	memory view types: a reference type, and a pointer type.
				489
				490	See [spirv-ptr-ref.md](spirv-ptr-ref.md) for details on the translation.
				491
				492	## Mapping buffer types
				493
				494	Vulkan SPIR-V expresses a Uniform Buffer Object (UBO), or
				495	a WGSL 'uniform buffer' as:
				496
				497	- an OpVariable in Uniform storage class
				498	- its pointee type (store type) is a Block-decorated structure type
				499
				500	Vulkan SPIR-V has two ways to express a Shader Storage Buffer Object (SSBO),
				501	or a WGSL 'storage buffer' as either deprecated-style:
				502
				503	- an OpVariable in Uniform storage class
				504	- its pointee type (store type) is a BufferBlock-decorated structure type
				505
				506	or as new-style:
				507
				508	- an OpVariable in StorageBuffer storage class
				509	- its pointee type (store type) is a Block-decorated structure type
				510
				511	Deprecated-style storage buffer was the only option in un-extended
				512	Vulkan 1.0. It is generated by tools that want to generate code for
				513	the broadest reach. This includes DXC.
				514
				515	New-style storage buffer requires the use of the `OpExtension
				516	"SPV_KHR_storage_buffer_storage_class"` or SPIR-V 1.3 or later
				517	(Vulkan 1.1 or later).
				518
				519	Additionally, a storage buffer in SPIR-V may be marked as NonWritable.
				520	Perhaps surprisingly, this is typically done by marking all the
				521	members of the top-level (Buffer)Block-decorated structure as NonWritable.
				522	(This is the common translation because that's what Glslang does.)
				523
				524	Translation of uniform buffers is straightforward.
				525
				526	However, the SPIR-V Reader must support both the deprecated and the new
				527	styles of storage buffers.
				528
				529	Additionally:
				530	- a storage buffer with all NonWritable members is translated with `read`
				531	access mode. This becomes a part of its WGSL reference type (and hence
				532	corresponding pointer type).
				533	- a storage buffer without all NonWritable members is translated with
				534	an explicit `read_write` access mode. This becomes a part of its
				535	WGSL reference type (and hence corresponding pointer type).
				536
				537	Note that read-only vs. read-write is a property of the pointee-type in SPIR-V,
				538	but in WGSL it's part of the reference type (not the store type).
				539
				540	To handle this mismatch, the SPIR-V Reader has bookkeeping to map
				541	each pointer value (inside a function) back to through to the originating
				542	variable. This originating variable may be a buffer variable which then
				543	tells us which address space and access mode to use for a locally-defined
				544	pointer value.
				545
				546	Since baseline SPIR-V does not allow passing pointers to buffers into
				547	user-defined helper functions, we don't need to handle this buffer type
				548	remapping into function formal parameters.
				549
				550	## Mapping OpArrayLength
				551
				552	The OpArrayLength instruction takes a pointer to the enclosing
				553	structure (the pointee type of the storage buffer variable).
				554
				555	But the WGSL arrayLength builtin variable takes a pointer to the
				556	member inside that structure.
				557
				558	A small local adjustment is sufficient here.