Blame - docs/tint/ir.md - dawn

blob: fb589527768f60a51b6dc99d24afaf4ceda2e403 [file] [log] [blame] [view]

dan sinclair	eee8d88	2022-10-28 01:22:58 +0000	[diff] [blame]	1	# Intermediate Representation
				2
				3	As Tint has grown the number of transforms on the AST has grown. This
				4	growth has lead to several issues:
				5
				6	1. Transforms rebuild the AST and SEM which causes slowness
				7	1. Transforming in AST can be difficult as the AST is hard to work with
				8
				9	In order to address these goals, an IR is being introduced into Tint.
				10	The IR is mutable, it holds the needed state in order to be transformed.
				11	The IR is also translatable back into AST. It will be possible to
				12	generate an AST, convert to IR, transform, and then rebuild a new AST.
				13	This round-trip ability provides a few features:
				14
				15	1. Easy to integrate into current system by replacing AST transforms
				16	piecemeal
				17	1. Easier to test as the resulting AST can be emitted as WGSL and
				18	compared.
				19
				20	The IR helps with the complexity of the AST transforms by limiting the
				21	representations seen in the IR form. For example, instead of `for`,
				22	`while` and `loop` constructs there is a single `loop` construct.
Ben Clayton	c98d57d	2023-01-24 14:59:43 +0000	[diff] [blame]	23	`alias` and `const_assert` nodes are not emitted into IR. Dead code is
dan sinclair	eee8d88	2022-10-28 01:22:58 +0000	[diff] [blame]	24	eliminated during the IR construction.
				25
				26	As the IR can convert into AST, we could potentially simplify the
				27	SPIRV-Reader by generating IR directly. The IR is closer to what SPIR-V
				28	looks like, so maybe a simpler transform.
				29
				30	## Design
				31
				32	The IR breaks down into two fundamental pieces, the control flow and the
				33	expression lists. While these can be thought of as separate pieces they
				34	are linked in that the control flow blocks contain the expression lists.
				35	A control flow block may use the result of an expression as the
				36	condition.
				37
				38	The IR works together with the AST/SEM. There is an underlying
				39	assumption that the source `Program` will live as long as the IR. The IR
				40	holds pointers to data from the `Program`. This includes things like SEM
				41	types, variables, statements, etc.
				42
				43	Transforming from AST to IR and back to AST is a lossy operation.
				44	The resulting AST when converting back will not be the same as the
				45	AST being provided. (e.g. all `for`, `while` and `loop` constructs coming
				46	in will become `while` loops going out). This is intentional as it
				47	greatly simplifies the number of things to consider in the IR. For
				48	instance:
				49
				50	* No `alias` nodes
Ben Clayton	c98d57d	2023-01-24 14:59:43 +0000	[diff] [blame]	51	* No `const_assert` nodes
dan sinclair	eee8d88	2022-10-28 01:22:58 +0000	[diff] [blame]	52	* All loops become `while` loops
				53	* `if` statements may all become `if/else`
				54
				55	### Code Structure
				56	The code is contained in the `src/tint/ir` folder and is broken down
				57	into several classes. Note, the IR is a Tint _internal_ representation
				58	and these files should _never_ appear in the public API.
				59
				60	#### Builder
				61	The `Builder` class provides useful helper routines for creating IR
				62	content. The Builder owns an `ir::Module`, it can be created with an
				63	existing Module by moving it into the builder. The Module is moved from
				64	the builder when it is complete.
				65
				66	#### Module
				67	The top level of the IR is the `Module`. The module stores a list of
				68	`functions`, `entry_points`, allocators and various other bits of
				69	information needed by the IR. The `Module` also contains a pointer to
				70	the `Program` which the IR was created from. The `Program` must outlive
				71	the `Module`.
				72
				73	The `Module` provides two methods from moving two and from a `Program`.
				74	The `Module::FromProgram` static method will take a `Program` and
				75	construct an `ir::Module` from the contents. The resulting module class
				76	then has a `ToProgram` method which will construct a new `Program` from
				77	the `Module` contents.
				78
				79	#### BuilderImpl
				80	The `BuilderImpl` is internally used by the `Module` to do the
				81	conversion from a `Program` to a `Module`. This class should not be used
				82	outside the `src/tint/ir` folder.
				83
				84	### Transforms
				85	Similar to the AST a transform system is available for IR. The transform
				86	has the same setup as the AST (and inherits from the same base transform
				87	class.)
				88
				89	Note, not written yet.
				90
				91	### Scoping
				92	The IR flattens scopes. This also means that the IR will rename shadow
				93	variables to be uniquely named in the larger scoped block.
				94
				95	For an example of flattening:
				96
				97	```
				98	{
				99	var x = 1;
				100	{
				101	var y = 2;
				102	}
				103	}
				104	```
				105
				106	becomes:
				107
				108	```
				109	{
				110	var x = 1;
				111	var y = 2;
				112	}
				113	```
				114
				115	For an example of shadowing:
				116
				117	```
				118	{
				119	var x = 1;
				120	if (true) {
				121	var x = 2;
				122	}
				123	}
				124	```
				125
				126	becomes:
				127
				128	```
				129	{
				130	var x = 1;
				131	if true {
				132	var x_1 = 2;
				133	}
				134	}
				135	```
				136
				137	### Control Flow Blocks
				138
				139	At the top level, the AST is broken into a series of control flow nodes.
				140	There are a limited set of flow nodes as compared to AST:
				141
				142	1. Block
				143	1. Function
				144	1. If statement
				145	1. Loop statement
				146	1. Switch statement
				147	1. Terminator
				148
				149	As the IR is built a stack of control flow blocks is maintained. The
				150	stack contains `function`, `loop`, `if` and `switch` control flow
				151	blocks. A `function` is always the bottom element in the flow control
				152	stack.
				153
				154	The current instruction block is tracked. The tracking is reset to
				155	`nullptr` when a branch happens. This is used in the statement processing
				156	in order to eliminate dead code. If the current block does not exist, or
				157	has a branch target, then no further instructions can be added, which
				158	means all control flow has branched and any subsequent statements can be
				159	disregarded.
				160
				161	Note, this does have the effect that the inspector _must_ be run to
				162	retrieve the module interface before converting to IR. This is because
				163	phony assignments in dead code add variables into the interface.
				164
				165	```
				166	var<storage> b;
				167
				168	fn a() {
				169	return;
				170	_ = b; // This pulls b into the module interface but would be
				171	// dropped due to dead code removal.
				172	}
				173	```
				174
				175	#### Control Flow Block
				176	A block is the simplest control flow node. It contains the instruction
				177	lists for a given linear section of codes. A block only has one branch
				178	statement which always happens at the end of the block. Note, the branch
				179	statement is implicit, it doesn't show up in the expression list but is
				180	encoded in the `branch_target`.
				181
				182	In almost every case a block does not branch to another block. It will
				183	always branch to another control flow node. The exception to this rule
				184	is blocks branching to the function end block.
				185
				186	#### Control Flow Function
				187	A function control flow block has two targets associated with it, the
				188	`start_target` and the `end_target`. Function flow starts at the
				189	`start_target` and ends just before the `end_target`. The `end_target`
				190	is always a terminator, it just marks the end of the function
				191	(a return is a branch to the function `end_target`).
				192
				193	#### Control Flow If
				194	The if flow node is an `if-else` structure. There are no `else-if`
				195	entries, they get moved into the `else` of the `if`. The if control flow
				196	node has three targets, the `true_target`, `false_target` and possibly a
				197	`merge_target`.
				198
				199	The `merge_target` is possibly `nullptr`. This can happen if both
				200	branches of the `if` call `return` for instance as the internal branches
				201	would jump to the function `end_target`.
				202
				203	In all cases, the if node will have a `true_target` and a
				204	`false_target`, the target block maybe just a branch to the
				205	`merge_target` in the case where that branch of the if was empty.
				206
				207	#### Control Flow Loop
				208	All of the loop structures in AST merge down to a single loop control
				209	flow node. The loop contains the `start_target`, `continuing_target` and
				210	a `merge_target`.
				211
				212	In the case of a loop, the `merge_target` always exists, but may
				213	actually not exist in the control flow. The target is created in order
				214	to have a branch for `continue` to branch too, but if the loop body does
				215	a `return` then control flow may jump over that block completely.
				216
				217	The chain of blocks from the `start_target`, as long as it does not
				218	`break` or `return` will branch to the `continuing_target`. The
				219	`continuing_target` will possibly branch to the `merge_target` and will
				220	branch to the `start_target` for the loop.
				221
				222	A while loop is decomposed as listed in the WGSL spec:
				223
				224	```
				225	while (a < b) {
				226	c += 1;
				227	}
				228	```
				229
				230	becomes:
				231
				232	```
				233	loop {
				234	if (!(a < b)) {
				235	break;
				236	}
				237	c += 1;
				238	}
				239	```
				240
				241	A for loop is decomposed as listed in the WGSL spec:
				242	```
				243	for (var i = 0; i < 10; i++) {
				244	c += 1;
				245	}
				246	```
				247
				248	becomes:
				249
				250	```
				251	var i = 0;
				252	loop {
				253	if (!(i < 10)) {
				254	break;
				255	}
				256
				257	c += 1;
				258
				259	continuing {
				260	i++;
				261	}
				262	}
				263	```
				264
				265	#### Control Flow Switch
				266	The switch control flow has a target block for each of the
				267	`case/default` labels along with a `merge_target`. The `merge_target`
				268	while existing, maybe outside the control flow if all of the `case`
				269	branches `return`. The target exists in order to provide a `break`
				270	target.
				271
				272	#### Control Flow Terminator
				273	The terminator control flow is only used as the `end_target` of a
				274	function. It does not contain instructions and is only used as a marker
				275	for the exit of a function.
				276
				277	### Expression Lists.
				278	Note, this section isn't fully formed as this has not been written at
				279	this point.
				280
				281	The expression lists are all in SSA form. The SSA variables will keep
				282	pointers back to the source AST variables in order for us to not require
				283	PHI nodes and to make it easier to move back out of SSA form.
				284
				285	#### Expressions
				286	All expressions in IR are single operations. There are no complex
				287	expressions. Any complex expression in the AST is broke apart into the
				288	simpler single operation components.
				289
				290	```
				291	var a = b + c - (4 * k);
				292	```
				293
				294	becomes:
				295
				296	```
				297	%t0 = b + c
				298	%t1 = 4 * k
				299	%v0 = %t0 - %t1
				300	```
				301
				302	This also means that many of the short forms `i += 1`, `i++` get
				303	expanded into the longer form of `i = i + 1`.
				304
				305	##### Short-Circuit Expressions
				306	The short-circuit expressions (e.g. `a && b`) will be convert into an
				307	`if` structure control flow.
				308
				309	```
				310	let c = a() && b()
				311	```
				312
				313	becomes
				314
				315	```
				316	let c = a();
				317	if (c) {
				318	c = b();
				319	}
				320	```
				321
				322	#### Registers
				323	There are several types of registers used in the SSA form.
				324
				325	1. Constant Register
				326	1. Temporary Register
				327	1. Variable Register
				328	1. Return Register
				329	1. Function Argument Register
				330
				331	##### Constant Register
				332	The constant register `%c` holds a constant value. All values in IR are
				333	concrete, there are no abstract values as materialization has already
				334	happened. Each constant register holds a single constant value (e.g.
				335	`3.14`) and a pointee to the type (maybe? If needed.)
				336
				337	##### Temporary Register
				338	The temporary register `%t` hold the results of a simple operation. The
				339	temporaries are created as complex expressions are broken down into
				340	pieces. The temporary register tracks the usage count for the register.
				341	This allows a portion of a calculation to be pulled out when rebuilding
				342	AST as a common calculation. If the temporary is used once it can be
				343	re-combine back into a large expression.
				344
				345	##### Variable Register
				346	The variable register `%v` potentially holds a pointer back to source
				347	variables. So, while each value is written only once, if the pointer
				348	back to an AST variable exists we can rebuild the variable that value
				349	was originally created from and can assign back when converting to AST.
				350
				351	##### Return Register
				352	Each function has a return register `%r` where the return value will be
				353	stored before the final block branches to the `end_target`.
				354
				355	##### Function Argument Register
				356	The function argument registers `%a` are used to store the values being
				357	passed into a function call.
				358
				359	#### Type Information
				360	The IR shares type information with the SEM. The types are the same, but
				361	they may exist in different block allocations. The SEM types will be
				362	re-used if they exist, but if the IR needs to create a new type it will
				363	be created in the IRs type block allocator.
				364
				365	#### Loads / Stores and Deref
				366	Note, have not thought about this. We should probably have explicit
				367	load/store operations injected in the right spot, but don't know yet.
				368
				369	## Alternatives
				370	Instead of going to a custom IR there are several possible other roads
				371	that could be travelled.
				372
				373	### Mutable AST
				374	Tint originally contained a mutable AST. This was converted to immutable
				375	in order to allow processing over multiple threads and for safety
				376	properties. Those desires still hold, the AST is public API, and we want
				377	it to be as safe as possible, so keeping it immutable provides that
				378	guarantee.
				379
				380	### Multiple Transforms With One Program Builder
				381	Instead of generating an immutable AST after each transform, running
				382	multiple transforms on the single program builder would remove some of
				383	the performance penalties of going to and from immutable AST. While this
				384	is true, the transforms use a combination of AST and SEM information.
				385	When they transform they _do not_ create new SEM information. That
				386	means, after a given transform, the SEM is out of date. In order to
				387	re-generate the SEM the resolver needs to be rerun. Supporting this
				388	would require being very careful on what transforms run together and
				389	how they modify the AST.
				390
				391	### Adopt An Existing IR
				392	There are already several IRs in the while, Mesa has NIR, LLVM has
				393	LLVM IR. There are others, adopting one of those would remove the
				394	requirements of writing and maintaining our own IR. While that is true,
				395	there are several downsides to this re-use. The IRs are internal to the
				396	library, so the API isn't public, LLVM IR changes with each iteration of
				397	LLVM. This would require us to adapt the AST -> IR -> AST transform for
				398	each modification of the IR.
				399
				400	They also end up being lower level then is strictly useful for us. While
				401	the IR in Tint is a simplified form, we still have to be able to go back
				402	to the high level structured form in order to emit the resulting HLSL,
				403	MSL, GLSL, etc. (Only SPIR-V is a good match for the lowered IR form).
				404	This transformation back is not a direction other IRs maybe interested
				405	in so may have lost information, or require re-determining (determining
				406	variables from SSA and PHI nodes for example).
				407
				408	Other technical reasons are the maintenance of BUILD.gn and CMake files
				409	in order to integrate into our build systems, along with resulting
				410	binary size questions from pulling in external systems.
				411