blob: 0357c7a897049f22170fce629208a2feb59978d6 [file] [log] [blame] [view]
dan sinclaireee8d882022-10-28 01:22:58 +00001# Intermediate Representation
2
3As Tint has grown the number of transforms on the AST has grown. This
4growth has lead to several issues:
5
61. Transforms rebuild the AST and SEM which causes slowness
71. Transforming in AST can be difficult as the AST is hard to work with
8
dan sinclair14440bc2023-07-11 19:12:29 +00009In order to address these goals, an IR was introduced into Tint. The IR
10is mutable, it holds the needed state in order to be transformed. The IR
11is also translatable back into AST. It will be possible to generate an
12AST, convert to IR, transform, and then rebuild a new AST. This
13round-trip ability provides a few features:
dan sinclaireee8d882022-10-28 01:22:58 +000014
151. Easy to integrate into current system by replacing AST transforms
16 piecemeal
171. Easier to test as the resulting AST can be emitted as WGSL and
18 compared.
19
20The IR helps with the complexity of the AST transforms by limiting the
dan sinclair14440bc2023-07-11 19:12:29 +000021representations in the IR form. For example, instead of `for`, `while`
22and `loop` constructs there is a single `loop` construct. `alias` and
23`const_assert` nodes are not emitted into IR. Dead code maybe eliminated
24during IR construction.
dan sinclaireee8d882022-10-28 01:22:58 +000025
26As the IR can convert into AST, we could potentially simplify the
27SPIRV-Reader by generating IR directly. The IR is closer to what SPIR-V
28looks like, so maybe a simpler transform.
29
dan sinclair14440bc2023-07-11 19:12:29 +000030
dan sinclaireee8d882022-10-28 01:22:58 +000031## Design
32
dan sinclair14440bc2023-07-11 19:12:29 +000033The IR is composed of three primary concepts, `blocks`, `values`, and
34`instructions`.
dan sinclaireee8d882022-10-28 01:22:58 +000035
dan sinclair14440bc2023-07-11 19:12:29 +000036The `blocks` provide lists of `instructions` which are executed
37sequentially. An `instruction` may contain `blocks`, `value` operands
38and `value` results. A `value` is a block argument, function parameter,
39constant result, or a result computed by an `instruction`.
40
41Each `block` ends in a `terminator` expression. There are a few blocks
42which maybe empty, `if-false block`, `loop-initializer` and
43`loop-continuing`. Those blocks are optional. The `terminators` may hold
44`values` which are returned out of the block being terminated.
45
46Control flow is handled through `ControlInstructions`. These are the
47`if`, `switch` and `loop` instructions. Each control instruction
48contains `blocks` for the various control flow paths.
dan sinclaireee8d882022-10-28 01:22:58 +000049
50Transforming from AST to IR and back to AST is a lossy operation.
dan sinclair14440bc2023-07-11 19:12:29 +000051The resulting AST when converting back may not be the same as the
52AST being provided. This transformation is intentional as it greatly
53simplifies the number of things to consider in the IR. For instance:
dan sinclaireee8d882022-10-28 01:22:58 +000054
55* No `alias` nodes
Ben Claytonc98d57d2023-01-24 14:59:43 +000056* No `const_assert` nodes
dan sinclair14440bc2023-07-11 19:12:29 +000057
dan sinclaireee8d882022-10-28 01:22:58 +000058
59### Code Structure
60The code is contained in the `src/tint/ir` folder and is broken down
61into several classes. Note, the IR is a Tint _internal_ representation
62and these files should _never_ appear in the public API.
63
dan sinclair14440bc2023-07-11 19:12:29 +000064
dan sinclaireee8d882022-10-28 01:22:58 +000065#### Builder
66The `Builder` class provides useful helper routines for creating IR
dan sinclair14440bc2023-07-11 19:12:29 +000067content. The Builder references an `ir::Module`.
68
dan sinclaireee8d882022-10-28 01:22:58 +000069
70#### Module
71The top level of the IR is the `Module`. The module stores a list of
dan sinclair14440bc2023-07-11 19:12:29 +000072`functions`, constant manager, type manager, allocators and various other
73bits of information needed by the IR. The `module` _may_ contain a
74disassembled representation of the module if the disassembler has been
75called.
dan sinclaireee8d882022-10-28 01:22:58 +000076
dan sinclair14440bc2023-07-11 19:12:29 +000077A free `ir::FromProgram` method is provided to convert from a
78`Program` to an `ir::Module`. A similar `ToProgram` free method is
79provided for the reverse conversion.
dan sinclaireee8d882022-10-28 01:22:58 +000080
dan sinclaireee8d882022-10-28 01:22:58 +000081
82### Transforms
83Similar to the AST a transform system is available for IR. The transform
84has the same setup as the AST (and inherits from the same base transform
85class.)
86
dan sinclair97c37272023-07-24 17:11:53 +000087The IR transforms live in `src/tint/lang/core/ir/transform`. These transforms are
dan sinclair14440bc2023-07-11 19:12:29 +000088for use by the various IR generator backends.
dan sinclaireee8d882022-10-28 01:22:58 +000089
dan sinclair14440bc2023-07-11 19:12:29 +000090Unlike with the AST transforms, the IR transforms know which transforms
91must have run before themselves. The transform manager will then verify
92that all the required transforms for a given transform have executed.
dan sinclaireee8d882022-10-28 01:22:58 +000093
dan sinclaireee8d882022-10-28 01:22:58 +000094
dan sinclair14440bc2023-07-11 19:12:29 +000095### Validation
96The IR contains a validator. The validator checks for common errors
97encountered when building IR. The validator is _not_ run in production
98as any error is an internal coding error. The validator is executed by
99the generators before starting, and after each transform executes in
100debug mode.
dan sinclaireee8d882022-10-28 01:22:58 +0000101
dan sinclaireee8d882022-10-28 01:22:58 +0000102
dan sinclair14440bc2023-07-11 19:12:29 +0000103### Naming
104The instructions are internal to the IR, and the values in the IR do not
105contain names. The IR module has a symbol table which can be used to
106retrieve names for the instructions and values but those names may not
107be the names provided originally by the AST.
dan sinclaireee8d882022-10-28 01:22:58 +0000108
dan sinclaireee8d882022-10-28 01:22:58 +0000109
dan sinclair14440bc2023-07-11 19:12:29 +0000110### Control Flow
dan sinclaireee8d882022-10-28 01:22:58 +0000111
dan sinclair14440bc2023-07-11 19:12:29 +0000112#### Block
113A block contains the instruction lists for a section of code. A block
114always ends in a terminator instruction.
dan sinclaireee8d882022-10-28 01:22:58 +0000115
dan sinclair14440bc2023-07-11 19:12:29 +0000116The instructions in a block can be walked in linear scoped fashion. The
117control instructions do not need to be entered to walk a blocks
118instructions.
dan sinclaireee8d882022-10-28 01:22:58 +0000119
dan sinclaireee8d882022-10-28 01:22:58 +0000120
dan sinclair14440bc2023-07-11 19:12:29 +0000121#### Control Instructions
122The control instructions provide containers for other blocks. These
123include the `if`, `loop` and `switch` instruction. A control instruction
124is a self contained scope on top of the containing block scope.
dan sinclaireee8d882022-10-28 01:22:58 +0000125
dan sinclaireee8d882022-10-28 01:22:58 +0000126
dan sinclair14440bc2023-07-11 19:12:29 +0000127##### Control Instruction -- If
128The if instruction is an `if-else` structure. There are no `else-if`
129entries, they get moved into the `else` of the `if`. The if instruction
130has two internal blocks, the `True` and `False` blocks.
dan sinclaireee8d882022-10-28 01:22:58 +0000131
dan sinclair14440bc2023-07-11 19:12:29 +0000132A if instruction will _always_ have a `True` target. The `False` target
133maybe empty.
dan sinclaireee8d882022-10-28 01:22:58 +0000134
dan sinclair14440bc2023-07-11 19:12:29 +0000135The sub-blocks under the `if` can be terminated with `ExitIf`,
136`ExitSwitch`, `ExitLoop`, `Continue` or `Return`.
dan sinclaireee8d882022-10-28 01:22:58 +0000137
dan sinclaireee8d882022-10-28 01:22:58 +0000138
dan sinclair14440bc2023-07-11 19:12:29 +0000139#### Control Instruction -- Loop
140All of the loop structures in AST merge down to a single loop
141instruction. The loop contains the `Body`, which is required and
142optional `Initializer` and `Continuing` blocks.
dan sinclaireee8d882022-10-28 01:22:58 +0000143
dan sinclair14440bc2023-07-11 19:12:29 +0000144The `loop` body can be terminated with a `ExitLoop`, `Continue`,
145`NextIteration` or `Return`.
dan sinclaireee8d882022-10-28 01:22:58 +0000146
dan sinclair14440bc2023-07-11 19:12:29 +0000147The `loop` continuing can be terminated with a `NextIteration`, or a
148`BreakIf` terminator.
dan sinclaireee8d882022-10-28 01:22:58 +0000149
150A while loop is decomposed as listed in the WGSL spec:
151
152```
153while (a < b) {
154 c += 1;
155}
156```
157
158becomes:
159
160```
161loop {
162 if (!(a < b)) {
163 break;
164 }
165 c += 1;
166}
167```
168
169A for loop is decomposed as listed in the WGSL spec:
170```
171for (var i = 0; i < 10; i++) {
172 c += 1;
173}
174```
175
176becomes:
177
178```
179var i = 0;
180loop {
181 if (!(i < 10)) {
182 break;
183 }
184
185 c += 1;
186
187 continuing {
188 i++;
189 }
190}
191```
192
dan sinclair14440bc2023-07-11 19:12:29 +0000193#### Control Instruction -- Switch
194The switch instruction has a block for each of the `case/default` labels.
dan sinclaireee8d882022-10-28 01:22:58 +0000195
dan sinclair14440bc2023-07-11 19:12:29 +0000196The `switch` case blocks can be terminated with an `ExitSwitch`,
197`Continue`, or `Return`.
dan sinclaireee8d882022-10-28 01:22:58 +0000198
dan sinclaireee8d882022-10-28 01:22:58 +0000199
200#### Expressions
201All expressions in IR are single operations. There are no complex
202expressions. Any complex expression in the AST is broke apart into the
203simpler single operation components.
204
205```
206var a = b + c - (4 * k);
207```
208
209becomes:
210
211```
dan sinclair14440bc2023-07-11 19:12:29 +0000212%0:i32 = add %b, %c
213%1:i32 = mul 4u, %k
214%2:i32 = sub %0, %1
215%v:ptr<function, i32, read_write> = var %2
dan sinclaireee8d882022-10-28 01:22:58 +0000216```
217
218This also means that many of the short forms `i += 1`, `i++` get
219expanded into the longer form of `i = i + 1`.
220
dan sinclair14440bc2023-07-11 19:12:29 +0000221
dan sinclaireee8d882022-10-28 01:22:58 +0000222##### Short-Circuit Expressions
dan sinclair14440bc2023-07-11 19:12:29 +0000223The short-circuit expressions (e.g. `a && b`) are convert into an `if`
224structure control flow.
dan sinclaireee8d882022-10-28 01:22:58 +0000225
226```
227let c = a() && b()
228```
229
230becomes
231
232```
233let c = a();
234if (c) {
235 c = b();
236}
237```
238
dan sinclair14440bc2023-07-11 19:12:29 +0000239#### Values
240There are several types of values used in the SSA form.
dan sinclaireee8d882022-10-28 01:22:58 +0000241
dan sinclair14440bc2023-07-11 19:12:29 +00002421. `Constant` Value
2431. `InstructionResult` Value
2441. `Function` Value
2451. `FunctionParam` Value
2461. `BlockParam` Value
dan sinclaireee8d882022-10-28 01:22:58 +0000247
dan sinclaireee8d882022-10-28 01:22:58 +0000248
dan sinclair14440bc2023-07-11 19:12:29 +0000249##### Constant Value
250All values in IR are concrete, there are no abstract values as
251materialization has already happened. Each constant holds a lower level
dan sinclair97c37272023-07-24 17:11:53 +0000252`constant::Value` from the `src/tint/lang/core/ir/constant` system.
dan sinclair14440bc2023-07-11 19:12:29 +0000253
254
255##### InstructionResult Value
256The `InstructionResult` is the result of an instruction. The
dan sinclaireee8d882022-10-28 01:22:58 +0000257temporaries are created as complex expressions are broken down into
dan sinclair14440bc2023-07-11 19:12:29 +0000258pieces. The result tracks the usage for the value so you can determine
259which instructions use this value. The `InstructionResult` also points
260back to its `Source` instruction.
dan sinclaireee8d882022-10-28 01:22:58 +0000261
dan sinclaireee8d882022-10-28 01:22:58 +0000262
dan sinclair14440bc2023-07-11 19:12:29 +0000263### Function Value
264All `Functions` are values in the IR. This allows the function to be
265provided as a Value argument to things like the `Call` instructions.
dan sinclaireee8d882022-10-28 01:22:58 +0000266
dan sinclaireee8d882022-10-28 01:22:58 +0000267
dan sinclair14440bc2023-07-11 19:12:29 +0000268##### FunctionParam Value
269The function param values are used to store information on the values
270being passed into a function call.
dan sinclaireee8d882022-10-28 01:22:58 +0000271
dan sinclair14440bc2023-07-11 19:12:29 +0000272### BlockParam Values
273The `BlockParam` values are used to pass information for a
274`MultiInBlock` (e.g. Loop body, and loop continuing).
dan sinclaireee8d882022-10-28 01:22:58 +0000275
276## Alternatives
277Instead of going to a custom IR there are several possible other roads
278that could be travelled.
279
280### Mutable AST
281Tint originally contained a mutable AST. This was converted to immutable
282in order to allow processing over multiple threads and for safety
283properties. Those desires still hold, the AST is public API, and we want
284it to be as safe as possible, so keeping it immutable provides that
285guarantee.
286
287### Multiple Transforms With One Program Builder
288Instead of generating an immutable AST after each transform, running
289multiple transforms on the single program builder would remove some of
290the performance penalties of going to and from immutable AST. While this
291is true, the transforms use a combination of AST and SEM information.
292When they transform they _do not_ create new SEM information. That
293means, after a given transform, the SEM is out of date. In order to
294re-generate the SEM the resolver needs to be rerun. Supporting this
295would require being very careful on what transforms run together and
296how they modify the AST.
297
298### Adopt An Existing IR
299There are already several IRs in the while, Mesa has NIR, LLVM has
300LLVM IR. There are others, adopting one of those would remove the
301requirements of writing and maintaining our own IR. While that is true,
302there are several downsides to this re-use. The IRs are internal to the
303library, so the API isn't public, LLVM IR changes with each iteration of
304LLVM. This would require us to adapt the AST -> IR -> AST transform for
305each modification of the IR.
306
307They also end up being lower level then is strictly useful for us. While
308the IR in Tint is a simplified form, we still have to be able to go back
309to the high level structured form in order to emit the resulting HLSL,
310MSL, GLSL, etc. (Only SPIR-V is a good match for the lowered IR form).
311This transformation back is not a direction other IRs maybe interested
312in so may have lost information, or require re-determining (determining
313variables from SSA and PHI nodes for example).
314
315Other technical reasons are the maintenance of BUILD.gn and CMake files
316in order to integrate into our build systems, along with resulting
317binary size questions from pulling in external systems.
318