As Tint has grown the number of transforms on the AST has grown. This growth has lead to several issues:
In order to address these goals, an IR was introduced into Tint. The IR is mutable, it holds the needed state in order to be transformed. The IR is also translatable back into AST. It will be possible to generate an AST, convert to IR, transform, and then rebuild a new AST. This round-trip ability provides a few features:
The IR helps with the complexity of the AST transforms by limiting the representations in the IR form. For example, instead of for
, while
and loop
constructs there is a single loop
construct. alias
and const_assert
nodes are not emitted into IR. Dead code maybe eliminated during IR construction.
As the IR can convert into AST, we could potentially simplify the SPIRV-Reader by generating IR directly. The IR is closer to what SPIR-V looks like, so maybe a simpler transform.
The IR is composed of three primary concepts, blocks
, values
, and instructions
.
The blocks
provide lists of instructions
which are executed sequentially. An instruction
may contain blocks
, value
operands and value
results. A value
is a block argument, function parameter, constant result, or a result computed by an instruction
.
Each block
ends in a terminator
expression. There are a few blocks which maybe empty, if-false block
, loop-initializer
and loop-continuing
. Those blocks are optional. The terminators
may hold values
which are returned out of the block being terminated.
Control flow is handled through ControlInstructions
. These are the if
, switch
and loop
instructions. Each control instruction contains blocks
for the various control flow paths.
Transforming from AST to IR and back to AST is a lossy operation. The resulting AST when converting back may not be the same as the AST being provided. This transformation is intentional as it greatly simplifies the number of things to consider in the IR. For instance:
alias
nodesconst_assert
nodesThe code is contained in the src/tint/ir
folder and is broken down into several classes. Note, the IR is a Tint internal representation and these files should never appear in the public API.
The Builder
class provides useful helper routines for creating IR content. The Builder references an ir::Module
.
The top level of the IR is the Module
. The module stores a list of functions
, constant manager, type manager, allocators and various other bits of information needed by the IR. The module
may contain a disassembled representation of the module if the disassembler has been called.
A free ir::FromProgram
method is provided to convert from a Program
to an ir::Module
. A similar ToProgram
free method is provided for the reverse conversion.
Similar to the AST a transform system is available for IR. The transform has the same setup as the AST (and inherits from the same base transform class.)
The IR transforms live in src/tint/ir/transform
. These transforms are for use by the various IR generator backends.
Unlike with the AST transforms, the IR transforms know which transforms must have run before themselves. The transform manager will then verify that all the required transforms for a given transform have executed.
The IR contains a validator. The validator checks for common errors encountered when building IR. The validator is not run in production as any error is an internal coding error. The validator is executed by the generators before starting, and after each transform executes in debug mode.
The instructions are internal to the IR, and the values in the IR do not contain names. The IR module has a symbol table which can be used to retrieve names for the instructions and values but those names may not be the names provided originally by the AST.
A block contains the instruction lists for a section of code. A block always ends in a terminator instruction.
The instructions in a block can be walked in linear scoped fashion. The control instructions do not need to be entered to walk a blocks instructions.
The control instructions provide containers for other blocks. These include the if
, loop
and switch
instruction. A control instruction is a self contained scope on top of the containing block scope.
The if instruction is an if-else
structure. There are no else-if
entries, they get moved into the else
of the if
. The if instruction has two internal blocks, the True
and False
blocks.
A if instruction will always have a True
target. The False
target maybe empty.
The sub-blocks under the if
can be terminated with ExitIf
, ExitSwitch
, ExitLoop
, Continue
or Return
.
All of the loop structures in AST merge down to a single loop instruction. The loop contains the Body
, which is required and optional Initializer
and Continuing
blocks.
The loop
body can be terminated with a ExitLoop
, Continue
, NextIteration
or Return
.
The loop
continuing can be terminated with a NextIteration
, or a BreakIf
terminator.
A while loop is decomposed as listed in the WGSL spec:
while (a < b) { c += 1; }
becomes:
loop { if (!(a < b)) { break; } c += 1; }
A for loop is decomposed as listed in the WGSL spec:
for (var i = 0; i < 10; i++) { c += 1; }
becomes:
var i = 0; loop { if (!(i < 10)) { break; } c += 1; continuing { i++; } }
The switch instruction has a block for each of the case/default
labels.
The switch
case blocks can be terminated with an ExitSwitch
, Continue
, or Return
.
All expressions in IR are single operations. There are no complex expressions. Any complex expression in the AST is broke apart into the simpler single operation components.
var a = b + c - (4 * k);
becomes:
%0:i32 = add %b, %c %1:i32 = mul 4u, %k %2:i32 = sub %0, %1 %v:ptr<function, i32, read_write> = var %2
This also means that many of the short forms i += 1
, i++
get expanded into the longer form of i = i + 1
.
The short-circuit expressions (e.g. a && b
) are convert into an if
structure control flow.
let c = a() && b()
becomes
let c = a(); if (c) { c = b(); }
There are several types of values used in the SSA form.
Constant
ValueInstructionResult
ValueFunction
ValueFunctionParam
ValueBlockParam
ValueAll values in IR are concrete, there are no abstract values as materialization has already happened. Each constant holds a lower level constant::Value
from the src/tint/ir/constant
system.
The InstructionResult
is the result of an instruction. The temporaries are created as complex expressions are broken down into pieces. The result tracks the usage for the value so you can determine which instructions use this value. The InstructionResult
also points back to its Source
instruction.
All Functions
are values in the IR. This allows the function to be provided as a Value argument to things like the Call
instructions.
The function param values are used to store information on the values being passed into a function call.
The BlockParam
values are used to pass information for a MultiInBlock
(e.g. Loop body, and loop continuing).
Instead of going to a custom IR there are several possible other roads that could be travelled.
Tint originally contained a mutable AST. This was converted to immutable in order to allow processing over multiple threads and for safety properties. Those desires still hold, the AST is public API, and we want it to be as safe as possible, so keeping it immutable provides that guarantee.
Instead of generating an immutable AST after each transform, running multiple transforms on the single program builder would remove some of the performance penalties of going to and from immutable AST. While this is true, the transforms use a combination of AST and SEM information. When they transform they do not create new SEM information. That means, after a given transform, the SEM is out of date. In order to re-generate the SEM the resolver needs to be rerun. Supporting this would require being very careful on what transforms run together and how they modify the AST.
There are already several IRs in the while, Mesa has NIR, LLVM has LLVM IR. There are others, adopting one of those would remove the requirements of writing and maintaining our own IR. While that is true, there are several downsides to this re-use. The IRs are internal to the library, so the API isn't public, LLVM IR changes with each iteration of LLVM. This would require us to adapt the AST -> IR -> AST transform for each modification of the IR.
They also end up being lower level then is strictly useful for us. While the IR in Tint is a simplified form, we still have to be able to go back to the high level structured form in order to emit the resulting HLSL, MSL, GLSL, etc. (Only SPIR-V is a good match for the lowered IR form). This transformation back is not a direction other IRs maybe interested in so may have lost information, or require re-determining (determining variables from SSA and PHI nodes for example).
Other technical reasons are the maintenance of BUILD.gn and CMake files in order to integrate into our build systems, along with resulting binary size questions from pulling in external systems.