Corentin Wallez | 94efed2 | 2020-05-05 08:46:15 +0000 | [diff] [blame] | 1 | # Devices |
| 2 | |
| 3 | In Dawn the `Device` is a "god object" that contains a lot of facilities useful for the whole object graph that descends from it. |
| 4 | There a number of facilities common to all backends that live in the frontend and backend-specific facilities. |
| 5 | Example of frontend facilities are the management of content-less object caches, or the toggle management. |
| 6 | Example of backend facilities are GPU memory allocators or the backing API function pointer table. |
| 7 | |
| 8 | ## Frontend facilities |
| 9 | |
| 10 | ### Error Handling |
| 11 | |
| 12 | Dawn (dawn_native) uses the [Error.h](../src/dawn_native/Error.h) error handling to robustly handle errors. |
| 13 | With `DAWN_TRY` errors bubble up all the way to, and are "consumed" by the entry-point that was called by the application. |
| 14 | Error consumption uses `Device::ConsumeError` that expose them via the WebGPU "error scopes" and can also influence the device lifecycle by notifying of a device loss, or triggering a device loss.. |
| 15 | |
| 16 | See [Error.h](../src/dawn_native/Error.h) for more information about using errors. |
| 17 | |
| 18 | ### Device Lifecycle |
| 19 | |
| 20 | The device lifecycle is a bit more complicated than other objects in Dawn for multiple reasons: |
| 21 | |
| 22 | - The device initialization creates facilities in both the backend and the frontend, which can fail. |
| 23 | When a device fails to initialize, it should still be possible to destroy it without crashing. |
| 24 | - Execution of commands on the GPU must be finished before the device can be destroyed (because there's noone to "DeleteWhenUnused" the device). |
| 25 | - On creation a device might want to run some GPU commands (like initializing zero-buffers), which must be completed before it is destroyed. |
| 26 | - A device can become "disconnected" when a TDR or hot-unplug happens. |
| 27 | In this case, destruction of the device doesn't need to wait on GPU commands to finish because they just disappeared. |
| 28 | |
| 29 | There is a state machine `State` defined in [Device.h](../src/dawn_native/Device.h) that controls all of the above. |
| 30 | The most common state is `Alive` when there are potentially GPU commands executing. |
| 31 | |
| 32 | Initialization of a device looks like the following: |
| 33 | |
| 34 | - `DeviceBase::DeviceBase` is called and does mostly nothing except setting `State` to `BeingCreated` (and initial toggles). |
| 35 | - `backend::Device::Initialize` creates things like the underlying device and other stuff that doesn't run GPU commands. |
| 36 | - It then calls `DeviceBase::Initialize` that enables the `DeviceBase` facilities and sets the `State` to `Alive`. |
| 37 | - Optionally, `backend::Device::Initialize` can now enqueue GPU commands for its initialization. |
| 38 | - The device is ready to be used by the application! |
| 39 | |
| 40 | While it is `Alive` the device can notify it has been disconnected by the backend, in which case it jumps directly to the `Disconnected` state. |
| 41 | Internal errors, or a call to `LoseForTesting` can also disconnect the device, but in the underlying API commands are still running, so the frontend will finish all commands (with `WaitForIdleForDesctruction`) and prevent any new commands to be enqueued (by setting state to `BeingDisconnected`). |
| 42 | After this the device is set in the `Disconnected` state. |
| 43 | If an `Alive` device is destroyed, then a similar flow to `LoseForTesting happens`. |
| 44 | |
| 45 | All this ensures that during destruction or forceful disconnect of the device, it properly gets to the `Disconnected` state with no commands executing on the GPU. |
Loko Kung | fc5a7d4 | 2021-10-12 17:46:26 +0000 | [diff] [blame] | 46 | After disconnecting, frontend will call `backend::Device::DestroyImpl` so that it can properly free driver objects. |
Corentin Wallez | 94efed2 | 2020-05-05 08:46:15 +0000 | [diff] [blame] | 47 | |
| 48 | ### Toggles |
| 49 | |
| 50 | Toggles are booleans that control code paths inside of Dawn, like lazy-clearing resources or using D3D12 render passes. |
| 51 | They aren't just booleans close to the code path they control, because embedders of Dawn like Chromium want to be able to surface what toggles are used by a device (like in about:gpu). |
| 52 | |
| 53 | Toogles are to be used for any optional code path in Dawn, including: |
| 54 | |
| 55 | - Workarounds for driver bugs. |
| 56 | - Disabling select parts of the validation or robustness. |
| 57 | - Enabling limitations that help with testing. |
| 58 | - Using more advanced or optional backend API features. |
| 59 | |
| 60 | Toggles can be queried using `DeviceBase::IsToggleEnabled`: |
| 61 | ``` |
| 62 | bool useRenderPass = device->IsToggleEnabled(Toggle::UseD3D12RenderPass); |
| 63 | ``` |
| 64 | |
| 65 | Toggles are defined in a table in [Toggles.cpp](../src/dawn_native/Toggles.cpp) that also includes their name and description. |
| 66 | The name can be used to force enabling of a toggle or, at the contrary, force the disabling of a toogle. |
| 67 | This is particularly useful in tests so that the two sides of a code path can be tested (for example using D3D12 render passes and not). |
| 68 | |
| 69 | Here's an example of a test that is run in the D3D12 backend both with the D3D12 render passes forcibly disabled, and in the default configuration. |
| 70 | ``` |
| 71 | DAWN_INSTANTIATE_TEST(RenderPassTest, |
| 72 | D3D12Backend(), |
| 73 | D3D12Backend({}, {"use_d3d12_render_pass"})); |
| 74 | // The {} is the list of force enabled toggles, {"..."} the force disabled ones. |
| 75 | ``` |
| 76 | |
| 77 | The initialization order of toggles looks as follows: |
| 78 | |
| 79 | - The toggles overrides from the device descriptor are applied. |
| 80 | - The frontend device default toggles are applied (unless already overriden). |
| 81 | - The backend device default toggles are applied (unless already overriden) using `DeviceBase::SetToggle` |
| 82 | - The backend device can ignore overriden toggles if it can't support them by using `DeviceBase::ForceSetToggle` |
| 83 | |
| 84 | Forcing toggles should only be done when there is no "safe" option for the toggle. |
| 85 | This is to avoid crashes during testing when the tests try to use both sides of a toggle. |
| 86 | For toggles that are safe to enable, like workarounds, the tests can run against the base configuration and with the toggle enabled. |
| 87 | For toggles that are safe to disable, like using more advanced backing API features, the tests can run against the base configuation and with the toggle disabled. |
| 88 | |
| 89 | ### Immutable object caches |
| 90 | |
| 91 | A number of WebGPU objects are immutable once created, and can be expensive to create, like pipelines. |
| 92 | `DeviceBase` contains caches for these objects so that they are free to create the second time. |
| 93 | This is also useful to be able to compare objects by pointers like `BindGroupLayouts` since two BGLs would be equal iff they are the same object. |
| 94 | |
| 95 | ### Format Tables |
| 96 | |
François Beaufort | 3f689a4 | 2021-10-04 11:30:02 +0000 | [diff] [blame] | 97 | The frontend has a `Format` structure that represent all the information that are known about a particular WebGPU format for this Device based on the enabled features. |
Corentin Wallez | 94efed2 | 2020-05-05 08:46:15 +0000 | [diff] [blame] | 98 | Formats are precomputed at device initialization and can be queried from a WebGPU format either assuming the format is a valid enum, or in a safe manner that doesn't do this assumption. |
| 99 | A reference to these formats can be stored persistently as they have the same lifetime as the `Device`. |
| 100 | |
| 101 | Formats also have an "index" so that backends can create parallel tables for internal informations about formats, like what they translate to in the backing API. |
| 102 | |
| 103 | ### Object factory |
| 104 | |
| 105 | Like WebGPU's device object, `DeviceBase` is an factory with methods to create all kinds of other WebGPU objects. |
| 106 | WebGPU has some objects that aren't created from the device, like the texture view, but in Dawn these creations also go through `DeviceBase` so that there is a single factory for each backend. |