This features adds support for efficiently reading and writing per-pixel data that's local to a render pass, with correctly synchronized accesses. On “explicit” tiler GPUs this additional data is allocate in tile storage similarly to attachments, but on other GPUs the same functionality can be implemented using “fragment shader interlock” and read-write storage textures.
Here is the design doc for this feature.
A couple new concepts are added:
A pixel_local
storage class in WGSL that can be used with var<pixel_local> pls : u32/MyStruct
which declares the pixel local data in the shader. It requires a WGSL enable
, see chromium_experimental_framebuffer_fetch
.
Pixel local storage descriptors for both wgpu::PipelineLayoutDescritor
and wgpu::RenderPassDescriptor
that describe the layout of the PLS inside a render pass, and are part of pipeline / render pass compatibility. These descriptors contain both:
A wgpu::TextureUsage::StorageAttachment
that allows a texture to be used as a storage attachment in a render pass.
The wgpu::RenderPassEncoder::PixelLocalStorageBarrier
method for use in the non-coherent PLS extension.
The feature comes in two flavors depending on coherency. The coherent version automatically synchronizes fragment shader accesses to the PLS so that no data race happens (as if there is a critical section between the first and last use of the PLS in an invocation), and enforcing that fragment shaders happen in API order (two fragments invocations from two triangles of the same draw happen in the order of the triangles in the draw). The non-coherent version allows racy access to the PLS during the whole render pass, but provides a PixelLocalStorageBarrier()
that prevents races between fragment invocations before and after the barrier. In particular, there is no way to prevent racy access to the PLS in the same draw with the non-coherent version of the feature.
// Add to wgpu::FeatureName wgpu::FeatureName::PixelLocalStorageNonCoherent wgpu::FeatureName::PixelLocalStorageCoherent // Add to wgpu::TextureUsage wgpu::TextureUsage::StorageAttachment // Can be chained to a RenderPassDescriptor struct wgpu::RenderPassPixelLocalStorage : wgpu::ChainedStruct { uint64_t totalPixelLocalStorageSize; size_t storageAttachmentCount; wgpu::RenderPassStorageAttachment* storageAttachments; }; struct wgpu::RenderPassStorageAttachment : wgpu::ChainedStruct { wgpu::LoadOp loadOp; wgpu::StoreOp storeOp; wgpu::Color clearValue; wgpu::TextureView storage; uint64_t offset; }; // Can be chained to a PipelineLayoutDescriptor struct wgpu::PipelineLayoutPixelLocalStorage : wgpu::ChainedStruct { uint64_t totalPixelLocalStorageSize; size_t storageAttachmentCount; wgpu::PipelineLayoutStorageAttachment* storageAttachments; }; struct wgpu::PipelineLayoutStorageAttachment : wgpu::ChainedStruct { wgpu::TextureFormat format; uint64_t offset; }; // Used for non-coherent. wgpu::RenderPassEncoder::PixelLocalStorageBarrier();
Only the R32Uint
, R32Sint
and R32Float
texture formats can specify wgpu::TextureUsage::StorageAttachment
.
StorageAttachment
textures must be single-sampled.
In a PixelLocalStorage
descriptor:
StorageAttachment
.BeginRenderPass
).The PLS definition between a pipeline and a render pass must match.
A render pipeline must us a pixel_local
storage class that‘s compatible with its layout’s PLS descriptor.
pixel_local
block.u32
WGSL type.