If a shader uses an input attachment and doesn't do layered rendering,
but the image view is of type `MTLTextureType2DArray`, Metal's
validation layer will complain about the texture type mismatching what
the shader expects. This change makes the texture types line up.
When the extent covers both the source and destination images
completely, we can use the copy method on `MTLBlitCommandEncoder` which
can copy multiple slices at once. This should hopefully reduce CPU
overhead and command buffer memory usage.
Combine MoltenVKSPIRVToMSLConverter and MoltenVKGLSLToSPIRVConverter
frameworks into a single MoltenVKShaderConverter framework.
Update corresponding directory structures, symlinks, scripts, and build paths.
Update MoltenVK code to use new framework name for headers.
Add symlinks in API-Samples demo to support legacy
MoltenVKGLSLToSPIRVConverter header paths.
In addition to simplifying shader converter code and build management, the
use of only one shader converter framework fixes a race condition within Xcode,
prior to Xcode 12, when multiple targets use the same dependency XCFramework.
When calculating the vertices, we need to use the render area's
extent--but only if the implementation supports constraining the render
area using `renderTargetWidth` and `renderTargetHeight`. Otherwise, the
quad will be stretched and/or squashed because of the render area
constraint.
Fall through to the 2D case, so all the special handling for 2D is used
for 1D as well. Also, make sure 1D doesn't report multisampling or
support for 420 subsampled formats. There is no
`MTLTextureType1DMultisample` anyway.
Also, clear the `VkImageFormatProperties` struct if the format is not
supported with the given parameters. Some tests seem to expect this.
We don't want to do this for stencil attachment views, because we use
the original packed depth/stencil format in render pipelines, and
Metal's validation layer for some reason doesn't consider packed formats
and their corresponding stencil view formats to match. So only do this
if the image view usage includes `SAMPLED` or `INPUT_ATTACHMENT`.
If the image has a format that supports atomic access, or can be cast to
a format which supports atomic access, then use a texel buffer,
regardless of the memory type. If we can't use the `MTLBuffer` from the
device memory, then create our own.
For #1027.
We really don't want to use the `VkFormat` for the whole image. That has
a block size set on it, which causes the size to be reduced by a factor
of two or even four, in the case of a 420 format whose pitch exceeds the
required alignment. If instead we use the plane `MTLPixelFormat`, we can
sidestep that.
Under Metal, `GBGR422` and `BGRG422` formats don't support linear
textures, mipmapping, or multisampled, arrayed, 1D, 3D, or cube
images. Many of these don't make sense for multiplanar images, either,
so I've disabled them there as well. Vulkan also forbids creating buffer
views in a chroma subsampled format, which we can't do on Metal anyway
due to linear textures not supporting this. Finally, Vulkan forbids
blitting between chroma subsampled formats.
Don't advertise GBGR/BGRG formats greater than 8 bits. Metal has no
corresponding public pixel format.
Make sure `samplerYcbcrProperties.combinedImageSamplerDescriptorCount`
is at least 1. According to the Vulkan spec:
> `combinedImageSamplerDescriptorCount` is a number between 1 and the
> number of planes in the format.
For single-plane formats, use the directly mapped `MTLPixelFormat`
instead of trying to guess the correct format. This is important for
`G8B8G8R8_422_UNORM` and `B8G8R8G8_422_UNORM`, since these shouldn't map
to `RGBA8Unorm`.
Don't adjust the extent for plane 0. Plane 0 is never subsampled, even
with GBGR/BGRG formats. The subsampled R/B components are instead
interleaved in these formats with the fully sampled G. This fixes a
validation error creating a GBGR/BGRG texture with an odd size.
Don't warn when there's no `MTLPixelFormat` for a multiplanar format.
These deliberately have no `MTLPixelFormat`, because there is no single
`MTLTexture` corresponding to the entire image.
The extra checks in `MVKImage` are to ensure that the
`dEQP-VK.api.invariance.random` test doesn't crash.
Only set `MTLTextureUsagePixelFormatView` if this bit is set. This
should reduce the usage of this bit, which disables lossless compression
on Apple GPUs, in many cases.
We continue to set it anyway for `VK_IMAGE_USAGE_TRANSFER_SRC_BIT`; this
is because we create those texture views on the application's behalf, to
implement `vkCopyImage()` where the source and destination formats do
not agree. We also continue to set it for depth/stencil formats; Metal
requires it in order to use only the stencil aspect in a view.
If this bit is set, add `MTLTextureUsage` bits for features supported by
all possible view formats, not just the image format. Happily, Metal's
API validator layer is OK with this.
This is needed for the mutable format tests, which use this bit with a
format that doesn't support writing, then casts the image to a format
which does.
I've turned on the `Resolve` cap for stencil-only formats, even though
no version of the Metal Feature Set tables lists them as supporting
multisample resolve. Obviously, if they couldn't be resolve
destinations, the stencil-resolve filter that was introduced in Metal
2.1 wouldn't work. I don't know if the platforms and feature sets where
I've turned the bit on is accurate, though. Wider testing is needed.
Because Apple families 1 and 2 don't support depth/stencil resolve at
all, I've disabled the extension for those families. Since sample-zero
resolution is a required feature of Vulkan 1.2, this means we won't be
able to support 1.2 on those devices. If there's demand, we could
possibly have a compute pass which does sample-zero resolution.
We were looking in the wrong chain for the
`VkImagePlaneMemoryRequirementsInfo` struct. We were also failing to
call through to the `MVKImageMemoryBinding` to get dedicated
requirements, which broke dedicated memory.
Clearly, I failed to properly review the patch which refactored
`MVKImage` for `VK_KHR_sampler_ycbcr_conversion`.
Apple assured me that, starting in 10.15.5, linear textures and texture
buffers can be created from buffers with `Shared` storage. I have tested
this and can confirm that it works at least on Big Sur, and probably on
Catalina as well.
In an indirect draw, we wouldn't set the pipeline for the vertex stage
to the command encoder, because we had already switched encoders. This
caused havoc down the line, as we wound up running the "convert indirect
buffers" pipeline again, which smashed memory randomly--possibly causing
the GPU to crash, and bringing down the `WindowServer` with it!
Why is it so easy to bring down the `WindowServer` like this?