We can't rely on those enums not being re-`#define`d, because they're
only re-`#define`d like that in MVKPixelFormats.mm.
Signed-off-by: Chip Davis <cdavis@codeweavers.com>
The game NieR: Automata (via DXVK) attempts to create a 1680x1050
swapchain on a 1600x900 window. It then attempts to render to this
swapchain with a 1680x1050 framebuffer. But we created the textures at
1600x900, matching the window. The render area is thus too big, which
triggers a Metal validation failure. Apparently, DXVK doesn't check the
surface caps before creating the Vulkan swapchain.
Rather than expecting the swapchain to be the same size as the layer, we
can actually support any swapchain size, up to the maximum size of a
texture supported by the device. The system will just scale the texture
when rendering it if it doesn't match the layer size. If the sizes don't
match up, we return `VK_SUBOPTIMAL_KHR`, instead of
`VK_ERROR_OUT_OF_DATE_KHR`, indicating that presentation is still
possible, but performance may suffer. This is good enough to let the
game continue under the validation layers.
This really needs a corresponding change to the CTS, because it
currently assumes that we can't do this.
Signed-off-by: Chip Davis <cdavis@codeweavers.com>
MVKPixelFormats revert testing for stencil feedback and add Mac Catalyst test.
Revert mvkOSVersionIsAtLeast(mac, ios) to remove test for Mac Catalyst version.
MVKRenderPass revert testing for MTLStorageModeMemoryless for iOS.
If a command buffer fails in Metal, its `status` becomes
`MTLCommandBufferStatusError`, and its `error` property is populated. In
Vulkan, command buffer failure usually triggers device loss. This is
because most drivers can't guarantee that resources weren't affected in
an undefined fashion. We can't make that guarantee, either, so when a
Metal command buffer has an error, mark the device lost. The error is
also logged. All waits are immediately signaled; those that were client
requests instead of internal implementation details immediately return
`VK_ERROR_DEVICE_LOST`.
The app may be able to recreate the logical device and reconstruct its
state. However, some Metal errors are severe enough that all subsequent
command buffers will fail--in those cases, the physical device becomes
lost as well, indicating that the device cannot be recreated.
Only certain commands are allowed to report device loss. These commands
test for device loss before continuing, and will immediately return if
the device is lost.
This extension allows the subgroup size to vary between draw/dispatch
calls, and even allows clients to declare that full subgroups must
always be dispatched. It corresponds better to how Metal actually works.
No support for declaring a required subgroup size, unfortunately.
Define MVK_MACCAT build macro and use it to conditionally compile code to align
with build features and capabilities of Mac Catalyst platform on macOS 11.0+.
Treat Mac Catalyst as minor variation of macOS 11.0.
Update documentation.
Currently only support Mac Catalyst on macOS 11.0+, to avoid complexities of
deselecting iOS features and capabilities for Mac Catalyst on previous macOS versions.
Mac Catalyst (and Simulators) require use of XCFrameworks.
Currently unable to generate a dylib for Mac Catalyst.
There's a lot of redundant pointer comparison and version checking in
that function. Use a macro to reduce the redundancy.
Advertise `VK_KHR_shader_subgroup_extended_types` on iOS. I forgot to do
this in #1159.
Fix reversed condition in `MVKTimelineSemaphoreEmulated`, which probably
caused some test failures.
Don't increment the `MTLEvent` binary semaphore's counter unless a
command buffer is actually present to schedule a wait on.
Defer signal operations when a swapchain image is not yet available.
That way, the correct value will be signaled when the image is ready,
instead of causing GPU lockups and timeouts.
When a drawable is presented, immediately mark it available, instead of
waiting until the command buffer finishes. Otherwise, the wrong
semaphore could be signaled when an image is used twice in a row.
This should fix the problems using `MTLEvent` binary semaphores with
presentation, which was preventing us from enabling them by default when
available.
Special thanks to @apayen, whose
[idea](https://github.com/KhronosGroup/MoltenVK/issues/803) and
[change](a4ac715975)
were the basis for this.
On systems not supporting this, the subgroup size is set to 1.
Make sure the subgroup size is fixed in the shader, at least until we
implement `VK_EXT_subgroup_size_control`.
According to the Metal feature set tables, SIMD-group reduction is only
supported on Mac family 2 GPUs and Apple family 7 GPUs. Previously, we
were exposing these on all Mac GPUs.
Quadgroup permutation is supported on all Apple GPUs starting from
family 4. We use them for regular group non-uniform ops as well, so
these are considered to have a subgroup size of 4. On Mac, it's a bit
more complicated. The 2.1 tables say that all Mac GPUs support this, but
the 3.0 and 4.0 tables say that only family 2 supports quadgroup ops.
I've allowed quad ops on family 1 for now.
Unfortunately, my testing shows that SIMD-group functions don't work in
fragment shaders on Mojave, so no fragment shader support until Metal 3.
Update SPIRV-Cross to pull in changes needed for all this.
This was actually added in iOS 13, but it wasn't present in the betas.
Since the betas also didn't support family 6, this leads me to suspect
that `min_lod_clamp()` requires family 6. So to be safe, only enable the
feature on family 6.
Update SPIRV-Cross to pull in the changes needed for this.
Apple family 7 GPUs (A14) on iOS support multisample layered rendering,
as well as sampler border colors and the mirror clamp to edge sampler
address mode.
We know the actual threadgroup size, because it is declared in the
shader. Therefore, we know the total count of threads per threadgroup,
which is simply the product of the threadgroup size in all three
dimensions. This is necessary if Metal picks a size lower than the app
is expecting. At least one game (NieR: Automata) needs this to work
correctly.
Metal's validation layer doesn't like it when we render to both a 2D and
a 3D texture with layered rendering. Work around this by not setting the
`renderTargetArrayLength` if it would be one and we are rendering to
mixed 2D and 3D textures.
If the pipeline sets `[[render_target_array_index]]`, we're hosed either
way. Perhaps this is why Vulkan has a special flag,
`VK_IMAGE_CREATE_2D_ARRAY_COMPATIBLE_BIT`, intended for rendering to 3D
textures.
This is now supported in MSL 2.3. Support varies by device; devices that
support this return `YES` from `supportsPullModelInterpolation`. Based
on my testing, AMD devices do not yet support this, and Intel devices
do. Apple GPUs probably also support this, in order to support OpenGL on
top.
Update SPIRV-Cross to pull in the changes needed for this.
They can now be used as blit destinations and be cleared with a render
pass.
This also fixes a bug where we were incorrectly validating that the
format supported shader writes if the image were linear, even on
iOS/tvOS.
Add macros to set MTLPixelFormatASTC_*_HDR to MTLPixelFormatInvalid on tvOS.
Use MTLRenderPipelineDescriptor::inputPrimitiveTopologyMVK instead of native method.
Starting in macOS 10.15, all desktop GPUs support 256 kiB offsets. Apple
GPUs, however, do not support this until family 7. It isn't clear which
one is the supported limit on Apple Silicon Macs with family 5 or 6
GPUs, so I have left the Apple Silicon limit at 64 kiB for now.
Don't pass descriptor type to MVKDescriptor functions, which already know their
descriptor type, and remove redundant switch tests in each subclass.
Move sanity tests for correct descriptor type to callers.