Define MVK_MACCAT build macro and use it to conditionally compile code to align
with build features and capabilities of Mac Catalyst platform on macOS 11.0+.
Treat Mac Catalyst as minor variation of macOS 11.0.
Update documentation.
Currently only support Mac Catalyst on macOS 11.0+, to avoid complexities of
deselecting iOS features and capabilities for Mac Catalyst on previous macOS versions.
Mac Catalyst (and Simulators) require use of XCFrameworks.
Currently unable to generate a dylib for Mac Catalyst.
There's a lot of redundant pointer comparison and version checking in
that function. Use a macro to reduce the redundancy.
Advertise `VK_KHR_shader_subgroup_extended_types` on iOS. I forgot to do
this in #1159.
Fix reversed condition in `MVKTimelineSemaphoreEmulated`, which probably
caused some test failures.
Don't increment the `MTLEvent` binary semaphore's counter unless a
command buffer is actually present to schedule a wait on.
Defer signal operations when a swapchain image is not yet available.
That way, the correct value will be signaled when the image is ready,
instead of causing GPU lockups and timeouts.
When a drawable is presented, immediately mark it available, instead of
waiting until the command buffer finishes. Otherwise, the wrong
semaphore could be signaled when an image is used twice in a row.
This should fix the problems using `MTLEvent` binary semaphores with
presentation, which was preventing us from enabling them by default when
available.
Special thanks to @apayen, whose
[idea](https://github.com/KhronosGroup/MoltenVK/issues/803) and
[change](a4ac715975)
were the basis for this.
On systems not supporting this, the subgroup size is set to 1.
Make sure the subgroup size is fixed in the shader, at least until we
implement `VK_EXT_subgroup_size_control`.
According to the Metal feature set tables, SIMD-group reduction is only
supported on Mac family 2 GPUs and Apple family 7 GPUs. Previously, we
were exposing these on all Mac GPUs.
Quadgroup permutation is supported on all Apple GPUs starting from
family 4. We use them for regular group non-uniform ops as well, so
these are considered to have a subgroup size of 4. On Mac, it's a bit
more complicated. The 2.1 tables say that all Mac GPUs support this, but
the 3.0 and 4.0 tables say that only family 2 supports quadgroup ops.
I've allowed quad ops on family 1 for now.
Unfortunately, my testing shows that SIMD-group functions don't work in
fragment shaders on Mojave, so no fragment shader support until Metal 3.
Update SPIRV-Cross to pull in changes needed for all this.
This was actually added in iOS 13, but it wasn't present in the betas.
Since the betas also didn't support family 6, this leads me to suspect
that `min_lod_clamp()` requires family 6. So to be safe, only enable the
feature on family 6.
Update SPIRV-Cross to pull in the changes needed for this.
Apple family 7 GPUs (A14) on iOS support multisample layered rendering,
as well as sampler border colors and the mirror clamp to edge sampler
address mode.
We know the actual threadgroup size, because it is declared in the
shader. Therefore, we know the total count of threads per threadgroup,
which is simply the product of the threadgroup size in all three
dimensions. This is necessary if Metal picks a size lower than the app
is expecting. At least one game (NieR: Automata) needs this to work
correctly.
Metal's validation layer doesn't like it when we render to both a 2D and
a 3D texture with layered rendering. Work around this by not setting the
`renderTargetArrayLength` if it would be one and we are rendering to
mixed 2D and 3D textures.
If the pipeline sets `[[render_target_array_index]]`, we're hosed either
way. Perhaps this is why Vulkan has a special flag,
`VK_IMAGE_CREATE_2D_ARRAY_COMPATIBLE_BIT`, intended for rendering to 3D
textures.
This is now supported in MSL 2.3. Support varies by device; devices that
support this return `YES` from `supportsPullModelInterpolation`. Based
on my testing, AMD devices do not yet support this, and Intel devices
do. Apple GPUs probably also support this, in order to support OpenGL on
top.
Update SPIRV-Cross to pull in the changes needed for this.
They can now be used as blit destinations and be cleared with a render
pass.
This also fixes a bug where we were incorrectly validating that the
format supported shader writes if the image were linear, even on
iOS/tvOS.
Add macros to set MTLPixelFormatASTC_*_HDR to MTLPixelFormatInvalid on tvOS.
Use MTLRenderPipelineDescriptor::inputPrimitiveTopologyMVK instead of native method.
Starting in macOS 10.15, all desktop GPUs support 256 kiB offsets. Apple
GPUs, however, do not support this until family 7. It isn't clear which
one is the supported limit on Apple Silicon Macs with family 5 or 6
GPUs, so I have left the Apple Silicon limit at 64 kiB for now.
Don't pass descriptor type to MVKDescriptor functions, which already know their
descriptor type, and remove redundant switch tests in each subclass.
Move sanity tests for correct descriptor type to callers.