1314 Commits

Author SHA1 Message Date
Bill Hollings
fbc3600787 Support Xcode 12.3 and remove Travis CI ref from project. 2021-01-21 12:27:37 -05:00
Malcolm Bechard
526779ad66 fix incorrect behavior for MVKCmdResolveImage
fix incorrectly changing first resolve layer's src/dst base layer,
as well and out of bounds access to the mtlResolveSlices array
2021-01-18 22:58:07 -05:00
Mark Reid
40cd45d4d7 Support immmutableSamplers with sampler arrays, fixes #1181 2021-01-09 19:57:39 -08:00
Chip Davis
2eb0fe6947 MVKRenderPass: Only use MTLStorageModeMemoryless where available.
Fixes #1197.
2021-01-05 22:44:55 -06:00
Chip Davis
930525f289 MVKRenderPass: Use a non-trivial granularity for TBDR GPUs.
This parameter is intended to indicate the optimal granularity for the
render area. For TBDR GPUs, this will be the tile size. IMR GPUs
continue to use 1x1.

Apple GPUs support tile sizes of 16x16, 32x16, and 32x32. But, we can't
read the tile size used for a Metal render pass until a render command
encoder has been created. So for now, hardcode 32x32 for TBDR GPUs.
2020-12-31 13:18:11 -06:00
Bill Hollings
34dc378371
Merge pull request #1191 from cdavis5e/apple-gpu-barriers
Don't use barriers in render passes on Apple GPUs.
2020-12-31 09:40:17 -05:00
Chip Davis
d22d9e3d17 Don't use barriers in render passes on Apple GPUs.
Apple GPUs don't support them, and in fact Metal's validation layer will
complain if you try to use them.
2020-12-30 15:09:53 -06:00
Bill Hollings
e9bc3b3c62
Merge pull request #1193 from cdavis5e/memoryless-host-accessibility
MVKDeviceMemory: Don't consider Memoryless host-accessible on macOS/tvOS.
2020-12-30 10:11:24 -05:00
Bill Hollings
c985bbf4c3
Merge pull request #1192 from cdavis5e/memoryless-rt-actions
MVKRenderPass: Don't use Load/Store actions for memoryless.
2020-12-30 10:06:09 -05:00
Bill Hollings
4f0d370166
Merge pull request #1190 from cdavis5e/render-rgb9e5-mask
MVKGraphicsPipeline: Fix color write mask with RGB9E5 RTs.
2020-12-30 09:52:29 -05:00
Bill Hollings
326e872a65
Merge pull request #1189 from cdavis5e/sync-create-texture
MVKImagePlane: When sync'ing, create the texture if it doesn't exist.
2020-12-30 09:48:34 -05:00
Chip Davis
4d6b92bb9c MVKDeviceMemory: Don't consider Memoryless host-accessible on macOS/tvOS.
I missed this when I added support for memoryless on macOS.
2020-12-29 21:21:02 -06:00
Chip Davis
aa65392027 MVKRenderPass: Don't use Load/Store actions for memoryless.
Memoryless textures cannot use `Load`/`Store` actions, because there is
no memory to load from or store to. So don't use these actions, even if
we're not rendering to the whole thing.

Multiple subpasses might be a problem. Tessellation and indirect
multiview *will* be problems, because these require us to interrupt the
render pass in order to do compute--and that causes us to use `Store`
unconditionally.

One option, which I've mentioned before, is using tile shaders for these
cases. But I haven't looked seriously into that yet. The other involves
a subtle distinction between Metal's `MTLStorageModeMemoryless` and
Vulkan's `VK_MEMORY_TYPE_LAZILY_ALLOCATED_BIT`: the former *never*
commits memory; while the latter doesn't commit memory *at first*, but
may do so later. If we find that we're going to need to `Store` to a
`LAZILY_ALLOCATED` image, then what we can do is replace the
`Memoryless` texture with one with a real backing store in `Private`
memory. This change does not do that yet. It'll require some more
thought.

As for multiple subpasses, I eventually want to look into optimizing
render passes by shuffling the subpasses around to minimize the need to
load and store attachments from/to memory, which TBDR GPUs absolutely
hate. That should help with this problem, too.
2020-12-29 21:19:18 -06:00
Chip Davis
e6a8409b31 MVKGraphicsPipeline: Fix color write mask with RGB9E5 RTs.
Metal does not allow color write masks with this format where some but
not all of the channels are disabled. Either all must be enabled or none
must be enabled. Presumably, this is because of the shared exponent.

This is just good enough to stop the validation layer from violently
terminating the program. To implement this properly requires using
framebuffer fetch, with a change to SPIRV-Cross. Luckily, the only GPUs
that support `RGB9E5` rendering also support framebuffer fetch.
Honestly, I don't understand why Apple's drivers don't do this.
2020-12-29 21:15:14 -06:00
Chip Davis
f875055ecd MVKImagePlane: When sync'ing, create the texture if it doesn't exist.
Before we can upload the texture, we need to make sure there is a
texture to upload to.
2020-12-29 21:13:43 -06:00
Chip Davis
1bbab6151a MVKPixelFormats: Enable RenderTarget usage for linear textures on Apple GPUs.
I forgot to do this when I added the `renderLinearTexture` feature.
2020-12-29 21:12:14 -06:00
Bill Hollings
94a81177cb Update MoltenVK to version 1.1.2.
Update VK_MVK_MOLTENVK_SPEC_VERSION to 30.
Update What's New document.
2020-12-26 13:45:10 -05:00
Jan Sikorski
4440a64d83 Config: Added setting for fastMathEnabled Metal Compiler option.
Set it to false by default, as it creates visual glitches in at least one game,
caused by depth fighting between a depth and a color render pass.
2020-12-14 17:41:28 +01:00
Bill Hollings
1ccc0ab7b5 Update dependency libraries to match Vulkan SDK 1.2.162.
Fix Mac Catalyst build failure, plus several build warnings on other platforms.
Update What's New document.
2020-12-08 21:31:39 -05:00
Chip Davis
13b1840ad0 MVKImage: Fix compilation with Xcode 11.
We can't rely on those enums not being re-`#define`d, because they're
only re-`#define`d like that in MVKPixelFormats.mm.

Signed-off-by: Chip Davis <cdavis@codeweavers.com>
2020-12-04 02:35:13 -06:00
Chip Davis
891654d5d0 MVKSwapchain: Allow images whose size doesn't match the CAMetalLayer.
The game NieR: Automata (via DXVK) attempts to create a 1680x1050
swapchain on a 1600x900 window. It then attempts to render to this
swapchain with a 1680x1050 framebuffer. But we created the textures at
1600x900, matching the window. The render area is thus too big, which
triggers a Metal validation failure. Apparently, DXVK doesn't check the
surface caps before creating the Vulkan swapchain.

Rather than expecting the swapchain to be the same size as the layer, we
can actually support any swapchain size, up to the maximum size of a
texture supported by the device. The system will just scale the texture
when rendering it if it doesn't match the layer size. If the sizes don't
match up, we return `VK_SUBOPTIMAL_KHR`, instead of
`VK_ERROR_OUT_OF_DATE_KHR`, indicating that presentation is still
possible, but performance may suffer. This is good enough to let the
game continue under the validation layers.

This really needs a corresponding change to the CTS, because it
currently assumes that we can't do this.

Signed-off-by: Chip Davis <cdavis@codeweavers.com>
2020-12-03 10:18:03 -06:00
Chip Davis
f74356e51c Re-enable MTLEvent-based binary semaphores.
I haven't seen any problems yet. It passes all the tests--even the
`signal_order` tests. I have tried it with some games, and it seems OK
there, too.
2020-12-02 20:13:43 -06:00
Bill Hollings
6110c349ce
Merge pull request #1168 from billhollings/mac-catalyst
Support Mac Catalyst on macOS 11.0+
2020-12-02 19:51:52 -05:00
Bill Hollings
7f62362db3 Fixes from review for Mac Catalyst on macOS 11.0+.
MVKPixelFormats revert testing for stencil feedback and add Mac Catalyst test.
Revert mvkOSVersionIsAtLeast(mac, ios) to remove test for Mac Catalyst version.
MVKRenderPass revert testing for MTLStorageModeMemoryless for iOS.
2020-12-02 16:22:38 -05:00
Chip Davis
cbc2443ce4 Handle device loss.
If a command buffer fails in Metal, its `status` becomes
`MTLCommandBufferStatusError`, and its `error` property is populated. In
Vulkan, command buffer failure usually triggers device loss. This is
because most drivers can't guarantee that resources weren't affected in
an undefined fashion. We can't make that guarantee, either, so when a
Metal command buffer has an error, mark the device lost. The error is
also logged. All waits are immediately signaled; those that were client
requests instead of internal implementation details immediately return
`VK_ERROR_DEVICE_LOST`.

The app may be able to recreate the logical device and reconstruct its
state. However, some Metal errors are severe enough that all subsequent
command buffers will fail--in those cases, the physical device becomes
lost as well, indicating that the device cannot be recreated.

Only certain commands are allowed to report device loss. These commands
test for device loss before continuing, and will immediately return if
the device is lost.
2020-12-02 11:57:39 -06:00
Chip Davis
e0e5d3ce28 Support the VK_EXT_subgroup_size_control extension.
This extension allows the subgroup size to vary between draw/dispatch
calls, and even allows clients to declare that full subgroups must
always be dispatched. It corresponds better to how Metal actually works.

No support for declaring a required subgroup size, unfortunately.
2020-12-02 09:38:58 -06:00
Bill Hollings
ab34b8c6d4 Remove spurious debug log message. 2020-12-02 08:07:39 -05:00
Bill Hollings
1ec58c9a92 Support building for Mac Catalyst on macOS 11.0+.
Define MVK_MACCAT build macro and use it to conditionally compile code to align
with build features and capabilities of Mac Catalyst platform on macOS 11.0+.
Treat Mac Catalyst as minor variation of macOS 11.0.
Update documentation.

Currently only support Mac Catalyst on macOS 11.0+, to avoid complexities of
deselecting iOS features and capabilities for Mac Catalyst on previous macOS versions.

Mac Catalyst (and Simulators) require use of XCFrameworks.
Currently unable to generate a dylib for Mac Catalyst.
2020-12-01 19:26:15 -05:00
Chip Davis
1801c5b80b MVKExtensions: Clean up mvkIsSupportedOnPlatform().
There's a lot of redundant pointer comparison and version checking in
that function. Use a macro to reduce the redundancy.

Advertise `VK_KHR_shader_subgroup_extended_types` on iOS. I forgot to do
this in #1159.
2020-11-30 17:01:57 -06:00
Chip Davis
898c03d77f MVKSync: Miscellaneous fixes.
Fix reversed condition in `MVKTimelineSemaphoreEmulated`, which probably
caused some test failures.

Don't increment the `MTLEvent` binary semaphore's counter unless a
command buffer is actually present to schedule a wait on.

Defer signal operations when a swapchain image is not yet available.
That way, the correct value will be signaled when the image is ready,
instead of causing GPU lockups and timeouts.

When a drawable is presented, immediately mark it available, instead of
waiting until the command buffer finishes. Otherwise, the wrong
semaphore could be signaled when an image is used twice in a row.

This should fix the problems using `MTLEvent` binary semaphores with
presentation, which was preventing us from enabling them by default when
available.

Special thanks to @apayen, whose
[idea](https://github.com/KhronosGroup/MoltenVK/issues/803) and
[change](a4ac715975)
were the basis for this.
2020-11-28 11:18:20 -06:00
Bill Hollings
242edbdc0b
Merge pull request #1164 from js6i/master
MVKImage: Avoid swizzling storage and/or attachment image views.
2020-11-28 11:43:32 -05:00
Bill Hollings
2b8bc02106
Merge pull request #1162 from cdavis5e/min-sample-shading
MVKGraphicsPipeline: Handle minSampleShading.
2020-11-28 11:40:53 -05:00
Jan Sikorski
597e746012 MVKImage: Avoid swizzling storage and/or attachment image views.
Detect more swizzle patterns that are equivalent to identity, so that Metal
does not disable usage capabilities that we want.
2020-11-27 17:10:46 +01:00
Chip Davis
85416e297e MVKGraphicsPipeline: Handle minSampleShading.
If it is enabled and nonzero, then force sample-rate shading in the
fragment shader.
2020-11-26 14:36:53 -06:00
Chip Davis
346c532648 MVKPhysicalDevice: Enable texture swizzle on all Apple GPUs.
I misread the table when I added this.
2020-11-26 14:34:34 -06:00
Chip Davis
8e11c41c40 MVKPhysicalDevice: Correct subgroup properties.
On systems not supporting this, the subgroup size is set to 1.

Make sure the subgroup size is fixed in the shader, at least until we
implement `VK_EXT_subgroup_size_control`.

According to the Metal feature set tables, SIMD-group reduction is only
supported on Mac family 2 GPUs and Apple family 7 GPUs. Previously, we
were exposing these on all Mac GPUs.

Quadgroup permutation is supported on all Apple GPUs starting from
family 4. We use them for regular group non-uniform ops as well, so
these are considered to have a subgroup size of 4. On Mac, it's a bit
more complicated. The 2.1 tables say that all Mac GPUs support this, but
the 3.0 and 4.0 tables say that only family 2 supports quadgroup ops.
I've allowed quad ops on family 1 for now.

Unfortunately, my testing shows that SIMD-group functions don't work in
fragment shaders on Mojave, so no fragment shader support until Metal 3.

Update SPIRV-Cross to pull in changes needed for all this.
2020-11-25 12:02:37 -06:00
michael(jf.lai)
fee3e5da3a Fix macOS 10.15.5 with Xcode 12.1 compilation error. 2020-11-20 00:39:42 +08:00
Bill Hollings
883b96f611
Merge pull request #1153 from js6i/master
MVKBufferView: Avoid triggering bytesPerRow validation warning.
2020-11-17 09:38:02 -05:00
Jan Sikorski
8173a9717d MVKBufferView: Avoid triggering bytesPerRow validation warning. 2020-11-17 14:26:38 +01:00
Chip Davis
3b33c9ce05 MVKPhysicalDevice: Enable shaderResourceMinLod on iOS.
This was actually added in iOS 13, but it wasn't present in the betas.
Since the betas also didn't support family 6, this leads me to suspect
that `min_lod_clamp()` requires family 6. So to be safe, only enable the
feature on family 6.

Update SPIRV-Cross to pull in the changes needed for this.
2020-11-16 20:47:31 -06:00
Bill Hollings
4e0abab7db
Merge pull request #1151 from cdavis5e/ios-family-7
Enable some family 7 features on iOS.
2020-11-16 21:41:22 -05:00
Bill Hollings
3b71eb0b2a
Merge pull request #1149 from cdavis5e/ios-3d-compressed
MVKPhysicalDevice: Enable 3D compressed textures on iOS/tvOS.
2020-11-16 21:29:59 -05:00
Chip Davis
4f8613c650 Enable some family 7 features on iOS.
Apple family 7 GPUs (A14) on iOS support multisample layered rendering,
as well as sampler border colors and the mirror clamp to edge sampler
address mode.
2020-11-16 14:46:07 -06:00
Chip Davis
c826d58946 MVKComputePipeline: Override max threads per threadgroup.
We know the actual threadgroup size, because it is declared in the
shader. Therefore, we know the total count of threads per threadgroup,
which is simply the product of the threadgroup size in all three
dimensions. This is necessary if Metal picks a size lower than the app
is expecting. At least one game (NieR: Automata) needs this to work
correctly.
2020-11-15 20:03:53 -06:00
Chip Davis
0dc92be38f MVKPhysicalDevice: Enable 3D compressed textures on iOS/tvOS.
Forbid ETC2 and EAC 3D textures on all platforms. Apple GPUs do not
support 3D for those.
2020-11-15 13:56:15 -06:00
Bill Hollings
1772790f46
Merge pull request #1147 from cdavis5e/mac-apple-family-7
MVKPhysicalDevice: Enable Apple family 7 features on macOS.
2020-11-15 14:35:42 -05:00
Chip Davis
a63861c732
Merge pull request #1146 from TheSpydog/dont-cull-clears
Disable culling for the duration of vkCmdClearAttachments
2020-11-14 18:17:41 -06:00
Caleb Cornett
2c33c69e22 Set viewport and scissor rect to the framebuffer extents before clearing. And minor style fixes. 2020-11-14 16:27:39 -05:00
Caleb Cornett
85c6e453b2 Set fill mode, depth bias, viewport, and scissor states before clearing attachments 2020-11-14 15:32:37 -05:00
Chip Davis
6a00362b7b MVKPhysicalDevice: Enable Apple family 7 features on macOS.
The M1 chip supports up to GPU family 7. The enum for this was added to
the SDK in the final version of Xcode 12.2; now we can use it.
2020-11-13 18:10:02 -06:00