moltenvk

Author	SHA1	Message	Date
Bill Hollings	fbc3600787	Support Xcode 12.3 and remove Travis CI ref from project.	2021-01-21 12:27:37 -05:00
Malcolm Bechard	526779ad66	fix incorrect behavior for MVKCmdResolveImage fix incorrectly changing first resolve layer's src/dst base layer, as well and out of bounds access to the mtlResolveSlices array	2021-01-18 22:58:07 -05:00
Mark Reid	40cd45d4d7	Support immmutableSamplers with sampler arrays, fixes #1181	2021-01-09 19:57:39 -08:00
Chip Davis	2eb0fe6947	MVKRenderPass: Only use MTLStorageModeMemoryless where available. Fixes #1197.	2021-01-05 22:44:55 -06:00
Chip Davis	930525f289	MVKRenderPass: Use a non-trivial granularity for TBDR GPUs. This parameter is intended to indicate the optimal granularity for the render area. For TBDR GPUs, this will be the tile size. IMR GPUs continue to use 1x1. Apple GPUs support tile sizes of 16x16, 32x16, and 32x32. But, we can't read the tile size used for a Metal render pass until a render command encoder has been created. So for now, hardcode 32x32 for TBDR GPUs.	2020-12-31 13:18:11 -06:00
Bill Hollings	34dc378371	Merge pull request #1191 from cdavis5e/apple-gpu-barriers Don't use barriers in render passes on Apple GPUs.	2020-12-31 09:40:17 -05:00
Chip Davis	d22d9e3d17	Don't use barriers in render passes on Apple GPUs. Apple GPUs don't support them, and in fact Metal's validation layer will complain if you try to use them.	2020-12-30 15:09:53 -06:00
Bill Hollings	e9bc3b3c62	Merge pull request #1193 from cdavis5e/memoryless-host-accessibility MVKDeviceMemory: Don't consider Memoryless host-accessible on macOS/tvOS.	2020-12-30 10:11:24 -05:00
Bill Hollings	c985bbf4c3	Merge pull request #1192 from cdavis5e/memoryless-rt-actions MVKRenderPass: Don't use Load/Store actions for memoryless.	2020-12-30 10:06:09 -05:00
Bill Hollings	4f0d370166	Merge pull request #1190 from cdavis5e/render-rgb9e5-mask MVKGraphicsPipeline: Fix color write mask with RGB9E5 RTs.	2020-12-30 09:52:29 -05:00
Bill Hollings	326e872a65	Merge pull request #1189 from cdavis5e/sync-create-texture MVKImagePlane: When sync'ing, create the texture if it doesn't exist.	2020-12-30 09:48:34 -05:00
Chip Davis	4d6b92bb9c	MVKDeviceMemory: Don't consider Memoryless host-accessible on macOS/tvOS. I missed this when I added support for memoryless on macOS.	2020-12-29 21:21:02 -06:00
Chip Davis	aa65392027	MVKRenderPass: Don't use Load/Store actions for memoryless. Memoryless textures cannot use `Load`/`Store` actions, because there is no memory to load from or store to. So don't use these actions, even if we're not rendering to the whole thing. Multiple subpasses might be a problem. Tessellation and indirect multiview will be problems, because these require us to interrupt the render pass in order to do compute--and that causes us to use `Store` unconditionally. One option, which I've mentioned before, is using tile shaders for these cases. But I haven't looked seriously into that yet. The other involves a subtle distinction between Metal's `MTLStorageModeMemoryless` and Vulkan's `VK_MEMORY_TYPE_LAZILY_ALLOCATED_BIT`: the former never commits memory; while the latter doesn't commit memory at first, but may do so later. If we find that we're going to need to `Store` to a `LAZILY_ALLOCATED` image, then what we can do is replace the `Memoryless` texture with one with a real backing store in `Private` memory. This change does not do that yet. It'll require some more thought. As for multiple subpasses, I eventually want to look into optimizing render passes by shuffling the subpasses around to minimize the need to load and store attachments from/to memory, which TBDR GPUs absolutely hate. That should help with this problem, too.	2020-12-29 21:19:18 -06:00
Chip Davis	e6a8409b31	MVKGraphicsPipeline: Fix color write mask with RGB9E5 RTs. Metal does not allow color write masks with this format where some but not all of the channels are disabled. Either all must be enabled or none must be enabled. Presumably, this is because of the shared exponent. This is just good enough to stop the validation layer from violently terminating the program. To implement this properly requires using framebuffer fetch, with a change to SPIRV-Cross. Luckily, the only GPUs that support `RGB9E5` rendering also support framebuffer fetch. Honestly, I don't understand why Apple's drivers don't do this.	2020-12-29 21:15:14 -06:00
Chip Davis	f875055ecd	MVKImagePlane: When sync'ing, create the texture if it doesn't exist. Before we can upload the texture, we need to make sure there is a texture to upload to.	2020-12-29 21:13:43 -06:00
Chip Davis	1bbab6151a	MVKPixelFormats: Enable RenderTarget usage for linear textures on Apple GPUs. I forgot to do this when I added the `renderLinearTexture` feature.	2020-12-29 21:12:14 -06:00
Bill Hollings	94a81177cb	Update MoltenVK to version 1.1.2. Update VK_MVK_MOLTENVK_SPEC_VERSION to 30. Update What's New document.	2020-12-26 13:45:10 -05:00
Jan Sikorski	4440a64d83	Config: Added setting for fastMathEnabled Metal Compiler option. Set it to false by default, as it creates visual glitches in at least one game, caused by depth fighting between a depth and a color render pass.	2020-12-14 17:41:28 +01:00
Bill Hollings	1ccc0ab7b5	Update dependency libraries to match Vulkan SDK 1.2.162. Fix Mac Catalyst build failure, plus several build warnings on other platforms. Update What's New document.	2020-12-08 21:31:39 -05:00
Chip Davis	13b1840ad0	MVKImage: Fix compilation with Xcode 11. We can't rely on those enums not being re-`#define`d, because they're only re-`#define`d like that in MVKPixelFormats.mm. Signed-off-by: Chip Davis <cdavis@codeweavers.com>	2020-12-04 02:35:13 -06:00
Chip Davis	891654d5d0	MVKSwapchain: Allow images whose size doesn't match the CAMetalLayer. The game NieR: Automata (via DXVK) attempts to create a 1680x1050 swapchain on a 1600x900 window. It then attempts to render to this swapchain with a 1680x1050 framebuffer. But we created the textures at 1600x900, matching the window. The render area is thus too big, which triggers a Metal validation failure. Apparently, DXVK doesn't check the surface caps before creating the Vulkan swapchain. Rather than expecting the swapchain to be the same size as the layer, we can actually support any swapchain size, up to the maximum size of a texture supported by the device. The system will just scale the texture when rendering it if it doesn't match the layer size. If the sizes don't match up, we return `VK_SUBOPTIMAL_KHR`, instead of `VK_ERROR_OUT_OF_DATE_KHR`, indicating that presentation is still possible, but performance may suffer. This is good enough to let the game continue under the validation layers. This really needs a corresponding change to the CTS, because it currently assumes that we can't do this. Signed-off-by: Chip Davis <cdavis@codeweavers.com>	2020-12-03 10:18:03 -06:00
Chip Davis	f74356e51c	Re-enable MTLEvent-based binary semaphores. I haven't seen any problems yet. It passes all the tests--even the `signal_order` tests. I have tried it with some games, and it seems OK there, too.	2020-12-02 20:13:43 -06:00
Bill Hollings	6110c349ce	Merge pull request #1168 from billhollings/mac-catalyst Support Mac Catalyst on macOS 11.0+	2020-12-02 19:51:52 -05:00
Bill Hollings	7f62362db3	Fixes from review for Mac Catalyst on macOS 11.0+. MVKPixelFormats revert testing for stencil feedback and add Mac Catalyst test. Revert mvkOSVersionIsAtLeast(mac, ios) to remove test for Mac Catalyst version. MVKRenderPass revert testing for MTLStorageModeMemoryless for iOS.	2020-12-02 16:22:38 -05:00
Chip Davis	cbc2443ce4	Handle device loss. If a command buffer fails in Metal, its `status` becomes `MTLCommandBufferStatusError`, and its `error` property is populated. In Vulkan, command buffer failure usually triggers device loss. This is because most drivers can't guarantee that resources weren't affected in an undefined fashion. We can't make that guarantee, either, so when a Metal command buffer has an error, mark the device lost. The error is also logged. All waits are immediately signaled; those that were client requests instead of internal implementation details immediately return `VK_ERROR_DEVICE_LOST`. The app may be able to recreate the logical device and reconstruct its state. However, some Metal errors are severe enough that all subsequent command buffers will fail--in those cases, the physical device becomes lost as well, indicating that the device cannot be recreated. Only certain commands are allowed to report device loss. These commands test for device loss before continuing, and will immediately return if the device is lost.	2020-12-02 11:57:39 -06:00
Chip Davis	e0e5d3ce28	Support the VK_EXT_subgroup_size_control extension. This extension allows the subgroup size to vary between draw/dispatch calls, and even allows clients to declare that full subgroups must always be dispatched. It corresponds better to how Metal actually works. No support for declaring a required subgroup size, unfortunately.	2020-12-02 09:38:58 -06:00
Bill Hollings	ab34b8c6d4	Remove spurious debug log message.	2020-12-02 08:07:39 -05:00
Bill Hollings	1ec58c9a92	Support building for Mac Catalyst on macOS 11.0+. Define MVK_MACCAT build macro and use it to conditionally compile code to align with build features and capabilities of Mac Catalyst platform on macOS 11.0+. Treat Mac Catalyst as minor variation of macOS 11.0. Update documentation. Currently only support Mac Catalyst on macOS 11.0+, to avoid complexities of deselecting iOS features and capabilities for Mac Catalyst on previous macOS versions. Mac Catalyst (and Simulators) require use of XCFrameworks. Currently unable to generate a dylib for Mac Catalyst.	2020-12-01 19:26:15 -05:00
Chip Davis	1801c5b80b	MVKExtensions: Clean up mvkIsSupportedOnPlatform(). There's a lot of redundant pointer comparison and version checking in that function. Use a macro to reduce the redundancy. Advertise `VK_KHR_shader_subgroup_extended_types` on iOS. I forgot to do this in #1159.	2020-11-30 17:01:57 -06:00
Chip Davis	898c03d77f	MVKSync: Miscellaneous fixes. Fix reversed condition in `MVKTimelineSemaphoreEmulated`, which probably caused some test failures. Don't increment the `MTLEvent` binary semaphore's counter unless a command buffer is actually present to schedule a wait on. Defer signal operations when a swapchain image is not yet available. That way, the correct value will be signaled when the image is ready, instead of causing GPU lockups and timeouts. When a drawable is presented, immediately mark it available, instead of waiting until the command buffer finishes. Otherwise, the wrong semaphore could be signaled when an image is used twice in a row. This should fix the problems using `MTLEvent` binary semaphores with presentation, which was preventing us from enabling them by default when available. Special thanks to @apayen, whose [idea](https://github.com/KhronosGroup/MoltenVK/issues/803) and [change](`a4ac715975`) were the basis for this.	2020-11-28 11:18:20 -06:00
Bill Hollings	242edbdc0b	Merge pull request #1164 from js6i/master MVKImage: Avoid swizzling storage and/or attachment image views.	2020-11-28 11:43:32 -05:00
Bill Hollings	2b8bc02106	Merge pull request #1162 from cdavis5e/min-sample-shading MVKGraphicsPipeline: Handle minSampleShading.	2020-11-28 11:40:53 -05:00
Jan Sikorski	597e746012	MVKImage: Avoid swizzling storage and/or attachment image views. Detect more swizzle patterns that are equivalent to identity, so that Metal does not disable usage capabilities that we want.	2020-11-27 17:10:46 +01:00
Chip Davis	85416e297e	MVKGraphicsPipeline: Handle minSampleShading. If it is enabled and nonzero, then force sample-rate shading in the fragment shader.	2020-11-26 14:36:53 -06:00
Chip Davis	346c532648	MVKPhysicalDevice: Enable texture swizzle on all Apple GPUs. I misread the table when I added this.	2020-11-26 14:34:34 -06:00
Chip Davis	8e11c41c40	MVKPhysicalDevice: Correct subgroup properties. On systems not supporting this, the subgroup size is set to 1. Make sure the subgroup size is fixed in the shader, at least until we implement `VK_EXT_subgroup_size_control`. According to the Metal feature set tables, SIMD-group reduction is only supported on Mac family 2 GPUs and Apple family 7 GPUs. Previously, we were exposing these on all Mac GPUs. Quadgroup permutation is supported on all Apple GPUs starting from family 4. We use them for regular group non-uniform ops as well, so these are considered to have a subgroup size of 4. On Mac, it's a bit more complicated. The 2.1 tables say that all Mac GPUs support this, but the 3.0 and 4.0 tables say that only family 2 supports quadgroup ops. I've allowed quad ops on family 1 for now. Unfortunately, my testing shows that SIMD-group functions don't work in fragment shaders on Mojave, so no fragment shader support until Metal 3. Update SPIRV-Cross to pull in changes needed for all this.	2020-11-25 12:02:37 -06:00
michael(jf.lai)	fee3e5da3a	Fix macOS 10.15.5 with Xcode 12.1 compilation error.	2020-11-20 00:39:42 +08:00
Bill Hollings	883b96f611	Merge pull request #1153 from js6i/master MVKBufferView: Avoid triggering bytesPerRow validation warning.	2020-11-17 09:38:02 -05:00
Jan Sikorski	8173a9717d	MVKBufferView: Avoid triggering bytesPerRow validation warning.	2020-11-17 14:26:38 +01:00
Chip Davis	3b33c9ce05	MVKPhysicalDevice: Enable shaderResourceMinLod on iOS. This was actually added in iOS 13, but it wasn't present in the betas. Since the betas also didn't support family 6, this leads me to suspect that `min_lod_clamp()` requires family 6. So to be safe, only enable the feature on family 6. Update SPIRV-Cross to pull in the changes needed for this.	2020-11-16 20:47:31 -06:00
Bill Hollings	4e0abab7db	Merge pull request #1151 from cdavis5e/ios-family-7 Enable some family 7 features on iOS.	2020-11-16 21:41:22 -05:00
Bill Hollings	3b71eb0b2a	Merge pull request #1149 from cdavis5e/ios-3d-compressed MVKPhysicalDevice: Enable 3D compressed textures on iOS/tvOS.	2020-11-16 21:29:59 -05:00
Chip Davis	4f8613c650	Enable some family 7 features on iOS. Apple family 7 GPUs (A14) on iOS support multisample layered rendering, as well as sampler border colors and the mirror clamp to edge sampler address mode.	2020-11-16 14:46:07 -06:00
Chip Davis	c826d58946	MVKComputePipeline: Override max threads per threadgroup. We know the actual threadgroup size, because it is declared in the shader. Therefore, we know the total count of threads per threadgroup, which is simply the product of the threadgroup size in all three dimensions. This is necessary if Metal picks a size lower than the app is expecting. At least one game (NieR: Automata) needs this to work correctly.	2020-11-15 20:03:53 -06:00
Chip Davis	0dc92be38f	MVKPhysicalDevice: Enable 3D compressed textures on iOS/tvOS. Forbid ETC2 and EAC 3D textures on all platforms. Apple GPUs do not support 3D for those.	2020-11-15 13:56:15 -06:00
Bill Hollings	1772790f46	Merge pull request #1147 from cdavis5e/mac-apple-family-7 MVKPhysicalDevice: Enable Apple family 7 features on macOS.	2020-11-15 14:35:42 -05:00
Chip Davis	a63861c732	Merge pull request #1146 from TheSpydog/dont-cull-clears Disable culling for the duration of vkCmdClearAttachments	2020-11-14 18:17:41 -06:00
Caleb Cornett	2c33c69e22	Set viewport and scissor rect to the framebuffer extents before clearing. And minor style fixes.	2020-11-14 16:27:39 -05:00
Caleb Cornett	85c6e453b2	Set fill mode, depth bias, viewport, and scissor states before clearing attachments	2020-11-14 15:32:37 -05:00
Chip Davis	6a00362b7b	MVKPhysicalDevice: Enable Apple family 7 features on macOS. The M1 chip supports up to GPU family 7. The enum for this was added to the SDK in the final version of Xcode 12.2; now we can use it.	2020-11-13 18:10:02 -06:00

1 2 3 4 5 ...

1314 Commits