1392 Commits

Author SHA1 Message Date
Chip Davis
e1ac50c07e MVKMTLBufferAllocation: Mark temp buffers as volatile.
They are not expected to be useful beyond the commands that use them,
but they take up memory nonetheless. This is exactly the use case
purgeability was designed for. Tell the system that it's OK to reclaim
their memory if necessary.

Doing this every time the buffer is used will cause the purgeable state
to be reset from `MTLPurgeableStateEmpty`, in case the system really did
reclaim their memory.

In accordance with Apple's advice, lock the pages for the buffer when
loading it, so the memory isn't pulled out from under us.
2021-02-01 16:15:17 -06:00
Bill Hollings
e5e1f65dfc Update Xcode build settings to cover Xcode 12.4. 2021-02-01 13:50:42 -05:00
Chip Davis
55d206e058 MVKCommandEncoder: Use the temp buffer mechanism for visibility buffers.
Add support for "dedicated" temp buffers, where instead of allocating a
big buffer and carving regions out of it, a unique buffer is returned
for each allocation request. This is necessary for visibility buffers,
because the offset passed to `-[MTLRenderCommandEncoder
setVisibilityResultMode:offset:]` cannot exceed an
implementation-defined value, currently 256k less 8 bytes for Mac family
2 on Catalina and up, and on Apple family 7; and 64k less 8 bytes
otherwise.
2021-02-01 10:41:29 -06:00
Bill Hollings
a69982ef38 Add guard code for XCode 11. 2021-01-29 17:54:50 -05:00
Bill Hollings
4083dd1229 Merge branch 'master' of https://github.com/billhollings/MoltenVK into fastmath 2021-01-29 17:30:23 -05:00
Chip Davis
2bfb5ed77b Fix build on Xcode 11.
I missed this.
2021-01-29 14:53:03 -06:00
Bill Hollings
b80a620afc
Merge pull request #1223 from cdavis5e/occlusion-query-rewrite
MVKQueryPool: Totally rework the way occlusion queries work.
2021-01-29 14:58:52 -05:00
Chip Davis
24c1b7276e MVKPhysicalDevice: Require Mac family 2 for quad-scope permutation.
According to the Metal Feature Set Tables, only family 2 supports
quad-scope permutation. We've been seeing issues with SIMD-group
functions on family 1 hardware, so for now I'm moving quad-group
permutation to family 2.
2021-01-28 22:07:43 -06:00
Chip Davis
592cec58fd MVKPhysicalDevice: Require Mac family 2 for render without attachments.
We've seen reports that rendering without attachments doesn't work on
family 1 GPUs. Disable it for them.
2021-01-28 22:07:43 -06:00
Chip Davis
8e8edbadb1 MVKQueryPool: Totally rework the way occlusion queries work.
Instead of having Metal directly write to the query pool's internal
storage, we'll have it write to a temp buffer whose lifetime is tied to
the command buffer. The temp buffer's contents are then accumulated to
all queries that were activated.

This last step is particularly important for queries that span multiple
render passes. Since Metal resets the query counter at a render pass
boundary, this means that, up until now, only the last draw counted
toward the query. Data from the others were lost. By using this temp
buffer and accumulating the results to the query storage, the counter
will correctly count draws from all render passes inside the query
bounds.

This will also fix problems using multiple query pools, particularly
with large query pool support on, in a single render pass. Because Metal
requires us to set the visibility results buffer at render pass start
time, we couldn't use multiple query pools inside a single render pass.
Using a single temp buffer bypasses this problem.

Also, don't make queries available to the host unless they became
available to the device first. That way, a query that is immediately
reset during command buffer execution will properly report that the
query is unavailable. This fixes the remaining dEQP-VK.query_pool.*
tests. Fix some bugs that shook out of this.
2021-01-28 16:15:26 -06:00
Bill Hollings
3e20e1a137 Support compiling MSL with position invariance if indicated in SPIRV shader.
Add SPIRVToMSLConversionResults::isPositionInvariant to query
position invariance from SPIR-V.
MVKDevice::getMTLCompileOptions() takes into consideration need to preserve invariance.
MVKShaderModule compile MSL to preserve invariance if required by shader.
2021-01-28 16:46:49 -05:00
Bill Hollings
2343c0267b Enable MSL ffast-math compilation option by default.
Support querying SignedZeroInfNanPreserve execution mode
from SPIR-V to disable fast-math for individual shaders.
Clean up namespace references in SPIRVToMSLConverter.cpp.
2021-01-28 08:05:33 -05:00
Bill Hollings
696de7a4e7 Make MVKConfiguration access global, ignoring provided VkInstance.
MVKConfiguration access is now global, and the VkInstance provided in the
vkGet/Set/MoltenVKConfigurationMVK() functions is ignored. This allows these
functions to be provided with a VkInstance object that originates from a
different Vulkan layer than MoltenVK, without risking breaking the API.

MVKConfiguration extended to cover all MoltenVK environment variables.

Move all environment variable declarations to MVKEnvironment.h.
Add MVKEnvironment.cpp to define config functions.
Cleanup .m files to use MVKCommonEnvironment.h instead of MVKEnvironment.h.
2021-01-26 17:59:13 -05:00
Bill Hollings
6c74fbf1b5
Merge pull request #1217 from billhollings/shaderInt64
Advertise support for shaderInt64 feature.
2021-01-25 22:05:43 -05:00
Bill Hollings
2fc9fcb079 Support shaderInt64 feature only on minimum MSL 2.3 and higher GPUs. 2021-01-25 21:40:29 -05:00
Bill Hollings
f6374fe8bc
Merge pull request #1219 from billhollings/mtl-fence-semaphore
For Vulkan semaphores, prefer using MTLFence over MTLEvent.
2021-01-25 15:05:10 -05:00
Bill Hollings
64a273fb23 For Vulkan semaphores, prefer using MTLFence over MTLEvent,
and add documentation for VK_SEMAPHORE_TYPE_TIMELINE.
2021-01-25 14:35:58 -05:00
Bill Hollings
f123cb4b19 Advertise macOS M1 GPU as VK_PHYSICAL_DEVICE_TYPE_INTEGRATED_GPU. 2021-01-25 14:01:07 -05:00
Bill Hollings
57763529b1 Advertise support for shaderInt64 feature.
Add MVKDevice::mslVersionIsAtLeast() to support tests for MSL version.
Also bump MSL support level of standalone MoltenVKShaderConverter tool.
2021-01-25 13:08:41 -05:00
Bill Hollings
c757f3012c
Merge pull request #1213 from billhollings/minor-updates
Minor non-behavior-changing updates and clean-up
2021-01-22 15:08:30 -05:00
Bill Hollings
01d6feea07
Merge pull request #1211 from billhollings/non-functional-admin-changes
Non functional admin and documentation changes.
2021-01-22 08:47:16 -05:00
Bill Hollings
3ec61667a6 Update MVKSmallVector constructors, and remove unnecessary or obsolete code.
MVKSmallVector allow constructor to size with default values.
Remove obsolete MVKVector, which was long ago replaced with MVKSmallVector.
Remove unnecessary concrete implementations of template functions that are
used only within a single compilation unit.
2021-01-21 18:54:20 -05:00
Bill Hollings
b87b91f144 Remove unused member variables.
Remove MVKGPUCaptureScope::_queue.
Remove MVKQueue::_nextMTLCmdBuffID.
2021-01-21 17:46:29 -05:00
Bill Hollings
e1b3585413 Add MVK_APPLE_SILICON macro definition.
Use MVK_APPLE_SILICON instead of
MVK_IOS_OR_TVOS || MVK_MACOS_APPLE_SILICON or
MVK_MACOS_APPLE_SILICON || MVK_IOS_OR_TVOS.
2021-01-21 16:53:17 -05:00
Bill Hollings
4a65c293c6 Make logging functions globally accessible to ease their use in debugging activities.
Include MVKLogging.h in MVKEnvironment.h, and remove references elsewhere.
2021-01-21 16:21:13 -05:00
Bill Hollings
28c514d03b Add strings for all current VkResult values. 2021-01-21 15:45:23 -05:00
Bill Hollings
015031c955 Update copyright notices to year 2021 and Xcode build settings check to Xcode 12.3. 2021-01-21 13:37:07 -05:00
Bill Hollings
fbc3600787 Support Xcode 12.3 and remove Travis CI ref from project. 2021-01-21 12:27:37 -05:00
Malcolm Bechard
2235458838 further fix for 526779ad66 2021-01-19 18:17:22 -05:00
Malcolm Bechard
526779ad66 fix incorrect behavior for MVKCmdResolveImage
fix incorrectly changing first resolve layer's src/dst base layer,
as well and out of bounds access to the mtlResolveSlices array
2021-01-18 22:58:07 -05:00
Mark Reid
40cd45d4d7 Support immmutableSamplers with sampler arrays, fixes #1181 2021-01-09 19:57:39 -08:00
Chip Davis
2eb0fe6947 MVKRenderPass: Only use MTLStorageModeMemoryless where available.
Fixes #1197.
2021-01-05 22:44:55 -06:00
Chip Davis
930525f289 MVKRenderPass: Use a non-trivial granularity for TBDR GPUs.
This parameter is intended to indicate the optimal granularity for the
render area. For TBDR GPUs, this will be the tile size. IMR GPUs
continue to use 1x1.

Apple GPUs support tile sizes of 16x16, 32x16, and 32x32. But, we can't
read the tile size used for a Metal render pass until a render command
encoder has been created. So for now, hardcode 32x32 for TBDR GPUs.
2020-12-31 13:18:11 -06:00
Bill Hollings
34dc378371
Merge pull request #1191 from cdavis5e/apple-gpu-barriers
Don't use barriers in render passes on Apple GPUs.
2020-12-31 09:40:17 -05:00
Chip Davis
d22d9e3d17 Don't use barriers in render passes on Apple GPUs.
Apple GPUs don't support them, and in fact Metal's validation layer will
complain if you try to use them.
2020-12-30 15:09:53 -06:00
Bill Hollings
e9bc3b3c62
Merge pull request #1193 from cdavis5e/memoryless-host-accessibility
MVKDeviceMemory: Don't consider Memoryless host-accessible on macOS/tvOS.
2020-12-30 10:11:24 -05:00
Bill Hollings
c985bbf4c3
Merge pull request #1192 from cdavis5e/memoryless-rt-actions
MVKRenderPass: Don't use Load/Store actions for memoryless.
2020-12-30 10:06:09 -05:00
Bill Hollings
4f0d370166
Merge pull request #1190 from cdavis5e/render-rgb9e5-mask
MVKGraphicsPipeline: Fix color write mask with RGB9E5 RTs.
2020-12-30 09:52:29 -05:00
Bill Hollings
326e872a65
Merge pull request #1189 from cdavis5e/sync-create-texture
MVKImagePlane: When sync'ing, create the texture if it doesn't exist.
2020-12-30 09:48:34 -05:00
Chip Davis
4d6b92bb9c MVKDeviceMemory: Don't consider Memoryless host-accessible on macOS/tvOS.
I missed this when I added support for memoryless on macOS.
2020-12-29 21:21:02 -06:00
Chip Davis
aa65392027 MVKRenderPass: Don't use Load/Store actions for memoryless.
Memoryless textures cannot use `Load`/`Store` actions, because there is
no memory to load from or store to. So don't use these actions, even if
we're not rendering to the whole thing.

Multiple subpasses might be a problem. Tessellation and indirect
multiview *will* be problems, because these require us to interrupt the
render pass in order to do compute--and that causes us to use `Store`
unconditionally.

One option, which I've mentioned before, is using tile shaders for these
cases. But I haven't looked seriously into that yet. The other involves
a subtle distinction between Metal's `MTLStorageModeMemoryless` and
Vulkan's `VK_MEMORY_TYPE_LAZILY_ALLOCATED_BIT`: the former *never*
commits memory; while the latter doesn't commit memory *at first*, but
may do so later. If we find that we're going to need to `Store` to a
`LAZILY_ALLOCATED` image, then what we can do is replace the
`Memoryless` texture with one with a real backing store in `Private`
memory. This change does not do that yet. It'll require some more
thought.

As for multiple subpasses, I eventually want to look into optimizing
render passes by shuffling the subpasses around to minimize the need to
load and store attachments from/to memory, which TBDR GPUs absolutely
hate. That should help with this problem, too.
2020-12-29 21:19:18 -06:00
Chip Davis
e6a8409b31 MVKGraphicsPipeline: Fix color write mask with RGB9E5 RTs.
Metal does not allow color write masks with this format where some but
not all of the channels are disabled. Either all must be enabled or none
must be enabled. Presumably, this is because of the shared exponent.

This is just good enough to stop the validation layer from violently
terminating the program. To implement this properly requires using
framebuffer fetch, with a change to SPIRV-Cross. Luckily, the only GPUs
that support `RGB9E5` rendering also support framebuffer fetch.
Honestly, I don't understand why Apple's drivers don't do this.
2020-12-29 21:15:14 -06:00
Chip Davis
f875055ecd MVKImagePlane: When sync'ing, create the texture if it doesn't exist.
Before we can upload the texture, we need to make sure there is a
texture to upload to.
2020-12-29 21:13:43 -06:00
Chip Davis
1bbab6151a MVKPixelFormats: Enable RenderTarget usage for linear textures on Apple GPUs.
I forgot to do this when I added the `renderLinearTexture` feature.
2020-12-29 21:12:14 -06:00
Bill Hollings
94a81177cb Update MoltenVK to version 1.1.2.
Update VK_MVK_MOLTENVK_SPEC_VERSION to 30.
Update What's New document.
2020-12-26 13:45:10 -05:00
Jan Sikorski
4440a64d83 Config: Added setting for fastMathEnabled Metal Compiler option.
Set it to false by default, as it creates visual glitches in at least one game,
caused by depth fighting between a depth and a color render pass.
2020-12-14 17:41:28 +01:00
Bill Hollings
1ccc0ab7b5 Update dependency libraries to match Vulkan SDK 1.2.162.
Fix Mac Catalyst build failure, plus several build warnings on other platforms.
Update What's New document.
2020-12-08 21:31:39 -05:00
Chip Davis
13b1840ad0 MVKImage: Fix compilation with Xcode 11.
We can't rely on those enums not being re-`#define`d, because they're
only re-`#define`d like that in MVKPixelFormats.mm.

Signed-off-by: Chip Davis <cdavis@codeweavers.com>
2020-12-04 02:35:13 -06:00
Chip Davis
891654d5d0 MVKSwapchain: Allow images whose size doesn't match the CAMetalLayer.
The game NieR: Automata (via DXVK) attempts to create a 1680x1050
swapchain on a 1600x900 window. It then attempts to render to this
swapchain with a 1680x1050 framebuffer. But we created the textures at
1600x900, matching the window. The render area is thus too big, which
triggers a Metal validation failure. Apparently, DXVK doesn't check the
surface caps before creating the Vulkan swapchain.

Rather than expecting the swapchain to be the same size as the layer, we
can actually support any swapchain size, up to the maximum size of a
texture supported by the device. The system will just scale the texture
when rendering it if it doesn't match the layer size. If the sizes don't
match up, we return `VK_SUBOPTIMAL_KHR`, instead of
`VK_ERROR_OUT_OF_DATE_KHR`, indicating that presentation is still
possible, but performance may suffer. This is good enough to let the
game continue under the validation layers.

This really needs a corresponding change to the CTS, because it
currently assumes that we can't do this.

Signed-off-by: Chip Davis <cdavis@codeweavers.com>
2020-12-03 10:18:03 -06:00
Chip Davis
f74356e51c Re-enable MTLEvent-based binary semaphores.
I haven't seen any problems yet. It passes all the tests--even the
`signal_order` tests. I have tried it with some games, and it seems OK
there, too.
2020-12-02 20:13:43 -06:00