1210 Commits

Author SHA1 Message Date
Bill Hollings
697e8627cf
Merge pull request #1009 from cdavis5e/external-fence
Add basic support for VK_KHR_external_fence{,_capabilities}.
2020-09-05 21:58:24 -04:00
Chip Davis
0d4b087f3d MVKCommandBuffer: Fix a crash on starting a query outside a render pass.
This was introduced by #1006.

Fixes #1007.
2020-09-04 22:30:10 -05:00
Chip Davis
e6424654e3 Add basic support for VK_KHR_external_fence{,_capabilities}.
Like with `VK_KHR_device_group` and `VK_KHR_external_memory`, this just
adds the groundwork needed to support future extensions; it provides no
actual support for external fences.

We should be able to easily support `VK_KHR_external_fence_fd`, by using
a POSIX semaphore. Since the fence FDs produced by that extension are
opaque, only supporting `close(2)` and `dup(2)`, we shouldn't have to
worry about portable programs poking the FD in weird ways. Hopefully.

Other types of external fences we might support include GCD semaphores
(`dispatch_semaphore_t`) and Mach semaphores (`semaphore_t`). I really
think we want support for GCD semaphores, because that's the most likely
object we're going to see passed between processes on Darwin given GCD's
built-in support for XPC.

I have deliberately omitted mention of these extensions from the user
guide. `VK_KHR_external_memory` was not mentioned in there, presumably
because no actual external memory types are actually supported.

Also, add missing `vkGetInstanceProcAddr()` entry for
`vkGetPhysicalDeviceExternalBufferPropertiesKHR()`. We have the
function, and we export the extension's name string. We might as well
make it available via `vkGetInstanceProcAddr()`.
2020-09-04 13:16:54 -05:00
Chip Davis
34930eaf5b Support the VK_KHR_multiview extension.
Originally, Metal did not support this directly, and still largely
doesn't on GPUs other than Apple family 6. Therefore, this
implementation uses vertex instancing to draw the needed views. To
support the Vulkan requirement that only the layers for the enabled
views are loaded and stored in a multiview render pass, this
implementation uses multiple Metal render passes for multiple "clumps"
of enabled views.

For indirect draws, as with tessellation, we must adjust the draw
parameters at execution time to account for the extra views, so we need
to use deferred store actions here. Without them, tracking the state
becomes too involved.

If the implementation doesn't support either layered rendering or
deferred store actions, multiview render passes are instead unrolled and
rendered one view at a time. This will enable us to support the
extension even on older devices and OSes, but at the cost of additional
command buffer memory and (possibly) worse performance.

Eventually, we should consider using vertex amplification to accelerate
this, particularly since indirect multiview draws are terrible and
currently require a compute pass to adjust the instance count. Also,
instanced drawing in itself is terrible due to its subpar performance.
But, since vertex amplification on family 6 only supports two views,
when `VK_KHR_multiview` mandates a minimum of 6, we'll still need to use
instancing to support more than two views.

I have tested this extensively against the CTS. I'm very confident in
its correctness. The only failing tests are
`dEQP-VK.multiview.queries.*`, due to our inadequate implementation of
timestamp queries; and `dEQP-VK.multiview.depth.*`, due to what I assume
is a bug in the way Metal handles arrayed packed depth/stencil textures,
and which may only be a problem on Mojave. I need to test this on
Catalina and Big Sur.

Update SPIRV-Cross to pull in some fixes necessary for this to work.

Fixes #347.
2020-09-03 17:14:46 -05:00
Bill Hollings
4e45ddfbe7 Merge branch 'master' of https://github.com/billhollings/MoltenVK into xcode12 2020-09-02 11:21:35 -04:00
Jan Sikorski
3c3683ebef MVKImagePlane: account for swapchain images in get/releaseMTLTexture()
Delegate to MVKImage's getMTLTexture() method to use the overriden one
for swapchain images.
Also, the _mtlTextureViews member should be cleared to prevent accidental
reuse of released views.
2020-09-02 12:17:10 +02:00
Bill Hollings
aa496c8789
Merge pull request #1001 from cdavis5e/null-draws
MVKCmdDraw: Don't encode commands that draw zero vertices.
2020-09-01 16:55:03 -04:00
Bill Hollings
a52098d29a
Merge pull request #1000 from cdavis5e/clear-vertex-count
MVKCmdClearAttachments: Fix vertex count for clearing multiple layers.
2020-09-01 16:47:02 -04:00
Bill Hollings
dbb0b58671
Merge pull request #999 from cdavis5e/null-clear-atts
MVKCmdClearAttachments: Don't attempt to clear unused attachments.
2020-09-01 16:19:44 -04:00
Bill Hollings
56df7d61d7 Remove MoltenVK fat libraries and frameworks and use XCFramework instead.
- Delete fat library and framework scripts and templates.
- MoltenVK build package now only includes one XCFramework, and separate platform dylibs.
- Modify fetchDependencies and Makefile targets to not build fat libraries,
  and to build simulators separately than platforms instead.
- Script package_moltenvk.sh now copies dylibs for all built platforms.
- Consolidate package_all.sh and delete package_one_os.sh.
- Swap names of copy_lib_to_staging.sh and copy_to_staging.sh scripts.
- Cube demo now uses MoltenVK as XCFramework, and support Simulator builds.
- Hologram demo now uses MoltenVK as dylibs from new packaging location.
- API-Samples demo now uses MoltenVK as XCFramework.
- Update documentation.
2020-09-01 14:39:46 -04:00
Chip Davis
245286f57c MVKCmdDraw: Don't encode commands that draw zero vertices.
Nothing will be drawn in that case. Nothing would've been drawn anyway,
but Metal's validation layer complains if you issue a draw command with
zero vertices or instances.

We unfortunately cannot do anything about indirect draws, since we won't
know how many vertices to draw until execute time.
2020-08-31 21:18:02 -05:00
Chip Davis
864a6a6b94 MVKCmdClearAttachments: Fix vertex count for clearing multiple layers.
Prior to this, we were assuming one layer per clear rect. When that
assumption were broken, we wound up smashing the stack. This will be
more important once multiview support lands, since all views must be
cleared by the clear command.
2020-08-31 21:16:33 -05:00
Chip Davis
2d7717b708 MVKCmdClearAttachments: Don't attempt to clear unused attachments.
The attachment index in the `VkClearAttachment` struct is an index into
the current subpass attachment array, not the render pass attachment
array; so it's not enough to check `VkClearAttachment` for
`VK_ATTACHMENT_UNUSED`. We also need to check the current subpass.

If no attachments can be cleared, don't encode a command to the Metal
command buffer.

Perhaps I should bring back `recordBeginRenderPass()` and
`recordEndRenderPass()` and keep track of the current subpass; then we
can avoid recording the command entirely if the intersection of the sets
"attachments to clear" and "used attachments" is null.
2020-08-31 21:12:51 -05:00
Chip Davis
e2ba444aa6 Use deferred store actions instead of tracking multi-pass draws.
The current code does not handle multiple subpasses, nor does it handle
secondary command buffers. Handling subpasses was easy enough. The
problem came with secondary command buffers. Tracking them became
extremely complicated, particularly since pipelines may be set either
inside or outside a render pass, and further, a pipeline set in one
buffer might be used in another.

I then realized a simpler and more elegant solution: Metal's deferred
store actions feature. This allows you to defer setting the store action
for a render pass until encoding time. This is exactly what we need,
since we won't know what store action we actually want until we start
encoding draws. This solution should now work with multiple subpasses
and secondary command buffers, with much less code.
2020-08-31 21:09:11 -05:00
Bill Hollings
536d6cf0d4 Add MoltenVK XCFramework.
Add package_moltenvk_xcframework.sh.
Rename package_shader_converter.sh to package_shader_converter_xcframework.sh.
2020-08-27 23:43:06 -04:00
Bill Hollings
5f6dd8fe81 Fix build errors on Simulator not supporting MTLDrawable present time options. 2020-08-25 13:04:52 -04:00
Bill Hollings
dd59aea71f Use external libraries as XCFrameworks.
Exclude arm64 architectures on macOS and Simulators.
Exclude arm64e architectures on iOS and tvOS.
Stop building fat libraries for external libraries.
Remove package_ext_libs.sh script.
Don't include Headers in ext lib XCFrameworks because of Xcode12 bug in using them.
2020-08-20 15:24:35 -04:00
Jan Sikorski
7c99bb9430 Initialize tessellation related variables conditionally in indirect draws 2020-08-14 12:38:33 +02:00
Chip Davis
07c2054cc8 MVKCommandBuffer: Don't set renderTargetArrayLength on devices that don't support it.
It's a hard validation error to do so. We originally had a check for
this, but it was erroneously completely removed in #988, instead of
being limited to `renderTargetArrayLength`.

Fixes #991.
2020-08-13 15:20:57 -05:00
Bill Hollings
fe89118435 Support Xcode 12 settings. 2020-08-13 13:13:30 -04:00
Bill Hollings
24610ca6b7 Merge branch 'master' of https://github.com/KhronosGroup/MoltenVK into xcode12 2020-08-12 18:53:53 -04:00
Bill Hollings
f3e938fafa
Merge pull request #990 from billhollings/master
Re-add support for bitcode generation on iOS and tvOS.
2020-08-12 17:41:36 -04:00
Bill Hollings
d4b5df532e Re-add support for bitcode generation on iOS and tvOS.
Set BITCODE_GENERATION_MODE build setting in all Xcode projects.
create_dylib.sh support BITCODE_GENERATION_MODE.
2020-08-11 20:18:50 -04:00
Jan Sikorski
77cbe56101 Detect support for having no attachments in a render pass
When we do have support, don't create a dummy texture to attach.
2020-08-11 10:56:33 +02:00
Bill Hollings
2594bef1f6
Merge pull request #982 from jherico/dtk
Enable building on xcode 12 beta
2020-08-07 09:52:00 -04:00
Jan Sikorski
c3254cb546 Ensure the base texture is created when creating a view texture 2020-08-07 12:24:22 +02:00
Jan Sikorski
babefe6608 Always set renderTarget* properties in MTLRenderPassDescriptor
A shader might still use array index even if array length is 1
causing a validation warning if renderTargetArrayLength is not set.
2020-08-07 11:39:17 +02:00
Jan Sikorski
8f8395dcdf Enable MTLRenderPassDescriptor renderTargetWidth and renderTargetHeight API
on macOS, as it's available on 10.15 and newer.
2020-08-07 11:39:11 +02:00
Bradley Austin Davis
df672f99bb Enable building on xcode 12 beta 2020-08-05 18:50:13 -07:00
Bill Hollings
5126aaed3e Update MoltenVK version to 1.0.45.
Update What's New document.
2020-08-05 19:58:59 -04:00
Bill Hollings
7708b08c10
Merge pull request #983 from cdavis5e/multi-patch-workgroup
Process multiple patches per workgroup in a tessellation control shader.
2020-08-05 18:51:34 -04:00
Chip Davis
3db2cbff6b Process multiple patches per workgroup in a tessellation control shader.
This should hopefully reduce underutilization of the GPU, especially on
GPUs where the thread execution width is greater than the number of
control points.

This also eliminates the extra invocations previously needed to read the
varyings from the vertex shader into the tessellation shader. The number
of threads per workgroup is now lcm(SIMD-size, output control points).
This should ensure we always process a whole number of patches per
workgroup, and further reduce underutilization of the GPU's SIMD units.

To avoid complexity handling indices in the tessellation control shader,
I've also changed the way vertex shaders for tessellation are handled.
They are now compute kernels using Metal's support for vertex-style
stage input. This lets us always emit vertices into the buffer in order
of vertex shader execution. Now we no longer have to deal with indexing
in the tessellation control shader, nor do we always have to duplicate
the index buffer to insert gaps. This also fixes a long-standing issue
where if an index were greater than the number of vertices to draw, the
vertex shader would wind up writing outside the buffer, and the vertex
would be lost.
2020-08-05 16:03:25 -05:00
Jan Sikorski
8f52521b1d Flush source image of a transfer
Since on macOS textures cannot be resident in shared (host-coherent) memory,
they need to be flushed before making the copy, to ensure that the modified
data is transferred.
2020-08-05 15:34:10 +02:00
Bill Hollings
4609416ef2
Merge pull request #980 from mbechard/master
dynamicOffsets are supposed to be ordered by binding index
2020-07-30 12:06:12 -04:00
Malcolm Bechard
5874616d78 dynamicOffsets are supposed to be ordered by binding index
not by layout element index. Fixes #978
2020-07-28 17:25:53 -04:00
Bill Hollings
2c7734eda8 Fix issue where expected buffer-sizes buffer not bound to Metal compute encoder.
MVKComputeResourcesCommandEncoderState update buffer-size value before buffer
bindings are encoded into Metal and are no longer marked as dirty.
2020-07-28 15:38:39 -04:00
Bill Hollings
23ea3c9bf7 vkCmdBlitImage() return error if scaling or inverting to linear image on macOS. 2020-07-27 16:00:42 -04:00
Bill Hollings
f7a1c87c71 Update pipeline cache to latest CompilerMSL::Options struct content.
SPIRVToMSLConversionOptions compare instances using memcmp(CompilerMSL::Options).
Update What's New document.
2020-07-27 15:02:56 -04:00
Bill Hollings
6b25c816b5
Merge pull request #976 from cdavis5e/pipeline-sample-mask
MVKPipeline: Pass the pipeline sample mask, if present, to SPIRV-Cross.
2020-07-27 10:31:44 -04:00
Bill Hollings
d2e5ff7df5
Merge pull request #975 from cdavis5e/image-robustness
MVKDevice: Support the VK_EXT_image_robustness extension.
2020-07-27 10:21:31 -04:00
Bill Hollings
484e1dffdb
Merge pull request #972 from BearishSun/various-fixes
Various fixes
2020-07-27 10:04:00 -04:00
Chip Davis
cda8a2cf44 MVKPipeline: Pass the pipeline sample mask, if present, to SPIRV-Cross.
SPIRV-Cross can now AND the `gl_SampleMask` output with an additional
fixed mask, presumably from the pipeline. Use this new functionality to
implement pipeline sample mask handling.

Special thanks to Tomek Pontika and Corentin Wallez of Google for
graciously contributing their implementation to SPIRV-Cross.

Update SPIRV-Cross to pull in the change necessary for this.
2020-07-24 15:15:11 -05:00
Chip Davis
b1803ea5d7 MVKDevice: Support the VK_EXT_image_robustness extension.
This extension provides weaker guarantees than `VK_EXT_robustness2` and
its `robustImageAccess2` feature. Metal easily meets those guarantees,
with no action on our part necessary.
2020-07-24 15:13:58 -05:00
Marko Pintera
e2f4828f42 Less error prone way of calculating bytes per layer 2020-07-24 09:58:17 +02:00
Marko Pintera
6da2fbb353 Don't over-allocate memory for higher mip levels of 3D textures 2020-07-23 15:06:49 +02:00
Marko Pintera
f13fb48905 Allocate correctly sized images 2020-07-23 15:05:44 +02:00
Marko Pintera
d351a7a173 Use correct offset when calculating overlap between image and device memory ranges 2020-07-23 15:03:48 +02:00
Marko Pintera
0b6a5db049 Don't assign MTLTextureUsageRenderTarget to linear textures 2020-07-23 15:02:26 +02:00
Bill Hollings
1f68d5fc2a Fix intermittent concurrent shader specialization race condition.
MVKShaderLibrary::getMTLFunction() synchronize and refactor release of Metal objects.
Make use of existing autorelease pool instead of discrete retain/release.
Wrap entire specialization operation in @synchronized() to guard against
Metal internals not coping with multiple simultaneous specializations.
2020-07-22 17:48:19 -04:00
Bill Hollings
c0103fd008 Remove redundant validation check for 2D image views on 3D images. 2020-07-22 14:52:06 -04:00