Like with `VK_KHR_device_group` and `VK_KHR_external_memory`, this just
adds the groundwork needed to support future extensions; it provides no
actual support for external fences.
We should be able to easily support `VK_KHR_external_fence_fd`, by using
a POSIX semaphore. Since the fence FDs produced by that extension are
opaque, only supporting `close(2)` and `dup(2)`, we shouldn't have to
worry about portable programs poking the FD in weird ways. Hopefully.
Other types of external fences we might support include GCD semaphores
(`dispatch_semaphore_t`) and Mach semaphores (`semaphore_t`). I really
think we want support for GCD semaphores, because that's the most likely
object we're going to see passed between processes on Darwin given GCD's
built-in support for XPC.
I have deliberately omitted mention of these extensions from the user
guide. `VK_KHR_external_memory` was not mentioned in there, presumably
because no actual external memory types are actually supported.
Also, add missing `vkGetInstanceProcAddr()` entry for
`vkGetPhysicalDeviceExternalBufferPropertiesKHR()`. We have the
function, and we export the extension's name string. We might as well
make it available via `vkGetInstanceProcAddr()`.
Originally, Metal did not support this directly, and still largely
doesn't on GPUs other than Apple family 6. Therefore, this
implementation uses vertex instancing to draw the needed views. To
support the Vulkan requirement that only the layers for the enabled
views are loaded and stored in a multiview render pass, this
implementation uses multiple Metal render passes for multiple "clumps"
of enabled views.
For indirect draws, as with tessellation, we must adjust the draw
parameters at execution time to account for the extra views, so we need
to use deferred store actions here. Without them, tracking the state
becomes too involved.
If the implementation doesn't support either layered rendering or
deferred store actions, multiview render passes are instead unrolled and
rendered one view at a time. This will enable us to support the
extension even on older devices and OSes, but at the cost of additional
command buffer memory and (possibly) worse performance.
Eventually, we should consider using vertex amplification to accelerate
this, particularly since indirect multiview draws are terrible and
currently require a compute pass to adjust the instance count. Also,
instanced drawing in itself is terrible due to its subpar performance.
But, since vertex amplification on family 6 only supports two views,
when `VK_KHR_multiview` mandates a minimum of 6, we'll still need to use
instancing to support more than two views.
I have tested this extensively against the CTS. I'm very confident in
its correctness. The only failing tests are
`dEQP-VK.multiview.queries.*`, due to our inadequate implementation of
timestamp queries; and `dEQP-VK.multiview.depth.*`, due to what I assume
is a bug in the way Metal handles arrayed packed depth/stencil textures,
and which may only be a problem on Mojave. I need to test this on
Catalina and Big Sur.
Update SPIRV-Cross to pull in some fixes necessary for this to work.
Fixes#347.
- Delete fat library and framework scripts and templates.
- MoltenVK build package now only includes one XCFramework, and separate platform dylibs.
- Modify fetchDependencies and Makefile targets to not build fat libraries,
and to build simulators separately than platforms instead.
- Script package_moltenvk.sh now copies dylibs for all built platforms.
- Consolidate package_all.sh and delete package_one_os.sh.
- Swap names of copy_lib_to_staging.sh and copy_to_staging.sh scripts.
- Cube demo now uses MoltenVK as XCFramework, and support Simulator builds.
- Hologram demo now uses MoltenVK as dylibs from new packaging location.
- API-Samples demo now uses MoltenVK as XCFramework.
- Update documentation.
fetchDependencies support option to skip all library builds.
fetchDependencies avoid sync locks if not building in parallel.
fetchDependencies build glslang headers.
Update ExternalRevisions/README.md glslang build integration section.
Update What's New.
SPIRV-Cross can now AND the `gl_SampleMask` output with an additional
fixed mask, presumably from the pipeline. Use this new functionality to
implement pipeline sample mask handling.
Special thanks to Tomek Pontika and Corentin Wallez of Google for
graciously contributing their implementation to SPIRV-Cross.
Update SPIRV-Cross to pull in the change necessary for this.
This extension provides weaker guarantees than `VK_EXT_robustness2` and
its `robustImageAccess2` feature. Metal easily meets those guarantees,
with no action on our part necessary.
MVKShaderLibrary::getMTLFunction() synchronize and refactor release of Metal objects.
Make use of existing autorelease pool instead of discrete retain/release.
Wrap entire specialization operation in @synchronized() to guard against
Metal internals not coping with multiple simultaneous specializations.
Log ext lib build steps to provide user feedback.
Add package_ext_libs_finish.sh script to separate packaging
libraries from building them, to avoid build database conflicts.
Travis only build macOS ext libs.
We only support the `robustImageAccess2` feature for now. Metal's
guarantees around out-of-bounds accesses to textures give us this for
free. `nullDescriptor` should also be possible to implement, but it
needs testing. Null image descriptors will probably just work, but null
buffer pointers probably not. We may need help, either from SPIRV-Cross
or a pass in SPIRV-Tools. `robustBufferAccess2` definitely needs an
assist from SPIRV-Tools. All three are useful for Direct3D compatibility
layers (DXVK, wined3d).
Currently, when a mapping is initiated for host-coherent device memory, existing
image contents are copied from the MTLTexture to host memory. However, if
host-coherent device memory is already mapped in a long-standing memory mapping,
changes to MTLTexture content are not reflected to the host memory.
This change adds that capability to open memory mappings. When the image pipeline
barrier is applied to an image that is attached to a host-coherent device memory
that currently has an open memory mapping, the MTLTexture contents are copied to
the open host memory binding.
Add MVKMappedMemoryRange to retrieve the currently mapped range in an MVKDeviceMemory.
MVKImage cleanup use of isHostCoherent(). Let it be same as MVKDeviceMemory
and don't test for it from macOS because it is unnecessary.
Mark MVKResource::bindDeviceMemory() as virtual.
Remove use of !! from mvkIsAnyFlagEnabled().
Update What's New document for VK_KHR_sampler_ycbcr_conversion.
Report format properties only based on the primary MTLPixelFormat
of a VkFormat, not any possible substitution MTLPixelFormat.
Update MoltenVK version to 1.0.43 and VK_MVK_MOLTENVK_SPEC_VERSION to 26.
Add MVKTranslatedVertexBinding to describe a translated vertex binding, track these in
MVKGraphicsPipeline, and add a Metal vertex layout for each additional translated binding.
MVKGraphicsResourcesCommandEncoderState query MVKGraphicsPipeline for translated bindings
associated with existing bindings and binds the same MTLBuffer multiple times with
different offsets.
The link contains a space between the right square bracket and subsequent left round bracket causing the link to display incorrectly in GitHub's markdown reader.
Crash happened because MVKPhysicalDevices created inline in MVKSmallVector<, 2>
using emplace_back(). Crash occurred when attempting to reallocate memory in
MVKSmallVector, and move inline MVKPhysicalDevice instances. Revert to tracking
MVKPhysicalDevice as pointers, which are easy to move after memory reallocation.
Review and optimize all uses of MVKSmallVector::emplace_back().
Set VkPhysicalDeviceLimits::maxSamplerLodBias to zero since it's unsupported.
Return error and use VK_POLYGON_MODE_LINE for unsupported VK_POLYGON_MODE_POINT.
Functions and functionality supported, but don't currently do anything
until Metal-friendly enumerations added to VkExternalMemoryHandleTypeFlagBits.
Updated What's New document.
Capability functions and functionality supported, but don't currently do anything
until Metal-friendly enumerations added to VkExternalMemoryHandleTypeFlagBits.
Rename MVKPhysicalDevice::getPhysicalDeviceMemoryProperties() to
getMemoryProperties() for consistency.
Updated What's New document.
Added MVKQueuePerformance::frameInterval performance tracker.
Added MVKPerformanceTracker::latestDuration to track duration of most recent activity.
Swapchain performance can be retrieved with other activity performance through
vkGetPerformanceStatisticsMVK().
Performance logging of all activities can be performed periodically
on a frame-count basis, or inline as the activity occurs.
Add MVK_CONFIG_PERFORMANCE_LOGGING_INLINE env var to enable/disable
logging of performance of each activity when it happens.
Removed vkGetSwapchainPerformanceMVK() and MVKSwapchainPerformance from API.
Updated VK_MVK_MOLTENVK_SPEC_VERSION to 25.
Updated MoltenVK version to 1.0.42.
Add MVKPresentableSwapchainImage and MVKPeerSwapchainImage subclasses to
MVKSwapchainImage, with MVKPresentableSwapchainImage instances created inside
swapchain, and MVKPeerSwapchainImage instances created using vkCreateImage().
MVKPeerSwapchainImage retrieve and share CAMetalDrawable from corresponding
MVKPresentableSwapchainImage.
Fix the make install build command to overwrite the existing framework
in the system framework library.
Update README.md to clarify the instructions for using make install.