The [MTLDevice sampleTimestamps:gpuTimestamp:] function turns out to be
synchronized with other queue activities, and can block GPU execution
if it is called between MTLCommandBuffer submissions. On non-Apple-Silicon
devices, it was called before and after every vkQueueSubmit() submission,
to track the correlation between GPU and CPU timestamps, and was delaying
the start of GPU work on the next submission (on Apple Silicon, both
CPU & GPU timestamps are specified in nanoseconds, and the call was bypassed).
Move timestamp correlation from vkQueueSubmit() to
vkGetPhysicalDeviceProperties(), where it is used to update
VkPhysicalDeviceLimits::timestampPeriod on non-Apple-Silicon devices.
Delegate MVKPhysicalDevice::getProperties(VkPhysicalDeviceProperties2*)
to MVKPhysicalDevice::getProperties(VkPhysicalDeviceProperties*), plus
minimize wasted effort if pNext is empty (unrelated).
Move the declaration of several MVKPhysicalDevice member structs to
potentially reduce member spacing (unrelated).
- Remove Xcode 11 build from GitHub CI.
- Leave MVK_XCODE_12 guards in place to allow devs to possibly continue to
attempt to build existing MoltenVK code using Xcode 11, even though it's
not officially supported. Such devs may have to add their own additional
MVK_XCODE_12 guards for any Xcode 12 API features added after this change.
- Remove visionOS from multi-platform builds because it
requires Xcode 15+ and will abort a multi-platform build.
- Define TARGET_OS_XR for older SDK's.
- A number of SDK deprecation warnings remain when building for visionOS.
These cannot be removed without significant refactoring.
- Build visionOS dependencies for Release build by default.
- Fix local variable initialization warning (unrelated).
To reduce complexity and repetitive copy-pasted spaghetti code,
the design approach here was to implement triangle fan conversion on
MVKCmdDrawIndexedIndirect, as the most general of the draw commands,
and then populate and invoke a synthetic MVKCmdDrawIndexedIndirect
command from the other draw commands.
- Rename pipeline factory shader cmdDrawIndexedIndirectMultiviewConvertBuffers()
to cmdDrawIndexedIndirectConvertBuffers, and in addition to original support
for modifying indirect content to support multiview, add support for
converting triangle fan indirect content and indexes to triangle list.
- Modify MVKCmdDrawIndexedIndirect to track need to convert triangle fans
to triangle list, and invoke kernel function when needed.
- Modify MVKCmdDraw, MVKCmdDrawIndexed, and MVKCmdDrawIndirect to populate
and invoke a synthetic MVKCmdDrawIndexedIndirect command to convert triangle
fans to triangle lists.
- Add pipeline factory shader cmdDrawIndirectPopulateIndexes() to convert
non-indexed indirect content to indexed indirect content.
- MVKCmdDrawIndexedIndirect add support for zero divisor vertex buffers
potentially coming from MVKCmdDraw and MVKCmdDrawIndexed.
- Rename pipeline factory shader cmdDrawIndexedIndirectConvertBuffers()
to cmdDrawIndexedIndirectTessConvertBuffers() so it will be invoked from
MVKCommandEncodingPool::getCmdDrawIndirectTessConvertBuffersMTLComputePipelineState()
(unrelated).
This just provides support for the `SPV_KHR_non_semantic_info`
extension, which supports extended instruction sets that do not affect
the semantics of a SPIR-V shader (e.g. debug info). SPIRV-Cross already
handles these instruction sets, so no additional work is required on our
part to support this extension.
This extension has a direct Metal equivalent in the
`-[MTLDevice sampleTimestamps:gpuTimestamp:]` method. However, that
method returns CPU timestamps in the Mach absolute time domain, which is
*not* that of `CLOCK_MONOTONIC_RAW` but of `CLOCK_UPTIME_RAW`. The
function that corresponds to `CLOCK_MONOTONIC_RAW` is
`mach_continuous_time()`. Therefore, this implementation uses the
`mach_continuous_time()` function for the CPU timestamp. Perhaps we
should lobby the WG for `VK_TIME_DOMAIN_CLOCK_UPTIME_RAW_EXT`.
This turned out to be a little bit more involved than I had hoped. But,
with this, we can now use the `VK_FORMAT_A4R4G4B4_UNORM_PACK16` and
`VK_FORMAT_A4B4G4R4_UNORM_PACK16` formats from shaders, use them as blit
sources, and even clear them. Storage images and render targets of these
formats aren't supported, however. To support the latter would require
the insertion of a swizzle into the fragment shader before returning.
The former cannot be reasonably supported.
As of macOS Big Sur and iOS/tvOS 14, the `discard_fragment()` function
in MSL is defined to have demote semantics; that is, fragment shader
output is discarded, but the fragment shader thread continues to run as
a helper invocation. This is very useful for Direct3D emulation, since
this is the semantic that HLSL `discard` has.
Signed-off-by: Chip Davis <chip@holochip.com>
- [MTLDrawable presentAtTime:] syncs to display vsync. To support
VK_PRESENT_MODE_IMMEDIATE_KHR while using VkPresentTimeGOOGLE::presentID,
only call presentAtTime: if VkPresentTimeGOOGLE::desiredPresentTime has
been explicitly set to a non-zero value.
- Clarify initially clearing MVKImagePresentInfo to all zeros.
- Only log performance stats on FPS logging if logging style is explicitly
set to MVK_CONFIG_ACTIVITY_PERFORMANCE_LOGGING_STYLE_FRAME_COUNT (unrelated).
The same compute encoder is used across dispatches and other commands,
which may override compute state, and end up breaking subsequent dispatches.
- Mark compute encoding state dirty when following commands,
which use Metal compute encoders, are issued:
- vkCmdCopyBuffer()
- vkCmdClearColorImage()
- vkCmdClearDepthStencilImage()
- vkCmdFillBuffer()
- vkCmdCopyQueryPoolResults()
- MVKCommandEncoder move marking compute state dirty from
endCurrentMetalEncoding() to getMTLComputeEncoder().
- For efficiency, don't prematurely force end of query copy compute encoder
used on renderpass end, in case compute dispatches follow.
- Update MoltenVK to 1.2.5 (unrelated).
Advertise VK_KHR_depth_stencil_resolve extension on early iOS devices,
since VK_RESOLVE_MODE_SAMPLE_ZERO_BIT is supported on all devices,
even if other resolve modes are not, and makes it consistent with
Vulkan 1.2 mandatory support for VK_RESOLVE_MODE_SAMPLE_ZERO_BIT.
- MTLDevice registryID is not constant across OS reboots,
which is not conformant with deviceUUID requirements.
- Replace with combination of MTLDevice location, locationNumber,
peerGroupID, and peerIndex, which should define uniqueness,
and should be constant across OS reboots.
- Populate deviceLUID from MTLDevice registryID.
- Report error, but do not fail on request for timestamp query pool
that is too large for MTLCounterSampleBuffer.
- Change reported error to VK_ERROR_OUT_OF_DEVICE_MEMORY and clarify
text of error reported when timestamp query pool is too large.
- Clarify error reported for occlusion query pool errors (unrelated).
- Make MVKDevice::enableFeatures() functions into templates to pass struct type.
- Add mvkGetAddressOfFirstMember() to retrieve the address of the first member of
a struct, taking into consideration whether the struct has a Vulkan pNext member.
- Add mvk::getTypeName() and mvk::getOrdinalSuffix() string functions.
- Build one universal build, instead of per-platform.
- Upload this single build artifact to GitHub.
- Upgrade to v3 of action dependencies to remove Node.js deprecation warnings.
- Avoid use of deprecated set-output GitHub action command.
- Use macOS 13 and Xcode 14.3.
- README.md document access to binary artifacts.
- MVKPresentableSwapchainImage::presentCAMetalDrawable() and
addPresentedHandler() pass MVKImagePresentInfo by value instead
of reference, to avoid callbacks colliding with tracked
MVKImagePresentInfos being cleared when
MVKQueuePresentSurfaceSubmission is destroyed after it is run.
Also undeprecate the original vkGet/SetMoltenVKConfigurationMVK().
In expectation of the upcoming VK_EXT_layer_settings extension, it is felt that
adding these additional functions at this time would be confusing to app devs.
- Reinstate VK_MVK_moltenvk extension, but log warning message when it is enabled.
- Add vkGetMoltenVKConfiguration2MVK() and vkSetMoltenVKConfiguration2MVK()
to set config without passing a dummy VkInstance, and deprecate
vkGetMoltenVKConfigurationMVK() and vkSetMoltenVKConfigurationMVK().
The VK_MVK_moltenvk extension has never been brought inside Vulkan, and
the functions have never been supported by the Vulkan Loader and Layers.
Most of the functionality has long been replaced by the official
VK_metal_objects extension.
- Remove VK_MVK_moltenvk as an advertised extension.
- Refactor vk_mvk_moltenvk.h header file into separate headers files:
- mvk_config.h - Valid public config functions
- mvk_private_api.h - Valid development debugging functions used with care
- mvk_deprecated_api.h - Formally deprecated functions.
- Retain skeleton vk_mvk_moltenvk.h header file for legacy compatibility only.
- Update documentation and header comments to explain changes.
- MVKRenderSubpass add separate getDepthFormat() & getStencilFormat(),
and isDepthAttachmentUsed() & isStencilAttachmentUsed() and use
instead of testing pixel format for depth and stencil components.
- Add MVKRenderingAttachmentIterator class to consistently iterate,
and take actions, on the attachments in VkRenderingInfo to create
synthetic MVKRenderPass and extract image views and clear colors.
- Remove mvkCreateRenderPass() and mvkCreateFramebuffer() in favor
of additional constructors, and remove mvkGetDepthStencilFormat() in
favor of retrieving formats for separate depth and stencil attachments.
- MVKRenderpass constructors reorganize order of adding attachments and
subpasses, and connecting the two.
- Renmame MVKRenderPassAttachment to MVKAttachmentDescription.
- MVKPipeline reorganize member variables to minimize gaps in content
and remove unnecessary _isRasterizingDepthStencil member var (unrelated).
- MVKDevice track VkBuffers marked with VK_BUFFER_USAGE_SHADER_DEVICE_ADDRESS_BIT.
- Add SPIRVToMSLConversionResultInfo::usesPhysicalStorageBufferAddressesCapability
to detect and track shaders that use PhysicalStorageBufferAddresses capability,
and track such shader stages within pipeline.
- MVKResourcesCommandEncoderState encode usage of VkBuffers marked with
VK_BUFFER_USAGE_SHADER_DEVICE_ADDRESS_BIT when pipeline uses
PhysicalStorageBufferAddresses capability.
- Rename MVKResourcesCommandEncoderState::encodeArgumentBufferResourceUsage()
to encodeResourceUsage().
- MVKDevice move some functions to public scope and remove friend classes.
- MVKDeviceMemory ensure _vkMemAllocFlags is always initialized (unrelated).
- Rename MVKFoundation template method contains() to mvkContains() (unrelated).
- Add MVK_XCODE_14_3 macro to compile for iOS/tvOS 16.4 and above.
- Add support for BC compression on iOS/tvOS 16.4 and above where supported.
- Consolidate MVKPixelFormats::modifyMTLFormatCapabilities(mtlDev)
and centralize querying MTLDevice format methods for all platforms.
- Fix memory leak when waiting on timeline semaphores.
- For correctness, set VkPhysicalDeviceLimits::lineWidthGranularity to 1.
- Update MoltenVK to version 1.2.4.
- Update Whats_New.md document with recent changes.
- Cleanup VkPhysicalDeviceShaderAtomicFloatFeaturesEXT enablement and documentation.
- Cleanup VkPhysicalDevicePipelineCreationCacheControlFeaturesEXT enablement.
- Expand MVK_CONFIG_TRACE_VULKAN_CALLS to log thread ID only if requested.
- Add MVKCompressor template class, and mvkCompress() & mvkDecompress()
functions to support general data compression.
- Add MVKConfiguration::shaderSourceCompressionAlgorithm and
env var MVK_CONFIG_SHADER_COMPRESSION_ALGORITHM to support
compressing MSL shader source code held in a pipeline cache.
- Add MVKShaderCompilationPerformance::mslCompress and mslDecompress
to allow performance of MSL compression to be tracked and queried.
- Add support for logging performance stats accumulated in a VkDevice,
when it is destroyed. Good for CTS testing.
- Change MVKConfiguration::logActivityPerformanceInline boolean to
activityPerformanceLoggingStyle enumeration value.
- Add MVK_CONFIG_ACTIVITY_PERFORMANCE_LOGGING_STYLE environment variable and
build setting to set MVKConfiguration::activityPerformanceLoggingStyle value.
- Don't retain converted MSL source code in MVKShaderModule.
- Add SPIRVToMSLConversionResult and GLSLToSPIRVConversionResult
structures to capture all feedback from shader conversions.
- Fix crash when VkCommandBufferInheritanceInfo::renderPass is VK_NULL_HANDLE.
- Do not clear attachments when dynamic rendering is resumed.
- Allow ending dynamic rendering to trigger next multiview pass if needed.
- Move deciding to begin next multiview pass to MVKCommandEncoder.
- Fix premature caching of occlusion query results during tessellation rendering.
Tessellation ends Metal renderpass for compute control and eval stages.
Wait until end of Metal renderpass after rasterization stage.
- vkCmdCopyQueryPoolResults(): Fix loss of queries when query
count is not a multiple of GPU threadgroup execution width.
- Disable occlusion recording while clearing attachments or render area.
- MVKCmdClearAttachments improve labelling of MTLDebugGroup to better
distinguish clearing renderpass render area from vkCmdClearAttachments()
in an Xcode GPU capture (unrelated but helpful during debugging).
- MVKCmdClearAttachments re-order member variables to
optimize memory requirements (unrelated).
- MVKCommandBuffer remove unused renderpass tracking functions (unrelated).