This has caused us nothing but trouble. The code to build up the vertex
descriptor is fragile; we can rip that out now.
Also, make sure to positively identify per-patch blocks as per-patch.
For those, the individual members have the `Patch` decoration.
Update SPIRV-Cross to pull in the changes needed for this.
Fixes 66 tests in the CTS.
MSL: Handle descriptor aliasing of raw buffer descriptors.
MSL: Do not attempt to alias push constants.
MSL: Report unsupported 64-bit atomics.
MSL: Add more keywords to reserved set.
It is always legal in Vulkan to read a builtin, particularly
`BuiltInPosition`, even if it weren't written by the previous stage. The
CTS tests that this scenario works in the driver.
Update SPIRV-Cross to pull in a change required for this.
Fixes 8 CTS tests under `dEQP-VK.pipeline.*.no_position`. (Eight other
tests worked solely by accident without this change.)
- Update to latest SPIRV-Cross to support `SPV_KHR_physical_storage_buffer`
for `VK_KHR_buffer_device_address` and `VK_EXT_buffer_device_address`
- Add support for `VK_EXT_buffer_device_address` extension.
- Advertise support for `VK_KHR_buffer_device_address`
and `VK_EXT_buffer_device_address` on macOS 12.5.
- Add appropriate extension reporting and enablement for
`VkPhysicalDeviceBufferDeviceAddressFeatures`,
`VkPhysicalDeviceBufferDeviceAddressFeaturesEXT`, and
`VkPhysicalDeviceFragmentShaderBarycentricFeaturesKHR`.
- Support reading `VkMemoryAllocateFlagsInfo` to identify memory allocations that
need to support buffer pointer access (in case needed in future on non-shared memory).
- Update `Whats_New.md` and `MoltenVK_Runtime_UserGuide` documents.
mvk::getShaderOutputs() in SPRIVReflection.h support flattening nested structures.
MoltenVKShaderConverter tool support loading tessellation shader files.
MoltenVKShaderConverter tool update to MSL 2.4 by default.
Remove use of deprecated MTLCreateSystemDefaultDevice().
Update to latest version of SPIRV-Cross.
Do not use MTLEvent for VkSemaphore under Rosetta2.
Remove compile test for MVK_MACOS_APPLE_SILICON and MVK_APPLE_SILICON when testing
for Apple GPU families, to allow x86 builds to test for Apple GPU under Rosetta2.
Simplify identifying M1 GPU. All M1 SoCs currently support the A14 (Apple7) GPU.
Support compiling MSL 2.4 in runtime pipelines and MoltenVKShaderConverterTool.
Fix issue where MSL 2.3 only available on Apple Silicon, even on macOS.
Update to latest SPIRV-Cross (unrelated to Rosetta2).
Support maximum point primitive size of 511.
Update to latest SPIRV-Cross version to add support
for OpSpecConstantOp ops OpQuantizeToF16 and OpSRem.
Update MoltenVK version to 1.1.6.
MSL: Support row-major transpose when storing matrix from constant RHS matrix.
MSL: Fix casting in constant expressions with different sizes.
MSL: Fix duplicate gl_Position outputs when gl_Position defined but unused.
Add link to Vulkan SDK Getting Started doc to README.md and
MoltenVK_Runtime_UserGuide.md documents.
Add Github CI badge to README.md, and remove Travis CI badge.
Make document notices of use of Markdown into comments
so they are invisible when using a Markdown reader.
On systems not supporting this, the subgroup size is set to 1.
Make sure the subgroup size is fixed in the shader, at least until we
implement `VK_EXT_subgroup_size_control`.
According to the Metal feature set tables, SIMD-group reduction is only
supported on Mac family 2 GPUs and Apple family 7 GPUs. Previously, we
were exposing these on all Mac GPUs.
Quadgroup permutation is supported on all Apple GPUs starting from
family 4. We use them for regular group non-uniform ops as well, so
these are considered to have a subgroup size of 4. On Mac, it's a bit
more complicated. The 2.1 tables say that all Mac GPUs support this, but
the 3.0 and 4.0 tables say that only family 2 supports quadgroup ops.
I've allowed quad ops on family 1 for now.
Unfortunately, my testing shows that SIMD-group functions don't work in
fragment shaders on Mojave, so no fragment shader support until Metal 3.
Update SPIRV-Cross to pull in changes needed for all this.
This was actually added in iOS 13, but it wasn't present in the betas.
Since the betas also didn't support family 6, this leads me to suspect
that `min_lod_clamp()` requires family 6. So to be safe, only enable the
feature on family 6.
Update SPIRV-Cross to pull in the changes needed for this.
This is now supported in MSL 2.3. Support varies by device; devices that
support this return `YES` from `supportsPullModelInterpolation`. Based
on my testing, AMD devices do not yet support this, and Intel devices
do. Apple GPUs probably also support this, in order to support OpenGL on
top.
Update SPIRV-Cross to pull in the changes needed for this.
MVKDevice track enabled VkPhysicalDeviceInlineUniformBlockFeaturesEXT features.
Disable prefilled MTLCommandBuffers if update after binding enabled.
Update to latest SPIRV-Cross that includes support for unsized arrays.
Originally, Metal did not support this directly, and still largely
doesn't on GPUs other than Apple family 6. Therefore, this
implementation uses vertex instancing to draw the needed views. To
support the Vulkan requirement that only the layers for the enabled
views are loaded and stored in a multiview render pass, this
implementation uses multiple Metal render passes for multiple "clumps"
of enabled views.
For indirect draws, as with tessellation, we must adjust the draw
parameters at execution time to account for the extra views, so we need
to use deferred store actions here. Without them, tracking the state
becomes too involved.
If the implementation doesn't support either layered rendering or
deferred store actions, multiview render passes are instead unrolled and
rendered one view at a time. This will enable us to support the
extension even on older devices and OSes, but at the cost of additional
command buffer memory and (possibly) worse performance.
Eventually, we should consider using vertex amplification to accelerate
this, particularly since indirect multiview draws are terrible and
currently require a compute pass to adjust the instance count. Also,
instanced drawing in itself is terrible due to its subpar performance.
But, since vertex amplification on family 6 only supports two views,
when `VK_KHR_multiview` mandates a minimum of 6, we'll still need to use
instancing to support more than two views.
I have tested this extensively against the CTS. I'm very confident in
its correctness. The only failing tests are
`dEQP-VK.multiview.queries.*`, due to our inadequate implementation of
timestamp queries; and `dEQP-VK.multiview.depth.*`, due to what I assume
is a bug in the way Metal handles arrayed packed depth/stencil textures,
and which may only be a problem on Mojave. I need to test this on
Catalina and Big Sur.
Update SPIRV-Cross to pull in some fixes necessary for this to work.
Fixes#347.
- Delete fat library and framework scripts and templates.
- MoltenVK build package now only includes one XCFramework, and separate platform dylibs.
- Modify fetchDependencies and Makefile targets to not build fat libraries,
and to build simulators separately than platforms instead.
- Script package_moltenvk.sh now copies dylibs for all built platforms.
- Consolidate package_all.sh and delete package_one_os.sh.
- Swap names of copy_lib_to_staging.sh and copy_to_staging.sh scripts.
- Cube demo now uses MoltenVK as XCFramework, and support Simulator builds.
- Hologram demo now uses MoltenVK as dylibs from new packaging location.
- API-Samples demo now uses MoltenVK as XCFramework.
- Update documentation.
fetchDependencies support option to skip all library builds.
fetchDependencies avoid sync locks if not building in parallel.
fetchDependencies build glslang headers.
Update ExternalRevisions/README.md glslang build integration section.
Update What's New.
SPIRV-Cross can now AND the `gl_SampleMask` output with an additional
fixed mask, presumably from the pipeline. Use this new functionality to
implement pipeline sample mask handling.
Special thanks to Tomek Pontika and Corentin Wallez of Google for
graciously contributing their implementation to SPIRV-Cross.
Update SPIRV-Cross to pull in the change necessary for this.