MSL: Support row-major transpose when storing matrix from constant RHS matrix.
MSL: Fix casting in constant expressions with different sizes.
MSL: Fix duplicate gl_Position outputs when gl_Position defined but unused.
Add link to Vulkan SDK Getting Started doc to README.md and
MoltenVK_Runtime_UserGuide.md documents.
Add Github CI badge to README.md, and remove Travis CI badge.
Make document notices of use of Markdown into comments
so they are invisible when using a Markdown reader.
On systems not supporting this, the subgroup size is set to 1.
Make sure the subgroup size is fixed in the shader, at least until we
implement `VK_EXT_subgroup_size_control`.
According to the Metal feature set tables, SIMD-group reduction is only
supported on Mac family 2 GPUs and Apple family 7 GPUs. Previously, we
were exposing these on all Mac GPUs.
Quadgroup permutation is supported on all Apple GPUs starting from
family 4. We use them for regular group non-uniform ops as well, so
these are considered to have a subgroup size of 4. On Mac, it's a bit
more complicated. The 2.1 tables say that all Mac GPUs support this, but
the 3.0 and 4.0 tables say that only family 2 supports quadgroup ops.
I've allowed quad ops on family 1 for now.
Unfortunately, my testing shows that SIMD-group functions don't work in
fragment shaders on Mojave, so no fragment shader support until Metal 3.
Update SPIRV-Cross to pull in changes needed for all this.
This was actually added in iOS 13, but it wasn't present in the betas.
Since the betas also didn't support family 6, this leads me to suspect
that `min_lod_clamp()` requires family 6. So to be safe, only enable the
feature on family 6.
Update SPIRV-Cross to pull in the changes needed for this.
This is now supported in MSL 2.3. Support varies by device; devices that
support this return `YES` from `supportsPullModelInterpolation`. Based
on my testing, AMD devices do not yet support this, and Intel devices
do. Apple GPUs probably also support this, in order to support OpenGL on
top.
Update SPIRV-Cross to pull in the changes needed for this.
MVKDevice track enabled VkPhysicalDeviceInlineUniformBlockFeaturesEXT features.
Disable prefilled MTLCommandBuffers if update after binding enabled.
Update to latest SPIRV-Cross that includes support for unsized arrays.
Originally, Metal did not support this directly, and still largely
doesn't on GPUs other than Apple family 6. Therefore, this
implementation uses vertex instancing to draw the needed views. To
support the Vulkan requirement that only the layers for the enabled
views are loaded and stored in a multiview render pass, this
implementation uses multiple Metal render passes for multiple "clumps"
of enabled views.
For indirect draws, as with tessellation, we must adjust the draw
parameters at execution time to account for the extra views, so we need
to use deferred store actions here. Without them, tracking the state
becomes too involved.
If the implementation doesn't support either layered rendering or
deferred store actions, multiview render passes are instead unrolled and
rendered one view at a time. This will enable us to support the
extension even on older devices and OSes, but at the cost of additional
command buffer memory and (possibly) worse performance.
Eventually, we should consider using vertex amplification to accelerate
this, particularly since indirect multiview draws are terrible and
currently require a compute pass to adjust the instance count. Also,
instanced drawing in itself is terrible due to its subpar performance.
But, since vertex amplification on family 6 only supports two views,
when `VK_KHR_multiview` mandates a minimum of 6, we'll still need to use
instancing to support more than two views.
I have tested this extensively against the CTS. I'm very confident in
its correctness. The only failing tests are
`dEQP-VK.multiview.queries.*`, due to our inadequate implementation of
timestamp queries; and `dEQP-VK.multiview.depth.*`, due to what I assume
is a bug in the way Metal handles arrayed packed depth/stencil textures,
and which may only be a problem on Mojave. I need to test this on
Catalina and Big Sur.
Update SPIRV-Cross to pull in some fixes necessary for this to work.
Fixes#347.
- Delete fat library and framework scripts and templates.
- MoltenVK build package now only includes one XCFramework, and separate platform dylibs.
- Modify fetchDependencies and Makefile targets to not build fat libraries,
and to build simulators separately than platforms instead.
- Script package_moltenvk.sh now copies dylibs for all built platforms.
- Consolidate package_all.sh and delete package_one_os.sh.
- Swap names of copy_lib_to_staging.sh and copy_to_staging.sh scripts.
- Cube demo now uses MoltenVK as XCFramework, and support Simulator builds.
- Hologram demo now uses MoltenVK as dylibs from new packaging location.
- API-Samples demo now uses MoltenVK as XCFramework.
- Update documentation.
fetchDependencies support option to skip all library builds.
fetchDependencies avoid sync locks if not building in parallel.
fetchDependencies build glslang headers.
Update ExternalRevisions/README.md glslang build integration section.
Update What's New.
SPIRV-Cross can now AND the `gl_SampleMask` output with an additional
fixed mask, presumably from the pipeline. Use this new functionality to
implement pipeline sample mask handling.
Special thanks to Tomek Pontika and Corentin Wallez of Google for
graciously contributing their implementation to SPIRV-Cross.
Update SPIRV-Cross to pull in the change necessary for this.
Metal is picky about interface matching. If the types of a vertex output
and its corresponding fragment input don't match, down to the number of
vector components, it fails pipeline compilation. To support cases where
the number of components in the fragment input is less than the
corresponding vertex output, we need to fix up the fragment shader to
accept the extra components.
Add Scripts/packagePregenSpirvToolsHeaders script to automate packaging Spirv-Tools
headers in support of the fetchDependencies --skip-spirv-tools-build option.
Update Docs/Whats_New.md.
Update Cereal archive structs to match latest MoltenVK and SPIRV-Cross structs.
Fix recent error that caused pipeline cache data to be ignored during loading.
Update to Vulkan-Headers version 1.2.135.
Update to latest SPIRV-Cross version.
Update Whats_New.md document.
Add SPIRVShaderOutput::isUsed retrieved from shader reflection.
mvk::sizeOfOutput() returns zero if output var is not used.
Update to latest SPIRV-Cross version.
Add `MVK_CONFIG_TEXTURE_1D_AS_2D` environment variable, enabled by default.
Modify 1D warning messages to recommend use of `MVK_CONFIG_TEXTURE_1D_AS_2D`.
Update to latest version of SPIRV-Cross.
Align pipeline cache contents to latest CompilerMSL::Options structure.
Clean up code signing on demo Xcode projects.
Remove obsolescence log message for vkCreateMacOSSurfaceMVK()
and vkCreateIOSSurfaceMVK() functions.
Fix test for alignment of invalid pixel formats.
Update dependency libraries to match Vulkan SDK 1.1.121.
Update to renaming of VK_INTEL_shader_integer_functions2
enums and structs in latest Vulkan headers.
Update Whats_New.md document.
This extension allows fragment shaders to delineate critical sections
where pairs of invocations may not execute simultaneously. In Metal, the
nearest equivalent functionality is raster order groups. This
implementation is thus implemented on top of them.
Update SPIRV-Cross to pull in SPIR-V support for this new extension.
Largely minimal for now. Much of it, particularly most of the
interactions with `VK_KHR_swapchain`, was already implemented
previously. The only interesting bits are the `vkCmdDispatchBase()`
command, and the ability to create arbitrary swapchain images and bind
them to swapchain memory, which requires the use of the previously
implemented `VK_KHR_bind_memory2` extension. Most everything else can be
safely ignored for now.
Non-zero dispatch bases use the compute stage-input region to pass the
dispatch base group to the shader, which must manually adjust the
`WorkgroupId` and `GlobalInvocationId` builtins, since Metal does not do
this for us. I have tested that this approach works well--at least, well
enough to pass the CTS.
Because of the ability to bind arbitrary images to swapchain memory,
I've sucked the guts out of `MVKSwapchainImage` and into `MVKSwapchain`
itself. Availability and drawable management is now performed by the
swapchain object. `MVKSwapchainImage` is now just a specialized kind of
image, created when requested with a `VkImageCreateSwapchainInfoKHR`
structure.
Update SPIRV-Cross so we can support the `vkCmdDispatchBase()` command.
One more step towards Vulkan 1.1.
Update to latest external dependency libraries.
Rename components of VK_INTEL_shader_integer_functions2 to match 1.1.114 Vulkan spec.
Update What's New document.