MVKDescriptorSetLayoutBinding no longer uses MTLTextureType to create
MTLArgumentEncoder, which is not needed for descriptor set argument encoding.
Remove obsolete MSLResourceBinding::outMTLTextureType and
SPIRVToMSLConversionConfiguration::getMTLTextureType().
or one MTLArgumentEncoder per combination of pipeline-stage/descriptor set.
Add MVKPhysicalDeviceMetalFeatures::descriptorSetArgumentBuffers,
and MVKDeviceTrackingMixin::isUsingDescriptorSetMetalArgumentBuffers()
and isUsingPipelineStageMetalArgumentBuffers() to track this.
Create a separate MTLArgumentEncoder for each shader stage from
MTLFunction instead of MTLDevice, and track per-stage in MVKPipeline.
Add MVKMTLArgumentEncoder to track MTLArgumentEncoders in pipelines and desc set layouts.
Add SPIRVToMSLConversionResults::activeDescriptorSets to get desc sets used by a shader.
In SPIRV-Cross make padded Metal argument buffer descriptors
for buffers a pointer to float instead of pointer to void.
Update to latest version of SPIRV-Cross to use arg buffer padding feature.
Enable argument buffer support only when one MTLArgumentEncoder per
descriptor set can be used (macOS 10.16 and later or Intel GPU's).
Move dynamic offsets from the Metal argument buffer to an implicit shader buffer.
Add SPIRVToMSLConversionConfiguration::dynamicBufferDescriptors to track dynamic
buffer descriptors, and add SPIRVToMSLConversionResults::needsDynamicOffsetBuffer
to indicate shader needs an implicit buffer to track these dynamic offsets.
Add MVKShaderStageResourceBinding::dynamicOffsetBufferIndex to track a
descriptor's index into the implicit dynamic offsets buffer.
MVKPipelineLayout change how per-descriptor set offsets are calculated and tracked
when using Metal argument buffers, so that resource indexes do not accumulate
across descriptor sets, but dynamic offset buffer indexes do accumulate.
MVKPipelineLayout rearrange order of implicit buffer indexes to give more commonly used
implicit buffers lower indexes to help reduce risk of exhausting discrete buffer bindings.
usage not tracked accurately across shader stages.
Remove tracking of descriptor usage in MVKPipeline.
Don't bind argument buffer to command encoder if descriptor set unused by shader stage.
MVKDescriptor::encodeToMetalArgumentBuffer() remove unused shader stage argument.
Remove SPIRVToMSLConversionConfiguration::isResourceUsed().
Support SPIRV-Cross CompilerMSL::Options::pad_argument_buffer_resources and
MSLResourceBinding::base_type.
Consolidate mvkPopulateShaderConverterContext() and set
MSLResourceBinding::base_type from VkDescriptorType.
Add MVKDescriptorSetLayout::initForMetalArgumentBufferUse() to measure size of Metal
argument buffer needed for descriptor set by creating a ephemeral MTLArgumentEncoder
from the descriptor bindings and measuring its length.
MVKDescriptorPool::initMetalArgumentBuffer() measure size of a single MTLBuffer
to use for all descriptor sets in the pool.
MVKDescriptorPool use VkDescriptorPoolInlineUniformBlockCreateInfoEXT to determine
descriptor count for VK_DESCRIPTOR_TYPE_INLINE_UNIFORM_BLOCK_EXT.
Set max size of temporary inline block buffer allocations from
MVKDevice::_pMetalFeatures->dynamicMTLBufferSize.
Set MTLSamplerDescriptor::supportArgumentBuffers property when appropriate.
Use specific resource indices of MVKShaderStageResourceBinding
when assigning indices to argument buffer elements.
MVKShaderStageResourceBinding expand elements to uint32_t
to support larger numbers of resource arguments.
Fix calculation of descriptor set Metal argument buffer size requirements.
Streamline MVKResourcesCommandEncoderState::encodeToMetalArgumentBuffer(),
and add fix for GPU capturing of Metal arg buffers for Xcode 12.
Update MSLResourceBinding::outMTLTextureType and outIsUsedByShader from shader
only if resource binding is for the correct shader stage, to allow outIsUsedByShader
to be carried across multiple-stage shader compilations.
Remove MVKDescriptorSet::populateMetalArgumentBufferBinding().
Remove MVKSamplerDescriptorMixin::getMetalArgumentBufferSamplerIndexOffset().
Add MVKCommandEncoderState::getDevice().
shader cache lookup hits in MVKShaderLibraryCache.
Rename MSLResourceBinding::mtlTextureType, MSLResourceBinding::isUsedByShader, and
MSLShaderInput::isUsedByShader to add a prefix out*, to clarify that those variables
are output from the conversion process, instead of input to the conversion process.
Don't use MSLResourceBinding::outMTLTextureType when looking up cached shaders.
MVKPipeline track descriptors used by shaders.
Update resources as dirty at start of Metal render pass or compute encoder.
Add MVKCommandEncoder::beginMetalComputeEncoding() to mark a new Metal compute encoder.
MVKResourcesCommandEncoderState track resource usage that needs to be encoded.
Add MVKResourcesCommandEncoderState::encodeArgumentBufferResourceUsage().
Add MVKCommandEncoderState::beginMetalComputeEncoding() to mark compute
state dirty when a MTLComputeEncoder is created.
Add SPIRVToMSLConversionConfiguration::isResourceUsed().
MVKBitArray add ability to retain contents when resizing, and clear bit during getBit().
Add MVKBitArray getBit() option to clear bit.
Add MVKPhysicalDeviceMetalFeatures::argumentBuffers.
Add MVKConfiguration::useMetalArgumentBuffers and
MVK_CONFIG_USE_METAL_ARGUMENT_BUFFERS env var.
MVKPhysicalDevice and MVKDeviceTrackingMixin add isUsingMetalArgumentBuffers().
Populate VkPhysicalDeviceDescriptorIndexingPropertiesEXT with larger Tier2
limits when supported.
Populate pipelineCacheUUID with feature flag if Metal arg buffers are used.
SPIRVToMSLConversionConfiguration add content to extract shader texture types
and mark discrete descriptor sets.
External libraries in particular make liberal use of assert() calls, which bypass
error catching, sometimes causing crashes rather than catch errors and moving on.
Remove SPIRV-Cross/ qualifier from include references to SPIRV-Cross header files.
Remove glslang/ qualifier from include references to glslang header files.
This change allows easier integration with app build scripts.
Add SPIRVToMSLConversionResults::isPositionInvariant to query
position invariance from SPIR-V.
MVKDevice::getMTLCompileOptions() takes into consideration need to preserve invariance.
MVKShaderModule compile MSL to preserve invariance if required by shader.
Support querying SignedZeroInfNanPreserve execution mode
from SPIR-V to disable fast-math for individual shaders.
Clean up namespace references in SPIRVToMSLConverter.cpp.
MVKDevice track enabled VkPhysicalDeviceInlineUniformBlockFeaturesEXT features.
Disable prefilled MTLCommandBuffers if update after binding enabled.
Update to latest SPIRV-Cross that includes support for unsized arrays.
Combine MoltenVKSPIRVToMSLConverter and MoltenVKGLSLToSPIRVConverter
frameworks into a single MoltenVKShaderConverter framework.
Update corresponding directory structures, symlinks, scripts, and build paths.
Update MoltenVK code to use new framework name for headers.
Add symlinks in API-Samples demo to support legacy
MoltenVKGLSLToSPIRVConverter header paths.
In addition to simplifying shader converter code and build management, the
use of only one shader converter framework fixes a race condition within Xcode,
prior to Xcode 12, when multiple targets use the same dependency XCFramework.
Remove EXCLUDED_ARCHS from all Xcode projects to allow fat platform libraries to be built.
Script copy_lib_to_staging.sh no longer breaks fat libraries into single-architecture
libraries, and simply copies fat file to XCFramework staging area.
This permits support for arm64 on macOS, and arm64e on iOS and tvOS.
Creating a Simulator dylib containing both x86_64 and arm64 (Apple Silicon)
architectures is not currently supported by Xcode, so Simulator dylibs are skipped.
Originally, Metal did not support this directly, and still largely
doesn't on GPUs other than Apple family 6. Therefore, this
implementation uses vertex instancing to draw the needed views. To
support the Vulkan requirement that only the layers for the enabled
views are loaded and stored in a multiview render pass, this
implementation uses multiple Metal render passes for multiple "clumps"
of enabled views.
For indirect draws, as with tessellation, we must adjust the draw
parameters at execution time to account for the extra views, so we need
to use deferred store actions here. Without them, tracking the state
becomes too involved.
If the implementation doesn't support either layered rendering or
deferred store actions, multiview render passes are instead unrolled and
rendered one view at a time. This will enable us to support the
extension even on older devices and OSes, but at the cost of additional
command buffer memory and (possibly) worse performance.
Eventually, we should consider using vertex amplification to accelerate
this, particularly since indirect multiview draws are terrible and
currently require a compute pass to adjust the instance count. Also,
instanced drawing in itself is terrible due to its subpar performance.
But, since vertex amplification on family 6 only supports two views,
when `VK_KHR_multiview` mandates a minimum of 6, we'll still need to use
instancing to support more than two views.
I have tested this extensively against the CTS. I'm very confident in
its correctness. The only failing tests are
`dEQP-VK.multiview.queries.*`, due to our inadequate implementation of
timestamp queries; and `dEQP-VK.multiview.depth.*`, due to what I assume
is a bug in the way Metal handles arrayed packed depth/stencil textures,
and which may only be a problem on Mojave. I need to test this on
Catalina and Big Sur.
Update SPIRV-Cross to pull in some fixes necessary for this to work.
Fixes#347.
- Delete fat library and framework scripts and templates.
- MoltenVK build package now only includes one XCFramework, and separate platform dylibs.
- Modify fetchDependencies and Makefile targets to not build fat libraries,
and to build simulators separately than platforms instead.
- Script package_moltenvk.sh now copies dylibs for all built platforms.
- Consolidate package_all.sh and delete package_one_os.sh.
- Swap names of copy_lib_to_staging.sh and copy_to_staging.sh scripts.
- Cube demo now uses MoltenVK as XCFramework, and support Simulator builds.
- Hologram demo now uses MoltenVK as dylibs from new packaging location.
- API-Samples demo now uses MoltenVK as XCFramework.
- Update documentation.
Create shader converters as XCFrameworks.
Don't create shader converters as fat libs, dylibs, or regular frameworks.
Rename create_xcframework.sh to create_xcframework_func.sh.
Use separate MoltenVK packaging scripts for one or all OS's.
Add package_one_os.sh.
Remove package_shader_converter_lib.sh.
Remove redundant GLSL shader converter dependencies in MoltenVK packaging targets.
Exclude arm64 architectures on macOS and Simulators.
Exclude arm64e architectures on iOS and tvOS.
Stop building fat libraries for external libraries.
Remove package_ext_libs.sh script.
Don't include Headers in ext lib XCFrameworks because of Xcode12 bug in using them.
This should hopefully reduce underutilization of the GPU, especially on
GPUs where the thread execution width is greater than the number of
control points.
This also eliminates the extra invocations previously needed to read the
varyings from the vertex shader into the tessellation shader. The number
of threads per workgroup is now lcm(SIMD-size, output control points).
This should ensure we always process a whole number of patches per
workgroup, and further reduce underutilization of the GPU's SIMD units.
To avoid complexity handling indices in the tessellation control shader,
I've also changed the way vertex shaders for tessellation are handled.
They are now compute kernels using Metal's support for vertex-style
stage input. This lets us always emit vertices into the buffer in order
of vertex shader execution. Now we no longer have to deal with indexing
in the tessellation control shader, nor do we always have to duplicate
the index buffer to insert gaps. This also fixes a long-standing issue
where if an index were greater than the number of vertices to draw, the
vertex shader would wind up writing outside the buffer, and the vertex
would be lost.
SPIRV-Cross can now AND the `gl_SampleMask` output with an additional
fixed mask, presumably from the pipeline. Use this new functionality to
implement pipeline sample mask handling.
Special thanks to Tomek Pontika and Corentin Wallez of Google for
graciously contributing their implementation to SPIRV-Cross.
Update SPIRV-Cross to pull in the change necessary for this.
Metal is picky about interface matching. If the types of a vertex output
and its corresponding fragment input don't match, down to the number of
vector components, it fails pipeline compilation. To support cases where
the number of components in the fragment input is less than the
corresponding vertex output, we need to fix up the fragment shader to
accept the extra components.
Create fat builds of static, dynamic & framework libraries if both iOS
and simulator versions have been created from separate manual Xcode builds.
Refactor scripts for creating fat libraries to reuse across projects.
Add MVK_BUILT_PROD_DIR to replace use of BUILT_PRODUCTS_DIR in most scripts
to allow flexibility across per-platform compilation.