moltenvk

Author	SHA1	Message	Date
Chip Davis	93ee0300a9	MVKCommandEncoder: Set store override actions before finalizing draw state. Otherwise, they could be left unset when we switch to compute in order to set up the state for the draw call.	2020-09-08 18:52:38 -05:00
Bill Hollings	8260f44762	Merge pull request #1017 from cdavis5e/merge-pipeline-cache-owner-fix MVKShaderLibraryCache: Fix owner of merged MVKShaderLibraries.	2020-09-08 16:54:13 -04:00
Bill Hollings	79a15b1776	Merge pull request #1016 from cdavis5e/free-descriptor-sets MVKDescriptorPool: Only free descriptor sets it knows about.	2020-09-08 16:37:45 -04:00
Chip Davis	28b5f8c37e	MVKShaderLibraryCache: Fix owner of merged MVKShaderLibraries. When a pipeline cache were merged into another pipeline cache, we would create new `MVKShaderLibrary` objects for each one contained in the source. The objects would be exact copies of the originals... including their owner, which could be destroyed after the pipeline caches were merged. Fix the owner in the new objects to prevent a dangling reference.	2020-09-08 13:35:02 -05:00
Chip Davis	8a30aeadbe	MVKDescriptorPool: Only free descriptor sets it knows about. Fixes a crash in `dEQP-VK.api.null_handle.free_descriptor_sets`.	2020-09-08 13:33:46 -05:00
Chip Davis	0cf2bfd1d2	Implement the vkEnumerateInstanceVersion() function. We're Vulkan 1.1 now!	2020-09-08 13:22:17 -05:00
Chip Davis	a775263888	Implement the vkGetDeviceQueue2() function. This function was introduced with protected memory. Since we don't support that, right now it does nothing that `vkGetDeviceQueue()` did not already do. Despite that, I've added a method to `MVKDevice`, because this is an extensible function analogous to e.g. `vkGetPhysicalDeviceFeatures2()`.	2020-09-08 13:22:17 -05:00
Chip Davis	78963db6cc	Export core names of Vulkan 1.1 calls promoted from extensions. The functions are now defined under their core names. To avoid code bloat, I've defined the suffixed names as aliases of the core names. Both symbols will be globally defined with the same value, and in the dylib both will be exported. Fix the default API version when none is given. Zero is the same as `VK_API_VERSION_1_0`. Prior to this, we were overwriting it with zero if no app info were given, or if it were zero in the app info. It wasn't important before, but now that we gate API availability on maximum Vulkan version, we need to make sure it's a valid version.	2020-09-08 13:22:17 -05:00
Chip Davis	16db5bfe63	MVKDevice: Fill in protected memory info structs. We can't support this feature on top of Metal with the API available to us, but we have to fill in the structures for Vulkan 1.1.	2020-09-08 13:22:17 -05:00
Chip Davis	742a2f2951	MVKDevice: Fill in feature struct for VK_KHR_shader_draw_parameters. It's actually from Vulkan 1.1, but we'll soon support that.	2020-09-08 13:22:17 -05:00
Chip Davis	09bcd534d9	Add basic support for VK_KHR_external_semaphore{,_capabilities}. Also a non-functional base for future extensions. We can't implement it anyway until all remaining bugs in `MTLEvent`-based semaphores are fixed. This is the last of the extensions that was promoted to core for Vulkan 1.1. We're almost there!	2020-09-05 21:05:54 -05:00
Bill Hollings	697e8627cf	Merge pull request #1009 from cdavis5e/external-fence Add basic support for VK_KHR_external_fence{,_capabilities}.	2020-09-05 21:58:24 -04:00
Chip Davis	0d4b087f3d	MVKCommandBuffer: Fix a crash on starting a query outside a render pass. This was introduced by #1006. Fixes #1007.	2020-09-04 22:30:10 -05:00
Chip Davis	e6424654e3	Add basic support for VK_KHR_external_fence{,_capabilities}. Like with `VK_KHR_device_group` and `VK_KHR_external_memory`, this just adds the groundwork needed to support future extensions; it provides no actual support for external fences. We should be able to easily support `VK_KHR_external_fence_fd`, by using a POSIX semaphore. Since the fence FDs produced by that extension are opaque, only supporting `close(2)` and `dup(2)`, we shouldn't have to worry about portable programs poking the FD in weird ways. Hopefully. Other types of external fences we might support include GCD semaphores (`dispatch_semaphore_t`) and Mach semaphores (`semaphore_t`). I really think we want support for GCD semaphores, because that's the most likely object we're going to see passed between processes on Darwin given GCD's built-in support for XPC. I have deliberately omitted mention of these extensions from the user guide. `VK_KHR_external_memory` was not mentioned in there, presumably because no actual external memory types are actually supported. Also, add missing `vkGetInstanceProcAddr()` entry for `vkGetPhysicalDeviceExternalBufferPropertiesKHR()`. We have the function, and we export the extension's name string. We might as well make it available via `vkGetInstanceProcAddr()`.	2020-09-04 13:16:54 -05:00
Chip Davis	34930eaf5b	Support the VK_KHR_multiview extension. Originally, Metal did not support this directly, and still largely doesn't on GPUs other than Apple family 6. Therefore, this implementation uses vertex instancing to draw the needed views. To support the Vulkan requirement that only the layers for the enabled views are loaded and stored in a multiview render pass, this implementation uses multiple Metal render passes for multiple "clumps" of enabled views. For indirect draws, as with tessellation, we must adjust the draw parameters at execution time to account for the extra views, so we need to use deferred store actions here. Without them, tracking the state becomes too involved. If the implementation doesn't support either layered rendering or deferred store actions, multiview render passes are instead unrolled and rendered one view at a time. This will enable us to support the extension even on older devices and OSes, but at the cost of additional command buffer memory and (possibly) worse performance. Eventually, we should consider using vertex amplification to accelerate this, particularly since indirect multiview draws are terrible and currently require a compute pass to adjust the instance count. Also, instanced drawing in itself is terrible due to its subpar performance. But, since vertex amplification on family 6 only supports two views, when `VK_KHR_multiview` mandates a minimum of 6, we'll still need to use instancing to support more than two views. I have tested this extensively against the CTS. I'm very confident in its correctness. The only failing tests are `dEQP-VK.multiview.queries.`, due to our inadequate implementation of timestamp queries; and `dEQP-VK.multiview.depth.`, due to what I assume is a bug in the way Metal handles arrayed packed depth/stencil textures, and which may only be a problem on Mojave. I need to test this on Catalina and Big Sur. Update SPIRV-Cross to pull in some fixes necessary for this to work. Fixes #347.	2020-09-03 17:14:46 -05:00
Jan Sikorski	3c3683ebef	MVKImagePlane: account for swapchain images in get/releaseMTLTexture() Delegate to MVKImage's getMTLTexture() method to use the overriden one for swapchain images. Also, the _mtlTextureViews member should be cleared to prevent accidental reuse of released views.	2020-09-02 12:17:10 +02:00
Bill Hollings	aa496c8789	Merge pull request #1001 from cdavis5e/null-draws MVKCmdDraw: Don't encode commands that draw zero vertices.	2020-09-01 16:55:03 -04:00
Bill Hollings	a52098d29a	Merge pull request #1000 from cdavis5e/clear-vertex-count MVKCmdClearAttachments: Fix vertex count for clearing multiple layers.	2020-09-01 16:47:02 -04:00
Bill Hollings	dbb0b58671	Merge pull request #999 from cdavis5e/null-clear-atts MVKCmdClearAttachments: Don't attempt to clear unused attachments.	2020-09-01 16:19:44 -04:00
Chip Davis	245286f57c	MVKCmdDraw: Don't encode commands that draw zero vertices. Nothing will be drawn in that case. Nothing would've been drawn anyway, but Metal's validation layer complains if you issue a draw command with zero vertices or instances. We unfortunately cannot do anything about indirect draws, since we won't know how many vertices to draw until execute time.	2020-08-31 21:18:02 -05:00
Chip Davis	864a6a6b94	MVKCmdClearAttachments: Fix vertex count for clearing multiple layers. Prior to this, we were assuming one layer per clear rect. When that assumption were broken, we wound up smashing the stack. This will be more important once multiview support lands, since all views must be cleared by the clear command.	2020-08-31 21:16:33 -05:00
Chip Davis	2d7717b708	MVKCmdClearAttachments: Don't attempt to clear unused attachments. The attachment index in the `VkClearAttachment` struct is an index into the current subpass attachment array, not the render pass attachment array; so it's not enough to check `VkClearAttachment` for `VK_ATTACHMENT_UNUSED`. We also need to check the current subpass. If no attachments can be cleared, don't encode a command to the Metal command buffer. Perhaps I should bring back `recordBeginRenderPass()` and `recordEndRenderPass()` and keep track of the current subpass; then we can avoid recording the command entirely if the intersection of the sets "attachments to clear" and "used attachments" is null.	2020-08-31 21:12:51 -05:00
Chip Davis	e2ba444aa6	Use deferred store actions instead of tracking multi-pass draws. The current code does not handle multiple subpasses, nor does it handle secondary command buffers. Handling subpasses was easy enough. The problem came with secondary command buffers. Tracking them became extremely complicated, particularly since pipelines may be set either inside or outside a render pass, and further, a pipeline set in one buffer might be used in another. I then realized a simpler and more elegant solution: Metal's deferred store actions feature. This allows you to defer setting the store action for a render pass until encoding time. This is exactly what we need, since we won't know what store action we actually want until we start encoding draws. This solution should now work with multiple subpasses and secondary command buffers, with much less code.	2020-08-31 21:09:11 -05:00
Jan Sikorski	7c99bb9430	Initialize tessellation related variables conditionally in indirect draws	2020-08-14 12:38:33 +02:00
Chip Davis	07c2054cc8	MVKCommandBuffer: Don't set renderTargetArrayLength on devices that don't support it. It's a hard validation error to do so. We originally had a check for this, but it was erroneously completely removed in #988, instead of being limited to `renderTargetArrayLength`. Fixes #991.	2020-08-13 15:20:57 -05:00
Bill Hollings	f3e938fafa	Merge pull request #990 from billhollings/master Re-add support for bitcode generation on iOS and tvOS.	2020-08-12 17:41:36 -04:00
Bill Hollings	d4b5df532e	Re-add support for bitcode generation on iOS and tvOS. Set BITCODE_GENERATION_MODE build setting in all Xcode projects. create_dylib.sh support BITCODE_GENERATION_MODE.	2020-08-11 20:18:50 -04:00
Jan Sikorski	77cbe56101	Detect support for having no attachments in a render pass When we do have support, don't create a dummy texture to attach.	2020-08-11 10:56:33 +02:00
Jan Sikorski	c3254cb546	Ensure the base texture is created when creating a view texture	2020-08-07 12:24:22 +02:00
Jan Sikorski	babefe6608	Always set renderTarget* properties in MTLRenderPassDescriptor A shader might still use array index even if array length is 1 causing a validation warning if renderTargetArrayLength is not set.	2020-08-07 11:39:17 +02:00
Jan Sikorski	8f8395dcdf	Enable MTLRenderPassDescriptor renderTargetWidth and renderTargetHeight API on macOS, as it's available on 10.15 and newer.	2020-08-07 11:39:11 +02:00
Bill Hollings	5126aaed3e	Update MoltenVK version to 1.0.45. Update What's New document.	2020-08-05 19:58:59 -04:00
Bill Hollings	7708b08c10	Merge pull request #983 from cdavis5e/multi-patch-workgroup Process multiple patches per workgroup in a tessellation control shader.	2020-08-05 18:51:34 -04:00
Chip Davis	3db2cbff6b	Process multiple patches per workgroup in a tessellation control shader. This should hopefully reduce underutilization of the GPU, especially on GPUs where the thread execution width is greater than the number of control points. This also eliminates the extra invocations previously needed to read the varyings from the vertex shader into the tessellation shader. The number of threads per workgroup is now lcm(SIMD-size, output control points). This should ensure we always process a whole number of patches per workgroup, and further reduce underutilization of the GPU's SIMD units. To avoid complexity handling indices in the tessellation control shader, I've also changed the way vertex shaders for tessellation are handled. They are now compute kernels using Metal's support for vertex-style stage input. This lets us always emit vertices into the buffer in order of vertex shader execution. Now we no longer have to deal with indexing in the tessellation control shader, nor do we always have to duplicate the index buffer to insert gaps. This also fixes a long-standing issue where if an index were greater than the number of vertices to draw, the vertex shader would wind up writing outside the buffer, and the vertex would be lost.	2020-08-05 16:03:25 -05:00
Jan Sikorski	8f52521b1d	Flush source image of a transfer Since on macOS textures cannot be resident in shared (host-coherent) memory, they need to be flushed before making the copy, to ensure that the modified data is transferred.	2020-08-05 15:34:10 +02:00
Bill Hollings	4609416ef2	Merge pull request #980 from mbechard/master dynamicOffsets are supposed to be ordered by binding index	2020-07-30 12:06:12 -04:00
Malcolm Bechard	5874616d78	dynamicOffsets are supposed to be ordered by binding index not by layout element index. Fixes #978	2020-07-28 17:25:53 -04:00
Bill Hollings	2c7734eda8	Fix issue where expected buffer-sizes buffer not bound to Metal compute encoder. MVKComputeResourcesCommandEncoderState update buffer-size value before buffer bindings are encoded into Metal and are no longer marked as dirty.	2020-07-28 15:38:39 -04:00
Bill Hollings	23ea3c9bf7	vkCmdBlitImage() return error if scaling or inverting to linear image on macOS.	2020-07-27 16:00:42 -04:00
Bill Hollings	f7a1c87c71	Update pipeline cache to latest CompilerMSL::Options struct content. SPIRVToMSLConversionOptions compare instances using memcmp(CompilerMSL::Options). Update What's New document.	2020-07-27 15:02:56 -04:00
Bill Hollings	6b25c816b5	Merge pull request #976 from cdavis5e/pipeline-sample-mask MVKPipeline: Pass the pipeline sample mask, if present, to SPIRV-Cross.	2020-07-27 10:31:44 -04:00
Bill Hollings	d2e5ff7df5	Merge pull request #975 from cdavis5e/image-robustness MVKDevice: Support the VK_EXT_image_robustness extension.	2020-07-27 10:21:31 -04:00
Bill Hollings	484e1dffdb	Merge pull request #972 from BearishSun/various-fixes Various fixes	2020-07-27 10:04:00 -04:00
Chip Davis	cda8a2cf44	MVKPipeline: Pass the pipeline sample mask, if present, to SPIRV-Cross. SPIRV-Cross can now AND the `gl_SampleMask` output with an additional fixed mask, presumably from the pipeline. Use this new functionality to implement pipeline sample mask handling. Special thanks to Tomek Pontika and Corentin Wallez of Google for graciously contributing their implementation to SPIRV-Cross. Update SPIRV-Cross to pull in the change necessary for this.	2020-07-24 15:15:11 -05:00
Chip Davis	b1803ea5d7	MVKDevice: Support the VK_EXT_image_robustness extension. This extension provides weaker guarantees than `VK_EXT_robustness2` and its `robustImageAccess2` feature. Metal easily meets those guarantees, with no action on our part necessary.	2020-07-24 15:13:58 -05:00
Marko Pintera	e2f4828f42	Less error prone way of calculating bytes per layer	2020-07-24 09:58:17 +02:00
Marko Pintera	6da2fbb353	Don't over-allocate memory for higher mip levels of 3D textures	2020-07-23 15:06:49 +02:00
Marko Pintera	f13fb48905	Allocate correctly sized images	2020-07-23 15:05:44 +02:00
Marko Pintera	d351a7a173	Use correct offset when calculating overlap between image and device memory ranges	2020-07-23 15:03:48 +02:00
Marko Pintera	0b6a5db049	Don't assign MTLTextureUsageRenderTarget to linear textures	2020-07-23 15:02:26 +02:00

1 2 3 4 5 ...

1062 Commits