dynarmic

Author	SHA1	Message	Date
Lioncash	6c877ff8db	emit_x64_vector: Make EmitVectorUnsignedSaturatedAccumulateSigned() internally linked Given this is just an internal helper function, it can be marked static.	2018-09-16 08:16:54 +01:00
Lioncash	4b5926dcab	perf_map: Use std::string_view instead of std::string for PerfMapRegister() We can just use a non-owning view into a string in this case instead of potentially allocating a std::string instance.	2018-09-16 08:16:43 +01:00
MerryMage	74459479b9	A64: Implement SQRDMULH (vector), vector variant	2018-09-15 14:04:42 +01:00
MerryMage	03b80f2ebe	A64: Implement SQDMULL (vector), vector variant	2018-09-15 13:38:37 +01:00
MerryMage	4a2c5962c7	IR: Add VectorSignedSaturatedDoublingMultiplyLong	2018-09-15 13:38:17 +01:00
MerryMage	59dc33ef12	emit_x64_vector: Changes to VectorSignedSaturatedDoublingMultiply * Return both the upper and lower parts of the multiply if required * SSE2 does not support the pmuldq instruction, do sign correction to an unsigned result instead * Improve port utilisation where possible (punpck instructions were a bottleneck)	2018-09-15 09:55:25 +01:00
MerryMage	bbaebeb217	IR: Implement Vector{Signed,Unsigned}Multiply{16,32}	2018-09-15 09:55:25 +01:00
Lioncash	baac5a8810	backend_x64/a64_interface: Re-enable the constant folding pass This was disabled for debugging, but never re-enabled. Just to be sure, testing was done downstream in yuzu to make sure this didn't happen to break anything (which seems to be the case).	2018-09-14 12:14:19 +01:00
MerryMage	e78ca1947b	emit_x64_vector_floating_point: Hardware FMA implementation for RSqrtStepFused	2018-09-12 21:01:06 +01:00
MerryMage	8a5ae9a366	emit_x64_vector_floating_point: Hardware FMA implementation of FPVectorRecipStepFused	2018-09-12 20:45:39 +01:00
MerryMage	39818f98e8	emit_x64_floating_point: Hardware FMA implementation of FPRSqrtStepFused	2018-09-12 16:10:18 +01:00
MerryMage	3d0a0b432b	emit_x64_floating_point: Hardware FMA implementation of FPRecipStepFused{32,64}	2018-09-12 14:58:09 +01:00
MerryMage	2293dff6d8	emit_x64_vector: SSE implementation of VectorSignedSaturatedAccumulateUnsigned{8,16,32}	2018-09-11 19:57:31 +01:00
Lioncash	2047683777	emit_x64_vector: Correct static asserts for < 64-bit type checks in saturated accumulate fallbacks I had initially meant to use BitSize() here, not sizeof()	2018-09-11 07:08:32 +01:00
MerryMage	55e9e401aa	emit_x64_vector: EmitVectorSignedSaturatedAccumulateUnsigned64: SSE implementation	2018-09-10 22:39:30 +01:00
MerryMage	1076651426	emit_x64_vector: Simplify fpsr_qc related code Move the bool conversion into A64JitState::GetFpsr so we don't have to continuously pay the cost of conversion for every saturation instruction.	2018-09-10 21:24:07 +01:00
Lioncash	4039030234	A64: Implement CLZ's vector variant	2018-09-10 18:30:40 +01:00
Lioncash	0bb908fb53	ir: Add opcodes for vector CLZ operations We can optimize these cases further for with the use of a fair bit of shuffling via pshufb and the use of masks, but given the uncommon use of this instruction, I wouldn't consider it to be beneficial in terms of amount of code to be worth it over a simple manageable naive solution like this. If we ever do hit a case where vectorized CLZ happens to be a bottleneck, then we can revisit this. At least with AVX-512CD, this can be done with a single instruction for the 32-bit word case.	2018-09-10 18:30:40 +01:00
MerryMage	3b13259630	A64/translate: VectorZeroUpper for V(64) stores Ensures correctness.	2018-09-09 19:59:02 +01:00
MerryMage	1931d44495	simd_two_register_misc: FNEG (vector) with Q == 0 had dirty upper	2018-09-09 19:55:37 +01:00
Lioncash	a0790f02d0	emit_x64_vector: Remove unnecessary [[maybe_unused]] attributes These were unintentionally left in when introducing SUQADD and USQADD	2018-09-09 19:30:14 +01:00
Lioncash	b0e1eb5a15	A64: Implement USQADD's scalar and vector variants	2018-09-09 17:06:03 +01:00
Lioncash	28424c7ad1	ir: Add opcodes form unsigned saturated accumulations of signed values	2018-09-09 17:06:03 +01:00
Lioncash	9923ea0b71	A64: Implement SUQADD's scalar and vector variants	2018-09-09 17:06:03 +01:00
Lioncash	4c0adbb7f1	ir: Add opcodes for signed saturated accumulations of unsigned values	2018-09-09 17:06:03 +01:00
Lioncash	799bfed2df	A64: Implement SMLAL{2}, SMLSL{2}, UMLAL{2}, and UMLSL{2}'s vector by-element variants We can simply modify the general function made for SMULL{2} and UMULL{2}'s by-element variants to also handle the other multiply-based by-element variants.	2018-09-09 13:55:40 +01:00
Lioncash	94451ec321	A64: Implement UMULL{2}'s vector by-element variant	2018-09-09 13:55:40 +01:00
Lioncash	45867deac9	A64: Implement SMULL{2}'s vector by-element variant	2018-09-09 13:55:40 +01:00
Lioncash	02357939ac	ir/value: Replace includes with forward declarations enum classes are still considered complete types when forward declared (as the compiler knows the exact size of the type from the declaration alone). The only difference in this case being that the members of the enum class aren't visible. Given we don't use the members within this header in any way, we can simply forward declare them here and remove the inclusions.	2018-09-09 09:04:22 +01:00
Lioncash	450f721df5	ir/cond: Migrate to C++17 nested namespace specifiers	2018-09-09 09:03:42 +01:00
Lioncash	e649988cd6	CMakeLists: Add missing cond.h header to file listing Allows the file to show up within IDEs more easily.	2018-09-09 09:03:42 +01:00
Lioncash	d20e7694dd	A64: Implement URSQRTE	2018-09-09 00:37:28 +01:00
Lioncash	4f3bde5f12	ir: Add opcodes for performing unsigned reciprocal square root estimates	2018-09-09 00:37:28 +01:00
Lioncash	cfeeaec1c6	A64: Implement URECPE	2018-09-09 00:37:28 +01:00
Lioncash	622b60efd6	ir: Add opcodes for unsigned reciprocal estimate	2018-09-09 00:37:28 +01:00
Lioncash	d17599af40	Update Xbyak to 5.71 Merge commit 'f7c26e9f7ace572f440b80b0e71625295755c38b'	2018-09-08 17:09:25 -04:00
Lioncash	f7c26e9f7a	Squashed 'externals/xbyak/' changes from 671fc805..1de435ed 1de435ed bf uses Label class 613922bd add Label L() for convenience 43e15583 fix typo 93579ee6 add protect-re.cpp 60004b5c fix url of protect-re.cpp 348b2709 fix typo of doc f34f6ed5 update manual 232110be update test 82b78bf0 add setProtectMode dd8b290f put warning message if pageSize != 4096 64775ca2 a little refactoring 7c3e7b85 fix wrong VSIB encoding with idx >= 16 git-subtree-dir: externals/xbyak git-subtree-split: 1de435ed04c8e74775804da944d176baf0ce56e2	2018-09-08 16:52:55 -04:00
Lioncash	8782b69c93	travis: Make macOS build with Xcode 9.4.1 Builds against the latest release version of the Xcode toolchain	2018-09-08 13:21:58 +01:00
Lioncash	b575b23ea9	A64: Implement SQNEG's scalar and vector variant	2018-09-08 11:23:32 +01:00
Lioncash	06062a91c5	A64: Add opcodes for signed saturating negations	2018-09-08 11:23:32 +01:00
Lioncash	1c40579de5	emit_x64_vector: Simplify "position == 0" case for EmitVectorExtract() In the event position is zero, we can just treat it as a NOP, given there's no need to move the data.	2018-09-08 11:23:32 +01:00
Lioncash	e335050886	emit_x64_vector: Simplify "position == 0" case for EmitVectorExtractLower() In the event position == 0, we can just treat it as a simple movq, clearing the upper half of the XMM register. This also makes that case use only one register.	2018-09-08 11:23:32 +01:00
Lioncash	8b13421bac	A64: Implement SQDMULH's by-element scalar variant	2018-09-08 11:23:32 +01:00
Lioncash	9122a6e19e	A64: Implement SQDMULH's by-element vector variant	2018-09-08 11:23:32 +01:00
MerryMage	176e60ebb1	backend/x64: Do not clear fast_dispatch_table if not enabled There is no need to pay for the cost of setting a large block of memory if we're not using it.	2018-09-08 11:23:32 +01:00
MerryMage	959446573f	A64: Implement FastDispatchHint	2018-09-07 22:07:44 +01:00
MerryMage	2be95f2b3b	A32: Implement FastDispatchHint	2018-09-07 22:07:44 +01:00
MerryMage	96f23acd00	ir/terminal: Add FastDispatchHint	2018-09-07 21:29:47 +01:00
Lioncash	f5ca9e9e4a	A64: Implement SQDMULH's scalar variant	2018-09-06 20:35:43 +01:00
Lioncash	af8bea59d5	ir: Add opcodes for scalar signed saturated doubling multiplies	2018-09-06 20:35:43 +01:00

1 2 3 4 5 ...

1579 Commits