1700 Commits

Author SHA1 Message Date
Lioncash
f69893345f ir: Add opcodes for unsigned saturating left shifts 2018-09-19 12:13:22 +01:00
Lioncash
7148e661f6 A64/translate/impl: Make signatures consistent for unimplemented by-element SIMD variants
Makes them all consistent, so it isn't necessary to change the
prototypes over when implementing them.
2018-09-19 12:11:35 +01:00
Lioncash
fdde4ca363 A64: Implement BRK
Currently, we can just implement this as part of the exception
interface, similar to how it's done for the A32 interface with BKPT.
2018-09-19 07:09:27 +01:00
Lioncash
b1490db0e9 A64/imm: Add full range of comparison operators to Imm template
Makes the comparison interface consistent by providing all of the
relevant members. This also modifies the comparison operators to take
the Imm instance by value, as it's really only a u32 under the covers,
and it's cheaper to shuffle around a u32 than a 64-bit pointer address.
2018-09-19 07:09:00 +01:00
MerryMage
1ec40ef6ed IR: Add fbits argument to FPVectorFrom{Signed,Unsigned}Fixed 2018-09-18 21:46:17 +01:00
MerryMage
d6d5e986c2 A64: Implement SCVTF, UCVTF (scalar, fixed-point) 2018-09-18 21:08:06 +01:00
MerryMage
6513595c09 opcodes.inc: Align columns to a tabstop of 4 2018-09-18 20:37:03 +01:00
MerryMage
6b0d2b529e IR: Add fbits argument to FixedToFP-related opcodes 2018-09-18 20:36:37 +01:00
Lioncash
c4b383124f A64: Implement SQSHL's vector immediate variant 2018-09-18 18:22:03 +01:00
Lioncash
e0d8d2d855 A64: Implement SQSHL's vector register variant 2018-09-18 18:22:03 +01:00
Lioncash
532762582b ir: Add opcodes for left signed saturated shifts 2018-09-18 18:22:03 +01:00
Lioncash
9705252968 branch: Make variables const where applicable 2018-09-18 17:45:49 +01:00
Lioncash
650946e41c move_wide: Make variables const where applicable 2018-09-18 17:45:49 +01:00
Lioncash
62b3a6dcfb load_store_register_unprivileged: Make variables const where applicable 2018-09-18 17:45:49 +01:00
Lioncash
3add1c7b3f load_store_register_immediate: Place conditional bodies on their own line
Makes the conditionals visually consistent with the rest of the
codebase.
2018-09-18 17:45:49 +01:00
Lioncash
2fc4088a74 load_store_load_literal: Make variables const where applicable 2018-09-18 17:45:49 +01:00
Lioncash
b2c146259f data_processing_logical: Move datasize declarations after early-exit conditionals
While we're at it, make variables const where applicable.
2018-09-18 17:45:49 +01:00
Lioncash
028028f9eb data_processing_conditional_select: Make variables const where applicable
Makes CSEL's function consistent with all of the others.
2018-09-18 17:45:49 +01:00
Lioncash
c66042da57 data_processing_addsub: Move datasize declarations after early-exit conditionals
While we're at it, also make relevant variables const where applicable
2018-09-18 17:45:49 +01:00
Lioncash
6bc546e1f7 data_processing_bitfield: Move datasize variables after early-exit conditionals
Moves the declaration of datasize to the scope that it's used within.
This also takes the opportunity to apply const where applicable, and
make early-exits all vertically consistent with one another.
2018-09-17 18:14:16 +01:00
Lioncash
2aad5fa87f A64: Implement CLS's vector variant
Leverages CLZ like the integral variant does.
2018-09-17 18:14:00 +01:00
Lioncash
6c877ff8db emit_x64_vector: Make EmitVectorUnsignedSaturatedAccumulateSigned() internally linked
Given this is just an internal helper function, it can be marked static.
2018-09-16 08:16:54 +01:00
Lioncash
4b5926dcab perf_map: Use std::string_view instead of std::string for PerfMapRegister()
We can just use a non-owning view into a string in this case instead of
potentially allocating a std::string instance.
2018-09-16 08:16:43 +01:00
MerryMage
74459479b9 A64: Implement SQRDMULH (vector), vector variant 2018-09-15 14:04:42 +01:00
MerryMage
03b80f2ebe A64: Implement SQDMULL (vector), vector variant 2018-09-15 13:38:37 +01:00
MerryMage
4a2c5962c7 IR: Add VectorSignedSaturatedDoublingMultiplyLong 2018-09-15 13:38:17 +01:00
MerryMage
59dc33ef12 emit_x64_vector: Changes to VectorSignedSaturatedDoublingMultiply
* Return both the upper and lower parts of the multiply if required
* SSE2 does not support the pmuldq instruction, do sign correction to an unsigned result instead
* Improve port utilisation where possible (punpck instructions were a bottleneck)
2018-09-15 09:55:25 +01:00
MerryMage
bbaebeb217 IR: Implement Vector{Signed,Unsigned}Multiply{16,32} 2018-09-15 09:55:25 +01:00
Lioncash
baac5a8810 backend_x64/a64_interface: Re-enable the constant folding pass
This was disabled for debugging, but never re-enabled. Just to be sure,
testing was done downstream in yuzu to make sure this didn't happen to
break anything (which seems to be the case).
2018-09-14 12:14:19 +01:00
MerryMage
e78ca1947b emit_x64_vector_floating_point: Hardware FMA implementation for RSqrtStepFused 2018-09-12 21:01:06 +01:00
MerryMage
8a5ae9a366 emit_x64_vector_floating_point: Hardware FMA implementation of FPVectorRecipStepFused 2018-09-12 20:45:39 +01:00
MerryMage
39818f98e8 emit_x64_floating_point: Hardware FMA implementation of FPRSqrtStepFused 2018-09-12 16:10:18 +01:00
MerryMage
3d0a0b432b emit_x64_floating_point: Hardware FMA implementation of FPRecipStepFused{32,64} 2018-09-12 14:58:09 +01:00
MerryMage
2293dff6d8 emit_x64_vector: SSE implementation of VectorSignedSaturatedAccumulateUnsigned{8,16,32} 2018-09-11 19:57:31 +01:00
Lioncash
2047683777 emit_x64_vector: Correct static asserts for < 64-bit type checks in saturated accumulate fallbacks
I had initially meant to use BitSize() here, not sizeof()
2018-09-11 07:08:32 +01:00
MerryMage
55e9e401aa emit_x64_vector: EmitVectorSignedSaturatedAccumulateUnsigned64: SSE implementation 2018-09-10 22:39:30 +01:00
MerryMage
1076651426 emit_x64_vector: Simplify fpsr_qc related code
Move the bool conversion into A64JitState::GetFpsr so we don't have to continuously
pay the cost of conversion for every saturation instruction.
2018-09-10 21:24:07 +01:00
Lioncash
4039030234 A64: Implement CLZ's vector variant 2018-09-10 18:30:40 +01:00
Lioncash
0bb908fb53 ir: Add opcodes for vector CLZ operations
We can optimize these cases further for with the use of a fair bit of
shuffling via pshufb and the use of masks, but given the uncommon use of
this instruction, I wouldn't consider it to be beneficial in terms of
amount of code to be worth it over a simple manageable naive solution
like this.

If we ever do hit a case where vectorized CLZ happens to be a
bottleneck, then we can revisit this. At least with AVX-512CD, this can
be done with a single instruction for the 32-bit word case.
2018-09-10 18:30:40 +01:00
MerryMage
3b13259630 A64/translate: VectorZeroUpper for V(64) stores
Ensures correctness.
2018-09-09 19:59:02 +01:00
MerryMage
1931d44495 simd_two_register_misc: FNEG (vector) with Q == 0 had dirty upper 2018-09-09 19:55:37 +01:00
Lioncash
a0790f02d0 emit_x64_vector: Remove unnecessary [[maybe_unused]] attributes
These were unintentionally left in when introducing SUQADD and USQADD
2018-09-09 19:30:14 +01:00
Lioncash
b0e1eb5a15 A64: Implement USQADD's scalar and vector variants 2018-09-09 17:06:03 +01:00
Lioncash
28424c7ad1 ir: Add opcodes form unsigned saturated accumulations of signed values 2018-09-09 17:06:03 +01:00
Lioncash
9923ea0b71 A64: Implement SUQADD's scalar and vector variants 2018-09-09 17:06:03 +01:00
Lioncash
4c0adbb7f1 ir: Add opcodes for signed saturated accumulations of unsigned values 2018-09-09 17:06:03 +01:00
Lioncash
799bfed2df A64: Implement SMLAL{2}, SMLSL{2}, UMLAL{2}, and UMLSL{2}'s vector by-element variants
We can simply modify the general function made for SMULL{2} and
UMULL{2}'s by-element variants to also handle the other multiply-based
by-element variants.
2018-09-09 13:55:40 +01:00
Lioncash
94451ec321 A64: Implement UMULL{2}'s vector by-element variant 2018-09-09 13:55:40 +01:00
Lioncash
45867deac9 A64: Implement SMULL{2}'s vector by-element variant 2018-09-09 13:55:40 +01:00
Lioncash
02357939ac ir/value: Replace includes with forward declarations
enum classes are still considered complete types when forward declared
(as the compiler knows the exact size of the type from the declaration
alone). The only difference in this case being that the members of the
enum class aren't visible. Given we don't use the members within this
header in any way, we can simply forward declare them here and remove
the inclusions.
2018-09-09 09:04:22 +01:00