Lioncash
8f8527407c
emit_x64_vector: SSSE3 variant of EmitVectorCountLeadingZeros8()
...
pshufb lyfe
2018-09-23 10:13:08 +01:00
Mat M
be05e75818
Merge pull request #397 from VelocityRa/dec-shift-fix
...
decoders: Cast to correctly-sized type before shifting
2018-09-22 19:17:36 -04:00
VelocityRa
bc328fc645
decoders: Cast to correctly-sized type before shifting
...
Fixes decoding for 64-bit instructions
Does not help/apply to any currently supported ARM versions (since
all are 32-bit length or below), it's for future-proofing should
such an arch be supported.
2018-09-22 22:10:29 +03:00
MerryMage
9c3d2d104d
a64_emit_x64: Lowercase PAGE_SIZE
...
PAGE_SIZE is defined as a macro by musl.
2018-09-22 18:54:49 +01:00
MerryMage
f538d29be7
emit_x64_vector_floating_point: SSE4.1 implementation of EmitFPVectorToFixed
2018-09-22 18:47:15 +01:00
MerryMage
1603a6e9f8
emit_x64_vector_floating_point: EmitFPVectorRoundInt: Use FCODE
2018-09-22 16:19:54 +01:00
MerryMage
2e1ccaff53
emit_x64_vector: AVX implementation for EmitVectorCountLeadingZeros8
2018-09-22 16:08:23 +01:00
MerryMage
555bfdacf9
emit_x64_vector: SSE implementation of EmitVectorCountLeadingZeros16
2018-09-22 13:05:47 +01:00
Lioncash
71c2589662
externals: Update Xbyak to 5.73
...
Merge commit '1ec1b2febb1eac58cf98384dd8ba977a74be7054' into xbyak
2018-09-19 17:55:45 -04:00
Lioncash
1ec1b2febb
Squashed 'externals/xbyak/' changes from 1de435ed..42462ef9
...
42462ef9 use evex encoding for vpslld/vpslldq/vpsraw/...(reg, mem, imm);
da9117a9 update version of readme.md
d35f4fb7 fix the encoding of vinsertps for disp8N
git-subtree-dir: externals/xbyak
git-subtree-split: 42462ef922893f0d3f2156d005fa27ba6898498b
2018-09-19 17:54:43 -04:00
MerryMage
171d11659d
A64: Implement SCVTF, UCVTF (vector, fixed-point), scalar variant
2018-09-19 20:11:14 +01:00
MerryMage
f221bb0095
emit_x64_floating_point: Reduce fallback LUT code in EmitFPToFixed
2018-09-19 20:11:14 +01:00
MerryMage
eb123e2a74
A64: Implement FCVTZS, FCVTZU, UCVTF, SCVTF (vector, fixed-point), vector variant
2018-09-19 19:47:28 +01:00
Lioncash
487d37a4a1
A64: Implement UQSHL's vector immediate and register variants
2018-09-19 12:13:22 +01:00
Lioncash
f69893345f
ir: Add opcodes for unsigned saturating left shifts
2018-09-19 12:13:22 +01:00
Lioncash
7148e661f6
A64/translate/impl: Make signatures consistent for unimplemented by-element SIMD variants
...
Makes them all consistent, so it isn't necessary to change the
prototypes over when implementing them.
2018-09-19 12:11:35 +01:00
Lioncash
fdde4ca363
A64: Implement BRK
...
Currently, we can just implement this as part of the exception
interface, similar to how it's done for the A32 interface with BKPT.
2018-09-19 07:09:27 +01:00
Lioncash
b1490db0e9
A64/imm: Add full range of comparison operators to Imm template
...
Makes the comparison interface consistent by providing all of the
relevant members. This also modifies the comparison operators to take
the Imm instance by value, as it's really only a u32 under the covers,
and it's cheaper to shuffle around a u32 than a 64-bit pointer address.
2018-09-19 07:09:00 +01:00
MerryMage
1ec40ef6ed
IR: Add fbits argument to FPVectorFrom{Signed,Unsigned}Fixed
2018-09-18 21:46:17 +01:00
MerryMage
d6d5e986c2
A64: Implement SCVTF, UCVTF (scalar, fixed-point)
2018-09-18 21:08:06 +01:00
MerryMage
6513595c09
opcodes.inc: Align columns to a tabstop of 4
2018-09-18 20:37:03 +01:00
MerryMage
6b0d2b529e
IR: Add fbits argument to FixedToFP-related opcodes
2018-09-18 20:36:37 +01:00
Lioncash
c4b383124f
A64: Implement SQSHL's vector immediate variant
2018-09-18 18:22:03 +01:00
Lioncash
e0d8d2d855
A64: Implement SQSHL's vector register variant
2018-09-18 18:22:03 +01:00
Lioncash
532762582b
ir: Add opcodes for left signed saturated shifts
2018-09-18 18:22:03 +01:00
Lioncash
9705252968
branch: Make variables const where applicable
2018-09-18 17:45:49 +01:00
Lioncash
650946e41c
move_wide: Make variables const where applicable
2018-09-18 17:45:49 +01:00
Lioncash
62b3a6dcfb
load_store_register_unprivileged: Make variables const where applicable
2018-09-18 17:45:49 +01:00
Lioncash
3add1c7b3f
load_store_register_immediate: Place conditional bodies on their own line
...
Makes the conditionals visually consistent with the rest of the
codebase.
2018-09-18 17:45:49 +01:00
Lioncash
2fc4088a74
load_store_load_literal: Make variables const where applicable
2018-09-18 17:45:49 +01:00
Lioncash
b2c146259f
data_processing_logical: Move datasize declarations after early-exit conditionals
...
While we're at it, make variables const where applicable.
2018-09-18 17:45:49 +01:00
Lioncash
028028f9eb
data_processing_conditional_select: Make variables const where applicable
...
Makes CSEL's function consistent with all of the others.
2018-09-18 17:45:49 +01:00
Lioncash
c66042da57
data_processing_addsub: Move datasize declarations after early-exit conditionals
...
While we're at it, also make relevant variables const where applicable
2018-09-18 17:45:49 +01:00
Lioncash
6bc546e1f7
data_processing_bitfield: Move datasize variables after early-exit conditionals
...
Moves the declaration of datasize to the scope that it's used within.
This also takes the opportunity to apply const where applicable, and
make early-exits all vertically consistent with one another.
2018-09-17 18:14:16 +01:00
Lioncash
2aad5fa87f
A64: Implement CLS's vector variant
...
Leverages CLZ like the integral variant does.
2018-09-17 18:14:00 +01:00
Lioncash
6c877ff8db
emit_x64_vector: Make EmitVectorUnsignedSaturatedAccumulateSigned() internally linked
...
Given this is just an internal helper function, it can be marked static.
2018-09-16 08:16:54 +01:00
Lioncash
4b5926dcab
perf_map: Use std::string_view instead of std::string for PerfMapRegister()
...
We can just use a non-owning view into a string in this case instead of
potentially allocating a std::string instance.
2018-09-16 08:16:43 +01:00
MerryMage
74459479b9
A64: Implement SQRDMULH (vector), vector variant
2018-09-15 14:04:42 +01:00
MerryMage
03b80f2ebe
A64: Implement SQDMULL (vector), vector variant
2018-09-15 13:38:37 +01:00
MerryMage
4a2c5962c7
IR: Add VectorSignedSaturatedDoublingMultiplyLong
2018-09-15 13:38:17 +01:00
MerryMage
59dc33ef12
emit_x64_vector: Changes to VectorSignedSaturatedDoublingMultiply
...
* Return both the upper and lower parts of the multiply if required
* SSE2 does not support the pmuldq instruction, do sign correction to an unsigned result instead
* Improve port utilisation where possible (punpck instructions were a bottleneck)
2018-09-15 09:55:25 +01:00
MerryMage
bbaebeb217
IR: Implement Vector{Signed,Unsigned}Multiply{16,32}
2018-09-15 09:55:25 +01:00
Lioncash
baac5a8810
backend_x64/a64_interface: Re-enable the constant folding pass
...
This was disabled for debugging, but never re-enabled. Just to be sure,
testing was done downstream in yuzu to make sure this didn't happen to
break anything (which seems to be the case).
2018-09-14 12:14:19 +01:00
MerryMage
e78ca1947b
emit_x64_vector_floating_point: Hardware FMA implementation for RSqrtStepFused
2018-09-12 21:01:06 +01:00
MerryMage
8a5ae9a366
emit_x64_vector_floating_point: Hardware FMA implementation of FPVectorRecipStepFused
2018-09-12 20:45:39 +01:00
MerryMage
39818f98e8
emit_x64_floating_point: Hardware FMA implementation of FPRSqrtStepFused
2018-09-12 16:10:18 +01:00
MerryMage
3d0a0b432b
emit_x64_floating_point: Hardware FMA implementation of FPRecipStepFused{32,64}
2018-09-12 14:58:09 +01:00
MerryMage
2293dff6d8
emit_x64_vector: SSE implementation of VectorSignedSaturatedAccumulateUnsigned{8,16,32}
2018-09-11 19:57:31 +01:00
Lioncash
2047683777
emit_x64_vector: Correct static asserts for < 64-bit type checks in saturated accumulate fallbacks
...
I had initially meant to use BitSize() here, not sizeof()
2018-09-11 07:08:32 +01:00
MerryMage
55e9e401aa
emit_x64_vector: EmitVectorSignedSaturatedAccumulateUnsigned64: SSE implementation
2018-09-10 22:39:30 +01:00