MerryMage
2a8de5f733
a64_emit_x64: Clear exclusive state in EmitA64CallSupervisor
...
The kernel would have to execute an ERET instruction to return to
userland; this clears exclusive state.
2020-04-22 20:46:18 +01:00
MerryMage
57f7c7e1b0
Implement global exclusive monitor
2020-04-22 20:46:18 +01:00
MerryMage
85234338d3
a64_emit_x64: Simplify EmitExclusiveWrite
2020-04-22 20:46:18 +01:00
Lioncash
5ce187a54e
ir: Add opcodes for floating-point vector equalities
2020-04-22 20:46:18 +01:00
Lioncash
cf188448d4
emit_x64_vector: Vectorize fallback case in EmitVectorMultiply64()
...
Gets rid of the need to perform a fallback.
2020-04-22 20:46:18 +01:00
Lioncash
954deff2d4
emit_x64_vector: Add break to final case in EmitVectorRoundingHalvingAddUnsigned()
...
This doesn't alter behavior but does make the code better if anything
else is ever added to this function in the future.
2020-04-22 20:46:18 +01:00
Lioncash
bc718c5b28
ir: Add opcodes for performing rounding halving adds
2020-04-22 20:46:18 +01:00
Lioncash
054549da35
emit_x64_vector: Simplify AVX-512 codepath in EmitVectorMultiply64
...
I realized I introduced a helper for simple AVX operation emitting, so
use that instead of writing it all out long-form.
2020-04-22 20:46:18 +01:00
Lioncash
6de5ed96e5
emit_x64_vector: Emit VPMULLQ in EmitVectorMultiply64 on AVX-512{DQ, VL} capable CPUs
...
Shortens code-gen down to a single instruction in the 64-bit path.
2020-04-22 20:46:18 +01:00
Lioncash
1e10017f4b
ir: Add opcodes for signed absolute differences
2020-04-22 20:46:17 +01:00
Lioncash
3f6c529da2
ir: Add opcode to perform the vector conversion S64->F64
...
Unfortunately x86 prior to AVX-512 doesn't really give us any convenient instruction to do the work for us
2020-04-22 20:46:17 +01:00
Lioncash
44a5f8095a
ir: Add opcodes for performing vector halving subtracts
2020-04-22 20:46:17 +01:00
Lioncash
b312d28295
ir: Add an opcode for doing an SM4 lookup table query
2020-04-22 20:46:17 +01:00
Lioncash
27a6d5f6ce
emit_x64_vector: Use VPOPCNTB in EmitVectorPopulationCount() if AVX-512 BITALG is available
2020-04-22 20:46:17 +01:00
Lioncash
089096948a
ir: Add opcodes for performing halving adds
2020-04-22 20:46:17 +01:00
Lioncash
3d00dd63b4
emit_x64_vector: Emit VPMINSQ and VPMINUQ for 64-bit vector min operations if AVX-512VL is available
2020-04-22 20:46:17 +01:00
Lioncash
b97b71b8aa
emit_x64_vector: Emit VPMAXSQ and VPMAXUQ for 64-bit vector max operations if AVX-512VL is available
2020-04-22 20:46:17 +01:00
Lioncash
033e400df0
emit_x64_vector_floating_point: Deduplicate accurate NaN handling code
...
Allows the code to both be used from the 32 bit and 64 bit operations without duplicating code.
2020-04-22 20:46:17 +01:00
Lioncash
0f067b7330
emit_x64_vector: Emit VPABSQ in EmitVectorAbs() for the 64-bit case if AVX-512VL is available
2020-04-22 20:46:17 +01:00
Lioncash
d4ee878cbd
emit_x64_vector: Use VPSRAQ in EmitVectorArithmeticShiftRight64() if AVX-512VL is available
2020-04-22 20:46:17 +01:00
Lioncash
51e4f1d9db
emit_x64_vector: Vectorize fallback path of EmitVectorMaxS32()
2020-04-22 20:46:17 +01:00
Lioncash
c692ccdd6d
emit_x64_vector: Vectorize fallback path of EmitVectorMaxS8()
2020-04-22 20:46:17 +01:00
Lioncash
b194313d8c
emit_x64_vector: Vectorize fallback path in EmitVectorMinU32()
2020-04-22 20:46:17 +01:00
Lioncash
7ceda6d919
emit_x64_vector: Vectorize fallback path in EmitVectorMinU16()
2020-04-22 20:46:17 +01:00
Lioncash
cda85a1da0
emit_x64_vector: Vectorize fallback path in EmitVectorMinS32()
2020-04-22 20:46:17 +01:00
Lioncash
6e08eed210
emit_x64_vector: Vectorize fallback path in EmitVectorMinS8()
2020-04-22 20:46:17 +01:00
Lioncash
0fb6dce689
emit_x64_vector: Remove unnecessary if constexpr expression in LogicalVShift
...
This can simply be merged with the previous one.
2020-04-22 20:46:17 +01:00
Lioncash
5b71b1337b
emit_x64_vector: Avoid left shift of negative value in LogicalVShift
...
Now that we handle the signed variants, we also have to be careful about left shifts with negative values,
as this is considered undefined behavior.
2020-04-22 20:46:17 +01:00
Lioncash
9954d28868
a64_jitstate: Zero SP and PC on construction of A64JitState
...
Given we zero out/reset everything else in the struct, do the same for these members to keep initialization consistent
2020-04-22 20:46:17 +01:00
Lioncash
4efbd40ea4
backend_x64/callback: Default virtual destructor in the cpp file
...
Prevents the vtable being generated in each translation unit that includes the header (and silences -Wweak-vtables warnings)
2020-04-22 20:46:17 +01:00
Lioncash
edd0b5c8c7
a32_interface/a64_interface: Change reinterpret_casts to static_casts in GetCurrentBlock thunks
...
It's well-defined to static_cast a void* to its proper type.
2020-04-22 20:46:17 +01:00
Lioncash
21974ee57e
backend_x64/ir: Amend generic LogicalVShift() template to also handle signed variants
...
Also adds IR opcodes to dispatch said variants
2020-04-22 20:46:17 +01:00
Lioncash
9fc89f0a0e
emit_x64_vector_floating_point: Use arrays for retrieving size instead of hardcoding the size
...
Similar changes were done in emit_x64_vector, but these were missed.
2020-04-22 20:46:17 +01:00
Lioncash
af28e89a13
emit_x64_vector: Vectorize fallback path in EmitVectorMaxU16()
2020-04-22 20:46:17 +01:00
Lioncash
0d20423ad5
emit_x64_vector: Vectorize non-SSE4.1 fallback path for VectorMultiply32()
2020-04-22 20:46:17 +01:00
Lioncash
d70ee7c0d1
emit_x64_vector: Use VBPROADCAST where applicable and available
...
Uses the instruction that does what it says in its name if available. Allows avoiding the use
of a scratch register in EmitVectorBroadcast8() and EmitVectorBroadcastLower8()'s SSSE3 path.
2020-04-22 20:46:17 +01:00
Lioncash
26d77c6f09
ir: Add opcodes for performing vector deinterleaving
2020-04-22 20:46:17 +01:00
Lioncash
87ca63699f
emit_x64_vector: Emit PMAXUD in EmitVectorMaxU32 on SSE4.1-capable CPUs
2020-04-22 20:46:17 +01:00
Lioncash
f17702f608
emit_x64_vector: Emit PMINUD in EmitVectorMinU32 on SSE4.1-capable CPUs
2020-04-22 20:46:17 +01:00
Lioncash
596a8dd1dd
emit_x64_vector: Emit PMINSD in EmitVectorMinS32 on SSE4.1-capable CPUs
...
Provides a better alternative to a fallback operation.
2020-04-22 20:46:17 +01:00
Lioncash
75fd4eaaaa
emit_x64_vector: Get rid of some magic numbers in loop bounds
2020-04-22 20:46:17 +01:00
Lioncash
7b80ac25eb
emit_x64_vector: Generify variable shift functions
2020-04-22 20:46:17 +01:00
Lioncash
38fa984b53
IR: Add opcode for packed word->f32 conversions
2020-04-22 20:46:16 +01:00
Lioncash
64b1f2d468
ir: Add opcode for reversing bits in a vector
2020-04-22 20:46:15 +01:00
Lioncash
e33dcce14a
ir: Add opcodes for performing vector absolute values
2020-04-22 20:46:15 +01:00
MerryMage
3472f371df
IR: Implement VectorExtract, VectorExtractLower IR instructions
2020-04-22 20:46:15 +01:00
MerryMage
5c47f03888
A64: Implement FMUL (vector)
2020-04-22 20:46:15 +01:00
Lioncash
ad5cf584ce
ir: Add opcodes for performing vector unsigned absolute differences
2020-04-22 20:46:15 +01:00
Lioncash
701f43d61e
IR: Add opcodes for interleaving upper-order bytes/halfwords/words/doublewords
...
I should have added this when I introduced the functions for interleaving
low-order equivalents for consistency in the interface.
2020-04-22 20:46:15 +01:00
Lioncash
3985f7bf84
emit_x64_data_processing: Deduplicate some code in zero-extension functions
...
EmitZeroExtendByteToLong() can be implemented in terms of EmitZeroExtendByteToWord() and
EmitZeroExtendHalfToLong() can be implemented in terms of EmitZeroExtendHalfToWord().
2020-04-22 20:46:15 +01:00