Lioncash
7cab5c2e03
emit_x64_vector: Vectorize fallback path in EmitVectorMinS32()
2018-05-02 17:13:41 +01:00
Lioncash
7fa0d2765a
emit_x64_vector: Vectorize fallback path in EmitVectorMinS8()
2018-05-02 17:13:41 +01:00
Lioncash
73477f04dd
emit_x64_vector: Remove unnecessary if constexpr expression in LogicalVShift
...
This can simply be merged with the previous one.
2018-05-01 19:31:48 +01:00
Lioncash
f90a476546
emit_x64_vector: Avoid left shift of negative value in LogicalVShift
...
Now that we handle the signed variants, we also have to be careful about left shifts with negative values,
as this is considered undefined behavior.
2018-05-01 19:31:48 +01:00
Lioncash
e08c1bdae1
a64_jitstate: Zero SP and PC on construction of A64JitState
...
Given we zero out/reset everything else in the struct, do the same for these members to keep initialization consistent
2018-05-01 19:31:05 +01:00
Lioncash
e33ba25d8a
backend_x64/callback: Default virtual destructor in the cpp file
...
Prevents the vtable being generated in each translation unit that includes the header (and silences -Wweak-vtables warnings)
2018-05-01 19:30:56 +01:00
Lioncash
83e20fc690
a32_interface/a64_interface: Change reinterpret_casts to static_casts in GetCurrentBlock thunks
...
It's well-defined to static_cast a void* to its proper type.
2018-05-01 19:30:25 +01:00
Lioncash
8aa9a47a6a
A64: Implement SSHL (scalar)
2018-04-30 22:41:17 +01:00
Lioncash
8818d76212
A64: Implement SSHL (vector)
2018-04-30 22:41:17 +01:00
Lioncash
3e3ce37eb8
backend_x64/ir: Amend generic LogicalVShift() template to also handle signed variants
...
Also adds IR opcodes to dispatch said variants
2018-04-30 22:41:17 +01:00
Lioncash
4cb09c827d
emit_x64_vector_floating_point: Use arrays for retrieving size instead of hardcoding the size
...
Similar changes were done in emit_x64_vector, but these were missed.
2018-04-30 22:40:25 +01:00
Lioncash
5e20e5a44a
emit_x64_vector: Vectorize fallback path in EmitVectorMaxU16()
2018-04-28 19:19:06 +01:00
Lioncash
fea1e6ca1f
A64: Implement CMTST's scalar variant
2018-04-28 19:17:38 +01:00
Lioncash
f8a96dbe5f
emit_x64_vector: Vectorize non-SSE4.1 fallback path for VectorMultiply32()
2018-04-28 19:17:18 +01:00
Lioncash
43a3873dd2
emit_x64_vector: Use VBPROADCAST where applicable and available
...
Uses the instruction that does what it says in its name if available. Allows avoiding the use
of a scratch register in EmitVectorBroadcast8() and EmitVectorBroadcastLower8()'s SSSE3 path.
2018-04-26 22:28:09 +01:00
Lioncash
2a35b2a46a
A64: Implement UZP1 and UZP2
2018-04-26 08:49:51 +01:00
Lioncash
9c2550ac72
ir: Add opcodes for performing vector deinterleaving
2018-04-26 08:49:51 +01:00
Lioncash
4765503fc6
A64: Implement FNEG (half-precision)
2018-04-26 08:49:03 +01:00
MerryMage
a00a605a4b
README: Add usage example
2018-04-25 22:25:39 +01:00
Lioncash
96be76296e
A64: Implement USHL (scalar)
2018-04-24 08:15:00 +01:00
Lioncash
8261911bd8
A64: Implement FNEG (vector)
2018-04-24 08:14:31 +01:00
Lioncash
633e8e3ecd
A64: Implement RSUBHN/RSUBHN2
2018-04-23 21:08:43 +01:00
Lioncash
4dec013b09
A64: Implement RADDHN/RADDHN2
2018-04-23 21:08:43 +01:00
Lioncash
7fb04ccb36
A64: Implement XAR
2018-04-23 16:05:40 +01:00
Lioncash
db29d68a2d
simd_two_register_misc: Factor out common comparison code
...
Gets rid of a tiny bit of duplicated code.
2018-04-23 16:04:58 +01:00
Lioncash
1692e26c2e
A64: Implement CMLE (zero)'s vector variant
2018-04-23 16:04:58 +01:00
Lioncash
21936a82aa
A64: Implement CMTST (vector)
2018-04-23 16:04:40 +01:00
Lioncash
4f27764191
A64: Implement ADDHN{2} and SUBHN{2}
2018-04-21 08:58:16 +01:00
Lioncash
1791114ab1
translate: zero extend result in Vpart when storing to lower part of vector
2018-04-21 08:58:16 +01:00
Lioncash
e96bfd1133
emit_x64_vector: Emit PMAXUD in EmitVectorMaxU32 on SSE4.1-capable CPUs
2018-04-20 23:34:32 +01:00
Lioncash
f85e2681fc
emit_x64_vector: Emit PMINUD in EmitVectorMinU32 on SSE4.1-capable CPUs
2018-04-20 23:34:32 +01:00
Lioncash
2fb3ddd2cd
emit_x64_vector: Emit PMINSD in EmitVectorMinS32 on SSE4.1-capable CPUs
...
Provides a better alternative to a fallback operation.
2018-04-20 23:34:32 +01:00
Lioncash
be48d44132
emit_x64_vector: Get rid of some magic numbers in loop bounds
2018-04-20 18:54:29 +01:00
Lioncash
a3c8ab61ea
emit_x64_vector: Generify variable shift functions
2018-04-20 18:54:14 +01:00
Lioncash
f6e624e9ea
A64: Implement CMLE (zero)'s scalar variant
2018-04-20 17:31:07 +01:00
Lioncash
41a3e87c15
A64: Implement CMLT (zero)'s scalar single/double-precision variant
2018-04-20 15:48:50 +01:00
Lioncash
51912ca6ab
A64: Implement SHA512H2
2018-04-20 07:29:26 +01:00
Lioncash
4655f78ec2
A64: Implement SHA512H
2018-04-20 07:29:26 +01:00
Lioncash
3a52275611
A64: Handle S32->F32 case for SCVTF (vector)
2018-04-19 21:09:42 +01:00
Lioncash
230e954e5c
IR: Add opcode for packed word->f32 conversions
2018-04-19 21:09:42 +01:00
Lioncash
f24cfba5c2
A64: Implement SHA512SU1
2018-04-19 19:51:31 +01:00
Lioncash
e3412d9779
A64: Implement SHA512SU0
2018-04-19 19:51:31 +01:00
Lioncash
cc76802990
A64: Implement SHA256H and SHA256H2
2018-04-19 19:50:17 +01:00
MerryMage
b5585baefb
A64: Implement SCVTF (vector, integer), scalar varaint
2018-04-19 19:48:45 +01:00
MerryMage
badc7ac467
impl: Reorganize scalar two-register misc instructions
2018-04-19 19:48:45 +01:00
Lioncash
d53decf9e1
A64: Implement SHA256SU1
2018-04-19 08:40:55 +01:00
Lioncash
62be988507
simd_two_register_misc: Add missing zeroing of the vector for CMGT and CMLT
2018-04-19 08:39:56 +01:00
Lioncash
cd5fee6746
A64: Implement CMGE (zero)'s vector variant
2018-04-19 08:39:56 +01:00
Lioncash
da99e1fdaa
A64: Implement MLS (by element)
2018-04-19 08:39:25 +01:00
Lioncash
8920238d59
A64: Implement MUL (by element)
2018-04-19 08:39:25 +01:00