Lioncash
5e20e5a44a
emit_x64_vector: Vectorize fallback path in EmitVectorMaxU16()
2018-04-28 19:19:06 +01:00
Lioncash
fea1e6ca1f
A64: Implement CMTST's scalar variant
2018-04-28 19:17:38 +01:00
Lioncash
f8a96dbe5f
emit_x64_vector: Vectorize non-SSE4.1 fallback path for VectorMultiply32()
2018-04-28 19:17:18 +01:00
Lioncash
43a3873dd2
emit_x64_vector: Use VBPROADCAST where applicable and available
...
Uses the instruction that does what it says in its name if available. Allows avoiding the use
of a scratch register in EmitVectorBroadcast8() and EmitVectorBroadcastLower8()'s SSSE3 path.
2018-04-26 22:28:09 +01:00
Lioncash
2a35b2a46a
A64: Implement UZP1 and UZP2
2018-04-26 08:49:51 +01:00
Lioncash
9c2550ac72
ir: Add opcodes for performing vector deinterleaving
2018-04-26 08:49:51 +01:00
Lioncash
4765503fc6
A64: Implement FNEG (half-precision)
2018-04-26 08:49:03 +01:00
MerryMage
a00a605a4b
README: Add usage example
2018-04-25 22:25:39 +01:00
Lioncash
96be76296e
A64: Implement USHL (scalar)
2018-04-24 08:15:00 +01:00
Lioncash
8261911bd8
A64: Implement FNEG (vector)
2018-04-24 08:14:31 +01:00
Lioncash
633e8e3ecd
A64: Implement RSUBHN/RSUBHN2
2018-04-23 21:08:43 +01:00
Lioncash
4dec013b09
A64: Implement RADDHN/RADDHN2
2018-04-23 21:08:43 +01:00
Lioncash
7fb04ccb36
A64: Implement XAR
2018-04-23 16:05:40 +01:00
Lioncash
db29d68a2d
simd_two_register_misc: Factor out common comparison code
...
Gets rid of a tiny bit of duplicated code.
2018-04-23 16:04:58 +01:00
Lioncash
1692e26c2e
A64: Implement CMLE (zero)'s vector variant
2018-04-23 16:04:58 +01:00
Lioncash
21936a82aa
A64: Implement CMTST (vector)
2018-04-23 16:04:40 +01:00
Lioncash
4f27764191
A64: Implement ADDHN{2} and SUBHN{2}
2018-04-21 08:58:16 +01:00
Lioncash
1791114ab1
translate: zero extend result in Vpart when storing to lower part of vector
2018-04-21 08:58:16 +01:00
Lioncash
e96bfd1133
emit_x64_vector: Emit PMAXUD in EmitVectorMaxU32 on SSE4.1-capable CPUs
2018-04-20 23:34:32 +01:00
Lioncash
f85e2681fc
emit_x64_vector: Emit PMINUD in EmitVectorMinU32 on SSE4.1-capable CPUs
2018-04-20 23:34:32 +01:00
Lioncash
2fb3ddd2cd
emit_x64_vector: Emit PMINSD in EmitVectorMinS32 on SSE4.1-capable CPUs
...
Provides a better alternative to a fallback operation.
2018-04-20 23:34:32 +01:00
Lioncash
be48d44132
emit_x64_vector: Get rid of some magic numbers in loop bounds
2018-04-20 18:54:29 +01:00
Lioncash
a3c8ab61ea
emit_x64_vector: Generify variable shift functions
2018-04-20 18:54:14 +01:00
Lioncash
f6e624e9ea
A64: Implement CMLE (zero)'s scalar variant
2018-04-20 17:31:07 +01:00
Lioncash
41a3e87c15
A64: Implement CMLT (zero)'s scalar single/double-precision variant
2018-04-20 15:48:50 +01:00
Lioncash
51912ca6ab
A64: Implement SHA512H2
2018-04-20 07:29:26 +01:00
Lioncash
4655f78ec2
A64: Implement SHA512H
2018-04-20 07:29:26 +01:00
Lioncash
3a52275611
A64: Handle S32->F32 case for SCVTF (vector)
2018-04-19 21:09:42 +01:00
Lioncash
230e954e5c
IR: Add opcode for packed word->f32 conversions
2018-04-19 21:09:42 +01:00
Lioncash
f24cfba5c2
A64: Implement SHA512SU1
2018-04-19 19:51:31 +01:00
Lioncash
e3412d9779
A64: Implement SHA512SU0
2018-04-19 19:51:31 +01:00
Lioncash
cc76802990
A64: Implement SHA256H and SHA256H2
2018-04-19 19:50:17 +01:00
MerryMage
b5585baefb
A64: Implement SCVTF (vector, integer), scalar varaint
2018-04-19 19:48:45 +01:00
MerryMage
badc7ac467
impl: Reorganize scalar two-register misc instructions
2018-04-19 19:48:45 +01:00
Lioncash
d53decf9e1
A64: Implement SHA256SU1
2018-04-19 08:40:55 +01:00
Lioncash
62be988507
simd_two_register_misc: Add missing zeroing of the vector for CMGT and CMLT
2018-04-19 08:39:56 +01:00
Lioncash
cd5fee6746
A64: Implement CMGE (zero)'s vector variant
2018-04-19 08:39:56 +01:00
Lioncash
da99e1fdaa
A64: Implement MLS (by element)
2018-04-19 08:39:25 +01:00
Lioncash
8920238d59
A64: Implement MUL (by element)
2018-04-19 08:39:25 +01:00
MerryMage
68d6a1276b
A64: Implement MLA (by element)
2018-04-19 00:04:52 +01:00
Lioncash
7340c36ae0
A64: Implement ABS (scalar)
2018-04-19 00:03:08 +01:00
Lioncash
c196c73e17
A64: Implement SHA256SU0
2018-04-19 00:00:39 +01:00
Lioncash
21790dcf98
CMake: Make FindUnicorn introduce a unicorn target
...
Makes the find module do all the work of properly setting up the target instead of needing to do it in the main CMakeLists file.
2018-04-18 23:59:54 +01:00
Lioncash
cb3b885025
A64: Implement SHA1M
2018-04-16 07:47:22 +01:00
Lioncash
4381ded969
A64: Implement SHA1P
2018-04-16 07:47:22 +01:00
Lioncash
dc5c2508a5
A64: Implement scalar variants of CMEQ, CMGT, and CMGE zero comparison instructions
...
These can trivially use the ScalarCompare helper function.
2018-04-15 13:38:42 +01:00
Lioncash
3a71801f10
A64: Implement scalar variant of NEG
2018-04-15 13:37:07 +01:00
Lioncash
f8e387f13f
simd: Relocate REV16, REV32 and REV64 vector variants to the proper file
...
These aren't scalar instruction variants.
2018-04-15 13:37:07 +01:00
Lioncash
e23f6b666e
A64: Implement CMEQ (register, scalar)
2018-04-15 11:31:20 +01:00
Lioncash
5f879af788
A64: Implement CMHS (register, scalar)
2018-04-15 11:31:20 +01:00