1443 Commits

Author SHA1 Message Date
Lioncash
2c8c83c2f1
constant_propagation_pass: Add 64-bit variants of shifts to the pass
These optimizations can also apply to the 64-bit variants of the shift
opcodes; we just need to check if the instruction has an associated
pseudo-op before performing the 32-bit variant's specifics.

While we're at it, we can also relocate the code to its own function
like the rest of the cases to keep organization consistent.
2018-10-12 14:50:21 -04:00
Lioncash
b55b7a5c7f
constant_propagation_pass: Fold division operations where applicable
We can fold division operations if:

1. The divisor is zero, then we can replace the result with zero (as this is how
ARM platforms expect it).
2. Both values are known, in which case we can just do the operation and
store the result
3. The divisor is 1, in which case just return the other operand.
2018-10-09 16:03:47 -04:00
Merry
c3a1558dfc
Merge pull request #406 from lioncash/mul
constant_propagation_pass: Fold Mul32 and Mul64 cases where applicable
2018-10-09 15:11:36 +01:00
Merry
caea0460ca
Merge pull request #405 from lioncash/inst
a64: Add ARMv8.4+ instructions encodings to the encoding table
2018-10-08 08:35:04 +01:00
Lioncash
a5d1d27c8c
constant_propagation_pass: deduplicate common 32/64 bit checking for results in folding functions
It's common for an folding operation to apply to both the 32-bit and
64-bit variant of the same opcode, which leads to checking which kind of
result we need to store the value as. This moves it to its own function,
so that we don't need to duplicate it in various functions.
2018-10-07 22:16:40 -04:00
Lioncash
a0ea3a566a
constant_propagation_pass: Fold Mul32 and Mul64 cases where applicable
Multiplication operations can currently be folded if:

1. Both arguments are known constant values
2. Either operand is zero (in which case the result is also zero)
3. Either operand is one (in which case the result is the non-one
operand).
2018-10-07 22:16:01 -04:00
Lioncash
62a2ba7981
a64: Add ARMv8.4+ instructions encodings to the encoding table
Keeps the table up to date with the ARM specification.
2018-10-06 02:27:50 -04:00
Lioncash
f1d907c980
constant_propagation_pass: Fold SignExtend{Type}ToLong opcodes if possible 2018-10-05 21:06:57 -04:00
Lioncash
4f03ca65a0
constant_propagation_pass: Fold SignExtend{Type}ToWord opcodes if possible 2018-10-05 21:06:57 -04:00
Lioncash
f47c5e4ede
constant_propagation_pass: Fold ZeroExtend{Type}ToLong opcodes if possible
These are equivalent to the ZeroExtendXToWord variants, so we can
trivially do this as well.
2018-10-05 21:06:57 -04:00
Lioncash
e0eec323f2
constant_propagation_pass: Combine zero-extension folding code into its own function
Separates the behavior from the actual switch statement and gets rid of
duplication, now that we can use the general GetImmediateAsU64()
function.
2018-10-05 21:06:54 -04:00
Lioncash
8f262d485a
ir/value: Add IsSignedImmediate() and IsUnsignedImmediate() functions to Value's interface
This allows testing against arbitrary values while also simultaneously
eliminating the need to check IsImmediate() all the time in expressions.
2018-10-04 12:33:23 -04:00
Lioncash
01aa5096bf
ir/value: Add a GetImmediateAsS64() function
Provides a signed analogue to GetImmediateAsU64() for consistency with
both integral classes when it comes to signed/unsigned..
2018-10-04 12:33:19 -04:00
Lioncash
d3263fd605
ir/value: Add an IsZero() member function to Value's interface
By far, one of the most common things to check for is whether or not a
value is zero, as it typically allows folding away unnecesary
operations (other close contenders that can help with eliding operations  are 1 and -1).

So instead of requiring a check for an immediate and then actually
retrieving the integral value and checking it, we can wrap it within a
function to make it more convenient.
2018-10-04 05:04:22 -04:00
Merry
ba7a5b6c98
Merge pull request #401 from lioncash/folding
constant_propagation_pass: Fold &, |, ^, and ~ operations where applicable
2018-10-02 21:22:58 +01:00
Lioncash
7868411de5
constant_propagation_pass: Fold NOT operations 2018-10-01 18:53:54 -04:00
Lioncash
0d1f9841d1
constant_propagation_pass: Fold OR operations 2018-10-01 18:53:54 -04:00
Lioncash
142978041d
constant_propagation_pass: Fold AND operations 2018-10-01 18:53:54 -04:00
Lioncash
be3ba545e7
ir/value: Add member function to check whether or not all bits of a contained value are set
This is useful when we wish to know if a contained value is something
like 0xFFFFFFFF, as this helps perform constant folding. For example the
operation: x & 0xFFFFFFFF can be folded to just x in the 32-bit case.
2018-10-01 18:53:47 -04:00
MerryMage
4e6848d1c9 A32/ir_emitter: Bugfix: ExceptionRaised was producing incorrect PC
Use actual PC and not pipelined PC.
2018-09-30 19:39:52 +01:00
Lioncash
f5233bfc69
constant_propagation_pass: Fold EOR operations
It's possible to fold cases of exclusive OR operations if they can be
known to be an identity operation, or if both operands happen to be known
immediates, in which case we can just store the result of the
exclusive-OR directly.
2018-09-29 03:59:15 -04:00
Lioncash
41ba9fd7bc value: Move ImmediateToU64() to be a part of Value's interface
This'll make it slightly nicer to do basic constant folding for 32-bit
and 64-bit variants of the same IR opcode type. By that, I mean it's
possible to inspect immediate values without a bunch of conditional
checks beforehand to verify that it's possible to call GetU32() or
GetU64, etc.
2018-09-28 22:19:11 +01:00
MerryMage
c6a6271f86 reg_alloc: Emit AVX instructions where able
Smaller codesize.
2018-09-28 21:12:48 +01:00
MerryMage
aedd32aa20 abi: Emit AVX instructions where able
Smaller codesize.
2018-09-28 21:12:17 +01:00
MerryMage
f2d9337663 a64_exclusive_monitor: Loosen memory ordering requirements
It is not necessary to be as strict as it was.
2018-09-27 16:21:56 +01:00
MerryMage
14dd45eed9 Fix VShift terminology
An arithmetic shift is by definition a signed shift, and a logical shift is by definition an unsigned shift.

- Rename VectorLogicalVShiftS* -> VectorArithmeticVShift*
- Rename VectorLogicalVShiftU* -> VectorLogicalVShift*
2018-09-23 10:50:39 +01:00
MerryMage
88554c4f04 emit_x64_vector: AVX512 implementation of EmitVectorLogicalVShiftS16 2018-09-23 10:41:41 +01:00
MerryMage
ab4e316b24 emit_x64_vector: AVX512 implementation of EmitVectorLogicalVShiftS64 2018-09-23 10:41:30 +01:00
MerryMage
0ea84f34fe emit_x64_vector: AVX2 implementation of EmitVectorLogicalVShiftS32 2018-09-23 10:41:10 +01:00
MerryMage
c77a2c5bab emit_x64_vector: AVX512 implementation of EmitVectorLogicalVShiftU16() 2018-09-23 10:14:20 +01:00
MerryMage
e9441fd561 emit_x64_vector: AVX2 implementation of EmitVectorLogicalVShiftU64() 2018-09-23 10:14:20 +01:00
MerryMage
0e9c33cbf3 emit_x64_vector: AVX2 implementation of EmitVectorLogicalVShiftU32() 2018-09-23 10:14:20 +01:00
Lioncash
8f8527407c emit_x64_vector: SSSE3 variant of EmitVectorCountLeadingZeros8()
pshufb lyfe
2018-09-23 10:13:08 +01:00
VelocityRa
bc328fc645 decoders: Cast to correctly-sized type before shifting
Fixes decoding for 64-bit instructions

Does not help/apply to any currently supported ARM versions (since
all are 32-bit length or below), it's for future-proofing should
such an arch be supported.
2018-09-22 22:10:29 +03:00
MerryMage
9c3d2d104d a64_emit_x64: Lowercase PAGE_SIZE
PAGE_SIZE is defined as a macro by musl.
2018-09-22 18:54:49 +01:00
MerryMage
f538d29be7 emit_x64_vector_floating_point: SSE4.1 implementation of EmitFPVectorToFixed 2018-09-22 18:47:15 +01:00
MerryMage
1603a6e9f8 emit_x64_vector_floating_point: EmitFPVectorRoundInt: Use FCODE 2018-09-22 16:19:54 +01:00
MerryMage
2e1ccaff53 emit_x64_vector: AVX implementation for EmitVectorCountLeadingZeros8 2018-09-22 16:08:23 +01:00
MerryMage
555bfdacf9 emit_x64_vector: SSE implementation of EmitVectorCountLeadingZeros16 2018-09-22 13:05:47 +01:00
MerryMage
171d11659d A64: Implement SCVTF, UCVTF (vector, fixed-point), scalar variant 2018-09-19 20:11:14 +01:00
MerryMage
f221bb0095 emit_x64_floating_point: Reduce fallback LUT code in EmitFPToFixed 2018-09-19 20:11:14 +01:00
MerryMage
eb123e2a74 A64: Implement FCVTZS, FCVTZU, UCVTF, SCVTF (vector, fixed-point), vector variant 2018-09-19 19:47:28 +01:00
Lioncash
487d37a4a1 A64: Implement UQSHL's vector immediate and register variants 2018-09-19 12:13:22 +01:00
Lioncash
f69893345f ir: Add opcodes for unsigned saturating left shifts 2018-09-19 12:13:22 +01:00
Lioncash
7148e661f6 A64/translate/impl: Make signatures consistent for unimplemented by-element SIMD variants
Makes them all consistent, so it isn't necessary to change the
prototypes over when implementing them.
2018-09-19 12:11:35 +01:00
Lioncash
fdde4ca363 A64: Implement BRK
Currently, we can just implement this as part of the exception
interface, similar to how it's done for the A32 interface with BKPT.
2018-09-19 07:09:27 +01:00
Lioncash
b1490db0e9 A64/imm: Add full range of comparison operators to Imm template
Makes the comparison interface consistent by providing all of the
relevant members. This also modifies the comparison operators to take
the Imm instance by value, as it's really only a u32 under the covers,
and it's cheaper to shuffle around a u32 than a 64-bit pointer address.
2018-09-19 07:09:00 +01:00
MerryMage
1ec40ef6ed IR: Add fbits argument to FPVectorFrom{Signed,Unsigned}Fixed 2018-09-18 21:46:17 +01:00
MerryMage
d6d5e986c2 A64: Implement SCVTF, UCVTF (scalar, fixed-point) 2018-09-18 21:08:06 +01:00
MerryMage
6513595c09 opcodes.inc: Align columns to a tabstop of 4 2018-09-18 20:37:03 +01:00