Lioncash
7627fa389b
A64: Enable FMOV (general) for half-precision floating point
...
This just transfers values between vector registers and general-purpose
registers with no conversions performed, so this is trivial to add
support for half-precision to.
2019-03-22 15:18:12 -04:00
Merry
004adfe844
Merge pull request #455 from lioncash/sqrdmulh-scalar
...
A64: Implement SQRDMULH and SQDMULL's scalar indexed variants
2019-03-22 12:43:53 +00:00
Merry
a3a41a91cd
Merge pull request #454 from lioncash/sqrdmulh
...
A64: Implement SQRDMULH and SQDMULL{2}'s vector indexed element variants
2019-03-21 19:14:30 +00:00
Lioncash
5c6becc402
A64: Implement SQRDMULH's scalar indexed element variant
2019-03-20 23:26:55 -04:00
Lioncash
2d78390d43
A64: Implement SQDMULL{2}'s scalar indexed element variant
2019-03-20 23:05:07 -04:00
Lioncash
3238864878
simd_scalar_x_indexed_element: Factor out index and Vm argument construction
...
This will be useful in the implementations of SQRDMULH and SQDMULL{2} as
well.
2019-03-20 22:28:46 -04:00
Lioncash
996a618643
simd_vector_x_indexed_element: Deduplicate index and Vm operand construction
2019-03-20 16:21:48 -04:00
Lioncash
328211b0c5
A64: Implement SQDMULL{2}'s by-element variant
2019-03-20 15:36:27 -04:00
Lioncash
b4ca6b67d1
A64: Implement SQRDMULH's by-index vector variant
2019-03-20 14:05:41 -04:00
Merry
6e8b7d27ec
Merge pull request #452 from lioncash/frecpx
...
A64: Implement FRECPX's half-precision floating-point variant
2019-03-10 20:43:55 +00:00
Lioncash
f8ad1819ab
A64: Implement FRECPX's half-precision floating point variant
2019-03-09 20:08:01 -05:00
Lioncash
db67a42244
frontend/ir/ir_emitter: Amend FPRecipExponent to handle half-precision floating point
2019-03-09 20:08:01 -05:00
Lioncash
8036a54a74
frontend/ir/value: Add U16U32U64 type to represent floating point types
2019-03-09 20:08:01 -05:00
Lioncash
1e0933907a
common/fp/op/FPRecipExponent: Add half-precision floating point specialization
2019-03-09 20:07:53 -05:00
Lioncash
5117997cde
common/fp/unpacked: Correct edge-cases within FPUnpack for half-precision floating point
...
This corrects one case where floating-point exceptions could be set when
they're not supposed to be.
This also corrects a case where values were being treated as NaNs when
they weren't supposed to be.
2019-03-09 19:21:16 -05:00
Merry
d3e242b4af
Merge pull request #451 from lioncash/unpck
...
common/fp: Minor adjustments for half-precision floating point support
2019-03-09 16:05:42 +00:00
Lioncash
1b66e43094
common/fp/process_nan: Add half-precision instantiations for NaN processing functions
2019-03-09 02:26:58 -05:00
Lioncash
2f7f75ff09
common/fp/unpacked: Add half-precision instantiation of FPRoundBase
2019-03-09 02:26:47 -05:00
Lioncash
74e230de9d
common/fp/unpacked: Handle half-precision unpacking in FPUnpackBase
2019-03-09 01:19:55 -05:00
Lioncash
9c96c4c9fc
common/fp/unpacked: Adjust FPUnpack to operate like ARM pseudocode
...
This function is defined as always disabling the AHP bit in the fpcr
before performing any operations.
At the same time, rename the original FPUnpack function to FPUnpackBase
to match the pseudocode in the ARM reference manual.
2019-03-09 00:08:12 -05:00
Merry
c535f4bf89
Merge pull request #448 from lioncash/saturate
...
A64: Implement SQSHRN, SQSHRUN, and UQSHRN's scalar variants
2019-03-08 18:50:18 +00:00
Merry
b3d6f40dbb
Merge pull request #449 from lioncash/hp
...
common/fp/info: Add specialization of FPInfo for half-precision floating point
2019-03-08 18:49:18 +00:00
Merry
d2fbcc6dde
Merge pull request #450 from lioncash/cv
...
common/fp/unpacked: Add FPRoundCV and FPUnpackCV
2019-03-08 12:05:52 +00:00
Lioncash
6dad779547
common/fp/unpacked: Add FPRoundCV
...
Corresponds to the equivalent pseudocode within the ARMv8 reference
manual. This will be necessary for supporting half-precision
floating-point.
This also makes use of it within FPConvert
2019-03-08 06:24:54 -05:00
Lioncash
deca5e5dae
common/fp/unpacked: Add FPUnpackCV
...
Adds a template function that performs the same behavior as in the ARM
pseudocode, and utilizes it in FPConvert, which will be necessary for
half-float support.
2019-03-08 06:20:54 -05:00
Lioncash
d1a1434af6
common/fp/info: Add specialization of FPInfo for half-precision floating point
...
Puts the necessary info struct in place for further use.
2019-03-08 03:50:48 -05:00
Lioncash
203678bd9e
A64: Implement SQSHRN, SQSHRUN, and UQSHRN's scalar variants
...
These can just be implemented in terms of the vector variants for the
time being.
2019-03-08 03:23:56 -05:00
Lioncash
fb3847b044
A64: Amend prototypes of some SIMD scalar shift by immediate opcodes
...
These take a vector for a destination.
2019-03-08 00:20:24 -05:00
Merry
40339b1278
Merge pull request #447 from lioncash/flag
...
A64: Implement CFINV, RMIF, AXFlag and XAFlag
2019-03-07 16:17:13 +00:00
Lioncash
8d115ae80d
ir_opt/a64_get_set_elimination_pass: Add handling for NZCV raw get and set operations
2019-03-07 03:57:18 -05:00
Lioncash
5b66ae2a5a
A64: Implement AXFlag and XAFlag
2019-03-06 18:39:19 -05:00
Lioncash
080d163d39
A64: Implement RMIF
2019-03-06 18:39:19 -05:00
Lioncash
9cd84c5149
A64: Implement CFINV
2019-03-06 18:39:12 -05:00
Merry
04f09eb644
Merge pull request #442 from lioncash/fcvtxn
...
A64: Implement scalar and vector variants of FCVTXN
2019-03-06 20:27:59 +00:00
Lioncash
27af30d7c3
ir: Add A64-specific opcodes for getting and setting raw NZCV values
...
This will be necessary to implement the flag manipulation and flag
format instructions.
2019-03-06 14:17:27 -05:00
Lioncash
cdfdd95e63
A64: Implement the vector version of FCVTXN
2019-03-06 12:05:24 -05:00
Lioncash
e3ba07971e
A64: Implement the scalar version of FCVTXN
2019-03-06 12:05:24 -05:00
Lioncash
f44cafe3ca
frontend/ir/ir_emitter: Alter parameters of FPDoubleToSingle() and FPSingleToDouble() to pass along desired rounding mode
...
This will be necessary to special-case the non-IEEE Von Neumann rounding
to odd rounding mode.
2019-03-06 12:05:20 -05:00
Merry
d0ae8dcf97
Merge pull request #446 from lioncash/sqshl
...
A64: Implement scalar variants of SQSHL (register) and UQSHL (register)
2019-03-06 14:14:41 +00:00
Merry
948984bc94
Merge pull request #445 from lioncash/sqrt
...
A64: Implement single and double-precision vector variant of FSQRT
2019-03-06 14:14:21 +00:00
Merry
768a2a6973
Merge pull request #443 from lioncash/flag
...
A64: Rearrange flag format/manipulation instructions
2019-03-06 14:13:59 +00:00
Merry
a5ce4c7831
Merge pull request #441 from lioncash/constexpr
...
common/bit_util: Mark a few functions as constexpr
2019-03-05 19:57:12 +00:00
Merry
fa1671b655
Merge pull request #440 from lioncash/include
...
common/fp: Remove unnecessary includes
2019-03-05 19:56:58 +00:00
Merry
ecd544098b
Merge pull request #444 from lioncash/interpret
...
A64: Fall back to interpreting for FCADD and FCMLA half-precision variants
2019-03-05 18:51:50 +00:00
Lioncash
4429a0c8e5
A64: Implement UQSHL (register)'s scalar variant
...
This can be implemented in terms of the vector variant.
2019-03-04 14:26:25 -05:00
Lioncash
a3eeb334fa
A64: Implement SQSHL (register)'s scalar variant
...
We can implement this in terms of the vector variant.
2019-03-04 14:26:20 -05:00
Lioncash
16e6a8f343
A64: Implement single and double-precision vector variant of FSQRT
2019-03-04 13:37:27 -05:00
Lioncash
3e5a52cc66
frontend/ir: Add opcodes for vector square roots
2019-03-04 13:24:36 -05:00
Lioncash
d365d4083a
A64: Fall back to interpreting for FCADD and FCMLA half-precision variants
...
Rather than straight-up treating them as undefined, we can fall back to an
interpreter in this case.
2019-03-04 12:23:57 -05:00
Lioncash
36eba63641
A64: Rearrange flag format/manipulation instructions
...
Gives these instructions better categorical labeling.
2019-03-04 12:08:49 -05:00