Summary:
Writing support for three ACLE functions:
unsigned int __cls(uint32_t x)
unsigned int __clsl(unsigned long x)
unsigned int __clsll(uint64_t x)
CLS stands for "Count number of leading sign bits".
In AArch64, these two intrinsics can be translated into the 'cls'
instruction directly. In AArch32, on the other hand, this functionality
is achieved by implementing it in terms of clz (count number of leading
zeros).
Reviewers: compnerd
Reviewed By: compnerd
Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D69250
This test is broken by design. Clang codegen tests should not depend
on llvm middle-end behaviour, they should *only* test clang codegen.
Yet this test runs whole optimization pipeline.
I've really tried to fix it, but there isn't just a few things
that depend on passes, but everything there does.
llvm-svn: 372015
The jcvt intrinsic defined in ACLE [1] is available when ARM_FEATURE_JCVT is defined.
This change introduces the AArch64 intrinsic, wires it up to the instruction and a new clang builtin function.
The __ARM_FEATURE_JCVT macro is now defined when an Armv8.3-A or higher target is used.
I've implemented the target detection logic in Clang so that this feature is enabled for architectures from armv8.3-a onwards (so -march=armv8.4-a also enables this, for example).
make check-all didn't show any new failures.
[1] https://developer.arm.com/docs/101028/latest/data-processing-intrinsics
Differential Revision: https://reviews.llvm.org/D64495
llvm-svn: 366197
As per the discussion on D58375, we disable test that have optimizations under
the new PM. This patch adds -fno-experimental-new-pass-manager to RUNS that:
- Already run with optimizations (-O1 or higher) that were missed in D58375.
- Explicitly test new PM behavior along side some new PM RUNS, but are missing
this flag if new PM is enabled by default.
- Specify -O without the number. Based on getOptimizationLevel(), it seems the
default is 2, and the IR appears to be the same when changed to -O2, so
update the test to explicitly say -O2 and provide -fno-experimental-new-pass-manager`.
Differential Revision: https://reviews.llvm.org/D63156
llvm-svn: 364066
Implemented the remaining integer data processing intrinsics from
the ARM ACLE v2.1 spec, such as parallel arithemtic and DSP style
multiplications.
Differential Revision: https://reviews.llvm.org/D32282
llvm-svn: 302131
These two intrinsics are defined in arm_acle.h.
__rev16l needs to rotate by 16 bits, bit it was actually rotating by 2 bits.
For AArch64, where long is 64 bits, this would still be wrong.
__rev16ll was incorrect, it reversed the bytes in each 32-bit word, rather than
each 16-bit halfword. The correct implementation is to apply __rev16 to the top
and bottom words of the 64-bit value.
For AArch32 targets, these get compiled down to the hardware rev16 instruction
at -O1 and above. For AArch64 targets, the 64-bit ones get compiled to two
32-bit rev16 instructions, because there is not currently a pattern for the
64-bit rev16 instruction.
Differential Revision: http://reviews.llvm.org/D14609
llvm-svn: 253211
in section 10.1, __arm_{w,r}sr{,p,64}.
This includes arm_acle.h definitions with builtins and codegen to support
these, the intrinsics are implemented by generating read/write_register calls
which get appropriately lowered in the backend based on the register string
provided. SemaChecking is also implemented to fault invalid parameters.
Differential Revision: http://reviews.llvm.org/D9697
llvm-svn: 239737
Summary:
ACLE 2.0 section 9.2 defines the following "miscellaneous data processing intrinsics": `__clz`, `__cls`, `__ror`, `__rev`, `__rev16`, `__revsh` and `__rbit`.
`__clz` has already been implemented in the arm_acle.h header file. The rest are not supported yet. This patch completes ACLE data processing intrinsics.
Reviewers: t.p.northover, rengolin
Reviewed By: rengolin
Subscribers: aemerson, mroth, llvm-commits
Differential Revision: http://reviews.llvm.org/D4983
llvm-svn: 216658
1. Revert "Add default feature for CPUs on AArch64 target in Clang"
at r210625. Then, all enabled feature will by passed explicitly by
-target-feature in -cc1 option.
2. Get "-mfpu" deprecated.
3. Implement support of "-march". Usage is:
-march=armv8-a+[no]feature
For instance, "-march=armv8-a+neon+crc+nocrypto". Here "armv8-a" is
necessary, and CPU names are not acceptable. Candidate features are
fp, neon, crc and crypto. Where conflicting feature modifiers are
specified, the right-most feature is used.
4. Implement support of "-mtune". Usage is:
-march=CPU_NAME
For instance, "-march=cortex-a57". This option will ONLY get
micro-architectural feature enabled specifying to target CPU,
like "+zcm" and "+zcz" for cyclone. Any architectural features
WON'T be modified.
5. Change usage of "-mcpu" to "-mcpu=CPU_NAME+[no]feature", which is
an alias to "-march={feature of CPU_NAME}+[no]feature" and
"-mtune=CPU_NAME" together. Where this option is used in conjunction
with -march or -mtune, those options take precedence over the
appropriate part of this option.
llvm-svn: 213353
Summary: This patch introduces ACLE header file, implementing extensions that can be directly mapped to existing Clang intrinsics. It implements for both AArch32 and AArch64.
Reviewers: t.p.northover, compnerd, rengolin
Reviewed By: compnerd, rengolin
Subscribers: rnk, echristo, compnerd, aemerson, mroth, cfe-commits
Differential Revision: http://reviews.llvm.org/D4296
llvm-svn: 211962