We currently support them only in AArch64. The NEON Reference,
however, says they are 'ARMv7, ARMv8' intrinsics.
Differential Revision: https://reviews.llvm.org/D47447
llvm-svn: 334361
This reverts commit 65243b6d19143cb7a03f68df0169dcb63e8b4632.
Seems like it's not a flake. It might have something to do with
the '*' character being in a command line.
llvm-svn: 334356
There were a few linux compilation failures, but other than that
I think this was just a flake that caused the tests to fail. I'm
going to resubmit and see if the failures go away, if not I'll
revert again.
llvm-svn: 334355
This reverts commit 10d2e88e87150a35dc367ba30716189d2af26774.
This is causing some test failures for some reason, reverting
while I investigate.
llvm-svn: 334354
This function was internal to Program.inc, but I've needed this
on several occasions when I've had to use CreateProcess without
llvm's sys::Execute functions. In doing so, I noticed that the
function was written using unsafe C-string access and was pretty
hard to understand / make sense of, so I've also re-written the
functions to use more modern LLVM constructs.
llvm-svn: 334353
This is a recommit of r333506, which was reverted in r333518.
The original commit message is below.
In r325551 many calls of malloc/calloc/realloc were replaces with calls of
their safe counterparts defined in the namespace llvm. There functions
generate crash if memory cannot be allocated, such behavior facilitates
handling of out of memory errors on Windows.
If the result of *alloc function were checked for success, the function was
not replaced with the safe variant. In these cases the calling function made
the error handling, like:
T *NewElts = static_cast<T*>(malloc(NewCapacity*sizeof(T)));
if (NewElts == nullptr)
report_bad_alloc_error("Allocation of SmallVector element failed.");
Actually knowledge about the function where OOM occurred is useless. Moreover
having a single entry point for OOM handling is convenient for investigation
of memory problems. This change removes custom OOM errors handling and
replaces them with calls to functions `llvm::safe_*alloc`.
Declarations of `safe_*alloc` are moved to a separate include file, to avoid
cyclic dependency in SmallVector.h
Differential Revision: https://reviews.llvm.org/D47440
llvm-svn: 334344
SmallSet forwards to SmallPtrSet for pointer types. SmallPtrSet supports iteration, but a normal SmallSet doesn't. So if it wasn't for the forwarding, this wouldn't work.
These places were found by hiding the begin/end methods in the SmallSet forwarding
llvm-svn: 334343
I think we assume poison, not undef, for certain transforms we
currently do. In any case, we should clarify the language here.
(This sort of conversion is undefined behavior according to the C
and C++ standards. And in practice, hardware implementations handle
overflow inconsistently, so it would be difficult to define the
result here.)
Differential Revision: https://reviews.llvm.org/D47851
llvm-svn: 334326
We need to clarify the language here. I think poison makes more sense
than undef, since it's an undefined operation rather than uninitialized
memory. I don't think anything depends on the difference at the moment,
though.
Differential Revision: https://reviews.llvm.org/D47859
llvm-svn: 334325
It looks like this got left in by accident in r289794; I can't think of
any reason this check would be necessary. (Maybe it was meant to be a
check that the AND has one use? But we check that a few lines earlier.)
Differential Revision: https://reviews.llvm.org/D47921
llvm-svn: 334322
An expression like
(zext i2 {(trunc i32 (1 + %B) to i2),+,1}<%while.body> to i32)
will become zero exactly when the nested value becomes zero in its type.
Strip injective operations from the input value in howFarToZero to make
the value simpler.
Differential Revision: https://reviews.llvm.org/D47951
llvm-svn: 334318
Summary:
If we can use comdats, then we can make it so that the global metadata
is thrown away if the prevailing definition of the global was
uninstrumented. I have only tested this on COFF targets, but in theory,
there is no reason that we cannot also do this for ELF.
This will allow us to re-enable string merging with ASan on Windows,
reducing the binary size cost of ASan on Windows.
Reviewers: eugenis, vitalybuka
Subscribers: hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D47841
llvm-svn: 334313
Extension to D46954 (PR37426), this patch adds support for v8i16/v16i16 rotations in a similar manner - the conversion of the shift/rotate amount to a multiplication factor and the use of PMULLW to shift left and PMULHUW (ISD::MULHU) to shift the wrapped bits back around to be ORd together.
Differential Revision: https://reviews.llvm.org/D47822
llvm-svn: 334309
Summary: This patch originated from D46562 and is a proper subset, with some issues addressed for fsub.
Reviewers: spatel, hfinkel, wristow, arsenm
Reviewed By: spatel
Subscribers: wdng
Differential Revision: https://reviews.llvm.org/D47910
llvm-svn: 334306
This patch moves the recipe-creation functions out of
LoopVectorizationPlanner, which should do the high-level
orchestration of the transformations.
Reviewers: dcaballe, rengolin, hsaito, Ayal
Reviewed By: dcaballe
Differential Revision: https://reviews.llvm.org/D47595
llvm-svn: 334305
As detailed on Agner's Microarchitecture doc (21.8 AMD Bobcat and Jaguar pipeline - Dependency-breaking instructions), these instructions are dependency breaking and fast-path zero the destination register (and appropriate EFLAGS bits).
llvm-svn: 334303
Same fix as rL334110: I noticed while working on zero-idiom + dependency-breaking support (PR36671) that most of our binary instruction schedule tests were reusing the same src registers, which would cause the tests to fail once we enable scalar zero-idiom support on btver2.
llvm-svn: 334302
AMDGPU inline assembler support i16, half and i128 typed variables in constraints, but they were reported as error.
Needed to fix https://github.com/RadeonOpenCompute/ROCm/issues/341,
e.g. to be able to load with global_load_dwordx4 to a 128bit integer variable
Differential Revision: https://reviews.llvm.org/D44920
llvm-svn: 334301
Summary:
`%ret = add nuw i8 %x, C`
From [[ https://llvm.org/docs/LangRef.html#add-instruction | langref ]]:
nuw and nsw stand for “No Unsigned Wrap” and “No Signed Wrap”,
respectively. If the nuw and/or nsw keywords are present,
the result value of the add is a poison value if unsigned
and/or signed overflow, respectively, occurs.
So if `C` is `-1`, `%x` can only be `0`, and the result is always `-1`.
I'm not sure we want to use `KnownBits`/`LVI` here, because there is
exactly one possible value (all bits set, `-1`), so some other pass
should take care of replacing the known-all-ones with constant `-1`.
The `test/Transforms/InstCombine/set-lowbits-mask-canonicalize.ll` change *is* confusing.
What happening is, before this: (omitting `nuw` for simplicity)
1. First, InstCombine D47428/rL334127 folds `shl i32 1, %NBits`) to `shl nuw i32 -1, %NBits`
2. Then, InstSimplify D47883/rL334222 folds `shl nuw i32 -1, %NBits` to `-1`,
3. `-1` is inverted to `0`.
But now:
1. *This* InstSimplify fold `%ret = add nuw i32 %setbit, -1` -> `-1` happens first,
before InstCombine D47428/rL334127 fold could happen.
Thus we now end up with the opposite constant,
and it is all good: https://rise4fun.com/Alive/OA9https://rise4fun.com/Alive/sldC
Was mentioned in D47428 review.
Follow-up for D47883.
Reviewers: spatel, craig.topper
Reviewed By: spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D47908
llvm-svn: 334298
Summary:
The function `llvm::sys::commandLineFitsWithinSystemLimits` appears to be overestimating the system limits. This issue was discovered while attempting to enable response files in the Swift compiler. When the compiler submits its frontend jobs, those jobs are subjected to the system limits on command line length. `commandLineFitsWithinSystemLimits` is used to determine if the job's arguments need to be wrapped in a response file. There are some cases where the argument size for the job passes `commandLineFitsWithinSystemLimits`, but actually exceeds the real system limit, and the job fails.
`clang` also uses this function to decide whether or not to wrap it's job arguments in response files. See: https://github.com/llvm-mirror/clang/blob/master/lib/Driver/Driver.cpp#L1341. Clang will also fail for response files who's size falls within a certain range. I wrote a script that should find a failure point for `clang++`. All that is needed to run it is Python 2.7, and a simple "hello world" program for `test.cc`. It should run on Linux and on macOS. The script is available here: https://gist.github.com/dabelknap/71bd083cd06b91c5b3cef6a7f4d3d427. When it hits a failure point, you should see a `clang: error: unable to execute command: posix_spawn failed: Argument list too long`.
The proposed solution is to mirror the behavior of `xargs` in `commandLinefitsWithinSystemLimits`. `xargs` defaults to 128k for the command line length size (See: https://fossies.org/dox/findutils-4.6.0/buildcmd_8c_source.html#l00551). It adjusts this depending on the value of `ARG_MAX`.
Reviewers: alexfh
Reviewed By: alexfh
Subscribers: llvm-commits
Tags: #clang
Patch by Austin Belknap!
Differential Revision: https://reviews.llvm.org/D47795
llvm-svn: 334295
NFC here, this just raises some platform specific ifdef hackery
out of a class and creates proper platform-independent typedefs
for the relevant things. This allows these typedefs to be
reused in other places without having to reinvent this preprocessor
logic.
llvm-svn: 334294