GNU ld doesn't support `--no-pic-executable`.
`-p` has been removed from likely the only use case (Linux kernel) for over 2.5 years: https://git.kernel.org/linus/091bb549f7722723b284f63ac665e2aedcf9dec9
`--no-add-needed` was the pre-binutils-2.23 spelling for `--no-copy-dt-needed-entries`.
The legacy alias is irrelevant in 2021.
This extends `optimizeCompareInstr` to continue the backwards search
when it reached the beginning of a basic block. If there is a single
predecessor block then we can just continue the search in that block and
mark the EFLAGS register as live-in.
Differential Revision: https://reviews.llvm.org/D110862
This changes the first part of `optimizeCompareInstr` being split into a
loop with a forward scan for cases that re-use zero flags from a
producer in case of compare with zero and a backward scan for finding an
instruction equivalent to a compare.
The code now uses a single backward scan searching for the next
instructions that reads or writes EFLAGS.
Also:
- Add comments giving examples for the 3 cases handled.
- Check `MI` which contains the result of the zero-compare cases,
instead of re-checking `IsCmpZero`.
- Tweak coding style in some loops.
- Add new MIR based tests that test the optimization in isolation.
This also removes a check for flag readers in situations like this:
```
= SUB32rr %0, %1, implicit-def $eflags
... we no longer stop when there are $eflag users here
CMP32rr %0, %1 ; will be removed
...
```
Differential Revision: https://reviews.llvm.org/D110857
The LangRef clearly states that branching on a undef or poison value is immediate undefined behavior, but historically, we have not been consistent about implementing that interpretation in the optimizer. Historically, we used (in some cases) a more relaxed model which essentially looked for provable UB along both paths which was control dependent on the condition. However, we've never been 100% consistent here. For instance SCEV uses the strong model for increments which form AddRecs (and only addrecs).
At the moment, the last big blocker for finally making this switch is enabling the fix landed in D106041. Loop unswitching (in it's classic form) is incorrect as it creates many "branch on poisons" when unswitching conditions originally unreachable within the loop.
This change adds a flag to value tracking which allows to easily test the optimization potential of treating branch on poison as immediate UB. It's intended to help ease work on getting us finally through this transition and avoid multiple independent rediscovers of the same issues.
Differential Revision: https://reviews.llvm.org/D112026
Fixes a crash observed by oss-fuzz in 39934. Issue at hand is that code expects a pattern match on m_Mul to imply the operand is a mul instruction, however mul constexprs are also valid here.
This is msan/dfsan data which does not need waste cache
of other sanitizers.
Depends on D111614.
Differential Revision: https://reviews.llvm.org/D111615
This patch fixes a bug in implementation `mergeSymbolIds` where symbol
identifiers were not unique after merging them. Asserts for checking uniqueness
before and after the merge are also added. The asserts checking uniqueness
after the merge fail without the fix on existing test cases.
Reviewed By: arjunp
Differential Revision: https://reviews.llvm.org/D111958
As this API is now internally offset-based, we can accept a starting
offset and remove the need to create a temporary bitcast+gep
sequence to perform an offset load. The API now mirrors the
ConstantFoldLoadFromConst() API.
The test was removed temporarily in a539a847c9 to aid switching the RPC API in use in the LLJITWithRemoteDebugging example.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D110649
The functionality of this method is already covered by
computeExitCountExhaustively() in a more general fashion. It was
added at a time when exhaustive exit count calculation did not
support constant folding loads yet. I double checked that dropping
this code causes no binary changes in test-suite.
Differential Revision: https://reviews.llvm.org/D112343
This removes duplication and makes nesting more clear.
It also reduces the amount of changes necessary for exposing future options.
Differential revision: https://reviews.llvm.org/D112344
This patch adds a polynomial approximation that matches the
approximation in Eigen.
Note that the approximation only applies to vectorized inputs;
the scalar rsqrt is left unmodified.
The approximation is protected with a flag since it emits an AVX2
intrinsic (generated via the X86Vector). This is the only reasonably
clean way that I could find to generate the exact approximation that
I wanted (i.e. an identical one to Eigen's).
I considered two alternatives:
1. Introduce a Rsqrt intrinsic in LLVM, which doesn't exist yet.
I believe this is because there is no definition of Rsqrt that
all backends could agree on, since hardware instructions that
implement it have widely varying degrees of precision.
This is something that the standard could mandate, but Rsqrt is
not part of IEEE754, so I don't think this option is feasible.
2. Emit fdiv(1.0, sqrt) with fast math flags to allow reciprocal
transformations. Although portable, this doesn't allow us
to generate exactly the code we want; it is the LLVM backend,
and not MLIR, who controls what code is generated based on the
target CPU.
Reviewed By: ezhulenev
Differential Revision: https://reviews.llvm.org/D112192
The `clang` target used in the line below is only generated with `LLVM_ENABLE_PROJECTS=clang`.
Without this change, running `ninja clang` will fail with:
```
ninja: error: unknown target 'clang', did you mean 'clean'?
```
Reviewed By: xgupta
Differential Revision: https://reviews.llvm.org/D112257
Commit 8fa3e8fa14 added an implicit REP prefix to all VIA PadLock
instructions, but GNU as doesn't add one to xstore, only all the others.
This resulted in a kernel panic regression in FreeBSD upon updating to
LLVM 11 (https://bugs.freebsd.org/259218) which includes the commit in
question. This partially reverts that commit.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D112355
We currently only test the encoding of xstore but none of the other
instructions, which should all have their implicit REP prefix be
verified as working.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D112354
The clang-aarch64-full-2stage buildbot is complaining about a
warning with three instances in f18 code (none modified recently).
The warning is for using the | bitwise OR operator on bool operands.
In one instance, the bitwise operator was being used instead of the
logical || operator in order to avoid short-circuting. The fix
requires using some temporary variables. In the other two instances,
the bitwise operator seemed more idiomatic in context, but can be
replaced without harm with the logical operator.
Pushing without review as confidence is high and nobody wants
a buildbot to stay sad for long.
Change buffer_unique_ostream's constructor to call
raw_ostream::SetUnbuffered() on its owned stream. Otherwise,
buffer_unique_ostream's destructor could cause the owned stream to
temporarily allocate a buffer only to be immediately flushed.
Also add some tests for buffer_ostream and buffer_unique_ostream. Use
the same naming scheme as other raw_ostream-related tests (e.g.,
`raw_ostreamTest` for the fixture, `raw_ostream_test.cpp` for the
filename).
(I considered changing buffer_ostream in the same way (calling
SetUnbuffered on the referenced stream), but that seemed like overreach
since the client may have more things to write.)
(I considered merging buffer_ostream and buffer_unique_ostream into a
single class (with a `raw_ostream&` and a `std::unique_ptr` that is only
sometimes used), but that makes the class bigger and the small amount of
code deduplication seems uncompelling.)
Differential Revision: https://reviews.llvm.org/D110369
The 'A' edit descriptor once served as a form of raw I/O of bytes
to/from variables that weren't of type CHARACTER (which itself
didn't exist until F'77). This usage was especially common for
output of numeric variables that had been initialized with Hollerith.
Differential Revision: https://reviews.llvm.org/D112346
NAMELIST input can contain array subscripts with triplet notation.
The calculation of the default effective stride for the constructed
array descriptor was simply incorrect after the first dimension.
Differential Revision: https://reviews.llvm.org/D112347
A build-time check in a template class instantiation was applying
a test that's meaningful only for numeric types.
Differential Revision: https://reviews.llvm.org/D112345
A CHARACTER variable used as an output format may contain
unquoted tab characters, which are treated as if they had
been quoted. This is an extension supported by all other
Fortran compilers to which I have access.
Differential Revision: https://reviews.llvm.org/D112350
ExternalFileUnit::BeginReadingRecord() must be called at least once
during an external formatted READ statement before FinishReadingRecord().
In the case of a formatted external READ with no data items, the call
to finish processing of the format (which might have lingering control
items that need doing) was taking place before the call to BeginReadingRecord
from ExternalIoStatementState::EndIoStatement. Add a call to
BeginReadingRecord on this path.
Differential Revision: https://reviews.llvm.org/D112351
Implemented by patching python config instead of modifying all
the tests so that -generic and XFAIL work as usual. Expectation is for
this to be reverted once the old runtime is deleted.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D112225