Summary: This change provides a common optimization path for both Unsafe and FMF driven optimization for this fsub fold adding reassociation, as it the flag that most closely represents the translation
Reviewers: spatel, wristow, arsenm
Reviewed By: spatel
Subscribers: wdng
Differential Revision: https://reviews.llvm.org/D50195
llvm-svn: 339357
This patch introduces tablegen class MCStatement.
Currently, an MCStatement can be either a return statement, or a switch
statement.
```
MCStatement:
MCReturnStatement
MCOpcodeSwitchStatement
```
A MCReturnStatement expands to a return statement, and the boolean expression
associated with the return statement is described by a MCInstPredicate.
An MCOpcodeSwitchStatement is a switch statement where the condition is a check
on the machine opcode. It allows the definition of multiple checks, as well as a
default case. More details on the grammar implemented by these two new
constructs can be found in the diff for TargetInstrPredicates.td.
This patch makes it easier to read the body of auto-generated TargetInstrInfo
predicates.
In future, I plan to reuse/extend the MCStatement grammar to describe more
complex target hooks. For now, this is just a first step (mostly a minor
cosmetic change to polish the new predicates framework).
Differential Revision: https://reviews.llvm.org/D50457
llvm-svn: 339352
Summary:
Instead of just printing the current "False is not True, ..." message when we
fail to run a certain command, this patch also adds the actual command output or
error output that we received to the assertion message.
Reviewers: davide
Reviewed By: davide
Subscribers: lldb-commits
Differential Revision: https://reviews.llvm.org/D50492
llvm-svn: 339351
Summary:
The interface to get size and spill size of a register
was moved from MCRegisterInfo to TargetRegisterInfo over
a year ago. Afaik the old interface has bee around
to give out-of-tree targets a chance to adapt to the
new interface.
One problem with the old MCRegisterClass::PhysRegSize was that
it represented the size of a register as "size in bits" / 8.
So a register had to be a multiple of eight bits wide for the
size to be correct (and the byte size for the target needed to
be eight bits).
Reviewers: kparzysz, qcolombet
Reviewed By: kparzysz
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D47199
llvm-svn: 339350
link.exe ignores REL32 relocations on 32-bit x86, as well as relocations
against non-function symbols such as labels. This makes lld do the same.
Differential Revision: https://reviews.llvm.org/D50430
llvm-svn: 339345
As a part of attempting to clean up the way attributes are
printed, this patch adds an operator << to the diagnostics/
partialdiagnostics so that ParsedAttr can be sent directly.
This patch also rewrites a large amount* of the times when
ParsedAttr was printed using its IdentifierInfo object instead
of being printed itself.
*"a large amount" == "All I could find".
llvm-svn: 339344
This test relies on communicating with debugserver via an unnamed (pre-opened)
pipe, but macOS's version of debugserver doesn't seem to support that mode of
operation. So disable the test for now.
llvm-svn: 339343
Summary:
Currently we consider one forward declared RecordDecl and another with a
definition equal. We have to do the same in case of enums.
Reviewers: a_sidorin, r.stahl, xazax.hun
Subscribers: rnkovacs, dkrupp, cfe-commits
Differential Revision: https://reviews.llvm.org/D50444
llvm-svn: 339336
As discussed on D41794, we have many cases where we fail to combine shuffles as the input operands have other uses.
This patch permits these shuffles to be combined as long as they don't introduce additional variable shuffle masks, which should reduce instruction dependencies and allow the total number of shuffles to still drop without increasing the constant pool.
However, this may mean that some memory folds may no longer occur, and on pre-AVX require the occasional extra register move.
This also exposes some poor PMULDQ/PMULUDQ codegen which was doing unnecessary upper/lower calculations which will in fact fold to zero/undef - the fix will be added in a followup commit.
Differential Revision: https://reviews.llvm.org/D50328
llvm-svn: 339335
This is a larger patch. This relocation has irregular immediate
masks that require a lookup to find the correct mask.
Differential Revision: https://reviews.llvm.org/D50450
llvm-svn: 339332
Summary:
This allows implementations like different symbol indexes to know what
the current active file is. For example, some customized index implementation
might decide to only return results for some files.
Reviewers: ilya-biryukov
Reviewed By: ilya-biryukov
Subscribers: javed.absar, MaskRay, jkorous, arphaman, cfe-commits
Differential Revision: https://reviews.llvm.org/D50446
llvm-svn: 339320
As for Linux with its getrandom's syscall, giving the possibility to fill buffer with native call for good quality but falling back to /dev/urandom in worst case similarly.
Reviewers: vitalybuka, krytarowski
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D48804
llvm-svn: 339318
Summary:
Windows does not allow globals to be initialised to point to globals in
another DLL. Exported globals may be referenced only from code. Work
around this by creating an initialiser that runs in early library
initialisation and sets the isa pointer.
Reviewers: rjmccall
Reviewed By: rjmccall
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D50436
llvm-svn: 339317
According to PTX ISA .volatile has the same memory synchronization
semantics as .relaxed.sys, so it can be used to implement monotonic
atomic loads and stores. This is important for OpenMP's atomic
construct where
- 'read's and 'write's are lowered to atomic loads and stores, and
- an update of float or double types are lowered into a cmpxchg loop.
(Note that PTX could do better because it has atom.add.f{32,64} but
LLVM's atomicrmw instruction only allows integer types.)
Higher levels of atomicity (like acquire and release) need additional
synchronization properties which were added with PTX ISA 6.0 / sm_70.
So using these instructions still results in an error.
Differential Revision: https://reviews.llvm.org/D50391
llvm-svn: 339316
This pseudo-instruction is similar to la but uses PC-relative addressing
unconditionally. This is, la is only different to lla when using -fPIC. This
pseudo-instruction seems often forgotten in several specs but it is definitely
mentioned in binutils opcodes/riscv-opc.c. The semantics are defined both in
page 37 of the "RISC-V Reader" book but also in function macro found in
gas/config/tc-riscv.c.
This is a very first step towards adding PIC support for Linux in the RISC-V
backend.
The lla pseudo-instruction expands to a sequence of auipc + addi with a couple
of pc-rel relocations where the second points to the first one. This is
described in
https://github.com/riscv/riscv-elf-psabi-doc/blob/master/riscv-elf.md#pc-relative-symbol-addresses
For now, this patch only introduces support of that pseudo instruction at the
assembler parser.
Differential Revision: https://reviews.llvm.org/D49661
llvm-svn: 339314
We upstreamed the export of isl_val_2exp, to the official cpp bindings.
In this process, we concluded that pow2 is a better and more widely used
name for this functionality. Hence, both the official isl-cpp bindings
and our derived variant use now the term pow2.
llvm-svn: 339312
The main interesting case is a fence in an otherwise dead loop or one containing only arithmetic. This can happen as a result of DSE or other transforms from seemingly reasonable initial IR.
llvm-svn: 339310
Summary: DenseMap's operator[] performs an insertion if the entry isn't found. The second phase of ConstantMerge isn't trying to insert anything: it's just looking to see if the first phased performed an insertion. Use find instead, avoiding insertion of every single global initializer in the map of constants. This has the side-effect of making all entries in CMap non-null (because only global declarations would have null initializers, and that would be a bug).
Subscribers: dexonsmith, llvm-commits
Differential Revision: https://reviews.llvm.org/D50476
llvm-svn: 339309
Changes the default Windows target triple returned by
GetHostTriple.cmake from the old environment names (which we wanted to
move away from) to newer, normalized ones. This also requires updating
all tests to use the new systems names in constraints.
Differential Revision: https://reviews.llvm.org/D47381
llvm-svn: 339307
Summary: I noticed that my code wasn't going deep into the loop vectorizer code so added another pass that makes it go further.
Reviewers: morehouse, kcc
Reviewed By: morehouse
Subscribers: cfe-commits, llvm-commits
Differential Revision: https://reviews.llvm.org/D50482
llvm-svn: 339305
After https://reviews.llvm.org/D48800, shrink.test started failing on
x86_64h architecture.
Looking into this, the optimization pass is too eager to unroll the loop
on x86_64h, possibly leading to worse coverage data.
Alternative solutions include not unrolling the loop when fuzzing, or
disabling this test on that architecture.
Differential Revision: https://reviews.llvm.org/D50484
llvm-svn: 339303
Adding all libcall symbols to the link can have undesired consequences.
For example, the libgcc implementation of __sync_val_compare_and_swap_8
on 32-bit ARM pulls in an .init_array entry that aborts the program if
the Linux kernel does not support 64-bit atomics, which would prevent
the program from running even if it does not use 64-bit atomics.
This change makes it so that we only add libcall symbols to the
link before LTO if we have to, i.e. if the symbol's definition is in
bitcode. Any other required libcall symbols will be added to the link
after LTO when we add the LTO object file to the link.
Differential Revision: https://reviews.llvm.org/D50475
llvm-svn: 339301
isNegatibleForFree() should not matter here (as the test diffs show)
because it's always a win to replace an fsub+fadd with fneg. The
problem in D50195 persists because either (1) we are doing these
folds in the wrong order or (2) we're missing another fold for fadd.
llvm-svn: 339299
I don't know if it's possible to expose this diff in a test,
but we should always try simplifications (no new nodes created)
before more complicated transforms for efficiency (similar to
what we do in IR).
llvm-svn: 339298