Command lines with multiple `-arch` arguments expand into multiple entries in the compilation database. However, the file writes are not appending, meaning subsequent writes end up overwriting the previous ones, resulting in garbled output.
This patch fixes that by always appending to the file.
rdar://90165004
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D121997
This patch introduces the new -fdriver-only flag which instructs Clang to only execute the driver logic without running individual jobs. In a way, this is very similar to -###, with the following differences:
* it doesn't automatically print all jobs,
* it doesn't avoid side effects (e.g. it will generate compilation database when -MJ is specified).
This flag will be useful in testing D121997.
Reviewed By: dexonsmith, egorzhdan
Differential Revision: https://reviews.llvm.org/D127408
[flang]Add support for do concurrent
Upstreaming from fir-dev on https://github.com/flang-compiler/f18-llvm-project
Support for concurrent execution in do-loops.
A selection of tests are also added.
Co-authored-by: V Donaldson <vdonaldson@nvidia.com>
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D127240
These two functions are described in RVV intrinsics doc
to read/write RVV CSRs. This matches what GCC does.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D125875
Another issue unearthed by D127115
We take a long time to canonicalize an insert_vector_elt chain before being able to convert it into a build_vector - even if they are already in ascending insertion order, we fold the nodes one at a time into the build_vector 'seed', leaving plenty of time for other folds to alter it (in particular recognising when they come from extract_vector_elt resulting in a shuffle_vector that is much harder to fold with).
D127115 makes this particularly difficult as we're almost guaranteed to have the lost the sequence before all possible insertions have been folded.
This patch proposes to begin at the last insertion and attempt to collect all the (oneuse) insertions right away and create the build_vector before its too late.
Differential Revision: https://reviews.llvm.org/D127595
Unfortunately, it's not just constant expressions that can trap,
we might also have a trapping constant expression nested inside
a constant aggregate.
Perform the check during phi folding on Constant rather than
ConstantExpr, and extend the Constant::mayTrap() implementation
to also recursive into ConstantAggregates, not just ConstantExprs.
Fixes https://github.com/llvm/llvm-project/issues/49839.
This patch allows the same implicit conversions for vector-scalar
operations in SVE that are allowed for NEON.
Depends on D126377
Reviewed By: c-rhodes
Differential Revision: https://reviews.llvm.org/D126380
According D125377, we order STP Q's by ascending address. While on some
targets, paired 128 bit loads and stores are slow, so the STP will split
into STRQ and STUR, so I hope these stores will also be ordered.
Also add subtarget feature ascend-store-address to control the aggressive order.
Reviewed By: dmgreen, fhahn
Differential Revision: https://reviews.llvm.org/D126700
Currently the a AAPCS compliant frame record is not always created for
functions when it should. Although a consistent frame record might not
be required in some cases, there are still scenarios where applications
may want to make use of the call hierarchy made available trough it.
In order to enable the use of AAPCS compliant frame records whilst keep
backwards compatibility, this patch introduces a new command-line option
(`-mframe-chain=[none|aapcs|aapcs+leaf]`) for Aarch32 and Thumb backends.
The option allows users to explicitly select when to use it, and is also
useful to ensure the extra overhead introduced by the frame records is
only introduced when necessary, in particular for Thumb targets.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D125094
Simiarly to what's done on both ARM's and AArch64's frame lowering code,
this updates Thumb1FrameLowering to use the FrameDestroy Machine
Instruction flag to identify instructions inserted as part of the epilog
instead of relying on assumptions about specific machine instructions.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D126285
When pushing an operation across a phi node, we should avoid doing
so across a loop backedge. This is generally non-profitable, because
it does not reduce the number of times the operation is executed,
and could lead to an infinite combine loop.
The code was already guarding against this, but using an
insufficiently strong condition, which did not cover the case where
the operation was originally outside the loop (in which case the
transform moves the operation from outside the loop into the loop,
which is particularly undesirable).
Differential Revision: https://reviews.llvm.org/D127499
With opaque pointers, we end up merging these GEPs and dropping
the inrange attribute (in the last two cases). This did not happen
previously, because typed pointers use less powerful GEP folding logic.
I'm a bit unsure whether this is something we need to be concerned
about or not. I believe that generally our stance is that we should
perform folds even if this requires losing poison-generating flags
like inrange.
We can either a) accept this as-is, b) try to inhibit folding if it
requires dropping inrange or c) try to fold to poison if we know
that inrange is going to be violated.
For now, we accept it as-is.
Differential Revision: https://reviews.llvm.org/D127503
On 64-bit X86, 0x66 operand-size override prefix will change the size of
the instruction operand, e.g. from 32 bits to 16 bits, but it will not
modify the size of the displacement operand used for memory addressing,
which will always be 32 bits.
Reviewed By: skan, rafauler
Differential Revision: https://reviews.llvm.org/D126726
Just matter of enabling the config option.
(Also changed the platform of the input test file to macOS, since that's
the default that we specify in the `%lld` substitution. The conflict was
causing errors when linking with LTO.)
Reviewed By: #lld-macho, thakis
Differential Revision: https://reviews.llvm.org/D127600
D125496 changed the string layout on windows. Change `__is_long_` and `__size_` back to using `unsigned char` to fix the issue.
Reviewed By: Mordante, #libc
Spies: jloser, libcxx-commits, ayzhao
Differential Revision: https://reviews.llvm.org/D127566
This patch allows SimplifyDemandedBits to call SimplifyMultipleUseDemandedBits in cases where the source operand has other uses, enabling us to peek through the shifted value if we don't demand all the bits/elts.
This helps with several of the regressions from D125836
These methods don't access any state from RISCVInstrInfo. Make them
free functions in the RISCV namespace.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D127583
If the LHS/RHS selection operands can be cheaply concatenated back together then replace 2 x 128-bit selection nodes with 1 x 256-bit node
Addresses the regression introduced in the bug fix from rGd5af6a38082b39ae520a328e44dc29ebcb036bb2
REAPPLIED with for bug identified in rGea8fb3b60196