This addresses a regression where pretty much all C++ compilations using
-frounding-math now fail, due to rounding being performed in constexpr
function definitions in the standard library.
This follows the "manifestly constant evaluated" approach described in
https://reviews.llvm.org/D87528#2270676 -- evaluations that are required
to succeed at compile time are permitted even in regions with dynamic
rounding modes, as are (unfortunately) the evaluation of the
initializers of local variables of const integral types.
Differential Revision: https://reviews.llvm.org/D89360
After investigation by @asbirlea, the issue that caused the
revert appears to be an issue in the original source, rather
than a problem with the compiler.
This patch enables MemorySSA DSE again.
This reverts commit 915310bf14.
This test was failing in our CI environment, because Jenkins mounts the workspaces into Docker containers using their full path, i.e. /home/jenkins/workspaces/llvm-build.
We've seen permission denied errors because /home/jenkins is mounted with root permissions and the default cache directory under Linux is $HOME/.cache.
The fix is to explicitly provide the -fmodules-cache-path, which the other tests already seem to provide.
Reviewed By: akyrtzi
Differential Revision: https://reviews.llvm.org/D89453
- The goal of this patch is improve option compatible with RISCV-V GCC,
-mcpu support on GCC side will sent patch in next few days.
- -mtune only affect the pipeline model and non-arch/extension related
target feature, e.g. instruction fusion; in td file it called
TuneFeatures, which is introduced by X86 back-end[1].
- -mtune accept all valid option for -mcpu and extra alias processor
option, e.g. `generic`, `rocket` and `sifive-7-series`, the purpose is
option compatible with RISCV-V GCC.
- Processor alias for -mtune will resolve according the current target arch,
rv32 or rv64, e.g. `rocket` will resolve to `rocket-rv32` or `rocket-rv64`.
- Interaction between -mcpu and -mtune:
* -mtune has higher priority than -mcpu for pipeline model and
TuneFeatures.
[1] https://reviews.llvm.org/D85165
Reviewed By: luismarques
Differential Revision: https://reviews.llvm.org/D89025
folding to not constant folding.
Constant folding of ICEs is done as a GCC compatibility measure, but new
code was picking it up, presumably by accident, due to the bad default.
While here, also switch the flag from a bool to an enum to make it more
obvious what it means at call sites. This highlighted a couple of places
where our behavior is different between C++11 and C++14 due to switching
from checking for an ICE to checking for a converted constant
expression (where there is no 'fold' codepath).
This patch adds -f[no-]split-cold-code CC1 options to clang. This allows
the splitting pass to be toggled on/off. The current method of passing
`-mllvm -hot-cold-split=true` to clang isn't ideal as it may not compose
correctly (say, with `-O0` or `-Oz`).
To implement the -fsplit-cold-code option, an attribute is applied to
functions to indicate that they may be considered for splitting. This
removes some complexity from the old/new PM pipeline builders, and
behaves as expected when LTO is enabled.
Co-authored by: Saleem Abdulrasool <compnerd@compnerd.org>
Differential Revision: https://reviews.llvm.org/D57265
Reviewed By: Aditya Kumar, Vedant Kumar
Reviewers: Teresa Johnson, Aditya Kumar, Fedor Sergeev, Philip Pfaffe, Vedant Kumar
rL131311 added `asm()` support for builtin functions, but `asm()` for builtins with
specialized emitting (e.g. memcpy, various math functions) still do not work.
This patch makes these functions work for `asm()` and `#pragma redefine_extname`.
glibc uses `asm()` to redirect internal libc function calls to hidden aliases.
Limitation: such a function is a builtin in clang, but will not be recognized as
a libcall in optimization passes because Clang does not annotate the renamed
function as a libcall. In GCC -O1 or above, `abs` can be optimized out but we can't.
Additionally, we cannot redirect `__builtin_sin` to `real_sin` in the following example:
double sin(double x) asm("real_sin");
double f(double d) { return __builtin_sin(d); }
---
According to @rsmith, the following three statements cannot be simultaneously true:
(1) The frontend function foo has known, builtin semantics X.
(2) The symbol foo has known, builtin semantics X.
(3) It's not correct to lower a call to the frontend function foo to the symbol foo.
People do want (1) (if it is profitable to expand a memcpy, do it).
This also means that people do not want to add -fno-builtin-memcpy.
People do want (3): that is why they use asm("__GI_memcpy") in the first place.
So unfortunately we make a compromise by not refuting (2) (see the limitation above).
For most libcalls, there is a small loss because compilers don't synthesize them.
For the few glibc cares about, it uses `asm("memcpy = __GI_memcpy");` to make
the assembly level redirection.
(Changing function names (e.g. `__memcpy`) is a hit to ergonomics which is not acceptable).
Reviewed By: rsmith
Differential Revision: https://reviews.llvm.org/D88712
This reverts commits 683b308c07 and
8487bfd4e9.
We will go for a more restricted approach that does not give freedom to
everyone to change ABIs on whichever platform.
See the discussion on https://reviews.llvm.org/D85802.
Prototype the newly proposed load_lane instructions, as specified in
https://github.com/WebAssembly/simd/pull/350. Since these instructions are not
available to origin trial users on Chrome stable, make them opt-in by only
selecting them from intrinsics rather than normal ISel patterns. Since we only
need rough prototypes to measure performance right now, this commit does not
implement all the load and store patterns that would be necessary to make full
use of the offset immediate. However, the full suite of offset tests is included
to make it easy to track improvements in the future.
Since these are the first instructions to have a memarg immediate as well as an
additional immediate, the disassembler needed some additional hacks to be able
to parse them correctly. Making that code more principled is left as future
work.
Differential Revision: https://reviews.llvm.org/D89366
Previously we failed to convert 'p' from array/function to pointer type,
and to represent the load of 'p' in the AST. The latter causes problems
for constant evaluation.
callee in constant evaluation.
We previously made a deep copy of function parameters of class type when
passing them, resulting in the destructor for the parameter applying to
the original argument value, ignoring any modifications made in the
function body. This also meant that the 'this' pointer of the function
parameter could be observed changing between the caller and the callee.
This change completely reimplements how we model function parameters
during constant evaluation. We now model them roughly as if they were
variables living in the caller, albeit with an artificially reduced
scope that covers only the duration of the function call, instead of
modeling them as temporaries in the caller that we partially "reparent"
into the callee at the point of the call. This brings some minor
diagnostic improvements, as well as significantly reduced stack usage
during constant evaluation.
This implements the flag proposed in RFC http://lists.llvm.org/pipermail/cfe-dev/2020-August/066437.html.
The goal is to add a way to override the default target C++ ABI through
a compiler flag. This makes it easier to test and transition between different
C++ ABIs through compile flags rather than build flags.
In this patch:
- Store `-fc++-abi=` in a LangOpt. This isn't stored in a
CodeGenOpt because there are instances outside of codegen where Clang
needs to know what the ABI is (particularly through
ASTContext::createCXXABI), and we should be able to override the
target default if the flag is provided at that point.
- Expose the existing ABIs in TargetCXXABI as values that can be passed
through this flag.
- Create a .def file for these ABIs to make it easier to check flag
values.
- Add an error for diagnosing bad ABI flag values.
Differential Revision: https://reviews.llvm.org/D85802
clang --target arm-none-eabi --print-libgcc-file-name --rtlib=compiler-rt
used to print `/path/to/lib/clang/version/lib/libclang_rt.builtins-arm.a`
but should print `/path/to/lib/clang/version/lib/baremetal/libclang_rt.builtins-arm.a`.
Similarly, --target armv7m-none-eabi should print libclang_rt.builtins-armv7m.a
This matches the compiler-rt file name used at link time in the
baremetal driver.
Reviewed By: manojgupta
Differential Revision: https://reviews.llvm.org/D89327
Summary:
This patch does the following:
1. Make InitTargetOptionsFromCodeGenFlags() accepts Triple as a
parameter, because some options' default value is triple dependant.
2. DataSections is turned on by default on AIX for llc.
3. Test cases change accordingly because of the default behaviour change.
4. Clang Driver passes in -fdata-sections by default on AIX.
Reviewed By: MaskRay, DiggerLin
Differential Revision: https://reviews.llvm.org/D88737
During the import of attributes we forgot to set the spelling list
index. This caused a segfault when we wanted to traverse the AST
(e.g. by the dump() method).
Differential Revision: https://reviews.llvm.org/D89318
During the import of FormatAttrs we forgot to import the type (e.g
`__scanf__`) of the attribute. This caused a segfault when we wanted to
traverse the AST (e.g. by the dump() method).
Differential Revision: https://reviews.llvm.org/D89319
callee in constant evaluation.
We previously made a deep copy of function parameters of class type when
passing them, resulting in the destructor for the parameter applying to
the original argument value, ignoring any modifications made in the
function body. This also meant that the 'this' pointer of the function
parameter could be observed changing between the caller and the callee.
This change completely reimplements how we model function parameters
during constant evaluation. We now model them roughly as if they were
variables living in the caller, albeit with an artificially reduced
scope that covers only the duration of the function call, instead of
modeling them as temporaries in the caller that we partially "reparent"
into the callee at the point of the call. This brings some minor
diagnostic improvements, as well as significantly reduced stack usage
during constant evaluation.
callee in constant evaluation.
We previously made a deep copy of function parameters of class type when
passing them, resulting in the destructor for the parameter applying to
the original argument value, ignoring any modifications made in the
function body. This also meant that the 'this' pointer of the function
parameter could be observed changing between the caller and the callee.
This change completely reimplements how we model function parameters
during constant evaluation. We now model them roughly as if they were
variables living in the caller, albeit with an artificially reduced
scope that covers only the duration of the function call, instead of
modeling them as temporaries in the caller that we partially "reparent"
into the callee at the point of the call. This brings some minor
diagnostic improvements, as well as significantly reduced stack usage
during constant evaluation.
AIX has different layout dumping format from other itanium ABIs.
And for these two cases, use regex to match AIX format.
Differential Revision: https://reviews.llvm.org/D89064
Change EmitAsmStmt() to
- Not tie physregs with the "+r" constraint, but instead add the hard
register as an input constraint. This makes "+r" and "=r":"r" look the same
in the output.
Background: Macro intensive user code may contain inline assembly
statements with multiple operands constrained to the same physreg. Such a
case (with the operand constraints "+r" : "r") currently triggers the
TwoAddressInstructionPass assertion against any extra use of a tied
register. Furthermore, TwoAddress will insert a COPY to that physreg even
though isel has already done so (for the non-tied use), which may lead to a
second redundant instruction currently. A simple fix for this is to not
emit tied physreg uses in the first place for the "+r" constraint, which is
what this patch does.
- Give an error on multiple outputs to the same physical register.
This should be reported and this is also what GCC does.
Review: Ulrich Weigand, Aaron Ballman, Jennifer Yu, Craig Topper
Differential Revision: https://reviews.llvm.org/D87279
This patch resumes the work of D16586.
According to the AAPCS, volatile bit-fields should
be accessed using containers of the widht of their
declarative type. In such case:
```
struct S1 {
short a : 1;
}
```
should be accessed using load and stores of the width
(sizeof(short)), where now the compiler does only load
the minimum required width (char in this case).
However, as discussed in D16586,
that could overwrite non-volatile bit-fields, which
conflicted with C and C++ object models by creating
data race conditions that are not part of the bit-field,
e.g.
```
struct S2 {
short a;
int b : 16;
}
```
Accessing `S2.b` would also access `S2.a`.
The AAPCS Release 2020Q2
(https://documentation-service.arm.com/static/5efb7fbedbdee951c1ccf186?token=)
section 8.1 Data Types, page 36, "Volatile bit-fields -
preserving number and width of container accesses" has been
updated to avoid conflict with the C++ Memory Model.
Now it reads in the note:
```
This ABI does not place any restrictions on the access widths of bit-fields where the container
overlaps with a non-bit-field member or where the container overlaps with any zero length bit-field
placed between two other bit-fields. This is because the C/C++ memory model defines these as being
separate memory locations, which can be accessed by two threads simultaneously. For this reason,
compilers must be permitted to use a narrower memory access width (including splitting the access into
multiple instructions) to avoid writing to a different memory location. For example, in
struct S { int a:24; char b; }; a write to a must not also write to the location occupied by b, this requires at least two
memory accesses in all current Arm architectures. In the same way, in struct S { int a:24; int:0; int b:8; };,
writes to a or b must not overwrite each other.
```
I've updated the patch D16586 to follow such behavior by verifying that we
only change volatile bit-field access when:
- it won't overlap with any other non-bit-field member
- we only access memory inside the bounds of the record
- avoid overlapping zero-length bit-fields.
Regarding the number of memory accesses, that should be preserved, that will
be implemented by D67399.
Reviewed By: ostannard
Differential Revision: https://reviews.llvm.org/D72932
Emit the equivalent integer reduction intrinsics in IR instead of expanding to shuffle+arithmetic sequences.
The fadd/fmul reductions might be trickier as they assume a similar bisection reduction while the generic intrinsics assume a sequential reduction (intel docs are ambiguous on the correct approach) - I'm not sure if we want to always tag them with reassoc? Anyway, that issue can wait until a separate fp patch along with the fmin/fmax reductions.
Differential Revision: https://reviews.llvm.org/D87604
References to different declarations of the same entity aren't different
values, so shouldn't have different representations.
Recommit of e6393ee813, most recently
reverted in 9a33f027ac due to a bug caused
by ObjCInterfaceDecls not propagating availability attributes along
their redeclaration chains; that bug was fixed in
e2d4174e9c.
chain for ObjCInterfaceDecls.
Only one such declaration can actually have attributes (the definition,
if any), but generally we assume that we can look for InheritedAttrs on
the most recent declaration.
They can get stale at use time because of updates from other recursive
specializations. Instead, rely on the existence of previous declarations to add
the specialization.
Differential Revision: https://reviews.llvm.org/D87853