This reverts these two commits
[InstCombine] Turn (extractelement <1 x i64/double> (bitcast (x86_mmx))) into a single bitcast from x86_mmx to i64/double.
[InstCombine] Don't transform bitcasts between x86_mmx and v1i64 into insertelement/extractelement
We're seeing at least one internal test failure related to a
bitcast that was previously before an inline assembly block
containing emms being placed after it. This leads to the mmx
state ending up not empty after the emms. IR has no way to
make any specific guarantees about this. Reverting these patches
to get back to previous behavior which at least worked for this
test.
Summary: This should allow further simplifications, but it's a first step.
Reviewers: teemperor, jingham, friss
Subscribers: lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D70983
Summary:
Or, rather, don't accidentally forget to pass it.
This is aimed to solve the problem discussed in [this thread](http://lists.llvm.org/pipermail/llvm-dev/2019-November/136890.html), and to fix [a year-old bug](https://bugs.llvm.org/show_bug.cgi?id=38468).
TL;DR: when building libunwind for ARM Linux, we **need** libunwind to be built with the `-funwind-tables` flag, because, well ARM EHABI needs unwind info produced by this flag. Without the flag all the procedures in libunwind are marked `.cantunwind`, which causes all sorts of bad things. From `_Unwind_Backtrace` not working, to C++ exceptions not being caught (which is the aforementioned bug is about).
Previously, this flag was not added because the CMake check `add_compile_flags_if_supported(-funwind-tables)` produced a false negative. Why? With this flag, the compiler generates calls to the `__aeabi_unwind_cpp_pr0` symbol, which is defined in libunwind itself and obviously is not available at configure time, before libunwind is built. This led to failure at link time during the CMake check. We handle this by disabling the linker for CMake checks in linbunwind.
Also, this patch introduces a lit feature `libunwind-arm-ehabi`, which is used to mark the `signal_frame.pass.cpp` test as unsupported (as was advised by @miyuki in D70397).
Reviewers: peter.smith, phosek, EricWF, compnerd, jroelofs, saugustine, miyuki, jfb
Subscribers: mgorny, kristof.beyls, christof, libcxx-commits, miyuki
Tags: #libc
Differential Revision: https://reviews.llvm.org/D70815
'python' means Python 2 on some platforms while Python 3 on others.
'python3' is Python 3 only. Python 2.7 End of Life is set to January 1,
2020. Getting rid of Python 2 support reduces maintenance burden.
Reviewed By: lebedev.ri
Differential Revision: https://reviews.llvm.org/D70730
Emit a gap region beginning where the switch body begins. This sets line
execution counts in the areas between non-overlapping cases to 0.
This also removes some special handling of the first case in a switch:
these are now treated like any other case.
This does not resolve an outstanding issue with case statement regions
that do not end when a region is terminated. But it should address
llvm.org/PR44011.
Differential Revision: https://reviews.llvm.org/D70571
Problem: `FILECHECK_OPTS` was implemented so that a test runner, such
as a bot, can specify FileCheck debugging options, such as
`-dump-input=fail`. However, some existing test suites have FileCheck
calls that already specify `-dump-input=fail` or `-dump-input=always`.
Without this patch, such tests fail under such a test runner because
FileCheck doesn't accept multiple occurrences of `-dump-input`.
Solution: This patch permits multiple occurrences of `-dump-input` by
assigning precedence to its values in the following descending order:
`help`, `always`, `fail`, and `never`. That is, any occurrence of
`help` always obtains help, and otherwise the behavior is similar to
`-v` vs. `-vv` in that the option specifying the greatest verbosity
has precedence.
Rationale: My justification for the new behavior is as follows. I
have not experienced use cases where, either as a test runner or as a
test author, I want to **limit** the permitted debugging verbosity
(except as a test author in FileCheck's or lit's test suites where the
FileCheck debugging output itself is under test, but the solution
there is `env FILECHECK_OPTS=`, and I imagine we should use the same
solution anywhere else this need might occur). Of course, as either a
test runner or test author, it is useful to **increase** debugging
verbosity.
Reviewed By: probinson
Differential Revision: https://reviews.llvm.org/D70784
https://reviews.llvm.org/D70922
This adds a hook to allow targets to define exactly what extension
operation should be performed for widening constants. This handles cases
like widening i1 true which would end up becoming -1 which affects code
quality during combines.
Additionally, in order to stay consistent with how DAG is promoting
constants, we now signextend for byte sized types and zero extend
otherwise (by default). Targets can of course override this if
necessary.
Currently, when dumping the AST to JSON, the presumed file is what is included
when dumping a source location. This patch changes the behavior to instead dump
the actual file, and only dump a presumed file name when it differs from the
actual file.
This also corrects an issue with the test script generator that would prevent
it from working on Windows due to file permissions issues.
Fix PR40816: avoid considering scalar-with-predication instructions as also
uniform-after-vectorization.
Instructions identified as "scalar with predication" will be "vectorized" using
a replicating region. If such instructions are also optimized as "uniform after
vectorization", namely when only the first of VF lanes is used, such a
replicating region becomes erroneous - only the first instance of the region can
and should be formed. Fix such cases by not considering such instructions as
"uniform after vectorization".
Differential Revision: https://reviews.llvm.org/D70298
As it can be seen from accompanying cleanup, it is not unheard of
to write `~Known.Zero` meaning "what maximal value can this KnownBits
produce". But i think `~Known.Zero` isn't *that* self-explanatory,
as compared to a method with a name.
Note that not all `~Known.Zero` places were cleaned up,
only those where this arguably improves things.
When dealing with system libraries which are absolute paths, use the
absolute path rather than the `-l` option. This ensures that the system
library can be properly linked against. This is needed to enable using
proper link dependencies in CMake.
Summary:
In order to be compliant with tcmalloc's extension ownership
determination function, we have to expose a function that will
say if a chunk was allocated by us.
As to whether or not this has security consequences: someone
able to call this function repeatedly could use it to determine
secrets (cookie) or craft a valid header. So this should not be
exposed directly to untrusted user input.
Add related tests.
Additionally clang-format caught a few things to change.
Reviewers: hctim, pcc, cferris, eugenis, vitalybuka
Subscribers: JDevlieghere, jfb, #sanitizers, llvm-commits
Tags: #sanitizers, #llvm
Differential Revision: https://reviews.llvm.org/D70908
Summary:
Make SLPVectorize to recognize homogeneous aggregates like
`{<2 x float>, <2 x float>}`, `{{float, float}, {float, float}}`,
`[2 x {float, float}]` and so on.
It's a follow-up of https://reviews.llvm.org/D70068.
Merged `findBuildVector()` and `findBuildAggregate()` to
one `findBuildAggregate()` function making it recursive
to recognize multidimensional aggregates. Aggregates required
to be homogeneous.
Reviewers: RKSimon, ABataev, dtemirbulatov, spatel, vporpo
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70587
SYCL is single source offload programming model relying on compiler to
separate device code (i.e. offloaded to an accelerator) from the code
executed on the host.
Here is code example of the SYCL program to demonstrate compiler
outlining work:
```
int foo(int x) { return ++x; }
int bar(int x) { throw std::exception("CPU code only!"); }
...
using namespace cl::sycl;
queue Q;
buffer<int, 1> a(range<1>{1024});
Q.submit([&](handler& cgh) {
auto A = a.get_access<access::mode::write>(cgh);
cgh.parallel_for<init_a>(range<1>{1024}, [=](id<1> index) {
A[index] = index[0] + foo(42);
});
}
...
```
SYCL device compiler must compile lambda expression passed to
cl::sycl::handler::parallel_for method and function foo called from this
lambda expression for an "accelerator". SYCL device compiler also must
ignore bar function as it's not required for offloaded code execution.
This patch adds the sycl_kernel attribute, which is used to mark code
passed to cl::sycl::handler::parallel_for as "accelerated code".
Attribute must be applied to function templates which parameters include
at least "kernel name" and "kernel function object". These parameters
will be used to establish an ABI between the host application and
offloaded part.
Reviewers: jlebar, keryell, Naghasan, ABataev, Anastasia, bader, aaron.ballman, rjmccall, rsmith
Reviewed By: keryell, bader
Subscribers: mgorny, OlegM, ArturGainullin, agozillon, aaron.ballman, ebevhan, Anastasia, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D60455
Signed-off-by: Alexey Bader <alexey.bader@intel.com>
The naming used by editline for the history operations is counter
intuitive to how it's used in lldb for the REPL.
- The H_PREV operation returns the previous element in the history,
which is newer than the current one.
- The H_NEXT operation returns the next element in the history, which
is older than the current one.
This exposed itself as a bug in the REPL where the behavior of up- and
down-arrow was inverted. This wasn't immediately obvious because of how
we save the current "live" entry.
This patch fixes the bug and introduces and enum to wrap the editline
operations that match the semantics of lldb.
Differential revision: https://reviews.llvm.org/D70932
Since lambdas are represented by callable objects, we add
generic addr space for implicit object parameter in call
operator.
Any lambda variable declared in __constant addr space
(which is not convertible to generic) fails to compile with
a diagnostic. To support constant addr space we need to
add a way to qualify the lambda call operators.
Tags: #clang
Differential Revision: https://reviews.llvm.org/D69938
To ensure a reproducer works correctly, the version of LLDB used for
capture and replay must match. Right now the reproducer already contains
the LLDB version. However, this is purely informative. LLDB will happily
replay a reproducer generated with a different version of LLDB, which
can cause subtle differences.
This patch adds a version check which compares the current LLDB version
with the one in the reproducer. If the version doesn't match, LLDB will
refuse to replay. It also adds an escape hatch to make it possible to
still replay the reproducer without having to mess with the recorded
version. This might prove useful when you know two versions of LLDB
match, even though the version string doesn't. This behavior is
triggered by passing a new flag -reproducer-skip-version-check to the
lldb driver.
Differential revision: https://reviews.llvm.org/D70934
This patch adds intrinsics for SVE gather loads from memory addresses generated by a vector base plus immediate index:
* @llvm.aarch64.sve.ld1.gather.imm
This intrinsics maps 1-1 to the corresponding SVE instruction (example for half-words):
* ld1h { z0.d }, p0/z, [z0.d, #16]
Committed on behalf of Andrzej Warzynski (andwar)
Reviewers: sdesmalen, huntergr, kmclaughlin, eli.friedman, rengolin, rovka, dancgr, mgudim, efriedma
Reviewed By: sdesmalen
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70806
Summary:
[libomptarget] Build a minimal deviceRTL for amdgcn
The CMakeLists.txt file is functionally identical to the one used in the aomp fork.
Whitespace changes were made based on nvptx/CMakeLists.txt, plus the
copyright notice updated to match (Greg was the original author so would
like his sign off on that here).
This change will build a small subset of the deviceRTL if an appropriate toolchain is
available, e.g. a local install of rocm. Support.h is moved from nvptx as a dependency
of debug.h.
Reviewers: jdoerfert, ABataev, grokos, ronlieb, gregrodgers
Reviewed By: jdoerfert
Subscribers: jfb, Hahnfeld, jvesely, mgorny, openmp-commits
Tags: #openmp
Differential Revision: https://reviews.llvm.org/D70414
The DebugVariable class is a class declared in LiveDebugValues.cpp which
is used to uniquely identify a single variable, using its source
variable, inline location, and fragment info to do so. This patch moves
this class into DebugInfoMetadata.h, making it available in a much
broader scope.
Summary:
This fixes the memory leak in bec37c3fc7
and re-delivers the reverted patch.
In this patch the DDG DAG is sorted topologically to put the
nodes in the graph in the order that would satisfy all
dependencies. This helps transformations that would like to
generate code based on the DDG. Since the DDG is a DAG a
reverse-post-order traversal would give us the topological
ordering. This patch also sorts the basic blocks passed to
the builder based on program order to ensure that the
dependencies are computed in the correct direction.
Authored By: bmahjour
Reviewer: Meinersbur, fhahn, myhsu, xtian, dmgreen, kbarton, jdoerfert
Reviewed By: Meinersbur
Subscribers: ychen, arphaman, simoll, a.elovikov, mgorny, hiraditya, jfb, wuzish, llvm-commits, jsji, Whitney, etiotto, ppc-slack
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70609
This is a follow-up requested in comments for D70826.
It changes the message from
"section X has a sh_offset (Y) + sh_size (Z) that cannot be represented"
to
"section X has a sh_offset (Y) + sh_size (Z) that is greater than the file size (0xABC)"
when section's sh_offset + sh_size overruns a file buffer.
Differential revision: https://reviews.llvm.org/D70893
This was hard-coded to "clang". This change allows it to to be used on
processes other than clang (such as lld).
This gets reported as clang-10 on Linux and clang.exe on Windows so
adapted test to accommodate this.
Differential Revision: https://reviews.llvm.org/D70950