This was first reviewed in https://reviews.llvm.org/D46776 and
landed in r332299, but got reverted because it broke the PS4
bots.
https://reviews.llvm.org/D50410 fixed this, and then this
change was re-reviewed at https://reviews.llvm.org/D50515 and
relanded in r341329. It got reverted due to causing MSan issues.
However, nobody wrote down the error message and the bot link
is dead, so I'm relanding this to capture the MSan error.
I'll then either fix it, or copy it somewhere and revert if
fixing looks difficult.
llvm-svn: 359580
r359527 already merged some of that to the GN build,
but it was missing some bits as well.
The check-clangd target works (at least for now) differently than all
the other check-foo targets, see https://reviews.llvm.org/D61187
For that reason, there's no gni file and the generated lit configs are
not (yet?) added to llvm-lit/BUILD.gn.
llvm-svn: 359570
This is required for using PGO on Windows but isn't in the Windows
release packages. Windows packages are built with
LLVM_INSTALL_TOOLCHAIN_ONLY so only includes llvm "tools" listed here.
Differential Revision: https://reviews.llvm.org/D61317
llvm-svn: 359569
We don't have this restriction in IR, so it should not be here
either simply out of consistency. Code that wants to handle FP
exceptions is expected to use the 'strict' variants of these
nodes.
We don't get the frem case because frem by 0.0 produces NaN (invalid),
and that's the remaining check here (so the removed check for frem
was dead code AFAIK).
This is the only place in SDAG that uses "HasFPExceptions", so I
think we should remove that entirely as a follow-up patch.
llvm-svn: 359566
For clang-cl self hosts in VS2015 environment this was reporting: "Host
Clang must have at least -fms-compatibility-version=19.00.24213.1, your
version is 9.0.0".
This check fires as CMake detects the simulated environment as _MSC_VER
1900, which is truncated. This makes it less than the required
19.00.24213.1.
Differential revision: https://reviews.llvm.org/D61188
llvm-svn: 359556
Summary:
MemorySSA keeps internal pointers of AA and DT.
If these get invalidated, so should MemorySSA.
Reviewers: george.burgess.iv, chandlerc
Subscribers: jlebar, Prazek, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D61043
........
This was causing windows build bot failures
llvm-svn: 359555
The code in this test is not vectorized by SLP because its operand reordering cannot look beyond the immediate predecessors.
This will get fixed in a follow-up patch that introduces the look-ahead operand reordering heuristic.
Committed on behalf of @vporpo (Vasileios Porpodas)
Differential Revision: https://reviews.llvm.org/D61283
llvm-svn: 359553
This is a fix for https://bugs.llvm.org/show_bug.cgi?id=41371.
Currently, it is possible to break the sh_link field of the dynamic relocation section
by removing the section it refers to. The patch fixes an issue and adds 2 test cases.
One of them shows that it does not seem possible to break the sh_info field.
I added an assert to verify this.
Differential revision: https://reviews.llvm.org/D60825
llvm-svn: 359552
This implements TargetTransformInfo method getMemcpyCost, which estimates the
number of instructions to which a memcpy instruction expands to.
Differential Revision: https://reviews.llvm.org/D59787
llvm-svn: 359547
Current LLVM uses pxor+pinsrb on SSE4+ for INSERT_VECTOR_ELT(ZeroVec, 0, Elt) insead of much simpler movd.
INSERT_VECTOR_ELT(ZeroVec, 0, Elt) is idiomatic construct which is used e.g. for _mm_cvtsi32_si128(Elt) and for lowest element initialization in _mm_set_epi32.
So such inefficient lowering leads to significant performance digradations in ceratin cases switching from SSSE3 to SSE4.
https://bugs.llvm.org/show_bug.cgi?id=41512
Here INSERT_VECTOR_ELT(ZeroVec, 0, Elt) is simply converted to SCALAR_TO_VECTOR(Elt) when applicable since latter is closer match to desired behavior and always efficiently lowered to movd and alike.
Committed on behalf of @Serge_Preis (Serge Preis)
Differential Revision: https://reviews.llvm.org/D60852
llvm-svn: 359545
This was a local static funtion in SelectionDAG, which I've promoted to
TargetLowering so that I can reuse it to estimate the cost of a memory
operation in D59787.
Differential Revision: https://reviews.llvm.org/D59766
llvm-svn: 359543
Bail out on function arguments/returns with types aggregating an
unsupported type. This fixes cases where we would happily and
incorrectly lower functions taking e.g. [1 x i64] parameters, when we
don't even support plain i64 yet.
llvm-svn: 359540
The MachineFunction wasn't used in getOptimalMemOpType, but more importantly,
this allows reuse of findOptimalMemOpLowering that is calling getOptimalMemOpType.
This is the groundwork for the changes in D59766 and D59787, that allows
implementation of TTI::getMemcpyCost.
Differential Revision: https://reviews.llvm.org/D59785
llvm-svn: 359537
Summary:
When a variable goes into scope several times within a single function
or when two variables from different scopes share a stack slot it may
be incorrect to poison such scoped locals at the beginning of the
function.
In the former case it may lead to false negatives (see
https://github.com/google/sanitizers/issues/590), in the latter - to
incorrect reports (because only one origin remains on the stack).
If Clang emits lifetime intrinsics for such scoped variables we insert
code poisoning them after each call to llvm.lifetime.start().
If for a certain intrinsic we fail to find a corresponding alloca, we
fall back to poisoning allocas for the whole function, as it's now
impossible to tell which alloca was missed.
The new instrumentation may slow down hot loops containing local
variables with lifetime intrinsics, so we allow disabling it with
-mllvm -msan-handle-lifetime-intrinsics=false.
Reviewers: eugenis, pcc
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60617
llvm-svn: 359536
The PrologEpilogInserter need to insert a DW_OP_deref_size before
prepending a memory location expression to an already implicit
expression to avoid having the existing expression act on the memory
address instead of the value behind it.
The reason for using DW_OP_deref_size and not plain DW_OP_deref is that
big-endian targets need to read the right size as simply truncating a
larger read would yield the wrong result (LSB bytes are not at the lower
address).
This re-commit fixes issues reported in the first one. Namely deref was
inserted under wrong conditions and additionally the deref_size argument
was incorrectly encoded.
Differential Revision: https://reviews.llvm.org/D59687
llvm-svn: 359535
Do not combine (trunc adde(X, Y, Carry)) into (adde trunc(X), trunc(Y), Carry),
if adde is not legal for the target. Even it's at type-legalize phase.
Because adde is special and will not be legalized at operation-legalize phase later.
This fixes: PR40922
https://bugs.llvm.org/show_bug.cgi?id=40922
Differential Revision: https://reviews.llvm.org//D60854
llvm-svn: 359532
Summary:
With this change, cl::ResetAllOptionOccurrences() clears
cl::list just like cl::opt, allowing users to call
cl::ParseCommandLineOptions() multiple times without interference from
previous calls.
Reviewers: rnk
Reviewed By: rnk
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D61234
llvm-svn: 359522
Background: A definition generator can be attached to a JITDylib to generate
new definitions in response to queries. For example: a generator that forwards
calls to dlsym can map symbols from a dynamic library into the JIT process on
demand.
If definition generation fails then the generator should be able to return an
error. This allows the JIT API to distinguish between the case where a
generator does not provide a definition, and the case where it was not able to
determine whether it provided a definition due to an error.
The immediate motivation for this is cross-process symbol lookups: If the
remote-lookup generator is attached to a JITDylib early in the search list, and
if a generator failure is misinterpreted as "no definition in this JITDylib" then
lookup may continue and bind to a different definition in a later JITDylib, which
is a bug.
llvm-svn: 359521
Summary:
MemorySSA keeps internal pointers of AA and DT.
If these get invalidated, so should MemorySSA.
Reviewers: george.burgess.iv, chandlerc
Subscribers: jlebar, Prazek, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D61043
llvm-svn: 359519
Summary:
This patch adds support for __builtin_dcbf for PPC.
__builtin_dcbf copies the contents of a modified block from the data cache
to main memory and flushes the copy from the data cache.
Differential revision: https://reviews.llvm.org/D59843
llvm-svn: 359517
lld-link used to write PDB files that DIA couldn't recover natvis
files from if:
- The global strings table was > 64kiB
- There were at least 3 natvis files
The cause was that the hash function for the /src/headerblock stream
was incorrect: It needs to be truncated to 16 bit.
If the global strings table was <= 64kiB, truncating to 16 bit is a
no-op, so this wasn't needed for small programs.
If there are only 1 or 2 natvis files, then the growth strategy in
HashTable::grow() would mean the hash table would have 2 buckets (for 1
natvis file) or 4 buckets (for 4 natvis files), and since the hash
function is used modulo number of buckets, and since 2 and 4 divide
0x10000, the missing `% 0x10000` is a no-op there too. For 3 natvis
files, the hash table grows to 6 buckets, which has a factor that's not
common with 0x10000 and the difference starts to matter.
Fixes PR41626.
Differential Revision: https://reviews.llvm.org/D61277
llvm-svn: 359515
LLJITBuilder and LLLazyJITBuilder construct LLJIT and LLLazyJIT instances
respectively. Over time these will allow more configurable options to be
added while remaining easy to use in the default case, which for default
in-process JITing is now:
auto J = ExitOnErr(LLJITBuilder.create());
llvm-svn: 359511
Summary:
For ThinLTOCodegenerator, it has an option to save the object file
outputs into a directory which is essential for debug info. Tools like lldb
and dsymutil will look for these object files for debug info.
On Darwin platform, you can link fat binaries with one single clang
driver invocation like:
$ clang -arch x86_64 -arch i386 -Wl,-object_path_lto,$TMPDIR ...
Unfornately, the output object files for one architecture is going to
overwrite the previous ones and one architecture slice will end up with
no debug info. One example for this is to turn on ThinLTO for sanitizer
dylibs in compiler-rt project.
To fix the issue, add the name for the architecture into the name of the
output object file.
rdar://problem/35482935
Reviewers: tejohnson, bd1976llvm, dexonsmith, JDevlieghere
Reviewed By: dexonsmith
Subscribers: mehdi_amini, aprantl, inglorion, eraman, hiraditya, jkorous, dang, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60924
llvm-svn: 359508
The WebAssembly backend needs to know the signatures of all runtime
libcall functions. This adds the signature for __stack_chk_fail which was
previously missing.
Also, make the error message for a missing libcall include the name of
the function.
Differential Revision: https://reviews.llvm.org/D59521
Reviewed By: sbc100
llvm-svn: 359505
Change the PPCISelLowering.cpp function that decides to avoid update form in
favor of partial vector loads to know about newer load types and to not be
confused by the chain operand.
Differential Revision: https://reviews.llvm.org/D60102
llvm-svn: 359504
This was falling back and gives us a reason to create a selectIntrinsic function
which we would need eventually anyway. Update arm64-crypto.ll to show that we
correctly select it.
Also factor out the code for finding an intrinsic ID.
llvm-svn: 359501
On mingw/i686, local labels don't start with a leading period.
Also escape the leading period, as it previously could match
any char.
Differential Revision: https://reviews.llvm.org/D61254
llvm-svn: 359497
This is necessary since SVN r330706, as tail merging can include
CFI instructions since then.
This fixes PR40322 and PR40012.
Differential Revision: https://reviews.llvm.org/D61252
llvm-svn: 359496
Multiple targets in the same output directory can use the same
target_output_name. The typical example of that is having a shared
and a static library of the same, e.g. libc++.so and libc++.a.
When that's the case, the object files produced for each target
are going to conflict. Using the label_name avoids this conflict
since labels are guaranteed to be unique within a single BUILD.gn
file which corresponds to a single output directory.
Differential Revision: https://reviews.llvm.org/D60329
llvm-svn: 359494
Add target shuffle decoding to isHorizontalBinOp as well as ISD::VECTOR_SHUFFLE support.
This does mean we can go through bitcasts so we need to bitcast the extracted args to ensure they are the correct type
Fixes PR39936 and should help with PR39920/PR39921
Differential Revision: https://reviews.llvm.org/D61245
llvm-svn: 359491