The case with 2 variables is more complicated than the case where
we eliminate the shuffle entirely because a shuffle with an undef
mask element creates an undef result.
I'm not aware of any current analysis/transform that recognizes that
undef propagating to a div/rem/shift, but we have to guard against
the possibility.
llvm-svn: 336668
Summary:
Many operators in <compare> were _defined_ with the proper visibility attribute,
but they were _declared_ without any. This is not a problem until we change the
definition of _LIBCPP_INLINE_VISIBILITY to something that requires the
declaration to be decorated.
I also marked `strong_equality::operator weak_equality()` as
`_LIBCPP_INLINE_VISIBILITY`, since it seems like it had been forgotten.
This came up while trying to get rid of `__attribute__((__always_inline__))`
in favor of `__attribute__((internal_linkage))`.
Reviewers: EricWF, mclow.lists
Subscribers: christof, dexonsmith, llvm-commits
Differential Revision: https://reviews.llvm.org/D49104
llvm-svn: 336665
when building with an IDE so that header files show up in the UI.
This massively improves the development workflow in IDEs.
To implement this a new function `compiler_rt_process_sources(...)` has
been added that adds header files to the list of sources when the
generator is an IDE. For non-IDE generators (e.g. Ninja/Makefile) no
changes are made to the list of source files.
The function can be passed a list of headers via the
`ADDITIONAL_HEADERS` argument. For each runtime library a list of
explicit header files has been added and passed via
`ADDITIONAL_HEADERS`. For `tsan` and `sanitizer_common` a list of
headers was already present but it was stale and has been updated
to reflect the current state of the source tree.
The original version of this patch used file globbing (`*.{h,inc,def}`)
to find the headers but the approach was changed due to this being a
CMake anti-pattern (if the list of headers changes CMake won't
automatically re-generate if globbing is used).
The LLVM repo contains a similar function named `llvm_process_sources()`
but we don't use it here for several reasons:
* It depends on the `LLVM_ENABLE_OPTION` cache variable which is
not set in standalone compiler-rt builds.
* We would have to `include(LLVMProcessSources)` which I'd like to
avoid because it would include a bunch of stuff we don't need.
Differential Revision: https://reviews.llvm.org/D48422
llvm-svn: 336663
An explicit untied use is not sufficient to maintain liveness of a
register redefined in a predicated instruction. For example
%1 = COPY %0
...
%1 = A2_paddif %2, %1, 1
could become
$r1 = COPY $r0
...
$r1 = A2_paddif $p0, $r1, 1
and later
$r1 = COPY $r0 ;; this is not really dead!
...
$r1 = A2_paddif $p0, $r0, 1
llvm-svn: 336662
Summary:
Original patch by Kuba Mracek
The %T lit expansion expands to a common directory shared between all
the tests in the same directory, which is unexpected and unintuitive,
and more importantly, it's been a source of subtle race conditions and
flaky tests. In https://reviews.llvm.org/D35396, it was agreed that it
would be best to simply ban %T and only keep %t, which is unique to each
test. When a test needs a temporary directory, it can just create one
using mkdir %t.
This patch removes %T in compiler-rt.
Differential Revision: https://reviews.llvm.org/D48618
llvm-svn: 336661
Summary:
Reproducer and errors:
https://bugs.llvm.org/show_bug.cgi?id=37878
lookupModule was falling back to loadSubdirectoryModuleMaps when it couldn't
find ModuleName in (proper) search paths. This was causing iteration over all
files in the search path subdirectories for example "/usr/include/foobar" in
bugzilla case.
Users don't expect Clang to load modulemaps in subdirectories implicitly, and
also the disk access is not cheap.
if (AllowExtraModuleMapSearch) true with ObjC with @import ModuleName.
Reviewers: rsmith, aprantl, bruno
Subscribers: cfe-commits, teemperor, v.g.vassilev
Differential Revision: https://reviews.llvm.org/D48367
llvm-svn: 336660
Summary:
Fixed two cases of where PHI nodes need to be updated by lowerswitch.
When lowerswitch find out that the switch default branch is not
reachable it remove the old default and replace it with the most
popular block from the cases, but it forget to update the PHI
nodes in the default block.
The PHI nodes also need to be updated when the switch is replaced
with a single branch.
Reviewers: hans, reames, arsenm
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D47203
llvm-svn: 336659
Parsing invalid UTF-8 input is now a parse error.
Creating JSON values from invalid UTF-8 now triggers an assertion, and
(in no-assert builds) substitutes the unicode replacement character.
Strings retrieved from json::Value are always valid UTF-8.
llvm-svn: 336657
As suggested by @efriedma on D48975, this patch separates the BuildDiv/Pow2 style optimizations from the rest of the visitSDIV/visitUDIV to make it easier to reuse the combines and will allow us to avoid some rather nasty node recursive combining in visitREM.
llvm-svn: 336656
In this setup, skip adding all the default windows import libraries,
if linking to windowsapp (which replaces them, when targeting the
windows store/UWP api subset).
With GCC, the same is achieved by using a custom spec file, but
since clang doesn't use spec files, we have to allow other means of
overriding what default libraries to use (without going all the
way to using -nostdlib, which would exclude everything). The same
approach, in detecting certain user specified libraries and omitting
others from the defaults, was already used in SVN r314138.
Differential Revision: https://reviews.llvm.org/D49059
llvm-svn: 336655
Since SVN r314138, we check if the user has specified any particular
alternative msvcrt/ucrt version, and skip the default -lmsvcrt
in those cases.
In addition to the existing names checked, we should also treat
a plain -lucrt in the same way, mingw-w64 has now added a separate
import library named libucrt.a, in addition to libucrtbase.a.
Differential Revision: https://reviews.llvm.org/D49054
llvm-svn: 336654
Future symbol insertions can potentially change the type of these
symbols - keep pointers to the base class to reflect this, and
use dynamic casts to inspect them before using as the subclass
type.
This fixes crashes that were possible before, by touching these
symbols that now are populated as e.g. a DefinedRegular, via
the old pointers with DefinedImportThunk type.
Differential Revision: https://reviews.llvm.org/D48953
llvm-svn: 336652
Changes:
- Remove static assertion on size of a structure, fails on systems where
pointers aren't 8 bytes.
- Use size_t instead of deducing type of arguments to
`nearest_boundary`.
Follow-up to D48653.
llvm-svn: 336648
switch unswitching.
The core problem was that the way we handled unswitching trivial exit
edges through the default successor of a switch. For some reason
I thought the right way to do this was to add a block containing
unreachable and point the default successor at this block. In
retrospect, this has an amazing number of problems.
The first issue is the one that this pass has always worked around -- we
have to *detect* such edges and avoid unswitching them again. This
seemed pretty easy really. You juts look for an edge to a block
containing unreachable. However, this pattern is woefully unsound. So
many things can break it. The amazing thing is that I found a test case
where *simple-loop-unswitch itself* breaks this! When we do
a *non-trivial* unswitch of a switch we will end up splitting this exit
edge. The result will be a default successor that is an exit and
terminates in ... a perfectly normal branch. So the first test case that
I started trying to fix is added to the nontrivial test cases. This is
a ridiculous example that did just amazing things previously. With just
unswitch, it would create 10+ copies of this stuff stamped out. But if
you combine it *just right* with a bunch of other passes (like
simplify-cfg, loop rotate, and some LICM) you can get it to do this
infinitely. Or at least, I never got it to finish. =[
This, in turn, uncovered another related issue. When we are manipulating
these switches after doing a trivial unswitch we never correctly updated
PHI nodes to reflect our edits. As soon as I started changing how these
edges were managed, it became obvious there were more issues that
I couldn't realistically leave unaddressed, so I wrote more test cases
around PHI updates here and ensured all of that works now.
And this, in turn, required some adjustment to how we collect and manage
the exit successor when it is the default successor. That showed a clear
bug where we failed to include it in our search for the outer-most loop
reached by an unswitched exit edge. This was actually already tested and
the test case didn't work. I (wrongly) thought that was due to SCEV
failing to analyze the switch. In fact, it was just a simple bug in the
code that skipped the default successor. While changing this, I handled
it correctly and have updated the test to reflect that we now get
precise SCEV analysis of trip counts for the outer loop in one of these
cases.
llvm-svn: 336646
This patch adds fast-isel tests for the IR patterns produced for truncation
intrinsics in rC336643.
Differential Revision: https://reviews.llvm.org/D48822
llvm-svn: 336645
Summary:
We found a bug while working on a benchmark for the profiling mode which
manifests as a segmentation fault in the profiling handler's
implementation. This change adds unit tests which replicate the
issues in isolation.
We've tracked this down as a bug in the implementation of the Freelist
in the `xray::Array` type. This happens when we trim the array by a
number of elements, where we've been incorrectly assigning pointers for
the links in the freelist of chunk nodes. We've taken the chance to add
more debug-only assertions to the code path and allow us to verify these
assumptions in debug builds.
In the process, we also took the opportunity to use iterators to
implement both `front()` and `back()` which exposes a bug in the
iterator decrement operation. In particular, when we decrement past a
chunk size boundary, we end up moving too far back and reaching the
`SentinelChunk` prematurely.
This change unblocks us to allow for contributing the non-crashing
version of the benchmarks in the test-suite as well.
Reviewers: kpw
Subscribers: mgorny, llvm-commits
Differential Revision: https://reviews.llvm.org/D48653
llvm-svn: 336644
This patch lowers the _mm[256|512]_cvtepi{64|32|16}_epi{32|16|8} intrinsics to
native IR in cases where the result's length is less than 128 bits.
The resulting IR for 256-bit inputs is folded into VPMOV instructions, while for
128-bit inputs the vpshufb (or, in the 64-to-32-bit case, vinsertps)
instructions are generated instead
Differential Revision: https://reviews.llvm.org/D48712
llvm-svn: 336643
Now that rL336250 has landed, we should prefer 2 immediate shifts + a shuffle blend over performing a multiply. Despite the increase in instructions, this is quicker (especially for slow v4i32 multiplies), avoid loads and constant pool usage. It does mean however that we increase register pressure. The code size will go up a little but by less than what we save on the constant pool data.
This patch also adds support for v16i16 to the BLEND(SHIFT(v,c1),SHIFT(v,c2)) combine, and also prevents blending on pre-SSE41 shifts if it would introduce extra blend masks/constant pool usage.
Differential Revision: https://reviews.llvm.org/D48936
llvm-svn: 336642
We're missing the EVEX equivalents of these patterns and seem to get along fine.
I think we end up with X86vzload for the obvious IR cases that would produce this DAG.
llvm-svn: 336638
The rounding mode is checked in CGBuiltin.cpp to generate the correct intrinsic call.
Making this switch switchs the masking to use the i8 bitcast to <8 x i1> and extract i1 version of the IR for the mask. Previously we ended up with a scalar 'and' plus an icmp.
llvm-svn: 336637
BindingDecls have null type until their initializer is processed, so we can't
assume that a correction candidate has non-null type.
rdar://41559582
llvm-svn: 336634
Summary:
Some tests already make use of OS feature names, e.g. 'linux' and 'freebsd',
but they are not actually currently set by lit.
Reviewers: pcc, eugenis
Reviewed By: eugenis
Subscribers: emaste, krytarowski, delcypher, llvm-commits, #sanitizers
Differential Revision: https://reviews.llvm.org/D49115
llvm-svn: 336633
Functions that are a sub-Decl of a record were hashed differently than other
functions. This change keeps the AddFunctionDecl function and the hash of
records now calls this function. In addition, AddFunctionDecl has an option
to perform a hash as if the body was absent, which is required for some
checks after loading modules. Additional logic prevents multiple error
message from being printed.
llvm-svn: 336632
I believe the only way to test this functionality is to create extremely
large object files and attempt to create a .gdb_index that is greater
than 4 GiB. But I think that's too much for most environments and buildbots,
so I'm commiting this without a test that actually triggers the new
error condition.
llvm-svn: 336631
Privacy annotations shouldn't have to appear in the first
comma-delimited string in order to be recognized. Also, they should be
ignored if they are preceded or followed by non-whitespace characters.
rdar://problem/40706280
llvm-svn: 336629
This will convert the i8 mask argument to <8 x i1> and extract an i1 and then emit a select instruction. This replaces the '(__U & 1)" and ternary operator used in some of intrinsics. The old sequence was lowered to a scalar and and compare. The new sequence uses an i1 vector that will interoperate better with other mask intrinsics.
This removes the need to handle div_ss/sd specially in CGBuiltin.cpp. A follow up patch will add the GCCBuiltin name back in llvm and remove the custom handling.
I made some adjustments to legacy move_ss/sd intrinsics which we reused here to do a simpler extract and insert instead of 2 extracts and two inserts or a shuffle.
llvm-svn: 336622
This is prep for DWARF v5 range list emission. Emission of a single range list is moved
to a static helper function.
Reviewer: jdevlieghere
Differential Revision: https://reviews.llvm.org/D49098
llvm-svn: 336621
Previously, we didn't create multiple consecutive bitmaps.
Added a test to catch this bug too.
Differential Revision: https://reviews.llvm.org/D49107
llvm-svn: 336620