Summary:
This patch aims at condensing the hardware CRC32 feature detection and making
it slightly more effective on Android.
The following changes are included:
- remove the `CPUFeature` enum, and get rid of one level of nesting of
functions: we only used CRC32, so we just implement and use
`hasHardwareCRC32`;
- allow for a weak `getauxval`: the Android toolchain is compiled at API level
14 for Android ARM, meaning no `getauxval` at compile time, yet we will run
on API level 27+ devices. The `/proc/self/auxv` fallback can work but is
worthless for a process like `init` where the proc filesystem doesn't exist
yet. If a weak `getauxval` doesn't exist, then fallback.
- couple of extra corrections.
Reviewers: alekseyshl
Reviewed By: alekseyshl
Subscribers: kubamracek, aemerson, srhines, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D40322
llvm-svn: 318859
The default limit is 1000000 but it can be configured with a cache
policy. The motivation is that some filesystems (notably ext4) have
a limit on the number of files that can be contained in a directory
(separate from the inode limit).
Differential Revision: https://reviews.llvm.org/D40327
llvm-svn: 318857
Power has a weak consistency model so we need memory barriers to
make writes (both from runtime and from user code) available for
all threads.
Differential Revision: https://reviews.llvm.org/D40175
llvm-svn: 318848
We have just fixed the codegen of omp_is_initial_device() to reliably work
when offloading to the same device, see commit r316001. This fixes the
failing tests that were the reason why we disabled the library for 5.0.
Differential Revision: https://reviews.llvm.org/D39052
llvm-svn: 318847
SITargetLowering::LowerCall uses dummy pointer info for byval argument, which causes
flat load instead of buffer load.
This patch fixes that.
Differential Revision: https://reviews.llvm.org/D40040
llvm-svn: 318844
This is a bit annoying because LLVM regexes are always mutable to store
errors. Assert that there are never errors and fix broken hardcoded
regexes.
llvm-svn: 318840
As a side effect, the .debug_line section will be dumped in physical
order, rather than in the order that compile units refer to their
associated portions of the .debug_line section. These are probably
always the same order anyway, and no tests noticed the difference.
Differential Revision: https://reviews.llvm.org/D39854
llvm-svn: 318839
In the future the compiler will analyze whether the OpenMP
runtime needs to be (fully) initialized and avoid that overhead
if possible. The functions already take an argument to transfer
that information to the runtime, so pass in the default value 1.
(This is needed for binary compatibility with libomptarget-nvptx
currently being upstreamed.)
Differential Revision: https://reviews.llvm.org/D40354
llvm-svn: 318836
Summary:
This bug seems to have gone unnoticed because critical cases with LDS
instructions are eliminated by the peephole optimizer.
However, equivalent situations arise with buffer loads and stores
as well, so this fixes regressions since r317751 ("AMDGPU: Merge
S_BUFFER_LOAD_DWORD_IMM into x2, x4").
Fixes at least:
KHR-GL45.shader_storage_buffer_object.basic-operations-case1-cs
KHR-GL45.cull_distance.functional
piglit tes-input-gl_ClipDistance.shader_test
... and probably more
Change-Id: I0e371536288eb8e6afeaa241a185266fd45d129d
Reviewers: arsenm, mareko, rampitec
Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D40303
llvm-svn: 318829
The MIPS GOT section has a number of local entries based on the number of pages
needed for output sections referenced by GOT page relocations. The number is
recorded in the DT_MIPS_LOCAL_GOTNO dynamic section tag. However, the dynamic tag
is added before assignAddresses has been called, meaning that any section size used
to calculate the value will not include size modifications caused by, for example,
linker scripts and thunks.
This change moves the calculation of DT_MIPS_LOCAL_GOTNO until writeTo, by which
time the output section sizes have been finalized.
Reviewers: ruiu, rafael
Differential Revision: https://reviews.llvm.org/D39493
llvm-svn: 318828
Since i1 is a legal type, this:
NumBytes = Op1->getMemoryVT().getSizeInBits() >> 3;
is wrong and should be instead
NumBytes = Op0->getMemoryVT().getStoreSize();
There seems to be more places where this should be fixed outside DAGCombiner.
Review: Hal Finkel
https://bugs.llvm.org/show_bug.cgi?id=35366
llvm-svn: 318824
This makes the fact that X86 needs an explicit mask output not part of the type constraint for the ISD::MSCATTER.
This also gives the X86ISD::MGATHER/MSCATTER nodes a common base class simplifying the address selection code in X86ISelDAGToDAG.cpp
llvm-svn: 318823
RecordKeeper::getDef() is a hot place, it shows up in profiling
and it creates std::string instance for each search in RecordMap
though RecordKeeper::RecordMap can use StringRef as a key
instead to avoid that. Patch do that change.
Differential revision: https://reviews.llvm.org/D40170
llvm-svn: 318822
Now we consistently represent the mask result without relying on isel ignoring it.
We now have a more general SDNode and type constraints to represent these nodes in isel patterns. This allows us to present both both vXi1 and XMM/YMM mask types with a single set of constraints.
llvm-svn: 318821
Given loops `L1` and `L2` with AddRecs `AR1` and `AR2` varying in them respectively.
When identifying loop disposition of `AR2` w.r.t. `L1`, we only say that it is varying if
`L1` contains `L2`. But there is also a possible situation where `L1` and `L2` are
consecutive sibling loops within the parent loop. In this case, `AR2` is also varying
w.r.t. `L1`, but we don't correctly identify it.
It can lead, for exaple, to attempt of incorrect folding. Consider:
AR1 = {a,+,b}<L1>
AR2 = {c,+,d}<L2>
EXAR2 = sext(AR1)
MUL = mul AR1, EXAR2
If we incorrectly assume that `EXAR2` is invariant w.r.t. `L1`, we can end up trying to
construct something like: `{a * {c,+,d}<L2>,+,b * {c,+,d}<L2>}<L1>`, which is incorrect
because `AR2` is not available on entrance of `L1`.
Both situations "`L1` contains `L2`" and "`L1` preceeds sibling loop `L2`" can be handled
with one check: "header of `L1` dominates header of `L2`". This patch replaces the old
insufficient check with this one.
Differential Revision: https://reviews.llvm.org/D39453
llvm-svn: 318819
After the dataflow algorithm proves that an argument is constant,
it replaces it value with the integer constant and drops the lattice
value associated to the DEF.
e.g. in the example we have @f() that's called twice:
call @f(undef, ...)
call @f(2, ...)
`undef` MEET 2 = 2 so we replace the argument and all its uses with
the constant 2.
Shortly after, tryToReplaceWithConstantRange() tries to get the lattice
value for the argument we just replaced, causing an assertion.
This function is a little peculiar as it runs when we're doing replacement
and not as part of the solver but still queries the solver.
The fix is that of checking whether we replaced the value already and
get a temporary lattice value for the constant.
Thanks to Zhendong Su for the report!
Fixes PR35357.
llvm-svn: 318817
The support for relax relocations is dependent on the linker and
different toolchains within the same compiler can be using different
linkers some of which may or may not support relax relocations.
Give toolchains the option to control whether they want to use relax
relocations in addition to the existing (global) build system option.
Differential Revision: https://reviews.llvm.org/D39831
llvm-svn: 318816
In a17cd7c641c34b6c4bd4845a4d4fb590cb6c238c Marshall added assert(true) to the vector<bool>::size tests, which break on C1XX:
D:\Contest\gl0qojfu.5pe\src\qa\vc\libs\libcxx\upstream\test\std\containers\sequences\vector.bool\size.pass.cpp(62): error C2220: warning treated as error - no 'object' file generated
d:\contest\gl0qojfu.5pe\src\qa\vc\libs\libcxx\upstream\test\std\containers\sequences\vector.bool\size.pass.cpp(33) : warning C6326: Potential comparison of a constant with another constant.
d:\contest\gl0qojfu.5pe\src\qa\vc\libs\libcxx\upstream\test\std\containers\sequences\vector.bool\size.pass.cpp(52) : warning C6326: Potential comparison of a constant with another constant.
The corresponding test for vector::size asserts assert(c.size() == 3);, so I changed it to do that here.
llvm-svn: 318812
One can't replace vsscanf(_l) with a sscanf(_l) that doesn't
take a va_list.
This has been untouched since it was added in SVN r140728, so
apparently it hasn't been used since. One reason for this mistake
originally might have been that there was no _vsscanf_l until MSVC
2015.
Since it's unused, just remove this define.
Differential Revision: https://reviews.llvm.org/D40323
llvm-svn: 318810
Isl does not allow generating isl_ast_expr from an isl_pw_aff that has an
empty domain (i.e. has no pieces). We already detected the case if the
isl_pw_aff comes with an empty domain.
isl_ast_build also considers the domain empty if it is disjoint with the
parameter context (e.g. parameters values that we exclude by runtime
versioning).
Intersect the access relation domain with the parameter context to
also detect such practically empty access domains. The effective
pointer used in the generated code is unimportand because it will never
be executed.
This fixes llvm.org/PR35362
llvm-svn: 318806
Change the representation of COFF comdats so that a COFF linker
is able to accurately resolve comdats between IR and native object
files. Specifically, apply name mangling to comdat names consistently
with native object files, and do not export comdats with an internal
leader because they do not affect symbol resolution.
Differential Revision: https://reviews.llvm.org/D40278
llvm-svn: 318805