This is similar to how we generate the VEX tables.
More fixes are still needed for the instructions that use EVEX.b (broadcast and embedded rounding).
llvm-svn: 316294
The missed canonicalization/optimization in the motivating test from PR34471 leads to very different codegen:
int switcher(int x) {
switch(x) {
case 17: return 17;
case 19: return 19;
case 42: return 42;
default: break;
}
return 0;
}
int comparator(int x) {
if (x == 17) return 17;
if (x == 19) return 19;
if (x == 42) return 42;
return 0;
}
For the first example, we use a bit-test optimization to avoid a series of compare-and-branch:
https://godbolt.org/g/BivDsw
Differential Revision: https://reviews.llvm.org/D39011
llvm-svn: 316293
In order to identify the copy deduction candidate, I considered two approaches:
- attempt to determine whether an implicit guide is a copy deduction candidate by checking certain properties of its subsituted parameter during overload-resolution.
- using one of the many bits (WillHaveBody) from FunctionDecl (that CXXDeductionGuideDecl inherits from) that are otherwise irrelevant for deduction guides
After some brittle gymnastics w the first strategy, I settled on the second, although to avoid confusion and to give that bit a better name, i turned it into a member of an anonymous union.
Given this identification 'bit', the tweak to overload resolution was a simple reordering of the deduction guide checks (in SemaOverload.cpp::isBetterOverloadCandidate), in-line with Jason Merrill's p0620r0 drafting which made it into the working paper. Concordant with that, I made sure the copy deduction candidate is always added.
References:
See https://bugs.llvm.org/show_bug.cgi?id=34970
See http://wg21.link/p0620r0
llvm-svn: 316292
This patch implements dynamic stack (re-)alignment for 16-bit Thumb. When
targeting processors, which support only the 16-bit Thumb instruction set
the compiler ignores the alignment attributes of automatic variables and may
silently generate incorrect code.
Differential revision: https://reviews.llvm.org/D38143
llvm-svn: 316289
The pass scans the function to find instruction chains that define
registers in the same domain (closures).
It then calculates the cost of converting the closure to another domain.
If found profitable, the instructions are converted to instructions in
the other domain and the register classes are changed accordingly.
This commit adds the pass infrastructure and a simple conversion from
the GPR domain to the Mask domain.
Differential Revision:
https://reviews.llvm.org/D37251
Change-Id: Ic2cf1d76598110401168326d411128ae2580a604
llvm-svn: 316288
By assuming that mergeable input sections are smaller than 4 GiB,
lld's memory usage when linking clang with debug info drops from
2.788 GiB to 2.019 GiB (measured by valgrind, and that does not include
memory space for mmap'ed files). I think that's a reasonable assumption
given such a large RAM savings, so this patch.
According to valgrind, gold needs 3.54 GiB of RAM to do the same thing.
NB: This patch does not introduce a limitation on the size of
output sections. You can still create sections larger than 4 GiB.
llvm-svn: 316280
o) Add a 'Location' class that represents the four properties of a
physical location
o) Enhance 'SourceLocation' to provide 'expansion' and 'spelling'
locations, maintaining backwards compatibility with existing code by
forwarding the four properties to 'expansion'.
o) Update the implementation to use 'clang_getExpansionLocation'
instead of the deprecated 'clang_getInstantiationLocation', which
has been present since 2011.
o) Update the implementation of 'clang_getSpellingLocation' to actually
obtain spelling location instead of file location.
llvm-svn: 316278
This introduces a new operand type to encode the whether the index register should be XMM/YMM/ZMM. And new code to fixup the results created by readSIB.
This has the nice effect of removing a bunch of code that hard coded the name of every GATHER and SCATTER instruction to map the index type.
This fixes PR32807.
llvm-svn: 316273
Summary:
As Mattias Eriksson has reported in PR35009, in C, for enums, the underlying type should
be used when checking for the tautological comparison, unlike C++, where the enumerator
values define the value range. So if not in CPlusPlus mode, use the enum underlying type.
Also, i have discovered a problem (a crash) when evaluating tautological-ness of the following comparison:
```
enum A { A_a = 0 };
if (a < 0) // expected-warning {{comparison of unsigned enum expression < 0 is always false}}
return 0;
```
This affects both the C and C++, but after the first fix, only C++ code was affected.
That was also fixed, while preserving (i think?) the proper diagnostic output.
And while there, attempt to enhance the test coverage.
Yes, some tests got moved around, sorry about that :)
Fixes PR35009
Reviewers: aaron.ballman, rsmith, rjmccall
Reviewed By: aaron.ballman
Subscribers: Rakete1111, efriedma, materi, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D39122
llvm-svn: 316268
Neither of these cases really require a temporary APInt outside the loop. For the ConstantDataSequential case the APInt will never be larger than 64-bits so its fine to just call getElementAsAPInt. For ConstantVector we can get the APInt by reference and only make a copy where the inversion is needed.
llvm-svn: 316265
Some API calls accept 'NULL' instead of a char array (e.g. the second
argument to 'clang_ParseTranslationUnit'). For Python 3 compatibility,
all strings are passed through 'c_interop_string' which expects to
receive only 'bytes' or 'str' objects. This change extends this
behavior to additionally allow 'None' to be supplied.
A test case was added which breaks in Python 3, and is fixed by this
change. All the test cases pass in both, Python 2 and Python 3.
llvm-svn: 316264
The way that splitInnerLoopHeader splits blocks requires that
the induction PHI will be the first PHI in the inner loop
header. This makes sure that is actually the case when there
are both IV and reduction phis.
Differential Revision: https://reviews.llvm.org/D38682
llvm-svn: 316261
We don't need to do any additional recursion, we just need to analyze the APInt stored in the node. This matches what the ValueTracking versions do for IR.
llvm-svn: 316256
Summary:
We shouldn't recurse any further but it doesn't mean we shouldn't be able to give the known bits for a constant. The caller would probably like that we always return the right answer for a constant RHS. This matches what InstCombine does in this case.
I don't have a test case because this showed up while trying to revive D31724.
Reviewers: RKSimon, spatel
Reviewed By: RKSimon
Subscribers: arsenm, llvm-commits
Differential Revision: https://reviews.llvm.org/D38967
llvm-svn: 316255
Summary: __multi3 is not available on x86 (32-bit). Setting lib call name for MULI_128 to nullptr forces DAGTypeLegalizer::ExpandIntRes_MUL to generate instructions for 128-bit multiply instead of a call to an undefined function. This fixes PR20871 though it may be worth looking at why licm and indvars combine to generate 65-bit multiplies in that test.
Patch by Riyaz V Puthiyapurayil
Reviewers: craig.topper, schweitz
Reviewed By: craig.topper, schweitz
Subscribers: RKSimon, llvm-commits
Differential Revision: https://reviews.llvm.org/D38668
llvm-svn: 316254
This originally started out here in dev, but I moved it to another
file when it became clear this wouldn't work on non-Windows.
Unfortunately I forgot to remove it from this file. Test is still
live, just in another source file.
llvm-svn: 316247
To get MS-style inline assembly, we need to link in the various
backends. Some other clang tools already do this, and this issue
has been raised with clang-tidy several times, indicating there
is sufficient desire to make this work.
Differential Revision: https://reviews.llvm.org/D38549
llvm-svn: 316246
constant expressions.
We permit array-to-pointer decay on such arrays, but disallow pointer
arithmetic (since we do not know whether it will have defined behavior).
This is based on r311970 and r301822 (the former by me and the latter by Robert
Haberlach). Between then and now, two things have changed: we have committee
feedback indicating that this is indeed the right direction, and the code
broken by this change has been fixed.
This is necessary in C++17 to continue accepting certain forms of non-type
template argument involving arrays of unknown bound.
llvm-svn: 316245
Summary:
Without this, the launching of the test inferior may fail if it depends
on some component of the environment (most likely LD_LIBRARY_PATH). This
makes sure we propagate the environment variable to the inferior
process.
Reviewers: eugene
Subscribers: lldb-commits
Differential Revision: https://reviews.llvm.org/D39010
llvm-svn: 316244