This is a first step for generating SSE rsqrt instructions for
reciprocal square root calcs when fast-math is allowed.
For now, be conservative and only enable this for AMD btver2
where performance improves significantly - for example, 29%
on llvm/projects/test-suite/SingleSource/Benchmarks/BenchmarkGame/n-body.c
(if we convert the data type to single-precision float).
This patch adds a two constant version of the Newton-Raphson
refinement algorithm to DAGCombiner that can be selected by any target
via a parameter returned by getRsqrtEstimate()..
See PR20900 for more details:
http://llvm.org/bugs/show_bug.cgi?id=20900
Differential Revision: http://reviews.llvm.org/D5658
llvm-svn: 220570
Summary:
No functional change yet, it's just an object replacement for an enum.
It will allow us to gather ABI information in a single place so that we can
start testing for properties of the ABI's instead of the ABI itself.
For example we will eventually be able to use:
ABI.MinStackAlignmentInBytes()
instead of:
(isABI_N32() || isABI_N64()) ? 16 : 8
which is clearer and more maintainable.
Reviewers: matheusalmeida
Reviewed By: matheusalmeida
Differential Revision: http://reviews.llvm.org/D3341
llvm-svn: 220568
Summary:
This allows us to easily identify them in the backend which in turn allows us
to handle them correctly for big-endian targets (where they must be shifted
into the upper bits of the register).
Depends on D5961
Reviewers: atanasyan
Reviewed By: atanasyan
Subscribers: cfe-commits, theraven
Differential Revision: http://reviews.llvm.org/D5962
llvm-svn: 220566
Summary:
Ensure all integral/enumeration types are appropriately annotated with
signext/zeroext. In particular, i32 now has these attributes when using the
N32/N64 ABI. This paves the way for accurately representing the way the
N32/N64 ABI's promotes integer arguments to i64.
Reviewers: atanasyan
Reviewed By: atanasyan
Subscribers: cfe-commits, theraven
Differential Revision: http://reviews.llvm.org/D5961
llvm-svn: 220563
This fixes a crash in the RecursiveASTVisitor on such code
__typeof__(struct F*) var[invalid];
The UnderlyingTInfo of a TypeOfTypeLoc was left uninitialized when
created from ASTContext::getTrivialTypeSourceInfo
This lead to a crash in RecursiveASTVisitor when trying to access it.
llvm-svn: 220562
Summary:
i32 is always promoted to i64 so it no longer makes sense to assign i32 to
registers.
Reviewers: vmedic
Reviewed By: vmedic
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D5964
llvm-svn: 220561
Summary:
Most structs were fixed by r218451 but those of between >32-bits and
<64-bits remained broken since they were not marked with [ASZ]ExtUpper.
This patch fixes the remaining cases by using
CCPromoteToUpperBitsInType<i64> on i64's in addition to i32 and smaller.
Reviewers: vmedic
Reviewed By: vmedic
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D5963
llvm-svn: 220556
'char' is unsigned on all ARM and Thumb architectures. Clang gets this
right for ARM, and for thumb when using and arm triple and the -mthumb
option, but gets it wrong for thumb triples. This fixes that.
llvm-svn: 220555
This fixes a miscompilation in the AArch64 fast-isel which was
triggered when a branch is based on an icmp with condition eq or ne,
and type i1, i8 or i16. The cbz instruction compares the whole 32-bit
register, so values with the bottom 1, 8 or 16 bits clear would cause
the wrong branch to be taken.
llvm-svn: 220553
This is a very basic toolchain. It supports cross-compiling Windows (primarily
inspired by the WoA target). It is meant to use clang with the LLVM IAS and a
binutils ld-compatible interface for the linker (eventually to be lld). It does
not perform any "standard" GCC lookup, nor does it perform any special
adjustments given that it is expected to be used in an environment where the
user is using MSVCRT (and as such Visual Studio headers) and the Windows SDK.
The primary runtime library is expected to be compiler-rt and the C++
implementation to be libc++.
It also expects that a sysroot has been setup given the usual Unix semantics
(standard C headers in /usr/include, all the import libraries available in
/usr/lib). It also expects that an entry point stub is present in /usr/lib
(crtbegin.obj for executables, crtbeginS.obj for shared libraries).
The entry point stub is responsible for running any GNU constructors.
llvm-svn: 220546
This is asm/diasm-only support, similar to AVX.
For ISeling the register variant, they are no different from 213 other than
whether the multiplication or the addition operand is destructed.
For ISeling the memory variant, i.e. to fold a load, they are no different
than the 132 variant. The addition operand (op3) in both cases can come from
memory. Again the ony difference is which operand is destructed.
There could be a post-RA pass that would convert a 213 or 132 into a 231.
Part of <rdar://problem/17082571>
llvm-svn: 220540
This multiclass generates the different forms: 213, 231, 132 in AVX.
132 in AVX512 is a separate class but I am planning to use this same
multiclass to generate 231 relying on the nice the null_frag trick from AVX to
disable codegen pattern for 231.
No functionality change, no change in X86.td.expanded except for the different
instruction definition names.
llvm-svn: 220539
Although the current method is valid up till python 3.3 (which is not supported)
this seems to be a clearer way of checking for linux and moves the tests towards
python 3 compatibility.
llvm-svn: 220535
Although the current method is valid up till python 3.3 (which is not supported)
this seems to be a clearer way of checking for linux and moves the tests towards
python 3 compatibility.
llvm-svn: 220534
This adds support for legalization of instructions of the form:
[fp_conv] <1 x i1> %op to <1 x double>
where fp_conv is one of fpto[us]i, [us]itofp. This used to assert
because they were simply missing from the vector operand scalarizer.
A similar problem arose in r190830, with trunc instead.
Fixes PR20778.
Differential Revision: http://reviews.llvm.org/D5810
llvm-svn: 220533
Clang would previously assert on the following code when targeting MinGW:
struct __declspec(dllimport) S {
virtual ~S();
};
S::~S() {}
Because ~S is a key function and the class is dllimport, we would try to emit a
strong definition of the vtable, with dllimport - which is a conflict. We
should not emit strong vtable definitions for imported classes.
Differential Revision: http://reviews.llvm.org/D5944
llvm-svn: 220532
x86's CMPXCHG -> EFLAGS consumer wasn't being recorded as a real EFLAGS
dependency because it was represented by a pair of CopyFromReg(EFLAGS) ->
CopyToReg(EFLAGS) nodes. ScheduleDAG was expecting the source to be an
implicit-def on the instruction, where the result numbers in the DAG and the
Uses list in TableGen matched up precisely.
The Copy notation seems much more robust, so this patch extends ScheduleDAG
rather than refactoring x86.
Should fix PR20376.
llvm-svn: 220529
While refactoring this code I was confused by both the name I had
introduced (addNonArgumentVariable... but it has all this logic to
handle argument numbering and keep things in order?) and by the
redundancy. Seems when I fixed the misordered inlined argument handling,
I didn't realize it was mostly redundant with the argument ordering code
(which I may've also written, I'm not sure). So let's just rely on the
more general case.
The only oddity in output this produces is that it means when we emit
all the variables for the current function, we don't track when we've
finished the argument variables and are about to start the local
variables and insert DW_AT_unspecified_parameters (for varargs
functions) there. Instead it ends up after the local variables, scopes,
etc. But this isn't invalid and doesn't cause DWARF consumers problems
that I know of... so we'll just go with that because it makes the code
nice & simple.
(though, let's see what the buildbots have to say about this - *crosses
fingers*)
There will be some cleanup commits to follow to remove the now trivial
wrappers, etc.
llvm-svn: 220527
Currently, when --serialize-diagnostics is passed this only includes
the diagnostics from clang -cc1, and driver diagnostics are
dropped. This causes issues for tools that use the serialized
diagnostics, since stderr is lost and these diagnostics aren't seen at
all.
We handle this by merging the diagnostics from the CC1 process and the
driver diagnostics into a single file when the driver invokes CC1.
Fixes rdar://problem/10585062
llvm-svn: 220525