Otherwise it gets linked in by one of the dependencies of shared
libraries which may be too late and we end up with weird crashes in
std::call_once().
Differential Revision: http://reviews.llvm.org/D21478
llvm-svn: 273302
The basic structure is that once a list record goes over 64K, the last
subrecord of the list is an LF_INDEX record that refers to the next
record. Because the type record graph must be toplogically sorted, this
means we have to emit them in reverse order. We build the type record in
order of declaration, so this means that if we don't want extra copies,
we need to detect when we were about to split a record, and leave space
for a continuation subrecord that will point to the eventual split
top-level record.
Also adds dumping support for these records.
Next we should make sure that large method overload lists work properly.
llvm-svn: 273294
r273271 changed the RUN line of the regression test to use
-march=cyclone instead of -mtriple=aarch64-none-none.
This caused a change in the output syntax for the ext
instruction, causing the test to fail. Change this test
back to using -mtriple=aarch64-none-none.
llvm-svn: 273286
Summary:
Fix the computation of the offsets present in the scopetable when using the
SEH (__except_handler4).
This patch added an intrinsic to track the position of the allocation on the
stack of the EHGuard. This position is needed when producing the ScopeTable.
```
struct _EH4_SCOPETABLE {
DWORD GSCookieOffset;
DWORD GSCookieXOROffset;
DWORD EHCookieOffset;
DWORD EHCookieXOROffset;
_EH4_SCOPETABLE_RECORD ScopeRecord[1];
};
struct _EH4_SCOPETABLE_RECORD {
DWORD EnclosingLevel;
long (*FilterFunc)();
union {
void (*HandlerAddress)();
void (*FinallyFunc)();
};
};
```
The code to generate the EHCookie is added in `X86WinEHState.cpp`.
Which is adding these instructions when using SEH4.
```
Lfunc_begin0:
# BB#0: # %entry
pushl %ebp
movl %esp, %ebp
pushl %ebx
pushl %edi
pushl %esi
subl $28, %esp
movl %ebp, %eax <<-- Loading FramePtr
movl %esp, -36(%ebp)
movl $-2, -16(%ebp)
movl $L__ehtable$use_except_handler4_ssp, %ecx
xorl ___security_cookie, %ecx
movl %ecx, -20(%ebp)
xorl ___security_cookie, %eax <<-- XOR FramePtr and Cookie
movl %eax, -40(%ebp) <<-- Storing EHGuard
leal -28(%ebp), %eax
movl $__except_handler4, -24(%ebp)
movl %fs:0, %ecx
movl %ecx, -28(%ebp)
movl %eax, %fs:0
movl $0, -16(%ebp)
calll _may_throw_or_crash
LBB1_1: # %cont
movl -28(%ebp), %eax
movl %eax, %fs:0
addl $28, %esp
popl %esi
popl %edi
popl %ebx
popl %ebp
retl
```
And the corresponding offset is computed:
```
Luse_except_handler4_ssp$parent_frame_offset = -36
.p2align 2
L__ehtable$use_except_handler4_ssp:
.long -2 # GSCookieOffset
.long 0 # GSCookieXOROffset
.long -40 # EHCookieOffset <<----
.long 0 # EHCookieXOROffset
.long -2 # ToState
.long _catchall_filt # FilterFunction
.long LBB1_2 # ExceptionHandler
```
Clang is not yet producing function using SEH4, but it's a work in progress.
This patch is a step toward having a valid implementation of SEH4.
Unfortunately, it is not yet fully working. The EH registration block is not
allocated at the right offset on the stack.
Reviewers: rnk, majnemer
Subscribers: llvm-commits, chrisha
Differential Revision: http://reviews.llvm.org/D21231
llvm-svn: 273281
Summary:
Code generation for Cortex-A72/Cortex-A73 was accidentally changed
by r271555, which was a NFCI. The isCortexA57() predicate was not true
for Cortex-A72/Cortex-A73 before r271555 (since it was checking the CPU
string). Because Cortex-A72/Cortex-A73 inherit all features from Cortex-A57,
all decisions previously guarded by isCortexA57() are now taken.
This change restores the behaviour before r271555 by adding separate
ProcA72/ProcA73, which have the required features to preserve code
generation.
Reviewers: kristof.beyls, aadg, mcrosier, rengolin
Subscribers: mcrosier, llvm-commits, aemerson, t.p.northover, MatzeB, rengolin
Differential Revision: http://reviews.llvm.org/D21182
llvm-svn: 273277
Summary:
We have switched to using features for all heuristics, but
the tests for these are still using -mcpu, which means we
are not directly testing the features.
This converts at least some of the existing regression tests
to use the new features.
This still leaves the following features untested:
merge-narrow-ld
predictable-select-expensive
alternate-sextload-cvt-f32-pattern
disable-latency-sched-heuristic
Reviewers: mcrosier, t.p.northover, rengolin
Subscribers: MatzeB, aemerson, llvm-commits, rengolin
Differential Revision: http://reviews.llvm.org/D21288
llvm-svn: 273271
The large offset is being tested on Windows 10 (which has larger usable
virtual address space than Windows 8 or earlier)
Patch by: Wei Wang
Differential Revision: http://reviews.llvm.org/D21523
llvm-svn: 273269
When you have a map holding a unique_ptr, hold a reference to the raw
pointer instead of the unique pointer. The unique_ptr will be moved on
rehash.
llvm-svn: 273268
Summary:
canCombineSinCosLibcall() would previously combine sin+cos into sincos for
GNUX32/GNUEABI/GNUEABIHF regardless of whether UnsafeFPMath were set or not.
However, GNU would only combine them for UnsafeFPMath because sincos does not
set errno like sin and cos do. It seems likely that this is an oversight.
Reviewers: t.p.northover
Subscribers: t.p.northover, aemerson, llvm-commits, rengolin
Differential Revision: http://reviews.llvm.org/D21431
llvm-svn: 273259
It did not handle correctly cases without GEP.
The following loop wasn't vectorized:
for (int i=0; i<len; i++)
*to++ = *from++;
I use getPtrStride() to find Stride for memory access and return 0 is the Stride is not 1 or -1.
Differential revision: http://reviews.llvm.org/D20789
llvm-svn: 273257
Summary:
Using isOutOfOrder makes the code more clear.
Reviewers: rengolin, atrick, hfinkel.
Differential Revision: http://reviews.llvm.org/D21548
llvm-svn: 273255
This reverts commit r273019.
From email I sent to list:
> I don't think this makes sense. Either the linker you're using supports
> this feature, or it doesn't. Having it enabled for llc if your linker
> doesn't support it is not fun.
>
> Further note that this also affects basically all other code using llvm
> libraries -- other than Clang, which explicitly sets it back to false by
> default, unless you set the ENABLE_X86_RELAX_RELOCATIONS cmake flag to
> true.
>
> If you want to enable the relax mode across all llvm tools in some
> circumstances, I think it should be via moving the cmake flag from clang
> down into llvm.
>
> I'm going to revert this commit, since I both think it intrinsically
> doesn't make sense to do this, and because it's breaking some of our
> tools.
llvm-svn: 273245
This patch makes us perform interprocedural analysis on functions that
don't have internal linkage. It also removes a test that should've been
deleted in an earlier commit (since other tests now cover everything
that the newly-removed test covers).
Patch by Jia Chen.
Differential Revision: http://reviews.llvm.org/D21513
llvm-svn: 273229
Works around a bug (PR28216) in Clang's MS mangling of templates with
partial specializations.
This mismatch was introduced in about six months ago in r256656.
llvm-svn: 273223
The main difference is that StubDynamicNoPIC is gone. The
dynamic-no-pic mode as the name implies is simply not pic. It is just
conservative about what it assumes to be dso local.
llvm-svn: 273222
This patch adds function summaries, so that we don't need to recompute
various properties about function parameters/return values at each
callsite of a function. It also adds many interprocedural tests for
CFLAA.
Patch by Jia Chen.
Differential Revision: http://reviews.llvm.org/D21475#inline-182390
llvm-svn: 273219
The BSWAP of vector types is quite efficiently implemented using vector shuffles on SSE/AVX targets, we should reflect the typical cost of this to encourage vectorization.
Differential Revision: http://reviews.llvm.org/D21521
llvm-svn: 273217
Fix for PR27726 - sitofp i64 to fp128 was loading the merged load i64 to a x87 register preventing legalization for conversion to fp128.
Added 32-bit tests for fp128 cast/conversions.
llvm-svn: 273210
Darwin added support in its Xcode 8.0 tools (released in the beta) for universal
files where offsets and sizes for the objects are 64-bits to allow support for
objects contained in universal files to be larger then 4gb. The change is very
straight forward. There is a new magic number that differs by one bit, much
like the 64-bit Mach-O files. Then there is a new structure that follow the
fat_header that has the same layout but with the offset and size fields using
64-bit values instead of 32-bit values.
rdar://26899493
llvm-svn: 273207
There is a known intended race here. This is a follow-up to r264805,
which disabled tsan instrumentation for updates to instrprof counters.
For more background on this please see the discussion in D18164.
llvm-svn: 273202
By moving this transform to InstSimplify from InstCombine, we sidestep the problem/question
raised by PR27869:
https://llvm.org/bugs/show_bug.cgi?id=27869
...where InstCombine turns an icmp+zext into a shift causing us to miss the fold.
Credit to David Majnemer for a draft patch of the changes to InstructionSimplify.cpp.
Differential Revision: http://reviews.llvm.org/D21512
llvm-svn: 273200
Summary: Inliner needs ACT when calling InlineFunction. Instead of nullptr, we need to pass it in from SampleProfileLoader
Reviewers: davidxl
Subscribers: eraman, vsk, danielcdh, llvm-commits
Differential Revision: http://reviews.llvm.org/D21205
llvm-svn: 273199