For global reg lvalue - use regular store through global register.
For simple lvalue - use simple atomic store.
For bitfields, vector element, extended vector elements - the original value of the whole storage (for vector elements) or of some aligned value (for bitfields) is atomically read, the part of this value for the given lvalue is modified and then use atomic compare-and-exchange operation to try to atomically write modified value (if it was not modified).
Also, changes in this patch fix the bug for '#pragma omp atomic read' applied to extended vector elements.
Differential Revision: http://reviews.llvm.org/D7369
llvm-svn: 230736
The __finally emission block tries to be clever by removing unused continuation
edges if there's an unconditional jump out of the __finally block. With
exception edges, the EH continuation edge isn't always unused though and we'd
crash in a few places.
Just don't be clever. That makes the IR for __finally blocks a bit longer in
some cases (hence small and behavior-preserving changes to existing tests), but
it makes no difference in general and it fixes the last crash from PR22553.
http://reviews.llvm.org/D7918
llvm-svn: 230697
Currently, the NaN values emitted for MIPS architectures do not cover
non-IEEE754-2008 compliant case. This change fixes the issue.
Patch by Vladimir Radosavljevic.
Differential Revision: http://reviews.llvm.org/D7882
llvm-svn: 230653
Original CL description:
Produce less broken basic block sequences for __finally blocks.
The way cleanups (such as PerformSEHFinally) get emitted is that codegen
generates some initialization code, then calls the cleanup's Emit() with the
insertion point set to a good place, then the cleanup is supposed to emit its
stuff, and then codegen might tack in a jump or similar to where the insertion
point is after the cleanup.
The PerformSEHFinally cleanup tries to just stash away the block it's supposed
to codegen into, and then does codegen later, into that stashed block. However,
after codegen'ing the __finally block, it used to set the insertion point to
the finally's continuation block (where the __finally cleanup goes when its body
is completed after regular, non-exceptional control flow). That's not correct,
as that block can (and generally does) already ends in a jump. Instead,
remember the insertion point that was current before the __finally got emitted,
and restore that.
Fixes two of the crashes in PR22553.
llvm-svn: 230503
The way cleanups (such as PerformSEHFinally) get emitted is that codegen
generates some initialization code, then calls the cleanup's Emit() with the
insertion point set to a good place, then the cleanup is supposed to emit its
stuff, and then codegen might tack in a jump or similar to where the insertion
point is after the cleanup.
The PerformSEHFinally cleanup tries to just stash away the block it's supposed
to codegen into, and then does codegen later, into that stashed block. However,
after codegen'ing the __finally block, it used to set the insertion point to
the finally's continuation block (where the __finally cleanup goes when its body
is completed after regular, non-exceptional control flow). That's not correct,
as that block can (and generally does) already ends in a jump. Instead,
remember the insertion point that was current before the __finally got emitted,
and restore that.
Fixes two of the crashes in PR22553.
llvm-svn: 230460
This is a necessary prerequisite for debugging with modules.
The .pcm files become containers that hold the serialized AST which allows
us to store debug information in the module file that can be shared by all
object files that were built importing the module.
This reapplies r230044 with a fixed configure+make build and updated
dependencies and testcase requirements. Over the last iteration this
version adds
- missing target requirements for testcases that specify an x86 triple,
- a missing clangCodeGen.a dependency to libClang.a in the make build.
rdar://problem/19104245
llvm-svn: 230423
The backend should now be able to handle all AAPCS rules based on argument
type, which means Clang no longer has to duplicate the register-counting logic
and the CodeGen can be significantly simplified.
llvm-svn: 230349
MSVC does not support C99 _Complex.
ICC, however, does support it on windows x86_64, and treats it, for purposes of parameter passing, as equivalent to a struct containing two fields (for the real and imaginary part).
Differential Revision: http://reviews.llvm.org/D7825
llvm-svn: 230315
llvm.eh.sjlj.setjmp / llvm.eh.sjlj.longjmp, if the backend is known to
support them outside the Exception Handling context. The default
handling in LLVM codegen doesn't work and will create incorrect code.
The ARM backend on the other hand will assert if the intrinsics are
used.
llvm-svn: 230255
For now -funique-section-names is the default, so no change in default behavior.
The total .o size in a build of llvm and clang goes from 241687775 to 230649031
bytes if -fno-unique-section-names is used.
llvm-svn: 230031
Summary:
The definition for _mm256_insert_epi64 was taking an int, which would get
truncated before being inserted in the vector.
Original patch by Joshua Magee!
Reviewers: bruno, craig.topper
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D7179
llvm-svn: 229811
Not all targets generate 'store atomic' instructions for
'_Atomic(_Complex int)'. Some targets use the __atomic_store builtin instead.
This commit makes the test accept either one.
llvm-svn: 229676
This is a patch for PR22563 ( http://llvm.org/bugs/show_bug.cgi?id=22563 ).
We were not correctly unwrapping a single 256-bit AVX vector that was defined as an array of 1 inside a struct.
We would generate a <4 x float> param/return value instead of <8 x float> and lose half of the vector.
Differential Revision: http://reviews.llvm.org/D7614
llvm-svn: 229408
For #pragma comment(linker, ...) MSVC expects the comment string to be quoted, but for #pragma comment(lib, ...) the compiler itself quotes the library name.
Since this distinction disappears by the time the directive reaches the backend, move quoting for the "lib" version to the frontend.
Differential Revision: http://reviews.llvm.org/D7653
llvm-svn: 229376
Bools are a little tricky, they are i8 in memory and must be coerced
back to i1 before further operations can be performed on them.
This fixes PR22577.
llvm-svn: 229204
The first change won't touch GEPOperators such as these, but the update
script only identifies them by the leading '(' after getelementptr or
'getelementptr inbounds', so update this test to at least have those
features to allow auto-migrating.
llvm-svn: 229198
The /volatile:ms semantics turn volatile loads and stores into atomic
acquire and release operations. This distinction is important because
volatile memory operations do not form a happens-before relationship
with non-atomic memory. This means that a volatile store is not
sufficient for implementing a mutex unlock routine.
Differential Revision: http://reviews.llvm.org/D7580
llvm-svn: 229082
Summary:
This patch installs an InlineAsmDiagnosticsHandler to avoid the crash
report when the input is bitcode and the bitcode contains invalid inline
assembly. The handler will simply print the same error message that will
print from the backend.
Add CHECK in test-case
Reviewers: echristo, rafael
Reviewed By: rafael
Subscribers: rafael, cfe-commits
Differential Revision: http://reviews.llvm.org/D7568
llvm-svn: 228898
a non-uniqueable temporary node that is only turned into a permanent
unique or distinct node after it is finished.
Otherwise an intermediate node may get accidentally uniqued with another
node as illustrated by the testcase.
Paired commit with LLVM.
llvm-svn: 228855
Also removed unused builtins.
Original patch by Andrea Di Biagio!
Reviewers: craig.topper, nadav
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D7199
llvm-svn: 228481
modifiers on them. If we have a matching output constraint with
an early clobber make sure we don't propagate that to the input
constraint.
llvm-svn: 228422
After r228258, Clang started emitting C++ EH IR that LLVM wasn't ready
to deal with, even when exceptions were disabled with /EHs-. This time,
make /EHs- turn off -fexceptions while still emitting exceptional
constructs in functions using __try. Since Sema rejects C++ exception
handling constructs before CodeGen, landingpads should only appear in
such functions as the result of a __try.
llvm-svn: 228329
Previously we would simply double-emit the body of the __finally block,
but that doesn't work when it contains any kind of Decl, which we can't
double emit.
This fixes that by emitting the block once and branching into a shared
code region and then branching back out.
llvm-svn: 228222
Summary:
Named registers with the constraint "=&r" currently lose the early clobber flag
and turn into "=r" when converted to LLVM-IR. This patch correctly passes it on.
Reviewers: atanasyan
Reviewed By: atanasyan
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D7346
llvm-svn: 228143
Create a new TargetCodeGenInfo for Windows on ARM to permit annotating the
functions with stack-probe-size (for /Gs and -mstack-probe-support) for
generating the stack probe necessary for Windows targets. This will be used by
the backend when lowering the frame to generate the stack probe appropriately.
llvm-svn: 227641
On targets which use the MSVCRT, setjmp is a macro which expands to
_setjmp or _setjmpex.
_setjmp and _setjmpex have a secret, hidden argument which is not listed
in the function prototype on X64 and WoA. This hidden argument always
seems to be the frame pointer.
_setjmpex isn't used on X86, _setjmp is magically replaced with a call
to _setjmp3. The second argument is zero for 'normal' setjmp/longjmp
pairs, otherwise it is a count of additional variadic arguments. This
is used when setjmp appears inside of a try or __try.
It is not safe to use a pointer to setjmp because _setjmp, _setjmpex and
_setmp3 are not compatible with setjmp.
llvm-svn: 227426
Summary:
It was used for interoperability with PNaCl's calling conventions, but
it's no longer needed.
Also Remove NaCl*ABIInfo which just existed to delegate to either the portable
or native ABIInfo, and remove checkCallingConvention which was now a no-op
override.
Reviewers: jvoung
Subscribers: jfb, llvm-commits
Differential Revision: http://reviews.llvm.org/D7206
llvm-svn: 227362
The backend won't run LowerExpect on -O0. In a debug LTO build, this results in llvm.expect intrinsics being in the LTO IR which doesn't know how to optimize them.
Thanks to Chandler for the suggestion and review.
Differential revision: http://reviews.llvm.org/D7183
llvm-svn: 227135
The lowering looks a lot like normal EH lowering, with the exception
that the exceptions are caught by executing filter expression code
instead of matching typeinfo globals. The filter expressions are
outlined into functions which are used in landingpad clauses where
typeinfo would normally go.
Major aspects that still need work:
- Non-call exceptions in __try bodies won't work yet. The plan is to
outline the __try block in the frontend to keep things simple.
- Filter expressions cannot use local variables until capturing is
implemented.
- __finally blocks will not run after exceptions. Fixing this requires
work in the LLVM SEH preparation pass.
The IR lowering looks like this:
// C code:
bool safe_div(int n, int d, int *r) {
__try {
*r = normal_div(n, d);
} __except(_exception_code() == EXCEPTION_INT_DIVIDE_BY_ZERO) {
return false;
}
return true;
}
; LLVM IR:
define i32 @filter(i8* %e, i8* %fp) {
%ehptrs = bitcast i8* %e to i32**
%ehrec = load i32** %ehptrs
%code = load i32* %ehrec
%matches = icmp eq i32 %code, i32 u0xC0000094
%matches.i32 = zext i1 %matches to i32
ret i32 %matches.i32
}
define i1 zeroext @safe_div(i32 %n, i32 %d, i32* %r) {
%rr = invoke i32 @normal_div(i32 %n, i32 %d)
to label %normal unwind to label %lpad
normal:
store i32 %rr, i32* %r
ret i1 1
lpad:
%ehvals = landingpad {i8*, i32} personality i32 (...)* @__C_specific_handler
catch i8* bitcast (i32 (i8*, i8*)* @filter to i8*)
%ehptr = extractvalue {i8*, i32} %ehvals, i32 0
%sel = extractvalue {i8*, i32} %ehvals, i32 1
%filter_sel = call i32 @llvm.eh.seh.typeid.for(i8* bitcast (i32 (i8*, i8*)* @filter to i8*))
%matches = icmp eq i32 %sel, %filter_sel
br i1 %matches, label %eh.except, label %eh.resume
eh.except:
ret i1 false
eh.resume:
resume
}
Reviewers: rjmccall, rsmith, majnemer
Differential Revision: http://reviews.llvm.org/D5607
llvm-svn: 226760
It fails on Windows due to another temporary being emitted first, so the
LLVM internal renaming scheme gives out the name
__block_descriptor_tmp1.
llvm-svn: 226757
Currently we emit DeferredDeclsToEmit in reverse order. This patch changes that.
The advantages of the change are that
* The output order is a bit closer to the source order. The change to
test/CodeGenCXX/pod-member-memcpys.cpp is a good example.
* If we decide to deffer more, it will not cause as large changes in the
estcases as it would without this patch.
llvm-svn: 226751
Analogous to AVX2, these need to be implemented as macros to properly
propagate the immediate index operand.
Part of <rdar://problem/17688758>
llvm-svn: 226496
Summary:
This fixes MultiSource/Applications/lemon on big-endian N32 by correcting the
handling of the argument to wait(). glibc defines it as a transparent union of
void* and int*. Such unions are passed according to the rules of the first
member so the argument must be passed as if it were a void* (sign extended from
i32 to i64) and not as a union (shifted to the upper bits of an i64).
wait() already behaves correctly on big-endian O32 and N64 since the union is
already the same size as an argument slot.
Reviewers: atanasyan
Reviewed By: atanasyan
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D6963
llvm-svn: 225981
These are implemented with __builtin_shufflevector just like AVX.
We have some tests on the LLVM side to assert that these shufflevectors do
indeed generate the corresponding unpck instruction.
Part of <rdar://problem/17688758>
llvm-svn: 225922
Summary:
The Mips ABI's treat pointers in the same way as integers. They are
sign-extended to 32-bit for O32, and 64-bit for N32/N64. This doesn't matter
for O32 and N64 where pointers are already the correct width but it does matter
for big-endian N32, where pointers are 32-bit and need promoting.
The caller side is already passing pointers correctly. This patch corrects the
callee.
Reviewers: vmedic, atanasyan
Reviewed By: atanasyan
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D6812
llvm-svn: 225782
Introduce the following -fsanitize-recover flags:
- -fsanitize-recover=<list>: Enable recovery for selected checks or
group of checks. It is forbidden to explicitly list unrecoverable
sanitizers here (that is, "address", "unreachable", "return").
- -fno-sanitize-recover=<list>: Disable recovery for selected checks or
group of checks.
- -f(no-)?sanitize-recover is now a synonym for
-f(no-)?sanitize-recover=undefined,integer and will soon be deprecated.
These flags are parsed left to right, and mask of "recoverable"
sanitizer is updated accordingly, much like what we do for -fsanitize= flags.
-fsanitize= and -fsanitize-recover= flag families are independent.
CodeGen change: If there is a single UBSan handler function, responsible
for implementing multiple checks, which have different recoverable setting,
then we emit two handler calls instead of one:
the first one for the set of "unrecoverable" checks, another one - for
set of "recoverable" checks. If all checks implemented by a handler have the
same recoverability setting, then the generated code will be the same.
llvm-svn: 225719
The llvm IR until recently had no support for comdats. This was a problem when
targeting C++ on ELF/COFF as just using weak linkage would cause quite a bit of
dead bits to remain on the executable (unless -ffunction-sections,
-fdata-sections and --gc-sections were used).
To fix the problem, llvm's codegen will just assume that any weak or linkonce
that is not in an explicit comdat should be output in one with the same name as
the global.
This unfortunately breaks cases like pr19848 where a weak symbol is not
xpected to be part of any comdat.
Now that we have explicit comdats in the IR, we can finally get both cases
right.
This first patch just makes clang give explicit comdats to GlobalValues where
t is allowed to.
A followup patch to llvm will then stop implicitly producing comdats.
llvm-svn: 225705
Between this behavior and that fixed by r225083/r225000, I'll take the
latter over the former for now, but I'm immediately working on
understanding/addressing this behavior too.
(the fact that the code change in r225083 caused this change in behavior
is a bit troubling anyway - given that it looks & claims to be just a
preformance thing)
llvm-svn: 225086
This still lower to the same intrinsics as before.
This is preparation for bounds checking the immediate on the avx version of the builtin so we don't pass illegal immediates into the backend. Since SSE uses a smaller size immediate its not possible to bounds check when using a shared builtin. Rather than creating a clang specific builtin for the different immediate, I decided (after consulting with Chandler) that it was better to match gcc.
llvm-svn: 224879