Original CL description:
Produce less broken basic block sequences for __finally blocks.
The way cleanups (such as PerformSEHFinally) get emitted is that codegen
generates some initialization code, then calls the cleanup's Emit() with the
insertion point set to a good place, then the cleanup is supposed to emit its
stuff, and then codegen might tack in a jump or similar to where the insertion
point is after the cleanup.
The PerformSEHFinally cleanup tries to just stash away the block it's supposed
to codegen into, and then does codegen later, into that stashed block. However,
after codegen'ing the __finally block, it used to set the insertion point to
the finally's continuation block (where the __finally cleanup goes when its body
is completed after regular, non-exceptional control flow). That's not correct,
as that block can (and generally does) already ends in a jump. Instead,
remember the insertion point that was current before the __finally got emitted,
and restore that.
Fixes two of the crashes in PR22553.
llvm-svn: 230503
The way cleanups (such as PerformSEHFinally) get emitted is that codegen
generates some initialization code, then calls the cleanup's Emit() with the
insertion point set to a good place, then the cleanup is supposed to emit its
stuff, and then codegen might tack in a jump or similar to where the insertion
point is after the cleanup.
The PerformSEHFinally cleanup tries to just stash away the block it's supposed
to codegen into, and then does codegen later, into that stashed block. However,
after codegen'ing the __finally block, it used to set the insertion point to
the finally's continuation block (where the __finally cleanup goes when its body
is completed after regular, non-exceptional control flow). That's not correct,
as that block can (and generally does) already ends in a jump. Instead,
remember the insertion point that was current before the __finally got emitted,
and restore that.
Fixes two of the crashes in PR22553.
llvm-svn: 230460
This is a necessary prerequisite for debugging with modules.
The .pcm files become containers that hold the serialized AST which allows
us to store debug information in the module file that can be shared by all
object files that were built importing the module.
This reapplies r230044 with a fixed configure+make build and updated
dependencies and testcase requirements. Over the last iteration this
version adds
- missing target requirements for testcases that specify an x86 triple,
- a missing clangCodeGen.a dependency to libClang.a in the make build.
rdar://problem/19104245
llvm-svn: 230423
The backend should now be able to handle all AAPCS rules based on argument
type, which means Clang no longer has to duplicate the register-counting logic
and the CodeGen can be significantly simplified.
llvm-svn: 230349
MSVC does not support C99 _Complex.
ICC, however, does support it on windows x86_64, and treats it, for purposes of parameter passing, as equivalent to a struct containing two fields (for the real and imaginary part).
Differential Revision: http://reviews.llvm.org/D7825
llvm-svn: 230315
llvm.eh.sjlj.setjmp / llvm.eh.sjlj.longjmp, if the backend is known to
support them outside the Exception Handling context. The default
handling in LLVM codegen doesn't work and will create incorrect code.
The ARM backend on the other hand will assert if the intrinsics are
used.
llvm-svn: 230255
For now -funique-section-names is the default, so no change in default behavior.
The total .o size in a build of llvm and clang goes from 241687775 to 230649031
bytes if -fno-unique-section-names is used.
llvm-svn: 230031
Summary:
The definition for _mm256_insert_epi64 was taking an int, which would get
truncated before being inserted in the vector.
Original patch by Joshua Magee!
Reviewers: bruno, craig.topper
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D7179
llvm-svn: 229811
Not all targets generate 'store atomic' instructions for
'_Atomic(_Complex int)'. Some targets use the __atomic_store builtin instead.
This commit makes the test accept either one.
llvm-svn: 229676
This is a patch for PR22563 ( http://llvm.org/bugs/show_bug.cgi?id=22563 ).
We were not correctly unwrapping a single 256-bit AVX vector that was defined as an array of 1 inside a struct.
We would generate a <4 x float> param/return value instead of <8 x float> and lose half of the vector.
Differential Revision: http://reviews.llvm.org/D7614
llvm-svn: 229408
For #pragma comment(linker, ...) MSVC expects the comment string to be quoted, but for #pragma comment(lib, ...) the compiler itself quotes the library name.
Since this distinction disappears by the time the directive reaches the backend, move quoting for the "lib" version to the frontend.
Differential Revision: http://reviews.llvm.org/D7653
llvm-svn: 229376
Bools are a little tricky, they are i8 in memory and must be coerced
back to i1 before further operations can be performed on them.
This fixes PR22577.
llvm-svn: 229204
The first change won't touch GEPOperators such as these, but the update
script only identifies them by the leading '(' after getelementptr or
'getelementptr inbounds', so update this test to at least have those
features to allow auto-migrating.
llvm-svn: 229198
The /volatile:ms semantics turn volatile loads and stores into atomic
acquire and release operations. This distinction is important because
volatile memory operations do not form a happens-before relationship
with non-atomic memory. This means that a volatile store is not
sufficient for implementing a mutex unlock routine.
Differential Revision: http://reviews.llvm.org/D7580
llvm-svn: 229082
Summary:
This patch installs an InlineAsmDiagnosticsHandler to avoid the crash
report when the input is bitcode and the bitcode contains invalid inline
assembly. The handler will simply print the same error message that will
print from the backend.
Add CHECK in test-case
Reviewers: echristo, rafael
Reviewed By: rafael
Subscribers: rafael, cfe-commits
Differential Revision: http://reviews.llvm.org/D7568
llvm-svn: 228898
a non-uniqueable temporary node that is only turned into a permanent
unique or distinct node after it is finished.
Otherwise an intermediate node may get accidentally uniqued with another
node as illustrated by the testcase.
Paired commit with LLVM.
llvm-svn: 228855
Also removed unused builtins.
Original patch by Andrea Di Biagio!
Reviewers: craig.topper, nadav
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D7199
llvm-svn: 228481
modifiers on them. If we have a matching output constraint with
an early clobber make sure we don't propagate that to the input
constraint.
llvm-svn: 228422
After r228258, Clang started emitting C++ EH IR that LLVM wasn't ready
to deal with, even when exceptions were disabled with /EHs-. This time,
make /EHs- turn off -fexceptions while still emitting exceptional
constructs in functions using __try. Since Sema rejects C++ exception
handling constructs before CodeGen, landingpads should only appear in
such functions as the result of a __try.
llvm-svn: 228329
Previously we would simply double-emit the body of the __finally block,
but that doesn't work when it contains any kind of Decl, which we can't
double emit.
This fixes that by emitting the block once and branching into a shared
code region and then branching back out.
llvm-svn: 228222
Summary:
Named registers with the constraint "=&r" currently lose the early clobber flag
and turn into "=r" when converted to LLVM-IR. This patch correctly passes it on.
Reviewers: atanasyan
Reviewed By: atanasyan
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D7346
llvm-svn: 228143
Create a new TargetCodeGenInfo for Windows on ARM to permit annotating the
functions with stack-probe-size (for /Gs and -mstack-probe-support) for
generating the stack probe necessary for Windows targets. This will be used by
the backend when lowering the frame to generate the stack probe appropriately.
llvm-svn: 227641
On targets which use the MSVCRT, setjmp is a macro which expands to
_setjmp or _setjmpex.
_setjmp and _setjmpex have a secret, hidden argument which is not listed
in the function prototype on X64 and WoA. This hidden argument always
seems to be the frame pointer.
_setjmpex isn't used on X86, _setjmp is magically replaced with a call
to _setjmp3. The second argument is zero for 'normal' setjmp/longjmp
pairs, otherwise it is a count of additional variadic arguments. This
is used when setjmp appears inside of a try or __try.
It is not safe to use a pointer to setjmp because _setjmp, _setjmpex and
_setmp3 are not compatible with setjmp.
llvm-svn: 227426
Summary:
It was used for interoperability with PNaCl's calling conventions, but
it's no longer needed.
Also Remove NaCl*ABIInfo which just existed to delegate to either the portable
or native ABIInfo, and remove checkCallingConvention which was now a no-op
override.
Reviewers: jvoung
Subscribers: jfb, llvm-commits
Differential Revision: http://reviews.llvm.org/D7206
llvm-svn: 227362
The backend won't run LowerExpect on -O0. In a debug LTO build, this results in llvm.expect intrinsics being in the LTO IR which doesn't know how to optimize them.
Thanks to Chandler for the suggestion and review.
Differential revision: http://reviews.llvm.org/D7183
llvm-svn: 227135
The lowering looks a lot like normal EH lowering, with the exception
that the exceptions are caught by executing filter expression code
instead of matching typeinfo globals. The filter expressions are
outlined into functions which are used in landingpad clauses where
typeinfo would normally go.
Major aspects that still need work:
- Non-call exceptions in __try bodies won't work yet. The plan is to
outline the __try block in the frontend to keep things simple.
- Filter expressions cannot use local variables until capturing is
implemented.
- __finally blocks will not run after exceptions. Fixing this requires
work in the LLVM SEH preparation pass.
The IR lowering looks like this:
// C code:
bool safe_div(int n, int d, int *r) {
__try {
*r = normal_div(n, d);
} __except(_exception_code() == EXCEPTION_INT_DIVIDE_BY_ZERO) {
return false;
}
return true;
}
; LLVM IR:
define i32 @filter(i8* %e, i8* %fp) {
%ehptrs = bitcast i8* %e to i32**
%ehrec = load i32** %ehptrs
%code = load i32* %ehrec
%matches = icmp eq i32 %code, i32 u0xC0000094
%matches.i32 = zext i1 %matches to i32
ret i32 %matches.i32
}
define i1 zeroext @safe_div(i32 %n, i32 %d, i32* %r) {
%rr = invoke i32 @normal_div(i32 %n, i32 %d)
to label %normal unwind to label %lpad
normal:
store i32 %rr, i32* %r
ret i1 1
lpad:
%ehvals = landingpad {i8*, i32} personality i32 (...)* @__C_specific_handler
catch i8* bitcast (i32 (i8*, i8*)* @filter to i8*)
%ehptr = extractvalue {i8*, i32} %ehvals, i32 0
%sel = extractvalue {i8*, i32} %ehvals, i32 1
%filter_sel = call i32 @llvm.eh.seh.typeid.for(i8* bitcast (i32 (i8*, i8*)* @filter to i8*))
%matches = icmp eq i32 %sel, %filter_sel
br i1 %matches, label %eh.except, label %eh.resume
eh.except:
ret i1 false
eh.resume:
resume
}
Reviewers: rjmccall, rsmith, majnemer
Differential Revision: http://reviews.llvm.org/D5607
llvm-svn: 226760
It fails on Windows due to another temporary being emitted first, so the
LLVM internal renaming scheme gives out the name
__block_descriptor_tmp1.
llvm-svn: 226757
Currently we emit DeferredDeclsToEmit in reverse order. This patch changes that.
The advantages of the change are that
* The output order is a bit closer to the source order. The change to
test/CodeGenCXX/pod-member-memcpys.cpp is a good example.
* If we decide to deffer more, it will not cause as large changes in the
estcases as it would without this patch.
llvm-svn: 226751
Analogous to AVX2, these need to be implemented as macros to properly
propagate the immediate index operand.
Part of <rdar://problem/17688758>
llvm-svn: 226496
Summary:
This fixes MultiSource/Applications/lemon on big-endian N32 by correcting the
handling of the argument to wait(). glibc defines it as a transparent union of
void* and int*. Such unions are passed according to the rules of the first
member so the argument must be passed as if it were a void* (sign extended from
i32 to i64) and not as a union (shifted to the upper bits of an i64).
wait() already behaves correctly on big-endian O32 and N64 since the union is
already the same size as an argument slot.
Reviewers: atanasyan
Reviewed By: atanasyan
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D6963
llvm-svn: 225981
These are implemented with __builtin_shufflevector just like AVX.
We have some tests on the LLVM side to assert that these shufflevectors do
indeed generate the corresponding unpck instruction.
Part of <rdar://problem/17688758>
llvm-svn: 225922
Summary:
The Mips ABI's treat pointers in the same way as integers. They are
sign-extended to 32-bit for O32, and 64-bit for N32/N64. This doesn't matter
for O32 and N64 where pointers are already the correct width but it does matter
for big-endian N32, where pointers are 32-bit and need promoting.
The caller side is already passing pointers correctly. This patch corrects the
callee.
Reviewers: vmedic, atanasyan
Reviewed By: atanasyan
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D6812
llvm-svn: 225782
Introduce the following -fsanitize-recover flags:
- -fsanitize-recover=<list>: Enable recovery for selected checks or
group of checks. It is forbidden to explicitly list unrecoverable
sanitizers here (that is, "address", "unreachable", "return").
- -fno-sanitize-recover=<list>: Disable recovery for selected checks or
group of checks.
- -f(no-)?sanitize-recover is now a synonym for
-f(no-)?sanitize-recover=undefined,integer and will soon be deprecated.
These flags are parsed left to right, and mask of "recoverable"
sanitizer is updated accordingly, much like what we do for -fsanitize= flags.
-fsanitize= and -fsanitize-recover= flag families are independent.
CodeGen change: If there is a single UBSan handler function, responsible
for implementing multiple checks, which have different recoverable setting,
then we emit two handler calls instead of one:
the first one for the set of "unrecoverable" checks, another one - for
set of "recoverable" checks. If all checks implemented by a handler have the
same recoverability setting, then the generated code will be the same.
llvm-svn: 225719
The llvm IR until recently had no support for comdats. This was a problem when
targeting C++ on ELF/COFF as just using weak linkage would cause quite a bit of
dead bits to remain on the executable (unless -ffunction-sections,
-fdata-sections and --gc-sections were used).
To fix the problem, llvm's codegen will just assume that any weak or linkonce
that is not in an explicit comdat should be output in one with the same name as
the global.
This unfortunately breaks cases like pr19848 where a weak symbol is not
xpected to be part of any comdat.
Now that we have explicit comdats in the IR, we can finally get both cases
right.
This first patch just makes clang give explicit comdats to GlobalValues where
t is allowed to.
A followup patch to llvm will then stop implicitly producing comdats.
llvm-svn: 225705
Between this behavior and that fixed by r225083/r225000, I'll take the
latter over the former for now, but I'm immediately working on
understanding/addressing this behavior too.
(the fact that the code change in r225083 caused this change in behavior
is a bit troubling anyway - given that it looks & claims to be just a
preformance thing)
llvm-svn: 225086
This still lower to the same intrinsics as before.
This is preparation for bounds checking the immediate on the avx version of the builtin so we don't pass illegal immediates into the backend. Since SSE uses a smaller size immediate its not possible to bounds check when using a shared builtin. Rather than creating a clang specific builtin for the different immediate, I decided (after consulting with Chandler) that it was better to match gcc.
llvm-svn: 224879
The lit.cfg files only add .cpp to suffixes, so these tests used to never run,
oops. (Also tweak to of these tests in minor ways to make the actually pass.)
llvm-svn: 224718
Fixed assertion on type checking for arguments and parameters on function call if arguments are pointers to VLA
Differential Revision: http://reviews.llvm.org/D6655
llvm-svn: 224504
use clang -cc1 matching the front end and backend. Fix up a couple
of tests that were testing aapcs for arm-linux-gnu.
The test that removes the aapcs abi calling convention removes
them because the default triple matches what the backend uses
for the calling convention there and so it doesn't need to be
explicitly stated - see the code in TargetInfo.cpp.
llvm-svn: 224491
For MSVC compatibility, add the `__emit' builtin. This is used in the Windows
SDK headers, and must therefore be implemented as a builtin rather than an
intrinsic.
The `__emit' builtin provides a mechanism to emit a 16-bit opcode instruction
into the stream. The value must be a compile time constant expression. No
guarantees are made about the CPU and memory states after the execution of the
instruction.
Due to the unchecked nature of the builtin, only support this on Windows on ARM.
llvm-svn: 224438
Summary:
Because GCC doesn't use $1 for code generation, inline assembly code can use $1 without having to add it to the clobbers list.
LLVM, on the other hand, does not shy away from using $1, and this can cause conflicts with inline assembly which assumes GCC-like code generation.
A solution to this problem is to make Clang automatically clobber $1 for all MIPS inline assembly.
This is not the optimal solution, but it seems like a necessary compromise, for now.
Reviewers: dsanders
Reviewed By: dsanders
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D6638
llvm-svn: 224428
Currently clang fires assertions on x86-64 on any atomic operations for long double operands. Patch fixes codegen for such operations.
Differential Revision: http://reviews.llvm.org/D6499
llvm-svn: 224230
having OptimizeNone remove them again, just don't add them in the
first place if the function already has OptimizeNone.
Note that MinSize can still appear due to attributes on different
declarations; a future patch will address that.
llvm-svn: 224047
Summary:
When -fsanitize-address-field-padding=1 is present
don't emit memcpy for copy constructor.
Thanks Nico for the extra test case.
Test Plan: regression tests
Reviewers: thakis, rsmith
Reviewed By: rsmith
Subscribers: rsmith, cfe-commits
Differential Revision: http://reviews.llvm.org/D6515
llvm-svn: 223563
is for each machine. Fix up darwin tests that were testing for
aapcs on armv7-ios when the actual ABI is apcs.
Should be no user visible change without -cc1.
llvm-svn: 223429
ARM ABI specifies that all the libcalls use soft FP ABI
(even hard FP binaries). These days clang emits _mulsc3 / _muldc3
calls with default (C) calling convention which would be translated
into AAPCS_VFP LLVM calling and thus the result of complex
multiplication will be bogus.
Introduce a way for a target to specify explicitly calling
convention for libcalls. Right now this is temporary correctness
fix. Ultimately, we'll end with intrinsic for complex
multiplication and all calling convention decisions for libcalls
will be put into backend.
llvm-svn: 223123
Now that LLVM can count the registers needed to implement AAPCS rules, we don't
need to duplicate that logic here. This means we can drop the explicit padding
and also use more natural types in many cases (e.g. "struct { float arr[3]; }"
used to end up as "[2 x double]" to avoid holes on the stack.
The one wrinkle is that AAPCS va_arg was also using the register counting
machinery. But the local replacement isn't too bad.
llvm-svn: 222904
Cygwin and MinGW fail to conform to the underlying system's structure passing
ABI. Make the check more precise to ensure that we correctly generate code for
the itanium environment.
llvm-svn: 222626
"global-init", "global-init-src" and "global-init-type" were originally
used to blacklist entities in ASan init-order checker. However, they
were never documented, and later were replaced by "=init" category.
Old blacklist entries should be converted as follows:
* global-init:foo -> global:foo=init
* global-init-src:bar -> src:bar=init
* global-init-type:baz -> type:baz=init
llvm-svn: 222401
This reverts commit r222144. Commit r222142 is being reverted due to
a spec2006/gcc execution-time regression.
Update mips-varargs test as well.
llvm-svn: 222397
Summary:
With this patch, passing a va_list to another function and reading 10 int's from
it works correctly on a big-endian target.
Based on a pair of patches by David Chisnall, one of which I've reworked
for the current trunk.
Reviewers: theraven, atanasyan
Reviewed By: theraven, atanasyan
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D6248
llvm-svn: 222339
Summary:
This distinguishes between -fpic and -fPIC now, with the additions in LLVM for
PIC level support.
Test Plan: No regressions
Reviewers: echristo, rafael
Reviewed By: rafael
Subscribers: rnk, emaste, llvm-commits
Differential Revision: http://reviews.llvm.org/D5400
llvm-svn: 222227
used inside blocks. It fixes a crash in naming code
for __func__ etc. when used in a block declared globally.
It also brings back old naming convention for
predefined expression which was broken. rdar://18961148
llvm-svn: 222065
Summary:
Ok, here is somewhat addition to D6217 aiming to preserve old darwin behavior wrt the typedefed types. The actual change to SemaChecking turned out to be pretty gross, in particular:
1. We need to extract the typedef'ed type for proper diagnostics
2. We need to walk over paren expressions as well
Reviewers: chandlerc, rsmith
Reviewed By: rsmith
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D6256
llvm-svn: 222044
VSX makes the "vector long long" and "vector double" types available.
This patch enables the vec_perm interface for these types. The same
builtin is generated regardless of the specified type, so no
additional work or testing is needed in the back end. Tests are added
to ensure this builtin is generated by the front end.
llvm-svn: 221988
This patch adds builtin support for xvdivdp and xvdivsp, along with a
new test case. The builtins are accessed using vec_div in altivec.h.
Builtins are listed (mostly) alphabetically there, so inserting these
changed the line numbers for deprecation warnings tested in
test/Headers/altivec-intrin.c.
There is a companion patch for LLVM.
llvm-svn: 221984
Summary:
Consider the following nifty 1 liner: (0 ? csqrtl(2.0f) : sqrtl(2.0f)). One can easily obtain such code from e.g. tgmath. Right now it produces an assertion because we fail to do the promotion real => _Complex real.
The case was properly handled previously (old handleOtherComplexFloatConversion routine), but was forgotten in the current version. This seems to be about fallout from r219557
Reviewers: chandlerc, rsmith
Reviewed By: rsmith
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D6217
llvm-svn: 221821
This patch enables the vec_vsx_ld and vec_vsx_st intrinsics for
PowerPC, which provide programmer access to the lxvd2x, lxvw4x,
stxvd2x, and stxvw4x instructions.
New code in altivec.h defines these in terms of new builtins, which
are themselves defined in BuiltinsPPC.def. The builtins are converted
to LLVM intrinsics in CGBuiltin.cpp. Additional code is added to
builtins-ppc-vsx.c to verify the correct generation of the intrinsics.
Note that I moved the other VSX builtins so all VSX builtins will be
alphabetical in their own section in BuiltinsPPC.def.
There is a companion patch for LLVM.
llvm-svn: 221768
Summary: If we've added poisoned paddings to a type do not emit memcpy for operator=.
Test Plan: regression tests.
Reviewers: majnemer, rsmith
Reviewed By: rsmith
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D6160
llvm-svn: 221739
Summary:
This change makes CodeGenFunction::EmitCheck() take several
conditions that needs to be checked (all of them need to be true),
together with sanitizer kinds these checks are for. This would allow
to split one call into UBSan runtime into several calls in case
different sanitizer kinds would have different recoverability
settings.
Tests should be fixed accordingly, I'm working on it.
Test Plan: regression test suite.
Reviewers: rsmith
Reviewed By: rsmith
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D6219
llvm-svn: 221716
Homogeneous aggregates on AAPCS_VFP ARM need to be passed *without* being
flattened (e.g. [2 x float] rather than "float, float") for various weird ABI
reasons. However, this isn't the case for anything else; further, we know at
the ABIArgInfo::getDirect callsites whether this flattening is allowed.
So, we can get more unified ARM code, with a simpler Clang, by just using that
knowledge directly.
llvm-svn: 221559
mingw64's headers implement fabs by calling __builtin_fabs, so using the
library call results in an infinite loop. If the backend legalizes
@llvm.fabs as a call to fabs later, things should work out, as the crt
provides a definition.
llvm-svn: 221206
It turns out that MinGW never dllimports of exports inline functions.
This means that code compiled with Clang would fail to link with
MinGW-compiled libraries since we might try to import functions that
are not imported.
To fix this, make Clang never dllimport inline functions when targeting
MinGW.
llvm-svn: 221154
The most complex aspect of the convention is the handling of homogeneous
vector and floating point aggregates. Reuse the homogeneous aggregate
classification code that we use on PPC64 and ARM for this.
This convention also has a C mangling, and we apparently implement that
in both Clang and LLVM.
Reviewed By: majnemer
Differential Revision: http://reviews.llvm.org/D6063
llvm-svn: 221006
Now that we have initial support for VSX, we can begin adding
intrinsics for programmer access to VSX instructions. This patch
performs the necessary enablement in the front end, and tests it by
implementing intrinsics for minimum and maximum using the vector
double data type.
The main change in the front end is to no longer disallow "vector" and
"double" in the same declaration (lib/Sema/DeclSpec.cpp), but "vector"
and "long double" must still be disallowed. The new intrinsics are
accessed via vec_max and vec_min with changes in
lib/Headers/altivec.h. Note that for v4f32, we already access
corresponding VMX builtins, but with VSX enabled we should use the
forms that allow all 64 vector registers.
The new built-ins are defined in include/clang/Basic/BuiltinsPPC.def.
I've added a new test in test/CodeGen/builtins-ppc-vsx.c that is
similar to, but much smaller than, builtins-ppc-altivec.c. This
allows us to test VSX IR generation without duplicating CHECK lines
for the existing bazillion Altivec tests.
Since vector double is now legal when VSX is available, I've modified
the error message, and changed where we test for it and for vector
long double, since the target machine isn't visible in the old place.
This serendipitously removed a not-pertinent warning about 'long'
being deprecated when used with 'vector', when "vector long double" is
encountered and we just want to issue an error. The existing tests
test/Parser/altivec.c and test/Parser/cxx-altivec.cpp have been
updated accordingly, and I've added test/Parser/vsx.c to verify that
"vector double" is now legitimate with VSX enabled.
There is a companion patch for LLVM.
llvm-svn: 220989
Summary:
When we are adding field paddings for asan even an empty dtor has to remain in the code,
so we ignore -mconstructor-aliases if the paddings are going to be added.
Test Plan: added a test
Reviewers: rsmith, rnk, rafael
Reviewed By: rafael
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D6038
llvm-svn: 220986
Reuse the PPC64 HVA detection algorithm for ARM and AArch64. This is a
nice code deduplication, since they are roughly identical. A few virtual
method extension points are needed to understand how big an HVA can be
and what element types it can have for a given architecture.
Also make the record expansion code work in the presence of non-virtual
bases.
Reviewed By: uweigand, asl
Differential Revision: http://reviews.llvm.org/D6045
llvm-svn: 220972
The Windows NT SDK uses __readfsdword and declares it as a compiler provided
builtin (#pragma intrinsic(__readfsword). Because intrin.h is not referenced
by winnt.h, it is not possible to provide an out-of-line definition for the
intrinsic. Provide a proper compiler builtin definition.
llvm-svn: 220859
Following the NVVM IR specifications, arguments of aggregate type should be
passed on the stack without splitting (byval).
http://reviews.llvm.org/D6020
Patch by Jacques Pienaar.
llvm-svn: 220854
As discussed in bug 21398, PowerPC ABI code needs to consider C++ base
classes when classifying a class as homogeneous aggregate (or not) for
ABI purposes.
llvm-svn: 220852
An updated implemnentation of VLA types capturing based on previously committed solution for Lambdas.
This version captures the whole VLA type instead of particular variables which are part of VLA size expression and allows to use previusly calculated size of VLA type in captured regions. Required for OpenMP.
Differential Revision: http://reviews.llvm.org/D5099
llvm-svn: 220850
Summary:
We should avoid a tail padding not only if the last field
has zero size but also if the last field is a struct with a flexible array.
If/when http://reviews.llvm.org/D5478 is committed,
this will also handle the case of structs with zero-sized arrays.
Reviewers: majnemer, rsmith
Reviewed By: rsmith
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D5924
llvm-svn: 220708
Wire it through everywhere we have support for fastcall, essentially.
This allows us to parse the MSVC "14" CTP headers, but we will
miscompile them because LLVM doesn't support __vectorcall yet.
Reviewed By: Aaron Ballman
Differential Revision: http://reviews.llvm.org/D5808
llvm-svn: 220573
Summary:
This allows us to easily identify them in the backend which in turn allows us
to handle them correctly for big-endian targets (where they must be shifted
into the upper bits of the register).
Depends on D5961
Reviewers: atanasyan
Reviewed By: atanasyan
Subscribers: cfe-commits, theraven
Differential Revision: http://reviews.llvm.org/D5962
llvm-svn: 220566
Summary:
Ensure all integral/enumeration types are appropriately annotated with
signext/zeroext. In particular, i32 now has these attributes when using the
N32/N64 ABI. This paves the way for accurately representing the way the
N32/N64 ABI's promotes integer arguments to i64.
Reviewers: atanasyan
Reviewed By: atanasyan
Subscribers: cfe-commits, theraven
Differential Revision: http://reviews.llvm.org/D5961
llvm-svn: 220563
When SanitizerBlacklist decides if the SourceLocation is blacklisted,
we need to first turn it into a SpellingLoc before fetching the filename
and scanning "src:" entries. Otherwise we will fail to fecth the
correct filename for function definitions coming from macro expansion.
llvm-svn: 220403
This reverts commit r220169 which reverted r220153. However, it also
contains additional changes:
- We may need to add padding *after* we've packed the struct. This
occurs when the aligned next field offset is greater than the new
field's offset. When this occurs, we make the struct packed.
*However*, once packed the next field offset might be less than the
new feild's offset. It is in this case that we might further pad the
struct.
- We would pad structs which were perfectly sized! This behavior is
immensely old. This behavior came from blindly subtracting
NextFieldOffsetInChars from RecordSize. This doesn't take into
account the fact that the struct might have a greater overall
alignment than the last field.
llvm-svn: 220175
This commit caused two tests in LNT to regress. I'm able to reproduce on
any platform and will send reproduction steps to the original commit
log. This should restore the LNT bots that have been failing.
llvm-svn: 220169
a NaN-test prior to the call to the library function.
This should automatically make fastmath (including just non-NaNs) able to avoid
the expensive libcalls and also open the door to more advanced folding in LLVM
based on the rules for complex math.
Two important notes to remember: first is that this isn't yet a proper
limited range mode, it's still just improving the unlimited range mode.
Also, it isn't really perfecet w.r.t. what an unlimited range mode
should be doing because it isn't quite handling the flags produced by
all the operations in the way desirable for that mode, but then neither
is compiler-rt's libcall. When the compiler-rt libcall is improved to
carefully manage flags, the code emitted here should be improved
correspondingly. And it is still a long-term desirable thing to add
a limited range mode to Clang that would be able to use direct math
without library calls here.
Special thanks to Steve Canon for the careful review on this patch and
teaching me about these issues. =D
Differential Revision: http://reviews.llvm.org/D5756
llvm-svn: 220167
Before, ConstStructBuilder::AppendBytes would check packed constraints
prior to padding being added before the field's offset. However, adding
this padding might force our struct to be packed. Because we wouldn't
check *after* adding padding, ConstStructBuilder would be in an
inconsistent state leading to a crash.
This fixes PR21300.
llvm-svn: 220153
This commit changes the way we blacklist global variables in ASan.
Now the global is excluded from instrumentation (either regular
bounds checking, or initialization-order checking) if:
1) Global is explicitly blacklisted by its mangled name.
This part is left unchanged.
2) SourceLocation of a global is in blacklisted source file.
This changes the old behavior, where instead of looking at the
SourceLocation of a variable we simply considered llvm::Module
identifier. This was wrong, as identifier may not correspond to
the file name, and we incorrectly disabled instrumentation
for globals coming from #include'd files.
3) Global is blacklisted by type.
Now we build the type of a global variable using Clang machinery
(QualType::getAsString()), instead of llvm::StructType::getName().
After this commit, the active users of ASan blacklist files
may have to revisit them (this is a backwards-incompatible change).
llvm-svn: 220097
This commit changes the way we blacklist functions in ASan, TSan,
MSan and UBSan. We used to treat function as "blacklisted"
and turned off instrumentation in it in two cases:
1) Function is explicitly blacklisted by its mangled name.
This part is not changed.
2) Function is located in llvm::Module, whose identifier is
contained in the list of blacklisted sources. This is completely
wrong, as llvm::Module may not correspond to the actual source
file function is defined in. Also, function can be defined in
a header, in which case user had to blacklist the .cpp file
this header was #include'd into, not the header itself.
Such functions could cause other problems - for instance, if the
header was included in multiple source files, compiled
separately and linked into a single executable, we could end up
with both instrumented and non-instrumented version of the same
function participating in the same link.
After this change we will make blacklisting decision based on
the SourceLocation of a function definition. If a function is
not explicitly defined in the source file, (for example, the
function is compiler-generated and responsible for
initialization/destruction of a global variable), then it will
be blacklisted if the corresponding global variable is defined
in blacklisted source file, and will be instrumented otherwise.
After this commit, the active users of blacklist files may have
to revisit them. This is a backwards-incompatible change, but
I don't think it's possible or makes sense to support the
old incorrect behavior.
I plan to make similar change for blacklisting GlobalVariables
(which is ASan-specific).
llvm-svn: 219997
Summary:
The general approach is to add extra paddings after every field
in AST/RecordLayoutBuilder.cpp, then add code to CTORs/DTORs that poisons the paddings
(CodeGen/CGClass.cpp).
Everything is done under the flag -fsanitize-address-field-padding.
The blacklist file (-fsanitize-blacklist) allows to avoid the transformation
for given classes or source files.
See also https://code.google.com/p/address-sanitizer/wiki/IntraObjectOverflow
Test Plan: run SPEC2006 and some of the Chromium tests with -fsanitize-address-field-padding
Reviewers: samsonov, rnk, rsmith
Reviewed By: rsmith
Subscribers: majnemer, cfe-commits
Differential Revision: http://reviews.llvm.org/D5687
llvm-svn: 219961
They cannot be written to, so marking them const makes sense and may improve
optimisation.
As a side-effect, SectionInfos has to be moved from Sema to ASTContext.
It also fixes this problem, that occurs when compiling ATL:
warning LNK4254: section 'ATL' (C0000040) merged into '.rdata' (40000040) with different attributes
The ATL headers are putting variables in a special section that's marked
read-only. However, Clang currently can't model that read-onlyness in the IR.
But, by making the variables const, the section does become read-only, and
the linker warning is avoided.
Differential Revision: http://reviews.llvm.org/D5812
llvm-svn: 219960
CodeGen wouldn't mark the aliasee as thread_local if the aliasee was a
tentative definition.
Even if the definition was already emitted, it would never mark the
alias as thread_local.
This fixes PR21288.
llvm-svn: 219859
Thumb1 has legitimate reasons for preferring 32-bit alignment of types
i1/i8/i16, since the 16-bit encoding of "add rD, sp, #imm" requires #imm to be
a multiple of 4. However, this is a trade-off betweem code size and RAM usage;
the DataLayout string is not the best place to represent it even if desired.
So this patch removes the extra Thumb requirements, hopefully making ARM and
Thumb completely compatible in this respect.
llvm-svn: 219735
Before, ARM and Thumb mode code had different preferred alignments, which could
lead to some rather unexpected results. There's justification for reducing it
from the default 64-bits (wasted space), but I don't think there is for going
below 32-bits.
There's no actual ABI change here, just to reassure people.
llvm-svn: 219720
This addresses a regression introduced with SVN r219393. A block may be
contained within another block. In such a scenario, we would end up within a
BlockDecl, which is not a NamedDecl (as the names are synthesised). The cast to
a NamedDecl of the DeclContext would then assert as the types are unrelated.
Restore the mangling behaviour to that prior to SVN r219393. If the current
block is contained within a BlockDecl, walk up to the parent DeclContext,
recursively, until we have a non-BlockDecl. This is expected to be a NamedDecl.
Add in a couple of asserts to ensure that the assumption that we only encounter
a block within a NamedDecl or a BlockDecl.
llvm-svn: 219696
Previously loop hints such as #pragma loop vectorize_width(#) required a constant. This patch allows a constant expression to be used as well. Such as a non-type template parameter or an expression (2 * c + 1).
Reviewed by Richard Smith
llvm-svn: 219589
and !=) to support mixed complex and real operand types.
This requires removing an assert from SemaChecking, and adding support
both to the constant evaluator and the code generator to synthesize the
imaginary part when needed. This seemed somewhat cleaner than having
just the comparison operators force real-to-complex conversions.
I've added test cases for these operations. I'm really terrified that
there were *no* tests in-tree which exercised this.
This turned up when trying to build R after my change to the complex
type lowering.
llvm-svn: 219570
for complex math.
This should fix the windows build bots that started having trouble here
and generally fix complex libcall emission on targets which use sret for
complex data types. It also makes the code a bit simpler (despite
calling into a much more complex bucket of code).
llvm-svn: 219565
operators where one type is a C complex type, and to emit both the
efficient and correct implementation for complex arithmetic according to
C11 Annex G using this extra information.
For both multiply and divide the old code was writing a long-hand
reduced version of the math without any of the special handling of inf
and NaN recommended by the standard here. Instead of putting more
complexity here, this change does what GCC does which is to emit
a libcall for the fully general case.
However, the old code also failed to do the proper minimization of the
set of operations when there was a mixed complex and real operation. In
those cases, C provides a spec for much more minimal operations that are
valid. Clang now emits the exact suggested operations. This change isn't
*just* about performance though, without minimizing these operations, we
again lose the correct handling of infinities and NaNs. It is critical
that this happen in the frontend based on assymetric type operands to
complex math operations.
The performance implications of this change aren't trivial either. I've
run a set of benchmarks in Eigen, an open source mathematics library
that makes heavy use of complex. While a few have slowed down due to the
libcall being introduce, most sped up and some by a huge amount: up to
100% and 140%.
In order to make all of this work, also match the algorithm in the
constant evaluator to the one in the runtime library. Currently it is
a broken port of the simplifications from C's Annex G to the long-hand
formulation of the algorithm.
Splitting this patch up is very hard because none of this works without
the AST change to preserve non-complex operands. Sorry for the enormous
change.
Follow-up changes will include support for sinking the libcalls onto
cold paths in common cases and fastmath improvements to allow more
aggressive backend folding.
Differential Revision: http://reviews.llvm.org/D5698
llvm-svn: 219557
Make it possible to pass NULL through variadic functions on 64-bit
Windows targets. The Visual C++ headers define NULL to 0, when they
should define it to 0LL on Win64 so that NULL is a pointer-sized
integer.
Fixes PR20949.
Reviewers: thakis, rsmith
Differential Revision: http://reviews.llvm.org/D5480
llvm-svn: 219456
We already add the align parameter attribute for function parameters that have
the align_value attribute (or those with a typedef type having that attribute),
which is an important special case, but does not handle pointers with value
alignment assumptions that come into scope in any other way. To handle the
general case, emit an @llvm.assume-based alignment assumption whenever we load
the pointer-typed lvalue of an align_value-attributed variable (except for
function parameters, which we already deal with at entry).
I'll also note that this is more general than Intel's described support in:
https://software.intel.com/en-us/articles/data-alignment-to-assist-vectorization
which states that the compiler inserts __assume_aligned directives in response
to align_value-attributed variables only for function parameters and for the
initializers of local variables. I think that we can make the optimizer deal
with this more-general scheme (which could lead to a lot of calls to
@llvm.assume inside of loop bodies, for example), but if not, I'll rework this
to be less aggressive.
llvm-svn: 219052
This reverts commit r218917, effectively reapplying r218913. Original
commit message follows.
--
Update debug info testcases for an LLVM metadata schema change to fold
metadata constant operands into a single `MDString`.
Part of PR17891.
llvm-svn: 219011
This test includes stdint.h (via stdatomic.h), which might include system
headers (and that might not work, depending on the system configuration).
Attempting to fix llvm-clang-lld-x86_64-debian-fast.
llvm-svn: 218960
Adds a Clang-specific implementation of C11's stdatomic.h header. On systems,
such as FreeBSD, where a stdatomic.h header is already provided, we defer to
that header instead (using our __has_include_next technology). Otherwise, we
provide an implementation in terms of our __c11_atomic_* intrinsics (that were
created for this purpose).
C11 7.1.4p1 requires function declarations for atomic_thread_fence,
atomic_signal_fence, atomic_flag_test_and_set,
atomic_flag_test_and_set_explicit, and atomic_flag_clear, and requires that
they have external linkage. Accordingly, we provide these declarations, but if
a user elides the shadowing macros and uses them, then they must have a libc
(or similar) that actually provides definitions.
atomic_flag is implemented using _Bool as the underlying type. This is
consistent with the implementation provided by FreeBSD and also GCC 4.9 (at
least when __GCC_ATOMIC_TEST_AND_SET_TRUEVAL == 1).
Patch by Richard Smith (rebased and slightly edited by me -- Richard said I
should drive at this point).
llvm-svn: 218957
Update debug info testcases for an LLVM metadata schema change to fold
metadata constant operands into a single `MDString`.
Part of PR17891.
llvm-svn: 218913
This adds support for the align_value attribute. This attribute is supported by
Intel's compiler (versions 14.0+), and several of my HPC users have requested
support in Clang. It specifies an alignment assumption on the values to which a
pointer points, and is used by numerical libraries to encourage efficient
generation of vector code.
Of course, we already have an aligned attribute that can specify enhanced
alignment for a type, so why is this additional attribute important? The
problem is that if you want to specify that an input array of T is, say,
64-byte aligned, you could try this:
typedef double aligned_double attribute((aligned(64)));
void foo(aligned_double *P) {
double x = P[0]; // This is fine.
double y = P[1]; // What alignment did those doubles have again?
}
the access here to P[1] causes problems. P was specified as a pointer to type
aligned_double, and any object of type aligned_double must be 64-byte aligned.
But if P[0] is 64-byte aligned, then P[1] cannot be, and this access causes
undefined behavior. Getting round this problem requires a lot of awkward
casting and hand-unrolling of loops, all of which is bad.
With the align_value attribute, we can accomplish what we'd like in a well
defined way:
typedef double *aligned_double_ptr attribute((align_value(64)));
void foo(aligned_double_ptr P) {
double x = P[0]; // This is fine.
double y = P[1]; // This is fine too.
}
This attribute does not create a new type (and so it not part of the type
system), and so will only "propagate" through templates, auto, etc. by
optimizer deduction after inlining. This seems consistent with Intel's
implementation (thanks to Alexey for confirming the various Intel-compiler
behaviors).
As a final note, I would have chosen to call this aligned_value, not
align_value, for better naming consistency with the aligned attribute, but I
think it would be more useful to users to adopt Intel's name.
llvm-svn: 218910
Prior to GCC 4.4, __sync_fetch_and_nand was implemented as:
{ tmp = *ptr; *ptr = ~tmp & value; return tmp; }
but this was changed in GCC 4.4 to be:
{ tmp = *ptr; *ptr = ~(tmp & value); return tmp; }
in response to this change, support for sync_fetch_and_nand (and
sync_nand_and_fetch) was removed in r99522 in order to avoid miscompiling code
depending on the old semantics. However, at this point:
1. Many years have passed, and the amount of code relying on the old
semantics is likely smaller.
2. Through the work of many contributors, all LLVM backends have been updated
such that "atomicrmw nand" provides the newer GCC 4.4+ semantics (this process
was complete July of 2014 (added to the release notes in r212635).
3. The lack of this intrinsic is now a needless impediment to porting codes
from GCC to Clang (I've now seen several examples of this).
It is true, however, that we still set GNUC_MINOR to 2 (corresponding to GCC
4.2). To compensate for this, and to address the original concern regarding
code relying on the old semantics, I've added a warning that specifically
details the fact that the semantics have changed and that we provide the newer
semantics.
Fixes PR8842.
llvm-svn: 218905
In addition to __builtin_assume_aligned, GCC also supports an assume_aligned
attribute which specifies the alignment (and optional offset) of a function's
return value. Here we implement support for the assume_aligned attribute by making
use of the @llvm.assume intrinsic.
llvm-svn: 218500
AFAICT the semantics of frem match libm's fmod.
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Tom Stellard <tom@stellard.net>
llvm-svn: 218488
Fixes PR21027. The MIDL compiler produces code that does this.
If we wanted to improve the warning, I think we could do this:
void __stdcall f(); // Don't warn without -Wstrict-prototypes.
void g() {
f(); // Might warn, the user probably meant for f to take no args.
f(1, 2, 3); // Warn, we have no idea what args f takes.
f(1); // Error, this is insane, one of these calls is broken.
}
Reviewers: thakis
Differential Revision: http://reviews.llvm.org/D5481
llvm-svn: 218394
Summary:
Vectors are normally 16-byte aligned, however the O32 ABI enforces a
maximum alignment of 8-bytes since the base of the stack is 8-byte aligned.
Previously, this was enforced on the caller side, but not on the callee side.
This fixes the output of OpenCL's printf when given vectors.
Reviewers: atanasyan
Reviewed By: atanasyan
Subscribers: llvm-commits, pekka.jaaskelainen
Differential Revision: http://reviews.llvm.org/D5433
llvm-svn: 218248
Summary:
This fixes PR20023. In order to implement this scoping rule, we piggy
back on the existing LabelDecl machinery, by creating LabelDecl's that
will carry the "internal" name of the inline assembly label, which we
will rewrite the asm label to.
Reviewers: rnk
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D4589
llvm-svn: 218230
According to lore, we used to verifier-fail on:
void __thiscall f();
int main() { f(1); }
So that's fixed now. System headers use prototype-less __stdcall functions,
so make that a warning that's DefaultError -- then it fires on regular code
but is suppressed in system headers.
Since it's used in system headers, we have codegen tests for this; massage
them slightly so that they still compile.
llvm-svn: 218166
They are added to adxintrin.h but outside __ADX__ block.
These intrinics generates adc and sbb correspondingly that were available before ADX
llvm-svn: 218118
Summary:
This patch implements a new UBSan check, which verifies
that function arguments declared to be nonnull with __attribute__((nonnull))
are actually nonnull in runtime.
To implement this check, we pass FunctionDecl to CodeGenFunction::EmitCallArgs
(where applicable) and if function declaration has nonnull attribute specified
for a certain formal parameter, we compare the corresponding RValue to null as
soon as it's calculated.
Test Plan: regression test suite
Reviewers: rsmith
Reviewed By: rsmith
Subscribers: cfe-commits, rnk
Differential Revision: http://reviews.llvm.org/D5082
llvm-svn: 217389
This makes use of the recently-added @llvm.assume intrinsic to implement a
__builtin_assume(bool) intrinsic (to provide additional information to the
optimizer). This hooks up __assume in MS-compatibility mode to mirror
__builtin_assume (the semantics have been intentionally kept compatible), and
implements GCC's __builtin_assume_aligned as assume((p - o) & mask == 0). LLVM
now contains special logic to deal with assumptions of this form.
llvm-svn: 217349
made the 8-bit masks actually 8-bit arguments to these intrinsics.
These builtins are a mess. Many were missing the I qualifier which
I added where obviously correct. Most aren't tested, but I've updated
the relevant tests. I've tried to catch all the things that should
become 'c' in this round.
It's also frustrating because the set of these is really ad-hoc and
doesn't really map that cleanly to the set supported by either GCC or
LLVM. Oh well...
llvm-svn: 217311
This patch adds support for the 32bit numeric max/min and directed round-to-integral NEON intrinsics that were added as part of v8, along with unit tests.
Patch by Graham Hunter!
llvm-svn: 217242
For naked functions with parameters, Clang would still emit stores in the prologue
that would clobber the stack, because LLVM doesn't set up a stack frame. (This
shows up in -O0 compiles, because the stores are optimized away otherwise.)
For example:
__attribute__((naked)) int f(int x) {
asm("movl $42, %eax");
asm("retl");
}
Would result in:
_Z1fi:
movl 12(%esp), %eax
movl %eax, (%esp) <--- Oops.
movl $42, %eax
retl
Differential Revision: http://reviews.llvm.org/D5183
llvm-svn: 217198
If control falls off the end of a function after an __asm block, MSVC
assumes that the inline assembly filled the EAX and possibly EDX
registers with an appropriate return value. This functionality is used
in inline functions returning 64-bit integers in system headers, so we
need some amount of compatibility.
This is implemented in Clang by adding extra output constraints to every
inline asm block, and storing the resulting output registers into the
return value slot. If we see an asm block somewhere in the function
body, we emit a normal epilogue instead of marking the end of the
function with a return type unreachable.
Normal returns in functions not using this functionality will overwrite
the return value slot, and in most cases LLVM should be able to
eliminate the dead stores.
Fixes PR17201.
Reviewed By: majnemer
Differential Revision: http://reviews.llvm.org/D5177
llvm-svn: 217187
Summary:
This allows us to easily find them in the backend after the aggregates have
been lowered to other types. This is important on big-endian targets using
the N32/N64 ABI's since these ABI's must shift small structures into the
upper bits of the register.
Reviewers: atanasyan
Reviewed By: atanasyan
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D5005
llvm-svn: 217160
Summary:
They are returned indirectly which causes the other arguments to move to
the next argument slot.
With this, utils/ABITest does not discover any failing cases in the first
500 attempts on big/little endian for O32. Previously some of these failed.
Also tested N32/N64 little endian (big endian has other known issues) with
no issues.
Reviewers: atanasyan
Reviewed By: atanasyan
Subscribers: atanasyan, cfe-commits
Differential Revision: http://reviews.llvm.org/D4811
llvm-svn: 217147
Using the intrinsic allows the SelectionDAGBuilder to turn this call
into the FABS Node and also the intrinsic is something the vectorizer knows
how to vectorize.
This patch also sets the readnone attribute on this call, which should
enable additional optmizations.
llvm-svn: 217042
Previously, EnterStructPointerForCoercedAccess used Alloc size when determining how to convert. This was problematic, because there were situations were the alloc size was larger than the store size. For example, if the first element of a structure were i24 and the destination type were i32, the old code would generate a GEP and a load i24. The code should compare store sizes to ensure the whole object is loaded. I have attached a test case.
This patch modifies the output of arm64-be-bitfield.c test case, but the new IR seems to be equivalent, and after -O3, the compiler generates identical ARM assembly. (asr x0, x0, #54)
Patch by Thomas Jablin!
llvm-svn: 216722
Summary:
We did a great job getting this wrong:
- We messed up which LLVM IR types to use for arguments and return values.
The optimized libcalls use integer types for values.
Clang attempted to use the IR type which corresponds to the value
passed in instead of using an appropriately sized integer type. This
would result in violations of the ABI for, as an example, floating
point types.
- We didn't bother recording the result of the atomic libcall in the
destination memory.
Instead, call the functions with arguments matching the type of the
libcall prototype's parameters.
This fixes PR20780.
Differential Revision: http://reviews.llvm.org/D5098
llvm-svn: 216714
Summary:
The current implementation of asan cookie is incorrect:
we add nosanitize metadata to the cookie load, but the metadata may be lost
and we will instrument the load from poisoned memory.
This change replaces the load with a call to __asan_load_cxx_array_cookie (r216692)
Reviewers: rsmith
Reviewed By: rsmith
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D5111
llvm-svn: 216702
For the following code:
__declspec(dllimport) int f(int x);
int user(int x) {
return f(x);
}
int f(int x) { return 1; }
Clang will drop the dllimport attribute in the AST, but CodeGen would have
already put it on the LLVM::Function, and that would never get updated.
(The same thing happens for global variables.)
This makes Clang check dropped DLL attribute case each time the LLVM object
is referenced.
This isn't perfect, because we will still get it wrong if the function is
never referenced by codegen after the attribute is dropped, but this handles
the common cases and makes us not fail in the verifier.
llvm-svn: 216699
Summary:
ACLE 2.0 section 9.2 defines the following "miscellaneous data processing intrinsics": `__clz`, `__cls`, `__ror`, `__rev`, `__rev16`, `__revsh` and `__rbit`.
`__clz` has already been implemented in the arm_acle.h header file. The rest are not supported yet. This patch completes ACLE data processing intrinsics.
Reviewers: t.p.northover, rengolin
Reviewed By: rengolin
Subscribers: aemerson, mroth, llvm-commits
Differential Revision: http://reviews.llvm.org/D4983
llvm-svn: 216658