is evaluated in a condition expression and then
dereferenced to envoke the block. This is
pr15663 and I applied a slight variation of the
patch with a test case. (patch is from
Arthur O'Dwyer). Also // rdar://14085217
llvm-svn: 183471
X86's 'y' inline assembly constraint represents an MMX register, this change
prevents Clang from hitting an assertion when passed an incompatible type to
deal with.
llvm-svn: 183467
This could actually be implemented with the LLVM IR va_arg instruction,
but it doesn't seem to offer any advantages over accessing the va_list
pointer directly.
Using the va_list pointer directly makes it possible to perform type
coercion directly from the argument array, and the va_list updates are
exposed to the optimizers.
llvm-svn: 183292
Type coercion for argument passing is equivalent to storing the source
type and loading the destination type from the same pointer. On
big-endian targets, this means that the high bits of integers are
preserved.
This patch fixes the CoerceIntOrPtrToIntOrPtr() function on big-endian
targets by inserting the required shift instructions to preserve the
high bits instead of the low bits.
This is used by SparcABIInfo when passing small structs in the high bits
of registers.
llvm-svn: 183291
The 'inreg' attribute can also be applied to function return values in
LLVM IR. The SPARC v9 backend is using the flag when returning structs
containing 32-bit floats.
llvm-svn: 183290
The text of this diagnostic was unnecessarily specific to the current ARM
implementation of validateConstraintModifier, and it gave a potentially
confusing suggestion for fixing the problem. The ARM-specific issue is not
a big deal since that is the only target that currently does any checking of
asm operand modifiers, but until my change in 183172 it was still wrong for
output operands in the way that it referred to the value being truncated when
put into a register, since output operands are retrieved from the registers
instead of being put into them. The bigger problem is that its suggestion to
"use a modifier" is wrong and confusing in the case where a "q" modifier is
incorrectly used with an "r" constraint. In that case, the solution might
well be to remove the modifier or perhaps change the constraint. It's better
to just leave the diagnostic message more generic.
llvm-svn: 183209
We're getting reports of this warning getting triggered in cases where it
is not adding any value. There is no asm operand modifier that you can use
to silence it, and there's really nothing wrong with having an LDRB, for
example, with a "char" output.
llvm-svn: 183172
For integer types of sizes 1, 2, 4 and 8, libcompiler-rt (and libgcc)
provide atomic functions that pass parameters by value and return
results directly.
libgcc and libcompiler-rt only provide optimized libcalls for
__atomic_fetch_*, as generic libcalls on non-integer types would make
little sense. This means that we can finally make __atomic_fetch_* work
on architectures for which we don't provide these operations as builtins
(e.g. ARM).
This should fix the dreaded "cannot compile this atomic library call
yet" error that would pop up once every while.
llvm-svn: 183033
The coercion type serves two purposes:
1. Pad structs to a multiple of 64 bits, so they are passed
'left-aligned' in registers.
2. Expose aligned floating point elements as first-level elements, so
the code generator knows to pass them in floating point registers.
We also compute the InReg flag which indicates that the struct contains
aligned 32-bit floats. This flag is used by the code generator to pick
the right registers.
llvm-svn: 182753
- All integer arguments smaller than 64 bits are extended.
- Large structs are passed indirectly, not using 'byval'.
- Structs up to 32 bytes in size are returned in registers.
Some things are not implemented yet:
- EmitVAArg can be implemented in terms of the va_arg instruction.
- When structs are passed in registers, float members require special
handling because they are passed in the floating point registers.
- Structs are left-aligned when passed in registers. This may require
padding.
llvm-svn: 182745
This removes a FIXME in CodeGenModule::SetLLVMFunctionAttributesForDefinition.
When a function is declared cold we can now generate the IR attribute in
addition to marking the function to be optimized for size.
I tried adding a separate CHECK in the existing test, but it was
failing. I suppose CHECK matches one line exactly once? This would be
a problem if the attributes are listed in a different order, though they
seem to be sorted.
llvm-svn: 182666
selectany only applies to externally visible global variables. It has
the effect of making the data weak_odr.
The MSDN docs suggest that unused definitions can only be dropped at
linktime, so Clang uses weak instead of linkonce. MSVC optimizes away
references to constant selectany data, so it must assume that there is
only one definition, hence weak_odr.
Reviewers: espindola
Differential Revision: http://llvm-reviews.chandlerc.com/D814
llvm-svn: 182266
These intrinsics use the __builtin_shuffle() function to extract the
low and high half, respectively, of a 128-bit NEON vector. Currently,
they're defined to use bitcasts to simplify the emitter, so we get code
like:
uint16x4_t vget_low_u32(uint16x8_t __a) {
return (uint32x2_t) __builtin_shufflevector((int64x2_t) __a,
(int64x2_t) __a,
0);
}
While this works, it results in those bitcasts going all the way through
to the IR, resulting in code like:
%1 = bitcast <8 x i16> %in to <2 x i64>
%2 = shufflevector <2 x i64> %1, <2 x i64> undef, <1 x i32>
%zeroinitializer
%3 = bitcast <1 x i64> %2 to <4 x i16>
We can instead easily perform the operation directly on the input vector
like:
uint16x4_t vget_low_u16(uint16x8_t __a) {
return __builtin_shufflevector(__a, __a, 0, 1, 2, 3);
}
Not only is that much easier to read on its own, it also results in
cleaner IR like:
%1 = shufflevector <8 x i16> %in, <8 x i16> undef,
<4 x i32> <i32 0, i32 1, i32 2, i32 3>
This is both easier to read and easier for the back end to reason
about effectively since the operation is obfuscating the source with
bitcasts.
rdar://13894163
llvm-svn: 181865
Current gcc's produce an error if __clear_cache is anything but
__clear_cache(char *a, char *b);
It looks like we had just implemented a gcc bug that is now fixed.
llvm-svn: 181784
EmitCapturedStmt creates a captured struct containing all of the captured
variables, and then emits a call to the outlined function. This is similar in
principle to EmitBlockLiteral.
GenerateCapturedFunction actually produces the outlined function. It is based
on GenerateBlockFunction, but is much simpler. The function type is determined
by the parameters that are in the CapturedDecl.
Some changes have been added to this patch that were reviewed as part of the
serialization patch and moving the parameters to the captured decl.
Differential Revision: http://llvm-reviews.chandlerc.com/D640
llvm-svn: 181536
This patch then adds all the usual platform-specific pieces for SystemZ:
driver support, basic target info, register names and constraints,
ABI info and vararg support. It also adds new tests to verify pre-defined
macros and inline asm, and updates a test for the minimum alignment change.
This version of the patch incorporates feedback from reviews by
Eric Christopher and John McCall. Thanks to all reviewers!
Patch by Richard Sandiford.
llvm-svn: 181211
Un-break the gdb buildbot.
- Use the debug location of the return expression for the cleanup code
if the return expression is trivially evaluatable, regardless of the
number of stop points in the function.
- Ensure that any EH code in the cleanup still gets the line number of
the closing } of the lexical scope.
- Added a testcase with EH in the cleanup.
rdar://problem/13442648
llvm-svn: 181056
the actual parser and support arbitrary id-expressions.
We're actually basically set up to do arbitrary expressions here
if we wanted to.
Assembly operands permit things like A::x to be written regardless
of language mode, which forces us to embellish the evaluation
context logic somewhat. The logic here under template instantiation
is incorrect; we need to preserve the fact that an expression was
unevaluated. Of course, template instantiation in general is fishy
here because we have no way of delaying semantic analysis in the
MC parser. It's all just fishy.
I've also fixed the serialization of MS asm statements.
This commit depends on an LLVM commit.
llvm-svn: 180976
side because we need an inline asm diagnostics handler in place. Unfortunately,
we emit a .s file because we need to build the SelectionDAG to hit the backend
issue.
rdar://13446483
llvm-svn: 180874
We switch the order of offset and field type to make TBAAStructType node
(name, parent node, offset) similar to scalar TBAA node (name, parent node).
llvm-svn: 180653
For ms structs, zero-length bitfields following non-bitfield members are
completely ignored, we should not increase the field index.
Before the fix, we will have an assertion failure.
llvm-svn: 180038
Also,
- abstract out the indirect/in memory/in registers decisions into the CGCXXABI
- fix handling of empty struct arguments for '-cxx-abi microsoft'
- add/fix tests
llvm-svn: 179681
The SPARC v8 and SPARC v8 architectures are very similar, so use a base
class to share most information between them.
Include operating systems with known SPARC v9 ports.
Also fix two issues with the SPARC v8 data layout string: SPARC v8 is a
big endian target with a 64-bit aligned stack.
llvm-svn: 179596
For struct-path aware TBAA, we used to use scalar type node as the scalar tag,
which has an incompatible format with the struct path tag. We now use the same
format: base type, access type and offset.
We also uniformize the scalar type node and the struct type node: name, a list
of pairs (offset + pointer to MDNode). For scalar type, we have a single pair.
These are to make implementaiton of aliasing rules easier.
llvm-svn: 179335
however, it doesn't do that unless we're optimizing. Change
that and haul out to a helper function. Also make this a driver
test appropriate rather than an assembly test.
llvm-svn: 178606
- Add head 'prfchwintrin.h' to define '_m_prefetchw' which is mapped to
LLVM/clang prefetch builtin
- Add option '-mprfchw' to enable PRFCHW feature and pre-define '__PRFCHW__'
macro
llvm-svn: 178041
isIncompleteType() returns true or false for template types depending on whether
the type is instantiated yet. In this context, that's arbitrary. The better way
to check for a complete type is RequireCompleteType().
Thanks to Eli Friedman for noticing this!
<rdar://problem/12700799>
llvm-svn: 177768
emit function names in .gcda files by default, and the flag turns that off!
Rename the flag to make it match what it actually does. This keeps the default
format compatible with gcc 4.2.
Also add a test for this flag.
llvm-svn: 177475
it wasn't taking into account that the float should be truncated *before* the
range check happens. Thus (unsigned)-0.99 and (unsigned char)255.9 have defined
behavior and should not be trapped.
llvm-svn: 177362
'R' An address that can be sued in a non-macro load or store.
Including missing positive test case and fixed typo for r176453.
Thanks to Richard Smith for catching this!
Jack
llvm-svn: 176506
- Generate atomicrmw operations in most of the cases when it's sensible to do
so.
- Don't crash in several common cases (and hopefully don't crash in more of
them).
- Add some better tests.
We now generate significantly better code for things like:
_Atomic(int) x;
...
x++;
On MIPS, this now generates a 4-instruction ll/sc loop, where previously it
generated about 30 instructions in two nested loops. On x86-64, we generate a
single lock incl, instead of a lock cmpxchgl loop (one instruction instead of
ten).
llvm-svn: 176420
These are two related changes (one in llvm, one in clang).
LLVM:
- rename address_safety => sanitize_address (the enum value is the same, so we preserve binary compatibility with old bitcode)
- rename thread_safety => sanitize_thread
- rename no_uninitialized_checks -> sanitize_memory
CLANG:
- add __attribute__((no_sanitize_address)) as a synonym for __attribute__((no_address_safety_analysis))
- add __attribute__((no_sanitize_thread))
- add __attribute__((no_sanitize_memory))
for S in address thread memory
If -fsanitize=S is present and __attribute__((no_sanitize_S)) is not
set llvm attribute sanitize_S
llvm-svn: 176076
My previous attempt was extremely deficient, allowing more volatiles
to be introduced and not even checking all of the ones that are
present.
This attempt doesn't try to keep track of the values stored or offsets
within particular objects, just that the correct objects are accessed
in a correctly volatile manner throughout.
llvm-svn: 174700
AArch64 handles aggFct's return struct slightly differently, leading
to an extra memcpy. This test is fortunately only concerned about
volatile copies, so we can modify the grep text to filter it.
llvm-svn: 174621
Only some ABIs require the "signext" attribute on parameters. On most
platforms, however, it's a useful test so it's best not to limit it to some
random arbitrary platform.
llvm-svn: 174619
This addresses several (not all) debug info tests that use explicit metadata
numbers. Wherever the same number appeared more than once in a test I used
a named match to ensure the same number appeared in all those cases (this may
still be overly constraining test cases as they may not have actually cared
about that relationship). For one-off numbers I just replaced them with an
unnamed regex.
This may underconstrain poorly written test cases that were interested in
checking that certain metadata nodes were related but didn't actually match
on all the related nodes numbers.
llvm-svn: 174247
Prior to the patch, Clang does not properly promote types when a complex
integer operand is combined with an integer via a binary operator, or when
one is assigned to the other in either order. This patch detects when
promotion is needed (and permissible) and generates the necessary code.
The test assmes no target has the same size operands for "char" and
"long long," and that no target performs arithmetic on char operands without
extending them to a larger format first. If there are any targets for
which this is not the case, they should be XFAILed.
llvm-svn: 174181
Introduces these negation forms explicitly and uses them to control a new
"altivec" target feature for PowerPC. This allows avoiding generating
Altivec instructions on processors that support Altivec.
The new test case verifies that the Altivec "lvx" instruction is not
used when -fno-altivec is present on the command line.
llvm-svn: 174140
In cooperation with the LLVM patch, this should implement all scalar front-end
parts of the C and C++ ABIs for AArch64.
This patch excludes the NEON support also reviewed due to an outbreak of
batshit insanity in our legal department. That will be committed soon bringing
the changes to precisely what has been approved.
Further reviews would be gratefully received.
llvm-svn: 174055
Indents were given the color blue when outputting with color.
AST dumping now looks like this:
Node
|-Node
| `-Node
`-Node
`-Node
Compared to the previous:
(Node
(Node
(Node))
(Node
(Node)))
llvm-svn: 174022
implementation; this is much more inline with the original implementation
(i.e., pre-ubsan) and does not require run-time library support.
The trapping implementation can be invoked using either '-fcatch-undefined-behavior'
or '-fsanitize=undefined-trap -fsanitize-undefined-trap-on-error', with the latter
being preferred. Eventually, the -fcatch-undefined-behavior' flag will be removed.
llvm-svn: 173848
One of the gotchas (see changes to CodeGenFunction) was due to the fix in
r139416 (for PR10829). This only worked previously because the top level
lexical block would set the location to the end of the function, the debug
location would be updated (as per r139416), the location would be set to
the end of the function again (but that would no-op, since it was the same
as the previous location), then the return instruction would be emitted using
the debug location.
Once the top level lexical block was no longer emitted, the end-of-function
location change was causing the debug loc to be updated, regressing that bug.
llvm-svn: 173593
Title: [PR9027] volatile struct bug: member is not loaded at -O;
This is caused by last flag passed to @llvm.memcpy being false,
not honoring that aggregate has at least one 'volatile' data member
(even though aggregate itself has not been qualified as 'volatile'.
As a result, optimization optimizes away the memcpy altogether.
Patch review by John MaCall (I still need to fix up a test though).
llvm-svn: 173535
This is a missing piece for C99 conformance.
This patch handles UCNs by adding a '\\' case to LexTokenInternal and
LexIdentifier -- if we see a backslash, we tentatively try to read in a UCN.
If the UCN is not syntactically well-formed, we fall back to the old
treatment: a backslash followed by an identifier beginning with 'u' (or 'U').
Because the spelling of an identifier with UCNs still has the UCN in it, we
need to convert that to UTF-8 in Preprocessor::LookUpIdentifierInfo.
Of course, valid code that does *not* use UCNs will see only a very minimal
performance hit (checks after each identifier for non-ASCII characters,
checks when converting raw_identifiers to identifiers that they do not
contain UCNs, and checks when getting the spelling of an identifier that it
does not contain a UCN).
This patch also adds basic support for actual UTF-8 in the source. This is
treated almost exactly the same as UCNs except that we consider stray
Unicode characters to be mistakes and offer a fixit to remove them.
llvm-svn: 173369
the 64-bit PowerPC ELF ABI.
The ABI requires that the real and imaginary parts of a complex argument
each occupy their own doubleword. Arguments smaller than 8 bytes are
right-adjusted within the doubleword.
Clang expects EmitVAARG() to return a pointer to a structure in which
the real and imaginary parts are packed adjacently in memory. To accomplish
this, we generate code to load the code appropriately from the varargs
location and pack the values into a temporary variable in the form Clang
expects, returning a pointer to that structure.
The test case demonstrates correct code generation for all "small" complex
types on PPC64: int, short, char, and float.
llvm-svn: 172438
with respect to the lower "left-hand-side bitwidth" bits, even when negative);
see OpenCL spec 6.3j. This patch both implements this behaviour in the code
generator and "constant folding" bits of Sema, and also prevents tests
to detect undefinedness in terms of the weaker C99 or C++ specifications
from being applied.
llvm-svn: 171755
When we are visiting the extern declaration of 'i' in
static int i = 99;
int foo() {
extern int i;
return i;
}
We should not try to handle it as if it was an function static. That is, we
must consider the written storage class.
Fixing this then exposes that the assert in EmitGlobalVarDeclLValue and the
if leading to its call are not completely accurate. They were passing before
because the second decl was marked as having external storage. I changed them
to check the linkage, which I find easier to understand.
Last but not least, there is something strange going on with cuda and opencl.
My guess is that the linkage computation for these languages needs to be
audited, but I didn't want to change that in this patch so I just updated
the storage classes to keep the current behavior.
Thanks to Reed Kotler for reporting this.
llvm-svn: 170827
PR 14529 was opened because neither Clang or LLVM was expanding
calls to creal* or cimag* into instructions that just load the
respective complex field. After some discussion, it was not
considered realistic to do this in LLVM because of the platform
specific way complex types are expanded. Thus a way to solve
this in Clang was pursued. GCC does a similar expansion.
This patch adds the feature to Clang by making the creal* and
cimag* functions library builtins and modifying the builtin code
generator to look for the new builtin types.
llvm-svn: 170455