This patch addresses the first stage of PR17891 by folding constant
arguments together into a single MDString. Integers are stringified and
a `\0` character is used as a separator.
Part of PR17891.
Note: I've attached my testcases upgrade scripts to the PR. If I've
just broken your out-of-tree testcases, they might help.
llvm-svn: 218914
Update debug info testcases for an LLVM metadata schema change to fold
metadata constant operands into a single `MDString`.
Part of PR17891.
llvm-svn: 218913
elements as well as integer elements in order to form simpler shuffle
patterns.
This is the primary reason why we were failing to match some of the
2-and-2 floating point shuffles such as PR21140. Even after fixing this
we need to support some extra patterns in the backend in order to match
the resulting X86ISD::UNPCKL nodes into the correct instructions. This
commit should fix PR21140 and includes more comprehensive testing of
insertion patterns in v4 shuffles.
Not all of the added tests are beautiful. For example, we don't have
clever instructions to insert-via-load in the integer domain. There are
also some places where we aren't sufficiently cunning with our use of
movq and movd, but that's future work.
llvm-svn: 218911
This adds support for the align_value attribute. This attribute is supported by
Intel's compiler (versions 14.0+), and several of my HPC users have requested
support in Clang. It specifies an alignment assumption on the values to which a
pointer points, and is used by numerical libraries to encourage efficient
generation of vector code.
Of course, we already have an aligned attribute that can specify enhanced
alignment for a type, so why is this additional attribute important? The
problem is that if you want to specify that an input array of T is, say,
64-byte aligned, you could try this:
typedef double aligned_double attribute((aligned(64)));
void foo(aligned_double *P) {
double x = P[0]; // This is fine.
double y = P[1]; // What alignment did those doubles have again?
}
the access here to P[1] causes problems. P was specified as a pointer to type
aligned_double, and any object of type aligned_double must be 64-byte aligned.
But if P[0] is 64-byte aligned, then P[1] cannot be, and this access causes
undefined behavior. Getting round this problem requires a lot of awkward
casting and hand-unrolling of loops, all of which is bad.
With the align_value attribute, we can accomplish what we'd like in a well
defined way:
typedef double *aligned_double_ptr attribute((align_value(64)));
void foo(aligned_double_ptr P) {
double x = P[0]; // This is fine.
double y = P[1]; // This is fine too.
}
This attribute does not create a new type (and so it not part of the type
system), and so will only "propagate" through templates, auto, etc. by
optimizer deduction after inlining. This seems consistent with Intel's
implementation (thanks to Alexey for confirming the various Intel-compiler
behaviors).
As a final note, I would have chosen to call this aligned_value, not
align_value, for better naming consistency with the aligned attribute, but I
think it would be more useful to users to adopt Intel's name.
llvm-svn: 218910
When unsafe-fp-math is enabled, we can turn sqrt(X) * sqrt(X) into X.
This can happen in the real world when calculating x ** 3/2. This occurs
in test-suite/SingleSource/Benchmarks/BenchmarkGame/n-body.c.
Differential Revision: http://reviews.llvm.org/D5584
llvm-svn: 218906
Prior to GCC 4.4, __sync_fetch_and_nand was implemented as:
{ tmp = *ptr; *ptr = ~tmp & value; return tmp; }
but this was changed in GCC 4.4 to be:
{ tmp = *ptr; *ptr = ~(tmp & value); return tmp; }
in response to this change, support for sync_fetch_and_nand (and
sync_nand_and_fetch) was removed in r99522 in order to avoid miscompiling code
depending on the old semantics. However, at this point:
1. Many years have passed, and the amount of code relying on the old
semantics is likely smaller.
2. Through the work of many contributors, all LLVM backends have been updated
such that "atomicrmw nand" provides the newer GCC 4.4+ semantics (this process
was complete July of 2014 (added to the release notes in r212635).
3. The lack of this intrinsic is now a needless impediment to porting codes
from GCC to Clang (I've now seen several examples of this).
It is true, however, that we still set GNUC_MINOR to 2 (corresponding to GCC
4.2). To compensate for this, and to address the original concern regarding
code relying on the old semantics, I've added a warning that specifically
details the fact that the semantics have changed and that we provide the newer
semantics.
Fixes PR8842.
llvm-svn: 218905
floating point and integer domains.
Merge the AVX2 test into it and add an extra RUN line. Generate clean
FileCheck statements with my script. Remove the now merged AVX2 tests.
llvm-svn: 218903
Added tests and impl to make sure the following errors are reported:
* Notifying a created thread that we are already tracking.
* Notifying a thread death for a thread we don't know about.
llvm-svn: 218900
This check looks for if statements and loops: for, range-for, while and
do-while, and verifies that the inside statements are inside braces '{}'.
If not, proposes to add braces around them.
Example:
if (condition)
action();
becomes
if (condition) {
action();
}
This check ought to be used with the -format option, so that the braces be
well-formatted.
Patch by Marek Kurdej!
http://reviews.llvm.org/D5395
llvm-svn: 218898
Now that ThreadStateCoordinator errors out on threads in unexpected states,
it has enough information to know which threads need stop requests fired
when we want to do a deferred callback on a thread's behalf. This change
adds a new method, CallAfterRunningThreadsStop(...), which no longer
takes a set of thread ids that require stop requests. It's much harder
to misuse this method and (with newer error logic) it's harder to
correctly use the original method. Expect the original method that takes
the set of thread ids to stop to disappear in the near future.
Adds several tests for CallAfterRunningThreadsStop().
llvm-svn: 218897
The mergeByContent attribute on DefinedAtoms triggers the symbol table to
coalesce atoms with the exact same content. The problem is that atoms can also
have a required custom section. The coalescing should never change the custom
section of an atom.
The fix is to only consider to atoms to have the same content if their
sectionChoice() and customSectionName() attributes match.
llvm-svn: 218893
Every time we were adding or removing an expression when generating a
coverage mapping we were doing a linear search to try and deduplicate
the list. The indices in the list are important, so we can't just
replace it by a DenseMap entirely, but an auxilliary DenseMap for fast
lookup massively improves the performance issues I was seeing here.
llvm-svn: 218892
When the flag is given, the command prints out the COFF import table.
Currently only the import table directory will be printed.
I'm going to make another patch to print out the imported symbols.
The implementation of import directory entry iterator in
COFFObjectFile.cpp was buggy. This patch fixes that too.
http://reviews.llvm.org/D5569
llvm-svn: 218891
This patch fixes the codesigning of debugserver on OSX when built with
cmake. Without this you get this error when debugging:
error: process launch failed: unable to locate debugserver
Note: you also need to set LLDB_DEBUGSERVER_PATH to point to your built debugserver.
e.g. export LLDB_DEBUGSERVER_PATH=`pwd`/bin/debugserver
Change by dawn@burble.org.
Tested on MacOSX 10.9.5 and Xcode 6.1 Beta using cmake/ninja.
Verified no build break on Linux Ubuntu cmake/ninja and Xcode 6.1 canonical build.
llvm-svn: 218890
Summary:
Currently, with struct my_struct { int x; method_ptr y; };
a call to foo(my_struct s) may end up dropping the last 4 bytes
of the method pointer for x86_64 NaCl and x32.
When checking Has64BitPointers, also check if the method pointer
straddles an eightbyte boundary and classify Hi as well as Lo if needed.
Test Plan: test/CodeGenCXX/x86_64-arguments-nacl-x32.cpp
Reviewers: dschuff, pavel.v.chupin
Subscribers: jfb
Differential Revision: http://reviews.llvm.org/D5555
llvm-svn: 218889
When I was preparing r218879 for commit, I removed an early return
that I decided was just noise. It wasn't. This is r218879 no-crash
edition.
This reverts commit r218881, reapplying r218879.
llvm-svn: 218887
This also forbids the json importer to access other memory locations
than the original instruction as we to reuse the alignment of the
original load/store.
Differential Revision: http://reviews.llvm.org/D5560
llvm-svn: 218883
The Terms vector here represented a polynomial of of all possible
counters, and is used to simplify expressions when generating coverage
mapping. There are a few problems with this:
1. Keeping the vector as a member is wasteful, since we clear it every
time we use it.
2. Most expressions refer to a subset of the counters, so we end up
iterating over a large number of zeros doing nothing a lot of the
time.
This updates the user of the vector to store the terms locally, and
uses a sort and combine approach so that we only operate on counters
that are actually used in a given expression. For small cases this
makes very little difference, but in cases with a very large number of
counted regions this is a significant performance fix.
llvm-svn: 218879
The LoopAnnotator doesn't annotate only loops any more, thus it is
called ScopAnnotator from now on.
This also removes unnecessary polly:: namespace tags.
llvm-svn: 218878
The command line flag -polly-annotate-alias-scopes controls whether or not
Polly annotates alias scopes in the new SCoP (default ON). This can improve
later optimizations as the new SCoP is basically an alias free environment for
them.
llvm-svn: 218877
Added tests to verify that the coordinator signals an error if
the given thread to resume is unknown, and if the thread is through to
be running already.
Modified resume handling code to match tests.
llvm-svn: 218872
* Checks for any 'clean' arg, setting clean-only mode.
* If clean-only mode, skip the make.
* Adds the equivalent of os.devnull in a way that supports the file-like 'write' operation.
* Uses a null sysout/syserr for the clean step so we don't see noise from cleaning.
* Now always runs the clean step after a real test run so we don't have cruft left over after a test run.
llvm-svn: 218871
As the title says... also, fix all the ARM asm return sequences so that they
work on processors without the BX instruction.
http://reviews.llvm.org/D5314
llvm-svn: 218869
My commit rL216160 introduced a bug PR21014: IndVars widens code 'for (i = ; i < ...; i++) arr[ CONST - i]' into 'for (i = ; i < ...; i++) arr[ i - CONST]'
thus inverting index expression. This patch fixes it.
Thanks to Jörg Sonnenberger for pointing.
Differential Revision: http://reviews.llvm.org/D5576
llvm-svn: 218867
Summary:
Changed cmake/config-ix.cmake to add support for different MIPS architectures: mips, mipsel, mips64, mips64el
In profile code there is no target based dependencies, so just enabling mips flag does the work.
Patch by Mohit Bhakkad
Reviewers: dsanders, void, petarj, kcc, samsonov
Reviewed By: samsonov
Subscribers: llvm-commits, farazs, kumarsukhani
Differential Revision: http://reviews.llvm.org/D4880
llvm-svn: 218866
Summary: Commit r218863 broke this test case. This patch fixes it
by updating the expected output line. Should've been updated with
the original patch but for some reason it didn't fail during my
local make check.
Change-Id: I89ed28b37f67c34d1a5d28a3e47ae33d9a82a98f
llvm-svn: 218864