This matters for example in following matrix multiply:
int **mmult(int rows, int cols, int **m1, int **m2, int **m3) {
int i, j, k, val;
for (i=0; i<rows; i++) {
for (j=0; j<cols; j++) {
val = 0;
for (k=0; k<cols; k++) {
val += m1[i][k] * m2[k][j];
}
m3[i][j] = val;
}
}
return(m3);
}
Taken from the test-suite benchmark Shootout.
We estimate the cost of the multiply to be 2 while we generate 9 instructions
for it and end up being quite a bit slower than the scalar version (48% on my
machine).
Also, properly differentiate between avx1 and avx2. On avx-1 we still split the
vector into 2 128bits and handle the subvector muls like above with 9
instructions.
Only on avx-2 will we have a cost of 9 for v4i64.
I changed the test case in test/Transforms/LoopVectorize/X86/avx1.ll to use an
add instead of a mul because with a mul we now no longer vectorize. I did
verify that the mul would be indeed more expensive when vectorized with 3
kernels:
for (i ...)
r += a[i] * 3;
for (i ...)
m1[i] = m1[i] * 3; // This matches the test case in avx1.ll
and a matrix multiply.
In each case the vectorized version was considerably slower.
radar://13304919
llvm-svn: 176403
Inlining brought a few "null pointer use" false positives, which occur because
the callee defensively checks if a pointer is NULL, whereas the caller knows
that the pointer cannot be NULL in the context of the given call.
This is a first attempt to silence these warnings by tracking the symbolic value
along the execution path in the BugReporter. The new visitor finds the node
in which the symbol was first constrained to NULL. If the node belongs to
a function on the active stack, the warning is reported, otherwise, it is
suppressed.
There are several areas for follow up work, for example:
- How do we differentiate the cases where the first check is followed by
another one, which does happen on the active stack?
Also, this only silences a fraction of null pointer use warnings. For example, it
does not do anything for the cases where NULL was assigned inside a callee.
llvm-svn: 176402
The LoopVectorizer often runs multiple times on the same function due to inlining.
When this happens the loop vectorizer often vectorizes the same loops multiple times, increasing code size and adding unneeded branches.
With this patch, the vectorizer during vectorization puts metadata on scalar loops and marks them as 'already vectorized' so that it knows to ignore them when it sees them a second time.
PR14448.
llvm-svn: 176399
In LLVM, -pedantic is not set unless LLVM_ENABLE_PEDANTIC is set.
However, Clang's CMakeLists.txt unilaterally adds -pedantic to the run
line, so we need to disable -Wnested-anon-types explicitly.
llvm-svn: 176393
Calculate "can branch" using the MC API's rather than our hand-rolled regex'es.
As extra credit, allow setting the disassembly flavor for x86 based architectures to intel or att.
<rdar://problem/11319574>
<rdar://problem/9329275>
llvm-svn: 176392
Fix the way resources are counted. I'm taking some time to cleanup the
way MachineScheduler handles in-order machine resources. Eventually
we'll need more PPC/Atom test cases in tree.
llvm-svn: 176390
Previously we were assuming that we'd never ask for the sub-region bindings
of a bitfield, since a bitfield cannot have subregions. However,
unification of code paths has made that assumption invalid. While we could
take advantage of this by just checking for the single possible binding,
it's probably better to do the right thing, so that if/when we someday
support unions we'll do the right thing there, too.
This fixes a handful of false positives in analyzing LLVM.
<rdar://problem/13325522>
llvm-svn: 176388
The sys::fs::is_directory() check is unnecessary because, if the filename is
a directory, the function will fail anyway with the same error code returned.
Remove the check to avoid an unnecessary stat call.
Someone needs to review on windows and see if the check is necessary there or not.
llvm-svn: 176386
This patch eliminates the need to emit a constant move instruction when this
pattern is matched:
(select (setgt a, Constant), T, F)
The pattern above effectively turns into this:
(conditional-move (setlt a, Constant + 1), F, T)
llvm-svn: 176384
extra/test/cpp11-migrate/Makefile was using the same tmp file for generating
lit.site.cfg for two different directories. Parallelism caused conflicts so now
using differently named temp files.
llvm-svn: 176379
detail.
The was this test was written, it was relying on an implementation detail
(fixups) and hence was very brittle (relying, among other things, on the
exact ordering of statistics printed by MC).
The test was rewritten to check a more observable output difference. While it
doesn't cover 100% of the things the original test covered, it's a good
practice to write regression tests this way. If we want to check that
internal details and invariants hold, such tests should be expressed as unit
tests.
llvm-svn: 176377
Previously we would check the syntax of the file before we transform
it, but that's redundant since it'll be checked as part of the
transformation. Remove that check completely.
We also had an unconditional syntax check after transforming. This
is only really useful to debug cpp11-migrate, since users will end
up compiling the transformed source anyways, and the transformations
*should* never introduce a failure. Made this an option, accessible
via "-final-syntax-check".
Resolves PR 15380.
llvm-svn: 176376
and use it to keep from doing the OS Plugin UpdateThreadList while destroying, since
if that does anything that requires the API lock it may deadlock against whoever is
running the Process::Destroy.
<rdar://problem/13308627>
llvm-svn: 176375
The make (all) target takes care of creating lit configs and auto-generating
tests. The problem with the original 'lit.site.cfg' target is it's not
recursive and doesn't fully create everything necessary for testing
clang-tools-extra.
llvm-svn: 176374
Autoconf make (all) now properly recurses from tools/extra/Makefile into
tools/extra/test/Makefile and tools/extra/test/cpp11-migrate/Makefile. The
'all' target is responsible for creating lit config files and autogenerating
tests. Subsequent 'check-all' targets will properly work.
Re-enabling UseAuto/iterator.cpp test.
General clean-up of clang-tools-extra makefiles; removing dead targets and
removing duplicated pieces of llvm/Makefile.rules.
llvm-svn: 176373
This moves the actual replacement code into a separate
function. There is still a bit of code duplication to
go from macros to expansion areas, but that code will
need to be fixed anyways to resolve bugs around macro
replacement.
Reviewed by: Tareq Siraj, Edwin Vane
llvm-svn: 176372
Most map types have an operator[] that inserts a new element if the key
isn't found, then returns a reference to the value slot so that you can
assign into it. However, if the value type is a pointer, it will be
initialized to null. This is usually no problem.
However, if the user /knows/ the map contains a value for a particular key,
they may just use it immediately:
// From ClangSACheckersEmitter.cpp
recordGroupMap[group]->Checkers
In this case the analyzer reports a null dereference on the path where the
key is not in the map, even though the user knows that path is impossible
here. They could silence the warning by adding an assertion, but that means
splitting up the expression and introducing a local variable. (Note that
the analyzer has no way of knowing that recordGroupMap[group] will return
the same reference if called twice in a row!)
We already have logic that says a null dereference has a high chance of
being a false positive if the null came from an inlined function. This
patch simply extends that to references whose rvalues are null as well,
silencing several false positives in LLVM.
<rdar://problem/13239854>
llvm-svn: 176371
This reduces the time actually spent doing string to ID conversion and shows a 10% improvement in compile time for a particularly bad case that involves ARM Neon intrinsics (these have many overloads).
Patch by Jean-Luc Duprat!
llvm-svn: 176365
- ISD::SHL/SRL/SRA must have either both scalar or both vector operands
but TLI.getShiftAmountTy() so far only return scalar type. As a
result, backend logic assuming that breaks.
- Rename the original TLI.getShiftAmountTy() to
TLI.getScalarShiftAmountTy() and re-define TLI.getShiftAmountTy() to
return target-specificed scalar type or the same vector type as the
1st operand.
- Fix most TICG logic assuming TLI.getShiftAmountTy() a simple scalar
type.
llvm-svn: 176364
dispatch code. As far as I can tell the thumb2 code is behaving as expected.
I was able to compile and run the associated test case for both arm and thumb1.
rdar://13066352
llvm-svn: 176363
In builder type call, we indent to the laster function calls.
However, for the last element of such a call, we don't need to do
so, as that normally just wastes space and does not increase
readability.
Before:
aaaaaa->aaaaaa->aaaaaa( // break
aaaaaa);
aaaaaaaaaaaaaaaaaaaaa->aaaaaaaaaaaaaaaaaaaaaa
->aaaaaaaaaaaaaaaaaaaaaaaaaa(
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa);
After:
aaaaaa->aaaaaa->aaaaaa( // break
aaaaaa);
aaaaaaaaaaaaaaaaaaaaa->aaaaaaaaaaaaaaaaaaaaaa->aaaaaaaaaaaaaaaaaaaaaaaaaa(
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa);
llvm-svn: 176352