r248010 changed the -debug output to use short ids, but did not
similarly modify the graph printer. Change to be consistent, for ease of
cross-reference.
llvm-svn: 251465
With more complex structures in C++ Lambdas and JavaScript function
literals, the old value was simply to small. However, this is a
temporary solution, I need to look at this more closely a) to find a
fundamentally better approach and b) to look at whether the more recent
usage of NoLineBreak makes us visit stuff in an unfortunate order
where clang-format waste many states in dead ends.
llvm-svn: 251463
CatchReturnInst has side-effects: it runs a destructor. This destructor
could conceivably run forever/call exit/etc. and should not be removed.
llvm-svn: 251461
The idea behind this patch is to expose the meat of
LLDB's Python infrastructure (test suite, scripts, etc)
as a single package. This makes reusability and code
sharing among sub-packages easy.
Differential Revision: http://reviews.llvm.org/D14131
llvm-svn: 251460
If the user configured clang with a custom GCC toolchain that will take precedence on what the ToolChainTest.cpp expects to evaluate.
This is fixed here by passing --gcc-toolchain= to the driver, in order to override any user defined GCC toolchain.
llvm-svn: 251459
We used automated tools to update our IR to its current syntax in commit
21f77df7(r247378). While it correctly updated the CHECK lines in our
compatibility tests, the IR should have remained untouched. This commit
fixes the syntax errors.
llvm-svn: 251458
This code was modifying the cursor and then expecting the editline
API call to see the effect for the next operation. This is misusing
the API. Newer editlines break on this code, fixed by this.
llvm-svn: 251457
Use 10 bits to represent calling convention ID's instead of 13, and
update the bitcode compatibility tests accordingly. We now error-out in
the bitcode reader when we see bad calling conv ID's.
Thanks to rnk and dexonsmith for feedback!
Differential Revision: http://reviews.llvm.org/D13826
llvm-svn: 251452
AliasSetTracker does not need to convert the access mode to ModRefAccess if the
new visited UnknownInst has only 'REF' modrefinfo to existing pointers in the
sets.
Patch by Andrew Zhogin!
llvm-svn: 251451
This is a usage of the IR-level fast-math-flags now that they are propagated to SDNodes.
This was originally part of D8900.
Removing the global 'enable-unsafe-fp-math' checks will require auto-upgrade and
possibly other changes.
Differential Revision: http://reviews.llvm.org/D9708
llvm-svn: 251450
The analyzer assumes that system functions will not free memory or modify the
arguments in other ways, so we assume that arguments do not escape when
those are called. However, this may lead to false positive leak errors. For
example, in code like this where the pointers added to the rb_tree are freed
later on:
struct alarm_event *e = calloc(1, sizeof(*e));
<snip>
rb_tree_insert_node(&alarm_tree, e);
Add a heuristic to assume that calls to system functions taking void*
arguments allow for pointer escape.
llvm-svn: 251449
When ASan currently detects a bug, by default it will only print out the text
of the report to stderr. This patch changes this behavior and writes the full
text of the report to syslog before we terminate the process. It also calls
os_trace (Activity Tracing available on OS X and iOS) with a message saying
that the report is available in syslog. This is useful, because this message
will be shown in the crash log.
For this to work, the patch makes sure we store the full report into
error_message_buffer unconditionally, and it also strips out ANSI escape
sequences from the report (they are used when producing colored reports).
I've initially tried to log to syslog during printing, which is done on Android
right now. The advantage is that if we crash during error reporting or the
produced error does not go through ScopedInErrorReport, we would still get a
(partial) message in the syslog. However, that solution is very problematic on
OS X. One issue is that the logging routine uses GCD, which may spawn a new
thread on its behalf. In many cases, the reporting logic locks threadRegistry,
which leads to deadlocks.
Reviewed at http://reviews.llvm.org/D13452
llvm-svn: 251447
This will tag all mmapped memory sanitizers use with "Performance tool data"
when viewed in vmmap. (Even though sanitizers are not performance tools, it's
the best available match and better than having the unidentified objects.)
http://reviews.llvm.org/D13609
llvm-svn: 251445
A PHI on a catchpad might be used by both edges out of the catchpad,
feeding back into a loop. In this case, just use the insertion point.
Anything more clever would require new basic blocks or PHI placement.
llvm-svn: 251442
1. Make the warning more strict in C mode. r172696 added code to suppress
warnings from macro expansions in system headers, which checks
`SourceMgr.isMacroBodyExpansion(E->IgnoreParens()->getExprLoc())`. Consider
this snippet:
#define FOO(x) (x)
void f(int a) {
FOO(a);
}
In C, the line `FOO(a)` is an `ImplicitCastExpr(ParenExpr(DeclRefExpr))`,
while it's just a `ParenExpr(DeclRefExpr)` in C++. So in C++,
`E->IgnoreParens()` returns the `DeclRefExpr` and the check tests the
SourceLoc of `a`. In C, the `ImplicitCastExpr` has the effect of checking the
SourceLoc of `FOO`, which is a macro body expansion, which causes the
diagnostic to be skipped. It looks unintentional that clang does different
things for C and C++ here, so use `IgnoreParenImpCasts` instead of
`IgnoreParens` here. This has the effect of the warning firing more often
than previously in C code – it now fires as often as it fires in C++ code.
2. Suppress the warning if it would warn on `UNREFERENCED_PARAMETER`.
`UNREFERENCED_PARAMETER` is a commonly used macro on Windows and it happens
to uselessly trigger -Wunused-value. As discussed in the thread
"rfc: winnt.h's UNREFERENCED_PARAMETER() vs clang's -Wunused-value" on
cfe-dev, fix this by special-casing this specific macro. (This costs a string
comparison and some fast-path lexing per warning, but the warning is emitted
rarely. It fires once in Windows.h itself, so this code runs at least once
per TU including Windows.h, but it doesn't run hundreds of times.)
http://reviews.llvm.org/D13969
llvm-svn: 251441
This recommits r250719, which caused a failure in SPEC2000.gcc
because of the incorrect insert point for the new wider load.
Convert two halfword loads into a single 32-bit word load with bitfield extract
instructions. For example :
ldrh w0, [x2]
ldrh w1, [x2, #2]
becomes
ldr w0, [x2]
ubfx w1, w0, #16, #16
and w0, w0, #ffff
llvm-svn: 251438
Summary: This is a follow on to post review comments on revision r248276.
Patch by Scott Egerton.
Reviewers: vkalintiris, dsanders
Subscribers: joerg, rengolin, cfe-commits
Differential Revision: http://reviews.llvm.org/D13100
llvm-svn: 251430
When optimization is disabled, edge weights that are stored in MBB won't be used so that we don't have to store them. Currently, this is done by adding successors with default weight 0, and if all successors have default weights, the weight list will be empty. But that the weight list is empty doesn't mean disabled optimization (as is stated several times in MachineBasicBlock.cpp): it may also mean all successors just have default weights.
We should discourage using default weights when adding successors, because it is very easy for users to forget update the correct edge weights instead of using default ones (one exception is that the MBB only has one successor). In order to detect such usages, it is better to differentiate using default weights from the case when optimizations is disabled.
In this patch, a new interface addSuccessorWithoutWeight(MBB*) is created for when optimization is disabled. In this case, MBB will try to maintain an empty weight list, but it cannot guarantee this as for many uses of addSuccessor() whether optimization is disabled or not is not checked. But it can guarantee that if optimization is enabled, then the weight list always has the same size of the successor list.
Differential revision: http://reviews.llvm.org/D13963
llvm-svn: 251429
Summary:
This change could be way off-piste, I'm looking for any feedback on whether it's an acceptable approach.
It never seems to be a problem to gobble up as many reduction values as can be found, and then to attempt to reduce the resulting tree. Some of the workloads I'm looking at have been aggressively unrolled by hand, and by selecting reduction widths that are not constrained by a vector register size, it becomes possible to profitably vectorize. My test case shows such an unrolling which SLP was not vectorizing (on neither ARM nor X86) before this patch, but with it does vectorize.
I measure no significant compile time impact of this change when combined with D13949 and D14063. There are also no significant performance regressions on ARM/AArch64 in SPEC or LNT.
The more principled approach I thought of was to generate several candidate tree's and use the cost model to pick the cheapest one. That seemed like quite a big design change (the algorithms seem very much one-shot), and would likely be a costly thing for compile time. This seemed to do the job at very little cost, but I'm worried I've misunderstood something!
Reviewers: nadav, jmolloy
Subscribers: mssimpso, llvm-commits, aemerson
Differential Revision: http://reviews.llvm.org/D14116
llvm-svn: 251428
Linking options for particular file depend on the option that specifies the file.
Currently there are two:
* -mlink-bitcode-file links in complete content of the specified file.
* -mlink-cuda-bitcode links in only the symbols needed by current TU.
Linked symbols are internalized. This bitcode linking mode is used to
link device-specific bitcode provided by CUDA.
Files are linked in order they are specified on command line.
-mlink-cuda-bitcode replaces -fcuda-uses-libdevice flag.
Differential Revision: http://reviews.llvm.org/D13913
llvm-svn: 251427
Summary:
Currently, when the SLP vectorizer considers whether a phi is part of a reduction, it dismisses phi's whose incoming blocks are not the same as the block containing the phi. For the patterns I'm looking at, extending this rule to allow phis whose incoming block is a containing loop latch allows me to vectorize certain workloads.
There is no significant compile-time impact, and combined with D13949, no performance improvement measured in ARM/AArch64 in any of SPEC2000, SPEC2006 or LNT.
Reviewers: jmolloy, mcrosier, nadav
Subscribers: mssimpso, nadav, aemerson, llvm-commits
Differential Revision: http://reviews.llvm.org/D14063
llvm-svn: 251425
Summary:
Certain workloads, in particular sum-of-absdiff loops, can be vectorized using SLP if it can treat select instructions as reduction values.
The test case is a bit awkward. The AArch64 cost model needs some tuning to not be so pessimistic about selects. I've had to tweak the SLP threshold here.
Reviewers: jmolloy, mzolotukhin, spatel, nadav
Subscribers: nadav, mssimpso, aemerson, llvm-commits
Differential Revision: http://reviews.llvm.org/D13949
llvm-svn: 251424
When emitting a remark for a conditional branch annotation, the remark
uses the line location information of the conditional branch in the
message. In some cases, that information is unavailable and the
optimization would segfaul. I'm still not sure whether this is a bug or
WAI, but the optimizer should not die because of this.
llvm-svn: 251420
The existing behavior was correct on Darwin, which is probably the
platform it was written for.
Before this change, we would rewrite "align 8" to ".align 3" and then
fail to make it through the integrated assembler because 3 is not a
power of 2.
Differential Revision: http://reviews.llvm.org/D14120
llvm-svn: 251418
For CloudABI's toolchain I have a symlink that goes from <target>-ar and
<target>-ranlib to LLVM's ar binary, to mimick GNU Binutils' naming
scheme. The problem is that if we're targetting ARM64, the name of the
ranlib executable is aarch64-unknown-cloudabi-ranlib. This already
contains the string "ar".
Let's move the "ranlib" test above the "ar" test. It's not that likely
that we're going to see operating systems or harwdare architectures that
are called "ranlib".
Reviewed by: rafael
Differential Revision: http://reviews.llvm.org/D14123
llvm-svn: 251413
Summary:
We've had a lot of discussion in the past about the meaningful and useful default behaviors for the llvm-shlib tool. The original implementation was heavily geared toward Apple's use, and I think that was wrong. This patch seeks to correct that.
I've removed the LLVM_DYLIB_EXPORT_ALL variable and made libLLVM export everything by default.
I've also added a new target that is only built on Darwin for libLLVM-C as a library that re-exports the LLVM-C API. This library is not built on Linux because ELF doesn't support re-export libraries in the same way MachO does.
Reviewers: chapuni, resistor, bogner, axw
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D13842
llvm-svn: 251411