One performs: (X == 13 | X == 14) -> X-13 <u 2
The other: (A == C1 || A == C2) -> (A & ~(C1 ^ C2)) == C1
The problem is that there are certain values of C1 and C2 that
trigger both transforms but the first one blocks out the second,
this generates suboptimal code.
Reordering the transforms should be better in every case and
allows us to do interesting stuff like turn:
%shr = lshr i32 %X, 4
%and = and i32 %shr, 15
%add = add i32 %and, -14
%tobool = icmp ne i32 %add, 0
into:
%and = and i32 %X, 240
%tobool = icmp ne i32 %and, 224
llvm-svn: 179493
We do not only need to understand that 'k * p' is a parameter expression, but
also need to store this expression in the set of parameters. Before this patch
we wrongly stored the two individual parameters %k and %p.
Reported by: Sebastian Pop <spop@codeaurora.org>
llvm-svn: 179485
Invalid redeclarations of valid explicit declarations shouldn't
take the same path as redeclarations of implicit declarations,
and invalid local extern declarations shouldn't foul things up
for everybody else.
llvm-svn: 179482
two new options –msingle-float and –mdouble-float. These options can be
used simultaneously with float ABI selection options (-mfloat-abi,
-mhard-float, -msoft-float). They mark whether a floating-point
coprocessor supports double-precision operations.
llvm-svn: 179481
SDNodes and MachineOperands get target flags representing the %hi() and
%lo() assembly annotations that eventually become relocations.
Also define flags to be used by the 64-bit code models.
llvm-svn: 179468
- Do not add symbols with no names
- Make sure that symbols from ELF symbol tables know that the byte size is correct. Previously the symbols would calculate their sizes by looking for the next symbol and take symbols that had zero size and make them have invalid sizes.
- Added the ability to dump raw ELF symbols by adding a Dump method to ELFSymbol
Also removed some unused code from lldb_private::Symtab.
llvm-svn: 179466
Leaving MFCR has having unmodeled side effects is not enough to prevent
unwanted instruction reordering post-RA. We could probably apply a stronger
barrier attribute, but there is a better way: Add all (not just the first) CR
to be spilled as live-in to the entry block, and add all CRs to the MFCR
instruction as implicitly killed.
Unfortunately, I don't have a small test case.
llvm-svn: 179465
Two reasons for that:
* the declaration is not used. the LLDB_SOURCE_DIR is provided as the first argument in the script ($1) (called SRC_ROOT in the source code)
* add_custom_command is quoting the first argument of the command. Usually, it is the script itself (and then the full path to the script) but, here, it is the declaration of a variable.
It was failing with:
cd "/llvm-toolchain-3.3~svn179457/build-llvm/tools/lldb/scripts" && "SRCROOT=/llvm-toolchain-3.3~svn179457/tools/lldb" /llvm-toolchain-3.3~svn179457/tools/lldb/scripts/build-swig-wrapper-classes.sh /llvm-toolchain-3.3~svn179457/tools/lldb /llvm-toolchain-3.3~svn179457/build-llvm/tools/lldb/scripts /llvm-toolchain-3.3~svn179457/build-llvm/tools/lldb/scripts /llvm-toolchain-3.3~svn179457/build-llvm -m
/bin/sh: 1: SRCROOT=/llvm-toolchain-3.3~svn179457/tools/lldb: not found
llvm-svn: 179459
This is basically the same fix in three different places. We use a set to avoid
walking the whole tree of a big ConstantExprs multiple times.
For example: (select cmp, (add big_expr 1), (add big_expr 2))
We don't want to visit big_expr twice here, it may consist of thousands of
nodes.
The testcase exercises this by creating an insanely large ConstantExprs out of
a loop. It's questionable if the optimizer should ever create those, but this
can be triggered with real C code. Fixes PR15714.
llvm-svn: 179458
For functions that need to spill CRs, and have dynamic stack allocations, the
value of the SP during the restore is not what it was during the save, and so
we need to use the FP in these cases (as for all of the other spills and
restores, but the CR restore has a special code path because its reserved slot,
like the link register, is specified directly relative to the adjusted SP).
llvm-svn: 179457
The initial values were arbitrary. I want them to be more
conservative. This represents the number of latency cycles hidden by
OOO execution. In practice, I think it should be within a small factor
of the complex floating point operation latency so the scheduler can
make some attempt to hide latency even for smallish blocks.
These are by no means the best values, just a starting point for
tuning heuristics. Some benchmarks such as TSVC run faster with this
lower value for SandyBridge. I haven't run anything on Haswell, but
it's shouldn't be 2x SB.
llvm-svn: 179450
The register allocator expects minimal physreg live ranges. Schedule
physreg copies accordingly. This is slightly tricky when they occur in
the middle of the scheduling region. For now, this is handled by
rescheduling the copy when its associated instruction is
scheduled. Eventually we may instead bundle them, but only if we can
preserve the bundles as parallel copies during regalloc.
llvm-svn: 179449
I need to handle this for the test case in my following scheduler
commit.
Work is already under way to redesign the mechanism for node order
propagation because this case by case approach is unmaintainable.
llvm-svn: 179448