- Instead of computing a bunch of buckets of different flag types, just do an
incremental link resolving conflicts as they arise.
- This also has the advantage of making the link result deterministic and not
dependent on map iteration order.
llvm-svn: 172634
AT_producer. Which includes clang's version information so we can tell
which version of the compiler was used.
This is the first of two steps to allow us to do that. This is the llvm-mc
change to provide a method to set the AT_producer string. The second step,
coming soon to a clang near you, will have the clang driver pass the value
of getClangFullVersion() via an flag when invoking the integrated assembler
on assembly source files.
rdar://12955296
llvm-svn: 172630
In r143502, we renamed getHostTriple() to getDefaultTargetTriple()
as part of work to allow the user to supply a different default
target triple at configure time. This change also affected the JIT.
However, it is inappropriate to use the default target triple in the
JIT in most circumstances because this will not necessarily match
the current architecture used by the process, leading to illegal
instruction and other such errors at run time.
Introduce the getProcessTriple() function for use in the JIT and
its clients, and cause the JIT to use it. On architectures with a
single bitness, the host and process triples are identical. On other
architectures, the host triple represents the architecture of the
host CPU, while the process triple represents the architecture used
by the host CPU to interpret machine code within the current process.
For example, when executing 32-bit code on a 64-bit Linux machine,
the host triple may be 'x86_64-unknown-linux-gnu', while the process
triple may be 'i386-unknown-linux-gnu'.
This fixes JIT for the 32-on-64-bit (and vice versa) build on non-Apple
platforms.
Differential Revision: http://llvm-reviews.chandlerc.com/D254
llvm-svn: 172627
Hope you are feeling better.
The Mips RDHWR (Read Hardware Register) instruction was not
tested for assembler or dissassembler consumption. This patch
adds that functionality.
Contributer: Vladimir Medic
llvm-svn: 172579
using the DW_FORM_GNU_addr_index and a separate .debug_addr section which
stays in the executable and is fully linked.
Sneak in two other small changes:
a) Print out the debug_str_offsets.dwo section.
b) Change form we're expecting the entries in the debug_str_offsets.dwo
section to take from ULEB128 to U32.
Add tests for all of this in the fission-cu.ll test.
llvm-svn: 172578
some optimization opportunities (in the enclosing supper-expressions).
rule 1. (-0.0 - X ) * Y => -0.0 - (X * Y)
if expression "-0.0 - X" has only one reference.
rule 2. (0.0 - X ) * Y => -0.0 - (X * Y)
if expression "0.0 - X" has only one reference, and
the instruction is marked "noSignedZero".
2. Eliminate negation (The compiler was already able to handle these
opt if the 0.0s are replaced with -0.0.)
rule 3: (0.0 - X) * (0.0 - Y) => X * Y
rule 4: (0.0 - X) * C => X * -C
if the expr is flagged "noSignedZero".
3.
Rule 5: (X*Y) * X => (X*X) * Y
if X!=Y and the expression is flagged with "UnsafeAlgebra".
The purpose of this transformation is two-fold:
a) to form a power expression (of X).
b) potentially shorten the critical path: After transformation, the
latency of the instruction Y is amortized by the expression of X*X,
and therefore Y is in a "less critical" position compared to what it
was before the transformation.
4. Remove the InstCombine code about simplifiying "X * select".
The reasons are following:
a) The "select" is somewhat architecture-dependent, therefore the
higher level optimizers are not able to precisely predict if
the simplification really yields any performance improvement
or not.
b) The "select" operator is bit complicate, and tends to obscure
optimization opportunities. It is btter to keep it as low as
possible in expr tree, and let CodeGen to tackle the optimization.
llvm-svn: 172551
Test was failing for clang-native-arm-cortex-a9 build-bot configuration.
The reason for the failure was the test was using hardcoded names.
The attached patch fixes this failure by replacing the hard-coded variables
names with pattern-matched variable names.
Patch by Manish Verma, ARM
llvm-svn: 172534
we need to generate a N64 compound relocation
R_MIPS_GPREL_32/R_MIPS_64/R_MIPS_NONE.
The bug was exposed by the SingleSourcetest case
DuffsDevice.c.
Contributer: Jack Carter
llvm-svn: 172496
---------------------------------------------------------------------------
C_A: reassociation is allowed
C_R: reciprocal of a constant C is appropriate, which means
- 1/C is exact, or
- reciprocal is allowed and 1/C is neither a special value nor a denormal.
-----------------------------------------------------------------------------
rule1: (X/C1) / C2 => X / (C2*C1) (if C_A)
=> X * (1/(C2*C1)) (if C_A && C_R)
rule 2: X*C1 / C2 => X * (C1/C2) if C_A
rule 3: (X/Y)/Z = > X/(Y*Z) (if C_A && at least one of Y and Z is symbolic value)
rule 4: Z/(X/Y) = > (Z*Y)/X (similar to rule3)
rule 5: C1/(X*C2) => (C1/C2) / X (if C_A)
rule 6: C1/(X/C2) => (C1*C2) / X (if C_A)
rule 7: C1/(C2/X) => (C1/C2) * X (if C_A)
llvm-svn: 172488
The included test case is derived from one of the GCC compatibility tests.
The problem arises after the selection DAG has been converted to type-legalized
form. The combiner first sees a 64-bit load that can be converted into a
pre-increment form. The original load feeds into a SRL that isolates the
upper 32 bits of the loaded doubleword. This looks like an opportunity for
DAGCombiner::ReduceLoadWidth() to replace the 64-bit load with a 32-bit load.
However, this transformation is not valid, as the replacement load is not
a pre-increment load. The pre-increment load produces an extra result,
which feeds a subsequent add instruction. The replacement load only has
one result value, and this value is propagated to all uses of the pre-
increment load, including the add. Because the add is looking for the
second result value as its operand, it ends up attempting to add a constant
to a token chain, resulting in a crash.
So the patch simply disables this transformation for any load with more than
two result values.
llvm-svn: 172480
The reason that this occurs is that tail calling objc_autorelease eventually
tail calls -[NSObject autorelease] which supports fast autorelease. This can
cause us to violate the semantic gaurantees of __autoreleasing variables that
assignment to an __autoreleasing variables always yields an object that is
placed into the innermost autorelease pool.
The fix included in this patch works by:
1. In the peephole optimization function OptimizeIndividualFunctions, always
remove tail call from objc_autorelease.
2. Whenever we convert to/from an objc_autorelease, set/unset the tail call
keyword as appropriate.
*NOTE* I also handled the case where objc_autorelease is converted in
OptimizeReturns to an autoreleaseRV which still violates the ARC semantics. I
will be removing that in a later patch and I wanted to make sure that the tree
is in a consistent state vis-a-vis ARC always.
Additionally some test cases are provided and all tests that have tail call marked
objc_autorelease keywords have been modified so that tail call has been removed.
*NOTE* One test fails due to a separate bug that I am going to commit soon. Thus
I marked the check line TMP: instead of CHECK: so make check does not fail.
llvm-svn: 172287
register names in the standalone assembler llvm-mc.
Registers such as $A1 can represent either a 32 or
64 bit register based on the instruction using it.
In addition, based on the abi, $T0 can represent different
32 bit registers.
The problem is resolved by the Mips specific AsmParser
td definitions changing to work together. Many cases of
RegisterClass parameters are now RegisterOperand.
Contributer: Vladimir Medic
llvm-svn: 172284
the target if it supports the different CAST types. We didn't do this
on X86 because of the different register sizes and types, but on ARM
this makes sense.
llvm-svn: 172245
- recognize string "{memory}" in the MI generation
- mark as mayload/maystore when there's a memory clobber constraint.
PR14859.
Patch by Krzysztof Parzyszek
llvm-svn: 172228
We don't have a detailed analysis on which values are vectorized and which stay scalars in the vectorized loop so we use
another method. We look at reduction variables, loads and stores, which are the only ways to get information in and out
of loop iterations. If the data types are extended and truncated then the cost model will catch the cost of the vector
zext/sext/trunc operations.
llvm-svn: 172178
Messages:
Converted test case trivial_codegen_tailcall.ll to use FileCheck.
Converted test return_constant.ll to use FileCheck instead of grep.
Converted test reorder_load.ll to use FileCheck instead of grep.
Converted test intervening-inst.ll to use FileCheck instead of grep.
llvm-svn: 172171
This fixes va_start/va_copy of a va_list field which happens to not
be laid out at a 16-byte boundary.
Differential Revision: http://llvm-reviews.chandlerc.com/D276
llvm-svn: 172128