Commit Graph

17831 Commits

Author SHA1 Message Date
Chandler Carruth 21eb4e96c2 Teach the rewriting of memcpy calls to support subvector copies.
This also cleans up a bit of the memcpy call rewriting by sinking some
irrelevant code further down and making the call-emitting code a bit
more concrete.

Previously, memcpy of a subvector would actually miscompile (!!!) the
copy into a single vector element copy. I have no idea how this ever
worked. =/ This is the memcpy half of PR14478 which we probably weren't
noticing previously because it didn't actually assert.

The rewrite relies on the newly refactored insert- and extractVector
functions to do the heavy lifting, and those are the same as used for
loads and stores which makes the test coverage a bit more meaningful
here.

llvm-svn: 170338
2012-12-17 14:51:24 +00:00
Richard Osborne 51bf1b269a Add instruction encodings for PEEK and ENDIN.
Previously these were marked with the wrong format.

llvm-svn: 170334
2012-12-17 14:23:54 +00:00
Chandler Carruth cacda256a1 Fix a secondary bug I introduced while fixing the first part of PR14478.
The first half of fixing this bug was actually in r170328, but was
entirely coincidental. It did however get me to realize the nature of
the bug, and adapt the test case to test more interesting behavior. In
turn, that uncovered the rest of the bug which I've fixed here.

This should fix two new asserts that showed up in the vectorize nightly
tester.

llvm-svn: 170333
2012-12-17 14:03:01 +00:00
Richard Osborne 041071c558 Add instruction encodings / disassembly support for rus instructions.
llvm-svn: 170330
2012-12-17 13:50:04 +00:00
Richard Osborne e405e58639 Add instruction encodings for ZEXT and SEXT.
Previously these were marked with the wrong format.

llvm-svn: 170327
2012-12-17 13:20:37 +00:00
Richard Osborne 3a0d5cc314 Add instruction encodings / disassembly support for 2r instructions.
llvm-svn: 170323
2012-12-17 12:29:31 +00:00
Richard Osborne 016967e4ff Add instruction encodings / disassembly support for 0r instructions.
llvm-svn: 170322
2012-12-17 12:26:29 +00:00
Craig Topper f924a58af1 Add rest of BMI/BMI2 instructions to the folding tables as well as popcnt and lzcnt.
llvm-svn: 170304
2012-12-17 05:02:29 +00:00
Chandler Carruth ccca504f3a Fix the first part of PR14478: memset now works.
PR14478 highlights a serious problem in SROA that simply wasn't being
exercised due to a lack of vector input code mixed with C-library
function calls. Part of SROA was written carefully to handle subvector
accesses via memset and memcpy, but the rewriter never grew support for
this. Fixing it required refactoring the subvector access code in other
parts of SROA so it could be shared, and then fixing the splat formation
logic and using subvector insertion (this patch).

The PR isn't quite fixed yet, as memcpy is still broken in the same way.
I'm starting on that series of patches now.

Hopefully this will be enough to bring the bullet benchmark back to life
with the bb-vectorizer enabled, but that may require fixing memcpy as
well.

llvm-svn: 170301
2012-12-17 04:07:37 +00:00
Richard Osborne c5287b8889 Add tests for disassembly of 1r XCore instructions.
llvm-svn: 170295
2012-12-16 18:06:30 +00:00
Reed Kotler aee4d5d194 This patch is needed to make c++ exceptions work for mips16.
Mips16 is really a processor decoding mode (ala thumb 1) and in the same
program, mips16 and mips32 functions can exist and can call each other.

If a jal type instruction encounters an address with the lower bit set, then
the processor switches to mips16 mode (if it is not already in it). If the
lower bit is not set, then it switches to mips32 mode.

The linker knows which functions are mips16 and which are mips32.
When relocation is performed on code labels, this lower order bit is
set if the code label is a mips16 code label.

In general this works just fine, however when creating exception handling
tables and dwarf, there are cases where you don't want this lower order
bit added in.

This has been traditionally distinguished in gas assembly source by using a
different syntax for the label.

lab1:      ; this will cause the lower order bit to be added
lab2=.     ; this will not cause the lower order bit to be added

In some cases, it does not matter because in dwarf and debug tables
the difference of two labels is used and in that case the lower order
bits subtract each other out.

To fix this, I have added to mcstreamer the notion of a debuglabel.
The default is for label and debug label to be the same. So calling
EmitLabel and EmitDebugLabel produce the same result.

For various reasons, there is only one set of labels that needs to be
modified for the mips exceptions to work. These are the "$eh_func_beginXXX" 
labels.

Mips overrides the debug label suffix from ":" to "=." .

This initial patch fixes exceptions. More changes most likely
will be needed to DwarfCFException to make all of this work
for actual debugging. These changes will be to emit debug labels in some
places where a simple label is emitted now.

Some historical discussion on this from gcc can be found at:
http://gcc.gnu.org/ml/gcc-patches/2008-08/msg00623.html
http://gcc.gnu.org/ml/gcc-patches/2008-11/msg01273.html 

llvm-svn: 170279
2012-12-16 04:00:45 +00:00
Benjamin Kramer b16ccde7a4 X86: Add a couple of target-specific dag combines that turn VSELECTS into psubus if possible.
We match the pattern "x >= y ? x-y : 0" into "subus x, y" and two special cases
if y is a constant. DAGCombiner canonicalizes those so we first have to undo the
canonicalization for those cases. The pattern occurs in gzip when the loop
vectorizer is enabled. Part of PR14613.

llvm-svn: 170273
2012-12-15 16:47:44 +00:00
Chandler Carruth c50394fcfa Add a corollary test for PR14572. We got this code path correct already.
llvm-svn: 170271
2012-12-15 09:31:54 +00:00
Chandler Carruth 067edd342f Relax an overly aggressive assert to fix PR14572.
The alloca width is based on the alloc size, not the type size.

llvm-svn: 170270
2012-12-15 09:26:06 +00:00
Reed Kotler 5fdeb21249 This code implements most of mips16 hardfloat as it is done by gcc.
In this case, essentially it is soft float with different library routines.
The next step will be to make this fully interoperational with mips32 floating
point and that requires creating stubs for functions with signatures that
contain floating point types.

I have a more sophisticated design for mips16 hardfloat which I hope to
implement at a later time that directly does floating point without the need
for function calls.

The mips16 encoding has no floating point instructions so one needs to
switch to mips32 mode to execute floating point instructions.

llvm-svn: 170259
2012-12-15 00:20:05 +00:00
Kevin Enderby 06aa3eb8ce Make sure the alternate PC+imm syntax of LDR instruction with a small
immediate generates the narrow version.  Needed when doing round-trip
assemble/disassemble testing using the alternate syntax that specifies
'pc' directly.

llvm-svn: 170255
2012-12-14 23:04:25 +00:00
Michael Ilseman e2754dc887 Add back FoldOpIntoPhi optimizations with fix. Included test cases to help catch these errors and to test the presence of the optimization itself
llvm-svn: 170248
2012-12-14 22:08:26 +00:00
Nadav Rotem 8487537bdb TypeLegalizer: Do not generate target specific nodes with illegal types, because we cant type-legalize them.
llvm-svn: 170245
2012-12-14 21:20:37 +00:00
Nadav Rotem aa3e2a907e Fix a crash in ValueTracking on vectors of pointers.
llvm-svn: 170240
2012-12-14 20:43:49 +00:00
Bill Schmidt a4f898448c This patch removes some nondeterminism from direct object file output
for TLS dynamic models on 64-bit PowerPC ELF.  The default sort routine
for relocations only sorts on the r_offset field; but with TLS, there
can be two relocations with the same r_offset.  For PowerPC, this patch
sorts secondarily on descending r_type, which matches the behavior
expected by the linker.

llvm-svn: 170237
2012-12-14 20:28:38 +00:00
Shuxin Yang f8e9a5a061 rdar://12753946
Implement rule : "x * (select cond 1.0, 0.0) -> select cond x, 0.0"

llvm-svn: 170226
2012-12-14 18:46:06 +00:00
Bill Schmidt 9f0b4ec0f5 This patch improves the 64-bit PowerPC InitialExec TLS support by providing
for a wider range of GOT entries that can hold thread-relative offsets.
This matches the behavior of GCC, which was not documented in the PPC64 TLS
ABI.  The ABI will be updated with the new code sequence.

Former sequence:

  ld 9,x@got@tprel(2)
  add 9,9,x@tls

New sequence:

  addis 9,2,x@got@tprel@ha
  ld 9,x@got@tprel@l(9)
  add 9,9,x@tls

Note that a linker optimization exists to transform the new sequence into
the shorter sequence when appropriate, by replacing the addis with a nop
and modifying the base register and relocation type of the ld.

llvm-svn: 170209
2012-12-14 17:02:38 +00:00
Evgeniy Stepanov 49175b237d [msan] Origin stores and loads do not need explicit alignment.
Origin address is always 4 byte aligned, and the access type is always i32.

llvm-svn: 170199
2012-12-14 13:43:11 +00:00
David Blaikie 37fefc3f8d Debug Info: add support to mark member variables as artificial
This is the LLVM portion of r170154.

llvm-svn: 170156
2012-12-13 22:43:07 +00:00
NAKAMURA Takumi 38d2b2442f Revert r170020, "Simplify negated bit test", for now.
This assumes (1 << n) is always not zero. Consider n is greater than word size.
Although I know it is undefined, this transforms undefined behavior hidden.

This led clang unexpected behavior with some failures. I will investigate to fix undefined shl in clang.

llvm-svn: 170128
2012-12-13 14:28:16 +00:00
Akira Hatanaka cf9a61b6ee [mips] Do not copy GOT address to register $gp if the function being called has
internal linkage.

llvm-svn: 170092
2012-12-13 03:17:29 +00:00
Eli Bendersky f9be4c8b47 Make this Lit config file a bit slimmer
llvm-svn: 170083
2012-12-13 02:03:46 +00:00
Evan Cheng bf0baa9de7 Fix a bug in DAGCombiner::MatchBSwapHWord. Make sure the node has operands before referencing them. rdar://12868039
llvm-svn: 170078
2012-12-13 01:34:32 +00:00
Quentin Colombet c0dba2035a Take into account minimize size attribute in the inliner.
Better controls the inlining of functions when the caller function has MinSize attribute.
Basically, when the caller function has this attribute, we do not "force" the inlining
of callee functions carrying the InlineHint attribute (i.e., functions defined with
inline keyword)

llvm-svn: 170065
2012-12-13 01:05:25 +00:00
Nadav Rotem 36510f7194 Teach the cost model about the optimization in r169904: Truncation of induction variables costs the same as scalar trunc.
llvm-svn: 170051
2012-12-13 00:21:03 +00:00
Jakub Staszak c6ecd7deba Fix typo, which prevent test from being check.
llvm-svn: 170025
2012-12-12 21:10:56 +00:00
Jakub Staszak a3619d31d8 unHECKify test fixed by Jacob in r159003.
llvm-svn: 170023
2012-12-12 20:58:42 +00:00
David Majnemer 5226aa94ce Simplify negated bit test
llvm-svn: 170020
2012-12-12 20:48:54 +00:00
Evan Cheng b7d3d03bf9 Fix a logic bug in inline expansion of memcpy / memset with an overlapping
load / store pair. It's not legal to use a wider load than the size of
the remaining bytes if it's the first pair of load / store.

llvm-svn: 170018
2012-12-12 20:43:23 +00:00
Jakub Staszak 0a74fc8d6c unHECKify test. It was fixed by Chris in 2009.
llvm-svn: 170017
2012-12-12 20:43:00 +00:00
Bill Schmidt 25ffd19502 The ordering of two relocations on the same instruction is apparently not
predictable when compiled on at least one non-PowerPC host.  Source of
nondeterminism not apparent.  Restrict the test to build on PowerPC hosts
for now while looking into the issue further.

llvm-svn: 170016
2012-12-12 20:29:20 +00:00
Jakub Staszak aee9cca331 Fix typo in test-case.
llvm-svn: 170015
2012-12-12 20:29:06 +00:00
Jakub Staszak 67bf76ebbc Fix typo.
llvm-svn: 170006
2012-12-12 19:47:04 +00:00
Nadav Rotem d0bb22bba3 LoopVectorizer: Use the "optsize" attribute to decide if we are allowed to increase the function size.
llvm-svn: 170004
2012-12-12 19:29:45 +00:00
Bill Schmidt 24b8dd6eb7 This patch implements local-dynamic TLS model support for the 64-bit
PowerPC target.  This is the last of the four models, so we now have 
full TLS support.

This is mostly a straightforward extension of the general dynamic model.
I had to use an additional Chain operand to tie ADDIS_DTPREL_HA to the
register copy following ADDI_TLSLD_L; otherwise everything above the
ADDIS_DTPREL_HA appeared dead and was removed.

As before, there are new test cases to test the assembly generation, and
the relocations output during integrated assembly.  The expected code
gen sequence can be read in test/CodeGen/PowerPC/tls-ld.ll.

There are a couple of things I think can be done more efficiently in the
overall TLS code, so there will likely be a clean-up patch forthcoming;
but for now I want to be sure the functionality is in place.

Bill

llvm-svn: 170003
2012-12-12 19:29:35 +00:00
Alexey Samsonov 3d43b63a6e Improve debug info generated with enabled AddressSanitizer.
When ASan replaces <alloca instruction> with
<offset into a common large alloca>, it should also patch
llvm.dbg.declare calls and replace debug info descriptors to mark
that we've replaced alloca with a value that stores an address
of the user variable, not the user variable itself.

See PR11818 for more context.

llvm-svn: 169984
2012-12-12 14:31:53 +00:00
NAKAMURA Takumi be230b8fdb llvm/test/CodeGen/X86/atom-bypass-slow-division.ll: Fix possible typo(s) in CHECK-NOT lines.
Found by Alexander Zinenko, thanks!

llvm-svn: 169978
2012-12-12 13:34:20 +00:00
NAKAMURA Takumi cae5321a3b llvm/test/CodeGen/X86/atom-bypass-slow-division.ll: Rename symbols, s/test_/Test/g, not to mismatch "CHECK(-NOT): test".
llvm-svn: 169977
2012-12-12 13:34:14 +00:00
NAKAMURA Takumi 69d1405e48 llvm/test/CodeGen/X86/store_op_load_fold.ll: Fix typo, s/CHECK_NEXT/CHECK-NEXT/
llvm-svn: 169957
2012-12-12 01:41:01 +00:00
NAKAMURA Takumi 01ac65af00 llvm/test/CodeGen/X86/store_op_load_fold.ll: Add explicit triple.
llvm-svn: 169956
2012-12-12 01:40:56 +00:00
Manman Ren 82751a105c DAGCombine: clamp hi bit in APInt::getBitsSet to avoid assertion
rdar://12838504

llvm-svn: 169951
2012-12-12 01:13:50 +00:00
Evan Cheng 04e5518783 Avoid using lossy load / stores for memcpy / memset expansion. e.g.
f64 load / store on non-SSE2 x86 targets.

llvm-svn: 169944
2012-12-12 00:42:09 +00:00
Shuxin Yang 81b3678564 - Fix a problematic way in creating all-the-1 APInt.
- Propagate "exact" bit of [l|a]shr instruction.

llvm-svn: 169942
2012-12-12 00:29:03 +00:00
Michael Ilseman bb6f691b01 Added a slew of SimplifyInstruction floating-point optimizations, many of which take advantage of fast-math flags. Test cases included.
fsub X, +0 ==> X
  fsub X, -0 ==> X, when we know X is not -0
  fsub +/-0.0, (fsub -0.0, X) ==> X
  fsub nsz +/-0.0, (fsub +/-0.0, X) ==> X
  fsub nnan ninf X, X ==> 0.0
  fadd nsz X, 0 ==> X
  fadd [nnan ninf] X, (fsub [nnan ninf] 0, X) ==> 0
    where nnan and ninf have to occur at least once somewhere in this expression
  fmul X, 1.0 ==> X

llvm-svn: 169940
2012-12-12 00:27:46 +00:00
Nadav Rotem f707bf4ca3 PR14574. Fix a bug in the code that calculates the mask the converted PHIs in if-conversion.
llvm-svn: 169916
2012-12-11 21:30:14 +00:00