Commit Graph

60512 Commits

Author SHA1 Message Date
Benjamin Kramer dd67654af6 Reassociate: Avoid iterator invalidation.
OpndPtrs stored pointers into the Opnd vector that became invalid when the
vector grows. Store indices instead. Sadly I only have a large testcase that
only triggers under valgrind, so I didn't include it.

llvm-svn: 178793
2013-04-04 21:15:42 +00:00
Richard Osborne 0c12d1851e [XCore] Add bru instruction.
llvm-svn: 178783
2013-04-04 20:05:35 +00:00
Richard Osborne f18d95f756 [XCore] The RRegs register class is a superset of GRRegs.
At the time when the XCore backend was added there were some issues with
with overlapping register classes but these all seem to be fixed now.
Describing the register classes correctly allow us to get rid of a
codegen only instruction (LDAWSP_lru6_RRegs) and it means we can
disassemble ru6 instructions that use registers above r11.

llvm-svn: 178782
2013-04-04 19:57:46 +00:00
Jakob Stoklund Olesen 299475e0c6 Avoid high-latency false CPSR dependencies even for tMOVSi.
The Thumb2SizeReduction pass avoids false CPSR dependencies, except it
still aggressively creates tMOVi8 instructions because they are so
common.

Avoid creating false CPSR dependencies even for tMOVi8 instructions when
the the CPSR flags are known to have high latency. This allows integer
computation to overlap floating point computations.

Also process blocks in a reverse post-order and propagate high-latency
flags to successors.

<rdar://problem/13468102>

llvm-svn: 178773
2013-04-04 18:25:36 +00:00
Eli Bendersky fc186358f2 Formatting
llvm-svn: 178771
2013-04-04 18:03:41 +00:00
Vincent Lejeune bcbb13d691 R600: Use a mask for offsets when encoding instructions
llvm-svn: 178763
2013-04-04 14:00:09 +00:00
Vincent Lejeune 8e377fdba6 R600: Fix wrong address when substituting ENDIF
llvm-svn: 178762
2013-04-04 14:00:03 +00:00
Vincent Lejeune c44fa99719 R600: Take export into account when computing cf address
llvm-svn: 178761
2013-04-04 13:59:59 +00:00
Jakob Stoklund Olesen 8cfaffaade Add SPARC v9 support for select on 64-bit compares.
This requires v9 cmov instructions using the %xcc flags instead of the
%icc flags.

Still missing:
- Select floats on %xcc flags.
- Select i64 on %fcc flags.

llvm-svn: 178737
2013-04-04 03:08:00 +00:00
Manman Ren 5a15c9ed9f Debug Info: according to DWARF 2, FORM_ref_addr the same size as an address on
the target system.

It was hard-coded to 4 bytes before. I can't get llvm to generate a
ref_addr on a reasonably sized testing case.

rdar://problem/13559431

llvm-svn: 178722
2013-04-04 00:22:54 +00:00
Michael Gottesman 21a4ed3227 Refactored out the helper method FindPredecessorAutoreleaseWithSafePath from ObjCARCOpt::OptimizeReturns.
Now ObjCARCOpt::OptimizeReturns is easy to read and reason about.

llvm-svn: 178715
2013-04-03 23:39:14 +00:00
Michael Gottesman 6908db148b Refactored out the helper function FindPredecessorRetainWithSafePath from ObjCARCOpt::OptimizeReturns.
llvm-svn: 178714
2013-04-03 23:16:05 +00:00
Michael Gottesman c2d5bf5c53 Small cleanups.
Cleaned up trailing whitespace and added extra slashes in front of a
function level comment so that it follow the convention of having 3
slashes.

llvm-svn: 178712
2013-04-03 23:07:45 +00:00
Michael Gottesman 54dc7fdefb Refactored out a part of ObjCARCOpt::OptimizeReturns into its own method HasSafePathToPredecessorCall.
llvm-svn: 178710
2013-04-03 23:04:28 +00:00
Michael Gottesman 0a1748bb8c Removed an old comment.
llvm-svn: 178709
2013-04-03 23:04:24 +00:00
Michael Gottesman 43e7e00a68 Clean up arc annotations by moving the top/bottom BB annotations into conditional macros that no-op in Release mode instead of #ifdef sections of the code.
This is to follow the example of the DEBUG macro.

llvm-svn: 178705
2013-04-03 22:41:59 +00:00
Arnold Schwaighofer e9b5016411 X86 cost model: Vector shifts are expensive in most cases
The default logic does not correctly identify costs of casts because they are
marked as custom on x86.

For some cases, where the shift amount is a scalar we would be able to generate
better code. Unfortunately, when this is the case the value (the splat) will get
hoisted out of the loop, thereby making it invisible to ISel.

radar://13130673
radar://13537826

llvm-svn: 178703
2013-04-03 21:46:05 +00:00
Vincent Lejeune c3d3f9b66e R600: Fix last ALU of a clause being emitted in a separate clause
llvm-svn: 178675
2013-04-03 18:24:47 +00:00
Aaron Ballman 5e6d20524a Ensuring that both bits are set, and not just a combination of one or the other.
llvm-svn: 178674
2013-04-03 18:00:22 +00:00
Hal Finkel b0c810ff6d Cleanup PPC reciprocal-estimate functionality
Incorporating review feedback from Bill Schmidt on r178617. No functionality
change intended.

llvm-svn: 178672
2013-04-03 17:44:56 +00:00
Vincent Lejeune 80031d9fc4 R600: Factorize maximum alu per clause in a single location
llvm-svn: 178667
2013-04-03 16:49:34 +00:00
Aaron Ballman edc03c660c Testing for Visual Studio 2010 SP1 or greater before calling the _xgetbv intrinsic. This also fixes a minor code formatting issue.
llvm-svn: 178666
2013-04-03 16:28:24 +00:00
Vincent Lejeune b6d6c0d458 R600: Simplify data structure and add DEBUG to R600ControlFlowFinalizer
llvm-svn: 178665
2013-04-03 16:24:09 +00:00
Vincent Lejeune 9931298b30 R600: Consider KILLGT as an ALU instruction
Mesa does not override llvm behavior wrt KILLGT anymore so llvm
has to handle KILLGT on its own.

llvm-svn: 178664
2013-04-03 16:24:04 +00:00
Eli Bendersky b35a211f61 Measure time that IR parsing took as part of the -time-passes measurement.
llvm-svn: 178662
2013-04-03 15:33:45 +00:00
Hal Finkel 7ac4592e97 PPC: Enable FRES and FRSQRTE on the default PPC64 description
I discussed this with Bill Schmidt on IRC, and it was decided that this is a
safe and reasonable default.

llvm-svn: 178659
2013-04-03 14:40:18 +00:00
Hal Finkel 0c6d21933a PPC: Add a FIXME regarding the non-working fma+fneg Altivec pattern
llvm-svn: 178658
2013-04-03 14:40:16 +00:00
Hal Finkel 2ed21a8ca6 Remove some obsolete PowerPC/README entries
llvm-svn: 178657
2013-04-03 14:25:55 +00:00
Ulrich Weigand 084ff8e891 More direct types in PowerPC AltiVec intrinsics.
This patch follows up on work done by Bill Schmidt in r178277,
and replaces most of the remaining uses of VRRC in ISEL DAG patterns.

The resulting .inc files are identical except for comments, so
no change in code generation is expected.

llvm-svn: 178656
2013-04-03 14:08:13 +00:00
Bill Schmidt 92e26646bc Fix PR15632: No support for ppcf128 floating-point remainder on PowerPC.
For this we need to use a libcall.  Previously LLVM didn't implement
libcall support for frem, so I've added it in the usual
straightforward manner.  A test case from the bug report is included.

llvm-svn: 178639
2013-04-03 13:05:44 +00:00
Tim Northover 5816ca117b AArch64: implement ETMv4 trace system registers.
llvm-svn: 178637
2013-04-03 12:31:29 +00:00
Aaron Ballman 5f7c680fdc Second pass at addressing PR15351 by explicitly checking for AVX support
when getting the host processor information.  It emits a .byte sequence on GNUC compilers to work around lack of xgetbv support with older assemblers, and resolves a comment typo found in the previous patch.

llvm-svn: 178636
2013-04-03 12:25:06 +00:00
Timur Iskhodzhanov f4e0665e56 Fix SRet for thiscall in i686-pc-win32
llvm-svn: 178634
2013-04-03 11:27:54 +00:00
Tim Northover 5b097a735f AArch64: switch patterns to be type-based rather than RegClass-based
It's a bit of churn in the blame log, but I think there are real benefits to
the newer system so I'm making the change in one go.

llvm-svn: 178633
2013-04-03 11:19:16 +00:00
Eric Christopher 14c2067ca1 Fix grammar.
llvm-svn: 178624
2013-04-03 05:29:58 +00:00
Eric Christopher 5590949f29 Remove ZeroOrMore from the option description. We don't need it here.
llvm-svn: 178623
2013-04-03 05:26:07 +00:00
Jakob Stoklund Olesen d9bbdfd3cc Add 64-bit compare + branch for SPARC v9.
The same compare instruction is used for 32-bit and 64-bit compares. It
sets two different sets of flags: icc and xcc.

This patch adds a conditional branch instruction using the xcc flags for
64-bit compares.

llvm-svn: 178621
2013-04-03 04:41:44 +00:00
Hal Finkel b00fc87608 Remove some unsupported-feature comments from PPC.td
These refer to the reciprocal estimate support recently committed.

llvm-svn: 178618
2013-04-03 04:03:58 +00:00
Hal Finkel 2e10331057 Use PPC reciprocal estimates with Newton iteration in fast-math mode
When unsafe FP math operations are enabled, we can use the fre[s] and
frsqrte[s] instructions, which generate reciprocal (sqrt) estimates, together
with some Newton iteration, in order to quickly generate floating-point
division and sqrt results. All of these instructions are separately optional,
and so each has its own feature flag (except for the Altivec instructions,
which are covered under the existing Altivec flag). Doing this is not only
faster than using the IEEE-compliant fdiv/fsqrt instructions, but allows these
computations to be pipelined with other computations in order to hide their
overall latency.

I've also added a couple of missing fnmsub patterns which turned out to be
missing (but are necessary for good code generation of the Newton iterations).
Altivec needs a similar fix, but that will probably be more complicated because
fneg is expanded for Altivec's v4f32.

llvm-svn: 178617
2013-04-03 04:01:11 +00:00
Rafael Espindola b9b7ae0c78 Fix the fde encoding used by mips to match gas.
This finally fixes the encoding. The patch also
* Removes eh-frame.ll. It was an unnecessary .ll to .o test that was checking
  the wrong value.
* Merge fde-reloc.s and eh-frame.s into a single test, since the only difference
  was the run lines.
* Don't blindly test the content of the entire .eh_frame section. It makes it
  hard to anyone actually fixing a bug and hitting a difference in a binary
  blob. Instead, use a CHECK for each field and document what is being checked.

llvm-svn: 178615
2013-04-03 03:13:19 +00:00
Aaron Ballman 9c0f0af54f Rolling back the AVX support patch due to breaking a gcc 4.6 build bot that doesn't understand the xgetbv instruction for some reason. Will revisit when time permits.
llvm-svn: 178614
2013-04-03 03:11:39 +00:00
Michael Gottesman b8c8836594 Remove an optimization where we were changing an objc_autorelease into an objc_autoreleaseReturnValue.
The semantics of ARC implies that a pointer passed into an objc_autorelease
must live until some point (potentially down the stack) where an
autorelease pool is popped. On the other hand, an
objc_autoreleaseReturnValue just signifies that the object must live
until the end of the given function at least.

Thus objc_autorelease is stronger than objc_autoreleaseReturnValue in
terms of the semantics of ARC* implying that performing the given
strength reduction without any knowledge of how this relates to
the autorelease pool pop that is further up the stack violates the
semantics of ARC.

*Even though objc_autoreleaseReturnValue if you know that no RV
optimization will occur is more computationally expensive.

llvm-svn: 178612
2013-04-03 02:57:24 +00:00
Michael Gottesman 624243914f Improved comment. No functionality change.
llvm-svn: 178605
2013-04-03 01:57:16 +00:00
Aaron Ballman 56be6ba5e4 Attempting to fix the build on older GCC versions.
llvm-svn: 178604
2013-04-03 01:39:37 +00:00
Aaron Ballman 6bc0dfc7bd This patch addresses PR15351 by explicitly checking for AVX support
when getting the host processor information.

llvm-svn: 178598
2013-04-03 00:33:32 +00:00
Eric Christopher e2fbc67e81 Formatting.
llvm-svn: 178589
2013-04-02 23:06:40 +00:00
Akira Hatanaka 023c678a0d [mips] Small update to the implementation of eh.return for Mips.
This patch initializes t9 to the handler address, but only if the relocation
model is pic. This handles the case where handler to which eh.return jumps 
points to the start of the function.

Patch by Sasa Stankovic.

llvm-svn: 178588
2013-04-02 23:02:07 +00:00
Eric Christopher 6476f908b3 Support and test template arguments for unions.
llvm-svn: 178586
2013-04-02 22:55:56 +00:00
Eric Christopher 17dd8f07c6 Reformat arguments.
llvm-svn: 178585
2013-04-02 22:55:52 +00:00
Akira Hatanaka 2ffc5734e7 [mips] Expand pseudo multiply/divide instructions in MipsCodeEmitter.cpp.
This patch fixes the following two tests which have been failing on
llvm-mips-linux builder since r178403:

LLVM :: Analysis/Profiling/load-branch-weights-ifs.ll
LLVM :: Analysis/Profiling/load-branch-weights-loops.ll

llvm-svn: 178584
2013-04-02 22:53:58 +00:00
Jakob Stoklund Olesen aeb69a5481 Allow MachineTraceMetrics to be used when the model has no resources.
It it still possible to extract information from itineraries, for
example.

llvm-svn: 178582
2013-04-02 22:27:45 +00:00
Chad Rosier 8a24466f69 [ms-inline asm] Add support for parsing variables with namespace alias
qualifiers.

This patch only adds support for parsing these identifiers in the
X86AsmParser.  The front-end interface isn't capable of looking up
these identifiers at this point in time.  The end result is the
compiler now errors during object file emission, rather than at
parse time.  Test case coming shortly.
Part of rdar://13499009 and PR13340

llvm-svn: 178566
2013-04-02 20:02:33 +00:00
Bill Schmidt 3581cd4b4c Fix PR15630: Replace faulty stdcx. with stwcx.
When doing a partword atomic operation, a lwarx was being paired with
a stdcx. instead of a stwcx. when compiling for a 64-bit target.  The
target has nothing to do with it in this case; we always need a stwcx.

Thanks to Kai Nacke for reporting the problem.

llvm-svn: 178559
2013-04-02 18:37:08 +00:00
Jakob Stoklund Olesen 8fbfc59164 Don't attempt MTM heuristics without a scheduling model present.
This should fix the PPC buildbots.

llvm-svn: 178558
2013-04-02 18:26:45 +00:00
Jakob Stoklund Olesen 3ca14772d0 Count processor resources individually in MachineTraceMetrics.
The new instruction scheduling models provide information about the
number of cycles consumed on each processor resource. This makes it
possible to estimate ILP more accurately than simply counting
instructions / issue width.

The functions getResourceDepth() and getResourceLength() now identify
the limiting processor resource, and return a cycle count based on that.

This gives more precise resource information, particularly in traces
that use one resource a lot more than others.

llvm-svn: 178553
2013-04-02 17:49:51 +00:00
Chad Rosier 7925d280ff [fast-isel] Use the correct API to disable FastLowerArguments for Win64.
llvm-svn: 178549
2013-04-02 16:31:41 +00:00
Arnold Schwaighofer d6c6e868b2 DAGCombiner: Merge store/loads when we have extload/truncstores
This is helps on architectures where i8,i16 are not legal but we have byte, and
short loads/stores. Allowing us to merge copies like the one below on ARM.

copy(char *a, char *b, int n) {
 do {
   int t0 = a[0];
   int t1 = a[1];
   b[0] = t0;
   b[1] = t1;

radar://13536387

llvm-svn: 178546
2013-04-02 15:58:51 +00:00
Justin Holewinski a922c7e90e [NVPTX] Fix a few style issues in NVVMReflect
llvm-svn: 178536
2013-04-02 12:37:11 +00:00
Bill Wendling 88d06c3b2d Use a worklist to avoid a sneaky iterator invalidation.
The iterator could be invalidated when it's recursively deleting a whole bunch
of constant expressions in a constant initializer.

Note: This was only reproducible if `opt' was run on a `.bc' file. If `opt' was
run on a `.ll' file, it wouldn't crash. This is why the test first pushes the
`.ll' file through `llvm-as' before feeding it to `opt'.

PR15440

llvm-svn: 178531
2013-04-02 08:16:45 +00:00
Jakob Stoklund Olesen 8eabc3ffde Add 64-bit load and store instructions.
There is only a few new instructions, the rest is handled with patterns.

llvm-svn: 178528
2013-04-02 04:09:28 +00:00
Jakob Stoklund Olesen 917e07f095 Basic 64-bit ALU operations.
SPARC v9 extends all ALU instructions to 64 bits, so we simply need to
add patterns to use them for both i32 and i64 values.

llvm-svn: 178527
2013-04-02 04:09:23 +00:00
Jakob Stoklund Olesen bddb20eeef Materialize 64-bit immediates.
The last resort pattern produces 6 instructions, and there are still
opportunities for materializing some immediates in fewer instructions.

llvm-svn: 178526
2013-04-02 04:09:17 +00:00
Jakob Stoklund Olesen c1d1a4816e Add 64-bit shift instructions.
SPARC v9 defines new 64-bit shift instructions. The 32-bit shift right
instructions are still usable as zero and sign extensions.

This adds new F3_Sr and F3_Si instruction formats that probably should
be used for the 32-bit shifts as well. They don't really encode an
simm13 field.

llvm-svn: 178525
2013-04-02 04:09:12 +00:00
Jakob Stoklund Olesen 739d722ef7 Add predicates for distinguishing 32-bit and 64-bit modes.
The 'sparc' architecture produces 32-bit code while 'sparcv9' produces
64-bit code.

It is also possible to run 32-bit code using SPARC v9 instructions with:

  llc -march=sparc -mattr=+v9

llvm-svn: 178524
2013-04-02 04:09:06 +00:00
Jakob Stoklund Olesen 0b21f35aca Add support for 64-bit calling convention.
This is far from complete, but it is enough to make it possible to write
test cases using i64 arguments.

Missing features:
- Floating point arguments.
- Receiving arguments on the stack.
- Calls.

llvm-svn: 178523
2013-04-02 04:09:02 +00:00
Jakob Stoklund Olesen 5ad3b35377 Add an I64Regs register class for 64-bit registers.
We are going to use the same registers for 32-bit and 64-bit values, but
in two different register classes. The I64Regs register class has a
larger spill size and alignment.

The addition of an i64 register class confuses TableGen's type
inference, so it is necessary to clarify the type of some immediates and
the G0 register.

In 64-bit mode, pointers are i64 and should use the I64Regs register
class. Implement getPointerRegClass() to dynamically provide the pointer
register class depending on the subtarget. Use ptr_rc and iPTR for
memory operands.

Finally, add the i64 type to the IntRegs register class. This register
class is not used to hold i64 values, I64Regs is for that. The type is
required to appease TableGen's type checking in output patterns like this:

  def : Pat<(add i64:$a, i64:$b), (ADDrr $a, $b)>;

SPARC v9 uses the same ADDrr instruction for i32 and i64 additions, and
TableGen doesn't know to check the type of register sub-classes.

llvm-svn: 178522
2013-04-02 04:08:54 +00:00
Hal Finkel 93d75ea08a Fix typo in PPCISelLowering
Thanks to Bill Schmidt for finding this in review of r178480.

llvm-svn: 178521
2013-04-02 03:29:51 +00:00
Andrew Trick e1d88cfb57 The divide unit is not pipeline, but it is still buffered.
Buffered means a later divide may be executed out-of-order while a
prior divide is sitting (buffered) in a reservation station.

You can tell it's not pipelined, because operations that use it
reserve it for more than one cycle:

def : WriteRes<WriteIDiv, [HWPort0, HWDivider]> {
  let Latency = 25;
  let ResourceCycles = [1, 10];
}

We don't currently distinguish between an unpipeline operation and one
that is split into multiple micro-ops requiring the same unit. Except
that the later may have NumMicroOps > 1 if they also consume
issue/dispatch resources.

llvm-svn: 178519
2013-04-02 01:58:47 +00:00
NAKAMURA Takumi fd98f7f2b6 Target/R600: Fix CMake build to add missing files.
llvm-svn: 178508
2013-04-01 22:05:58 +00:00
Jack Carter 9423f507b1 Mips direct object exception handling regression
Revision 177141 caused a regression in all but
mips64 little endian. That is because none of the
other Mips targets had test cases checking the 
contents of the .eh_frame section. This patch fixes
both the llvm code and adds an assembler test case 
to include the current 4 flavors.

The test cases unfortunately rely on llvm-objdump. A
preferable method would be to use a pretty printer output
such as what readelf -wf <elf_file> would give.

I also changed the name of the test case to correct a typo.

llvm-svn: 178506
2013-04-01 21:55:15 +00:00
Vincent Lejeune bfaa63a6db R600: Add support for native control flow
llvm-svn: 178505
2013-04-01 21:48:05 +00:00
Vincent Lejeune ace6f7351e R600/SI: Share code recording ShaderTypeAttribute between generations
llvm-svn: 178504
2013-04-01 21:47:53 +00:00
Vincent Lejeune f43bc57b66 R600: Emit CF_ALU and use true kcache register.
llvm-svn: 178503
2013-04-01 21:47:42 +00:00
Eli Bendersky e60fc2f676 Fix top-comment header and some indentation
llvm-svn: 178492
2013-04-01 19:47:56 +00:00
Hal Finkel 3f88d08974 Fix a bad assert in PPCTargetLowering
llvm-svn: 178489
2013-04-01 18:42:58 +00:00
Shuxin Yang 6662fd0f15 Correct assertion condition
llvm-svn: 178484
2013-04-01 18:13:05 +00:00
Arnold Schwaighofer 6752366ed7 Merge load/store sequences with adresses: base + index + offset
We would also like to merge sequences that involve a variable index like in the
example below.

    int index = *idx++
    int i0 = c[index+0];
    int i1 = c[index+1];
    b[0] = i0;
    b[1] = i1;

By extending the parsing of the base pointer to handle dags that contain a
base, index, and offset we can handle examples like the one above.

The dag for the code above will look something like:

 (load (i64 add (i64 copyfromreg %c)
                (i64 signextend (i8 load %index))))

 (load (i64 add (i64 copyfromreg %c)
                (i64 signextend (i32 add (i32 signextend (i8 load %index))
                                         (i32 1)))))

The code that parses the tree ignores the intermediate sign extensions. However,
if there is a sign extension it needs to be on all indexes.

 (load (i64 add (i64 copyfromreg %c)
                (i64 signextend (add (i8 load %index)
                                     (i8 1))))
 vs

 (load (i64 add (i64 copyfromreg %c)
                (i64 signextend (i32 add (i32 signextend (i8 load %index))
                                         (i32 1)))))
radar://13536387

llvm-svn: 178483
2013-04-01 18:12:58 +00:00
Hal Finkel f6d45f2379 Add more PPC floating-point conversion instructions
The P7 and A2 have additional floating-point conversion instructions which
allow a direct two-instruction sequence (plus load/store) to convert from all
combinations (signed/unsigned i32/i64) <--> (float/double) (on previous cores,
only some combinations were directly available).

llvm-svn: 178480
2013-04-01 17:52:07 +00:00
Hal Finkel 39caf9f5ec Use ImmToIdxMap.count in PPCRegisterInfo
Code improvement suggested by Jakob (in review of r178450). No functionality
change intended.

llvm-svn: 178473
2013-04-01 17:02:06 +00:00
Hal Finkel 290376dd78 Add the PPC popcntw instruction
The popcntw instruction is available whenever the popcntd instruction is
available, and performs a separate popcnt on the lower and upper 32-bits.
Ignoring the high-order count, this can be used for the 32-bit input case
(saving on the explicit zero extension otherwise required to use popcntd).

llvm-svn: 178470
2013-04-01 15:58:15 +00:00
Nadav Rotem be79a7ac7a Add support for vector data types in the LLVM interpreter.
Patch by:
Veselov, Yuri <Yuri.Veselov@intel.com>

llvm-svn: 178469
2013-04-01 15:53:30 +00:00
Hal Finkel 60c7510711 Treat PPCISD::STFIWX like the memory opcode that it is
PPCISD::STFIWX is really a memory opcode, and so it should come after
FIRST_TARGET_MEMORY_OPCODE, and we should use DAG.getMemIntrinsicNode to create
nodes using it.

No functionality change intended (although there could be optimization benefits
from preserving the MMO information).

llvm-svn: 178468
2013-04-01 15:37:53 +00:00
Duncan Sands fee96f832d Remove unused typedef.
llvm-svn: 178462
2013-04-01 13:46:15 +00:00
Arnold Schwaighofer 6793aebb84 ARM Scheduler Model: Add resources instructions, map resources in subtargets
Reapply r177968:
After commit 178074 we can now have undefined scheduler variants.

Move the CortexA9 resources into the CortexA9 SchedModel namespace. Define
resource mappings under the CortexA9 SchedModel. Define resources and mappings
for the SwiftModel.

Incooperate Andrew's feedback.

llvm-svn: 178460
2013-04-01 13:07:05 +00:00
Benjamin Kramer 52ceb44331 X86TTI: Add accurate costs for itofp operations, based on the actual instruction counts.
llvm-svn: 178459
2013-04-01 10:23:49 +00:00
Joe Abbey bc6f4baea9 Whitespace cleanup
llvm-svn: 178454
2013-04-01 02:28:07 +00:00
Vincent Lejeune 53f3525d35 R600: Emit native instructions for tex
llvm-svn: 178452
2013-03-31 19:33:04 +00:00
Duncan Sands e1aa194aab There is no longer any need to silence this compiler warning as the warning has
been turned off globally.

llvm-svn: 178451
2013-03-31 17:44:09 +00:00
Hal Finkel 8540f7771c Cleanup ImmToIdxMap and noImmForm in PPCRegisterInfo
ImmToIdxMap should be a DenseMap (not a std::map) because there
is no ordering requirement. Also, we don't need a separate list
of instructions for noImmForm in eliminateFrameIndex, because this
list is essentially the complement of the keys in ImmToIdxMap.

No functionality change intended.

llvm-svn: 178450
2013-03-31 14:43:31 +00:00
Benjamin Kramer b60633fb87 X86: Promote sitofp <8 x i16> to <8 x i32> when AVX is available.
A vector sext + sitofp is a lot cheaper than 8 scalar conversions.

llvm-svn: 178448
2013-03-31 12:49:15 +00:00
Hal Finkel beb296bea1 Add the PPC lfiwax instruction
This instruction is available on modern PPC64 CPUs, and is now used
to improve the SINT_TO_FP lowering (by eliminating the need for the
separate sign extension instruction and decreasing the amount of
needed stack space).

llvm-svn: 178446
2013-03-31 10:12:51 +00:00
Hal Finkel e53429a13e Cleanup PPC(64) i32 -> float/double conversion
The existing SINT_TO_FP code for i32 -> float/double conversion was disabled
because it relied on broken EXTSW_32/STD_32 instruction definitions. The
original intent had been to enable these 64-bit instructions to be used on CPUs
that support them even in 32-bit mode.  Unfortunately, this form of lying to
the infrastructure was buggy (as explained in the FIXME comment) and had
therefore been disabled.

This re-enables this functionality, using regular DAG nodes, but only when
compiling in 64-bit mode. The old STD_32/EXTSW_32 definitions (which were dead)
are removed.

llvm-svn: 178438
2013-03-31 01:58:02 +00:00
Benjamin Kramer 9335443236 DAGCombine: visitXOR can replace a node without returning it, bail out in that case.
Fixes the crash reported in PR15608.

llvm-svn: 178429
2013-03-30 21:28:18 +00:00
Benjamin Kramer 9c9e0a2c04 Change '@SECREL' suffix to GAS-compatible '@SECREL32'.
'@SECREL' is what is used by the Microsoft assembler, but GNU as expects '@SECREL32'.
With the patch, the MC-generated code works fine in combination with a recent GNU as (2.23.51.20120920 here).

Patch by David Nadlinger!
Differential Revision: http://llvm-reviews.chandlerc.com/D429

llvm-svn: 178427
2013-03-30 16:21:50 +00:00
Benjamin Kramer a73cc5eead Put private class into an anonmyous namespace.
llvm-svn: 178420
2013-03-30 15:23:08 +00:00
Justin Holewinski 59fd8ba5f5 [NVPTX] Remove support for SM < 2.0. This was never fully supported anyway.
llvm-svn: 178417
2013-03-30 14:29:30 +00:00
Justin Holewinski b94bd05b95 [NVPTX] Add NVVMReflect pass to allow compile-time selection of
specific code paths.

This allows us to write code like:

  if (__nvvm_reflect("FOO"))
    // Do something
  else
    // Do something else

and compile into a library, then give "FOO" a value at kernel
compile-time so the check becomes a no-op.

llvm-svn: 178416
2013-03-30 14:29:25 +00:00
Justin Holewinski 0497ab142d [NVPTX] Run clang-format on all NVPTX sources.
Hopefully this resolves any outstanding style issues and gives us
an automated way of ensuring we conform to the style guidelines.

llvm-svn: 178415
2013-03-30 14:29:21 +00:00
Shuxin Yang 7b0c94e207 Implement XOR reassociation. It is based on following rules:
rule 1: (x | c1) ^ c2 => (x & ~c1) ^ (c1^c2),
     only useful when c1=c2
  rule 2: (x & c1) ^ (x & c2) = (x & (c1^c2))
  rule 3: (x | c1) ^ (x | c2) = (x & c3) ^ c3 where c3 = c1 ^ c2
  rule 4: (x | c1) ^ (x & c2) => (x & c3) ^ c1, where c3 = ~c1 ^ c2

 It reduces an application's size (in terms of # of instructions) by 8.9%.
 Reviwed by Pete Cooper. Thanks a lot!

 rdar://13212115  

llvm-svn: 178409
2013-03-30 02:15:01 +00:00
Akira Hatanaka b3c1847b30 [mips] Add patterns for DSP indexed load instructions.
llvm-svn: 178408
2013-03-30 02:14:45 +00:00
Akira Hatanaka b1457304cc [mips] Define reg+imm load/store pattern templates.
llvm-svn: 178407
2013-03-30 02:01:48 +00:00
Akira Hatanaka fb221c197d [mips] Fix DSP instructions to have explicit accumulator register operands.
Check that instruction selection can select multiply-add/sub DSP instructions
from a pattern that doesn't have intrinsics.

llvm-svn: 178406
2013-03-30 01:58:00 +00:00
Akira Hatanaka 33c060480d Remove unused variables.
llvm-svn: 178405
2013-03-30 01:46:28 +00:00
Akira Hatanaka 9efcd76c2c [mips] Move the code which does dag-combine for multiply-add/sub nodes to
derived class MipsSETargetLowering.

We shouldn't be generating madd/msub nodes if target is Mips16, since Mips16
doesn't have support for multipy-add/sub instructions.

llvm-svn: 178404
2013-03-30 01:42:24 +00:00
Akira Hatanaka be8612f6f4 [mips] Fix definitions of multiply, multiply-add/sub and divide instructions.
The new instructions have explicit register output operands and use table-gen
patterns instead of C++ code to do instruction selection.

Mips16's instructions are unaffected by this change.

llvm-svn: 178403
2013-03-30 01:36:35 +00:00
Akira Hatanaka f0ea500c14 [mips] Remove function getFPBranchCodeFromCond. Rename invertFPCondCodeAdd.
llvm-svn: 178396
2013-03-30 01:16:38 +00:00
Akira Hatanaka d5a0e096bc Fix indentation.
llvm-svn: 178395
2013-03-30 01:15:17 +00:00
Akira Hatanaka 28721bd7dd [mips] Add mips-specific nodes which will be used to select multiply and divide
instructions.

llvm-svn: 178394
2013-03-30 01:14:04 +00:00
Akira Hatanaka 3a34d14745 [mips] Implement getRepRegClassFor in MipsSETargetLowering. This function is
called in several places in ScheduleDAGRRList.cpp.

llvm-svn: 178393
2013-03-30 01:12:05 +00:00
Akira Hatanaka cd77e15cfb [mips] Fix MipsSEInstrInfo::copyPhysReg, loadRegFromStack and storeRegToStack
to handle accumulator registers.

llvm-svn: 178392
2013-03-30 01:08:05 +00:00
Akira Hatanaka 3b70145184 [mips] Expand pseudo load, store and copy instructions right before
callee-saved scan.

The code makes use of register's scavenger's capability to spill multiple
registers.

llvm-svn: 178391
2013-03-30 01:04:11 +00:00
Akira Hatanaka c8d85025a0 [mips] Define pseudo instructions for spilling and copying accumulator
registers.

llvm-svn: 178390
2013-03-30 00:54:52 +00:00
Eric Christopher 4887c8f4ff Use SmallVectorImpl instead of SmallVector at the uses.
llvm-svn: 178386
2013-03-29 23:34:06 +00:00
Jean-Luc Duprat 89fe247094 SmallVector and SmallPtrSet allocations now power-of-two aligned.
This time tested on both OSX and Linux.

llvm-svn: 178377
2013-03-29 22:07:12 +00:00
Michael Gottesman 3b8f877860 Add clang.arc.used to ModuleHasARC so ARC always runs if said call is present in a module.
clang.arc.used is an interesting call for ARC since ObjCARCContract
needs to run to remove said intrinsic to avoid a linker error (since the
call does not exist).

llvm-svn: 178369
2013-03-29 21:15:23 +00:00
Jyotsna Verma add82b3c75 Hexagon: Add emitFrameIndexDebugValue function to emit debug information.
llvm-svn: 178368
2013-03-29 21:09:53 +00:00
Eric Christopher 9c8414f84a Use 12 as the magic number for our abbreviation data and our
die values. A lot of DIEs have 10 attributes in C++ code (example
clang), none had more than 12. Seems like a good default.

llvm-svn: 178366
2013-03-29 20:23:06 +00:00
Eric Christopher 6be35037b5 Move the construction of the skeleton compile unit after the
entire original compile unit has been constructed.

llvm-svn: 178365
2013-03-29 20:23:02 +00:00
Hal Finkel f8ac57e289 Implement FRINT lowering on PPC using frin
Like nearbyint, rint can be implemented on PPC using the frin instruction. The
complication comes from the fact that rint needs to set the FE_INEXACT flag
when the result does not equal the input value (and frin does not do that). As
a result, we use a custom inserter which, after the rounding, compares the
rounded value with the original, and if they differ, explicitly sets the XX bit
in the FPSCR register (which corresponds to FE_INEXACT).

Once LLVM has better modeling of the floating-point environment we should be
able to (often) eliminate this extra complexity.

llvm-svn: 178362
2013-03-29 19:41:55 +00:00
Akira Hatanaka 7b8b9b9abf [mips] Define a function which returns the GPR register class.
llvm-svn: 178359
2013-03-29 19:17:42 +00:00
Matt Arsenault 19f773be37 Build fixes for STLPort + GCC
llvm-svn: 178356
2013-03-29 18:48:45 +00:00
Matt Arsenault 2080ecd107 Fix loop style
llvm-svn: 178355
2013-03-29 18:48:42 +00:00
Benjamin Kramer 70671b9937 Remove the old CodePlacementOpt pass.
It was superseded by MachineBlockPlacement and disabled by default since LLVM 3.1.

llvm-svn: 178349
2013-03-29 17:14:24 +00:00
Nadav Rotem 6036f581aa Fix a typo
llvm-svn: 178346
2013-03-29 16:34:23 +00:00
Jyotsna Verma 26226cea4b Hexagon: Disable DwarfUsesInlineInfoSection flag.
llvm-svn: 178345
2013-03-29 15:46:12 +00:00
Hal Finkel c20a08d25b Add PPC FP rounding instructions fri[mnpz]
These instructions are available on the P5x (and later) and on the A2. They
implement the standard floating-point rounding operations (floor, trunc, etc.).
One caveat: frin (round to nearest) does not implement "ties to even", and so
is only enabled in fast-math mode.

llvm-svn: 178337
2013-03-29 08:57:48 +00:00
Rafael Espindola de65751493 Revert "Fix allocations of SmallVector and SmallPtrSet so they are more prone to"
This reverts commit 617330909f0c26a3f2ab8601a029b9bdca48aa61.

It broke the bots:

/home/clangbuild2/clang-ppc64-2/llvm.src/unittests/ADT/SmallVectorTest.cpp:150: PushPopTest
/home/clangbuild2/clang-ppc64-2/llvm.src/unittests/ADT/SmallVectorTest.cpp:118: Failure
Value of: v[i].getValue()
  Actual: 0
Expected: value
Which is: 2

llvm-svn: 178334
2013-03-29 07:11:21 +00:00
Jean-Luc Duprat 67ce1472b4 Fix allocations of SmallVector and SmallPtrSet so they are more prone to
being power-of-two sized.

llvm-svn: 178332
2013-03-29 05:45:22 +00:00
Michael Gottesman 60f6b28c58 Removed trailing whitespace.
llvm-svn: 178329
2013-03-29 05:13:07 +00:00
Akira Hatanaka f05e9ad59f [mips] Change type of accumulator registers to Untyped. Add two more accumulator
register classes for Mips64 and DSP-ASE.

No functionality changes.

llvm-svn: 178328
2013-03-29 03:27:21 +00:00
Akira Hatanaka 465faccafa [mips] Define overloaded versions of storeRegToStack and loadRegFromStack.
No functionality changes.

llvm-svn: 178327
2013-03-29 02:14:12 +00:00
Akira Hatanaka 11184e4c8c [mips] Add parameter Alignment to MipsFrameLowering's constructor.
No functionality changes.

llvm-svn: 178326
2013-03-29 01:51:04 +00:00
Jack Carter 311246c6d5 [Mips Assembler] Add support for OR macro with imediate opperand
Mips assembler supports macros that allows the OR instruction 
to have an immediate parameter. This patch adds an instruction 
alias that converts this macro into a Mips ORI instruction. 

Contributer: Vladimir Medic
llvm-svn: 178316
2013-03-28 23:45:13 +00:00
Michael Liao a486a11dcf Add support of RDSEED defined in AVX2 extension
llvm-svn: 178314
2013-03-28 23:41:26 +00:00
Michael Liao 5fff5c7b26 Enhance boolean simplification to handle 16-/64-bit RDRAND
- RDRAND always clears the destination value when a random value is not
  available (i.e. CF == 0). This value is truncated or zero-extended as
  the false boolean value to be returned. Boolean simplification needs
  to skip this 'zext' or 'trunc' node.

llvm-svn: 178312
2013-03-28 23:38:52 +00:00
Michael Liao 96b42608ab Skip moving call address loading into callseq when targets prefer register indirect call.
To enable a load of a call address to be folded with that call, this
load is moved from outside of callseq into callseq. Such a moving
adds a non-glued node (that load) into a glued sequence. This non-glue
load is only removed when DAG selection folds them into a memory form
call instruction. When such instruction selection is disabled, it breaks
DAG schedule.

To prevent that, such moving is disabled when target favors register
indirect call.

Previous workaround disabling CALL32m/CALL64m insn selection is removed.

llvm-svn: 178308
2013-03-28 23:13:21 +00:00
Michael Gottesman ba64859e6e Removed dead code from ObjCARCOpts relating to tracking objc_retainBlocks through the ARC Dataflow analysis. By the time we get to the ARC dataflow analysis, any objc_retainBlock calls are not optimizable.
llvm-svn: 178306
2013-03-28 23:08:44 +00:00
Chad Rosier dbac025d84 [fast-isel] Add a preemptive fix for the case where we fail to materialize an
immediate in a register.  I don't believe this should ever fail, but I see no
harm in trying to make this code bullet proof.

I've added an assert to ensure my assumtion is correct.  If the assertion fires
something is wrong and we should fix it, rather then just silently fall back to
SelectionDAG isel.

llvm-svn: 178305
2013-03-28 23:04:47 +00:00
Jack Carter e1d85d55e6 [Mips Assembler] Add alias definitions for jal
Mips assembler allows following to be used as aliased instructions:
jal $rs for jalr $rs
jal $rd,$rd for jalr $rd,$rs

This patch provides alias definitions in td files and test cases to show the usage.

Contributer: Vladimir Medic
llvm-svn: 178304
2013-03-28 23:02:21 +00:00
Nadav Rotem ff8c45529c Add the X86 FMAs to the scheduling model.
llvm-svn: 178303
2013-03-28 22:54:45 +00:00
Bill Wendling 85722f48da Minor simplification.
Go ahead and use the full path for both the .gcno and .gcda files.

llvm-svn: 178302
2013-03-28 22:40:08 +00:00
Nadav Rotem e7b6a8aa8c Add the Haswell machine model.
llvm-svn: 178301
2013-03-28 22:34:46 +00:00
Nadav Rotem a20ec3164e Remove the unused port from the SandyBridge machine model
llvm-svn: 178300
2013-03-28 22:32:41 +00:00
Michael Liao c93fe7f8b2 Add ADX CPUID detection
llvm-svn: 178299
2013-03-28 22:29:53 +00:00
Eric Christopher 6c75232cf0 These two are default in the constructor for MCAsmInfo.
llvm-svn: 178293
2013-03-28 21:37:18 +00:00
Timur Iskhodzhanov a2fd5fdd7a Make Win32 put the SRet address into EAX, fixes PR15556
llvm-svn: 178291
2013-03-28 21:30:04 +00:00
Hal Finkel 22e41c411e Only enable 64-bit bswap DAG combines for PPC64
Compiling in 32-bit mode on a P7 would assert after 64-bit DAG combines were
added for bswap with load/store. This is because these combines are really only
valid in 64-bit mode, regardless of the CPU (and this was not being checked).

llvm-svn: 178286
2013-03-28 20:23:46 +00:00
Michael Gottesman 49f9885a2a Non optimizable objc_retainBlock calls are not forwarding.
Since we handle optimizable objc_retainBlocks through strength reduction
in OptimizableIndividualCalls, we know that all code after that point
will only see non-optimizable objc_retainBlock calls. IsForwarding is
only called by functions after that point, so it is ok to just classify
objc_retainBlock as non-forwarding.

<rdar://problem/13249661>.

llvm-svn: 178285
2013-03-28 20:11:30 +00:00
Michael Gottesman 158fdf699e [ObjCARC] Strength reduce objc_retainBlock -> objc_retain if the objc_retainBlock is optimizable.
If an objc_retainBlock has the copy_on_escape metadata attached to it
AND if the block pointer argument only escapes down the stack, we are
allowed to strength reduce the objc_retainBlock to to an objc_retain and
thus optimize it.

Current there is logic in the ARC data flow analysis to handle
this case which is complicated and involved making distinctions in
between objc_retainBlock and objc_retain in certain places and
considering them the same in others.

This patch simplifies said code by:

1. Performing the strength reduction in the initial ARC peephole
analysis (ObjCARCOpts::OptimizeIndividualCalls).

2. Changes the ARC dataflow analysis (which runs after the peephole
analysis) to consider all objc_retainBlock calls to not be optimizable
(since if the call was optimizable, we would have strength reduced it
already).

This patch leaves in the infrastructure in the ARC dataflow analysis to
handle this case, which due to 2 will just be dead code. I am doing this
on purpose to separate the removal of the old code from the testing of
the new code.

<rdar://problem/13249661>.

llvm-svn: 178284
2013-03-28 20:11:19 +00:00
Jyotsna Verma a46059b74d Hexagon: Replace switch-case in isDotNewInst with TSFlags.
llvm-svn: 178281
2013-03-28 19:44:04 +00:00