Commit Graph

54235 Commits

Author SHA1 Message Date
NAKAMURA Takumi 7bec74112d Unix/Process.inc: Give more useful random seed to srand. Workaround for PR12743.
llvm-svn: 156252
2012-05-06 08:24:24 +00:00
NAKAMURA Takumi 54acb28882 Support/Process: Move llvm::sys::Process::GetRandomNumber() from Process.cpp to Unix/Process.inc.
FIXME: GetRandomNumber() is not implemented in Win32.
llvm-svn: 156251
2012-05-06 08:24:18 +00:00
Chris Lattner 9322ba824c reapply my patch, with a fix for an off-by-one error. Turned out to be a lot
of work for a drive-by fix :)

llvm-svn: 156246
2012-05-05 22:17:32 +00:00
Chris Lattner 64f65d33df revert my patches, which are causing problems.
llvm-svn: 156245
2012-05-05 22:11:04 +00:00
Chris Lattner cd60bc491e refactor some code to expose column numbers more and make diagnostic printing slightly more efficient.
llvm-svn: 156243
2012-05-05 21:39:51 +00:00
Jim Grosbach 7ce129268e Nuke a few dead remnants of the CBE.
llvm-svn: 156241
2012-05-05 17:45:12 +00:00
Daniel Dunbar d5f82d92f3 [Support] Add missing include.
llvm-svn: 156240
2012-05-05 16:49:11 +00:00
Daniel Dunbar 58ed0c6c09 [Support] Fix up comments.
llvm-svn: 156239
2012-05-05 16:39:22 +00:00
Daniel Dunbar 3f0fa19bc4 [Support] Rewrite sys::fs::unique_file to not be stupid with /dev/urandom.
- Just use sys::Process::GetRandomNumber instead of having two poor
   implementations.
 - This is ~70 times (!) faster on my OS X machine.

llvm-svn: 156238
2012-05-05 16:36:24 +00:00
Daniel Dunbar b57ddd4e29 [Support] Add sys::Process::GetRandomNumber().
- Primitive API, but we rarely have need for random numbers.

llvm-svn: 156237
2012-05-05 16:36:20 +00:00
Benjamin Kramer 047d7ca0b1 CodeGenPrepare: Add a transform to turn selects into branches in some cases.
This came up when a change in block placement formed a cmov and slowed down a
hot loop by 50%:

	ucomisd	(%rdi), %xmm0
	cmovbel	%edx, %esi

cmov is a really bad choice in this context because it doesn't get branch
prediction. If we emit it as a branch, an out-of-order CPU can do a better job
(if the branch is predicted right) and avoid waiting for the slow load+compare
instruction to finish. Of course it won't help if the branch is unpredictable,
but those are really rare in practice.

This patch uses a dumb conservative heuristic, it turns all cmovs that have one
use and a direct memory operand into branches. cmovs usually save some code
size, so we disable the transform in -Os mode. In-Order architectures are
unlikely to benefit as well, those are included in the
"predictableSelectIsExpensive" flag.

It would be better to reuse branch probability info here, but BPI doesn't
support select instructions currently. It would make sense to use the same
heuristics as the if-converter pass, which does the opposite direction of this
transform.


Test suite shows a small improvement here and there on corei7-level machines,
but the actual results depend a lot on the used microarchitecture. The
transformation is currently disabled by default and available by passing the
-enable-cgp-select2branch flag to the code generator.

Thanks to Chandler for the initial test case to him and Evan Cheng for providing
me with comments and test-suite numbers that were more stable than mine :)

llvm-svn: 156234
2012-05-05 12:49:22 +00:00
Benjamin Kramer e31f31e5c0 Add a new target hook "predictableSelectIsExpensive".
This will be used to determine whether it's profitable to turn a select into a
branch when the branch is likely to be predicted.

Currently enabled for everything but Atom on X86 and Cortex-A9 devices on ARM.

I'm not entirely happy with the name of this flag, suggestions welcome ;)

llvm-svn: 156233
2012-05-05 12:49:14 +00:00
Benjamin Kramer a25a61b9e8 NVPTX: Initialize the UseF32FTZ flag.
llvm-svn: 156232
2012-05-05 11:22:02 +00:00
Stepan Dyatkovskiy cb2a1a34e2 Small fix in InstCombineCasts.cpp. Restored "alloca + bitcast" reducing for case when alloca's size is calculated within the "add/sub/... nsw".
Also added fix to 2011-06-13-nsw-alloca.ll test.

llvm-svn: 156231
2012-05-05 07:09:40 +00:00
Eric Christopher de9e92ed9b Typo.
llvm-svn: 156226
2012-05-05 01:16:06 +00:00
Jakob Stoklund Olesen e326ed33a8 Make sure findRepresentativeClass picks the widest super-register.
We want the representative register class to contain the largest
super-registers available. This makes the function less sensitive to the
register class numbering.

llvm-svn: 156220
2012-05-04 22:53:28 +00:00
Jakob Stoklund Olesen e89496fe63 Remove extra comma in debug output.
llvm-svn: 156219
2012-05-04 22:53:26 +00:00
David Blaikie 891d0a3d20 Fix warnings in release build.
This fixes a couple of Clang warnings in release builds of LLVM:

* Missing return in ISelLowering
* Unused variable in NVPTXutil.cpp

llvm-svn: 156216
2012-05-04 22:34:16 +00:00
Kevin Enderby cabbae653e Tweak to the fix in r156212, as with the change in removing the shift the
SignExtend32<22>(Val<<1) also needs to change to SignExtend32<21>(Val) .

llvm-svn: 156213
2012-05-04 22:09:52 +00:00
Kevin Enderby 8ce1ada1be Fix a bug in the ARM disassembler for wide branch conditional instructions
where the symbolic operand's displacement was incorrectly shifted left by 1.
rdar://11387046

llvm-svn: 156212
2012-05-04 22:02:27 +00:00
Chandler Carruth cd3464ee22 Fix a Clang warning in the new NVPTX backend:
In file included from ../lib/Target/NVPTX/VectorElementize.cpp:53:
../lib/Target/NVPTX/NVPTX.h:44:3: warning: default label in switch which covers all enumeration values [-Wcovered-switch-default]
  default: assert(0 && "Unknown condition code");
  ^
1 warning generated.

The prevailing pattern in LLVM is to not use a default label, and instead to
use llvm_unreachable to denote that the switch in fact covers all return paths
from the function.

llvm-svn: 156209
2012-05-04 21:35:49 +00:00
Chandler Carruth 6781821c01 Teach the code extractor how to extract a sequence of blocks from
RegionInfo's RegionNode. This mirrors the logic for automating the
extraction from a Loop.

llvm-svn: 156208
2012-05-04 21:33:30 +00:00
Chandler Carruth 8880325a92 Rename the Region::block_iterator to Region::block_node_iterator, and
add a new Region::block_iterator which actually iterates over the basic
blocks of the region.

The old iterator, now call 'block_node_iterator' iterates over
RegionNodes which contain a single basic block. This works well with the
GraphTraits-based iterator design, however most users actually want an
iterator over the BasicBlocks inside these RegionNodes. Now the
'block_iterator' is a wrapper which exposes exactly this interface.
Internally it uses the block_node_iterator to walk all nodes which are
single basic blocks, but transparently unwraps the basic block to make
user code simpler.

While this patch is a bit of a wash, most of the updates are to internal
users, not external users of the RegionInfo. I have an accompanying
patch to Polly that is a strict simplification of every user of this
interface, and I'm working on a pass that also wants the same simplified
interface.

This patch alone should have no functional impact.

llvm-svn: 156202
2012-05-04 20:55:23 +00:00
Justin Holewinski ae556d3ef7 This patch adds a new NVPTX back-end to LLVM which supports code generation for NVIDIA PTX 3.0. This back-end will (eventually) replace the current PTX back-end, while maintaining compatibility with it.
The new target machines are:

nvptx (old ptx32) => 32-bit PTX
nvptx64 (old ptx64) => 64-bit PTX

The sources are based on the internal NVIDIA NVPTX back-end, and
contain more functionality than the current PTX back-end currently
provides.

NV_CONTRIB

llvm-svn: 156196
2012-05-04 20:18:50 +00:00
Sebastian Pop 2420e8b7d5 Added missing CMN case in Thumb2SizeReduction pass so that LLVM emits 16-bits encoding of CMN instructions.
llvm-svn: 156195
2012-05-04 19:53:56 +00:00
Preston Gurd d6c440cd4c Adds Intel Atom scheduling latencies to X86InstrSystem.td.
llvm-svn: 156194
2012-05-04 19:26:37 +00:00
Matt Beaumont-Gay e82ab6baa7 Pacify GCC's -Wreturn-type
llvm-svn: 156189
2012-05-04 18:34:27 +00:00
Chandler Carruth 14316fcf7d Factor the computation of input and output sets into a public interface
of the CodeExtractor utility. This allows speculatively computing input
and output sets to measure the likely size impact of the code
extraction.

These sets cannot be reused sadly -- we mutate the function prior to
forming the final sets used by the actual extraction.

The interface has been revamped slightly to make it easier to use
correctly by making the interface const and sinking the computation of
the number of exit blocks into the full extraction function and away
from the rest of this logic which just computed two output parameters.

llvm-svn: 156168
2012-05-04 11:20:27 +00:00
Chandler Carruth 44e13911bc Rather than trying to gracefully handle input sequences with repeated
blocks, assert that this doesn't happen. We don't want to bother trying
to support this call pattern as it isn't necessary.

llvm-svn: 156167
2012-05-04 11:17:06 +00:00
Chandler Carruth 0a570552d1 Fix a goof with my previous commit by completely returning when we
detect an in-eligible block rather than just breaking out of the loop.

llvm-svn: 156166
2012-05-04 11:14:19 +00:00
Chandler Carruth 2f5d0191f7 Hoist a safety assert from the extraction method into the construction
of the extractor itself.

llvm-svn: 156164
2012-05-04 10:26:45 +00:00
Chandler Carruth 0fde00150d Move the CodeExtractor utility to a dedicated header file / source file,
and expose it as a utility class rather than as free function wrappers.

The simple free-function interface works well for the bugpoint-specific
pass's uses of code extraction, but in an upcoming patch for more
advanced code extraction, they simply don't expose a rich enough
interface. I need to expose various stages of the process of doing the
code extraction and query information to decide whether or not to
actually complete the extraction or give up.

Rather than build up a new predicate model and pass that into these
functions, just take the class that was actually implementing the
functions and lift it up into a proper interface that can be used to
perform code extraction. The interface is cleaned up and re-documented
to work better in a header. It also is now setup to accept the blocks to
be extracted in the constructor rather than in a method.

In passing this essentially reverts my previous commit here exposing
a block-level query for eligibility of extraction. That is no longer
necessary with the more rich interface as clients can query the
extraction object for eligibility directly. This will reduce the number
of walks of the input basic block sequence by quite a bit which is
useful if this enters the normal optimization pipeline.

llvm-svn: 156163
2012-05-04 10:18:49 +00:00
Hans Wennborg aea412008e Make ARM and Mips use TargetMachine::getTLSModel()
This moves the logic for selecting a TLS model to a single place,
instead of the previous three (ARM, Mips, and X86 which already
uses this function).

llvm-svn: 156162
2012-05-04 09:40:39 +00:00
Craig Topper bdd2e34b1f Fix some loops to match coding standards. No functional change intended.
llvm-svn: 156159
2012-05-04 06:39:13 +00:00
Craig Topper d4d3237bb8 Fix up some spacing. No functional change.
llvm-svn: 156158
2012-05-04 06:18:33 +00:00
Craig Topper e2ae413746 Simplify broadcast lowering code. No functional change intended.
llvm-svn: 156157
2012-05-04 05:49:51 +00:00
Craig Topper 42f2182366 Allow v16i16 and v32i8 shuffles to be rewritten as narrower shuffles.
llvm-svn: 156156
2012-05-04 04:44:49 +00:00
Bill Wendling fa0ebcd1b0 Add 'landingpad' instructions to the list of instructions to ignore.
Also combine the code in the 'assert' statement.

llvm-svn: 156155
2012-05-04 04:22:32 +00:00
Craig Topper 59063c0a3d Simplify shuffle narrowing code a bit. No functional change intended.
llvm-svn: 156154
2012-05-04 04:08:44 +00:00
Jakob Stoklund Olesen 796e5272ab Remove the SubRegClasses field from RegisterClass descriptions.
This information in now computed by TableGen.

llvm-svn: 156152
2012-05-04 03:30:34 +00:00
Jakob Stoklund Olesen 75fbe90839 Use SuperRegClassIterator for findRepresentativeClass().
The masks returned by SuperRegClassIterator are computed automatically
by TableGen. This is better than depending on the manually specified
SuperRegClasses.

llvm-svn: 156147
2012-05-04 02:19:22 +00:00
Jakob Stoklund Olesen 34a8f13e5f Initialize SparcInstrInfo before SparcTargetLowering.
The TargetLowering construction needs to use a valid TargetRegisterInfo
instance.

llvm-svn: 156146
2012-05-04 02:16:39 +00:00
Jakob Stoklund Olesen 57c7050675 Add a SuperRegClassIterator class.
This iterator class provides a more abstract interface to the (Idx,
Mask) lists of super-registers for a register class. The layout of the
tables shouldn't be exposed to clients.

llvm-svn: 156144
2012-05-04 01:48:29 +00:00
Chandler Carruth da7513a834 A pile of long over-due refactorings here. There are some very, *very*
minor behavior changes with this, but nothing I have seen evidence of in
the wild or expect to be meaningful. The real goal is unifying our logic
and simplifying the interfaces. A summary of the changes follows:

- Make 'callIsSmall' actually accept a callsite so it can handle
  intrinsics, and simplify callers appropriately.
- Nuke a completely bogus declaration of 'callIsSmall' that was still
  lurking in InlineCost.h... No idea how this got missed.
- Teach the 'isInstructionFree' about the various more intelligent
  'free' heuristics that got added to the inline cost analysis during
  review and testing. This mostly surrounds int->ptr and ptr->int casts.
- Switch most of the interesting parts of the inline cost analysis that
  were essentially computing 'is this instruction free?' to use the code
  metrics routine instead. This way we won't keep duplicating logic.

All of this is motivated by the desire to allow other passes to compute
a roughly equivalent 'cost' metric for a particular basic block as the
inline cost analysis. Sadly, re-using the same analysis for both is
really messy because only the actual inline cost analysis is ever going
to go to the contortions required for simplification, SROA analysis,
etc.

llvm-svn: 156140
2012-05-04 00:58:03 +00:00
Jakob Stoklund Olesen 2f460ae3b4 Use a shared implementation of getMatchingSuperRegClass().
TargetRegisterClass now gives access to the necessary tables.

llvm-svn: 156122
2012-05-03 22:49:04 +00:00
Kevin Enderby 914223010c Fix issues with the ARM bl and blx thumb instructions and the J1 and J2 bits
for the assembler and disassembler.  Which were not being set/read correctly
for offsets greater than 22 bits in some cases.

Changes to lib/Target/ARM/ARMAsmBackend.cpp from Gideon Myles!

llvm-svn: 156118
2012-05-03 22:41:56 +00:00
Chandler Carruth a46e62424b Factor the logic for testing whether a basic block is viable for code
extraction into a public interface. Also clean it up and apply it more
consistently such that we check for landing pads *anywhere* in the
extracted code, not just in single-block extraction.

This will be used to guide decisions in passes that are planning to
eventually perform a round of code extraction.

llvm-svn: 156114
2012-05-03 22:26:53 +00:00
Nuno Lopes d4cf35d775 remove calls to calloc if the allocated memory is not used (it was already being done for malloc)
fix a few typos found by Chad in my previous commit

llvm-svn: 156110
2012-05-03 22:08:19 +00:00
Sirish Pande f8e5e3c072 Support for target dependent Hexagon VLIW packetizer.
This patch creates and optimizes packets as per Hexagon ISA rules.

llvm-svn: 156109
2012-05-03 21:52:53 +00:00
Nuno Lopes d2b71e7fa9 add support for calloc to objectsize lowering
llvm-svn: 156102
2012-05-03 21:19:58 +00:00