Commit Graph

103552 Commits

Author SHA1 Message Date
Simon Pilgrim b079c8b35b [X86][SSE] Change memop fragment to inherit from vec128load with local alignment controls
First possible step towards merging SSE/AVX memory folding pattern fragments.

Also allows us to remove the duplicate non-temporal load logic.

Differential Revision: https://reviews.llvm.org/D33902

llvm-svn: 305184
2017-06-12 10:01:27 +00:00
Craig Topper 69fead95c7 [AVX-512] Add VPCONFLICT and VPLZCNT to load folding tables.
llvm-svn: 305180
2017-06-12 04:57:31 +00:00
Yaron Keren 7d46392124 Address http://bugs.llvm.org/pr32207 by making BannerPrinted local to runOnSCC and skipping banner for function declarations.
Reviewed By: Mehdi AMINI

Differential Revision: https://reviews.llvm.org/D34086

llvm-svn: 305179
2017-06-12 02:18:50 +00:00
Sanjay Patel dcbfbb11d9 [x86] use vperm2f128 rather than vinsertf128 when there's a chance to fold a 32-byte load
I was looking closer at the x86 test diffs in D33866, and the first change seems like it 
shouldn't happen in the first place. So this patch will resolve that.

Using Agner's tables and AMD docs, vperm2f128 and vinsertf128 have identical timing for 
any given CPU model, so we should be able to interchange those without affecting perf. 
But as we can see in some of the diffs here, using vperm2f128 allows load folding, so 
we should take that opportunity to reduce code size and register pressure.

A secondary advantage is making AVX1 and AVX2 codegen more similar. Given that vperm2f128 
was introduced with AVX1, we should be selecting it in all of the same situations that we 
would with AVX2. If there's some reason that an AVX1 CPU would not want to use this 
instruction, that should be fixed up in a later pass.

Differential Revision: https://reviews.llvm.org/D33938

llvm-svn: 305171
2017-06-11 21:18:58 +00:00
Xinliang David Li 7ed6cd32ea [PartialInlining] Support shrinkwrap life_range markers
Differential Revision: http://reviews.llvm.org/D33847

llvm-svn: 305170
2017-06-11 20:46:05 +00:00
Simon Pilgrim 516938452f Fix unused variable warning on non-debug EXPENSIVE_CHECKS builds
llvm-svn: 305163
2017-06-11 12:49:29 +00:00
Amaury Sechet 2127452ff7 [DAGCombine] Make sure we check the ResNo from UADDO before combining
Summary: UADDO has 2 result, and one must check the result no before doing any kind of combine. Without it, the transform is invalid.

Reviewers: joerg

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34088

llvm-svn: 305162
2017-06-11 11:36:38 +00:00
Davide Italiano 83122058cf [MemorySSA] preservesAll() implies preserves<MemorySSA>(). NFCI.
llvm-svn: 305160
2017-06-11 01:05:45 +00:00
David Blaikie a91885a08c dwarfdump: Handle relocs to zlib (.zdebug*) compressed sections
llvm-svn: 305152
2017-06-10 19:32:50 +00:00
Vedant Kumar c6e9e3007b Fix a ubsan failure introduced by r305092
lib/Object/WindowsResource.cpp:578:3: runtime error: store to
misaligned address 0x7fa09aedebbe for type 'unsigned int', which
requires 4 byte alignment
0x7fa09aedebbe: note: pointer points here
00 00 03 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  00 00
            ^

llvm-svn: 305149
2017-06-10 18:07:24 +00:00
Geoff Berry 3cca1da20c [EarlyCSE] Add option to use MemorySSA for function simplification run of EarlyCSE (off by default).
Summary:
Use MemorySSA for memory dependency checking in the EarlyCSE pass at the
start of the function simplification portion of the pipeline.  We rely
on the fact that GVNHoist runs just after this pass of EarlyCSE to
amortize the MemorySSA construction cost since GVNHoist uses MemorySSA
and EarlyCSE preserves it.

This is turned off by default.  A follow-up change will turn it on to
allow for easier reversion in case it breaks something.

llvm-svn: 305146
2017-06-10 15:20:03 +00:00
Galina Kistanova 038f9854ec Added llvm_unreachable as ReportError cannot be specified as noreturn.
llvm-svn: 305143
2017-06-10 07:50:14 +00:00
Wei Ding 7c3e5115a5 AMDGPU : Fix ISA Version Definitions.
Differential Revision: http://reviews.llvm.org/D28531

llvm-svn: 305137
2017-06-10 03:53:19 +00:00
Andrew Kaylor 647025f9e1 [InstSimplify] Don't constant fold or DCE calls that are marked nobuiltin
Differential Revision: https://reviews.llvm.org/D33737

llvm-svn: 305132
2017-06-09 23:18:11 +00:00
Sanjay Patel 2843cad435 [CGP] add a reference to DataLayout in MemCmpExpansion; NFCI
We're currently passing endian-ness around as a param (and not uniformly),
so this eliminates the need for that. I'd like to add a constant fold
call too, and that requires a DL.

llvm-svn: 305129
2017-06-09 23:01:05 +00:00
I-Jui (Ray) Sung 21fde385fa [AArch64] Add fallback in FastISel fp16 conversions
Summary:
- Fix assertion failures on F16 to/from int types in FastISel by falling
  back to regular ISel
- Add a testcase of various conversion cases with FastISel (-O0)

Reviewers: kristof.beyls, jmolloy, SjoerdMeijer

Reviewed By: SjoerdMeijer

Subscribers: SjoerdMeijer, llvm-commits, srhines, pirama, aemerson, rengolin, javed.absar, kristof.beyls

Differential Revision: https://reviews.llvm.org/D33734

llvm-svn: 305127
2017-06-09 22:40:50 +00:00
Craig Topper 7ad13f259f [LVI] Fix spelling error in comment. NFC
llvm-svn: 305115
2017-06-09 21:21:17 +00:00
Craig Topper 6dd9dcf26e [LVI] Const correct and rename the LVILatticeVal parameter to getPredicateResult. NFC
Previously it was non-const reference named Result which would tend to make someone think that it was an outparam when really its an input.

llvm-svn: 305114
2017-06-09 21:18:16 +00:00
Zachary Turner 3226fe95bb [pdb] Support CoffSymbolRVA debug subsection.
llvm-svn: 305108
2017-06-09 20:46:52 +00:00
Yaxun Liu 6455b0dbf3 [SROA] Fix APInt size when load/store have different address space
Currently there is a bug in SROA::presplitLoadsAndStores which causes assertion in
GEPOperator::accumulateConstantOffset.

Basically it does not consider the situation that the pointer operand of load or store
may be in a non-zero address space and its size may be different from the size of
a pointer in address space 0.

This patch fixes assertion when compiling Blender Cycles kernels for amdgpu backend.

Diffferential Revision: https://reviews.llvm.org/D33298

llvm-svn: 305107
2017-06-09 20:46:29 +00:00
Keno Fischer 5329174cb1 [Sink] Fix predicate in legality check
Summary:
isSafeToSpeculativelyExecute is the wrong predicate to use here.
All that checks for is whether it is safe to hoist a value due to
unaligned/un-dereferencable accesses. However, not only are we doing
sinking rather than hoisting, our concern is that the location
we're loading from may have been modified. Instead forbid sinking
any load across a critical edge.

Reviewers: majnemer

Subscribers: davide, llvm-commits

Differential Revision: https://reviews.llvm.org/D33179

llvm-svn: 305102
2017-06-09 19:31:10 +00:00
Stanislav Mekhanoshin 1a61ab8172 [AMDGPU] Add intrinsics for alignbit and alignbyte instructions
Differential Revision: https://reviews.llvm.org/D34046

llvm-svn: 305098
2017-06-09 19:03:00 +00:00
Zachary Turner 7e62cd17d6 Allow VarStreamArray to use stateful extractors.
Previously extractors tried to be stateless with any additional
context information needed in order to parse items being passed
in via the extraction method.  This led to quite cumbersome
implementation challenges and awkwardness of use.  This patch
brings back support for stateful extractors, making the
implementation and usage simpler.

llvm-svn: 305093
2017-06-09 17:54:36 +00:00
Eric Beckmann d9de6389fc Implement COFF emission for parsed Windows Resource ( .res) files.
Summary: Add the WindowsResourceCOFFWriter class for producing the final COFF after all parsing is done.

Reviewers: hiraditya!, zturner, ruiu

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34020

llvm-svn: 305092
2017-06-09 17:34:30 +00:00
Simon Pilgrim 3d37b1a277 [X86][SSE] Add support for PACKSS nodes to faux shuffle extraction
If the inputs won't saturate during packing then we can treat the PACKSS as a truncation shuffle

llvm-svn: 305091
2017-06-09 17:29:52 +00:00
Craig Topper 31ce4ec2fd [LazyValueInfo] Don't run the more complex predicate handling code for EQ and NE in getPredicateResult
Summary:
Unless I'm mistaken, the special handling for EQ/NE should cover everything and there is no reason to fallthrough to the more complex code. For that matter I'm not sure there's any reason to special case EQ/NE other than avoiding creating temporary ConstantRanges.

This patch moves the complex code into an else so we only do it when we are handling a predicate other than EQ/NE.

Reviewers: anna, reames, resistor, Farhana

Reviewed By: anna

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34000

llvm-svn: 305086
2017-06-09 16:16:20 +00:00
Krzysztof Parzyszek 7aca2fd830 [Hexagon] Fixes and updates to the selection patterns
- Add some missing patterns.
- Use C4_cmplte in branch patterns.
- Fix signedness of immediate operand in M2_accii.

llvm-svn: 305085
2017-06-09 15:26:21 +00:00
Zvi Rackover 3a2c4b48bb SelectionDAG: Remove deleted nodes from legalized set to avoid clash with newly created nodes
Summary:
During DAG legalization loop in SelectionDAG::Legalize(),
bookkeeping of the SDNodes that were already legalized is implemented
with SmallPtrSet (LegalizedNodes). This kind of set stores only pointers
to objects, not the objects themselves. Unfortunately, if SDNode is
deleted during legalization for some reason, LegalizedNodes set is not
informed about this fact. This wouldn’t be so bad, if SelectionDAG wouldn’t reuse
space deallocated after deletion of unused nodes, for creation of new
ones. Because of this, new nodes, created during legalization often can
have pointers identical to ones that have been previously legalized,
added to the LegalizedNodes set, and deleted afterwards. This in turn
causes, that newly created nodes, sharing the same pointer as deleted
old ones, are present in LegalizedNodes *already at the moment of
creation*, so we never call Legalize on them.
The fix facilitates the fact, that DAG notifies listeners about each
modification. I have registered DAGNodeDeletedListener inside
SelectionDAG::Legalize, with a callback function that removes any
pointer of any deleted SDNode from the LegalizedNodes set. With this
modification, LegalizeNodes set does not contain pointers to nodes that
were deleted, so newly created nodes can always be inserted to it, even
if they share pointers with old deleted nodes.

Patch by pawel.szczerbuk@intel.com

The issue this patch addresses causes failures in an out-of-tree target,
and i was not able to create a reproducer for an in-tree target, hence
there is no test-case.

Reviewers: delena, spatel, RKSimon, hfinkel, davide, qcolombet

Reviewed By: delena

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33891

llvm-svn: 305084
2017-06-09 14:53:45 +00:00
Simon Dardis 212cccb2f4 Reland "[SelectionDAG] Enable target specific vector scalarization of calls and returns"
By target hookifying getRegisterType, getNumRegisters, getVectorBreakdown,
backends can request that LLVM to scalarize vector types for calls
and returns.

The MIPS vector ABI requires that vector arguments and returns are passed in
integer registers. With SelectionDAG's new hooks, the MIPS backend can now
handle LLVM-IR with vector types in calls and returns. E.g.
'call @foo(<4 x i32> %4)'.

Previously these cases would be scalarized for the MIPS O32/N32/N64 ABI for
calls and returns if vector types were not legal. If vector types were legal,
a single 128bit vector argument would be assigned to a single 32 bit / 64 bit
integer register.

By teaching the MIPS backend to inspect the original types, it can now
implement the MIPS vector ABI which requires a particular method of
scalarizing vectors.

Previously, the MIPS backend relied on clang to scalarize types such as "call
@foo(<4 x float> %a) into "call @foo(i32 inreg %1, i32 inreg %2, i32 inreg %3,
i32 inreg %4)".

This patch enables the MIPS backend to take either form for vector types.

The previous version of this patch had a "conditional move or jump depends on
uninitialized value".

Reviewers: zoran.jovanovic, jaydeep, vkalintiris, slthakur

Differential Revision: https://reviews.llvm.org/D27845

llvm-svn: 305083
2017-06-09 14:37:08 +00:00
Sanjay Patel 70db424601 [SimplifyLibCalls] fix formatting; NFC
llvm-svn: 305081
2017-06-09 14:22:03 +00:00
Sanjay Patel fef83e8fb9 [ValueTracking] fix typo; NFC
llvm-svn: 305080
2017-06-09 14:21:18 +00:00
David Stuttard 82618baa0f [AMDGPU] Fix for issue in alloca to vector promotion pass
Summary:
Alloca promotion pass not dealing with non-canonical input

Added some additional checks so the pass simply backs-off forms it can't deal with (non-canonical)

Also added some test cases in non-canonical form to check that it no longer crashes

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tpr, t-tye

Differential Revision: https://reviews.llvm.org/D31710

llvm-svn: 305079
2017-06-09 14:16:22 +00:00
Javed Absar 9e1ff8654f [ARM] Custom machine-scheduler. NFCI.
This patch creates a customised machine-scheduler for ARM targets,
so that subsequently DAG mutations etc can be added.
Reviewed by: hahn, rengolin, rovka. 
Differential Revision: https://reviews.llvm.org/D34039

llvm-svn: 305078
2017-06-09 14:07:21 +00:00
Nirav Dave 670109d89a [MC] Fix compiler crash in AsmParser::Lex
When an empty comment is present in an assembly file, the compiler will crash because it checks the first character for '\n' or '\r'.
The fix consists of also checking if the string is empty before accessing the *front* method of the StringRef.
A test is included for the x86 target, but this issue is reproducible with other targets as well.

Patch by Alexandru Guduleasa!

Reviewers: niravd, grosbach, llvm-commits

Reviewed By: niravd

Differential Revision: https://reviews.llvm.org/D33993

llvm-svn: 305077
2017-06-09 14:04:03 +00:00
Krzysztof Parzyszek 7881415510 [Hexagon] Add LLVM header to HexagonPatterns.td
llvm-svn: 305074
2017-06-09 13:30:58 +00:00
Serge Rogatch 85427c0da7 [XRay] Fix computation of function size subject to XRay threshold
Summary:
Currently XRay compares its threshold against `Function::size()` . However, `Function::size()` returns the number of basic blocks (as I understand, such as cycle bodies, if/else bodies, switch-case bodies, etc.), rather than the number of instructions.

The name of the parameter `-fxray-instruction-threshold=N`, as well as XRay documentation at http://llvm.org/docs/XRay.html , suggests that instructions should be counted, rather than the number of basic blocks.

I see two options:
1. Count the number of MachineInstr`s in MachineFunction : this gives better  estimate for the number of assembly instructions on the target. So a user can check in disassembly that the threshold works more or less correctly.
2. Count the number of Instruction`s in a Function : AFAIK, this gives correct number of IR instructions, which the user can check in IR listing. However, this number may be far (several times for small functions) from the number of assembly instructions finally emitted.

Option 1 is implemented in this patch because I think that having the closer estimate for the number of assembly instructions emitted is more important than to have a clear definition of the metric.

Reviewers: dberris, rengolin

Reviewed By: dberris

Subscribers: llvm-commits, iid_iunknown

Differential Revision: https://reviews.llvm.org/D34027

llvm-svn: 305072
2017-06-09 13:23:23 +00:00
Nirav Dave 43a4d8122f Prevent RemoveDeadNodes from deleted already deleted node.
This prevents against assertion errors like PR32659 which occur from a
replacement deleting a node after it's been added to the list argument
of RemoveDeadNodes. The specific failure from PR32659 does not
currently happen, but it is still potentially possible. The underlying
cause is that the callers of the change dfunction builds up a list of
nodes to delete after having moved their uses and it possible that a
move of a later node will cause a previously deleted nodes to be
deleted.

Reviewers: bkramer, spatel, davide

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33731

llvm-svn: 305070
2017-06-09 12:57:35 +00:00
Oliver Stannard ad0973557c [ARM] Add scheduling info for VFMS
The scalar VFMS instructions did not have scheduling information attached (but
VFMA did), which was causing assertion failures with the Cortex-A57 scheduling
model and -fp-contract=fast.

Differential Revision: https://reviews.llvm.org/D34040

llvm-svn: 305064
2017-06-09 09:19:09 +00:00
Stefan Maksimovic add20f8f17 Test commit: remove whitespace
llvm-svn: 305059
2017-06-09 07:57:05 +00:00
David Blaikie 4a60d370e8 bugpoint: disabling symbolication of bugpoint-executed programs
Initial implementation - needs similar work/testing for other tools
bugpoint invokes (llc, lli I think, maybe more).

Alternatively (as suggested by chandlerc@) an environment variable could
be used. This would allow the option to pass transparently through user
scripts, pass to compilers if they happened to be LLVM-ish, etc.

I worry a bit about using cl::opt in the crash handling code - LLVM
might crash early, perhaps before the cl::opt is properly initialized?
Or at least before arguments have been parsed?

 - should be OK since it defaults to "pretty", so if the crash is very
 early in opt parsing, etc, then crash reports will still be symbolized.

I shyed away from doing this with an environment variable when I
realized that would require copying the existing environment and
appending the env variable of interest. But it seems there's no existing
LLVM API for accessing the environment (even the Support tests for
process launching have their own ifdefs for getting the environment). It
could be added, but seemed like a higher bar/untested codepath to
actually add environment variables.

Most importantly, this reduces the runtime of test/BugPoint/metadata.ll
in a split-dwarf Debug build from 1m34s to 6.5s by avoiding a lot of
symbolication. (this wasn't a problem for non-split-dwarf builds only
because the executable was too large to map into memory (due to bugpoint
setting a 400MB memory (including address space - not sure why? Going to
remove that) limit on the child process) so symbolication would fail
fast & wouldn't spend all that time parsing DWARF, etc)

Reviewers: chandlerc, dannyb

Differential Revision: https://reviews.llvm.org/D33804

llvm-svn: 305056
2017-06-09 07:29:03 +00:00
Serguei Katkov 38414b57f9 [IndVars] Add an option to be able to disable LFTR
This change adds an option disable-lftr to be able to disable Linear Function Test Replace optimization.
By default option is off so current behavior is not changed.

Reviewers: reames, sanjoy, wmi, andreadb, apilipenko
Reviewed By: sanjoy
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33979

llvm-svn: 305055
2017-06-09 06:11:59 +00:00
George Burgess IV a20352e13e [LoopVectorize] Don't preserve nsw/nuw flags on shrunken ops.
If we're shrinking a binary operation, it may be the case that the new
operations wraps where the old didn't. If this happens, the behavior
should be well-defined. So, we can't always carry wrapping flags with us
when we shrink operations.

If we do, we get incorrect optimizations in cases like:

void foo(const unsigned char *from, unsigned char *to, int n) {
  for (int i = 0; i < n; i++)
    to[i] = from[i] - 128;
}

which gets optimized to:

void foo(const unsigned char *from, unsigned char *to, int n) {
  for (int i = 0; i < n; i++)
    to[i] = from[i] | 128;
}

Because:
- InstCombine turned `sub i32 %from.i, 128` into
  `add nuw nsw i32 %from.i, 128`.
- LoopVectorize vectorized the add to be `add nuw nsw <16 x i8>` with a
  vector full of `i8 128`s
- InstCombine took advantage of the fact that the newly-shrunken add
  "couldn't wrap", and changed the `add` to an `or`.

InstCombine seems happy to figure out whether we can add nuw/nsw on its
own, so I just decided to drop the flags. There are already a number of
places in LoopVectorize where we rely on InstCombine to clean up.

llvm-svn: 305053
2017-06-09 03:56:15 +00:00
David Blaikie cb9327b02d Inliner: Don't touch indirect calls
Other comments/implications are that this isn't intended behavior (nor
perserved/reimplemented in the new inliner) & complicates fixing the
'inlining' of trivially dead calls without consulting the cost function
first.

llvm-svn: 305052
2017-06-09 03:29:20 +00:00
Rui Ueyama 365d4d0000 Fix -Wunused-variable.
llvm-svn: 305051
2017-06-09 03:26:45 +00:00
Craig Topper a420562257 [InstCombine] Pass a proper context instruction to all of the calls into InstSimplify
Summary: This matches the behavior we already had for compares and makes us consistent everywhere.

Reviewers: dberlin, hfinkel, spatel

Reviewed By: dberlin

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33604

llvm-svn: 305049
2017-06-09 03:21:29 +00:00
Bob Haarman fdf499bf2d [codeview] use 32-bit integer for RelocOffset in DebugLinesSubsection
Summary:
RelocOffset is a 32-bit value, but we previously truncated it to 16 bits.

Fixes PR33335.

Reviewers: zturner, hiraditya!

Reviewed By: zturner

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33968

llvm-svn: 305043
2017-06-09 01:18:10 +00:00
Zachary Turner 28c22c83e3 [pdb] Don't crash on unknown debug subsections.
More and more unknown debug subsection kinds are being discovered
so we should make it possible to dump these and display the
bytes.

llvm-svn: 305041
2017-06-09 00:53:59 +00:00
Saleem Abdulrasool 1f62f57b37 sink DebugCompressionType into MC for exposing to clang
This is a preparatory change to expose the debug compression style to
clang.  It requires exposing the enumeration and passing the actual
value through to the backend from the frontend in actual value form
rather than a boolean that selects the GNU style of debug info
compression.

Minor tweak to the ELF Object Writer to use a variable for re-used
values.  Add an assertion that debug information format is one of the
two currently known types if debug information is being compressed.

llvm-svn: 305038
2017-06-09 00:40:19 +00:00
Zachary Turner deb391309c [CodeView] Support remaining debug subsection types
This adds support for Symbols, StringTable, and FrameData subsection
types.  Even though these subsections rarely if ever appear in a PDB
file (they are usually in object files), there's no theoretical reason
why they *couldn't* appear in a PDB.  The real issue though is that in
order to add support for dumping and writing them (which will be useful
for object files), we need a way to test them.  And since there is no
support for reading and writing them to / from object files yet, making
PDB support them is the best way to both add support for the underlying
format and add support for tests at the same time.  Later, when we go
to add support for reading / writing them from object files, we'll need
only minimal changes in the underlying read/write code.

llvm-svn: 305037
2017-06-09 00:28:08 +00:00
Zachary Turner 1bf7762049 [llvm-pdbdump] Support native ordering of subsections in raw mode.
This is the same change for the YAML Output style applied to the
raw output style.  Previously we would queue up all subsections
until every one had been read, and then output them in a pre-
determined order.  This was because some subsections need to be
read first in order to properly dump later subsections.  This
patch allows them to be dumped in the order they appear.

Differential Revision: https://reviews.llvm.org/D34015

llvm-svn: 305034
2017-06-08 23:49:01 +00:00