Commit Graph

155193 Commits

Author SHA1 Message Date
Balaram Makam 7c79470c71 [ARM/AARCH64] Make test MachineBranchProb.ll more robust and re-enable for ARM/AArch64
Summary: Make test robust enough to not fail due to CFG changes and re-enable for ARM/AArch64.

Reviewers: rovka, fhahn

Reviewed By: fhahn

Subscribers: fhahn, aemerson, rengolin, mcrosier, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D38590

llvm-svn: 315002
2017-10-05 18:33:34 +00:00
Reid Kleckner 7344282c36 [X86] Simplify X86 epilogue frame size calculation, NFC
Sink the insertion of "pop ebp" out of the frame size calculation
branches. They all check for HasFP.

Our handling of CLEANUPRET and CATCHRET was equivalent, both are
funclets and use the same frame size. We can eliminate the CLEANUPRET
case.

Hoist the hasFP(MF) query into a local bool.

Rename TargetMBB to CatchRetTarget to be more descriptive.

Eliminate the Optional<unsigned> RetOpcode local, now that it has one
use.

It's only a net savings of 10 lines, but hopefully it's *slightly* more
readable.

llvm-svn: 315000
2017-10-05 18:27:08 +00:00
Davide Italiano c8708e59e8 [PassManager] Improve the interaction between -O2 and ThinLTO.
Run GDCE slightly later so that we don't have to repeat it
twice when preparing for Thin. Thanks to Mehdi for the suggestion.

llvm-svn: 314999
2017-10-05 18:23:25 +00:00
Davide Italiano ff829cea8b [PassManager] Run global optimizations after the inliner.
The inliner performs some kind of dead code elimination as it goes,
but there are cases that are not really caught by it. We might
at some point consider teaching the inliner about them, but it
is OK for now to run GlobalOpt + GlobalDCE in tandem as their
benefits generally outweight the cost, making the whole pipeline
faster.

This fixes PR34652.

Differential Revision: https://reviews.llvm.org/D38154

llvm-svn: 314997
2017-10-05 18:06:37 +00:00
Matthew Simpson 49ee814996 [SparsePropagation] Move member definitions to header (NFC)
AbstractLatticeFunction and SparseSolver are class templates parameterized by a
lattice value, so we need to move these member functions over to the header.

Differential Revision: https://reviews.llvm.org/D38561

llvm-svn: 314996
2017-10-05 18:03:30 +00:00
Petar Jovanovic 65f10246bb [mips] implement .set dspr2 directive
Implement .set dspr2 directive with appropriate feature bits. This
directive is a counterpart of -mattr=dspr2 command line option with the
exception that it does not influence elf header flags.

Patch by Milos Stojanovic.

Differential Revision: https://reviews.llvm.org/D38537

llvm-svn: 314994
2017-10-05 17:40:32 +00:00
Matt Arsenault 2d3f8f333d AMDGPU: Set v2i32 any_extend to expand
llvm-svn: 314993
2017-10-05 17:38:30 +00:00
Krzysztof Parzyszek 9f3e88ae64 [RDF] Simplify construction of maximal registers
The old algoritm was not correct, although it worked most of the time.
Avoid the complex reachability analysis and simply calculate the maximal
registers out of the set of all referenced registers.

llvm-svn: 314991
2017-10-05 17:12:49 +00:00
Rong Xu 289da65698 [ProfileData] Fix data racing in merging indexed profiles
There is data racing to the static variable RecordIndex in index profile reader
when merging in multiple threads. Make it a member variable in
IndexedInstrProfReader to fix this.

Differential Revision: https://reviews.llvm.org/D38431

llvm-svn: 314990
2017-10-05 17:05:20 +00:00
Artur Pilipenko 7b15254c8f [X86] Fix chains update when lowering BUILD_VECTOR to a vector load
The code which lowers BUILD_VECTOR of consecutive loads into a single vector
load doesn't update chains properly. As a result the vector load can be
reordered with the store to the same location.

The current code in EltsFromConsecutiveLoads only updates the chain following
the first load. The fix is to update the chains following all the loads
comprising the vector.

This is a fix for PR10114.

Reviewed By: niravd

Differential Revision: https://reviews.llvm.org/D38547

llvm-svn: 314988
2017-10-05 16:28:21 +00:00
Konstantin Zhuravlyov aa0835a7ab AMDGPU: Add and set AMDGPU-specific e_flags
Differential Revision: https://reviews.llvm.org/D38556

llvm-svn: 314987
2017-10-05 16:19:18 +00:00
Ayal Zaks c9e0f886e5 [LV] Fix PR34743 - handle casts that sink after interleaved loads
When ignoring a load that participates in an interleaved group, make sure to
move a cast that needs to sink after it.

Testcase derived from reproducer of PR34743.

Differential Revision: https://reviews.llvm.org/D38338

llvm-svn: 314986
2017-10-05 15:45:14 +00:00
Clement Courbet 922e5bc698 Revert "Re-land "[MergeICmps] Disable mergeicmps if the target does not want to handle memcmp expansion."""
broken test on windows

This reverts commit c91479518344fd1fc071c5bd5848f6eb83e53dca.

llvm-svn: 314985
2017-10-05 14:42:06 +00:00
Sanjay Patel f11b5b4f87 revert r314698 - [InstCombine] remove one-use restriction for icmp (shr exact X, C1), C2 --> icmp X, (C2<<C1)
There is a bot failure that appears to be related to this change:
http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15-selfhost-neon/builds/2117

...so reverting to confirm that and attempting to keep the bot green while investigating.

llvm-svn: 314984
2017-10-05 14:26:15 +00:00
Javed Absar fc500041bb [TablgeGen] : Tidy up CodeGenSchedule. NFC.
Reviewed by: @MatzeB
Differential Revision: https://reviews.llvm.org/D38534

llvm-svn: 314982
2017-10-05 13:27:43 +00:00
Ayal Zaks fc3f7a4f0c [LV] Fix PR34711 - widen instruction ranges when sinking casts
Instead of trying to keep LastWidenRecipe updated after creating each recipe,
have tryToWiden() retrieve the last recipe of the current VPBasicBlock and check
if it's a VPWidenRecipe when attempting to extend its range. This ensures that
such extensions, optimized to maintain the original instruction order, do so
only when the instructions are to maintain their relative order. The latter does
not always hold, e.g., when a cast needs to sink to unravel first order
recurrence (r306884).

Testcase derived from reproducer of PR34711.

Differential Revision: https://reviews.llvm.org/D38339

llvm-svn: 314981
2017-10-05 12:41:49 +00:00
Clement Courbet 4cafbb9b5e Re-land "[MergeICmps] Disable mergeicmps if the target does not want to handle memcmp expansion.""
llvm-svn: 314980
2017-10-05 12:39:57 +00:00
Simon Dardis 51a7ae2a29 [mips] Place certain 64 bit FPU instructions in their own decoder namespace
Previously, instructions that were defined to use the FGR64 register class
were associated with the Mips64 table which was incorrect.

Reviewers: nitesh.jain, atanasyan

Differential Revision: https://reviews.llvm.org/D38454

llvm-svn: 314976
2017-10-05 10:27:37 +00:00
Karl-Johan Karlsson 8d8d201c17 [DebugInfo] Insert DEBUG_VALUEs after each register redefinition
Summary:
When reinserting debug values after register allocation, make sure to
insert debug values after each redefinition of debug value register in
the slot index range. The reason for this is that DwarfDebug will end
the range of a debug variable when the physical reg is defined. For
instructions with e.g. tied operands this result in prematurely ended
debug range.

This resolves pr34545

Patch by Karl-Johan Karlsson and Bjorn Pettersson

Reviewers: rnk, aprantl

Reviewed By: rnk

Subscribers: bjope, llvm-commits

Differential Revision: https://reviews.llvm.org/D38229

llvm-svn: 314974
2017-10-05 08:37:31 +00:00
George Rimar b074fbcb48 [MC] - llvm-mc hangs on non-english characters.
Currently llvm-mc just hangs inside infinite loop
while trying to parse file which has ".section .с" inside,
where section name is non-english character.
Patch fixes the issue.

In this patch I also moved content of non-english-characters.s
to test/MC/AsmParser/Inputs folder  so that non-english-characters.s
becomes a single testcase for all invalid inputs containing non-english
symbols. That is convinent because llvm-mc otherwise tries
to parse and tokenize the whole testcase file with tools invocations and
it is harder to isolate the issue.

Differential revision: https://reviews.llvm.org/D38545

llvm-svn: 314973
2017-10-05 08:15:55 +00:00
Clement Courbet 6603fc0e7b Revert "[MergeICmps] Disable mergeicmps if the target does not want to handle memcmp expansion."
Breaks
clang-stage1-cmake-RA-incremental/llvm/test/Transforms/MergeICmps/X86/tuple-four-int8.ll

This reverts commit 3038c459d67f8898ffa295d54a013b280690abfa.

llvm-svn: 314972
2017-10-05 08:03:39 +00:00
Craig Topper 17b0c78447 [InstCombine] Fix a vector splat handling bug in transformZExtICmp.
We were using an i1 type and then zero extending to a vector. Instead just create the 0/1 directly as a ConstantInt with the correct type. No need to ask ConstantExpr to zero extend for us.

This bug is a bit tricky to hit because it requires us to visit a zext of an icmp that would normally be simplified to true/false, but that icmp hasnt' been visited yet. In the test case this zext and icmp were created by visiting a udiv and due to worklist ordering we got to the zext first.

Fixes PR34841.

llvm-svn: 314971
2017-10-05 07:59:11 +00:00
Clement Courbet 902eef32eb [MergeICmps] Disable mergeicmps if the target does not want to handle memcmp expansion.
Summary: This is to avoid e.g. merging two cheap icmps if the target is not going to expand to something nice later.

Reviewers: dberlin, spatel

Subscribers: davide, nemanjai

Differential Revision: https://reviews.llvm.org/D38232

llvm-svn: 314970
2017-10-05 07:49:09 +00:00
Mikael Holmen 0ec1d25d33 Minor refactoring regarding Cast::isNoopCast(), NFC
Summary:
FastISel::hasTrivialKill() was the only user of the "IntPtrTy" version of
Cast::isNoopCast(). According to review comments in D37894 we could instead
use the "DataLayout" version of the method, and thus get rid of the
"IntPtrTy" versions of isNoopCast() completely.

With the above done, the remaining isNoopCast() could then be simplified
a bit more.

Reviewers: arsenm

Reviewed By: arsenm

Subscribers: wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D38497

llvm-svn: 314969
2017-10-05 07:07:09 +00:00
Dean Michael Berris 0a465d7a01 [XRay][tools] Support arg1 logging entries in the basic logging mode
Summary:
The arg1 logging handler changed in compiler-rt to start writing a
different type for entries encountered when logging the first argument
of XRay-instrumented functions. This change allows the trace loader to
support reading these record types as well as prepare for when the
basic (naive) mode implementation starts writing down the argument
payloads.

Without this change, binaries with arg1 logging support enabled start
writing unreadable logs for any of the XRay tracing tools.

Reviewers: pelikan

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38550

llvm-svn: 314967
2017-10-05 05:18:17 +00:00
Sean Fertile df8d998602 Enabling new pass manager in LTO (and thinLTO) link step.
Adds the option 'new-pass-manager' to the gold pluggin to enable using the
new pass manager during the lto/thinlto link step.

Patch by Graham Yiu.

 Differential Revision: https://reviews.llvm.org/D38517

llvm-svn: 314963
2017-10-05 01:48:42 +00:00
Xinliang David Li 04ab11a08a Revert r314928 to investigate thinLTO bootstrap failure
llvm-svn: 314961
2017-10-05 01:40:13 +00:00
Eugene Zelenko 60433b682f [X86] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 314953
2017-10-05 00:33:50 +00:00
Matt Arsenault f48e5c9ce5 AMDGPU: Add comment about clamps
llvm-svn: 314952
2017-10-05 00:13:20 +00:00
Matt Arsenault aafff87dda AMDGPU: Do not fold clamp instructions when sources are different
Patch by hakzsam (Samuel Pitoiset)

llvm-svn: 314951
2017-10-05 00:13:17 +00:00
Craig Topper 7a93092399 [InstCombine] Improve support for ashr in foldICmpAndShift
We can support ashr similar to lshr, if we know that none of the shifted in bits are used. In that case SimplifyDemandedBits would normally convert it to lshr. But that conversion doesn't happen if the shift has additional users.

Differential Revision: https://reviews.llvm.org/D38521

llvm-svn: 314945
2017-10-04 23:06:13 +00:00
Matt Arsenault 9ab1fa6803 AMDGPU: Fix not accounting for instruction size in bundles
These were counted as 0. Fixes branch limit exceeded errors
in some large programs.

llvm-svn: 314944
2017-10-04 22:59:12 +00:00
Konstantin Zhuravlyov 8684f7b4f9 AMDGPU: Correctly set EI_OSABI based on the os
Differential Revision: https://reviews.llvm.org/D38555

llvm-svn: 314943
2017-10-04 22:44:13 +00:00
Adrian Prantl b4a67907b7 clang-format file.
llvm-svn: 314942
2017-10-04 22:26:19 +00:00
Adrian Prantl 617a007b7c delete commented out code.
llvm-svn: 314941
2017-10-04 22:26:19 +00:00
Sanjoy Das 005b88c0a6 Do not call Loop::getName on possibly dead loops
This fixes PR34832.

llvm-svn: 314938
2017-10-04 22:02:27 +00:00
Xin Tong d8d97972de [MachineBlockPlacement] Make sure PreferredLoopExit is cleared everytime new loop is processed
Summary: Rotate on exit that actually exits the current loop.

Reviewers: davidxl, danielcdh, iteratee, chandlerc

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38563

llvm-svn: 314937
2017-10-04 21:39:25 +00:00
Hans Wennborg 899809d531 Fix a -Wparentheses warning. NFC.
llvm-svn: 314936
2017-10-04 21:14:07 +00:00
Justin Lebar f84f7c7467 Convert an APInt to int64_t properly in TTI::getGEPCost().
Summary:
If the pointer width is 32 bits and the calculated GEP offset is
negative, we call APInt::getLimitedValue(), which does a
*zero*-extension of the offset.  That's wrong -- we should do an sext.

Fixes a bug introduced in rL314362 and found by Evgeny Astigeevich.

Reviewers: efriedma

Subscribers: sanjoy, javed.absar, llvm-commits, eastig

Differential Revision: https://reviews.llvm.org/D38557

llvm-svn: 314935
2017-10-04 20:47:33 +00:00
Marcello Maggioni df3e71e037 [LoopDeletion] Move deleteDeadLoop to to LoopUtils. NFC
llvm-svn: 314934
2017-10-04 20:42:46 +00:00
Rafael Espindola 8c0ff9508d Bring r314809 back.
But now include a check for CPU_COUNT so we still build on 10 year old
versions of glibc.

Original message:

Use sched_getaffinity instead of std:🧵:hardware_concurrency.

The issue with std:🧵:hardware_concurrency is that it forwards
to libc and some implementations (like glibc) don't take thread
affinity into consideration.

With this change a llvm program that can execute in only 2 cores will
use 2 threads, even if the machine has 32 cores.

This makes benchmarking a lot easier, but should also help if someone
doesn't want to use all cores for compilation for example.

llvm-svn: 314931
2017-10-04 20:27:01 +00:00
Sanjay Patel 4c33d5213b [SimplifyCFG] put the optional assumption cache pointer in the options struct; NFCI
This is a follow-up to https://reviews.llvm.org/D38138. 

I fixed the capitalization of some functions because we're changing those
lines anyway and that helped verify that we weren't accidentally dropping 
any options by using default param values.

llvm-svn: 314930
2017-10-04 20:26:25 +00:00
Xinliang David Li 7a73757358 Recommit r314561 after fixing msan build failure
(trial 2) Incoming val defined by terminator instruction which
also requires bitcasts can not be handled.

llvm-svn: 314928
2017-10-04 20:17:55 +00:00
Guozhi Wei eb301875b8 [TargetTransformInfo] Check if function pointer is valid before calling isLoweredToCall
Function isLoweredToCall can only accept non-null function pointer, but a function pointer can be null for indirect function call. So check it before calling isLoweredToCall from getInstructionLatency.

Differential Revision: https://reviews.llvm.org/D38204

llvm-svn: 314927
2017-10-04 20:14:08 +00:00
Jun Bum Lim d40e03c2d8 Recommit : Use the basic cost if a GEP is not used as addressing mode
Recommitting r314517 with the fix for handling ConstantExpr.

Original commit message:
  Currently, getGEPCost() returns TCC_FREE whenever a GEP is a legal addressing
  mode in the target. However, since it doesn't check its actual users, it will
  return FREE even in cases where the GEP cannot be folded away as a part of
  actual addressing mode. For example, if an user of the GEP is a call
  instruction taking the GEP as a parameter, then the GEP may not be folded in
  isel.

llvm-svn: 314923
2017-10-04 18:33:52 +00:00
Daniel Neilson bef94bcbae Revert D38481 due to missing cmake check for CPU_COUNT
Summary:
This reverts D38481. The change breaks systems with older versions of glibc. It
injects a use of CPU_COUNT() from sched.h without checking to ensure that the
function exists first.

Reviewers:

Subscribers:

llvm-svn: 314922
2017-10-04 18:19:03 +00:00
Simon Pilgrim 9edbe110e8 [X86][AVX] Improve (i8 bitcast (v8i1 x)) handling for v8i64/v8f64 512-bit vector compare results.
AVX1/AVX2 targets were missing a chance to use vmovmskps for v8f32/v8i32 results for bool vector bitcasts

llvm-svn: 314921
2017-10-04 18:00:42 +00:00
Krzysztof Parzyszek 4697ddeea4 [Hexagon] Add a member Subtarget to HexagonInstrInfo, NFC
llvm-svn: 314920
2017-10-04 18:00:15 +00:00
Hans Wennborg 2a6c9adb2f Revert r314886 "[X86] Improvement in CodeGen instruction selection for LEAs (re-applying post required revision changes.)"
It broke the Chromium / SQLite build; see PR34830.

> Summary:
>    1/  Operand folding during complex pattern matching for LEAs has been
>        extended, such that it promotes Scale to accommodate similar operand
>        appearing in the DAG.
>        e.g.
>          T1 = A + B
>          T2 = T1 + 10
>          T3 = T2 + A
>        For above DAG rooted at T3, X86AddressMode will no look like
>          Base = B , Index = A , Scale = 2 , Disp = 10
>
>    2/  During OptimizeLEAPass down the pipeline factorization is now performed over LEAs
>        so that if there is an opportunity then complex LEAs (having 3 operands)
>        could be factored out.
>        e.g.
>          leal 1(%rax,%rcx,1), %rdx
>          leal 1(%rax,%rcx,2), %rcx
>        will be factored as following
>          leal 1(%rax,%rcx,1), %rdx
>          leal (%rdx,%rcx)   , %edx
>
>    3/ Aggressive operand folding for AM based selection for LEAs is sensitive to loops,
>       thus avoiding creation of any complex LEAs within a loop.
>
> Reviewers: lsaba, RKSimon, craig.topper, qcolombet, jmolloy
>
> Reviewed By: lsaba
>
> Subscribers: jmolloy, spatel, igorb, llvm-commits
>
>     Differential Revision: https://reviews.llvm.org/D35014

llvm-svn: 314919
2017-10-04 17:54:06 +00:00
Jake Ehrlich 084400bad9 [llvm-objcopy] Fix major layout bugs in llvm-objcopy
Somehow a few massive errors slipped though the cracks of testing.

1. The code in Segment::finalize was left over from the old layout
algorithm. In certain situations this would cause very strange issues
with segment layout. For instance in the shift-segments.test case it
would cause the second segment to have the same offset as the first.

2. In debugging this I discovered another issue. Namely section alignment
was not being computed based on Section->Align but instead
Section->Offset which is bizarre and makes no sense. I have no clue how
it worked in the first place. This issue is also fixed

3. Fixing #2 exposed a bug where things were not being written past the end
of the file that technically should have been. This was because in
certain cases (like overlapping-segments) the end of the file wouldn't
always be bumped if the offset could be chosen relative to an existing
segment that already had it's offset chosen. For fully nested segments
this is fine but for overlapping segments this leaves the end of the
file short. So I changed how the offset is bumped when looping though
segments.

Differential Revision: https://reviews.llvm.org/D38436

llvm-svn: 314918
2017-10-04 17:44:42 +00:00
Jakub Kuderski 68fc036e1d [Dominators] Take fast path when applying <=1 updates
Summary:
This patch teaches `DT.applyUpdates` to take the fast when applying zero or just one update and makes it not run the internal batch updater machinery.

With this patch, it should no longer make sense to have a special check in user's code that checks the update sequence size before applying them, e.g.
```
if (!MyUpdates.empty())
  DT.applyUpdates(MyUpdates);
```
or
```
if (MyUpdates.size() == 1)
  if (...)
    DT.insertEdge(...)
  else
    DT.deleteEdge(...)
```

Reviewers: dberlin, brzycki, davide, grosser, sanjoy

Reviewed By: dberlin, davide

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38541

llvm-svn: 314917
2017-10-04 17:32:55 +00:00
Simon Pilgrim b47b3f2564 [X86][SSE] Add support for lowering v8i16 binary shuffles to PACKSS/PACKUS
Missed in D38472

llvm-svn: 314916
2017-10-04 17:31:28 +00:00
Francis Ricci 954b94f5df [test] Fix append_path in the empty case
Summary:
normpath() was being called on an empty string and appended to
the environment variable in the case where the environment variable
was unset. This led to ":." being appended to the path, since
normpath() of an empty string is '.', presumably to represent cwd.

Reviewers: zturner, sqlbyme, modocache

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38542

llvm-svn: 314915
2017-10-04 17:30:28 +00:00
Craig Topper 6fb55716e9 [X86] Redefine MOVSS/MOVSD instructions to take VR128 regclass as input instead of FR32/FR64
This patch redefines the MOVSS/MOVSD instructions to take VR128 as its second input. This allows the MOVSS/SD->BLEND commute to work without requiring a COPY to be inserted.

This should fix PR33079

Overall this looks to be an improvement in the generated code. I haven't checked the EXPENSIVE_CHECKS build but I'll do that and update with results.

Differential Revision: https://reviews.llvm.org/D38449

llvm-svn: 314914
2017-10-04 17:20:12 +00:00
Balaram Makam 45ff83ce1e "[ARM] Mark flaky test MachineBranchProb.ll unsupported again for ARM/AArch64"
r314857 changed the CFG that resulted in the flaky test MachineBranchProb.ll to
fail the bots again. Marking it as unsupported for ARM/AArch64 again until we
find the cause.

llvm-svn: 314912
2017-10-04 16:45:24 +00:00
Yonghong Song 09b01b3555 bpf: fix an insn encoding issue for neg insn
Signed-off-by: Yonghong Song <yhs@fb.com>
llvm-svn: 314911
2017-10-04 16:11:52 +00:00
Adam Nemet 6c381b7a2e [OptRemark] Move YAML writing to IR
Before the patch this was in Analysis.  Moving it to IR and making it implicit
part of LLVMContext::diagnose allows the full opt-remark facility to be used
outside passes e.g. the pass manager.  Jessica is planning to use this to
report function size after each pass.  The same could be used for time
reports.

Tested with BUILD_SHARED_LIBS=On.

llvm-svn: 314909
2017-10-04 15:18:11 +00:00
Adam Nemet f1bea0a6ab Also update MachineORE after r314874.
llvm-svn: 314908
2017-10-04 15:18:07 +00:00
Sanjay Patel 9366b9c53b [InstCombine] add 'exact' variants of all tests; NFC
We can likely remove most of these as redundant in the near future, 
but I'm trying to make sure I don't introduce any regressions with D38514.

llvm-svn: 314907
2017-10-04 15:17:25 +00:00
Clement Courbet 98eaa88357 [NFC] clang-format lib/Transforms/Scalar/MergeICmps.cpp
llvm-svn: 314906
2017-10-04 15:13:52 +00:00
Simon Pilgrim 46a366ccb7 [X86][SSE] Early out from ComputeNumSignBitsForTargetNode. NFCI.
Early out from vector shift by immediates that will exceed eltsize - don't bother making an unnecessary ComputeNumSignBits recursive call.

llvm-svn: 314903
2017-10-04 13:41:26 +00:00
Simon Pilgrim bd5d2f0284 [X86][SSE] Add support for lowering unary shuffles to PACKSS/PACKUS
Extension to D38472

llvm-svn: 314901
2017-10-04 13:12:08 +00:00
George Rimar 3040da03e0 [gold-plugin] - Fix compilation after LLVM update (r314883). NFC.
llvm-svn: 314899
2017-10-04 11:00:30 +00:00
Dylan McKay 8dd702c1cd [AVR] Implement LPMWRdZ pseudo-instruction's expansion.
FIXME: implementation is mostly copy-pasted from LDWRdPtr, so we should
refactor a bit and unify the two

Patch by Gerdo Erdi.

llvm-svn: 314898
2017-10-04 10:37:22 +00:00
Dylan McKay 3f71f1c91e [AVR] Factor out mayLoad in tablegen patterns
Patch by Gergo Erdi.

llvm-svn: 314897
2017-10-04 10:36:07 +00:00
Dylan McKay d00f9c1ef1 [AVR] Elaborate LDWRdPtr into `ld r, X++; ld r+1, X`
Patch by Gergo Erdi.

llvm-svn: 314896
2017-10-04 10:33:36 +00:00
Dylan McKay 39069208d5 [AVR] Insert JMP for long branches
Previously, on long branches (relative jumps of >4 kB), an assertion
failure was hit, as AVRInstrInfo::insertIndirectBranch was not
implemented. Despite its name, it is called by the branch relaxator
for *all* unconditional jumps.

Patch by Thomas Backman.

llvm-svn: 314891
2017-10-04 09:51:28 +00:00
Dylan McKay c4b002bf5a [AVR] Fix displacement overflow for LDDW/STDW
In some cases, the code generator attempts to generate instructions such as:

lddw r24, Y+63

which expands to:

ldd r24, Y+63
ldd r25, Y+64 # Oops! This is actually ld r25, Y in the binary

This commit limits the first offset to 62, and thus the second to 63.
It also updates some asserts in AVRExpandPseudoInsts.cpp, including for
INW and OUTW, which appear to be unused.

Patch by Thomas Backman.

llvm-svn: 314890
2017-10-04 09:51:21 +00:00
Oliver Stannard 878216dd05 [ARM] Add diag string for movw/movt immediates in assembly
This adds diagnostics for invalid immediate operands to the MOVW and MOVT
instructions (ARM and Thumb).

Differential revision: https://reviews.llvm.org/D31879

llvm-svn: 314888
2017-10-04 09:24:54 +00:00
Oliver Stannard 5a7aae3a80 [ARM, Asm] Change grammar of immediate operand diagnostics
Currently, our diagnostics for assembly operands are not consistent.
Some start with (for example) "immediate operand must be ...",
and some with "operand must be an immediate ...". I think the latter
form is preferable for a few reasons:
* It's unambiguous that it is referring to the expected type of operand, not
  the type the user provided. For example, the user could provide an register
  operand, and get a message taking about an operand is if it is already an
  immediate, just not in the accepted range.
* It allows us to have a consistent style once we add diagnostics for operands
  that could take two forms, for example a label or pc-relative memory operand.

Differential revision: https://reviews.llvm.org/D36689

llvm-svn: 314887
2017-10-04 09:18:07 +00:00
Jatin Bhateja 3c29bacd43 [X86] Improvement in CodeGen instruction selection for LEAs (re-applying post required revision changes.)
Summary:
   1/  Operand folding during complex pattern matching for LEAs has been
       extended, such that it promotes Scale to accommodate similar operand
       appearing in the DAG.
       e.g.
         T1 = A + B
         T2 = T1 + 10
         T3 = T2 + A
       For above DAG rooted at T3, X86AddressMode will no look like
         Base = B , Index = A , Scale = 2 , Disp = 10

   2/  During OptimizeLEAPass down the pipeline factorization is now performed over LEAs
       so that if there is an opportunity then complex LEAs (having 3 operands)
       could be factored out.
       e.g.
         leal 1(%rax,%rcx,1), %rdx
         leal 1(%rax,%rcx,2), %rcx
       will be factored as following
         leal 1(%rax,%rcx,1), %rdx
         leal (%rdx,%rcx)   , %edx

   3/ Aggressive operand folding for AM based selection for LEAs is sensitive to loops,
      thus avoiding creation of any complex LEAs within a loop.

Reviewers: lsaba, RKSimon, craig.topper, qcolombet, jmolloy

Reviewed By: lsaba

Subscribers: jmolloy, spatel, igorb, llvm-commits

    Differential Revision: https://reviews.llvm.org/D35014

llvm-svn: 314886
2017-10-04 09:02:10 +00:00
Sean Eveson ea9dceed77 [llvm-cov] Fix showing title when filtering and not outputting to a directory
Differential Revision: https://reviews.llvm.org/D38507

llvm-svn: 314885
2017-10-04 08:54:37 +00:00
George Rimar 099960d322 [MC] - Don't assert when non-english characters are used.
I found that llvm-mc does not like non-english characters even in comments,
which it tries to tokenize.

Problem happens because of functions like isdigit(), isalnum() which takes
int argument and expects it is not negative.
But at the same time MCParser uses char* to store input buffer poiner, char has signed value,
so it is possible to pass negative value to one of functions from above and
that triggers an assert. 
Testcase for demonstration is provided.

To fix the issue helper functions were introduced in StringExtras.h

Differential revision: https://reviews.llvm.org/D38461

llvm-svn: 314883
2017-10-04 08:50:08 +00:00
Mikael Holmen a1a3f5c5e6 Recommit [UnreachableBlockElim] Use COPY if PHI input is undef
This time invoking llc with "-march=x86-64" in the testcase, so we don't assume
the default target is x86.

Summary:
If we have

    %vreg0<def> = PHI %vreg2<undef>, <BB#0>, %vreg3, <BB#2>; GR32:%vreg0,%vreg2,%vreg3
    %vreg3<def,tied1> = ADD32ri8 %vreg0<kill,tied0>, 1, %EFLAGS<imp-def>; GR32:%vreg3,%vreg0

then we can't just change %vreg0 into %vreg3, since %vreg2 is actually
undef. We would have to also copy the undef flag to be able to change the
register.

Instead we deal with this case like other cases where we can't just
replace the register: we insert a COPY. The code creating the COPY already
copied all flags from the PHI input, so the undef flag will be transferred
as it should.

Reviewers: kparzysz

Reviewed By: kparzysz

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38235

llvm-svn: 314882
2017-10-04 07:42:45 +00:00
Max Kazantsev 8aacef6cae [IRCE] Temporarily disable unsigned latch conditions by default
We have found some corner cases connected to range intersection where IRCE makes
a bad thing when the latch condition is unsigned. The fix for that will go as a follow up.
This patch temporarily disables IRCE for unsigned latch conditions until the issue is fixed.

The unsigned latch conditions were introduced to IRCE by rL310027.

Differential Revision: https://reviews.llvm.org/D38529

llvm-svn: 314881
2017-10-04 06:53:22 +00:00
Mikael Holmen 75b1992f78 Revert r314879 "[UnreachableBlockElim] Use COPY if PHI input is undef"
Build-bots broke on the new testcase. I'll investigate and fix.

llvm-svn: 314880
2017-10-04 06:39:22 +00:00
Mikael Holmen 65eb2f394c [UnreachableBlockElim] Use COPY if PHI input is undef
Summary:
If we have

    %vreg0<def> = PHI %vreg2<undef>, <BB#0>, %vreg3, <BB#2>; GR32:%vreg0,%vreg2,%vreg3
    %vreg3<def,tied1> = ADD32ri8 %vreg0<kill,tied0>, 1, %EFLAGS<imp-def>; GR32:%vreg3,%vreg0

then we can't just change %vreg0 into %vreg3, since %vreg2 is actually
undef. We would have to also copy the undef flag to be able to change the
register.

Instead we deal with this case like other cases where we can't just
replace the register: we insert a COPY. The code creating the COPY already
copied all flags from the PHI input, so the undef flag will be transferred
as it should.

Reviewers: kparzysz

Reviewed By: kparzysz

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38235

llvm-svn: 314879
2017-10-04 06:06:31 +00:00
Martin Storsjo e14145dcb0 [X86] Fix using the SJLJ jump table on x86_64
The previous version didn't work if the jump table base address didn't
fit in 32 bit, since it was encoded as an immediate offset. And in case
the jump table is encoded as 32 bit label differences, we need to
load and add them to the table base first.

This solves the first half of the issues mentioned in PR34720.

Also fix some of the errors pointed out by -verify-machineinstrs, by
using GR32_NOSPRegClass.

Differential Revision: https://reviews.llvm.org/D38333

llvm-svn: 314876
2017-10-04 05:12:10 +00:00
Adam Nemet f31b1f310c Move verbosity check for remarks to the diag handler
Test needs some slight adjustment because we no longer check the existence of
BFI but rather that the actual hotness is set on the remark.  If entry_count
is not set getBlockProfileCount returns None.

llvm-svn: 314874
2017-10-04 04:26:23 +00:00
Tim Shen 83fd6a1243 [FuzzerUtil] Partially revert D38481 on FuzzerUtil
This is because lib/Fuzzer doesn't really depend on llvm infrastucture.
It's not easy to access the llvm hardware_concurrency here.

Differential Reivision: https://reviews.llvm.org/D38481

llvm-svn: 314870
2017-10-04 01:05:34 +00:00
Adrian Prantl f01b3f3f8b Add a manpage for llvm-dwarfdump.
llvm-svn: 314863
2017-10-03 23:46:57 +00:00
Rui Ueyama 15b8327963 Simplify multikey_qsort function.
This function implements the three-way radix quicksort algorithm.
This patch simplifies the implementation by using MutableArrayRef.

llvm-svn: 314858
2017-10-03 23:12:01 +00:00
Balaram Makam e0c43152b5 [AArch64] Use LateSimplifyCFG after expanding atomic operations.
Summary:
After r308422 we defer optimizations that can destroy loop canonical forms to
LateSimplifyCFG. Running LateSimplifyCFG after expanding atomic operations
can exploit more control-flow opportunities.

Reviewers: mcrosier, t.p.northover, efriedma

Reviewed By: efriedma

Subscribers: aemerson, rengolin, javed.absar, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D38262

llvm-svn: 314857
2017-10-03 22:39:24 +00:00
Adrian Prantl e5b93f2e34 llvm-dwarfdump: implement the --regex option in combination with --name.
llvm-svn: 314855
2017-10-03 22:08:22 +00:00
Konstantin Zhuravlyov 22bc039c89 AMDGPU: Expand setcc for v2f32 and v4f32
llvm-svn: 314853
2017-10-03 21:45:01 +00:00
Konstantin Zhuravlyov 908fa90b51 AMDGPU: Expand setcc for v2i32 and v4i32
llvm-svn: 314852
2017-10-03 21:31:24 +00:00
Konstantin Zhuravlyov 3696352d85 AMDGPU/Docs: Follow up on review feedback in https://reviews.llvm.org/D38387
llvm-svn: 314848
2017-10-03 21:18:03 +00:00
Jakub Kuderski 4dc477dc93 [Dominators] Make eraseNode invalidate DFS numbers
This patch makes DT::eraseNode mark DFSInfo as invalid.
Not marking it as invalid leads to DFS numbers getting corrupted
and failing VerifyDFSNumbers check.

This patch also makes children iterator const (NFC).

llvm-svn: 314847
2017-10-03 21:17:48 +00:00
Konstantin Zhuravlyov 0aa94d314c AMDGPU: Add ELFOSABI_AMDGPU_MESA3D
Differential Revision: https://reviews.llvm.org/D38387

llvm-svn: 314846
2017-10-03 21:14:14 +00:00
Reid Kleckner 33cbbbc62f [X86] Remove dead declaration convertArgMovsToPushes, NFC
This was dead when it landed in r252578. We have this functionality, if
not for stack probe calls, but for regular calls in
X86CallFrameOptimization.cpp.

llvm-svn: 314845
2017-10-03 21:12:18 +00:00
Rafael Espindola 476a7f9293 Pre-compute the tail of the archive
An archive looks like

<header>
<symbol table>
<tail>

The symbol table refers to offsets in the tail. A complication is that
we would like to support symbol tables that use 64 bit offsets if it
turns out that any of the offsets is too big.

This patch changes the archive writer to first compute the tail. We
cannot just compute one big StringRef since that would require reading
every member upfront, but we can represent it as a series of
StringRefs.

Having done that it is much easier to compute the symbol table and all
offsets are computed before it is written. With this if there is an
accounting problem it will show up with a regular symbol table, not
just when a 64 bit one is needed.

llvm-svn: 314844
2017-10-03 20:59:43 +00:00
Konstantin Zhuravlyov a952b44ed5 AMDGPU: Add ELFOSABI_AMDGPU_PAL
llvm-svn: 314843
2017-10-03 20:54:07 +00:00
Reid Kleckner bc66947433 Refactor DIBuilder dbg intrinsic insertion, NFC
Both dbg.declare and dbg.value insertion had duplicate code for the two
overloads with different insertion point conventions.

llvm-svn: 314839
2017-10-03 20:36:40 +00:00
Sanjay Patel 389b7cedc3 [InstCombine] add tests for icmp gt/lt (shr X, C1), C2; NFC
Surprisingly, we have zero coverage for these patterns.

Many of these are handled in InstSimplify, but it's not obvious
what the rule for folding each case should be, so I've just
stamped out everything.

It should be possible to fold every case, but currently, we
miss these:

int ashr_slt(int x) {
  return (x >> 1) < 1; 
}

int ashr_sgt(int x) {
  return (x >> 1) > 0; 
}

https://godbolt.org/g/aB2hLE

llvm-svn: 314837
2017-10-03 20:34:20 +00:00
Jessica Paquette acc15e1265 [MachineOutliner] Fix off-by-one in cost model
This commit does two things. Firstly, it cleans up some of the benefit
calculation wrt outlined functions and candidates. Secondly, it fixes an
off-by-one bug in the cost model which was caused by the benefit value of
an OutlinedFunction and Candidate differing by 1. It updates the remarks test
to reflect this change.

llvm-svn: 314836
2017-10-03 20:32:55 +00:00
Stefan Pintilie e1d7547237 [PowerPC] Revert P9 scheduling model to incomplete
Partially revert a previous change from commit: https://llvm.org/svn/llvm-project/llvm/trunk@314026
The previous change caused regressions on Power 9.

llvm-svn: 314835
2017-10-03 20:27:30 +00:00
Craig Topper df63b96811 [InstCombine] Use isSignBitCheck to simplify an if statement. Directly create new sign bit compares instead of manipulating the constant. NFCI
Since we no longer had the direct constant compares, manipulating the constant seemeded less clear.

llvm-svn: 314830
2017-10-03 19:14:23 +00:00
Tim Renouf 72800f0436 [AMDGPU] implemented pal metadata
Summary:
For the amdpal OS type:

We write an AMDGPU_PAL_METADATA record in the .note section in the ELF
(or as an assembler directive). It contains key=value pairs of 32 bit
ints. It is a merge of metadata from codegen of the shaders, and
metadata provided by the frontend as _amdgpu_pal_metadata IR metadata.
Where both sources have a key=value with the same key, the two values
are ORed together.

This .note record is part of the amdpal ABI and will be documented in
docs/AMDGPUUsage.rst in a future commit.

Eventually the amdpal OS type will stop generating the .AMDGPU.config
section once the frontend has safely moved over to using the .note
records above instead of .AMDGPU.config.

Reviewers: arsenm, nhaehnle, dstuttard

Subscribers: kzhuravl, wdng, yaxunl, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D37753

llvm-svn: 314829
2017-10-03 19:03:52 +00:00
Alexander Timofeev 4651396584 [AMDGPU] Avoid predicated execution of the basic blocks containing scalar
instructions.

Differential revision: https://reviews.llvm.org/D38293

llvm-svn: 314828
2017-10-03 18:55:36 +00:00
Hans Wennborg dc8d6f2527 Fix -Wcovered-switch-default warnings from r314821
llvm-svn: 314826
2017-10-03 18:44:12 +00:00
Hans Wennborg ab2177edf7 Revert r314817 "[dwarfdump] Add -lookup option"
The test fails on Linux; see follow-up email on the llvm-commits list.

> Add the option to lookup an address in the debug information and print
> out the file, function, block and line table details.
>
> Differential revision: https://reviews.llvm.org/D38409

This also reverts the follow-up r314818:

> [test] Fix llvm-dwarfdump/cmdline.test
>
> Fixes test/tools/llvm-dwarfdump/cmdline.test

llvm-svn: 314825
2017-10-03 18:39:13 +00:00
Hans Wennborg 9a9048e19f Revert r314806 "[SLP] Vectorize jumbled memory loads."
All the buildbots are red, e.g.
http://lab.llvm.org:8011/builders/clang-cmake-aarch64-lld/builds/2436/

> Summary:
> This patch tries to vectorize loads of consecutive memory accesses, accessed
> in non-consecutive or jumbled way. An earlier attempt was made with patch D26905
> which was reverted back due to some basic issue with representing the 'use mask' of
> jumbled accesses.
>
> This patch fixes the mask representation by recording the 'use mask' in the usertree entry.
>
> Change-Id: I9fe7f5045f065d84c126fa307ef6ebe0787296df
>
> Reviewers: mkuper, loladiro, Ayal, zvi, danielcdh
>
> Reviewed By: Ayal
>
> Subscribers: hans, mzolotukhin
>
> Differential Revision: https://reviews.llvm.org/D36130

llvm-svn: 314824
2017-10-03 18:32:29 +00:00
Reid Kleckner 494cfd48ef Fix expectations in MC wasm init-fini-array test
llvm-svn: 314823
2017-10-03 18:30:38 +00:00
Reid Kleckner b4569de739 Implement David Blaikie's suggestion for comparison operators
llvm-svn: 314822
2017-10-03 18:30:11 +00:00
Hans Wennborg 660531085a CodeView: Provide a .def file with the register ids
The list of register ids was previously written out in a couple of dirrent
places. This puts it in a .def file and also adds a few more registers (e.g.
the x87 regs) which should lead to more readable dumps, but I didn't include
the whole list since that seems unnecessary.

X86_MC::initLLVMToSEHAndCVRegMapping is pretty ugly, but at least it's not
relying on magic constants anymore. The TODO of using tablegen still stands.

Differential revision: https://reviews.llvm.org/D38480

llvm-svn: 314821
2017-10-03 18:27:22 +00:00
Reid Kleckner 04e25e00b7 [DebugInfo] Correctly coalesce DBG_VALUEs that mix direct and indirect values
Summary:
This should fix a regression introduced by r313786, which switched from
MachineInstr::isIndirectDebugValue() to checking if operand 1 is an
immediate. I didn't have a test case for it until now.

A single UserValue, which approximates a user variable, may have many
DBG_VALUE instructions that disagree about whether the variable is in
memory or in a virtual register. This will become much more common once
we have llvm.dbg.addr, but you can construct such a test case manually
today with llvm.dbg.value.

Before this change, we would get two UserValues: one for direct and one
for indirect DBG_VALUE instructions describing the same variable. If we
build separate interval maps for direct and indirect locations, we will
end up accidentally coalescing identical DBG_VALUE intervals that need
to remain separate because they are broken up by intervals of the
opposite direct-ness.

Reviewers: aprantl

Subscribers: llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D37932

llvm-svn: 314819
2017-10-03 17:59:02 +00:00
Jonas Devlieghere 1821a7b3cb [test] Fix llvm-dwarfdump/cmdline.test
Fixes test/tools/llvm-dwarfdump/cmdline.test

llvm-svn: 314818
2017-10-03 17:28:37 +00:00
Jonas Devlieghere f998c501b6 [dwarfdump] Add -lookup option
Add the option to lookup an address in the debug information and print
out the file, function, block and line table details.

Differential revision: https://reviews.llvm.org/D38409

llvm-svn: 314817
2017-10-03 17:10:21 +00:00
Simon Pilgrim 261a6c29ea [X86] Add non-SSE tests for PR15215 as well
llvm-svn: 314815
2017-10-03 17:04:36 +00:00
Geoff Berry fabedbad11 Revert "Re-enable "[MachineCopyPropagation] Extend pass to do COPY source forwarding""
This reverts commit r314729.

Another bug has been encountered in an out-of-tree target reported by Quentin.

llvm-svn: 314814
2017-10-03 16:59:13 +00:00
Simon Pilgrim 46a804cfd9 [X86][SSE] Add bool vector extraction test cases from PR15215
llvm-svn: 314813
2017-10-03 16:56:57 +00:00
Rafael Espindola 6e182fbab4 Use sched_getaffinity instead of std:🧵:hardware_concurrency.
The issue with std:🧵:hardware_concurrency is that it forwards
to libc and some implementations (like glibc) don't take thread
affinity into consideration.

With this change a llvm program that can execute in only 2 cores will
use 2 threads, even if the machine has 32 cores.

This makes benchmarking a lot easier, but should also help if someone
doesn't want to use all cores for compilation for example.

llvm-svn: 314809
2017-10-03 16:25:15 +00:00
Dehao Chen ea523ddb1b Revert the change that accidentally went in r314806.
llvm-svn: 314807
2017-10-03 15:50:42 +00:00
Mohammad Shahid 1d5422f27f [SLP] Vectorize jumbled memory loads.
Summary:
This patch tries to vectorize loads of consecutive memory accesses, accessed
in non-consecutive or jumbled way. An earlier attempt was made with patch D26905
which was reverted back due to some basic issue with representing the 'use mask' of
jumbled accesses.

This patch fixes the mask representation by recording the 'use mask' in the usertree entry.

Change-Id: I9fe7f5045f065d84c126fa307ef6ebe0787296df

Reviewers: mkuper, loladiro, Ayal, zvi, danielcdh

Reviewed By: Ayal

Subscribers: hans, mzolotukhin

Differential Revision: https://reviews.llvm.org/D36130

llvm-svn: 314806
2017-10-03 15:28:48 +00:00
Jakub Kuderski 5f081b1c0b [Dominators] Don't use default parameter in lambda
... to make GCC buildbots happy.

llvm-svn: 314805
2017-10-03 14:51:31 +00:00
Oliver Stannard 0d5c792223 [ARM] Use table-gen'd assembly operand diags in ARM asm parser
This switches the ARM AsmParser to use assembly operand diagnostics from
tablegen, rather than a switch statement on the ARMMatchResultTy. It
moves the existing diagnostic strings to tablegen, but adds no new ones,
so this is NFC except for one diagnostic string that had an off-by-1 error
in the hand-written switch statement.

Differential revision: https://reviews.llvm.org/D31607

llvm-svn: 314804
2017-10-03 14:38:52 +00:00
Oliver Stannard 41dfac3eb2 [AsmParser] Add DiagnosticString to AsmOperands in tablegen
This adds a DiagnosticString member to the AsmOperand tablegen class, so
that the diagnostic text to be used when an assembly operand is
incorrect can be stored in the tablegen description of the operand,
rather than in a separate switch statement in the AsmParser.

If DiagnosticString is used for any operands, tablegen will emit a
getMatchKindDiag function, to map from diagnostic enums to strings.

Differential revision: https://reviews.llvm.org/D31606

llvm-svn: 314803
2017-10-03 14:34:57 +00:00
Jakub Kuderski ce2e36e63e [Dominators] Add DFS number verification
Summary:
This patch teaches the DominatorTree verifier to check DFS In/Out numbers which are used to answer dominance queries.
DFS number verification is done in O(nlogn), so it shouldn't add much overhead on top of the O(n^3) sibling property verification.
This check should detect errors like the one spotted in PR34466 and related bug reports.

The patch also cleans up the DFS calculation a bit, as all constructed trees should have a single root now.

I see 2 new test failures when running check-all after this change:

```
Failing Tests (2):
    Polly :: Isl/CodeGen/OpenMP/reference-argument-from-non-affine-region.ll
    Polly :: Isl/CodeGen/OpenMP/two-parallel-loops-reference-outer-indvar.ll

```
which seem to happen just after `Create LLVM-IR from SCoPs` -- I XFAILed them in r314800.

Reviewers: dberlin, grosser, davide, zhendongsu, bollu

Reviewed By: dberlin

Subscribers: nandini12396, bollu, Meinersbur, brzycki, llvm-commits

Differential Revision: https://reviews.llvm.org/D38331

llvm-svn: 314801
2017-10-03 14:33:41 +00:00
Oliver Stannard 55114fd9f0 [ARM, Asm] Use correct source location for register tokens
tryParseRegister advances the lexer, so we need to take copies of the start and
end locations of the register operand before calling it.

Previously, the caret in the diagnostic pointer to the comma after the r0
operand in the test, rather than the start of the operand.

Differential revision: https://reviews.llvm.org/D31537

llvm-svn: 314799
2017-10-03 14:30:58 +00:00
Simon Dardis 055192ccd3 [mips] Enable spilling and reloading of the dsp register set.
The dsp register class is an alias of the gpr register class, so
we have to define instructions for spilling and reloading.

Reviewers: atanasyan

Differential Revision: https://reviews.llvm.org/D38038

llvm-svn: 314798
2017-10-03 13:45:49 +00:00
John Brawn 736bf0020f [CGP] Make optimizeMemoryInst capable of handling multiple AddrModes
Currently optimizeMemoryInst requires that all of the AddrModes it sees are
identical. This patch makes it capable of tracking multiple AddrModes, so long
as they differ in at most one field.

This patch does nothing by itself, but later patches will make use of it to
insert or reuse phi or select instructions for the differing fields.

Differential Revision: https://reviews.llvm.org/D38278

llvm-svn: 314795
2017-10-03 13:08:22 +00:00
John Brawn eb83c7554e [CGP] In optimizeMemoryInst handle select similarly to phi
This lets us optimize away selects that perform the same address computation in
two different ways and is also the first step towards being able to handle
selects between two different, but compatible, address computations.

Differential Revision: https://reviews.llvm.org/D38242

llvm-svn: 314794
2017-10-03 13:04:15 +00:00
Oliver Stannard 68aa7de517 [ARM, Asm] Fix ubsan failure caused by out-of-range enum value
In this code, we use ~0U as a sentinel value for any operand class that doesn't
have a user-friendly error message, but this value isn't in range of the
MatchClassKind enum, so we need to ensure it does not get passed to isSubclass.

llvm-svn: 314793
2017-10-03 12:45:18 +00:00
Simon Pilgrim cf99d069c3 [X86][SSE] Add support for decoding PACKSS/PACKUS shuffles masks with UNDEF
llvm-svn: 314792
2017-10-03 12:41:39 +00:00
Oliver Stannard 5daee987fd [ARM, Asm] Remove dead code causing MSan failure.
r314779 caused ErrorInfo to be red uninitialised, but also made this code dead,
so it can just be removed.

llvm-svn: 314791
2017-10-03 12:28:28 +00:00
Simon Pilgrim f5f291d129 [X86][SSE] Add support for lowering shuffles to PACKSS/PACKUS
If the upper bits of a truncation shuffle patterns have at least the minimum number of sign/zero bits on their inputs then we can safely use PACKSS/PACKUS as shuffles.

Partial fix for https://bugs.llvm.org/show_bug.cgi?id=34773

Differential Revision: https://reviews.llvm.org/D38472

llvm-svn: 314788
2017-10-03 12:01:31 +00:00
Evgeny Astigeevich d3558b5d5e [InlineCost, NFC] Extract code dealing with inbounds GEPs from visitGetElementPtr into a function
The code responsible for analysis of inbounds GEPs is extracted into a separate
function: CallAnalyzer::canFoldInboundsGEP. With the patch SROA
enabling/disabling code is localized at one place instead of spreading across
the code of CallAnalyzer::visitGetElementPtr.

Differential Revision: https://reviews.llvm.org/D38233

llvm-svn: 314787
2017-10-03 12:00:40 +00:00
Sam Clegg b2b019f727 [WebAssembly] MC: Support for init_array and fini_array
Differential Revision: https://reviews.llvm.org/D37757

llvm-svn: 314783
2017-10-03 11:20:28 +00:00
Sean Eveson d932b2d763 [llvm-cov] Hide files with no coverage from the index when filtering by name
Differential Revision: https://reviews.llvm.org/D38457

llvm-svn: 314782
2017-10-03 11:05:28 +00:00
Bjorn Pettersson 90cc1b53d0 [DebugInfo] Handle endianness when moving debug info for split integer values (reapplied)
Summary:
Take the target's endianness into account when splitting the
debug information in DAGTypeLegalizer::SetExpandedInteger.

This patch fixes so that, for big-endian targets, the fragment
expression corresponding to the high part of a split integer
value is placed at offset 0, in order to correctly represent
the memory address order.

I have attached a PPC32 reproducer where the resulting DWARF
pieces for a 64-bit integer were incorrectly reversed.

Original patch was reverted due to using -stop-after=isel in
the test case (but that is only working when AMDGPU target
is included in the llc build). The test case has now been
updated to use -stop-before=expand-isel-pseudos instead.

Patch by: dstenb

Reviewers: JDevlieghere, aprantl, dblaikie

Reviewed By: JDevlieghere, aprantl, dblaikie

Subscribers: nemanjai

Differential Revision: https://reviews.llvm.org/D38172

llvm-svn: 314781
2017-10-03 11:03:02 +00:00
Oliver Stannard e093bad472 [ARM] Use new assembler diags for ARM
This converts the ARM AsmParser to use the new assembly matcher error
reporting mechanism, which allows errors to be reported for multiple
instruction encodings when it is ambiguous which one the user intended
to use.

By itself this doesn't improve many error messages, because we don't have
diagnostic text for most operand types, but as we add that then this will allow
more of those diagnostic strings to be used when they are relevant.

Differential revision: https://reviews.llvm.org/D31530

llvm-svn: 314779
2017-10-03 10:26:11 +00:00
Simon Pilgrim d87af9a1c0 Remove unused variable. NFCI.
llvm-svn: 314778
2017-10-03 10:01:02 +00:00
Simon Pilgrim 640fbf5132 [X86][SSE] Add support for shuffle combining from PACKSS/PACKUS
Mentioned in D38472

llvm-svn: 314777
2017-10-03 09:54:03 +00:00
Simon Pilgrim 19d535e75b [X86][SSE] Add support for PACKSS/PACKUS constant folding
Pulled out of D38472

llvm-svn: 314776
2017-10-03 09:41:00 +00:00
Javed Absar e485b143ea [MiSched] - Simplify ProcResEntry access
Reviewed by: @MatzeB
Differential Revision: https://reviews.llvm.org/D38447

llvm-svn: 314775
2017-10-03 09:35:04 +00:00
Oliver Stannard 65f7bc5bf0 [Assembler] Report multiple near misses for invalid instructions
The current table-generated assembly instruction matcher returns a
64-bit error code when matching fails. Since multiple instruction
encodings with the same mnemonic can fail for different reasons, it uses
some heuristics to decide which message is important.

This heuristic does not work well for targets that have many encodings
with the same mnemonic but different operands, or which have different
versions of instructions controlled by subtarget features, as it is hard
to know which encoding the user was intending to use.

Instead of trying to improve the heuristic in the table-generated
matcher, this patch changes it to report a list of near-miss encodings.
This list contains an entry for each encoding with the correct mnemonic,
but with exactly one thing preventing it from being valid. This thing
could be a single invalid operand, a missing target feature or a failed
target-specific validation function.

The target-specific assembly parser can then report an error message
giving multiple options for instruction variants that the user may have
been trying to use. For example, I am working on a patch to use this for
ARM, which can give this error for an invalid instruction for ARMv6-M:

  <stdin>:8:3: error: invalid instruction, multiple near-miss encodings found
    adds r0, r1, #0x8
    ^
  <stdin>:8:3: note: for one encoding: instruction requires: thumb2
    adds r0, r1, #0x8
    ^
  <stdin>:8:16: note: for one encoding: expected an integer in range [0, 7]
    adds r0, r1, #0x8
                 ^
  <stdin>:8:16: note: for one encoding: expected a register in range [r0, r7]
    adds r0, r1, #0x8
                 ^

This also allows the target-specific assembly parser to apply its own
heuristics to suppress some errors. For example, the error "instruction
requires: arm-mode" is never going to be useful when targeting an
M-profile architecture (which does not have ARM mode).

This patch just adds the target-independent mechanism for doing this,
all targets still use the old mechanism. I've added a bit in the
AsmParser tablegen class to allow targets to switch to this new
mechanism. To use this, the target-specific assembly parser will have to
be modified for the change in signature of MatchInstructionImpl, and to
report errors based on the list of near-misses.

Differential revision: https://reviews.llvm.org/D27620

llvm-svn: 314774
2017-10-03 09:33:12 +00:00
Sjoerd Meijer 7a22a4948f ISel type legalization: add debug messages. NFCI.
This adds some more debug messages to the type legalizer and functions
like PromoteNode, ExpandNode, ExpandLibCall in an attempt to make
the debug messages a little bit more informative and useful.

Differential Revision: https://reviews.llvm.org/D38450

llvm-svn: 314773
2017-10-03 08:54:15 +00:00
Alex Bradbury bb89b2b628 [llvm-readobj][RISCV] Pretty-print RISCV e_flags
llvm-svn: 314772
2017-10-03 08:41:59 +00:00
Alex Bradbury 3ce3e6326c [RISCV] Add missed test case for r314770
Differential Revision: https://reviews.llvm.org/D38311
Patch by https://reviews.llvm.org/D38311

llvm-svn: 314771
2017-10-03 08:03:14 +00:00
Alex Bradbury 1fb1a480b5 [RISCV] Parse RISC-V eflags in ObjectYAML
Differential Revision: https://reviews.llvm.org/D38311
Patch by Chih-Mao Chen.

llvm-svn: 314770
2017-10-03 08:00:47 +00:00
Hiroshi Inoue 224661d94b [trivial] fix format, NFC
llvm-svn: 314769
2017-10-03 07:28:58 +00:00
Shoaib Meenai 4fefcd6dad [ObjectYAML] Handle SHF_COMPRESSED
This was previously being silently dropped by obj2yaml and caused
parsing errors with yaml2obj.

Differential Revision: https://reviews.llvm.org/D38490

llvm-svn: 314768
2017-10-03 06:35:55 +00:00
Martin Storsjo 1e54738676 [X86] Provide the LSDA pointer with RIP relative addressing if necessary
This makes sure the LSDA pointer isn't truncated to 32 bit.

Make LowerINTRINSIC_WO_CHAIN a member function instead of a static
function, so that it can use the getGlobalWrapperKind method.

This solves the second half of the issues mentioned in PR34720.

Differential Revision: https://reviews.llvm.org/D38343

llvm-svn: 314767
2017-10-03 06:29:58 +00:00
Mikael Holmen 6efe507e42 [Lint] Avoid failed assertion by fetching the proper pointer type
Summary:
When checking if a constant expression is a noop cast we fetched the
IntPtrType by doing DL->getIntPtrType(V->getType())). However, there can
be cases where V doesn't return a pointer, and then getIntPtrType()
triggers an assertion.

Now we pass DataLayout to isNoopCast so the method itself can determine
what the IntPtrType is.

Reviewers: arsenm

Reviewed By: arsenm

Subscribers: wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D37894

llvm-svn: 314763
2017-10-03 06:03:49 +00:00
Craig Topper 8ed1aa91bd [InstCombine] Change a bunch of methods to take APInts by reference instead of pointer.
This allows us to remove a bunch of dereferences and only have a few dereferences at the call sites.

llvm-svn: 314762
2017-10-03 05:31:07 +00:00
Craig Topper 664c4d0190 [InstCombine] Replace an equality compare of two APInt pointers with a compare of the APInts themselves.
Apparently this works by virtue of the fact that the pointers are pointers to the APInts stored inside of the ConstantInt objects. But I really don't think we should be relying on that.

llvm-svn: 314761
2017-10-03 04:55:04 +00:00
Quentin Colombet c2f3cea608 [Legalizer] Add support for G_OR NarrowScalar.
Legalize bitwise OR:
 A = BinOp<Ty> B, C
into:
 B1, ..., BN = G_UNMERGE_VALUES B
 C1, ..., CN = G_UNMERGE_VALUES C
 A1 = BinOp<Ty/N> B1, C2
 ...
 AN = BinOp<Ty/N> BN, CN
 A = G_MERGE_VALUES A1, ..., AN

llvm-svn: 314760
2017-10-03 04:53:56 +00:00
Craig Topper 360c816cbb [X86] Add AVX512 check lines to the cost model truncate test.
llvm-svn: 314758
2017-10-03 03:47:34 +00:00
Rui Ueyama bfd7515408 Rewrite a function so that it doesn't use pointers to pointers. NFC.
Previous code was a bit puzzling because of its use of pointers.
In this patch, we pass a vector and its offsets, instead of pointers to
vector elements.

llvm-svn: 314756
2017-10-03 03:09:05 +00:00
Peter Collingbourne d5fc37cc88 LTO: Improve error reporting when adding a cache entry.
Move error handling code next to the code that returns the error,
and change the error message in order to distinguish it from a similar
error message elsewhere in this file.

llvm-svn: 314745
2017-10-03 00:44:21 +00:00
Daniel Berlin 67fbb351e3 SparseSolver: Rename getOrInitValueState to getValueState, matching what SCCP calls it
llvm-svn: 314744
2017-10-03 00:26:21 +00:00
Matt Arsenault 90c7593a75 AMDGPU: Remove global isGCN predicates
These are problematic because they apply to everything,
and can easily clobber whatever more specific predicate
you are trying to add to a function.

Currently instructions use SubtargetPredicate/PredicateControl
to apply this to patterns applied to an instruction definition,
but not to free standing Pats. Add a wrapper around Pat
so the special PredicateControls requirements can be appended
to the final predicate list like how Mips does it.

llvm-svn: 314742
2017-10-03 00:06:41 +00:00
Haicheng Wu 25f6c196d7 [InstSimplify] teach SimplifySelectInst() to fold more vector selects
Call ConstantFoldSelectInstruction() to fold cases like below

select <2 x i1><i1 true, i1 false>, <2 x i8> <i8 0, i8 1>, <2 x i8> <i8 2, i8 3>

All operands are constants and the condition has mixed true and false conditions.

Differential Revision: https://reviews.llvm.org/D38369

llvm-svn: 314741
2017-10-02 23:43:52 +00:00
Davide Italiano c48d1c8519 [PassManager] Retire cl::opt that have been set for a while. NFCI.
llvm-svn: 314740
2017-10-02 23:39:20 +00:00
Tim Shen 59465d29f8 [PowerPC] Revert r314666.
See https://reviews.llvm.org/D38172.

I tried to XFAIL it, but sometimes XPASS triggers the bot. Simply
revert it.

llvm-svn: 314739
2017-10-02 23:20:06 +00:00
Daniel Berlin 4d825bcf09 Template the sparse propagation solver instead of using void pointers
Summary:
This avoids using void * as the type of the lattice value and ugly casts needed to make that happen.
(If folks want to use references, etc, they can use a reference_wrapper).

Reviewers: davide, mssimpso

Subscribers: sanjoy, llvm-commits

Differential Revision: https://reviews.llvm.org/D38476

llvm-svn: 314734
2017-10-02 22:49:49 +00:00
Tim Shen 7b8928c729 [PowerPC] Temporarily disable the test introduced by r314666
See https://reviews.llvm.org/D38172 for details.

llvm-svn: 314732
2017-10-02 22:40:32 +00:00
Geoff Berry bfc5fb4571 Re-enable "[MachineCopyPropagation] Extend pass to do COPY source forwarding"
Issues addressed since original review:
- Avoid bug in regalloc greedy/machine verifier when forwarding to use
  in an instruction that re-defines the same virtual register.
- Fixed bug when forwarding to use in EarlyClobber instruction slot.
- Fixed incorrect forwarding to register definitions that showed up in
  explicit_uses() iterator (e.g. in INLINEASM).
- Moved removal of dead instructions found by
  LiveIntervals::shrinkToUses() outside of loop iterating over
  instructions to avoid instructions being deleted while pointed to by
  iterator.
- Fixed ARMLoadStoreOptimizer bug exposed by this change in r311907.
- The pass no longer forwards COPYs to physical register uses, since
  doing so can break code that implicitly relies on the physical
  register number of the use.
- The pass no longer forwards COPYs to undef uses, since doing so
  can break the machine verifier by creating LiveRanges that don't
  end on a use (since the undef operand is not considered a use).

  [MachineCopyPropagation] Extend pass to do COPY source forwarding

  This change extends MachineCopyPropagation to do COPY source forwarding.

  This change also extends the MachineCopyPropagation pass to be able to
  be run during register allocation, after physical registers have been
  assigned, but before the virtual registers have been re-written, which
  allows it to remove virtual register COPY LiveIntervals that become dead
  through the forwarding of all of their uses.

llvm-svn: 314729
2017-10-02 22:01:37 +00:00
Michael Liao 70356574bb Remove trailing whitespace to trigger re-cmaking
llvm-svn: 314728
2017-10-02 21:54:38 +00:00
Craig Topper 36e21399f4 [X86] Run dos2unix on two disassembler tests.
llvm-svn: 314727
2017-10-02 21:46:58 +00:00
Amjad Aboud 8ef85a088e [X86][NFC] Add X86CmovConverterPass to the pass registry.
Differential Revision: https://reviews.llvm.org/D38355

llvm-svn: 314726
2017-10-02 21:46:37 +00:00
Adrian Prantl 46e006c497 llvm-dwarfdump: support the --ignore-case option.
llvm-svn: 314723
2017-10-02 21:21:09 +00:00
Michael Liao c6004d0371 Remove dead file.
llvm-svn: 314720
2017-10-02 21:00:52 +00:00
Konstantin Zhuravlyov 121125f20d Add ELFOSABI_FIRST_ARCH, ELFOSABI_LAST_ARCH and start using those in llvm-readobj
Differential Revision: https://reviews.llvm.org/D38418

llvm-svn: 314717
2017-10-02 20:49:58 +00:00
Matt Arsenault c6baa85fc6 AMDGPU: Fix typos
llvm-svn: 314715
2017-10-02 20:31:18 +00:00
Matt Arsenault b866d10127 AMDGPU: Fix potentially incorrectly matching check lines
These check lines are supposed to make sure the new d16
load instructions aren't used, but the expected instruction
name is a prefix of the incorrect instruction name.

llvm-svn: 314714
2017-10-02 20:31:16 +00:00
Sanjay Patel 4eb046172f [InstCombine] auto-generate complete checks; NFC
llvm-svn: 314712
2017-10-02 20:16:59 +00:00
Sanjay Patel 5ad573616e [InstCombine] add icmp (shr X, Y), 0 test; NFC
llvm-svn: 314710
2017-10-02 20:07:15 +00:00
Hans Wennborg db861e7a81 Fix two header comments. NFC.
llvm-svn: 314709
2017-10-02 19:48:28 +00:00
Walter Lee 35b09cbd42 Add support for Myriad ma2x8x series of CPUs
Summary: Also add support for some older Myriad CPUs that were missing.

Reviewers: jyknight

Subscribers: fedor.sergeev

Differential Revision: https://reviews.llvm.org/D37552

llvm-svn: 314705
2017-10-02 18:50:48 +00:00
Adrian Prantl a8b2ddbde4 Move the stripping of invalid debug info from the Verifier to AutoUpgrade.
This came out of a recent discussion on llvm-dev
(https://reviews.llvm.org/D38042). Currently the Verifier will strip
the debug info metadata from a module if it finds the dbeug info to be
malformed. This feature is very valuable since it allows us to improve
the Verifier by making it stricter without breaking bcompatibility,
but arguable the Verifier pass should not be modifying the IR. This
patch moves the stripping of broken debug info into AutoUpgrade
(UpgradeDebugInfo to be precise), which is a much better location for
this since the stripping of malformed (i.e., produced by older, buggy
versions of Clang) is a (harsh) form of AutoUpgrade.

This change is mostly NFC in nature, the one big difference is the
behavior when LLVM module passes are introducing malformed debug
info. Prior to this patch, a NoAsserts build would have printed a
warning and stripped the debug info, after this patch the Verifier
will report a fatal error. I believe this behavior is actually more
desirable anyway.

Differential Revision: https://reviews.llvm.org/D38184

llvm-svn: 314699
2017-10-02 18:31:29 +00:00
Sanjay Patel 6e17c00a88 [InstCombine] remove one-use restriction for icmp (shr exact X, C1), C2 --> icmp X, (C2<<C1)
llvm-svn: 314698
2017-10-02 18:26:44 +00:00
Sanjay Patel ce86ab3ed7 [InstCombine] add icmp (lshr X, C1), C2 test; NFC
llvm-svn: 314696
2017-10-02 18:16:17 +00:00
Dehao Chen f464627f28 Update getMergedLocation to check the instruction type and merge properly.
Summary: If the merged instruction is call instruction, we need to set the scope to the closes common scope between 2 locations, otherwise it will cause trouble when the call is getting inlined.

Reviewers: dblaikie, aprantl

Reviewed By: dblaikie, aprantl

Subscribers: llvm-commits, sanjoy

Differential Revision: https://reviews.llvm.org/D37877

llvm-svn: 314694
2017-10-02 18:13:14 +00:00
Hans Wennborg ea89ff7c25 CodeView symbol dumper: use symbolic names for registers
https://reviews.llvm.org/D38469

llvm-svn: 314690
2017-10-02 17:44:47 +00:00
Stanislav Mekhanoshin 6deb212522 Eliminate ftrunc if source is know to be rounded
Differential Revision: https://reviews.llvm.org/D38421

llvm-svn: 314688
2017-10-02 16:57:07 +00:00
Jonas Devlieghere f91dc28b7b [dwarfdump] Add -show-form
This enables printing of DWARF form types after the DWARF attribute
types.

Differential revision: https://reviews.llvm.org/D38459

llvm-svn: 314685
2017-10-02 16:02:04 +00:00
Simon Pilgrim 65bde8806f [X86][SSE] Add PACKSS/PACKUS constant folding tests
llvm-svn: 314682
2017-10-02 15:43:26 +00:00
Simon Pilgrim ec528b2cb6 Regenerate test (missing broadcast constant comments). NFCI.
Still avoiding the floating point comments to prevent linux/windows discrepancies.

llvm-svn: 314681
2017-10-02 15:22:35 +00:00
Simon Pilgrim 050f60370c Regenerate test (missing broadcast constant comments). NFCI.
llvm-svn: 314680
2017-10-02 15:21:14 +00:00
Simon Pilgrim 037f6d10d5 Regenerate test. NFCI.
llvm-svn: 314679
2017-10-02 15:16:30 +00:00
Sanjay Patel 232669d6b3 use range-for-loops; NFCI
llvm-svn: 314676
2017-10-02 15:02:06 +00:00
Coby Tayree 01e5320c48 [AsmParser] Support GAS's .print directive
Differential Revision: https://reviews.llvm.org/D38448

llvm-svn: 314674
2017-10-02 14:36:31 +00:00
Sanjay Patel 7998d37076 remove duplicate comments, reposition related functions; NFC
llvm-svn: 314669
2017-10-02 14:03:17 +00:00
Bjorn Pettersson 8e978c0151 [X86][SSE] Fix -Wsign-compare problems introduced in r314658
The refactoring in
"[X86][SSE] Add createPackShuffleMask helper function. NFCI."
resulted in warning when compiling the code (seen in build bots).

This patch restores some types from int to unsigned to avoid
those warnings.

llvm-svn: 314667
2017-10-02 12:46:38 +00:00
Bjorn Pettersson 839775a277 [Debug info] Handle endianness when moving debug info for split integer values
Summary:
Take the target's endianness into account when splitting the
debug information in DAGTypeLegalizer::SetExpandedInteger.

This patch fixes so that, for big-endian targets, the fragment
expression corresponding to the high part of a split integer
value is placed at offset 0, in order to correctly represent
the memory address order.

I have attached a PPC32 reproducer where the resulting DWARF
pieces for a 64-bit integer were incorrectly reversed.

Patch by: dstenb

Reviewers: JDevlieghere, aprantl, dblaikie

Reviewed By: JDevlieghere, aprantl, dblaikie

Subscribers: nemanjai

Differential Revision: https://reviews.llvm.org/D38172

llvm-svn: 314666
2017-10-02 12:46:32 +00:00
Simon Pilgrim e2e27aff9b [X86][SSE] Add createPackShuffleMask helper function. NFCI.
llvm-svn: 314658
2017-10-02 10:12:51 +00:00
Simon Pilgrim c04c7443ea [X86][SSE] matchBinaryVectorShuffle - add support for different src/dst value shuffle types
Preparation for support for combining to PACKSS/PACKUS

llvm-svn: 314656
2017-10-02 09:45:08 +00:00
Hiroshi Inoue dcedd66b00 [PowerPC] support ZERO_EXTEND in tryBitPermutation
This patch add a support of ISD::ZERO_EXTEND in PPCDAGToDAGISel::tryBitPermutation to increase the opportunity to use rotate-and-mask by reordering ZEXT and ANDI.
Since tryBitPermutation stops analyzing nodes if it hits a ZEXT node while traversing SDNodes, we want to avoid ZEXT between two nodes that can be folded into a rotate-and-mask instruction.

For example, we allow these nodes

      t9: i32 = add t7, Constant:i32<1>
    t11: i32 = and t9, Constant:i32<255>
  t12: i64 = zero_extend t11
t14: i64 = shl t12, Constant:i64<2>

to be folded into a rotate-and-mask instruction.
Such case often happens in array accesses with logical AND operation in the index, e.g. array[i & 0xFF];

Differential Revision: https://reviews.llvm.org/D37514

llvm-svn: 314655
2017-10-02 09:24:00 +00:00
Simon Pilgrim 3bbbf31590 Fix typo in comment. NFCI.
llvm-svn: 314653
2017-10-02 09:10:50 +00:00
Simon Pilgrim e575651370 [X86] Cleanup uses of computeKnownBits by using MaskedValueIsZero helper instead. NFCI.
llvm-svn: 314652
2017-10-02 09:08:45 +00:00
Michael Zuckerman e4084f6bdb [X86][LLVM]Expanding Supports lowerInterleaved{store|load}() in X86InterleavedAccess (VF64 stride 3-4)
I continue to support different VF interleaved and in this pass for this patch,
I added the vf64 stride3 support for both load and store.
I also added support fot the stride4 store.

Reviewers:
1. zvi
2. dorit
3. igorb
4. guyblank

Differential Revision: https://reviews.llvm.org/D37687

Change-Id: I3d238efedf217d1768b348d710de1efa2f19d27b
llvm-svn: 314651
2017-10-02 07:35:25 +00:00
Craig Topper d37625859a [X86] Fix copy pasto in X86FastISel::fastEmitInst_rrrr.
The 4th operand was not being constrained and the third operand was being constrained twice.

llvm-svn: 314648
2017-10-02 05:46:53 +00:00
Craig Topper bb7866162c [X86] Use a bool flag instead of assigning an unsigned to two different values that we only use in an equality comparison.
llvm-svn: 314647
2017-10-02 05:46:52 +00:00
Craig Topper c05c390a7c [X86] Use _NOREX MOVZX instructions for some patterns even in 32-bit mode.
This unifies the patterns between both modes. This should be effectively NFC since all the available registers in 32-bit mode statisfy this constraint.

llvm-svn: 314643
2017-10-02 00:44:50 +00:00
Ron Lieberman 9bcdd80b66 [Hexagon] Check vector elements for equivalence in the HexagonVectorLoopCarriedReuse pass
If the two instructions being compared for equivalence have corresponding operands
    that are integer constants, then check their values to determine equivalence.
    
    Patch by Suyog Sarda!
    

llvm-svn: 314642
2017-10-02 00:34:07 +00:00
Ron Lieberman f90493d220 [Hexagon] Patch to Extract i1 element from vector of i1
This patch extracts 1 element from vector consisting
of elements of size 1 bit at given index.

llvm-svn: 314641
2017-10-02 00:16:15 +00:00
Craig Topper 6e025a3ecc [InstCombine] Use APInt for all the math in foldICmpDivConstant
Summary: This currently uses ConstantExpr to do its math, but as noted in a TODO it can all be done directly on APInt.

Reviewers: spatel, majnemer

Reviewed By: majnemer

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38440

llvm-svn: 314640
2017-10-01 23:53:54 +00:00
Craig Topper c20b46da2f [X86] Change register&memory TEST instructions from MRMSrcMem to MRMDstMem
Summary:
Intel documentation shows the memory operand as the first operand. But we currently treat it as the second operand. Conceptually the order doesn't matter since it doesn't write memory. We have aliases to parse with the operands in either order and the isel matching is commutable.

For the register&register form order does matter for the assembly parser. PR22995 was previously filed and fixed by changing the register&register form from MRMSrcReg to MRMDestReg to match gas. Ideally the memory form should match by using MRMDestMem.

I believe this supercedes D38025 which was trying to switch the register&register form back to pre-PR22995.

Reviewers: aymanmus, RKSimon, zvi

Reviewed By: aymanmus

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38120

llvm-svn: 314639
2017-10-01 23:53:53 +00:00
Craig Topper 00230604d3 [X86] Remove a couple unnecessary COPY_TO_REGCLASS from some output patterns where the instruction already produces the correct register class.
llvm-svn: 314638
2017-10-01 23:53:50 +00:00