Commit Graph

117315 Commits

Author SHA1 Message Date
Aditya Kumar a27014b851 Improve static analysis of cold basic blocks
Differential Revision: https://reviews.llvm.org/D52704

Reviewers: sebpop, tejohnson, brzycki, SirishP
Reviewed By: sebpop

llvm-svn: 343663
2018-10-03 06:21:05 +00:00
Aditya Kumar 9e20ade72a Add support for new pass manager
Modified the testcases to use both pass managers
Use single commandline flag for both pass managers.

Differential Revision: https://reviews.llvm.org/D52708
Reviewers: sebpop, tejohnson, brzycki, SirishP
Reviewed By: tejohnson, brzycki

llvm-svn: 343662
2018-10-03 05:55:20 +00:00
Fangrui Song 3d76d36059 [AMDGPU] Rename pass "isel" to "amdgpu-isel"
Summary: The AMDGPU target specific pass "isel" is a misleading name.

Reviewers: tstellar, echristo, javed.absar, arsenm

Reviewed By: arsenm

Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D52759

llvm-svn: 343659
2018-10-03 03:38:22 +00:00
Matt Arsenault 635d479322 AMDGPU: Always run AMDGPUAlwaysInline
Even if calls are enabled, it still needs to be run
for forcing inline of functions that use LDS.

llvm-svn: 343657
2018-10-03 02:47:25 +00:00
Matt Arsenault 0f83d66ae7 Add atomicrmw operation to error messages
llvm-svn: 343656
2018-10-03 02:37:15 +00:00
Daniel Sanders 34eac35a60 Add the missing new files from r343654
llvm-svn: 343655
2018-10-03 02:21:30 +00:00
Daniel Sanders c973ad1878 Re-commit: [globalisel] Add a combiner helpers for extending loads and use them in a pre-legalize combiner for AArch64
Summary: Depends on D45541

Reviewers: ab, aditya_nandakumar, bogner, rtereshin, volkan, rovka, javed.absar, aemerson

Subscribers: aemerson, rengolin, mgorny, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D45543

The previous commit failed portions of the test-suite on GreenDragon due to
duplicate COPY instructions and iterator invalidation. Both issues have now
been fixed. To assist with this, a helper (cloneVirtualRegister) has been added
to MachineRegisterInfo that can be used to get another register that has the same
type and class/bank as an existing one.

llvm-svn: 343654
2018-10-03 02:12:17 +00:00
Thomas Lively 9075cd607d [WebAssembly] any_true and all_true intrinsics and instructions
Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D52755

llvm-svn: 343649
2018-10-03 00:19:39 +00:00
Stanislav Mekhanoshin 1821513e2f [AMDGPU] Assert in getOpSize() there are no sub-dword subregs
Differential Revision: https://reviews.llvm.org/D52769

llvm-svn: 343648
2018-10-03 00:00:41 +00:00
Matt Arsenault b02ba99e91 IR: Move AtomicRMW string names into class
This will be used to improve error messages in a future commit.

llvm-svn: 343647
2018-10-02 23:44:11 +00:00
Sam Clegg b2486f118d [WebAssembly] Stop generating helper functions in WebAssemblyLowerEmscriptenEHSjLj
Previously we were creating weakly defined helper function in
each translation unit:

-  setThrew
-  setTempRet0

Instead we now assume these will be provided at link time.  In
emscripten they are provided in compiler-rt:
 https://github.com/kripken/emscripten/pull/7203

Additionally we previously created three global variable which are
also now required to exist at link time instead.

- __THREW__
- _threwValue
- __tempRet0

Differential Revision: https://reviews.llvm.org/D49208

llvm-svn: 343640
2018-10-02 22:12:15 +00:00
Aaron Smith da0602c154 [CodeView] Only add the Scoped flag for an enum type when it has an immediate function scope to match MSVC
Reviewers: rnk, zturner, llvm-commits

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D52706

llvm-svn: 343627
2018-10-02 20:28:15 +00:00
Aaron Smith 802b033d78 [CodeView] Emit function options for subprogram and member functions
Summary:
Use the newly added DebugInfo (DI) Trivial flag, which indicates if a C++ record is trivial or not, to determine Codeview::FunctionOptions.

Clang and MSVC generate slightly different Codeview for C++ records. For example, here is the C++ code for a class with a defaulted ctor,

       class C {
       public:
         C() = default;
       };

Clang will produce a LF for the defaulted ctor while MSVC does not. For more details, refer to FIXMEs in the test cases in "function-options.ll" included with this set of changes.


Reviewers: zturner, rnk, llvm-commits, aleksandr.urakov

Reviewed By: rnk

Subscribers: Hui, JDevlieghere

Differential Revision: https://reviews.llvm.org/D45123

llvm-svn: 343626
2018-10-02 20:21:05 +00:00
Matt Morehouse 4b1ec17fb0 Revert "X86, AArch64, ARM: Do not attach debug location to spill/reload instructions"
This reverts r343520 due to breakage of HWASan tests on Android.

llvm-svn: 343616
2018-10-02 18:35:44 +00:00
Craig Topper 49225d0915 [X86][Disassembler] Add bizarro versions of the MOVSXD instruction that sign extend from a GR32 to GR32 or GR16.
The 0x63 opcodes in 64-bit mode have a fixed source size of 32-bits, but the destination size is controlled by REX.W and the 0x66 opsize prefix. This instruction is normally used with a REX.W prefix which provides desired behavior. The other encodings are interpretted as valid by the processor, but aren't useful.

This patch makes us recognize them for the disassembler to match objdump.

llvm-svn: 343614
2018-10-02 18:16:19 +00:00
Daniel Sanders 74de21d06f [globalisel][verifier] Run the MachineVerifier from IRTranslator onwards
-verify-machineinstrs inserts the MachineVerifier after every MachineInstr-based
pass. However, GlobalISel creates MachineInstr-based passes earlier than DAGISel
and the corresponding verifiers are not being added. This patch fixes that.

If GlobalISel triggers the fallback path then the MIR can be left in a bad
state that is going to be cleared by ResetMachineFunctions. In this situation
verifying between GlobalISel passes will prevent the fallback path from
recovering from this. As a result, we bail out of verifying a function if the
FailedISel attribute is present.

llvm-svn: 343613
2018-10-02 17:56:58 +00:00
Reid Kleckner d5e4ec74e3 [codeview] Fix 32-bit x86 variable locations in realigned stack frames
Add the .cv_fpo_stackalign directive so that we can define $T0, or the
VFRAME virtual register, with it. This was overlooked in the initial
implementation because unlike MSVC, we push CSRs before allocating stack
space, so this value is only needed to describe local variable
locations. Variables that the compiler now addresses via ESP are instead
described as being stored at offsets from VFRAME, which for us is ESP
after alignment in the prologue.

This adds tests that show that we use the VFRAME register properly in
our S_DEFRANGE records, and that we emit the correct FPO data to define
it.

Fixes PR38857

llvm-svn: 343603
2018-10-02 16:43:52 +00:00
Simon Pilgrim 860cb5c071 [X86][Btver2] Fix BLENDV and AESDEC schedules
Match AMD Fam16h SOG + llvm-exegesis tests

llvm-svn: 343597
2018-10-02 15:13:18 +00:00
Krzysztof Parzyszek 528aff3372 [Hexagon] Fix extracting subvectors of non-HVX vNi1
Patch by Brendon Cahoon.

llvm-svn: 343596
2018-10-02 15:05:43 +00:00
Diogo N. Sampaio eb9ca5ab18 [ARM] Emmit data symbol for constant pool data
The ARM elf emitter would omit printing data
symbol when constant data. This patch
overrides the emitFill method as to enforce that
the symbol is correctly printed.

Differential revision: https://reviews.llvm.org/D52737

llvm-svn: 343594
2018-10-02 14:55:48 +00:00
Simon Pilgrim 201bbe3993 [X86] Remove unnecessary BT(C/R/S)m(i/r) scheduler overrides
Some SchedAlias remain due to some badly setup RMW tags - but at least the overrides are all removed

llvm-svn: 343586
2018-10-02 13:11:59 +00:00
Simon Pilgrim 271bcb9397 [X86] Add APInt constant assembly printer helper
llvm-svn: 343577
2018-10-02 11:32:33 +00:00
Oliver Stannard c41902807e [AArch64][v8.5A] Add Memory Tagging instructions
This adds new instructions to manipluate tagged pointers, and to load
and store the tags associated with memory.

Patch by Pablo Barrio, David Spickett and Oliver Stannard!

Differential revision: https://reviews.llvm.org/D52490

llvm-svn: 343572
2018-10-02 10:04:39 +00:00
Oliver Stannard 2a5fcba94b [AArch64][v8.5A] Add Memory Tagging system registers
This adds new system registers introduced by the Memory Tagging
extension.

Patch by Pablo Barrio!

Differential revision: https://reviews.llvm.org/D52488

llvm-svn: 343571
2018-10-02 09:54:35 +00:00
Oliver Stannard 4493f421ac [AArch64][v8.5A] Add MTE system instructions
The Memory Tagging Extension adds system instructions for data cache
maintenance, implemented as new operands to the DC instruction.

Patch by Pablo Barrio!

Differential revision: https://reviews.llvm.org/D52487

llvm-svn: 343570
2018-10-02 09:48:43 +00:00
David Green 1e44c3b62c [InstCombine] Fold ~A - Min/Max(~A, O) -> Max/Min(A, ~O) - A
This is an attempt to get out of a local-minimum that instcombine currently
gets stuck in. We essentially combine two optimisations at once, ~a - ~b = b-a
and min(~a, ~b) = ~max(a, b), only doing the transform if the result is at
least neutral. This involves using IsFreeToInvert, which has been expanded a
little to include selects that can be easily inverted.

This is trying to fix PR35875, using the ideas from Sanjay. It is a large
improvement to one of our rgb to cmy kernels.

Differential Revision: https://reviews.llvm.org/D52177

llvm-svn: 343569
2018-10-02 09:48:34 +00:00
Oliver Stannard 85de54090e [AArch64][v8.5A] Add MTE as an optional AArch64 extension
This adds the memory tagging extension, which is an optional extension
introduced in v8.5A. The new instructions and registers will be added by
subsequent patches.

Patch by Pablo Barrio!

Differential revision: https://reviews.llvm.org/D52486

llvm-svn: 343563
2018-10-02 09:36:28 +00:00
Simon Pilgrim ad23f270db [X86] Standardize floating point assembly comments
Consistently try to use APFloat::toString for floating point constant comments to get rid of differences between Constant / ConstantDataSequential values - it should help stop some of the linux-windows buildbot failures matching NaN/INF etc. as well.

Differential Revision: https://reviews.llvm.org/D52702

llvm-svn: 343562
2018-10-02 09:08:51 +00:00
Matt Arsenault ab41193312 AMDGPU: Expand atomicrmw nand in IR
llvm-svn: 343559
2018-10-02 03:50:56 +00:00
Thomas Lively 6f77811a21 [WebAssembly] Restore slashes in SIMD conversion names
Summary: Depends on D52372 and D52442.

Reviewers: aheejin, dschuff, aardappel

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D52512

llvm-svn: 343558
2018-10-02 01:52:21 +00:00
Craig Topper d616d33a96 [SimplifyCFG] Use Value::hasNUses instead of 'getNumUses() =='. NFCI
getNumUses is linear in the number of uses. Since we're looking for a specific use count, we can use hasNUses which will stop as soon as it determines there are more than N uses instead of walking all of them.

llvm-svn: 343550
2018-10-01 23:09:52 +00:00
Craig Topper 90c0a0621c [SimplifyCFG] Update comments that refer to CondBB to say ThenBB instead. NFC
There is no variable in this function named CondBB, but there is one named ThenBB and I believe the comments are all refering to it.

llvm-svn: 343548
2018-10-01 22:56:11 +00:00
Zachary Turner a67765ac8d [PDB] Add support for more kinds of PDB Sym Tags.
DIA SDK is returning several new sym tag types, so we update
the enumeration and printing code to support these.

llvm-svn: 343547
2018-10-01 22:39:19 +00:00
Daniel Sanders 33f42f97af Revert: r343521 and r343541: [globalisel] Add a combiner helpers for extending loads and use them in a pre-legalize combiner for AArch64
There's a strange assertion on two of the Green Dragon bots that goes away when
this is reverted. The assertion is in RegBankAlloc and if it is this commit then
-verify-machine-instrs should have caught it earlier in the pipeline.

llvm-svn: 343546
2018-10-01 22:32:08 +00:00
Reid Kleckner 8d7c421a70 [codeview] Simplify S_DEFRANGE emission code, NFC
These assembler directives are still pretty unreadable and it would be
nice to clean them up at some point.

llvm-svn: 343544
2018-10-01 22:25:49 +00:00
Reid Kleckner 9ea2c01264 [codeview] Emit S_FRAMEPROC and use S_DEFRANGE_FRAMEPOINTER_REL
Summary:
Before this change, LLVM would always describe locals on the stack as
being relative to some specific register, RSP, ESP, EBP, ESI, etc.
Variables in stack memory are pretty common, so there is a special
S_DEFRANGE_FRAMEPOINTER_REL symbol for them. This change uses it to
reduce the size of our debug info.

On top of the size savings, there are cases on 32-bit x86 where local
variables are addressed from ESP, but ESP changes across the function.
Unlike in DWARF, there is no FPO data to describe the stack adjustments
made to push arguments onto the stack and pop them off after the call,
which makes it hard for the debugger to find the local variables in
frames further up the stack.

To handle this, CodeView has a special VFRAME register, which
corresponds to the $T0 variable set by our FPO data in 32-bit.  Offsets
to local variables are instead relative to this value.

This is part of PR38857.

Reviewers: hans, zturner, javed.absar

Subscribers: aprantl, hiraditya, JDevlieghere, llvm-commits

Differential Revision: https://reviews.llvm.org/D52217

llvm-svn: 343543
2018-10-01 21:59:45 +00:00
Reid Kleckner 7a6966ec27 Fix the Windows build in GlobalISel
Clang-cl was complaining about some sort of constexpr narrowing bug:

C:\src\llvm-project\llvm\lib\CodeGen\GlobalISel\CombinerHelper.cpp(136,31):  error: non-constant-expression cannot be narrowed from type 'llvm::TargetOpcode::(anonymous enum at C:\src\llvm-project\llvm\include\llvm/CodeGen/TargetOpcodes.h:22:1)' to 'unsigned int' in initializer list [-Wc++11-narrowing]
                              unsigned(MI.getOpcode()) == unsigned(TargetOpcode::G_LOAD)
                              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
C:\src\llvm-project\llvm\lib\CodeGen\GlobalISel\CombinerHelper.cpp(136,31):  note: insert an explicit cast to silence this issue
                              unsigned(MI.getOpcode()) == unsigned(TargetOpcode::G_LOAD)
                              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                              static_cast<unsigned int>(

llvm-svn: 343541
2018-10-01 21:39:39 +00:00
Craig Topper 42cd8cd862 Recommit r343499 "[X86] Enable load folding in the test shrinking code"
Original message:
This patch adds load folding support to the test shrinking code. This was noticed missing in the review for D52669

llvm-svn: 343540
2018-10-01 21:35:28 +00:00
Craig Topper f06a57fc89 Recommit r343498 "[X86] Improve test instruction shrinking when the sign flag is used and the output of the and is truncated."
This includes a fix to prevent i16 compares with i32/i64 ands from being shrunk if bit 15 of the and is set and the sign bit is used.

Original commit message:
Currently we skip looking through truncates if the sign flag is used. But that's overly restrictive.

It's safe to look through the truncate as long as we ensure one of the 3 things when we shrink. Either the MSB of the mask at the shrunken size isn't set. If the mask bit is set then either the shrunk size needs to be equal to the compare size or the sign

There are still missed opportunities to shrink a load and fold it in here. This will be fixed in a future patch.

llvm-svn: 343539
2018-10-01 21:35:26 +00:00
Stefan Pintilie 5d32a86f44 [PowerPC] Folding XForm to DForm loads requires alignment for some DForm loads.
Going from XForm Load to DSForm Load requires that the immediate be 4 byte
aligned.
If we are not aligned we must leave the load as LDX (XForm).
This bug is causing a compile-time failure in the benchmark h264ref.

Differential Revision: https://reviews.llvm.org/D51988

llvm-svn: 343525
2018-10-01 20:16:27 +00:00
Eric Christopher dcf1d97c5c Temporarily revert "[GVNHoist] Re-enable GVNHoist by default"
This reverts commit r342387 as it's showing significant performance
regressions in a number of benchmarks. Followed up with the
committer and original thread with an example and will get performance
numbers before recommitting.

llvm-svn: 343522
2018-10-01 18:57:08 +00:00
Daniel Sanders 9659bfda5a [globalisel] Add a combiner helpers for extending loads and use them in a pre-legalize combiner for AArch64
Summary: Depends on D45541

Reviewers: ab, aditya_nandakumar, bogner, rtereshin, volkan, rovka, javed.absar, aemerson

Subscribers: aemerson, rengolin, mgorny, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D45543

llvm-svn: 343521
2018-10-01 18:56:47 +00:00
Matthias Braun 3e081703c3 X86, AArch64, ARM: Do not attach debug location to spill/reload instructions
Spill/reload instructions are artificially generated by the compiler and
have no relation to the original source code. So the best thing to do is
not attach any debug location to them (instead of just taking the next
debug location we find on following instructions).

Differential Revision: https://reviews.llvm.org/D52125

llvm-svn: 343520
2018-10-01 18:56:39 +00:00
Craig Topper e072934d28 Revert r343499 and r343498. X86 test improvements
There's a subtle bug in the handling of truncate from i32/i64 to i32 without minsize.

I'll be adding more test cases and trying to find a fix.

llvm-svn: 343516
2018-10-01 18:40:44 +00:00
Krzysztof Parzyszek 6d569a2cc4 [Hexagon] Remove incorrect pattern for swiz
The pattern had a couple of problems:
- It was checking for loads of bytes in the reverse order to what it
  should have been looking for.
- It would replace loads of bytes with a load of a word without making
  sure that the alignment was correct.

Thanks to Eli Friedman for pointing it out.

llvm-svn: 343514
2018-10-01 18:24:40 +00:00
Stanislav Mekhanoshin ae8bd6d9b5 [AMDGPU] Fixed SIInstrInfo::getOpSize to handle subregs
Currently it returns incorrect operand size for a target independet
node such as COPY if operand is a register with subreg. Instead of
correct subreg size it returns a size of the whole superreg.

Differential Revision: https://reviews.llvm.org/D52736

llvm-svn: 343508
2018-10-01 18:00:02 +00:00
Zachary Turner a5e3e02602 [PDB] Add support for dumping Typedef records.
These work a little differently because they are actually in
the globals stream and are treated as symbol records, even though
DIA presents them as types.  So this also adds the necessary
infrastructure to cache records that live somewhere other than
the TPI stream as well.

llvm-svn: 343507
2018-10-01 17:55:38 +00:00
Zachary Turner 5c1873b213 [PDB] Add support for parsing VFTable Shape records.
This allows them to be returned from the native API.

llvm-svn: 343506
2018-10-01 17:55:16 +00:00
Matthias Braun 7159daa68e MIRParser: Check that instructions only reference DILocation metadata
llvm-svn: 343505
2018-10-01 17:50:52 +00:00
Wouter van Oortmerssen 0c83c3ff38 [WebAssembly] Fixed AsmParser not allowing instructions with /
Summary:
The AsmParser Lexer regards these as a seperate token.
Here we expand the instruction name with them if they are
adjacent (no whitespace).

Tested: the basic-assembly.s test case has one case with a / in it.
The currently are also instructions with : in them, which we intend
to rename rather than fix them here.

Reviewers: tlively, dschuff

Subscribers: sbc100, jgravelle-google, aheejin, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D52442

llvm-svn: 343501
2018-10-01 17:20:31 +00:00
Craig Topper aa84e1bba2 [X86] Enable load folding in the test shrinking code
This patch adds load folding support to the test shrinking code. This was noticed missing in the review for D52669

Differential Revision: https://reviews.llvm.org/D52699

llvm-svn: 343499
2018-10-01 17:10:50 +00:00
Craig Topper 2b587ad071 [X86] Improve test instruction shrinking when the sign flag is used and the output of the and is truncated
Currently we skip looking through truncates if the sign flag is used. But that's overly restrictive.

It's safe to look through the truncate as long as we ensure one of the 3 things when we shrink. Either the MSB of the mask at the shrunken size isn't set. If the mask bit is set then either the shrunk size needs to be equal to the compare size or the sign flag needs to be unused.

There are still missed opportunities to shrink a load and fold it in here. This will be fixed in a future patch.

Differential Revision: https://reviews.llvm.org/D52669

llvm-svn: 343498
2018-10-01 17:10:45 +00:00
Simon Pilgrim e0d2019052 [X86][Btver2] Fix BT(C|R|S)mr & BT(C|R|S)mi schedule latency + uop counts
Match AMD Fam16h SOG + llvm-exegesis tests

llvm-svn: 343494
2018-10-01 16:31:30 +00:00
Matthias Braun 004fe6bf83 DAGCombiner: StoreMerging: Fix bad index calculating when adjusting mismatching vector types
This fixes a case of bad index calculation when merging mismatching
vector types. This changes the existing code to just use the existing
extract_{subvector|element} and a bitcast (instead of bitcast first and
then newly created extract_xxx) so we don't need to adjust any indices
in the first place.

rdar://44584718

Differential Revision: https://reviews.llvm.org/D52681

llvm-svn: 343493
2018-10-01 16:25:50 +00:00
Simon Pilgrim 683e35527b [X86] Create schedule classes for BT(C|R|S)mi and BT(C|R|S)mr instructions
llvm-svn: 343490
2018-10-01 16:12:44 +00:00
Evandro Menezes 55b9a5395b [AArch64] Refactor cheap cost model
Refactor the order in `TII::isAsCheapAsAMove()` to ease future development
and maintenance.  Practically NFC.

llvm-svn: 343489
2018-10-01 16:11:19 +00:00
Simon Pilgrim 4334912c1c [X86] Remove unnecessary BTmi/BTmr scheduler overrides
llvm-svn: 343487
2018-10-01 15:01:00 +00:00
Jesper Antonsson c954b86391 [InstCombine] Handle vector compares in foldGEPIcmp(), take 2
Summary:
This is a continuation of the fix for PR34627 "InstCombine assertion at vector gep/icmp folding". (I just realized bugpoint had fuzzed the original test for me, so I had fixed another trigger of the same assert in adjacent code in InstCombine.)

This patch avoids optimizing an icmp (to look only at the base pointers) when the resulting icmp would have a different type.

The patch adds a testcase and also cleans up and shrinks the pre-existing test for the adjacent assert trigger.

Reviewers: lebedev.ri, majnemer, spatel

Reviewed By: lebedev.ri

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52494

llvm-svn: 343486
2018-10-01 14:59:25 +00:00
Simon Pilgrim 6ddc4e821c [X86][Btver2] Fix BTmr schedule uop counts
Match AMD Fam16h SOG + llvm-exegesis tests

llvm-svn: 343484
2018-10-01 14:42:16 +00:00
Sanjay Patel 31b07198f1 [InstCombine] try to convert vector insert+extract to trunc; 2nd try
This was originally committed at rL343407, but reverted at 
rL343458 because it crashed trying to handle a case where
the destination type is FP. This version of the patch adds
a check for that possibility. Tests added at rL343480.

Original commit message:

This transform is requested for the backend in:
https://bugs.llvm.org/show_bug.cgi?id=39016
...but I figured it was worth doing in IR too, and it's probably
easier to implement here, so that's this patch.

In the simplest case, we are just truncating a scalar value. If the
extract index doesn't correspond to the LSBs of the scalar, then we
have to shift-right before the truncate. Endian-ness makes this tricky,
but hopefully the ASCII-art helps visualize the transform.

Differential Revision: https://reviews.llvm.org/D52439

llvm-svn: 343482
2018-10-01 14:40:00 +00:00
Simon Pilgrim 43737a3df4 [X86] Create schedule classes for BTmi and BTmr instructions
llvm-svn: 343478
2018-10-01 14:23:37 +00:00
Robert Widmann abda7ee8e7 [LLVM-C] Add an accessor for the kind of a Metadata Node
Summary: Allows for retrieving the type of a metadata node.  Has the added benefit of ensuring that the C and C++ kind APIs stay in sync as a failure to add a corresponding LLVMMetadataKind will result in the switch in the accessor being semantically malformed.

Reviewers: whitequark, deadalnix

Reviewed By: whitequark

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52693

llvm-svn: 343469
2018-10-01 13:15:09 +00:00
Simon Pilgrim a982236e59 [X86][Btver2] Fix masked load schedule
JFPU01 resource usage should match JFPX

Match AMD Fam16h SOG + llvm-exegesis tests

llvm-svn: 343468
2018-10-01 13:12:05 +00:00
Hans Wennborg a60aa91374 Revert r343407 "[InstCombine] try to convert vector insert+extract to trunc"
This caused Chromium builds to fail with "Illegal Trunc" assertion.
See https://crbug.com/890723 for repro.

> This transform is requested for the backend in:
> https://bugs.llvm.org/show_bug.cgi?id=39016
> ...but I figured it was worth doing in IR too, and it's probably
> easier to implement here, so that's this patch.
>
> In the simplest case, we are just truncating a scalar value. If the
> extract index doesn't correspond to the LSBs of the scalar, then we
> have to shift-right before the truncate. Endian-ness makes this tricky,
> but hopefully the ASCII-art helps visualize the transform.
>
> Differential Revision: https://reviews.llvm.org/D52439

llvm-svn: 343458
2018-10-01 12:07:45 +00:00
Alexander Timofeev b048fa3344 [AMDGPU] Divergence driven instruction selection. Shift operations.
Summary: This change enables VOP3 shifts to be explicitly selected
         dependent on the divergence.

Differential Revision: https://reviews.llvm.org/D52559

Reviewers: rampitec
llvm-svn: 343455
2018-10-01 11:06:35 +00:00
Andrea Di Biagio 24ea163007 [X86][BtVer2] Teach how to identify zero-idiom VPERM2F128rr instructions.
This patch adds another variant class to identify zero-idiom VPERM2F128rr
instructions.

On Jaguar, a VPERM wih bit 3 and 7 of the mask set, is a zero-idiom.

Differential Revision: https://reviews.llvm.org/D52663

llvm-svn: 343452
2018-10-01 10:35:13 +00:00
Florian Hahn 8600fee52e Recommit r343308: [LoopInterchange] Turn into a loop pass.
llvm-svn: 343450
2018-10-01 09:59:48 +00:00
Clement Courbet a933fb237e [X86][Sched] Update scheduling information for VZEROALL on HWS, BDW, SKX, SNB.
Summary:
    While looking at PR35606, I found out that the scheduling info is incorrect.

    One can check that it's really a P5+P6 and not a 2*P56 with:
    echo -e 'vzeroall\nvandps %xmm1, %xmm2, %xmm3' | ./bin/llvm-exegesis -mode=uops -snippets-file=-
    (vandps executes on P5 only)

    Reviewers: craig.topper, RKSimon

    Subscribers: llvm-commits

    Differential Revision: https://reviews.llvm.org/D52541

llvm-svn: 343447
2018-10-01 08:37:48 +00:00
Clement Courbet dac60b9837 [X86][Sched] Add pfm uop counter definitions for SNB,BDW,SKX.
llvm-svn: 343446
2018-10-01 08:37:37 +00:00
Carlos Alberto Enciso 81d8ef2196 [DebugInfo][Dexter] Incorrect DBG_VALUE after MCP dead copy instruction removal.
When MachineCopyPropagation eliminates a dead 'copy', its associated debug information becomes invalid. as the recorded register has been removed.  It causes the debugger to display wrong variable value.

Differential Revision: https://reviews.llvm.org/D52614

llvm-svn: 343445
2018-10-01 08:14:44 +00:00
Craig Topper 67d9dbdbdd [X86] Stop X86DomainReassignment from creating copies between GR8/GR16 physical registers and k-registers.
We can only copy between a k-register and a GR32/GR64 register.

This patch detects that the copy will be illegal and prevents the domain reassignment from happening for that closure.

This probably isn't the best fix, and we should probably figure out how to handle this correctly.

Fixes PR38803.

llvm-svn: 343443
2018-10-01 07:08:41 +00:00
Lang Hames deb3640d95 [ORC] Pass Symbols to ExecutionSession::lookup by value, potentially saving a
copy.

llvm-svn: 343442
2018-10-01 04:59:10 +00:00
Lang Hames 47d0a37704 [ORC] Add convenience methods for creating DynamicLibraryFallbackGenerators for
libraries on disk, and for the current process.

Avoids more boilerplate during JIT construction.

llvm-svn: 343430
2018-10-01 00:59:28 +00:00
Craig Topper 1d1dca6a6f [X86] Change an llvm_unreachable to a report_fatal_error so the optimizer will stop making us reach the other report_fatal_error in this function.
There's a conditional report_fatal_error just above this llvm_unreachable. The optimizer when seeing the unreachable removes the conditional and just makes any other error trigger the existing report_fatal_error.

llvm-svn: 343428
2018-09-30 23:43:30 +00:00
Lang Hames 71d781c434 [ORC] Add an 'intern' method to ExecutionEngine for interning symbol names.
This cuts down on boilerplate by reducing 'ES.getSymbolStringPool().intern(...)'
to 'ES.intern(...)'.

llvm-svn: 343427
2018-09-30 23:18:24 +00:00
Fangrui Song 3507c6e884 Use the container form llvm::sort(C, ...)
There are a few leftovers in rL343163 which span two lines. This commit
changes these llvm::sort(C.begin(), C.end, ...) to llvm::sort(C, ...)

llvm-svn: 343426
2018-09-30 22:31:29 +00:00
Simon Pilgrim f21083870d [X86] Fix scheduler class for BTmi instructions
This wasn't treated as a folded load instruction

llvm-svn: 343424
2018-09-30 20:19:16 +00:00
Lang Hames d435ce4343 [ORC] Extract and tidy up JITTargetMachineBuilder, add unit test.
(1) Adds comments for the API.

(2) Removes the setArch method: This is redundant: the setArchStr method on the
    triple should be used instead.

(3) Turns EmulatedTLS on by default. This matches EngineBuilder's behavior.

llvm-svn: 343423
2018-09-30 19:12:23 +00:00
Craig Topper 99ad2a5723 [X86] Copy memrefs when folding a load for division instruction selection.
llvm-svn: 343419
2018-09-30 17:47:18 +00:00
Bjorn Pettersson c2fc53ac90 [PHIElimination] Lower a PHI node with only undef uses as IMPLICIT_DEF
Summary:
The lowering of PHI nodes used to detect if all inputs originated
from IMPLICIT_DEF's. If so the PHI node was replaced by an
IMPLICIT_DEF. Now we also consider undef uses when checking the
inputs. So if all inputs are implicitly defined or undef we
lower the PHI to an IMPLICIT_DEF. This makes
PHIElimination::LowerPHINode more consistent as it checks
both implicit and undef properties at later stages.

Reviewers: MatzeB, tstellar

Reviewed By: MatzeB

Subscribers: jvesely, nhaehnle, llvm-commits

Differential Revision: https://reviews.llvm.org/D52558

llvm-svn: 343417
2018-09-30 17:26:58 +00:00
Bjorn Pettersson 4af7f57bdf [PHIElimination] Update the regression test for PR16508
Summary:
When PR16508 was solved (in rL185363) a regression test was
added as test/CodeGen/PowerPC/2013-07-01-PHIElimBug.ll.
I discovered that the test case no longer reproduced the
scenario from PR16508. This problem could have been amended
by adding an extra RUN line with "-O1" (or possibly "-O0"),
but instead I added a mir-reproducer
  test/CodeGen/PowerPC/2013-07-01-PHIElimBug.mir
to get a reproducer that is less sensitive to changes in
earlier passes (including O-level).

While being at it I also corrected a code comment in
PHIElimination::EliminatePHINodes that has been incorrect
since the related bugfix from rL185363.

Reviewers: MatzeB, hfinkel

Reviewed By: MatzeB

Subscribers: nemanjai, jsji, llvm-commits

Differential Revision: https://reviews.llvm.org/D52553

llvm-svn: 343416
2018-09-30 17:23:21 +00:00
Simon Pilgrim 4f5693ac8d [X86][Btver2] Fix PCmpIStrI/PCmpIStrM schedules
Missing JFPU0 pipe and double JFPU1 pipe (to match JVALU1) resources

Match AMD Fam16h SOG + llvm-exegesis tests

llvm-svn: 343413
2018-09-30 16:38:38 +00:00
Zachary Turner 518cb2d560 [PDB] Add native support for dumping array types.
llvm-svn: 343412
2018-09-30 16:19:18 +00:00
Simon Pilgrim 9cec221a1c [X86][BtVer2] Add the ability to add additional uops for folded instructions
Some instructions take an extra load uop - but not consistently.....

llvm-svn: 343410
2018-09-30 15:58:56 +00:00
Sanjay Patel 1e0f1f645a [InstCombine] try to convert vector insert+extract to trunc
This transform is requested for the backend in:
https://bugs.llvm.org/show_bug.cgi?id=39016
...but I figured it was worth doing in IR too, and it's probably 
easier to implement here, so that's this patch.

In the simplest case, we are just truncating a scalar value. If the 
extract index doesn't correspond to the LSBs of the scalar, then we 
have to shift-right before the truncate. Endian-ness makes this tricky, 
but hopefully the ASCII-art helps visualize the transform.

Differential Revision: https://reviews.llvm.org/D52439

llvm-svn: 343407
2018-09-30 14:34:01 +00:00
Sanjay Patel 26c119a9c2 [InstCombine] allow lengthening of insertelement to eliminate shuffles
As noted in post-commit comments for D52548, the limitation on 
increasing vector length can be applied by opcode.
As a first step, this patch only allows insertelement to be
widened because that has no logical downsides for IR and has 
little risk of pessimizing codegen.

This may cause PR39132 to go into hiding during a full compile,
but that bug is not fixed.

llvm-svn: 343406
2018-09-30 13:50:42 +00:00
Simon Pilgrim 818cfc40ff [DAG] Don't perform SINT_TO_FP<->UINT_TO_FP custom conversion after legalization
The SINT_TO_FP<->UINT_TO_FP combines for non-negative integers should only occur for legal ops once LegalOperations = true

No test case to hand, noticed when investigating PR38226 + PR38970

llvm-svn: 343405
2018-09-30 12:46:42 +00:00
Craig Topper 1709829fed [X86] Disable BMI BEXTR in X86DAGToDAGISel::matchBEXTRFromAnd unless we're on compiling for a CPU with single uop BEXTR
Summary:
This function turns (X >> C1) & C2 into a BMI BEXTR or TBM BEXTRI instruction. For BMI BEXTR we have to materialize an immediate into a register to feed to the BEXTR instruction.

The BMI BEXTR instruction is 2 uops on Intel CPUs. It looks like on SKL its one port 0/6 uop and one port 1/5 uop. Despite what Agner's tables say. I know one of the uops is a regular shift uop so it would have to go through the port 0/6 shifter unit. So that's the same or worse execution wise than the shift+and which is one 0/6 uop and one 0/1/5/6 uop. The move immediate into register is an additional 0/1/5/6 uop.

For now I've limited this transform to AMD CPUs which have a single uop BEXTR. If may also might make sense if we can fold a load or if the and immediate is larger than 32-bits and can't be encoded as a sign extended 32-bit value or if LICM or CSE can hoist the move immediate and share it. But we'd need to look more carefully at that. In the regression I looked at it doesn't look load folding or large immediates were occurring so the regression isn't caused by the loss of those. So we could try to be smarter here if we find a compelling case.

Reviewers: RKSimon, spatel, lebedev.ri, andreadb

Reviewed By: RKSimon

Subscribers: llvm-commits, andreadb, RKSimon

Differential Revision: https://reviews.llvm.org/D52570

llvm-svn: 343399
2018-09-30 03:01:46 +00:00
Lang Hames 98440293fb [ORC] Add partitioning support to CompileOnDemandLayer2.
CompileOnDemandLayer2 now supports user-supplied partition functions (the
original CompileOnDemandLayer already supported these).

Partition functions are called with the list of requested global values
(i.e. global values that currently have queries waiting on them) and have an
opportunity to select extra global values to materialize at the same time.

Also adds testing infrastructure for the new feature to lli.

llvm-svn: 343396
2018-09-29 23:49:57 +00:00
Lang Hames c3053e41bf [ORC] Clear SymbolToDefinitionMap when materializing a MaterializationUnit.
The map is inaccessible at this point, so we may as well reclaim the memory
early.

llvm-svn: 343395
2018-09-29 23:49:56 +00:00
Zachary Turner 6ca6a03c51 [PDB] Better native API support for pointers.
We didn't properly detect when a pointer was a member
pointer, and when that was the case we were not
properly returning class parent info.  This caused
member pointers to render incorrectly in pretty mode.
However, we didn't even have pretty tests for pointers
in native mode, so those are also added now to ensure
this.

llvm-svn: 343393
2018-09-29 23:28:19 +00:00
Simon Pilgrim a2efe82b81 [X86] SimplifyDemandedVectorEltsForTargetNode - remove identity target shuffles before simplifying inputs
By removing demanded target shuffles that simplify to zero/undef/identity before simplifying its inputs we improve chances of further simplification, as only the immediate parent user of the combined is added back to the work list - this still doesn't help us if its passed through other ops though (bitcasts....).

llvm-svn: 343390
2018-09-29 18:15:26 +00:00
Simon Pilgrim a93407fadf [X86][SSE] LowerScalarImmediateShift - remove 32-bit vXi64 special case handling.
This is all handled generally by getTargetConstantBitsFromNode now

llvm-svn: 343387
2018-09-29 17:36:22 +00:00
Simon Pilgrim b5737007cd Fix signed/unsigned mismatch warning. NFCI.
llvm-svn: 343385
2018-09-29 17:11:19 +00:00
Simon Pilgrim d633e290c8 [X86] getTargetConstantBitsFromNode - add support for rearranging constant bits via shuffles
Exposed an issue that recursive calls to getTargetConstantBitsFromNode don't handle changes to EltSizeInBits yet.

llvm-svn: 343384
2018-09-29 17:01:55 +00:00
Simon Pilgrim ae34ae12ef [X86][SSE] LowerScalarImmediateShift - use getTargetConstantBitsFromNode to get immediate data
Don't just attempt to find a splat build vector.

First step towards getting rid of all the 32-bit special case code.

llvm-svn: 343383
2018-09-29 16:40:35 +00:00
Sanjay Patel 54d31ef87e [InstCombine] fix formatting in vector evaluators; NFC
We need to alter the functionality as shown in D52548.

llvm-svn: 343379
2018-09-29 15:05:24 +00:00
Simon Pilgrim a731940c60 [X86] getTargetConstantBitsFromNode - fix self-move assertions from gcc builds due to rL343375
llvm-svn: 343377
2018-09-29 14:51:09 +00:00
Simon Pilgrim 22d51014af [X86] getTargetConstantBitsFromNode - add support for peeking through ISD::EXTRACT_SUBVECTOR
llvm-svn: 343375
2018-09-29 14:17:32 +00:00
Simon Pilgrim aa77033a6b [X86][SSE] Fixed issue with v2i64 variable shifts on 32-bit targets
The shift amount might have peeked through a extract_subvector, altering the number of vector elements in the 'Amt' variable - so we were incorrectly calculating the ratio when peeking through bitcasts, resulting in incorrectly detecting splats.

llvm-svn: 343373
2018-09-29 13:25:22 +00:00
Heejin Ahn 5e174e7474 Fix comment indentation in addLandingPad
rL343018 messed up the comment indentation while moving it.

llvm-svn: 343371
2018-09-29 09:22:25 +00:00
Vitaly Buka 0509070811 [cxx2a] Fix warning triggered by r343285
llvm-svn: 343369
2018-09-29 02:17:12 +00:00
Lang Hames 2afc22ed9e [ORC] Make MaterializationResponsibility::getRequestedSymbols() const.
This makes it available for use in IRTransformLayer2::TransformFunction
instances (since a const MaterializationResponsibility& parameter was
added in r343365).

llvm-svn: 343367
2018-09-28 22:03:17 +00:00
Lang Hames 3e709d5f78 [ORC] Add more utilities to aid debugging output.
(1) A const accessor for the LLVMContext held by a ThreadSafeContext.

(2) A const accessor for the ThreadSafeModules held by an IRMaterializationUnit.

(3) A const MaterializationResponsibility reference to IRTransformLayer2's
    transform function. This makes IRTransformLayer2 useful for JIT debugging
    (since it can inspect JIT state through the responsibility argument) as well
    as program transformations.

llvm-svn: 343365
2018-09-28 21:49:53 +00:00
Thomas Lively d47b5c7bed [ValueTracking] Allow select patterns to work on FP vectors
Summary:
This CL allows constant vectors of floats to be recognized as non-NaN
and non-zero in select patterns. This change makes
`matchSelectPattern` more powerful generally, but was motivated
specifically because I wanted fminnan and fmaxnan to be created for
vector versions of the scalar patterns they are created for.

Tested with check-all on all targets. A testcase in the WebAssembly
backend that tests the non-nan codepath is in an upcoming CL.

Reviewers: aheejin, dschuff

Subscribers: sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D52324

llvm-svn: 343364
2018-09-28 21:36:43 +00:00
Robert Widmann e63a12ccbe [LLVM-C] Add an accessor for the "value type" of a global
Summary: Before this, there was no reasonable way to retrieve the type of a global value (most notably, a function) that was created with  the C API.

Reviewers: whitequark, deadalnix

Reviewed By: whitequark

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52659

llvm-svn: 343363
2018-09-28 20:54:29 +00:00
Heejin Ahn ec3d65b870 [WebAssembly] Fix memory leak on WasmEHFuncInfo
Summary: WasmEHFuncInfo objects were not being properly deleted.

Reviewers: dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D52582

llvm-svn: 343362
2018-09-28 20:54:04 +00:00
Eli Friedman 5ab09a684f [ARM] Fix correctness checks in promoteToConstantPool.
Correctly check for relocations in the constant to promote. And don't
allow promoting a constant multiple times.

This partially fixes https://bugs.llvm.org//show_bug.cgi?id=32780 ;
it's not a complete fix because we also need to prevent
ARMConstantIslands from cloning the constant.

(-arm-promote-constant is currently off by default, and it stays off
with this patch. I'll look into turning it on again when all the known
issues are fixed.)

Differential Revision: https://reviews.llvm.org/D51472

llvm-svn: 343361
2018-09-28 20:27:31 +00:00
Eli Friedman bb993be56b [ARM] Use preferred alignment for constants in promoteToConstantPool.
This mostly affects IR generated by non-clang frontends because clang
generally sets the alignment of globals explicitly.

Fixes https://bugs.llvm.org//show_bug.cgi?id=32394 .

(-arm-promote-constant is currently off by default, and it stays off
with this patch. I'll look into turning it on again when all the known
issues are fixed.)

Differential Revision: https://reviews.llvm.org/D51469

llvm-svn: 343359
2018-09-28 20:21:51 +00:00
Lang Hames b62f73420b [ORC] Narrow a cast: the block guarded by the condition only handles
GlobalVariables, not all GlobalValues.

llvm-svn: 343358
2018-09-28 20:16:16 +00:00
Evandro Menezes fc1852ff1c [AArch64] Split zero cycle feature more granularly
Split the `zcz` feature into specific ones got GP and FP registers, `zcz-gp`
and `zcz-fp`, respectively, while retaining the original feature option to
mean both.

Differential revision: https://reviews.llvm.org/D52621

llvm-svn: 343354
2018-09-28 19:05:09 +00:00
David Bolvansky 8e90bad63d [DAGCombiner] [NFC] Improve X div/rem 1 fold
Reviewers: spatel

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52661

llvm-svn: 343349
2018-09-28 18:40:30 +00:00
Luke Cheeseman 10981cc884 Revert r343317
- asan buildbots are breaking and I need to investigate the issue

llvm-svn: 343341
2018-09-28 17:01:50 +00:00
whitequark 29b2980159 Revert "[LLVM-C] Add bindings for addCoroutinePassesToExtensionPoints"
This reverts commit c4baf7c2f06ff5459c4f5998ce980346e72bff97.

Broke the bots, and should really be in Transforms/Coroutines
instead.

llvm-svn: 343337
2018-09-28 16:45:18 +00:00
whitequark 937afbc365 [LLVM-C] Add bindings for addCoroutinePassesToExtensionPoints
Summary: This patch adds bindings to C and Go for addCoroutinePassesToExtensionPoints, which is used to add coroutine passes to the correct locations in PassManagerBuilder.

Reviewers: whitequark, deadalnix

Reviewed By: whitequark

Subscribers: mehdi_amini, modocache, llvm-commits

Differential Revision: https://reviews.llvm.org/D51642

llvm-svn: 343336
2018-09-28 16:38:11 +00:00
Robert Widmann d22ee9461f [LLVM-C] Fix broken build bots
Summary: Fix broken bots caused by the merge of D51522.

Reviewers: whitequark

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52657

llvm-svn: 343334
2018-09-28 16:02:26 +00:00
Robert Widmann 9cba4eced8 [LLVM-C] Add more debug information accessors to GlobalObject and Instruction
Summary: Adds missing debug information accessors to GlobalObject.  This puts the finishing touches on cloning debug info in the echo tests.

Reviewers: whitequark, deadalnix

Reviewed By: whitequark

Subscribers: aprantl, JDevlieghere, llvm-commits, harlanhaskins

Differential Revision: https://reviews.llvm.org/D51522

llvm-svn: 343330
2018-09-28 15:35:18 +00:00
Sanjay Patel 242f90fe82 [InstCombine] don't propagate wider shufflevector arguments to predecessors
InstCombine would propagate shufflevector insts that had wider output vectors onto 
predecessors, which would sometimes push undef's onto the divisor of a div/rem and 
result in bad codegen.

I've fixed this by just banning propagating shufflevector back if the result of 
the shufflevector is wider than the input vectors.

Patch by: @sheredom (Neil Henning)

Differential Revision: https://reviews.llvm.org/D52548

llvm-svn: 343329
2018-09-28 15:24:41 +00:00
Lang Hames 604769943a [ORC] Remove some dead code.
llvm-svn: 343327
2018-09-28 15:13:41 +00:00
Aditya Nandakumar 1cbb057142 [GISel]: Remove an incorrect assert in CallLowering
https://reviews.llvm.org/D51147

Asserting if any extend of vectors should be up to the target's
legalizer/target specific code not in CallLowering.

reviewed by : dsanders.

llvm-svn: 343325
2018-09-28 15:08:49 +00:00
Lang Hames 46f40a7128 [ORC] Improve debugging output for ORC.
(1) Print debugging output under a session lock to avoid garbled messages when
compiling on multiple threads.

(2) Name MaterializationUnits, add an ostream operator for them, and so they can
be easily referenced in debugging output, and have that ostream operator
optionally print code/data/hidden symbols provided by that materialization unit
based on command line options.

llvm-svn: 343323
2018-09-28 15:03:11 +00:00
Simon Pilgrim 428c1196d8 [X86][Btver2] PSUBS/PSUBUS instructions are zero-idioms
Noticed during llvm-exegesis tests, the PSUBS/PSUBUS instructions have the same zero-idiom behaviour to PSUB

llvm-svn: 343321
2018-09-28 14:20:42 +00:00
Luke Cheeseman 21f2955bb2 Reapply changes reverted by r343235
- Add fix so that all code paths that create DWARFContext
  with an ObjectFile initialise the target architecture in the context
- Add an assert that the Arch is known in the Dwarf CallFrameString method

llvm-svn: 343317
2018-09-28 13:37:27 +00:00
Petar Jovanovic ff1bc621a0 [MIPS GlobalISel] Lower i64 arguments
Lower integer arguments larger then 32 bits for MIPS32.
setMostSignificantFirst is used in order for G_UNMERGE_VALUES and
G_MERGE_VALUES to always hold registers in same order, regardless of
endianness.

Patch by Petar Avramovic.

Differential Revision: https://reviews.llvm.org/D52409

llvm-svn: 343315
2018-09-28 13:28:47 +00:00
Simon Pilgrim 66da1ed29d [X86][Btver2] CVTSS2I/CVTSD2I - add missing JFPU0 pipe
We issue JFPU1->JSTC then JFPU0->JFPA then -> JALU0 (integer pipe)

Match AMD Fam16h SOG + llvm-exegesis tests

llvm-svn: 343314
2018-09-28 13:19:22 +00:00
Simon Pilgrim 17e5981ebf [X86][Btver2] Fix BSF/BSR schedule
Double throughput to account for 2 pipes + fix BSF's latency/uop counts

Match AMD Fam16h SOG + llvm-exegesis tests

llvm-svn: 343311
2018-09-28 10:26:48 +00:00
Florian Hahn 8d72ecc36f Revert r343308: [LoopInterchange] Turn into a loop pass.
llvm-svn: 343310
2018-09-28 10:20:07 +00:00
Florian Hahn 0694c159f7 [LoopInterchange] Turn into a loop pass.
This patch turns LoopInterchange into a loop pass. It now only
considers top-level loops and tries to move the innermost loop to the
optimal position within the loop nest. By only looking at top-level
loops, we might miss a few opportunities the function pass would get
(e.g. if we have a loop nest of 3 loops, in the function pass
we might process loops at level 1 and 2 and move the inner most loop to
level 1, and then we process loops at levels 0, 1, 2 and interchange
again, because we now have a different inner loop). But I think it would
be better to handle such cases by picking the best inner loop from the
start and avoid re-visiting the same loops again.

The biggest advantage of it being a function pass is that it interacts
nicely with the other loop passes. Without this patch, there are some
performance regressions on AArch64 with loop interchanging enabled,
where no loops were interchanged, but we missed out on some other loop
optimizations.

It also removes the SimplifyCFG run. We are just changing branches, so
the CFG should not be more complicated, besides the additional 'unique'
preheaders this pass might create.


Reviewers: chandlerc, efriedma, mcrosier, javed.absar, xbolva00

Reviewed By: xbolva00

Differential Revision: https://reviews.llvm.org/D51702

llvm-svn: 343308
2018-09-28 09:45:50 +00:00
David Spickett ea605913be [ARM] Allow execute only code on Cortex-m23
The NoMovt feature prevents the use of MOVW/MOVT
instructions on Cortex-M23 for performance reasons.
These instructions are required for execute only code
so NoMovt should be disabled when that option is enabled.

Differential Revision: https://reviews.llvm.org/D52551

llvm-svn: 343302
2018-09-28 08:55:19 +00:00
David Spickett a799fe40dc Remove extra whitespace. NFC. (test commit)
llvm-svn: 343301
2018-09-28 08:45:28 +00:00
Oliver Stannard 5f34e9e265 [ARM][v8.5A] Add speculation barriers SSBB and PSSBB
This adds two new barrier instructions which can be used to restrict
speculative execution of load instructions.

Patch by Pablo Barrio!

Differential revision: https://reviews.llvm.org/D52484

llvm-svn: 343300
2018-09-28 08:27:56 +00:00
Simon Pilgrim 280af1c7f0 [X86][BtVer2] Fix PHMINPOS schedule resources typo
PHMINPOS can run on either JFPU pipe

llvm-svn: 343299
2018-09-28 08:21:39 +00:00
Hiroshi Inoue 69bfa40200 [CodeGen] fix broken successor probability in MBB dump
When printing successor probabilities for a MBB, a human readable value is sometimes shown as 200.0%.
The human readable output is based on getProbabilityIterator, which returns 0xFFFFFFFF for getNumerator() and 0x80000000 for getDenominator() for unknown BranchProbability.
By using getSuccProbability as we do for the non-human readable part, we can avoid this problem.

Differential Revision: https://reviews.llvm.org/D52605

llvm-svn: 343297
2018-09-28 05:27:32 +00:00
Craig Topper bb50c38635 [ScalarizeMaskedMemIntrin] Use MinAlign to calculate alignment for the scalar load/stores to handle element types that are byte-sized but not powers of 2.
This pass doesn't handle non-byte sized types correctly at all, but at least we can make byte sized types work.

llvm-svn: 343294
2018-09-28 03:35:37 +00:00
Aaron Smith 757274f9b2 [pdb] Simplify the code by replacing a few string conversions with calls to invokeBstrMethod()
Reviewers: aleksandr.urakov, zturner, llvm-commits

Reviewed By: zturner

Differential Revision: https://reviews.llvm.org/D52624

llvm-svn: 343291
2018-09-28 02:32:07 +00:00
Lang Hames 4328ea3443 [ORC] clang-format the ThreadSafeModule code.
Evidently I forgot to do this before committing r343055.

llvm-svn: 343288
2018-09-28 01:41:33 +00:00
Craig Topper fdf4c76ca0 [ScalarizeMaskedMemIntrin] Fix the alignment calculation for the scalar stores of a masked store expansion.
It should be the minimum of the original alignment and the scalar size.

llvm-svn: 343284
2018-09-28 01:06:13 +00:00
Craig Topper 8b4f0e1b8c [ScalarizeMaskedMemIntrin] Ensure the mask is a vector of ConstantInts before generating the expansion without control flow.
Its possible the mask itself or one of the elements is a ConstantExpr and we shouldn't optimize in that case.

llvm-svn: 343278
2018-09-27 22:31:42 +00:00
Craig Topper 10ec021621 [ScalarizeMaskedMemIntrin] Use cast instead of dyn_cast checked by an assert. Consistently make use of the element type variable we already have. NFCI
cast will take care of asserting internally.

llvm-svn: 343277
2018-09-27 22:31:40 +00:00
Derek Schuff 70ce1af9fa WebAssembly: Rename GetSignature to GetLibcallSignature [NFC]
llvm-svn: 343275
2018-09-27 22:20:33 +00:00
Craig Topper 6911bfe263 [ScalarizeMaskedMemIntrin] When expanding masked gathers, start with the passthru vector and insert the new load results into it.
Previously we started with undef and did a final merge with the passthru at the end.

llvm-svn: 343273
2018-09-27 21:28:59 +00:00
Craig Topper 7d234d6628 [ScalarizeMaskedMemIntrin] When expanding masked loads, start with the passthru value and insert each conditional load result over their element.
Previously we started with undef and did one final merge at the end with a select.

llvm-svn: 343271
2018-09-27 21:28:52 +00:00
Craig Topper dfc0f289fa [ScalarizeMaskedMemIntrin] Handle the case where the mask is an all zero vector.
This shouldn't really happen in practice I hope, but we tried to handle other constant cases. We missed this one because we checked for ConstantVector without realizing that zero becomes ConstantAggregateZero instead.

So instead just check for Constant and use getAggregateElement which will do the dirty work for us.

llvm-svn: 343270
2018-09-27 21:28:46 +00:00
Craig Topper dfe460db57 [ScalarizeMaskedMemIntrin] Remove some temporary variables that are only used by a single if condition.
llvm-svn: 343268
2018-09-27 21:28:41 +00:00
Craig Topper 49dad8b8af [ScalarizeMaskedMemIntrin] Cleanup comments. NFC
llvm-svn: 343267
2018-09-27 21:28:39 +00:00
Lang Hames 8b81395db9 [ORC] Add definition for IRLayer::setCloneToNewContextOnEmit, use it to set the
flag to true in LLJIT when running in multithreaded mode.

The IRLayer::setCloneToNewContextOnEmit method sets a flag within the IRLayer
that causes modules added to that layer to be moved to a new context (by
serializing to/from a memory buffer) when they are emitted. This allows modules
that were all loaded on the same context to be compiled in parallel.

llvm-svn: 343266
2018-09-27 21:13:07 +00:00
Konstantin Zhuravlyov 5f1b8181ad AMDGPU: Split HasExt into HasExtDPP/SDWA/SDWA9
llvm-svn: 343264
2018-09-27 20:49:00 +00:00
Konstantin Zhuravlyov 9da26b20da AMDGPU: Split VOP2Inst into VOP2Inst_e32/e64/sdwa
llvm-svn: 343259
2018-09-27 19:46:41 +00:00
Lang Hames c5192f751a [ORC] Coalesce all of ORC's symbol renaming / linkage-promotion utilities into
one SymbolLinkagePromoter utility.

SymbolLinkagePromoter renames anonymous and private symbols, and bumps all
linkages to at least global/hidden-visibility. Modules whose symbols have been
promoted by this utility can be decomposed into sub-modules without introducing
link errors. This is used by the CompileOnDemandLayer to extract single-function
modules for lazy compilation.

llvm-svn: 343257
2018-09-27 19:27:20 +00:00
Konstantin Zhuravlyov 7d424aae13 AMDGPU/NFC: Simplify VOP_MAC_F16/F32
llvm-svn: 343254
2018-09-27 19:24:05 +00:00
Stanislav Mekhanoshin b080adfc0c [AMDGPU] Fold copy (copy vgpr)
This allows to reduce a number of used VGPRs in some cases.

Differential Revision: https://reviews.llvm.org/D52577

llvm-svn: 343249
2018-09-27 18:55:20 +00:00
Craig Topper 0423681d4a [ScalarizeMaskedMemIntrin] Don't emit 'icmp eq i1 %x, 1' to check mask values. That's just %x so use that directly.
Had we emitted this IR earlier, InstCombine would have removed icmp so I'm going to assume using the i1 directly would be considered canonical.

llvm-svn: 343244
2018-09-27 18:01:48 +00:00
Simon Pilgrim 2a64d393ea [X86] Remove BT/BTC/BTR/BTS rr/ri overrides
llvm-svn: 343241
2018-09-27 17:29:13 +00:00
Simon Pilgrim 86c7b07ecd [X86][Btver2] (V)MPSADBW instructions take 3uops not 1
llvm-svn: 343238
2018-09-27 17:13:57 +00:00
Luke Cheeseman 8e5676b1aa Revert r343192 as an ubsan build is currently failing
llvm-svn: 343235
2018-09-27 16:47:30 +00:00
Simon Pilgrim dd744f158a [X86][Btver2] BTC/BTR/BTS instructions take 2uops not 1
llvm-svn: 343234
2018-09-27 16:39:52 +00:00
Simon Pilgrim 29cf499bca [X86] Split BT and BTC/BTR/BTS scheduler classes
llvm-svn: 343233
2018-09-27 16:24:42 +00:00
Simon Pilgrim 06ccc9d998 [Sparc] EXPENSIVE_CHECKS now passes all machine verifier errors (PR27461)
Now that D51487 has landed, the last machine verifier tests that failed EXPENSIVE_CHECKS builds have now been fixed/removed, so we can remove @MatzeB 's isMachineVerifierClean() hack for sparc targets.

Differential Revision: https://reviews.llvm.org/D52612

llvm-svn: 343232
2018-09-27 16:21:35 +00:00
Oliver Stannard 2721e6f0ed [AArch64] Refactor immediate details out of add/sub tblgen class (NFCI)
Bits [23-22] are used in Add and Sub to specify the shift. The value of the
shift field must be 0x; values of 1x are unallocated. MTE adds some instructions
that use such encodings, and this patch refactors the Add/Sub class so that
another class could derive from this one to implement other encodings and other
formats of bitfields.

Patch by Pablo Barrio!

Differential revision: https://reviews.llvm.org/D52489

llvm-svn: 343231
2018-09-27 16:19:04 +00:00
Oliver Stannard a4f68bf4ad [AArch64][v8.5A] Add speculation barriers SSBB and PSSBB
This adds two new barrier instructions which can be used to restrict
speculative execution of load instructions.

Patch by Pablo Barrio!

Differential revision: https://reviews.llvm.org/D52483

llvm-svn: 343229
2018-09-27 16:09:05 +00:00
Sanjay Patel c3f50ff92e [InstCombine] Without infinites, fold (C / X) < 0.0 --> (X < 0)
When C is not zero and infinites are not allowed (C / X) > 0 is a sign
test. Depending on the sign of C, the predicate must be swapped.

E.g.:
  foo(double X) {
    if ((-2.0 / X) <= 0) ...
  }
 =>
  foo(double X) {
    if (X >= 0) ...
  }

Patch by: @marels (Martin Elshuber)

Differential Revision: https://reviews.llvm.org/D51942

llvm-svn: 343228
2018-09-27 15:59:24 +00:00
Simon Pilgrim c2a88ea64e [X86][Btver2] BLSI/BLSMSK/BLSR instructions take 2uops not 1 (same as TZCNT)
llvm-svn: 343227
2018-09-27 14:57:57 +00:00
Teresa Johnson f24136f17a [WPD] Fix incorrect devirtualization after indirect call promotion
Summary:
Add a dominance check to ensure that the possible devirtualizable
call is actually dominated by the type test/checked load intrinsic being
analyzed. With PGO, after indirect call promotion is performed during
the compile step, followed by inlining, we may have a type test in the
promoted and inlined sequence that allows an indirect call in that
sequence to be devirtualized. That indirect call (inserted by inlining
after promotion) will share the same vtable pointer as the fallback
indirect call that cannot be devirtualized.

Before this patch the code was incorrectly devirtualizing the fallback
indirect call.

See the new test and the example described there for more details.

Reviewers: pcc, vitalybuka

Subscribers: mehdi_amini, Prazek, eraman, steven_wu, dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D52514

llvm-svn: 343226
2018-09-27 14:55:32 +00:00
Oliver Stannard a9a5eee169 [AArch64][v8.5A] Add Branch Target Identification instructions
This adds new instructions used by the Branch Target Identification
feature. When this is enabled, these are the only instructions which can
be targeted by indirect branch instructions.

Patch by Pablo Barrio!

Differential revision: https://reviews.llvm.org/D52485

llvm-svn: 343225
2018-09-27 14:54:33 +00:00
Oliver Stannard 8459d34e82 [AArch64][v8.5A] Add speculation restriction system registers
This adds some new system registers which can be used to restrict
certain types of speculative execution.

Patch by Pablo Barrio and David Spickett!

Differential revision: https://reviews.llvm.org/D52482

llvm-svn: 343218
2018-09-27 14:05:46 +00:00
Oliver Stannard dc837e3f1f [AArch64][v8.5A] Add Armv8.5-A random number instructions
This adds two new system registers, used to generate random numbers.

This is an optional extension to v8.5-A, and will be controlled by the
"+rng" modifier of the -march= and -mcpu= options.

Patch by Pablo Barrio!

Differential revision: https://reviews.llvm.org/D52481

llvm-svn: 343217
2018-09-27 14:01:40 +00:00
Oliver Stannard 6930b12d53 [AArch64][v8.5A] Add Armv8.5-A "DC CVADP" instruction
This adds a new variant of the DC system instruction for persistent
memory.

Patch by Pablo Barrio!

Differential revision: https://reviews.llvm.org/D52480

llvm-svn: 343216
2018-09-27 13:53:35 +00:00
Oliver Stannard 224428c06a [AArch64][v8.5A] Add prediction invalidation instructions to AArch64
This adds new system instructions which act as barriers to speculative
execution based on earlier execution within a particular execution
context.

Patch by Pablo Barrio!

Differential revision: https://reviews.llvm.org/D52479

llvm-svn: 343214
2018-09-27 13:47:40 +00:00
Oliver Stannard 382c935c42 [ARM][v8.5A] Add speculation barrier to ARM & Thumb instruction sets
This is a new barrier which limits speculative execution of the
instructions following it.

Patch by Pablo Barrio!

Differential revision: https://reviews.llvm.org/D52477

llvm-svn: 343213
2018-09-27 13:41:14 +00:00
Oliver Stannard e481f1d95a [AArch64][v8.5A] Add speculation barrier to AArch64 instruction set
This is a new barrier which limits speculative execution of the
instructions following it.

Patch by Pablo Barrio!

Differential revision: https://reviews.llvm.org/D52476

llvm-svn: 343211
2018-09-27 13:39:06 +00:00
Daniel Cederman 0c05bdea2b [Sparc] Remove the support for builtin setjmp/longjmp
Summary: It is currently broken and for Sparc there is not much benefit
in using a builtin version compared to a library version. Both versions
needs to store the same four values in setjmp and flush the register
windows in longjmp. If the need for a builtin setjmp/longjmp arises there
is an improved implementation available at https://reviews.llvm.org/D50969.

Reviewers: jyknight, joerg, venkatra

Subscribers: fedor.sergeev, jrtc27, llvm-commits

Differential Revision: https://reviews.llvm.org/D51487

llvm-svn: 343210
2018-09-27 13:32:54 +00:00
Oliver Stannard ddb7d46aa5 [AArch64][v8.5A] Add FRINT[32,64][Z,X] instructions
These are some new variants of the "Floating-point Round to Integral"
family of instructions, which round to the nearest floating-point value
which fits in a 32- or 64-bit integer.

Patch by Pablo Barrio!

Differential revision: https://reviews.llvm.org/D52475

llvm-svn: 343209
2018-09-27 13:32:06 +00:00
Daniel Cederman b35d3a2733 [Sparc] Add unimp alias
Summary: Use 0 as the default immediate for the UNIMP instruction.
This matches the behavior in gas.

Reviewers: jyknight, venkatra

Subscribers: fedor.sergeev, jrtc27, llvm-commits

Differential Revision: https://reviews.llvm.org/D51526

llvm-svn: 343203
2018-09-27 12:34:53 +00:00
Daniel Cederman c1968ba5d3 [Sparc] Add support for the partial write PSR instruction
Summary:
Partial write %PSR (WRPSR) is a SPARC V8e option that allows WRPSR
instructions to only affect the %PSR.ET field. It is supported by
the GR740 and GR716.

Reviewers: jyknight, venkatra

Subscribers: fedor.sergeev, jrtc27, llvm-commits

Differential Revision: https://reviews.llvm.org/D48644

llvm-svn: 343202
2018-09-27 12:34:48 +00:00
Simon Pilgrim 98f503a326 [X86][Btver2] TZCNT instructions take 2uops not 1
llvm-svn: 343200
2018-09-27 12:28:47 +00:00
Nemanja Ivanovic a59096759d [PowerPC] [NFC] Refactor code for printing register operands
We have an unfortunate situation in our back end where we have to keep pairs of
functions synchronized. Needless to say that this is not an ideal situation as
it is very difficult to enforce. Even without bugs, it's annoying to have to do
the same thing in two places.

This patch just refactors the code so that the two pairs of those functions that
pertain to printing register operands are unified:
  - stripRegisterPrefix() - this just removes the letter prefixes from registers
    for the InstrPrinter and AsmPrinter. This patch provides this as a static
    member of PPCRegisterInfo
  - Handling of PPCII::UseVSXReg - there are 3 places where we do something
    special for instructions with that flag set. Each of those places does its
    own checking of this flag and implements code customization. Any changes to
    how we print/encode VSX/VMX registers require modifying all 3 places. This
    patch unifies this into a static function in PPCInstrInfo that returns the
    register number adjusted as needed.

Differential revision: https://reviews.llvm.org/D52467

llvm-svn: 343195
2018-09-27 11:49:47 +00:00
Simon Pilgrim 7e4f154e79 [X86][Btver2] Add uops counter for exegesis reports
llvm-svn: 343194
2018-09-27 11:40:26 +00:00
Luke Cheeseman f6844b307a Reapply changes reverted in r343114, lldb patch to follow shortly
llvm-svn: 343192
2018-09-27 10:39:20 +00:00
Hans Wennborg 5008d9014b Revert r342942 "[MachineCopyPropagation] Reimplement CopyTracker in terms of register units"
It seems to have broken several targets, see comments on the llvm-commits thread.

> Change the copy tracker to keep a single map of register units instead
> of 3 maps of registers. This gives a very significant compile time
> performance improvement to the pass. I measured a 30-40% decrease in
> time spent in MCP on x86 and AArch64 and much more significant
> improvements on out of tree targets with more registers.
>
> Differential Revision: https://reviews.llvm.org/D52374

llvm-svn: 343189
2018-09-27 09:59:27 +00:00
Oliver Stannard 31af178f4a [AArch64][v8.5A] Add PSTATE manipulation instructions XAFlag and AXFlag
These new instructions manipluate the NZCV bits, to convert between the
regular Arm floating-point comare format and an alternative format.

Patch by Pablo Barrio!

Differential revision: https://reviews.llvm.org/D52473

llvm-svn: 343187
2018-09-27 09:11:27 +00:00
Simon Atanasyan e58c45a695 [mips] Add support MIPS r6 Debian triples
Debian uses different triples for MIPS r6 and paths. Here we use SubArch
to determine whether it is r6, if we found `r6' in CPU section of triple.

These new triples include:
  mipsisa32r6-linux-gnu
  mipsisa32r6el-linux-gnu
  mipsisa64r6-linux-gnuabi64
  mipsisa64r6el-linux-gnuabi64
  mipsisa64r6-linux-gnuabin32
  mipsisa64r6el-linux-gnuabin32

Patch by YunQiang Su.

Differential revision: https://reviews.llvm.org/D50857

llvm-svn: 343185
2018-09-27 08:51:18 +00:00
Lang Hames b57c9c4a6e [ORC] Use ExecutionSession's pre-constructed main JITDylib in LLJIT.
As of r342086 ExecutionSession automatically creates a 'main' JITDylib, so
there is no need for LLJIT to create its own.

llvm-svn: 343167
2018-09-27 04:19:32 +00:00
Fangrui Song 0cac726a00 llvm::sort(C.begin(), C.end(), ...) -> llvm::sort(C, ...)
Summary: The convenience wrapper in STLExtras is available since rL342102.

Reviewers: dblaikie, javed.absar, JDevlieghere, andreadb

Subscribers: MatzeB, sanjoy, arsenm, dschuff, mehdi_amini, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, eraman, aheejin, kbarton, JDevlieghere, javed.absar, gbedwell, jrtc27, mgrang, atanasyan, steven_wu, george.burgess.iv, dexonsmith, kristina, jsji, llvm-commits

Differential Revision: https://reviews.llvm.org/D52573

llvm-svn: 343163
2018-09-27 02:13:45 +00:00
Yury Delendik b3857e4d35 [WebAssembly] Fix MRI.hasOneNonDBGUse assert in WebAssemblyRegStackify pass
Summary:
The OneUseDominatesOtherUses in the WebAssemblyRegStackify not properly validates register use using hasOneUse. Since we added/modified DBG_VALUE the assert started catching valid cases.

See also https://reviews.llvm.org/D49034#1247200

Fix verified by running the wasm waterfall.

Reviewed By: dschuff

Tags: #debug-info

Differential Revision: https://reviews.llvm.org/D49034

llvm-svn: 343154
2018-09-26 23:49:21 +00:00
Florian Hahn 6feb637124 [LoopInterchange] Preserve LCSSA.
This patch extends LoopInterchange to move LCSSA to the right place
after interchanging. This is required for LoopInterchange to become a
function pass.

An alternative to the manual moving of the PHIs, we could also re-form
the LCSSA phis for a set of interchanged loops, but that's more
expensive.

Reviewers: efriedma, mcrosier, davide

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D52154

llvm-svn: 343132
2018-09-26 19:34:25 +00:00
Tom Stellard 344475fce5 AMDGPU/SI: Change predicate to isCIOnly for 32-bit imm s_buffer_load* patterns
Summary:
This is essentially NFC, because the complex pattern used for these patterns
will fail on non-CI, but this makes the pattern consistent with other CI
smrd patterns.  It is also a performance improvement, because the pattern
will now fail earlier on non-CI.

Reviewers: arsenm, nhaehnle

Reviewed By: arsenm

Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D52469

llvm-svn: 343125
2018-09-26 16:53:36 +00:00
Lang Hames f0a3fd885d Reapply r343058 with a fix for -DLLVM_ENABLE_THREADS=OFF.
Modifies lit to add a 'thread_support' feature that can be used in lit test
REQUIRES clauses. The thread_support flag is set if -DLLVM_ENABLE_THREADS=ON
and unset if -DLLVM_ENABLE_THREADS=OFF. The lit flag is used to disable the
multiple-compile-threads-basic.ll testcase when threading is disabled.

llvm-svn: 343122
2018-09-26 16:26:59 +00:00
Simon Pilgrim b0189289bf [DAG] SelectionDAGLegalize::ExpandLegalINT_TO_FP - use getFPExtendOrRound helper. NFCI.
Handles SrcVT == DstVT as well.

llvm-svn: 343121
2018-09-26 16:24:07 +00:00
Oliver Stannard 2905937435 [AArch64] Extend single-operand FP insns to match Arm ARM (NFCI)
The Armv8.3-A reference manual defines floating-point data-processing
instructions with one source operand to have an opcode of 6 bits
[20:15]. The current class in tablegen, BaseSingleOperandFPData, only
allows [18:15]. This was ok because [20:19] could only be '00', with
other encodings unallocated. Armv8.5-A brings in the FRINT group of
instructions which use other values for these bits.

This patch refactors the existing class a bit to allow using the full 6
bits of the opcode, as defined in the Arm ARM.

Patch by Pablo Barrio!

Differential revision: https://reviews.llvm.org/D52474

llvm-svn: 343120
2018-09-26 15:42:47 +00:00
Luke Cheeseman 77aaa22081 Revert r343112 as CallFrameString API change has broken lldb builds
llvm-svn: 343114
2018-09-26 14:48:03 +00:00
Oliver Stannard c5d192b611 [AArch64] Refactor instructions that write PSTATE (NFCI)
Reuse some code in preparation for the v8.5A XAFlag/AXFlag instructions,
which shares part of the encoding of the MSR-immediate.

Patch by Pablo Barrio!

Differential revision: https://reviews.llvm.org/D52472

llvm-svn: 343113
2018-09-26 14:42:59 +00:00
Luke Cheeseman 03ad8812f5 [AArch64] - Return address signing dwarf support
- Reapply r343089 with a fix for DebugInfo/Sparc/gnu-window-save.ll

llvm-svn: 343112
2018-09-26 14:30:29 +00:00
Oliver Stannard 89b1604935 [AArch64][AsmParser] Show name of missing feature for system instructions
Parsing of the system instructions (IC, DC, AT and TLBI) uses this
function to show the required architecture when the operand is valid,
but the architecture is not enabled. Armv8.5A adds a few different
system instructions as part of optional features, so we need to extend
it to show individual features, not just base architectures.

This is NFC for now, but will be used by three different features added
in v8.5A, and will be tested by them.

Patch by David Spickett!

Differential revision: https://reviews.llvm.org/D52478

llvm-svn: 343109
2018-09-26 13:52:27 +00:00
Francis Visoiu Mistrih 6acaa18afc [CodeGen] Always print register ties in MI::dump()
It was the case when calling MO::dump(), but MI::dump() was still
depending on hasComplexRegisterTies().

The MIR output is not affected.

llvm-svn: 343107
2018-09-26 13:33:09 +00:00
Fedor Sergeev a43fd9522d [PassTiming] cleaning up legacy PassTimingInfo interface. NFCI.
During D51276 discussion it was decided that legacy PassTimingInfo
interface can not be reused for new pass manager's implementation
of -time-passes.

This is a cleanup in preparation for D51276 to make legacy interface
as concise as possible, moving the PassTimingInfo from the header
into the anonymous legacy namespace in .cpp.

It is rather close to a revert of rL340872 in a sense that it hides
the interface and gets rid of templates. However as compared to
a complete revert it resides in a different translation unit and has
an additional pass-instance counting funcitonality (PassIDCountMap).

Reviewers: philip.pfaffe
Differential Revision: https://reviews.llvm.org/D52356

llvm-svn: 343104
2018-09-26 13:01:43 +00:00
Hans Wennborg 00b88bbcaf Revert r343089 "[AArch64] - Return address signing dwarf support"
This caused the DebugInfo/Sparc/gnu-window-save.ll test to fail.

> Functions that have signed return addresses need additional dwarf support:
> - After signing the LR, and before authenticating it, the LR register is in a
>   state the is unusable by a debugger or unwinder
> - To account for this a new directive, .cfi_negate_ra_state, is added
> - This directive says the signed state of the LR register has now changed,
>   i.e. unsigned -> signed or signed -> unsigned
> - This directive has the same CFA code as the SPARC directive GNU_window_save
>   (0x2d), adding a macro to account for multiply defined codes
> - This patch matches the gcc implementation of this support:
>   https://patchwork.ozlabs.org/patch/800271/
>
> Differential Revision: https://reviews.llvm.org/D50136

llvm-svn: 343103
2018-09-26 12:57:45 +00:00
Oliver Stannard 7c3c4baa3f [ARM/AArch64][v8.5A] Add Armv8.5-A target
This patch allows targeting Armv8.5-A, adding the architecture to
tablegen and setting the options to be identical to Armv8.4-A for the
time being. Subsequent patches will add support for the different
features included in the Armv8.5-A Reference Manual.

Patch by Pablo Barrio!

Differential revision: https://reviews.llvm.org/D52470

llvm-svn: 343102
2018-09-26 12:48:21 +00:00
Simon Pilgrim e2437689a8 [DAG] ExpandLegalINT_TO_FP - pull out repeated getValueType() call. NFCI.
llvm-svn: 343101
2018-09-26 12:42:19 +00:00
Hiroshi Inoue 20982f0995 [PowerPC] optimize conditional branch on CRSET/CRUNSET
This patch adds a check to optimize conditional branch (BC and BCn) based on a constant set by CRSET or CRUNSET.
Other optimizers, such as block placement, may generate such code and hence
I do this at the very end of the optimization in pre-emit peephole pass.

A conditional branch based on a constant is eliminated or converted into unconditional branch. 
Also CRSET/CRUNSET is eliminated if the condition code register is not used
by instruction other than the branch to be optimized.

Differential Revision: https://reviews.llvm.org/D52345

llvm-svn: 343100
2018-09-26 12:32:45 +00:00
Hans Wennborg 20b5abe23b Revert r343058 "[ORC] Add support for multithreaded compiles to LLJIT and LLLazyJIT."
This doesn't work well in builds configured with LLVM_ENABLE_THREADS=OFF,
causing the following assert when running
ExecutionEngine/OrcLazy/multiple-compile-threads-basic.ll:

  lib/ExecutionEngine/Orc/Core.cpp:1748: Expected<llvm::JITEvaluatedSymbol>
  llvm::orc::lookup(const llvm::orc::JITDylibList &, llvm::orc::SymbolStringPtr):
  Assertion `ResultMap->size() == 1 && "Unexpected number of results"' failed.

> LLJIT and LLLazyJIT can now be constructed with an optional NumCompileThreads
> arguments. If this is non-zero then a thread-pool will be created with the
> given number of threads, and compile tasks will be dispatched to the thread
> pool.
>
> To enable testing of this feature, two new flags are added to lli:
>
> (1) -compile-threads=N (N = 0 by default) controls the number of compile threads
> to use.
>
> (2) -thread-entry can be used to execute code on additional threads. For each
> -thread-entry argument supplied (multiple are allowed) a new thread will be
> created and the given symbol called. These additional thread entry points are
> called after static constructors are run, but before main.

llvm-svn: 343099
2018-09-26 12:15:23 +00:00