Commit Graph

120378 Commits

Author SHA1 Message Date
Teresa Johnson 4bdf82ce79 [SamplePGO] Minor efficiency improvement in samplePGO ICP
Summary:
When attaching prof metadata to promoted direct calls in SamplePGO
mode, no need to construct and use a SmallVector to pass a single count
to the ArrayRef parameter, we can simply use a brace-enclosed init list.

This made a small but consistent improvement for a ThinLTO backend
compile I was measuring.

Reviewers: wmi

Subscribers: mehdi_amini, dexonsmith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57706

llvm-svn: 353123
2019-02-05 00:18:38 +00:00
Matt Arsenault 81511e5428 GlobalISel: Implement narrowScalar for select
Don't handle vector conditions.

I think this can be merged in the future with
fewerElementsVectorSelect, although this becomes slightly tricky with
a vector condition.

llvm-svn: 353122
2019-02-05 00:13:44 +00:00
Matt Arsenault 24f14993e8 GlobalISel: Combine g_extract with g_merge_values
Try to use the underlying source registers.

This enables legalization in more cases where some irregular
operations are widened and others narrowed.

This seems to make the test_combines_2 AArch64 test worse, since the
MERGE_VALUES has multiple uses. Since this should be required for
legalization, a hasOneUse check is probably inappropriate (or maybe
should only be used if the merge is legal?).

llvm-svn: 353121
2019-02-04 23:41:59 +00:00
Evandro Menezes 98f356cd74 Revert "[PATCH] [TargetLibraryInfo] Update run time support for Windows"
This reverts accidental commit ff5527718d.

llvm-svn: 353118
2019-02-04 23:34:50 +00:00
Evandro Menezes ff5527718d [PATCH] [TargetLibraryInfo] Update run time support for Windows
It seems that the run time for Windows has changed and supports more math
functions than before.  Since LLVM requires at least VS2015, I assume that
this is the run time that would be redistributed with programs built with
Clang.  Thus, I based this update on the header file `math.h` that
accompanies it.

This patch addresses the PR40541.  Unfortunately, I have no access to a
Windows development environment to validate it.

llvm-svn: 353114
2019-02-04 23:29:41 +00:00
Matt Arsenault 1f795e2c2a GlobalISel: Enforce operand types for constants
A number of of tests were using imm operands, not cimm. Since CSE
relies on the exact ConstantInt* pointer used, and implicit
conversions are generally evil, also enforce the bitsize of the types.

llvm-svn: 353113
2019-02-04 23:29:31 +00:00
Matt Arsenault f2a26339e2 GlobalISel: Verify g_select
Factor the common vector element consistency check many instructions
need out, although this makes the error messages worse.

llvm-svn: 353112
2019-02-04 23:29:16 +00:00
Matt Arsenault 46f9c6cf0b MachineVerifier: Move verification of G_* instructions to function
llvm-svn: 353111
2019-02-04 23:29:11 +00:00
Sam Clegg 313f9f54f5 [WebAssembly] MC: Mark more function aliases as functions
Aliases of functions are now marked as function symbols even if
they are bitcast to some other other non-function type.
This is important for WebAssembly where object and function
symbols can't alias each other.

Fixes PR38866

Differential Revision: https://reviews.llvm.org/D57538

llvm-svn: 353109
2019-02-04 23:07:34 +00:00
Matt Arsenault 8a59b1919c MIR: Validate LLT types when parsing
llvm-svn: 353107
2019-02-04 22:59:56 +00:00
Matt Arsenault 3d6a49b0b9 GlobalISel: Fix not calling observer when legalizing bitcount ops
This was hiding bugs from never legalizing the source type.

llvm-svn: 353102
2019-02-04 22:26:33 +00:00
Matt Arsenault cba0c6d0c9 AMDGPU: Don't rematerialize mov with implicit operands
This was pulling the mov used for register indexing on gfx9 out of the
loop.

llvm-svn: 353101
2019-02-04 22:26:21 +00:00
Julian Lettner 29ac3a5b82 [SanitizerCoverage] Clang crashes if user declares `__sancov_lowest_stack` variable
Summary:
If the user declares or defines `__sancov_lowest_stack` with an
unexpected type, then `getOrInsertGlobal` inserts a bitcast and the
following cast fails:
```
Constant *SanCovLowestStackConstant =
       M.getOrInsertGlobal(SanCovLowestStackName, IntptrTy);
SanCovLowestStack = cast<GlobalVariable>(SanCovLowestStackConstant);
```

This variable is a SanitizerCoverage implementation detail and the user
should generally never have a need to access it, so we emit an error
now.

rdar://problem/44143130

Reviewers: morehouse

Differential Revision: https://reviews.llvm.org/D57633

llvm-svn: 353100
2019-02-04 22:06:30 +00:00
Nicolai Haehnle a69146e67e [InstCombine] Cleanup the TFE/LWE check in AMDGPU SimplifyDemanded
Summary:
The fix added in r352904 is not quite correct, or rather misleading:

1. When the texfailctrl (TFC) argument was non-constant, the fix assumed
   non-TFE/LWE, which is incorrect.

2. Regardless, this code path cannot even be hit for correct
   TFE/LWE-enabled calls, because those return a struct. Added
   a test case for those for completeness.

Change-Id: I92d314dbc67a2670f6d7adaab765ef45f56a49cf

Reviewers: hliao, dstuttard, arsenm

Subscribers: kzhuravl, jvesely, wdng, yaxunl, tpr, t-tye, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57681

llvm-svn: 353097
2019-02-04 21:24:19 +00:00
Craig Topper c45e39b35f [CodeGen][ARC][SystemZ][WebAssembly] Use MachineInstr::isInlineAsm in more places instead of just comparing opcode. NFCI
I'm looking at adding a second INLINEASM opcode for better modeling asm-goto
as a terminator. Using the existing predicate will reduce teh number of
places that will need to use the new opcode.

llvm-svn: 353095
2019-02-04 21:24:13 +00:00
Philip Pfaffe 0ee6a933ce [NewPM][MSan] Add Options Handling
Summary: This patch enables passing options to msan via the passes pipeline, e.e., -passes=msan<recover;kernel;track-origins=4>.

Reviewers: chandlerc, fedor.sergeev, leonardchan

Subscribers: hiraditya, bollu, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57640

llvm-svn: 353090
2019-02-04 21:02:49 +00:00
Wolfgang Pieb 90d856cd5f [DEBUGINFO] Reposting r352642: Handle restore instructions in LiveDebugValues
The LiveDebugValues pass recognizes spills but not restores, which can
cause large gaps in location information for some variables, depending
on control flow. This patch make LiveDebugValues recognize restores and
generate appropriate DBG_VALUE instructions.

This patch was posted previously with r352642 and reverted in r352666 due
to buildbot errors. A missing return statement was the cause for the 
failures.

Reviewers: aprantl, NicolaPrica

Differential Revision: https://reviews.llvm.org/D57271

llvm-svn: 353089
2019-02-04 20:42:45 +00:00
Scott Linder d19d197221 [AMDGPU] Support emitting GOT relocations for function calls
Differential Revision: https://reviews.llvm.org/D57416

llvm-svn: 353083
2019-02-04 20:00:07 +00:00
Michael Kruse 70560a0a2c [WarnMissedTransforms] Do not warn about already vectorized loops.
LoopVectorize adds llvm.loop.isvectorized, but leaves
llvm.loop.vectorize.enable. Do not consider such a loop for user-forced
vectorization since vectorization already happened -- by prioritizing
llvm.loop.isvectorized except for TM_SuppressedByUser.

Fixes http://llvm.org/PR40546

Differential Revision: https://reviews.llvm.org/D57542

llvm-svn: 353082
2019-02-04 19:55:59 +00:00
Matt Arsenault 8121ec26c0 GlobalISel: Fix CSE handling of buildConstant
This fixes two problems with CSE done in buildConstant. First, this
would hit an assert when used with a vector result type. Solve this by
allowing CSE on the vector elements, but not on the result vector for
now.

Second, this was also performing the CSE based on the input
ConstantInt pointer. The underlying buildConstant could potentially
convert the constant depending on the result type, giving in a
different ConstantInt*. Stop allowing the APInt and ConstantInt forms
from automatically casting to the result type to avoid any similar
problems in the future.

llvm-svn: 353077
2019-02-04 19:15:50 +00:00
Heejin Ahn 18c56a0762 [WebAssembly] clang-tidy (NFC)
Summary:
This patch fixes clang-tidy warnings on wasm-only files.
The list of checks used is:
`-*,clang-diagnostic-*,llvm-*,misc-*,-misc-unused-parameters,readability-identifier-naming,modernize-*`
(LLVM's default .clang-tidy list is the same except it does not have
`modernize-*`. But I've seen in multiple CLs in LLVM the modernize style
was recommended and code was fixed based on the style, so I added it as
well.)

The common fixes are:
- Variable names start with an uppercase letter
- Function names start with a lowercase letter
- Use `auto` when you use casts so the type is evident
- Use inline initialization for class member variables
- Use `= default` for empty constructors / destructors
- Use `using` in place of `typedef`

Reviewers: sbc100, tlively, aardappel

Subscribers: dschuff, sunfish, jgravelle-google, yurydelendik, kripken, MatzeB, mgorny, rupprecht, llvm-commits

Differential Revision: https://reviews.llvm.org/D57500

llvm-svn: 353075
2019-02-04 19:13:39 +00:00
Roman Lebedev b7ecc9b624 [X86] X86DAGToDAGISel::matchBitExtract(): prepare 'control' in 32 bits
Summary:
Noticed while looking at D56052.
```
  // The 'control' of BEXTR has the pattern of:
  // [15...8 bit][ 7...0 bit] location
  // [ bit count][     shift] name
  // I.e. 0b000000011'00000001 means  (x >> 0b1) & 0b11
```
I.e. we do not care about any of the bits aside from the low 16 bits.
So there is no point in doing the `slh`,`or` in 64 bits,
let's just do everything in 32 bits, and anyext if needed.

We could do that in 16 even, but we intentionally don't
zext to i16 (longer encoding IIRC),
so i'm guessing the same applies here.

Reviewers: craig.topper, andreadb, RKSimon

Reviewed By: craig.topper

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D56715

llvm-svn: 353073
2019-02-04 19:04:26 +00:00
David Callahan fd3e7a9320 Adjust cardinality of internal inliner thresholds
Summary:
While compiling openJDK11 (also other workloads), some make files would pass both  CFLAGS  and LDFLAGS at link step ; resulting in duplicate options on the command line when one is using LTO and trying to influence the inliner. Most of the internal flags are ZeroOrMore, this diff changes the remaining ones.

Reviewers: david2050, twoh, modocache

Reviewed By: twoh

Subscribers: mehdi_amini, dexonsmith, eraman, haicheng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57537

Patch by: Abdoul-Kader Keita

llvm-svn: 353071
2019-02-04 18:46:25 +00:00
Craig Topper 8ea72a8201 [X86] Add ST0 as an implicit def/use of x87 load/store instructions during FP stackifying.
These instructions implicitly operate on ST0, but we don't currently add that information to the MachineInstr. We also don't add it the tablegen definitions either.

For the most part this doesn't cause any problems because the stackifying occurs after register allocation. All the instructions are marked as having side effects so the postRA scheduler won't reorder them amongst themselves.

But nothing stops inline assembly using X87 instructions from being reordered around other x87 instructions if that inline assembly wasn't marked volatile.

The two test cases I've identified so far in PR40539 involve loads and stores used to set up the inline assembly or capture the results of the inline assembly ending up in the wrong order.

This patch adds implicit ST0 uses/defs to the load/store instructions to prevent this from happening.

I plan to fix all of the FP instructions, but the binops are bit trickier to get right. So I've chosen fixing the known test cases as a good first step.

I think we also need to update the tablegen descriptions so MS inline assembly infers the right clobbers, but I haven't checked that yet.

Differential Revision: https://reviews.llvm.org/D57644

llvm-svn: 353070
2019-02-04 18:43:55 +00:00
Matt Arsenault 0723828675 GlobalISel: Fix moreElementsToNextPow2
This was completely broken. The condition was inverted, and changed
the element type for vectors of pointers.

Fixes bug 40592.

llvm-svn: 353069
2019-02-04 18:42:24 +00:00
Wouter van Oortmerssen 0b3cf247c4 [WebAssembly] Make segment/size/type directives optional in asm
Summary:
These were "boilerplate" that repeated information already present
in .functype and end_function, that needed to be repeated to Please
the particular way our object writing works, and missing them would
generate errors.

Instead, we generate the information for these automatically so the
user can concern itself with writing more canonical wasm functions
that always work as expected.

Reviewers: dschuff, sbc100

Subscribers: jgravelle-google, aheejin, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D57546

llvm-svn: 353067
2019-02-04 18:03:11 +00:00
Jessica Paquette 834bded9d6 Revert "[GlobalISel] Add IRTranslator support for G_FFLOOR"
This reverts commit 8bbd570fd5205a04d88d2e5513a6e4adbd028039.

Apparently adding ffloor breaks AMDGPU somehow, so I need to back this out
while I look into it.

llvm-svn: 353064
2019-02-04 17:32:43 +00:00
Sam Clegg d1152a267c [WebAssembly] Rename relocations from R_WEBASSEMBLY_ to R_WASM_
See https://github.com/WebAssembly/tool-conventions/pull/95.

This is less typing and IMHO more readable, and it also fits with
our naming around the binary format which tends to use the short name.
e.g.

include/llvm/BinaryFormat/Wasm.h
tools/llvm-objdump/WasmDump.cpp
etc..

Differential Revision: https://reviews.llvm.org/D57611

llvm-svn: 353062
2019-02-04 17:28:46 +00:00
Craig Topper bf7593ec4a [X86] Print all register forms of x87 fadd/fsub/fdiv/fmul as having two arguments where on is %st.
All of these instructions consume one encoded register and the other register is %st. They either write the result to %st or the encoded register. Previously we printed both arguments when the encoded register was written. And we printed one argument when the result was written to %st. For the stack popping forms the encoded register is always the destination and we didn't print both operands. This was inconsistent with gcc and objdump and just makes the output assembly code harder to read.

This patch changes things to always print both operands making us consistent with gcc and objdump. The parser should still be able to handle the single register forms just as it did before. This also matches the GNU assembler behavior.

llvm-svn: 353061
2019-02-04 17:28:18 +00:00
Leonard Chan 68d428e578 [Intrinsic] Unsigned Fixed Point Multiplication Intrinsic
Add an intrinsic that takes 2 unsigned integers with the scale of them
provided as the third argument and performs fixed point multiplication on
them.

This is a part of implementing fixed point arithmetic in clang where some of
the more complex operations will be implemented as intrinsics.

Differential Revision: https://reviews.llvm.org/D55625

llvm-svn: 353059
2019-02-04 17:18:11 +00:00
Jessica Paquette 73158e7201 [GlobalISel] Add IRTranslator support for G_FFLOOR
Follow-up to https://reviews.llvm.org/D57484

Adds G_FFLOOR to translateKnownIntrinsic and update arm64-irtranslator.ll.

Differential Revision: https://reviews.llvm.org/D57485

llvm-svn: 353058
2019-02-04 17:15:34 +00:00
Sanjay Patel c00bdab4c8 [CGP] use IRBuilder to simplify code
This is no-functional-change-intended although there could
be intermediate variations caused by a difference in the
debug info produced by setting that from the builder's 
insertion point. 

I'm updating the IR test file associated with this code just
to show that the naming differences from using the builder
are visible.

The motivation for adding a helper function is that we are
likely to extend this code to deal with other overflow ops.

llvm-svn: 353056
2019-02-04 16:30:46 +00:00
James Henderson 9652652a32 [CommandLine] Don't print empty sentinel values from EnumValN lists in help text
In order to make an option value truly optional, both the ValueOptional
attribute and an empty-named value are required. Prior to this change,
this empty-named value appears in the command-line help text:

-some-option - some help text
  =v1        - description 1
  =v2        - description 2
  =          -

This change improves the help text for these sort of options in a number
of ways:

1) ValueOptional options with an empty-named value now print their help
   text twice: both without and then with '=<value>' after the name. The
   latter version then lists the allowed values after it.
2) Empty-named values with no help text in ValueOptional options are not
   listed in the permitted values.

-some-option         - some help text
-some-option=<value> - some help text
  =v1                - description 1
  =v2                - description 2

3) Otherwise empty-named options are printed as =<empty> rather than
   simply '='.
4) Option values without help text do not have the '-' separator
   printed.

-some-option=<value> - some help text
  =v1                - description 1
  =v2
  =<empty>           - description

It also tweaks the llvm-symbolizer -functions help text to not print a
trailing ':' as that looks bad combined with 1) above.

This is mostly a reland of r353048 which in turn was a reland of
r352750.

Reviewed by: ruiu, thopre, mstorsjo

Differential Revision: https://reviews.llvm.org/D57030

llvm-svn: 353053
2019-02-04 16:17:57 +00:00
Simon Pilgrim 6e5350a367 [X86][SSE] SimplifyDemandedBitsForTargetNode - PCMPGT(0,X) sign mask
For PCMPGT(0, X) patterns where we only demand the sign bit (e.g. BLENDV or MOVMSK) then we can use X directly.

Differential Revision: https://reviews.llvm.org/D57667

llvm-svn: 353051
2019-02-04 15:43:36 +00:00
James Henderson c9e6861a76 Revert r353048.
It was causing unexpected unit test failures on build bots.

llvm-svn: 353050
2019-02-04 15:09:58 +00:00
James Henderson d90b5a2e51 [CommandLine] Don't print empty sentinel values from EnumValN lists in help text
In order to make an option value truly optional, both the ValueOptional
attribute and an empty-named value are required. Prior to this change,
this empty-named value appears in the command-line help text:

-some-option - some help text
  =v1        - description 1
  =v2        - description 2
  =          -

This change improves the help text for these sort of options in a number
of ways:

1) ValueOptional options with an empty-named value now print their help
   text twice: both without and then with '=<value>' after the name. The
   latter version then lists the allowed values after it.
2) Empty-named values with no help text in ValueOptional options are not
   listed in the permitted values.

-some-option         - some help text
-some-option=<value> - some help text
  =v1                - description 1
  =v2                - description 2

3) Otherwise empty-named options are printed as =<empty> rather than
   simply '='.
4) Option values without help text do not have the '-' separator
   printed.

-some-option=<value> - some help text
  =v1                - description 1
  =v2
  =<empty>           - description

It also tweaks the llvm-symbolizer -functions help text to not print a
trailing ':' as that looks bad combined with 1) above.

This is mostly a reland of r352750.

Reviewed by: ruiu, thopre, mstorsjo

Differential Revision: https://reviews.llvm.org/D57030

llvm-svn: 353048
2019-02-04 14:48:33 +00:00
Matt Arsenault 56edf3f344 GlobalISel: Fix formatting of debug output
There was a missing space before the instruction name, and the newline
is redundant since MI::print by default adds one.

llvm-svn: 353046
2019-02-04 14:05:33 +00:00
Matt Arsenault 10547230f3 AMDGPU/GlobalISel: Legalize select for v4s16
Also add some more select tests to help show future legalization
changes.

llvm-svn: 353045
2019-02-04 14:04:52 +00:00
Simon Pilgrim a536b89fe0 [DAGCombine] Add ADD(SUB,SUB) combines
Noticed while investigating PR40483, and fixes the basic test case from the bug - but not a more general case.

We're pretty weak at dealing with ADD/SUB combines compared to the SimplifyAssociativeOrCommutative/SimplifyUsingDistributiveLaws abilities that InstCombine can manage.

llvm-svn: 353044
2019-02-04 13:44:49 +00:00
Andrea Di Biagio edbf06a767 [AsmPrinter] Remove hidden flag -print-schedule.
This patch removes hidden codegen flag -print-schedule effectively reverting the
logic originally committed as r300311
(https://llvm.org/viewvc/llvm-project?view=revision&revision=300311).

Flag -print-schedule was originally introduced by r300311 to address PR32216
(https://bugs.llvm.org/show_bug.cgi?id=32216). That bug was about adding "Better
testing of schedule model instruction latencies/throughputs".

These days, we can use llvm-mca to test scheduling models. So there is no longer
a need for flag -print-schedule in LLVM. The main use case for PR32216 is
now addressed by llvm-mca.
Flag -print-schedule is mainly used for debugging purposes, and it is only
actually used by x86 specific tests. We already have extensive (latency and
throughput) tests under "test/tools/llvm-mca" for X86 processor models. That
means, most (if not all) existing -print-schedule tests for X86 are redundant.

When flag -print-schedule was first added to LLVM, several files had to be
modified; a few APIs gained new arguments (see for example method
MCAsmStreamer::EmitInstruction), and MCSubtargetInfo/TargetSubtargetInfo gained
a couple of getSchedInfoStr() methods.

Method getSchedInfoStr() had to originally work for both MCInst and
MachineInstr. The original implmentation of getSchedInfoStr() introduced a
subtle layering violation (reported as PR37160 and then fixed/worked-around by
r330615).
In retrospect, that new API could have been designed more optimally. We can
always query MCSchedModel to get the latency and throughput. More importantly,
the "sched-info" string should not have been generated by the subtarget.
Note, r317782 fixed an issue where "print-schedule" didn't work very well in the
presence of inline assembly. That commit is also reverted by this change.

Differential Revision: https://reviews.llvm.org/D57244

llvm-svn: 353043
2019-02-04 12:51:26 +00:00
Simon Pilgrim 9899967464 Use auto for dyn_cast case to save a line. NFCI.
llvm-svn: 353041
2019-02-04 12:32:39 +00:00
David Green b4f36a2196 [ARM] Mark 255 and 65535 as cheap for Thumb1 "And"
This prevents Constant Hoisting from pulling the constant out of the block,
allowing us to still produce LDRH/UXTH nodes. LDRB/UXTB (255) is already cheap
by the default getIntImmCost, but I've added it for clarity.

Differential Revision: https://reviews.llvm.org/D57671

llvm-svn: 353040
2019-02-04 11:58:48 +00:00
Max Kazantsev 56b57e3f53 [NFC] Make a check in GuardWidening more obvious
llvm-svn: 353038
2019-02-04 10:41:17 +00:00
Max Kazantsev 09802f41cc [NFC] Rename variables to reflect the actual status of GuardWidening
llvm-svn: 353036
2019-02-04 10:31:18 +00:00
Max Kazantsev 13ab5cbb64 [NFC] Remove redundant parameters for better readability
llvm-svn: 353034
2019-02-04 10:20:51 +00:00
Max Kazantsev 65970aa24d [NFC] Replace equivalent condition for better readability
llvm-svn: 353032
2019-02-04 09:55:18 +00:00
Clement Courbet 1bb0e5ccfb [SelectionDAG] Add a BaseIndexOffset::print() method for debugging.
llvm-svn: 353028
2019-02-04 09:30:43 +00:00
Max Kazantsev 437ee05885 [SCEV] Do not bother creating separate SCEVUnknown for unreachable nodes
Currently, SCEV creates SCEVUnknown for every node of unreachable code. If we
have a huge amounts of such code, we will be littering SE with these nodes. We could
just state that they all are undef and save some memory.

Differential Revision: https://reviews.llvm.org/D57567
Reviewed By: sanjoy

llvm-svn: 353017
2019-02-04 05:04:19 +00:00
Craig Topper b5e945c260 Recommit r352660 "[X86] Mark EMMS and FEMMS as clobbering MM0-7 and ST0-7."
We now print ST0 as 'st' when generating the clobber list for MS inline assembly in clang. This matches what the gcc reg name list expects.

Original commit message:

This fixes the test case in PR35982 by preventing MMX instructions that read MM0-7 from being moved below EMMS/FEMMS by the post RA scheduler.

Though as discussed in bugzilla, this is not a complete fix. There is still the possibility of reordering in IR or by the pre-RA scheduler.

Differential Revision: https://reviews.llvm.org/D57298

llvm-svn: 353016
2019-02-04 04:44:20 +00:00
Craig Topper 7a2944efe1 [X86] Print %st(0) as %st when its implicit to the instruction. Continue printing it as %st(0) when its encoded in the instruction.
This is a step back from the change I made in r352985. This appears to be more consistent with gcc and objdump behavior.

llvm-svn: 353015
2019-02-04 04:15:10 +00:00
Craig Topper f77b858dc3 Revert r352985 "[X86] Print %st(0) as %st to match what gcc inline asm uses as the clobber name to make MS inline asm work correctly"
Looking into gcc and objdump behavior more this was overly aggressive. If the register is encoded in the instruction we should print %st(0), if its implicit we should print %st.

I'll be making a more directed change in a future patch.

llvm-svn: 353013
2019-02-04 04:15:02 +00:00
Davide Italiano 73929c4d24 [LoopIdiomRecognize] @llvm.dbg values shouldn't affect the transformation.
Summary: PR40564

Reviewers: aprantl, rnk

Subscribers: llvm-commits, hiraditya

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57629

llvm-svn: 353007
2019-02-03 20:33:20 +00:00
Sanjay Patel 84ceae6048 [CGP] adjust target constraints for forming uaddo
There are 2 changes visible here:
1. There's no reason to limit this transform based on number
   of condition registers. That diff allows PPC to produce 
   slightly better (dot-instructions should be generally good) 
   code.
   Note: someone that cares about PPC codegen might want to 
   look closer at that output because it seems like we could
   still improve this.

2. We (probably?) should not bother trying to form uaddo (or
   other overflow ops) when there's no target support for such
   an op. This goes beyond checking whether the op is expanded
   because both PPC and AArch64 show better codegen for standard
   types regardless of whether the op is legal/custom.

llvm-svn: 353001
2019-02-03 17:53:09 +00:00
Simon Pilgrim 1fce5a8b75 [X86][AVX] Support shuffle combining for VBROADCAST with smaller vector sources
getTargetShuffleMask can only do this safely if we're extracting the lowest subvector from a vector of the same result type.

llvm-svn: 352999
2019-02-03 16:51:33 +00:00
Simon Pilgrim 18b73a655b [X86][AVX] Support shuffle combining for VPMOVZX with smaller vector sources
llvm-svn: 352997
2019-02-03 16:10:18 +00:00
Simon Pilgrim a2a3e5b811 [X86][AVX] More aggressively simplify BROADCAST source operand
Aim to use scalar source or lowest 128-bit vector directly.

We're still missing some VZMOVL_LOAD combines.

llvm-svn: 352994
2019-02-03 14:39:41 +00:00
Sanjay Patel 00fcc74e50 [CGP] refactor optimizeCmpExpression (NFCI)
This is not truly NFC because we are bailing out without
a TLI now. That should not be a real concern though because
there should be a TLI in any real-world scenario. 

That seems better than passing around a pointer and then 
checking it for null-ness all over the place.

The motivation is to fix what appears to be an unintended
restriction on the uaddo transform - 
hasMultipleConditionRegisters() shouldn't be reason to limit 
the transform.

llvm-svn: 352988
2019-02-03 13:48:03 +00:00
Philip Pfaffe 9438585fe4 [DA][NewPM] Handle transitive dependencies in the new-pm version of DA
Summary:
The analysis result of DA caches pointers to AA, SCEV, and LI, but it
never checks for their invalidation. Fix that.

Reviewers: chandlerc, dmgreen, bogner

Reviewed By: dmgreen

Subscribers: hiraditya, bollu, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D56381

llvm-svn: 352986
2019-02-03 12:25:41 +00:00
Craig Topper 5a570dd437 [X86] Print %st(0) as %st to match what gcc inline asm uses as the clobber name to make MS inline asm work correctly
Summary:
When calculating clobbers for MS style inline assembly we fail if the asm clobbers stack top because we print st(0) and try to pass it through the gcc register name check. This was found with when I attempted to make a emms/femms clobber all ST registers. If you use emms/femms in MS inline asm we would try to use st(0) as the clobber name but clang would think that wasn't a valid clobber name.

This also matches what objdump disassembly prints. It's also what is printed by gcc -S.

Reviewers: RKSimon, rnk, efriedma, spatel, andreadb, lebedev.ri

Reviewed By: rnk

Subscribers: eraman, gbedwell, lebedev.ri, llvm-commits

Differential Revision: https://reviews.llvm.org/D57621

llvm-svn: 352985
2019-02-03 07:53:39 +00:00
Craig Topper 950ca192f6 [X86] Lower ISD::UADDO to use the Z flag instead of C flag when the RHS is a constant 1 to encourage INC formation.
Summary:
Add an additional combine to combineCarryThroughADD to reverse it back to the C flag to avoid regressions.

I believe this catches the cases that D57547 got.

Reviewers: RKSimon, spatel

Reviewed By: spatel

Subscribers: javed.absar, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57637

llvm-svn: 352984
2019-02-03 07:25:06 +00:00
Fangrui Song b21ed3c57a [AMDGPU] Fix -Wunused-variable after rL352978
llvm-svn: 352982
2019-02-03 03:51:52 +00:00
Dmitry Venikov aaa709f2ec [InstSimplify] Missed optimization in math expression: log10(pow(10.0,x)) == x, log2(pow(2.0,x)) == x
Summary: This patch enables folding following instructions under -ffast-math flag: log10(pow(10.0,x)) -> x, log2(pow(2.0,x)) -> x

Reviewers: hfinkel, spatel, efriedma, craig.topper, zvi, majnemer, lebedev.ri

Reviewed By: spatel, lebedev.ri

Subscribers: lebedev.ri, llvm-commits

Differential Revision: https://reviews.llvm.org/D41940

llvm-svn: 352981
2019-02-03 03:48:30 +00:00
Matt Arsenault 888aa5dedd GlobalISel: Implement widenScalar for G_UNMERGE_VALUES
For the scalar case only.

Also move the similar G_MERGE_VALUES handling to a separate function
and cleanup to make them look more similar.

llvm-svn: 352979
2019-02-03 00:07:33 +00:00
Matt Arsenault 0e5d856eb8 GlobalISel: Implement widenScalar for G_EXTRACT vector sources
Handle the basic element extract case.

llvm-svn: 352978
2019-02-02 23:56:00 +00:00
Matt Arsenault eb2603cfb2 AMDGPU/GlobalISel: Avoid reporting illegal extloads as legal
This avoids breaking a test in a future commit.

llvm-svn: 352977
2019-02-02 23:39:13 +00:00
Matt Arsenault 58f9d3df97 AMDGPU/GlobalISel: Legalize icmp for pointer types
llvm-svn: 352976
2019-02-02 23:35:15 +00:00
Matt Arsenault 2065c94dd3 AMDGPU/GlobalISel: Legalize constant for pointer types
llvm-svn: 352975
2019-02-02 23:33:49 +00:00
Matt Arsenault 2491f82679 AMDGPU/GlobalISel: Legalize select for pointer types
llvm-svn: 352974
2019-02-02 23:31:50 +00:00
Matt Arsenault cbaada6bc1 GlobalISel: Legalization for inttoptr/ptrtoint
llvm-svn: 352973
2019-02-02 23:29:55 +00:00
Simon Pilgrim dbf302c9f1 [X86][AVX] Enable INSERT_SUBVECTOR(SRC0, SHUFFLE(SRC1)) shuffle combining
Push the insert_subvector up through the shuffle operands to help find more cross-lane shuffles.

The is exposes a couple of minor issues that will be fixed shortly:
Missed broadcast folds - we have a mixture of vzext_load lengths that need cleaning up
combine-sdiv.ll - AVX1 SimplifyDemandedVectorElts failure (hits max depth due to a couple of extra bitcasts).

llvm-svn: 352963
2019-02-02 18:08:04 +00:00
Simon Pilgrim bd42f97946 [SDAG] Add SDNode/SDValue getConstantOperandAPInt helper. NFCI.
We already have the getConstantOperandVal helper which returns a uint64_t, but along comes the fuzzer and inserts a i128 -1 constant or something and the whole thing asserts.......

I've updated a few obvious cases, and tried to make use of the const reference where possible, but there's more to do. A number of existing oss-fuzz tickets should be fixed if we start using APInt and perform value clamping where necessary.

llvm-svn: 352961
2019-02-02 17:35:06 +00:00
Florian Hahn dd2ef0af46 [LCSSA] Handle case with single new PHI faster.
If there is only a single available value, all uses must be dominated by
the single value and there is no need to search for a reaching
definition.

This drastically speeds up LCSSA in some cases. For the test case
from PR37202, it speeds up LCSSA construction by 4 times.

Time-passes without this patch for test case from PR37202:

    Total Execution Time: 29.9285 seconds (29.9276 wall clock)

    ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
    5.2786 ( 17.7%)   0.0021 (  1.2%)   5.2806 ( 17.6%)   5.2808 ( 17.6%)  Unswitch loops
    4.3739 ( 14.7%)   0.0303 ( 18.1%)   4.4042 ( 14.7%)   4.4042 ( 14.7%)  Loop-Closed SSA Form Pass
    4.2658 ( 14.3%)   0.0192 ( 11.5%)   4.2850 ( 14.3%)   4.2851 ( 14.3%)  Loop-Closed SSA Form Pass #2
    2.2307 (  7.5%)   0.0013 (  0.8%)   2.2320 (  7.5%)   2.2318 (  7.5%)  Loop Invariant Code Motion
    2.0888 (  7.0%)   0.0012 (  0.7%)   2.0900 (  7.0%)   2.0897 (  7.0%)  Unroll loops
    1.6761 (  5.6%)   0.0013 (  0.8%)   1.6774 (  5.6%)   1.6774 (  5.6%)  Value Propagation
    1.3686 (  4.6%)   0.0029 (  1.8%)   1.3716 (  4.6%)   1.3714 (  4.6%)  Induction Variable Simplification
    1.1457 (  3.8%)   0.0010 (  0.6%)   1.1468 (  3.8%)   1.1468 (  3.8%)  Loop-Closed SSA Form Pass #4
    1.1384 (  3.8%)   0.0005 (  0.3%)   1.1389 (  3.8%)   1.1389 (  3.8%)  Loop-Closed SSA Form Pass #6
    1.1360 (  3.8%)   0.0027 (  1.6%)   1.1387 (  3.8%)   1.1387 (  3.8%)  Loop-Closed SSA Form Pass #5
    1.1331 (  3.8%)   0.0010 (  0.6%)   1.1341 (  3.8%)   1.1340 (  3.8%)  Loop-Closed SSA Form Pass #3

Time passes with this patch

  Total Execution Time: 19.2802 seconds (19.2813 wall clock)

   ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
   4.4234 ( 23.2%)   0.0038 (  2.0%)   4.4272 ( 23.0%)   4.4273 ( 23.0%)  Unswitch loops
   2.3828 ( 12.5%)   0.0020 (  1.1%)   2.3848 ( 12.4%)   2.3847 ( 12.4%)  Unroll loops
   1.8714 (  9.8%)   0.0020 (  1.1%)   1.8734 (  9.7%)   1.8735 (  9.7%)  Loop Invariant Code Motion
   1.7973 (  9.4%)   0.0022 (  1.2%)   1.7995 (  9.3%)   1.8003 (  9.3%)  Value Propagation
   1.4010 (  7.3%)   0.0033 (  1.8%)   1.4043 (  7.3%)   1.4044 (  7.3%)  Induction Variable Simplification
   0.9978 (  5.2%)   0.0244 ( 13.1%)   1.0222 (  5.3%)   1.0224 (  5.3%)  Loop-Closed SSA Form Pass #2
   0.9611 (  5.0%)   0.0257 ( 13.8%)   0.9868 (  5.1%)   0.9868 (  5.1%)  Loop-Closed SSA Form Pass
   0.5856 (  3.1%)   0.0015 (  0.8%)   0.5871 (  3.0%)   0.5869 (  3.0%)  Unroll loops #2
   0.4132 (  2.2%)   0.0012 (  0.7%)   0.4145 (  2.1%)   0.4143 (  2.1%)  Loop Invariant Code Motion #3

Reviewers: efriedma, davide, mzolotukhin

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D57033

llvm-svn: 352960
2019-02-02 15:26:05 +00:00
Florian Hahn 509b48a64a [LCSSA] Add expensive verification of LCSSA form for sub-loops.
This assertion makes sure all sub-loops are in LCSSA form before
bringing their parent in LCSSA form. This precondition was added to
formLCSSA in D56848.

Reviewers: davide, efriedma, mzolotukhin

Reviewed By: davide

Differential Revision: https://reviews.llvm.org/D56921

llvm-svn: 352958
2019-02-02 14:42:27 +00:00
Yonghong Song fa3654008b [BPF] [BTF] Process FileName with absolute path correctly
In IR, sometimes the following attributes for DIFile may be
generated:
  filename: /home/yhs/test.c
  directory: /tmp
The /tmp may represent the working directory of the compilation
process.

In such cases, since filename is with absolute path,
the directory should be ignored by BTF. The filename alone is
enough to get the source.

Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
llvm-svn: 352952
2019-02-02 05:54:59 +00:00
Julian Lettner f82d8924ef [ASan] Do not instrument other runtime functions with `__asan_handle_no_return`
Summary:
Currently, ASan inserts a call to `__asan_handle_no_return` before every
`noreturn` function call/invoke. This is unnecessary for calls to other
runtime funtions. This patch changes ASan to skip instrumentation for
functions calls marked with `!nosanitize` metadata.

Reviewers: TODO

Differential Revision: https://reviews.llvm.org/D57489

llvm-svn: 352948
2019-02-02 02:05:16 +00:00
Mandeep Singh Grang 2be4eabb6f [AutoUpgrade] Fix AutoUpgrade for x86.seh.recoverfp
Summary: This fixes the bug in https://reviews.llvm.org/D56747#inline-502711.

Reviewers: efriedma

Reviewed By: efriedma

Subscribers: javed.absar, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57614

llvm-svn: 352945
2019-02-02 01:32:48 +00:00
Yonghong Song 329010e1b5 Revert "[BPF] [BTF] Process FileName with absolute path correctly"
This reverts commit r352939.

Some tests failed. Revert to unblock others.

llvm-svn: 352941
2019-02-01 23:49:52 +00:00
Mandeep Singh Grang dc1e778369 [AArch64] Fix unused variable [NFC]
llvm-svn: 352940
2019-02-01 23:42:34 +00:00
Yonghong Song 5233fb8f5e [BPF] [BTF] Process FileName with absolute path correctly
In IR, sometimes the following attributes for DIFile may be
generated:
  filename: /home/yhs/test.c
  directory: /tmp
The /tmp may represent the working directory of the compilation
process.

In such cases, since filename is with absolute path,
the directory should be ignored by BTF. The filename alone is
enough to get the source.

Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
llvm-svn: 352939
2019-02-01 23:23:17 +00:00
Philip Reames 00056ed0e6 [CodeGen] Be as conservative about atomic accesses as for volatile
Background: At the moment, we record the AtomicOrdering of an access in the MMO, but also mark any atomic access as volatile in SelectionDAG. I'm working towards separating that. See https://reviews.llvm.org/D57601 for context.

Update all usages of isVolatile in lib/CodeGen to preserve behaviour once atomic MMOs stop being also volatile. This is NFC in it's current form, but is essential for correctness once we make that final change.

It useful to keep in mind that AtomicSDNode is not a parent of LoadSDNode, StoreSDNode, or LSBaseSDNode. As a result, any call to isVolatile on one of those static types doesn't need a companion isAtomic check.  We should probably adjust that class hierarchy long term, but for now, that seperation is useful.

I'm deliberately being conservative about handling. I want the change to stop adding volatile to be NFC itself, and then will work through places where we can be less conservative for atomics one by one in separate changes w/tests.

Differential Revision: https://reviews.llvm.org/D57596

llvm-svn: 352937
2019-02-01 22:58:52 +00:00
Dan Gohman f726e4454c [WebAssembly] Add codegen support for the import_field attribute
This adds the LLVM side of https://reviews.llvm.org/D57602 -- the
import_field attribute. See that patch for details.

Differential Revision: https://reviews.llvm.org/D57603

llvm-svn: 352931
2019-02-01 22:27:34 +00:00
Mandeep Singh Grang 70d484d94e [COFF, ARM64] Fix localaddress to handle stack realignment and variable size objects
Summary: This fixes using the correct stack registers for SEH when stack realignment is needed or when variable size objects are present.

Reviewers: rnk, efriedma, ssijaric, TomTan

Reviewed By: rnk, efriedma

Subscribers: javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D57183

llvm-svn: 352923
2019-02-01 21:41:33 +00:00
Simon Pilgrim e95550f508 [X86][AVX] Add VMOVDDUP-VPBROADCASTQ execution domain mapping
Noticed in D57514.

Differential Revision: https://reviews.llvm.org/D57519

llvm-svn: 352922
2019-02-01 21:41:30 +00:00
Jordan Rupprecht 835df27f85 [DebugInfo] Don't use realpath when looking up debug binary locations.
Summary:
Using realpath makes assumptions about build systems that do not always hold true. The debug binary referred to from the .gnu_debuglink should exist in the same directory (or in a .debug directory, etc.), but the files may only exist as symlinks to a differently named files elsewhere, and using realpath causes that lookup to fail.

This was added in r189250, and this is basically a revert + regression test case.

Reviewers: dblaikie, samsonov, jhenderson

Reviewed By: dblaikie

Subscribers: llvm-commits, hiraditya

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57609

llvm-svn: 352916
2019-02-01 21:04:16 +00:00
James Y Knight 291f791ef1 [opaque pointer types] Pass function type for CallBase::setCalledFunction.
Differential Revision: https://reviews.llvm.org/D57174

llvm-svn: 352914
2019-02-01 20:44:54 +00:00
James Y Knight 7716075a17 [opaque pointer types] Pass value type to GetElementPtr creation.
This cleans up all GetElementPtr creation in LLVM to explicitly pass a
value type rather than deriving it from the pointer's element-type.

Differential Revision: https://reviews.llvm.org/D57173

llvm-svn: 352913
2019-02-01 20:44:47 +00:00
James Y Knight 14359ef1b6 [opaque pointer types] Pass value type to LoadInst creation.
This cleans up all LoadInst creation in LLVM to explicitly pass the
value type rather than deriving it from the pointer's element-type.

Differential Revision: https://reviews.llvm.org/D57172

llvm-svn: 352911
2019-02-01 20:44:24 +00:00
James Y Knight d9e85a0861 [opaque pointer types] Pass function types to InvokeInst creation.
This cleans up all InvokeInst creation in LLVM to explicitly pass a
function type rather than deriving it from the pointer's element-type.

Differential Revision: https://reviews.llvm.org/D57171

llvm-svn: 352910
2019-02-01 20:43:34 +00:00
James Y Knight 7976eb5838 [opaque pointer types] Pass function types to CallInst creation.
This cleans up all CallInst creation in LLVM to explicitly pass a
function type rather than deriving it from the pointer's element-type.

Differential Revision: https://reviews.llvm.org/D57170

llvm-svn: 352909
2019-02-01 20:43:25 +00:00
Michael Liao 8b323f53eb [InstCombine] Extra null-checking on TFE/LWE support
- If that operand is not ConstantInt, skip enabling TFE/LWE.

Differential Revision: https://reviews.llvm.org/D57539

llvm-svn: 352904
2019-02-01 19:53:44 +00:00
Roland Froese 7f29195c3f test commit (add blank line) NFC
llvm-svn: 352897
2019-02-01 18:55:43 +00:00
Wolfgang Pieb 58513b7761 [DWARF v5] Fix DWARF emitter and consumer to produce/expect a uleb for a location description's length.
Reviewer: davide, JDevliegere

Differential Revision: https://reviews.llvm.org/D57550

llvm-svn: 352889
2019-02-01 17:11:58 +00:00
Tim Corringham fa3e4e5b53 [AMDGPU] Fix for vector element insertion
Summary:
Incorrect code was generated when lowering insertelement operations
for vectors with 8 or 16 bit elements.  The value being inserted was
not adjusted for the position of the element within the 32 bit word
and so only the low element within each 32 bit word could receive
the intended value.

Fixed by simply replicating the value to each element of a
congruent vector before the mask and or operation used to
update the intended element.

A number of affected LIT tests have been updated appropriately.

before the mask & or into the intended

Reviewers: arsenm, nhaehnle

Reviewed By: arsenm

Subscribers: llvm-commits, arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57588

llvm-svn: 352885
2019-02-01 16:51:09 +00:00
Sanjay Patel 6502b1444d [SDAG] improve variable names; NFC
The version of FoldConstantArithmetic() that takes arbitrary nodes
was confusingly naming those nodes as constants when they might
not be; also "Cst" reads like "Cast".

llvm-svn: 352884
2019-02-01 16:06:53 +00:00
Simon Pilgrim 85184017e9 [X86][SSE] Use PSLLDQ/PSRLDQ to mask out zeroable ends of a shuffle
As suggested on PR40318, this patch uses PSLLDQ/PSRLDQ to lower shuffles to zero out the ends of a vector, leaving a sequential inner section.

For pre-SSSE3 we do this for shuffles with zeros at either end (requiring up to 3 shifts), but once PSHUFB is available I've limited this to shuffles with a single zeroable end (2 shifts).

Differential Revision: https://reviews.llvm.org/D56784

llvm-svn: 352883
2019-02-01 16:02:12 +00:00
Sanjay Patel 0279b5b0b8 [TargetLowering] try harder to determine undef elements of vector binops
This might be the start of tracking all vector element constants generally if we take it to its 
logical conclusion, but let's stop here and make sure this is correct/beneficial so far.

The affected tests require a convoluted path before they get simplified currently because we 
don't call SimplifyDemandedVectorElts() from binops directly and don't modify the binop operands 
directly in SimplifyDemandedVectorElts().

That's why the tests all have a trailing shuffle to induce a chain reaction of transforms. So 
something like this is happening:

1. Improve the knowledge of undefs in the binop via a SimplifyDemandedVectorElts() call that 
   originates from a shuffle.
2. Transfer that undef knowledge back to the shuffle mask user as more undef lanes.
3. Combine the modified shuffle by calling SimplifyDemandedVectorElts() again.
4. Translate the improved shuffle mask as undemanded lanes of build vector constants causing 
   those to become full undef constants.
5. Simplify the binop now that it has a full undef operand.

As we can see from the unchanged 'and' and 'or' tests, tracking undefs alone isn't a full solution. 
We would need to track zero and all-ones constants to improve those opcodes. We'd probably need to 
track NaN for FP ops too (assuming we don't have fast-math-flags set).

Differential Revision: https://reviews.llvm.org/D57066

llvm-svn: 352880
2019-02-01 15:35:12 +00:00
Simon Pilgrim 1a529f58f9 [X86][AVX] Combine INSERT_SUBVECTOR(SRC0, BITCAST(SHUFFLE(EXTRACT_SUBVECTOR(SRC1)))
Enable peeking through one use bitcasts to the subvector shuffle.

This still depends on the subvector being the same scalar-size but D57514 has already helped with the more tricky patterns

llvm-svn: 352879
2019-02-01 15:31:01 +00:00
Sanjay Patel fbcbac7174 [InstCombine] reduce duplicate code; NFC
An unused variable problem was introduced with rL352870 
and stubbed out with rL352871, but we can make a better
fix by actually using the local variable in code rather 
than just the assert.  

llvm-svn: 352873
2019-02-01 14:37:49 +00:00
Fangrui Song 8495aabec2 [InstCombine] Fix -Wunused-variable when -DLLVM_ENABLE_ASSERTIONS=off
llvm-svn: 352871
2019-02-01 14:22:02 +00:00
Sanjay Patel be23a91fcd [InstCombine] try to reduce x86 addcarry to generic uaddo intrinsic
If we can reduce the x86-specific intrinsic to the generic op, it allows existing 
simplifications and value tracking folds. AFAICT, this always results in identical 
x86 codegen in the non-reduced case...which should be true because we semi-generically 
(too aggressively IMO) convert to llvm.uadd.with.overflow in CGP, so the DAG/isel must 
already combine/lower this intrinsic as expected.

This isn't quite what was requested in:
https://bugs.llvm.org/show_bug.cgi?id=40486
...but we want to have these kinds of folds early for efficiency and to enable greater 
simplifications. For the case in the bug report where we have:
_addcarry_u64(0, ahi, 0, &ahi)
...this gets completely simplified away in IR.

Differential Revision: https://reviews.llvm.org/D57453

llvm-svn: 352870
2019-02-01 14:14:47 +00:00
Adhemerval Zanella b3ccc5550d [AArch64] Optimize floating point materialization
This patch changes isFPImmLegal to return if the value can be enconded
as the immediate operand of a logical instruction besides checking if
for immediate field for fmov.

This optimizes some floating point materization, inclusive values
used on isinf lowering.

Reviewed By: rengolin, efriedma, evandro

Differential Revision: https://reviews.llvm.org/D57044

llvm-svn: 352866
2019-02-01 12:26:06 +00:00
Roman Lebedev 7857215f8e [X86][BdVer2] Transfer delays from the integer to the floating point unit.
Summary:
I'm unable to find this number in the "AMD SOG for family 15h".
llvm-exegesis measures the latencies of these instructions as `2`,
which matches the latencies specified in "AMD SOG for family 15h".

However if we look at Agner, Microarchitecture, "AMD Bulldozer, Piledriver,
Steamroller and Excavator pipeline", "Data delay between different execution
domains", the int->ivec transfer is listed as `8`..`10`cy of additional latency.

Also, Agner's "Instruction tables", for Piledriver, lists their latencies as `12`,
which is consistent with `2cy` from exegesis / AMD SOG + `10cy` transfer delay.

Additional data point comes from the fact that Agner's "Instruction tables",
for Jaguar, lists their latencies as `8`; and "AMD SOG for family 16h" does
state the `+6cy` int->ivec delay, which is consistent with instr latency of `1` or `2`.

Reviewers: andreadb, RKSimon, craig.topper

Reviewed By: andreadb

Subscribers: gbedwell, courbet, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57300

llvm-svn: 352861
2019-02-01 11:15:13 +00:00
Yevgeny Rouban 15b17d0a7c Provide reason messages for unviable inlining
InlineCost's isInlineViable() is changed to return InlineResult
instead of bool. This provides messages for failure reasons and
allows to get more specific messages for cases where callsites
are not viable for inlining.

Reviewed By: xbolva00, anemet

Differential Revision: https://reviews.llvm.org/D57089

llvm-svn: 352849
2019-02-01 10:44:43 +00:00
James Henderson 212833ce76 Revert r352750.
This was causing a build bot failure:
http://green.lab.llvm.org/green/job/clang-stage2-Rthinlto/15346/

llvm-svn: 352848
2019-02-01 10:38:40 +00:00
Oliver Stannard bac11518cd [CodeGen] Don't scavenge non-saved regs in exception throwing functions
Previously, LiveRegUnits was assuming that if a block has no successors
and does not return, then no registers are live at the end of it
(because the end of the block is unreachable). This was causing the
register scavenger to use callee-saved registers to materialise stack
frame addresses without saving them in the prologue. This would normally
be fine, because the end of the block is unreachable, but this is not
legal if the block ends by throwing a C++ exception. If this happens,
the scratch register will be modified, but its previous value won't be
preserved, so it doesn't get restored by the exception unwinder.

Differential revision: https://reviews.llvm.org/D57381

llvm-svn: 352844
2019-02-01 09:23:51 +00:00
Yevgeny Rouban 4cdd783955 [SLPVectorizer] Get rid of IndexQueue array from vectorizeStores. NFCI.
Indices are checked as they are generated. No need to fill the whole array of indices.

Differential Revision: https://reviews.llvm.org/D57144

llvm-svn: 352839
2019-02-01 06:44:08 +00:00
Alex Bradbury 7539fa2c2d [RISCV] Implement RV64D codegen
This patch:
* Adds necessary RV64D codegen patterns
* Modifies CC_RISCV so it will properly handle f64 types (with soft float ABI)

Note that in general there is no reason to try to select fcvt.w[u].d rather than fcvt.l[u].d for i32 conversions because fptosi/fptoui produce poison if the input won't fit into the target type.

Differential Revision: https://reviews.llvm.org/D53237

llvm-svn: 352833
2019-02-01 03:53:30 +00:00
Alex Bradbury 32b77383ec [SelectionDAG] Support promotion of the FPOWI integer operand
For targets where i32 is not a legal type (e.g. 64-bit RISC-V),
LegalizeIntegerTypes must promote the integer operand of ISD::FPOWI. As this
is a signed value, this should be sign-extended.

This patch enables all tests in test/CodeGen/RISCVfloat-intrinsics.ll for
RV64, as prior to this patch that file couldn't be compiled for RV64 due to an
assertion when performing codegen for fpowi.

Differential Revision: https://reviews.llvm.org/D54574

llvm-svn: 352832
2019-02-01 03:46:28 +00:00
James Y Knight 13680223b9 [opaque pointer types] Add a FunctionCallee wrapper type, and use it.
Recommit r352791 after tweaking DerivedTypes.h slightly, so that gcc
doesn't choke on it, hopefully.

Original Message:
The FunctionCallee type is effectively a {FunctionType*,Value*} pair,
and is a useful convenience to enable code to continue passing the
result of getOrInsertFunction() through to EmitCall, even once pointer
types lose their pointee-type.

Then:
- update the CallInst/InvokeInst instruction creation functions to
  take a Callee,
- modify getOrInsertFunction to return FunctionCallee, and
- update all callers appropriately.

One area of particular note is the change to the sanitizer
code. Previously, they had been casting the result of
`getOrInsertFunction` to a `Function*` via
`checkSanitizerInterfaceFunction`, and storing that. That would report
an error if someone had already inserted a function declaraction with
a mismatching signature.

However, in general, LLVM allows for such mismatches, as
`getOrInsertFunction` will automatically insert a bitcast if
needed. As part of this cleanup, cause the sanitizer code to do the
same. (It will call its functions using the expected signature,
however they may have been declared.)

Finally, in a small number of locations, callers of
`getOrInsertFunction` actually were expecting/requiring that a brand
new function was being created. In such cases, I've switched them to
Function::Create instead.

Differential Revision: https://reviews.llvm.org/D57315

llvm-svn: 352827
2019-02-01 02:28:03 +00:00
Kostya Serebryany a78a44d480 [sanitizer-coverage] prune trace-cmp instrumentation for CMP isntructions that feed into the backedge branch. Instrumenting these CMP instructions is almost always useless (and harmful) for fuzzing
llvm-svn: 352818
2019-01-31 23:43:00 +00:00
Matt Arsenault 50d6579bac GlobalISel: Fix MMO creation with non-power-of-2 mem size
It should probably just be mandatory for getTgtMemIntrinsic to return
the alignment.

llvm-svn: 352817
2019-01-31 23:41:23 +00:00
Thomas Lively 9a48438832 [WebAssembly] Fix a regression selecting negative build_vector lanes
Summary:
The custom lowering introduced in rL352592 creates build_vector nodes
with negative i32 operands, but these operands did not meet the value
range constraints necessary to match build_vector nodes. This CL fixes
the issue by removing the unnecessary constraints.

Reviewers: aheejin

Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish

Differential Revision: https://reviews.llvm.org/D57481

llvm-svn: 352813
2019-01-31 23:22:39 +00:00
Alex Bradbury d834d8301d [RISCV] Add RV64F codegen support
This requires a little extra work due tothe fact i32 is not a legal type. When
call lowering happens post-legalisation (e.g. when an intrinsic was inserted
during legalisation). A bitcast from f32 to i32 can't be introduced. This is
similar to the challenges with RV32D. To handle this, we introduce
target-specific DAG nodes that perform bitcast+anyext for f32->i64 and
trunc+bitcast for i64->f32.

Differential Revision: https://reviews.llvm.org/D53235

llvm-svn: 352807
2019-01-31 22:48:38 +00:00
Sam Clegg c0affde863 [WebAssembly] MC: Fix for outputing wasm object to /dev/null
Subscribers: dschuff, jgravelle-google, aheejin, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D57479

llvm-svn: 352806
2019-01-31 22:38:22 +00:00
Richard Trieu 8f6182f7f6 [Hexagon] Rename textually included file from .h to .inc
llvm-svn: 352802
2019-01-31 21:58:42 +00:00
James Y Knight fadf25068e Revert "[opaque pointer types] Add a FunctionCallee wrapper type, and use it."
This reverts commit f47d6b38c7 (r352791).

Seems to run into compilation failures with GCC (but not clang, where
I tested it). Reverting while I investigate.

llvm-svn: 352800
2019-01-31 21:51:58 +00:00
Alina Sbirlea e271889291 [EarlyCSE & MSSA] Cleanup special handling for removing MemoryAccesses.
Summary: Moving special handling to MemorySSAUpdater in D57199.

Reviewers: gberry, george.burgess.iv

Subscribers: sanjoy, jlebar, Prazek, llvm-commits

Differential Revision: https://reviews.llvm.org/D57200

llvm-svn: 352794
2019-01-31 21:12:41 +00:00
Thomas Lively 88058d4e1e [WebAssembly] Add bulk memory target feature
Summary: Also clean up some preexisting target feature code.

Reviewers: aheejin

Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, jfb

Differential Revision: https://reviews.llvm.org/D57495

llvm-svn: 352793
2019-01-31 21:02:19 +00:00
Guozhi Wei 0bed9e0453 [DAGCombine] Avoid CombineZExtLogicopShiftLoad if there is free ZEXT
This patch fixes pr39098.

For the attached test case, CombineZExtLogicopShiftLoad can optimize it to

t25: i64 = Constant<1099511627775>
t35: i64 = Constant<0>
  t0: ch = EntryToken
t57: i64,ch = load<(load 4 from `i40* undef`, align 8), zext from i32> t0, undef:i64, undef:i64
    t58: i64 = srl t57, Constant:i8<1>
  t60: i64 = and t58, Constant:i64<524287>
t29: ch = store<(store 5 into `i40* undef`, align 8), trunc to i40> t57:1, t60, undef:i64, undef:i64

But later visitANDLike transforms it to

t25: i64 = Constant<1099511627775>
t35: i64 = Constant<0>
  t0: ch = EntryToken
t57: i64,ch = load<(load 4 from `i40* undef`, align 8), zext from i32> t0, undef:i64, undef:i64
      t61: i32 = truncate t57
    t63: i32 = srl t61, Constant:i8<1>
  t64: i32 = and t63, Constant:i32<524287>
t65: i64 = zero_extend t64
    t58: i64 = srl t57, Constant:i8<1>
  t60: i64 = and t58, Constant:i64<524287>
t29: ch = store<(store 5 into `i40* undef`, align 8), trunc to i40> t57:1, t60, undef:i64, undef:i64

And it triggers CombineZExtLogicopShiftLoad again, causes a dead loop.

Both forms should generate same instructions, CombineZExtLogicopShiftLoad generated IR looks cleaner. But it looks more difficult to prevent visitANDLike to do the transform, so I prevent CombineZExtLogicopShiftLoad to do the transform if the ZExt is free.

Differential Revision: https://reviews.llvm.org/D57491

llvm-svn: 352792
2019-01-31 20:46:42 +00:00
James Y Knight f47d6b38c7 [opaque pointer types] Add a FunctionCallee wrapper type, and use it.
The FunctionCallee type is effectively a {FunctionType*,Value*} pair,
and is a useful convenience to enable code to continue passing the
result of getOrInsertFunction() through to EmitCall, even once pointer
types lose their pointee-type.

Then:
- update the CallInst/InvokeInst instruction creation functions to
  take a Callee,
- modify getOrInsertFunction to return FunctionCallee, and
- update all callers appropriately.

One area of particular note is the change to the sanitizer
code. Previously, they had been casting the result of
`getOrInsertFunction` to a `Function*` via
`checkSanitizerInterfaceFunction`, and storing that. That would report
an error if someone had already inserted a function declaraction with
a mismatching signature.

However, in general, LLVM allows for such mismatches, as
`getOrInsertFunction` will automatically insert a bitcast if
needed. As part of this cleanup, cause the sanitizer code to do the
same. (It will call its functions using the expected signature,
however they may have been declared.)

Finally, in a small number of locations, callers of
`getOrInsertFunction` actually were expecting/requiring that a brand
new function was being created. In such cases, I've switched them to
Function::Create instead.

Differential Revision: https://reviews.llvm.org/D57315

llvm-svn: 352791
2019-01-31 20:35:56 +00:00
Alina Sbirlea 240a90a57e [MemorySSA] Extend removeMemoryAccess API to optimize MemoryPhis.
Summary:
EarlyCSE needs to optimize MemoryPhis after an access is removed and has
special handling for it. This should be handled by MemorySSA instead.
The default remains that MemoryPhis are *not* optimized after an access
is removed.

Reviewers: george.burgess.iv

Subscribers: sanjoy, jlebar, llvm-commits, Prazek

Differential Revision: https://reviews.llvm.org/D57199

llvm-svn: 352787
2019-01-31 20:13:47 +00:00
Nirav Dave b792299d83 [DAG][SystemZ] Define unwrapAddress for PCREL_WRAPPER.
Summary:
Like with X86, this allows better DAG-level alias analysis and
alignment inference for wrapped addresses.

Reviewers: jonpa, uweigand

Reviewed By: uweigand

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D57407

llvm-svn: 352786
2019-01-31 19:58:34 +00:00
Nirav Dave 4061b44057 [DAG] Aggressively cleanup dangling node in CombineZExtLogicopShiftLoad.
While dangling nodes will eventually be pruned when they are
considered, leaving them disables combines requiring single-use.

Reviewers: Carrot, spatel, craig.topper, RKSimon, efriedma

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D57520

llvm-svn: 352784
2019-01-31 19:35:14 +00:00
Leonard Chan ae527ac603 [Intrinsic] Expand SMULFIX to MUL, MULH[US], or [US]MUL_LOHI on vector arguments
r zero scale SMULFIX, expand into MUL which produces better code for X86.

For vector arguments, expand into MUL if SMULFIX is provided with a zero scale.
Otherwise, expand into MULH[US] or [US]MUL_LOHI.

Differential Revision: https://reviews.llvm.org/D56987

llvm-svn: 352783
2019-01-31 19:15:37 +00:00
Craig Topper a8f0745440 Revert "[X86] Mark EMMS and FEMMS as clobbering MM0-7 and ST0-7."
This is causing a failure in chromium

llvm-svn: 352782
2019-01-31 19:05:22 +00:00
Philip Reames ede49ddff5 Lower widenable_conditions in CGP
This ensures that if we make it to the backend w/o lowering widenable_conditions first, that we generate correct code. Doing it in CGP - instead of isel - let's us fold control flow before hitting block local instruction selection.

Differential Revision: https://reviews.llvm.org/D57473

llvm-svn: 352779
2019-01-31 18:45:46 +00:00
Simon Pilgrim 00cefe1158 Trim trailing whitespace. NFCI.
llvm-svn: 352775
2019-01-31 17:49:25 +00:00
Simon Pilgrim eb6aef6db3 [X86][AVX] Fold concat(broadcast(x),broadcast(x)) -> broadcast(x)
Differential Revision: https://reviews.llvm.org/D57514

llvm-svn: 352774
2019-01-31 17:48:35 +00:00
Simon Pilgrim d04a2d2d5e [X86][AVX] insert_subvector(bitcast(v), bitcast(s), c1) -> bitcast(insert_subvector(v,s,c2))
Similar to what we already do in DAGCombiner, but this version also handles bitcasts from types with different scalar sizes, which x86 is better at handling.

Differential Revision: https://reviews.llvm.org/D57514

llvm-svn: 352773
2019-01-31 17:38:10 +00:00
Craig Topper c1892ec15a [CallSite removal] Remove CallSite uses from InstCombine.
Reviewers: chandlerc

Reviewed By: chandlerc

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D57494

llvm-svn: 352771
2019-01-31 17:23:29 +00:00
Teresa Johnson f59242e5ff Recommit "[ThinLTO] Rename COMDATs for COFF when promoting/renaming COMDAT leader"
Recommit of r352763 with fix for use after free.

llvm-svn: 352770
2019-01-31 17:18:11 +00:00
Teresa Johnson 4877715ee6 Revert "[ThinLTO] Rename COMDATs for COFF when promoting/renaming COMDAT leader"
This reverts commit r352763.

Causing a couple bot failures, root cause pointed to by sanitizer bot:
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/28909/steps/annotate/logs/stdio

Use after free. I understand the issue but will revert and test with fix
before recommitting.

llvm-svn: 352768
2019-01-31 16:46:14 +00:00
Teresa Johnson 992b53fd16 [ThinLTO] Rename COMDATs for COFF when promoting/renaming COMDAT leader
Summary:
COFF requires that COMDAT name match that of the leader. When we promote
and rename an internal leader in ThinLTO due to an import, ensure we
subsequently rename the associated COMDAT. Similar to D31963 which did
this during ThinLTO module splitting.

Fixes PR40414.

Reviewers: pcc, inglorion

Subscribers: mehdi_amini, dexonsmith, dmajor, llvm-commits

Differential Revision: https://reviews.llvm.org/D57395

llvm-svn: 352763
2019-01-31 16:00:15 +00:00
Simon Pilgrim 63f3383ece [X86][AVX] Fold broadcast(bitcast(src)) -> bitcast(broadcast(src))
llvm-svn: 352751
2019-01-31 14:04:07 +00:00
James Henderson 140f75f625 [CommandLine] Improve help text for cl::values style options
In order to make an option value truly optional, both the ValueOptional
and an empty-named value are required. This empty-named value appears in
the command-line help text, which is not ideal.

This change improves the help text for these sort of options in a number
of ways:
1) ValueOptional options with an empty-named value now print their help
text twice: both without and then with '=<value>' after the name. The
latter version then lists the allowed values after it.
2) Empty-named values with no help text in ValueOptional options are not
listed in the permitted values.
3) Otherwise empty-named options are printed as =<empty> rather than
simply '='.
4) Option values without help text do not have the '-' separator
printed.

It also tweaks the llvm-symbolizer -functions help text to not print a
trailing ':' as that looks bad combined with 1) above.

Reviewed by: thopre, ruiu

Differential Revision: https://reviews.llvm.org/D57030

llvm-svn: 352750
2019-01-31 13:58:48 +00:00
Simon Pilgrim a001008a09 [X86] combineExtractWithShuffle - more aggressively peek through bitcasts
Fixes regression introduced by rL352743

llvm-svn: 352745
2019-01-31 11:55:30 +00:00
Simon Pilgrim b96a2c7fed [X86][AVX] Enable AVX1 broadcasts in shuffle combining
Enables 32/64-bit scalar load broadcasts on AVX1 targets

The extractelement-load.ll regression will be fixed shortly in a followup commit.

llvm-svn: 352743
2019-01-31 11:41:10 +00:00
Simon Pilgrim 51c2efc104 [X86][AVX] Fold vt1 concat_vectors(vt2 undef, vt2 broadcast(x)) --> vt1 broadcast(x)
If we're not inserting the broadcast into the lowest subvector then we can avoid the insertion by just performing a larger broadcast.

Avoids a regression when we enable AVX1 broadcasts in shuffle combining

llvm-svn: 352742
2019-01-31 11:15:05 +00:00
Max Kazantsev f392bc846f Default lowering for experimental.widenable.condition
Introduces a pass that provides default lowering strategy for the
`experimental.widenable.condition` intrinsic, replacing all its uses with
`i1 true`.

Differential Revision: https://reviews.llvm.org/D56096
Reviewed By: reames

llvm-svn: 352739
2019-01-31 09:10:17 +00:00
Yevgeny Rouban ae29857d64 Test commit. NFCI.
llvm-svn: 352738
2019-01-31 08:49:20 +00:00
Sjoerd Meijer f222259c3c [ARM] Thumb2: ConstantMaterializationCost
Constants can also be materialised using the negated value and a MVN, and this
case seem to have been missed for Thumb2. To check the constant materialisation
costs, we now call getT2SOImmVal twice, once for the original constant and then
also for its negated value, and this function checks if the constant can both
be splatted or rotated.

This was revealed by a test that optimises for minsize: instead of a LDR
literal pool load and having a literal pool entry, just a MVN with an immediate
is smaller (and also faster).

Differential Revision: https://reviews.llvm.org/D57327

llvm-svn: 352737
2019-01-31 08:38:06 +00:00
Sjoerd Meijer f7cc34cae8 [SelectionDAG] Codesize: don't expand SHIFT to SHIFT_PARTS
And instead just generate a libcall. My motivating example on ARM was a simple:
  
  shl i64 %A, %B

for which the code bloat is quite significant. For other targets that also
accept __int128/i128 such as AArch64 and X86, it is also beneficial for these
cases to generate a libcall when optimising for minsize. On these 64-bit targets,
the 64-bits shifts are of course unaffected because the SHIFT/SHIFT_PARTS
lowering operation action is not set to custom/expand.

Differential Revision: https://reviews.llvm.org/D57386

llvm-svn: 352736
2019-01-31 08:07:30 +00:00
Dmitry Venikov 8817658836 [InstCombine] Missed optimization in math expression: simplify calls exp functions
Summary: This patch enables folding following expressions under -ffast-math flag: exp(X) * exp(Y) -> exp(X + Y), exp2(X) * exp2(Y) -> exp2(X + Y). Motivation: https://bugs.llvm.org/show_bug.cgi?id=35594

Reviewers: hfinkel, spatel, efriedma, lebedev.ri

Reviewed By: spatel, lebedev.ri

Subscribers: lebedev.ri, llvm-commits

Differential Revision: https://reviews.llvm.org/D41342

llvm-svn: 352730
2019-01-31 06:28:10 +00:00
Max Kazantsev b37419ef66 [SCEV] Prohibit SCEV transformations for huge SCEVs
Currently SCEV attempts to limit transformations so that they do not work with
big SCEVs (that may take almost infinite compile time). But for this, it uses heuristics
such as recursion depth and number of operands, which do not give us a guarantee
that we don't actually have big SCEVs. This situation is still possible, though it is not
likely to happen. However, the bug PR33494 showed a bunch of simple corner case
tests where we still produce huge SCEVs, even not reaching big recursion depth etc.

This patch introduces a concept of 'huge' SCEVs. A SCEV is huge if its expression
size (intoduced in D35989) exceeds some threshold value. We prohibit optimizing
transformations if any of SCEVs we are dealing with is huge. This gives us a reliable
check that we don't spend too much time working with them.

As the next step, we can possibly get rid of old limiting mechanisms, such as recursion
depth thresholds.

Differential Revision: https://reviews.llvm.org/D35990
Reviewed By: reames

llvm-svn: 352728
2019-01-31 06:19:25 +00:00
Richard Trieu 108b892939 Add namespace to some types.
llvm-svn: 352725
2019-01-31 04:33:11 +00:00
David L. Jones d81f23071c Revert "Reapply "[CGP] Check for existing inttotpr before creating new one""
This change reverts r351626.

The changes in r351626 cause quadratic work in several cases. (See r351626 thread on llvm-commits for details.)

llvm-svn: 352722
2019-01-31 03:28:46 +00:00
Matt Arsenault c7bce739ad GlobalISel: Handle odd splits in fewerElementsVector for load/store
llvm-svn: 352720
2019-01-31 02:46:05 +00:00
Matt Arsenault d1bfc8d0c3 GlobalISel: Implement narrowScalar for bswap
llvm-svn: 352719
2019-01-31 02:34:03 +00:00
Matt Arsenault cf4db733d8 GlobalISel: Don't call changingInstruction before giving up
llvm-svn: 352718
2019-01-31 02:22:39 +00:00
Matt Arsenault d5684f76e0 GlobalISel: Allow bitcount ops to have different result type
For AMDGPU the result is always 32-bit for 64-bit inputs.

llvm-svn: 352717
2019-01-31 02:09:57 +00:00
Matt Arsenault 8db2001d52 GlobalISel: Use helper function for MMO splitting
Also fix an alignment bug getMachineMemOperand. If the
tracked value is null, the offset isn't tracked so the
base alignment needs to be reduced.

llvm-svn: 352716
2019-01-31 01:49:58 +00:00
Matt Arsenault 2a64598ef2 GlobalISel: Fix creating MMOs with align 0
llvm-svn: 352712
2019-01-31 01:38:47 +00:00
Thomas Lively 9510adafe6 [LegalizeVectorTypes] Allow illegal indices when splitting extract_vector_elt
Summary:
Fixes PR40267, in which the removed assertion was triggering on
perfectly valid IR. As far as I can tell, constant out of bounds
indices should be allowed when splitting extract_vector_elt, since
they will simply be propagated as out of bounds indices in the
resulting split vector and handled appropriately elsewhere.

Reviewers: aheejin

Subscribers: dschuff, sbc100, jgravelle-google, hiraditya

Differential Revision: https://reviews.llvm.org/D57471

llvm-svn: 352702
2019-01-31 00:35:37 +00:00
Craig Topper 49c4c68919 [LegalizeTypes] Use report_fatal_error instead of llvm_unreachable in the default case of some type legalization handlers that can be reached with intrinsics with result or operands that aren't legal types.
These can be triggered by mistakenly using a 64-bit mode only intrinsics with a -mtriple=i686. Using report_fatal_error gives a better experience for this mistake in release builds instead of probably crashing.

We already do this for some of the vector type legalization handles.

llvm-svn: 352699
2019-01-31 00:04:48 +00:00
Craig Topper 8bdc203d4b [X86] Remove handling of ISD::INTRINSIC_WO_CHAIN in ReplaceNodeResults.
I believe this was there to handle avx512bw intrinsics that returned i64 type in 32-bit mode. But all those intrinsics have since been changed to v64i1 results or replaced with generic IR.

llvm-svn: 352698
2019-01-31 00:04:46 +00:00
Zachary Turner 3c35f774de [RuntimeDyld] Don't try to allocate sections with align 0.
ELF sections allow 0 for the alignment, which is specified to
be the same as 1.  However many clients do not expect this and
will behave poorly in the presence of a 0-aligned section (for
example by trying to modulo something by the section alignment).
We can be more polite by making sure that we always pass a
non-zero value to clients.

Differential Revision: https://reviews.llvm.org/D57482

llvm-svn: 352694
2019-01-30 23:52:32 +00:00
Jessica Paquette 84bedac7e9 [GlobalISel][AArch64] Select G_FEXP
This teaches the legalizer to handle G_FEXP in AArch64. As a result, it also
allows us to select G_FEXP.

It...

- Updates the legalizer-info tests
- Adds a test for legalizing exp
- Updates the existing fp tests to show that we can now select G_FEXP

https://reviews.llvm.org/D57483

llvm-svn: 352692
2019-01-30 23:46:15 +00:00
Amara Emerson 13311e5274 [GlobalISel][LegalizerHelper] Add some missing MI change observer calls.
No test as it's a preventative fix.

llvm-svn: 352691
2019-01-30 23:42:46 +00:00
Chen Zheng be589423d8 [PowerPC] delete no more needed workaround for readsRegister() in PowerPC
Differential Revision: https://reviews.llvm.org/D57439

llvm-svn: 352689
2019-01-30 23:18:38 +00:00
Matt Arsenault 547a83b4eb MIR: Reject non-power-of-4 alignments in MMO parsing
llvm-svn: 352686
2019-01-30 23:09:28 +00:00
Jessica Paquette 10f59405ae [GlobalISel][AArch64] Select G_FABS
This adds instruction selection support for G_FABS in AArch64. It also updates
the existing basic FP tests, adds a selection test for G_FABS.

https://reviews.llvm.org/D57418

llvm-svn: 352684
2019-01-30 22:54:21 +00:00
Sam Clegg 19e8befabb [WebAssembly] MC: Use WritePatchableLEB helper function. NFC.
Subscribers: dschuff, jgravelle-google, aheejin, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D57477

llvm-svn: 352683
2019-01-30 22:47:35 +00:00
Heejin Ahn 0bb9865011 [WebAssembly] Restore stack pointer right after catch instruction
Summary:
After the staack is unwound due to a thrown exxception,
`__stack_pointer` global can point to an invalid address. So
a `global.set` to restore `__stack_pointer` should be inserted right
after `catch` instruction.

But after r352598 the `global.set` instruction is inserted not right
after `catch` but after `block` - `br-on-exn` - `end_block` -
`extract_exception` sequence. This CL fixes it.

While doing that, we can actually move ReplacePhysRegs pass after
LateEHPrepare and merge EHRestoreStackPointer pass into LateEHPrepare,
and now placing `global.set` to `__stack_pointer` right after `catch` is
much easier. Otherwise it is hard to guarantee that `global.set` is
still right after `catch` and not touched with other transformations, in
which case we have to do something to hoist it.

Reviewers: dschuff

Subscribers: mgorny, sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D57421

llvm-svn: 352681
2019-01-30 22:44:45 +00:00
Sanjay Patel 9ab23101a8 [DAGCombiner] sub X, 0/1 --> add X, 0/-1
This extends the existing transform for:
add X, 0/1 --> sub X, 0/-1
...to allow the sibling subtraction fold.

This pattern could regress with the proposed change in D57401.

llvm-svn: 352680
2019-01-30 22:41:35 +00:00
Jessica Paquette 0154bd1385 [GlobalISel][AArch64] Add instruction selection support for @llvm.log2
This teaches GlobalISel to emit a RTLib call for @llvm.log2 when it encounters
it.

It updates the existing floating point tests to show that we don't fall back on
the intrinsic, and select the correct instructions. It also adds a legalizer
test for G_FLOG2.

https://reviews.llvm.org/D57357

llvm-svn: 352673
2019-01-30 21:16:04 +00:00
Jessica Paquette 22457f8e9b [GlobalISel][AArch64] Add instruction selection support for @llvm.sqrt
This teaches the legalizer about G_FSQRT in AArch64. Also adds a legalizer
test for G_FSQRT, a selection test for it, and updates existing floating point
tests.

https://reviews.llvm.org/D57361

llvm-svn: 352671
2019-01-30 21:03:52 +00:00
Jessica Paquette b147e7d853 [GlobalISel] Add IRTranslator support for @llvm.sqrt -> G_FSQRT
Follow-up commit to https://reviews.llvm.org/D57359. (r352668)

This adds IRTranslator support for recognising a @llvm.sqrt intrinsic and
translating it into a G_FSQRT.

https://reviews.llvm.org/D57360

llvm-svn: 352670
2019-01-30 20:58:14 +00:00
Wolfgang Pieb facd052e16 Reverting r352642 - Handle restore instructions in LiveDebugValues - as it's causing
assertions on some buildbots.

llvm-svn: 352666
2019-01-30 20:37:14 +00:00
Erik Pilkington 600e9deacf Add a 'dynamic' parameter to the objectsize intrinsic
This is meant to be used with clang's __builtin_dynamic_object_size.
When 'true' is passed to this parameter, the intrinsic has the
potential to be folded into instructions that will be evaluated
at run time. When 'false', the objectsize intrinsic behaviour is
unchanged.

rdar://32212419

Differential revision: https://reviews.llvm.org/D56761

llvm-svn: 352664
2019-01-30 20:34:35 +00:00
Craig Topper 22b3de5b51 [X86] Mark EMMS and FEMMS as clobbering MM0-7 and ST0-7.
This fixes the test case in PR35982 by preventing MMX instructions that read MM0-7 from being moved below EMMS/FEMMS by the post RA scheduler.

Though as discussed in bugzilla, this is not a complete fix. There is still the possibility of reordering in IR or by the pre-RA scheduler.

Differential Revision: https://reviews.llvm.org/D57298

llvm-svn: 352660
2019-01-30 19:57:01 +00:00
Philip Reames c71e996aed SimplifyDemandedVectorElts for all intrinsics
The point is that this simplifies integration of new intrinsics into SimplifiedDemandedVectorElts, and ensures we don't miss any existing ones.

This is intended to be NFC-ish, but as seen from the diffs, can produce slightly different output.  This is due to order of transforms w/in instcombine resulting in two slightly different fixed points.  That's something we should fix, but isn't a problem w/this patch per se.

Differential Revision: https://reviews.llvm.org/D57398

llvm-svn: 352653
2019-01-30 19:21:11 +00:00
Wolfgang Pieb 5590a4355f [DEBUGINFO] Handle restore instructions in LiveDebugValues
The LiveDebugValues pass recognizes spills but not restores, which can 
cause large gaps in location information for some variables, depending
on control flow. This patch make LiveDebugValues recognize restores and
generate appropriate DBG_VALUE instructions. 

Reviewers: aprantl, NicolaPrica

Differential Revision: https://reviews.llvm.org/D57271

llvm-svn: 352642
2019-01-30 18:34:07 +00:00
Matt Arsenault dc8258c4aa GlobalISel: Add assert that legalize mutation makes sense
I've repeatedly encountered bugs resulting from custom legalize
mutations returning nonsense legalize results, such as increasing the
number of elements for FewerElements. Add an assert function to make
sure the type to mutate to is consistent with the legalize action.

llvm-svn: 352636
2019-01-30 17:52:23 +00:00
Matt Arsenault 4c0409e9c7 AMDGPU: Stop generating unused intrinsic .inc files
llvm-svn: 352635
2019-01-30 17:25:37 +00:00
Simon Pilgrim 317fad5921 [X86][AVX] Prefer to combine shuffle to broadcasts whenever possible
This is the first step towards improving broadcast support on AVX1 targets.

llvm-svn: 352634
2019-01-30 16:19:19 +00:00
Max Kazantsev 365021cc15 Properly use DT.verify in LoopSimplifyCFG
llvm-svn: 352621
2019-01-30 12:32:19 +00:00
Max Kazantsev 34eeeec3ae Enable IRCE for narrow latch by defailt
llvm-svn: 352619
2019-01-30 11:25:12 +00:00
Shiva Chen 5af037f1e9 [RISCV] Insert R_RISCV_ALIGN relocation type and Nops for code alignment when linker relaxation enabled
Linker relaxation may change code size. We need to fix up the alignment
of alignment directive in text section by inserting Nops and R_RISCV_ALIGN
relocation type. So then linker could satisfy the alignment by removing Nops.

To do this:

1. Add shouldInsertExtraNopBytesForCodeAlign target hook to calculate
   the Nops we need to insert.

2. Add shouldInsertFixupForCodeAlign target hook to insert
   R_RISCV_ALIGN fixup type.

Differential Revision: https://reviews.llvm.org/D47755

llvm-svn: 352616
2019-01-30 11:16:59 +00:00
Aleksandr Urakov d17f6ab61b [NativePDB] Fix access to both old & new fpo data entries from dbi stream
Summary:
This patch fixes access to fpo streams in native pdb from DbiStream and makes
code consistent with DbiStreamBuilder.

Patch By: leonid.mashinskiy

Reviewers: zturner, aleksandr.urakov

Reviewed By: zturner

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D56725

llvm-svn: 352615
2019-01-30 10:40:45 +00:00
Craig Topper 11133d2531 [X86] Remove unnecessary code from the top of handleCompareFP in X86FloatingPoint.cpp.
There were checks to ensure some tables were sorted, but those tables aren't used by this function. The same tables are checked in the function that does use them. Maybe this was copy/pasted?

llvm-svn: 352609
2019-01-30 08:04:06 +00:00
Craig Topper 594f76aea2 [X86] Remove a couple places where we unnecessarily pass 0 to the EmitPriority of some FP instruction aliases. NFC
As far as I can tell we already won't emit these aliases due to an operand count check in the tablegen code. Removing these because I couldn't make sense of the inconsistency between fadd and fmul from reading the code.

I checked the AsmMatcher and AsmWriter files before and after this change and there were no differences.

llvm-svn: 352608
2019-01-30 07:33:24 +00:00
Craig Topper 9dfe9b086e [X86] Add FPSW as a Def on some FP instructions that were missing it.
llvm-svn: 352607
2019-01-30 07:08:44 +00:00
Hiroshi Inoue c437f310a5 [NFC] fix trivial typos in comments
llvm-svn: 352602
2019-01-30 05:26:31 +00:00
Matt Arsenault dc6c78596b GlobalISel: Implement fewerElementsVector for select
llvm-svn: 352601
2019-01-30 04:19:31 +00:00
Matt Arsenault f6cab16258 AMDGPU/GlobalISel: Fix clamping shifts with 16-bit insts
llvm-svn: 352599
2019-01-30 03:36:25 +00:00
Heejin Ahn d6f487863d [WebAssembly] Exception handling: Switch to the new proposal
Summary:
This switches the EH implementation to the new proposal:
https://github.com/WebAssembly/exception-handling/blob/master/proposals/Exceptions.md
(The previous proposal was
 https://github.com/WebAssembly/exception-handling/blob/master/proposals/old/Exceptions.md)

- Instruction changes
  - Now we have one single `catch` instruction that returns a except_ref
    value
  - `throw` now can take variable number of operations
  - `rethrow` does not have 'depth' argument anymore
  - `br_on_exn` queries an except_ref to see if it matches the tag and
    branches to the given label if true.
  - `extract_exception` is a pseudo instruction that simulates popping
    values from wasm stack. This is to make `br_on_exn`, a very special
    instruction, work: `br_on_exn` puts values onto the stack only if it
    is taken, and the # of values can vay depending on the tag.

- Now there's only one `catch` per `try`, this patch removes all special
  handling for terminate pad with a call to `__clang_call_terminate`.
  Before it was the only case there are two catch clauses (a normal
  `catch` and `catch_all` per `try`).

- Make `rethrow` act as a terminator like `throw`. This splits BB after
  `rethrow` in WasmEHPrepare, and deletes an unnecessary `unreachable`
  after `rethrow` in LateEHPrepare.

- Now we stop at all catchpads (because we add wasm `catch` instruction
  that catches all exceptions), this creates new
  `findWasmUnwindDestinations` function in SelectionDAGBuilder.

- Now we use `br_on_exn` instrution to figure out if an except_ref
  matches the current tag or not, LateEHPrepare generates this sequence
  for catch pads:
```
  catch
  block i32
  br_on_exn $__cpp_exception
  end_block
  extract_exception
```

- Branch analysis for `br_on_exn` in WebAssemblyInstrInfo

- Other various misc. changes to switch to the new proposal.

Reviewers: dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D57134

llvm-svn: 352598
2019-01-30 03:21:57 +00:00
Matt Arsenault 6d8e1b456a GlobalISel: Use appropriate extension for legalizing select conditions
llvm-svn: 352597
2019-01-30 02:57:43 +00:00
Zi Xuan Wu fec749ff5d [PowerPC] [NFC] Create a helper function to copy register to particular register class at PPCFastISel
Make copy register code as common function as following.

unsigned copyRegToRegClass(const TargetRegisterClass *ToRC, unsigned SrcReg, unsigned Flag = 0, unsigned SubReg = 0);

Differential Revision: https://reviews.llvm.org/D57368

llvm-svn: 352596
2019-01-30 02:56:22 +00:00
Matt Arsenault 045bc9a4a6 GlobalISel: Support narrowScalar for uneven loads
llvm-svn: 352594
2019-01-30 02:35:38 +00:00
Thomas Lively 079816efb7 [WebAssembly] Optimize BUILD_VECTOR lowering for size
Summary:
Implements custom lowering logic that finds the optimal value for the
initial splat of the vector and either uses it or uses v128.const if
it is available and if it would produce smaller code. This logic
replaces large TableGen ISEL patterns that would lower all non-splat
BUILD_VECTORs into a splat followed by a fixed number of replace_lane
instructions. This CL fixes PR39685.

Reviewers: aheejin

Subscribers: dschuff, sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D56633

llvm-svn: 352592
2019-01-30 02:23:29 +00:00
Matt Arsenault ccefbbd0f0 GlobalISel: Handle some odd splits in fewerElementsVector
Also add some quick hacks to AMDGPU legality for the tests.

llvm-svn: 352591
2019-01-30 02:22:13 +00:00
Matt Arsenault 92c5001136 GlobalISel: Handle more cases for widenScalar for G_STORE
llvm-svn: 352585
2019-01-30 02:04:31 +00:00
Chen Zheng ca26039cc7 [PowerPC] more opportunity for converting reg+reg to reg+imm
Differential Revision: https://reviews.llvm.org/D57314

llvm-svn: 352583
2019-01-30 01:57:01 +00:00
Matt Arsenault ccb810fb54 GlobalISel: Verify memory size for load/store
llvm-svn: 352578
2019-01-30 01:10:42 +00:00
George Burgess IV 179f6baa45 Remove a redundant space from an error message; NFC
llvm-svn: 352576
2019-01-30 00:28:56 +00:00
Sam Clegg c7d2e5f154 [WebAssembly] Add missing SymbolRef update from rL352551
This change broke some MC tests which are now fixed.

Differential Revision: https://reviews.llvm.org/D57424

llvm-svn: 352573
2019-01-30 00:15:48 +00:00
Thomas Lively 74c12ceacb [WebAssembly] Lower SCALAR_TO_VECTOR to splats
Reviewers: aheejin

Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish

Differential Revision: https://reviews.llvm.org/D57269

llvm-svn: 352568
2019-01-29 23:44:48 +00:00
Matt Arsenault 3de9a96174 GlobalISel: Fix unused variable warning in release builds
llvm-svn: 352565
2019-01-29 23:38:42 +00:00
Craig Topper 0b5e6b11c3 [IR] Use CallBase to reduce code duplication. NFC
Noticed in the asm-goto patch. Callbr needs to go here too. One cast and call is better than 3.

Differential Revision: https://reviews.llvm.org/D57295

llvm-svn: 352563
2019-01-29 23:31:54 +00:00
Matt Arsenault d45b03bb81 GlobalISel: Verify pointer casts
Not sure if the old AArch64 tests should be just
deleted or not.

llvm-svn: 352562
2019-01-29 23:29:00 +00:00