Commit Graph

174949 Commits

Author SHA1 Message Date
Heejin Ahn 7b7a4ef3d3 [WebAssembly] Add a comment about why v128.const test was disabled (NFC)
llvm-svn: 353236
2019-02-05 23:01:41 +00:00
Sanjay Patel cddb1e5469 [InstCombine] limit extracting shuffle transform based on uses
As discussed in D53037, this can lead to worse codegen, and we
don't generally expect the backend to be able to optimize
arbitrary shuffles. If there's only one use of the 1st shuffle,
that means it's getting removed, so that should always be
safe.

llvm-svn: 353235
2019-02-05 22:58:45 +00:00
Heejin Ahn 1b8df42712 [WebAssembly] Disable a v128.const test line temporarily
r353131 caused failures in v128.const test for clang-ppc64be-linux-lnt
and clang-s390x-linux bots. This temporarily disables that line until
it is fixed.

llvm-svn: 353234
2019-02-05 22:47:29 +00:00
Sanjay Patel 0272b44ea2 [InstCombine] split shuffle test to show extra use constraint; NFC
As discussed in D53037, this transform can cause codegen problems
if the 1st shuffle has multiple uses.

llvm-svn: 353233
2019-02-05 22:46:13 +00:00
Rong Xu ce10d5ead4 [PGO] Use a function for creating variable for profile file name. NFC.
Factored out the code for creating variable for profile file name to
a function.

llvm-svn: 353230
2019-02-05 22:34:45 +00:00
Petar Jovanovic b33f00f508 [elfabi] Fix the type of the variable formated for error output
Change the format type of Dyn.SONameOffset to PRIx64 since it is a uint64_t.
The problem was detected on mips builds, where it was printing junk values
and causing test failure.

Patch by Milos Stojanovic.

Differential Revision: https://reviews.llvm.org/D57676

llvm-svn: 353225
2019-02-05 22:23:46 +00:00
Aditya Nandakumar fef7619b05 [NFC][GlobalISel]: Add a convenience method to MachineInstrBuilder to simplify getOperand(i).getReg()
https://reviews.llvm.org/D57608

It's a common pattern in GISel to have a MachineInstrBuilder from which we get various regs
(commonly MIB->getOperand(0).getReg()). This adds a helper method and the above can be
replaced with MIB.getReg(0).

llvm-svn: 353223
2019-02-05 22:14:40 +00:00
Craig Topper 4562f420cd [X86] Regenerate tests missed in r353061. NFC
We now print the implicit %st register on these instruction, but since they occur at the end of the line, FileCheck didn't see they were missing.

llvm-svn: 353222
2019-02-05 21:47:42 +00:00
Reid Kleckner f38bc4fc99 [MC] Don't error on numberless .file directives on MachO
Summary:
Before r349976, MC ignored such directives when producing an object file
and asserted when re-producing textual assembly output. I turned this
assertion into a hard error in both cases in r349976, but this makes it
unnecessarily difficult to write a single assembly file that supports
both MachO and other object formats that support .file. A user reported
this as PR40578, and we decided to go back to ignoring the directive.

Fixes PR40578

Reviewers: mstorsjo

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57772

llvm-svn: 353218
2019-02-05 21:14:09 +00:00
Matt Davis 0d0e9c08a4 [llvm-readobj] Display sections that do not belong to a segment in the section-mapping
Summary:
The following patch adds the "None" line to the section to segment mapping dump.
That line lists the sections that do not belong to any segment.
I realize that this change differs from GNU readelf which does not display the latter information.

I'd rather not add this "feature" under a command line option.  I think that might introduce confusion, since users would have to
make an additional decision as to if they want to see all of the section-to-segment map or just a subset of it.

Another option is to only print the "None" line if the `--section-mapping` option is passed; however,
that might also introduce some confusion, because the section-to-segment map would be different between`--program-headers`
and the `--section-mapping` output.  While the difference is just the "None" line, it seems that if we choose to display
the segment-to-section mapping, then we should always display the whole map including the sections
that do not belong to segments.

```
Section to Segment mapping:
  Segment Sections...
   00
   01     .interp
   02     .interp .note.ABI-tag .gnu.hash
   03     .init_array .fini_array .dynamic
   04     .dynamic
   05     .note.ABI-tag
   06     .eh_frame_hdr
   07
   08     .init_array .fini_array .dynamic .got
   None   .comment .symtab .strtab .shstrtab <--- THIS LINE
```

Reviewers: grimar, rupprecht, jhenderson, espindola

Reviewed By: rupprecht

Subscribers: khemant, emaste, arichardson, llvm-commits

Differential Revision: https://reviews.llvm.org/D57700

llvm-svn: 353217
2019-02-05 21:01:01 +00:00
Thomas Lively 315056692d [WebAssembly] Lower memmove to memory.copy
Summary: The lowering is identical to the memcpy lowering.

Reviewers: aheejin

Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57727

llvm-svn: 353216
2019-02-05 20:57:40 +00:00
Evandro Menezes e5bb58b115 [TargetLibraryInfo] Regroup run time functions for Windows (NFC)
Regroup supported and unsupported functions by precision and C standard.

llvm-svn: 353213
2019-02-05 20:24:21 +00:00
Matt Arsenault 5b3084e3ab Move some llvm-mc tests where they belong
llvm-svn: 353211
2019-02-05 20:12:48 +00:00
Matt Arsenault a3d0c5adaf GlobalISel: Verify G_GEP
llvm-svn: 353209
2019-02-05 20:04:12 +00:00
Scott Linder e2c5847414 [AMDGPU] Consider XOR in waterfall loop as a terminator
Ensure the XOR in the waterfall loop for indirect addressing is considered a terminator.

Differential Revision: https://reviews.llvm.org/D57703

llvm-svn: 353207
2019-02-05 19:50:32 +00:00
Alexey Bataev f3a9150324 [DEBUG_INFO][NVPTX] Generate DW_AT_address_class to get the values in debugger.
Summary:
According to
https://docs.nvidia.com/cuda/archive/10.0/ptx-writers-guide-to-interoperability/index.html#cuda-specific-dwarf,
the compiler should emit the DW_AT_address_class attribute for all
variable and parameter. It means, that DW_AT_address_class attribute
should be used in the non-standard way to support compatibility with the
cuda-gdb debugger.
Clang is able to generate the information about the variable address
class. This information is emitted as the expression sequence
`DW_OP_constu <DWARF Address Space> DW_OP_swap DW_OP_xderef`. The patch
tries to find all such expressions and transform them into
`DW_AT_address_class <DWARF Address Space>` if target is NVPTX and the debugger is gdb.
If the expression is not found, then default values are used. For the
local variables <DWARF Address Space> is set to ADDR_local_space(6), for
the globals <DWARF Address Space> is set to ADDR_global_space(5). The
values are taken from the table in the same section 5.2. CUDA-Specific
DWARF Definitions.

Reviewers: echristo, probinson

Subscribers: jholewinski, aprantl, llvm-commits

Differential Revision: https://reviews.llvm.org/D57157

llvm-svn: 353203
2019-02-05 19:33:47 +00:00
Matt Arsenault a3f9b71c09 AMDGPU: Fix assert on trunc from bitcast of build_vector
The v2i64 argument is lowered to a bitcast of v4i32 build_vector.
This would then attempt to use the i32-element as the source of the
vector truncate. This really would need to collect 2 elements from the
build_vector to produce the intended truncate.

llvm-svn: 353202
2019-02-05 19:23:57 +00:00
Simon Pilgrim b0afc69435 [X86][SSE] Disable ZERO_EXTEND shuffle combining
rL352997 enabled ZERO_EXTEND from non-shuffle-able value types. I've disabled it for now to fix a regression identified by @asbirlea until I can fix this properly.

llvm-svn: 353198
2019-02-05 19:15:48 +00:00
Petar Jovanovic 40a7f63c37 [PGO] Fix the type of the formated variable
Change the format type of Value to PRIu64 since it is a uint64_t.
The problem was detected on mips boards building 32-bit binaries,
where it was printing junk values and causing test failure.

Patch by Milos Stojanovic.

Differential Revision: https://reviews.llvm.org/D57583

llvm-svn: 353194
2019-02-05 18:09:28 +00:00
Robert Widmann d5444ccf17 [LLVM-C] Add Bindings to GlobalIFunc
Summary:
Adds the standard gauntlet of accessors for global indirect functions and updates the echo test.

Now it would be nice to have a target abstraction so one could know if they have access to a suitable ELF linker and runtime.

Reviewers: whitequark, deadalnix

Reviewed By: whitequark

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D56177

llvm-svn: 353193
2019-02-05 18:05:44 +00:00
Anton Korobeynikov b26134bf92 Enable integrated assembler on MSP430 by default.
Patch by Kristina Bessonova!

Differential Revision: https://reviews.llvm.org/D56787

llvm-svn: 353192
2019-02-05 18:01:45 +00:00
Oliver Stannard 78dc38ec94 [AArch64][Outliner] Don't outline BTI instructions
We can't outline BTI instructions, because they need to be the very first
instruction executed after an indirect call or branch. If we outline them, then
an indirect call might go to the branch to the outlined function, which will
fault.

Differential revision: https://reviews.llvm.org/D57753

llvm-svn: 353190
2019-02-05 17:21:57 +00:00
Simon Pilgrim 822d2e35e7 [X86][AVX] Attempt to combine shuffles to subvector broadcast load
llvm-svn: 353189
2019-02-05 17:02:49 +00:00
Matt Arsenault 86504fb20e AArch64/GlobalISel: Don't clamp from 2 to 2
This is equivalent to clampMaxNumElements, but saves a check.

llvm-svn: 353188
2019-02-05 16:57:18 +00:00
Sam Clegg d9c9dc036c [WebAssembly] Object: Remove redundant method. NFC.
Differential Revision: https://reviews.llvm.org/D57719

llvm-svn: 353183
2019-02-05 16:30:21 +00:00
Simon Pilgrim ec5a6761e5 [X86][AVX] Add PR34041 subvector broadcast test cases
llvm-svn: 353182
2019-02-05 16:18:30 +00:00
Sanjay Patel b696b771bf [CGP] add test for unsigned subtract of 1 with overflow; NFC
llvm-svn: 353179
2019-02-05 15:27:40 +00:00
Sanjay Patel 237e208f16 [AArch64][x86] add tests for unsigned subtract with overflow; NFC
llvm-svn: 353178
2019-02-05 15:26:42 +00:00
Nico Weber 50be01149c gn build: BUILD.gn files for clang-tidy and clang-apply-replacements
Patch from Mirko Bonadei <mbonadei@webrtc.org>!

Differential Revision: https://reviews.llvm.org/D57329

llvm-svn: 353177
2019-02-05 15:14:38 +00:00
Krasimir Georgiev 12971803c4 Fix typo in comment, NFCI
llvm-svn: 353176
2019-02-05 15:00:56 +00:00
Nico Weber 697f914dff gn build: Merge r353072
llvm-svn: 353175
2019-02-05 14:47:36 +00:00
Thomas Preud'homme a5e233bf79 Recommit: Detect incorrect FileCheck variable CLI definition
Summary:
While the backend code of FileCheck relies on definition of variable
from the command-line to have an equal sign '=' and a variable name
before that, the frontend does not actually enforce it. This leads to
FileCheck crashing when invoked with invalid syntax for the -D option.

This patch adds the missing validation in the frontend. It also makes
the -D option an AlwaysPrefix option to be able to detect -D=FOO as
being a define without variable and -D as missing its value.

Copyright:
- Linaro (changes in version 2 of revision D55940)
- GraphCore (changes in later versions)

Reviewers: jdenny

Subscribers: JonChesterfield, hiraditya, kristina, probinson,
llvm-commits

Differential Revision: https://reviews.llvm.org/D55940

llvm-svn: 353173
2019-02-05 14:17:28 +00:00
Thomas Preud'homme f929a0f81b Recommit: Add support for prefix-only CLI options
Summary:
Add support for options that always prefix their value, giving an error
if the value is in the next argument or if the option is given a value
assignment (ie. opt=val). This is the desired behavior for the -D option
of FileCheck for instance.

Copyright:
- Linaro (changes in version 2 of revision D55940)
- GraphCore (changes in later versions and introduced when creating
  D56549)

Reviewers: jdenny

Subscribers: llvm-commits, probinson, kristina, hiraditya,
JonChesterfield

Differential Revision: https://reviews.llvm.org/D56549

llvm-svn: 353172
2019-02-05 14:17:16 +00:00
Simon Pilgrim 8450ad17a9 [X86][SSE] Rename SimplifyDemandedVectorElts BLENDV tests
I'm going to be adding SimplifyDemandedBits tests shortly.

llvm-svn: 353171
2019-02-05 14:11:50 +00:00
Andrea Di Biagio 4bce783ee3 [MCA] Moved the logic that updates register dependencies from DispatchStage to RegisterFile. NFC
DispatchStage should always delegate to an object of class RegisterFile the task
of updating data dependencies.  ReadState and WriteState objects should not be
modified directly by DispatchStage.
This patch also renames stage IS_AVAILABLE to IS_DISPATCHED.

llvm-svn: 353170
2019-02-05 14:11:41 +00:00
Serge Guelton cad6336675 gn build: Fix Python 3 write_vcsrevision script compatibility
Trivial fix: decode was not called for all subprocess.check_output calls.

Commited on behalf of Andrew Boyarshin

Differential Revision: https://reviews.llvm.org/D57505

llvm-svn: 353168
2019-02-05 13:01:12 +00:00
Simon Pilgrim 62af24cc93 [X86][SSE] Add SimplifyDemandedVectorElts support for X86ISD::BLENDV
llvm-svn: 353165
2019-02-05 12:27:29 +00:00
Simon Pilgrim bd3adbb899 [X86][SSE] Add tests showing missing SimplifyDemandedVectorElts support for X86ISD::BLENDV
llvm-svn: 353164
2019-02-05 12:18:34 +00:00
Andrea Di Biagio 998a925e0e [MCA] Simplify the logic in method WriteState::addUser. NFCI
In some cases, it is faster to just grow the set of 'Users' rather than
performing a llvm::find_if every time a new user is added to
the set. No functional change intended.

llvm-svn: 353162
2019-02-05 11:36:55 +00:00
Jeremy Morse 84ca706be1 [DebugInfo][NFCI] Split salvageDebugInfo into helper functions
Some use cases are appearing where salvaging is needed that does not
correspond to an instruction being deleted -- for example an instruction
being sunk, or a Value not being available in a block being isel'd.

Enable more fine grained control over how salavging occurs by splitting
the logic into helper functions, separating things that are specific to
working on DbgVariableIntrinsics from those specific to interpreting IR
and building DIExpressions.

Differential Revision: https://reviews.llvm.org/D57696

llvm-svn: 353156
2019-02-05 11:11:28 +00:00
Hans Wennborg 7279895454 Fix format string in bindings/go/llvm/ir_test.go (PR40561)
The test started failing for me recently. I don't see any changes around
this code, so maybe it's my local go version that changed or something.

The error seems real to me: we're trying to print an Attribute with %d.
The test talks about "attribute masks" I'm not sure what that refers to,
but I suppose we could print the raw pointer value, since that's
what the test seems to be comparing.

Differential revision: https://reviews.llvm.org/D57672

llvm-svn: 353155
2019-02-05 11:01:54 +00:00
Simon Pilgrim 9e595e3663 [X86][AVX] Attempt to share broadcasts of different widths (PR39454)
If we have broadcasts of different vector widths, keep the longest vector width and extract subvectors for the shorter vectors (which should be free).

Differential Revision: https://reviews.llvm.org/D57663

llvm-svn: 353154
2019-02-05 10:58:43 +00:00
Simon Pilgrim fbb3086fc8 [CostModel][X86] Add UMUL fixed point cost tests
llvm-svn: 353153
2019-02-05 10:55:38 +00:00
Florian Hahn 3b251963c3 [CGP] Add support for sinking operands to their users, if they are free.
This patch improves code generation for some AArch64 ACLE intrinsics. It adds
support to CGP to duplicate and sink operands to their user, if they can be
folded into a target instruction, like zexts and sub into usubl. It adds a
TargetLowering hook shouldSinkOperands, which looks at the operands of
instructions to see if sinking is profitable.

I decided to add a new target hook, as for the sinking to be profitable,
at least on AArch64, we have to look at multiple operands of an
instruction, instead of looking at the users of a zext for example.

The sinking is done in CGP, because it works around an instruction
selection limitation. If instruction selection is not limited to a
single basic block, this patch should not be needed any longer.

Alternatively this could be done in the LoopSink pass, which tries to
undo LICM for instructions in blocks that are not executed frequently.

Note that we do not force the operands to sink to have a single user,
because we duplicate them before sinking. Therefore this is only
desirable if they really can be done for free. Additionally we could
consider the impact on live ranges later on.

This should fix https://bugs.llvm.org/show_bug.cgi?id=40025.

As for performance, we have internal code that uses intrinsics and can
be speed up by 10% by this change.

Reviewers: SjoerdMeijer, t.p.northover, samparker, efriedma, RKSimon, spatel

Reviewed By: samparker

Differential Revision: https://reviews.llvm.org/D57377

llvm-svn: 353152
2019-02-05 10:27:40 +00:00
Diana Picus e24b104a11 [ARM GlobalISel] Support G_GEP for Thumb2
Same as ARM, but use a different opcode in the instruction selection.

llvm-svn: 353151
2019-02-05 10:21:37 +00:00
Dan Liew de5220ed5e Previously if the user configured their build but then changed
LLVM_ENABLED_PROJECT and reconfigured it had no effect on what
projects were actually built. This was very confusing behaviour. The
reason for this is that the value of the `LLVM_TOOL_<PROJECT>_BUILD`
variables are already set.

The problem here is that we have two sources of truth:

* The projects listed in LLVM_ENABLE_PROJECTS.
* The projects enabled/disabled with LLVM_TOOL_<PROJECT>_BUILD.

At configure time we have no real way of knowing which source of truth
the user wants so we apply the following heuristic:

If the user ever sets `LLVM_ENABLE_PROJECTS` in the CMakeCache then that
is used as the single source of truth and we force the
`LLVM_TOOL_<PROJECT>_BUILD` CMake cache variables to have the
appropriate values that match the contents of the
`LLVM_ENABLE_PROJECTS`. If the user never sets `LLVM_ENABLE_PROJECTS`
then they can continue to use and set the `LLVM_TOOL_<PROJECT>_BUILD`
variables as the "source of truth".

The problem with this approach is that if the user ever tries to use
both `LLVM_ENABLE_PROJECTS` and `LLVM_TOOL_<PROJECT>_BUILD` for the same
build directory then any user set value for `LLVM_TOOL_<PROJECT>_BUILD`
variables will get overwriten, likely without the user noticing.

Hopefully the above shouldn't matter in practice because the
LLVM_TOOL_<PROJECT>_BUILD variables are not documented, but
LLVM_ENABLE_PROJECTS is.

We should probably deprecate the `LLVM_TOOL_<PROJECT>_BUILD`
variables at some point by turning them into to regular CMake
variables that don't live in the CMake cache.

Differential Revision: https://reviews.llvm.org/D57535

llvm-svn: 353148
2019-02-05 08:47:28 +00:00
Hiroshi Inoue 02a2bb2f54 [NFC] fix trivial typos in comments
llvm-svn: 353147
2019-02-05 08:30:48 +00:00
Clement Courbet f3da6abf0f [DAG][NFC] Add unit tests.
In preparation for D57541.

llvm-svn: 353144
2019-02-05 08:00:17 +00:00
Clement Courbet 17b51b655e [DAG] BaseIndexOffset: FrameIndexSDNodes with the same FrameIndex compare equal.
Reviewers: niravd

Subscribers: arphaman, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57692

llvm-svn: 353143
2019-02-05 07:36:20 +00:00
Craig Topper f86eb00f12 [X86] Connect the default fpsr and dirflag clobbers in inline assembly to the registers we have defined for them.
Summary:
We don't currently map these constraints to physical register numbers so they don't make it to the MachineIR representation of inline assembly.

This could have problems for proper dependency tracking in the machine schedulers though I don't have a test case that shows that.

Reviewers: rnk

Reviewed By: rnk

Subscribers: eraman, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57641

llvm-svn: 353141
2019-02-05 06:13:06 +00:00
Peter Collingbourne 6141b037a9 gn build: Upgrade to NDK r19.
NDK r19 includes a sysroot that can be used directly by the compiler
without creating a standalone toolchain, so we just need a handful
of flags to point Clang there.

Differential Revision: https://reviews.llvm.org/D57733

llvm-svn: 353139
2019-02-05 05:10:19 +00:00
Craig Topper 31259c52ff [X86] Add test case from PR40529. NFC
llvm-svn: 353138
2019-02-05 04:48:23 +00:00
Max Kazantsev d5e595b7a6 [LSR] Check SCEV on isZero() after extend. PR40514
When LSR first adds SCEVs to BaseRegs, it only does it if `isZero()` has
returned false. In the end, in invocation of `InsertFormula`, it asserts that
all values there are still not zero constants. However between these two
points, it makes some transformations, in particular extends them to wider
type.

SCEV does not give us guarantee that if `S` is not a constant zero, then
`sext(S)` is also not a constant zero. It might have missed some optimizing
transforms when it was calculating `S` and then made them when it took `sext`.
For example, it may happen if previously optimizing transforms were limited
by depth or somehow else.

This patch adds a bailout when we may end up with a zero SCEV after extension.

Differential Revision: https://reviews.llvm.org/D57565
Reviewed By: samparker

llvm-svn: 353136
2019-02-05 04:30:37 +00:00
Teresa Johnson b0bf530fb5 [SamplePGO] More pipeline changes when flattened profile used in ThinLTO postlink
Summary:
Follow on to D54819/r351476.

We also don't need to perform extra InstCombine pass when we aren't
loading the sample profile in the ThinLTO backend because we have a
flattened sample profile.

Additionally, for consistency and clarity, when we aren't reloading the
sample profile, perform ICP in the same location as non-sample PGO
backends. To this end I have moved the ICP invocation for non-SamplePGO
ThinLTO down into buildModuleSimplificationPipeline (partly addresses
the FIXME where we were previously setting this up).

Reviewers: wmi

Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57705

llvm-svn: 353135
2019-02-05 04:09:19 +00:00
Richard Trieu a9354b2f33 Fix narrowing issue from r353129
llvm-svn: 353134
2019-02-05 02:26:03 +00:00
Heejin Ahn e37ba2cb96 [WebAssembly] Fix indentation after adding IsCanonical property (NFC)
llvm-svn: 353132
2019-02-05 01:59:49 +00:00
Wouter van Oortmerssen 1a91cb0402 [WebAssembly] Make disassembler always emit most canonical name.
Summary:
There are a few instructions that all map to the same opcode, so
when disassembling, we have to pick one. That was just the first one
before (the except_ref variant in the case of "call"), now it is the
one marked as IsCanonical in tablegen, or failing that, the shortest
name (which is typically the "canonical" one).

Also introduced a canonical "end" instruction for this purpose.

Reviewers: dschuff, tlively

Subscribers: sbc100, jgravelle-google, aheejin, llvm-commits, sunfish

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57713

llvm-svn: 353131
2019-02-05 01:19:45 +00:00
Wei Mi 4901f371a2 [SamplePGO][NFC] Minor improvement to replace a temporary vector with a
brace-enclosed init list.

Differential Revision: https://reviews.llvm.org/D57726

llvm-svn: 353129
2019-02-05 00:57:50 +00:00
Matt Arsenault 2bf74ec8c5 GlobalISel: Fix verifier crashing on non-register operands
Also correct the wording of error on subregisters.

llvm-svn: 353128
2019-02-05 00:53:22 +00:00
Thomas Lively d99af23765 [WebAssembly] memory.copy
Summary: Depends on D57495.

Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, hiraditya, sunfish

Differential Revision: https://reviews.llvm.org/D57498

llvm-svn: 353127
2019-02-05 00:49:55 +00:00
Matt Arsenault 7f09fd6b04 GlobalISel: Consolidate load/store legalization
The fewerElementsVectors implementation for load/stores
handles the scalar reduction case just as well, so drop
the redundant code in narrowScalar. This also introduces
support for narrowing irregular size breakdowns for
scalars.

llvm-svn: 353125
2019-02-05 00:26:12 +00:00
Craig Topper d4e37afe45 [DAGCombiner] Discard pointer info when combining extract_vector_elt of a vector load when the index isn't constant
Summary:
If the index isn't constant, this transform inserts a multiply and an add on the index to calculating the base pointer for a scalar load. But we still create a memory operand with an offset of 0 and the size of the scalar access. But the access is really to an unknown offset within the original access size.

This can cause the machine scheduler to incorrectly calculate dependencies between this load and other accesses. In the case we saw, there was a 32 byte vector store that was split into two 16 byte stores, one with offset 0 and one with offset 16. The size of the memory operand for both was 16. The scheduler correctly detected the alias with the offset 0 store, but not the offset 16 store.

This patch discards the pointer info so we don't incorrectly detect aliasing. I wasn't sure if we could keep using the original offset and size without risking some other transform on the load changing the size.

I tried to reduce a test case, but there's still a lot of memory operations needed to get the scheduler to do the bad reordering. So it looked pretty fragile to maintain.

Reviewers: efriedma

Reviewed By: efriedma

Subscribers: arphaman, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57616

llvm-svn: 353124
2019-02-05 00:22:23 +00:00
Teresa Johnson 4bdf82ce79 [SamplePGO] Minor efficiency improvement in samplePGO ICP
Summary:
When attaching prof metadata to promoted direct calls in SamplePGO
mode, no need to construct and use a SmallVector to pass a single count
to the ArrayRef parameter, we can simply use a brace-enclosed init list.

This made a small but consistent improvement for a ThinLTO backend
compile I was measuring.

Reviewers: wmi

Subscribers: mehdi_amini, dexonsmith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57706

llvm-svn: 353123
2019-02-05 00:18:38 +00:00
Matt Arsenault 81511e5428 GlobalISel: Implement narrowScalar for select
Don't handle vector conditions.

I think this can be merged in the future with
fewerElementsVectorSelect, although this becomes slightly tricky with
a vector condition.

llvm-svn: 353122
2019-02-05 00:13:44 +00:00
Matt Arsenault 24f14993e8 GlobalISel: Combine g_extract with g_merge_values
Try to use the underlying source registers.

This enables legalization in more cases where some irregular
operations are widened and others narrowed.

This seems to make the test_combines_2 AArch64 test worse, since the
MERGE_VALUES has multiple uses. Since this should be required for
legalization, a hasOneUse check is probably inappropriate (or maybe
should only be used if the merge is legal?).

llvm-svn: 353121
2019-02-04 23:41:59 +00:00
Julian Lettner 98b9f5b4b3 [Sanitizers] UBSan unreachable incompatible with Kernel ASan
Summary:
This is a follow up for https://reviews.llvm.org/D57278. The previous
revision should have also included Kernel ASan.

rdar://problem/40723397

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D57711

llvm-svn: 353120
2019-02-04 23:37:50 +00:00
Sam Clegg ae28be3a8a [llvm-readobj] Fix readobj test expectation broken in rL353109. NFC.
llvm-svn: 353119
2019-02-04 23:36:38 +00:00
Evandro Menezes 98f356cd74 Revert "[PATCH] [TargetLibraryInfo] Update run time support for Windows"
This reverts accidental commit ff5527718d.

llvm-svn: 353118
2019-02-04 23:34:50 +00:00
Evandro Menezes d016763774 [ADT] Refactor the Windows query functions (NFC)
Increase reuse in the query functions for Windows.

llvm-svn: 353117
2019-02-04 23:34:38 +00:00
Evandro Menezes ff5527718d [PATCH] [TargetLibraryInfo] Update run time support for Windows
It seems that the run time for Windows has changed and supports more math
functions than before.  Since LLVM requires at least VS2015, I assume that
this is the run time that would be redistributed with programs built with
Clang.  Thus, I based this update on the header file `math.h` that
accompanies it.

This patch addresses the PR40541.  Unfortunately, I have no access to a
Windows development environment to validate it.

llvm-svn: 353114
2019-02-04 23:29:41 +00:00
Matt Arsenault 1f795e2c2a GlobalISel: Enforce operand types for constants
A number of of tests were using imm operands, not cimm. Since CSE
relies on the exact ConstantInt* pointer used, and implicit
conversions are generally evil, also enforce the bitsize of the types.

llvm-svn: 353113
2019-02-04 23:29:31 +00:00
Matt Arsenault f2a26339e2 GlobalISel: Verify g_select
Factor the common vector element consistency check many instructions
need out, although this makes the error messages worse.

llvm-svn: 353112
2019-02-04 23:29:16 +00:00
Matt Arsenault 46f9c6cf0b MachineVerifier: Move verification of G_* instructions to function
llvm-svn: 353111
2019-02-04 23:29:11 +00:00
Sam Clegg 313f9f54f5 [WebAssembly] MC: Mark more function aliases as functions
Aliases of functions are now marked as function symbols even if
they are bitcast to some other other non-function type.
This is important for WebAssembly where object and function
symbols can't alias each other.

Fixes PR38866

Differential Revision: https://reviews.llvm.org/D57538

llvm-svn: 353109
2019-02-04 23:07:34 +00:00
Matt Arsenault 8a59b1919c MIR: Validate LLT types when parsing
llvm-svn: 353107
2019-02-04 22:59:56 +00:00
Sanjay Patel 738e17b5df [CGP] fix bogus test names/comments; NFC
Inverted operand 0 and operand 1.

llvm-svn: 353106
2019-02-04 22:37:05 +00:00
Sam Clegg 3fd2462d03 [llvm-readobj] Report more WebAssembly symbol info
Differential Revision: https://reviews.llvm.org/D57695

llvm-svn: 353104
2019-02-04 22:27:46 +00:00
Sanjay Patel 1f9e23e3cc [CGP] add tests for usubo; NFC
llvm-svn: 353103
2019-02-04 22:27:08 +00:00
Matt Arsenault 3d6a49b0b9 GlobalISel: Fix not calling observer when legalizing bitcount ops
This was hiding bugs from never legalizing the source type.

llvm-svn: 353102
2019-02-04 22:26:33 +00:00
Matt Arsenault cba0c6d0c9 AMDGPU: Don't rematerialize mov with implicit operands
This was pulling the mov used for register indexing on gfx9 out of the
loop.

llvm-svn: 353101
2019-02-04 22:26:21 +00:00
Julian Lettner 29ac3a5b82 [SanitizerCoverage] Clang crashes if user declares `__sancov_lowest_stack` variable
Summary:
If the user declares or defines `__sancov_lowest_stack` with an
unexpected type, then `getOrInsertGlobal` inserts a bitcast and the
following cast fails:
```
Constant *SanCovLowestStackConstant =
       M.getOrInsertGlobal(SanCovLowestStackName, IntptrTy);
SanCovLowestStack = cast<GlobalVariable>(SanCovLowestStackConstant);
```

This variable is a SanitizerCoverage implementation detail and the user
should generally never have a need to access it, so we emit an error
now.

rdar://problem/44143130

Reviewers: morehouse

Differential Revision: https://reviews.llvm.org/D57633

llvm-svn: 353100
2019-02-04 22:06:30 +00:00
David Major 1137fce9e9 gn build: Windows: use a more standard format for PDB filenames
The current build was producing names like llvm-undname.exe.pdb, which looks unusual to me at least. This switches them to the more common llvm-undname.pdb style.

Differential Revision: https://reviews.llvm.org/D57613

llvm-svn: 353099
2019-02-04 21:27:38 +00:00
David Major d1934853a8 gn build: Revert r353094 (bad merge)
llvm-svn: 353098
2019-02-04 21:25:13 +00:00
Nicolai Haehnle a69146e67e [InstCombine] Cleanup the TFE/LWE check in AMDGPU SimplifyDemanded
Summary:
The fix added in r352904 is not quite correct, or rather misleading:

1. When the texfailctrl (TFC) argument was non-constant, the fix assumed
   non-TFE/LWE, which is incorrect.

2. Regardless, this code path cannot even be hit for correct
   TFE/LWE-enabled calls, because those return a struct. Added
   a test case for those for completeness.

Change-Id: I92d314dbc67a2670f6d7adaab765ef45f56a49cf

Reviewers: hliao, dstuttard, arsenm

Subscribers: kzhuravl, jvesely, wdng, yaxunl, tpr, t-tye, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57681

llvm-svn: 353097
2019-02-04 21:24:19 +00:00
Craig Topper 4ca0b850e0 [X86] Add test case for report_fatal_error added in r352699.
r352699 replaced an llvm_unreachable with a report_fatal_error. This patch adds a test case for it.

llvm-svn: 353096
2019-02-04 21:24:15 +00:00
Craig Topper c45e39b35f [CodeGen][ARC][SystemZ][WebAssembly] Use MachineInstr::isInlineAsm in more places instead of just comparing opcode. NFCI
I'm looking at adding a second INLINEASM opcode for better modeling asm-goto
as a terminator. Using the existing predicate will reduce teh number of
places that will need to use the new opcode.

llvm-svn: 353095
2019-02-04 21:24:13 +00:00
David Major 1469ff417b gn build: Windows: use a more standard format for PDB filenames
The current build was producing names like llvm-undname.exe.pdb, which looks unusual to me at least. This switches them to the more common llvm-undname.pdb style.

Differential Revision: https://reviews.llvm.org/D57613

llvm-svn: 353094
2019-02-04 21:20:25 +00:00
David Major 3c659cb267 gn build: Windows: write PDBs when is_debug
Without /DEBUG, the /Zi doesn't on its own create PDB files.

And since ninja runs multiple compilations in parallel, we need /FS to prevent contention on PDBs.

Differential Revision: https://reviews.llvm.org/D57612

llvm-svn: 353093
2019-02-04 21:13:43 +00:00
Aditya Nandakumar 9b6b9a5791 [Tablegen][DAG]: Fix build breakage when LLVM_ENABLE_DAGISEL_COV=1
LLVM_ENABLE_DAGISEL_COV can be used to instrument DAGISel tablegen
selection code to show which patterns along with Complex patterns were
used when selecting instructions. Unfortunately this is turned off by
default and was broken but never tested.
This required a simple fix (missing new line) to get it to build again.

llvm-svn: 353091
2019-02-04 21:06:24 +00:00
Philip Pfaffe 0ee6a933ce [NewPM][MSan] Add Options Handling
Summary: This patch enables passing options to msan via the passes pipeline, e.e., -passes=msan<recover;kernel;track-origins=4>.

Reviewers: chandlerc, fedor.sergeev, leonardchan

Subscribers: hiraditya, bollu, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57640

llvm-svn: 353090
2019-02-04 21:02:49 +00:00
Wolfgang Pieb 90d856cd5f [DEBUGINFO] Reposting r352642: Handle restore instructions in LiveDebugValues
The LiveDebugValues pass recognizes spills but not restores, which can
cause large gaps in location information for some variables, depending
on control flow. This patch make LiveDebugValues recognize restores and
generate appropriate DBG_VALUE instructions.

This patch was posted previously with r352642 and reverted in r352666 due
to buildbot errors. A missing return statement was the cause for the 
failures.

Reviewers: aprantl, NicolaPrica

Differential Revision: https://reviews.llvm.org/D57271

llvm-svn: 353089
2019-02-04 20:42:45 +00:00
Scott Linder d19d197221 [AMDGPU] Support emitting GOT relocations for function calls
Differential Revision: https://reviews.llvm.org/D57416

llvm-svn: 353083
2019-02-04 20:00:07 +00:00
Michael Kruse 70560a0a2c [WarnMissedTransforms] Do not warn about already vectorized loops.
LoopVectorize adds llvm.loop.isvectorized, but leaves
llvm.loop.vectorize.enable. Do not consider such a loop for user-forced
vectorization since vectorization already happened -- by prioritizing
llvm.loop.isvectorized except for TM_SuppressedByUser.

Fixes http://llvm.org/PR40546

Differential Revision: https://reviews.llvm.org/D57542

llvm-svn: 353082
2019-02-04 19:55:59 +00:00
Matt Arsenault 22309c8701 GlobalISel: Fix CheckMachineFunction passing if ReadCheckFile files
This could be tested, but the FileCheck library spams the error
message to the console.

llvm-svn: 353081
2019-02-04 19:53:22 +00:00
Matt Arsenault f3a46d0ae9 GlobalISel: Allow constructing SrcOp/DstOp from MachineOperand
llvm-svn: 353080
2019-02-04 19:53:19 +00:00
Matt Arsenault d7fa13c121 GlobalISel: Fix parameter name in documentation
llvm-svn: 353078
2019-02-04 19:16:58 +00:00
Matt Arsenault 8121ec26c0 GlobalISel: Fix CSE handling of buildConstant
This fixes two problems with CSE done in buildConstant. First, this
would hit an assert when used with a vector result type. Solve this by
allowing CSE on the vector elements, but not on the result vector for
now.

Second, this was also performing the CSE based on the input
ConstantInt pointer. The underlying buildConstant could potentially
convert the constant depending on the result type, giving in a
different ConstantInt*. Stop allowing the APInt and ConstantInt forms
from automatically casting to the result type to avoid any similar
problems in the future.

llvm-svn: 353077
2019-02-04 19:15:50 +00:00
Heejin Ahn 18c56a0762 [WebAssembly] clang-tidy (NFC)
Summary:
This patch fixes clang-tidy warnings on wasm-only files.
The list of checks used is:
`-*,clang-diagnostic-*,llvm-*,misc-*,-misc-unused-parameters,readability-identifier-naming,modernize-*`
(LLVM's default .clang-tidy list is the same except it does not have
`modernize-*`. But I've seen in multiple CLs in LLVM the modernize style
was recommended and code was fixed based on the style, so I added it as
well.)

The common fixes are:
- Variable names start with an uppercase letter
- Function names start with a lowercase letter
- Use `auto` when you use casts so the type is evident
- Use inline initialization for class member variables
- Use `= default` for empty constructors / destructors
- Use `using` in place of `typedef`

Reviewers: sbc100, tlively, aardappel

Subscribers: dschuff, sunfish, jgravelle-google, yurydelendik, kripken, MatzeB, mgorny, rupprecht, llvm-commits

Differential Revision: https://reviews.llvm.org/D57500

llvm-svn: 353075
2019-02-04 19:13:39 +00:00
Jordan Rupprecht 2e862c7555 [llvm-objcopy][NFC] simplify an error return
llvm-svn: 353074
2019-02-04 19:09:20 +00:00
Roman Lebedev b7ecc9b624 [X86] X86DAGToDAGISel::matchBitExtract(): prepare 'control' in 32 bits
Summary:
Noticed while looking at D56052.
```
  // The 'control' of BEXTR has the pattern of:
  // [15...8 bit][ 7...0 bit] location
  // [ bit count][     shift] name
  // I.e. 0b000000011'00000001 means  (x >> 0b1) & 0b11
```
I.e. we do not care about any of the bits aside from the low 16 bits.
So there is no point in doing the `slh`,`or` in 64 bits,
let's just do everything in 32 bits, and anyext if needed.

We could do that in 16 even, but we intentionally don't
zext to i16 (longer encoding IIRC),
so i'm guessing the same applies here.

Reviewers: craig.topper, andreadb, RKSimon

Reviewed By: craig.topper

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D56715

llvm-svn: 353073
2019-02-04 19:04:26 +00:00
Matt Arsenault b3e86709dc GlobalISel: Improve gtest usage
Don't unnecessarily use ASSERT_*, and print the MachineFunction
on failure.

llvm-svn: 353072
2019-02-04 18:58:27 +00:00
David Callahan fd3e7a9320 Adjust cardinality of internal inliner thresholds
Summary:
While compiling openJDK11 (also other workloads), some make files would pass both  CFLAGS  and LDFLAGS at link step ; resulting in duplicate options on the command line when one is using LTO and trying to influence the inliner. Most of the internal flags are ZeroOrMore, this diff changes the remaining ones.

Reviewers: david2050, twoh, modocache

Reviewed By: twoh

Subscribers: mehdi_amini, dexonsmith, eraman, haicheng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57537

Patch by: Abdoul-Kader Keita

llvm-svn: 353071
2019-02-04 18:46:25 +00:00
Craig Topper 8ea72a8201 [X86] Add ST0 as an implicit def/use of x87 load/store instructions during FP stackifying.
These instructions implicitly operate on ST0, but we don't currently add that information to the MachineInstr. We also don't add it the tablegen definitions either.

For the most part this doesn't cause any problems because the stackifying occurs after register allocation. All the instructions are marked as having side effects so the postRA scheduler won't reorder them amongst themselves.

But nothing stops inline assembly using X87 instructions from being reordered around other x87 instructions if that inline assembly wasn't marked volatile.

The two test cases I've identified so far in PR40539 involve loads and stores used to set up the inline assembly or capture the results of the inline assembly ending up in the wrong order.

This patch adds implicit ST0 uses/defs to the load/store instructions to prevent this from happening.

I plan to fix all of the FP instructions, but the binops are bit trickier to get right. So I've chosen fixing the known test cases as a good first step.

I think we also need to update the tablegen descriptions so MS inline assembly infers the right clobbers, but I haven't checked that yet.

Differential Revision: https://reviews.llvm.org/D57644

llvm-svn: 353070
2019-02-04 18:43:55 +00:00
Matt Arsenault 0723828675 GlobalISel: Fix moreElementsToNextPow2
This was completely broken. The condition was inverted, and changed
the element type for vectors of pointers.

Fixes bug 40592.

llvm-svn: 353069
2019-02-04 18:42:24 +00:00
Jordan Rupprecht 5745c5f54f [llvm-objcopy][NFC] Use StringSaver for --keep-global-symbols
Summary: Use StringSaver/BumpPtrAlloc when parsing lines from --keep-global-symbols files. This allows us to consistently use StringRef for driver options, which avoids copying the full strings for each object copied, as well as simplifies part of D57517.

Reviewers: jhenderson, evgeny777, alexshap

Subscribers: jakehehrlich

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57617

llvm-svn: 353068
2019-02-04 18:38:00 +00:00
Wouter van Oortmerssen 0b3cf247c4 [WebAssembly] Make segment/size/type directives optional in asm
Summary:
These were "boilerplate" that repeated information already present
in .functype and end_function, that needed to be repeated to Please
the particular way our object writing works, and missing them would
generate errors.

Instead, we generate the information for these automatically so the
user can concern itself with writing more canonical wasm functions
that always work as expected.

Reviewers: dschuff, sbc100

Subscribers: jgravelle-google, aheejin, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D57546

llvm-svn: 353067
2019-02-04 18:03:11 +00:00
Jessica Paquette 92834ffcbf Revert "[GlobalISel] Introduce a generic floating point floor opcode, G_FFLOOR"
This reverts commit b05ecba6d687fcb3078509220c67458bf1d77a2e.

Apparently adding floor breaks AMDGPU somehow, so I have to back this out
while I look into it.

llvm-svn: 353065
2019-02-04 17:32:47 +00:00
Jessica Paquette 834bded9d6 Revert "[GlobalISel] Add IRTranslator support for G_FFLOOR"
This reverts commit 8bbd570fd5205a04d88d2e5513a6e4adbd028039.

Apparently adding ffloor breaks AMDGPU somehow, so I need to back this out
while I look into it.

llvm-svn: 353064
2019-02-04 17:32:43 +00:00
Nico Weber 11256b213e gn build: Merge r352944
llvm-svn: 353063
2019-02-04 17:32:36 +00:00
Sam Clegg d1152a267c [WebAssembly] Rename relocations from R_WEBASSEMBLY_ to R_WASM_
See https://github.com/WebAssembly/tool-conventions/pull/95.

This is less typing and IMHO more readable, and it also fits with
our naming around the binary format which tends to use the short name.
e.g.

include/llvm/BinaryFormat/Wasm.h
tools/llvm-objdump/WasmDump.cpp
etc..

Differential Revision: https://reviews.llvm.org/D57611

llvm-svn: 353062
2019-02-04 17:28:46 +00:00
Craig Topper bf7593ec4a [X86] Print all register forms of x87 fadd/fsub/fdiv/fmul as having two arguments where on is %st.
All of these instructions consume one encoded register and the other register is %st. They either write the result to %st or the encoded register. Previously we printed both arguments when the encoded register was written. And we printed one argument when the result was written to %st. For the stack popping forms the encoded register is always the destination and we didn't print both operands. This was inconsistent with gcc and objdump and just makes the output assembly code harder to read.

This patch changes things to always print both operands making us consistent with gcc and objdump. The parser should still be able to handle the single register forms just as it did before. This also matches the GNU assembler behavior.

llvm-svn: 353061
2019-02-04 17:28:18 +00:00
Sam Clegg d00c02c0f9 [WebAssembly] Remove redundant namespaces qualifiers. NFC.
Differential Revision: https://reviews.llvm.org/D57610

llvm-svn: 353060
2019-02-04 17:26:22 +00:00
Leonard Chan 68d428e578 [Intrinsic] Unsigned Fixed Point Multiplication Intrinsic
Add an intrinsic that takes 2 unsigned integers with the scale of them
provided as the third argument and performs fixed point multiplication on
them.

This is a part of implementing fixed point arithmetic in clang where some of
the more complex operations will be implemented as intrinsics.

Differential Revision: https://reviews.llvm.org/D55625

llvm-svn: 353059
2019-02-04 17:18:11 +00:00
Jessica Paquette 73158e7201 [GlobalISel] Add IRTranslator support for G_FFLOOR
Follow-up to https://reviews.llvm.org/D57484

Adds G_FFLOOR to translateKnownIntrinsic and update arm64-irtranslator.ll.

Differential Revision: https://reviews.llvm.org/D57485

llvm-svn: 353058
2019-02-04 17:15:34 +00:00
Jessica Paquette 616a1fb492 [GlobalISel] Introduce a generic floating point floor opcode, G_FFLOOR
This introduces a generic opcode for floating point floor, working towards
selecting @llvm.floor.

Differential Revision: https://reviews.llvm.org/D57484

llvm-svn: 353057
2019-02-04 17:10:55 +00:00
Sanjay Patel c00bdab4c8 [CGP] use IRBuilder to simplify code
This is no-functional-change-intended although there could
be intermediate variations caused by a difference in the
debug info produced by setting that from the builder's 
insertion point. 

I'm updating the IR test file associated with this code just
to show that the naming differences from using the builder
are visible.

The motivation for adding a helper function is that we are
likely to extend this code to deal with other overflow ops.

llvm-svn: 353056
2019-02-04 16:30:46 +00:00
James Henderson 9652652a32 [CommandLine] Don't print empty sentinel values from EnumValN lists in help text
In order to make an option value truly optional, both the ValueOptional
attribute and an empty-named value are required. Prior to this change,
this empty-named value appears in the command-line help text:

-some-option - some help text
  =v1        - description 1
  =v2        - description 2
  =          -

This change improves the help text for these sort of options in a number
of ways:

1) ValueOptional options with an empty-named value now print their help
   text twice: both without and then with '=<value>' after the name. The
   latter version then lists the allowed values after it.
2) Empty-named values with no help text in ValueOptional options are not
   listed in the permitted values.

-some-option         - some help text
-some-option=<value> - some help text
  =v1                - description 1
  =v2                - description 2

3) Otherwise empty-named options are printed as =<empty> rather than
   simply '='.
4) Option values without help text do not have the '-' separator
   printed.

-some-option=<value> - some help text
  =v1                - description 1
  =v2
  =<empty>           - description

It also tweaks the llvm-symbolizer -functions help text to not print a
trailing ':' as that looks bad combined with 1) above.

This is mostly a reland of r353048 which in turn was a reland of
r352750.

Reviewed by: ruiu, thopre, mstorsjo

Differential Revision: https://reviews.llvm.org/D57030

llvm-svn: 353053
2019-02-04 16:17:57 +00:00
Simon Pilgrim 6e5350a367 [X86][SSE] SimplifyDemandedBitsForTargetNode - PCMPGT(0,X) sign mask
For PCMPGT(0, X) patterns where we only demand the sign bit (e.g. BLENDV or MOVMSK) then we can use X directly.

Differential Revision: https://reviews.llvm.org/D57667

llvm-svn: 353051
2019-02-04 15:43:36 +00:00
James Henderson c9e6861a76 Revert r353048.
It was causing unexpected unit test failures on build bots.

llvm-svn: 353050
2019-02-04 15:09:58 +00:00
James Henderson d90b5a2e51 [CommandLine] Don't print empty sentinel values from EnumValN lists in help text
In order to make an option value truly optional, both the ValueOptional
attribute and an empty-named value are required. Prior to this change,
this empty-named value appears in the command-line help text:

-some-option - some help text
  =v1        - description 1
  =v2        - description 2
  =          -

This change improves the help text for these sort of options in a number
of ways:

1) ValueOptional options with an empty-named value now print their help
   text twice: both without and then with '=<value>' after the name. The
   latter version then lists the allowed values after it.
2) Empty-named values with no help text in ValueOptional options are not
   listed in the permitted values.

-some-option         - some help text
-some-option=<value> - some help text
  =v1                - description 1
  =v2                - description 2

3) Otherwise empty-named options are printed as =<empty> rather than
   simply '='.
4) Option values without help text do not have the '-' separator
   printed.

-some-option=<value> - some help text
  =v1                - description 1
  =v2
  =<empty>           - description

It also tweaks the llvm-symbolizer -functions help text to not print a
trailing ':' as that looks bad combined with 1) above.

This is mostly a reland of r352750.

Reviewed by: ruiu, thopre, mstorsjo

Differential Revision: https://reviews.llvm.org/D57030

llvm-svn: 353048
2019-02-04 14:48:33 +00:00
Matt Arsenault 56edf3f344 GlobalISel: Fix formatting of debug output
There was a missing space before the instruction name, and the newline
is redundant since MI::print by default adds one.

llvm-svn: 353046
2019-02-04 14:05:33 +00:00
Matt Arsenault 10547230f3 AMDGPU/GlobalISel: Legalize select for v4s16
Also add some more select tests to help show future legalization
changes.

llvm-svn: 353045
2019-02-04 14:04:52 +00:00
Simon Pilgrim a536b89fe0 [DAGCombine] Add ADD(SUB,SUB) combines
Noticed while investigating PR40483, and fixes the basic test case from the bug - but not a more general case.

We're pretty weak at dealing with ADD/SUB combines compared to the SimplifyAssociativeOrCommutative/SimplifyUsingDistributiveLaws abilities that InstCombine can manage.

llvm-svn: 353044
2019-02-04 13:44:49 +00:00
Andrea Di Biagio edbf06a767 [AsmPrinter] Remove hidden flag -print-schedule.
This patch removes hidden codegen flag -print-schedule effectively reverting the
logic originally committed as r300311
(https://llvm.org/viewvc/llvm-project?view=revision&revision=300311).

Flag -print-schedule was originally introduced by r300311 to address PR32216
(https://bugs.llvm.org/show_bug.cgi?id=32216). That bug was about adding "Better
testing of schedule model instruction latencies/throughputs".

These days, we can use llvm-mca to test scheduling models. So there is no longer
a need for flag -print-schedule in LLVM. The main use case for PR32216 is
now addressed by llvm-mca.
Flag -print-schedule is mainly used for debugging purposes, and it is only
actually used by x86 specific tests. We already have extensive (latency and
throughput) tests under "test/tools/llvm-mca" for X86 processor models. That
means, most (if not all) existing -print-schedule tests for X86 are redundant.

When flag -print-schedule was first added to LLVM, several files had to be
modified; a few APIs gained new arguments (see for example method
MCAsmStreamer::EmitInstruction), and MCSubtargetInfo/TargetSubtargetInfo gained
a couple of getSchedInfoStr() methods.

Method getSchedInfoStr() had to originally work for both MCInst and
MachineInstr. The original implmentation of getSchedInfoStr() introduced a
subtle layering violation (reported as PR37160 and then fixed/worked-around by
r330615).
In retrospect, that new API could have been designed more optimally. We can
always query MCSchedModel to get the latency and throughput. More importantly,
the "sched-info" string should not have been generated by the subtarget.
Note, r317782 fixed an issue where "print-schedule" didn't work very well in the
presence of inline assembly. That commit is also reverted by this change.

Differential Revision: https://reviews.llvm.org/D57244

llvm-svn: 353043
2019-02-04 12:51:26 +00:00
Simon Pilgrim fb222aa319 [X86] Add a couple of missed ADD combine tests
Noticed while investigating PR40483

llvm-svn: 353042
2019-02-04 12:37:38 +00:00
Simon Pilgrim 9899967464 Use auto for dyn_cast case to save a line. NFCI.
llvm-svn: 353041
2019-02-04 12:32:39 +00:00
David Green b4f36a2196 [ARM] Mark 255 and 65535 as cheap for Thumb1 "And"
This prevents Constant Hoisting from pulling the constant out of the block,
allowing us to still produce LDRH/UXTH nodes. LDRB/UXTB (255) is already cheap
by the default getIntImmCost, but I've added it for clarity.

Differential Revision: https://reviews.llvm.org/D57671

llvm-svn: 353040
2019-02-04 11:58:48 +00:00
David Green 75d79f472e [ARM] Add testcases for D57671. NFC
llvm-svn: 353039
2019-02-04 11:50:14 +00:00
Max Kazantsev 56b57e3f53 [NFC] Make a check in GuardWidening more obvious
llvm-svn: 353038
2019-02-04 10:41:17 +00:00
Dmitry Venikov 3643cbbf9c Commit tests for changes in revision D41608
llvm-svn: 353037
2019-02-04 10:32:07 +00:00
Max Kazantsev 09802f41cc [NFC] Rename variables to reflect the actual status of GuardWidening
llvm-svn: 353036
2019-02-04 10:31:18 +00:00
Clement Courbet be16b80056 [llvm-objcopy][NFC] Fix trailing semicolon warning.
llvm-svn: 353035
2019-02-04 10:24:42 +00:00
Max Kazantsev 13ab5cbb64 [NFC] Remove redundant parameters for better readability
llvm-svn: 353034
2019-02-04 10:20:51 +00:00
Max Kazantsev 65970aa24d [NFC] Replace equivalent condition for better readability
llvm-svn: 353032
2019-02-04 09:55:18 +00:00
Clement Courbet 1bb0e5ccfb [SelectionDAG] Add a BaseIndexOffset::print() method for debugging.
llvm-svn: 353028
2019-02-04 09:30:43 +00:00
Roman Lebedev bd84b139b0 [llvm-exegesis] Cut run time of analysis mode by another -35% (*sic*) (YamlContext::getRegNo())
Summary:

Together with the previous patch, it's an -90% improvement,
or roughly -96% improvement if you look starting with rL347204

```
$ perf stat -r 9 ./bin/llvm-exegesis -mode=analysis -analysis-epsilon=1.0 -benchmarks-file=/tmp/benchmarks-inverse_throughput-onefull.yaml -analysis-clusters-output-file="" -analysis-inconsistencies-output-file=/tmp/clusters-bew.html
no exegesis target for x86_64-unknown-linux-gnu, using default
Parsed 14656 benchmark points
Printing sched class consistency analysis results to file '/tmp/clusters-bew.html'
...
no exegesis target for x86_64-unknown-linux-gnu, using default
Parsed 14656 benchmark points
Printing sched class consistency analysis results to file '/tmp/clusters-bew.html'

 Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=1.0 -benchmarks-file=/tmp/benchmarks-inverse_throughput-onefull.yaml -analysis-clusters-output-file= -analysis-inconsistencies-output-file=/tmp/clusters-bew.html' (9 runs):

           1483.18 msec task-clock                #    0.999 CPUs utilized            ( +-  0.10% )
                68      context-switches          #   46.085 M/sec                    ( +- 22.62% )
                 0      cpu-migrations            #    0.000 K/sec
             11641      page-faults               # 7850.880 M/sec                    ( +-  0.62% )
        5943246799      cycles                    # 4008184.428 GHz                   ( +-  0.10% )  (83.28%)
         442869514      stalled-cycles-frontend   #    7.45% frontend cycles idle     ( +-  0.41% )  (83.29%)
        1443375663      stalled-cycles-backend    #   24.29% backend cycles idle      ( +-  0.47% )  (33.43%)
        7714006752      instructions              #    1.30  insn per cycle
                                                  #    0.19  stalled cycles per insn  ( +-  0.07% )  (50.17%)
        1977242936      branches                  # 1333472193.855 M/sec              ( +-  0.07% )  (66.79%)
          32624220      branch-misses             #    1.65% of all branches          ( +-  0.18% )  (83.34%)

           1.48438 +- 0.00143 seconds time elapsed  ( +-  0.10% )
```
```
$ perf stat -r 9 ./bin/llvm-exegesis -mode=analysis -analysis-epsilon=1.0 -benchmarks-file=/tmp/benchmarks-inverse_throughput-onefull.yaml -analysis-clusters-output-file="" -analysis-inconsistencies-output-file=/tmp/clusters-newer.html
no exegesis target for x86_64-unknown-linux-gnu, using default
Parsed 14656 benchmark points
Printing sched class consistency analysis results to file '/tmp/clusters-newer.html'
...
no exegesis target for x86_64-unknown-linux-gnu, using default
Parsed 14656 benchmark points
Printing sched class consistency analysis results to file '/tmp/clusters-newer.html'

 Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=1.0 -benchmarks-file=/tmp/benchmarks-inverse_throughput-onefull.yaml -analysis-clusters-output-file= -analysis-inconsistencies-output-file=/tmp/clusters-newer.html' (9 runs):

            963.28 msec task-clock                #    0.999 CPUs utilized            ( +-  0.37% )
                12      context-switches          #   12.695 M/sec                    ( +- 52.79% )
                 0      cpu-migrations            #    0.000 K/sec
             11599      page-faults               # 12046.971 M/sec                   ( +-  0.59% )
        3860122322      cycles                    # 4009359.596 GHz                   ( +-  0.37% )  (83.19%)
         380300669      stalled-cycles-frontend   #    9.85% frontend cycles idle     ( +-  0.34% )  (83.30%)
        1071910340      stalled-cycles-backend    #   27.77% backend cycles idle      ( +-  1.30% )  (33.51%)
        4773418224      instructions              #    1.24  insn per cycle
                                                  #    0.22  stalled cycles per insn  ( +-  0.15% )  (50.17%)
        1106990316      branches                  # 1149787979.919 M/sec              ( +-  0.11% )  (66.80%)
          23632231      branch-misses             #    2.13% of all branches          ( +-  0.18% )  (83.33%)

           0.96389 +- 0.00356 seconds time elapsed  ( +-  0.37% )
```
```
$ sha512sum /tmp/clusters-*
db4bbd904fe8840853b589b032c5041bc060b91bcd9c27b914b56581fbc473550eea74b852238c79963b5adf2419f379e9f5db76784048b48e3937f9f3e732bf  /tmp/clusters-bew.html
db4bbd904fe8840853b589b032c5041bc060b91bcd9c27b914b56581fbc473550eea74b852238c79963b5adf2419f379e9f5db76784048b48e3937f9f3e732bf  /tmp/clusters-newer.html
db4bbd904fe8840853b589b032c5041bc060b91bcd9c27b914b56581fbc473550eea74b852238c79963b5adf2419f379e9f5db76784048b48e3937f9f3e732bf  /tmp/clusters-old.html
```

Reviewers: courbet, gchatelet

Reviewed By: courbet

Subscribers: tschuett, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57658

llvm-svn: 353025
2019-02-04 09:12:25 +00:00
Roman Lebedev 5b94fe9623 [llvm-exegesis] Cut run time of analysis mode by -84% (*sic*) (YamlContext::getInstrOpcode())
Summary:
```
$ perf stat -r 9 ./bin/llvm-exegesis -mode=analysis -analysis-epsilon=1.0 -benchmarks-file=/tmp/benchmarks-inverse_throughput-onefull.yaml -analysis-clusters-output-file="" -analysis-inconsistencies-output-file=/tmp/clusters-old.html
no exegesis target for x86_64-unknown-linux-gnu, using default
Parsed 14656 benchmark points
Printing sched class consistency analysis results to file '/tmp/clusters-old.html'
...
no exegesis target for x86_64-unknown-linux-gnu, using default
Parsed 14656 benchmark points
Printing sched class consistency analysis results to file '/tmp/clusters-old.html'

 Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=1.0 -benchmarks-file=/tmp/benchmarks-inverse_throughput-onefull.yaml -analysis-clusters-output-file= -analysis-inconsistencies-output-file=/tmp/clusters-old.html' (9 runs):

           9465.46 msec task-clock                #    1.000 CPUs utilized            ( +-  0.05% )
                60      context-switches          #    6.363 M/sec                    ( +- 79.45% )
                 0      cpu-migrations            #    0.000 K/sec
             11364      page-faults               # 1200.697 M/sec                    ( +-  0.60% )
       37935623543      cycles                    # 4008083.912 GHz                   ( +-  0.05% )  (83.32%)
        2371625356      stalled-cycles-frontend   #    6.25% frontend cycles idle     ( +-  0.37% )  (83.32%)
        8476077875      stalled-cycles-backend    #   22.34% backend cycles idle      ( +-  0.18% )  (33.36%)
       41822439158      instructions              #    1.10  insn per cycle
                                                  #    0.20  stalled cycles per insn  ( +-  0.02% )  (50.03%)
       11607658944      branches                  # 1226405861.486 M/sec              ( +-  0.01% )  (66.69%)
         210864633      branch-misses             #    1.82% of all branches          ( +-  0.06% )  (83.34%)

           9.46636 +- 0.00441 seconds time elapsed  ( +-  0.05% )
```
```
$ perf stat -r 9 ./bin/llvm-exegesis -mode=analysis -analysis-epsilon=1.0 -benchmarks-file=/tmp/benchmarks-inverse_throughput-onefull.yaml -analysis-clusters-output-file="" -analysis-inconsistencies-output-file=/tmp/clusters-bew.html
no exegesis target for x86_64-unknown-linux-gnu, using default
Parsed 14656 benchmark points
Printing sched class consistency analysis results to file '/tmp/clusters-bew.html'
...
no exegesis target for x86_64-unknown-linux-gnu, using default
Parsed 14656 benchmark points
Printing sched class consistency analysis results to file '/tmp/clusters-bew.html'

 Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=1.0 -benchmarks-file=/tmp/benchmarks-inverse_throughput-onefull.yaml -analysis-clusters-output-file= -analysis-inconsistencies-output-file=/tmp/clusters-bew.html' (9 runs):

           1480.66 msec task-clock                #    1.000 CPUs utilized            ( +-  0.19% )
                13      context-switches          #    8.483 M/sec                    ( +- 83.10% )
                 0      cpu-migrations            #    0.075 M/sec                    ( +-100.00% )
             11596      page-faults               # 7834.247 M/sec                    ( +-  0.59% )
        5933732194      cycles                    # 4008977.535 GHz                   ( +-  0.19% )  (83.22%)
         438111928      stalled-cycles-frontend   #    7.38% frontend cycles idle     ( +-  0.37% )  (83.25%)
        1454969705      stalled-cycles-backend    #   24.52% backend cycles idle      ( +-  0.94% )  (33.53%)
        7724218604      instructions              #    1.30  insn per cycle
                                                  #    0.19  stalled cycles per insn  ( +-  0.07% )  (50.14%)
        1979796413      branches                  # 1337599858.945 M/sec              ( +-  0.06% )  (66.74%)
          32641638      branch-misses             #    1.65% of all branches          ( +-  0.18% )  (83.31%)

           1.48128 +- 0.00284 seconds time elapsed  ( +-  0.19% )

$ sha512sum /tmp/clusters-*
db4bbd904fe8840853b589b032c5041bc060b91bcd9c27b914b56581fbc473550eea74b852238c79963b5adf2419f379e9f5db76784048b48e3937f9f3e732bf  /tmp/clusters-bew.html
db4bbd904fe8840853b589b032c5041bc060b91bcd9c27b914b56581fbc473550eea74b852238c79963b5adf2419f379e9f5db76784048b48e3937f9f3e732bf  /tmp/clusters-old.html
```

Reviewers: courbet, gchatelet

Reviewed By: courbet

Subscribers: tschuett, llvm-commits, RKSimon

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57657

llvm-svn: 353024
2019-02-04 09:12:21 +00:00
Roman Lebedev 1a0d595f15 [llvm-exegesis] Throughput support in analysis mode
Summary:
D57000 / [[ https://bugs.llvm.org/show_bug.cgi?id=37698 | PR37698 ]] added support for measuring of the inverse throughput.
But the support for the analysis was not added.
This attempts to fix that. (analysis done o bdver2 / piledriver)

First, small-scale experiment:
```
$ ./bin/llvm-exegesis -num-repetitions=10000 -mode=inverse_throughput -opcode-name=BSF64rr
Check generated assembly with: /usr/bin/objdump -d /tmp/snippet-d0acdd.o
---
mode:            inverse_throughput
key:
  instructions:
    - 'BSF64rr RAX RDX'
  config:          ''
  register_initial_values:
    - 'RDX=0x0'
cpu_name:        bdver2
llvm_triple:     x86_64-unknown-linux-gnu
num_repetitions: 10000
measurements:
  - { key: inverse_throughput, value: 3.0278, per_snippet_value: 3.0278 }
error:           ''
info:            instruction has no tied variables picking Uses different from defs
assembled_snippet: 48BA0000000000000000480FBCC2480FBCC2480FBCC2480FBCC2480FBCC2480FBCC2480FBCC2480FBCC2480FBCC2480FBCC2480FBCC2480FBCC2480FBCC2480FBCC2480FBCC2480FBCC2C3
...
```
If we plug `bsfq	%r12, %r10` into llvm-mca:
https://godbolt.org/z/ZtOyhJ
```
Dispatch Width:    4
uOps Per Cycle:    3.00
IPC:               0.50
Block RThroughput: 2.0
```
So RThroughput mismatch exists.

Now, let's upscale and analyse:
{F8207148}
`$ ./bin/llvm-exegesis -mode=analysis -analysis-epsilon=1.0 -benchmarks-file=/tmp/benchmarks-inverse_throughput.yaml -analysis-inconsistencies-output-file=/tmp/clusters.html`:
{F8207172}
{F8207197}
And if we now look at https://www.agner.org/optimize/instruction_tables.pdf,
`Reciprocal throughput` for `BSF r,r` is listed as `3`.
Yay?

Reviewers: courbet, gchatelet

Reviewed By: courbet

Subscribers: tschuett, RKSimon, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57647

llvm-svn: 353023
2019-02-04 09:12:17 +00:00
Roman Lebedev dc78bc277d [llvm-exegesis] deserializeMCInst(): bump SmallVector small size up to 16
Summary:
... from 8.
`VALIGNDZ128rmbik XMM0 XMM0 K1 XMM3 RDI i_0x1  i_0x0  i_0x1` instruction already has 9 components.
It does not matter much in terms of performance, but avoiding allocation seems to come with low cost here..

Reviewers: courbet, gchatelet

Reviewed By: courbet

Subscribers: tschuett, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57654

llvm-svn: 353022
2019-02-04 09:12:13 +00:00
Roman Lebedev 21193f4b7e [llvm-exegesis] Don't default to running&dumping all analyses to '-'
Summary:
Up until the point i have looked in the source, i didn't even understood that
i can disable 'cluster' output. I have always silenced it via ` &> /dev/null`.
(And hoped it wasn't contributing much of the run time.)

While i expect that it has it's use-cases i never once needed it so far.
If i forget to silence it, console is completely flooded with that output.

How about not expecting users to opt-out of analyses,
but to explicitly specify the analyses that should be performed?

Reviewers: courbet, gchatelet

Reviewed By: courbet

Subscribers: tschuett, RKSimon, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57648

llvm-svn: 353021
2019-02-04 09:12:08 +00:00
Max Kazantsev 437ee05885 [SCEV] Do not bother creating separate SCEVUnknown for unreachable nodes
Currently, SCEV creates SCEVUnknown for every node of unreachable code. If we
have a huge amounts of such code, we will be littering SE with these nodes. We could
just state that they all are undef and save some memory.

Differential Revision: https://reviews.llvm.org/D57567
Reviewed By: sanjoy

llvm-svn: 353017
2019-02-04 05:04:19 +00:00
Craig Topper b5e945c260 Recommit r352660 "[X86] Mark EMMS and FEMMS as clobbering MM0-7 and ST0-7."
We now print ST0 as 'st' when generating the clobber list for MS inline assembly in clang. This matches what the gcc reg name list expects.

Original commit message:

This fixes the test case in PR35982 by preventing MMX instructions that read MM0-7 from being moved below EMMS/FEMMS by the post RA scheduler.

Though as discussed in bugzilla, this is not a complete fix. There is still the possibility of reordering in IR or by the pre-RA scheduler.

Differential Revision: https://reviews.llvm.org/D57298

llvm-svn: 353016
2019-02-04 04:44:20 +00:00
Craig Topper 7a2944efe1 [X86] Print %st(0) as %st when its implicit to the instruction. Continue printing it as %st(0) when its encoded in the instruction.
This is a step back from the change I made in r352985. This appears to be more consistent with gcc and objdump behavior.

llvm-svn: 353015
2019-02-04 04:15:10 +00:00
Craig Topper 145ccb0eb9 [X86] Regenerate test to drop 'End function' comments some other other regex updates.
llvm-svn: 353014
2019-02-04 04:15:04 +00:00
Craig Topper f77b858dc3 Revert r352985 "[X86] Print %st(0) as %st to match what gcc inline asm uses as the clobber name to make MS inline asm work correctly"
Looking into gcc and objdump behavior more this was overly aggressive. If the register is encoded in the instruction we should print %st(0), if its implicit we should print %st.

I'll be making a more directed change in a future patch.

llvm-svn: 353013
2019-02-04 04:15:02 +00:00
Saleem Abdulrasool 764727d92e tests: loosen restriction
The MachO tests can run on any target, but require that the x86 backend
is available.  Broaden the coverage of the test.

llvm-svn: 353012
2019-02-04 00:09:32 +00:00
Sunil Srivastava b33b410283 Compute the correct symbol size in llvm-nm even without --print-size
In llvm-nm, the symbol size was being computed only with --print-size option,
even though it was being printed in other cases, such as with --format=posix.

This patch simply removes the guard, so that the size is computed
independently of the later decision to print it or not.

Fixes PR39997.

Differential Revision: https://reviews.llvm.org/D57599

llvm-svn: 353011
2019-02-03 22:40:01 +00:00
Davide Italiano 1002ab3d1c [docs] Recommend assertions when testing.
Pointed out by Shoaib Meenai.

llvm-svn: 353008
2019-02-03 20:37:13 +00:00
Davide Italiano 73929c4d24 [LoopIdiomRecognize] @llvm.dbg values shouldn't affect the transformation.
Summary: PR40564

Reviewers: aprantl, rnk

Subscribers: llvm-commits, hiraditya

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57629

llvm-svn: 353007
2019-02-03 20:33:20 +00:00
Simon Pilgrim 135413d381 [NFC] Make vector types legal in UREM test
As discussed in D50222, this changes the vector types in tests required for that revision to ones legal for X86.

Patch by @hermord (Dmytro Shynkevych)

Differential Revision: https://reviews.llvm.org/D56372

llvm-svn: 353004
2019-02-03 19:38:15 +00:00
Sanjay Patel f1314b6603 [PowerPC] adjust test for uaddo change in rL353001
We don't need a mtctr/bctr for this test now; a regular
conditional branch is fine.

llvm-svn: 353002
2019-02-03 18:10:16 +00:00
Sanjay Patel 84ceae6048 [CGP] adjust target constraints for forming uaddo
There are 2 changes visible here:
1. There's no reason to limit this transform based on number
   of condition registers. That diff allows PPC to produce 
   slightly better (dot-instructions should be generally good) 
   code.
   Note: someone that cares about PPC codegen might want to 
   look closer at that output because it seems like we could
   still improve this.

2. We (probably?) should not bother trying to form uaddo (or
   other overflow ops) when there's no target support for such
   an op. This goes beyond checking whether the op is expanded
   because both PPC and AArch64 show better codegen for standard
   types regardless of whether the op is legal/custom.

llvm-svn: 353001
2019-02-03 17:53:09 +00:00
Simon Pilgrim 1fce5a8b75 [X86][AVX] Support shuffle combining for VBROADCAST with smaller vector sources
getTargetShuffleMask can only do this safely if we're extracting the lowest subvector from a vector of the same result type.

llvm-svn: 352999
2019-02-03 16:51:33 +00:00
Sanjay Patel 837552fe9f [PatternMatch] add special-case uaddo matching for increment-by-one (2nd try)
This is the most important uaddo problem mentioned in PR31754:
https://bugs.llvm.org/show_bug.cgi?id=31754
...but that was overcome in x86 codegen with D57637.

That patch also corrects the inc vs. add regressions seen with the  previous attempt at this.

Still, we want to make this matcher complete, so we can potentially canonicalize the pattern 
even if it's an 'add 1' operation.
Pattern matching, however, shouldn't assume that we have canonicalized IR, so we match 4 
commuted variants of uaddo.

There's also a test with a crazy type to show that the existing CGP transform based on this 
matcher is not limited by target legality checks.

I'm not sure if the Hexagon diff means the test is no longer testing what it intended to
test, but that should be solvable in a follow-up.

Differential Revision: https://reviews.llvm.org/D57516

llvm-svn: 352998
2019-02-03 16:16:48 +00:00
Simon Pilgrim 18b73a655b [X86][AVX] Support shuffle combining for VPMOVZX with smaller vector sources
llvm-svn: 352997
2019-02-03 16:10:18 +00:00
Simon Pilgrim a2a3e5b811 [X86][AVX] More aggressively simplify BROADCAST source operand
Aim to use scalar source or lowest 128-bit vector directly.

We're still missing some VZMOVL_LOAD combines.

llvm-svn: 352994
2019-02-03 14:39:41 +00:00
Sanjay Patel 3d6ecfc078 [x86] add CGP uaddo test with weird type; NFC
There's probably no reason to try this transform
for an obviously unsupported op.

llvm-svn: 352993
2019-02-03 14:22:43 +00:00
Sanjay Patel b961bd68f0 [CGP] move test file to prevent bot failures
The test specifiies the triple, so it needs to be in the
x86 directory in case a bot has been configured without
the x86 target.

llvm-svn: 352992
2019-02-03 14:19:45 +00:00
Sanjay Patel 00fcc74e50 [CGP] refactor optimizeCmpExpression (NFCI)
This is not truly NFC because we are bailing out without
a TLI now. That should not be a real concern though because
there should be a TLI in any real-world scenario. 

That seems better than passing around a pointer and then 
checking it for null-ness all over the place.

The motivation is to fix what appears to be an unintended
restriction on the uaddo transform - 
hasMultipleConditionRegisters() shouldn't be reason to limit 
the transform.

llvm-svn: 352988
2019-02-03 13:48:03 +00:00
Sanjay Patel 359a97310b [PowerPC] add tests for saturating add; NFC
This is copied from the existing test files for x86/AArch.

llvm-svn: 352987
2019-02-03 12:42:54 +00:00
Philip Pfaffe 9438585fe4 [DA][NewPM] Handle transitive dependencies in the new-pm version of DA
Summary:
The analysis result of DA caches pointers to AA, SCEV, and LI, but it
never checks for their invalidation. Fix that.

Reviewers: chandlerc, dmgreen, bogner

Reviewed By: dmgreen

Subscribers: hiraditya, bollu, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D56381

llvm-svn: 352986
2019-02-03 12:25:41 +00:00
Craig Topper 5a570dd437 [X86] Print %st(0) as %st to match what gcc inline asm uses as the clobber name to make MS inline asm work correctly
Summary:
When calculating clobbers for MS style inline assembly we fail if the asm clobbers stack top because we print st(0) and try to pass it through the gcc register name check. This was found with when I attempted to make a emms/femms clobber all ST registers. If you use emms/femms in MS inline asm we would try to use st(0) as the clobber name but clang would think that wasn't a valid clobber name.

This also matches what objdump disassembly prints. It's also what is printed by gcc -S.

Reviewers: RKSimon, rnk, efriedma, spatel, andreadb, lebedev.ri

Reviewed By: rnk

Subscribers: eraman, gbedwell, lebedev.ri, llvm-commits

Differential Revision: https://reviews.llvm.org/D57621

llvm-svn: 352985
2019-02-03 07:53:39 +00:00
Craig Topper 950ca192f6 [X86] Lower ISD::UADDO to use the Z flag instead of C flag when the RHS is a constant 1 to encourage INC formation.
Summary:
Add an additional combine to combineCarryThroughADD to reverse it back to the C flag to avoid regressions.

I believe this catches the cases that D57547 got.

Reviewers: RKSimon, spatel

Reviewed By: spatel

Subscribers: javed.absar, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57637

llvm-svn: 352984
2019-02-03 07:25:06 +00:00
Fangrui Song b21ed3c57a [AMDGPU] Fix -Wunused-variable after rL352978
llvm-svn: 352982
2019-02-03 03:51:52 +00:00
Dmitry Venikov aaa709f2ec [InstSimplify] Missed optimization in math expression: log10(pow(10.0,x)) == x, log2(pow(2.0,x)) == x
Summary: This patch enables folding following instructions under -ffast-math flag: log10(pow(10.0,x)) -> x, log2(pow(2.0,x)) -> x

Reviewers: hfinkel, spatel, efriedma, craig.topper, zvi, majnemer, lebedev.ri

Reviewed By: spatel, lebedev.ri

Subscribers: lebedev.ri, llvm-commits

Differential Revision: https://reviews.llvm.org/D41940

llvm-svn: 352981
2019-02-03 03:48:30 +00:00
Matt Arsenault 888aa5dedd GlobalISel: Implement widenScalar for G_UNMERGE_VALUES
For the scalar case only.

Also move the similar G_MERGE_VALUES handling to a separate function
and cleanup to make them look more similar.

llvm-svn: 352979
2019-02-03 00:07:33 +00:00
Matt Arsenault 0e5d856eb8 GlobalISel: Implement widenScalar for G_EXTRACT vector sources
Handle the basic element extract case.

llvm-svn: 352978
2019-02-02 23:56:00 +00:00
Matt Arsenault eb2603cfb2 AMDGPU/GlobalISel: Avoid reporting illegal extloads as legal
This avoids breaking a test in a future commit.

llvm-svn: 352977
2019-02-02 23:39:13 +00:00
Matt Arsenault 58f9d3df97 AMDGPU/GlobalISel: Legalize icmp for pointer types
llvm-svn: 352976
2019-02-02 23:35:15 +00:00
Matt Arsenault 2065c94dd3 AMDGPU/GlobalISel: Legalize constant for pointer types
llvm-svn: 352975
2019-02-02 23:33:49 +00:00
Matt Arsenault 2491f82679 AMDGPU/GlobalISel: Legalize select for pointer types
llvm-svn: 352974
2019-02-02 23:31:50 +00:00
Matt Arsenault cbaada6bc1 GlobalISel: Legalization for inttoptr/ptrtoint
llvm-svn: 352973
2019-02-02 23:29:55 +00:00
Craig Topper cf07b097c6 [X86] Add another test case for PR40539. NFC
llvm-svn: 352967
2019-02-02 22:01:41 +00:00
Simon Pilgrim dbf302c9f1 [X86][AVX] Enable INSERT_SUBVECTOR(SRC0, SHUFFLE(SRC1)) shuffle combining
Push the insert_subvector up through the shuffle operands to help find more cross-lane shuffles.

The is exposes a couple of minor issues that will be fixed shortly:
Missed broadcast folds - we have a mixture of vzext_load lengths that need cleaning up
combine-sdiv.ll - AVX1 SimplifyDemandedVectorElts failure (hits max depth due to a couple of extra bitcasts).

llvm-svn: 352963
2019-02-02 18:08:04 +00:00
Simon Pilgrim bd42f97946 [SDAG] Add SDNode/SDValue getConstantOperandAPInt helper. NFCI.
We already have the getConstantOperandVal helper which returns a uint64_t, but along comes the fuzzer and inserts a i128 -1 constant or something and the whole thing asserts.......

I've updated a few obvious cases, and tried to make use of the const reference where possible, but there's more to do. A number of existing oss-fuzz tickets should be fixed if we start using APInt and perform value clamping where necessary.

llvm-svn: 352961
2019-02-02 17:35:06 +00:00
Florian Hahn dd2ef0af46 [LCSSA] Handle case with single new PHI faster.
If there is only a single available value, all uses must be dominated by
the single value and there is no need to search for a reaching
definition.

This drastically speeds up LCSSA in some cases. For the test case
from PR37202, it speeds up LCSSA construction by 4 times.

Time-passes without this patch for test case from PR37202:

    Total Execution Time: 29.9285 seconds (29.9276 wall clock)

    ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
    5.2786 ( 17.7%)   0.0021 (  1.2%)   5.2806 ( 17.6%)   5.2808 ( 17.6%)  Unswitch loops
    4.3739 ( 14.7%)   0.0303 ( 18.1%)   4.4042 ( 14.7%)   4.4042 ( 14.7%)  Loop-Closed SSA Form Pass
    4.2658 ( 14.3%)   0.0192 ( 11.5%)   4.2850 ( 14.3%)   4.2851 ( 14.3%)  Loop-Closed SSA Form Pass #2
    2.2307 (  7.5%)   0.0013 (  0.8%)   2.2320 (  7.5%)   2.2318 (  7.5%)  Loop Invariant Code Motion
    2.0888 (  7.0%)   0.0012 (  0.7%)   2.0900 (  7.0%)   2.0897 (  7.0%)  Unroll loops
    1.6761 (  5.6%)   0.0013 (  0.8%)   1.6774 (  5.6%)   1.6774 (  5.6%)  Value Propagation
    1.3686 (  4.6%)   0.0029 (  1.8%)   1.3716 (  4.6%)   1.3714 (  4.6%)  Induction Variable Simplification
    1.1457 (  3.8%)   0.0010 (  0.6%)   1.1468 (  3.8%)   1.1468 (  3.8%)  Loop-Closed SSA Form Pass #4
    1.1384 (  3.8%)   0.0005 (  0.3%)   1.1389 (  3.8%)   1.1389 (  3.8%)  Loop-Closed SSA Form Pass #6
    1.1360 (  3.8%)   0.0027 (  1.6%)   1.1387 (  3.8%)   1.1387 (  3.8%)  Loop-Closed SSA Form Pass #5
    1.1331 (  3.8%)   0.0010 (  0.6%)   1.1341 (  3.8%)   1.1340 (  3.8%)  Loop-Closed SSA Form Pass #3

Time passes with this patch

  Total Execution Time: 19.2802 seconds (19.2813 wall clock)

   ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
   4.4234 ( 23.2%)   0.0038 (  2.0%)   4.4272 ( 23.0%)   4.4273 ( 23.0%)  Unswitch loops
   2.3828 ( 12.5%)   0.0020 (  1.1%)   2.3848 ( 12.4%)   2.3847 ( 12.4%)  Unroll loops
   1.8714 (  9.8%)   0.0020 (  1.1%)   1.8734 (  9.7%)   1.8735 (  9.7%)  Loop Invariant Code Motion
   1.7973 (  9.4%)   0.0022 (  1.2%)   1.7995 (  9.3%)   1.8003 (  9.3%)  Value Propagation
   1.4010 (  7.3%)   0.0033 (  1.8%)   1.4043 (  7.3%)   1.4044 (  7.3%)  Induction Variable Simplification
   0.9978 (  5.2%)   0.0244 ( 13.1%)   1.0222 (  5.3%)   1.0224 (  5.3%)  Loop-Closed SSA Form Pass #2
   0.9611 (  5.0%)   0.0257 ( 13.8%)   0.9868 (  5.1%)   0.9868 (  5.1%)  Loop-Closed SSA Form Pass
   0.5856 (  3.1%)   0.0015 (  0.8%)   0.5871 (  3.0%)   0.5869 (  3.0%)  Unroll loops #2
   0.4132 (  2.2%)   0.0012 (  0.7%)   0.4145 (  2.1%)   0.4143 (  2.1%)  Loop Invariant Code Motion #3

Reviewers: efriedma, davide, mzolotukhin

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D57033

llvm-svn: 352960
2019-02-02 15:26:05 +00:00
Florian Hahn 509b48a64a [LCSSA] Add expensive verification of LCSSA form for sub-loops.
This assertion makes sure all sub-loops are in LCSSA form before
bringing their parent in LCSSA form. This precondition was added to
formLCSSA in D56848.

Reviewers: davide, efriedma, mzolotukhin

Reviewed By: davide

Differential Revision: https://reviews.llvm.org/D56921

llvm-svn: 352958
2019-02-02 14:42:27 +00:00
Craig Topper d3107a7b85 [X86][SSE]: Adding full coverage of MC encoding tests for the SSE isa sets.<NFC>
Summary:
NFC.
Adding MC regressions tests to cover all the SSE ISA sets as follows:
SSE, SSE2, SSE3, SSE4, SSE42, SSEMXCSR, SSE_PREFETCH, SSSE3

This patch is part of a larger task to cover MC encoding of all X86 ISA Sets.
See revision: https://reviews.llvm.org/D39952

Patch by Gadi Haber and Wang Tianqing

Reviewers: RKSimon, zvi, craig.topper, AndreiGrischenko, gadi.haber, LuoYuanke

Reviewed By: craig.topper

Subscribers: jfb, llvm-commits

Differential Revision: https://reviews.llvm.org/D40387

llvm-svn: 352955
2019-02-02 06:21:54 +00:00
JF Bastien 115b64b3d1 Revert "Bump minimum toolchain version"
Reverting D57264 again, it looks like we're down to two bots that need fixing:

polly-amd64-linux
polly-arm-linux

They both have old versions of libstdc++ and recent clang.

llvm-svn: 352954
2019-02-02 06:01:12 +00:00
Yonghong Song fa3654008b [BPF] [BTF] Process FileName with absolute path correctly
In IR, sometimes the following attributes for DIFile may be
generated:
  filename: /home/yhs/test.c
  directory: /tmp
The /tmp may represent the working directory of the compilation
process.

In such cases, since filename is with absolute path,
the directory should be ignored by BTF. The filename alone is
enough to get the source.

Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
llvm-svn: 352952
2019-02-02 05:54:59 +00:00
JF Bastien c9a69acc7e Bump minimum toolchain version
Summary:
The RFC on moving past C++11 got good traction:
  http://lists.llvm.org/pipermail/llvm-dev/2019-January/129452.html

This patch therefore bumps the toolchain versions according to our policy:
  llvm.org/docs/DeveloperPolicy.html#toolchain

Subscribers: mgorny, jkorous, dexonsmith, llvm-commits, mehdi_amini, jyknight, rsmith, chandlerc, smeenai, hans, reames, lattner, lhames, erichkeane

Differential Revision: https://reviews.llvm.org/D57264

llvm-svn: 352951
2019-02-02 05:15:34 +00:00
Alexander Shaposhnikov d078b4e484 [llvm-objcopy] Temporarily limit one test to darwin
Some triples in llvm-mc appear to be unavailable on some buildbots.
To please those buildbots we temporarily limit the test to darwin 
(where the required triple is guranteed to be available)
until we find the right solution.

llvm-svn: 352950
2019-02-02 05:01:00 +00:00
Julian Lettner f82d8924ef [ASan] Do not instrument other runtime functions with `__asan_handle_no_return`
Summary:
Currently, ASan inserts a call to `__asan_handle_no_return` before every
`noreturn` function call/invoke. This is unnecessary for calls to other
runtime funtions. This patch changes ASan to skip instrumentation for
functions calls marked with `!nosanitize` metadata.

Reviewers: TODO

Differential Revision: https://reviews.llvm.org/D57489

llvm-svn: 352948
2019-02-02 02:05:16 +00:00
Alexander Shaposhnikov 7d53675b70 [llvm-objcopy] Fix triples in macho tests.
Update triples used by the macho tests to fix some buildbots.

llvm-svn: 352947
2019-02-02 02:04:09 +00:00
Mandeep Singh Grang 2be4eabb6f [AutoUpgrade] Fix AutoUpgrade for x86.seh.recoverfp
Summary: This fixes the bug in https://reviews.llvm.org/D56747#inline-502711.

Reviewers: efriedma

Reviewed By: efriedma

Subscribers: javed.absar, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57614

llvm-svn: 352945
2019-02-02 01:32:48 +00:00
Alexander Shaposhnikov d911ed10af [llvm-objcopy] Add ability to copy MachO object files
This diff implements first bits for copying (without modification) MachO object files.

Test plan: make check-all

Differential revision: https://reviews.llvm.org/D54674

llvm-svn: 352944
2019-02-02 00:38:07 +00:00
Yonghong Song 329010e1b5 Revert "[BPF] [BTF] Process FileName with absolute path correctly"
This reverts commit r352939.

Some tests failed. Revert to unblock others.

llvm-svn: 352941
2019-02-01 23:49:52 +00:00
Mandeep Singh Grang dc1e778369 [AArch64] Fix unused variable [NFC]
llvm-svn: 352940
2019-02-01 23:42:34 +00:00
Yonghong Song 5233fb8f5e [BPF] [BTF] Process FileName with absolute path correctly
In IR, sometimes the following attributes for DIFile may be
generated:
  filename: /home/yhs/test.c
  directory: /tmp
The /tmp may represent the working directory of the compilation
process.

In such cases, since filename is with absolute path,
the directory should be ignored by BTF. The filename alone is
enough to get the source.

Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
llvm-svn: 352939
2019-02-01 23:23:17 +00:00
Philip Reames 00056ed0e6 [CodeGen] Be as conservative about atomic accesses as for volatile
Background: At the moment, we record the AtomicOrdering of an access in the MMO, but also mark any atomic access as volatile in SelectionDAG. I'm working towards separating that. See https://reviews.llvm.org/D57601 for context.

Update all usages of isVolatile in lib/CodeGen to preserve behaviour once atomic MMOs stop being also volatile. This is NFC in it's current form, but is essential for correctness once we make that final change.

It useful to keep in mind that AtomicSDNode is not a parent of LoadSDNode, StoreSDNode, or LSBaseSDNode. As a result, any call to isVolatile on one of those static types doesn't need a companion isAtomic check.  We should probably adjust that class hierarchy long term, but for now, that seperation is useful.

I'm deliberately being conservative about handling. I want the change to stop adding volatile to be NFC itself, and then will work through places where we can be less conservative for atomics one by one in separate changes w/tests.

Differential Revision: https://reviews.llvm.org/D57596

llvm-svn: 352937
2019-02-01 22:58:52 +00:00
Evandro Menezes 39ad187ec9 [InstCombine] Refactor test checks (NFC)
llvm-svn: 352935
2019-02-01 22:52:05 +00:00
Philip Reames 05fc7edf62 [Test] Update file w/update_test_checks.py to make a follow on change obvious
llvm-svn: 352932
2019-02-01 22:30:51 +00:00
Dan Gohman f726e4454c [WebAssembly] Add codegen support for the import_field attribute
This adds the LLVM side of https://reviews.llvm.org/D57602 -- the
import_field attribute. See that patch for details.

Differential Revision: https://reviews.llvm.org/D57603

llvm-svn: 352931
2019-02-01 22:27:34 +00:00
Mandeep Singh Grang 70d484d94e [COFF, ARM64] Fix localaddress to handle stack realignment and variable size objects
Summary: This fixes using the correct stack registers for SEH when stack realignment is needed or when variable size objects are present.

Reviewers: rnk, efriedma, ssijaric, TomTan

Reviewed By: rnk, efriedma

Subscribers: javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D57183

llvm-svn: 352923
2019-02-01 21:41:33 +00:00
Simon Pilgrim e95550f508 [X86][AVX] Add VMOVDDUP-VPBROADCASTQ execution domain mapping
Noticed in D57514.

Differential Revision: https://reviews.llvm.org/D57519

llvm-svn: 352922
2019-02-01 21:41:30 +00:00
Scott Linder afc24ed21a [AMDGPU] Mark test functions with hidden visibility
Prepare for future patch which affects codegen for calls to preemptible
functions.

Differential Revision: https://reviews.llvm.org/D57605

llvm-svn: 352920
2019-02-01 21:23:28 +00:00
Jordan Rupprecht 614dd19869 [DebugInfo] Fix mkdir use in test
llvm-svn: 352918
2019-02-01 21:14:21 +00:00
Evandro Menezes d91776a433 [InstCombine] Expand Windows test (NFC)
Run checks for Win32 as well.

llvm-svn: 352917
2019-02-01 21:14:10 +00:00
Jordan Rupprecht 835df27f85 [DebugInfo] Don't use realpath when looking up debug binary locations.
Summary:
Using realpath makes assumptions about build systems that do not always hold true. The debug binary referred to from the .gnu_debuglink should exist in the same directory (or in a .debug directory, etc.), but the files may only exist as symlinks to a differently named files elsewhere, and using realpath causes that lookup to fail.

This was added in r189250, and this is basically a revert + regression test case.

Reviewers: dblaikie, samsonov, jhenderson

Reviewed By: dblaikie

Subscribers: llvm-commits, hiraditya

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57609

llvm-svn: 352916
2019-02-01 21:04:16 +00:00
James Y Knight 291f791ef1 [opaque pointer types] Pass function type for CallBase::setCalledFunction.
Differential Revision: https://reviews.llvm.org/D57174

llvm-svn: 352914
2019-02-01 20:44:54 +00:00