Commit Graph

61340 Commits

Author SHA1 Message Date
Simon Pilgrim 6b10fde69b [CostModel][X86] Add min/max reduction costs for all SSE targets
The original costs stopped at SSE42, I've added conservative estimates for everything down to SSE1/SSE2 and moved some of the SSE42 costs to SSE41 (really only the addition of PCMPGT makes any difference).

I've also added missing vXi8 costs (we use PHMINPOSUW for i8/i16 for scarily quick results) and 256-bit vector costs for AVX1.

llvm-svn: 360528
2019-05-11 17:12:52 +00:00
Simon Pilgrim e4c5b6d9bd [X86][SSE] Add SimplifyDemandedVectorElts HADD/HSUB handling.
Still missing PHADDW/PHSUBW tests because PEXTRW doesn't call SimplifyDemandedVectorElts

llvm-svn: 360526
2019-05-11 16:07:12 +00:00
Craig Topper c9d7484aa3 [X86] Add CMOV_FR32X/CMOV_FR64X pseudo instructions. Use them in fast isel to fix a machine verifier error after adding test cases.
Fast isel picks the FR32X/FR64X register classes when lowering pseudo select, but it didn't have the right opcode to go with it.

llvm-svn: 360524
2019-05-11 16:00:28 +00:00
Simon Pilgrim 8039e838c6 [MC][X86] Add test cases from PR14056
llvm-svn: 360521
2019-05-11 15:51:14 +00:00
Simon Pilgrim 4871a3057e [X86][SSE] Tweaked HADD/HSUB SimplifyDemandedVectorElts
Try to ensure we LHS and RHS test coverage

llvm-svn: 360519
2019-05-11 14:47:54 +00:00
Simon Pilgrim 1db0cc9e1b [X86][SSE] Add integer HADD/HSUB SimplifyDemandedVectorElts tests
llvm-svn: 360518
2019-05-11 14:08:34 +00:00
Simon Pilgrim 67ad4c2f27 [X86][SSE] Add HADD/HSUB SimplifyDemandedVectorElts tests
Shows missed opportunities to simplify args.

Will add integer HADD/HSUB tests in a future commit.

llvm-svn: 360517
2019-05-11 12:46:38 +00:00
Craig Topper 31f7adb94f [X86] Don't emit MOVNTDQA loads from fast-isel without SSE4.1.
We were checking for SSE4.1 for FP types, but not integer 128-bit types.

Fixes PR41837.

llvm-svn: 360512
2019-05-11 04:19:33 +00:00
Craig Topper bdef12df8d [X86] Add a test case for idempotent atomic operations with speculative load hardening. Fix an additional issue found by the test.
This test covers the fix from r360475 as well.

llvm-svn: 360511
2019-05-11 04:00:27 +00:00
Lang Hames b0cecfc907 [JITLink][MachO] Mark atoms in sections 'no-dead-strip' set live by default.
If a MachO section has the no-dead-strip attribute set then its atoms should
be preserved, regardless of whether they're public or referenced elsewhere in
the object.

llvm-svn: 360477
2019-05-10 22:24:37 +00:00
Reid Kleckner 7eb6b5ffc3 [COFF] Fix .bss section size bug in obj2yaml / yaml2obj
We need to serialize SizeOfRawData through even when there is no data,
as in a .bss section.

Fixes PR41836

llvm-svn: 360473
2019-05-10 21:53:44 +00:00
Mircea Trofin ff3bed0e61 Skip over prefetches
Summary: Skip over prefetches when assigning debug info to instructions with memory operands. This way, the debug info is stable after instrumenting a binary with prefetches, allowing for iterative profiling and instrumentation.

Reviewers: davidxl

Reviewed By: davidxl

Subscribers: aprantl, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61789

llvm-svn: 360471
2019-05-10 21:27:55 +00:00
Nikita Popov 9f7537bd48 [SDAG] Recursively legalize both vector mulo results
Split out from D61692 per RKSimon's suggestion. Vector op
legalization will automatically recursively legalize the returned
SDValue, but we need to take care of the other results ourselves.
Otherwise it will end up getting legalized only during op
legalization, by which point it might be too late (though I'm not
aware of any specific cases right now).

There are codegen differences because expansion occurs earlier now
and we don't get a DAGCombiner run in between.

Differential Revision: https://reviews.llvm.org/D61744

llvm-svn: 360470
2019-05-10 20:42:48 +00:00
Teresa Johnson 37b80122bd [ThinLTO] Auto-hide prevailing linkonce_odr only when all copies eligible
Summary:
We hit undefined references building with ThinLTO when one source file
contained explicit instantiations of a template method (weak_odr) but
there were also implicit instantiations in another file (linkonce_odr),
and the latter was the prevailing copy. In this case the symbol was
marked hidden when the prevailing linkonce_odr copy was promoted to
weak_odr. It led to unsats when the resulting shared library was linked
with other code that contained a reference (expecting to be resolved due
to the explicit instantiation).

Add a CanAutoHide flag to the GV summary to allow the thin link to
identify when all copies are eligible for auto-hiding (because they were
all originally linkonce_odr global unnamed addr), and only do the
auto-hide in that case.

Most of the changes here are due to plumbing the new flag through the
bitcode and llvm assembly, and resulting test changes. I augmented the
existing auto-hide test to check for this situation.

Reviewers: pcc

Subscribers: mehdi_amini, inglorion, eraman, dexonsmith, arphaman, dang, llvm-commits, steven_wu, wmi

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59709

llvm-svn: 360466
2019-05-10 20:08:24 +00:00
Cameron McInally e75412ab47 Add InstCombine::visitFNeg(...)
Differential Revision: https://reviews.llvm.org/D61784

llvm-svn: 360461
2019-05-10 20:01:04 +00:00
Nikita Popov e99486dc11 [CVP] Add tests for urem, sdiv, srem ranges; NFC
We currently don't calcuate result ranges for these binary operators.

llvm-svn: 360460
2019-05-10 19:36:38 +00:00
David Blaikie 7598b71488 DebugInfo: Only move types out of type units if they're named or type united
Follow up to r359122, after a bug was reported in it - the original
change too aggressively tried to move related types out of type units,
which included unnamed types (like array types) which can't reasonably
be declared-but-not-defined.

A step beyond that is that some types in type units can be anonymous, if
they are types with a name for linkage purposes (eg: "typedef struct { }
x;"). So ensure those don't get turned into plain declarations (without
signatures) because, lacking names, they can't be resolved to the
definition.

[Also include a fix for llvm-dwarfdump/libDebugInfoDWARF to pretty print
types in type units]

llvm-svn: 360458
2019-05-10 19:15:29 +00:00
Paul Robinson 8273fdc2a4 Replace 'REQUIRES: nozlib' with '!zlib' because we don't need two ways
to say the same thing.

llvm-svn: 360455
2019-05-10 18:47:39 +00:00
Paul Robinson 0c55985bbb Replace 'REQUIRES: not_?san' with 'UNSUPPORTED: ?san' as that better
expresses the intent of the exclusion.

llvm-svn: 360449
2019-05-10 18:08:02 +00:00
Nikita Popov d74b871504 [CVP] Add tests for abs and nabs spf; NFC
One half of the bound is already computed correctly for these
tests, the other isn't.

llvm-svn: 360445
2019-05-10 17:39:50 +00:00
Fangrui Song 9529c563eb [MC][ELF] Copy top 3 bits of st_other to .symver aliases
On PowerPC64 ELFv2 ABI, the top 3 bits of st_other encode the local
entry offset. A versioned symbol alias created by .symver should copy
the bits from the source symbol.

This partly fixes PR41048. A full fix needs tracking of .set assignments
and updating st_other fields when finish() is called, see D56586.

Patch by Alfredo Dal'Ava Júnior

Differential Revision: https://reviews.llvm.org/D59436

llvm-svn: 360442
2019-05-10 17:09:25 +00:00
Momchil Velikov c396f09ce9 Adjust MachineScheduler to use ProcResource counts
This fix allows the scheduler to take into account the number of instances of
each ProcResource specified. Previously a declaration in a scheduler of
ProcResource<1> would be treated identically to a declaration of
ProcResource<2>. Now the hazard recognizer would report a hazard only after all
of the resource instances are busy.

Patch by Jackson Woodruff and Momchil Velikov.

Differential Revision: https://reviews.llvm.org/D51160

llvm-svn: 360441
2019-05-10 16:54:32 +00:00
Fangrui Song 6150407951 [llvm-objdump] Print st_other
Add support for ".hidden" ".internal" ".protected" and " 0x%02x" for
other st_other bits used by some architectures.

Reviewed By: sfertile

Differential Revision: https://reviews.llvm.org/D61718

llvm-svn: 360439
2019-05-10 16:24:57 +00:00
Nemanja Ivanovic 34dc3aca40 Pull r360426 as it is breaking the build bots.
llvm-svn: 360437
2019-05-10 16:03:22 +00:00
Robert Lougher 986b6b86bb [X86] Avoid SFB - Fix inconsistent codegen with/without debug info
Fixes https://bugs.llvm.org/show_bug.cgi?id=40969

The functions findPotentiallyBlockedCopies and buildCopy are currently not
accounting for the presence of debug instructions. In the former this results
in the optimization not being trigerred, and in the latter results in
inconsistent codegen.

This patch enables the optimization to be performed in a debug build and
ensures the codegen is consistent with non-debug builds.

Patch by Chris Dawson.

Differential Revision: https://reviews.llvm.org/D61680

llvm-svn: 360436
2019-05-10 15:55:06 +00:00
Simon Pilgrim a0b1518a4a [X86][SSE] Add getHopForBuildVector vector splitting
If we only use the lower xmm of a ymm hop, then extract the xmm's (for free), perform the xmm hop and then insert back into a ymm (for free).

Fixes some of the regressions noted in D61782

llvm-svn: 360435
2019-05-10 15:46:04 +00:00
Nemanja Ivanovic 7a41cd5b88 Another attempt to fix the build bot breaks after r360426
The test case checks were produced by the update_test_checks.py
scripts and I assumed that is sufficient. However, the behaviour
is different with different default target triples. Specify the
triple explicitly in the test case.

If this doesn't clean up the build bot breaks, I'll remove the test
case until I can get to the bottom of why the behaviour on build bots
is different from my machine.

llvm-svn: 360434
2019-05-10 15:44:56 +00:00
Nemanja Ivanovic 0f991c65f2 Fix build break after r360426
llvm-svn: 360433
2019-05-10 15:11:40 +00:00
Michael Liao b284414a1b [InferAddressSpaces] Enhance the handling of cosntexpr.
Summary:
- Constant expressions may not be added in strict postorder as the
  forward instruction scan order. Thus, for a constant express (CE0), if
  its operand (CE1) is used in an previous instruction, they are not in
  postorder. However, different from
  `cloneInstructionWithNewAddressSpace`,
  `cloneConstantExprWithNewAddressSpace` doesn't bookkeep uninferred
  instructions for later resolving. That results in failure of inferring
  constant address.
- This patch adds the support to infer constant expression operand
  recursively, since there won't be loop, if that operand is another
  constant expression.

Reviewers: arsenm

Subscribers: jholewinski, jvesely, wdng, nhaehnle, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61760

llvm-svn: 360431
2019-05-10 14:57:42 +00:00
Lei Huang 1ac6e9636c [PowerPC] custom lower `v2f64 fpext v2f32`
Reduces scalarization overhead via custom lowering of v2f64 fpext v2f32.

eg. For the following IR
  %0 = load <2 x float>, <2 x float>* %Ptr, align 8
  %1 = fpext <2 x float> %0 to <2 x double>
  ret <2 x double> %1

Pre custom lowering:
  ld r3, 0(r3)
  mtvsrd f0, r3
  xxswapd vs34, vs0
  xscvspdpn f0, vs0
  xxsldwi vs1, vs34, vs34, 3
  xscvspdpn f1, vs1
  xxmrghd vs34, vs0, vs1

After custom lowering:
  lfd f0, 0(r3)
  xxmrghw vs0, vs0, vs0
  xvcvspdp vs34, vs0

Differential Revision: https://reviews.llvm.org/D57857

llvm-svn: 360429
2019-05-10 14:04:06 +00:00
Nemanja Ivanovic cfc89896e0 [Pass Pipeline][NFC] Add a test prior to committing D61726
This patch just adds a test case to show the differences in code emitted
by opt before and after https://reviews.llvm.org/D61726.

llvm-svn: 360426
2019-05-10 13:47:00 +00:00
Cameron McInally a67e387de8 Pre-commit InstCombine::visitFNeg(...) test.
llvm-svn: 360424
2019-05-10 13:18:57 +00:00
James Henderson 5772e02bd0 [llvm-objcopy] Add additional testing for various cases
This patch adds a number of tests to test various cases not covered by
existing tests. All of them work correctly, with no need to change
llvm-objcopy itself, although some do indicate possible areas for
improvement.

Reviewed by: MaskRay

Differential Revision: https://reviews.llvm.org/D61727

llvm-svn: 360422
2019-05-10 12:58:52 +00:00
Tim Northover 6c1e3f9493 SelectionDAG: accommodate atomic floating stores.
We were applying a pointer truncation to floating types, which crashed LLVM.
That is Not A Good Thing(TM).

llvm-svn: 360421
2019-05-10 11:23:04 +00:00
Fangrui Song c8e68253de [Object] Fix macho-invalid.test
llvm-svn: 360420
2019-05-10 10:47:30 +00:00
Jeremy Morse a2b780b731 [DebugInfo] Use zero linenos for debug intrinsics when promoting dbg.declare
In certain circumstances, optimizations pick line numbers from debug
intrinsic instructions as the new location for altered instructions. This
is problematic because the line number of a debugging intrinsic is
meaningless (it doesn't produce any machine instruction), only the scope
information is valid. The result can be the line number of a variable
declaration "leaking" into real code from debugging intrinsics, making the
line table un-necessarily jumpy, and potentially different with / without
variable locations.

Fix this by using zero line numbers when promoting dbg.declare intrinsics
into dbg.values: this is safe for debug intrinsics as their line numbers
are meaningless, and reduces the scope for damage / misleading stepping
when optimizations pick locations from the wrong place.

Differential Revision: https://reviews.llvm.org/D59272

llvm-svn: 360415
2019-05-10 10:03:41 +00:00
Sam Clegg ea38ac5ba3 [WebAssembly] Don't assume that strongly defined symbols are DSO-local
The current PIC model for WebAssembly is more like ELF in that it
allows symbol interposition.

This means that more functions end up being addressed via the GOT
and fewer directly added to the wasm table.

One effect is a reduction in the number of wasm table entries similar
to the previous attempt in https://reviews.llvm.org/D61539 which was
reverted.

Differential Revision: https://reviews.llvm.org/D61772

llvm-svn: 360402
2019-05-10 01:52:08 +00:00
Mircea Trofin 5c31c05fbd [llvm] X86DiscriminateMemOps: insert debug info when missing
Reviewers: davidxl

Reviewed By: davidxl

Subscribers: aprantl, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61735

llvm-svn: 360396
2019-05-10 00:12:51 +00:00
Stanislav Mekhanoshin 64196850f0 [AMDGPU] Pattern for v_xor3_b32
This also allows three op patterns to use increased constant bus
limit of GFX10.

Differential Revision: https://reviews.llvm.org/D61763

llvm-svn: 360395
2019-05-10 00:09:01 +00:00
Philip Reames bd588dfd59 [X86] Improve lowering of idemptotent RMW operations
The current lowering uses an mfence. mfences are substaintially higher latency than the locked operations originally requested, but we do want to avoid contention on the original cache line. As such, use a locked instruction on a cache line assumed to be thread local.

Differential Revision: https://reviews.llvm.org/D58632

llvm-svn: 360393
2019-05-09 23:23:42 +00:00
Lang Hames 112967833e [JITLink] Fixed a signedness bug when processing X86_64_RELOC_SUBTRACTOR.
Subtractor relocation addends are signed, so we need to read them via signed
int pointers. Accidentally treating 32-bit addends as unsigned leads to
out-of-range errors when we try to add very large (>INT32_MAX) bogus addends.

llvm-svn: 360392
2019-05-09 23:17:41 +00:00
Bill Wendling 6ee7f31484 Add ".dword" directive
Summary:
The ".dword" directive is a synonym for ".xword" and is used used
by klibc, a minimalistic libc subset for initramfs.

Reviewers: t.p.northover, nickdesaulniers

Reviewed By: nickdesaulniers

Subscribers: nickdesaulniers, javed.absar, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61719

llvm-svn: 360381
2019-05-09 21:57:44 +00:00
Caroline Tice abf25745b3 llvm-dwarfdump: Add dwo parsing to --statistics.
Add check for, and parsing of, .dwo files to Statistics.cpp; create a new getNon
SkeletonUnitDie function for DWARFUnit.h

Reviewers: dblaikie

Differential Revision: https://review.llvm.org/D61755

llvm-svn: 360380
2019-05-09 21:53:33 +00:00
Simon Pilgrim 93bfa5af48 [X86][SSE] Fold add(shuffle(),shuffle()) to hadd on 'slow' targets (PR39920)
As reported on PR39920, "slow horizontal ops" targets tend to internally expand to 2*shuffle+add/sub - so if we can reduce 2*shuffle+add/sub to a hadd/sub then we should do it - similar port usage but reduced instruction count.

This works out in most cases, although the "PR22377" regression in vector-shuffle-combining.ll is annoying - going from 2*shuffle+add+shuffle to hadd+2*shuffle - I've opened PR41813 to cover this.

Differential Revision: https://reviews.llvm.org/D61308

llvm-svn: 360360
2019-05-09 17:45:01 +00:00
Andrea Di Biagio 4e62554bfa [MCA] Add support for nested and overlapping region markers
This patch fixes PR41523
https://bugs.llvm.org/show_bug.cgi?id=41523

Regions can now nest/overlap provided that they have different names.
Anonymous regions cannot overlap.

Region end markers must specify the region name. The only exception is for when
there is only one user-defined region; in that particular case, the region end
marker doesn't need to specify a name.

Incorrect region end markers are no longer ignored. Instead, the tool reports an
error and we exit with an error code.

Added test cases to verify the new diagnostic error messages.

Updated the llvm-mca docs to reflect this feature change.

Differential Revision: https://reviews.llvm.org/D61676

llvm-svn: 360351
2019-05-09 15:18:09 +00:00
Pavel Labath dcdb3c6650 MinidumpYAML: add support for the ThreadList stream
Summary:
The implementation is a pretty straightforward extension of the pattern
used for (de)serializing the ModuleList stream. Since there are other
streams which use the same format (MemoryList and MemoryList64, at
least). I tried to generalize the code a bit so that adding future
streams of this type can be done with less code.

Reviewers: amccarth, jhenderson, clayborg

Subscribers: markmentovai, lldb-commits, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61423

llvm-svn: 360350
2019-05-09 15:13:53 +00:00
Roman Lebedev 9db0e72570 [X86] AMD Piledriver (BdVer2): major cleanup (mainly inverse throughput)
I've started this cleanup more several times now, but got sidetracked
elsewhere, e.g. by llvm-exegesis problems. Not this time, finally!

This is mainly cleaning up the inverse throughput values,
and a few latencies/uops, based on the llvm-exegesis measured values.

Though this is not complete by any means,
there's certainly more cleanup to be done.

The performance numbers (i've only checked by RawSpeed benchmark) aren't
really surprising - overall this *slightly* (< -1%) improves perf.

llvm-svn: 360341
2019-05-09 13:54:51 +00:00
Sanjay Patel 012adfbb96 [LoopVectorizer] fix test file to not run the entire -O3 pipeline
This test file has a long history of edits from changes outside
of vectorization, and it would happen again with the proposal in
D61726.

End-to-end testing shouldn't be happening in a test file that is
specifically checking for vector masked load/store ops.
Larger-scale testing goes in PhaseOrdering or the test-suite.

I've hopefully preserved the intent by taking what was completely
unoptimized IR in some tests and passing that through the -O1
pipeline. That becomes the input IR, and now we just run the loop
vectorizer and verify that the vector masked ops are produced as
expected.

llvm-svn: 360340
2019-05-09 13:43:22 +00:00
Fangrui Song bc1c6a0b44 [llvm-nm] Fix handling of symbol types 't' 'd' 'r'
This restores part of r359311 that was reverted by r359830.

Rewrite the symbol types to fix several issues.

Notable difference is that the type of __init_array_start changes from
't' to 'd'.

GNU nm used to mark ELF symbols relative to .init_array as 't'
https://sourceware.org/bugzilla/show_bug.cgi?id=24505 (before 2.33)
because ".init" is the prefix. The bug was copied by r287803.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D61551

llvm-svn: 360339
2019-05-09 12:43:37 +00:00
Nemanja Ivanovic 80808ed0f6 [PowerPC][NFC] Add test for D60506 to show differences in code-gen
Differential revision: https://reviews.llvm.org/D61723

llvm-svn: 360338
2019-05-09 12:26:39 +00:00