Commit Graph

395683 Commits

Author SHA1 Message Date
Eli Friedman 1f62af6346 [AArch64][SelectionDAG] Support passing/returning scalable vectors with unusual types.
This adds handling for two cases:

1. A scalable vector where the element type is promoted.
2. A scalable vector where the element count is odd (or more generally,
   not divisble by the element count of the part type).

(Some element types still don't work; for example, <vscale x 2 x i128>,
or <vscale x 2 x fp128>.)

Differential Revision: https://reviews.llvm.org/D105591
2021-08-02 15:53:16 -07:00
modimo b40a2a533a [clang] Add support for optional flag -fnew-infallible to restrict exception propagation
The declaration for the global new function in C++ is generated in the compiler front-end. When examining exception propagation, we found that this is the largest root throw site propagator requiring unwind code to be generated for callers up the stack. Allowing this to be handled immediately with termination stops upward propagation and leads to significantly less landing pads generated. This in turns leads to a performance and .text size win.

With `-fnew-infallible` this annotates the declaration with `throw()` and `__attribute__((returns_nonnull))`.  `throw()` allows the compiler to assume exceptions do not propagate out of new and eliminate it as a root throw site. Note that the definition of global new is user-replaceable so users should ensure that the one used follows these semantics.

Measuring internally, we're seeing at 0.5% CPU win in one of our large internal FB workload. Measuring on clang self-build (cd0a1226b5) we get:

thinlto/

        "dwarfehprepare.NumCleanupLandingPadsRemaining": 153494,
        "dwarfehprepare.NumNoUnwind": 26309,
thinlto_newinfallible/

        "dwarfehprepare.NumCleanupLandingPadsRemaining": 143660,
        "dwarfehprepare.NumNoUnwind": 28744,

a 1-143660/153494 = 6.4% reduction in landing pads and a 28744/26309 = 9.3% increase in the number of nounwind functions.

Testing:
ninja check-all
new test case to make sure these attributes are added correctly to global new.

Reviewed By: urnathan

Differential Revision: https://reviews.llvm.org/D105225
2021-08-02 15:45:06 -07:00
Vedant Kumar 3b0a9e7b39 [profile] Move assertIsZero to InstrProfilingUtil.c
... and rename it to 'warnIfNonZero' to better-reflect what it actually
does.

The goal is to minimize the amount of logic that's conditionally
compiled under '#if __APPLE__'.
2021-08-02 15:25:09 -07:00
Aart Bik 52c87e0437 [mlir][sparse] use consistent type for COO object and sparse tensor storage
There was a slightly mismatch between the double COO and actual numerical
type in the final sparse tensor storage (due to external formats always
using double). This minor revision removes that inconsistency by using a
properly typed COO and casting during the "add" method instead. This also
prepares alternative ways of initializing the COO object.

Reviewed By: gussmith23

Differential Revision: https://reviews.llvm.org/D107310
2021-08-02 15:24:43 -07:00
Mitch Phillips 65e9d7efb0 Improve UBSan documentation
Add more checks, info on -fno-sanitize=..., and reference to 5/2021 UBSan Oracle blog.

Authored By: DianeMeirowitz
Reviewed By: hctim

Differential Revision: https://reviews.llvm.org/D106908
2021-08-02 15:10:21 -07:00
Roman Lebedev 6f6e9a867f
[BasicTTIImpl][LoopUnroll] getUnrollingPreferences(): emit ORE remark when advising against unrolling due to a call in a loop
I'm not sure this is the best way to approach this,
but the situation is rather not very detectable unless we explicitly call it out when refusing to advise to unroll.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D107271
2021-08-03 00:57:26 +03:00
Roman Lebedev 4ba3326f17
[InstCombine] `vector_reduce_{or,and}(?ext(<n x i1>))` --> `?ext(vector_reduce_{or,and}(<n x i1>))` (PR51259)
This allows the expansion logic to actually trigger if the argument
was extended from i1 element type, like the rest of the reductions expect.

Alive2 agrees:
https://alive2.llvm.org/ce/z/wcfews (or zext)
https://alive2.llvm.org/ce/z/FCXNFx (or sext)
https://alive2.llvm.org/ce/z/f26zUY (and zext)
https://alive2.llvm.org/ce/z/jprViN (and sext)
2021-08-03 00:54:35 +03:00
Roman Lebedev cdb0dfdffa
[NFC][InstCombine] Add tests for or reduction w/ i1 element type (PR51259) 2021-08-03 00:54:35 +03:00
Roman Lebedev a22449336e
[NFC][InstCombine] Add tests for and reduction w/ i1 element type (PR51259) 2021-08-03 00:54:35 +03:00
Jessica Paquette bd13c8e610 [AArch64][GlobalISel] Emit extloads for ZExt/SExt values in assignValueToAddress
When a value is expected to be extended, we should emit an extended load rather
than a normal G_LOAD.

Add checklines to arm64-abi.ll which show that we now emit the correct loads.

For ease of comparison: https://godbolt.org/z/8WvY6EfdE

Differential Revision: https://reviews.llvm.org/D107313
2021-08-02 14:48:44 -07:00
Roman Lebedev 554fc9ad0a
[InstCombine] `vector_reduce_smax(?ext(<n x i1>))` --> `?ext(vector_reduce_{and,or}(<n x i1>))` (PR51259)
Alive2 agrees:
https://alive2.llvm.org/ce/z/3oqir9 (self)
https://alive2.llvm.org/ce/z/6cuI5m (zext)
https://alive2.llvm.org/ce/z/4FL8rD (sext)

We already handle `vector_reduce_and(<n x i1>)`,
so let's just combine into the already-handled pattern
and let the existing fold do the rest.
2021-08-03 00:29:06 +03:00
Roman Lebedev d7482a2bde
[NFC][InstCombine] Add tests for smax reduction w/ i1 element type (PR51259) 2021-08-03 00:29:06 +03:00
Roman Lebedev f47b7b6d10
[InstCombine] `vector_reduce_smin(?ext(<n x i1>))` --> `?ext(vector_reduce_{or,and}(<n x i1>))` (PR51259)
Alive2 agrees:
https://alive2.llvm.org/ce/z/noXtZ8 (self)
https://alive2.llvm.org/ce/z/JNrN6C (zext)
https://alive2.llvm.org/ce/z/58snuN (sext)

We already handle `vector_reduce_and(<n x i1>)`,
so let's just combine into the already-handled pattern
and let the existing fold do the rest.
2021-08-03 00:29:06 +03:00
Roman Lebedev 4551a41847
[NFC][InstCombine] Add tests for smin reduction w/ i1 element type (PR51259) 2021-08-03 00:29:06 +03:00
Vitaly Buka ecc2c9ba45 [sanitizer] Add callbacks for epoll_pwait2
Depends on D107207.

Differential Revision: https://reviews.llvm.org/D107209
2021-08-02 14:14:19 -07:00
Vitaly Buka f6f724c02e [sanitizer] Fix __sanitizer_syscall_post_epoll_wait
Syscall return number of initialized events which
needs to be used for unposoning.

Differential Revision: https://reviews.llvm.org/D107207
2021-08-02 14:14:18 -07:00
Nikita Popov c7770574f9 Revert "[unroll] Move multiple exit costing into consumer pass [NFC]"
This reverts commit 76940577e4.

This causes Transforms/LoopUnroll/ARM/multi-blocks.ll to fail.
2021-08-02 22:23:34 +02:00
Chang-Sun Lin, Jr b58eda39eb [ValueTracking] Fix computeConstantRange to use "may" instead of "always" semantics for llvm.assume
ValueTracking should allow for value ranges that may satisfy
llvm.assume, instead of restricting the ranges only to values that
will always satisfy the condition.

Differential Revision: https://reviews.llvm.org/D107298
2021-08-02 22:20:17 +02:00
Eli Friedman 739efad3f6 [AArch64] Regenerate fp16 tests. 2021-08-02 13:05:16 -07:00
Roman Lebedev b9b7162b8b
[InstCombine] `vector_reduce_umax(?ext(<n x i1>))` --> `?ext(vector_reduce_or(<n x i1>))` (PR51259)
Alive2 agrees:
https://alive2.llvm.org/ce/z/NbBaeT (self)
https://alive2.llvm.org/ce/z/iEaig4 (zext)
https://alive2.llvm.org/ce/z/meGb3y (sext)

We already handle `vector_reduce_and(<n x i1>)`,
so let's just combine into the already-handled pattern
and let the existing fold do the rest.
2021-08-02 23:02:23 +03:00
Roman Lebedev 9d179ee331
[NFC][InstCombine] Add tests for umax reduction w/ i1 element type (PR51259) 2021-08-02 23:02:22 +03:00
Roman Lebedev 0c13798056
[InstCombine] `vector_reduce_umin(?ext(<n x i1>))` --> `?ext(vector_reduce_and(<n x i1>))` (PR51259)
Alive2 agrees:
https://alive2.llvm.org/ce/z/XxUScW (self)
https://alive2.llvm.org/ce/z/3usTF- (zext)
https://alive2.llvm.org/ce/z/GVxwQz (sext)

We already handle `vector_reduce_and(<n x i1>)`,
so let's just combine into the already-handled pattern
and let the existing fold do the rest.
2021-08-02 23:02:22 +03:00
Roman Lebedev 7888cfe7ef
[NFC][InstCombine] Add tests for umin reduction w/ i1 element type (PR51259) 2021-08-02 23:02:22 +03:00
Simon Pilgrim 317d70ea91 [SLP][X86] Add fmuladd test coverage 2021-08-02 20:59:12 +01:00
Philip Reames 76940577e4 [unroll] Move multiple exit costing into consumer pass [NFC]
This aligns the multiple exit costing with all the other cost decisions.  Note that UnrollAndJam, which is the only other caller of the original home of this code, unconditionally bails out of multiple exit loops.
2021-08-02 12:46:23 -07:00
Alex Lorenz f575f37182 [clang][darwin] Add support for the -mtargetos= option to the driver
The new -mtargetos= option is a replacement for the existing, OS-specific options
like -miphoneos-version-min=. This allows us to introduce support for new darwin OSes
easier as they won't require the use of a new option. The older options will be
deprecated and the use of the new option will be encouraged instead.

Differential Revision: https://reviews.llvm.org/D106316
2021-08-02 12:45:40 -07:00
Eric Leese 437e37dd55 [nfc] [lldb] Support moving support files instead of copy
Split from D100299.

Reviewed By: jankratochvil

Differential Revision: https://reviews.llvm.org/D107165
2021-08-02 21:43:34 +02:00
Nikita Popov 380b8a603c [DFAJumpThreading] Use SmallPtrSet for Visited (NFC)
This set is only used for contains checks, so there is no need to
use std::set.
2021-08-02 21:30:25 +02:00
Hedin Garca 2ab18d57d7 [libc] Add differential and performance targets for sqrtf
Comparing the runtime of the sqrt functions from LLVM libc with the system libc:
|function       |perf - LLVM libc          |perf - MSVCRT
|sqrtf - Windows|44.05 sec (44051715500 ns)| 417.84 sec (417843359900 ns) = 6.96 mins

|function       |perf - LLVM libc          |perf - glibc
|sqrtf - Linux  |30.48 sec (30479458632 ns)|43.72 sec (43716901527 ns)

By running the differential test:
|function       |diff
|sqrtf - Windows|0 differing results
|sqrtf - Linux  |0 differing results

Reviewed By: sivachandra

Differential Revision: https://reviews.llvm.org/D107229
2021-08-02 19:29:48 +00:00
Nikita Popov 3f7aea1a37 [DFAJumpThreading] Use insert return value (NFC)
Rather than find + insert. Also use range based for loop.
2021-08-02 21:21:21 +02:00
Eugene Zhulenev b537c5b414 [mlir] Async: clone constants into async.execute functions and parallel compute functions
Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D107007
2021-08-02 12:17:41 -07:00
Nikita Popov 84602f98c6 [DFAJumpThreading] Remove unnecessary includes (NFC)
This file uses neither unordered_map nor unordered_set.
2021-08-02 21:13:30 +02:00
Nikita Popov e97524cba2 [DFAJumpThreading] Mark DT as preserved in LegacyPM
It is marked as preserved in NewPM, but not LegacyPM.
2021-08-02 21:13:30 +02:00
Eric Leese ea9706626c [test] [lldb] Use filename instead of index in test
In some environments this test could fail if start.S has its own DWARF
CompileUnit or similar are included before the DWARF CompileUnit for the
file.

This change makes the test independent of the index of the compile unit,
instead checking the filename.

Reviewed By: herhut, jankratochvil

Differential Revision: https://reviews.llvm.org/D107300
2021-08-02 21:12:57 +02:00
Roman Lebedev 469793efa7
[InstCombine] `vector_reduce_mul(?ext(<n x i1>))` --> `zext(vector_reduce_and(<n x i1>))` (PR51259)
Alive2 agrees:
https://alive2.llvm.org/ce/z/PDansB (self)
https://alive2.llvm.org/ce/z/55D-Xc (zext)
https://alive2.llvm.org/ce/z/LxG3-r (sext)

We already handle `vector_reduce_and(<n x i1>)`,
so let's just combine into the already-handled pattern
and let the existing fold do the rest.
2021-08-02 21:57:51 +03:00
Roman Lebedev 8baea41570
[NFC][InstCombine] Add tests for mul reduction w/ i1 element type (PR51259) 2021-08-02 21:57:51 +03:00
Philip Reames 9016beaa24 [unrollruntime] Pull out a helper function for readability and eventual reuse [nfc] 2021-08-02 11:47:27 -07:00
Jon Chesterfield 0c3dafd9ed Add Johannes to CODE_OWNERS for openmp offloading
Agreed on llvm-dev in May 2021
2021-08-02 19:45:47 +01:00
Andrzej Warzynski ad2e830fe2 [flang][nfc] Add a regression test for #50993
https://bugs.llvm.org/show_bug.cgi?id=50993 was effectively fixed in
https://reviews.llvm.org/D106727. This patch adds a regression
test specifically for the use case reported in 50993.

Differential Revision: https://reviews.llvm.org/D107260
2021-08-02 18:21:23 +00:00
Paulo Matos 245f2ee647 Revert "[WebAssembly] Add new pass to lower int/ptr conversions of reftypes"
This reverts commit ce1c59dea6.
2021-08-02 20:12:25 +02:00
Nico Weber 82dc463bb3 [lldb] Get rid of HAVE_SIGACTION
The .cpp file uses SIGNAL_POLLING_UNSUPPORTED to guard the call
to sigaction, so use it in the .h file too. (LLVM also calls
sigaction without a guard on non-Windows.)

No behavior change.

Differential Revision: https://reviews.llvm.org/D107255
2021-08-02 20:11:35 +02:00
Nico Weber 3555880f10 [gn build] (manually) port 5c2b48fdb0 2021-08-02 20:10:04 +02:00
Scott Linder 635c5ba45b [AMDGPU][HIP] Switch default DWARF version to 5
Another attempt at changing this default, now that tooling has greater
support for DWARF 5.

Differential Revision: https://reviews.llvm.org/D107190
2021-08-02 18:04:01 +00:00
Philip Reames ebc4c4e3b0 [unroll] Add clarifying comment
The option to not preserve LCSSA is in fact not tested at all in upstream.  I was tempted to just remove the code entirely, but realized I didn't need to for my actual goal.
2021-08-02 10:44:56 -07:00
peter klausler c4a65434d8 [flang] Symbol representation for dummy SubprogramDetails
Dummy procedures can be defined as subprograms with explicit
interfaces, e.g.

  subroutine subr(dummy)
    interface
      subroutine dummy(x)
        real :: x
      end subroutine
    end interface
    ! ...
  end subroutine

but the symbol table had no means of marking such symbols as dummy
arguments, so predicates like IsDummy(dummy) would fail.  Add an
isDummy_ flag to SubprogramNameDetails, analogous to the corresponding
flag in EntityDetails, and set/test it as needed.

Differential Revision: https://reviews.llvm.org/D106697
2021-08-02 10:44:27 -07:00
Alexander Yermolovich 5a865b0b1e [DWARF] Don't process .debug_info relocations for DWO Context
When we build with split dwarf in single mode the .o files that contain both "normal" debug sections and dwo sections, along with relocaiton sections for "normal" debug sections.
When we create DWARF context in DWARFObjInMemory we process relocations and store them in the map for .debug_info, etc section.
For DWO Context we also do it for non dwo dwarf sections. Which I believe is not necessary. This leads to a lot of memory being wasted. We observed 70GB extra memory being used.

I went with context sensitive approach, flag is passed in. I am not sure if it's always safe not to process relocations for regular debug sections if Obj contains .dwo sections.
If it is alternatvie might be just to scan, in constructor, sections and if there are .dwo sections not to process regular debug ones.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D106624
2021-08-02 10:41:47 -07:00
Paulo Matos ce1c59dea6 [WebAssembly] Add new pass to lower int/ptr conversions of reftypes
Add new pass LowerRefTypesIntPtrConv to generate trap
instruction for an inttoptr and ptrtoint of a reference type instead
of erroring, since calling these instructions on non-integral pointers
has been since allowed (see ac81cb7e6).

Differential Revision: https://reviews.llvm.org/D107102
2021-08-02 19:40:00 +02:00
Florian Hahn bb725c9803
[VPlan] Use defined and ops VPValues to print VPInterleaveRecipe.
This patch updates VPInterleaveRecipe::print to print the actual defined
VPValues for load groups and the store VPValue operands for store
groups.

The IR references may become outdated while transforming the VPlan and
the defined and stored VPValues always are up-to-date.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D107223
2021-08-02 18:36:36 +01:00
Chris Lattner 07548b8324 [PatternRewriter] Disable copy/assign operators.
We had a [bad bug](69655864ee) over in CIRCT
caused by accidentally passing around PatternRewriter
by value.  There is no reason to support copy/assignment
of the pattern rewriter, so disable it.

Differential Revision: https://reviews.llvm.org/D107232
2021-08-02 10:26:33 -07:00
Roman Lebedev 1e801439be
[InstCombine] `xor` reduction w/ i1 elt type is a parity check
For i1 element type, `xor` and `add` are interchangeable
(https://alive2.llvm.org/ce/z/e77hhQ), so we should treat it just like
an `add` reduction and consistently transform them both:
https://alive2.llvm.org/ce/z/MjCm5W (self)
https://alive2.llvm.org/ce/z/kgqF4M (skipped zext)
https://alive2.llvm.org/ce/z/pgy3HP (skipped sext)

Though, let's emit the IR that is similar to the one we produce for
`vector_reduce_add(<n x i1>)`.

See https://bugs.llvm.org/show_bug.cgi?id=51259
2021-08-02 20:21:37 +03:00