Commit Graph

395825 Commits

Author SHA1 Message Date
Krishna 946fd4ea65 Revert "[InstCombine] Remove nnan requirement for transformation to fabs from select"
This reverts commit 6180ce2e2a.
2021-08-03 18:08:11 +05:30
Krishna d99260641b [InstCombine] Fold phi ( inttoptr/ptrtoint x ) to phi (x)
The inttoptr/ptrtoint roundtrip optimization is not always correct.
We are working towards removing this optimization and adding support to specific cases where this optimization works.

In this patch, we focus on phi-node operands with inttoptr casts.
We know that ptrtoint( inttoptr( ptrtoint x) ) is same as ptrtoint (x).
So, we want to remove this roundtrip cast which goes through phi-node.

Reviewed By: aqjune

Differential Revision: https://reviews.llvm.org/D106289
2021-08-03 17:52:59 +05:30
Krishna 6180ce2e2a [InstCombine] Remove nnan requirement for transformation to fabs from select
In this patch, the "nnan" requirement is removed for the canonicalization of select with fcmp to fabs.
(i) FSub logic: Remove check for nnan flag presence in fsub. Example: https://alive2.llvm.org/ce/z/751svg (fsub).
(ii) FNeg logic: Remove check for the presence of nnan and nsz flag in fneg. Example: https://alive2.llvm.org/ce/z/a_fsdp (fneg).

Differential Revision: https://reviews.llvm.org/D106872
2021-08-03 17:52:58 +05:30
Aaron Ballman 80c17bb298 This feature is not in Clang 13 and only has partial support 2021-08-03 08:21:15 -04:00
Simon Pilgrim d3917bbfc6 [X86] Add title comment to separate the "CPU Families" features from the other subtarget features. NFCI.
Hopefully we can get rid of these some day...
2021-08-03 12:53:57 +01:00
Dmitry Vyukov e72ad3c19a tsan: use semaphores for thread creation synchronization
We currently use ad-hoc spin waiting to synchronize thread creation
and thread start both ways. But spinning tend to degrade ungracefully
under high contention (lots of threads are created at the same time).
Use semaphores for synchronization instead.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D107337
2021-08-03 13:47:01 +02:00
Corentin Jabot 977bdf6f44 Make simple requirements starting with requires ill-formed in in requirement body
This patch implements P2092

Simple requirements in requirement body shall not start with requires.
A warning was already in place so we just turn this warning into an error.

In addition, we add tests to make sure typename is optional in
requirement-parameter-list as per the same paper.
2021-08-03 07:42:29 -04:00
Simon Pilgrim 43ff058e78 [llvm-objcopy] IHexELFBuilder::addDataSections - fix evaluation ordering static analyzer warning
As detailed on https://pvs-studio.com/en/blog/posts/cpp/0771/ and raised on D62583, the SecNo++ increment is not guaranteed to occur before the second use of SecNo in the same addSection() call.

This patch pulls out the increment (just for clarity) and replaces the second use of SecNo with a constant zero value (we're using stable_sort so the value isn't critical).

Differential Revision: https://reviews.llvm.org/D107273
2021-08-03 12:16:59 +01:00
Guillaume Chatelet e4dee76224 [libc] Allow benchmarking several implementations at the same time.
Next step is to generate an archive with all implementations and a header listing them all.

Differential Revision: https://reviews.llvm.org/D107336
2021-08-03 10:53:11 +00:00
Dmitry Vyukov 559426ae76 tsan: use Tid/StackID types in MBlock
Replace more raw types with Tid/StackID typedefs.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D107335
2021-08-03 12:43:02 +02:00
Adam Czachorowski 08128fe705 [clang] Make member var invalid when static initializer is invalid.
Previously we would show an error, but keep the member, and also the
CXXRrecordDecl, valid. This could lead to crashes when attempting to
access the record layout or size.

Differential Revision: https://reviews.llvm.org/D105478
2021-08-03 11:52:52 +02:00
Kiran Chandramohan 59989d68ba [MLIR][OpenMP] Add support for critical construct
This patch adds the critical construct to the OpenMP dialect. The
implementation models the definition in 2.17.1 of the OpenMP 5 standard.
A name and hint can be specified. The name is a global entity or has
external linkage, it is modelled as a FlatSymbolRefAttr. Hint is
modelled as an integer enum attribute.
Also lowering to LLVM IR using the OpenMP IRBuilder.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D107135
2021-08-03 10:50:21 +01:00
Fraser Cormack cba6aab971 [RISCV] Support simple fractional steps in matching VID sequences
This patch extends the optimization of VID-sequence BUILD_VECTORs
introduced in D104921 to include simple fractional steps composed of a
separated integer numerator and denominator.

A notable limitation in this sequence detection is that only sequences
with steps N/1 or 1/D are found, meaning that the step between elements
and the frequency with which it changes is consistent across the whole
sequence. Fractional steps such as 2/3 won't be matched as those would
involve more complex tracking of state or some level of backtracking.

As is stands, however, this patch is sufficient to match common
interleave-type shuffle indices, for example matching `<0,0,1,1>` (or
commonly `<0,u,1,u>` or `<u,0,u,1>`) to an index sequence divided by 2.

While the optimization is relatively `undef`-tolerant, due to greedy
pattern-matching there even are some simple patterns which confuse the
sequence detection into identifying either a suboptimal sequence or no
sequence at all.

Currently only fractional-step sequences identified as having a
power-of-two denominator are actually lowered to RVV instructions. This
is to avoid introducing divisions into the generated code.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D106533
2021-08-03 10:38:24 +01:00
Frederik Gossen f0008a4cf4 [MLIR] Add `getI8Type` to `OpBuilder`
Differential Revision: https://reviews.llvm.org/D107332
2021-08-03 11:42:39 +02:00
Jason Molenda 0d8cd4e2d5 [AArch64InstPrinter] Change printAddSubImm to comment imm value when shifted
Add a comment when there is a shifted value,
    add x9, x0, #291, lsl #12 ; =1191936
but not when the immediate value is unshifted,
    subs x9, x0, #256 ; =256
when the comment adds nothing additional to the reader.

Differential Revision: https://reviews.llvm.org/D107196
2021-08-03 02:28:46 -07:00
Vladislav Vinogradov 9b50844fd7 [mlir] Fix delayed object interfaces registration
Store both interfaceID and objectID as key for interface registration callback.
Otherwise the implementation allows to register only one external model per one object in the single dialect.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D107274
2021-08-03 12:21:55 +03:00
Krasimir Georgiev 4f4f278305 [clang-format] don't break between function and function name in JS
The patch https://reviews.llvm.org/D105964 (58494c856a)
updated detection of function declaration names. It had the unfortunate
consequence that it started breaking between `function` and the function
name in some cases in JavaScript code.

This patch addresses this.

Reviewed By: MyDeveloperDay, owenpan

Differential Revision: https://reviews.llvm.org/D107267
2021-08-03 11:18:21 +02:00
Dmitry Vyukov d77b476c19 tsan: avoid extra call indirection in unaligned access functions
Currently unaligned access functions are defined in tsan_interface.cpp
and do a real call to MemoryAccess. This means we have a real call
and no read/write constant propagation.

Unaligned memory access can be quite hot for some programs
(observed on some compression algorithms with ~90% of unaligned accesses).

Move them to tsan_interface_inl.h to avoid the additional call
and enable constant propagation.
Also reorder the actual store and memory access handling for
__sanitizer_unaligned_store callbacks to enable tail calling
in MemoryAccess.

Depends on D107282.

Reviewed By: vitalybuka, melver

Differential Revision: https://reviews.llvm.org/D107283
2021-08-03 11:12:49 +02:00
Esme-Yi 69396896fb [llvm-readobj][XCOFF] Fix the error dumping for the first
item of StringTable.

Summary: For the string table in XCOFF, the first 4 bytes
contains the length of the string table, so we should
print the string entries from fifth bytes. This patch
also adds tests for llvm-readobj dumping the string
table.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D105522
2021-08-03 09:08:58 +00:00
Dmitry Vyukov 18c6ed2f0f tsan: add AccessVptr
Add AccessVptr access type.
For now it's converted to the same thr->is_vptr_access,
but later it will be passed directly to ReportRace
and will enable efficient tail calling in MemoryAccess function
(currently __tsan_vptr_update/__tsan_vptr_read can't use
tail calls in MemoryAccess because of the trailing assignment
to thr->is_vptr_access).

Depends on D107276.

Reviewed By: vitalybuka, melver

Differential Revision: https://reviews.llvm.org/D107282
2021-08-03 11:03:36 +02:00
Dmitry Vyukov 831910c5c4 tsan: new MemoryAccess interface
Currently we have MemoryAccess function that accepts
"bool kAccessIsWrite, bool kIsAtomic" and 4 wrappers:
MemoryRead/MemoryWrite/MemoryReadAtomic/MemoryWriteAtomic.

Such scheme with bool flags is not particularly scalable/extendable.
Because of that we did not have Read/Write wrappers for UnalignedMemoryAccess,
and "true, false" or "false, true" at call sites is not very readable.

Moreover, the new tsan runtime will introduce more flags
(e.g. move "freed" and "vptr access" to memory acccess flags).
We can't have 16 wrappers and each flag also takes whole
64-bit register for non-inlined calls.

Introduce AccessType enum that contains bit mask of
read/write, atomic/non-atomic, and later free/non-free,
vptr/non-vptr.
Such scheme is more scalable, more readble, more efficient
(don't consume multiple registers for these flags during calls)
and allows to cover unaligned and range variations of memory
access functions as well.

Also switch from size log to just size.
The new tsan runtime won't have the limitation of supporting
only 1/2/4/8 access sizes, so we don't need the logarithms.

Also add an inline thunk that converts the new interface to the old one.
For inlined calls it should not add any overhead because
all flags/size can be computed as compile time.

Reviewed By: vitalybuka, melver

Differential Revision: https://reviews.llvm.org/D107276
2021-08-03 11:03:23 +02:00
David Sherwood 0156f91f3b [NFC] Rename enable-strict-reductions to force-ordered-reductions
I'm renaming the flag because a future patch will add a new
enableOrderedReductions() TTI interface and so the meaning of this
flag will change to be one of forcing the target to enable/disable
them. Also, since other places in LoopVectorize.cpp use the word
'Ordered' instead of 'strict' I changed the flag to match.

Differential Revision: https://reviews.llvm.org/D107264
2021-08-03 09:33:01 +01:00
Cullen Rhodes a02bbeeae7 [AArch64][AsmParser] NFC: Use helpers in matrix tile list parsing 2021-08-03 08:13:01 +00:00
Jay Foad 40202b13b2 [AMDGPU] Legalize operands of V_ADDC_U32_e32 and friends
These instructions have an implicit use of vcc which counts towards the
constant bus limit. Pre gfx10 this means that the explicit operands
cannot be sgprs. Use the custom inserter hook to call legalizeOperands
to enforce that restriction.

Fixes https://bugs.llvm.org/show_bug.cgi?id=51217

Differential Revision: https://reviews.llvm.org/D106868
2021-08-03 09:04:52 +01:00
Martin Storsjö ce49fd024b [clang] [MinGW] Let the last of -mconsole/-mwindows have effect
Don't just check for the existence of one, but check which one was
specified last, if any.

This fixes https://llvm.org/PR51296.

Differential Revision: https://reviews.llvm.org/D107261
2021-08-03 10:55:44 +03:00
Martin Storsjö b7fb5b54a9 [LLD] [MinGW] Support both "--opt value" and "--opt=value" for more options
This does the same fix as D107237 but for a couple more options,
converting all remaining cases of such options to accept both
forms, for consistency. This fixes building e.g. openldap, which
uses --image-base=<value>.

Differential Revision: https://reviews.llvm.org/D107253
2021-08-03 10:55:44 +03:00
Florian Mayer 150395c2bc [hwasan] report failing thread for invalid free.
Reviewed By: hctim

Differential Revision: https://reviews.llvm.org/D107270
2021-08-03 08:53:53 +01:00
Paulo Matos d3a0a65bf0 Reland: "[WebAssembly] Add new pass to lower int/ptr conversions of reftypes"
Add new pass LowerRefTypesIntPtrConv to generate debugtrap
instruction for an inttoptr and ptrtoint of a reference type instead
of erroring, since calling these instructions on non-integral pointers
has been since allowed (see ac81cb7e6).

Differential Revision: https://reviews.llvm.org/D107102
2021-08-03 09:20:51 +02:00
Vitaly Buka 735da5f5ad [NFC][sanitizer] Add static to internal functions 2021-08-03 00:12:36 -07:00
Kim-Anh Tran 1dfc13cf54 Test commit to check commit access 2021-08-03 08:51:38 +02:00
Uday Bondhugula 3d63d1a390 [MILR][NFC] Silence clang-tidy warning in AffineOps.cpp
Silence clang-tidy warning in AffineOps.cpp due to the inability to see
through the typeswitch. NFC.

Differential Revision: https://reviews.llvm.org/D106125
2021-08-03 11:54:28 +05:30
Chirag Khandelwal 77ebfba68b [Flang][Openmp] Upgrade TASKGROUP construct to 5.0.
In OMP 5.0 specification clause-list with
* task_reduction
* allocate
were allowed on taskgroup construct.

Fix XFAIL - omp-taskloop01.f90.

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D93373
2021-08-03 10:27:47 +05:30
jacquesguan 7900ee0b61 [RISCV] Teach VSETVLI insertion to merge the unused VSETVLI with the one need to be insert after it.
If a vsetvli instruction is not compatible with the next vector instruction,
and there is no other things that may update or use VL/VTYPE, we could merge
it with the next vsetvli instruction that should be insert for the vector
instruction.

This commit only merge VTYPE with the former vsetvli instruction which has
the same VL.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D106857
2021-08-03 12:06:59 +08:00
jacquesguan ed80458834 [RISCV][test] Precommit tests for VSETVLI insertion improvement (D106857).
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D106865
2021-08-03 12:06:59 +08:00
luxufan 0023caf952 [RuntimeDyldChecker] Delete comparision of integers of different signs 2021-08-03 11:38:25 +08:00
luxufan f4e418ac1e [RuntimeDyldChecker] Support offset in decode_operand expr
In RISCV's relocations, some relocations are comprised of two relocation types. For example, R_RISCV_PCREL_HI20 and R_RISCV_PCREL_LO12_I compose a PC relative relocation. In general the compiler will set a label in the position of R_RISCV_PCREL_HI20. So, to test the R_RISCV_PCREL_LO12_I relocation, we need decode instruction at position of the label points to R_RISCV_PCREL_HI20 plus 4 (the size of a riscv non-compress instruction).

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D105528
2021-08-03 11:25:51 +08:00
Matthias Springer 18d10fbe87 [mlir][affine] addLowerOrUpperBound: Make map+operand composing optional
There are cases in which it is not desirable to fully compose the bound map with the operands when adding lower/upper bounds to a `FlatAffineConstraints`.

E.g., this is the case when bounds should be expressed in terms of the operands only (and not the operands' dependencies). This also makes `addLowerOrUpperBound` useable together with operands that are defined through semi-affine expressions.

Differential Revision: https://reviews.llvm.org/D107221
2021-08-03 11:31:00 +09:00
Matthias Springer fef4708472 [mlir][affine] addLowerOrUpperBound: Disallow pos among boundOperands
Bounds such as `dim_{pos} <= c_1 * dim_x + ...` where `x == pos` are invalid. `addLowerOrUpperBound` previously added an incorrect inequality to the set. Such cases are now explicitly rejected.

Differential Revision: https://reviews.llvm.org/D107220
2021-08-03 11:18:47 +09:00
Matthias Springer 3a41ff4883 [mlir][SCF] Peel scf.for loops for even step divison
Add ForLoopBoundSpecialization pass, which specializes scf.for loops into a "main loop" where `step` divides the iteration space evenly and into an scf.if that handles the last iteration.

This transformation is useful for vectorization and loop tiling. E.g., when vectorizing loads/stores, programs will spend most of their time in the main loop, in which only unmasked loads/stores are used. Only the in the last iteration (scf.if), slower masked loads/stores are used.

Subsequent commits will apply this transformation in the SparseDialect and in Linalg's loop tiling.

Differential Revision: https://reviews.llvm.org/D105804
2021-08-03 10:21:38 +09:00
wlei 6da9241aab [llvm-profgen] Refactor PerfReader to allow different types of perf scripts
In order to support different types of perf scripts, this change tried to refactor `PerfReader` by adding the base class `PerfReaderBase` and current HybridPerfReader is derived from it for CS profile generation. Common functions like, passMM2PEvents, extract_lbrs, extract_callstack, etc. can be reused.

Next step is to add LBR only reader(for non-CS profile) and aggregated perf scripts reader(do a pre-aggregation of scripts).

Reviewed By: hoy, wenlei

Differential Revision: https://reviews.llvm.org/D107014
2021-08-02 17:18:47 -07:00
Vitaly Buka 9205143f07 [NFC][tsan] clang-format two files 2021-08-02 16:28:26 -07:00
Shimin Cui 7ce98cf56e [GlobalOpt] Fix the assert for stored once non-pointer to global address
This is to fix the assert @bjope reported due to the code change of https://reviews.llvm.org/D106589. The test case from @bjope is also included.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D107302
2021-08-02 19:23:29 -04:00
Eli Friedman 1f62af6346 [AArch64][SelectionDAG] Support passing/returning scalable vectors with unusual types.
This adds handling for two cases:

1. A scalable vector where the element type is promoted.
2. A scalable vector where the element count is odd (or more generally,
   not divisble by the element count of the part type).

(Some element types still don't work; for example, <vscale x 2 x i128>,
or <vscale x 2 x fp128>.)

Differential Revision: https://reviews.llvm.org/D105591
2021-08-02 15:53:16 -07:00
modimo b40a2a533a [clang] Add support for optional flag -fnew-infallible to restrict exception propagation
The declaration for the global new function in C++ is generated in the compiler front-end. When examining exception propagation, we found that this is the largest root throw site propagator requiring unwind code to be generated for callers up the stack. Allowing this to be handled immediately with termination stops upward propagation and leads to significantly less landing pads generated. This in turns leads to a performance and .text size win.

With `-fnew-infallible` this annotates the declaration with `throw()` and `__attribute__((returns_nonnull))`.  `throw()` allows the compiler to assume exceptions do not propagate out of new and eliminate it as a root throw site. Note that the definition of global new is user-replaceable so users should ensure that the one used follows these semantics.

Measuring internally, we're seeing at 0.5% CPU win in one of our large internal FB workload. Measuring on clang self-build (cd0a1226b5) we get:

thinlto/

        "dwarfehprepare.NumCleanupLandingPadsRemaining": 153494,
        "dwarfehprepare.NumNoUnwind": 26309,
thinlto_newinfallible/

        "dwarfehprepare.NumCleanupLandingPadsRemaining": 143660,
        "dwarfehprepare.NumNoUnwind": 28744,

a 1-143660/153494 = 6.4% reduction in landing pads and a 28744/26309 = 9.3% increase in the number of nounwind functions.

Testing:
ninja check-all
new test case to make sure these attributes are added correctly to global new.

Reviewed By: urnathan

Differential Revision: https://reviews.llvm.org/D105225
2021-08-02 15:45:06 -07:00
Vedant Kumar 3b0a9e7b39 [profile] Move assertIsZero to InstrProfilingUtil.c
... and rename it to 'warnIfNonZero' to better-reflect what it actually
does.

The goal is to minimize the amount of logic that's conditionally
compiled under '#if __APPLE__'.
2021-08-02 15:25:09 -07:00
Aart Bik 52c87e0437 [mlir][sparse] use consistent type for COO object and sparse tensor storage
There was a slightly mismatch between the double COO and actual numerical
type in the final sparse tensor storage (due to external formats always
using double). This minor revision removes that inconsistency by using a
properly typed COO and casting during the "add" method instead. This also
prepares alternative ways of initializing the COO object.

Reviewed By: gussmith23

Differential Revision: https://reviews.llvm.org/D107310
2021-08-02 15:24:43 -07:00
Mitch Phillips 65e9d7efb0 Improve UBSan documentation
Add more checks, info on -fno-sanitize=..., and reference to 5/2021 UBSan Oracle blog.

Authored By: DianeMeirowitz
Reviewed By: hctim

Differential Revision: https://reviews.llvm.org/D106908
2021-08-02 15:10:21 -07:00
Roman Lebedev 6f6e9a867f
[BasicTTIImpl][LoopUnroll] getUnrollingPreferences(): emit ORE remark when advising against unrolling due to a call in a loop
I'm not sure this is the best way to approach this,
but the situation is rather not very detectable unless we explicitly call it out when refusing to advise to unroll.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D107271
2021-08-03 00:57:26 +03:00
Roman Lebedev 4ba3326f17
[InstCombine] `vector_reduce_{or,and}(?ext(<n x i1>))` --> `?ext(vector_reduce_{or,and}(<n x i1>))` (PR51259)
This allows the expansion logic to actually trigger if the argument
was extended from i1 element type, like the rest of the reductions expect.

Alive2 agrees:
https://alive2.llvm.org/ce/z/wcfews (or zext)
https://alive2.llvm.org/ce/z/FCXNFx (or sext)
https://alive2.llvm.org/ce/z/f26zUY (and zext)
https://alive2.llvm.org/ce/z/jprViN (and sext)
2021-08-03 00:54:35 +03:00
Roman Lebedev cdb0dfdffa
[NFC][InstCombine] Add tests for or reduction w/ i1 element type (PR51259) 2021-08-03 00:54:35 +03:00