Commit Graph

378176 Commits

Author SHA1 Message Date
Kazu Hirata d44ca0cf2f [CodeGen] Forward-declare TargetMachine (NFC)
InstrEmitter.h needs TargetMachine but relies on a forward declaration
of TargetMachine in MachineOperand.h.  This patch adds a forward
declaration right in InstrEmitter.h.

While we are at it, this patch removes the one in MachineOperand.h,
where it is unnecessary.
2021-01-24 12:18:54 -08:00
Craig Topper 116177afcc [RISCV] Use SRLIWPat in the PACKUW pattern.
This makes the code more tolerant if we ever change SimplifyDemandedBits
to not remove 1s from the lsbs of a contiguous mask.
2021-01-24 10:41:58 -08:00
Jon Chesterfield e5e448aafa [libomptarget][cuda] Fix build, change missed from D95274 2021-01-24 18:30:04 +00:00
Shilei Tian cfd978d5d3 [OpenMP] Fixed test environment of `check-libomptarget-nvptx`
D95161 removed the option `--libomptarget-nvptx-path`, which is used in
the tests for `libomptarget-nvptx`.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D95293
2021-01-24 13:18:33 -05:00
Nikita Popov 8b9df70bf7 [Utils] Use NoAliasScopeDeclInst in a few more places (NFC)
In the cloning infrastructure, only track an MDNode mapping,
without explicitly storing the Metadata mapping, same as is done
during inlining. This makes things slightly simpler.
2021-01-24 16:24:11 +01:00
David Green 4cc94b7313 [CostModel] Tests for showing the cost of intrinsics from the vectorizer. NFC 2021-01-24 14:47:15 +00:00
Florian Hahn f959d8195d
[LTO] Move DisableVerify setting to LTOCodeGenerator class (NFC).
To simplify the transition to using LTOBackend, move DisableVerify to
the LTOCodeGenerator class, like most/all other options.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D95223
2021-01-24 14:14:40 +00:00
Sanjay Patel 77adbe6a8c [SLP] fix fast-math requirements for fmin/fmax reductions
a6f0221276 enabled intersection of FMF on reduction instructions,
so it is safe to ease the check here.

There is still some room to improve here - it looks like we
have nearly duplicate flags propagation logic inside of the
LoopUtils helper but it is limited targets that do not form
reduction intrinsics (they form the shuffle expansion).
2021-01-24 08:55:56 -05:00
David Zarzycki 1bc8daba4f Fix x86 exegesis tests after c042aff886
In c042aff886, unused FileCheck prefixes became an error, which exposed some testing bugs in four exegesis tests. I've tried my best to either fix the testing bugs, or expand the testing to cover more scenarios.

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D95287
2021-01-24 08:51:06 -05:00
David Green 06ab7953e9 [AArch64] Saturating add cost tests. NFC 2021-01-24 13:49:17 +00:00
Jeroen Dobbelaere dcc7706fcf [InstCombine] Remove unused llvm.experimental.noalias.scope.decl
A @llvm.experimental.noalias.scope.decl is only useful if there is !alias.scope and !noalias metadata that uses the declared scope.
When that is not the case for at least one of the two, the intrinsic call can as well be removed.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D95141
2021-01-24 13:55:50 +01:00
Jeroen Dobbelaere 659c7bcde6 [LoopRotate] Use llvm.experimental.noalias.scope.decl for duplicating noalias metadata as needed
Similar to D92887, LoopRotation also needs duplicate the noalias scopes when rotating a `@llvm.experimental.noalias.scope.decl` across a block boundary.
This is based on the version from the Full Restrict paches (D68511).

The problem it fixes also showed up in Transforms/Coroutines/ex5.ll after D93040 (when enabling strict checking with -verify-noalias-scope-decl-dom).

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D94306
2021-01-24 13:53:13 +01:00
Jeroen Dobbelaere 774629641b [LoopUnroll] Use llvm.experimental.noalias.scope.decl for duplicating noalias metadata as needed
This is a fix for https://bugs.llvm.org/show_bug.cgi?id=39282. Compared to D90104, this version is based on part of the full restrict patched (D68484) and uses the `@llvm.experimental.noalias.scope.decl` intrinsic to track the location where !noalias and !alias.scope scopes have been introduced. This allows us to only duplicate the scopes that are really needed.

Notes:
- it also includes changes and tests from D90104

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D92887
2021-01-24 13:48:20 +01:00
Lang Hames b3d7e761e3 [examples] Fix "Target does not support MC emission!" in HowToUseJIT example.
Patch by Shivam Gupta. Thanks Shivam!

Differential Revision: https://reviews.llvm.org/D92280
2021-01-24 22:11:54 +11:00
Jon Chesterfield c3074d48d3 [libomptarget][nvptx] Replace cuda atomic primitives with clang intrinsics
[libomptarget][nvptx] Replace cuda atomic primitives with clang intrinsics

Tested by diff of IR generated for target_impl.cu before and after. NFC. Part
of removing deviceRTL build time dependency on cuda SDK.

Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D95294
2021-01-24 10:59:15 +00:00
Nikita Popov 5d12b976b0 [ValueTracking] Don't assume readonly function will return
This is similar to D94106, but for the
isGuaranteedToTransferExecutionToSuccessor() helper. We should not
assume that readonly functions will return, as this is only true for
mustprogress functions (in which case we already infer willreturn).
As with the DCE change, for now continue assuming that readonly
intrinsics will return, as not all target intrinsics have been
annotated yet.

Differential Revision: https://reviews.llvm.org/D95288
2021-01-24 10:40:21 +01:00
Craig Topper c50457f3e4 [RISCV] Make the code in MatchSLLIUW ignore the lower bits of the AND mask where the shift has guaranteed zeros.
This avoids being dependent on SimplifyDemandedBits having cleared
those bits.

It could make sense to teach SimplifyDemandedBits to keep all
lower bits 1 in an AND mask when possible. This could be
implemented with slli+srli in the general case rather than
needing to materialize the constant.
2021-01-24 00:34:45 -08:00
Lang Hames 45ad6fac6a [JITLink] Use edge kind names for fixups in EHFrameEdgeFixer.
Previously FDE field names were used, but the fixup kind used for a field can
vary based on the pointer encoding.

This change will improve readability / maintainability when EH-frame support is
added to JITLink/ELF.
2021-01-24 15:38:04 +11:00
Ben Shi 2a4acf3ea8 [AVR] Optimize 8-bit int shift
Reviewed By: dylanmckay

Differential Revision: https://reviews.llvm.org/D90678
2021-01-24 11:04:37 +08:00
Michael Kruse b890fafe67 [OpenMPIRBuilder] Silence compiler warning. NFC.
Address the compiler warning
OMPIRBuilder.cpp:1232:27: comparison of integers of different signs: 'size_t' (aka 'unsigned long') and 'int' [-Wsign-compare]
2021-01-23 21:00:37 -06:00
Michael Kruse b7dee667b6 [OpenMPIRBuilder] Implement tileLoops.
The  tileLoops method implements the code generation part of the tile directive introduced in OpenMP 5.1. It takes a list of loops forming a loop nest, tiles it, and returns the CanonicalLoopInfo representing the generated loops.

The implementation takes n CanonicalLoopInfos, n tile size Values and returns 2*n new CanonicalLoopInfos. The input CanonicalLoopInfos are invalidated and BBs not reused in the new loop nest removed from the function.

In a modified version of D76342, I was able to correctly compile and execute a tiled loop nest.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D92974
2021-01-23 19:39:29 -06:00
Craig Topper c7d5d8fa33 [RISCV] Group some Zbs isel patterns together and remove a stale comment. NFC 2021-01-23 16:45:05 -08:00
Zbigniew Sarbinowski 92bb81aac1 [SystemZ][ZOS] Provide PATH_MAX macro for libcxx
Defining PATH_MAX to _XOPEN_PATH_MAX which is the closest macro available on z/OS.
Note that this value is 1024 which is 4 times smaller from same macro on Linux.

Reviewed By: #libc, ldionne, hubert.reinterpretcast

Differential Revision: https://reviews.llvm.org/D92110
2021-01-24 00:29:39 +00:00
Craig Topper 998057ec06 [RISCV] Add isel patterns to remove masks on SLO/SRO shift amounts. 2021-01-23 15:57:41 -08:00
Craig Topper 5a73daf907 [RISCV] Add test cases for SRO/SLO with shift amounts masked to bitwidth-1. NFC
The sro/slo instructions ignore extra bits in the shift amount,
so we can ignore the mask just like we do for sll, srl, and sra.
2021-01-23 15:45:51 -08:00
Craig Topper d2927f786e [RISCV] Add isel patterns to remove (and X, 31) from sllw/srlw/sraw shift amounts.
We try to do this during DAG combine with SimplifyDemandedBits,
but it fails if there are multiple nodes using the AND. For
example, multiple shifts using the same shift amount.
2021-01-23 15:08:18 -08:00
Jon Chesterfield dc70c56be5 [libomptarget][amdgpu][nfc] Update comments
[libomptarget][amdgpu][nfc] Update comments

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D95295
2021-01-23 22:53:58 +00:00
Stella Laurenzo 52586c46b0 [mlir][CAPI] Add result type inference to the CAPI.
* Adds a flag to MlirOperationState to enable result type inference using the InferTypeOpInterface.
* I chose this level of implementation for a couple of reasons:
  a) In the creation flow is naturally where generated and custom builder code will be invoking such a thing
  b) it is a bit more efficient to share the data structure and unpacking vs having a standalone entry-point
  c) we can always decide to expose more of these interfaces with first-class APIs, but that doesn't preclude that we will always want to use this one in this way (and less API surface area for common things is better for API stability and evolution).
* I struggled to find an appropriate way to test it since we don't link the test dialect into anything CAPI accessible at present. I opted instead for one of the simplest ops I found in a regular dialect which implements the interface.
* This does not do any trait-based type selection. That will be left to generated tablegen wrappers.

Differential Revision: https://reviews.llvm.org/D95283
2021-01-23 14:30:51 -08:00
Roman Lebedev 6f2753273e
[NFC][SimplifyCFG] Extract CloneInstructionsIntoPredecessorBlockAndUpdateSSAUses() out of PerformBranchToCommonDestFolding()
To be used in PerformValueComparisonIntoPredecessorFolding()
2021-01-24 00:54:55 +03:00
Roman Lebedev 67f9c87a65
[NFC][SimplifyCFG] Perform early-continue in FoldValueComparisonIntoPredecessors() per-pred loop 2021-01-24 00:54:54 +03:00
Roman Lebedev a4e6c2e647
[NFC][SimplifyCFG] Extract PerformValueComparisonIntoPredecessorFolding() out of FoldValueComparisonIntoPredecessors()
Less nested code is much easier to follow and modify.
2021-01-24 00:54:54 +03:00
Nikita Popov c83cff45c7 [IR] Add NoAliasScopeDeclInst (NFC)
Add an intrinsic type class to represent the
llvm.experimental.noalias.scope.decl intrinsic, to make code
working with it a bit nicer by hiding the metadata extraction
from view.
2021-01-23 22:40:32 +01:00
Arthur Eubanks c37dd3b6d5 [NewPM][opt] Make -enable-new-pm default to LLVM_ENABLE_NEW_PASS_MANAGER
This is controlled by the ENABLE_EXPERIMENTAL_NEW_PASS_MANAGER CMake flag.

https://lists.llvm.org/pipermail/llvm-dev/2021-January/147993.html

Reviewed By: ychen, asbirlea

Differential Revision: https://reviews.llvm.org/D95254
2021-01-23 12:36:09 -08:00
Arthur Eubanks a22ba5afc8 [test] Pin dead-calls-willreturn.ll to legacy PM
The new PM inliner does not delete dead calls.
2021-01-23 12:35:36 -08:00
Jon Chesterfield 78b0630b72 [libomptarget][cuda] Call v2 functions explicitly
[libomptarget][cuda] Call v2 functions explicitly

rtl.cpp calls functions like cuMemFree that are replaced by a macro
in cuda.h with cuMemFree_v2. This patch changes the source to use
the v2 names consistently.

See also D95104, D95155 for the idea. Alternatives are to use a mixture,
e.g. call the macro names and explictly dlopen the _v2 names, or to keep
the current status where the symbols are replaced by macros in both files

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D95274
2021-01-23 20:33:13 +00:00
Nikita Popov cd3d80eace [PhaseOrdering] Add tests for PR44461 and PR48844 (NFC)
In both cases, optimization is prevented because
"br X == C || X == C2" is converted into a switch. In one case
loop rotation is blocked, in the other vectorization.
2021-01-23 21:24:54 +01:00
Nikita Popov 5c62d66131 [SimplifyCFG] Regenerate test checks (NFC) 2021-01-23 21:24:54 +01:00
Shilei Tian 5ad038aafa [Clang][OpenMP][NVPTX] Replace `libomptarget-nvptx-path` with `libomptarget-nvptx-bc-path`
D94700 removed the static library so we no longer need to pass
`-llibomptarget-nvptx` to `nvlink`. Since the bitcode library is the only device
runtime for now, instead of emitting a warning when it is not found, an error
should be raised. We also set a new option `libomptarget-nvptx-bc-path` to let
user choose which bitcode library is being used.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D95161
2021-01-23 14:42:38 -05:00
Kazu Hirata e4847a7fcf Revert "[Target] Use llvm::append_range (NFC)"
This reverts commit cc7a238286.

The X86WinEHState.cpp hunk seems to break certain builds.
2021-01-23 11:25:27 -08:00
Mark de Wever 99d5fad7a5 [libc++] Remove invalid C++20 code from a test.
During the review of D91986 it has been discovered the in C++11
deprecated `throw()` exception specification has been removed in
C++20. Removed the part of the test code using this feature.
2021-01-23 20:10:17 +01:00
Florian Hahn 166d40f2ed
[FuzzMutate] Add mutator to modify instruction flags.
This patch adds a new InstModificationIRStrategy to mutate flags/options
for instructions. For example, it may add or remove nuw/nsw flags from
add, mul, sub, shl instructions or change the predicate for icmp
instructions.

Subtle changes such as those mentioned above should lead to a more
interesting range of inputs. The presence or absence of overflow flags
can expose subtle bugs, for example.

Reviewed By: bogner

Differential Revision: https://reviews.llvm.org/D94905
2021-01-23 19:05:20 +00:00
Michael Kruse 3b9677e1ec [Polly] Track defined behavior for PHI predecessor computation.
ZoneAlgorithms's computePHI relies on being provided with consistent a
schedule to compute the statement prodecessors of a statement containing
PHINodes. Otherwise unexpected results such as PHI nodes with multiple
predecessors can occur which would result in problems in the
algorithms expecting consistent data.

In the added test case, statement instances are scrubbed from the
SCoP their execution would result in undefined behavior (Due to a nsw
overflow). As already being undefined behavior in LLVM-IR, neither
AssumedContext nor InvalidContext are updated, giving computePHI no
means to avoid these cases.

Intoduce a new SCoP property, the DefinedBehaviorContext, that among
the runtime-checked conditions, also tracks the assumptions not needing
a runtime check, in particular those affecting the assumed control flow.
This replaces the manual combination of the 3 other contexts that was
already done in computePHI and setNewAccessRelation. Currently, the only
additional assumption is that loop induction variables will nsw flag for
not wrap, but potentially more can be added. Use in
hasFeasibleRuntimeContext, isl::ast_build and gisting are other
potential uses.

To limit computational complexity, the DefinedBehaviorContext is not
availabe if it grows too large (atm hardcoded to 8 disjuncts).

Possible other fixes include bailing out in computePHI when
inconsistencies are detected, choose an arbitrary value for inconsistent
cases (since it is undefined behavior anyways), or make the code
receiving the result from ComputePHI handle inconsistent data. All of
them reduce the quality of implementation having to bail out more often
and disabling the ability to assert on actually wrong results.

This fixes llvm.org/PR48783.
2021-01-23 13:03:49 -06:00
Michael Kruse 02e8a5ad3c [Polly] Allow param sets for dumpPw(). 2021-01-23 13:03:48 -06:00
Michael Kruse de0457a013 [Polly] Clean up hasFeasibleRuntimeContext. 2021-01-23 13:03:48 -06:00
Michael Kruse a5b895110f [Polly] Gist new access relations using the SCoP context.
This simplifies the access relations.
2021-01-23 13:03:48 -06:00
Kazu Hirata 1238378f18 [llvm] Use pop_back_val (NFC) 2021-01-23 10:56:33 -08:00
Kazu Hirata cc7a238286 [Target] Use llvm::append_range (NFC) 2021-01-23 10:56:31 -08:00
Kazu Hirata 2f1ffa94d7 [llvm] Forward-declare ICFLoopSafetyInfo (NFC)
LoopUtils.h needs ICFLoopSafetyInfo but relies on a forward
declaration of ICFLoopSafetyInfo in IVDescriptors.h.  This patch adds
a forward declaration right in LoopUtils.h.

While we are at it, this patch removes the one in IVDescriptors.h,
where it is unnecessary.
2021-01-23 10:56:30 -08:00
Florian Hahn d60b74c28a
[InstCombine] Set MadeIRChange in replaceInstUsesWith.
Some utilities used by InstCombine, like SimplifyLibCalls, may add new
instructions and replace the uses of a call, but return nullptr because
the inserted call produces multiple results.

Previously, the replaced library calls would get removed by
InstCombine's deleter, but after
292077072e this may not happen, if the
willreturn attribute is missing.

As a work-around, update replaceInstUsesWith to set MadeIRChange, if it
replaces any uses. This catches the cases where it is used as replacer
by utilities used by InstCombine and seems useful in general; updating
uses will modify the IR.

This fixes an expensive-check failure when replacing
@__sinpif/@__cospifi with @__sincospif_sret.
2021-01-23 17:52:59 +00:00
Mark de Wever a8e06361dd [libc++] Implements concept destructible
Implements parts of:
- P0898R3 Standard Library Concepts
- P1754 Rename concepts to standard_case for C++20, while we still can

Reviewed By: ldionne, miscco, #libc

Differential Revision: https://reviews.llvm.org/D91004
2021-01-23 18:17:25 +01:00