Commit Graph

360182 Commits

Author SHA1 Message Date
mydeveloperday 7a1bcf9f9a [polly] NFC clang-format change following D83564 2020-07-12 18:58:53 +01:00
Craig Topper f8f007e378 [X86] Consistently use 128 as the PSHUFB/VPPERM index for zero
Bit 7 of the index controls zeroing, the other bits are ignored when bit 7 is set. Shuffle lowering was using 128 and shuffle combining was using 255. Seems like we should be consistent.

This patch changes shuffle combining to use 128 to match lowering.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D83587
2020-07-12 10:52:43 -07:00
Craig Topper 04013a07ac [X86] Fix two places that appear to misuse peekThroughOneUseBitcasts
peekThroughOneUseBitcasts checks the use count of the operand of the bitcast. Not the bitcast itself. So I think that means we need to do any outside haseOneUse checks before calling the function not after.

I was working on another patch where I misused the function and did a very quick audit to see if I there were other similar mistakes.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D83598
2020-07-12 10:52:43 -07:00
mydeveloperday 65dc97b79e [clang-format] PR46609 clang-format does not obey `PointerAlignment: Right` for ellipsis in declarator for pack
Summary:
https://bugs.llvm.org/show_bug.cgi?id=46609

Ensure `*...` obey they left/middle/right rules of Pointer alignment

Reviewed By: curdeius

Differential Revision: https://reviews.llvm.org/D83564
2020-07-12 18:44:26 +01:00
Ayal Zaks 82a5157ff1 [LV] Fixing versioning-for-unit-stide of loops with small trip count
This patch fixes D81345 and PR46652.

If a loop with a small trip count is compiled w/o -Os/-Oz, Loop Access Analysis
still generates runtime checks for unit strides that will version the loop.

In such cases, the loop vectorizer should either re-run the analysis or bail-out
from vectorizing the loop, as done prior to D81345. The latter is applied for
now as the former requires refactoring.

Differential Revision: https://reviews.llvm.org/D83470
2020-07-12 19:51:47 +03:00
Yonghong Song 152a9fef1b BPF: permit .maps section variables with typedef type
Currently, llvm when see a global variable in .maps section,
it ensures its type must be a struct type. Then pointee
will be further evaluated for the structure members.
In normal cases, the pointee type will be skipped.

Although this is what current all bpf programs are doing,
but it is a little bit restrictive. For example, it is legitimate
for users to have:
typedef struct { int key_size; int value_size; } __map_t;
__map_t map __attribute__((section(".maps")));

This patch lifts this restriction and typedef of
a struct type is also allowed for .maps section variables.
To avoid create unnecessary fixup entries when traversal
started with typedef/struct type, the new implementation
first traverse all map struct members and then traverse
the typedef/struct type. This way, in internal BTFDebug
implementation, no fixup entries are generated.

Two new unit tests are added for typedef and const
struct in .maps section. Also tested with kernel bpf selftests.

Differential Revision: https://reviews.llvm.org/D83638
2020-07-12 09:42:25 -07:00
Ayke van Laethem 69e60c9dc7
[LLD][ELF][AVR] Implement the missing relocation types
Implements the missing relocation types for AVR target.
The results have been cross-checked with binutils.

Original patch by LemonBoy. Some changes by me.

Differential Revision: https://reviews.llvm.org/D78741
2020-07-12 18:18:54 +02:00
Nikita Popov d589372704 [SCCP] Extend nonnull metadata test (NFC) 2020-07-12 17:48:32 +02:00
Fangrui Song be9f363704 [AVRInstPrinter] printOperand: support llvm-objdump --print-imm-hex
Differential Revision: https://reviews.llvm.org/D83634
2020-07-12 08:14:52 -07:00
Rahul Joshi 032810f589 [NFC] Fix comment style in MLIR unittests to conform to LLVM coding standards.
Differential Revision: https://reviews.llvm.org/D83632
2020-07-12 07:27:02 -07:00
Sanjay Patel 39009a8245 [DAGCombiner] tighten fast-math constraints for fma fold
fadd (fma A, B, (fmul C, D)), E --> fma A, B, (fma C, D, E)

This is only allowed when "reassoc" is present on the fadd.

As discussed in D80801, this transform goes beyond
what is allowed by "contract" FMF (-ffp-contract=fast).
That is because we are fusing the trailing add of 'E' with a
multiply, but without "reassoc", the code mandates that the
products A*B and C*D are added together before adding in 'E'.

I've added this example to the LangRef to try to clarify the
meaning of "contract". If that seems reasonable, we should
probably do something similar for the clang docs because
there does not appear to be any formal spec for the behavior
of -ffp-contract=fast.

Differential Revision: https://reviews.llvm.org/D82499
2020-07-12 08:51:49 -04:00
Ten Tzen 66f1dcd872 [Windows SEH] Fix the frame-ptr of a nested-filter within a _finally
This change fixed a SEH bug (exposed by test58 & test61 in MSVC test xcpt4u.c);
when an Except-filter is located inside a finally, the frame-pointer generated today
via intrinsic @llvm.eh.recoverfp is the frame-pointer of the immediate
parent _finally, not the frame-ptr of outermost host function.

The fix is to retrieve the Establisher's frame-pointer that was previously saved in
parent's frame.
The prolog of a filter inside a _finally should be like code below:

%0 = call i8* @llvm.eh.recoverfp(i8* bitcast (@"?fin$0@0@main@@"), i8*%frame_pointer)
%1 = call i8* @llvm.localrecover(i8* bitcast (@"?fin$0@0@main@@"), i8*%0, i32 0)
%2 = bitcast i8* %1 to i8**
%3 = load i8*, i8** %2, align 8

Differential Revision: https://reviews.llvm.org/D77982
2020-07-12 01:37:56 -07:00
Nikita Popov 6634aef71f [SCCP] Add test for predicate info condition handling (NFC) 2020-07-12 10:13:10 +02:00
Zequan Wu 77272d177a [COFF] Fix endianness of .llvm.call-graph-profile section data 2020-07-11 20:49:26 -07:00
Fangrui Song d1bcddb5c1 [llvm-objdump][test] Move tests after dc4a6f5db4
Move RISCV/ to ELF/RISCV/ as well.
2020-07-11 16:45:05 -07:00
kuter 4dbe82eef3 [Attributor] Introudce attribute seed allow list. 2020-07-12 02:25:33 +03:00
Nikita Popov 6792069a3f [NewGVN] Regenerate test checks (NFC) 2020-07-11 22:51:49 +02:00
Michael Liao b8409c03ed Fix `-Wreturn-type` warning. NFC. 2020-07-11 16:20:41 -04:00
Mehdi Amini 44b0b7cf66 Fix one memory leak in the MLIRParser by using std::unique_ptr to hold the new block pointer
This is NFC when there is no parsing error.

Differential Revision: https://reviews.llvm.org/D83619
2020-07-11 20:05:37 +00:00
Mehdi Amini 3b04af4d84 Fix some memory leak in MLIRContext with respect to registered types/attributes interfaces
Differential Revision: https://reviews.llvm.org/D83618
2020-07-11 20:05:29 +00:00
Craig Topper 47872adf6a [X86] Add test cases for missed opportunities to use vpternlog due to a bitcast between the logic ops.
These test cases fail to use vpternlog because the AND was converted
to a blend shuffle and then converted back to AND during shuffle lowering.
This results in the AND having a different type than it started with.
This prevents our custom matching logic from seeing the two logic ops.
2020-07-11 12:54:52 -07:00
Stephen Neuendorffer d8c35031a3 [examples] fix ExceptionDemo
Code didn't compile in a release build.  Guard debug output with
ifndef NDEBUG.

Differential Revision: https://reviews.llvm.org/D83628
2020-07-11 12:38:27 -07:00
clementval 8f183d9f3d [openmp] Remove unused variable in DirectiveEmitter 2020-07-11 12:59:52 -04:00
Johannes Doerfert 5937434677 [OpenMP] Silence unused symbol warning with proper ifdefs 2020-07-11 11:57:42 -05:00
Yaxun (Sam) Liu 5d2c3e031a Fix regression due to test hip-version.hip
Added RocmInstallationDetector to Darwin and MinGW.

Fixed duplicate ROCm detector in ROCm toolchain.
2020-07-11 12:45:29 -04:00
Valentin Clement 6e42a417ba [flang][openmp] Check clauses allowed semantic with tablegen generated map
Summary:
This patch is enabling the generation of clauses enum sets for semantics check in Flang through
tablegen. Enum sets and directive - sets map is generated by the new tablegen infrsatructure for OpenMP
and other directive languages.
The semantic checks for OpenMP are modified to use this newly generated map.

Reviewers: DavidTruby, sscalpone, kiranchandramohan, ichoyjx, jdoerfert

Reviewed By: DavidTruby, ichoyjx

Subscribers: mgorny, yaxunl, hiraditya, guansong, sstefan1, aaron.ballman, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D83326
2020-07-11 12:45:12 -04:00
Yash Jain 102828249c [MLIR] Parallelize affine.for op to 1-D affine.parallel op
Introduce pass to convert parallel affine.for op into 1-D affine.parallel op.
Run using --affine-parallelize. Removes test-detect-parallel: pass for checking
parallel affine.for ops.

Signed-off-by: Yash Jain <yash.jain@polymagelabs.com>

Differential Revision: https://reviews.llvm.org/D83193
2020-07-11 21:33:25 +05:30
Michael Liao 81db614411 Fix `-Wunused-variable` warnings. NFC. 2020-07-11 10:09:44 -04:00
Michael Liao 0b4cf802fa [fix-irreducible] Skip unreachable predecessors.
Summary:
- Skip unreachable predecessors during header detection in SCC. Those
  unreachable blocks would be generated in the switch lowering pass in
  the corner cases or other frontends. Even though they could be removed
  through the CFG simplification, we should skip them during header
  detection.

Reviewers: sameerds

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D83562
2020-07-11 10:08:44 -04:00
sstefan1 850b150cff [Attributor][NFC] Add more debug output for deleted functions 2020-07-11 14:26:08 +02:00
Christudasan Devadasan d7a05698ef [AMDGPU] Move LowerSwitch pass to CodeGenPrepare.
It is possible that LowerSwitch pass leaves certain blocks
unreachable from the entry. If not removed, these dead blocks
can cause undefined behavior in the subsequent passes.
It caused a crash in the AMDGPU backend after the instruction
selection when a PHI node has its incoming values coming from
these unreachable blocks.

In the AMDGPU pass flow, the last invocation of UnreachableBlockElim
precedes where LowerSwitch is currently placed and eventually
missed out on the opportunity to get these blocks eliminated.
This patch ensures that LowerSwitch pass get inserted earlier
to make use of the existing unreachable block elimination pass.

Reviewed By: sameerds, arsenm

Differential Revision: https://reviews.llvm.org/D83584
2020-07-11 16:33:38 +05:30
Alexey Lapshin f7907e9d22 [TRE] allow TRE for non-capturing calls.
The current implementation of Tail Recursion Elimination has a very restricted
pre-requisite: AllCallsAreTailCalls. i.e. it requires that no function
call receives a pointer to local stack. Generally, function calls that
receive a pointer to local stack but do not capture it - should not
break TRE. This fix allows us to do TRE if it is proved that no pointer
to the local stack is escaped.

Reviewed by: efriedma

Differential Revision: https://reviews.llvm.org/D82085
2020-07-11 14:01:48 +03:00
Roman Lebedev 4500db8c59
Revert "Reland "[InstCombine] Lower infinite combine loop detection thresholds"""
And there's a new hit: https://bugs.llvm.org/show_bug.cgi?id=46680
This reverts commit 7103c87596.
2020-07-11 13:53:24 +03:00
Nico Weber 09a95f51fb [gn build] (manually) merge 943660fd15 2020-07-11 06:44:28 -04:00
Nathan James 35af6f11e0
Reland Fix gn build after 943660f 2020-07-11 11:42:05 +01:00
Nathan James 8fb91dfeed Revert "Fix gn builds after 943660fd1"
This reverts commit 4abdcdb45e.
2020-07-11 10:45:17 +01:00
Nathan James 4abdcdb45e
Fix gn builds after 943660fd1 2020-07-11 10:42:57 +01:00
Nathan James c3bdc9814d
[clang-tidy] Reworked enum options handling(again)
Reland b9306fd after fixing the issue causing mac builds to fail unittests.

Following on from D77085, I was never happy with the passing a mapping to the option get/store functions. This patch addresses this by using explicit specializations to handle the serializing and deserializing of enum options.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D82188
2020-07-11 10:13:20 +01:00
Johannes Doerfert dce6bc18c4 [OpenMP][FIX] remove unused variable and long if-else chain
MSVC throws an error if you use "too many" if-else in a row:
  `Frontend/OpenMP/OMPKinds.def(570): fatal error C1061: compiler limit:
    blocks nested too deeply`
We work around it now...
2020-07-11 02:37:57 -05:00
Mehdi Amini c44702bcdf Remove unused variable `KMPC_KERNEL_PARALLEL_WORK_FN_PTR_ARG_NO` (NFC)
This fixes a compiler warning.
2020-07-11 07:17:28 +00:00
Johannes Doerfert 5b0581aedc [OpenMP] Replace function pointer uses in GPU state machine
In non-SPMD mode we create a state machine like code to identify the
parallel region the GPU worker threads should execute next. The
identification uses the parallel region function pointer as that allows
it to work even if the kernel (=target region) and the parallel region
are in separate TUs. However, taking the address of a function comes
with various downsides. With this patch we will identify the most common
situation and replace the function pointer use with a dummy global
symbol (for identification purposes only). That means, if the parallel
region is only called from a single target region (or kernel), we do not
use the function pointer of the parallel region to identify it but a new
global symbol.

Fixes PR46450.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D83271
2020-07-11 01:44:00 -05:00
Johannes Doerfert 624d34afff [OpenMP] Compute a proper module slice for the CGSCCC pass
The module slice describes which functions we can analyze and transform
while working on an SCC as part of the CGSCC OpenMPOpt pass. So far, we
simply restricted it to the SCC. In a follow up we will need to have a
bigger scope which is why this patch introduces a proper identification
of the module slice. In short, everything that has a transitive
reference to a function in the SCC or is transitively referenced by one
is fair game.

Reviewed By: sstefan1

Differential Revision: https://reviews.llvm.org/D83270
2020-07-11 01:44:00 -05:00
Johannes Doerfert e8039ad4de [OpenMP] Identify GPU kernels (aka. OpenMP target regions)
We now identify GPU kernels, that is entry points into the GPU code.
These kernels (can) correspond to OpenMP target regions. With this patch
we identify and on request print them via remarks.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D83269
2020-07-11 01:44:00 -05:00
Johannes Doerfert 54bd3751ce [OpenMP][NFC] Add convenient helper and early exit check 2020-07-11 00:51:51 -05:00
Johannes Doerfert b726c55709 [OpenMP][NFC] Fix some typos 2020-07-11 00:51:51 -05:00
Johannes Doerfert c98699582a [OpenMP][NFC] Remove unused (always fixed) arguments
There are various runtime calls in the device runtime with unused, or
always fixed, arguments. This is bad for all sorts of reasons. Clean up
two before as we match them in OpenMPOpt now.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D83268
2020-07-11 00:51:51 -05:00
Eric Christopher 256e4d46a6 Fix signed vs unsigned comparison warnings a different way. 2020-07-10 22:52:50 -07:00
Johannes Doerfert b5667d00e0 [OpenMP][CUDA] Fix std::complex in GPU regions
The old way worked to some degree for C++-mode but in C mode we actually
tried to introduce variants of macros (e.g., isinf). To make both modes
work reliably we get rid of those extra variants and directly use NVIDIA
intrinsics in the complex implementation. While this has to be revisited
as we add other GPU targets which want to reuse the code, it should be
fine for now.

Reviewed By: tra, JonChesterfield, yaxunl

Differential Revision: https://reviews.llvm.org/D83591
2020-07-11 00:40:05 -05:00
Jonas Devlieghere 8ee225744f [lldb/Test] Fix missing yaml2obj in Xcode standalone build.
Rather than trying to find the yaml2obj from dotest we should pass it in
like we do for dsymutil and FileCheck.
2020-07-10 21:34:56 -07:00
Yaxun (Sam) Liu 849d4405f5 [HIP] Fix rocm detection
Do not detect device library by default in rocm detector.
Only detect device library in Rocm and HIP toolchain.

Separate detection of HIP runtime and Rocm device library.

Detect rocm path by version file in host toolchains.

Also added detecting rocm version and printing rocm
installation path and version with -v.

Fixed include path and device library detection for
ROCm 3.5.

Added --hip-version option. Renamed --hip-device-lib-path
to --rocm-device-lib-path.

Fixed default value for -fhip-new-launch-api.

Added default -std option for HIP.

Differential Revision: https://reviews.llvm.org/D82930
2020-07-10 23:20:15 -04:00