Commit Graph

397473 Commits

Author SHA1 Message Date
Fangrui Song ba6e15d8cc [TargetMachine] Move COFF special case for ExternalSymbolSDNode from shouldAssumeDSOLocal to X86Subtarget
Intended to be NFC. ARM/AArch64 don't appear to need adjustment.

TargetMachine::shouldAssumeDSOLocal is expected to be very simple, ideally
matching isDSOLocal(). The IR producers are expected to set dso_local correctly.
(While some may think this function can make producers' work easier, the
function is really not in a good position to set dso_local. See the various
special cases we duplicate from clang CodeGenModule.cpp.)

Reviewed By: mstorsjo

Differential Revision: https://reviews.llvm.org/D108514
2021-08-23 13:54:40 -07:00
Artem Belevich 4c40c03b39 Fixed doc build. 2021-08-23 13:45:36 -07:00
River Riddle 4e103a12d9 [mlir] Add support for VariadicOfVariadic operands
This revision adds native ODS support for VariadicOfVariadic operand
groups. An example of this is the SwitchOp, which has a variadic number
of nested operand ranges for each of the case statements, where the
number of case statements is variadic. Builtin ODS support allows for
generating proper accessors for the nested operand ranges, builder
support, and declarative format support. VariadicOfVariadic operands
are supported by providing a segment attribute to use to store the
operand groups, mapping similarly to the AttrSizedOperand trait
(but with a user defined attribute name).

`build` methods for VariadicOfVariadic operand expect inputs of the
form `ArrayRef<ValueRange>`. Accessors for the variadic ranges
return a new `OperandRangeRange` type, which represents a
contiguous range of `OperandRange`. In the declarative assembly
format, VariadicOfVariadic operands and types are by default
formatted as a comma delimited list of value lists:
`(<value>, <value>), (), (<value>)`.

Differential Revision: https://reviews.llvm.org/D107774
2021-08-23 20:32:31 +00:00
Artem Belevich ce4545db1d [CUDA] Bump the latest supported CUDA version to 11.4.
This should reduce the amount of noise issued by clang for the recent-ish CUDA
versions.

Clang still does not support all the features offered by NVCC, but is expected
to handle CUDA headers and produce binaries for all GPUs supported by NVCC.

Differential Revision: https://reviews.llvm.org/D108248
2021-08-23 13:24:49 -07:00
Artem Belevich 3db8e486e5 [CUDA] Improve CUDA version detection and diagnostics.
Always use cuda.h to detect CUDA version. It's a more universal approach
compared to version.txt which is no longer present in recent CUDA versions.

Split the 'unknown CUDA version' warning in two:

* when detected CUDA version is partially supported by clang. It's expected to
work in general, at the feature parity with the latest supported CUDA
version. and may be missing support for the new features/instructions/GPU
variants. Clang will issue a warning.

* when detected version is new. Recent CUDA versions have been working with
clang reasonably well, and will likely to work similarly to the partially
supported ones above. Or it may not work at all. Clang will issue a warning and
proceed as if the latest known CUDA version was detected.

Differential Revision: https://reviews.llvm.org/D108247
2021-08-23 13:24:48 -07:00
Artem Belevich 49d982d8cb [CUDA] Add support for CUDA-11.4
Differential Revision: https://reviews.llvm.org/D108239
2021-08-23 13:24:46 -07:00
Artem Belevich 0060fffc82 [CUDA] Bump default GPU architecture to sm_35.
It's the oldest GPU architecture currently supported by all CUDA versions clang
can use.

Differential Revision: https://reviews.llvm.org/D108235
2021-08-23 13:24:45 -07:00
Simon Pilgrim 10c982e0b3 Revert rG1c9bec727ab5c53fa060560dc8d346a911142170 : [InstCombine] Fold (gep (oneuse(gep Ptr, Idx0)), Idx1) -> (gep Ptr, (add Idx0, Idx1)) (PR51069)
Reverted (manually due to merge conflicts) while regressions reported on PR51540 are investigated

As noticed on D106352, after we've folded "(select C, (gep Ptr, Idx), Ptr) -> (gep Ptr, (select C, Idx, 0))" if the inner Ptr was also a (now one use) gep we could then merge the geps, using the sum of the indices instead.

I've limited this to basic 2-op geps - a more general case further down InstCombinerImpl.visitGetElementPtrInst doesn't have the one-use limitation but only creates the add if it can be created via SimplifyAddInst.

https://alive2.llvm.org/ce/z/f8pLfD (Thanks Roman!)

Differential Revision: https://reviews.llvm.org/D106450
2021-08-23 21:09:26 +01:00
David Green 50f4ae58eb [AArch64] Correct store ReadAdrBase operand
It appears that the Read operand for stores was being placed on the
first operand (the stored value) not the address base. This adds a
ReadST for the stored value operand, allowing the ReadAdrBase to
correctly act upon the address.

Differential Revision: https://reviews.llvm.org/D108287
2021-08-23 21:07:55 +01:00
David Green 955c9437fd [AArch64] Add Scheduling tests for Load/Store ReadAdv operands. 2021-08-23 21:07:55 +01:00
MaheshRavishankar 4aeeb91a92 [mlir][Linalg] Allow all build methods of Structured ops to specify additional attributes.
Differential Revision: https://reviews.llvm.org/D108338
2021-08-23 13:06:34 -07:00
Nikita Popov 19dc02e99f [MergeICmps] Allow sinking past non-load/store
This is a followup to D106591. MergeICmps currently only allows
sinking the loads past either instructions that don't write to
memory at all, or simple loads/stores that don't modify the memory
the loads access.

The "simple loads/stores" part of this check doesn't seem necessary
to me -- AA isModRef() already accurately models any operation
that may clobber the memory. For example, in the adjusted test case
the transform is still fine if the call to @foo() isn't readonly,
but inaccessiblememonly -- in both cases, the call cannot modify
the loaded memory.

Differential Revision: https://reviews.llvm.org/D108517
2021-08-23 22:03:49 +02:00
River Riddle da12d88b1c [mlir][NFC] Add inlineRegion overloads that take a block iterator insert position
This allows for inlining into an empty block or to the beginning of a block. NFC as the existing implementations now foward to this overload.

Differential Revision: https://reviews.llvm.org/D108572
2021-08-23 19:49:53 +00:00
Alina Sbirlea e8723abf43 [DSE] Check post-dominance for malloc+memset->calloc transform.
Aiming to address the regression discussed in
https://reviews.llvm.org/D103009.

Differential Revision: https://reviews.llvm.org/D108485
2021-08-23 12:39:51 -07:00
Louis Dionne 2540c77360 [libc++][NFC] Reindent error message 2021-08-23 15:34:51 -04:00
Andrei Elovikov f5c2889488 [NFC][clang] Use X86 Features declaration from X86TargetParser
...instead of redeclaring them in clang's own X86Target.def. They were already
required to be in sync (IIUC), so no reason to maintain two identical lists.

Reviewed By: erichkeane, craig.topper

Differential Revision: https://reviews.llvm.org/D108151
2021-08-23 12:30:28 -07:00
Jon Chesterfield 842f875c8b [openmp] Use llvm GridValues from devicertl
Add include path to the cmakefiles and set the target_impl enums
from the llvm constants instead of copying the values.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D108391
2021-08-23 20:25:24 +01:00
Stanislav Mekhanoshin 401a45c61b Fix late rematerialization operands check
D106408 enables rematerialization of instructions with virtual
register uses. That has uncovered the bug in the allUsesAvailableAt
implementation: https://bugs.llvm.org/show_bug.cgi?id=51516.

In the majority of cases canRematerializeAt() called to check if
an instruction can be rematerialized before the given UseIdx.
However, SplitEditor::enterIntvAtEnd() calls it to rematerialize
an instruction at the end of a block passing LIS.getMBBEndIdx()
into the check. In the testcase from the bug it has attempted to
rematerialize ADDXri after STRXui in bb.17. The use operand %55
of the ADD is killed by the STRX but that is undetected by the check
because it adjusts passed UseIdx to the reg slot, before the kill.
The value is dead at the index passed to the check however.

This change uses a later of passed UseIdx and its reg slot. This
shall be correct because if are checking an availability of operands
before an instruction that instruction cannot be the one defining
these operands. If we are checking for late rematerialization we
are really interested if operands live past the instruction.

The bug is not exploitable without D106408 but needed to reland
reverted D106408.

Differential Revision: https://reviews.llvm.org/D108475
2021-08-23 12:23:58 -07:00
Zarko Todorovski b575bbd0c7 [PowerPC][AIX] Set the HasAlloca flag in the AIX Traceback Table only if R31 is used as a frame pointer
After c063946476 usage of R31 doesn't necessarily mean
that alloca is used. The `TracebackTable::IsAllocaUsedMask` flag should be set only
when R31 is used as a frame pointer.

On AIX the `function calls alloca' bit seems to be set whenever R31 is
set up as a frame pointer, even when there is no alloca call.

Reviewed By: lkail

Differential Revision: https://reviews.llvm.org/D108141
2021-08-23 15:20:41 -04:00
Sanjay Patel 5d7d2f0d2e [InstCombine] improve efficiency of isFreeToInvert
This is NFC-intended when viewed from outside the pass.
I was trying to make sure that we don't infinite loop
in subtract combines and noticed that we handle the
non-canonical forms of add/sub here, but it should
not be necessary. Coding it this way seems slightly
clearer than mixing all 4 patterns as before.
2021-08-23 14:56:14 -04:00
River Riddle e4635e6328 [mlir][FoldUtils] Ensure the created constant dominates the replaced op
This revision fixes a bug where an operation would get replaced with
a pre-existing constant that didn't dominate it. This can occur when
a pattern inserts operations to be folded at the beginning of the
constants insertion block. This revision fixes the bug by moving the
existing constant before the replaced operation in such cases. This is
fine because if a constant didn't already exist, a new one would have
been inserted before this operation anyways.

Differential Revision: https://reviews.llvm.org/D108498
2021-08-23 18:48:24 +00:00
Alex Langford 23c19395c0 [lldb][NFC] Remove unused method RichManglingContext::IsFunction 2021-08-23 11:45:55 -07:00
Krzysztof Drewniak 469172f3f4 [MLIR][Docs] Fix broken link to tuple type rationale
Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D108135
2021-08-23 18:35:36 +00:00
Alfonso Gregory 9cdd4ea06f [libc][NFC] Add explicit casts to ctype functions
Reviewed By: sivachandra

Differential Revision: https://reviews.llvm.org/D106902
2021-08-23 18:17:20 +00:00
Greg Clayton e100a41bbe Fix fallback code that gets decl file + line.
When a function has no line table, but does have debug info (DW_TAG_subprogram), we fall back to creating a line table with a single line entry that has the start address of the function and the source file and line of the function declaration. The bug in this code was that we might have a DW_TAG_subprogram that uses a DW_AT_specification or DW_AT_abstract_origin that points to another DIE, and that DIE might be in another compile unit. The bug was we were grabbing the file index value from the DIE, and that index could be from the other DIE in another compile unit that has its own and compleltely different file table, so we might be using a file index from one compile unit with the file table from another. This was causing a crash in llvm-gsymuil when run against dSYM files. dsymutil, the Apple DWARF linker, will often unique types and can end up with more absolute references across different compile units.

The fix is to use the DWARFDie::getDeclFile(...) accessor as it does fetch this information correctly.

Differential Revision: https://reviews.llvm.org/D108497
2021-08-23 11:06:15 -07:00
Jessica Paquette a2c8e17658 [AArch64][GlobalISel] Add regbankselect support for G_LLROUND
Same as G_LROUND: destination should always be a GPR, source should always be
a FPR.

Differential Revision: https://reviews.llvm.org/D108566
2021-08-23 10:32:20 -07:00
Chris Bieneman 43de869d77 Implement #pragma clang restrict_expansion
This patch adds `#pragma clang restrict_expansion ` to enable flagging
macros as unsafe for header use. This is to allow macros that may have
ABI implications to be avoided in headers that have ABI stability
promises.

Using macros in headers (particularly public headers) can cause a
variety of issues relating to ABI and modules. This new pragma logs
warnings when using annotated macros outside the main source file.

This warning is added under a new diagnostics group -Wpedantic-macros

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D107095
2021-08-23 09:46:38 -07:00
Jessica Paquette fe51f9098b [AArch64][GlobalISel] Legalize G_LLROUND for s64 + s32
Same as G_LROUND.

Also add a TODO for full fp16 legalization.

Differential Revision: https://reviews.llvm.org/D108564
2021-08-23 09:45:23 -07:00
Jessica Paquette 6760e2a7bc [GlobalISel] Translate @llvm.llround.* -> G_LLROUND
Translate it using `IRTranslator::translateSimpleIntrinsic`.

Differential Revision: https://reviews.llvm.org/D108563
2021-08-23 09:42:53 -07:00
Jon Chesterfield c2574e63ff [openmp][nfc] Refactor GridValues
Remove redundant fields and replace pointer with virtual function

Of fourteen fields, three are dead and four can be computed from the
remainder. This leaves a couple of currently dead fields in place as
they are expected to be used from the deviceRTL shortly. Two of the
fields that can be computed are only used from codegen and require a
log2() implementation so are inlined into codegen instead.

This change leaves the new methods in the same location in the struct
as the previous fields for convenience at review.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D108380
2021-08-23 16:19:11 +01:00
Florian Hahn 7872074f22
[InstCombine] Add reduced sub/negate test from PR51584. 2021-08-23 15:47:22 +01:00
Florian Hahn 9577fac0fd
Revert "[InstCombine] generalize subtract with 'not' operands"
This reverts commit 3aa009cc87.

The reverted commit causes an infinite loop in instcombine. See PR51584.
2021-08-23 15:47:21 +01:00
Jinsong Ji 628eaa4cf7 [InstrProfiling] Add AIX triple to platform test
We found that AIX was not covered in most of the InstrProfiling tests.
So we are trying to enable the tests gradually.

This is to add AIX triple to platform tests to make sure the
registrations are OK.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D108490
2021-08-23 14:31:24 +00:00
Alexander Potapenko cdb391698b [tsan] Do not include <stdatomic.h> from sanitize-thread-disable.c
Looks like non-x86 bots are unhappy with inclusion of <stdatomic.h>
e.g.:

clang-armv7-vfpv3-2stage - https://lab.llvm.org/buildbot/#/builders/182/builds/626
clang-ppc64le-linux - https://lab.llvm.org/buildbot/#/builders/76/builds/3619
llvm-clang-win-x-armv7l - https://lab.llvm.org/buildbot/#/builders/60/builds/4514

It seems to be unnecessary, just remove it and replace atomic_load()
calls with dereferences of _Atomic*.

Differential Revision: https://reviews.llvm.org/D108555
2021-08-23 16:21:43 +02:00
Krasimir Georgiev f3671a688d [clang-format] break after the closing paren of a TypeScript decoration
This fixes up a regression we found from
https://reviews.llvm.org/D107267: in specific contexts, clang-format
stopped breaking after the `)` in TypeScript decorations. There were no test cases covering this, so I added one.

Reviewed By: MyDeveloperDay

Differential Revision: https://reviews.llvm.org/D108538
2021-08-23 15:52:14 +02:00
Peyton, Jonathan L d39d3a327b [OpenMP][test] fix omp_get_wtime.c test to be more accommodating
The omp_get_wtime.c test fails intermittently if the recorded times are
off by too much which can happen when many tests are run in parallel.

Instead of failing if one timing is a little off, take average of 100
timings minus the 10 worst.

Differential Revision: https://reviews.llvm.org/D108488
2021-08-23 08:13:42 -05:00
Simon Pilgrim f77174d4b8 [X86] Add unaligned partial load test
Shows LoadedSlice::canMergeExpensiveCrossRegisterBankCopy failure to merge unaligned dereferencable loads.

Another candidate for PR45116
2021-08-23 14:13:08 +01:00
Andy Wingo 4fb0c08342 [clang][CodeGen] GetDefaultAlignTempAlloca uses preferred alignment
This function was defaulting to use the ABI alignment for the LLVM
type.  Here we change to use the preferred alignment.  This will allow
unification with GetTempAlloca, which if alignment isn't specified, uses
the preferred alignment.

Differential Revision: https://reviews.llvm.org/D108450
2021-08-23 14:55:58 +02:00
Andy Wingo 8da70fed70 [clang][NFC] Tighten up code for GetGlobalVarAddressSpace
The LangAS local is only used in the OpenCL case; move its decl
inwards.

Differential Revision: https://reviews.llvm.org/D108449
2021-08-23 14:55:58 +02:00
Andy Wingo d3d4d98576 [clang][NFC] GetOrCreateLLVMGlobal takes LangAS
Pass a LangAS instead of a target address space to
GetOrCreateLLVMGlobal, to remove a place where the frontend assumes that
target address space 0 is special.

Differential Revision: https://reviews.llvm.org/D108445
2021-08-23 14:55:58 +02:00
Matthias Springer bc194a5bb5 [mlir][SCF] Do not peel loops inside partial iterations
Do not apply loop peeling to loops that are contained in the partial iteration of an already peeled loop. This is to avoid code explosion when dealing with large loop nests. Can be controlled with a new pass option `skip-partial`.

Differential Revision: https://reviews.llvm.org/D108542
2021-08-23 21:35:46 +09:00
Chuanqi Xu 2556f58148 [FuncSpec] Don't specialize function which are easy to inline
It would waste time to specialize a function which would inline finally.
This patch did two things:

- Don't specialize functions which are always-inline.
- Don't spescialize functions whose lines of code are less than threshold
(100 by default).

For spec2017int, this patch could reduce the number of specialized
functions by 33%. Then the compile time didn't increase for every
benchmark.

Reviewed By: SjoerdMeijer, xbolva00, snehasish

Differential Revision: https://reviews.llvm.org/D107897
2021-08-23 19:20:21 +08:00
Alexander Potapenko 8300d52e8c [tsan] Add support for disable_sanitizer_instrumentation attribute
Unlike __attribute__((no_sanitize("thread"))), this one will cause TSan
to skip the entire function during instrumentation.

Depends on https://reviews.llvm.org/D108029

Differential Revision: https://reviews.llvm.org/D108202
2021-08-23 12:38:33 +02:00
Simon Pilgrim 4554b5bcf5 [X86][AVX] Add PR13310 test coverage
Show failure to fold scaled-index into gather/scatter scale operands
2021-08-23 11:34:13 +01:00
Florian Hahn d024a01511
Recommit "[LoopVectorize][AArch64] Enable ordered reductions by default for AArch64"
This reverts the revert ab9296f13b.

The issue causing the revert should be fixed in 9baed023b4.
2021-08-23 11:25:27 +01:00
Jay Foad 7a967d9011 [AMDGPU] Try to fix a GCC 11 warning
Apparently GCC 11 was warning:
AMDGPURegisterBankInfo.cpp:2543:33: warning: enumerated and non-enumerated type in conditional expression [-Wextra]
2021-08-23 10:51:37 +01:00
Cullen Rhodes fb82b836b7 [AArch64][SME] Support NEON scalar FP instructions in streaming mode
The following scalar FP instructions are legal in streaming mode:

  0101 1110 xx1x xxxx 11x1 11xx xxxx xxxx # FMULX/FRECPS/FRSQRTS (scalar)
  0101 1110 x10x xxxx 00x1 11xx xxxx xxxx # FMULX/FRECPS/FRSQRTS (scalar, FP16)
  01x1 1110 1x10 0001 11x1 10xx xxxx xxxx # FRECPE/FRSQRTE/FRECPX (scalar)
  01x1 1110 1111 1001 11x1 10xx xxxx xxxx # FRECPE/FRSQRTE/FRECPX (scalar, FP16)

Predicate them on `HasNEONorStreamingSVE`. Full list of affected
instructions:

  FMULX16, FMULX32, FMULX64, FRECPS16, FRECPS32, FRECPS64, FRSQRTS16,
  FRSQRTS32, FRSQRTS64, FRECPEv1f16, FRECPEv1i32, FRECPEv1i64, FRECPXv1f16,
  FRECPXv1i32, FRECPXv1i64, FRSQRTEv1f16, FRSQRTEv1i32, FRSQRTEv1i64

Depends on D107902.

The reference can be found here:
https://developer.arm.com/documentation/ddi0602/2021-06/SIMD-FP-Instructions

Execution of NEON instructions that are illegal in streaming mode will
cause a trap or exception. Using FMULX [1] as an example, this check is
at the top of the pseudocode:

  if elements == 1 then
      CheckFPEnabled64();
  else
      CheckFPAdvSIMDEnabled64();

For the legal scalar variants it calls `CheckFPEnabled64`, whereas for the
illegal vector variants it calls `CheckFPAdvSIMDEnabled64` which traps.

This is useful for observing which instructions are/aren't legal
in streaming mode.

[1] https://developer.arm.com/documentation/ddi0602/2021-06/SIMD-FP-Instructions/FMULX--Floating-point-Multiply-extended-

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D108039
2021-08-23 08:48:34 +00:00
Cullen Rhodes cf3c6cca9f [AArch64][SME] Add predicate for NEON support in streaming mode
Split out from D107903 to remove dependency for D108039 and D108279.

Reviewed By: paulwalker-arm

Differential Revision: https://reviews.llvm.org/D108293
2021-08-23 08:48:33 +00:00
Michael Kruse 955b91c19c [Polly] Never consider non-SCoP blocks as error blocks.
Code outside the SCoP will be executed recardless of the code versioning
runtime check introduced by CodeGeneration. Assumption made based on
that these are never executed in Polly-optimized code does not hold.

This fixes the miscompilation of MultiSource/Applications/lambda-0.1.3
2021-08-23 01:04:01 -05:00
Siva Chandra Reddy 8e488c3cc0 [libc] Add a multi-waiter mutex test.
A corresponding adjustment to mtx_lock has also been made.
2021-08-23 05:47:30 +00:00