Commit Graph

418076 Commits

Author SHA1 Message Date
Marek Kurdej 126b37a713 [clang-format] Correctly recognize arrays in template parameter list.
Fixes https://github.com/llvm/llvm-project/issues/54245.

Reviewed By: MyDeveloperDay, HazardyKnusperkeks, owenpan

Differential Revision: https://reviews.llvm.org/D121584
2022-03-15 11:33:13 +01:00
Ivan Butygin 9f864a5447 [mlir][gpu] Introduce gpu.global_id op
Introduce OpenCL-style global_id op and corresponding spirv lowering.

Differential Revision: https://reviews.llvm.org/D121548
2022-03-15 13:25:50 +03:00
Ivan Butygin 7b0e041df8 [mlir][spirv] Add AssumeTrueKHROp
Differential Revision: https://reviews.llvm.org/D121601
2022-03-15 13:03:45 +03:00
Matthias Springer 05e0495f1d [mlir][bufferize][NFC] Deallocate all buffers at the end of bufferization
This makes bufferization more modular. This is in preparation of future refactorings.

Differential Revision: https://reviews.llvm.org/D121362
2022-03-15 17:53:53 +09:00
Nikita Popov 875782bd9e [OpenMPOpt] Avoid pointer element type access during region merging
Hardcode the function type as ParallelTask, which is the guaranteed
pointee type of this runtime function argument (if pointee types
exist). The elimination of the callee bitcast is left for InstCombine.

Differential Revision: https://reviews.llvm.org/D120885
2022-03-15 09:52:46 +01:00
Matthias Springer 9597b16aa9 [mlir][bufferize][NFC] Split BufferizationState into AnalysisState/BufferizationState
Differential Revision: https://reviews.llvm.org/D121361
2022-03-15 17:35:47 +09:00
Jean Perier 83b0d0f964 [flang] fulfill -Msave/-fno-automatic in main programs too
`semantics::IsSaved()` was not applying -Msave/-fno-automatic for main programs.
This caused issues since lowering relies on it to allocate static
variables. This did not match nvfortran/gfortran behaviors where
-fno-automatic/-Msave control the static allocation of scalars in
main programs.

Some program may rely on main program scalars to be statically allocated in
bss (and therefore initialized to zero) with -Msave/-fno-automatic
flags.

Differential Revision: https://reviews.llvm.org/D121603
2022-03-15 09:33:07 +01:00
Matthias Springer 76b1601001 [mlir][bufferize] Fix config not passed to greedy rewriter
Also add a TODO to switch to a custom walk instead of the GreedyPatternRewriter, which should be more efficient. (The bufferization pattern is guaranteed to apply only a single time for every op, so a simple walk should suffice.)

We currently specify a top-to-bottom walk order. This is important because other walk orders could introduce additional casts and/or buffer copies. These canonicalize away again, but it is more efficient to never generate them in the first place.

Note: A few of these canonicalizations are not yet implemented.

Differential Revision: https://reviews.llvm.org/D121518
2022-03-15 17:32:38 +09:00
Siva Chandra Reddy 1ceb007939 [libc][Obvious] Fix typo in CMake file. 2022-03-15 08:32:05 +00:00
Jean Perier a69cb78242 [flang] Hanlde COMPLEX 2/3/10 in runtime TypeCode(cat, kind)
Type codes for COMPLEX kinds 2, 3, and 10 were added in https://reviews.llvm.org/D117336
but handling for these kinds in TypeCode(cat, kind) has not been added
yet.

Differential Revision: https://reviews.llvm.org/D121587
2022-03-15 09:26:14 +01:00
Fangrui Song 252bc2b9f5 [MachineLICM] Simplify code and avoid adding nullptr values to ParentMap. NFC 2022-03-15 01:24:01 -07:00
Florian Hahn ca1b2fc9fb
[LV] Remove LoopVectorBody from InnerLoopVectorizer. (NFCI)
Update places still referencing LoopVectorBody to use the vector loop to
get the vector loop header. This is needed to move vector loop
code-generation to VPlan completely, which in turn is needed to model
pre-header & exit blocks in VPlan as well.
2022-03-15 08:22:31 +00:00
River Riddle bbfec2a1b0 [mlir] Remove the deprecated ODS Op verifier/parser/printer code blocks
These have been deprecated for ~1 month now and can be removed.

Differential Revision: https://reviews.llvm.org/D121090
2022-03-15 01:17:30 -07:00
Stanislav Gatev 092a530ca1 [clang][dataflow] Model the behavior of non-standard optional constructors
Model nullopt, inplace, value, and conversion constructors.

Reviewed-by: ymandel, xazax.hun, gribozavr2

Differential Revision: https://reviews.llvm.org/D121602
2022-03-15 08:13:13 +00:00
Adrian Kuegel fd8fe3bab6 [mlir][Bazel] Adjust build file to account for new td files. 2022-03-15 09:05:07 +01:00
Qiu Chaofan 300e1293de [PowerPC] Disable perfect shuffle by default
We are going to remove the old 'perfect shuffle' optimization since it
brings performance penalty in hot loop around vectors. For example, in
following loop sharing the same mask:

  %v.1 = shufflevector ... <0,1,2,3,8,9,10,11,16,17,18,19,24,25,26,27>
  %v.2 = shufflevector ... <0,1,2,3,8,9,10,11,16,17,18,19,24,25,26,27>

The generated instructions will be `vmrglw-vmrghw-vmrglw-vmrghw` instead
of `vperm-vperm`. In some large loop cases, this causes 20%+ performance
penalty.

The original attempt to resolve this is to pre-record masks of every
shufflevector operation in DAG, but that is somewhat complex and brings
unnecessary computation (to scan all nodes) in optimization. Here we
disable it by default. There're indeed some cases becoming worse after
this, which will be fixed in a more careful way in future patches.

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D121082
2022-03-15 15:52:24 +08:00
River Riddle 23e3cbe24a [mlir] Refactor how parser/printers are specified for AttrDef/TypeDef
There is currently an awkwardly complex set of rules for how a
parser/printer is generated for AttrDef/TypeDef. It can change depending on if a
mnemonic was specified, if there are parameters, if using the assemblyFormat, if
individual parser/printer code blocks were specified, etc. This commit refactors
this to make what the attribute/type wants more explicit, and to better align
with how formats are specified for operations.

Firstly, the parser/printer code blocks are removed in favor of a
`hasCustomAssemblyFormat` bit field. This aligns with the operation format
specification (and is nice to remove code blocks from ODS).

This commit also adds a requirement to explicitly set `assemblyFormat` or
`hasCustomAssemblyFormat` when the mnemonic is set and the attr/type
has no parameters. This removes the weird implicit matrix of behavior,
and also encourages the author to make a conscious choice of either C++
or declarative format instead of implicitly opting them into the C++
format (we should be pushing towards declarative when possible).

Differential Revision: https://reviews.llvm.org/D121505
2022-03-15 00:42:31 -07:00
River Riddle 84d2549e82 [mlir] Rewrite and modernize the documentation for defining Attributes/Types
The current documentation is super old, crusty, and at times wrong. This commit
rewrites the documentation to focus on the TableGen declarative definition,
expounds on various components, and moves the doc out of Tutorials/ and into
a new top level `AttributesAndTypes.md` doc. As part of this, the AttrDef/TypeDef
documentation in OpDefinitions.md is removed.

Differential Revision: https://reviews.llvm.org/D120011
2022-03-15 00:19:52 -07:00
River Riddle 1d7120c69a [mlir] Split out AttrDef/TypeDef and pattern constructs from OpBase.td
OpBase.td has formed into a huge monolith of all ODS constructs. This
commits starts to rectify that by splitting out some constructs to their
own .td files.

Differential Revision: https://reviews.llvm.org/D118636
2022-03-15 00:18:03 -07:00
Mogball 4767e26775 [mlir][ods] Add support for custom directive in attr/type formats
This patch adds support for custom directives in attribute and type formats. Custom directives dispatch calls to user-defined parser and printer functions.

For example, the assembly format "custom<Foo>($foo, ref($bar))" expects a function with the signature

```
LogicalResult parseFoo(AsmParser &parser, FailureOr<FooT> &foo, BarT bar);
void printFoo(AsmPrinter &printer, FooT foo, BarT bar);
```

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D120944
2022-03-15 07:15:15 +00:00
esmeyi 6143ec2961 [NFC][XCOFF] Refactor and format XCOFFObjectWriter.cpp.
Reviewed By: jhenderson, DiggerLin

Differential Revision: https://reviews.llvm.org/D120858
2022-03-15 02:40:50 -04:00
Fangrui Song cce3521020 [llvm-objcopy] Simplify CompressedSection creation. NFC
Remove Expected<CompressedSection> factory functions in favor of constructors
now that zlib::compress returns void (D121512).

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D121644
2022-03-14 23:15:15 -07:00
Fangrui Song 1caee67dac [MC][test] Add more .loc directives to improve portability with older zlib
Make .debug_line so larger so that MC will more assuredly compress .debug_line
(it doesn't compress a section if compressed content is not smaller).
2022-03-14 22:33:08 -07:00
Chris Lattner 2ef95efb41 Revert "[mlirTranslateMain] Add a customization callback."
This reverts commit f18d6af7e9.
This patch is a more controversial than I expected, it is
better to revert while the discussion continues.  xref this
thread:
https://discourse.llvm.org/t/doc-mlir-translate-mlir-opt/60751/

xref this phab patch: https://reviews.llvm.org/D120970

Differential Revision: https://reviews.llvm.org/D121668
2022-03-14 22:04:46 -07:00
Thomas Raoux 6d007e0278 [mlir][nvvm] Fix bug in ldmatrix intrinsic conversion
The ldmatrix intrinsic trans option was inverted.

Bug found by @christopherbate!

Differential Revision: https://reviews.llvm.org/D121666
2022-03-15 05:04:09 +00:00
Jonas Devlieghere 0aaf480be9
[lldb] Cleanup MacOSX platform headers (NFC)
While working on dde487e547 I noticed that the MacOSX platforms were
in need of some love. This patch cleans up the headers:

 - Move platforms into the lldb_private namespace.
 - Remove lldb_private:: prefixes to improve readability.
 - Fix header includes and use forward declarations (iwyu).
 - Fix formatting
2022-03-14 22:01:05 -07:00
Keith Smiley cb22d71806 [clang] Fix DIFile directory root on Windows
On unix systems this logic would not separate the file and directory of
the DIFile unless they shared more components at the start than just the
root path character. The logic to do this was unix specific so it didn't
work on Windows. Now we check if the entire root_path is the same as
what you were going to set as the Dir and use the full filepath in that
case.

Differential Revision: https://reviews.llvm.org/D111579
2022-03-14 20:07:01 -07:00
Keith Smiley 6541d3e979 [test] Add lit helper for windows paths
This adds 2 new lit helpers `%{fs-src-root}` and `%{fs-sep}`, these
allow writing tests that correctly handle slashes on Windows. In the
case of tests like clang/test/CodeGen/debug-prefix-map.c, these are
unable to correctly test behavior on both platforms, unless they fork
and add OS requirements, because the relevant logic hits host specific
codepaths like checking if paths are absolute.

Differential Revision: https://reviews.llvm.org/D111457
2022-03-14 20:05:55 -07:00
Sam Clegg 2481adb59c [WebAssembly] Fix asan issue from https://reviews.llvm.org/D121349 2022-03-14 19:57:50 -07:00
Ruiling Song 98dd390573 AMDGPU: Use removeAllRegUnitsForPhysReg()
I met the issue here when working on something else.
Actually we have already reserved EXEC, but it looks
like the register coalescer is causing the sub-register
of EXEC appears in LiveIntervals. I have not looked
deeper why register coalscer have such behavior, but
removeAllRegUnitsForPhysReg() is the right way.

Reviewed By: critson, foad, arsenm

Differential Revision: https://reviews.llvm.org/D117014
2022-03-15 10:28:27 +08:00
Petr Hosek 1b6ff3f4f8 [CMake][Fuchsia] Use correct architecture for iossim
We should be building iossim for x86_64, not arm64.

Differential Revision: https://reviews.llvm.org/D121659
2022-03-14 19:21:09 -07:00
Jez Ng ceff23c6e3 [lld-macho] -flat_namespace for dylibs should make all externs interposable
All references to interposable symbols can be redirected at runtime to
point to a different symbol definition (with the same name). For
example, if both dylib A and B define symbol _foo, and we load A before
B at runtime, then all references to _foo within dylib B will point to
the definition in dylib A.

ld64 makes all extern symbols interposable when linking with
`-flat_namespace`.

TODO 1: Support `-interposable` and `-interposable_list`, which should
just be a matter of parsing those CLI flags and setting the
`Defined::interposable` bit.

TODO 2: Set Reloc::FinalDefinitionInLinkageUnit correctly with this info
(we are currently not setting it at all, so we're erring on the
conservative side, but we should help the LTO backend generate more
optimal code.)

Reviewed By: modimo, MaskRay

Differential Revision: https://reviews.llvm.org/D119294
2022-03-14 22:18:32 -04:00
Jez Ng 7f3ddf8443 [lld-macho][nfc] Allow Defined symbols to be placed in binding sections
Previously, we only allowed this for DylibSymbols. However, in order to
properly support `-flat_namespace` as well as `-interposable`, we need
to allow this for Defined symbols too. Therefore we hoist the
`lazyBindOffset` and the `stubsHelperIndex` into the parent Symbol
class.

The actual change to support interposition under `-flat_namespace` is in
{D119294}; the NFC changes here have been split out for easier review.

Perf regression isn't stat sig on my 3.2 GHz 16-Core Intel Xeon W linking
chromium_framework:

             base           diff           difference (95% CI)
  sys_time   1.227 ± 0.021  1.234 ± 0.031  [  -0.3% ..   +1.5%]
  user_time  3.665 ± 0.036  3.674 ± 0.035  [  -0.2% ..   +0.7%]
  wall_time  4.596 ± 0.055  4.609 ± 0.064  [  -0.3% ..   +0.9%]
  samples    34             47

Max RSS regression is barely stat sig:

           base                           diff                           difference (95% CI)
  time     1003664356.324 ± 15404053.912  1010380403.613 ± 10578309.455  [  +0.0% ..   +1.3%]
  samples  37                             31

Reviewed By: modimo

Differential Revision: https://reviews.llvm.org/D121351
2022-03-14 22:18:32 -04:00
Owen Pan 0a0cc3c58a [clang-format] Don't unwrap lines preceded by line comments
Fixes #53495

Differential Revision: https://reviews.llvm.org/D121576
2022-03-14 19:16:29 -07:00
Joseph Huber f1388b616a [OpenMP][Fix] Fix test failing after patch 2022-03-14 21:51:38 -04:00
Joseph Huber 670438e55d [OpenMP][Fix] Add offloading kind to AMDGPU libraries
Summary:
A previous patch added the offloading kind to the triple format we used.
I forgot to update the line where we add the AMDGPU libraries.
2022-03-14 21:18:19 -04:00
LLVM GN Syncbot e28ace8a97 [gn build] Port 9c542a5a4e 2022-03-15 00:51:57 +00:00
Julian Lettner 9c542a5a4e Lower `@llvm.global_dtors` using `__cxa_atexit` on MachO
For MachO, lower `@llvm.global_dtors` into `@llvm_global_ctors` with
`__cxa_atexit` calls to avoid emitting the deprecated `__mod_term_func`.

Reuse the existing `WebAssemblyLowerGlobalDtors.cpp` to accomplish this.

Enable fallback to the old behavior via Clang driver flag
(`-fregister-global-dtors-with-atexit`) or llc / code generation flag
(`-lower-global-dtors-via-cxa-atexit`).  This escape hatch will be
removed in the future.

Differential Revision: https://reviews.llvm.org/D121327
2022-03-14 17:51:18 -07:00
Joseph Huber 24ebdb6c25 [CUDA] Add CUDA fatbinary magic
Nvidia uses fatbinaries to bundle all of their device code. This patch
adds the magic number "0x50ed55ba" used in their propeitary format to
the list of magic identifies. This is technically undocumented and could
unlikely be changed by Nvidia in the future.

Reviewed By: tra

Differential Revision: https://reviews.llvm.org/D120932
2022-03-14 20:08:31 -04:00
Joseph Huber 23d885b3a2 [OpenMP][NFC] Refactor new driver to be more general
This path refactors the new driver to be less dependent on OpenMP. This
is done in preparation for the new driver to be able to handle other
offloading kinds and compile them together.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D120934
2022-03-14 20:08:29 -04:00
Joseph Huber 9f89769cd7 [Clang] Add offload kind to embedded offload object
This patch adds the offload kind to the embedded section name in
preparation for offloading to different kinda like CUDA or HIP.

Depends on D120288

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D120271
2022-03-14 20:08:27 -04:00
Joseph Huber 06b336c4cd [OpenMP] Implement dense map info for device file
This patch implements a DenseMap info struct for the device file type.
This is used to help grouping device files that have the same triple and
architecture. Because of this the filename, which will always be unique
for each file, is not used.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D120288
2022-03-14 20:08:26 -04:00
Joseph Huber 806bbc49dc [OpenMP] Try to embed offloading objects after codegen
Currently we use the `-fembed-offload-object` option to embed a binary
file into the host as a named section. This is currently only used as a
codegen action, meaning we only handle this option correctly when the
input is a bitcode file. This patch adds the same handling to embed an
offloading object after we complete code generation. This allows us to
embed the object correctly if the input file is source or bitcode.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D120270
2022-03-14 20:08:24 -04:00
Stanislav Mekhanoshin c4500de255 [AMDGPU] gfx940: disable OP_SEL on V_DOT instructions
Differential Revision: https://reviews.llvm.org/D121634
2022-03-14 17:02:00 -07:00
Vy Nguyen 0d5e27623a Reland "[lld-macho] Avoid using bump-alloc in TrieBuider""
This reverts commit ee7a286cd3.
2022-03-14 19:33:13 -04:00
Stanislav Mekhanoshin 8dd3d1cf1f [AMDGPU] Add symbolic names for gfx940 HWREGs
The namespaces of HWREGs is now overlapping with gfx10. Thus the
patch is longer than necessary to just support new names. It also
need to handle proper error messages, i.e. to issue a "specified
hardware register is not supported on this GPU" message.

This may need a major refactoring in the future.

Differential Revision: https://reviews.llvm.org/D121418
2022-03-14 16:13:33 -07:00
Andrew Browne dbf8c00b09 [DFSan] Remove trampolines to unblock opaque pointers. (Reland with fix)
https://github.com/llvm/llvm-project/issues/54172

Reviewed By: pcc

Differential Revision: https://reviews.llvm.org/D121250
2022-03-14 16:03:25 -07:00
Stanislav Mekhanoshin 23499103f7 [AMDGPU] Support for gfx940 flat lds opcodes
Differential Revision: https://reviews.llvm.org/D121414
2022-03-14 15:46:19 -07:00
Stanislav Mekhanoshin 1f53f20fc1 [AMDGPU] Support gfx940 v_lshl_add_u64 instruction
Differential Revision: https://reviews.llvm.org/D121401
2022-03-14 15:45:42 -07:00
Ryan Prichard 659029302d [ARM] __cxa_end_cleanup: avoid clobbering r4
The fix for D111703 clobbered r4 both to:
 - Save/restore the original lr.
 - Load the address of _Unwind_Resume for LIBCXXABI_BAREMETAL.

This patch saves and restores lr without clobbering any extra
registers.

For LIBCXXABI_BAREMETAL, it is still necessary to clobber one extra
register to hold the address of _Unwind_Resume, but it seems better to
use ip/r12 (intended for linker veneers/trampolines) than r4 for this
purpose.

The function also clobbers r0 for the _Unwind_Resume function's
parameter, but that is unavoidable.

Reviewed By: danielkiss, logan, MaskRay

Differential Revision: https://reviews.llvm.org/D121432
2022-03-14 15:44:35 -07:00