Commit Graph

391760 Commits

Author SHA1 Message Date
Joseph Huber 44feacc736 [OpenMP] Change remaining globalization from an analysis remark to missed
After landing the globalization optimizations, the precense of globalization on
the device that was not put in shared or stack memory is a failed optimization
with performance consequences so it should indicate a missed remark.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D104735
2021-06-22 16:52:06 -04:00
Kadir Cetinkaya 544d20eab6
[clangd] Dont index ObjCCategoryDecls for completion
They are already provided by Sema, deserializing from preamble if need
be. Moreover category names are meaningless outside interface/implementation
context, hence they were only causing noise.

Differential Revision: https://reviews.llvm.org/D104540
2021-06-22 22:42:25 +02:00
Aart Bik 36b66ab9ed [mlir][sparse] add support for "simply dynamic" sparse tensor expressions
Slowly we are moving toward full support of sparse tensor *outputs*. First
step was support for all-dense annotated "sparse" tensors. This step adds
support for truly sparse tensors, but only for operations in which the values
of a tensor change, but not the nonzero structure (this was refered to as
"simply dynamic" in the [Bik96] thesis).

Some background text was posted on discourse:
https://llvm.discourse.group/t/sparse-tensors-in-mlir/3389/25

Reviewed By: gussmith23

Differential Revision: https://reviews.llvm.org/D104577
2021-06-22 13:37:32 -07:00
David Tenty 7942ebdf01 [clang] Add cc1 option for dumping layout for all complete types
This change adds an option which, in addition to dumping the record
layout as is done by -fdump-record-layouts, causes us to compute the
layout for all complete record types (rather than the as-needed basis
which is usually done by clang), so that we will dump them as well.
This is useful if we are looking for layout differences across large
code bases without needing to instantiate every type we are interested in.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D104484
2021-06-22 16:27:26 -04:00
Joseph Huber 422adaa879 [OpenMP] Add thread limit environment variable support to plugins
The OpenMP 5.1 standard defines the environment variable
`OMP_TEAMS_THREAD_LIMIT` to limit the number of threads that will be run in a
single block. This patch adds support for this into the AMDGPU and CUDA
plugins.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D103923
2021-06-22 16:25:40 -04:00
Louis Dionne e35677c07c [libc++] NFC: Remove unused c++98 Lit feature 2021-06-22 16:24:43 -04:00
River Riddle 87e59e47e9 [mlir] Remove the Identifier ThreadLocalCache from MLIRContext
This used to be important for reducing lock contention when accessing identifiers, but
the cost of the cache can be quite large if parsing in a multi-threaded context. After
D104167, the win of keeping a cache is not worth the cost.

Differential Revision: https://reviews.llvm.org/D104737
2021-06-22 19:56:05 +00:00
River Riddle e4e31e19bb [mlir][OpGen] Cache Identifiers for known attribute names in AbstractOperation.
Operations currently rely on the string name of attributes during attribute lookup/removal/replacement, in build methods, and more. This unfortunately means that some of the most used APIs in MLIR require string comparisons, additional hashing(+mutex locking) to construct Identifiers, and more. This revision remedies this by caching identifiers for all of the attributes of the operation in its corresponding AbstractOperation. Just updating the autogenerated usages brings up to a 15% reduction in compile time, greatly reducing the cost of interacting with the attributes of an operation. This number can grow even higher as we use these methods in handwritten C++ code.

Methods for accessing these cached identifiers are exposed via `<attr-name>AttrName` methods on the derived operation class. Moving forward, users should generally use these methods over raw strings when an attribute name is necessary.

Differential Revision: https://reviews.llvm.org/D104167
2021-06-22 19:56:05 +00:00
Reid Kleckner 5bcbc7ee52 Add regression test for maybeMangle issue
This was crbug.com/1222724, which caused D104529 to be reverted. The
new test fails when D104529 is reapplied locally.
2021-06-22 12:55:25 -07:00
Geoffrey Martin-Noble 4aeb2e60df Introduce a Bazel build configuration
This patch introduces configuration for a Bazel BUILD in a side
directory in the monorepo.

This is following the approval of
https://github.com/llvm/llvm-www/blob/main/proposals/LP0002-BazelBuildConfiguration.md

As detailed in the README, the Bazel BUILD is not supported
by the community in general, and is maintained only by interested
parties. It follows the requirements of the LLVM peripheral tier:
https://llvm.org/docs/SupportPolicy.html#peripheral-tier.

This is largely copied from https://github.com/google/llvm-bazel,
with a few filepath tweaks and the addition of the README.

Reviewed By: echristo, keith, dblaikie, kuhar

Differential Revision: https://reviews.llvm.org/D90352
2021-06-22 12:47:43 -07:00
Vitali Lovich 64cf5eba06 [clang-format] Add new LambdaBodyIndentation option
Currently the lambda body indents relative to where the lambda signature is located. This instead lets the user
choose to align the lambda body relative to the parent scope that contains the lambda declaration. Thus:

someFunction([] {
  lambdaBody();
});

will always have the same indentation of the body even when the lambda signature goes on a new line:

someFunction(
    [] {
  lambdaBody();
});

whereas before lambdaBody would be indented 6 spaces.

Differential Revision: https://reviews.llvm.org/D102706
2021-06-22 21:46:16 +02:00
Petr Hosek 21c008d5a5 Revert "[cmake] [compiler-rt] Call llvm_setup_rpath() when adding shared libraries."
This reverts commit 78fd93e039 as
a follow up to D91099.
2021-06-22 12:42:39 -07:00
Nico Weber 356d6b7b8a [gn build] manually port c747b7d1d9 more (config.osx_sysroot) 2021-06-22 15:33:52 -04:00
Nico Weber dedeb66191 Make lit configs relocatable again after c747b7d1d9
See https://reviews.llvm.org/D77184 for background.
2021-06-22 15:27:32 -04:00
Bill Wendling 46db43240f [llvm-diff] Explicitly check ConstantArrays
Global initializers may be ConstantArrays. They need to be checked
explicitly, because different-yet-still-equivalent type names may be
used for each, and/or a GEP instruction may appear in one.
2021-06-22 12:23:38 -07:00
Bill Wendling ab6002871d [llvm-diff] Add support for diffing the callbr instruction
The only wrinkle is that we can't process the "blockaddress" arguments
of the callbr until the blocks have been equated. So we force them to be
"unified" before checking.

This was left out when the callbr instruction was added.

Differential Revision: https://reviews.llvm.org/D104606
2021-06-22 12:23:37 -07:00
Nikita Popov ae1093921f Revert "[compiler-rt] Make use of undefined symbols configurable"
This reverts commit ed7086ad46.
This reverts commit b9792638b0.

This breaks cmake with message:

    CMake Error at llvm-project/compiler-rt/CMakeLists.txt:449:
      Parse error.  Expected "(", got newline with text "
2021-06-22 21:20:20 +02:00
Nikita Popov 7bb7fa12e7 [OpaquePtr] Support changing load type in InstCombine
When the load type is changed to ptr, we need the load pointer type
to also be ptr, because it's not allowed to create a pointer to an
opaque pointer. This is achieved by adjusting the getPointerTo() API
to return an opaque pointer for an opaque pointer base type.

Differential Revision: https://reviews.llvm.org/D104718
2021-06-22 21:16:15 +02:00
Sami Tolvanen 33c9438f11 Revert "ThinLTO: Fix inline assembly references to static functions with CFI"
This reverts commit 4474958d3a.

Breaks check-llvm on Mac.
2021-06-22 12:10:58 -07:00
Joseph Huber bc768aac2e [OpenMP] Remove OpenMP CUDA Target Parallel compiler flag
Summary:
The changes introduced in D97680 turns this command line option into a no-op so
it can be removed entirely.

Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D102940
2021-06-22 15:10:19 -04:00
Christopher Di Bella e4ec613083 [libcxx][doc] corrects LWG links in the One Ranges section 2021-06-22 19:00:23 +00:00
Petr Hosek ed7086ad46 [CMake] Fix the option declaration
This addresses build issue introduced in
b9792638b0.
2021-06-22 11:58:26 -07:00
Christopher Di Bella e7091da10b [libcxx][docs] updates the ranges status paper
* indicates whether work has been started or completed
* consolidates content that was split for dependency reasons (iff
  everything has been merged)
* makes things a lot more fine-grained
* turns sub-CSVs into lists
* puts links into description section and removes patch column
* adds links to c++draft on occasion

These changes heavily prioritise the the reader of the generated HTML
file, not the source.

Differential Revision: https://reviews.llvm.org/D103295
2021-06-22 18:54:59 +00:00
Petr Hosek b9792638b0 [compiler-rt] Make use of undefined symbols configurable
We want to disable the use of undefined symbols on Fuchsia, but there
are cases where it might be desirable so may it configurable.

Differential Revision: https://reviews.llvm.org/D104728
2021-06-22 11:49:31 -07:00
Petr Hosek fa5f425209 [compiler-rt][CMake] Drop flags that are set by default for Fuchsia
-Wl,-z,now is set by the Fuchsia driver, -Wl,-z,relro is the default
in LLD.
2021-06-22 11:49:30 -07:00
Akira Hatanaka f4c06bcb67 [CodeGen] Don't create fake FunctionDecls when generating block/byref
copy/dispose helper functions

We found out that these fake functions would cause clang to crash if the
changes proposed in https://reviews.llvm.org/D98799 were made.

Differential Revision: https://reviews.llvm.org/D104082
2021-06-22 11:42:53 -07:00
Joseph Huber ca1560da72 [OpenMP][NFC] Add new optimizations to OpenMPOpt comment header
Summary:
Adds mentions to the new globalization optimizations added to the OpenMPOpt comment header.
2021-06-22 14:40:31 -04:00
Joseph Huber b54ccab509 [Attributor] Add an option to increase the max number of iterations
Right now the Attributor defaults to 32 fixed point iterations unless it is set
explicitly by a command line flag. This patch allows this to be configured when
the attributor instance is created. The maximum is then increased in OpenMPOpt
if the target is a kernel. This is because the globalization analysis can result
in larger iteration counts due to many dependent instances running at once.

Depends on D102444

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D104416
2021-06-22 14:38:25 -04:00
Reid Kleckner 8d84751ac4 Revert "[LLD] [COFF] Avoid doing repeated fuzzy symbol lookup for each iteration. NFC."
This reverts commit e1adf90826.

This appears to affect the way that C++ mangled symbols appear in the
import library when using a .def file that names a C++ free function
with no name decoration. I will follow up with a reduced test case
shortly.
2021-06-22 11:35:14 -07:00
Fangrui Song 948016228f Improve clang -Wframe-larger-than= diagnostic
Match the style in D104667.

This commit is for non-LTO diagnostics, while D104667 is for LTO and llc diagnostics.
2021-06-22 11:20:49 -07:00
Sanjay Patel b1f6ef92ec [InstCombine] reduce code duplication for FP min/max with casts fold; NFC 2021-06-22 14:15:04 -04:00
Sanjay Patel bfd172999b [InstCombine][test] add tests for FP min/max with negated op; NFC 2021-06-22 14:15:04 -04:00
Sanjay Patel 4e78bd3836 [InstCombine][test] add tests for FP min/max with negated op; NFC 2021-06-22 14:15:04 -04:00
Joseph Huber 30e36c9b3c [Attributor] Add interface to emit remarks in Attributor
Summary:
This patch adds support for the Attributor to emit remarks on behalf of some
other pass. The attributor can now optionally take a callback function that
returns an OptimizationRemarkEmitter object when given a Function pointer. If
this is availible then a remark will be emitted for the corresponding pass
name.

Depends on D102197

Reviewed By: sstefan1 thegameg

Differential Revision: https://reviews.llvm.org/D102444
2021-06-22 14:12:46 -04:00
David Green 015c27caa2 [ARM] Change some Gather/Scatter interface types to Instructions. NFC
These returned Values are cast to an Instruction already, this just
cleans up the interface a little to match the expected types.
2021-06-22 19:11:39 +01:00
Raphael Isemann 709f8186a4 [lldb] Add missing string include to lldb-server's main 2021-06-22 19:49:10 +02:00
Louis Dionne 87dbe6c4ef [libc++] NFC: Add missing all.h to the modulemap 2021-06-22 13:47:41 -04:00
Matt Arsenault 39f8a792f0 AMDGPU: Try to eliminate clearing of high bits of 16-bit instructions
These used to consistently be zeroed pre-gfx9, but gfx9 made the
situation complicated since now some still do and some don't. This
also manages to pick up a few cases that the pattern fails to optimize
away.

We handle some cases with instruction patterns, but some get
through. In particular this improves the integer cases.
2021-06-22 13:42:49 -04:00
Arthur O'Dwyer 317e92a3e8 [libc++] Enable `explicit` conversion operators, even in C++03 mode.
C++03 didn't support `explicit` conversion operators;
but Clang's C++03 mode does, as an extension, so we can use it.
This lets us make the conversion explicit in `std::function` (even in '03),
and remove some silly metaprogramming in `std::basic_ios`.

Drive-by improvements to the tests for these operators, in addition
to making sure all these tests also run in `c++03` mode.

Differential Revision: https://reviews.llvm.org/D104682
2021-06-22 13:35:59 -04:00
Matt Arsenault 2e120920ac AMDGPU: Add baseline test for instructions zeroing high bits 2021-06-22 13:27:39 -04:00
Joseph Huber 7d69da71dd [OpenMP] Enable HeapToStack conversion in OpenMPOpt for new RTL globalization calls
Summary:
The changes to globalization introduced in D97680 introduce a large amount of overhead by default. The old globalization method would always ignore globalization code if executing in SPMD mode. This wasn't strictly correct as data sharing is still possible in SPMD mode. The new interface is correct but introduces globalization code even when unnecessary. This optimization will use the existing HeapToStack transformation in the attributor to allow for unneeded globalization to be replaced with thread-private stack memory. This is done using the newly introduced library instances for the RTL functions added in D102087.

Depends on D97818

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D102197
2021-06-22 13:23:05 -04:00
Joseph Huber 2662351e3b [OpenMP] Add new OpenMP globalization functions to library info
Summary:
The changes to globalization introduced in D97680 created two new functions to
push / pop shareably memory on the GPU, __kmpc_alloc_shared and
__kmpc_free_shared. This patch adds these new runtime functions to the
library info so they can be used by the HeapToStack attributor interface. This
optimization replaces malloc / free pairs with stack memory if legal.

Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D102087
2021-06-22 13:23:05 -04:00
Patrick Holland d03736455c [MCA] [In-order pipeline] Fix for 0 latency instruction causing assertion to fail.
0 latency instructions now get processed and retired properly within the in-order pipeline. Had to fix a bug within TimelineView.cpp as well that would show up when a 0 latency instruction was the first instruction in the source.

Differential Revision: https://reviews.llvm.org/D104675
2021-06-22 10:18:39 -07:00
Matt Arsenault 9ad8a1f6fb AMDGPU: Fix high 16-bit optimization on gfx9
We can do this optimization in the majority of cases, but we currently
don't have a way to do it. We do not track/model which instructions
have which behavior, the control bit to change the high bit behavior,
or making use of preserved bits at all. This is a bit fuzzy since we
don't know precisely how the source instruction will be lowered, but
that only really matters in one case (for fma_mixlo).

We do need to fixup some of these cases after selection, but the
pattern helps eliminate many of these zexts.
2021-06-22 13:16:45 -04:00
LLVM GN Syncbot 805e1a5896 [gn build] Port 40d6d2c49d 2021-06-22 17:03:46 +00:00
Sami Tolvanen 4474958d3a ThinLTO: Fix inline assembly references to static functions with CFI
Create an internal alias with the original name for static functions
that are renamed in promoteInternals to avoid breaking inline
assembly references to them.

Link: https://github.com/ClangBuiltLinux/linux/issues/1354

Reviewed By: pcc

Differential Revision: https://reviews.llvm.org/D104058
2021-06-22 10:01:55 -07:00
zhijian bd240b3d77 [AIX][XCOFF] generate eh_info when vector registers are saved according to the traceback table.
Summary:

generate eh_info when vector registers are saved according to the traceback table.

struct eh_info_t {

unsigned version;       /* EH info version 0 */
#if defined(64BIT)

char _pad[4];           /* padding */
#endif

unsigned long lsda;     /* Pointer to Language Specific Data Area */
unsigned long personality; /* Pointer to the personality routine */
};

the value of lsda and personality is zero when the number of vector registers saved is large zero and there is not personality of the function

Reviewers: Jason Liu
Differential Revision: https://reviews.llvm.org/D103651
2021-06-22 13:01:31 -04:00
Stanislav Mekhanoshin d797a7f8da [AMDGPU] Use performOptimizedStructLayout for LDS sort
This gives better packing.

Differential Revision: https://reviews.llvm.org/D104331
2021-06-22 09:58:10 -07:00
Fangrui Song f53d791520 Improve the diagnostic of DiagnosticInfoResourceLimit (and warn-stack-size in particular)
Before: `warning: stack size limit exceeded (888) in main`
After: `warning: stack frame size (888) exceeds limit (100) in function 'main'` (the -Wframe-larger-than limit will be mentioned)

Reviewed By: nickdesaulniers

Differential Revision: https://reviews.llvm.org/D104667
2021-06-22 09:55:20 -07:00
zoecarver 40d6d2c49d [libcxx][ranges] Add `ranges::iter_swap`.
Differential Revision: https://reviews.llvm.org/D102809
2021-06-22 09:52:40 -07:00