Commit Graph

392715 Commits

Author SHA1 Message Date
Arthur O'Dwyer e5fbe9f315 [libc++] graph_header_deps.py: Detect files that include themselves.
This wasn't happening before, which led to one slipping in.
2021-06-30 17:37:43 -04:00
Artem Belevich cab5f89cfd [Clang] allow overriding -fbasic-block-sections
We should not error out on non-x86 targets if `-fbasic-block-sections=none` is in effect.

Also, filter it out for GPU-side compilations, as we do with other options not
supported on the GPU.

Differential Revision: https://reviews.llvm.org/D105226
2021-06-30 14:32:08 -07:00
Richard Smith ef227b32b6 Add dumping support for RequiresExpr.
In passing, fix an ast-print bug that inserted a spurious extra `;`
after a concept definition.
2021-06-30 14:27:19 -07:00
Jon Chesterfield d86b0073cf [libomptarget][amdgpu][nfc] Fix build warnings, drop some headers
Removes stdarg header, drops uses of iostream, fix some format string errors.
Also changes a C style struct to C++ style to avoid a warning from clang/

Reviewed By: pdhaliwal

Differential Revision: https://reviews.llvm.org/D104923
2021-06-30 22:23:36 +01:00
Matt Arsenault a601b308d9 GlobalISel: Lower non-byte loads and stores
Previously we didn't preserve the memory type and had to blindly
interpret a number of bytes. Now that non-byte memory accesses are
representable, we can handle these correctly.

Ported from DAG version (minus some weird special case i1 legality
checking which I don't fully understand, and we don't have a way to
query for)

For now, this is NFC and the test changes are placeholders. Since the
legality queries are still relying on byte-flattened memory sizes, the
legalizer can't actually see these non-byte accesses. This keeps this
change self contained without merging it with the larger patch to
switch to LLT memory queries.
2021-06-30 17:05:50 -04:00
Matt Arsenault 748e0b07dc GlobalISel: Preserve memory type when reducing load/store width 2021-06-30 17:05:29 -04:00
Matt Arsenault d6270125fc AMDGPU/GlobalISel: Remove some problematic testcases
These testcases are a bit nonsensical and won't be handled correctly
for a long time. Remove them to unblock load/store legalization work.
2021-06-30 17:05:29 -04:00
Jonas Paulsson 7aef99351a [MCStreamer] Move emission of attributes section into MCELFStreamer
Enable the emission of a GNU attributes section by reusing the code for
emitting the ARM build attributes section.

The GNU attributes follow the exact same section format as the ARM
BuildAttributes section, so this can be factored out and reused for GNU
attributes generally.

The immediate motivation for this is to emit a GNU attributes section for the
vector ABI on SystemZ (https://reviews.llvm.org/D105067).

Review: Logan Chien, Ulrich Weigand

Differential Revision: https://reviews.llvm.org/D102894
2021-06-30 16:00:27 -05:00
Aleksandr Platonov a62579fc00 [clangd][nfc] Show more information in logs when compiler instance prepare fails
Without this patch clangd silently process compiler instance prepare failure and only LSP errors "Invalid AST" could be found in logs.
E.g. the reason of the problem https://github.com/clangd/clangd/issues/734 is impossible to understand without verbose logs or with disabled background index.
This patch adds more information into logs to help understand the reason of such failures.

Logs without this patch:
```
E[...] Could not build a preamble for file test.cpp version 1
```

Logs with this patch:
```
E[...] Could not build a preamble for file test.cpp version 1: CreateTargetInfo() return null
..
E[...] Failed to prepare a compiler instance: unknown target ABI 'lp64'
```

Reviewed By: sammccall

Differential Revision: https://reviews.llvm.org/D104056
2021-06-30 21:58:33 +01:00
Matt Arsenault fae05692a3 CodeGen: Print/parse LLTs in MachineMemOperands
This will currently accept the old number of bytes syntax, and convert
it to a scalar. This should be removed in the near future (I think I
converted all of the tests already, but likely missed a few).

Not sure what the exact syntax and policy should be. We can continue
printing the number of bytes for non-generic instructions to avoid
test churn and only allow non-scalar types for generic instructions.

This will currently print the LLT in parentheses, but accept parsing
the existing integers and implicitly converting to scalar. The
parentheses are a bit ugly, but the parser logic seems unable to deal
without either parentheses or some keyword to indicate the start of a
type.
2021-06-30 16:54:13 -04:00
Siva Chandra 578a4cfe19 [libc][NFC] Clear all exceptions in exception_flags_test before raising another.
This is because, raising some exceptions can raise other ones. For
example, raising FE_OVERFLOW can raise FE_INEXACT. So, we need to clear all
exceptions if we want a clean slate.
2021-06-30 13:48:07 -07:00
Martin Storsjö bf6770f9bd [CMake] Don't use -Bsymbolic-functions for MinGW targets
This is an ELF specific option which isn't supported for Windows/MinGW
targets, even if the MinGW linker otherwise uses an ld.bfd like linker
interface.

Differential Revision: https://reviews.llvm.org/D105148
2021-06-30 22:54:26 +03:00
Valentin Churavy 9762f12c6c
[Orc] Run the examples as part of the tests
Enable the Orc C-Bindings for testing.

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D104637
2021-06-30 21:45:16 +02:00
Valentin Churavy 69e0f790e0
[Orc] Fix name of LLVMOrcIRTransformLayerSetTransform
In https://reviews.llvm.org/D103855 we added access to IRTransformLayer, but I
just noticed that the function name is following the wrong pattern.

Differential Revision: https://reviews.llvm.org/D104840
2021-06-30 21:43:34 +02:00
Shilei Tian 24a36ce58b [OpenMP][Offloading] Replace all calls to `isSPMDMode` with `__kmpc_is_spmd_exec_mode`
In our ongoing work, we are using `AbstractAttributor` to deduct execution model
of device functions, and potententially remove unnecessary function calls to
`__kmpc_is_spmd_exec_mode`. In current device runtime, we have mixed use of
`isSPMDMode` and `__kmpc_is_spmd_exec_mode`, but in fact in `__kmpc_is_spmd_exec_mode`
it simply calls `isSPMDMode`. Since all functions starting with `__kmpc` is C
function, which doesn't have things like name mangling. It is more optimization
friendly. In this patch, we simply replaced all calls to `isSPMDMode` with
`__kmpc_is_spmd_exec_mode` to pave the way for the optimization.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D105211
2021-06-30 15:39:57 -04:00
Jon Roelofs a642872476 [GISel] Support llvm.memcpy.inline
Differential revision: https://reviews.llvm.org/D105072
2021-06-30 12:39:05 -07:00
Suraj Sudhir 2eb7bbbe65 [mlir][tosa] Use 3D tensors in tosa.matmul
Signed-off-by: Suraj Sudhir <suraj.sudhir@arm.com>

Reviewed By: rsuderman

Differential Revision: https://reviews.llvm.org/D105213
2021-06-30 12:22:52 -07:00
Florian Hahn e6d22d0174
[BasicAA] Use separate scale variable for GCD.
Use separate variable for adjusted scale used for GCD computations. This
fixes an issue where we incorrectly determined that all indices are
non-negative and returned noalias because of that.

Follow up to 91fa3565da.
2021-06-30 20:04:39 +01:00
Florian Hahn f4ea6531e6
[BasicAA] Add test for incorrectly inferring noalias due to scale sign.
This patch adds a test where we currently incorrectly determine noalias,
because the sign of Scale is adjusted after 91fa3565da.
2021-06-30 19:57:29 +01:00
LLVM GN Syncbot ec74192f52 [gn build] Port 381ded345b 2021-06-30 18:49:16 +00:00
Nico Weber 51c3e3f80c [gn build] (manually) port f617ab1044 (DoublerPlugin) 2021-06-30 14:49:06 -04:00
Philip Reames f0693bc0ae autogen two tests for ease of update 2021-06-30 11:47:36 -07:00
Stanislav Mekhanoshin 381ded345b [AMDGPU] Add S_MOV_B64_IMM_PSEUDO for wide constants
This is to allow 64 bit constant rematerialization. If a constant
is split into two separate moves initializing sub0 and sub1 like
now RA cannot rematerizalize a 64 bit register.

This gives 10-20% uplift in a set of huge apps heavily using double
precession math.

Fixes: SWDEV-292645

Differential Revision: https://reviews.llvm.org/D104874
2021-06-30 11:45:38 -07:00
Xun Li 822b92aae4 [Coroutines] Add the newly generated SCCs back to the CGSCC work queue after CoroSplit actually happened
Relevant discussion can be found at: https://lists.llvm.org/pipermail/llvm-dev/2021-January/148197.html
In the existing design, An SCC that contains a coroutine will go through the folloing passes:
Inliner -> CoroSplitPass (fake) -> FunctionSimplificationPipeline -> Inliner -> CoroSplitPass (real) -> FunctionSimplificationPipeline

The first CoroSplitPass doesn't do anything other than putting the SCC back to the queue so that the entire pipeline can repeat.
As you can see, we run Inliner twice on the SCC consecutively without doing any real split, which is unnecessary and likely unintended.
What we really wanted is this:
Inliner -> FunctionSimplificationPipeline -> CoroSplitPass -> FunctionSimplificationPipeline
(note that we don't really need to run Inliner again on the ramp function after split).

Hence the way we do it here is to move CoroSplitPass to the end of the CGSCC pipeline, make it once for real, insert the newly generated SCCs (the clones) back to the pipeline so that they can be optimized, and also add a function simplification pipeline after CoroSplit to optimize the post-split ramp function.

This approach also conforms to how the new pass manager works instead of relying on an adhoc post split cleanup, making it ready for full switch to new pass manager eventually.

By looking at some of the changes to the tests, we can already observe that this changes allows for more optimizations applied to coroutines.

Reviewed By: aeubanks, ChuanqiXu

Differential Revision: https://reviews.llvm.org/D95807
2021-06-30 11:38:14 -07:00
Ahmed Taei 2c4f5690ab Add linalg.batch_matvec named op
Similarly to batch_mat vec outer most dim is a batching dim
    and this op does |b| matrix-vector-products :
    C[b, i] = sum_k(A[b, i, k] * B[b, k])

Reviewed By: rsuderman

Differential Revision: https://reviews.llvm.org/D104739
2021-06-30 11:37:21 -07:00
Fangrui Song 03051f7ac8 [ELF] Preserve section order within an INSERT AFTER command
For
```
SECTIONS {
  text.0 : {}
  text.1 : {}
  text.2 : {}
} INSERT AFTER .data;
```

the current order is `.data text.2 text.1 text.0`. It makes more sense to
preserve the specified order and thus improve compatibility with GNU ld.

For
```
SECTIONS { text.0 : {} } INSERT AFTER .data;
SECTIONS { text.3 : {} } INSERT AFTER .data;
```

GNU ld somehow collects sections with `INSERT AFTER .data` together (IMO
inconsistent) but I think it makes more sense to execute the commands in order
and get `.data text.3 text.0` instead.

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D105158
2021-06-30 11:35:50 -07:00
Leonard Chan 9b0ddc2662 [clang][Fuchsia] Remove relative-vtables multilibs
As of D102374, relative vtables is enabled on Fuchsia by default, so we don't need any of the RV multilibs.

Differential revision: https://reviews.llvm.org/D105145
2021-06-30 11:21:37 -07:00
David Green cd76f43b49 [ARM] Set the immediate cost of GEP operands to 0
This prevents constant gep operands from being hoisted by the Constant
Hoisting pass, leaving them to CodegenPrepare which can usually do a
better job at splitting large offsets. This can, in general, improve
performance and decrease codesize, especially for v6m where many
constants have a high cost.

Differential Revision: https://reviews.llvm.org/D104877
2021-06-30 19:19:03 +01:00
Michael Liao 4339d3bd84 Fix shared build. 2021-06-30 14:04:16 -04:00
zhijian 9a9e6189d7 [AIX][XCOFF][BUG-Fixed] need to switch back to text section after emit a dumy eh structure
Summary:

in the patch https://reviews.llvm.org/D103651 [AIX][XCOFF] generate eh_info when vector registers are saved according to the traceback table.

when generate eh_info, it switch to other section, when it done, it need to switch back to text section again.

Reviewers: Jason Liu
Differential Revision: https://reviews.llvm.org/105195
2021-06-30 13:56:37 -04:00
Simon Pilgrim 59fa435ea6 [X86] Canonicalize SGT/UGT compares with constants to use SGE/UGE to reduce the number of EFLAGs reads. (PR48760)
This demonstrates a possible fix for PR48760 - for compares with constants, canonicalize the SGT/UGT condition code to use SGE/UGE which should reduce the number of EFLAGs bits we need to read.

As discussed on PR48760, some EFLAG bits are treated independently which can require additional uops to merge together for certain CMOVcc/SETcc/etc. modes.

I've limited this to cases where the constant increment doesn't result in a larger encoding or additional i64 constant materializations.

Differential Revision: https://reviews.llvm.org/D101074
2021-06-30 18:46:50 +01:00
Sanjay Patel c7b658aeb5 [InstCombine] fold icmp of offset value with constant
There must be a better way to describe this pattern in words?
(X + C2) >u C --> X <s -C2 (if C == C2 + SMAX)

This could be extended to handle the more general (non-constant)
pattern too:
https://alive2.llvm.org/ce/z/rdfNFP

  define i1 @src(i8 %a, i8 %c1) {
    %t = add i8 %a, %c1
    %c2 = add i8 %c1, 127 ; SMAX
    %ov = icmp ugt i8 %t, %c2
    ret i1 %ov
  }

  define i1 @tgt(i8 %a, i8 %c1) {
    %neg_c1 = sub i8 0, %c1
    %ov = icmp slt i8 %a, %neg_c1
    ret i1 %ov
  }

The pattern was noticed as a by-product of D104932.
2021-06-30 13:37:31 -04:00
Sanjay Patel 36bd25db3d [InstCombine][test] add tests for icmp with constant and offset; NFC 2021-06-30 13:37:31 -04:00
Siva Chandra Reddy 230df8a419 [libc] Allow reading and writing __FE_DENORM if available on x86_64.
Some libcs define __FE_DENORM on x86_64. This change allows reading the
bits corresponding to that non-standard exception.

Reviewed By: lntue

Differential Revision: https://reviews.llvm.org/D105004
2021-06-30 17:32:24 +00:00
Siva Chandra Reddy 804dc3dcf2 [libc] Clear all exceptions before setting in fesetexceptflag.
Previously, exceptions from the flag were being added. This patch
changes it such that only the exceptions in the flag will be set.

Reviewed By: lntue

Differential Revision: https://reviews.llvm.org/D105085
2021-06-30 17:29:48 +00:00
Philip Reames 0c2f40f916 [instcombine] Precommit tests for umin(a,b) ne/eq 0 fold 2021-06-30 10:29:19 -07:00
Siva Chandra Reddy 9474ddc3ac [libc] Fix feclearexcept for x86_64.
Previously, feclearexcept cleared all exceptions irrespective of the
argument. This change brings it in line with the aarch64 flavors wherein
only those exceptions listed in the argument will be cleared.

Reviewed By: lntue

Differential Revision: https://reviews.llvm.org/D105081
2021-06-30 17:28:06 +00:00
Philip Reames c4fc2cb5b2 [instcombine] umin(x, 1) == zext(x != 0)
We already implemented this for the select form, but the intrinsic form was missing.  Note that this doesn't change poison behavior as 1 is non-poison, and the optimized form is still poison exactly when x is.
2021-06-30 10:20:01 -07:00
Tomas Matheson f617ab1044 [NPM] Resolve llvmGetPassPluginInfo to the plugin being loaded
Dynamically loaded plugins for the new pass manager are initialised by
calling llvmGetPassPluginInfo. This is defined as a weak symbol so that
it is continually redefined by each plugin that is loaded. When loading
a plugin from a shared library, the intention is that
llvmGetPassPluginInfo will be resolved to the definition in the most
recent plugin. However, using a global search for this resolution can
fail in situations where multiple plugins are loaded.

Currently:

* If a plugin does not define llvmGetPassPluginInfo, then it will be
  silently resolved to the previous plugin's definition.

* If loading the same plugin twice with another in between, e.g. plugin
  A/plugin B/plugin A, then the second load of plugin A will resolve to
  llvmGetPassPluginInfo in plugin B.

* The previous case can also occur when a dynamic library defines both
  NPM and legacy plugins; the legacy plugins are loaded first and then
  with `-fplugin=A -fpass-plugin=B -fpass-plugin=A`: A will be loaded as
  a legacy plugin and define llvmGetPassPluginInfo; B will be loaded
  and redefine it; and finally when A is loaded as an NPM plugin it will
  be resolved to the definition from B.

Instead of searching globally, restrict the symbol lookup to the library
that is currently being loaded.

Differential Revision: https://reviews.llvm.org/D104916
2021-06-30 18:11:28 +01:00
Yaxun (Sam) Liu 434bd5bf54 [AMDGPU] Add builtin functions image_bvh_intersect_ray
Reviewed by: Stanislav Mekhanoshin, Matt Arsenault

Differential Revision: https://reviews.llvm.org/D104946
2021-06-30 13:10:47 -04:00
Nico Weber f6db88535c [gn build] add dep needed after b56e5f8a10 2021-06-30 12:58:59 -04:00
Nico Weber b56e5f8a10 [clangd] Unbreak mac build after 0c96a92d86
That commit removed the include of Features.inc from ClangdLSPServer.h,
but ClangdMain.cpp relied on this include to pull in Features.inc for
the #if at the bottom of Transport.h.

Since the include is needed in Transport.h, just add it to there
directly.
2021-06-30 12:53:38 -04:00
Fangrui Song 7b06bfc49e [ELF] -pie: produce dynamic relocations for absolute relocations referencing undef weak
See the comment for my understanding of -no-pie and -shared expectation.
-no-pie has freedom on choices. We choose dynamic relocations to be consistent
with the handling of GOT-generating relocations.

Note: GNU ld has arch-varying behaviors and its x86 -pie has a very
complex rule:
if there is at least one GOT-generating or PLT-generating relocation and
-z dynamic-undefined-weak (enabled by default) is in effect, generate a
dynamic relocation.

We don't emulate its rule.

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D105164
2021-06-30 09:43:28 -07:00
David Goldman 570984204f [clangd] Fix highlighting for implicit ObjC property refs
Objective-C lets you use the `self.prop` syntax as sugar for both
`[self prop]` and `[self setProp:]`, but clangd previously did not
provide a semantic token for `prop`.

Now, we provide a semantic token, treating it like a normal property
except it's backed by a `ObjCMethodDecl` instead of a
`ObjCPropertyDecl`.

Differential Revision: https://reviews.llvm.org/D104117
2021-06-30 12:31:50 -04:00
Caroline Tice 05915400b7 [lldb] Replace SVE_PT* macros in NativeRegisterContextLinux_arm64.{cpp,h} with their equivalent defintions in LinuxPTraceDefines_arm64sve.h
Commit 090306fc80 (August 2020) changed most of the arm64 SVE_PT*
macros, but apparently did not make the changes in the
NativeRegisterContextLinux_arm64.* files (or those files were pulled
over from someplace else after that commit). This change replaces the
macros NativeRegisterContextLinux_arm64.cpp with the replacement
definitions in LinuxPTraceDefines_arm64sve.h. It also includes
LinuxPTraceDefines_arm64sve.h in NativeRegisterContextLinux_arm64.h.

Differential Revision: https://reviews.llvm.org/D104826
2021-06-30 09:26:20 -07:00
thomasraoux 0298f2cfb1 [mlir] Fix wrong type in WmmaConstantOpToNVVMLowering
InsertElement takes a scalar integer attribute not an array of integer.

Differential Revision: https://reviews.llvm.org/D105174
2021-06-30 09:10:02 -07:00
thomasraoux 4392841949 [mlir][VectorToGPU] Support converting vetor.broadcast to MMA op
Differential Revision: https://reviews.llvm.org/D105175
2021-06-30 09:08:55 -07:00
LLVM GN Syncbot 0596f7d828 [gn build] Port 0c96a92d86 2021-06-30 15:57:43 +00:00
Jeremy Morse 4955544162 [LiveDebugValues][InstrRef][1/2] Recover more clobbered variable locations
In various circumstances, when we clobber a register there may be
alternative locations that the value is live in. The classic example would
be a value loaded from the stack, and then clobbered: the value is still
available on the stack. InstrRefBasedLDV was coping with this at block
starts where it's forced to pick a location, however it wasn't searching
for alternative locations when values were clobbered.

This patch notifies the "Transfer Tracker" object when clobbers occur, and
it's able to find alternatives and issue DBG_VALUEs for that location. See:
the added test.

Differential Revision: https://reviews.llvm.org/D88405
2021-06-30 16:56:25 +01:00
Joseph Huber ecabc6684f [OpenMP] Change analysis remarks to not emit on cold functions
The remarks will trigger on some functions that are marked cold, such as the
`__muldc3` intrinsic functions. Change the remarks to avoid these functions.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D105196
2021-06-30 11:54:24 -04:00