Commit Graph

369083 Commits

Author SHA1 Message Date
Luqman Aden 0778cad9f3 Fix style warnings. 2020-10-14 19:34:31 -07:00
Luqman Aden 8b70d527d7 [LLD] Set alignment as part of Characteristics in TLS table.
Differential Revision: https://reviews.llvm.org/D88637
2020-10-14 19:34:31 -07:00
Tony b3a38bc2dc [AMDGPU] Correct typos in SIMemoryLegalizer.cpp comments 2020-10-15 02:07:56 +00:00
Petr Hosek 220de1f32a Revert "[CMake] Avoid accidental C++ standard library dependency in sanitizers"
This reverts commit 287c318690 which broke
sanitizer tests that use C++ standard library.
2020-10-14 18:44:09 -07:00
Petr Hosek 287c318690 [CMake] Avoid accidental C++ standard library dependency in sanitizers
While sanitizers don't use C++ standard library, we could still end
up accidentally including or linking it just by the virtue of using
the C++ compiler. Pass -nostdinc++ and -nostdlib++ to avoid these
accidental dependencies.

Differential Revision: https://reviews.llvm.org/D88922
2020-10-14 18:26:56 -07:00
Richard Smith f7f2e4261a PR47805: Use a single object for a function parameter in the caller and
callee in constant evaluation.

We previously made a deep copy of function parameters of class type when
passing them, resulting in the destructor for the parameter applying to
the original argument value, ignoring any modifications made in the
function body. This also meant that the 'this' pointer of the function
parameter could be observed changing between the caller and the callee.

This change completely reimplements how we model function parameters
during constant evaluation. We now model them roughly as if they were
variables living in the caller, albeit with an artificially reduced
scope that covers only the duration of the function call, instead of
modeling them as temporaries in the caller that we partially "reparent"
into the callee at the point of the call. This brings some minor
diagnostic improvements, as well as significantly reduced stack usage
during constant evaluation.
2020-10-14 17:43:51 -07:00
Kazushi (Jam) Marukawa 94c18d91d2 [VE] Add vector load/store instructions
Add vector registers and vector load/store instructions.  Add
regression tests for vector load/store instructions too.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D89183
2020-10-15 09:26:55 +09:00
Dave Lee 4cb4db11ee Revert "[ASTImporter] Fix crash caused by unset AttributeSpellingListIndex"
This broke the GreenDragon build, due to the following error while running
TestImportBuiltinFileID:

```
Ignored/unknown shouldn't get here
UNREACHABLE executed at tools/clang/include/clang/Sema/AttrSpellingListIndex.inc:13!
```

See http://green.lab.llvm.org/green/view/LLDB/job/lldb-cmake/24213/

This reverts commit 73c6beb2f7.
This reverts https://reviews.llvm.org/D89318
2020-10-14 17:21:56 -07:00
Kazushi (Jam) Marukawa 8e7b108e80 [VE] Change to expand SHL_PARTS/SRA_PARTS/SRL_PARTS
VE doesn't have SHL_PARTS/SRA_PARTS/SRL_PARTS instructions, so need
to expand them.  Add regression tests too.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D89396
2020-10-15 09:04:34 +09:00
Amara Emerson 78ccb0359d [AArch64][GlobalISel] Don't use explicit zero registers for compare results.
These cause problems for later optimizations, just using an unused vreg like
SelectionDAG generates better code in the end, and obviates the need for some
GISel specific flag optimizations.

Differential Revision: https://reviews.llvm.org/D89419
2020-10-14 16:49:33 -07:00
Reid Kleckner 8b6d1c0467 [ADT] Use alignas + sizeof for inline storage, NFC
AlignedCharArrayUnion is really only needed to handle the "union" case
when we need memory of suitable size and alignment for multiple types.
SmallVector only needs storage for one type, so use that directly.
2020-10-14 16:16:02 -07:00
Leonard Chan 8487bfd4e9 [clang][NFC] Change diagnostic to start with lowercase letter 2020-10-14 15:48:29 -07:00
Ben Hamilton e7b4feea8e [Format/ObjC] Add NS_SWIFT_NAME() and CF_SWIFT_NAME() to WhitespaceSensitiveMacros
The argument passed to the preprocessor macros `NS_SWIFT_NAME(x)` and
`CF_SWIFT_NAME(x)` is stringified before passing to
`__attribute__((swift_name("x")))`.

ClangFormat didn't know about this stringification, so its custom parser
tried to parse the argument(s) passed to the macro as if they were
normal function arguments.

That means ClangFormat currently incorrectly inserts whitespace
between `NS_SWIFT_NAME` arguments with colons and dots, so:

```
extern UIWindow *MainWindow(void) NS_SWIFT_NAME(getter:MyHelper.mainWindow());
```

becomes:

```
extern UIWindow *MainWindow(void) NS_SWIFT_NAME(getter : MyHelper.mainWindow());
```

which clang treats as a parser error:

```
error: 'swift_name' attribute has invalid identifier for context name [-Werror,-Wswift-name-attribute]
```

Thankfully, D82620 recently added the ability to treat specific macros
as "whitespace sensitive", meaning their arguments are implicitly
treated as strings (so whitespace is not added anywhere inside).

This diff adds `NS_SWIFT_NAME` and `CF_SWIFT_NAME` to
`WhitespaceSensitiveMacros` so their arguments are implicitly treated
as whitespace-sensitive.

Test Plan:
  New tests added. Ran tests with:
  % ninja FormatTests && ./tools/clang/unittests/Format/FormatTests

Reviewed By: sammccall

Differential Revision: https://reviews.llvm.org/D89425
2020-10-14 15:42:51 -06:00
Adrian Prantl 0ff9116b36 Register TargetCXXABI.def as a textual header 2020-10-14 14:20:39 -07:00
MaheshRavishankar de2568aab8 [mlir][Linalg] Rethink fusion of linalg ops with reshape ops.
The current fusion on tensors fuses reshape ops with generic ops by
linearizing the indexing maps of the fused tensor in the generic
op. This has some limitations
- It only works for static shapes
- The resulting indexing map has a linearization that would be
  potentially prevent fusion later on (for ex. tile + fuse).

Instead, try to fuse the reshape consumer (producer) with generic op
producer (consumer) by expanding the dimensionality of the generic op
when the reshape is expanding (folding).  This approach conflicts with
the linearization approach. The expansion method is used instead of
the linearization method.

Further refactoring that changes the fusion on tensors to be a
collection of patterns.

Differential Revision: https://reviews.llvm.org/D89002
2020-10-14 13:50:31 -07:00
Benjamin Kramer 633f9fcb82 Make header self-contained. NFC. 2020-10-14 22:03:19 +02:00
Duncan P. N. Exon Smith d758f79e5d clang/Basic: Replace ContentCache::getBuffer with Optional semantics
Remove `ContentCache::getBuffer`, which always returned a
dereferenceable `MemoryBuffer*` and had a `bool*Invalid` out parameter,
and replace it with:

- `ContentCache::getBufferOrNone`, which returns
  `Optional<MemoryBufferRef>`. This is the new API that consumers should
  use. Later it could be renamed to `getBuffer`, but intentionally using
  a different name to root out any unexpected callers.
- `ContentCache::getBufferPointer`, which returns `MemoryBuffer*` with
  "optional" semantics. This is `private` to avoid growing callers and
  `SourceManager` has temporarily been made a `friend` to access it.
  Later paches will update the transitive callers to not need a raw
  pointer, and eventually this will be deleted.

No functionality change intended here.

Differential Revision: https://reviews.llvm.org/D89348
2020-10-14 15:55:18 -04:00
Snehasish Kumar 24bf6ff4e0 [llvm] Update default cutoff threshold for machine function splitter.
Based on internal testing at Google we found that setting the profile
summary cutoff threshold to 999950 yields the best results in terms of
itlb and icache metrics (as observed on Intel CPUs).

*default* = Split out code if no profile count available for block
*size-%*  = The fraction of bytes split out of .text and .text.hot
*itlb*    = Misses per kilo instructions (MPKI) for itlb
*icache*  = Misses per kilo instructions (MPKI) for L1 icache

Search1

| cutoff  | size-%  | itlb      | icache  |
|---------|---------|-----------|---------|
| default | 42.5861 | 0.0822151 | 2.46363 |
|  999999 | 44.9350 | 0.0767194 | 2.44416 |
|  999950 | 50.0660 |  0.075744 |  2.4091 |
|  999500 | 56.9158 |  0.082564 |  2.4188 |
|  995000 | 63.8625 | 0.0814927 | 2.42832 |
|  990000 | 71.7314 |  0.106906 | 2.57785 |

Search2

| cutoff  | size-% | itlb     | icache  |
|---------|--------|----------|---------|
| default | 2.8845 | 0.626712 | 4.73245 |
|  999999 | 3.3291 | 0.602309 | 4.70045 |
|  999950 | 3.8577 | 0.587842 | 4.71632 |
|  999500 | 4.4170 |  0.63577 | 4.68351 |
|  995000 | 5.1020 | 0.657969 | 4.82272 |
|  990000 | 5.7153 | 0.719122 | 5.39496 |

Differential Revision: https://reviews.llvm.org/D89085
2020-10-14 12:48:10 -07:00
Sean Silva dd378739d7 [mlir] Fix some style comments from D89268
That change was a pure move, so split out the stylistic changes into
this patch.

Differential Revision: https://reviews.llvm.org/D89272
2020-10-14 12:39:16 -07:00
Sean Silva 9a14cb53cb [mlir][bufferize] Rename BufferAssignment* to Bufferize*
Part of the refactor discussed in:
https://llvm.discourse.group/t/what-is-the-strategy-for-tensor-memref-conversion-bufferization/1938/17

Differential Revision: https://reviews.llvm.org/D89271
2020-10-14 12:39:16 -07:00
Sean Silva 1cca0f323e [mlir] Refactor code out of BufferPlacement.cpp
Now BufferPlacement.cpp doesn't depend on Bufferize.h.

Part of the refactor discussed in:
https://llvm.discourse.group/t/what-is-the-strategy-for-tensor-memref-conversion-bufferization/1938/17

Differential Revision: https://reviews.llvm.org/D89268
2020-10-14 12:39:16 -07:00
Sean Silva 6b30fb7653 [mlir] Rename ShapeTypeConversion to ShapeBufferize
Once we have tensor_to_memref ops suitable for type materializations,
this pass can be split into a generic type conversion pattern.

Part of the refactor discussed in:
https://llvm.discourse.group/t/what-is-the-strategy-for-tensor-memref-conversion-bufferization/1938/17

Differential Revision: https://reviews.llvm.org/D89258
2020-10-14 12:39:16 -07:00
Sean Silva 9ca97cde85 [mlir] Linalg refactor for using "bufferize" terminology.
Part of the refactor discussed in:
https://llvm.discourse.group/t/what-is-the-strategy-for-tensor-memref-conversion-bufferization/1938/17

Differential Revision: https://reviews.llvm.org/D89261
2020-10-14 12:39:15 -07:00
Leonard Chan 683b308c07 [clang] Add -fc++-abi= flag for specifying which C++ ABI to use
This implements the flag proposed in RFC http://lists.llvm.org/pipermail/cfe-dev/2020-August/066437.html.

The goal is to add a way to override the default target C++ ABI through
a compiler flag. This makes it easier to test and transition between different
C++ ABIs through compile flags rather than build flags.

In this patch:
- Store `-fc++-abi=` in a LangOpt. This isn't stored in a
  CodeGenOpt because there are instances outside of codegen where Clang
  needs to know what the ABI is (particularly through
  ASTContext::createCXXABI), and we should be able to override the
  target default if the flag is provided at that point.
- Expose the existing ABIs in TargetCXXABI as values that can be passed
  through this flag.
  - Create a .def file for these ABIs to make it easier to check flag
    values.
  - Add an error for diagnosing bad ABI flag values.

Differential Revision: https://reviews.llvm.org/D85802
2020-10-14 12:31:21 -07:00
Snehasish Kumar 77638a5343 [llvm] Set the default for -bbsections-cold-text-prefix to .text.split.
After using this for a while, we find that it is generally useful to
have it set to .text.split. by default, removing the need for an
additional -mllvm option.

Differential Revision: https://reviews.llvm.org/D88997
2020-10-14 12:16:36 -07:00
Guozhi Wei adfb541501 [MBP] Add whole chain to BlockFilterSet instead of individual BB
Currently we add individual BB to BlockFilterSet if its frequency satisfies

LoopFreq / Freq <= LoopToColdBlockRatio

LoopFreq is edge frequency from outside to loop header.
LoopToColdBlockRatio is a command line parameter.

It doesn't make sense since we always layout whole chain, not individual BBs.

It may also cause a tricky problem. Sometimes it is possible that the LoopFreq
of an inner loop is smaller than LoopFreq of outer loop. So a BB can be in
BlockFilterSet of inner loop, but not in BlockFilterSet of outer loop,
like .cold in the test case. So it is added to the chain of inner loop. When
work on the outer loop, .cold is not added to BlockFilterSet, so the edge to
successor .problem is not counted in UnscheduledPredecessors of .problem chain.
But other blocks in the inner loop are added BlockFilterSet, so the whole inner
loop chain can be layout, and markChainSuccessors is called to decrease
UnscheduledPredecessors of following chains. markChainSuccessors calls
markBlockSuccessors for every BB, even it is not in BlockFilterSet, like .cold,
so .problem chain's UnscheduledPredecessors is decreased, but this edge was not
counted on in fillWorkLists, so .problem chain's UnscheduledPredecessors
becomes 0 when it still has an unscheduled predecessor .pred! And it causes
problems in following various successor BB selection algorithms.

Differential Revision: https://reviews.llvm.org/D89088
2020-10-14 11:55:10 -07:00
Pavel Labath a1ab2b773b [lldb] More memory allocation test fixes
XFAIL nodefaultlib.cpp on darwin - the test does not pass there

XFAIL TestGdbRemoteMemoryAllocation on windows - memory is allocated
with incorrect permissions
2020-10-14 20:43:47 +02:00
Andrzej Warzynski 42e89ab2a6 [flang] Fix CMake bug in the definition of flang-new
Recent patch that improved Flang's compatibility with respect to how LLVM
dynamic libraries should be linked (and specified in CMake recipes),
introduced a bug in the definition of `flang-new`:
  * https://reviews.llvm.org/D87893
More specifically, `add_flang_tool` does not support the
`LINK_COMPONENTS` CMake argument.  Instead, one should set
`LLVM_LINK_COMPONENTS` before calling `add_flang_tool`.

This patch reverts the change for `flang-new` from
https://reviews.llvm.org/D87893, and instead:
  * sets `LLVM_LINK_COMPONENTS`
  * calls `clang_target_link_libraries` to add Clang dependencies

Differential Revision: https://reviews.llvm.org/D89403
2020-10-14 19:24:10 +01:00
Justin Lebar e9ac1869a8
Preserve param alignment in NVPTXLowerArgs pass.
NVPTXLowerArgs works as follows.

  * Create a regular alloca with alignment identical to arg.
  * Copy arg from param space (and ASC'ing it from generic AS first) to
    the alloca (it's still in generic AS).
  * Replace loads of arg with loads of alloca.

The bug here is that we did not preserve the arg's alignment when
loading from the alloca.

The impact of this bug is that sometimes param loads would be lowered as
a series of u8 loads, because we're incorrectly assuming everything has
alignment 1.

Differential Revision: https://reviews.llvm.org/D89404
2020-10-14 11:15:30 -07:00
rdzhabarov 008c0ea6a4 [DDR] Introduce implicit equality check for the source pattern operands with the same name.
This CL allows user to specify the same name for the operands in the source pattern which implicitly enforces equality on operands with the same name.
E.g., Pat<(OpA $a, $b, $a) ... > would create a matching rule for checking equality for the first and the last operands. Equality of the operands is enforced at any depth, e.g., OpA ($a, $b, OpB($a, $c, OpC ($a))).

Example usage: Pat<(Reshape $arg0, (Shape $arg0)), (replaceWithValue $arg0)>

Note, this feature only covers operands but not attributes.
Current use cases are based on the operand equality and explicitly add the constraint into the pattern. Attribute equality will be worked out on the different CL.

Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D89254
2020-10-14 11:05:13 -07:00
Michał Górny ff30bff136 [lldb] [Process/FreeBSDRemote] Support YMM reg via PT_*XSTATE
Add a framework for reading/writing extended register sets via
PT_GETXSTATE/PT_GETXSTATE_INFO/PT_SETXSTATE, and use it to support
YMM0..YMM15.  The code is prepared to handle arbitrary XSAVE extensions,
including correct offset handling.

This fixes Shell/Register/*ymm* tests.

Differential Revision: https://reviews.llvm.org/D89193
2020-10-14 19:56:46 +02:00
Krzysztof Parzyszek 670cd3c6e3 [Hexagon] Generate better splat code on v62+ 2020-10-14 12:55:20 -05:00
Jacques Pienaar f4ad76deb8 [mlir] More changes to avoid args now inserted.NFC
Migrates a bit more from the old/to be deprecated form.
2020-10-14 10:47:45 -07:00
Christopher Di Bella 18432bea76 [Driver]: fix compiler-rt path when printing libgcc for baremetal
clang --target arm-none-eabi --print-libgcc-file-name --rtlib=compiler-rt
used to print `/path/to/lib/clang/version/lib/libclang_rt.builtins-arm.a`
but should print `/path/to/lib/clang/version/lib/baremetal/libclang_rt.builtins-arm.a`.
Similarly, --target armv7m-none-eabi should print libclang_rt.builtins-armv7m.a
This matches the compiler-rt file name used at link time in the
baremetal driver.

Reviewed By: manojgupta

Differential Revision: https://reviews.llvm.org/D89327
2020-10-14 10:29:35 -07:00
Craig Topper 2949baec3c [X86] Add test case to demonstrate a Log2_32_Ceil that can just be Log2_32 in SimplifySetCC ctpop combine.
This combine can look through (trunc (ctpop X)). When doing this
it tries to make sure the trunc doesn't lose any information
from the ctpop. It does this by checking that the truncated type
has more bits that Log2_32_Ceil of the ctpop type. The Ceil is
unnecessary and pessimizes non-power of 2 types.

For example, ctpop of i256 requires 9 bits to represent the max
value of 256. But ctpop of i255 only requires 8 bits to represent
the max result of 255. Log2_32_Ceil of 256 and 255 both return 8
while Log2_32 returns 8 for 256 and 7 for 255.
2020-10-14 10:22:51 -07:00
Simon Pilgrim 60ba9233d1 Revert rG25a97c3a43d7 - "[InstCombine] visitCallInst - retain undefs in vector funnel shift amounts"
This reverts commit 25a97c3a43.

We have other constant folds that fold undef funnel shift amounts to 0 - so we need to be consistent.

If we end up with regressions where we lose a splat shift amount pattern we'll have to investigate other canonicalizations, but matchFunnelShift currently protects us from that.
2020-10-14 18:14:37 +01:00
Konstantin Zhuravlyov 3fdf3b1539 AMDGPU: Update AMDHSA code object version handling
Differential Revision: https://reviews.llvm.org/D89076
2020-10-14 13:04:27 -04:00
Matt Arsenault 6a9484f4bf InstCombine: Fix losing load properties in copy-constant-to-alloca
Preserve the alignment and metadata. Atomic loads are skipped for
this, but pass along the properties for consistency.
2020-10-14 12:55:25 -04:00
Matt Arsenault 6da31fa4a6 InstCombine: Fix infinite loop in copy-constant-to-alloca transform
This was broken by 16295d521e, when
instructions started being handled and not just constant
expressions. This was re-inserting an equivalent bitcast to the
original memcpy operand, which made a non-functional IR change on
every iteration.

This also fixes a secondary problem where it was inserting
addrspacecasts which may not have been legal (i.e. it changed the
source address space). Start visiting all pointer users and fail out
if we can't process them. Also start handling the relevant memory
intrinsic users. These cases can be dealt with by running
InferAddressSpaces separately.
2020-10-14 12:55:25 -04:00
Louis Dionne 0728b67b27 [libc++] Mark two tests as unsupported in C++03
This was dropped when I split the tests into individual source files
to make sure they would actually run (in 2908eb20ba).
2020-10-14 12:42:11 -04:00
Florian Hahn 93f6c6b79c Recommit "[VPlan] Use VPValue def for VPMemoryInstructionRecipe."
This reverts the revert commit 710aceb645
and includes a fix for a memsan failure.

Original message:

    This patch turns VPMemoryInstructionRecipe into a VPValue and uses it
    during VPlan construction and codegeneration instead of the plain IR
    reference where possible.
2020-10-14 17:41:23 +01:00
Simon Pilgrim b967b9a711 [CodeGen] Move x86 specific ms intrinsic tests into x86 target subfolder. NFCI. 2020-10-14 17:37:26 +01:00
Mark Schimmel 8e570abf10 Polly - specify address space when creating a pointer to a vector type
Polly incorrectly dropped the address space specified for a load instruction when it vectorized the code.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D88907
2020-10-14 11:17:15 -05:00
Kadir Cetinkaya fc2fb60bab
[clangd] clang-format TweakTests, NFC 2020-10-14 18:14:27 +02:00
Irina Dobrescu 65b9b9aa50 Add Allocate Clause to MLIR Parallel Operation Definition
Differential Revision: https://reviews.llvm.org/D87684
2020-10-14 17:13:48 +01:00
Louis Dionne 4212533961 [libc++] Use ADDITIONAL_COMPILE_FLAGS instead of #define for _LIBCPP_DEBUG 2020-10-14 12:02:37 -04:00
Louis Dionne 2908eb20ba [libc++] Split off debug tests that were missed by ce1365f8f7 into test/libcxx
Also, some tests had multiple death tests in them, so split them into
separate tests instead. The second death test would obviously never
get run, because the first one would kill the program before.
2020-10-14 12:02:37 -04:00
jasonliu f85bcc21dd [AIX] Turn -fdata-sections on by default in Clang
Summary:

This patch does the following:
1. Make InitTargetOptionsFromCodeGenFlags() accepts Triple as a
 parameter, because some options' default value is triple dependant.
2. DataSections is turned on by default on AIX for llc.
3. Test cases change accordingly because of the default behaviour change.
4. Clang Driver passes in -fdata-sections by default on AIX.

Reviewed By: MaskRay, DiggerLin

Differential Revision: https://reviews.llvm.org/D88737
2020-10-14 15:58:31 +00:00
Simon Pilgrim 89657b3a3b [InstCombine] narrowRotate - canonicalize to OR(SHL,LSHR). NFCI.
Match the canonicalization code that was added to matchFunnelShift at rG02295e6d1a15
2020-10-14 16:45:00 +01:00
Mircea Trofin c8fcffe775 [NFC][MC] Use MCRegister in Machine{Sink|Pipeliner}.cpp
Differential Revision: https://reviews.llvm.org/D89328
2020-10-14 08:42:17 -07:00