Commit Graph

356397 Commits

Author SHA1 Message Date
Uday Bondhugula 0f6999af88 [MLIR] Update linalg.conv lowering to use affine load in the absence of padding
Update linalg to affine lowering for convop to use affine load for input
whenever there is no padding. It had always been using std.loads because
max in index functions (needed for non-zero padding if not materializing
zeros) couldn't be represented in the non-zero padding cases.

In the future, the non-zero padding case could also be made to use
affine - either by materializing or using affine.execute_region. The
latter approach will not impact the scf/std output obtained after
lowering out affine.

Differential Revision: https://reviews.llvm.org/D81191
2020-06-05 12:28:30 +05:30
Jonas Devlieghere 70ad03d938 Revert "Set the captures on a CXXRecordDecl representing a lambda closure type"
This reverts commit c13dd74e31.
2020-06-04 23:45:36 -07:00
Jonas Devlieghere df53f09056 Revert "PR46209: properly determine whether a copy assignment operator is"
This reverts commit c57f8a3a20.
2020-06-04 23:45:36 -07:00
Jan Kratochvil 7fc6d36d48 [nfc] [lldb] clang-format #include files order 2020-06-05 08:28:06 +02:00
Fangrui Song 78702dec3b [Driver] Migrate some -f/-fno options to use OptInFFlag and OptOutFFlag
Also assign OptInFFlag and OptOutFFlag to f_Group.
2020-06-04 23:25:19 -07:00
Max Kazantsev 80cb25cbd5 Revert "[InstCombine][NFC] Factor out constant check"
This reverts commit 9bdb918890.

This refactoring proved to not be useful.
2020-06-05 12:00:44 +07:00
Xing GUO 929edd8bd2 [DWARFYAML][debug_aranges] Replace InitialLength with Format and Length.
This patch addresses the comment in [D80972](https://reviews.llvm.org/D80972#inline-744217).

Before this patch, the initial length field of .debug_aranges section should be declared as:

```
## 32-bit DWARF
debug_aranges:
  - Length:
      TotalLength: 0x20
    Version: 2
    ...

## 64-bit DWARF
debug_aranges:
  - Length:
      TotalLength:   0xffffffff
      TotalLength64: 0x20
    Version: 2
    ...
```

After this patch:

```
## 32-bit DWARF
debug_aranges:
  - [[Format:  DWARF32]] ## Optional
    Length:  0x20
    Version: 2
    ...

## 64-bit DWARF
debug_aranges:
  - Format:  DWARF64
    Length:  0x20
    Version: 2
```

Current implementation of generating DWARF64 .debug_aranges section is buggy. A follow-up patch will improve it and add test cases for DWARF64.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D81063
2020-06-05 12:16:44 +08:00
Vitaly Buka 3c32af58f6 [StackSafety,NFC] Ignore callee declarations
It's going to fail FunctionInfo lookup anyway.
2020-06-04 20:55:50 -07:00
Petr Hosek d76e62fdb7 [AddressSanitizer] Don't use weak linkage for __{start,stop}_asan_globals
It should not be necessary to use weak linkage for these. Doing so
implies interposablity and thus PIC generates indirections and
dynamic relocations, which are unnecessary and suboptimal. Aside
from this, ASan instrumentation never introduces GOT indirection
relocations where there were none before--only new absolute relocs
in RELRO sections for metadata, which are less problematic for
special linkage situations that take pains to avoid GOT generation.

Patch By: mcgrathr

Differential Revision: https://reviews.llvm.org/D80605
2020-06-04 20:18:35 -07:00
Fangrui Song e5158b5273 [Driver] Migrate some -f/-fno options to use OptInFFlag and OptOutFFlag 2020-06-04 19:33:14 -07:00
Richard Smith c57f8a3a20 PR46209: properly determine whether a copy assignment operator is
trivial.

We previously took a shortcut by assuming that if a subobject had a
trivial copy assignment operator (with a few side-conditions), we would
always invoke it, and could avoid going through overload resolution.
That turns out to not be correct in the presenve of ref-qualifiers (and
also won't be the case for copy-assignments with requires-clauses
either). Use the same logic for lazy declaration of copy-assignments
that we use for all other special member functions.
2020-06-04 19:19:01 -07:00
Richard Smith c13dd74e31 Set the captures on a CXXRecordDecl representing a lambda closure type
before marking it complete.

No functionality change intended.
2020-06-04 19:19:01 -07:00
Philip Reames 4c735439fd [Statepoint] Migrate a few tests to gc-live bundle format and fix assert
The assert was missed in 0e7c7705, migrating the test revealed the problem.
2020-06-04 18:15:58 -07:00
Vedant Kumar 198762680e [LiveDebugValues] Cache LexicalScopes::getMachineBasicBlocks, NFCI
Summary:
Cache the results from getMachineBasicBlocks in LexicalScopes to speed
up UserValueScopes::dominates queries.  This replaces the caching done
in UserValueScopes. Compared to the old caching method, this reduces
memory traffic when a VarLoc is copied (e.g. when a VarLocMap grows),
and enables caching across basic blocks.

When compiling sqlite 3.5.7 (CTMark version), this patch reduces the
number of calls to getMachineBasicBlocks from 10,207 to 1,093. I also
measured a small compile-time reduction (~ 0.1% of total wall time, on
average, on my machine).

As a drive-by, I made the DebugLoc in UserValueScopes a const reference
to cut down on MetadataTracking traffic.

Reviewers: jmorse, Orlando, aprantl, nikic

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80957
2020-06-04 16:58:45 -07:00
River Riddle c0cd1f1c5c [mlir] Refactor BoolAttr to be a special case of IntegerAttr
This simplifies a lot of handling of BoolAttr/IntegerAttr. For example, a lot of places currently have to handle both IntegerAttr and BoolAttr. In other places, a decision is made to pick one which can lead to surprising results for users. For example, DenseElementsAttr currently uses BoolAttr for i1 even if the user initialized it with an Array of i1 IntegerAttrs.

Differential Revision: https://reviews.llvm.org/D81047
2020-06-04 16:41:24 -07:00
Mircea Trofin fa42620afb [docs] Referenced llvm workflow in HowToAddABuilder
Reviewers: gkistanova, dblaikie

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81046
2020-06-04 16:39:11 -07:00
Dan Gohman 072192d54a [WebAssembly] Fix a testcase to be independent of the sysroot default
As a followup to D62922, add a sysroot command-line option to this test
to ensure that the output is independent of any default sysroot options,
and adjust the reactor test to be more consistent with the command test.
2020-06-04 16:04:44 -07:00
Nicolas Vasilache 3463d9835b [mlir][Linalg] Add a hoistViewAllocOps helper function
This revision adds a helper function to hoist alloc/dealloc pairs and
alloca op out of immediately enclosing scf::ForOp if both conditions are true:
   1. all operands are defined outside the loop.
   2. all uses are ViewLikeOp or DeallocOp.

This is now considered Linalg-specific and will be generalized on a per-need basis.

Differential Revision: https://reviews.llvm.org/D81152
2020-06-04 18:59:03 -04:00
Jan Korous a95c08db12 [Analyzer][NoUncountedMembersChecker] Fix crash for C structs
Fixes https://bugs.llvm.org/show_bug.cgi?id=46177
Fixes second bug reported in https://bugs.llvm.org/show_bug.cgi?id=46142
2020-06-04 15:57:19 -07:00
Philip Reames 3d40c75189 [Statepoint] Switch RS4GC to using gc-live bundle form
Now that we have an operand based form for the GC arguments to a statepoint intrinsic, update RS4GC to use it and update tests to reflect. This is pretty straight forward. I nearly landed without review, but figured a second set of eyes didn't hurt.

Differential Revision: https://reviews.llvm.org/D81121
2020-06-04 15:49:11 -07:00
Petr Hosek b16ed493dd [Fuchsia] Rely on linker switch rather than dead code ref for profile runtime
Follow the model used on Linux, where the clang driver passes the
linker a -u switch to force the profile runtime to be linked in,
rather than having every TU emit a dead function with a reference.

Differential Revision: https://reviews.llvm.org/D79835
2020-06-04 15:47:05 -07:00
Petr Hosek e1ab90001a Revert "[Fuchsia] Rely on linker switch rather than dead code ref for profile runtime"
This reverts commit d510542174 since
it broke several bots.
2020-06-04 15:44:10 -07:00
Julian Lettner 284934fbc1 Make linter happy 2020-06-04 15:14:48 -07:00
Vedant Kumar 24660ea11c [docs] HowToUpdateDebugInfo: Minor cleanups
- Change the reference to salvageDebugInfoOrUndef to salvageDebugInfo
  (in accordance with https://reviews.llvm.org/D78369).

- Reorganize a few sections in preparation for an upcoming change that
  attempts to specify rules for updating debug locations.

- Fix some intra-document links.

- Some spelling / wording fixes.
2020-06-04 14:56:01 -07:00
Yuanfang Chen f9ea86eaa1 [Docs] Add the entry for `Advanced builds` in UserGuide.rst
Also add a link to it from ThinLTO.rst.
2020-06-04 14:52:51 -07:00
Alexey Bataev 4e3d4622b1 Fix undefined behaviour when trying to deref nullptr. 2020-06-04 17:52:06 -04:00
Craig Topper 3ad8fbd205 [Reassociate] Teach ConvertShiftToMul to preserve nsw flag if the shift amount is not bitwidth - 1.
Multiply and shl have different signed overflow behavior in
some cases. But it looks like we should be ok as long as the
shift amount is less than bitwidth - 1.

Alive2: http://volta.cs.utah.edu:8080/z/MM4WZP

Differential Revision: https://reviews.llvm.org/D81189
2020-06-04 14:51:34 -07:00
Matt Arsenault 1657f0ebc2 AMDGPU: Fix overriding global FP atomic feature predicates
Global TableGen let override blocks are pretty dangerous and override
any local special cases. In this case, the broader HasFlatGlobalInsts
was overriding the more specific predicate for
FeatureAtomicFaddInsts. Make sure HasFlatGlobalInsts is implied by
FeatureAtomicFaddInsts, and make sure the right predicate is used.

One issue with independently setting the subtarget features on
incompatible targets is all of the encoding families do not define all
opcodes. This will hit an assert on gfx10 for example, since we set
the encoding independently based on the generation and not based on a
feature.
2020-06-04 17:50:38 -04:00
Matt Arsenault 651c36b508 AMDGPU: Select strict_fmul 2020-06-04 17:49:00 -04:00
Matt Arsenault 483d4daa5e AMDGPU: Select strict_fma
Like with strict_fadd, the legalization is scalarizing the v4f16 when
it should split.
2020-06-04 17:49:00 -04:00
Matt Arsenault ae26c064ce AMDGPU: Select strict_fadd 2020-06-04 17:49:00 -04:00
Diego Caballero 5c990d6994 [mlir] Add support for bf16 to StandardToLLVM conversion
Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D81127
2020-06-04 14:36:36 -07:00
Matt Arsenault b71f574e7f AMDGPU: Add test for fdiv nofpexcept preservation
This logically belongs with 89d48ccabe,
but this order was needed to avoid regressions before adding
mayRaiseFPExceptions to relevant instructions.
2020-06-04 17:35:27 -04:00
Matt Arsenault d259668731 AMDGPU: Set mayRaiseFPException
This may be missing a few overrides to set it off still in some
special cases. Since the flags set during selection should now be
reliably preserved, this should not change codegen for non-strictfp
functions.
2020-06-04 17:35:27 -04:00
Sanjay Patel 192cb71836 [InstCombine] avoid crashing on select-shuffle detection
As mentioned in the post-commit comments of D81013 -
the mask check API has to assume the shuffle is
not length-changing, but we have not ruled that out
in this code. Use the ShuffleVectorInst call instead.
2020-06-04 17:27:14 -04:00
Petr Hosek d510542174 [Fuchsia] Rely on linker switch rather than dead code ref for profile runtime
Follow the model used on Linux, where the clang driver passes the
linker a -u switch to force the profile runtime to be linked in,
rather than having every TU emit a dead function with a reference.

Patch By: mcgrathr

Differential Revision: https://reviews.llvm.org/D79835
2020-06-04 14:25:19 -07:00
Shilei Tian a014fbbc21 [OpenMP] Improve D2D memcpy to use more efficient driver API
Summary:
In current implementation, D2D memcpy is first to copy data back to host and then
copy from host to device. This is very efficient if the device supports D2D
memcpy, like CUDA.

In this patch, D2D memcpy will first try to use native supported driver API. If
it fails, fall back to original way. It is worth noting that D2D memcpy in this
scenerio contains two ideas:
- Same devices: this is the D2D memcpy in the CUDA context.
- Different devices: this is the PeerToPeer memcpy in the CUDA context.
My implementation merges this two parts. It chooses the best API according to
the source device and destination device.

Reviewers: jdoerfert, AndreyChurbanov, grokos

Reviewed By: jdoerfert

Subscribers: yaxunl, guansong, sstefan1, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D80649
2020-06-04 16:59:06 -04:00
Yaxun (Sam) Liu 263390d4f5 [CUDA][HIP] Fix implicit HD function resolution
recommit e03394c6a6 with fix

When implicit HD function calls a function in device compilation,
if one candidate is an implicit HD function, current resolution rule is:

D wins over HD and H
HD and H are equal

this caused regression when there is an otherwise worse D candidate

This patch changes that to

D, HD and H are all equal

The rationale is that we already know for host compilation there is already
a valid candidate in HD and H candidates that will not cause error. Allowing
HD and H gives us a fall back candidate that will not cause error. If D wins,
that means D has to be a better match otherwise, therefore D should also
be a valid candidate that will not cause error. In this way, we can guarantee
no regression.

Differential Revision: https://reviews.llvm.org/D80450
2020-06-04 16:54:52 -04:00
Matt Arsenault 54a8a8d509 AMDGPU: Fix using unencodable instructions in tests
There are a number of MIR tests using instructions on subtargets where
they don't really exist. These are some of the easy cases that don't
require splitting up test functions.
2020-06-04 16:50:19 -04:00
Matt Arsenault fe0d5121fa AMDGPU/GlobalISel: Fix making LDS FP atomics legal on SI/CI 2020-06-04 16:50:19 -04:00
Matt Arsenault 16acc12e1d AMDGPU/GlobalISel: Fix trying to use wave32 for gfx9 test 2020-06-04 16:50:19 -04:00
Alexey Bataev bd1c03d7b7 [OPENMP50]Codegen for inscan reductions in worksharing directives.
Summary:
Implemented codegen for reduction clauses with inscan modifiers in
worksharing constructs.

Emits the code for the directive with inscan reductions.
The code is the following:
```
size num_iters = <num_iters>;
<type> buffer[num_iters];
for (i: 0..<num_iters>) {
  <input phase>;
  buffer[i] = red;
}
for (int k = 0; k != ceil(log2(num_iters)); ++k)
for (size cnt = last_iter; cnt >= pow(2, k); --k)
  buffer[i] op= buffer[i-pow(2,k)];
for (0..<num_iters>) {
  red = InclusiveScan ? buffer[i] : buffer[i-1];
  <scan phase>;
}
```

Reviewers: jdoerfert

Subscribers: yaxunl, guansong, arphaman, cfe-commits, caomhin

Tags: #clang

Differential Revision: https://reviews.llvm.org/D79948
2020-06-04 16:29:33 -04:00
Thomas Lively a07c08f74f [WebAssembly] Lower llvm.debugtrap properly
Summary:
Unlike normal traps, debug traps are allowed to return and can have
additional instructions in the same basic block. Without explicit
backend support for debug traps, they are lowered in ISel as normal
traps. Since normal traps are lowered in the WebAssembly backend to
the UNREACHABLE instruction, which is a terminator, using debug traps
could lead to invalid MBBs when there are additional instructions
after the trap. This patch fixes the issue by lowering debug traps to
a new version of the UNREACHABLE instruction, DEBUG_UNREACHABLE, that
is not a terminator.

An alternative approach would have been to make UNREACHABLE not a
terminator, but that breaks a large number of tests. In particular, it
would require removing the traps inserted after noreturn calls to
@llvm.wasm.throw because otherwise the terminator throw would be
followed by a non-terminator UNREACHABLE and we would be back to
having invalid MBBs. Overall the approach in this patch seems simpler.

Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81055
2020-06-04 13:25:10 -07:00
Pete Steinfeld 1746c8ed26 [flang] Fixed crash on forward referenced `len` parameter
Summary:
Using a forward reference to define a `len` parameter causes a crash.
The underlying cause was that a previously declared type had an
erroneous expression for its `LEN` param value.  When this expression
was referenced to evaluate a subsequent expression, bad things happened.
I fixed this by putting in code to detect this case.

Reviewers: tskeith, klausler, DavidTruby

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80593
2020-06-04 13:12:11 -07:00
Huihui Zhang 42048ff972 [NFC] Move test vscale-factor-out-constant.ll to AArch64 sub-directory.
Vscale scalable vector is specific to AArch64 target.

Bring back 'uglygep' check.
2020-06-04 12:55:28 -07:00
Eric Schweitz baa12ddb6f [flang] Add the conversions for types.
Part of lowering is to convert the front-end types to their FIR dialect representations.  These conversions are done by here in the ConvertType module.

proactively update the code to conform better with LLVM coding conventions

Differential Revision: https://reviews.llvm.org/D81034
2020-06-04 12:54:50 -07:00
aartbik c19fae507e [mlir] [VectorOps] Add missing comments to CreateMaskOp lowering
Summary: Add missing comment to CreateMask. Fixed typo in ConstantMask comment.

Reviewers: nicolasvasilache, rriddle, reidtatge, ftynse

Reviewed By: ftynse

Subscribers: mehdi_amini, rriddle, jpienaar, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, stephenneuendorffer, Joonsoo, grosul1, frgossen, Kayjukh, jurahul

Tags: #mlir

Differential Revision: https://reviews.llvm.org/D81125
2020-06-04 12:50:47 -07:00
Florian Hahn 714e84be46 [SemaOverload] Use iterator_range to iterate over VectorTypes (NFC).
We can simplify the code a bit by using iterator_range instead of
plain iterators. Matrix type support here (added in 6f6e91d193)
already uses an iterator_range.

Reviewers: rjmccall, arphaman, jfb, Bigcheese

Reviewed By: rjmccall

Differential Revision: https://reviews.llvm.org/D81138
2020-06-04 20:47:16 +01:00
Dmitri Gribenko a180d5409f AST Matchers test: use arrays instead of vectors
Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D81180
2020-06-04 21:40:30 +02:00
Valentin Clement 3d9bb031d1 [flang] avoid GCC < 8 compiler failure after D80794
Summary:
Patch D80794 remove the custom flags for release build for flang.
This leads to build failure with GCC < 8. This patch add upperbound check
in order to avoid the -Werror=array-bounds to trigger a build failure.

```
/home/4vn/versioning/llvm-project/flang/lib/Decimal/big-radix-floating-point.h:183:29: error: array subscript is above array bounds [-Werror=array-bounds]
           digit_[j] = digit_[j + remove];
                       ~~~~~~^
/home/4vn/versioning/llvm-project/flang/lib/Decimal/big-radix-floating-point.h:183:29: error: array subscript is above array bounds [-Werror=array-bounds]
           digit_[j] = digit_[j + remove];
                       ~~~~~~^
/home/4vn/versioning/llvm-project/flang/lib/Decimal/big-radix-floating-point.h:183:29: error: array subscript is above array bounds [-Werror=array-bounds]
           digit_[j] = digit_[j + remove];
                       ~~~~~~^
/home/4vn/versioning/llvm-project/flang/lib/Decimal/big-radix-floating-point.h:183:29: error: array subscript is above array bounds [-Werror=array-bounds]
           digit_[j] = digit_[j + remove];
                       ~~~~~~^
/home/4vn/versioning/llvm-project/flang/lib/Decimal/big-radix-floating-point.h:183:29: error: array subscript is above array bounds [-Werror=array-bounds]
           digit_[j] = digit_[j + remove];
```

```
/home/4vn/versioning/llvm-project/flang/include/flang/Evaluate/integer.h:809:28: error: array subscript is above array bounds [-Werror=array-bounds]
               xy += product[to];
                     ~~~~~~~^
/home/4vn/versioning/llvm-project/flang/include/flang/Evaluate/integer.h:810:22: error: array subscript is above array bounds [-Werror=array-bounds]
               product[to] = xy & partMask;
               ~~~~~~~^
/home/4vn/versioning/llvm-project/flang/include/flang/Evaluate/integer.h:809:28: error: array subscript is above array bounds [-Werror=array-bounds]
               xy += product[to];
                     ~~~~~~~^
```

Reviewers: DavidTruby, sscalpone, jdoerfert

Reviewed By: DavidTruby

Subscribers: llvm-commits

Tags: #llvm, #flang

Differential Revision: https://reviews.llvm.org/D81179
2020-06-04 14:48:39 -04:00