Commit Graph

389032 Commits

Author SHA1 Message Date
Rafael Auler a33687ec58 [RuntimeDyld] Add allowStubs/allowZeroSyms
This patch introduces functionality used by BOLT when
re-linking the final binary. It adds to MemoryManager a new member
function allowStubAllocation to control whether this MemoryManager
supports increasing code size with stubs or not. Since BOLT can
rewrite some files in-place, it needs to avoid stub insertion done
by the linker. This patch also introduces allowsZeroSymbols to the
JITSymbolResolver class, enabling us to finish a link successfully
even when some symbols resolve to the value zero. When rewriting a
binary, sometimes we do need to resolve a target to zero in case
the input binary calls address zero and we want to be bug
compatible. We also expose reassignSectionAddress as it is used by
BOLT.

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D97898
2021-05-18 11:35:27 -07:00
peter klausler 8cd199b85f [flang] Accept OPEN(ACCESS='APPEND') legacy extension even without warnings enabled
My earlier patch to accept ACCESS='APPEND' only worked when warnings
were enabled; fix it.

Differential Revision: https://reviews.llvm.org/D102653
2021-05-18 11:32:52 -07:00
Nikita Popov e81334a754 [LICM] Remove MaybePromotable set (PR50367)
The MaybePromotable set keeps track of loads/stores for which
promotion was not attempted yet. Normally, any load/stores that
are promoted in the current iteration will be removed from this
set, because they naturally MustAlias with the promoted value.
However, if the source program has UB with metadata claiming that
a store is NoAlias, while it is actually MustAlias, and multiple
different pointers are promoted in the same iteration, it can
happen that a store is removed that is still in the MaybePromotable
set, causing a use-after-free.

While this could be fixed by explicitly invalidating values in
MaybePromotable in the LoopPromoter, I'm going with the more
radical option of dropping the set entirely here and check all
load/stores on each promotion iteration. As promotion, and especially
repeated promotion, are quite rare, this doesn't seem to have any
impact on compile-time.

Fixes https://bugs.llvm.org/show_bug.cgi?id=50367.
2021-05-18 20:26:01 +02:00
peter klausler 5e1421b22f [flang] Implement MATMUL in the runtime
Define an API for the transformational intrinsic function MATMUL,
implement it, and add some basic unit tests.  The large number of
possible argument type combinations are covered by a set of
generalized templates that are instantiated for each valid
pair of possible argument types.

Places where BLAS-2/3 routines could be called for acceleration
are marked with TODOs.  Handling for other special cases (e.g.,
known-shape 3x3 matrices and vectors) are deferred.

Some minor tweaks were made to the recent related implementation
of DOT_PRODUCT to reflect lessons learned.

Differential Revision: https://reviews.llvm.org/D102652
2021-05-18 10:59:52 -07:00
Fangrui Song 2919222d80 [Driver] Delete -mimplicit-it=
This is a GNU as and Clang cc1as option, not a GCC option.
Users should specify `-Wa,-mimplicit-it=` instead.

Note: mixing the -m option and the -Wa, option doesn't work
`-Wa,-mimplicit-it=never -mimplicit-it=always` =>
`clang (LLVM option parsing): for the --arm-implicit-it option: may only occur zero or one times!`

Reviewed By: nickdesaulniers, raj.khem

Differential Revision: https://reviews.llvm.org/D102568
2021-05-18 10:57:24 -07:00
Nico Weber b4ead2c37b [lld/mac] Correctly set nextdefsym
In LC_DYSYMTAB, private externs were still emitted as exported symbols instead
of as locals.

Fixes PR50373. See bug for details.

Differential Revision: https://reviews.llvm.org/D102662
2021-05-18 13:53:55 -04:00
Chris Lattner 855b42ddd0 [IntegerAttr] Add helpers for working with LLVM's APSInt type.
The FIRRTL dialect in CIRCT uses inherently signful types, and APSInt
is the best way to model that.  Add a couple of helpers that make it
easier to work with an IntegerAttr that carries a sign.

This follows the example of getZExt() and getSExt() which assert when
the underlying type of the attribute is unexpected.  In this case
we assert fail when the underlying type of the attribute is signless.

This is strictly additive, so it is NFC.  It is tested in the CIRCT
repo.

Differential Revision: https://reviews.llvm.org/D102701
2021-05-18 10:51:52 -07:00
Arthur Eubanks 5781f9a743 [NFC] Format PassesBindingsTests CMake like other unittests 2021-05-18 10:40:07 -07:00
Sanjay Patel 6d949a9c8f [InstCombine] restrict funnel shift match to avoid miscompile
As noted in the post-commit discussion for:
https://reviews.llvm.org/rGabd7529625a73f405e40a63dcc446c41d51a219e

...that change exposed a logic hole that allows a miscompile
if the shift amount could exceed the narrow width:
https://alive2.llvm.org/ce/z/-i_CiM
https://alive2.llvm.org/ce/z/NaYz28

The restriction isn't necessary for a rotate (same operand for
both shifts), so we should adjust the matching for the shift
value as a follow-up enhancement:
https://alive2.llvm.org/ce/z/ahuuQb
2021-05-18 13:32:07 -04:00
Arthur Eubanks 0b031eeefa [test] Speculative fix for bots (round 2)
Bot has error "Failed to create target from default triple: Unable to
find target for this triple (no targets are registered)", likely because
we only initialized the native target, not the registered target if it's
different.

https://lab.llvm.org/buildbot/#/builders/86/builds/13664
2021-05-18 10:26:28 -07:00
Arthur Eubanks 16cbc80e72 [gn build] Rename PassesBindingsTests and add it to unittests 2021-05-18 10:26:00 -07:00
Sanjay Patel e81f09f8f8 [InstCombine] add tests for funnel shift miscompile; NFC 2021-05-18 13:18:39 -04:00
Chris Lattner 3043be9d2d [IR] Add a Location to BlockArgument.
This adds the ability to specify a location when creating BlockArguments.
Notably Value::getLoc() will return this correctly, which makes diagnostics
more precise (e.g. the example in test-legalize-type-conversion.mlir).

This is currently optional to avoid breaking any existing code - if
absent, the BlockArgument defaults to using the location of its enclosing
operation (preserving existing behavior).

The bulk of this change is plumbing location tracking through the parser
and printer to make sure it can round trip (in -mlir-print-debuginfo
mode).  This is complete for generic operations, but requires manual
adoption for custom ops.

I added support for function-like ops to round trip their argument
locations - they print correctly, but when parsing the locations are
dropped on the floor.  I intend to fix this, but it will require more
invasive plumbing through "function_like_impl" stuff so I think it
best to split it out to its own patch.

Differential Revision: https://reviews.llvm.org/D102567
2021-05-18 10:18:04 -07:00
Arthur Eubanks c3530e75ce Revert "[test] Speculative fix for bots"
This reverts commit 5c291482ec.

unittests/Passes/CMakeFiles/PassesBindingsTests.dir/PassBuilderBindingsTest.cpp.o: In function `PassBuilderCTest::SetUp()':
PassBuilderBindingsTest.cpp:(.text._ZN16PassBuilderCTest5SetUpEv[_ZN16PassBuilderCTest5SetUpEv]+0x28): undefined reference to `LLVMInitializeARMTargetInfo'
2021-05-18 10:12:51 -07:00
Simon Pilgrim 99c0f16ea4 [X86] Use Skylake Server model for x86-64-v4 so we have full instruction coverage
The x86-64-v4 generic cpu arch supports AVX512BW/DQ/CD/VLX which isn't covered by the Haswell model, use the SkylakeServer model instead which is a lot closer to what the arch represents.

Differential Revision: https://reviews.llvm.org/D102553
2021-05-18 18:06:40 +01:00
Arthur Eubanks 5c291482ec [test] Speculative fix for bots
Bot has error "Failed to create target from default triple: Unable to
find target for this triple (no targets are registered)", likely because
we only initialized the native target, not the registered target if it's
different.

https://lab.llvm.org/buildbot/#/builders/86/builds/13664
2021-05-18 10:01:38 -07:00
Arthur Eubanks 85f8698eb9 [gn build] Add target for PassesBindingsTest 2021-05-18 10:01:19 -07:00
Jessica Paquette 58c57e1b5f [AArch64][GlobalISel] Prefer mov for s32->s64 G_ZEXT
We can use an ORRWrs (mov) + SUBREG_TO_REG rather than a UBFX for G_ZEXT on
s32->s64.

This closer matches what SDAG does, and is likely more power efficient etc.

(Also fixed up arm64-rev.ll which had a fallback check line which was entirely
useless.)

Simple example: https://godbolt.org/z/h1jKKdx5c

Differential Revision: https://reviews.llvm.org/D102656
2021-05-18 10:00:00 -07:00
Roman Lebedev 75ea0abaae
[X86] AMD Zen 3: fix MULX modelling - don't forget about WriteIMulH (PR50387)
Otherwise lack thereof will be caught by a defensive check during
scheduling, and we'll crash.

I've literally never seen this syntax before..
2021-05-18 19:58:04 +03:00
Vinayaka Bandishti a3917d3670 [MLIR][Affine] Privatize certain escaping memrefs
During affine loop fusion, create private memrefs for escaping memrefs
too under the conditions that:
-- the source is not removed after fusion, and
-- the destination does not write to the memref.

This creates more fusion opportunities as illustrated in the test case.

Reviewed By: bondhugula, ayzhuang

Differential Revision: https://reviews.llvm.org/D102604
2021-05-18 22:23:02 +05:30
Aaron Ballman ccbac06a07 Speculatively fix failing tests from 6381664580
This was causing some Mac-specific build failures:
http://45.33.8.238/macm1/9739/step_7.txt
http://45.33.8.238/mac/31615/step_7.txt

As best I can tell with psychic debugging, the /Users/blah path to the
source file is being treated as a macro undef with the clang-cl driver.
This splits the filename off explicitly so hopefully the rest of the
command line arguments will be read properly.
2021-05-18 12:44:58 -04:00
Jessica Paquette 892497c806 [GlobalISel] Simplify G_ICMP to true/false when the result is known
Use existing KnownBits helpers from KnownBits.h to simplify G_ICMPs.

E.g.

x == x -> true
x != x -> false
load(x) > 1 -> true (when the load is known to be greater than 1)

And so on.

Differential Revision: https://reviews.llvm.org/D102542
2021-05-18 09:26:41 -07:00
Sergey Dmitriev 8998a8aa97 [clang-offload-bundler] Add sections and set section flags using one llvm-objcopy invocation
llvm-objcopy has been changed to support adding a section and updating section flags
in one run (D90438), so we can now change clang-offload-bundler to run llvm-objcopy
tool only once when creating fat object.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D102670
2021-05-18 08:44:41 -07:00
Lang Hames 9e5f3dd9db [ORC-RT] Add apply_tuple utility.
This is a substitute for std::apply, which we can't use until we move to c++17.

apply_tuple will be used in upcoming the upcoming wrapper-function utils code.
2021-05-18 08:44:15 -07:00
Lang Hames bd6c93c004 [ORC-RT] Add compiler abstraction header for the ORC runtime.
This header provides helper macros to insulate the rest of the ORC runtime from
compiler specifics.
2021-05-18 08:44:15 -07:00
Lang Hames c42580bf20 [ORC] Don't try to obtain a ref to a non-existent buffer. 2021-05-18 08:44:15 -07:00
Aaron Ballman 6381664580 Introduce SYCL 2020 mode
Currently, we have support for SYCL 1.2.1 (also known as SYCL 2017).
This patch introduces the start of support for SYCL 2020 mode, which is
the latest SYCL standard available at (https://www.khronos.org/registry/SYCL/specs/sycl-2020/html/sycl-2020.html).
This sets the default SYCL to be 2020 in the driver, and introduces the
notion of a "default" version (set to 2020) when cc1 is in SYCL mode
but there was no explicit -sycl-std= specified on the command line.
2021-05-18 10:34:14 -04:00
Tim Northover ba1509da7b Recommit X86: support Swift Async context
This adds support to the X86 backend for the newly committed swiftasync
function parameter. If such a (pointer) parameter is present it gets stored
into an augmented frame record (populated in IR, but generally containing
enhanced backtrace for coroutines using lots of tail calls back and forth).

The context frame is identical to AArch64 (primarily so that unwinders etc
don't get extra complexity). Specfically, the new frame record is [AsyncCtx,
%rbp, ReturnAddr], and its presence is signalled by bit 60 of the stored %rbp
being set to 1. %rbp still points to the frame pointer in memory for backwards
compatibility (only partial on x86, but OTOH the weird AsyncCtx before the rest
of the record is because of x86).

Recommited with a fix for unwind info when i386 pc-rel thunks are
adjacent to a prologue.
2021-05-18 15:19:05 +01:00
Jinsong Ji 7d6449322e [DebugInfo][test] Check specific func name to ignore codegen differences
We use `CHECK-LABEL: define` to divide input stream into functions,
this works well on most platforms.

But there are cases that some platforms (eg: AIX) may have different
codegen , especially for global constructor and descructors.

On AIX, the codegen will have two more functions: __dtor_b,
__finalize_b, which will fail the test.

The fix is to use specific function name so that we can safely ignore
those unrelated codegen differences.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D102654
2021-05-18 14:03:27 +00:00
Raphael Isemann 82f248d234 [ADT] Remove StringRef::withNullAsEmpty
A long time ago LLDB wanted to start using StringRef instead of
C-Strings/ConstString but was blocked by the StringRef(const char *) ctor
asserting that the C-string isn't a nullptr. To workaround this, D24697
introduced a special function called withNullAsEmpty and that's what LLDB (and
only LLDB) started to use to build StringRefs from C-strings.

A bit later it seems that withNullAsEmpty was declared too awkward to use and
instead the assert in the StringRef constructor got removed (see D24904). The
rest of LLDB was then converted to StringRef by just calling the now perfectly
usable implicit constructor.

However, it seems that the original approach with withNullAsEmpty was never
touched again since then and now just exists as a function in StringRef that
is only used in a few places in LLDB.

I removed the few uses of withNullAsEmpty in D102597 and this patch removes
the function itself. Calling the implicit StringRef(const char *) constructor
is the preferred way of doing this today.

Reviewed By: lattner

Differential Revision: https://reviews.llvm.org/D102599
2021-05-18 15:45:09 +02:00
Roman Lebedev 3cc3960766
[X86] AMD Zen 3: cap LoopMicroOpBufferSize to workaround PR50384 (quadratic IndVars runtime)
While i would like to keep the right value here,
i would also like to be able to actually compile
e.g. vanilla test-suite.

256 is a pretty random guess, it should be pretty good enough
for serious loops, but small enough to result in tolerant
compile times for certain edge cases.

https://bugs.llvm.org/show_bug.cgi?id=50384
2021-05-18 15:56:57 +03:00
Kristina Bessonova 9f4f012c10 [libcxx][test] Attempt to make debug mode tests more bulletproof
The problem with debug mode tests is that it isn't known which particular
_LIBCPP_ASSERT causes the test to exit, and as shown by
https://reviews.llvm.org/D100029 and 2908eb20ba it might be not the
expected one.

The patch adds TEST_LIBCPP_ASSERT_FAILURE macro that allows checking
_LIBCPP_ASSERT message to ensure we caught an expected failure.

Reviewed By: Quuxplusone, ldionne

Differential Revision: https://reviews.llvm.org/D100595
2021-05-18 14:52:34 +02:00
David Sherwood 38e2359a11 [NFC] Removed unused VFInfo comparison operator 2021-05-18 13:32:24 +01:00
Martin Storsjö dd7575ba44 [LLD] [MinGW] Pass the canExitEarly parameter through properly
The MinGW driver passed a hardcoded true to this parameter
since 6f4e255219, but when the MinGW driver got the
canExitEarly parameter for consistency in b11386f9be, this
call was missed so it wasn't passed on properly.
2021-05-18 15:09:07 +03:00
Simon Pilgrim 560b709abe [X86][AVX] Cleanup AVX2 vector integer truncation costs
Noticed while investigating PR50364, the truncation costs for v4i64->v4i16/v4i8 and v8i32->v8i8 were way too optimistic for a shuffle sequence that usually matches the AVX1 codegen (they matched AVX512 numbers which have actual truncation instructions!).
2021-05-18 13:07:29 +01:00
Nico Weber 095c520fb4 [lld/mac] Propagate -(un)exported_symbol(s_list) to privateExtern in Driver
That way, it's done only once instead of every time shouldExportSymbol() is
called.

Possibly a bit faster:

    % ministat at_main at_symtodo
    x at_main
    + at_symtodo
        N           Min           Max        Median           Avg        Stddev
    x  30     3.9732189      4.114846      4.024621     4.0304692   0.037022865
    +  30       3.93766     4.0510042     3.9973931      3.991469   0.028472565
    Difference at 95.0% confidence
            -0.0390002 +/- 0.0170714
            -0.967635% +/- 0.423559%
            (Student's t, pooled s = 0.0330256)

In other runs with n=30 it makes no perf difference, so maybe it's just noise.
But being able to quickly and conveniently answer "is this symbol exported?"
is useful for fixing PR50373 and for implementing -dead_strip, so this seems
like a good change regardless.

No behavior change.

Differential Revision: https://reviews.llvm.org/D102661
2021-05-18 07:42:58 -04:00
Florian Hahn fff84d3a2e
[LV] Add test which sinks a load a across an aliasing store. 2021-05-18 12:25:57 +01:00
Simon Pilgrim f79f04ac0c [CostModel][X86] Add scalar truncation cost checks
Ensure these are all zero
2021-05-18 12:24:59 +01:00
Simon Pilgrim 07fea1ef2d [CostModel][X86] Add missing check prefixes from cast.ll
We have checks for these but no actual RUNs were using them
2021-05-18 12:20:19 +01:00
Simon Pilgrim 81606ab856 Revert rGd70cbd1ce9b426f2c7e0e0f900769bbcbb300a95 "[AMDGPU] Regenerate wave32.ll tests"
This is failing on buildbots but not locally - not sure why
2021-05-18 12:15:38 +01:00
Simon Pilgrim d70cbd1ce9 [AMDGPU] Regenerate wave32.ll tests
Keep the manual GFX10DEFWAVE checks for VGPRBlocks
2021-05-18 12:03:31 +01:00
Alexey Bader 2ab513cd3e [SYCL] Enable `opencl_global_[host,device]` attributes for SYCL
Differential Revision: https://reviews.llvm.org/D100396
2021-05-18 10:27:35 +03:00
Nicolas Vasilache f8dbd61074 [mlir][Linalg] Drop spuriously long matmul_column_major benchmark 2021-05-18 10:07:19 +00:00
Ole Strohm 642d2f000b [OpenCL] Fix initialization of __constant constructors without arguments
This fixes the initialization of objects in the __constant
address space that occurs when declaring the object.

Fixes part of PR42566

Reviewed By: Anastasia

Differential Revision: https://reviews.llvm.org/D102248
2021-05-18 10:59:53 +01:00
James Henderson 20e1577d13 [lld] Add a feature for each lld variant when use_lld is called
This allows tests to detect whether to run or not, dependent on which
LLD version is required for the test.

Reviewed by: thopre

Differential Revision: https://reviews.llvm.org/D101997
2021-05-18 10:51:27 +01:00
James Henderson a1e6565855 [lit] Stop using PATH to lookup clang/lld/lldb unless requested
This patch stops lit from looking on the PATH for clang, lld and other
users of use_llvm_tool (currently only the debuginfo-tests) unless the
call explicitly requests to opt into using the PATH. When not opting in,
tests will only look in the build directory.

See the mailing list thread starting from
https://lists.llvm.org/pipermail/llvm-dev/2021-May/150421.html.

See the review for details of why decisions were made about when still
to use the PATH.

Reviewed by: thopre

Differential Revision: https://reviews.llvm.org/D102630
2021-05-18 10:43:33 +01:00
Adrian Kuegel fa765a0944 [mlir] Add folder for complex.ReOp and complex.ImOp.
Now that complex constants are supported, we can also fold.

Differential Revision: https://reviews.llvm.org/D102616
2021-05-18 11:27:23 +02:00
Jay Foad 092a3ce569 [AMDGPU] Fix typo in comment 2021-05-18 10:15:49 +01:00
Benjamin Kramer 3f3642a763 [CodeGen] Avoid unused variable warning in Release builds. NFCI. 2021-05-18 11:09:12 +02:00
Neal (nealsid) e89b60fcfc Update MSVC version number in preprocessor check
Passing template parameter packs to std::map doesn't work in VS 2017/2019, so this updates the preprocessor version check to use an alternate version in VS2019, as well.

Reviewed By: DavidSpickett

Differential Revision: https://reviews.llvm.org/D102260
2021-05-18 10:04:39 +01:00