Commit Graph

143434 Commits

Author SHA1 Message Date
Craig Topper 0b50fa9945 [FaultsMaps][llvm-objdump] Move FaultMapParser to Object/. Remove CodeGen dependency from llvm-objdump
FaultsMapParser lived in CodeGen and was forcing llvm-objdump to
link CodeGen and everything CodeGen depends on.

This was previously attempted in r240364 to fix a link failure.
The CodeGen dependency was independently added to fix the same
link failure, and that ended up being kept.

Removing the dependency seems like the correct layering for
llvm-objdump.

Reviewed By: MaskRay, jhenderson

Differential Revision: https://reviews.llvm.org/D95414
2021-01-27 10:39:59 -08:00
Craig Topper 04570e98c8 [RISCV] Group the legal vector types into lists we can iterator over in the RISCVISelLowering constructor
Remove the RISCVVMVTs namespace because I don't think it provides
a lot of value. If we change the mappings we'd likely have to add
or remove things from the list anyway.

Add a wrapper around addRegisterClass that can determine the
register class from the fixed size of the type.

Reviewed By: frasercrmck, rogfer01

Differential Revision: https://reviews.llvm.org/D95491
2021-01-27 10:20:12 -08:00
Florian Hahn 28410d17f5
[LoopUtils] Pass SCEVExpander instead SE to addRuntimeChecks.
This gives the user control over which expander to use, which in turn
allows the user to decide what to do with the expanded instructions.

Used in D75980.

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D94295
2021-01-27 17:36:19 +00:00
Paul C. Anagnostopoulos f3449ed607 [TableGen] [DetailedRecords] Print record name that is null string as ""
Differential Revision: https://reviews.llvm.org/D95312

Add a test for the backend.
2021-01-27 10:41:46 -05:00
Simon Pilgrim 5ded5ab78f ExecutionDomainFix.cpp - use const refs in for-range loops. NFCI.
Avoid unnecessary copies. Reported by clang-tidy.
2021-01-27 15:39:32 +00:00
Simon Pilgrim 30829a27ca [Support] CommandLine.cpp - Fix clang-tidy namespace comment warnings. NFCI.
Ensure namespace braces have the correct comment with them
2021-01-27 15:39:31 +00:00
Simon Pilgrim b3718eee0e [Support] Fix clang-tidy auto warnings. NFCI.
Use auto pointer/reference to fix llvm-qualified-auto remarks.
2021-01-27 15:39:31 +00:00
Roman Lebedev 51a25846c1
[CodeGen] SafeStack: preserve DominatorTree if it is avaliable
While this is mostly NFC right now, because only ARM happens
to run this pass with DomTree available before it,
and required after it, more backends will be affected once
the SimplifyCFG's switch for domtree preservation is flipped,
and DwarfEHPrepare also preserves the domtree.
2021-01-27 18:32:35 +03:00
Roman Lebedev 4de3bdd65f
[NFC] StackProtector: be consistent and to initialize DominatorTreeWrapperPass
We already ask for it, so it might be good to ensure that it is
actually initialized before us. Doesn't seem to matter in practice though.
2021-01-27 18:32:35 +03:00
Jeremy Morse ef0dcb5063 [DWARF] Create subprogram's DIE in DISubprogram's unit
This is a fix for PR48790. Over in D70350, subprogram DIEs were permitted
to be shared between CUs. However, the creation of a subprogram DIE can be
triggered early, from other CUs. The subprogram definition is then created
in one CU, and when the function is actually emitted children are attached
to the subprogram that expect to be in another CU. This breaks internal CU
references in the children.

Fix this by redirecting the creation of subprogram DIEs in
getOrCreateContextDIE to the CU specified by it's DISubprogram definition.
This ensures that the subprogram DIE is always created in the correct CU.

Differential Revision: https://reviews.llvm.org/D94976
2021-01-27 12:36:14 +00:00
Mindong Chen 00fcc03687 [SCEV] Fix incorrect loop exit count analysis.
In computeLoadConstantCompareExitLimit, the addrec used to compute the
exit count should be from the loop which the exiting block belongs to.

Reviewed by: mkazantsev

Differential Revision: https://reviews.llvm.org/D92367
2021-01-27 19:36:05 +08:00
Sjoerd Meijer 48ecba350e [MachineLICM][MachineSink] Move SinkIntoLoop to MachineSink.
This moves SinkIntoLoop from MachineLICM to MachineSink. The motivation for
this work is that hoisting is a canonicalisation transformation, but we do not
really have a good story to sink instructions back if that is better, e.g. to
reduce live-ranges, register pressure and spilling. This has been discussed a
few times on the list, the latest thread is:

https://lists.llvm.org/pipermail/llvm-dev/2020-December/147184.html

There it was pointed out that we have the LoopSink IR pass, but that works on
IR, lacks register pressure informatiom, and is focused on profile guided
optimisations, and then we have MachineLICM and MachineSink that both perform
sinking. MachineLICM is more about hoisting and CSE'ing of hoisted
instructions. It also contained a very incomplete and disabled-by-default
SinkIntoLoop feature, which we now move to MachineSink.

Getting loop-sinking to do something useful is going to be at least a 3-step
approach:

1) This is just moving the code and is almost a NFC, but contains a bug fix.
This uses helper function `isLoopInvariant` that was factored out in D94082 and
added to MachineLoop.
2) A first functional change to make loop-sink a little bit less restrictive,
which it really is at the moment, is the change in D94308. This lets it do
more (alias) analysis using functions in MachineSink, making it a bit more
powerful. Nothing changes much: still off by default. But it shows that
MachineSink is a better home for this, and it starts using its functionality
like `hasStoreBetween`, and in the next step we can use `isProfitableToSinkTo`.
3) This is the going to be he interesting step: decision making when and how
many instructions to sink. This will be driven by the register pressure, and
deciding if reducing live-ranges and loop sinking will help in better
performance.
4) Once we are happy with 3), this should be enabled by default, that should be
the end goal of this exercise.

Differential Revision: https://reviews.llvm.org/D93694
2021-01-27 10:49:56 +00:00
David Green 0175cd00a1 [AArch64] Add vector saturating add intrinsic costs
This adds sadd.sat, uadd.sat, ssub.sat and usub.sat costs for AArch64,
similar to how they were recently added for ARM.

Differential Revision: https://reviews.llvm.org/D95292
2021-01-27 10:38:32 +00:00
Fraser Cormack 9a75a808c2 [RISCV] Fix a codegen crash in getSetCCResultType
This patch fixes some crashes coming from
`RISCVISelLowering::getSetCCResultType`, which would occasionally return
an EVT constructed from an invalid MVT, which has a null Type pointer.

The attached test shows this happening currently for some fixed-length
vectors, which hit this issue when the V extension was enabled, even
though they're not legal types under the V extension. The fix was also
pre-emptively extended to scalable vectors which can't be represented as
an MVT, even though a test case couldn't be found for them.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D95434
2021-01-27 10:22:54 +00:00
David Green 9e2768a3d9 [ARM] Add neon FP16 scalar_to_vector patterns.
This adds some simple fp16 scalar_to_vector patterns, preventing a
selection failure if this came up.

Differential Revision: https://reviews.llvm.org/D95427
2021-01-27 09:59:15 +00:00
Cassie Jones 40f6599c20 [AArch64][GlobalISel] Make G_SADDE and G_SSUBE legal
This makes G_SADDE and G_SSUBE legal in preparation for further work
legalizing overflowing operations. It's fine that they don't have an
instruction selector implementation yet, because G_UADDE and G_USUBE are
already legal on AArch64 without an instruction selector implementation. This
completes the set of G_[SU]{ADD,SUB}[EO] operations on AArch64.

Reviewed By: paquette

Differential Revision: https://reviews.llvm.org/D95325
2021-01-27 04:36:17 -05:00
Alexey Bader f96767368f Fix an error about implicit fallthrough during self build - new tag for ittapi.
A fix has been implemented in the ittap repo to fix an error about implicit fallthrough in a switch that was occurring during self build.
A new tag has been created for that fix. This is to update the tag.

Reviewed By: bader

Differential Revision: https://reviews.llvm.org/D95462

Patch by Zahira Ammarguellat.
2021-01-27 08:55:52 +03:00
Kazu Hirata 657f5b9743 [MemorySSA] Use ListSeparator (NFC) 2021-01-26 20:00:18 -08:00
Kazu Hirata 6bde085366 [AMDGPU] Forward-declare TargetRegisterClass (NFC)
AMDGPUInstructionSelector.h needs TargetRegisterClass but relies on a
forward declaration of TargetRegisterClass in InstructionSelector.h.
This patch adds a forward declaration right in
AMDGPUInstructionSelector.h.

While we are at it, this patch removes the one in
InstructionSelector.h, where it is unnecessary.
2021-01-26 20:00:16 -08:00
Petr Hosek bb9eb19829 Support for instrumenting only selected files or functions
This change implements support for applying profile instrumentation
only to selected files or functions. The implementation uses the
sanitizer special case list format to select which files and functions
to instrument, and relies on the new noprofile IR attribute to exclude
functions from instrumentation.

Differential Revision: https://reviews.llvm.org/D94820
2021-01-26 17:13:34 -08:00
Nico Weber f3c9687a4f llvm-lib: Pull error printing code out of two functions
Slightly changes the output in error code, but no behavior change in
normal use. This is for preparation for using these two functions
elsewhere.
2021-01-26 19:13:30 -05:00
Duncan P. N. Exon Smith 2f721476d1 Frontend: Simplify handling of non-seeking streams in CompilerInstance, NFC
Add a new `raw_pwrite_ostream` variant, `buffer_unique_ostream`, which
is like `buffer_ostream` but with unique ownership of the stream it's
wrapping. Use this in CompilerInstance to simplify the ownership of
non-seeking output streams, avoiding logic sprawled around to deal with
them specially.

This also simplifies future work to encapsulate output files in a
different class.

Differential Revision: https://reviews.llvm.org/D93260
2021-01-26 15:20:43 -08:00
Jessica Paquette f36007e811 [GlobalISel] Implement computeKnownBits for G_SEXT_INREG
Just use the existing `Known.sextInReg` implementation.

- Update KnownBitsTest.cpp.
- Update combine-redundant-and.mir for a more concrete example.

Differential Revision: https://reviews.llvm.org/D95484
2021-01-26 15:01:38 -08:00
Adrian Prantl 0554541b44 Salvage debug info for function arguments in coro-split funclets.
This patch improves the availability for variables stored in the
coroutine frame by emitting an alloca to hold the pointer to the frame
object and rewriting dbg.declare intrinsics to point inside the frame
object using salvaged DIExpressions. Finally, a new alloca is created
in the funclet to hold the FramePtr pointer to ensure that it is
available throughout the entire function at -O0.

This path also effectively reverts D90772. The testcase updates
highlight nicely how every removed CHECK for a dbg.value is preceded
by a new CHECK for a dbg.declare.

Thanks to JunMa, Yifeng, and Bruno for their thoughtful reviews!

Differential Revision: https://reviews.llvm.org/D93497

rdar://71866936
2021-01-26 15:01:26 -08:00
Zhuojia Shen 8cef45517e [ARM] Fix STRT/STRHT/STRBT input/output operands.
STRT, STRHT, and STRBT are store instructions and their source register
$Rt should be treated as an input operand instead of an output operand.
This should fix things (e.g., liveness tracking in LivePhysRegs) if
these instructions were used in CodeGen.

Differential Revision: https://reviews.llvm.org/D95074
2021-01-26 14:00:58 -08:00
Bjorn Pettersson a9bd3d37bd [NewPM] Add ExtraVectorizerPasses support
As it looks like NewPM generally is using SimpleLoopUnswitch
instead of LoopUnswitch, this patch also use SimpleLoopUnswitch
in the ExtraVectorizerPasses sequence (compared with LegacyPM
which use the LoopUnswitch pass).

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D95457
2021-01-26 22:59:10 +01:00
Valery N Dmitriev 716b9dd0d8 [InstCombine] Preserve FMF for powi simplifications.
Differential Revision: https://reviews.llvm.org/D95455
2021-01-26 13:26:06 -08:00
Craig Topper 74784a5aa4 [X86] In shrinkAndImmediate, place the new constant into the topological sort.
Revert the change to use APInt::isSignedIntN from
5ff5cf8e05.

Its clear that the games we were playing to avoid the topological
sort aren't working. So just fix it once and for all.

Fixes PR48888.
2021-01-26 13:18:04 -08:00
Amara Emerson cbed865e1e [GlobalISel][IRTranslator] Ignore the llvm.experimental.noalias.scope.decl intrinsic.
These don't generate any code.
2021-01-26 13:04:11 -08:00
Haowei Wu 15313f64be [llvm-elfabi] Support ELF file that lacks .gnu.hash section
Before this change, when reading ELF file, elfabi determines number of
entries in .dynsym by reading the .gnu.hash section. This change makes
elfabi read section headers directly first. This change allows elfabi
works on ELF files which do not have .gnu.hash sections.

Differential Revision: https://reviews.llvm.org/D93362
2021-01-26 12:31:52 -08:00
Fangrui Song 34b60d8a56 Add -fbinutils-version= to gate ELF features on the specified binutils version
There are two use cases.

Assembler
We have accrued some code gated on MCAsmInfo::useIntegratedAssembler().  Some
features are supported by latest GNU as, but we have to use
MCAsmInfo::useIntegratedAs() because the newer versions have not been widely
adopted (e.g. SHF_LINK_ORDER 'o' and 'unique' linkage in 2.35, --compress-debug-sections= in 2.26).

Linker
We want to use features supported only by LLD or very new GNU ld, or don't want
to work around older GNU ld. We currently can't represent that "we don't care
about old GNU ld".  You can find such workarounds in a few other places, e.g.
Mips/MipsAsmprinter.cpp PowerPC/PPCTOCRegDeps.cpp X86/X86MCInstrLower.cpp
AArch64 TLS workaround for R_AARCH64_TLSLD_MOVW_DTPREL_* (PR ld/18276),
R_AARCH64_TLSLE_LDST8_TPREL_LO12 (https://bugs.llvm.org/show_bug.cgi?id=36727 https://sourceware.org/bugzilla/show_bug.cgi?id=22969)

Mixed SHF_LINK_ORDER and non-SHF_LINK_ORDER components (supported by LLD in D84001;
GNU ld feature request https://sourceware.org/bugzilla/show_bug.cgi?id=16833 may take a while before available).
This feature allows to garbage collect some unused sections (e.g. fragmented .gcc_except_table).

This patch adds `-fbinutils-version=` to clang and `-binutils-version` to llc.
It changes one codegen place in SHF_MERGE to demonstrate its usage.
`-fbinutils-version=2.35` means the produced object file does not care about GNU
ld<2.35 compatibility. When `-fno-integrated-as` is specified, the produced
assembly can be consumed by GNU as>=2.35, but older versions may not work.

`-fbinutils-version=none` means that we can use all ELF features, regardless of
GNU as/ld support.

Both clang and llc need `parseBinutilsVersion`. Such command line parsing is
usually implemented in `llvm/lib/CodeGen/CommandFlags.cpp` (LLVMCodeGen),
however, ClangCodeGen does not depend on LLVMCodeGen. So I add
`parseBinutilsVersion` to `llvm/lib/Target/TargetMachine.cpp` (LLVMTarget).

Differential Revision: https://reviews.llvm.org/D85474
2021-01-26 12:28:23 -08:00
Petr Hosek 1e634f3952 Revert "Support for instrumenting only selected files or functions"
This reverts commit 4edf35f11a because
the test fails on Windows bots.
2021-01-26 12:25:28 -08:00
Austin Kerbow 2291bd137d [AMDGPU] Update subtarget features for new target ID support
Support for XNACK and SRAMECC is not static on some GPUs. We must be able
to differentiate between different scenarios for these dynamic subtarget
features.

The possible settings are:

- Unsupported: The GPU has no support for XNACK/SRAMECC.
- Any: Preference is unspecified. Use conservative settings that can run anywhere.
- Off: Request support for XNACK/SRAMECC Off
- On: Request support for XNACK/SRAMECC On

GCNSubtarget will track the four options based on the following criteria. If
the subtarget does not support XNACK/SRAMECC we say the setting is
"Unsupported". If no subtarget features for XNACK/SRAMECC are requested we
must support "Any" mode. If the subtarget features XNACK/SRAMECC exist in the
feature string when initializing the subtarget, the settings are "On/Off".

The defaults are updated to be conservatively correct, meaning if no setting
for XNACK or SRAMECC is explicitly requested, defaults will be used which
generate code that can be run anywhere. This corresponds to the "Any" setting.

Differential Revision: https://reviews.llvm.org/D85882
2021-01-26 11:25:51 -08:00
Petr Hosek 4edf35f11a Support for instrumenting only selected files or functions
This change implements support for applying profile instrumentation
only to selected files or functions. The implementation uses the
sanitizer special case list format to select which files and functions
to instrument, and relies on the new noprofile IR attribute to exclude
functions from instrumentation.

Differential Revision: https://reviews.llvm.org/D94820
2021-01-26 11:11:39 -08:00
Adhemerval Zanella dad55c2218 [ARM] [ELF] Fix ARMMaterializeGV for Indirect calls
Recent shouldAssumeDSOLocal changes (introduced by 961f31d8ad)
do not take in consideration the relocation model anymore.  The ARM
fast-isel pass uses the function return to set whether a global symbol
is loaded indirectly or not, and without the expected information
llvm now generates an extra load for following code:

```
$ cat test.ll
@__asan_option_detect_stack_use_after_return = external global i32
define dso_local i32 @main(i32 %argc, i8** %argv) #0 {
entry:
  %0 = load i32, i32* @__asan_option_detect_stack_use_after_return,
align 4
  %1 = icmp ne i32 %0, 0
  br i1 %1, label %2, label %3

2:
  ret i32 0

3:
  ret i32 1
}

attributes #0 = { noinline optnone }

$ lcc test.ll -o -
[...]
main:
        .fnstart
[...]
        movw    r0, :lower16:__asan_option_detect_stack_use_after_return
        movt    r0, :upper16:__asan_option_detect_stack_use_after_return
        ldr     r0, [r0]
        ldr     r0, [r0]
        cmp     r0, #0
[...]
```

And without 'optnone' it produces:
```
[...]
main:
        .fnstart
[...]
        movw    r0, :lower16:__asan_option_detect_stack_use_after_return
        movt    r0, :upper16:__asan_option_detect_stack_use_after_return
        ldr     r0, [r0]
        clz     r0, r0
        lsr     r0, r0, #5
        bx      lr

[...]
```

This triggered a lot of invalid memory access in sanitizers for
arm-linux-gnueabihf.  I checked this patch both a stage1 built with
gcc and a stage2 bootstrap and it fixes all the Linux sanitizers
issues.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D95379
2021-01-26 15:57:55 -03:00
Craig Topper f9d7f77267 [RISCV] Have customLegalizeToWOp truncate to the original type instead of i32 now that we use it for i8/i16 as well.
239cfbccb0 add support for legalizing
i8/i16 UDIV/UREM/SDIV to use *W instructions. So we need to truncate
to i8/i16 if we're legalizing one of those.
2021-01-26 10:50:03 -08:00
Matt Arsenault 5f9707b796 AMDGPU: Fix redundant FP spilling/assert in some functions
If a function has stack objects, and a call, we require an FP. If we
did not initially have any stack objects, and only introduced them
during PrologEpilogInserter for CSR VGPR spills, SILowerSGPRSpills
would end up spilling the FP register as if it were a normal
register. This would result in an assert in a debug build, or
redundant handling of the FP register in a release build.

Try to predict that we will have an FP later, although this is ugly.
2021-01-26 13:01:45 -05:00
Matt Arsenault 92d1195b5f AMDGPU: Add assertion to determineCalleeSaves
Make sure this isn't getting called multiple times. I was surprised we
were modifying the function here, which I think is a bit questionable.
2021-01-26 13:01:45 -05:00
Sanjay Patel 09b1c56366 [LoopUtils] do not initialize Cmp predicate unnecessarily; NFC
The switch must set the predicate correctly; anything else
should lead to unreachable/assert.

I'm trying to fix FMF propagation here and the callers,
so this is a preliminary cleanup.
2021-01-26 11:22:51 -05:00
Simon Pilgrim f82cff31d3 [AMDGPU] HSAMD::fromString - replace std::string arg with StringRef. NFCI.
Removes an unnecessary chain of StringRef -> std::string -> StringRef conversions
2021-01-26 16:09:39 +00:00
Simon Pilgrim ee3da8958a [AMDGPU] Fix null-dereference static analysis warnings. NFCI.
Avoid repeated calls to isZeroValue() and check for a null pointer before dereferencing a dyn_cast<>.
2021-01-26 15:43:59 +00:00
Matt Arsenault 551a69e418 AMDGPU: Clear IsSSA property in SIFormMemoryClauses
Fixes verifier error when writing MIR testcases
2021-01-26 10:40:41 -05:00
Florian Hahn 1272f16d14
[LoopUnswitch] Avoid partially unswitching too aggressively.
This patch adds additional checks to avoid partial unswitching
in cases where it won't be profitable, e.g. because the path directly
exits the loop anyways.
2021-01-26 15:18:41 +00:00
Mirko Brkusanin 608ac62540 [AMDGPU] Fix use of HasModifiers in VopProfile
HasModifiers should be true if at least one modifier is used.
This should make the use of this field bit more consistent.

Differential Revision: https://reviews.llvm.org/D94795
2021-01-26 15:21:11 +01:00
Florian Hahn 35b3989a30
[Passes] Run peeling as part of simple/full loop unrolling.
Loop peeling removes conditions from loop bodies that become invariant
after a small number of iterations. When triggered, this leads to fewer
compares and possibly PHIs in loop bodies, enabling further
optimizations. The current cost-model of loop peeling should be quite
conservative/safe, i.e. only peel if a condition in the loop becomes
known after peeling.

For example, see PR47671, where loop peeling enables vectorization by
removing a PHI the vectorizer does not understand. Granted, the
loop-vectorizer could also be taught about constant PHIs, but loop
peeling is likely to enable other optimizations as well.

This has an impact on quite a few benchmarks from
MultiSource/SPEC2000/SPEC2006 on X86 with -O3 -flto, for example

    Same hash: 186 (filtered out)
    Remaining: 51
    Metric: loop-vectorize.LoopsVectorized

    Program                                        base   patch  diff
     test-suite...ve-susan/automotive-susan.test     8.00   9.00 12.5%
     test-suite...nal/skidmarks10/skidmarks.test    35.00  31.00 -11.4%
     test-suite...lications/sqlite3/sqlite3.test    41.00  43.00  4.9%
     test-suite...s/ASC_Sequoia/AMGmk/AMGmk.test    25.00  26.00  4.0%
     test-suite...006/450.soplex/450.soplex.test    88.00  89.00  1.1%
     test-suite...TimberWolfMC/timberwolfmc.test   120.00 119.00 -0.8%
     test-suite.../CINT2006/403.gcc/403.gcc.test   215.00 216.00  0.5%
     test-suite...006/447.dealII/447.dealII.test   957.00 958.00  0.1%
     test-suite...ternal/HMMER/hmmcalibrate.test    75.00  75.00  0.0%

    Same hash: 186 (filtered out)
    Remaining: 51
    Metric: loop-vectorize.LoopsAnalyzed

    Program                                        base    patch   diff
     test-suite...ks/Prolangs-C/agrep/agrep.test   440.00  434.00  -1.4%
     test-suite...nal/skidmarks10/skidmarks.test   312.00  308.00  -1.3%
     test-suite...marks/7zip/7zip-benchmark.test   6399.00 6323.00 -1.2%
     test-suite...lications/minisat/minisat.test   134.00  135.00   0.7%
     test-suite...rks/FreeBench/pifft/pifft.test   295.00  297.00   0.7%
     test-suite...TimberWolfMC/timberwolfmc.test   1879.00 1869.00 -0.5%
     test-suite...pplications/treecc/treecc.test   689.00  691.00   0.3%
     test-suite...T2000/300.twolf/300.twolf.test   1593.00 1597.00  0.3%
     test-suite.../Benchmarks/Bullet/bullet.test   1394.00 1392.00 -0.1%
     test-suite...ications/JM/ldecod/ldecod.test   1431.00 1429.00 -0.1%
     test-suite...6/464.h264ref/464.h264ref.test   2229.00 2230.00  0.0%
     test-suite...lications/sqlite3/sqlite3.test   2590.00 2589.00 -0.0%
     test-suite...ications/JM/lencod/lencod.test   2732.00 2733.00  0.0%
     test-suite...006/453.povray/453.povray.test   3395.00 3394.00 -0.0%

Note the -11% regression in number of loops vectorized for skidmarks. I
suspect this corresponds to the fact that those loops are gone now (see
the reduction in number of loops analyzed by LV).

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D88471
2021-01-26 13:52:30 +00:00
Lang Hames 4dc110a4b8 [ORC] Attempt to auto-claim responsibility for weak defs in ObjectLinkingLayer.
Compilers may insert new definitions during compilation, E.g. EH personality
function pointers, or named constant pool entries. This commit causes
ObjectLinkingLayer to attempt to claim responsibility for all weak definitions
in objects as they're linked. This is always safe (first claimant for each
symbol is granted responsibility, subsequent claims are rejected without error)
and prevents compiler-injected symbols from being dead-stripped (which they
will be if they remain unclaimed by anyone).

This change was motivated by errors seen by an out-of-tree client while testing
eh-frame support in JITLink ELF/x86-64: IR containing exceptions didn't define
DW.ref.__gxx_personality_v0 (since it's added by CodeGen), and this caused
DW.ref.__gxx_personality_v0 to be dead-stripped leading to linker failures.

No test case yet: We won't have a way to test in-tree until we enable JITLink
for lli on Linux.
2021-01-27 00:09:40 +11:00
Lang Hames 476abdb562 [ORC] Fix debug logging message. 2021-01-26 23:52:44 +11:00
Lang Hames b3e0135a6f [JITLink][ELF/x86-64] When building PLT stub, use -4 offset for PCRel32.
This is required for ELF where PCRel32 doesn't implicitly subtract 4.

No test case yet: I haven't figured out a good way to test stub
generation -- this may required extensions to jitlink-check.
2021-01-26 23:51:58 +11:00
Dmitry Preobrazhensky 745064e36b [AMDGPU][MC] Refactored exp tgt handling
Summary:
- Separated tgt encoding from parsing;
- Separated tgt decoding from printing;
- Improved errors handling;
- Disabled leading zeroes in index. The following code is no longer accepted: exp pos00 v3, v2, v1, v0

Reviewers: arsenm, rampitec, foad

Differential Revision: https://reviews.llvm.org/D95216
2021-01-26 14:54:15 +03:00
Georgii Rymar d5e48f1347 [yaml2obj][obj2yaml] - Improve how we set/dump the sh_entsize field.
We already set the `sh_entsize` field in a single place
for all non-implicit sections.

This patch reorders the logic slightly and with it
we finally have the only one place where the `sh_entsize` is set.

obj2yaml will not dump the `EntSize` key for `SHT_DYNSYM/SHT_SYMTAB` sections anymore,
when the value of `sh_entsize` is equal to `sizeof(Elf_Sym)`

Note that this also seems revealed an issue in llvm-objcopy:
Previously yaml2obj set the `sh_entsize` for the `.symtab` section to 0x18,
now we it sets it for `SHT_SYMTAB` sections, i.e. by type.
But the `llvm-objcopy/ELF/only-keep-debug.test` has a `.symtab` section of type `SHT_STRTAB`,
and now yaml2obj sets the `sh_entsize` to 0 for it.
I had to update the corresponding check lines for `ES`, but the behavior of
`llvm-objcopy` should be fixed instead I think.
I've added a TODO and a comment.

Differential revision: https://reviews.llvm.org/D95364
2021-01-26 13:33:02 +03:00