This diff adds support for SHT_GROUP sections to llvm-objcopy.
Some sections are interrelated and comprise a group.
For example, a definition of an inline function might require,
in addition to the section containing its instructions,
a read-only data section containing literals referenced inside the function.
A section of the type SHT_GROUP contains the indices of the group members,
therefore, it needs to be updated whenever the indices change.
Similarly, the fields sh_link, sh_info should be recalculated as well.
[Resubmit r328012 with the proper handling of endianness]
Test plan: make check-all
Differential revision: https://reviews.llvm.org/D43996
llvm-svn: 328143
Summary:
External functions appearing as indirect call targets could not be
found in the SymTab, and the value:counter record was represented,
in the text format, using an empty string for the name. This would
then cause a silent parsing error when reading.
This CL:
- adds explicit support for such functions
- fixes the places where we would not propagate errors when reading
- addresses a performance issue due to eager resorting of the SymTab.
Reviewers: xur, eraman, davidxl
Reviewed By: davidxl
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D44717
llvm-svn: 328132
With this patch, the "instruction dispatched" event now provides information
related to the number of microarchitectural registers used in each register
file. Similarly, the "instruction retired" event is now able to tell how may
registers are freed in each register file.
Currently, the BackendStatistics view is the only consumer of register
usage/pressure information. BackendStatistics uses that info to print out a few
general statistics (i.e. max number of mappings used; total mapping created).
Before this patch, the BackendStatistics was forced to query the Backend to
obtain the register pressure information.
This helps removes that dependency. Now views are completely independent from
the Backend. As a consequence, it should be easier to address PR36663 and
further modularize the pipeline.
Added a couple of test cases in the BtVer2 specific directory.
llvm-svn: 328129
Summary:
Instantiating def's and defm's needs to perform the following steps:
- for defm's, clone multiclass def prototypes and subsitute template args
- for def's and defm's, add subclass definitions, substituting template
args
- clone the record based on foreach loops and substitute loop iteration
variables
- override record variables based on the global 'let' stack
- resolve the record name (this should be simple, but unfortunately it's
not due to existing .td files relying on rather silly implementation
details)
- for def(m)s in multiclasses, add the unresolved record as a multiclass
prototype
- for top-level def(m)s, resolve all internal variable references and add
them to the record keeper and any active defsets
This change streamlines how we go through these steps, by having both
def's and defm's feed into a single addDef() method that handles foreach,
final resolve, and routing the record to the right place.
This happens to make foreach inside of multiclasses work, as the new
test case demonstrates. Previously, foreach inside multiclasses was not
forbidden by the parser, but it was de facto broken.
Another side effect is that the order of "instantiated from" notes in error
messages is reversed, as the modified test case shows. This is arguably
clearer, since the initial error message ends up pointing directly to
whatever triggered the error, and subsequent notes will point to increasingly
outer layers of multiclasses. This is consistent with how C++ compilers
report nested #includes and nested template instantiations.
Change-Id: Ica146d0db2bc133dd7ed88054371becf24320447
Reviewers: arsenm, craig.topper, tra, MartinO
Subscribers: wdng, llvm-commits
Differential Revision: https://reviews.llvm.org/D44478
llvm-svn: 328117
The pipeliner needs to remove instructions from the SlotIndexes
structure when they are deleted. Otherwise, the SlotIndexes map
has stale data, and an assert will occur when adding new
instructions.
This patch also changes the pipeliner to make the back-edge of
a loop carried dependence 1 cycle. The 1 cycle latency is added
to the anti-dependence that represents the back-edge. This
changes eliminates a couple of hacks added to the pipeliner to
handle the latency of the back-edge. It is needed to correctly
pipeline the test case for the sub-register elimination pass.
llvm-svn: 328113
This patch also includes extensive tests targeted at select and br+fcmp IR
inputs. A sequence of br+fcmp required support for FPR32 registers to be added
to RISCVInstrInfo::storeRegToStackSlot and
RISCVInstrInfo::loadRegFromStackSlot.
llvm-svn: 328104
Summary:
This change is part of step five in the series of changes to remove alignment argument from
memcpy/memmove/memset in favour of alignment attributes. In particular, this changes the
MemCpyOpt pass to cease using:
1) The old getAlignment() API of MemoryIntrinsic in favour of getting source & dest specific
alignments through the new API.
2) The old IRBuilder CreateMemCpy/CreateMemMove single-alignment APIs in favour of the new
API that allows setting source and destination alignments independently.
We also add a few tests to fill gaps in the testing of this pass.
Steps:
Step 1) Remove alignment parameter and create alignment parameter attributes for
memcpy/memmove/memset. ( rL322965, rC322964, rL322963 )
Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing
source and dest alignments. ( rL323597 )
Step 3) Update Clang to use the new IRBuilder API. ( rC323617 )
Step 4) Update Polly to use the new IRBuilder API. ( rL323618 )
Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API,
and those that use use MemIntrinsicInst::[get|set]Alignment() to use [get|set]DestAlignment()
and [get|set]SourceAlignment() instead. ( rL323886, rL323891, rL324148, rL324273, rL324278,
rL324384, rL324395, rL324402, rL324626, rL324642, rL324653, rL324654, rL324773, rL324774,
rL324781, rL324784, rL324955, rL324960, rL325816, rL327398, rL327421 )
Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the
MemIntrinsicInst::[get|set]Alignment() methods.
Reference
http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.htmlhttp://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html
llvm-svn: 328097
Summary:
If the operands of a udiv/urem can be proved to fit within a smaller
power-of-two-sized type, reduce the width of the udiv/urem.
Backed out for causing performance regressions. Re-landing
because we've determined that these regressions were noise.
Original Differential Revision: https://reviews.llvm.org/D44102
llvm-svn: 328096
Summary:
When building the selection DAG we sometimes need to postpone
the handling of a dbg.value until the value it should refer to
is created. This is done by using the DanglingDebugInfoMap.
In the past this map has been limited to hold one dangling
dbg.value per value. This patch removes that restriction.
Reviewers: aprantl, rnk, probinson, vsk
Reviewed By: aprantl
Subscribers: Ka-Ka, llvm-commits, JDevlieghere
Tags: #debug-info
Differential Revision: https://reviews.llvm.org/D44610
llvm-svn: 328084
Also restrict to port 0 and 1 for SkylakeClient. It looks like the scheduler models don't account for client not having a full vector ALU on port 5 like server.
Fixes PR36808.
llvm-svn: 328061
Most basic possible test for the logic used by LICM.
Also contains a speculative build fix for compiles which complain about a definition of a stuct K; followed by a declaration as class K;
llvm-svn: 328058
The default thread model for wasm is single, and in this mode thread-local
global variables can be lowered identically to non-thread-local variables.
Differential Revision: https://reviews.llvm.org/D44703
llvm-svn: 328049
The inline assembly generated for the ARC autorelease elision marker
must have a funclet token if it's emitted inside a funclet, otherwise
the inline assembly (and all subsequent code in the funclet) will be
marked unreachable by WinEHPrepare.
Note that this only applies for the non-O0 case, since at O0, clang
emits the autorelease elision marker itself rather than deferring to the
backend. The fix for clang is handled in a separate change.
Differential Revision: https://reviews.llvm.org/D44641
llvm-svn: 328042
Mingw uses the same stack protector functions as GCC provides
on other platforms as well.
Patch by Valentin Churavy!
Differential Revision: https://reviews.llvm.org/D27296
llvm-svn: 328039
term sections from .o files to look to see if the pointers have a relocation
entry and if so print the symbol name from the relocation entry. If not fall
back to the existing code and use the pointer value to look up that value
in the symbol table.
rdar://38337506
llvm-svn: 328037
It uses the MC framework and the tablegen matcher to do the
heavy lifting. Can handle both explicit and implicit locals
(-disable-wasm-explicit-locals). Comes with a small regression
test.
This is a first basic implementation that can parse most llvm .s
output and round-trips most instructions succesfully, but in order
to keep the commit small, does not address all issues.
There are a fair number of mismatches between what MC / assembly
matcher think a "CPU" should look like and what WASM provides,
some already have workarounds in this commit (e.g. the way it
deals with register operands) and some that require further work.
Some of that further work may involve changing what the
Disassembler outputs (and what s2wasm parses), so are probably
best left to followups.
Some known things missing:
- Many directives are ignored and not emitted.
- Vararg calls are parsed but extra args not emitted.
- Loop signatures are likely incorrect.
- $drop= is not emitted.
- Disassembler does not output SIMD types correctly, so assembler
can't test them.
Patch by Wouter van Oortmerssen
Differential Revision: https://reviews.llvm.org/D44329
llvm-svn: 328028
I'm not entirely sure these hacks are still needed. If you remove the hacks completely, the name of the library call that gets generated doesn't match the grep the test previously had. So the test wasn't really checking anything.
If the hack is still needed it belongs in PPC specific code. I believe the FP_TO_SINT code here is the only place in the tree where a FP_ROUND_INREG node is created today. And I don't think its even being used correctly because the legalization returned a BUILD_PAIR with the same value twice. That doesn't seem right to me. By moving the code entirely to PPC we can avoid creating the FP_ROUND_INREG at all.
I replaced the grep in the existing test with full checks generated by hacking update_llc_test_check.py to support ppc32 just long enough to generate it.
Differential Revision: https://reviews.llvm.org/D44061
llvm-svn: 328017
Registers E[A-D]X, E[SD]I, E[BS]P, and EIP have 16-bit subregisters
that cover the low halves of these registers. This change adds artificial
subregisters for the high halves in order to differentiate (in terms of
register units) between the 32- and the low 16-bit registers.
This patch contains parts that aim to preserve the calculated register
pressure. This is in order to preserve the current codegen (minimize the
impact of this patch). The approach of having artificial subregisters
could be used to fix PR23423, but the pressure calculation would need
to be changed.
Differential Revision: https://reviews.llvm.org/D43353
llvm-svn: 328016
As suggested in the original review (https://reviews.llvm.org/D44524), use an annotation style printer instead.
Note: The switch from -analyze to -disable-output in tests was driven by the fact that seems to be the idiomatic style used in annoation passes. I tried to keep both working, but the old style pass API for printers really doesn't make this easy. It invokes (runOnFunction, print(Module)) repeatedly. I decided the extra state wasn't worth it given the old pass manager is going away soonish anyway.
llvm-svn: 328015
This diff adds support for SHT_GROUP sections to llvm-objcopy.
Some sections are interrelated and comprise a group.
For example, a definition of an inline function might require,
in addition to the section containing its instructions,
a read-only data section containing literals referenced inside the function.
A section of the type SHT_GROUP contains the indices of the group members,
therefore, it needs to be updated whenever the indices change.
Similarly, the fields sh_link, sh_info should be recalculated as well.
Test plan: make check-all
Differential revision: https://reviews.llvm.org/D43996
llvm-svn: 328012
This way we can support address-space specific variants without explicitly
encoding the space in the name of the intrinsic. Less intrinsics to deal with ->
less boilerplate.
Added a bit of tablegen magic to match/replace an intrinsics with a pointer
argument in particular address space with the space-specific instruction
variant.
Updated tests to use non-default address spaces.
Differential Revision: https://reviews.llvm.org/D43268
llvm-svn: 328006
Many of our loop passes make use of so called "must execute" or "guaranteed to execute" facts to prove the legality of code motion. The basic notion is that we know (by assumption) an instruction didn't fault at it's original location, so if the location we move it to is strictly post dominated by the original, then we can't have introduced a new fault.
At the moment, the testing for this logic is somewhat adhoc and done mostly through LICM. Since I'm working on that code, I want to improve the testing. This patch is the first step in that direction. It doesn't actually test the variant used by the loop passes - I need to move that to the Analysis library first - but instead exercises an alternate implementation used by SCEV. (I plan on merging both implementations.)
Note: I'll be replacing the printing logic within this with an annotation based version in the near future. Anna suggested this in review, and it seems like a strictly better format.
Differential Revision: https://reviews.llvm.org/D44524
llvm-svn: 328004
TopReadyCycle and BotReadyCycle were off by one cycle when an SU is either
the first instruction or the last instruction in a packet.
Patch by Ikhlas Ajbar.
llvm-svn: 328000
Summary:
Currently X-Ray Instrumentation pass has a dependency on MachineLoopInfo
(and thus on MachineDominatorTree as well) and we have to compute them
even if X-Ray is not used. This patch changes it to a lazy computation
to save compile time by avoiding these redundant computations.
Reviewers: dberris, kubamracek
Subscribers: llvm-commits, hiraditya
Differential Revision: https://reviews.llvm.org/D44666
llvm-svn: 327999
Summary:
Added a flag -no-dwarf-pub-sections, which allows to disable
emission of DWARF public sections.
Reviewers: probinson, echristo
Subscribers: aprantl, JDevlieghere, llvm-commits
Differential Revision: https://reviews.llvm.org/D44385
llvm-svn: 327994
Summary: Fix a bug in entry block shuffled to middle of the chain.
Reviewers: davide, courbet
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D44642
llvm-svn: 327971
The test is taken from
https://github.com/avr-rust/rust/issues/57
The originally implementation of struct return lowering was made in
r325474.
Patch by Peter Nimmervoll
llvm-svn: 327967
Summary:
It turned out to be error-prone to expect the callers to handle that - better to
leave the decision to this routine and make the required data to be explicitly
passed to the function.
This handles the case that was missed in the r322473 and fixes the assert
mentioned in PR36524.
Reviewers: dorit, mssimpso, Ayal, dcaballe
Reviewed By: dcaballe
Subscribers: Ka-Ka, hiraditya, dneilson, hsaito, llvm-commits
Differential Revision: https://reviews.llvm.org/D43812
llvm-svn: 327960
In these cases, both parameters and return values are passed
as a pointer to a stack allocation.
MSVC doesn't use the f80 data type at all, while it is used
for long doubles on mingw.
Normally, this part of the calling convention is handled
within clang, but for intrinsics that are lowered to libcalls,
it may need to be handled within llvm as well.
Differential Revision: https://reviews.llvm.org/D44592
llvm-svn: 327957