Summary:
The pfm counters are now in the ExegesisTarget rather than the
MCSchedModel (PR39165).
This also compresses the pfm counter tables (PR37068).
Reviewers: RKSimon, gchatelet
Subscribers: mgrang, llvm-commits
Differential Revision: https://reviews.llvm.org/D52932
llvm-svn: 345243
Multiply a is complex operation so just because some bit of the output isn't used doesn't mean that bit of the input isn't used.
We might able to bound it, but it will require some more thought.
llvm-svn: 345241
GNU readelf tool prints hex value of the ELF header flags field and the
flags names. This change adds the same functionality to llvm-readobj.
Now llvm-readobj can print MIPS and RISCV flags.
New GNUStyle::printFlags() method is a copy of ScopedPrinter::printFlags()
routine. Probably we can escape code duplication and / or simplify the
printFlags() method. But it's a task for separate commit.
Differential revision: https://reviews.llvm.org/D52027
llvm-svn: 345238
Summary: Fixes part of the problem reported in bug 39275.
Reviewers: aheejin, dschuff
Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits, alexcrichton
Differential Revision: https://reviews.llvm.org/D53542
llvm-svn: 345230
This makes the offsets larger (since they are further from the base
address) but those are in the .dwo - and allows removing addresses and
relocations from the .o file.
This could be built into the AddressPool more fundamentally, perhaps -
when you ask for an AddressPool entry you could say "or give me some
other entry and an offset I need to use" - though what to do about
situations where the first use of an address in a section is not the
earliest address in that section... is tricky.
At least with range addresses we can be fairly sure we've seen the
earliest address first because we see the start address for the
function.
llvm-svn: 345224
Summary:
Currently when assigning depths 'rethrow' does not take the whole
control flow stack into accounts but only considers EH pad stacks. When
assigning depth immmediates to rethrows, in normal cases it is done
correctly but when a rethrow instruction throws up to a caller, i.e., we
convert a pseudo RETHROW_TO_CALLER instruction to a rethrow, it
mistakenly compute the whole stack depth.
Reviewers: dschuff
Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits
Differential Revision: https://reviews.llvm.org/D53619
llvm-svn: 345223
Summary:
Changing the node type in lowering was violating assumptions made in
the DAG combiner, so don't change the node type any more. This fixes
one of the issues reported in bug 39275.
Reviewers: aheejin, dschuff
Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits, alexcrichton
Differential Revision: https://reviews.llvm.org/D53537
llvm-svn: 345221
Instead of using the MOVGOT64r pseudo, use the existing
MO_PIC_BASE_OFFSET support on symbol operands. Now I don't have to
create a "scratch register operand" for the pseudo to use, and the
register allocator can make better decisions.
Fixes some X86 verifier errors tracked in PR27481.
llvm-svn: 345219
Summary:
Changes all uses of minnan/maxnan to minimum/maximum
globally. These names emphasize that the semantic difference between
these operations is more than just NaN-propagation.
Reviewers: arsenm, aheejin, dschuff, javed.absar
Subscribers: jholewinski, sdardis, wdng, sbc100, jgravelle-google, jrtc27, atanasyan, llvm-commits
Differential Revision: https://reviews.llvm.org/D53112
llvm-svn: 345218
In this diff we introduce dispatch mechanism based on
the type of the input (archive, object file, raw binary)
and the format (coff, elf, macho).
We also move the ELF-specific code into the namespace llvm::objcopy::elf.
Test plan: make check-all
Differential revision: https://reviews.llvm.org/D53311
llvm-svn: 345217
'ignore-non-existent-contents' stopped working after r342232 in a way
that the actual attribute value isn't used and it works as if it is
always `true`.
Common use case for VFS iteration is iterating through files in umbrella
directories for modules. Ability to detect if some VFS entries point to
non-existing files is nice but non-critical. Instead of adding back
support for `'ignore-non-existent-contents': false` I am removing the
attribute, because such scenario isn't used widely enough and stricter
checks don't provide enough value to justify the maintenance.
Change is done both in LLVM and Clang, corresponding Clang commit is r345212.
rdar://problem/45176119
Reviewers: bruno
Reviewed By: bruno
Subscribers: hiraditya, dexonsmith, sammccall, cfe-commits
Differential Revision: https://reviews.llvm.org/D53228
llvm-svn: 345213
The current splitting algorithm works in three stages:
1) Identify cold blocks, then
2) Use forward/backward propagation to mark hot blocks, then
3) Grow a SESE region of blocks *outside* of the set of hot blocks and
start outlining.
While testing this pass on Apple internal frameworks I noticed that some
kinds of control flow (e.g. loops) are never outlined, even though they
unconditionally lead to / follow cold blocks. I noticed two other issues
related to how cold regions are identified:
- An inconsistency can arise in the internal state of the hotness
propagation stage, as a block may end up in both the ColdBlocks set
and the HotBlocks set. Further inconsistencies can arise as these sets
do not match what's in ProfileSummaryInfo.
- It isn't necessary to limit outlining to single-exit regions.
This patch teaches the splitting algorithm to identify maximal cold
regions and outline them. A maximal cold region is defined as the set of
blocks post-dominated by a cold sink block, or dominated by that sink
block. This approach can successfully outline loops in the cold path. As
a side benefit, it maintains less internal state than the current
approach.
Due to a limitation in CodeExtractor, blocks within the maximal cold
region which aren't dominated by a single entry point (a so-called "max
ancestor") are filtered out.
Results:
- X86 (LNT + -Os + externals): 134KB of TEXT were outlined compared to
47KB pre-patch, or a ~3x improvement. Did not see a performance impact
across two runs.
- AArch64 (LNT + -Os + externals + Apple-internal benchmarks): 149KB
of TEXT were outlined. Ditto re: performance impact.
- Outlining results improve marginally in the internal frameworks I
tested.
Follow-ups:
- Outline more than once per function, outline large single basic
blocks, & try to remove unconditional branches in outlined functions.
Differential Revision: https://reviews.llvm.org/D53627
llvm-svn: 345209
(Relands r344930, reverted in r344935, and now hopefully fixed for
Windows.)
While this change specifically targets FileCheck, it affects any tool
using the same SourceMgr facilities.
Previously, -color was documented in FileCheck's -help output, but
-color had no effect. Now, -color obeys its documentation: it forces
colors to be used in FileCheck diagnostics even when stderr is not a
terminal.
-color is especially helpful when combined with FileCheck's -v, which
can produce a long series of diagnostics that you might wish to pipe
to a pager, such as less -R. The WithColor extensions here will also
help to clean up color usage in FileCheck's annotated dump of input,
which is proposed in D52999.
Reviewed By: JDevlieghere, zturner
Differential Revision: https://reviews.llvm.org/D53419
llvm-svn: 345202
Until now, we've only checked whether merging stores would cause a cycle via
the value argument, but the address and indexed offset arguments are also
capable of creating cycles in some situations.
The addresses are all base+offset with notionally the same base, but the base
SDNode may still be different (e.g. via an indexed load in one case, and an
ISD::ADD elsewhere). This allows cycles to creep in if one of these sources
depends on another.
The indexed offset is usually undef (representing a non-indexed store), but on
some architectures (e.g. 32-bit ARM-mode ARM) it can be an arbitrary value,
again allowing dependency cycles to creep in.
llvm-svn: 345200
It's possible to do a tail call to a stack argument. LLVM already
calculates the right stack offset to call through.
Fixes the sibcall* and musttail* verifier failures tracked at PR27481.
llvm-svn: 345197
Summary:
This renames the IsParsingMSInlineAsm member variable of AsmLexer to
LexMasmIntegers and moves it up to MCAsmLexer. This is the only behavior
controlled by that variable. I added a public setter, so that it can be
set from outside or from the llvm-mc command line. We may need to
arrange things so that users can get this behavior from clang, but
that's future work.
I also put additional hex literal lexing functionality under this flag
to fix PR32973. It appears that this hex literal parsing wasn't intended
to be enabled in non-masm-style blocks.
Now, masm integers (0b1101 and 0ABCh) work in __asm blocks from clang,
but 0b label references work when using .intel_syntax in standalone .s
files.
However, 0b label references will *not* work from __asm blocks in clang.
They will work from GCC inline asm blocks, which it sounds like is
important for Crypto++ as mentioned in PR36144.
Essentially, we only lex masm literals for inline asm blobs that use
intel syntax. If the .intel_syntax directive is used inside a gnu-style
inline asm statement, masm literals will not be lexed, which is
compatible with gas and llvm-mc standalone .s assembly.
This fixes PR36144 and PR32973.
Reviewers: Gerolf, avt77
Subscribers: eraman, hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D53535
llvm-svn: 345189
I'm not sure all the microarchitectural tuning flags that have been added to IVBFeatures are relevant for KNL. Separating will allow us to see and audit them. There might even be some simplification opportunities in the Sandy Bridge through Icelake inheritance line without KNL using the same chain.
llvm-svn: 345183
Add X86 SimplifyDemandedBitsForTargetNode and use it to simplify PMULDQ/PMULUDQ target nodes.
This enables us to repeatedly simplify the node's arguments after the previous approach had to be reverted due to PR39398.
Differential Revision: https://reviews.llvm.org/D53643
llvm-svn: 345182
Summary:
The current default of appending "_"+entry block label to the new
extracted cold function breaks demangling. Change the deliminator from
"_" to "." to enable demangling. Because the header block label will
be empty for release compile code, use "extracted" after the "." when
the label is empty.
Additionally, add a mechanism for the client to pass in an alternate
suffix applied after the ".", and have the hot cold split pass use
"cold."+Count, where the Count is currently 1 but can be used to
uniquely number multiple cold functions split out from the same function
with D53588.
Reviewers: sebpop, hiraditya
Subscribers: llvm-commits, erik.pilkington
Differential Revision: https://reviews.llvm.org/D53534
llvm-svn: 345178
The BKPT instruction is specified to cause a software breakpoint,
and at least on Linux results in a SIGTRAP. This makes it more
suitable for implementing debugtrap than TRAP (aka UDF #254), which
is specified to cause an undefined instruction exception and results
in a SIGILL on Linux.
Moreover, BKPT is not marked as a terminator, which is not only
consistent with the IR instruction but allows the analyzeBlock
function to correctly analyze a basic block containing the instruction,
which fixes an assertion failure in the machine block placement pass
previously triggered by the included test case.
Because BKPT is only supported starting with ARMv5T, we continue to
use UDF #254 when targeting v4T.
Differential Revision: https://reviews.llvm.org/D53614
llvm-svn: 345171
This will allow other generators of LLVM IR to use the auto-vectorizer
without having to change that flag.
Note: on its own, this patch will enable auto-vectorization on Hexagon
in all cases, regardless of the -fvectorize flag. There is a companion
clang patch that together with this one forms an NFC for clang users.
llvm-svn: 345169
This patch brings back the MOV64r0 pseudo instruction for zeroing a 64-bit register. This replaces the SUBREG_TO_REG MOV32r0 sequence we use today. Post register allocation we will rewrite the MOV64r0 to a 32-bit xor with an implicit def of the 64-bit register similar to what we do for the various XMM/YMM/ZMM zeroing pseudos.
My main motivation is to enable the spill optimization in foldMemoryOperandImpl. As we were seeing some code that repeatedly did "xor eax, eax; store eax;" to spill several registers with a new xor for each store. With this optimization enabled we get a store of a 0 immediate instead of an xor. Though I admit the ideal solution would be one xor where there are multiple spills. I don't believe we have a test case that shows this optimization in here. I'll see if I can try to reduce one from the code were looking at.
There's definitely some other machine CSE(and maybe other passes) behavior changes exposed by this patch. So it seems like there might be some other deficiencies in SUBREG_TO_REG handling.
Differential Revision: https://reviews.llvm.org/D52757
llvm-svn: 345165
Non-uniform division/remainder handling was added back at D49248/D50765 - so share the 'mul+sub' costs that already exist for uniform cases.
llvm-svn: 345164
A lifetime end intrinsic between a tail call and the return should not
prevent the call from being tail call optimized.
Differential Revision: https://reviews.llvm.org/D53519
llvm-svn: 345163
Also, removed the initialization of vectors used for processor resource masks.
Support function 'computeProcResourceMasks()' already calls method resize on
those vectors.
No functional change intended.
llvm-svn: 345161
Summary: This function was performing two hash lookups when a new function type was requested: first checking if it exists and second to insert it. This patch updates the function to perform a single hash lookup in this case by updating the value in the hash table in-place in case the function type was not there before.
Reviewers: bkramer
Reviewed By: bkramer
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D53471
llvm-svn: 345151
The original patch was committed here:
rL344609
...and reverted:
rL344612
...because it did not properly check/test data types before calling
ComputeNumSignBits().
The tests that caused bot failures for the previous commit are
over-reaching front-end tests that run the entire -O optimizer
pipeline:
Clang :: CodeGen/builtins-systemz-zvector.c
Clang :: CodeGen/builtins-systemz-zvector2.c
I've added a negative test here to ensure coverage for that case.
The new early exit check also tests the type of the 'B' parameter,
so we don't waste time on matching if either value is unsuitable.
Original commit message:
This is part of solving PR37549:
https://bugs.llvm.org/show_bug.cgi?id=37549
The patterns shown here are a special case of something
that we already convert to select. Using ComputeNumSignBits()
catches that case (but not the more complicated motivating
patterns yet).
The backend has hooks/logic to convert back to logic ops
if that's better for the target.
llvm-svn: 345149
Added begin()/end() methods to allow the usage of SourceMgr in foreach loops.
With this change, method getMCInstFromIndex() (as well as a couple of other
methods) are now redundant, and can be removed from the public interface.
llvm-svn: 345147
This work is to avoid regressions when we seperate FNeg from the FSub IR instruction.
Differential Revision: https://reviews.llvm.org/D53205
llvm-svn: 345146
Summary:
If the target does not support `.asciz` and `.ascii` directives, the
strings are represented as bytes and each byte is placed on the new line
as a separate byte directive `.b8 <data>`. NVPTX target allows to
represent the vector of the data of the same type as a vector, where
values are separated using `,` symbol: `.b8 <data1>,<data2>,...`. This
allows to reduce the size of the final PTX file. Ptxas tool includes ptx
files into the resulting binary object, so reducing the size of the PTX
file is important.
Reviewers: tra, jlebar, echristo
Subscribers: jholewinski, llvm-commits
Differential Revision: https://reviews.llvm.org/D45822
llvm-svn: 345142
On Windows at least, llvm-strings was crashing if it encountered bytes
that mapped to negative chars, as it was passing these into
std::isgraph and std::isblank functions, resulting in undefined
behaviour. On debug builds using MSVC, these functions verfiy that the
value passed in is representable as an unsigned char. Since the char is
promoted to an int, a value greater than 127 would turn into a negative
integer value, and fail the check. Using the llvm::isPrint function is
sufficient to solve the issue.
Reviewed by: ruiu, mstorsjo
Differential Revision: https://reviews.llvm.org/D53509
llvm-svn: 345137
A new class named InstructionError has been added to Support.h in order to
improve the error reporting from class InstrBuilder.
The llvm-mca driver is responsible for handling InstructionError objects, and
printing them out to stderr.
The goal of this patch is to remove all the remaining error handling logic from
the library code.
In particular, this allows us to:
- Simplify the logic in InstrBuilder by removing a needless dependency from
MCInstrPrinter.
- Centralize all the error halding logic in a new function named 'runPipeline'
(see llvm-mca.cpp).
This is also a first step towards generalizing class InstrBuilder, so that in
future, we will be able to reuse its logic to also "lower" MachineInstr to
mca::Instruction objects.
Differential Revision: https://reviews.llvm.org/D53585
llvm-svn: 345129
masked-interleaving is enabled
Enable interleave-groups under fold-tail scenario for Opt for size compilation;
D50480 added support for vectorizing loops of arbitrary trip-count without a
remiander, which in turn makes everything in the loop conditional, including
interleave-groups if any. It therefore invalidated all interleave-groups
because we didn't have support for vectorizing predicated interleaved-groups
at the time. In the meantime, D53011 introduced this support, so we don't
have to invalidate interleave-groups when masked-interleaved support is enabled.
Reviewers: Ayal, hsaito, dcaballe, fhahn
Reviewed By: hsaito
Differential Revision: https://reviews.llvm.org/D53559
llvm-svn: 345115
LSR reassociates constants as unfolded offsets when the constants fit as
immediate add operands, which currently prevents such constants from being
combined later with loop invariant registers.
This patch modifies GenerateCombinations() to generate a second formula which
includes the unfolded offset in the combined loop-invariant register.
Differential Revision: https://reviews.llvm.org/D51861
llvm-svn: 345114
This B/W VPTEST instructions are only available with AVX512BW. But lowering should prevent any byte or word elements from getting to isel so this can't be exposed.
llvm-svn: 345112
This patch adds support for dumping the unwind info from ARM64 COFF object
files.
Differential Revision: https://reviews.llvm.org/D53264
llvm-svn: 345108
A global alias may use indices which are not considered in bounds. In
such a case, accessing the base object will fail as it only peers
through inbounds accesses. This pattern is used by the swift compiler
to create references to preceeding members in the type metadata. This
would cause the code generation to fail when targeting a platform that
used ELF as the object file format. Be conservative and fail the
read-only check if we run into an alias that we cannot peer through.
llvm-svn: 345107
On GNU/Hurd, llvm-config is returning bogus value, such as:
$ llvm-config-6.0 --includedir
/usr/include
while it should be:
$ llvm-config-6.0 --includedir
/usr/lib/llvm-6.0/include
This is because getMainExecutable does not get the actual installation
path. On GNU/Hurd, /proc/self/exe is indeed a symlink to the path that
was used to start the program, and not the eventual binary file. Llvm's
getMainExecutable thus needs to run realpath over it to get the actual
place where llvm was installed (/usr/lib/llvm-6.0/bin/llvm-config), and
not /usr/bin/llvm-config-6.0. This will not change the result on Linux,
where /proc/self/exe already points to the eventual file.
Patch by Samuel Thibault!
While making changes here, I reformatted this block a bit to reduce
indentation and match 2 space indent style.
Differential Revision: https://reviews.llvm.org/D53557
llvm-svn: 345104
in the same round of SCC update.
In https://reviews.llvm.org/rL309784, inline history is added to prevent
infinite inlining across multiple run of inliner and SCC update, but the
history will only be kept when new SCC is actually generated during SCC update.
We found a case that SCC can be split and then merge into itself in the same
round of SCC update, so the same SCC will be pop out from UR.CWorklist and
then added back immediately, without any new SCC generated, that is why the
existing patch cannot catch the infinite inline case.
What the patch does is even if no new SCC is generated, if only the current
SCC appears in UR.CWorklist again, then keep the inline history.
Differential Revision: https://reviews.llvm.org/D52915
llvm-svn: 345103
When implementing memset's today we often see this pattern:
$x0 = MOV 0xXYXYXYXYXYXYXYXY
store $x0, ...
$w1 = MOV 0xXYXYXYXY
store $w1, ...
We first create a 64bit constant in a 64bit register with all bytes the
same and then create a 32bit constant with all bytes the same in a 32bit
register. In many targets we could just access the lower byte of the
64bit register instead.
- Ideally this would be handled by the ConstantHoist pass but it runs
too early when memset isn't expanded yet.
- The memset expansion code already had this optimization implemented,
however SelectionDAG constantfolding would constantfold the
"trunc(bigconstnat)" pattern to "smallconstant".
- This patch makes the memset expansion mark the constant as Opaque and
stop DAGCombiner from constant folding in this situation. (Similar to
how ConstantHoisting marks things as Opaque to avoid folding
ADD/SUB/etc.)
Differential Revision: https://reviews.llvm.org/D53181
llvm-svn: 345102
Summary:
Fix the new PM to only perform hot cold splitting once during ThinLTO,
by skipping it in the pre-link phase.
This was already fixed in the old PM by the move of the hot cold split
pass later (after the early return when PrepareForThinLTO) by r344869.
Reviewers: vsk, sebpop, hiraditya
Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, llvm-commits
Differential Revision: https://reviews.llvm.org/D53611
llvm-svn: 345096
Summary:
This is a revised version of D41474.
When the debug location is parsed in BitcodeReader::parseFunction, the
scope and inlinedAt MDNodes are obtained via MDLoader->getMDNodeFwdRefOrNull(),
which will create a forward ref if they were not yet loaded.
Specifically, if one of these MDNodes is in the module level metadata
block, and this is during ThinLTO importing, that metadata block is
lazily loaded.
Most places in that invoke getMDNodeFwdRefOrNull have a corresponding call
to resolveForwardRefsAndPlaceholders which will take care of resolving them.
E.g. places that call getMetadataFwdRefOrLoad, or at the end of parsing a
function-level metadata block, or at the end of the initial lazy load of
module level metadata in order to handle invocations of getMDNodeFwdRefOrNull
for named metadata and global object attachments. However, the calls for
the scope/inlinedAt of debug locations are not backed by any such call to
resolveForwardRefsAndPlaceholders.
To fix this, change the scope and inlinedAt parsing to instead use
getMetadataFwdRefOrLoad, which will ensure the forward refs to lazily
loaded metadata are resolved.
Fixes PR35472.
Reviewers: dexonsmith, Sunil_Srivastava, vsk
Subscribers: inglorion, eraman, steven_wu, sebpop, mehdi_amini, dmikulin, vsk, hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D53596
llvm-svn: 345095
Summary:
This patch will print out {Counter, Skip, StopAfter} info of all passes which have DebugCounter set at destruction.
It can be used to monitor how many times does certain transformation happen in a pass, and also help check if -debug-counter option is set correctly.
Please refer to this [[ http://lists.llvm.org/pipermail/llvm-dev/2018-July/124722.html | thread ]] for motivation.
Reviewers: george.burgess.iv, davide, greened
Reviewed By: greened
Subscribers: kristina, llozano, mgorny, llvm-commits, mgrang
Differential Revision: https://reviews.llvm.org/D50031
llvm-svn: 345085
Using -diff and -verbose together doesn't work today. We should audit
where these two options interact and fix them. In the meantime we error
out when the user try to specify both.
llvm-svn: 345084
Clearing LargeOffsetGEPMap at the end fixes a bug where if a large
offset GEP is in a dead basic block, we fail an assertion when trying
to delete the block due to the asserting VH in LargeOffsetGEPMap.
Differential Revision: https://reviews.llvm.org/D53464
llvm-svn: 345082
Doesn't build on Windows. The call to 'lookup' is ambiguous. Clang and
MSVC agree, anyway.
http://lab.llvm.org:8011/builders/clang-x64-windows-msvc/builds/787
C:\b\slave\clang-x64-windows-msvc\build\llvm.src\unittests\ExecutionEngine\Orc\CoreAPIsTest.cpp(315): error C2668: 'llvm::orc::ExecutionSession::lookup': ambiguous call to overloaded function
C:\b\slave\clang-x64-windows-msvc\build\llvm.src\include\llvm/ExecutionEngine/Orc/Core.h(823): note: could be 'llvm::Expected<llvm::JITEvaluatedSymbol> llvm::orc::ExecutionSession::lookup(llvm::ArrayRef<llvm::orc::JITDylib *>,llvm::orc::SymbolStringPtr)'
C:\b\slave\clang-x64-windows-msvc\build\llvm.src\include\llvm/ExecutionEngine/Orc/Core.h(817): note: or 'llvm::Expected<llvm::JITEvaluatedSymbol> llvm::orc::ExecutionSession::lookup(const llvm::orc::JITDylibSearchList &,llvm::orc::SymbolStringPtr)'
C:\b\slave\clang-x64-windows-msvc\build\llvm.src\unittests\ExecutionEngine\Orc\CoreAPIsTest.cpp(315): note: while trying to match the argument list '(initializer list, llvm::orc::SymbolStringPtr)'
llvm-svn: 345078
In the new scheme the client passes a list of (JITDylib&, bool) pairs, rather
than a list of JITDylibs. For each JITDylib the boolean indicates whether or not
to match against non-exported symbols (true means that they should be found,
false means that they should not). The MatchNonExportedInJD and MatchNonExported
parameters on lookup are removed.
The new scheme is more flexible, and easier to understand.
This patch also updates JITDylib search orders to be lists of (JITDylib&, bool)
pairs to match the new lookup scheme. Error handling is also plumbed through
the LLJIT class to allow regression tests to fail predictably when a lookup from
a lazy call-through fails.
llvm-svn: 345077
Add a list of benchmarks, applications and algorithms which are under
discussion to be added to the test-suite.
The initial list includes the the benchmarks mentioned at
https://llvm.org/PR34216, missing SPEC benchmarks, some image processing
algorithms and a few others. The bug tracker only allows adding to the
discussion, not removing, commenting, adding details to individual
benchmarks.
The first proposal was to add these benchmark into the test-suite
repository, but after a discussion, adding it to llvm/docs/Proposals
seem more appropriate. One advantage is that llvm.org will have a
browsable web page with these suggestions.
Suggested-by: Hal Finkel
Differential Revision: https://reviews.llvm.org/D46714
llvm-svn: 345074
Outlined code is cold by assumption, so it makes sense to optimize it
for minimal code size rather than performance.
After r344869 moved the splitting pass to the end of the IR pipeline,
this does not result in much of a code size reduction. This is probably
because a comparatively small number backend transforms make use of the
MinSize hint.
Running LNT on x86_64, I see that 33/1020 binaries shrink for a total of
919 bytes of TEXT reduction. I didn't measure a significant performance
impact.
Differential Revision: https://reviews.llvm.org/D53518
llvm-svn: 345072
We can't add the MULDQ node back to the worklist after the demanded bits change has been committed in case the node has been removed entirely. This will have to wait until we have SimplifyDemandedBitsForTargetNode.
llvm-svn: 345070
Summary:
GNU strip supports both `-s` and `-S` as aliases for `--strip-all` and `--strip-debug`, respectfully.
As part of this, it turns out that strip/objcopy were accepting case insensitive command line args. I'm not sure if there was an explicit reason for this. The only others uses of this are llvm-cvtres/llvm-mt/llvm-lib, which are all tools specific for windows support. Forcing case sensitivity allows both aliases to exist, but seems like a good idea anyway.
And as a surprise test case adjustment, the llvm-strip unit test was running with `-keep=unavailable_symbol`, despite `keep` not be a valid flag for strip. This is because there is a flag `-K` which, when case insensitivity is permitted, allows it to be interpreted as `-K` = `eep=unavailable_symbol` (e.g. to allow `-Kfoo` == `--keep-symbol=foo`).
Reviewers: jakehehrlich, jhenderson, alexshap
Reviewed By: jakehehrlich
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D53163
llvm-svn: 345068
As suggested on D53258, this patch move the CTPOP expansion code from SelectionDAGLegalize to TargetLowering to allow it to be reused by the VectorLegalizer.
Proper vector support will be added by D53258.
llvm-svn: 345066
As suggested on D53258, this patch shares common CTLZ expansion code between VectorLegalizer and SelectionDAGLegalize by putting it in TargetLowering.
Extension to D53474
llvm-svn: 345060
Summary:
Some targets have very long encodings and uint64_t isn't sufficient. uint128_t
isn't portable so such targets need to use an object instead.
There is one catch with this at the moment, no string of bits extracted
from the encoding may exceeed 64-bits. Fields are still permitted to
exceed 64-bits so long as they aren't one contiguous string of bits. If
this proves to be a problem then we can modify the generation of
fieldFromInstruction() calls to account for it but for now I've added an
assertion for this.
InsnType must either be integral or an APInt-like object that must:
* Have a static const max_size_in_bits equal to the number of bits in the encoding.
* be default-constructible and copy-constructible
* be constructible from a uint64_t (this is the key area the interface deviates
from APInt since this constructor does not take the bit width)
* be constructible from an APInt (this can be private)
* be convertible to uint64_t
* Support the ~, &,, ==, !=, and |= operators with other objects of the same type
* Support shift (<<, >>) with signed and unsigned integers on the RHS
* Support put (<<) to raw_ostream&
Reviewers: bogner, charukcs
Subscribers: nhaehnle, llvm-commits
Differential Revision: https://reviews.llvm.org/D52100
llvm-svn: 345056
Add support to allow bit-casting from f128 to i128 and then
extracting 64 bits from the result.
Differential Revision: https://reviews.llvm.org/D49507
llvm-svn: 345053
The initial motivation is that we want to remove the
fneg API because that would silently fail if we add
an actual fneg instruction to IR. The same would be
true for the integer ops, so we might as well get rid
of these too.
We have a newer 'match' API that makes checking for
these patterns simpler. It also works with vectors
that may include undef elements in constants.
If any out-of-tree users need updating, they can model
their code changes on these commits:
rL345050
rL345043
rL345042
rL345041
rL345036
rL345030
llvm-svn: 345052