The tail duplication pass uses an assumed layout when making duplication
decisions. This is fine, but passes up duplication opportunities that
may arise when blocks are outlined. Because we want the updated CFG to
affect subsequent placement decisions, this change must occur during
placement.
In order to achieve this goal, TailDuplicationPass is split into a
utility class, TailDuplicator, and the pass itself. The pass delegates
nearly everything to the TailDuplicator object, except for looping over
the blocks in a function. This allows the same code to be used for tail
duplication in both places.
This change, in concert with outlining optional branches, allows
triangle shaped code to perform much better, esepecially when the
taken/untaken branches are correlated, as it creates a second spine when
the tests are small enough.
Issue from previous rollback fixed, and a new test was added for that
case as well.
Differential revision: https://reviews.llvm.org/D18226
llvm-svn: 283274
Summary:
These are analog to the existing LLVMConstExactSDiv and LLVMBuildExactSDiv
functions.
Reviewers: deadalnix, majnemer
Subscribers: majnemer, llvm-commits
Differential Revision: https://reviews.llvm.org/D25259
llvm-svn: 283269
Summary:
Attempting to fix PR30384.
Take the same approach as in compiler_rt and add a simplified version of __get_cpuid_max.
Including cpuid.h is no longer needed.
Reviewers: echristo, joerg
Subscribers: mehdi_amini, llvm-commits
Differential Revision: https://reviews.llvm.org/D24597
llvm-svn: 283265
The motivation for the change is that we can't have pseudo-global settings for
codegen living in TargetOptions because that doesn't work with LTO.
Ideally, these reciprocal attributes will be moved to the instruction-level via
FMF, metadata, or something else. But making them function attributes is at least
an improvement over the current state.
The ingredients of this patch are:
Remove the reciprocal estimate command-line debug option.
Add TargetRecip to TargetLowering.
Remove TargetRecip from TargetOptions.
Clean up the TargetRecip implementation to work with this new scheme.
Set the default reciprocal settings in TargetLoweringBase (everything is off).
Update the PowerPC defaults, users, and tests.
Update the x86 defaults, users, and tests.
Note that if this patch needs to be reverted, the related clang patch checked in
at r283251 should be reverted too.
Differential Revision: https://reviews.llvm.org/D24816
llvm-svn: 283252
load commands that uses the MachO::encryption_info_command and
MachO::encryption_info_command types but not used in llvm libObject
code but used in llvm tool code.
This includes just LC_ENCRYPTION_INFO and
LC_ENCRYPTION_INFO_64 load commands.
llvm-svn: 283250
Make LIT_COMMAND configurable, use source tree only when actually
available and extend the default search to other common executable names
'lit.py' and 'lit', in order to increase uniformity between all LLVM
projects and support using installed lit.
Changing the conditional used to determine whether in-tree or external
lit is being used covers the case when LLVM_MAIN_SRC_DIR is defined but
does not exist (anymore). In this case, the functions falls back to
looking for installed lit rather than attempting to use a non-existing
path. The same conditional is used in clang already.
Making LIT_COMMAND a cache variable in case the source tree variant is
used serves two purposes. Firstly, it increases uniformity between
the two branches since find_program() implicitly makes LIT_COMMAND
a cache variable. Secondly, it allows overriding the lit executable used
to run the tests when the LLVM source tree is provided. Gentoo is
planning to use this to use installed (and byte-compiled) lit instead of
re-compiling it in every LLVM project.
Extending default search is meant to increase uniformity between
different LLVM projects. The 'lit.py' name is already used by a few of
them, and 'lit' is the name used by utils/lit/setup.py when installing.
Differential Revision: https://reviews.llvm.org/D25076
llvm-svn: 283247
This adds support for CaseLower, CasesLower, StartsWithLower, and
EndsWithLower.
Differential revision: https://reviews.llvm.org/D24686
llvm-svn: 283244
AArch64InstrInfo::shouldScheduleAdjacent() determines whether two
instruction can benefit from macroop fusion on apple CPUs. The list
turned out to be incomplete:
- the "rr" variants of the instructions were missing
- even the "rs" variants can have shift value == 0 and behave like the
"rr" variants
This also splits the MacropFusion target feature into
ArithmeticBccFusion and ArithmeticCbzFusion.
Differential Revision: https://reviews.llvm.org/D25142
llvm-svn: 283243
The purpose of the YAML diagnostic output file is to collect information on
optimizations performed, or not performed, for later processing by tools that
help users (and compiler developers) understand how code was optimized. As
such, the diagnostics that appear in the file should not be coupled to what a
user might want to see summarized for them as the compiler runs, and in fact,
because the user likely does not know what optimization diagnostics their tools
might want to use, the user cannot provide a useful filter regardless. As such,
we shouldn't filter the diagnostics going to the output file.
Differential Revision: https://reviews.llvm.org/D25224
llvm-svn: 283236
CMake requires that all targets expressed as dependencies exist, so we can't have intrinsics_gen in LLVM_COMMON_DEPENDS when it is written out, otherwise projects building out of tree will have CMake errors.
llvm-svn: 283234
IntegerType::MAX_INT_BITS is apparently not in sync with Type::SubclassData
size. This patch fixes this.
Differential Revision: https://reviews.llvm.org/D24814
llvm-svn: 283215
This patch corresponds to review:
The newly added VSX D-Form (register + offset) memory ops target the upper half
of the VSX register set. The existing ones target the lower half. In order to
unify these and have the ability to target all the VSX registers using D-Form
operations, this patch defines Pseudo-ops for the loads/stores which are
expanded post-RA. The expansion then choses the correct opcode based on the
register that was allocated for the operation.
llvm-svn: 283212
Treat soft-float as unsupported for fast-isel. Additionally, ensure we check
that lowering f32 arguments also considers the case of soft-float mode.
Reviewers: ehostunreach, vkalintiris, zoran.jovanovic
Differential Review: https://reviews.llvm.org/D24505
llvm-svn: 283209
Previously code would access invalid memory and may crash,
patch fixes the issue.
Differential revision: https://reviews.llvm.org/D25187
llvm-svn: 283204
The SMULO/UMULO DAG nodes, when not directly supported by the target,
expand to a multiplication twice as wide. In case that the resulting
type is not legal, an __mul?i3 intrinsic is used. Since the type is
not legal, the legalizer cannot directly call the intrinsic with
the wide arguments; instead, it "pre-lowers" them by splitting them
in halves.
The "pre-lowering" code in essence made assumptions about
the calling convention, specifically that i(N*2) values will be
split into two iN values and passed in consecutive registers in
little-endian order. This, naturally, breaks on a big-endian system,
such as our OR1K out-of-tree backend.
Thanks to James Miller <james@aatch.net> for help in debugging.
Differential Revision: https://reviews.llvm.org/D25223
llvm-svn: 283203
When using broken input object found using AFL,
getExtendedSymbolTableIndex() crashed because ShndxTable
was empty as object does not contain SHT_SYMTAB_SHNDX section.
Differential revision: https://reviews.llvm.org/D25189
llvm-svn: 283196
This fixes the inconsistency of the fp denormal option names: in LLVM this was
DenormalType, but in Clang this is DenormalMode which seems better.
Differential Revision: https://reviews.llvm.org/D24906
llvm-svn: 283192
This patch corresponds to review:
https://reviews.llvm.org/D23155
This patch removes the VSHRC register class (based on D20310) and adds
exploitation of the Power9 sub-word integer loads into VSX registers as well
as vector sign extensions.
The new instructions are useful for a few purposes:
Int to Fp conversions of 1 or 2-byte values loaded from memory
Building vectors of 1 or 2-byte integers with values loaded from memory
Storing individual 1 or 2-byte elements from integer vectors
This patch implements all of those uses.
llvm-svn: 283190
Reintroduce versioning of shared libraries via SOVERSION, addressing
the issues with the previous design, since Gentoo is relying
on shared-split install of LLVM. The SOVERSIONs were originally
introduced in r229720 for all libraries, and removed in r252093 in favor
of custom SONAME. As far as I understand, the major concern with the old
versioning was that the used versions were incompatible with ldconfig.
Having considered that, this commit introduce SOVERSIONS with the
following considerations:
1. SOVERSIONs are formed of major & minor version concatenated -- i.e.
for 4.0 its .so.40. This matches the common practice where the first
version number indicates ABI breakage, and therefore fixes the issues
with ldconfig. Additionally, VERSION with the remaining verion
components appended is used, however this is not strictly necessary.
2. The versioning is only applied to libraries with no explicit SONAME
specified -- i.e. it won't apply to libLLVM but only to the split
libraries. It will also apply to libraries installed by the subprojects.
3. The versioning is only done on *nix systems, Darwin excluded. This
matches the current use of SONAME.
Differential Revision: https://reviews.llvm.org/D24757
llvm-svn: 283189
Use separate doctrees between different Sphinx builders in order to
prevent race condition issues due to multiple Sphinx instances accessing
the same doctree cache in parallel.
Bug: https://llvm.org/bugs/show_bug.cgi?id=23781
Differential Revision: https://reviews.llvm.org/D23755
llvm-svn: 283188
Summary:
The minimum version of Python necessary to run the LLVM test suite is
2.7. Code to work around Python 2.5 and lower isn't necessary.
Reviewers: ddunbar, echristo, delcypher, beanz
Subscribers: llvm-commits, mehdi_amini
Differential Revision: https://reviews.llvm.org/D25209
llvm-svn: 283169
Slightly improves the precision of GlobalsAA in certain situations, and
makes the behavior of optimization passes more predictable.
Differential Revision: https://reviews.llvm.org/D24104
llvm-svn: 283165
The tail duplication pass uses an assumed layout when making duplication
decisions. This is fine, but passes up duplication opportunities that
may arise when blocks are outlined. Because we want the updated CFG to
affect subsequent placement decisions, this change must occur during
placement.
In order to achieve this goal, TailDuplicationPass is split into a
utility class, TailDuplicator, and the pass itself. The pass delegates
nearly everything to the TailDuplicator object, except for looping over
the blocks in a function. This allows the same code to be used for tail
duplication in both places.
This change, in concert with outlining optional branches, allows
triangle shaped code to perform much better, esepecially when the
taken/untaken branches are correlated, as it creates a second spine when
the tests are small enough.
llvm-svn: 283164
We now build MemorySSA in its ctor, instead of waiting until the user
calls MemorySSA::getWalker. This silently changed our unittests, since
we add BasicAA to AAResults *after* constructing MemorySSA (...but
before calling MemorySSA::getWalker).
None of them broke because we do most of our "did this get optimized
correctly?" tests in .ll files.
llvm-svn: 283158
WebAssembly has officially switched from being an AST to being a stack
machine. Update various bits of terminology and README.md entries
accordingly.
llvm-svn: 283154
Summary:
optparse is deprecated in Python 2.7, which is the minimum version of
Python required to run the LLVM test suite. Replace its usage in lit
with argparse, optparse's 2.7 replacement module.
argparse has several benefits over optparse, but this commit does not
make use of those benefits yet. Instead, it simply uses the new API,
and attempts to keep the number of changes to a minimum.
Confirmed that lit's test suite, as well as LLVM's regression test suite,
still pass with these changes.
Patch By Brian Gesiak!
Reviewers: ddunbar, echristo, beanz, delcypher
Subscribers: llvm-commits, mehdi_amini
Differential Revision: https://reviews.llvm.org/D25173
llvm-svn: 283152
This is to avoid problems with win32 + ELF which surprisingly happens a
lot in practice: If a user just specifies -march on the commandline the
object format changes along with the architecture to ELF in many
instances while the OS stays with the default/host OS.
llvm-svn: 283151
This avoids llc using the hosts OS/vendor as defaults and triggering
unwanted behaviour in the tests. This should deal with the buildbot
breakages on windows after r283140.
llvm-svn: 283149
Refactor the code so that the same function can be used for all
instructions with all the same operands for up to 3 operands.
This is going to be useful for cast instructions.
NFC.
llvm-svn: 283144
Each shadow only represents data flow that is restricted to its reaching
def. Propagating more than that could lead to spurious register liveness,
resulting in extra (incorrectly) block live-ins.
llvm-svn: 283143
Windows has no GOT relocations the way elf/darwin has. Some people use
x86_64-pc-win32-macho to build EFI firmware; Do not produce GOT
relocations for this target.
Differential Revision: https://reviews.llvm.org/D24627
llvm-svn: 283140
Summary: LoopSink pass uses some common function in LICM. This patch refactor the LICM code to make it usable by LoopSink pass (https://reviews.llvm.org/D22778).
Reviewers: davidxl, danielcdh, hfinkel, chandlerc
Subscribers: hfinkel, llvm-commits
Differential Revision: https://reviews.llvm.org/D24168
llvm-svn: 283134
Splitting the edge is nontrivial because of the landing pad, and we would
currently assert trying to do it.
Differential Revision: https://reviews.llvm.org/D24680
llvm-svn: 283129
It should forward to deregisterEHFramesInProcess by default, not
registerEHFramesInProcess.
No test case: I haven't come up with a good way to unit test EH frame
registration yet.
llvm-svn: 283123
This should fix:
https://llvm.org/bugs/show_bug.cgi?id=30433
There are a couple of open questions about the codegen:
1. Should we let scalar ops be scalars and avoid vector constant loads/splats?
2. Should we have a pass to combine constants such as the inverted pair that we have here?
Differential Revision: https://reviews.llvm.org/D25165
llvm-svn: 283119
If the llvm. prefix is dropped other parts of llvm don't see this as
an intrinsic. This means that the number of regular symbols depends
on the context the module is loaded into, which causes LTO to abort.
Fixes PR30509.
llvm-svn: 283117
Retrying after buildbot reset.
To lex hash directives we peek ahead to find component tokens, create a
unified token, and unlex the peeked tokens so the parser does not need
to parse the tokens then. Make sure we do not to lex another hash
directive during peek operation.
This fixes PR28921.
Reviewers: rnk, loladiro
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D24839
llvm-svn: 283111
Summary: Added 6 new target hooks for the vectorizer in order to filter types, handle size constraints and decide how to split chains.
Reviewers: tstellarAMD, arsenm
Subscribers: arsenm, mzolotukhin, wdng, llvm-commits, nhaehnle
Differential Revision: https://reviews.llvm.org/D24727
llvm-svn: 283099
library call to __aeabi_uidivmod. This is an improved implementation of
r280808, see also D24133, that got reverted because isel was stuck in a loop.
That was caused by the optimisation incorrectly triggering on i64 ints, which
shouldn't happen because there is no 64bit hwdiv support; that put isel's type
legalization and this optimisation in a loop. A native ARM compiler and testing
now shows that this is fixed.
Patch mostly by Pablo Barrio.
Differential Revision: https://reviews.llvm.org/D25077
llvm-svn: 283098
search loop, by Andrey Tischenko
PR27136 shows failure to hoist constant out of loop. This test is used
as start point to fix the failure: it shows the current state of codegen
and discovers what should be fixed
Differential Revision: https://reviews.llvm.org/D25097
llvm-svn: 283091
Summary:
lit's `OneCommandFileTest` class implements an abstract method that
raises if called. However, it raises by referencing an undefined
symbol. Instead, raise explicitly by throwing a `NotImplementedError`.
This is clearer, and appeases Python linters.
Patch By Brian Gesiak!
Reviewers: ddunbar, echristo, beanz
Subscribers: llvm-commits, mehdi_amini
Differential Revision: https://reviews.llvm.org/D25170
llvm-svn: 283090
Summary:
In Python, `None` is a singleton, so checking whether a variable is
`None` may be done with `is` or `is not`. This has a slight advantage
over equiality comparisons `== None` and `!= None`, since `__eq__` may
be overridden in Python to produce sometimes unexpected results.
Using `is None` and `is not None` is also recommended practice in
https://www.python.org/dev/peps/pep-0008:
> Comparisons to singletons like `None` should always be done with `is` or
> `is not`, never the equality operators.
Patch by Brian Gesiak!
Reviewers: ddunbar, echristo, beanz
Subscribers: llvm-commits, mehdi_amini
Differential Revision: https://reviews.llvm.org/D25168
llvm-svn: 283088
The PPC branch-selection pass, which performs branch relaxation, needs to
account for the padding that might be introduced to satisfy block alignment
requirements. We were assuming that the first block was at offset zero (i.e.
had the alignment of the function itself), but under the ELFv2 ABI, a global
entry function prologue is added to the first block, and it is a
two-instruction sequence (i.e. eight-bytes long). If the function has 16-byte
alignment, the fact that the first block is eight bytes offset from the start
of the function is relevant to calculating where padding will be added in
between later blocks.
Unfortunately, I don't have a small test case.
llvm-svn: 283086
I don't know for sure that we truly needs this, but its the only vector load that isn't rematerializable. Making it consistent allows it to not be a special case in the td files.
llvm-svn: 283083
This was first landed in rL283058 and subsequenlty reverted since a
change this depends on (rL283057) was buggy and had to be reverted.
llvm-svn: 283079
This change teaches getEquivalentICmp to be smarter about generating
ICMP_NE and ICMP_EQ predicates.
An earlier version of this change was landed as rL283057 which had a
use-after-free bug. This new version has a fix for that bug, and a (C++
unittests/) test case that would have triggered it rL283057.
llvm-svn: 283078
Preemptively scrubbing these to avoid a bot fail as in PR30443:
https://llvm.org/bugs/show_bug.cgi?id=30443
I'm nearly done with a patch to fix these cases, so not trying very
hard to do better for the temporary win.
I plan to use better checks than what the script produces for the vectorized cases.
llvm-svn: 283072
To allow broadcast loads of a non-zero'th vector element, lowerVectorShuffleAsBroadcast can replace a load with a new load with an adjusted address, but unfortunately we weren't ensuring that the new load respected the same dependencies.
This patch adds a TokenFactor and updates all dependencies of the old load to reference the new load instead.
Bug found during internal testing.
Differential Revision: https://reviews.llvm.org/D25039
llvm-svn: 283070
They've broken the sanitizer-bootstrap bots. Reverting while I investigate.
Original commit messages:
r283057: "[ConstantRange] Make getEquivalentICmp smarter"
r283058: "[SCEV] Rely on ConstantRange instead of custom logic; NFCI"
llvm-svn: 283062
This change enables soft-float for PowerPC64, and also makes soft-float disable
all vector instruction sets for both 32-bit and 64-bit modes. This latter part
is necessary because the PPC backend canonicalizes many Altivec vector types to
floating-point types, and so soft-float breaks scalarization support for many
operations. Both for embedded targets and for operating-system kernels desiring
soft-float support, it seems reasonable that disabling hardware floating-point
also disables vector instructions (embedded targets without hardware floating
point support are unlikely to have Altivec, etc. and operating system kernels
desiring not to use floating-point registers to lower syscall cost are unlikely
to want to use vector registers either). If someone needs this to work, we'll
need to change the fact that we promote many Altivec operations to act on
v4f32. To make it possible to disable Altivec when soft-float is enabled,
hardware floating-point support needs to be expressed as a positive feature,
like the others, and not a negative feature, because target features cannot
have dependencies on the disabling of some other feature. So +soft-float has
now become -hard-float.
Fixes PR26970.
llvm-svn: 283060
As per the PE COFF spec (section 8.3, Import Name Type)
Offset: 18 Size 2 bits Name: Type
Offset: 20 Size 3 bits Name: Name Type
Offset: 20 added based on 18+2
Partially commited as rL279069
Differential Revision: https://reviews.llvm.org/D23540
llvm-svn: 283055
Now we can commute to BLENDPD/BLENDPS on SSE41+ targets if necessary, so simplify the combine matching where we can.
This required me to add a couple of scalar math movsd/moss fold patterns that hadn't been needed in the past.
llvm-svn: 283038
Instead of selecting between MOVSD/MOVSS and BLENDPD/BLENDPS at shuffle lowering by subtarget this will help us select the instruction based on actual commutation requirements.
We could possibly add BLENDPD/BLENDPS -> MOVSD/MOVSS commutation and MOVSD/MOVSS memory folding using a similar approach if it proves useful
I avoided adding AVX512 handling as I'm not sure when we should be making use of VBLENDPD/VBLENDPS on EVEX targets
llvm-svn: 283037
Revert the change in r283029 (and the fixup in r283033) due to buildbot
breakage. The fixup is ineffective for the bots that do not force clean
build since the wrong value is already cached in CMakeCache.txt.
Reverting it should result in the cache variable being removed
and therefore it should be possible to re-introduce it after all
buildbots build this revision.
llvm-svn: 283036
Make LIT_COMMAND configurable, use source tree only when actually
available and extend the default search to other common executable names
'lit.py' and 'lit', in order to increase uniformity between all LLVM
projects and support using installed lit.
Changing the conditional used to determine whether in-tree or external
lit is being used covers the case when LLVM_MAIN_SRC_DIR is defined but
does not exist (anymore). In this case, the functions falls back to
looking for installed lit rather than attempting to use a non-existing
path. The same conditional is used in clang already.
Making LIT_COMMAND a cache variable in case the source tree variant is
used serves two purposes. Firstly, it increases uniformity between
the two branches since find_program() implicitly makes LIT_COMMAND
a cache variable. Secondly, it allows overriding the lit executable used
to run the tests when the LLVM source tree is provided. Gentoo is
planning to use this to use installed (and byte-compiled) lit instead of
re-compiling it in every LLVM project.
Extending default search is meant to increase uniformity between
different LLVM projects. The 'lit.py' name is already used by a few of
them, and 'lit' is the name used by utils/lit/setup.py when installing.
Differential Revision: https://reviews.llvm.org/D25076
llvm-svn: 283029
Install the OCaml interface .mli files. Those files were most likely
omitted because they are input files for the compiled .cmi files.
However, installing them is reasonable since -- unlike .cmi files --
they are human-readable.
The issue was originally spotted by @jpdeplaix.
Differential Revision: https://reviews.llvm.org/D25128
llvm-svn: 283028
It got disconnected during the cmake conversion. For Miscompilation.cpp,
it was purely advisory for the user and the ToolRunner.cpp version was
trying to compensate for libs and bins in the same directory, which
hasn't been the case for a very long time.
llvm-svn: 283022
-Remove OptForSize. Not all of the backend follows the same rules for creating broadcasts and there is no conflicting pattern.
-Don't stop selecting VEX VMOVDDUP when AVX512 is supported. We need VLX for EVEX VMOVDDUP.
-Only use VMOVDDUP for v2i64 broadcasts if AVX2 is not supported.
llvm-svn: 283020
To lex hash directives we peek ahead to find component tokens, create a
unified token, and unlex the peeked tokens so the parser does not need
to parse the tokens then. Make sure we do not to lex another hash
directive during peek operation.
This fixes PR28921.
Reviewers: rnk, loladiro
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D24839
llvm-svn: 282992
The binder is in a specific section that "reverse" the edges in a
regular dead-stripping: the binder is live as long as a global it
references is live.
This is a big hammer that prevents LLVM from dead-stripping these,
while still allowing linker dead-stripping (with special knowledge
of the section).
Differential Revision: https://reviews.llvm.org/D24673
llvm-svn: 282988
We don't need to have singleton ValueMapping on their own, we can just
reuse one of the elements of the 3-ops mapping.
This allows even more code sharing.
NFC.
llvm-svn: 282959
When we create a PDB file using PDBFileBuilder, the information
in the superblock, such as the size of the resulting file, is not
available.
Previously, PDBFileBuilder::initialize took a superblock assuming
that all the members of the struct are correct. That is useful when
you want to restore the exact information from a YAML file, but
that's probably the only use case in which that is useful.
When we are creating a PDB file on the fly, we have to backfill the
members.
This patch redefines PDBFileBuilder::initialize to take only a
block size. Now all the other members are left as default values,
so that they'll be updated when commit() is called.
Differential Revision: https://reviews.llvm.org/D25108
llvm-svn: 282944
WritableStream needs the exact file size to open a file, but
until we fix the final layout of a PDB file, we don't know the
size of the file.
This patch changes the parameter type of PDBFileBuilder::commit
to solve that chiecken-and-egg problem. Now the function opens
a file after fixing the layout, so it can create a file with the
exact size.
Differential Revision: https://reviews.llvm.org/D25107
llvm-svn: 282940
We can't use Jcc to leave a Win64 function in general, because that
confuses the unwinder. However, for "leaf" functions, that is, functions
where the return address is always on top of the stack and which don't
have unwind info, it's OK.
Differential Revision: https://reviews.llvm.org/D24836
llvm-svn: 282920
Summary:
In the case below, %Result.i19 is defined between coro.save and coro.suspend and used after coro.suspend. We need to correctly place such a value into the coroutine frame.
```
%save = call token @llvm.coro.save(i8* null)
%Result.i19 = getelementptr inbounds %"struct.lean_future<int>::Awaiter", %"struct.lean_future<int>::Awaiter"* %ref.tmp7, i64 0, i32 0
%suspend = call i8 @llvm.coro.suspend(token %save, i1 false)
switch i8 %suspend, label %exit [
i8 0, label %await.ready
i8 1, label %exit
]
await.ready:
%val = load i32, i32* %Result.i19
```
Reviewers: majnemer
Subscribers: llvm-commits, mehdi_amini
Differential Revision: https://reviews.llvm.org/D24418
llvm-svn: 282902
Summary:
Without the fix, if there was a function inlined into the coroutine with debug information, CloneFunctionInto(NewF, &F, VMap, /*ModuleLevelChanges=*/true, Returns); would duplicate all of the debug information including the DICompileUnit.
We know use VMap to indicate that debug metadata for a File, Unit and FunctionType should not be duplicated when we creating clones that will become f.resume, f.destroy and f.cleanup.
Reviewers: majnemer
Subscribers: mehdi_amini, llvm-commits
Differential Revision: https://reviews.llvm.org/D24417
llvm-svn: 282899
Summary: Not all coro.subfn.addr intrinsics can be eliminated in CoroElide through devirtualization. Those that remain need to be lowered in CoroCleanup.
Reviewers: majnemer
Subscribers: llvm-commits, mehdi_amini
Differential Revision: https://reviews.llvm.org/D24412
llvm-svn: 282897
Add a OCAML_INSTALL_PATH variable that can be used to control
the install path for OCaml libraries. The new variable defaults to
${OCAML_STDLIB_PATH}, i.e. the OCaml library path obtained from
the OCaml compiler. Install libraries into "llvm" subdirectory.
This fixes two issues:
1. OCaml library directories differ between systems, and 'lib/ocaml' is
incorrect e.g. on amd64 Gentoo where OCaml is installed
in 'lib64/ocaml'. Therefore, obtain the library path from the OCaml
compiler using 'ocamlc -where' (which is already used to set
OCAML_STDLIB_PATH), which is the method used commonly in OCaml packages.
2. The top-level directory is reserved for the standard library, and has
precedence over local directory in search path. As a result, OCaml
preferred the files installed along with previous LLVM version over the
source tree when building a new version, resulting in two versions being
mixed during the build. The new layout is used commonly by other OCaml
packages, and findlib is able to find the LLVM libraries successfully.
Bug: https://bugs.gentoo.org/559134
Bug: https://bugs.gentoo.org/559624
Differential Revision: https://reviews.llvm.org/D24354
llvm-svn: 282895
Summary: Debug info should *not* affect optimization decisions. This patch updates loop unroller cost model to make it not affected by debug info.
Reviewers: davidxl, mzolotukhin
Subscribers: haicheng, llvm-commits, mzolotukhin
Differential Revision: https://reviews.llvm.org/D25098
llvm-svn: 282894
Register stackification currently checks VNInfo for changes. Make that
more accurate by testing each intervening instruction for any other defs
to the same virtual register.
Patch by Jacob Gravelle
Differential Revision: https://reviews.llvm.org/D24942
llvm-svn: 282886
Summary:
This patch is adding the support for a shadow memory with
dynamically allocated address range.
The compiler-rt needs to export a symbol containing the shadow
memory range.
This is required to support ASAN on windows 64-bits.
Reviewers: kcc, rnk, vitalybuka
Subscribers: zaks.anna, kubabrecka, dberris, llvm-commits, chrisha
Differential Revision: https://reviews.llvm.org/D23354
llvm-svn: 282881
The CL was originally failing due to the use of some C++14
specific features, so I've removed those. Hopefully this will
satisfy the bots.
llvm-svn: 282867
When building the steps for scalar induction variables, we previously attempted
to determine if all the scalar users of the induction variable were uniform. If
they were, we would only emit the step corresponding to vector lane zero. This
optimization was too aggressive. We generally don't know the entire set of
induction variable users that will be scalar. We have
isScalarAfterVectorization, but this is only a conservative estimate of the
instructions that will be scalarized. Thus, an induction variable may have
scalar users that aren't already known to be scalar. To avoid emitting unused
steps, we can only check that the induction variable is uniform. This should
fix PR30542.
Reference: https://llvm.org/bugs/show_bug.cgi?id=30542
llvm-svn: 282863
Summary:
This change adds the AVR assembly instruction printer.
No tests are included in this patch. I have left them downstream so we can
add them once `llc` successfully runs (there's very few components left
to upstream until this).
Reviewers: arsenm, kparzysz
Subscribers: wdng, beanz, mgorny
Differential Revision: https://reviews.llvm.org/D25028
llvm-svn: 282854
Summary:
Previously, when allocating unspillable live ranges, we would never
attempt to split. We would always bail out and try last ditch graph
recoloring.
This patch changes this by attempting to split all live intervals before
performing recoloring.
This fixes LLVM bug PR14879.
I can't add test cases for any backends other than AVR because none of
them have small enough register classes to trigger the bug.
Reviewers: qcolombet
Subscribers: MatzeB
Differential Revision: https://reviews.llvm.org/D25070
llvm-svn: 282852
When LLVM_INSTALL_TOOLCHAIN_ONLY is used and LLVM_TOOLCHAIN_TOOLS
contains a tool which is a symlink, it would be ignored. This already
worked before but got broken in r282510.
Differential Revision: https://reviews.llvm.org/D25067
llvm-svn: 282844
I'm not completely sure what this method does or why all the 256-bit VTs returned VR128RegClass when the comments on the method definiton say it should return the largest super register class. I just figured AVX-512 should be similar.
llvm-svn: 282836
If AVX512 is disabled, the registers should already be marked reserved. Pattern predicates and register classes on instructions should take care of most of the rest. Loads/stores and physical register copies for XMM16-31 and YMM16-31 without VLX have already been taken care of.
I'm a little unclear why this changed the register allocation of the SSE2 run of the sad.ll test, but the registers selected appear to be valid after this change.
llvm-svn: 282835
Summary:
We don't want to decay hot callsites to import chains of hot
callsites. The same mechanism is used in LIPO.
Reviewers: tejohnson, eraman, mehdi_amini
Subscribers: llvm-commits, mehdi_amini
Differential Revision: https://reviews.llvm.org/D24976
llvm-svn: 282833
For some reason there are both of these available, except
for scalar 64-bit compares which only has u64. I'm not sure
why there are both (I'm guessing it's for the one bit inputs we
don't use), but for consistency always using the
unsigned one.
llvm-svn: 282832
Summary:
This lets people link against LLVM and their own version of the UTF
library.
I determined this only affects llvm, clang, lld, and lldb by running
$ git grep -wl 'UTF[0-9]\+\|\bConvertUTF\bisLegalUTF\|getNumBytesFor' | cut -f 1 -d '/' | sort | uniq
clang
lld
lldb
llvm
Tested with
ninja lldb
ninja check-clang check-llvm check-lld
(ninja check-lldb doesn't complete for me with or without this patch.)
Reviewers: rnk
Subscribers: klimek, beanz, mgorny, llvm-commits
Differential Revision: https://reviews.llvm.org/D24996
llvm-svn: 282822
This uses a TableGen'ed like structure for all 3-operands instrs.
The output of the RegBankSelect pass should be identical but the
RegisterBankInfo will do less dynamic allocations.
llvm-svn: 282817
(Recommit after making sure IsVerbose gets properly initialized in
DiagnosticInfoOptimizationBase. See previous commit that takes care of
this.)
OptimizationRemarkAnalysis directly takes the role of the report that is
generated by LAA.
Then we need the magic to be able to turn an LAA remark into an LV
remark. This is done via a new OptimizationRemark ctor.
llvm-svn: 282813
enumerate allows you to iterate over a range by pairing the
iterator's value with its index in the enumeration. This gives
you most of the benefits of using a for loop while still allowing
the range syntax.
llvm-svn: 282804
Also, make foldSelectExtConst() a member of InstCombiner, remove
unnecessary parameters from its interface, and group visitSelectInst
helpers together in the header file.
llvm-svn: 282796
load command that uses the MachO::entry_point_command type
but not used in llvm libObject code but used in llvm tool code.
This includes just the LC_MAIN load command.
llvm-svn: 282766
OptimizationRemarkAnalysis directly takes the role of the report that is
generated by LAA.
Then we need the magic to be able to turn an LAA remark into an LV
remark. This is done via a new OptimizationRemark ctor.
llvm-svn: 282758
Instead of producing a mapping for all the operands, we only generate a
mapping for the definition. Indeed, the other operands are not
constrained by the instruction and thus, we should leave the choice to
the actual definition to do the right thing.
In pratice this is almost NFC, but with one advantage. We will have only
one instance of OperandsMapping for each copy and phi that map to one
register bank instead of one different instance for each different
number of operands for each copy and phi.
llvm-svn: 282756
The VS debugger doesn't appear to understand the 0x68 or 0x69 type
indices, which were probably intended for use on a platform where a C
'int' is 8 bits. So, use the character types instead. Clang was already
using the character types because '[u]int8_t' is usually defined in
terms of 'char'.
See the Rust issue for screenshots of what VS does:
https://github.com/rust-lang/rust/issues/36646
Fixes PR30552
llvm-svn: 282739
load command that uses the Mach::source_version_command type
but not used in llvm libObject code but used in llvm tool code.
This includes just the LC_SOURCE_VERSION load command.
llvm-svn: 282736
Summary:
Not tunned up heuristic, but with this small heuristic there is about
+0.10% improvement on SPEC 2006
Reviewers: tejohnson, mehdi_amini, eraman
Subscribers: mehdi_amini, llvm-commits
Differential Revision: https://reviews.llvm.org/D24940
llvm-svn: 282733
The last one remaining after which emitAnalysis can be removed is when
we convert the LAA's report to a vectorization report. This requires
converting LAA to the new interface first.
llvm-svn: 282726
The shuffle mask decodes have a large amount of repeated code extracting/splitting mask values from Constant data.
This patch pulls all of this duplicated code into a single helper function to identify undef elements and combine/split constant integer data into the requested shuffle mask elements.
Updated PSHUFB/VPERMIL/VPERMIL2/VPPERM decoders to use it (VPERMV/VPERMV3 could be converted as well in the future).
llvm-svn: 282720