Computing the reaching definition in forwardTree() can take a long time
if the coefficients are large. When the forwarding is
carried-out (doIt==true), forwardTree() must execute entirely or not at
all to get a consistent output, which means we cannot just allow
out-of-quota errors to happen in the middle of the processing.
We introduce the class IslQuotaScope which allows to opt-in code that is
conformant and has been tested with out-of-quota events. In case of
ForwardOpTree, out-of-quota is allowed during the operand tree
examination, but not during the transformation. The same forwardTree()
recursion is used for examination and execution, meaning that the
reaching definition has already been computed in the examination tree
walk and cached for reuse in the transformation tree walk.
This should fix the time-out of grtestutils.ll of the asop buildbot. If
the compilation still takes too long, we can reduce the max-operations
allows for -polly-optree.
Differential Revision: https://reviews.llvm.org/D37984
llvm-svn: 313690
Summary: Introduces the llvm-cfi-verify tool to llvm. Includes the design document (docs/CFIVerify.rst). Current implementation of the tool is simply a disassembler that identifies and prints the indirect control flow instructions.
Reviewers: vlad.tsyrklevich
Reviewed By: vlad.tsyrklevich
Patch by Mitch Phillips
Subscribers: llvm-commits, kcc, pcc, mgorny
Differential Revision: https://reviews.llvm.org/D37937
llvm-svn: 313688
Unreachable blocks in the machine instr representation are these
weird empty blocks with no successors.
The MIR printer used to not print empty lists of successors. However,
the MIR parser now treats non-printed list of successors as "please
guess it for me". As a result, the parser tries to guess the list of
successors and given the block is empty, just assumes it falls through
the next block (if any).
For instance, the following test case used to fail the verifier.
The MIR printer would print
entry
/ \
true (def) false (no list of successors)
|
split.true (use)
The MIR parser would understand this:
entry
/ \
true (def) false
| / <-- invalid edge
split.true (use)
Because of the invalid edge, we get the "def does not
dominate all uses" error.
The fix consists in printing empty successor lists, so that the parser
knows what to do for unreachable blocks.
rdar://problem/34022159
llvm-svn: 313685
The recently behavior in the code that these tests were meant to be checking will be ammended as soon as a suitable change can be properly reviewed.
llvm-svn: 313684
I didn't initialize a pointer to be nullptr that I needed to.
This change adds support for nested and even overlapping segments. This means
that PT_PHDR, PT_GNU_RELRO, PT_TLS, and PT_DYNAMIC can be supported properly.
Differential Revision: https://reviews.llvm.org/D36558
llvm-svn: 313682
CieRecord is a struct containing a CIE and FDEs, but oftentimes the
struct itself is named `Cie` which caused some confusion. This patch
renames them `CieRecords` or `Rec`.
llvm-svn: 313681
The ARM docs suggest in examples that the flags can have either case, and there
are applications in the wild that (libopencm3, for example) that expect to be
able to use the uppercase spelling.
https://reviews.llvm.org/D37953
llvm-svn: 313680
This reverts r313640, originally r313400, one more time for essentially
the same issue. My BitVector of spilled location numbers isn't working
because we coalesce identical DBG_VALUE locations as we rewrite them,
invalidating the location numbers used to index the BitVector.
llvm-svn: 313679
Summary: In the ThinLTO compilation, if a function is inlined in the profiling binary, we need to inline it before annotation. If the callee is not available in the primary module, a first step is needed to import that callee function. For the current implementation, if the call is an indirect call, which has been promoted to >1 targets and inlined, SamplePGO will only import one target with the largest sample count. This patch fixed the bug to import all targets instead.
Reviewers: tejohnson, davidxl
Reviewed By: tejohnson
Subscribers: sanjoy, llvm-commits, mehdi_amini
Differential Revision: https://reviews.llvm.org/D36637
llvm-svn: 313678
Summary:
As requested by Sam McCall:
> Enums (not new I guess). Typical case: if (enum < 0 || enum > MAX)
> The warning strongly suggests that the enum < 0 check has no effect
> (for enums with nonnegative ranges).
> Clang doesn't seem to optimize such checks out though, and they seem
> likely to catch bugs in some cases. Yes, only if there's UB elsewhere,
> but I assume not optimizing out these checks indicates a deliberate
> decision to stay somewhat compatible with a technically-incorrect
> mental model.
> If this is the case, should we move these to a
> -Wtautological-compare-enum subcategory?
Reviewers: rjmccall, rsmith, aaron.ballman, sammccall, bkramer, djasper
Reviewed By: aaron.ballman
Subscribers: jroelofs, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D37629
llvm-svn: 313677
Summary:
There is no benefit in having the 4-byte alignment, and removing this
restriction can save a lot of space for some applications.
Reviewers: asl, awygle
Reviewed By: awygle
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D36165
llvm-svn: 313676
When the value specified for n in ordered(n) is larger than the number of loops a segmentation fault can occur in one of two ways when attempting to print out a diagnostic for an associated depend(sink : vec):
1) The iteration vector vec contains less than n items
2) The iteration vector vec contains a variable that is not a loop control variable
This patch addresses both of these issues.
Differential Revision: https://reviews.llvm.org/D38049
llvm-svn: 313675
The generated DAG isel file currently makes use of formatted_raw_ostream primarily for generating a hierarchical representation while also skipping over the initial comment that contains the current index.
It was reported in D37957 that this formatting might be slow due to the need to keep track of column numbers by monitoring all the written data for new lines.
This patch attempts to rewrite the emitter to make use of simpler formatting mechanisms to generate a fairly similar output. The main difference is that the number in the index comment is now right justified and padded with spaces inside the comment. Previously we appended the spaces after the comment.
Differential Revision: https://reviews.llvm.org/D37966
llvm-svn: 313674
The pre-RA scheduler does load/store clustering, but post-RA
scheduler undoes it. Add mutation to prevent it.
Differential Revision: https://reviews.llvm.org/D38014
llvm-svn: 313670
SystemZTargetLowering::combineSTORE contains code to transform a
combination of STORE + BSWAP into a STRV type instruction.
This transformation is correct for regular stores, but not for
truncating stores. The routine neglected to check for that case.
Fixes a miscompilation of llvm-objcopy with clang, which caused
test suite failures in the SystemZ multistage build bot.
llvm-svn: 313669
This reverts commit SVN r313654. Seems that it is triggering an
assertion on Windows specifically. Revert until I can build on Windows
and look into what is happening there.
llvm-svn: 313668
Inputs should be placed local to the test (or possibly in a common
parent? I think we do that in some places - but the only common parent
between these two directories is 'test' which seems a bit overly broad).
llvm-svn: 313662
This change adds a test that checks the an error is produced when a hexagon
specific reserved section index is used but e_machine is not EM_HEXAGON.
Differential Revision: https://reviews.llvm.org/D38017
llvm-svn: 313661
Add some member types to MachineValueTypeSet::const_iterator so that
iterator_traits can work with it.
Improve TableGen performance of -gen-dag-isel (motivated by X86 backend)
The introduction of parameterized register classes in r313271 caused the
matcher generation code in TableGen to run much slower, particularly so
in the unoptimized (debug) build. This patch recovers some of the lost
performance.
Summary of changes:
- Cache the set of legal types in TypeInfer::getLegalTypes. The contents
of this set do not change.
- Add LLVM_ATTRIBUTE_ALWAYS_INLINE to several small functions. Normally
this would not be necessary, but in the debug build TableGen is not
optimized, so this helps a little bit.
- Add an early exit from TypeSetByHwMode::operator== for the case when
one or both arguments are "simple", i.e. only have one mode. This
saves some time in GenerateVariants.
- Finally, replace the underlying storage type in TypeSetByHwMode::SetType
with MachineValueTypeSet based on std::array instead of std::set.
This significantly reduces the number of memory allocation calls.
I've done a number of experiments with the underlying type of InfoByHwMode.
The type is a map, and for targets that do not use the parameterization,
this map has only one entry. The best (unoptimized) performance, somewhat
surprisingly came from std::map, followed closely by std::unordered_map.
DenseMap was the slowest by a large margin.
Various hand-crafted solutions (emulating enough of the map interface
not to make sweeping changes to the users) did not yield any observable
improvements.
llvm-svn: 313660
When symbolizing large binaries, parsing every CU in a DWP file is a
significant performance penalty. Instead, use the index to only load the
CUs that are needed.
llvm-svn: 313659
Summary: Fix the bug when promoted call return type mismatches with the promoted function, we should not try to inline it. Otherwise it may lead to compiler crash.
Reviewers: davidxl, tejohnson, eraman
Reviewed By: tejohnson
Subscribers: llvm-commits, sanjoy
Differential Revision: https://reviews.llvm.org/D38018
llvm-svn: 313658
This change adds support for nested and even overlapping segments. This means
that PT_PHDR, PT_GNU_RELRO, PT_TLS, and PT_DYNAMIC can be supported properly.
Differential Revision: https://reviews.llvm.org/D36558
llvm-svn: 313656
The main change is to avoid setting the process state as running when
debugging core/minidumps (details in the bug). Also included a few small,
related fixes around how the errors propagate in this case.
Fixed the FreeBSD/Windows break: the intention was to keep
Process::WillResume() and Process::DoResume() "in-sync", but this had the
unfortunate consequence of breaking Process sub-classes which don't override
WillResume().
The safer approach is to keep Process::WillResume() untouched and only
override it in the minidump and core implementations.
patch by lemo
Bug: https://bugs.llvm.org/show_bug.cgi?id=34532
Differential Revision: https://reviews.llvm.org/D37651
llvm-svn: 313655
Add support for the R_AARCH64_ABS{16,32} relocations in the execution
engine. This is primarily used for DWARF debug information relocations
and needed by the LLVM JIT to support JITing for lldb.
Patch by Alex Langford!
llvm-svn: 313654
The motivation is to be able to check sources outside the current
directory. See D31363 for example usage.
Differential Revision: https://reviews.llvm.org/D37859
llvm-svn: 313648
The introduction of parameterized register classes in r313271 caused the
matcher generation code in TableGen to run much slower, particularly so
in the unoptimized (debug) build. This patch recovers some of the lost
performance.
Summary of changes:
- Cache the set of legal types in TypeInfer::getLegalTypes. The contents
of this set do not change.
- Add LLVM_ATTRIBUTE_ALWAYS_INLINE to several small functions. Normally
this would not be necessary, but in the debug build TableGen is not
optimized, so this helps a little bit.
- Add an early exit from TypeSetByHwMode::operator== for the case when
one or both arguments are "simple", i.e. only have one mode. This
saves some time in GenerateVariants.
- Finally, replace the underlying storage type in TypeSetByHwMode::SetType
with MachineValueTypeSet based on std::array instead of std::set.
This significantly reduces the number of memory allocation calls.
I've done a number of experiments with the underlying type of InfoByHwMode.
The type is a map, and for targets that do not use the parameterization,
this map has only one entry. The best (unoptimized) performance, somewhat
surprisingly came from std::map, followed closely by std::unordered_map.
DenseMap was the slowest by a large margin.
Various hand-crafted solutions (emulating enough of the map interface
not to make sweeping changes to the users) did not yield any observable
improvements.
llvm-svn: 313647
Given a linker script that ends in
.some_sec { ...} ;
__stack_start = .;
. = . + 0x2000;
__stack_end = .;
lld would put orphan sections like .comment before __stack_end,
corrupting the intended meaning.
The reason we don't normally move orphans past assignments to . is to
avoid breaking
rx_sec : { *(rx_sec) }
. = ALIGN(0x1000);
/* The RW PT_LOAD starts here*/
but in this case, there is nothing after and it seems safer to put the
orphan section last. This seems to match bfd's behavior and is
convenient for writing linker scripts that care about the layout of
SHF_ALLOC sections, but not of any non SHF_ALLOC sections.
llvm-svn: 313646
Similar to what we do for X86ISD::SHRUNKBLEND just turn X86ISD::SELECT into ISD::VSELECT. This allows us to remove the duplicated TRUNC patterns.
Differential Revision: https://reviews.llvm.org/D38022
llvm-svn: 313644
After speaking with the libcxx owners, they agreed that this is
a bug in the bot that needs to be fixed by the bot owners, and
the CMake changes are correct.
llvm-svn: 313643
This causes a linker error because of duplicate symbol since
ReportDeadlySignal is defined both in sanitizer_common_libcdep and
sanitizer_fuchsia.
Differential Revision: https://reviews.llvm.org/D37952
llvm-svn: 313641