I think there are some destruction ordering issues here. The
ShouldDelete map seems to be getting destroyed before the shared_ptr
deleter lambda accesses it. In any case, this avoids inserting elements
into the map during shutdown.
llvm-svn: 306736
This patch verifies the number of atoms, the validity of the form for each atom, as well as the validity of the
hashdata. For hashdata, we're verifying that the hashdata offset is correct and that the offset in the .debug_info for
each DIE in the hashdata is also valid.
llvm-svn: 306735
The style guide states that the explicit `inline`
should not be used with inline methods. classof is
very common inline method with a fair amount on
inconsistency:
$ git grep classof ./include | grep inline | wc -l
230
$ git grep classof ./include | grep -v inline | wc -l
257
I chose to target this method rather the larger change
since this method is easily cargo-culted (I did it at
least once). I considered doing the larger change and
removing all occurrences but that would be a much larger
change.
Differential Revision: https://reviews.llvm.org/D33906
llvm-svn: 306731
Summary:
This patch adds an additional level of verification - it checks parent and sibling properties of a tree. By definition, every tree with these two properties is a dominator tree.
It is possible to run those check by running llvm with `-verify-dom-info=1`.
Bootstrapping clang and building the llvm test suite with this option enabled doesn't yield any errors.
Reviewers: dberlin, sanjoy, chandlerc
Reviewed By: dberlin
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D34482
llvm-svn: 306711
Summary:
Indices for GEPs that index into a struct type should always be
constants. This added more checks in `collectConstantCandidates:` which make
sure constants for GEP pointer type are not hoisted.
This fixed Bug https://bugs.llvm.org/show_bug.cgi?id=33538
Reviewers: ributzka, rnk
Reviewed By: ributzka
Subscribers: efriedma, llvm-commits, srhines, javed.absar, pirama
Differential Revision: https://reviews.llvm.org/D34576
llvm-svn: 306704
Requires callers to directly associate relocations with a DataExtractor
used to read data from a DWARF section, which helps a callee not make
assumptions about which section it is reading.
This is the next step in reducing DWARFFormValue's dependence on DWARFUnit.
Differential Revision: https://reviews.llvm.org/D34704
llvm-svn: 306699
In LLVM IR the following code:
%r = urem <ty> %t, %b
is equivalent to:
%q = udiv <ty> %t, %b
%s = mul <ty> nuw %q, %b
%r = sub <ty> nuw %t, %q ; (t / b) * b + (t % b) = t
As UDiv, Mul and Sub are already supported by SCEV, URem can be
implemented with minimal effort this way.
Note: While SRem and SDiv are also related this way, SCEV does not
provides SDiv yet.
llvm-svn: 306695
Relanding after restricting equalBaseIndex to not erroneuosly consider
a FrameIndices stemming from alloca from being comparable as its
offset is set post-selectionDAG.
Pull FrameIndex comparision reasoning from DAGCombiner::isAlias to
general BaseIndexOffset.
llvm-svn: 306688
For networking-type bpf program, it often needs to access
packet data. A context data structure is provided to the bpf
programs with two fields:
u32 data;
u32 data_end;
User can access these two fields with ctx->data and ctx->data_end.
During program verification process, the kernel verifier modifies
the bpf program with loading of actual pointer value from kernel
data structure.
r = ctx->data ===> r = actual data start ptr
r = ctx->data_end ===> r = actual data end ptr
A typical program accessing ctx->data like
char *data_ptr = (char *)(long)ctx->data
will result in a 32-bit load followed by a zero extension.
Such an operation is combined into a single LDW in DAG combiner
as bpf LDW does zero extension automatically.
In cases like the below (which can be a result of global value numbering
and partial redundancy elimination before insn selection):
B1:
u32 a = load-32-bit &ctx->data
u64 pa = zext a
...
B2:
u32 b = load-32-bit &ctx->data
u64 pb = zext b
...
B3:
u32 m = PHI(a, b)
u64 pm = zext m
In B3, "pm = zext m" cannot be removed, which although is legal
from compiler perspective, will generate incorrect code after
kernel verification.
This patch recognizes this pattern and traces through PHI node
to see whether the operand of "zext m" is defined with LDWs or not.
If it is, the "zext m" itself can be removed.
The patch also recognizes the pattern where the load and use of
the load value not in the same basic block, where truncate operation
may be removed as well.
The patch handles 1-byte, 2-byte and 4-byte truncation.
Two test cases are added to verify the transformation happens properly
for the above code pattern.
Signed-off-by: Yonghong Song <yhs@fb.com>
llvm-svn: 306685
This patch fixes a verification error with -verify-machineinstrs while expanding __tls_get_addr by not creating ADJCALLSTACKUP and ADJCALLSTACKDOWN if there is another ADJCALLSTACKUP in this basic block since nesting ADJCALLSTACKUP/ADJCALLSTACKDOWN is not allowed.
Here, ADJCALLSTACKUP and ADJCALLSTACKDOWN are created as a fence for instruction scheduling to avoid _tls_get_addr is scheduled before mflr in the prologue (https://bugs.llvm.org//show_bug.cgi?id=25839). So if another ADJCALLSTACKUP exists before _tls_get_addr, we do not need to create a new ADJCALLSTACKUP.
Differential Revision: https://reviews.llvm.org/D34347
llvm-svn: 306678
Because of mistake introduced in r306517,
wrong variable ("name" instead of "Name") was used
in error message.
As a result it reported section name instead of
relocation name.
This file still needs cleanup to match LLVM coding style
and more tests I think.
llvm-svn: 306677
The changes are a result of discussion of https://reviews.llvm.org/D33685.
It solves the following problem:
1. We can inform getGEPCost about simplified indices to help it with
calculating the cost. But getGEPCost does not take into account the
context which GEPs are used in.
2. We have getUserCost which can take the context into account but we cannot
inform about simplified indices.
With the changes getUserCost will have access to additional information
as getGEPCost has.
The one parameter getUserCost is also provided.
Differential Revision: https://reviews.llvm.org/D34057
llvm-svn: 306674
The difference from the previous version is the use of decltype, as the
implementation of std::result_of in libc++ did not work correctly for
variadic function like open(2).
Original summary:
This function retries an operation if it was interrupted by a signal
(failed with EINTR). It's inspired by the TEMP_FAILURE_RETRY macro in
glibc, but I've turned that into a template function. I've also added a
fail-value argument, to enable the function to be used with e.g.
fopen(3), which is documented to fail for any reason that open(2) can
fail (which includes EINTR).
The main user of this function will be lldb, but there were also a
couple of uses within llvm that I could simplify using this function.
Reviewers: zturner, silvas, joerg
Subscribers: mgorny, llvm-commits
Differential Revision: https://reviews.llvm.org/D33895
llvm-svn: 306671
Summary:
Support vector type G_MERGE_VALUES selection. For now G_MERGE_VALUES marked as legal for any type, so nothing to do in legalizer.
Split from https://reviews.llvm.org/D33665
Reviewers: qcolombet, t.p.northover, zvi, guyblank
Reviewed By: guyblank
Subscribers: rovka, kristof.beyls, guyblank, llvm-commits
Differential Revision: https://reviews.llvm.org/D33958
llvm-svn: 306665
Summary:
TBB and THH allow using a Thumb GPR or the PC as destination operand.
A few machine verifier failures where due to those instructions not
expecting PC as destination operand.
Add -verify-machineinstrs to test/CodeGen/ARM/jump-table-tbh.ll to add
test coverage even if expensive checks are disabled.
Reviewers: MatzeB, t.p.northover, jmolloy
Reviewed By: MatzeB
Subscribers: aemerson, javed.absar, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D34610
llvm-svn: 306654
Examining a large profile example, it seems relatively few records have
non-empty IndirectCall and MemOP data, so indirecting these through a
unique_ptr (non-null only when they are non-empty) Reduces memory usage
on this particular example from 14GB to 10GB according to valgrind's
massif.
I suspect it'd still be worth moving InstrProfWriter to its own data
structure that had Counts and the indirected IndirectCall+MemOP, and did
not include the Name, Hash, or Error fields. This would reduce the size
of this dominant data structure by half of this new, lower amount.
(Name(2), Hash(1), Error(1) ~= Counts(vector, 3), ValueProfData
(unique_ptr, 1))
-> From code review feedback, might actually refactor InstrProfRecord
itself to have a sub-struct with all the counts, and use that from
InstrProfWriter, rather than InstrProfWriter owning its own data
structure for this.
Reviewers: davidxl
Differential Revision: https://reviews.llvm.org/D34694
llvm-svn: 306631
This reverts commit d4c7e9fc63c10dbab0c30186ef8575474a704496.
This is done in order to address the failure of CrWinClangLLD etc. bots.
These throw an error of "side-by-side configuration is incorrect" during
compilation, which sounds suspiciously related to these manifest
changes.
Revert "Switch external cvtres.exe for llvm's own resource library."
This reverts commit 71fe8ef283a9dab9a3f21432c98466cbc23990d1.
llvm-svn: 306618
Summary:
As discussed on the mailing list it is legal to propagate TBAA to loads/stores
from/to smaller regions of a larger load tagged with TBAA. Do so for
(load->extractvalue)=>(gep->load) and similar foldings.
Reviewed By: sanjoy
Differential Revision: https://reviews.llvm.org/D31954
llvm-svn: 306615
Instead of creating symbols directly in the findChildren methods of the native
symbol implementations, they will rely on the NativeSession to act as a factory
for these types. This lets NativeSession cache the NativeRawSymbols in its
new symbol cache and makes that cache the source of unique IDs for the symbols.
Right now, this affects only NativeCompilandSymbols. There's no external
change yet, so I think the existing tests are still sufficient. Coming soon
are patches to extend this to built-in types and enums.
llvm-svn: 306610
Given no NaNs and no signed zeroes it folds:
(fmul X, (select (fcmp X > 0.0), -1.0, 1.0)) -> (fneg (fabs X))
(fmul X, (select (fcmp X > 0.0), 1.0, -1.0)) -> (fabs X)
Differential Revision: https://reviews.llvm.org/D34579
llvm-svn: 306592
r306381 caused PR33613, by reversing the order in which insertelements were
generated per unroll part. This patch fixes PR33613 by retraining this order,
placing each set of insertelements per part immediately after the last scalar
being packed for this part. Includes a test case derived from PR33613.
Reference: https://bugs.llvm.org/show_bug.cgi?id=33613
Differential Revision: https://reviews.llvm.org/D34760
llvm-svn: 306575
Some conditional branch instructions generated by this pass are checking
the wrong condition code. The instructions TBZ and TBNZ are transformed
into B.GE and B.LT instead of B.PL and B.MI respectively. They should
only be checking the Negative bit.
Differential Revision: https://reviews.llvm.org/D34743
llvm-svn: 306550
The current heuristic in isProfitableToIfCvt assumes we have a branch predictor,
and so gives the wrong answer in some cases when we don't. This patch adds a
subtarget feature to indicate that a subtarget has no branch predictor, and
changes the heuristic in isProfitableToiIfCvt when it's present. This gives a
slight overall improvement in a set of embedded benchmarks on Cortex-M4 and
Cortex-M33.
Differential Revision: https://reviews.llvm.org/D34398
llvm-svn: 306547
Summary:
I was testing using this expansion logic in other cases besides
NVPTX, and found some runtime failures due to the lack of a check
for a zero length memcpy/memset before the loop. There is already
such a check in the memmove expansion code though.
Reviewers: hfinkel
Subscribers: jholewinski, wdng, llvm-commits
Differential Revision: https://reviews.llvm.org/D34707
llvm-svn: 306541
CFI instructions that set appropriate cfa offset and cfa register are now
inserted in emitEpilogue() in X86FrameLowering.
Majority of the changes in this patch:
1. Ensure that CFI instructions do not affect code generation.
2. Enable maintaining correct information about cfa offset and cfa register
in a function when basic blocks are reordered, merged, split, duplicated.
These changes are target independent and described below.
Changed CFI instructions so that they:
1. are duplicable
2. are not counted as instructions when tail duplicating or tail merging
3. can be compared as equal
Add information to each MachineBasicBlock about cfa offset and cfa register
that are valid at its entry and exit (incoming and outgoing CFI info). Add
support for updating this information when basic blocks are merged, split,
duplicated, created. Add a verification pass (CFIInfoVerifier) that checks
that outgoing cfa offset and register of predecessor blocks match incoming
values of their successors.
Incoming and outgoing CFI information is used by a late pass
(CFIInstrInserter) that corrects CFA calculation rule for a basic block if
needed. That means that additional CFI instructions get inserted at basic
block beginning to correct the rule for calculating CFA. Having CFI
instructions in function epilogue can cause incorrect CFA calculation rule
for some basic blocks. This can happen if, due to basic block reordering,
or the existence of multiple epilogue blocks, some of the blocks have wrong
cfa offset and register values set by the epilogue block above them.
Patch by Violeta Vukobrat.
Differential Revision: https://reviews.llvm.org/D18046
llvm-svn: 306529
The original patch was an improvement to IR ValueTracking on non-negative
integers. It has been checked in to trunk (D18777, r284022). But was disabled by
default due to performance regressions.
Perf impact has improved. The patch would be enabled by default.
Reviewers: reames
Differential Revision: https://reviews.llvm.org/D34101
Patch by: Olga Chupina <olga.chupina@intel.com>
llvm-svn: 306528
Summary:
This commit allows matchSelectPattern to recognize clamp of float
arguments in the presence of FMF the same way as already done for
integers.
This case is a little different though. With integers, given the
min/max pattern is recognized, DAGBuilder starts selecting MIN/MAX
"automatically". That is not the case for float, because for them only
full FMINNAN/FMINNUM/FMAXNAN/FMAXNUM ISD nodes exist and they do care
about NaNs. On the other hand, some backends (e.g. X86) have only
FMIN/FMAX nodes that do not care about NaNS and the former NAN/NUM
nodes are illegal thus selection is not happening. So I decided to do
such kind of transformation in IR (InstCombiner) instead of
complicating the logic in the backend.
Reviewers: spatel, jmolloy, majnemer, efriedma, craig.topper
Reviewed By: efriedma
Subscribers: hiraditya, javed.absar, n.bozhenov, llvm-commits
Patch by Andrei Elovikov <andrei.elovikov@intel.com>
Differential Revision: https://reviews.llvm.org/D33186
llvm-svn: 306525
With fix in include folder character case:
#include "llvm/Codegen/AsmPrinter.h" -> #include "llvm/CodeGen/AsmPrinter.h"
Original commit message:
Change introduces error reporting policy for DWARFContextInMemory.
New callback provided by client is able to handle error on it's
side and return Halt or Continue.
That allows to either keep current behavior when parser prints all errors
but continues parsing object or implement something very different, like
stop parsing on a first error and report an error in a client style.
Differential revision: https://reviews.llvm.org/D34328
llvm-svn: 306517
The benchmarking summarized in
http://lists.llvm.org/pipermail/llvm-dev/2017-May/113525.html showed
this is beneficial for a wide range of cores.
As is to be expected, quite a few small adaptations are needed to the
regressions tests, as the difference in scheduling results in:
- Quite a few small instruction schedule differences.
- A few changes in register allocation decisions caused by different
instruction schedules.
- A few changes in IfConversion decisions, due to a difference in
instruction schedule and/or the estimated cost of a branch mispredict.
llvm-svn: 306514
Change introduces error reporting policy for DWARFContextInMemory.
New callback provided by client is able to handle error on it's
side and return Halt or Continue.
That allows to either keep current behavior when parser prints all errors
but continues parsing object or implement something very different, like
stop parsing on a first error and report an error in a client style.
Differential revision: https://reviews.llvm.org/D34328
llvm-svn: 306512
A slightly more efficient way to get constant, we avoid resolving in getSCEV and excessive
invocations, and we don't create a ConstantInt if 'true' branch is taken.
Differential Revision: https://reviews.llvm.org/D34672
llvm-svn: 306503
That is pretty common for clang to produce code like
(shl %x, (and %amt, 31)). In this situation we can still perform
trunc (shl) into shl (trunc) conversion given the known value
range of shift amount.
Differential Revision: https://reviews.llvm.org/D34723
llvm-svn: 306499
When simplifying an instruction that has been re-mapped, it should never
simplify to an instruction in the original function. In the edge case
where we are inlining a function into itself, the existing code led to
incorrect behavior. Replace the incorrect code with an assert verifying
that we never expect simplification to produce an instruction in the old
function, unless the functions are the same.
Differential Revision: https://reviews.llvm.org/D33850
llvm-svn: 306495
Summary:
This is the llvm part of the initial implementation to support Windows ARM64 COFF format.
I will gradually add more functionality in subsequent patches.
Reviewers: ruiu, rnk, t.p.northover, compnerd
Reviewed By: ruiu, compnerd
Subscribers: aemerson, mgorny, javed.absar, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D34705
llvm-svn: 306490
As noted in D34071, there are some IR optimization opportunities that could be
handled by normal IR passes if this expansion wasn't happening so late in CGP.
Regardless of that, it seems wasteful to knowingly produce suboptimal IR here,
so I'm proposing this change:
%s = sub i32 %x, %y
%r = icmp ne %s, 0
=>
%r = icmp ne %x, %y
Changing the predicate to 'eq' mimics what InstCombine would do, so that's just
an efficiency improvement if we decide this expansion should happen sooner.
The fact that the PowerPC backend doesn't eliminate the 'subf.' might be
something for PPC folks to investigate separately.
Differential Revision: https://reviews.llvm.org/D34416
llvm-svn: 306471
Without this check, COPY instructions can actually be one of the generic casts
in disguise. That's confusing and bad.
At some point during ISel this restriction has to be relaxed since the fully
selected instructions will usually use COPY for those purposes. Right now I
think it's possible that relaxation occurs during RegBankSelect (hence the
change there). I'm not convinced that's where it belongs long-term though.
llvm-svn: 306470
This patch enables significant performance enhancements to the
Cavium ThunderX2T99 LLVM backend, as observed by running SPEC2K6,
by adding more detailed scheduling information.
Related Bugzilla bug: http://bugs.llvm.org/show_bug.cgi?id=32562
Patch by: steleman
Differential Revision: https://reviews.llvm.org/D31801
llvm-svn: 306462
The overal size of the data section (including BSS)
is otherwise not included in the wasm binary.
Differential Revision: https://reviews.llvm.org/D34657
llvm-svn: 306459
The check to see if we can propagate the nsw flag used m_ConstantInt(uint64_t*&) which doesn't work with splat vectors and has a restriction that the bitwidth of the ConstantInt must be 64-bits are less.
This patch changes it to use m_APInt to remove both these issues
Differential Revision: https://reviews.llvm.org/D34699
llvm-svn: 306457
BlockAddress are only valid within their function context, which does not
interact well with CodeExtractor. Detect this case and prevent it.
Differential Revision: https://reviews.llvm.org/D33839
llvm-svn: 306448
Depending on the compare code that can be either an argument of
sext or negate of it. This helps to avoid v_cndmask_b64 instruction
for sext. A reversed value can be further simplified and folded into
its parent comparison if possible.
Differential Revision: https://reviews.llvm.org/D34545
llvm-svn: 306446