This commit adds a wrapper for std::distance() which works with ranges.
As it would be a common case to write `distance(predecessors(BB))`, this
also introduces `pred_size()` and `succ_size()` helpers to make that
easier to write.
Differential Revision: https://reviews.llvm.org/D46668
llvm-svn: 332057
length excluding the table header. Instead it must encode the contribution length minus the length
field itself.
Reviewer: JDevliegehere
Differential Revision: https://reviews.llvm.org/D45922
llvm-svn: 332030
Accessing the members of a large data structures needs a lot of GEPs which
usually have large offsets due to the size of the underlying data structure. If
the offsets are too large to fit into the r+i addressing mode, these GEPs cannot
be sunk to their users' blocks and many extra registers are needed then to carry
the values of these GEPs.
This patch tries to split a large data struct starting from %base like the
following.
Before:
BB0:
%base =
BB1:
%gep0 = gep %base, off0
%gep1 = gep %base, off1
%gep2 = gep %base, off2
BB2:
%load1 = load %gep0
%load2 = load %gep1
%load3 = load %gep2
After:
BB0:
%base =
%new_base = gep %base, off0
BB1:
%new_gep0 = %new_base
%new_gep1 = gep %new_base, off1 - off0
%new_gep2 = gep %new_base, off2 - off0
BB2:
%load1 = load i32, i32* %new_gep0
%load2 = load i32, i32* %new_gep1
%load3 = load i32, i32* %new_gep2
In the above example, the struct is split into two parts. The first part still
starts from %base and the second part starts from %new_base. After the
splitting, %new_gep1 and %new_gep2 have smaller offsets and then can be sunk to
BB2 and folded into their users.
The algorithm to split data structure is simple and very similar to the work of
merging SExts. First, it collects GEPs that have large offsets when iterating
the blocks. Second, it splits the underlying data structures and updates the
collected GEPs to use smaller offsets.
Differential Revision: https://reviews.llvm.org/D42759
llvm-svn: 332015
Summary:
The combine in rebuildSetCC may be combined to another
node leaving our references stale. Keep a handle on
it to avoid stale references.
Fixes PR36602.
Reviewers: dbabokin, RKSimon, eli.friedman, davide
Subscribers: hiraditya, uabelho, JesperAntonsson, qcolombet, llvm-commits
Differential Revision: https://reviews.llvm.org/D46404
llvm-svn: 331985
Reviewed by: dblaikie, JDevlieghere, espindola
Differential Revision: https://reviews.llvm.org/D44560
Summary:
The .debug_line parser previously reported errors by printing to stderr and
return false. This is not particularly helpful for clients of the library code,
as it prevents them from handling the errors in a manner based on the calling
context. This change switches to using llvm::Error and callbacks to indicate
what problems were detected during parsing, and has updated clients to handle
the errors in a location-specific manner. In general, this means that they
continue to do the same thing to external users. Below, I have outlined what
the known behaviour changes are, relating to this change.
There are two levels of "errors" in the new error mechanism, to broadly
distinguish between different fail states of the parser, since not every
failure will prevent parsing of the unit, or of subsequent unit. Malformed
table errors that prevent reading the remainder of the table (reported by
returning them) and other minor issues representing problems with parsing that
do not prevent attempting to continue reading the table (reported by calling a
specified callback funciton). The only example of this currently is when the
last sequence of a unit is unterminated. However, I think it would be good to
change the handling of unrecognised opcodes to report as minor issues as well,
rather than just printing to the stream if --verbose is used (this would be a
subsequent change however).
I have substantially extended the DwarfGenerator to be able to handle
custom-crafted .debug_line sections, allowing for comprehensive unit-testing
of the parser code. For now, I am just adding unit tests to cover the basic
error reporting, and positive cases, and do not currently intend to test every
part of the parser, although the framework should be sufficient to do so at a
later point.
Known behaviour changes:
- The dump function in DWARFContext now does not attempt to read subsequent
tables when searching for a specific offset, if the unit length field of a
table before the specified offset is a reserved value.
- getOrParseLineTable now returns a useful Error if an invalid offset is
encountered, rather than simply a nullptr.
- The parse functions no longer use `WithColor::warning` directly to report
errors, allowing LLD to call its own warning function.
- The existing parse error messages have been updated to not specifically
include "warning" in their message, allowing consumers to determine what
severity the problem is.
- If the line table version field appears to have a value less than 2, an
informative error is returned, instead of just false.
- If the line table unit length field uses a reserved value, an informative
error is returned, instead of just false.
- Dumping of .debug_line.dwo sections is now implemented the same as regular
.debug_line sections.
- Verbose dumping of .debug_line[.dwo] sections now prints the prologue, if
there is a prologue error, just like non-verbose dumping.
As a helper for the generator code, I have re-added emitInt64 to the
AsmPrinter code. This previously existed, but was removed way back in r100296,
presumably because it was dead at the time.
This change also requires a change to LLD, which will be committed separately.
llvm-svn: 331971
The second source operand of G_SHL, G_ASHR, and G_LSHR must preserve its
value as a (small) unsigned integer, therefore its incorrect to widen it
in any way but by zero extending it.
G_SHL was using G_ANYEXT and G_ASHR - G_SEXT (which is correct for their
destination and first source operands, but not the "number of bits to
shift" operand).
Generally, shifts aren't as similar to regular binary operations as it
might seem, for instance, they aren't commutative nor associative and
the second source operand usually requires a special treatment.
Reviewers: bogner, javed.absar, aivchenk, rovka
Reviewed By: bogner
Subscribers: igorb, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D46413
llvm-svn: 331926
Previously if !LegalOperations we would blindly call getBitcast and hope that getNode would constant fold it. But if the conversion is between a vector and a scalar, getNode has no simplification.
This means we would just get back the original N. We would then return that N which would make the caller of visitBITCAST think that we used CombineTo and did our own worklist management. This prevents target specific optimizations from being called for vector/scalar bitcasts until after legal operations.
llvm-svn: 331896
The previous value of 8192 resulted in severe compile time hits in
some pathological cases.
rdar://39781410
Differential Revision: https://reviews.llvm.org/D46581
llvm-svn: 331888
Reverting this to see if the clang-cmake-aarch64-global-isel and
clang-cmake-aarch64-quick bots are failing because of this commit.
We know it wasn't r331819.
llvm-svn: 331846
Because we create a new kind of debug instruction, DBG_LABEL, we need to
check all passes which use isDebugValue() to check MachineInstr is debug
instruction or not. When expelling debug instructions, we should expel
both DBG_VALUE and DBG_LABEL. So, I create a new function,
isDebugInstr(), in MachineInstr to check whether the MachineInstr is
debug instruction or not.
This patch has no new test case. I have run regression test and there is
no difference in regression test.
Differential Revision: https://reviews.llvm.org/D45342
Patch by Hsiangkai Wang.
llvm-svn: 331844
In order to convert LLVM IR to MachineInstr, we need a new TargetOpcode,
DBG_LABEL, to ‘lower’ intrinsic llvm.dbg.label. The patch
creates this new TargetOpcode and convert intrinsic llvm.dbg.label to
MachineInstr through SelectionDAG.
In SelectionDAG, debug information is stored in SDDbgInfo. We create a
new data member of SDDbgInfo for labels and use the new data member,
SDDbgLabel, to create DBG_LABEL MachineInstr.
The new DBG_LABEL MachineInstr uses label metadata from LLVM IR as its
parameter. So, the backend could get metadata information of labels from
DBG_LABEL MachineInstr.
Differential Revision: https://reviews.llvm.org/D45341
Patch by Hsiangkai Wang.
llvm-svn: 331842
In order to set breakpoints on labels and list source code around
labels, we need collect debug information for labels, i.e., label
name, the function label belong, line number in the file, and the
address label located. In order to keep these information in LLVM
IR and to allow backend to generate debug information correctly.
We create a new kind of metadata for labels, DILabel. The format
of DILabel is
!DILabel(scope: !1, name: "foo", file: !2, line: 3)
We hope to keep debug information as much as possible even the
code is optimized. So, we create a new kind of intrinsic for label
metadata to avoid the metadata is eliminated with basic block.
The intrinsic will keep existing if we keep it from optimized out.
The format of the intrinsic is
llvm.dbg.label(metadata !1)
It has only one argument, that is the DILabel metadata. The
intrinsic will follow the label immediately. Backend could get the
label metadata through the intrinsic's parameter.
We also create DIBuilder API for labels to be used by Frontend.
Frontend could use createLabel() to allocate DILabel objects, and use
insertLabel() to insert llvm.dbg.label intrinsic in LLVM IR.
Differential Revision: https://reviews.llvm.org/D45024
Patch by Hsiangkai Wang.
llvm-svn: 331841
Fix a silly mistake in my pre-commit changes for r331816. It should check what
opcode the insn is before extracting the operands.
NFC at the moment since the caller already checked the opcode.
llvm-svn: 331820
Refactoring LegalizerHelper::widenScalar member function reducing its
size by approximately a factor of 2 and (hopefuly) making it more
straightforward and regular by introducing widenScalarSrc and
widenScalarDst helper methods.
The new widenScalar* methods mutate the instructions in place instead
of recreating them from scratch and removing the originals. The
compile time implications of this were measured on sqlite3
amalgamation, targeting AArch64 in -O0:
LegalizerHelper::widenScalar: > 25% faster
Legalizer::runOnMachineFunction: ~ 4.0 - 4.5% faster
Also adding MachineOperand::setCImm and refactoring out
MachineIRBuilder::recordInsertion methods to make the change possible.
Reviewers: aditya_nandakumar, bogner, javed.absar, t.p.northover, ab, dsanders, arsenm
Reviewed By: aditya_nandakumar
Subscribers: wdng, rovka, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D46414
llvm-svn: 331819
Before SVN r244158, codeview debug info was emitted always
emitted for msvc if debug info was enabled, but that commit
added a module flag.
Since it's still restricted by the flag, we can allow it
for any target if the user requests it, not only msvc (and
windows-itanium, added in SVN r287567).
Add a test for emitting it for a mingw target.
Differential Revision: https://reviews.llvm.org/D46303
llvm-svn: 331809
CodeGenPrepare pass move extension instructions close to load instructions in different BB, so they can be combined later. But the extension instructions can't move through logical and shift instructions in current implementation. This patch enables this enhancement, so we can eliminate more extension instructions.
Differential Revision: https://reviews.llvm.org/D45537
llvm-svn: 331783
Making sure we don't truncate / extend pointers, don't try to change
vector topology or bitcast vectors to scalars or back, and most
importantly, don't extend to a smaller type or truncate to a large
one.
Reviewers: qcolombet t.p.northover aditya_nandakumar
Reviewed By: qcolombet
Subscribers: rovka, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D46490
llvm-svn: 331718
Every generic machine instruction must have generic virtual registers
only, that is, have a low-level type attached to each operand.
Previously MachineVerifier would catch a type missing on an operand
only if the previous operand for the the same type index exists and
have a type attached to it and it will report it as a type mismatch.
This is incosistent behaviour and a misleading error message.
This commit makes sure MachineVerifier explicitly checks that the
types are there for every operand and if not provides a
straightforward error message.
Reviewers: qcolombet t.p.northover bogner ab
Reviewed By: qcolombet
Subscribers: rovka, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D46455
llvm-svn: 331694
This is an NFC pre-commit for the following "Checking that generic
instrs have LLTs on all vregs" commit.
This overloads MachineOperand::print to make it possible to print LLTs
with standalone machine operands.
This also overloads MachineVerifier::print(...MachineOperand...) with
an optional LLT using the newly introduced MachineOperand::print
variant; no actual calls added.
This also refactors MachineVerifier::visitMachineInstrBefore in the
parts dealing with all generic instructions (checking Selected
property, LLTs, and phys regs).
llvm-svn: 331693
Summary:
Split off from D46031.
The previous patch, D46493, completely disabled unfolding in case of immediates.
But we can do better:
{F6120274} {F6120277}
https://rise4fun.com/Alive/xJS
Reviewers: spatel, craig.topper
Reviewed By: spatel
Subscribers: andreadb, llvm-commits
Differential Revision: https://reviews.llvm.org/D46494
llvm-svn: 331685
Summary:
Split off from D46031.
In masked merge case, this degrades IPC by decreasing instruction count.
{F6108777}
The next patch should be able to recover and improve this.
This also affects the transform @spatel have added in D27489 / rL289738,
and the test coverage for X86 was missing.
But after i have added it, and looked at the changes in MCA, i'm somewhat confused.
{F6093591} {F6093592} {F6093593}
I'd say this regression is an improvement, since `IPC` increased in that case?
Reviewers: spatel, craig.topper
Reviewed By: spatel
Subscribers: andreadb, llvm-commits, spatel
Differential Revision: https://reviews.llvm.org/D46493
llvm-svn: 331684
Summary:
getNode optimizes (ext (trunc x)) to x and the dbgvalue node on trunc is lost. The fix calls transferDbgValues to add the dbgvalue to x.
Add DebugInfo/AArch64/dbg-value-i16.ll
Patch by Sejong Oh!
Reviewers: aprantl, javed.absar, llvm-commits, vsk
Reviewed By: aprantl, vsk
Subscribers: kristof.beyls, JDevlieghere, llvm-commits
Differential Revision: https://reviews.llvm.org/D46348
llvm-svn: 331665
Instead of enabling it for non NDEBUG builds, use -verify-cfiinstrs to
run verifier in CFIInstrInserter. It defaults to false.
Differential Revision: https://reviews.llvm.org/D46444
llvm-svn: 331635
Summary:
Split off form D46031.
It seems we don't want to transform the pattern if the `xor`'s are actually `not`'s.
In vector case, this breaks `andnpd` / `vandnps` patterns.
That being said, we may want to re-visit this `not` handling, maybe in D46073.
Reviewers: spatel, craig.topper, javed.absar
Reviewed By: spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D46492
llvm-svn: 331595
Summary:
The current code cannot handle register class names like 'i32', which is
a valid register class name in WebAssembly. This patch removes special
handling for integer/scalar/pointer type parsing and treats them as
normal identifiers.
Reviewers: thegameg
Subscribers: jfb, dschuff, sbc100, jgravelle-google, sunfish, llvm-commits
Differential Revision: https://reviews.llvm.org/D45948
llvm-svn: 331586
Inspired by r331508, I did a grep and found these.
Mostly just change from dyn_cast to cast. Some cases also showed a dyn_cast result being converted to bool, so those I changed to isa.
llvm-svn: 331577
Summary: Providing the glue to map SDNode fast math sub flags to MachineInstr fast math sub flags.
Reviewers: spatel, arsenm, wristow
Reviewed By: spatel
Subscribers: wdng
Differential Revision: https://reviews.llvm.org/D46447
llvm-svn: 331567
Summary:
When checking if an instruction stores to a given frame index, check
that the instruction can write to memory before looking at the memory
operands list to avoid e.g. DBG_VALUE instructions that reference a
frame index preventing a load from that index from being hoisted.
Reviewers: dblaikie, MatzeB, qcolombet, reames, javed.absar
Subscribers: mcrosier, llvm-commits
Differential Revision: https://reviews.llvm.org/D46284
llvm-svn: 331549
Summary: Adding support for Fast flags in the SDNode to leverage fast math sub flag usage.
Reviewers: spatel, arsenm, jbhateja, hfinkel, escha, qcolombet, echristo, wristow, javed.absar
Reviewed By: spatel
Subscribers: llvm-commits, rampitec, nhaehnle, tstellar, FarhanaAleen, nemanjai, javed.absar, jbhateja, hfinkel, wdng
Differential Revision: https://reviews.llvm.org/D45710
llvm-svn: 331547
Summary:
Added a helper method in RegsForValue to get a list with
all the <RegNumber, RegSize> pairs that we want to iterate
over in SelectionDAGBuilder::EmitFuncArgumentDbgValue and
in SelectionDAGBuilder::visitIntrinsicCall.
Reviewers: vsk
Reviewed By: vsk
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D46360
llvm-svn: 331510
Don't assume the alias of a defined reg is always already in the set.
As the test case in https://bugs.llvm.org/show_bug.cgi?id=36587 discovered,
it is wrong to assume that all the aliases of the defined register in the
*current function* is already present in the UsedPhysRegsMask.
This patch changes this so that any definition in the current function of a
phys-reg always results in all its aliases inserted into the set of defined
registers.
Review: Quentin Colombet
https://reviews.llvm.org/D45157
llvm-svn: 331509
Summary:
Using a set is unnecessary here an in some cases (see e.g. PR37277)
takes significant amount of time to just insert values into it. In this
particular case all we need is just to check if we find the block we are
looking for or not.
Reviewers: davide
Subscribers: hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D46411
llvm-svn: 331502
Summary:
constrainOperandRegClass() currently fails if it tries to constrain the
register class of an operand that is defeined with an unallocatable register
class. This patch resolves this by adding a target callback to compute
register constriants in this case.
This is required by the AMDGPU because many of its instructions have source opreands
defined with the unallocatable register classe VS_32 which is a union of two allocatable
register classes VGPR_32 and SReg_32.
Reviewers: dsanders, aditya_nandakumar
Reviewed By: aditya_nandakumar
Subscribers: rovka, kristof.beyls, tpr, llvm-commits
Differential Revision: https://reviews.llvm.org/D45991
llvm-svn: 331485
Summary:
This reverts SVN r331441 (reapplies r331337), together with a fix
in to handle an already existing fragment expression in the
dbg.value that must be fragmented due to a split PHI node.
This should solve the problem seen in PR37321, which was the
reason for the revert of r331337.
The situation in PR37321 is that we have a PHI node like this
%u.sroa = phi i80 [ %u.sroa.x, %if.x ],
[ %u.sroa.y, %if.y ],
[ %u.sroa.z, %if.z ]
and a dbg.value like this
call void @llvm.dbg.value(metadata i80 %u.sroa,
metadata !13,
metadata !DIExpression(DW_OP_LLVM_fragment, 0, 80))
The phi node is split into three 32-bit PHI nodes
%30:gr32 = PHI %11:gr32, %bb.4, %14:gr32, %bb.5, %27:gr32, %bb.8
%31:gr32 = PHI %12:gr32, %bb.4, %15:gr32, %bb.5, %28:gr32, %bb.8
%32:gr32 = PHI %13:gr32, %bb.4, %16:gr32, %bb.5, %29:gr32, %bb.8
but since the original value only is 80 bits we need to adjust the size
of the last fragment expression, and with this patch we get
DBG_VALUE debug-use %30:gr32, debug-use $noreg, !"u", !DIExpression(DW_OP_LLVM_fragment, 0, 32)
DBG_VALUE debug-use %31:gr32, debug-use $noreg, !"u", !DIExpression(DW_OP_LLVM_fragment, 32, 32)
DBG_VALUE debug-use %32:gr32, debug-use $noreg, !"u", !DIExpression(DW_OP_LLVM_fragment, 64, 16)
Reviewers: vsk, aprantl, mstorsjo
Reviewed By: aprantl
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D46384
llvm-svn: 331464
Summary:
Machine Instruction flags for fast math support and MIR print support
Reviewers: spatel, arsenm
Reviewed By: arsenm
Subscribers: wdng
Differential Revision: https://reviews.llvm.org/D45781
llvm-svn: 331417
The size of an object cannot be less than the emitted size of all the
contained elements. This would cause an overflow in padding size
calculation. Add an assert to catch this.
Patch by Suyog Sarda.
llvm-svn: 331376
Summary:
This is a follow up to rL331182. A PHI node can be split up into
several MIR PHI nodes when being selected. When there is a
dbg.value intrinsic that uses the result of such a PHI node we
need to select several DBG_VALUE instructions, with fragment
expressions, in order to do a correct selection.
Reviewers: rnk, aprantl, vsk
Reviewed By: vsk
Subscribers: mattd, llvm-commits, JDevlieghere, aprantl, gbedwell, rnk
Tags: #debug-info
Differential Revision: https://reviews.llvm.org/D46329
llvm-svn: 331337
The logic for this combine is almost identical to the logic for a
(sext (sextload x)) combine.
This commit factors out the logic so it can be shared by both combines,
and corrects the SDLoc assigned in the zext version of the combine.
Prior to this patch, for the given test case, we would apply the
location associated with the udiv instruction to instructions which
perform the load.
Part of: llvm.org/PR37262
llvm-svn: 331303
Prior to this patch, for the given test case, we would apply the
location associated with the sdiv instruction to instructions which
perform the load.
Part of: llvm.org/PR37262.
Differential Revision: https://reviews.llvm.org/D46222
llvm-svn: 331302
In DAGCombiner, we try to simplify this pattern:
([s|z]ext (load ...))
Conceptually, a new extload which is created while splitting the load
should have the same debug location as the load.
Making this change affects the IROrder of the new load, causing some
test case churn.
In practice, the new location is never different from the location of
the [s|z]ext, at least not during check-llvm or a stage2 build.
Part of: llvm.org/PR37262
Differential Revision: https://reviews.llvm.org/D46156
llvm-svn: 331301
Setting the right SDLoc on a newly-created zextload fixes a line table
bug which resulted in non-linear stepping behavior.
Several backend tests contained CHECK lines which relied on the IROrder
inherited from the wrong SDLoc. This patch breaks that dependence where
feasbile and regenerates test cases where not.
In some cases, changing a node's IROrder may alter register allocation
and spill behavior. This can affect performance. I have chosen not to
prevent this by applying a "known good" IROrder to SDLocs, as this may
hide a more general bug in the scheduler, or cause regressions on other
test inputs.
rdar://33755881, Part of: llvm.org/PR37262
Differential Revision: https://reviews.llvm.org/D45995
llvm-svn: 331300
This is a follow-up to r331272.
We've been running doxygen with the autobrief option for a couple of
years now. This makes the \brief markers into our comments
redundant. Since they are a visual distraction and we don't want to
encourage more \brief markers in new code either, this patch removes
them all.
Patch produced by
for i in $(git grep -l '\@brief'); do perl -pi -e 's/\@brief //g' $i & done
https://reviews.llvm.org/D46290
llvm-svn: 331275
We've been running doxygen with the autobrief option for a couple of
years now. This makes the \brief markers into our comments
redundant. Since they are a visual distraction and we don't want to
encourage more \brief markers in new code either, this patch removes
them all.
Patch produced by
for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done
Differential Revision: https://reviews.llvm.org/D46290
llvm-svn: 331272
This appears to have some issues associated with the file directive output
causing multiple global symbols with the name "file" to be emitted into a
startup section. I'm investigating more specific causes and working with the
original author.
This reverts commit r330271.
Also Revert "[DEBUGINFO, NVPTX] Add the test for the debug info of the local"
This reverts commit r330592 and the follow up of 330779 as the testcase is dependent upon r330271.
llvm-svn: 331237
Dead defs were being removed from the live set (in stepForward), but
registers clobbered by regmasks weren't (more specifically, they were
actually removed by removeRegsInMask, but then they were added back in).
llvm-svn: 331219
Teach AsmParser to check with Assembler for when evaluating constant
expressions. This improves the handing of preprocessor expressions
that must be resolved at parse time. This idiom can be found as
assembling-time assertion checks in source-level assemblers. Note that
this relies on the MCStreamer to keep sufficient tabs on Section /
Fragment information which the MCAsmStreamer does not. As a result the
textual output may fail where the equivalent object generation would
pass. This can most easily be resolved by folding the MCAsmStreamer
and MCObjectStreamer together which is planned for in a separate
patch.
Currently, this feature is only enabled for assembly input, keeping IR
compilation consistent between assembly and object generation.
Reviewers: echristo, rnk, probinson, espindola, peter.smith
Reviewed By: peter.smith
Subscribers: eraman, peter.smith, arichardson, jyknight, hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D45164
llvm-svn: 331218
No need to waste space nor number MBBs differently if MF gets recreated.
Reviewers: qcolombet, stoklund, t.p.northover, bogner, javed.absar
Reviewed By: qcolombet
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D46078
llvm-svn: 331213
There are two separate fixes here:
* The lowering code for non-extending loads should report UnableToLegalize instead of emitting the same instruction.
* The target should not be requesting lowering of non-extending loads.
llvm-svn: 331201
See r331124 for how I made a list of files missing the include.
I then ran this Python script:
for f in open('filelist.txt'):
f = f.strip()
fl = open(f).readlines()
found = False
for i in xrange(len(fl)):
p = '#include "llvm/'
if not fl[i].startswith(p):
continue
if fl[i][len(p):] > 'Config':
fl.insert(i, '#include "llvm/Config/llvm-config.h"\n')
found = True
break
if not found:
print 'not found', f
else:
open(f, 'w').write(''.join(fl))
and then looked through everything with `svn diff | diffstat -l | xargs -n 1000 gvim -p`
and tried to fix include ordering and whatnot.
No intended behavior change.
llvm-svn: 331184
Summary:
This patch will introduce copying of DBG_VALUE instructions
from an otherwise empty basic block to predecessor/successor
blocks in case the empty block is eliminated/bypassed. It
is currently only done in one identified situation in the
BranchFolding pass, before optimizing on empty block.
It can be seen as a light variant of the propagation done
by the LiveDebugValues pass, which unfortunately is executed
after the BranchFolding pass.
We only propagate (copy) DBG_VALUE instructions in a limited
number of situations:
a) If the empty BB is the only predecessor of a successor
we can copy the DBG_VALUE instruction to the beginning of
the successor (because the DBG_VALUE instruction is always
part of the flow between the blocks).
b) If the empty BB is the only successor of a predecessor
we can copy the DBG_VALUE instruction to the end of the
predecessor (because the DBG_VALUE instruction is always
part of the flow between the blocks). In this case we add
the DBG_VALUE just before the first terminator (assuming
that the terminators do not impact the DBG_VALUE).
A future solution, to handle more situations, could perhaps
be to run the LiveDebugValues pass before branch folding?
This fix is related to PR37234. It is expected to resolve
the problem seen, when applied together with the fix in
SelectionDAG from here: https://reviews.llvm.org/D46129
Reviewers: #debug-info, aprantl, rnk
Reviewed By: #debug-info, aprantl
Subscribers: ormris, gbedwell, llvm-commits
Differential Revision: https://reviews.llvm.org/D46184
llvm-svn: 331183
Summary:
When building the selection DAG at ISel all PHI nodes are
selected and lowered to Machine Instruction PHI nodes before
we start to create any SDNodes. So there are no SDNodes for
values produced by the PHI nodes.
In the past when selecting a dbg.value intrinsic that uses
the value produced by a PHI node we have been handling such
dbg.value intrinsics as "dangling debug info". I.e. we have
not created a SDDbgValue node directly, because there is
no existing SDNode for the PHI result, instead we deferred
the creationg of a SDDbgValue until we found the first use
of the PHI result.
The old solution had a couple of flaws. The position of the
selected DBG_VALUE instruction would end up quite late in a
basic block, and for example not directly after the PHI node
as in the LLVM IR input. And in case there were no use at all
in the basic block the dbg.value could be dropped completely.
This patch introduces a new VREG kind of SDDbgValue nodes.
It is similar to a SDNODE kind of node, but it refers directly
to a virtual register and not a SDNode. When we do selection
for a dbg.value that is using the result of a PHI node we
can do a lookup of the virtual register directly (as it already
is determined for the PHI node) and create a SDDbgValue node
immediately instead of delaying the selection until we find a
use.
This should fix a problem with losing debug info at ISel
as seen in PR37234 (https://bugs.llvm.org/show_bug.cgi?id=37234).
It does not resolve PR37234 completely, because the debug info
is dropped later on in the BranchFolder (see D46184).
Reviewers: #debug-info, aprantl
Reviewed By: #debug-info, aprantl
Subscribers: rnk, gbedwell, aprantl, JDevlieghere, llvm-commits
Differential Revision: https://reviews.llvm.org/D46129
llvm-svn: 331182
Summary:
Previously, a extending load was represented at (G_*EXT (G_LOAD x)).
This had a few drawbacks:
* G_LOAD had to be legal for all sizes you could extend from, even if
registers didn't naturally hold those sizes.
* All sizes you could extend from had to be allocatable just in case the
extend went missing (e.g. by optimization).
* At minimum, G_*EXT and G_TRUNC had to be legal for these sizes. As we
improve optimization of extends and truncates, this legality requirement
would spread without considerable care w.r.t when certain combines were
permitted.
* The SelectionDAG importer required some ugly and fragile pattern
rewriting to translate patterns into this style.
This patch begins changing the representation to:
* (G_[SZ]EXTLOAD x)
* (G_LOAD x) any-extends when MMO.getSize() * 8 < ResultTy.getSizeInBits()
which resolves these issues by allowing targets to work entirely in their
native register sizes, and by having a more direct translation from
SelectionDAG patterns.
This patch introduces the new generic instructions and new variation on
G_LOAD and adds lowering for them to convert back to the existing
representations.
Depends on D45466
Reviewers: ab, aditya_nandakumar, bogner, rtereshin, volkan, rovka, aemerson, javed.absar
Reviewed By: aemerson
Subscribers: aemerson, kristof.beyls, javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D45540
llvm-svn: 331115
This commit makes it so that if you outline a def of some register, then the
call instruction created by the outliner actually reflects that the register
is defined by the call. It also makes it so that outlined functions don't
have the TracksLiveness property.
Outlined calls shouldn't break liveness assumptions that someone might make.
This also un-XFAILs the noredzone test, and updates the calls test.
llvm-svn: 331095
Summary:
D42479 (rL329525) enabled SDIV combine for pow2 non-splat vector
dividers. But when there is a 1 in a vector, the instruction sequence to
be generated involves shifting a value by the number of its bit widths,
which is undefined
(c64f4dbfe3/lib/CodeGen/SelectionDAG/DAGCombiner.cpp (L6000-L6006)).
Especially, in architectures that do not support vector instructions,
each of element in a vector will be computed separately using scalar
operations, and then the resulting value will be undef for '1' values
in a vector.
(All 1's vector is fine; only vectors mixed with 1 and others will be
affected.)
Reviewers: RKSimon, jgravelle-google
Subscribers: jfb, dschuff, sbc100, jgravelle-google, llvm-commits
Differential Revision: https://reviews.llvm.org/D46161
llvm-svn: 331092
For local variables the first DW_OP_deref is consumed by turning the
location kind into a memeory location, but that only makes sense for
values that are in a register to begin with, which cannot happen for
global variables that are attached to a symbol.
rdar://problem/39741860
This reapplies r330970 after fixing an uncovered bug in r331086 and
working around the situation caused by it.
llvm-svn: 331090
Now local value sinking only scans and numbers instructions added
between the current flush point and the last flush point. This ensures
that ISel is overall linear in the size of the BB.
Fixes PR37010 and re-enables local value sinking by default.
llvm-svn: 331087
Some of the bots were failing in a different way to the others. These were
unable to compare tuples. Fix this by changing to a struct, thereby avoiding
the quirks of tuples.
llvm-svn: 331081
Extend the live-in check for all aliased registers so that we can
allow sinking Copy instructions when only implicit def is in successor's
live-in.
llvm-svn: 331072
Summary:
Currently only the memory size is supported but others can be added as
needed.
narrowScalar for G_LOAD and G_STORE now correctly update the
MachineMemOperand and will refuse to legalize atomics since those need more
careful expansions to maintain atomicity.
Reviewers: ab, aditya_nandakumar, bogner, rtereshin, aemerson, javed.absar
Reviewed By: aemerson
Subscribers: aemerson, rovka, kristof.beyls, javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D45466
llvm-svn: 331071
The main goal of this change is to make it much easier to track which
rules are actually covered by Testgen'erated regression tests.
Reviewers: aemerson, dsanders
Differential Revision: https://reviews.llvm.org/D46095
llvm-svn: 330988
For local variables the first DW_OP_deref is consumed by turning the
location kind into a memeory location, but that only makes sense for
values that are in a register to begin with, which cannot happen for
global variables that are attached to a symbol.
rdar://problem/39741860
llvm-svn: 330970
As noted, the attribute name is subject to change once we have
the clang side implemented, but it's clear that we need some
kind of attribute-based predication here based on the discussion
for:
rL330437
llvm-svn: 330951
As discussed in the post-review comments for rL330437,
we need to guard this fold to allow existing code to
keep working with the undefined behavior that they've
come to rely on.
That would mean duplicating more code than we already
have, so let's fix that first.
llvm-svn: 330947
Debug var, expr and loc were only supported for non-fixed stack objects.
This patch adds the following fields to the "fixedStack:" entries, and
renames the ones from "stack:" to:
* debug-info-variable
* debug-info-expression
* debug-info-location
Differential Revision: https://reviews.llvm.org/D46032
llvm-svn: 330859
We were previously prefering ZEXTLOAD over EXTLOAD if it is legal. This triggers during X86's promotion of i16->i32. Not sure about other targets.
Using ZEXTLOAD can prevent folding it to SEXTLOAD later if we were to promote a sign extended operand like we would need for SRA. However, X86 doesn't currently promote i16 SRA. I was looking into doing that which is how I found this issue.
This is also blocking our ability to fold 4 byte aligned EXTLOADs with "loadi32". This is what caused most of the test changes here.
Differential Revision: https://reviews.llvm.org/D45585#inline-402825
llvm-svn: 330781
Previously, _any_ store or load instruction was considered to be
operating on a spill if it had a frameindex as an operand, and thus
was fair game for optimisations such as "StackSlotColoring". This
usually works, except on architectures where spills can be partially
restored, for example on X86 where a spilt vector can have a single
component loaded (zeroing the rest of the target register). This can be
mis-interpreted and the zero extension unsoundly eliminated, see
pr30821.
To avoid this, this commit optionally provides the caller to
isLoadFromStackSlot and isStoreToStackSlot with the number of bytes
spilt/loaded by the given instruction. Optimisations can then determine
that a full spill followed by a partial load (or vice versa), for
example, cannot necessarily be commuted.
Patch by Jeremy Morse!
Differential Revision: https://reviews.llvm.org/D44782
llvm-svn: 330778
This patch aims to provide correct dwarf unwind information in function
epilogue for X86.
It consists of two parts. The first part inserts CFI instructions that set
appropriate cfa offset and cfa register in emitEpilogue() in
X86FrameLowering. This part is X86 specific.
The second part is platform independent and ensures that:
* CFI instructions do not affect code generation (they are not counted as
instructions when tail duplicating or tail merging)
* Unwind information remains correct when a function is modified by
different passes. This is done in a late pass by analyzing information
about cfa offset and cfa register in BBs and inserting additional CFI
directives where necessary.
Added CFIInstrInserter pass:
* analyzes each basic block to determine cfa offset and register are valid
at its entry and exit
* verifies that outgoing cfa offset and register of predecessor blocks match
incoming values of their successors
* inserts additional CFI directives at basic block beginning to correct the
rule for calculating CFA
Having CFI instructions in function epilogue can cause incorrect CFA
calculation rule for some basic blocks. This can happen if, due to basic
block reordering, or the existence of multiple epilogue blocks, some of the
blocks have wrong cfa offset and register values set by the epilogue block
above them.
CFIInstrInserter is currently run only on X86, but can be used by any target
that implements support for adding CFI instructions in epilogue.
Patch by Violeta Vukobrat.
Differential Revision: https://reviews.llvm.org/D42848
llvm-svn: 330706
Summary:
The pass is supposed to scalarize such intrinsics if the target does not support
them natively, so if the scalarization does not happen instruction selection
crashes due to inability to lower these intrinsics.
Reviewers: andrew.w.kaylor, craig.topper
Reviewed By: andrew.w.kaylor
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D45947
llvm-svn: 330700
Summary:
This is [[ https://bugs.llvm.org/show_bug.cgi?id=37104 | PR37104 ]].
[[ https://bugs.llvm.org/show_bug.cgi?id=6773 | PR6773 ]] will introduce an IR canonicalization that is likely bad for the end assembly.
Previously, `andl`+`andn`/`andps`+`andnps` / `bic`/`bsl` would be generated. (see `@out`)
Now, they would no longer be generated (see `@in`).
So we need to make sure that they are still generated.
If the mask is constant, we do nothing. InstCombine should have unfolded it.
Else, i use `hasAndNot()` TLI hook.
For now, only handle scalars.
https://rise4fun.com/Alive/bO6
----
I *really* don't like the code i wrote in `DAGCombiner::unfoldMaskedMerge()`.
It is super fragile. Is there something like IR Pattern Matchers for this?
Reviewers: spatel, craig.topper, RKSimon, javed.absar
Reviewed By: spatel
Subscribers: andreadb, courbet, kristof.beyls, javed.absar, rengolin, nemanjai, llvm-commits
Differential Revision: https://reviews.llvm.org/D45733
llvm-svn: 330646
This helps debug issues where selection-dag assigns the wrong location
to an instruction.
Differential Revision: https://reviews.llvm.org/D45913
llvm-svn: 330618
Summary:
This just refactors the lowering of the atomic memory intrinsics to more
closely match the code patterns used in the lowering of the non-atomic
memory intrinsics. Specifically, we encapsulate the lowering in
SelectionDAG::getAtomicMem*() functions rather than embedding
the code directly in the SelectionDAGBuilder code.
llvm-svn: 330603
In certain cases, the compiler might try to merge __stack_chk_guard with
another global variable. (Or someone could theoretically define
__stack_chk_guard as an alias.) In that case, make sure we don't crash.
Differential Revision: https://reviews.llvm.org/D45746
llvm-svn: 330495
This was originally committed at rL328921 and reverted at rL329920 to
investigate failures in Chrome. This time I've added to the ReleaseNotes
to warn users of the potential of exposing UB and let me repeat that
here for more exposure:
Optimization of floating-point casts is improved. This may cause surprising
results for code that is relying on undefined behavior. Code sanitizers can
be used to detect affected patterns such as this:
int main() {
float x = 4294967296.0f;
x = (float)((int)x);
printf("junk in the ftrunc: %f\n", x);
return 0;
}
$ clang -O1 ftrunc.c -fsanitize=undefined ; ./a.out
ftrunc.c:5:15: runtime error: 4.29497e+09 is outside the range of
representable values of type 'int'
junk in the ftrunc: 0.000000
Original commit message:
fptosi / fptoui round towards zero, and that's the same behavior as ISD::FTRUNC,
so replace a pair of casts with the equivalent node. We don't have to account for
special cases (NaN, INF) because out-of-range casts are undefined.
Differential Revision: https://reviews.llvm.org/D44909
llvm-svn: 330437