Use getAPIntValue() directly - this is mainly a best practice style issue to help prevent fuzz tests blowing up when a i12345 (or whatever) is generated.
Use getConstantOperandVal/getConstantOperandAPInt wrappers where possible.
llvm-svn: 371315
```
.type foo,@gnu_indirect_function
.set foo,foo_resolver
.set foo2,foo
.set foo3,foo2
```
The types of foo2 and foo3 should be STT_GNU_IFUNC, but we currently
resolve them to the type of foo_resolver. This patch fixes it.
Differential Revision: https://reviews.llvm.org/D67206
Patch by Senran Zhang
llvm-svn: 371312
Summary:
Normally TargetLowering::expandFixedPointMul would handle
SMULFIXSAT with scale zero by using an SMULO to compute the
product and determine if saturation is needed (if overflow
happened). But if SMULO isn't custom/legal it falls through
and uses the same technique, using MULHS/SMUL_LOHI, as used
for non-zero scales.
Problem was that when checking for overflow (handling saturation)
when not using MULO we did not expect to find a zero scale. So
we ended up in an assertion when doing
APInt::getLowBitsSet(VTSize, Scale - 1)
This patch fixes the problem by adding a new special case for
how saturation is computed when scale is zero.
Reviewers: RKSimon, bevinh, leonardchan, spatel
Reviewed By: RKSimon
Subscribers: wuzish, nemanjai, hiraditya, MaskRay, jsji, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67071
llvm-svn: 371309
Summary:
Add an intrinsic that takes 2 unsigned integers with
the scale of them provided as the third argument and
performs fixed point multiplication on them. The
result is saturated and clamped between the largest and
smallest representable values of the first 2 operands.
This is a part of implementing fixed point arithmetic
in clang where some of the more complex operations
will be implemented as intrinsics.
Patch by: leonardchan, bjope
Reviewers: RKSimon, craig.topper, bevinh, leonardchan, lebedev.ri, spatel
Reviewed By: leonardchan
Subscribers: ychen, wuzish, nemanjai, MaskRay, jsji, jdoerfert, Ka-Ka, hiraditya, rjmccall, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D57836
llvm-svn: 371308
Fix for https://bugs.llvm.org/show_bug.cgi?id=43230.
When creating PSHUFLW from a repeated shuffle mask, we have to apply
the checks to the repeated mask, not the original one. For the test
case from PR43230 the inspected part of the original mask is all undef.
Differential Revision: https://reviews.llvm.org/D67314
llvm-svn: 371307
This addresses the issue mentioned on D19867. When we simplify
with.overflow instructions in CVP, we leave behind extractvalue
of insertvalue sequences that LVI no longer understands. This
means that we can not simplify any instructions based on the
with.overflow anymore (until some over pass like InstCombine
cleans them up).
This patch extends LVI extractvalue handling by calling
SimplifyExtractValueInst (which doesn't do anything more than
constant folding + looking through insertvalue) and using the block
value of the simplification.
A possible alternative would be to do something similar to
SimplifyIndVars, where we instead directly try to replace
extractvalue users of the with.overflow. This would need some
additional structural changes to CVP, as it's currently not legal
to remove anything but the current instruction -- we'd have to
introduce a worklist with instructions scheduled for deletion or similar.
Differential Revision: https://reviews.llvm.org/D67035
llvm-svn: 371306
Summary:
The value operand in DW_OP_plus_uconst/DW_OP_constu value can be
large (it uses uint64_t as representation internally in LLVM).
This means that in the uint64_t to int conversions, previously done
by DwarfExpression::addMachineRegExpression, could lose information.
Also, the negation done in "-Offset" was undefined behavior in case
Offset was exactly INT_MIN.
To avoid the above problems, we now avoid transformation like
[Reg, DW_OP_plus_uconst, Offset] --> [DW_OP_breg, Offset]
and
[Reg, DW_OP_constu, Offset, DW_OP_plus] --> [DW_OP_breg, Offset]
when Offset > INT_MAX.
And we avoid to transform
[Reg, DW_OP_constu, Offset, DW_OP_minus] --> [DW_OP_breg,-Offset]
when Offset > INT_MAX+1.
The patch also adjusts DwarfCompileUnit::constructVariableDIEImpl
to make sure that "DW_OP_constu, Offset, DW_OP_minus" is used
instead of "DW_OP_plus_uconst, Offset" when creating DIExpressions
with negative frame index offsets.
Notice that this might just be the tip of the iceberg. There
are lots of fishy handling related to these constants. I think both
DIExpression::appendOffset and DIExpression::extractIfOffset may
trigger undefined behavior for certain values.
Reviewers: sdesmalen, rnk, JDevlieghere
Reviewed By: JDevlieghere
Subscribers: jholewinski, aprantl, hiraditya, ychen, uabelho, llvm-commits
Tags: #debug-info, #llvm
Differential Revision: https://reviews.llvm.org/D67263
llvm-svn: 371304
This currently triggers undefined behavior if executed with an
ubsan build. It is just a precommit of the test case to show that
we got a problem.
Fix is proposed in https://reviews.llvm.org/D67263 and plan is to
commit the fix directly after this patch.
llvm-svn: 371303
Summary:
This patch introduces initial `AAValueSimplify` which simplifies a value in a context.
example
- (for function returned) If all the return values are the same and constant, then we can replace callsite returned with the constant.
- If an internal function takes the same value(constant) as an argument in the callsite, then we can replace the argument with that constant.
Reviewers: jdoerfert, sstefan1
Reviewed By: jdoerfert
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D66967
llvm-svn: 371291
Summary:
This is the first change to enable the TLI to be built per-function so
that -fno-builtin* handling can be migrated to use function attributes.
See discussion on D61634 for background. This is an enabler for fixing
handling of these options for LTO, for example.
This change should not affect behavior, as the provided function is not
yet used to build a specifically per-function TLI, but rather enables
that migration.
Most of the changes were very mechanical, e.g. passing a Function to the
legacy analysis pass's getTLI interface, or in Module level cases,
adding a callback. This is similar to the way the per-function TTI
analysis works.
There was one place where we were looking for builtins but not in the
context of a specific function. See FindCXAAtExit in
lib/Transforms/IPO/GlobalOpt.cpp. I'm somewhat concerned my workaround
could provide the wrong behavior in some corner cases. Suggestions
welcome.
Reviewers: chandlerc, hfinkel
Subscribers: arsenm, dschuff, jvesely, nhaehnle, mehdi_amini, javed.absar, sbc100, jgravelle-google, eraman, aheejin, steven_wu, george.burgess.iv, dexonsmith, jfb, asbirlea, gchatelet, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D66428
llvm-svn: 371284
version after r371273.
Also fix a minor issue in r371273 that only surfaced after template
instantiation from LLVM's use of the demangler.
llvm-svn: 371274
Despite the fact that the localizer's original motivation was to fix horrendous
constant spilling at -O0, shortening live ranges still has net benefits even
with optimizations enabled.
On an -Os build of CTMark, doing this improves code size by 0.5% geomean.
There are a few regressions, bullet increasing in size by 0.5%. One example from
bullet where code size increased slightly was due to GlobalISel actually now
generating the same code as SelectionDAG. So we actually have an opportunity
in future to implement better heuristics for localization and therefore be
*better* than SDAG in some cases. In relation to other optimizations though that
one is relatively minor.
Differential Revision: https://reviews.llvm.org/D67303
llvm-svn: 371266
Add the new method `LibCallSimplifier::substituteInParent()` that calls
`LibCallSimplifier::replaceAllUsesWith()' and
`LibCallSimplifier::eraseFromParent()` back to back, simplifying the
resulting code.
llvm-svn: 371264
Summary:
There's an unspoken invariant of callbr that the list of BlockAddress
Constants in the "function args" list match the BasicBlocks in the
"other labels" list. (This invariant is being added to the LangRef in
https://reviews.llvm.org/D67196).
When modifying the any of the indirect destinations of a callbr
instruction (possible jump targets), we need to update the function
arguments if the argument is a BlockAddress whose BasicBlock refers to
the indirect destination BasicBlock being replaced. Otherwise, many
transforms that modify successors will end up violating that invariant.
A recent change to the arm64 Linux kernel exposed this bug, which
prevents the kernel from booting.
I considered maintaining a mapping from indirect destination BasicBlock
to argument operand BlockAddress, but this ends up being a one to
potentially many (though usually one) mapping. Also, the list of
arguments to a function (or more typically inline assembly) ends up
being less than 10. The implementation is significantly simpler to just
rescan the full list of arguments. Because of the one to potentially
many relationship, the full arg list must be scanned (we can't stop at
the first instance).
Thanks to the following folks that reported the issue and helped debug
it:
* Nathan Chancellor
* Will Deacon
* Andrew Murray
* Craig Topper
Link: https://bugs.llvm.org/show_bug.cgi?id=43222
Link: https://github.com/ClangBuiltLinux/linux/issues/649
Link: https://lists.infradead.org/pipermail/linux-arm-kernel/2019-September/678330.html
Reviewers: craig.topper, chandlerc
Reviewed By: craig.topper
Subscribers: void, javed.absar, kristof.beyls, hiraditya, llvm-commits, nathanchance, srhines
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67252
llvm-svn: 371262
Trying to minimize the features we need to manipulate when this
is updated for D67259.
The VBMI is interesting because it enables some improved combining
for truncates.
I enabled fast-variable-shuffle because all the CPUs we're going
to add implicitly enable it. So they can share check lines.
llvm-svn: 371261
It was pointed out that I had hard-coded PlatformKind. This is rectifying that.
Differential Revision: https://reviews.llvm.org/D67255
llvm-svn: 371248
Test verified that we could compile an empty module and produce an XCOFF
object file. Newer tests superssed this coverage, its safe to remove.
llvm-svn: 371247
ORC-RPC batches calls by default, and the channel's send method must be called
to transfer any buffered calls to the remote. The call to send was missing on
responses and blocking calls in the SingleThreadedRPCEndpoint. This patch adds
the necessary calls and modifies the RPC unit test to check for them.
llvm-svn: 371245
The llvm-jitlink utility now accepts a '-slab-allocate <size>' option. If given,
llvm-jitlink will use a slab-based memory manager rather than the default
InProcessMemoryManager. Using a slab allocator will allow reliable testing of
future locality based optimizations (e.g. PLT and GOT elimination) in JITLink.
The <size> argument is a number, optionally followed by a units specifier (Kb,
Mb, or Gb). If the units are not given then the number is assumed to be in Kb.
llvm-svn: 371244
We can use a MOVSX16 here then rely on FixupBWInst to change to
MOVSX32 if the upper bits are dead. With a special case to
not promote if it could be turned into CBW.
Then we can rely on X86MCInstLower to turn the MOVSX into CBW
very late if register allocation worked out.
Using MOVSX gives an opportunity to use the MOVSX as a both a
copy and a sign extend since the input and output register aren't
tied together.
Differential Revision: https://reviews.llvm.org/D67192
llvm-svn: 371243
We can rely on X86FixupBWInsts to turn these into MOVZX32. This
simplifies a follow up commit to use MOVSX for i8 sdivrem with
a late optimization to use CBW when register allocation works out.
llvm-svn: 371242
Extend the common/local-common testing for object files to also verify the
symbol table now that the needed functionality has landed in llvm-readobj.
Differential Revision: https://reviews.llvm.org/D66944
llvm-svn: 371237
The IRBuilder doesn't know that the two floating point to integer instructions
have constrained equivalents. This patch adds the support by building on
the strict FP mode now present in the IRBuilder.
Reviewed by: John McCall
Approved by: John McCall
Differential Revision: https://reviews.llvm.org/D67291
llvm-svn: 371235
In order to keep remarks around, we need to make them tied to a string
table.
Users then can delete the parser and rely on the string table to keep
the memory of the strings alive and deduplicated.
llvm-svn: 371233
-tailcallopt requires that we perform different stack adjustments than with
sibling calls. For example, the `@caller_to0_from8` function in
test/CodeGen/AArch64/tail-call.ll requires that we adjust SP. Without
-tailcallopt, this adjustment does not happen. With it, however, it is expected.
So, to ensure that adding sibling call support doesn't break -tailcallopt,
make CallLowering always fall back on possible tail calls when -tailcallopt
is passed in.
Update test/CodeGen/AArch64/tail-call.ll with a GlobalISel line to make sure
that we don't differ from the SDAG implementation at any point.
Differential Revision: https://reviews.llvm.org/D67245
llvm-svn: 371227
Summary:
This isn't an important optimization at all... We're already doing:
pow(x, 0.0) -> 1.0
My patch merely teaches instcombine that -0.0 does the same.
However, doing this fixes an AMAZING bug! Compile this program:
extern "C" double pow(double, double);
double boom(double base) {
return pow(base, -0.0);
}
With:
clang++ ~/Desktop/fast-math.cpp -ffast-math -O2 -S
And clang will crash with a signal. Wow, fast math is so fast it ICEs the
compiler! Arguably, the generated math is infinitely fast.
What's actually happening is that we recurse infinitely in getPow. In debug we
hit its assertion:
assert(Exp != 0 && "Incorrect exponent 0 not handled");
We avoid this entire mess if we instead recognize that an exponent of positive
and negative zero yield 1.0.
A separate commit, r371221, fixed the same problem. This only contains the added
tests.
<rdar://problem/54598300>
Reviewers: scanon
Subscribers: hiraditya, jkorous, dexonsmith, ributzka, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67248
llvm-svn: 371224
This patch sinks add/mul(shufflevector(insertelement())) into the basic block in which they are used so that they can then be selected together.
This is useful for various MVE instructions, such as vmla and others that take R registers.
Loop tests have been added to the vmla test file to make sure vmlas are generated in loops.
Differential revision: https://reviews.llvm.org/D66295
llvm-svn: 371218
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790
Reviewers: courbet
Subscribers: nemanjai, javed.absar, hiraditya, kbarton, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, s.egerton, pzheng, ychen, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67267
llvm-svn: 371212
Summary:
This was discovered while introducing the llvm::Align type.
The original setMinFunctionAlignment used to take alignment as log2, looking at the comment it seems like instructions are to be 2-bytes aligned and not 4-bytes aligned.
Reviewers: uweigand
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67271
llvm-svn: 371204
This is a fix for https://bugs.llvm.org/show_bug.cgi?id=40785.
llvm-readelf does not print the st_value of the symbol when
st_value has any non-visibility bits set.
This patch:
* Aligns "Ndx" row for the default and a new cases.
(it was 1 space character off for the case when "PROTECTED" visibility was printed)
* Prints "[<other>: 0x??]" for symbols which has an additional st_other bits set.
In compare with GNU, this logic is a bit simpler and seems to be more consistent.
For MIPS GNU can print named flags, though can't print a mix of them:
0: 00000000 0 NOTYPE LOCAL DEFAULT UND
1: 00000000 0 NOTYPE GLOBAL DEFAULT [OPTIONAL] UND a1
2: 00000000 0 NOTYPE GLOBAL DEFAULT [MIPS PLT] UND a2
3: 00000000 0 NOTYPE GLOBAL DEFAULT [MIPS PIC] UND a3
4: 00000000 0 NOTYPE GLOBAL DEFAULT [MICROMIPS] UND a4
5: 00000000 0 NOTYPE GLOBAL DEFAULT [MIPS16] UND a5
6: 00000000 0 NOTYPE GLOBAL DEFAULT [<other>: c] UND b1
7: 00000000 0 NOTYPE GLOBAL DEFAULT [<other>: 28] UND b2
On PPC64 it can print a localentry value that is encoded in the high bits of st_other
63: 0000000000000850 208 FUNC GLOBAL DEFAULT [<localentry>: 8] 12
We chose to print the raw st_other field, prefixed with '0x'.
Differential revision: https://reviews.llvm.org/D67094
llvm-svn: 371201
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790
Reviewers: courbet
Subscribers: jyknight, sdardis, nemanjai, javed.absar, hiraditya, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, s.egerton, pzheng, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67229
llvm-svn: 371200
This patch allows the DFAPacketizer to be queried after a packet is formed to work out which
resources were allocated to the packetized instructions.
This is particularly important for targets that do their own bundle packing - it's not
sufficient to know simply that instructions can share a packet; which slots are used is
also required for encoding.
This extends the emitter to emit a side-table containing resource usage diffs for each
state transition. The packetizer maintains a set of all possible resource states in its
current state. After packetization is complete, all remaining resource states are
possible packetization strategies.
The sidetable is only ~500K for Hexagon, but the extra tracking is disabled by default
(most uses of the packetizer like MachinePipeliner don't care and don't need the extra
maintained state).
Differential Revision: https://reviews.llvm.org/D66936
llvm-svn: 371198
If a stack spill location is overwritten by another spill instruction,
any variable locations pointing at that slot should be terminated. We
cannot rely on spills always being restored to registers or variable
locations being moved by a DBG_VALUE: the register allocator is entitled
to spill a value and then forget about it when it goes out of liveness.
To address this, scan for memory writes to spill locations, even those we
don't consider to be normal "spills". isSpillInstruction and
isLocationSpill distinguish the two now. After identifying spill
overwrites, terminate the open range, and insert a $noreg DBG_VALUE for
that variable.
Differential Revision: https://reviews.llvm.org/D66941
llvm-svn: 371193
Summary:
This fixes poor scheduling in a function containing a barrier and a few
load instructions.
Without this fix, ScheduleDAGInstrs::buildSchedGraph adds an artificial
edge in the dependency graph from the barrier instruction to the exit
node representing live-out latency, with a latency of about 500 cycles.
Because of this it thinks the critical path through the graph also has
a latency of about 500 cycles. And because of that it does not think
that any of the load instructions are on the critical path, so it
schedules them with no regard for their (80 cycle) latency, which gives
poor results.
Reviewers: arsenm, dstuttard, tpr, nhaehnle
Subscribers: kzhuravl, jvesely, wdng, yaxunl, t-tye, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67218
llvm-svn: 371192
`struct Elf*_Shdr` has a field `sh_offset`, named `ShOffset` in
llvm::ELFYAML::Section. Rename SHOffset (e_shoff) to SHOff to prevent confusion.
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D67254
llvm-svn: 371185
The MVE and LOB extensions of Armv8.1m can be combined to enable
'tail predication' which removes the need for a scalar remainder
loop after vectorization. Lane predication is performed implicitly
via a system register. The effects of predication is described in
Section B5.6.3 of the Armv8.1-m Arch Reference Manual, the key points
being:
- For vector operations that perform reduction across the vector and
produce a scalar result, whether the value is accumulated or not.
- For non-load instructions, the predicate flags determine if the
destination register byte is updated with the new value or if the
previous value is preserved.
- For vector store instructions, whether the store occurs or not.
- For vector load instructions, whether the value that is loaded or
whether zeros are written to that element of the destination
register.
This patch implements a pass that takes a hardware loop, containing
masked vector instructions, and converts it something that resembles
an MVE tail predicated loop. Currently, if we had code generation,
we'd generate a loop in which the VCTP would generate the predicate
and VPST would then setup the value of VPR.PO. The loads and stores
would be placed in VPT blocks so this is not tail predication, but
normal VPT predication with the predicate based upon a element
counting induction variable. Further work needs to be done to finally
produce a true tail predicated loop.
Because only the loads and stores are predicated, in both the LLVM IR
and MIR level, we will restrict support to only lane-wise operations
(no horizontal reductions). We will perform a final check on MIR
during loop finalisation too.
Another restriction, specific to MVE, is that all the vector
instructions need operate on the same number of elements. This is
because predication is performed at the byte level and this is set
on entry to the loop, or by the VCTP instead.
Differential Revision: https://reviews.llvm.org/D65884
llvm-svn: 371179
Summary:
Fix a bug of not update the jump table and recommit it again.
In `block-placement` pass, it will create some patterns for unconditional we can do the simple early retrun.
But the `early-ret` pass is before `block-placement`, we don't want to run it again.
This patch is to do the simple early return to optimize the blocks at the last of `block-placement`.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D63972
llvm-svn: 371177
The
;CHECK: bb
;CHECK-NEXT: %namedVReg1353:_(p0) = COPY $d0
parts of the test case failed when the tests were placed in a directory
including "bb" in the path, since the full path of the file is then
output in the
; ModuleID = '/repo/bb/
line which the CHECK matched on and then the CHECK-NEXT failed.
llvm-svn: 371171
Summary: It says [[ http://www.sco.com/developers/gabi/latest/ch4.eheader.html | here ]] that if there are no program headers than e_phoff should be 0, but currently it is always set after the header. GNU's `readelf` (but not `llvm-readelf`) complains about this: `readelf: Warning: possibly corrupt ELF header - it has a non-zero program header offset, but no program headers`.
Reviewers: jhenderson, grimar, MaskRay, rupprecht
Reviewed By: jhenderson, grimar, MaskRay
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67054
llvm-svn: 371162
Passing INT64_MIN to MCInstPrinter::formatHex triggers undefined
behavior because the negation of -9223372036854775808 cannot be
represented in type 'int64_t' (aka 'long long'). This patch puts a
workaround in place to just print the hex value directly.
A possible alternative involves using a small helper functions that uses
(implementation) defined conversions to achieve the desirable value:
static int64_t helper(int64_t V) {
auto U = static_cast<uint64_t>(V);
return V < 0 ? -U : U;
}
The underlying problem is that MCInstPrinter::formatHex(int64_t) returns
a format_object<int64_t> and should really return a
format_object<uint64_t>. However, that's not possible because formatImm
needs to be able to print both as decimal (where a signed is required)
and hex (where we'd prefer to always have an unsigned).
format_object<int64_t> formatImm(int64_t Value) const {
return PrintImmHex ? formatHex(Value) : formatDec(Value);
}
Differential revision: https://reviews.llvm.org/D67236
llvm-svn: 371159
See http://lists.llvm.org/pipermail/llvm-dev/2019-February/130583.html
and D60242 for the lld partition feature.
This patch:
* Teaches yaml2obj to parse the 3 section types.
* Teaches llvm-readobj/llvm-readelf to dump the 3 section types.
There is no test for SHT_LLVM_DEPENDENT_LIBRARIES in llvm-readobj. Add
it as well.
Reviewed By: thakis
Differential Revision: https://reviews.llvm.org/D67228
llvm-svn: 371157
This was only using the correct register constraints if this was the
final result instruction. If the extract was a sub instruction of the
result, it would attempt to use GIR_ConstrainSelectedInstOperands on a
COPY, which won't work. Move the handling to
createAndImportSubInstructionRenderer so it works correctly.
I don't fully understand why runOnPattern and
createAndImportSubInstructionRenderer both need to handle these
special cases, and constrain them with slightly different methods. If
I remove the runOnPattern handling, it does break the constraint when
the final result instruction is EXTRACT_SUBREG.
llvm-svn: 371150
The same stack is loaded for each workitem ID, and each use. Nothing
prevents you from creating multiple fixed stack objects with the same
offsets, so this was creating a load for each unique frame index,
despite them being the same offset. Re-use the same frame index so the
loads are CSEable.
llvm-svn: 371148
Properly check if NewAAInfo conflicts with AAInfo.
Update local variable and alias set that a change occured when a conflict is found.
Resolves PR42969.
llvm-svn: 371139
Summary:
Here we try to avoid issues with "explicit branch" with SimplifyBranchOnICmpChain
which can check on undef. Msan by design reports branches on uninitialized
memory and undefs, so we have false report here.
In general msan does not like when we convert
```
// If at least one of them is true we can MSAN is ok if another is undefs
if (a || b)
return;
```
into
```
// If 'a' is undef MSAN will complain even if 'b' is true
if (a)
return;
if (b)
return;
```
Example
Before optimization we had something like this:
```
while (true) {
bool maybe_undef = doStuff();
while (true) {
char c = getChar();
if (c != 10 && c != 13)
continue
break;
}
// we know that c == 10 || c == 13 if we get here,
// so msan know that branch is not affected by maybe_undef
if (maybe_undef || c == 10 || c == 13)
continue;
return;
}
```
SimplifyBranchOnICmpChain will convert that into
```
while (true) {
bool maybe_undef = doStuff();
while (true) {
char c = getChar();
if (c != 10 && c != 13)
continue;
break;
}
// however msan will complain here:
if (maybe_undef)
continue;
// we know that c == 10 || c == 13, so either way we will get continue
switch(c) {
case 10: continue;
case 13: continue;
}
return;
}
```
Reviewers: eugenis, efriedma
Reviewed By: eugenis, efriedma
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67205
llvm-svn: 371138
Approximately 30% of the time was spent in the std::vector
constructor. In one testcase this pushes the scheduler to being the
second slowest pass.
I'm not sure I understand why these vector are necessary. The default
scheduler initCandidate seems to use some pre-existing vectors for the
pressure.
llvm-svn: 371136
This commit includes the following changes: Adds a Getting Involved section under Community. Moves the Development Process section under Community. Moves Sphinx Quickstart Template and How to submit an LLVM bug report from User Guides section to Getting Involved.
llvm-svn: 371127
This patch reuses the MIR vreg renamer from the MIRCanonicalizerPass to cleanup
names of vregs in a MIR file for MIR test authors. I found it useful when
writing a regression test for a globalisel failure I encountered recently and
thought it might be useful for other folks as well.
Differential Revision: https://reviews.llvm.org/D67209
llvm-svn: 371121
Now that we look through copies, it's possible to visit registers that
have a register class constraint but not a type constraint. Avoid looking
through copies when this occurs as the SrcReg won't be able to determine
it's bit width or any known bits.
Along the same lines, if the initial query is on a register that doesn't
have a type constraint then the result is a default-constructed KnownBits,
that is, a 1-bit fully-unknown value.
llvm-svn: 371116
Recommit basic sibling call lowering (https://reviews.llvm.org/D67189)
The issue was that if you have a return type other than void, call lowering
will emit COPYs to get the return value after the call.
Disallow sibling calls other than ones that return void for now. Also
proactively disable swifterror tail calls for now, since there's a similar issue
with COPYs there.
Update call-translator-tail-call.ll to include test cases for each of these
things.
llvm-svn: 371114
The code was incorrectly counting the number of identical instructions,
and therefore tried to predicate an instruction which should not have
been predicated. This could have various effects: a compiler crash,
an assembler failure, a miscompile, or just generating an extra,
unnecessary instruction.
Instead of depending on TargetInstrInfo::removeBranch, which only
works on analyzable branches, just remove all branch instructions.
Fixes https://bugs.llvm.org/show_bug.cgi?id=43121 and
https://bugs.llvm.org/show_bug.cgi?id=41121 .
Differential Revision: https://reviews.llvm.org/D67203
llvm-svn: 371111
As noted in PR43197, we can use test+add+cmov+sra to implement
signed division by a power of 2.
This is based off the similar version in AArch64, but I've
adjusted it to use target independent nodes where AArch64 uses
target specific CMP and CSEL nodes. I've also blocked INT_MIN
as the transform isn't valid for that.
I've limited this to i32 and i64 on 64-bit targets for now and only
when CMOV is supported. i8 and i16 need further investigation to be
sure they get promoted to i32 well.
I adjusted a few tests to enable cmov to demonstrate the new
codegen. I also changed twoaddr-coalesce-3.ll to 32-bit mode
without cmov to avoid perturbing the scenario that is being
set up there.
Differential Revision: https://reviews.llvm.org/D67087
llvm-svn: 371104
A follow-up for r329011.
This may be changed to produce @llvm.sub.with.overflow in a later patch,
but for now just make things more consistent overall.
A few observations stem from this:
* There does not seem to be a similar one-instruction fold for uadd-overflow
* I'm not sure we'll want to canonicalize `B u> A` as `usub.with.overflow`,
so since the `icmp` here no longer refers to `sub`,
reconstructing `usub.with.overflow` will be problematic,
and will likely require standalone pass (similar to DivRemPairs).
https://rise4fun.com/Alive/Zqs
Name: (A - B) u> A --> B u> A
%t0 = sub i8 %A, %B
%r = icmp ugt i8 %t0, %A
=>
%r = icmp ugt i8 %B, %A
Name: (A - B) u<= A --> B u<= A
%t0 = sub i8 %A, %B
%r = icmp ule i8 %t0, %A
=>
%r = icmp ule i8 %B, %A
Name: C u< (C - D) --> C u< D
%t0 = sub i8 %C, %D
%r = icmp ult i8 %C, %t0
=>
%r = icmp ult i8 %C, %D
Name: C u>= (C - D) --> C u>= D
%t0 = sub i8 %C, %D
%r = icmp uge i8 %C, %t0
=>
%r = icmp uge i8 %C, %D
llvm-svn: 371101
Summary:
This is a simple change that allows easy iterator semantics for symbols held in interface file.
Not being used, so harmless change right now, but will be once TBD-v4 is submitted.
Reviewers: ributzka, steven_wu
Reviewed By: ributzka
Subscribers: javed.absar, kristof.beyls, dexonsmith, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67204
llvm-svn: 371097
Updates the links on the homepage by moving the User Guides, Programming Documentation, and Subsystem Documentation sections to separate pages. Also changes "Overview" to "About" at the top of the LLVM Docs homepage. This work is part of the Google Season of Docs project.
llvm-svn: 371096
If we have:
bb5:
br i1 %arg3, label %bb6, label %bb7
bb6:
%tmp = getelementptr inbounds i32, i32* %arg1, i64 2
store i32 3, i32* %tmp, align 4
br label %bb9
bb7:
%tmp8 = getelementptr inbounds i32, i32* %arg1, i64 2
store i32 3, i32* %tmp8, align 4
br label %bb9
bb9: ; preds = %bb4, %bb6, %bb7
...
We can't sink stores directly into bb9.
This patch creates new BB that is successor of %bb6 and %bb7
and sinks stores into that block.
SplitFooterBB is the parameter to the pass that controls
that behavior.
Change-Id: I7fdf50a772b84633e4b1b860e905bf7e3e29940f
Differential: https://reviews.llvm.org/D66234
llvm-svn: 371089
Summary:
Avoid visiting an instruction more than once by using a map.
This is similar to https://reviews.llvm.org/rL361416.
Reviewers: davidxl
Reviewed By: davidxl
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67198
llvm-svn: 371086
A number of inline assembly constraints are currently supported by LLVM, but rejected as invalid by Clang:
Target independent constraints:
s: An integer constant, but allowing only relocatable values
ARM specific constraints:
j: An immediate integer between 0 and 65535 (valid for MOVW)
x: A 32, 64, or 128-bit floating-point/SIMD register: s0-s15, d0-d7, or q0-q3
N: An immediate integer between 0 and 31 (Thumb1 only)
O: An immediate integer which is a multiple of 4 between -508 and 508. (Thumb1 only)
This patch adds support to Clang for the missing constraints along with some checks to ensure that the constraints are used with the correct target and Thumb mode, and that immediates are within valid ranges (at least where possible). The constraints are already implemented in LLVM, but just a couple of minor corrections to checks (V8M Baseline includes MOVW so should work with 'j', 'N' and 'O' shouldn't be valid in Thumb2) so that Clang and LLVM are in line with each other and the documentation.
Differential Revision: https://reviews.llvm.org/D65863
Change-Id: I18076619e319bac35fbb60f590c069145c9d9a0a
llvm-svn: 371079
As discussed on D64551 and PR43227, we don't correctly handle cases where the base load has a non-zero byte offset.
Until we can properly handle this, we must bail from EltsFromConsecutiveLoads.
llvm-svn: 371078
Linkers (ld.bfd/gold/lld) place the section header table at the very
end. This allows tools to strip it, which is optional in executable/shared objects.
In addition, if we add or section, the size of the section header table
will change. Placing the section header table in the end keeps section
offsets unchanged.
yaml2obj currently places the section header table immediately after the
program header. Follow what linkers do to make offset updating easier.
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D67221
llvm-svn: 371074
D62179 introduced a regression. llvm-readelf lose the ability to dump the dynamic symbols
when there is .dynamic section with a DT_SYMTAB, but there are no program headers:
https://reviews.llvm.org/D62179#1652778
Below is a program flow before the D62179 change:
1) Find SHT_DYNSYM.
2) Find there is no PT_DYNAMIC => don't try to parse it.
3) Print dynamic symbols using information about them found on step (1).
And after the change it became:
1) Find SHT_DYNSYM.
2) Find there is no PT_DYNAMIC => find SHT_DYNAMIC.
3) Parse dynamic table, but fail to handle the DT_SYMTAB because of the absence of the PT_LOAD. Report the "Virtual address is not in any segment" error.
This patch fixes the issue. For doing this it checks that the value of DT_SYMTAB was
mapped to a segment. If not - it ignores it.
Differential revision: https://reviews.llvm.org/D67078
llvm-svn: 371071
This attempts to just fix the creation of VPT blocks, fixing up the iterating,
which instructions are considered in the bundle, and making sure that we do not
overrun the end of the block.
Differential Revision: https://reviews.llvm.org/D67219
llvm-svn: 371064
G_FENCE comes form fence instruction. For MIPS fence is generated in
AtomicExpandPass when atomic instruction gets surrounded with fence
instruction when needed.
G_FENCE arguments don't have LLT, because of that there is no job for
legalizer and regbankselect. Instruction select G_FENCE for MIPS32.
Differential Revision: https://reviews.llvm.org/D67181
llvm-svn: 371056
Select G_INTRINSIC_W_SIDE_EFFECTS for Intrinsic::trap for MIPS32
via legalizeIntrinsic.
Differential Revision: https://reviews.llvm.org/D67180
llvm-svn: 371055
Instead of returning structure by value clang usually adds pointer
to that structure as an argument. Pointers don't require special
handling no matter the SRet flag. Remove unsuccessful exit from
lowerCall for arguments with SRet flag if they are pointers.
Differential Revision: https://reviews.llvm.org/D67179
llvm-svn: 371054
This adds support for basic sibling call lowering in AArch64. The intent here is
to only handle tail calls which do not change the ABI (hence, sibling calls.)
At this point, it is very restricted. It does not handle
- Vararg calls.
- Calls with outgoing arguments.
- Calls whose calling conventions differ from the caller's calling convention.
- Tail/sibling calls with BTI enabled.
This patch adds
- `AArch64CallLowering::isEligibleForTailCallOptimization`, which is equivalent
to the same function in AArch64ISelLowering.cpp (albeit with the restrictions
above.)
- `mayTailCallThisCC` and `canGuaranteeTCO`, which are identical to those in
AArch64ISelLowering.cpp.
- `getCallOpcode`, which is exactly what it sounds like.
Tail/sibling calls are lowered by checking if they pass target-independent tail
call positioning checks, and checking if they satisfy
`isEligibleForTailCallOptimization`. If they do, then a tail call instruction is
emitted instead of a normal call. If we have a sibling call (which is always the
case in this patch), then we do not emit any stack adjustment operations. When
we go to lower a return, we check if we've already emitted a tail call. If so,
then we skip the return lowering.
For testing, this patch
- Adds call-translator-tail-call.ll to test which tail calls we currently lower,
which ones we don't, and which ones we shouldn't.
- Updates branch-target-enforcement-indirect-calls.ll to show that we fall back
as expected.
Differential Revision: https://reviews.llvm.org/D67189
........
This fails on EXPENSIVE_CHECKS builds due to a -verify-machineinstrs test failure in CodeGen/AArch64/dllimport.ll
llvm-svn: 371051
Handle the remaining cases also by handling asm goto in
SystemZInstrInfo::getBranchInfo().
Review: Ulrich Weigand
https://reviews.llvm.org/D67151
llvm-svn: 371048
Fixes clang static-analyzer warning.
Technically the MachineInstr *Sub might still be null if we're comparing zero (IsCmpZero == true), although this probably won't happen as SrcReg2 is probably == 0.
llvm-svn: 371047
Summary:
This patch renames functions that takes or returns alignment as log2, this patch will help with the transition to llvm::Align.
The renaming makes it explicit that we deal with log(alignment) instead of a power of two alignment.
A few renames uncovered dubious assignments:
- `MirParser`/`MirPrinter` was expecting powers of two but `MachineFunction` and `MachineBasicBlock` were using deal with log2(align). This patch fixes it and updates the documentation.
- `MachineBlockPlacement` exposes two flags (`align-all-blocks` and `align-all-nofallthru-blocks`) supposedly interpreted as power of two alignments, internally these values are interpreted as log2(align). This patch updates the documentation,
- `MachineFunctionexposes` exposes `align-all-functions` also interpreted as power of two alignment, internally this value is interpreted as log2(align). This patch updates the documentation,
Reviewers: lattner, thegameg, courbet
Subscribers: dschuff, arsenm, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, javed.absar, hiraditya, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, dexonsmith, PkmX, jocewei, jsji, Jim, s.egerton, llvm-commits, courbet
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65945
llvm-svn: 371045
-ftime-trace could break flame-graph assumptions on Windows, with an
inner scope overrunning outer scopes. This was due to the way that times
were truncated. Changed this so time_points for the flame-graph are
truncated instead of durations, preserving the relative order of event
starts and ends.
I have tried to retain the extra precision for the totals, which count
thousands or millions of events.
Added assert to check this property holds in future.
Fixes PR43043
Differential Revision: https://reviews.llvm.org/D66411
llvm-svn: 371039
After r361885, realPathFromHandle() ends up getting called on the working
directory on each Clang invocation. This unveiled that the code didn't work for
paths on network shares.
For example, if one maps the local dir c:\src\tmp to x:
net use x: \\localhost\c$\tmp
and run e.g. "clang -c foo.cc" in x:\, realPathFromHandle will get
\\?\UNC\localhost\c$\src\tmp\ back from GetFinalPathNameByHandleW, and would
strip off the initial \\?\ prefix, ending up with a path that doesn't work.
This patch makes the prefix stripping a little smarter to handle this case.
Differential revision: https://reviews.llvm.org/D67166
llvm-svn: 371035
In D62809 I accidentally added "ELFState<ELFT> &State" as the
first parameter to two methods. There is no reason for having that.
I removed this argument and also moved finalizeStrings declaration to
remove an excessive 'private:' tag.
Differential revision: https://reviews.llvm.org/D67157
llvm-svn: 371033
Fix: added missing return "return 0;"
Original commit message:
This eliminates one of the error(1) call in this lib.
It is different from the others because happens on a fields mapping stage
and can be easily fixed.
Differential revision: https://reviews.llvm.org/D67150
llvm-svn: 371030
This eliminates one of the error(1) call in this lib.
It is different from the others because happens on a fields mapping stage
and can be easily fixed.
Differential revision: https://reviews.llvm.org/D67150
llvm-svn: 371023
As DW_AT_rnglists_base points after the header and headers have
different sizes for DWARF32 and DWARF64, we have to use the format
of the CU to adjust the offset correctly in order to extract
the referenced range list table.
The patch also changes the type of RangeSectionBase because in DWARF64
it is 8-bytes long.
Differential Revision: https://reviews.llvm.org/D67098
llvm-svn: 371016
In the review process for some of the refactoring of MIRCanonicalizationPass it
was noted that some of the tests didn't have verifier enabled. Enabling here.
llvm-svn: 371005
This adds support for basic sibling call lowering in AArch64. The intent here is
to only handle tail calls which do not change the ABI (hence, sibling calls.)
At this point, it is very restricted. It does not handle
- Vararg calls.
- Calls with outgoing arguments.
- Calls whose calling conventions differ from the caller's calling convention.
- Tail/sibling calls with BTI enabled.
This patch adds
- `AArch64CallLowering::isEligibleForTailCallOptimization`, which is equivalent
to the same function in AArch64ISelLowering.cpp (albeit with the restrictions
above.)
- `mayTailCallThisCC` and `canGuaranteeTCO`, which are identical to those in
AArch64ISelLowering.cpp.
- `getCallOpcode`, which is exactly what it sounds like.
Tail/sibling calls are lowered by checking if they pass target-independent tail
call positioning checks, and checking if they satisfy
`isEligibleForTailCallOptimization`. If they do, then a tail call instruction is
emitted instead of a normal call. If we have a sibling call (which is always the
case in this patch), then we do not emit any stack adjustment operations. When
we go to lower a return, we check if we've already emitted a tail call. If so,
then we skip the return lowering.
For testing, this patch
- Adds call-translator-tail-call.ll to test which tail calls we currently lower,
which ones we don't, and which ones we shouldn't.
- Updates branch-target-enforcement-indirect-calls.ll to show that we fall back
as expected.
Differential Revision: https://reviews.llvm.org/D67189
llvm-svn: 370996
Moving MIRCanonicalizerPass vreg renaming code to MIRVRegNamerUtils so that it
can be reused in another pass (ie planing to write a standalone mir-namer pass).
I'm going to write a mir-namer pass so that next time someone has to author a
test in MIR, they can use it to cleanup the naming and make it more readable by
having the numbered vregs swapped out with named vregs.
Differential Revision: https://reviews.llvm.org/D67114
llvm-svn: 370985
In mingw environments, resources are normally compiled to resource
object files directly, instead of letting the linker convert them to
COFF format.
Since some time, GCC supports the notion of a default manifest object.
When invoking the linker, GCC looks for the default manifest object
file, and if found in the expected path, it is added to linker commands.
The default manifest is one that indicates support for the latest known
versions of windows, to implicitly unlock the modern behaviours of certain
APIs.
Not all mingw/gcc distributions include this file, but e.g. in msys2,
the default manifest object is distributed in a separate package (which
can be but might not always be installed).
This means that even if user projects only use one single resource
object file, the linker can end up with two resource object files,
and thus needs to support merging them.
The default manifest has a language id of zero, and GNU ld has got
logic for dropping a manifest with a zero language id, if there's
another manifest present with a nonzero language id. If there are
multiple manifests with a nonzero language id, the merging process
errors out.
Differential Revision: https://reviews.llvm.org/D66825
llvm-svn: 370974
This patch merges the sancov module and funciton passes into one module pass.
The reason for this is because we ran into an out of memory error when
attempting to run asan fuzzer on some protobufs (pc.cc files). I traced the OOM
error to the destructor of SanitizerCoverage where we only call
appendTo[Compiler]Used which calls appendToUsedList. I'm not sure where precisely
in appendToUsedList causes the OOM, but I am able to confirm that it's calling
this function *repeatedly* that causes the OOM. (I hacked sancov a bit such that
I can still create and destroy a new sancov on every function run, but only call
appendToUsedList after all functions in the module have finished. This passes, but
when I make it such that appendToUsedList is called on every sancov destruction,
we hit OOM.)
I don't think the OOM is from just adding to the SmallSet and SmallVector inside
appendToUsedList since in either case for a given module, they'll have the same
max size. I suspect that when the existing llvm.compiler.used global is erased,
the memory behind it isn't freed. I could be wrong on this though.
This patch works around the OOM issue by just calling appendToUsedList at the
end of every module run instead of function run. The same amount of constants
still get added to llvm.compiler.used, abd we make the pass usage and logic
simpler by not having any inter-pass dependencies.
Differential Revision: https://reviews.llvm.org/D66988
llvm-svn: 370971
When using llvm-rtdyld to execute code, -show-times will now show the time
taken to load the object files, apply relocations, and execute the
rtdyld-linked code.
llvm-svn: 370968
Summary:
- `__wasm_init_memory` is now the WebAssembly start function instead
of being called from `__wasm_call_ctors` or called directly by the
runtime.
- Adds a new synthetic data symbol `__wasm_init_memory_flag` that is
atomically incremented from zero to one by the thread responsible
for initializing memory.
- All threads now unconditionally perform data.drop on all passive
segments.
- Removes --passive-segments and --active-segments flags and controls
segment type based on --shared-memory instead. The deleted flags
were only present to ameliorate the upgrade path in Emscripten.
Reviewers: sbc100, aheejin
Subscribers: dschuff, jgravelle-google, sunfish, jfb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65783
llvm-svn: 370965
This patch adds the ability to encode and decode InlineInfo objects and adds test coverage. Error handling is introduced in the encoding and decoding which will be used from here on out for remaining patches.
Differential Revision: https://reviews.llvm.org/D66600
llvm-svn: 370936
Since an add instruction must produce an unused carry out, this
requires additional SGPRs. This can be avoided by keeping the entire
offset computation in SGPRs. If one SGPR is still available, this only
costs one extra mov. If none are available, the entire computation can
be done in place and reversed.
This does assume the use is a VGPR operand. This was already assumed,
and we currently only select frame indexes to VALU instructions. This
should probably be fixed at some point to handle more possible MIR.
llvm-svn: 370929
Summary:
Instead of building attributes for internal functions which we do not
update as long as we assume they are dead, we now do not create
attributes until we assume the internal function to be live. This
improves the number of required iterations, as well as the number of
required updates, in real code. On our tests, the results are mixed.
Reviewers: sstefan1, uenoku
Subscribers: hiraditya, bollu, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D66914
llvm-svn: 370924
Summary:
We create attributes on-demand so we need to check the white list
on-demand. This also unifies the location at which we create,
initialize, and eventually invalidate new abstract attributes.
The tests show mixed results, a few more call site attributes are
determined which can cause more iterations.
Reviewers: uenoku, sstefan1
Subscribers: hiraditya, bollu, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D66913
llvm-svn: 370922
This partially adds support for patterns with REG_SEQUENCE. The source
patterns are now accepted, but the pattern is still rejected due to
missing support for the instruction renderer.
llvm-svn: 370920
Summary:
Before we tried to rule out non-exact definitions early but that lead to
on-demand attributes created for them anyway. As a consequence we needed
to look at the definition in the initialize of each attribute again.
This patch centralized this lookup and tightens the condition under
which we give up on non-exact definitions.
Reviewers: uenoku, sstefan1
Subscribers: hiraditya, bollu, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67115
llvm-svn: 370917
As commented on D65712, the getIRPosition methods weren't correctly being overridden.
Differential Revision: https://reviews.llvm.org/D67170
llvm-svn: 370914
SROA pass processes debug info incorrecly if applied twice.
Specifically, after SROA works first time, instcombine converts dbg.declare
intrinsics into dbg.value. Inlining creates new opportunities for SROA,
so it is called again. This time it does not handle correctly previously
inserted dbg.value intrinsics.
Differential Revision: https://reviews.llvm.org/D64595
llvm-svn: 370906
Apologies, due to a git SNAFU this fix (dump doesn't exist and silence unused variables) stayed in my index rather than applying to rL370893.
llvm-svn: 370894
This is the beginnings of a reimplementation of ModuloScheduleExpander. It works
by generating a single-block correct pipelined kernel and then peeling out the
prolog and epilogs.
This patch implements kernel generation as well as a validator that will
confirm the number of phis added is the same as the ModuloScheduleExpander.
Prolog and epilog peeling will come in a different patch.
Differential Revision: https://reviews.llvm.org/D67081
llvm-svn: 370893
the test is building a 64-bit executable, so the addresses should be
64-bit too. The test was still passing even with smaller address size,
but it was hitting the "unexpected end of data" error sooner than it
should.
llvm-svn: 370882
When comparing variable locations, LiveDebugValues currently considers only
the machine location, ignoring any DIExpression applied to it. This is a
problem because that DIExpression can do pretty much anything to the machine
location, for example dereferencing it.
This patch adds DIExpressions to that comparison; now variables based on the
same register/memory-location but with different expressions will compare
differently, and be dropped if we attempt to merge them between blocks. This
reduces variable coverage-range a little, but only because we were producing
broken locations.
Differential Revision: https://reviews.llvm.org/D66942
llvm-svn: 370877
Breaks BUILD_SHARED_LIBS build, introduces cycles in library dependency
graphs. (clangInterp depends on clangAST which depends on clangInterp)
This reverts r370839, which is an yet another recommit of D64146.
llvm-svn: 370874
Tested on VS2017 and VS2019 llvm/clang builds with WX enabled - its no longer necessary to disable this warning.
Differential Revision: https://reviews.llvm.org/D67103
llvm-svn: 370871
On release builds, 'MI' isn't used by anything (it's already inserted into a
block by BuildMI), while on non-release builds it's used by a LLVM_DEBUG
statement. Mark as explicitly used to avoid the warning.
llvm-svn: 370870
Summary:
While fixing the handling of some error cases, r370363 introduced new
problems -- assertion failures due to unchecked errors (my excuse is that a very
early version of that patch used Optional<T> instead of Expected).
This patch adds proper handling of parsing errors encountered when
dumping location lists from inside DWARF DIEs, and adds a bunch of
additional tests.
I reorder the arguments of the location list dumping functions to make
them consistent, and also be able to dump the two kinds of location
lists generically.
Reviewers: JDevlieghere, dblaikie, probinson
Subscribers: aprantl, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67102
llvm-svn: 370868
Tested on VS2017 and VS2019 llvm/clang builds with WX enabled - its no longer necessary to disable this warning.
Differential Revision: https://reviews.llvm.org/D67047
llvm-svn: 370866
PT_GNU_STACK is used in an llvm-objcopy test.
I plan to use PT_GNU_RELRO in a patch to improve nested segment
processing in llvm-objcopy (PR42963).
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D67146
llvm-svn: 370857
"Section" can refer to the type llvm::objcopy:🧝:Section or the
variable name. Rename it to "Sec" for clarity. "Sec" is already used a
lot, so this change improves consistency as well.
Also change `auto` to `const SectionBase` for readability.
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D67143
llvm-svn: 370852
For any unpaired muls, we accumulate them as an input to the
reduction. Check the type of the mul and perform a sext if the
existing accumlator input type is not the same.
Differential Revision: https://reviews.llvm.org/D66993
llvm-svn: 370851
Summary: Previously module pass printer pass prints the banner even when the module doesn't include any function provided with `-filter-print-funcs` option. This introduced a lot of noise, especailly with ThinLTO. This diff addresses the issue and makes the banner printed only when the module includes functions in `-filter-print-funcs` list.
Reviewers: fedor.sergeev
Subscribers: mehdi_amini, hiraditya, dexonsmith, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D66560
llvm-svn: 370849
Similar to the issue with G_ZEXT that was fixed earlier, this is a quick
to fall back if the source type is not exactly half of the dest type.
Fixes the clang-cmake-aarch64-lld bot build.
llvm-svn: 370847
Summary:
This patch introduces the skeleton of the constexpr interpreter,
capable of evaluating a simple constexpr functions consisting of
if statements. The interpreter is described in more detail in the
RFC. Further patches will add more features.
Reviewers: Bigcheese, jfb, rsmith
Subscribers: bruno, uenoku, ldionne, Tyker, thegameg, tschuett, dexonsmith, mgorny, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D64146
llvm-svn: 370839
This reverts r370525 (git commit 0bb1630685)
Also reverts r370543 (git commit 185ddc08ee)
The approach I took only works for functions marked `noreturn`. In
general, a call that is not known to be noreturn may be followed by
unreachable for other reasons. For example, there could be multiple call
sites to a function that throws sometimes, and at some call sites, it is
known to always throw, so it is followed by unreachable. We need to
insert an `int3` in these cases to pacify the Windows unwinder.
I think this probably deserves its own standalone, Win64-only fixup pass
that runs after block placement. Implementing that will take some time,
so let's revert to TrapUnreachable in the mean time.
llvm-svn: 370829
Summary:
This removes all string constants for function names and compares
functions by string directly when needed. Many of these constants are
used only once or twice so the benefit of defining them separately is
not very clear, and this actually fixes a bug.
When we already have a `malloc` declaration which is an alias to
something else within the module,
```
@malloc = weak hidden alias i8* (i32), i8* (i32)* @dlmalloc
```
(this happens compiling with emscripten with `-s WASM_OBJECT_FILES=0`
because all bc files are merged before being fed into `wasm-ld` which
runs the backend optimizations as LTO)
`Module::getFunction("malloc")` in `canLongjmp` returns `nullptr`
because `Module::getFunction` dyncasts pointer into `Function`, but the
alias is a `GlobalValue` but not a `Function`. This makes `canLongjmp`
return false for `malloc` in this case, and we end up adding a lot of
longjmp handling code around malloc. This is not only a code size
increase but actually a bug because `malloc` is used in the entry block
when preparing for setjmp tables for emscripten sjlj handling, and this
makes initial setjmp preparation, which has to happen in the entry
block, move to another split block, and this interferes with SSA update
later.
This also adds two more functions, `getTempRet0` and `setTempRet0`, in
the list of not longjmp-able functions.
Fixes https://github.com/emscripten-core/emscripten/issues/8935.
Reviewers: sbc100
Subscribers: mehdi_amini, jgravelle-google, hiraditya, sunfish, dexonsmith, dschuff, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67129
llvm-svn: 370828
Add a mode in which profile read errors are not immediately treated as
fatal. In this mode, merging makes forward progress and reports failure
only if no inputs can be read.
Differential Revision: https://reviews.llvm.org/D66985
llvm-svn: 370827
The check needs to validate a counter offset before performing pointer
arithmetic with the (potentially corrupt) offset.
Found by UBSan's pointer overflow check.
rdar://54843625
Differential Revision: https://reviews.llvm.org/D66979
llvm-svn: 370826
When I dug into this, it turns out to be *much* more involved than I'd realized and doesn't actually simplify anything.
The general purpose of the leader table is that we want to find the most-dominating definition quickly. The problem for equivalance folding is slightly different; we want to find the most dominating *value* whose definition block dominates our use quickly.
To make this change, we'd end up having to restructure the leader table (either the sorting thereof, or maybe even introducing multiple leader tables per value) and that complexity is just not worth it.
llvm-svn: 370824
Now that we have the infrastructure to support s128 types as parameters
we can expand these to libcalls.
Differential Revision: https://reviews.llvm.org/D66185
llvm-svn: 370823