Summary:
The existing detection of a format member function has a couple of deficiencies:
- the member function does not get detected if one calls formatv with an lvalue,
because the template parameter gets deduced as T&, which fails the is_class
check.
- it also did not work if the function was called with a const variable because
the template parameter would get deduced as const T&, again failing the
is_class check.
This fixes the problem by stripping the references in the uses_format_member
template, to make sure the type is correctly detected as class. It also provides
specializations of the has_FormatMember template for const and non-const members
of the types in order to enable declaring the format member as a "const"
function. I have added tests that verify that formatv can be now called in these
scenarios. As some scenarios could not be verified at runtime (e.g. making sure
that calling a non-const format member on a const object does *not* compile), I
have also added some static_asserts which test the behaviour of the template
classes used internally by formatv().
Reviewers: zturner
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D27525
llvm-svn: 289040
This allows clients to register an AsmCommentConsumer with the MCAsmLexer,
which receives a callback each time a comment is parsed.
Differential Revision: https://reviews.llvm.org/D27511
llvm-svn: 289036
Most importantly, we need to hash the relocation model, otherwise we can
end up trying to link non-PIC object files into PIEs or DSOs.
Differential Revision: https://reviews.llvm.org/D27556
llvm-svn: 289024
The relocations for `DIEEntry::EmitValue` were wrong for Win64
(emitting FK_Data_4 instead of FK_SecRel_4). This corrects that
oversight so that the DWARF data is correct in Win64 COFF files.
Fixes PR15393.
Patch by Jameson Nash <jameson@juliacomputing.com> based on a patch
by David Majnemer.
Differential Revision: https://reviews.llvm.org/D21731
llvm-svn: 289013
The only tests we have for the DWARF parser are the tests that use llvm-dwarfdump and expect output from textual dumps.
More DWARF parser modification are coming in the next few weeks and I wanted to add tests that can verify that we can encode and decode all form types, as well as test some other basic DWARF APIs where we ask DIE objects for their children and siblings.
DwarfGenerator.cpp was added in the lib/CodeGen directory. This file contains the code necessary to easily create DWARF for tests:
dwarfgen::Generator DG;
Triple Triple("x86_64--");
bool success = DG.init(Triple, Version);
if (!success)
return;
dwarfgen::CompileUnit &CU = DG.addCompileUnit();
dwarfgen::DIE CUDie = CU.getUnitDIE();
CUDie.addAttribute(DW_AT_name, DW_FORM_strp, "/tmp/main.c");
CUDie.addAttribute(DW_AT_language, DW_FORM_data2, DW_LANG_C);
dwarfgen::DIE SubprogramDie = CUDie.addChild(DW_TAG_subprogram);
SubprogramDie.addAttribute(DW_AT_name, DW_FORM_strp, "main");
SubprogramDie.addAttribute(DW_AT_low_pc, DW_FORM_addr, 0x1000U);
SubprogramDie.addAttribute(DW_AT_high_pc, DW_FORM_addr, 0x2000U);
dwarfgen::DIE IntDie = CUDie.addChild(DW_TAG_base_type);
IntDie.addAttribute(DW_AT_name, DW_FORM_strp, "int");
IntDie.addAttribute(DW_AT_encoding, DW_FORM_data1, DW_ATE_signed);
IntDie.addAttribute(DW_AT_byte_size, DW_FORM_data1, 4);
dwarfgen::DIE ArgcDie = SubprogramDie.addChild(DW_TAG_formal_parameter);
ArgcDie.addAttribute(DW_AT_name, DW_FORM_strp, "argc");
// ArgcDie.addAttribute(DW_AT_type, DW_FORM_ref4, IntDie);
ArgcDie.addAttribute(DW_AT_type, DW_FORM_ref_addr, IntDie);
StringRef FileBytes = DG.generate();
MemoryBufferRef FileBuffer(FileBytes, "dwarf");
auto Obj = object::ObjectFile::createObjectFile(FileBuffer);
EXPECT_TRUE((bool)Obj);
DWARFContextInMemory DwarfContext(*Obj.get());
This code is backed by the AsmPrinter code that emits DWARF for the actual compiler.
While adding unit tests it was discovered that DIEValue that used DIEEntry as their values had bugs where DW_FORM_ref1, DW_FORM_ref2, DW_FORM_ref8, and DW_FORM_ref_udata forms were not supported. These are all now supported. Added support for DW_FORM_string so we can emit inlined C strings.
Centralized the code to unique abbreviations into a new DIEAbbrevSet class and made both the dwarfgen::Generator and the llvm::DwarfFile classes use the new class.
Fixed comments in the llvm::DIE class so that the Offset is known to be the compile/type unit offset.
DIEInteger now supports more DW_FORM values.
There are also unit tests that cover:
Encoding and decoding all form types and values
Encoding and decoding all reference types (DW_FORM_ref1, DW_FORM_ref2, DW_FORM_ref4, DW_FORM_ref8, DW_FORM_ref_udata, DW_FORM_ref_addr) including cross compile unit references with that go forward one compile unit and backward on compile unit.
Differential Revision: https://reviews.llvm.org/D27326
llvm-svn: 289010
Replace @progbits in the section directive with %progbits, because "@" starts a comment on arm/thumb.
Use b.w branch instruction.
Use .thumb_function and .thumb_set for proper arm/thumb interwork. This way jumptable entry addresses on thumb have bit 0 set (correctly). This does not affect CFI check math, because the address of the jumptable start also has that bit set.
This does not work on thumbv5, because it does not support b.w, and the linker would not insert a veneer (trampoline?) to extend the range of b.n. We may need to do full-range plt-style jumptables on thumbv54, which are 12 bytes per entry. Another option is "push lr; bl; pop pc" (4 bytes) but that needs unwinding instructions, etc.
Differential Revision: https://reviews.llvm.org/D27499
llvm-svn: 289008
This abstracts the code for emitting DWARF binary from the DWARFYAML types into reusable interfaces that could be used by ELF and COFF.
llvm-svn: 288990
The fix committed in r288851 doesn't cover all the cases.
In particular, if we have an instruction with side effects
which has a no non-dbg use not depending on the bits, we still
perform RAUW destroying the dbg.value's first argument.
Prevent metadata from being replaced here to avoid the issue.
Differential Revision: https://reviews.llvm.org/D27534
llvm-svn: 288987
ConstantExpr instances were emitting code into the current block rather than
the entry block. This meant they didn't necessarily dominate all uses, which is
clearly wrong.
llvm-svn: 288985
Since DWARF formatting is agnostic to the object file it is stored in, it doesn't make sense for this to be in the MachOYAML implementation. Pulling it into its own namespace means we could modify the ELF and COFF YAML tools to emit DWARF as well.
In a follow-up patch I will better abstract this in obj2yaml and yaml2obj so that the DWARF bits in the tools can be re-used too.
llvm-svn: 288984
Having to ask the MIRBuilder for the current function is a little awkward, and
I'm intending to improve how that's threaded through anyway.
llvm-svn: 288983
MachineIRBuilder had weird before/after and beginning/end flags for the insert
point. Unfortunately the non-default means that instructions will be inserted
in reverse order which is almost never what anyone wants.
Really, I think we just want (like IRBuilder has) the ability to insert at any
C++ iterator-style point (i.e. before any instruction or before MBB.end()). So
this fixes MIRBuilders to behave like IRBuilders in this respect.
llvm-svn: 288980
If we don't skip over DEBUG_VALUEs, we get differences between -g and non-g
code.
This fixes PR31242.
Differential Revision: https://reviews.llvm.org/D27485
llvm-svn: 288965
The second operand of an "ri" instruction may be an immediate, but it may
also be a globalvariable, so we should make any assumptions.
This fixes PR31271.
Differential Revision: https://reviews.llvm.org/D27481
llvm-svn: 288964
The tests that already work are folded in InstSimplify, so those
tests should be redundant and we can remove them if they don't
seem worthwhile for completeness.
llvm-svn: 288957
This is now performed more generally by the target shuffle combine code.
Already covered by tests that were originally added in D7666/rL229480 to support combineVectorZext (or VectorZextCombine as it was known then....).
Differential Revision: https://reviews.llvm.org/D27510
llvm-svn: 288918
This patch attempts to scalarize the operand expressions of predicated
instructions if they were conditionally executed in the original loop. After
scalarization, the expressions will be sunk inside the blocks created for the
predicated instructions. The transformation essentially performs
un-if-conversion on the operands.
The cost model has been updated to determine if scalarization is profitable. It
compares the cost of a vectorized instruction, assuming it will be
if-converted, to the cost of the scalarized instruction, assuming that the
instructions corresponding to each vector lane will be sunk inside a predicated
block, possibly avoiding execution. If it's more profitable to scalarize the
entire expression tree feeding the predicated instruction, the expression will
be scalarized; otherwise, it will be vectorized. We only consider the cost of
the entire expression to accurately estimate the cost of the required
insertelement and extractelement instructions.
Differential Revision: https://reviews.llvm.org/D26083
llvm-svn: 288909
In the case of a fully redundant load LI dominated by an equivalent load V, GVN
should always preserve the original debug location of V. Otherwise, we risk to
introduce an incorrect stepping.
If V has debug info, then clearly it should not be modified. If V has a null
debugloc, then it is still potentially incorrect to propagate LI's debugloc
because LI may not post-dominate V.
Differential Revision: https://reviews.llvm.org/D27468
llvm-svn: 288903
We are being inconsistent with these instructions (and all their variants.....) with a random mix of them using the default float domain.
Differential Revision: https://reviews.llvm.org/D27419
llvm-svn: 288902
The non-constant pool version of DecodeVPERMIL2PMask was not offsetting correctly for the second input. I've updated the code to match the implementation in the constant-pool version.
Annoyingly this bug was hidden for so long as it's tricky to combine to useful variable shuffle masks that don't become constant-pool entries.
llvm-svn: 288898
When a function F is inlined, InlineFunction extends the debug location of every
instruction inlined from F by adding an InlinedAt.
However, if an instruction has a 'null' debug location, InlineFunction would
propagate the callsite debug location to it. This behavior existed since
revision 210459.
Revision 210459 was originally committed specifically to workaround the lack of
debug information for instructions inlined from intrinsic functions (which are
usually declared with attributes `__always_inline__, __nodebug__`).
The problem with revision 210459 is that it doesn't make any sort of distinction
between instructions inlined from a 'nodebug' function and instructions which
are inlined from a function built with debug info. This issue may lead to
incorrect stepping in the debugger.
This patch works under the assumption that a nodebug function does not have a
DISubprogram. When a function F is inlined into another function G,
InlineFunction checks if F has debug info associated with it.
For nodebug functions, the InlineFunction logic is unchanged (i.e. it would
still propagate the callsite debugloc to the inlined instructions). Otherwise,
InlineFunction no longer propagates the callsite debug location.
Differential Revision: https://reviews.llvm.org/D27462
llvm-svn: 288895
I believe this is the cause of the failure, but have not been able to confirm. Note that this is a speculative fix; I'm still waiting for a full build to finish as I synced and ended up doing a clean build which takes 20+ minutes on my machine.
llvm-svn: 288886
Requesting metadata for a global is a relatively expensive operation as it
involves a map lookup, but it's one that we need to do relatively frequently in
this pass to collect the list of type metadata nodes associated with a global.
This change improves the performance of type metadata queries by prebuilding
data structures that keep the global together with its list of type metadata,
and changing the pass to use that data structure wherever we were previously
passing global references around.
This change also eliminates some O(N^2) behavior by collecting the list of
globals associated with each type identifier during the first pass over the
list of globals rather than visiting each global to compute that list every
time we add a new type identifier.
Reduces pass runtime on a module containing Chrome's vtables from over 60s
to 0.9s.
Differential Revision: https://reviews.llvm.org/D27484
llvm-svn: 288859
On some platforms (like MSP430) the second element of the result
structure for SMULO/UMULO may have a shorter type than the one
returned by SetCC. We need to truncate it to the right type, or
else some incorrect code may be generated later on.
This fixes issue https://github.com/rust-lang/rust/issues/37829
Patch by Vadzim Dambrouski!
Differential Revision: https://reviews.llvm.org/D27154
llvm-svn: 288857
As Eli noted in the post-commit thread for r288833, the use of
swapOperands() may not be allowed in InstSimplify, so I'm
removing those calls here pending further review.
The swap mutates the icmp, and there doesn't appear to be precedent
for instruction mutation in InstSimplify.
I didn't actually have any tests for those cases, so I'm adding
a few here.
llvm-svn: 288855
BDCE has two phases:
1. It asks SimplifyDemandedBits if all the bits of an instruction are dead, and if so,
replaces all its uses with the constant zero.
2. Then, it asks SimplifyDemandedBits again if the instruction is really dead
(no side effects etc..) and if so, eliminates it.
Now, in 1) if all the bits of an instruction are dead, we may end up replacing a dbg use:
%call = tail call i32 (...) @g() #4, !dbg !15
tail call void @llvm.dbg.value(metadata i32 %call, i64 0, metadata !8, metadata !16), !dbg !17
->
%call = tail call i32 (...) @g() #4, !dbg !15
tail call void @llvm.dbg.value(metadata i32 0, i64 0, metadata !8, metadata !16), !dbg !17
but not eliminating the call because it may have arbitrary side effects.
In other words, we lose some debug informations.
This patch fixes the problem making sure that BDCE does nothing with the instruction if
it has side effects and no non-dbg uses.
Differential Revision: https://reviews.llvm.org/D27471
llvm-svn: 288851
Summary:
If we write an immediate to a VGPR and then copy the VGPR to an
SGPR, we can replace the copy with a S_MOV_B32 sgpr, imm, rather than
moving the copy to the SALU.
Reviewers: arsenm
Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye
Differential Revision: https://reviews.llvm.org/D27272
llvm-svn: 288849
We were rounding size in bits down rather than up, leading to 0-sized slots for
i1 (assert!) and bugs for other types not byte-aligned.
llvm-svn: 288848
Summary:
Prefer expansions such as: pmullw,pmulhw,unpacklwd,unpackhwd over pmulld.
On Silvermont [source: Optimization Reference Manual]:
PMULLD has a throughput of 1/11 [instruction/cycles].
PMULHUW/PMULHW/PMULLW have a throughput of 1/2 [instruction/cycles].
Fixes pr31202.
Analysis of this issue was done by Fahana Aleen.
Reviewers: wmi, delena, mkuper
Subscribers: RKSimon, llvm-commits
Differential Revision: https://reviews.llvm.org/D27203
llvm-svn: 288844
Handle the case where a sign extension has ended up being split into separate stages (typically to get around vector legal ops) and a zext + sext_in_reg gets inserted.
Differential Revision: https://reviews.llvm.org/D27461
llvm-svn: 288842
All of these (and a few more) are already handled by InstCombine,
but we shouldn't have to wait until then to simplify these because
they're cheap to deal with here in InstSimplify.
This is the 'and' sibling of the earlier 'or' patch:
https://reviews.llvm.org/rL288833
llvm-svn: 288841
There were two problems:
+ AArch64 was reusing random data from its binary op tables, which is
complete nonsense for G_SEQUENCE.
+ Even when AArch64 gave up and said it couldn't handle G_SEQUENCE,
the generic code asserted.
llvm-svn: 288836
It'll almost immediately fail because it always tries to half/double the size
until it finds a legal one. Unfortunately, this triggers an assertion
preventing the DAG fallback from being possible.
llvm-svn: 288834
All of these (and a few more) are already handled by InstCombine,
but we shouldn't have to wait until then to simplify these because
they're cheap to deal with here in InstSimplify.
llvm-svn: 288833
These are OpenBSD specific program headers.
OpenBSD commit:
d39116912b
It is required for fixing PR31288.
Differential revision: https://reviews.llvm.org/D27456
llvm-svn: 288831
Summary:
This is NFC but prevents assertions when PartialMappingIdx is tablegen-erated.
The assumptions were:
1) FirstGPR is 0
2) FirstGPR is the first of the First* enumerators.
GPR32 is changed to 1 to demonstrate that assumption #1 is fixed. #2 will
be covered by a subsequent patch that tablegen-erates information and swaps
the order of GPR and FPR as a side effect.
Depends on D27336
Reviewers: ab, t.p.northover, qcolombet
Subscribers: aemerson, rengolin, vkalintiris, dberris, rovka, llvm-commits
Differential Revision: https://reviews.llvm.org/D27337
llvm-svn: 288812
When we see a non flag-setting instruction for which only the flag-setting
version is available in Thumb1, we should give a better error message than
"invalid instruction".
Differential Revision: https://reviews.llvm.org/D27414
llvm-svn: 288805
Check if a build_vector node includes a repeated constant pattern and replace it with a broadcast of that pattern.
For example:
"build_vector <0, 1, 2, 3, 0, 1, 2, 3>" would be replaced by "broadcast <0, 1, 2, 3>"
Differential Revision: https://reviews.llvm.org/D26802
llvm-svn: 288804
This is the final patch in the series of patches that improves
BUILD_VECTOR handling on PowerPC. This adds a few peephole optimizations
to remove redundant instructions. It also adds a large test case which
encompasses a large set of code patterns that build vectors - this test
case was the motivator for this series of patches.
Differential Revision: https://reviews.llvm.org/D26066
llvm-svn: 288800
a hilarious bug and fix it.
We somehow were never verifying the RefSCCs newly formed when
splitting an existing one apart, and when verifying them we weren't
really checking the SCC indices mapping effectively.
If we had been, it would have been blindingly obvious that right after
putting something int `RC.SCCs` we should update `RC.SCCIndices` instead
of `SCCIndices` which we were about to clear and rebuild anyways. =[
Anyways, this is thoroughly covered by existing tests now that we
actually verify things properly.
llvm-svn: 288795
Summary: This patch makes sure FirstCSPop and MBBI never point to DBG_VALUE instructions, which affected the code generated.
Reviewers: mkuper, aprantl, MatzeB
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D27343
llvm-svn: 288794
This pattern turned a vector sqrt/rcp/rsqrt operation of sse_load_f32/f64 into the the scalar instruction for the operation and put undef into the upper bits. For correctness, the resulting code should still perform the sqrt/rcp/rsqrt on the upper bits after the load is extended since that's what the operation asked for. Particularly in the case where the upper bits are 0, in that case we need calculate the sqrt/rcp/rsqrt of the zeroes and keep the result in the upper-bits. This implies we should be using the packed instruction still.
The only test case for this pattern is one I just added so there was no coverage of this.
llvm-svn: 288784
This occurs due to a pattern that uses sse_load_f32/f64 with vector sqrt/rcp/rsqrt operations and turns them into scalar instructions. Perhaps for the case were the upper bits come from undef this is ok. I believe a (vzmovl load64) would do the same thing but those seems to become vzload instead and selectScalarSSELoad doesn't handle that today. In that case we should be performing the vector operation on the zeros in the upper bits which is not equivalent to using a scalar instruction.
I will remove this pattern in a follow up patch. There appears to be no other test content for it.
llvm-svn: 288783
The intrinsics are supposed to pass the upper bits straight through to their output register. This means we need to make sure we still perform the 128-bit load to get those upper bits to pass to give to the instruction since the memory form of the instruction only reads 32 or 64 bits.
llvm-svn: 288781
The sqrtsd instruction only loads 64-bits and writes bits 63:0 with the sqrt result. Bits 127:64 are preserved in the destination register. The semantics of the intrinsic indicate bits 127:64 should come from the intrinsic argument which in this case is a 128-bit load. So the generated code should have a 128-bit load and use a register form of sqrtsd.
llvm-svn: 288780
The intrinsic takes one argument, the lower bits are affected by the operation and the upper bits should be passed through. The instruction itself takes two operands, the high bits of the first operand are passed through and the low bits of the second operand are modified by the operation. To match this to the intrinsic we should pass the single intrinsic input to both operands.
I had to remove the stack folding test for these instructions since they depended on the incorrect behavior. The same register is now used for both inputs so the load can't be folded.
llvm-svn: 288779
This patch adds the starting support for encoding data from the MachO __DWARF segment. The first section supported is the __debug_str section because it is the simplest.
llvm-svn: 288774
Summary:
This patch removes the scalar logical operation alias instructions. We can just use reg class copies and use the normal packed instructions instead. This removes the need for putting these instructions in the execution domain fixing tables as was done recently.
I removed the loadf64_128 and loadf32_128 patterns as DAG combine creates a narrower load for (extractelt (loadv4f32)) before we ever get to isel.
I plan to add similar patterns for AVX512DQ in a future commit to allow use of the larger register class when available.
Reviewers: spatel, delena, zvi, RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D27401
llvm-svn: 288771
It is kinda crazy to have llvm/include and llvm/lib/Target in the include path for every tablegen invocation for every tablegen-like tool.
This patch removes those flags from the tablgen function that is called everywhere by instead creating a variable LLVM_TABLEGEN_FLAGS which is setup in the LLVM source directories.
This removes TableGen.cmake's dependency on LLVM_MAIN_SRC_DIR, and LLVM_MAIN_INCLUDE_DIR.
llvm-svn: 288770
Integers are expressed in the lattice via constant ranges. They can never be represented by constants or not-constants; those are reserved for non-integer types. This code has been dead for literaly years.
llvm-svn: 288767
This completes a small series of patches to hide the stateful updates of LVILatticeVal from the consuming code. The only remaining stateful API is mergeIn.
llvm-svn: 288765
Summary:
We recently introduced a feature that enforce at link-time that the
LLVM headers used by a clients are matching the ABI setting of the
LLVM library linked to.
However for clients that are using only headers from ADT and promise
they won't call into LLVM, this is forcing to link libSupport. This
new flag is intended to provide a way to configure LLVM with this
promise for such client.
Reviewers: bob.wilson, compnerd
Subscribers: mgorny, llvm-commits
Differential Revision: https://reviews.llvm.org/D27432
llvm-svn: 288754
The structured CFG is just an aid to inserting exec
mask modification instructions, once that is done
we don't really need it anymore. We also
do not analyze blocks with terminators that
modify exec, so this should only be impacting
true branches.
llvm-svn: 288744
Summary:
r288722 introduced a build break due some code that should
not have been part of the commit. This change removes the offending
code.
Reviewers: davide, ruiu
Differential Revision: https://reviews.llvm.org/D27435
llvm-svn: 288742
clang -target arm deprecated-asm.s -c
deprecated-asm.s:30:9: warning: use of SP or PC in the list is deprecated
stmia r4!, {r12-r14}
We have to have an option what can disable it.
Patched by Yin Ma!
Reviewers: joey, echristo, weimingz
Subscribers: llvm-commits, aemerson
Differential Revision: https://reviews.llvm.org/D27219
llvm-svn: 288734
The function used to finish off PHIs by adding the relevant basic blocks can
fail if we're aborting and still don't actually have the needed
MachineBasicBlocks. So avoid trying in that case.
llvm-svn: 288727
There are two cases handled here:
1) a branch on undef
2) a switch with an undef condition.
Both cases are currently handled by ResolvedUndefsIn. If we have
a branch on undef, we force its value to false (which is trivially
foldable). If we have a switch on undef, we force to the first
constant (which is also foldable).
llvm-svn: 288725
Summary: The code we use to read PDBs assumed that streams we ask it to read exist, and would read memory outside a vector and crash if this wasn't the case. This would, for example, cause llvm-pdbdump to crash on PDBs generated by lld. This patch handles such cases more gracefully: the PDB reading code in LLVM now reports errors when asked to get a stream that is not present, and llvm-pdbdump will report missing streams and continue processing streams that are present.
Reviewers: ruiu, zturner
Subscribers: thakis, amccarth
Differential Revision: https://reviews.llvm.org/D27325
llvm-svn: 288722
When the entry block was empty after arg lowering, we were always placing
constants at the end. This is probably hamrless while translating the same
block, but horribly wrong once its terminator has been translated. So switch to
inserting at the beginning.
llvm-svn: 288720