Function::lookupIntrinsicID is somewhat forgiving as it comes to
overloaded intrinsics' names: it returns an ID as soon as the name
provided has a prefix that matches a registered intrinsic's name w/o
actually checking that the rest of the name encodes all the concrete arg
types, let alone that those types are compatible with the intrinsic's
definition.
That's probably fine and comes in handy in MIR serialization: we don't
care about IR types at MIR level and every intrinsic should be
selectable based on its ID and low-level types (LLTs) of its operands,
including the overloaded ones, so there is no point in serializing
mangled IR types as part of the intrinsic's name.
However, lookupIntrinsicID is somewhat inconsistent in its forgiveness:
if the name provided is actually an exact match, it will refuse to
return the ID if the intrinsic is overloaded. There is probably no
real reason for that and it renders MIRParser incapable to deserialize
MIR MIRPrinter serialized.
This commit fixes it.
Reviewers: rnk, aditya_nandakumar, qcolombet, thegameg, dsanders,
marcello.maggioni
Reviewed By: bogner
Subscribers: javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D43267
llvm-svn: 326387
When the function has musttail call - its cc is fixed to be equal to the
cc of the musttail callee. In such case (and in the case of the musttail
callee), GlobalOpt should not change the cc to fastcc as it will break
the invariant.
This fixes PR36546
Patch by: Fedor Indutny (indutny)
Differential revision: https://reviews.llvm.org/D43859
llvm-svn: 326376
Summary:
For use by LLPC SPV_AMD_shader_ballot extension.
The v_writelane instruction was already implemented for use by SGPR
spilling, but I had to add an extra dummy operand tied to the
destination, to represent that all lanes except the selected one keep
the old value of the destination register.
.ll test changes were due to schedule changes caused by that new
operand.
Differential Revision: https://reviews.llvm.org/D42838
llvm-svn: 326353
The API verification tool tapi has difficulty processing frameworks
which enable code coverage, but which have no code. The profile lowering
pass does not emit the runtime hook in this case because no counters are
lowered.
While the hook is not needed for program correctness (the profile
runtime doesn't have to be linked in), it's needed to allow tapi to
validate the exported symbol set of instrumented binaries.
It was not possible to add a workaround in tapi for empty binaries due
to an architectural issue: tapi generates its expected symbol set before
it inspects a binary. Changing that model has a higher cost than simply
forcing llvm to always emit the runtime hook.
rdar://36076904
Differential Revision: https://reviews.llvm.org/D43794
llvm-svn: 326350
FailedISel MachineFunction property is part of the CodeGen pipeline
state as much as every other property, notably, Legalized,
RegBankSelected, and Selected. Let's make that part of the state also
serializable / de-serializable, so if GlobalISel aborts on some of the
functions of a large module, but not the others, it could be easily seen
and the state of the pipeline could be maintained through llc's
invocations with -stop-after / -start-after.
To make MIR printable and generally to not to break it too much too
soon, this patch also defers cleaning up the vreg -> LLT map until
ResetMachineFunctionPass.
To make MIR with FailedISel: true also machine verifiable, machine
verifier is changed so it treats a MIR-module as non-regbankselected and
non-selected if there is FailedISel property set.
Reviewers: qcolombet, ab
Reviewed By: dsanders
Subscribers: javed.absar, rovka, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D42877
llvm-svn: 326343
Emulated TLS is enabled by llc flag -emulated-tls,
which is passed by clang driver.
When llc is called explicitly or from other drivers like LTO,
missing -emulated-tls flag would generate wrong TLS code for targets
that supports only this mode.
Now use useEmulatedTLS() instead of Options.EmulatedTLS to decide whether
emulated TLS code should be generated.
Unit tests are modified to run with and without the -emulated-tls flag.
Differential Revision: https://reviews.llvm.org/D42999
llvm-svn: 326341
Summary:
Expressions of the form x < 0 ? 0 : x; and x < -1 ? -1 : x can be lowered using bit-operations instead of branching or conditional moves
In thumb-mode this results in a two-instruction sequence, a shift followed by a bic or or while in ARM/thumb2 mode that has flexible second operand the shift can be folded into a single bic/or instructions. In most cases this results in smaller code and possibly less branches, and in no case larger than before.
Patch by Martin Svanfeldt
Reviewers: fhahn, pbarrio, rogfer01
Reviewed By: pbarrio, rogfer01
Subscribers: chrib, yroux, eugenis, efriedma, rogfer01, aemerson, javed.absar, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D42574
llvm-svn: 326333
Summary:
Some targets does not support labels inside debug sections, but support
references in form `section +|- offset`. Patch adds initial support
for this. Also, this patch disables emission of all additional debug
sections that may have labels inside of it (like pub sections and
string tables).
Reviewers: probinson, echristo
Subscribers: JDevlieghere, llvm-commits
Differential Revision: https://reviews.llvm.org/D43627
llvm-svn: 326328
Summary:
Fix a bug in MergeICmp that can lead to a BCECmp block being processed more than once and eventually lead to a broken LLVM module.
The problem is that if the non-constant value is not produced by the last block, the producer will be processed once when the its parent block
is processed and second time when the last block is processed.
We end up having 2 same BCECmpBlock in the merge queue. And eventually lead to a broken LLVM module.
Reviewers: courbet, davide
Reviewed By: courbet
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D43825
llvm-svn: 326318
An extract_element where the result type is larger than the scalar element type is semantically an any_extend of from the scalar element type to the result type. If we expect zeroes in the upper bits of the i8/i32 we need to mae sure those zeroes are explicit in the DAG.
For these cases the best way to accomplish this is use an insert_subvector to pad zeroes to the upper bits of the v1i1 first. We extend to either v16i1(for i32) or v8i1(for i8). Then bitcast that to a scalar and finish with a zero_extend up to i32 if necessary. We can't extend past v16i1 because that's the largest mask size on KNL. But isel is smarter enough to know that a zext of a bitcast from v16i1 to i16 can use a KMOVW instruction. The insert_subvectors will be dropped during isel because we can determine that the producing instruction already zeroed the upper bits of the k-register.
llvm-svn: 326308
Neither the linker nor the runtime need this information
anymore. We were originally using this to model BSS size
but the plan is now to use the segment metadata to allow
for BSS segments.
Differential Revision: https://reviews.llvm.org/D41366
llvm-svn: 326267
Absence of memory operands is treated as "aliasing everything", so
dropping them is sufficient.
Recommit r326256 with a fixed testcase.
llvm-svn: 326262
Qualifiers on a pointer or reference type may apply to either the
pointee or the pointer itself. Consider 'const char *' and 'char *
const'. In the first example, the pointee data may not be modified
without casts, and in the second example, the pointer may not be updated
to point to new data.
In the general case, qualifiers are applied to types with LF_MODIFIER
records, which support the usual const and volatile qualifiers as well
as the __unaligned extension qualifier.
However, LF_POINTER records, which are used for pointers, references,
and member pointers, have flags for qualifiers applying to the
*pointer*. In fact, this is the only way to represent the restrict
qualifier, which can only apply to pointers, and cannot qualify regular
data types.
This patch causes LLVM to correctly fold 'const' and 'volatile' pointer
qualifiers into the pointer record, as well as adding support for
'__restrict' qualifiers in the same place.
Based on a patch from Aaron Smith
Differential Revision: https://reviews.llvm.org/D43060
llvm-svn: 326260
When attempting to compile the following Objective-C++ code with
CodeView debug info:
void (^b)(void) = []() {};
The generated debug metadata contains a structure like the following:
!43 = !DICompositeType(tag: DW_TAG_structure_type, name: "__block_literal_1", scope: !6, file: !6, line: 1, size: 168, elements: !44)
!44 = !{!45, !46, !47, !48, !49, !52}
...
!52 = !DIDerivedType(tag: DW_TAG_member, scope: !6, file: !6, line: 1, baseType: !53, size: 8, offset: 160, flags: DIFlagPublic)
!53 = !DIDerivedType(tag: DW_TAG_const_type, baseType: !54)
!54 = !DICompositeType(tag: DW_TAG_class_type, file: !6, line: 1, flags: DIFlagFwdDecl)
Note that the member node (!52) is unnamed, but rather than pointing to
a DICompositeType directly, it points to a DIDerivedType with tag
DW_TAG_const_type, which then points to the DICompositeType. However,
the CodeView assembly printer currently assumes that the base type for
an unnamed member will always be a DICompositeType, and attempts to
perform that cast, which triggers an assertion failure, since in this
case the base type is actually a DIDerivedType, not a DICompositeType
(and we would have to get the base type of the DIDerivedType to reach
the DICompositeType). I think the debug metadata being generated by the
frontend is correct (or at least plausible), and the CodeView printer
needs to handle this case.
This patch teaches the CodeView printer to unwrap any qualifier types.
The qualifiers are just dropped for now. Ideally, they would be applied
to the added indirect members instead, but this occurs infrequently
enough that adding the logic to handle the qualifiers correctly isn't
worth it for now. A FIXME is added to note this.
Additionally, Reid pointed out that the underlying assumption that an
unnamed member must be a composite type is itself incorrect and may not
hold for all frontends. Therefore, after all qualifiers have been
stripped, check if the resulting type is in fact a DICompositeType and
just return if it isn't, rather than assuming the type and crashing if
that assumption is violated.
Differential Revision: https://reviews.llvm.org/D43803
llvm-svn: 326255
This is similar to what's done in computeKnownBits and computeSignBits. Don't do anything fancy just collect information valid for any element.
Differential Revision: https://reviews.llvm.org/D43789
llvm-svn: 326237
We were always setting the block alignment to 2 bytes in Thumb mode
and 4-bytes in ARM mode (r325754, and r325012), but this could cause
reducing the block alignment when it already had been aligned (e.g.
in Thumb mode when the block is a CPE that was already 4-byte aligned).
Patch by Momchil Velikov, I've only added a test.
Differential Revision: https://reviews.llvm.org/D43777
llvm-svn: 326232
Following DW_AT_sibling attributes completely defeats the pruning pass.
Although clang doesn't generate the DW_AT_sibling attribute we should
still handle it correctly.
Differential revision: https://reviews.llvm.org/D43439
llvm-svn: 326231
This is a slight reduction of one of the benchmarks
that suffered with D43079. Cost model changes should
not cause this test to remain scalarized.
llvm-svn: 326221
This is a slight reduction of one of the benchmarks
that suffered with D43079. Cost model changes should
not cause this test to remain scalarized.
llvm-svn: 326217
Re-enable commit r323991 now that r325931 has been committed to make
MachineOperand::isRenamable() check more conservative w.r.t. code
changes and opt-in on a per-target basis.
llvm-svn: 326208
Summary:
Since r325479 the DataLayout includes a program address space. However, it
is not possible to use `call %foo` if foo is a `i8(...) addrspace(200)` and
the DataLayout specifies address space 200 as the address space for functions.
With this change the IR parser will still accept variables in the program
address space as well as address space 0 for call and invoke functions.
Reviewers: pcc, arsenm, bjope, dylanmckay, theraven
Reviewed By: dylanmckay
Subscribers: wdng, llvm-commits
Differential Revision: https://reviews.llvm.org/D43645
llvm-svn: 326188
Add test that verifies that we don't follow DWARF values with a
reference form class, such as DW_AT_sibling.
Since clang doesn't generate the latter attribute, we added a PowerPC
test generated on an old PowerBook G4. (Thanks Adrian!)
llvm-svn: 326183
In case we update a ValuePHI node created earlier, we could update it
based on a different OpPHI which could be in a different block.
We need to update the TempToBlock mapping reflecting the new block,
otherwise we would end up placing the new phi node in a wrong block.
This problem is exposed by the test case in
https://bugs.llvm.org/show_bug.cgi?id=36504.
This patch fixes a slightly simpler problem than in the bug report. In
the bug's re-producer, the additional problem is that we are re-using a
ValuePHI node with to few incoming values for the new OpPHI. If this
patch makes sense, I will follow it up with a patch that creates a new
PHI node if the existing PHI node has a different number of incoming
values.
Reviewers: davide, dberlin
Reviewed By: dberlin
Differential Revision: https://reviews.llvm.org/D43770
llvm-svn: 326181
Agner's tables indicate that for SSE42+ targets (Core2 and later) we can reduce the FADD/FSUB/FMUL costs down to 1, which should fix the Himeno benchmark.
Note: the AVX512 FDIV costs look rather dodgy, but this isn't part of this patch.
Differential Revision: https://reviews.llvm.org/D43733
llvm-svn: 326133
There's still some shortcoming in our ability to combine binops of constants with different sizes separated by an extend. I'll try to look at that next.
llvm-svn: 326128