Should fix UBSan bot by also checking there's no "uwtable" attribute
before skipping. Otherwise the unwind table will be useless since its
moves expect CSRs to actually be preserved.
A noreturn nounwind function can be expected to never return in any way, and by
never returning it will also never have to restore any callee-saved registers
for its caller. This makes it possible to skip spills of those registers during
function entry, saving some stack space and time in the process. This is rather
useful for embedded targets with limited stack space.
Should fix PR9970.
Patch mostly by myeisha (pmb).
llvm-svn: 329494
Summary:
The LLVM SourceMgr class (which is used indirectly by Swift, though not Clang)
has a routine for looking up line numbers of SMLocs. This routine uses a
shared, special-purpose cache that handles exactly one access pattern
efficiently: looking up the line number of an SMLoc that points into the same
buffer as the last query made to the SourceMgr, at a location in the buffer at
or ahead of the last query.
When this works it's fine, but when it fails it's catastrophic for performancer:
one recent out-of-order access from a Swift utility routine ran for tens of
seconds, spending 99% of its time repeatedly scanning buffers for '\n'.
This change removes the shared cache from the SourceMgr and installs a new
cache in each SrcBuffer. The per-SrcBuffer caches are also "full", in the sense
that rather than caching a single last-query pointer, they cache _all_ the
line-ending offsets, in a binary-searchable array, such that once it's
populated (on first access), all subsequent access patterns run at the same
speed.
Performance measurements I've done show this is actually a little bit faster on
real codebases (though only a couple fractions of a percent). Memory usage is
up by a few tens to hundreds of bytes per SrcBuffer that has a line lookup done
on it; I've attempted to minimize this by using dynamic selection of integer
sized when storing offset arrays. But the main motive here is to
make-impossible the cases we don't always see, that show up by surprise when
there is an out-of-order access pattern.
Reviewers: jordan_rose
Reviewed By: jordan_rose
Subscribers: probinson, llvm-commits
Differential Revision: https://reviews.llvm.org/D45003
llvm-svn: 329470
Llvm-mc (and tools that use Path.inc on Windows) assume that strings are utf-8
encoded, however, this is not always the case. On Windows the default codepage
is not utf-8, so most of the time the strings are not utf-8 encoded.
The lld test 'format-binary-non-ascii' uses llvm-mc with a file with non-ascii
characters in the name which is how this bug was found. The test fails when run
using Python 3 because it uses properly encoded unicode strings (Python 2 actually
ends up using a byte string which is not utf-8 encoded, so the test passes, but
that's separate issue).
Patch by Stella Stamenova!
llvm-svn: 329468
Previously HalfTy was not handled which would either trigger an assertion,
or result in array initialized with garbage.
Differential Revision: https://reviews.llvm.org/D45391
llvm-svn: 329463
v2f16 is a special case in NVPTX. v4f16 may be loaded as a pair of v2f16
and that was not previously handled correctly by tryLDGLDU()
Differential Revision: https://reviews.llvm.org/D45339
llvm-svn: 329456
Summary:
This patch implements a tablegen-driven Instruction Compression
mechanism for generating RISCV compressed instructions
(C Extension) from the expanded instruction form.
This tablegen backend processes CompressPat declarations in a
td file and generates all the compile-time and runtime checks
required to validate the declarations, validate the input
operands and generate correct instructions.
The checks include validating register operands, immediate
operands, fixed register operands and fixed immediate operands.
Example:
class CompressPat<dag input, dag output> {
dag Input = input;
dag Output = output;
list<Predicate> Predicates = [];
}
let Predicates = [HasStdExtC] in {
def : CompressPat<(ADD GPRNoX0:$rs1, GPRNoX0:$rs1, GPRNoX0:$rs2),
(C_ADD GPRNoX0:$rs1, GPRNoX0:$rs2)>;
}
The result is an auto-generated header file
'RISCVGenCompressEmitter.inc' which exports two functions for
compressing/uncompressing MCInst instructions, plus
some helper functions:
bool compressInst(MCInst& OutInst, const MCInst &MI,
const MCSubtargetInfo &STI,
MCContext &Context);
bool uncompressInst(MCInst& OutInst, const MCInst &MI,
const MCRegisterInfo &MRI,
const MCSubtargetInfo &STI);
The clients that include this auto-generated header file and
invoke these functions can compress an instruction before emitting
it, in the target-specific ASM or ELF streamer, or can uncompress
an instruction before printing it, when the expanded instruction
format aliases is favored.
The following clients were added to implement compression\uncompression
for RISCV:
1) RISCVAsmParser::MatchAndEmitInstruction:
Inserted a call to compressInst() to compresses instructions
parsed by llvm-mc coming from an ASM input.
2) RISCVAsmPrinter::EmitInstruction:
Inserted a call to compressInst() to compress instructions that
were lowered from Machine Instructions (MachineInstr).
3) RVInstPrinter::printInst:
Inserted a call to uncompressInst() to print the expanded
version of the instruction instead of the compressed one (e.g,
add s0, s0, a5 instead of c.add s0, a5) when -riscv-no-aliases
is not passed.
This patch squashes D45119, D42780 and D41932. It was reviewed in smaller patches by
asb, efriedma, apazos and mgrang.
Reviewers: asb, efriedma, apazos, llvm-commits, sabuasal
Reviewed By: sabuasal
Subscribers: mgorny, eraman, asb, rbar, johnrusso, simoncook, jordy.potman.lists, apazos, niosHD, kito-cheng, shiva0217, zzheng
Differential Revision: https://reviews.llvm.org/D45385
llvm-svn: 329455
Summary:
r327219 added wrappers to std::sort which randomly shuffle the container before sorting.
This will help in uncovering non-determinism caused due to undefined sorting
order of objects having the same key.
To make use of that infrastructure we need to invoke llvm::sort instead of std::sort.
Note: This patch is one of a series of patches to replace *all* std::sort to llvm::sort.
Refer the comments section in D44363 for a list of all the required patches.
Reviewers: stoklund, kparzysz, dsanders
Reviewed By: dsanders
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D45144
llvm-svn: 329451
Summary:
The 'strong' StackProtector heuristic takes into consideration call instructions.
Certain intrinsics, such as lifetime.start, can cause the
StackProtector to protect functions that do not need to be protected.
Specifically, a volatile variable, (not optimized away), but belonging to a stack
allocation will encourage a llvm.lifetime.start to be inserted during
compilation. Because that intrinsic is a 'call' the strong StackProtector
will see that the alloca'd variable is being passed to a call instruction, and
insert a stack protector. In this case the intrinsic isn't really lowered to a
call. This can cause unnecessary stack checking, at the cost of additional
(wasted) CPU cycles.
In the future we should rely on TargetTransformInfo::isLoweredToCall, but as of
now that routine considers all intrinsics as not being lowerable. That needs
to be corrected, and such a change is on my list of things to get moving on.
As a side note, the updated stack-protector-dbginfo.ll test always seems to
pass. I never see the dbg.declare/dbg.value reaching the
StackProtector::HasAddressTaken, but I don't see any code excluding dbg
intrinsic calls either, so I think it's the safest thing to do.
Reviewers: void, timshen
Reviewed By: timshen
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D45331
llvm-svn: 329450
The compiler is generating packet with the following instructions,
which causes an undefined register assert in the verifier.
$r0 = IMPLICIT_DEF
$r1 = IMPLICIT_DEF
S2_storerd_io killed $r29, 0, killed %d0
The problem is that the packetizer is not saving the IMPLICIT_DEF
instructions, which are needed when checking if it is legal to
add the store instruction. The fix is to add the IMPLICIT_DEF
instructions to the CurrentPacketMIs structure.
Patch by Brendon Cahoon.
llvm-svn: 329439
Packetizer keeps two zero-latency bound instrctions in the same packet ignoring
the stalls on the later instruction. This should not be the case if there is no
data dependence.
Patch by Sumanth Gundapaneni.
llvm-svn: 329437
Summary:
r327219 added wrappers to std::sort which randomly shuffle the container before sorting.
This will help in uncovering non-determinism caused due to undefined sorting
order of objects having the same key.
To make use of that infrastructure we need to invoke llvm::sort instead of std::sort.
Note: This patch is one of a series of patches to replace *all* std::sort to llvm::sort.
Refer the comments section in D44363 for a list of all the required patches.
Reviewers: bogner, rnk, MatzeB, RKSimon
Reviewed By: rnk
Subscribers: JDevlieghere, javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D45133
llvm-svn: 329435
Summary:
This patch removes InstRW overrides for basic arithmetic/logic instructions. To do this I've added the store address port to RMW. And used a WriteSequence to make the latency additive. It does not cover ADC/SBB because they have different latency.
Apparently we were inconsistent about whether the store has latency or not thus the test changes.
I've also left out Sandy Bridge because the load latency there is currently 4 cycles and should be 5.
Reviewers: RKSimon, andreadb
Reviewed By: andreadb
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D45351
llvm-svn: 329416
Summary:
Fixing an issue where initializations of globals where constructors use
casts were silently translated to 0-initialization.
Reviewers: davidxl, evgeny777
Reviewed By: evgeny777
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D45198
llvm-svn: 329409
Summary:
This patch add checks to verify that the information in the name index
entries is consistent with the debug_info section. Specifically, we
check that entries point to valid DIEs, and their names, tags, and
compile units match the information in the debug_info sections.
These checks are only run if the previous checks did not find any errors
in the name index headers. Attempting to proceed with the checks anyway
would likely produce a lot of spurious errors and the verification code
would need to be very careful to avoid crashing.
I also add a couple of more checks to the abbreviation-validation code
to verify that some attributes are always present (an index without a
DW_IDX_die_offset attribute is fairly useless).
The entry verification works only on indexes without any type units - I
haven't attempted to extend it to type units, as we don't even have a
DWARF v5-compatible type unit generator at the moment.
Reviewers: JDevlieghere, aprantl, dblaikie
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D45323
llvm-svn: 329392
As mentioned on D44647, this patch increases the default memory latency to +5cy , which more closely matches what most custom cases are doing for reg-mem instructions.
I've bumped LoadLatency, ReadAfterLd and WriteLoad values to 5cy to be consistent.
As Sandy Bridge is currently our default generic model, this affects a lot of scheduling tests...
Differential Revision: https://reviews.llvm.org/D44654
llvm-svn: 329388
Inserting instrumentation between a musttail call and ret instruction
would create invalid IR. Instead, treat musttail calls as function
exits.
llvm-svn: 329385
MFI.LocalFrameSize was not serialized.
It is usually set from LocalStackSlotAllocation, so if that pass doesn't
run it is impossible do deduce it from the stack objects. Until now, this
information was lost.
llvm-svn: 329382
Summary:
The positions of the DwarfVersion and AddressSize arguments were
reversed, which caused parsing for dwarf opcodes which contained
address-size-dependent operands (such as DW_OP_addr). Amusingly enough,
none of the address-size asserts fired, as dwarf version was always 4,
which is a valid address size.
I ran into this when constructing weird inputs for the DWARF verifier. I
I add a test case as hand-written dwarf -- I am not sure how to trigger
this differently, as having a DW_OP_addr inside a location list is a
fairly non-standard thing to do.
Fixing this error exposed a bug in the debug_loc.dwo parser, which was
always being constructed with an address size of 0. I fix that as well
by following the pattern in the non-dwo parser of picking up the address
size from the first compile unit (which is technically not correct, but
probably good enough in practice).
Reviewers: JDevlieghere, aprantl, dblaikie
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D45324
llvm-svn: 329381
VSX D-form load/store instructions of POWER9 require the offset be a multiple of 16 and a helper`isOffsetMultipleOf` is used to check this.
So far, the helper handles FrameIndex + offset case, but not handling FrameIndex without offset case. Due to this, we are missing opportunities to exploit D-form instructions when accessing an object or array allocated on stack.
For example, x-form store (stxvx) is used for int a[4] = {0}; instead of d-form store (stxv). For larger arrays, D-form instruction is not used when accessing the first 16-byte. Using D-form instructions reduces register pressure as well as instructions.
Differential Revision: https://reviews.llvm.org/D45079
llvm-svn: 329377
Summary:
- Add a missing getter for module-level inline assembly
- Add a missing append function for module-level inline assembly
- Deprecate LLVMSetModuleInlineAsm and replace it with LLVMSetModuleInlineAsm2 which takes an explicit length parameter
- Deprecate LLVMConstInlineAsm and replace it with LLVMGetInlineAsm, a function that allows passing a dialect and is not mis-classified as a constant operation
Reviewers: whitequark, deadalnix
Reviewed By: whitequark
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D45346
llvm-svn: 329369