This adds handling for two cases:
1. A scalable vector where the element type is promoted.
2. A scalable vector where the element count is odd (or more generally,
not divisble by the element count of the part type).
(Some element types still don't work; for example, <vscale x 2 x i128>,
or <vscale x 2 x fp128>.)
Differential Revision: https://reviews.llvm.org/D105591
The declaration for the global new function in C++ is generated in the compiler front-end. When examining exception propagation, we found that this is the largest root throw site propagator requiring unwind code to be generated for callers up the stack. Allowing this to be handled immediately with termination stops upward propagation and leads to significantly less landing pads generated. This in turns leads to a performance and .text size win.
With `-fnew-infallible` this annotates the declaration with `throw()` and `__attribute__((returns_nonnull))`. `throw()` allows the compiler to assume exceptions do not propagate out of new and eliminate it as a root throw site. Note that the definition of global new is user-replaceable so users should ensure that the one used follows these semantics.
Measuring internally, we're seeing at 0.5% CPU win in one of our large internal FB workload. Measuring on clang self-build (cd0a1226b5) we get:
thinlto/
"dwarfehprepare.NumCleanupLandingPadsRemaining": 153494,
"dwarfehprepare.NumNoUnwind": 26309,
thinlto_newinfallible/
"dwarfehprepare.NumCleanupLandingPadsRemaining": 143660,
"dwarfehprepare.NumNoUnwind": 28744,
a 1-143660/153494 = 6.4% reduction in landing pads and a 28744/26309 = 9.3% increase in the number of nounwind functions.
Testing:
ninja check-all
new test case to make sure these attributes are added correctly to global new.
Reviewed By: urnathan
Differential Revision: https://reviews.llvm.org/D105225
... and rename it to 'warnIfNonZero' to better-reflect what it actually
does.
The goal is to minimize the amount of logic that's conditionally
compiled under '#if __APPLE__'.
There was a slightly mismatch between the double COO and actual numerical
type in the final sparse tensor storage (due to external formats always
using double). This minor revision removes that inconsistency by using a
properly typed COO and casting during the "add" method instead. This also
prepares alternative ways of initializing the COO object.
Reviewed By: gussmith23
Differential Revision: https://reviews.llvm.org/D107310
Add more checks, info on -fno-sanitize=..., and reference to 5/2021 UBSan Oracle blog.
Authored By: DianeMeirowitz
Reviewed By: hctim
Differential Revision: https://reviews.llvm.org/D106908
I'm not sure this is the best way to approach this,
but the situation is rather not very detectable unless we explicitly call it out when refusing to advise to unroll.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D107271
When a value is expected to be extended, we should emit an extended load rather
than a normal G_LOAD.
Add checklines to arm64-abi.ll which show that we now emit the correct loads.
For ease of comparison: https://godbolt.org/z/8WvY6EfdE
Differential Revision: https://reviews.llvm.org/D107313
ValueTracking should allow for value ranges that may satisfy
llvm.assume, instead of restricting the ranges only to values that
will always satisfy the condition.
Differential Revision: https://reviews.llvm.org/D107298
This aligns the multiple exit costing with all the other cost decisions. Note that UnrollAndJam, which is the only other caller of the original home of this code, unconditionally bails out of multiple exit loops.
The new -mtargetos= option is a replacement for the existing, OS-specific options
like -miphoneos-version-min=. This allows us to introduce support for new darwin OSes
easier as they won't require the use of a new option. The older options will be
deprecated and the use of the new option will be encouraged instead.
Differential Revision: https://reviews.llvm.org/D106316
In some environments this test could fail if start.S has its own DWARF
CompileUnit or similar are included before the DWARF CompileUnit for the
file.
This change makes the test independent of the index of the compile unit,
instead checking the filename.
Reviewed By: herhut, jankratochvil
Differential Revision: https://reviews.llvm.org/D107300
The .cpp file uses SIGNAL_POLLING_UNSUPPORTED to guard the call
to sigaction, so use it in the .h file too. (LLVM also calls
sigaction without a guard on non-Windows.)
No behavior change.
Differential Revision: https://reviews.llvm.org/D107255
The option to not preserve LCSSA is in fact not tested at all in upstream. I was tempted to just remove the code entirely, but realized I didn't need to for my actual goal.
Dummy procedures can be defined as subprograms with explicit
interfaces, e.g.
subroutine subr(dummy)
interface
subroutine dummy(x)
real :: x
end subroutine
end interface
! ...
end subroutine
but the symbol table had no means of marking such symbols as dummy
arguments, so predicates like IsDummy(dummy) would fail. Add an
isDummy_ flag to SubprogramNameDetails, analogous to the corresponding
flag in EntityDetails, and set/test it as needed.
Differential Revision: https://reviews.llvm.org/D106697
When we build with split dwarf in single mode the .o files that contain both "normal" debug sections and dwo sections, along with relocaiton sections for "normal" debug sections.
When we create DWARF context in DWARFObjInMemory we process relocations and store them in the map for .debug_info, etc section.
For DWO Context we also do it for non dwo dwarf sections. Which I believe is not necessary. This leads to a lot of memory being wasted. We observed 70GB extra memory being used.
I went with context sensitive approach, flag is passed in. I am not sure if it's always safe not to process relocations for regular debug sections if Obj contains .dwo sections.
If it is alternatvie might be just to scan, in constructor, sections and if there are .dwo sections not to process regular debug ones.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D106624
Add new pass LowerRefTypesIntPtrConv to generate trap
instruction for an inttoptr and ptrtoint of a reference type instead
of erroring, since calling these instructions on non-integral pointers
has been since allowed (see ac81cb7e6).
Differential Revision: https://reviews.llvm.org/D107102
This patch updates VPInterleaveRecipe::print to print the actual defined
VPValues for load groups and the store VPValue operands for store
groups.
The IR references may become outdated while transforming the VPlan and
the defined and stored VPValues always are up-to-date.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D107223
We had a [bad bug](69655864ee) over in CIRCT
caused by accidentally passing around PatternRewriter
by value. There is no reason to support copy/assignment
of the pattern rewriter, so disable it.
Differential Revision: https://reviews.llvm.org/D107232