This patch moves the construction of the default backend from llvm-mca.cpp and
into mca::Context. The Context class is responsible for holding ownership of
the simulated hardware components. These components are subclasses of
HardwareUnit. Right now the HardwareUnit is pretty bare-bones, but eventually
we might want to add some common functionality across all hardware components,
such as isReady() or something similar.
I have a feeling this patch will probably need some updates, but it's a start.
One thing I am not particularly fond of is the rather large interface for
createDefaultPipeline. That convenience routine takes a rather large set of
inputs from the llvm-mca driver, where many of those inputs are generated via
command line options.
One item I think we might want to change is the separating of ownership of
hardware components (owned by the context) and the pipeline (which owns
Stages). In short, a Pipeline owns Stages, a Context (currently) owns hardware.
The Pipeline's Stages make use of the components, and thus there is a lifetime
dependency generated. The components must outlive the pipeline. We could solve
this by having the Context also own the Pipeline, and not return a
unique_ptr<Pipeline>. Now that I think about it, I like that idea more.
Differential Revision: https://reviews.llvm.org/D48691
llvm-svn: 336456
This diff adds support for handling static libraries
to llvm-objcopy and llvm-strip.
Test plan: make check-all
Differential revision: https://reviews.llvm.org/D48413
llvm-svn: 336455
Suppress the diagnostic for mis-sized dbg.values when a value operand is
narrower than the unsigned variable it describes. Assume that a debugger
would implicitly zero-extend these values.
llvm-svn: 336452
The replaceAllDbgUsesWith utility helps passes preserve debug info when
replacing one value with another.
This improves upon the existing insertReplacementDbgValues API by:
- Updating debug intrinsics in-place, while preventing use-before-def of
the replacement value.
- Falling back to salvageDebugInfo when a replacement can't be made.
- Moving the responsibiliy for rewriting llvm.dbg.* DIExpressions into
common utility code.
Along with the API change, this teaches replaceAllDbgUsesWith how to
create DIExpressions for three basic integer and pointer conversions:
- The no-op conversion. Applies when the values have the same width, or
have bit-for-bit compatible pointer representations.
- Truncation. Applies when the new value is wider than the old one.
- Zero/sign extension. Applies when the new value is narrower than the
old one.
Testing:
- check-llvm, check-clang, a stage2 `-g -O3` build of clang,
regression/unit testing.
- This resolves a number of mis-sized dbg.value diagnostics from
Debugify.
Differential Revision: https://reviews.llvm.org/D48676
llvm-svn: 336451
As discussed in D48987 and D48893, there are many different
ways to go wrong depending on the binop (and as shown here
we already do go wrong in some cases).
llvm-svn: 336450
Summary:
Namely, set the abort message, and allow to write the message to syslog if the
option is enabled.
Reviewers: alekseyshl
Reviewed By: alekseyshl
Subscribers: delcypher, #sanitizers, llvm-commits
Differential Revision: https://reviews.llvm.org/D48902
llvm-svn: 336445
The enhanced version will be used in D48893 and related patches
and an almost identical (fadd is different) version is proposed
in D28907, so adding this as a preliminary step.
llvm-svn: 336444
Added statistics for the number of SMLAD instructions created, and
als renamed the pass name to -arm-parallel-dsp.
Differential Revision: https://reviews.llvm.org/D48971
llvm-svn: 336441
LoopBlockNumber is a DenseMap<BasicBlock*, int>, comparing the result of
find() will compare a pair<BasicBlock*, int>. That's of course depending
on pointer ordering which varies from run to run. Reverse iteration
doesn't find this because we're copying to a vector first.
This bug has been there since 2016 but only recently showed up on clang
selfhost with FDO and ThinLTO, which is also why I didn't manage to get
a reasonable test case for this. Add an assert that would've caught
this.
llvm-svn: 336439
When emitting a CU, store the MCSymbol pointing to the beginning of the
CU. We'll need this information later when emitting the .debug_names
section (DWARF5 accelerator table).
llvm-svn: 336433
This patches adds support for passing -mcpu=native for AArch64. It will
get turned into the host CPU name, before we get the target features.
CPU = native is handled in a similar fashion in
getAArch64MicroArchFetauresFromMtune and getAArch64TargetCPU already.
Having a good test case for this is hard, as it depends on the host CPU
of the machine running the test. But we can check that native has been
replaced with something else.
When cross-compiling, we will get a CPU name from the host architecture
and get ` the clang compiler does not support '-mcpu=native'` as error
message, which seems reasonable to me.
Reviewers: rengolin, peter.smith, dlj, javed.absar, t.p.northover
Reviewed By: peter.smith
Tags: #clang
Differential Revision: https://reviews.llvm.org/D48931
llvm-svn: 336429
Summary:
The method only takes PPreprocessor and don't require structures that
might not be available (e.g. Sema and ASTContext) when CodeCompletionString
needs to be generated for macros.
Reviewers: sammccall
Reviewed By: sammccall
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D48973
llvm-svn: 336427
D48278
Allow to reduce redundant shift masks.
For example:
x1 = x & 0xAB00
x2 = (x >> 8) & 0xAB
can be reduced to:
x1 = x & 0xAB00
x2 = x1 >> 8
It only allows folding when the masks and shift values are constants.
llvm-svn: 336426
This is a maintenance update. Besides many minor changes it ships two
functions "isl_*_list_size" and "isl_*_list_get_at" which will allow us
to simplify the iterator implementation in Polly.
llvm-svn: 336425
They were failing in Chromium's packaging builds with:
C:\b\rr\tmphqfaff\w\src\third_party\llvm\tools\lld\test\COFF\pdb-globals-dia-vfunc-collision2.test:24:8:
error: expected string not found in input
CHECK: func [0x00001060+ 0 - 0x0000106c-12 | sizeof= 12] (FPO) virtual int __cdecl A132()
^
<stdin>:8:11: note: scanning from here
struct S [sizeof = 8] {
^
<stdin>:9:2: note: possible intended match here
func [0x00001060+ 0 - 0x0000106c-12 | sizeof= 12] (FPO) virtual int __cdecl S::A132()
^
Maybe due to different DIA versions.
llvm-svn: 336424
This patch modifies the Scheduler heuristic used to select the next instruction
to issue to the pipelines.
The motivating example is test X86/BtVer2/add-sequence.s, for which llvm-mca
wrongly reported an estimated IPC of 1.50. According to perf, the actual IPC for
that test should have been ~2.00.
It turns out that an IPC of 2.00 for test add-sequence.s cannot possibly be
predicted by a Scheduler that only prioritizes instructions based on their
"age". A similar issue also affected test X86/BtVer2/dependent-pmuld-paddd.s,
for which llvm-mca wrongly estimated an IPC of 0.84 instead of an IPC of 1.00.
Instructions in the ReadyQueue are now ranked based on two factors:
- The "age" of an instruction.
- The number of unique users of writes associated with an instruction.
The new logic still prioritizes older instructions over younger instructions to
minimize the pressure on the reorder buffer. However, the number of users of an
instruction now also affects the overall rank. This potentially increases the
ability of the Scheduler to extract instruction level parallelism. This patch
fixes the problem with the wrong IPC reported for test add-sequence.s and test
dependent-pmuld-paddd.s.
llvm-svn: 336420
Previously we only iterated over functions reachable from the set of
external functions in the module. But since some of the passes under
this (notably the always-inliner and coroutine lowerer) are required for
correctness, they need to run over everything.
This just adds an extra layer of iteration over the CallGraph to keep
track of which functions we've already visited and get the next batch of
SCCs.
Should fix PR38029.
llvm-svn: 336419
The intrinsics can be implemented with a f32/f64 llvm.fma intrinsic and an insert into a zero vector.
There are a couple regressions here due to SelectionDAG not being able to pull an fneg through an extract_vector_elt. I'm not super worried about this though as InstCombine should be able to do it before we get to SelectionDAG.
llvm-svn: 336416
A Chromium developer reported a bug which turned out to be a mangling
collision between these two literals:
char s[] = "foo";
char t[32] = "foo";
They may look the same, but for the initialization of t we will (under
some circumstances) use a literal that's extended with zeros, and
both the length and those zeros should be accounted for by the mangling.
This actually makes the mangling code simpler: where it previously had
special logic for null terminators, which are not part of the
StringLiteral, that is now covered by the general algorithm.
(The problem was reported at https://crbug.com/857442)
Differential Revision: https://reviews.llvm.org/D48928
llvm-svn: 336415
Remove support for linking microMIPS 64-bit code because this kind of
ISA is rarely used and unsupported by LLVM.
Differential revision: https://reviews.llvm.org/D48949
llvm-svn: 336413
Summary:
Error's new operator<< is the first way to print an error without consuming it.
formatv() can now print objects with an operator<< that works with raw_ostream.
Reviewers: bkramer
Subscribers: mgorny, llvm-commits
Differential Revision: https://reviews.llvm.org/D48966
llvm-svn: 336412
Summary:
Add support for two additional ObjC image info flags: `IS_SIMULATED` and
`HAS_CATEGORY_CLASS_PROPERTIES`.
`IS_SIMULATED` indicates a Mach-O binary built for iOS simulator.
`HAS_CATEGORY_CLASS_PROPERTIES` indicates a Mach-O binary built by a compiler
that supports class properties in categories.
Reviewers: enderby, compnerd
Reviewed By: compnerd
Subscribers: keith, llvm-commits
Differential Revision: https://reviews.llvm.org/D48568
llvm-svn: 336411
This upgrades all of the intrinsics to use fneg instructions to convert fma into fmsub/fnmsub/fnmadd/fmsubadd. And uses a select instruction for masking.
This matches how clang uses the intrinsics these days.
llvm-svn: 336409
-Split cases that call 2 intrinsics in the same case.
-Remove testing mask3 and maskz intrinsics with an all ones mask. These won't be interesting after the upgrade.
-Restore test cases for some intrinsics that are marked for deletion, but haven't been deleted yet.
llvm-svn: 336408
We add an option to dump the entire global / public symbol record
stream. Previously we would dump globals or publics, but not both.
And when we did dump them, we would always dump them in the order
they were referenced by the corresponding hash streams, not in
the order they were serialized in. This patch adds a lower level
mode that just dumps the whole stream in serialization order.
Additionally, when dumping global-extras, we now dump the hash
bitmap as well as the record offset instead of dumping all zeros
for the offsets.
llvm-svn: 336407