The verifier identified several modules that were broken due to incorrect
linkage on declarations. To fix this, CompileOnDemandLayer2::extractFunction
has been updated to change decls to external linkage.
llvm-svn: 336150
Different CodeBlocks don't overlap. The same MCInst cannot appear in more than
one code block because all blocks are instantiated before the simulation is run.
We should always clear the content of map VariantDescriptors before every
simulation, since VariantDescriptors cannot possibly store useful information
for the next blocks. It is also "safer" to clear its content because `MCInst*`
is used as the key type for map VariantDescriptors.
llvm-svn: 336142
On darwin, all virtual sections have zerofill type, and having a
.zerofill directive in a non-virtual section is not allowed. Instead of
asserting, show a nicer error.
In order to use the equivalent of .zerofill in a non-virtual section,
the usage of .zero of .space is required.
This patch replaces the assert with an error.
Differential Revision: https://reviews.llvm.org/D48517
llvm-svn: 336127
Summary:
This adds a new -no-weak flag to nm to hide weak symbols in its output.
This also adds a -W alias for this which is analogous to -U.
Patch by Keith Smiley
Reviewers: kastiglione, enderby, compnerd
Reviewed By: kastiglione
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D48751
llvm-svn: 336126
Currently the llvm-exegesis native architecture is determined by comparing the
llvm native architecture with X86, so to add a new target would mean adding a
new check. Change this to building up a list of the targets llvm-exegesis
supports then using that, as this means that when adding a new target you just
add the target to the list of supported targets.
Differential Revision: https://reviews.llvm.org/D48778
llvm-svn: 336105
Currently the cycle counter is taken from the subtarget schedule model, which
isn't any use if the subtarget doesn't have one. Delegate the decision to the
target benchmark runner, as it may know better what to do in that case, with
the default being the current behaviour.
Differential Revision: https://reviews.llvm.org/D48779
llvm-svn: 336099
We were printing every character, even those that weren't printable. It
doesn't really make sense for this option.
The string content was sticked to its address, added two spaces in
between.
Differential Revision: https://reviews.llvm.org/D48271
llvm-svn: 336058
The original binary holder has an optimization where it caches a static
library (archive) between consecutive calls to GetObjects. However, the
actual memory buffer wasn't cached between calls.
This made sense when dsymutil was processing objects one after each
other, but when processing them in parallel, several binaries have to be
in memory at the same time. For this reason, every link context
contained a binary holder.
Having one binary holder per context is problematic, because the same
static archive was cached for every object file. Luckily, when the file
is mmap'ed, this was only costing us virtual memory.
This patch introduces a new BinaryHolder variant that is fully cached,
for all the object files it load, as well as the static archives. This
way, we don't have to give up on this optimization of bypassing the
file system.
Differential revision: https://reviews.llvm.org/D48501
llvm-svn: 335990
This simplifies the logic that updates RAW dependencies in the DispatchStage.
There is no advantage in storing that flag in the ReadDescriptor; we should
simply rely on the call to `STI.getReadAdvanceCycles()` to obtain the
ReadAdvance cycles. If there are no read-advance entries, then method
`getReadAdvanceCycles()` quickly returns 0.
No functional change intended.
llvm-svn: 335977
This change adds experimental support for SHT_RELR sections, proposed
here: https://groups.google.com/forum/#!topic/generic-abi/bX460iggiKg
Definitions for the new ELF section type and dynamic array tags, as well
as the encoding used in the new section are all under discussion and are
subject to change. Use with caution!
Author: rahulchaudhry
Differential Revision: https://reviews.llvm.org/D47919
llvm-svn: 335922
The checking logic should not treat artificial locations as being
somehow problematic. Producing these locations can be the desired
behavior of some passes.
See llvm.org/PR37961.
llvm-svn: 335897
This patch introduces a new class named WriteRef. A WriteRef is used by the
RegisterFile to keep track of register definitions. Internally it wraps a
WriteState, as well as the source index of the defining instruction.
This patch allows the tool to propagate additional information to support future
analysis on data dependencies.
llvm-svn: 335867
Rather than calling std::find in a loop, just sort the vector and remove
duplicate entries at the end of the function.
Also, move the debug print at the end of the function, and query the
MCRegisterInfo to print register names rather than physreg IDs.
No functional change intended.
llvm-svn: 335837
Summary:
This enables the X86-specific X86FloatingPointStackifierPass, and allow
llvm-exegesis to generate and measure X87 latency/uops for some FP ops.
Reviewers: gchatelet
Subscribers: tschuett, llvm-commits
Differential Revision: https://reviews.llvm.org/D48592
llvm-svn: 335815
This patch splits off some abstractions used by dsymutil's dwarf linker
and moves them into separate header and implementation files. This
almost halves the number of LOC in DwarfLinker.cpp and makes it a lot
easier to understand what functionality lives where.
Differential revision: https://reviews.llvm.org/D48647
llvm-svn: 335749
Summary:
This patch removes a few callbacks from Pipeline. It comes at the cost of
registering Listeners with all Stages. Not all stages need listeners or issue
callbacks, this registration is a bit redundant. However, as we build-out the
API, this redundancy can disappear.
The main purpose here is to move callback code from the Pipeline and into the
stages that actually issue those callbacks. This removes the back-pointer to
the Pipeline that was put into a few Stage subclasses.
Reviewers: andreadb, courbet, RKSimon
Reviewed By: andreadb, courbet
Subscribers: tschuett, gbedwell, llvm-commits
Differential Revision: https://reviews.llvm.org/D48576
llvm-svn: 335748
When promoting instructions from the wait queue to the ready queue, we should
check if an instruction has already reached the IS_READY state before
calling method update().
llvm-svn: 335722
It's not possible to get the fragment size of some dbg.values. Teach the
mis-sized dbg.value diagnostic to detect this scenario and bail out.
Tested with:
$ find test/Transforms -print -exec opt -debugify-each -instcombine {} \;
llvm-svn: 335695
Report an error in -check-debugify when the size of a dbg.value operand
doesn't match up with the size of the variable it describes.
Eventually this check should be moved into the IR verifier. For the
moment, it's useful to include the check in -check-debugify as a means
of catching regressions and finding existing bugs.
Here are some instances of bugs the new check finds in the -O2 pipeline
(all in InstCombine):
1) A float is used where a double is expected:
ERROR: dbg.value operand has size 32, but its variable has size 64:
call void @llvm.dbg.value(metadata float %expf, metadata !12, metadata
!DIExpression()), !dbg !15
2) An i8 is used where an i32 is expected:
ERROR: dbg.value operand has size 8, but its variable has size 32:
call void @llvm.dbg.value(metadata i8 %t4, metadata !14, metadata
!DIExpression()), !dbg !24
3) A <4 x i32> is used where something twice as large is expected
(perhaps a <4 x i64>, I haven't double-checked):
ERROR: dbg.value operand has size 128, but its variable has size 256:
call void @llvm.dbg.value(metadata <4 x i32> %4, metadata !40, metadata
!DIExpression()), !dbg !95
Differential Revision: https://reviews.llvm.org/D48408
llvm-svn: 335682
LLJIT is a prefabricated ORC based JIT class that is meant to be the go-to
replacement for MCJIT. Unlike OrcMCJITReplacement (which will continue to be
supported) it is not API or bug-for-bug compatible, but targets the same
use cases: Simple, non-lazy compilation and execution of LLVM IR.
LLLazyJIT extends LLJIT with support for function-at-a-time lazy compilation,
similar to what was provided by LLVM's original (now long deprecated) JIT APIs.
This commit also contains some simple utility classes (CtorDtorRunner2,
LocalCXXRuntimeOverrides2, JITTargetMachineBuilder) to support LLJIT and
LLLazyJIT.
Both of these classes are works in progress. Feedback from JIT clients is very
welcome!
llvm-svn: 335670
When checking the debug info in a module, don't treat a missing
dbg.value as an error. The dbg.value may simply have been DCE'd, in
which case the debugger has enough information to display the variable
as <optimized out>.
llvm-svn: 335647
Summary:
Adds assembly parsing support for the module summary index (follow on
to r333335 which added the assembly writing support).
I added support to llvm-as to invoke the index parsing, so that it can
create either a bitcode file with a Module and a per-module index, or
a combined index without a Module.
I will send follow on patches soon to do the following:
- add support to tools such as llvm-lto2 to parse the per-module indexes
from assembly instead of bitcode when testing the thin link.
- verification support.
Depends on D47844 and D47842.
Reviewers: pcc, dexonsmith, mehdi_amini
Subscribers: inglorion, eraman, steven_wu, llvm-commits
Differential Revision: https://reviews.llvm.org/D47905
llvm-svn: 335602
Summary:
This allows targets to override code generation for some instructions.
As an example of override, this also moves ad-hoc instruction filtering
for X86 into the X86 ExegesisTarget.
Reviewers: gchatelet
Subscribers: mgorny, tschuett, llvm-commits
Differential Revision: https://reviews.llvm.org/D48587
llvm-svn: 335582
Summary:
This change renames the Backend and BackendPrinter to Pipeline and PipelinePrinter respectively.
Variables and comments have also been updated to reflect this change.
The reason for this rename, is to be slightly more correct about what MCA is modeling. MCA models a Pipeline, which implies some logical sequence of stages.
Reviewers: andreadb, courbet, RKSimon
Reviewed By: andreadb, courbet
Subscribers: mgorny, javed.absar, tschuett, gbedwell, llvm-commits
Differential Revision: https://reviews.llvm.org/D48496
llvm-svn: 335496
Summary:
This ensures that the snippet always sees the same values for registers,
making measurements reproducible.
This will also allow exploring different values.
Reviewers: gchatelet
Subscribers: tschuett, llvm-commits
Differential Revision: https://reviews.llvm.org/D48542
llvm-svn: 335465
The DispatchUnit is no longer a dependency of RCU, so this patch removes a
stale include and forward decl. This patch also cleans up some comments.
llvm-svn: 335392
Summary:
Remove explicit stages and introduce a list of stages.
A pipeline should be composed of an arbitrary list of stages, and not any
predefined list of stages in the Backend. The Backend should not know of any
particular stage, rather it should only be concerned that it has a list of
stages, and that those stages will fulfill the contract of what it means to be
a Stage (namely pre/post/execute a given instruction).
For now, we leave the original set of stages defined in the Backend ctor;
however, I imagine these will be moved out at a later time.
This patch makes an adjustment to the semantics of Stage::isReady.
Specifically, what the Backend really needs to know is if a Stage has
unfinished work. With that said, it is more appropriately renamed
Stage::hasWorkToComplete(). This change will clean up the check in
Backend::run(), allowing us to query each stage to see if there is unfinished
work, regardless of what subclass a stage might be. I feel that this change
simplifies the semantics too, but that's a subjective statement.
Given how RetireStage and ExecuteStage handle data in their preExecute(), I've
had to change the order of Retire and Execute in our stage list. Retire must
complete any of its preExecute actions before ExecuteStage's preExecute can
take control. This is mainly because both stages utilize the RCU. In the
meantime, I want to see if I can adjust that or remove that coupling.
Reviewers: andreadb, RKSimon, courbet
Reviewed By: andreadb
Subscribers: tschuett, gbedwell, llvm-commits
Differential Revision: https://reviews.llvm.org/D46907
llvm-svn: 335361
After the recent refactoring that introduced parallel handling of
different object, the binary holder became unique per object file. This
defeats its optimization of caching archives, leading to an archive
being opened for every binary it contains. This is obviously unfortunate
and will need to be refactored soon.
Luckily in practice, the impact of this is limited as most files are
mmap'ed instead of memcopy'd. There's a caveat however: when the memory
buffer requires a null terminator and it's a multiple of the page size,
we allocate instead of mmap'ing. If this happens for a static archive,
we end up with N copies of it in memory, where N is the number of
objects in the archive, leading to exuberant memory usage. This provided
a stopgap solution to ensure that all the files it loads are mmap in
memory by removing the requirement for a terminating null byte.
Differential revision: https://reviews.llvm.org/D48397
llvm-svn: 335293
Summary: Pretty much everything we need is in llvm::TargetMachine.
Reviewers: gchatelet
Subscribers: llvm-commits, tschuett
Differential Revision: https://reviews.llvm.org/D48428
llvm-svn: 335237
Errors found processing the DW_AT_ranges attribute are propagated by lower level
routines and reported by their callers.
Reviewer: JDevlieghere
Differential Revision: https://reviews.llvm.org/D48344
llvm-svn: 335188
This patch teaches llvm-mca how to identify register writes that implicitly zero
the upper portion of a super-register.
On X86-64, a general purpose register is implemented in hardware as a 64-bit
register. Quoting the Intel 64 Software Developer's Manual: "an update to the
lower 32 bits of a 64 bit integer register is architecturally defined to zero
extend the upper 32 bits". Also, a write to an XMM register performed by an AVX
instruction implicitly zeroes the upper 128 bits of the aliasing YMM register.
This patch adds a new method named clearsSuperRegisters to the MCInstrAnalysis
interface to help identify instructions that implicitly clear the upper portion
of a super-register. The rest of the patch teaches llvm-mca how to use that new
method to obtain the information, and update the register dependencies
accordingly.
I compared the kernels from tests clear-super-register-1.s and
clear-super-register-2.s against the output from perf on btver2. Previously
there was a large discrepancy between the estimated IPC and the measured IPC.
Now the differences are mostly in the noise.
Differential Revision: https://reviews.llvm.org/D48225
llvm-svn: 335113
Summary: Introducing a Prototype object to capture Variables that must be set but keeps degrees of freedom as Invalid. This allows exploring non constraint variables later on.
Reviewers: courbet
Subscribers: tschuett, llvm-commits
Differential Revision: https://reviews.llvm.org/D48316
llvm-svn: 335105
Summary: This is a step towards implementing memory operands and X87.
Reviewers: gchatelet
Subscribers: mgorny, tschuett, llvm-commits
Differential Revision: https://reviews.llvm.org/D48210
llvm-svn: 335038
Summary:
While that is indeed a quite interesting summary stat,
there are cases where it does not really add anything
other than consuming extra lines.
Declutters the output of D48190.
Reviewers: RKSimon, andreadb, courbet, craig.topper
Reviewed By: andreadb
Subscribers: javed.absar, gbedwell, llvm-commits
Differential Revision: https://reviews.llvm.org/D48209
llvm-svn: 334833
Summary:
On hover, the whole asm snippet is displayed, including operands.
This requires the actual assembly output instead of just the MCInsts:
This is because some pseudo-instructions get lowered to actual target
instructions during codegen (e.g. ABS_Fp32 -> SSE or X87).
Reviewers: gchatelet
Subscribers: mgorny, tschuett, llvm-commits
Differential Revision: https://reviews.llvm.org/D48164
llvm-svn: 334805
Summary:
Get rid of OpcodeName.
To remove the opcode name from an old file:
```
cat old_file | sed '/opcode_name.*/d'
```
Reviewers: gchatelet
Subscribers: tschuett, llvm-commits
Differential Revision: https://reviews.llvm.org/D48121
llvm-svn: 334691
Summary: This patch transforms the Scheduler class into the ExecuteStage. Most of the logic remains.
Reviewers: andreadb, RKSimon, courbet
Reviewed By: andreadb
Subscribers: mgorny, javed.absar, tschuett, gbedwell, llvm-commits
Differential Revision: https://reviews.llvm.org/D47246
llvm-svn: 334679
Fixes PR37790.
In some (very rare) cases, the LSUnit (Load/Store unit) was wrongly marking a
load (or store) as "ready to execute" effectively bypassing older memory barrier
instructions.
To reproduce this bug, the memory barrier must be the first instruction in the
input assembly sequence, and it doesn't have to perform any register writes.
llvm-svn: 334633
Not sure why, but it breaks buildbot clang-cmake-armv8-full.
It causes a failure in TEST 'Xray-armhf-linux :: TestCases/Posix/profiling-single-threaded.cc'.
llvm-svn: 334617
Summary: Previous design was relying on the 'mutate' keyword and was quite confusing. This version separate mutable from immutable data and makes it clearer what changes and what doesn't.
Reviewers: courbet
Subscribers: tschuett, llvm-commits
Differential Revision: https://reviews.llvm.org/D48020
llvm-svn: 334596
Summary:
This method was not correct for entries in DWO files as it assumed it
could just add up the CU and DIE offsets to get the absolute DIE offset.
This is not correct for the DWO files, as here the CU offset will
reference the skeleton unit, whereas the DIE offset will be the offset
in the full unit in the DWO file.
Unfortunately, this means that we are not able to determine the absolute
DIE offset using the information in the .debug_names section alone,
which means we have to offload some of this work to the users of this
class.
To demonstrate how this can be done, I've added/fixed the ability to
lookup entries using accelerator tables in DWO files in llvm-dwarfdump.
To make this happen, I've needed to make two extra changes in other
classes:
- made the DWARFContext method to lookup a CU based on the section
offset public. I've needed this functionality to lookup a CU, and this
seems like a useful thing in general.
- made DWARFUnit::getDWOId call extractDIEsIfNeeded. Before this, the
DWOId was filled in only if the root DIE happened to be parsed
before we called the accessor. Since the lazy parsing is supposed to
happen under the hood, calling extractDIEsIfNeeded seems appropriate.
Reviewers: JDevlieghere, aprantl, dblaikie
Subscribers: mgrang, llvm-commits
Differential Revision: https://reviews.llvm.org/D48009
llvm-svn: 334578
This simplifies some code which had StringRefs to begin with, and
makes other code more complicated which had const char* to begin
with.
In the end, I think this makes for a more idiomatic and platform
agnostic API. Not all platforms launch process with null terminated
c-string arrays for the environment pointer and argv, but the api
was designed that way because it allowed easy pass-through for
posix-based platforms. There's a little additional overhead now
since on posix based platforms we'll be takign StringRefs which
were constructed from null terminated strings and then copying
them to null terminate them again, but from a readability and
usability standpoint of the API user, I think this API signature
is strictly better.
llvm-svn: 334518
Don't provide the assembler source as the "root file" unless the user
asked to have debug info for the assembler source (with -g).
If the source doesn't provide an explicit ".file 0" then (a) use the
compilation directory as directory #0, and (b) use the file #1 info
for file #0 also.
Differential Revision: https://reviews.llvm.org/D48055
llvm-svn: 334512
Summary: This patch moves linking of libpfm from different places to a single one.
Reviewers: courbet
Subscribers: mgorny, tschuett, llvm-commits
Differential Revision: https://reviews.llvm.org/D48075
llvm-svn: 334499
Name table occupies a big chunk of size in current binary format sample profile.
In order to reduce its size, the patch changes the sample writer/reader to
save/restore MD5Hash of names in the name table. Sample annotation phase will
also use MD5Hash of name to query samples accordingly.
Experiment shows compact binary format can reduce the size of sample profile by
2/3 compared with binary format generally.
Differential Revision: https://reviews.llvm.org/D47955
llvm-svn: 334447
NFC here, this just raises some platform specific ifdef hackery
out of a class and creates proper platform-independent typedefs
for the relevant things. This allows these typedefs to be
reused in other places without having to reinvent this preprocessor
logic.
llvm-svn: 334294
I don't know how to build this code, but based on the failing
buildbot error message it looks like this change should get
the buildbot up and running again.
llvm-svn: 334231
This breaks the OpenFlags enumeration into two separate
enumerations: OpenFlags and CreationDisposition. The first
controls the behavior of the API depending on whether or not
the target file already exists, and is not a flags-based
enum. The second controls more flags-like values.
This yields a more easy to understand API, while also allowing
flags to be passed to the openForRead api, where most of the
values didn't make sense before. This also makes the apis more
testable as it becomes easy to enumerate all the configurations
which make sense, so I've added many new tests to exercise all
the different values.
llvm-svn: 334221
The class Object contains std::shared_ptr<MemoryBuffer> OwnedData
which is not used anywhere. Besides avoiding two stage initialization
the motivation to remove it comes from the plan to add (currently missing) support
for static libraries.
NFC.
Test plan: make check-all
Differential revision: https://reviews.llvm.org/D47855
llvm-svn: 334217
Summary: BenchmarkRunner subclasses can now create many configurations - although this patch still generates one.
Reviewers: courbet
Subscribers: tschuett, llvm-commits
Differential Revision: https://reviews.llvm.org/D47877
llvm-svn: 334197
Summary: This is the first step to have the BenchmarkRunner create and measure many different configurations (different initial values for instance).
Reviewers: courbet
Subscribers: tschuett, llvm-commits
Differential Revision: https://reviews.llvm.org/D47826
llvm-svn: 334169
Summary: BenchmarkResult IO functions now return an Error or Expected so caller can deal take proper action.
Reviewers: courbet
Subscribers: tschuett, llvm-commits
Differential Revision: https://reviews.llvm.org/D47868
llvm-svn: 334167
With '-elf-output-style=GNU -relocations', a header containing the number
of entries is printed before all the relocation entries in the section.
For Android packed format, we need to perform the unpacking first before
we can get the actual number of relocations in the section.
Patch by Rahul Chaudhry!
Differential Revision: https://reviews.llvm.org/D47800
llvm-svn: 334147
With the upcoming patch to add summary parsing support, IsAnalysis would
be true in contexts where we are not performing module summary analysis.
Rename to the more specific and approprate HaveGVs, which is essentially
what this flag is indicating.
llvm-svn: 334140
Before this patch, debugify would insert debug value intrinsics before the
terminating instruction in a block. This had the advantage of being simple,
but was a bit too simple/unrealistic.
This patch teaches debugify to insert debug values immediately after their
operand defs. This enables better testing of the compiler.
For example, with this patch, `opt -debugify-each` is able to identify a
vectorizer DI-invariance bug fixed in llvm.org/PR32761. In this bug, the
vectorizer produced different output with/without debug info present.
Reverting Davide's bugfix locally, I see:
$ ~/scripts/opt-check-dbg-invar.sh ./bin/opt \
.../SLPVectorizer/AArch64/spillcost-di.ll -slp-vectorizer
Comparing: -slp-vectorizer .../SLPVectorizer/AArch64/spillcost-di.ll
Baseline: /var/folders/j8/t4w0bp8j6x1g6fpghkcb4sjm0000gp/T/tmp.iYYeL1kf
With DI : /var/folders/j8/t4w0bp8j6x1g6fpghkcb4sjm0000gp/T/tmp.sQtQSeet
9,11c9,11
< %5 = getelementptr inbounds %0, %0* %2, i64 %0, i32 1
< %6 = bitcast i64* %4 to <2 x i64>*
< %7 = load <2 x i64>, <2 x i64>* %6, align 8, !tbaa !0
---
> %5 = load i64, i64* %4, align 8, !tbaa !0
> %6 = getelementptr inbounds %0, %0* %2, i64 %0, i32 1
> %7 = load i64, i64* %6, align 8, !tbaa !5
12a13
> store i64 %5, i64* %8, align 8, !tbaa !0
14,15c15
< %10 = bitcast i64* %8 to <2 x i64>*
< store <2 x i64> %7, <2 x i64>* %10, align 8, !tbaa !0
---
> store i64 %7, i64* %9, align 8, !tbaa !5
:: Found a test case ^
Running this over the *.ll files in tree, I found four additional examples
which compile differently with/without DI present. I plan on filing bugs for
these.
llvm-svn: 334118
Moves the Mode field out of the Key. The existing yaml benchmark results can be fixed with the following script:
```
readonly FILE=$1
readonly MODE=latency # Change to uops to fix a uops benchmark.
cat $FILE | \
sed "/^\ \+mode:\ \+$MODE$/d" | \
sed "/^cpu_name.*$/i mode: $MODE"
```
Differential Revision: https://reviews.llvm.org/D47813
Authored by: Guillaume Chatelet
llvm-svn: 334079
This patch fixe the logic in ReadState::cycleEvent(). That method was not
correctly updating field `TotalCycles`.
Added extra code comments in class ReadState to better describe each field.
llvm-svn: 334028
We want llvm-exegesis to explore instructions (effect of initial register values, effect of operand selection). To enable this a BenchmarkResult muststore all the relevant data in its key. This patch starts adding such data. Here we simply allow to store the generated instructions, following patches will add operands and initial values for registers.
https://reviews.llvm.org/D47764
Authored by: Guilluame Chatelet
llvm-svn: 334008
The -check-debugify pass should preserve all analyses. Otherwise, it may
invalidate an optional analysis and inadvertently alter codegen.
The test case is reduced from deopt-bundle.ll. The result of `opt -O1`
on this file would differ when -debugify-each was toggled. That happened
because CheckDebugify failed to preserve GlobalsAA.
Thanks to Davide Italiano for his help chasing this down!
llvm-svn: 333959
Review feedback from r328165. Split out just the one function from the
file that's used by Analysis. (As chandlerc pointed out, the original
change only moved the header and not the implementation anyway - which
was fine for the one function that was used (since it's a
template/inlined in the header) but not in general)
llvm-svn: 333954
Summary:
These tools failed for a very large bitcode file produced by LTO due to
64-bit values being assigned to 32-bit types. For the BitstreamReader.h
fix, the value initially fit into the 32-bit unsigned, but there was an
overflow when multiplying by 32 furter below to compute the bit offset.
No test case in the patch as this requires a huge bitcode file.
Reviewers: pcc, george.karpenkov
Subscribers: mehdi_amini, a.sidorin, llvm-commits
Differential Revision: https://reviews.llvm.org/D47731
llvm-svn: 333942
This patch is the last of a sequence of three patches related to LLVM-dev RFC
"MC support for variant scheduling classes".
http://lists.llvm.org/pipermail/llvm-dev/2018-May/123181.html
This fixes PR36672.
The main goal of this patch is to teach llvm-mca how to solve variant scheduling
classes. This patch does that, plus it adds new variant scheduling classes to
the BtVer2 scheduling model to identify so-called zero-idioms (i.e. so-called
dependency breaking instructions that are known to generate zero, and that are
optimized out in hardware at register renaming stage).
Without the BtVer2 change, this patch would not have had any meaningful tests.
This patch is effectively the union of two changes:
1) a change that teaches llvm-mca how to resolve variant scheduling classes.
2) a change to the BtVer2 scheduling model that allows us to special-case
packed XOR zero-idioms (this partially fixes PR36671).
Differential Revision: https://reviews.llvm.org/D47374
llvm-svn: 333909
Resubmit of r333424. This version contains the fix for fails found by buildbots
on some targets.
This patch allows parsing GNU_PROPERTY_X86_FEATURE_1_AND
notes in .note.gnu.property sections. These notes
indicate that the object file is built to support Intel CET.
patch by mike.dvoretsky
Differential Revision: https://reviews.llvm.org/D47473
llvm-svn: 333908
This is required if we want to correctly match the behavior of method
SubtargetEmitter::ExpandProcResource() in Tablegen. When computing the set of
"consumed" processor resources and resource cycles, the logic in
ExpandProcResource() doesn't update the number of resource cycles contributed by
a "Super" resource to a group. We need to take this into account when a model
declares a processor resource which is part of a 'processor resource group', and
it is also used as the "Super" of other resources.
llvm-svn: 333892
Summary:
We now highlight any sched classes whose measurements do not match the
LLVM SchedModel. "bad" clusters are marked in red.
Screenshot in phabricator diff.
Reviewers: gchatelet
Subscribers: tschuett, mgrang, RKSimon, llvm-commits
Differential Revision: https://reviews.llvm.org/D47639
llvm-svn: 333884
After r333856, opt -debugify would just stop emitting debug value
intrinsics after encountering a musttail call. This wasn't sufficient to
avoid verifier failures.
Debug value intrinicss for all instructions preceding a musttail call
must also be emitted before the musttail call.
llvm-svn: 333866
Applying synthetic debug info before the bitcode writer pass has no
testing-related purpose. This commit prevents that from happening.
It also adds tests which check that IR produced with/without
-debugify-each enabled is identical after stripping. This makes it
possible to check that individual passes (or full pipelines) are
invariant to debug info.
llvm-svn: 333861
The -strip-module-flags option strips llvm.module.flags metadata from a
module at the beginning of the opt pipeline.
This will be used to test whether the output of a pass is debug info
(DI) invariant.
E.g, after applying synthetic debug info to a test case, we'd like to
strip out all DI-related metadata and check that the final IR is
identical to a baseline file without any DI applied, to check that
optimizations aren't inhibited by debug info.
llvm-svn: 333860
Object FIle Representation
At codegen time this is emitted into the ELF file a pair of symbol indices and a weight. In assembly it looks like:
.cg_profile a, b, 32
.cg_profile freq, a, 11
.cg_profile freq, b, 20
When writing an ELF file these are put into a SHT_LLVM_CALL_GRAPH_PROFILE (0x6fff4c02) section as (uint32_t, uint32_t, uint64_t) tuples as (from symbol index, to symbol index, weight).
Differential Revision: https://reviews.llvm.org/D44965
llvm-svn: 333823
This fixes the bug where strip-all option was
leading to a malformed outputted ELF file.
Differential Revision: https://reviews.llvm.org/D47414
llvm-svn: 333772
Summary:
"Unknown" for platforms that were not manually added into the switch
did not make sense at all. Now it prints Target + addend for all
elf-machines that were not explicitly mentioned.
Addresses PR21059 and PR25124.
Original author: fedor.sergeev
Reviewers: jyknight, espindola, fedor.sergeev
Reviewed By: jyknight
Subscribers: eraman, dcederman, jfb, dschuff, aheejin, llvm-commits
Differential Revision: https://reviews.llvm.org/D36464
llvm-svn: 333726
This diff implements the option -o
for specifying a file to write the output to.
Test plan: make check-all
Differential revision: https://reviews.llvm.org/D47505
llvm-svn: 333693
The lambda functions used by method ResourceManager::mustIssueImmediately() was
incorrectly truncating masks of buffered processor resources to 32-bit quantities.
The invalid mask values were then used to access a map of processor
resource descriptors.
Fixes PR37643.
llvm-svn: 333692
As noted by Adrian on llvm-commits, PrintHTMLEscaped and PrintEscaped in
StringExtras did not conform to the LLVM coding guidelines. This commit
rectifies that.
llvm-svn: 333669
Summary:
Both (Apple and DWARF5) implementations of the iterators had bugs which
resulted in crashes if one attempted to iterate through the accelerator
tables all the way.
For the Apple tables, the issue was that we did not clear the DataOffset
field when we reached the end, which made our iterator compare unequal
to the "end" iterator. For the Dwarf5 tables, the problem was that we
incremented the CurrentIndex pointer and then used the incremented
(possibly invalid) pointer to check whether we have reached the end of
the index list.
The reason these bugs went undetected is because their only user
(dwarfdump) only ever searched for the first match. Besides allowing us
to test this fix, changing llvm-dwarfdump --find to display all matches
seems like a good improvement (it makes the behavior consistent with the
--name option), so I change llvm-dwarfdump to do that.
The existing tests would be sufficient to test this fix with the new
llvm-dwarfdump behavior, but I add a special test that demonstrates that
the tool indeed displays multiple results. The find.test test needed to
be tweaked a bit as the tool now does not print the ".debug_info
contents" header (also consistent with how --name works).
Reviewers: JDevlieghere, aprantl, dblaikie
Subscribers: mgrang, llvm-commits
Differential Revision: https://reviews.llvm.org/D47543
llvm-svn: 333635
Summary:
I'm slowly looking into a new X86 scheduler model,
for AMD Bulldozer CPU, model 2 (bdver2, Piledriver).
And naturally, i have hit that assert :)
I happened to know what it meant, and how to fix it,
but that is not too common knowledge.
Reviewers: courbet, RKSimon
Reviewed By: courbet
Subscribers: tschuett, llvm-commits, craig.topper
Differential Revision: https://reviews.llvm.org/D47572
llvm-svn: 333632
Per discussion on the generic-abi mailing list:
https://groups.google.com/forum/#!topic/generic-abi/MPr8TVtnVn4
An object file manipulation tool must either write out a symbol
table with the same number of entries as the original symbol table
and in the same order, or if this is impossible, refuse to operate
on the object file if it has unrecognized sections that are linked
to the symtab section. However, existing tools (namely GNU strip,
GNU objcopy and ld.{bfd,gold,lld} -r) do not comply with this at
present: they change symbol table indexes and set sh_link to 0 on
the unrecognized symtab-linked sections.
We intend to use the latter as a (temporary) signal that a tool has
operated on a proposed new symtab-linked section and invalidated the
symbol table indexes. However, llvm-objcopy currently keeps sh_link
pointing to the new symtab section. This patch changes llvm-objcopy
to set sh_link to 0 to match the behaviour of the other tools.
Differential Revision: https://reviews.llvm.org/D47404
llvm-svn: 333581
When printing string in the Plist, we weren't escaping the characters
which lead to invalid XML. This patch adds the escape logic to
StringExtras.
rdar://39785334
llvm-svn: 333565
Previously JITCompileCallbackManager only supported single threaded code. This
patch embeds a VSO (see include/llvm/ExecutionEngine/Orc/Core.h) in the callback
manager. The VSO ensures that the compile callback is only executed once and that
the resulting address cached for use by subsequent re-entries.
llvm-svn: 333490
This patch allows parsing GNU_PROPERTY_X86_FEATURE_1_AND
notes in .note.gnu.property sections. These notes
indicate that the object file is built to support Intel CET.
patch by mike.dvoretsky
Differential Revision: https://reviews.llvm.org/D47473
llvm-svn: 333424