Summary:
Use `vector<char> Added + vector<size_t> ToProcess` to replace `SetVector ToProcess`
We also check `Added[P]` to enqueueing a point more than once, which
also saves us a `ClusterIdForPoint_[Q].isUndef()` check.
Reviewers: courbet, RKSimon, gchatelet, john.brawn, lebedev.ri
Subscribers: tschuett, llvm-commits
Differential Revision: https://reviews.llvm.org/D54442
........
Patch wasn't approved and breaks buildbots
llvm-svn: 349139
Summary:
Use `vector<char> Added + vector<size_t> ToProcess` to replace `SetVector ToProcess`
We also check `Added[P]` to enqueueing a point more than once, which
also saves us a `ClusterIdForPoint_[Q].isUndef()` check.
Reviewers: courbet, RKSimon, gchatelet, john.brawn, lebedev.ri
Subscribers: tschuett, llvm-commits
Differential Revision: https://reviews.llvm.org/D54442
llvm-svn: 349136
Summary:
llvm-size uses "isText()" etc. which seem to indicate whether the section contains code-like things, not whether or not it will actually go in the text segment when in a fully linked executable.
The unit test added (elf-sizes.test) shows some types of sections that cause discrepencies versus the GNU size tool. llvm-size is not correctly reporting sizes of things mapping to text/data segments, at least for ELF files.
This fixes pr38723.
Reviewers: echristo, Bigcheese, MaskRay
Reviewed By: MaskRay
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D54369
llvm-svn: 349074
Summary:
In both Elf{32,64}_Phdr, the field Elf{32,64}_World p_type is uint32_t.
Also reorder the fields to be similar to Elf64_Phdr (which is different
from Elf32_Phdr but quite similar).
Reviewers: rupprecht, jhenderson, jakehehrlich, alexshap, espindola
Reviewed By: rupprecht
Subscribers: emaste, arichardson, llvm-commits
Differential Revision: https://reviews.llvm.org/D55618
llvm-svn: 348985
Continue to present HSA metadata as YAML in ASM and when output by tools
(e.g. llvm-readobj), but encode it in Messagepack in the code object.
Differential Revision: https://reviews.llvm.org/D48179
llvm-svn: 348963
Using an Error as an out parameter from an indirect operation like
iteration as described in the documentation (
http://llvm.org/docs/ProgrammersManual.html#building-fallible-iterators-and-iterator-ranges
) seems to be a little fussy - so here's /one/ possible solution, though
I'm not sure it's the right one.
Alternatively such APIs may be better off being switched to a standard
algorithm style, where they take a lambda to do the iteration work that
is then called back into (eg: "Error e = obj.for_each_note([](const
Note& N) { ... });"). This would be safer than having an unwritten
assumption that the user of such an iteration cannot return early from
the inside of the function - and must always exit through the gift
shop... I mean error checking. (even though it's guaranteed that if
you're mid-way through processing an iteration, it's not in an error
state).
Alternatively we'd need some other (the super untrustworthy/thing we've
generally tried to avoid) error handling primitive that actually clears
the error state entirely so it's safe to ignore.
Fleshed this solution out a bit further during review - it now relies on
op==/op!= comparison as the equivalent to "if (Err)" testing the Error.
So just like an Error must be checked (even if it's in a success state),
the Error hiding in the iterator must be checked after each increment
(including by comparison with another iterator - perhaps this could be
constrained to only checking if the iterator is compared to the end
iterator? Not sure it's too important).
So now even just creating the iterator and not incrementing it at all
should still assert because the Error has not been checked.
Reviewers: lhames, jakehehrlich
Differential Revision: https://reviews.llvm.org/D55235
llvm-svn: 348811
Summary:
When bugpoint attempts to find the other executables it needs to run,
such as `opt` or `clang`, it tries searching the user's PATH. However,
in many cases, the 'bugpoint' executable is part of an LLVM build, and
the 'opt' executable it's looking for is in that same directory.
Many LLVM tools handle this case by using the `Paths` parameter of
`llvm::sys::findProgramByName`, passing the parent path of the currently
running executable. Do this same thing for bugpoint. However, to
preserve the current behavior exactly, first search the user's PATH,
and then search for 'opt' in the directory containing 'bugpoint'.
Test Plan:
`check-llvm`. Many of the existing bugpoint tests no longer need to use the
`--opt-command` option as a result of these changes.
Reviewers: MatzeB, silvas, davide
Reviewed By: MatzeB, davide
Subscribers: davide, llvm-commits
Differential Revision: https://reviews.llvm.org/D54884
llvm-svn: 348734
Summary:
This anoymous function actually has same logic with `Obj->toMappedAddr`.
Besides, I have a question on resolving illegal value. `gnu-readelf`, `gnu-objdump` and `llvm-objdump` could parse the test file 'test/tools/llvm-objdump/Inputs/private-headers-x86_64.elf', but `llvm-readobj` will fail when parse `DT_RELR` segment. Because, the value is 0x87654321 which is illegal. So, shall we do this clean up rather then remove the checking statements inside anoymous function?
```
if (Delta >= Phdr.p_filesz)
return createError("Virtual address is not in any segment");
```
Reviewers: rupprecht, jhenderson
Reviewed By: jhenderson
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D55329
llvm-svn: 348701
Summary: This line is longer than 80 characters.
Subscribers: llvm-commits, jakehehrlich
Differential Revision: https://reviews.llvm.org/D55419
llvm-svn: 348580
Summary: This line is longer than 80 characters.
Subscribers: llvm-commits, jakehehrlich
Differential Revision: https://reviews.llvm.org/D55419
llvm-svn: 348578
I just hard core goofed when I wrote this and created a different name
for no good reason. I'm failry aware of most "fresh" users of llvm-objcopy
(that is, users which are not using it as a drop in replacement for GNU
objcopy) and can say that only "-j" is being used by such people so this
patch should strictly increase compatibility and not remove it.
Differential Revision: https://reviews.llvm.org/D52180
llvm-svn: 348446
Summary:
r336838 allowed these to be toggleable.
r336858 reverted r336838.
r336943 made the generation of these sections conditional on LDPO_REL.
This commit brings back the toggle-ability. You can specify:
-plugin-opt=-function-sections
-plugin-opt=-data-sections
For your linker flags to disable the changes made in r336943.
Without toggling r336943 off, arm64 linux kernels linked with gold-plugin
see significant boot time regressions, but with r336943 outright reverted
x86_64 linux kernels linked with gold-plugin fail to boot.
Reviewers: pcc, void
Reviewed By: pcc
Subscribers: javed.absar, kristof.beyls, llvm-commits, srhines
Differential Revision: https://reviews.llvm.org/D55291
llvm-svn: 348389
After TimePoint's precision was increased in LLVM we started seeing
failures because the modification times didn't match. This adds a time
cast to ensure that we're comparing TimePoints with the same amount of
precision.
llvm-svn: 348283
This flag does not exist in GNU objcopy but has a major use case.
Debugging tools support the .build-id directory structure to find
debug binaries. There is no easy way to build this structure up
however. One way to do it is by using llvm-readelf and some crazy
shell magic. This implements the feature directly. It is most often
the case that you'll want to strip a file and send the original to
the .build-id directory but if you just want to send a file to the
.build-id directory you can copy to /dev/null instead.
Differential Revision: https://reviews.llvm.org/D54384
llvm-svn: 348174
Usually local symbols will have their address described in the debug
map. Global symbols have to have their address looked up in the symbol
table of the main executable. By playing with 'ld -r' and export lists,
you can get a symbol described as global by the debug map while actually
being a local symbol as far as the link in concerned. By gathering the
address of local symbols, we fix this issue.
Also, we prefer a global symbol in case of a name collision to preserve
the previous behavior.
Note that using the 'ld -r' tricks, people can actually cause symbol
names collisions that dsymutil has no way to figure out. This fixes the
simple case where there is only one symbol of a given name.
rdar://problem/32826621
Differential revision: https://reviews.llvm.org/D54922
llvm-svn: 348021
This patch removes a (potentially) slow while loop in
DefaultResourceStrategy::select(). A better (and faster) approach is to do some
bit manipulation in order to shrink the range of candidate resources.
On a release build, this change gives an average speedup of ~10%.
llvm-svn: 348007
Summary: The original intention of !Config.xx.empty() was probably to emphasize the thing that is currently considered, but I feel the simplified form is actually easier to understand and it is also consistent with the call sites in other llvm components.
Reviewers: alexshap, rupprecht, jakehehrlich, jhenderson, espindola
Reviewed By: alexshap, rupprecht
Subscribers: emaste, arichardson, llvm-commits
Differential Revision: https://reviews.llvm.org/D55040
llvm-svn: 347891
Summary:
We can sometimes end up with multiple copies of a local variable that
have the same GUID in the index. This happens when there are local
variables with the same name that are in different source files having the
same name/path at compile time (but compiled into different bitcode objects).
In this case make sure we import the copy in the caller's module.
This enables importing both of the variables having the same GUID
(but which will have different promoted names since the module paths,
and therefore the module hashes, will be distinct).
Importing the wrong copy is particularly problematic for read only
variables, since we must import them as a local copy whenever
referenced. Otherwise we get undefs at link time.
Note that the llvm-lto.cpp and ThinLTOCodeGenerator changes are needed
for testing the distributed index case via clang, which will be sent as
a separate clang-side patch shortly. We were previously not doing the
dead code/read only computation before computing imports when testing
distributed index generation (like it was for testing importing and
other ThinLTO mechanisms alone).
Reviewers: evgeny777
Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, dang, llvm-commits
Differential Revision: https://reviews.llvm.org/D55047
llvm-svn: 347886
This patch adds the ability to specify via tablegen which processor resources
are load/store queue resources.
A new tablegen class named MemoryQueue can be optionally used to mark resources
that model load/store queues. Information about the load/store queue is
collected at 'CodeGenSchedule' stage, and analyzed by the 'SubtargetEmitter' to
initialize two new fields in struct MCExtraProcessorInfo named `LoadQueueID` and
`StoreQueueID`. Those two fields are identifiers for buffered resources used to
describe the load queue and the store queue.
Field `BufferSize` is interpreted as the number of entries in the queue, while
the number of units is a throughput indicator (i.e. number of available pickers
for loads/stores).
At construction time, LSUnit in llvm-mca checks for the presence of extra
processor information (i.e. MCExtraProcessorInfo) in the scheduling model. If
that information is available, and fields LoadQueueID and StoreQueueID are set
to a value different than zero (i.e. the invalid processor resource index), then
LSUnit initializes its LoadQueue/StoreQueue based on the BufferSize value
declared by the two processor resources.
With this patch, we more accurately track dynamic dispatch stalls caused by the
lack of LS tokens (i.e. load/store queue full). This is also shown by the
differences in two BdVer2 tests. Stalls that were previously classified as
generic SCHEDULER FULL stalls, are not correctly classified either as "load
queue full" or "store queue full".
About the differences in the -scheduler-stats view: those differences are
expected, because entries in the load/store queue are not released at
instruction issue stage. Instead, those are released at instruction executed
stage. This is the main reason why for the modified tests, the load/store
queues gets full before PdEx is full.
Differential Revision: https://reviews.llvm.org/D54957
llvm-svn: 347857
This reapplies r347767 (originally reviewed at: https://reviews.llvm.org/D55000)
with a fix for the missing std::move of the Error returned by the call to
Pipeline::runCycle().
Below is the original commit message from r347767.
If a user only cares about the overall latency, then the best/quickest way is to
change method Pipeline::run() so that it returns the total number of cycles to
the caller.
When the simulation pipeline is run, the number of cycles (or an error) is
returned from method Pipeline::run().
The advantage is that no hardware event listener is needed for computing that
latency. So, the whole process should be faster (and simpler - at least for that
particular use case).
llvm-svn: 347795
If a user only cares about the overall latency, then the best/quickest way is to
change method Pipeline::run() so that it returns the total number of cycles to
the caller.
When the simulation pipeline is run, the number of cycles (or an error) is
returned from method Pipeline::run().
The advantage is that no hardware event listener is needed for computing that
latency. So, the whole process should be faster (and simpler - at least for that
particular use case).
llvm-svn: 347767
This allows libtool to detect the presence of llvm-strip and use
it with the options --strip-debug and --strip-unneeded.
Also hook up the -V alias for objcopy.
Differential Revision: https://reviews.llvm.org/D54936
llvm-svn: 347731
Summary:
By default sanstats search binaries at the same location where they were when
stats was collected. Sometime you can not print report immediately or you need
to move post-processing to another workstation. To support this use-case when
original binary is missing sanstats will fall-back to directory with sanstats
file.
Reviewers: pcc
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D53857
llvm-svn: 347601
Summary: Help with off-line symbolization or other type debugging.
Reviewers: pcc
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D53606
llvm-svn: 347600
By default, llvm-mca conservatively assumes that a register operand from the
variadic sequence is both a register read and a register write. That is because
MCInstrDesc doesn't describe extra variadic operands; we don't have enough
dataflow information to tell which register operands from the variadic sequence
is a definition, and which is a use instead.
However, if a variadic instruction is flagged 'mayStore' (but not 'mayLoad'),
and it has no 'unmodeledSideEffects', then llvm-mca (very) optimistically
assumes that any register operand in the variadic sequence is a register read
only. Conversely, if a variadic instruction is marked as 'mayLoad' (but not
'mayStore'), and it has no 'unmodeledSideEffects', then llvm-mca optimistically
assumes that any extra register operand is a register definition only.
These assumptions work quite well for variadic load/store multiple instructions
defined by the ARM backend.
llvm-svn: 347522
With this change, InstrBuilder emits an error if the MCInst sequence contains an
instruction with a variadic opcode, and a non-zero number of variadic operands.
Currently we don't know how to correctly analyze variadic opcodes. The problem
with variadic operands is that there is no information for them in the opcode
descriptor (i.e. MCInstrDesc). That means, we don't know which variadic operands
are defs, and which are uses.
In future, we could try to conservatively assume that any extra register
operands is both a register use and a register definition.
This patch fixes a subtle bug in the evaluation of read/write operands for ARM
VLD1 with implicit index update. Added test vld1-index-update.s
llvm-svn: 347503
RetireControlUnitStatistics now reports extra information about the ROB and the
avg/maximum number of entries consumed over the entire simulation.
Example:
Retire Control Unit - number of cycles where we saw N instructions retired:
[# retired], [# cycles]
0, 109 (17.9%)
1, 102 (16.7%)
2, 399 (65.4%)
Total ROB Entries: 64
Max Used ROB Entries: 35 ( 54.7% )
Average Used ROB Entries per cy: 32 ( 50.0% )
Documentation in llvm/docs/CommandGuide/llvmn-mca.rst has been updated to
reflect this change.
llvm-svn: 347493
Also, try to minimize the number of queries to the memory queues to speedup the
analysis.
On average, this change gives a small 2% speedup. For memcpy-like kernels, the
speedup is up to 5.5%.
llvm-svn: 347469
This avoids a heap allocation most of the times.
This patch gives a small but consistent 3% speedup on a release build (up to ~5%
on a debug build).
llvm-svn: 347464
This patch fixes an invalid memory read introduced by r346487.
Before this patch, partial register write had to query the latency of the
dependent full register write by calling a method on the full write descriptor.
However, if the full write is from an already retired instruction, chances are
that the EntryStage already reclaimed its memory.
In some parial register write tests, valgrind was reporting an invalid
memory read.
This change fixes the invalid memory access problem. Writes are now responsible
for tracking dependent partial register writes, and notify them in the event of
instruction issued.
That means, partial register writes no longer need to query their associated
full write to check when they are ready to execute.
Added test X86/BtVer2/partial-reg-update-7.s
llvm-svn: 347459
Apply review comments of https://reviews.llvm.org/D54185 to other target as well, specifically:
1. make anonymous namespaces as small as possible, avoid using static inside anonymous namespaces
2. Add missing header to some files
3. GetLoadImmediateOpcodem-> getLoadImmediateOpcode
4. Fix typo
Differential Revision: https://reviews.llvm.org/D54343
llvm-svn: 347309