PT_GNU_STACK is used in an llvm-objcopy test.
I plan to use PT_GNU_RELRO in a patch to improve nested segment
processing in llvm-objcopy (PR42963).
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D67146
llvm-svn: 370857
Add a mode in which profile read errors are not immediately treated as
fatal. In this mode, merging makes forward progress and reports failure
only if no inputs can be read.
Differential Revision: https://reviews.llvm.org/D66985
llvm-svn: 370827
The check needs to validate a counter offset before performing pointer
arithmetic with the (potentially corrupt) offset.
Found by UBSan's pointer overflow check.
rdar://54843625
Differential Revision: https://reviews.llvm.org/D66979
llvm-svn: 370826
Fix: add a 'consumeError()' call to ObjectFile.cpp.
This error was never checked.
Original commit message:
It adds a test case for a problem fixed by D66976 <https://reviews.llvm.org/D66976>.
It was introduced by me in D66089 <https://reviews.llvm.org/D66089>.
The error reported was never consumed because of a wrong variable name used,
so it could fail when LLVM_ENABLE_ABI_BREAKING_CHECKS is used.
Differential revision: https://reviews.llvm.org/D67002
llvm-svn: 370669
It adds a test case for a problem fixed by D66976.
It was introduced by me in D66089.
The error reported was never consumed because of a wrong variable name used,
so it could fail when LLVM_ENABLE_ABI_BREAKING_CHECKS is used.
Differential revision: https://reviews.llvm.org/D67002
llvm-svn: 370661
On BtVer2 conditional SIMD stores are heavily microcoded.
The latency is directly proportional to the number of packed elements extracted
from the input vector. Also, according to micro-benchmarks, most of the
computation seems to be done in the integer unit.
Only a minority of the uOPs is executed by the FPU. The observed behaviour on
the FPU looks similar to this:
- The input MASK value is moved to the Integer Unit
-- [ a VMOVMSK-like uOP-executed on JFPU0].
- In parallel, each element of the input XMM/YMM is extracted and then sent to
the IntegerUnit through JFPU1.
As expected, a (conditional) store is executed for every extracted element.
Interestingly, a (speculative) load is executed for every extracted element too.
It is as-if a "LOAD - BIT_EXTRACT- CMOV" sequence of uOPs is repeated by the
integer unit for every contionally stored element.
VMASKMOVDQU is a special case: the number of speculative loads is always 2
(presumably, one load per quadword). That means, extra shifts and masking is
performed on (one of) the loaded quadwords before each conditional store (that
also explains the big number of non-FP uOPs retired).
This patch replaces the existing writes for conditional SIMD stores (i.e.
WriteFMaskedStore, and WriteFMaskedStoreY) with the following new writes:
WriteFMaskedStore32 [ XMM Packed Single ]
WriteFMaskedStore32Y [ YMM Packed Single ]
WriteFMaskedStore64 [ XMM Packed Double ]
WriteFMaskedStore64Y [ YMM Packed Double ]
Added a wrapper class named X86SchedWriteMaskMove in X86Schedule.td to describe
both RM and MR variants for conditional SIMD moves in a single tablegen
definition.
Instances of that class are then passed in input to multiclass avx_movmask_rm
when constructing MASKMOVPS/PD definitions.
Since this patch introduces new writes, I had to update all the X86 scheduling
models.
Differential Revision: https://reviews.llvm.org/D66801
llvm-svn: 370649
This is in line with the previous changes which allowed to
override the sh_offset/sh_size and useful for writing test cases.
Differential revision: https://reviews.llvm.org/D66998
llvm-svn: 370633
Verify that the call site DWARF symbols (added during the implementation
of the debug entry values feature) are generated properly.
Differential Revision: https://reviews.llvm.org/D66865
llvm-svn: 370631
cold versus function being newly added.
This is the second half of https://reviews.llvm.org/D66374.
Profile symbol list is the collection of function symbols showing up in
the binary which generates the current profile. It is used to discriminate
function being cold versus function being newly added. Profile symbol list
is only added for profile with ExtBinary format.
During profile use compilation, when profile-sample-accurate is enabled,
a function without profile will be regarded as cold only when it is
contained in that list.
Differential Revision: https://reviews.llvm.org/D66766
llvm-svn: 370563
Also improve assembler parser register validation for .seh_ directives.
This requires moving X86-specific seh directive handling into the x86
backend, which addresses some assembler FIXMEs.
Differential Revision: https://reviews.llvm.org/D66625
llvm-svn: 370533
This tool merges interface stub files to produce a merged interface stub file
or a stub library. Currently it for stub library generation it can produce an
ELF .so stub file, or a TBD file (experimental). It will be used by the clang
-emit-interface-stubs compilation pipeline to merge and assemble the per-CU
stub files into a stub library.
The new IFS format is as follows:
--- !experimental-ifs-v1
IfsVersion: 1.0
Triple: <llvm triple>
ObjectFileFormat: <ELF | TBD>
Symbols:
_ZSymbolName: { Type: <type>, etc... }
...
Differential Revision: https://reviews.llvm.org/D66405
llvm-svn: 370499
Currenly we can encode the 'st_other' field of symbol using 3 fields.
'Visibility' is used to encode STV_* values.
'Other' is used to encode everything except the visibility, but it can't handle arbitrary values.
'StOther' is used to encode arbitrary values when 'Visibility'/'Other' are not helpfull enough.
'st_other' field is used to encode symbol visibility and platform-dependent
flags and values. Problem to encode it is that it consists of Visibility part (STV_* values)
which are enumeration values and the Other part, which is different and inconsistent.
For MIPS the Other part contains flags for all STO_MIPS_* values except STO_MIPS_MIPS16.
(Like comment in ELFDumper says: "Someones in their infinite wisdom decided to make
STO_MIPS_MIPS16 flag overlapped with other ST_MIPS_xxx flags."...)
And for PPC64 the Other part might actually encode any value.
This patch implements custom logic for handling the st_other and removes
'Visibility' and 'StOther' fields.
Here is an example of a new YAML style this patch allows:
- Name: foo
Other: [ 0x4 ]
- Name: bar
Other: [ STV_PROTECTED, 4 ]
- Name: zed
Other: [ STV_PROTECTED, STO_MIPS_OPTIONAL, 0xf8 ]
Differential revision: https://reviews.llvm.org/D66886
llvm-svn: 370472
This allows llvm-readobj to print the contents of each resource
when printing resources from an object file or executable, like it
already does for plain .res files.
This requires providing the whole COFFObjectFile to ResourceSectionRef.
This supports both object files and executables. For executables,
the DataRVA field is used as is to look up the right section.
For object files, ideally we would need to complete linking of them
and fix up all relocations to know what the DataRVA field would end up
being. In practice, the only thing that makes sense for an RVA field
is an ADDR32NB relocation. Thus, find a relocation pointing at this
field, verify that it has the expected type, locate the symbol it
points at, look up the section the symbol points at, and read from the
right offset in that section.
This works both for GNU windres object files (which use one single
.rsrc section, with all relocations against the base of the .rsrc
section, with the original value of the DataRVA field being the
offset of the data from the beginning of the .rsrc section) and
cvtres object files (with two separate .rsrc$01 and .rsrc$02 sections,
and one symbol per data entry, with the original pre-relocated DataRVA
field being set to zero).
Differential Revision: https://reviews.llvm.org/D66820
llvm-svn: 370433
This allows us to produce broken binaries with local
symbols placed after global in '.dynsym'/'.symtab'
Also, simplifies the code.
Differential revision: https://reviews.llvm.org/D66799
llvm-svn: 370331
When we have a dynamic relocation with a broken symbol's st_name,
tools report a useless error: "Invalid data was encountered while parsing the file".
After this change we report a warning + "<corrupt>" as a symbol name.
Differential revision: https://reviews.llvm.org/D66734
llvm-svn: 370330
Before this change, if multiple binary files were presented, all of them must have been instrumented or the load would fail with coverage_map_error::no_data_found.
Patch by Dean Sturtevant.
Differential Revision: https://reviews.llvm.org/D66763
llvm-svn: 370257
This relands this commit, I mistakenly reverted the original change
thinking it was the cause of the observed MSan failures but it was not.
llvm-svn: 370206
Summary:
This patch implements main entry and auxiliary entries of symbol table generation for llvm-readobj on AIX.
The source code of aix_xcoff_xlc_test8.o (compile with xlc) is:
-bash-4.2$ cat test8.c
extern int i;
extern int TestforXcoff;
extern int fun(int i);
static int static_i;
char* p="abcd";
int fun1(int j) {
static_i++;
j++;
j=j+*p;
return j;
}
int main() {
i++;
fun(i);
return fun1(i);
}
Patch provided by DiggerLin
Differential Revision: https://reviews.llvm.org/D65240
llvm-svn: 370097
This is a follow up discussed in the comments of D66583.
Currently, if for example, we have both StOther and Other set in YAML document for a symbol,
then yaml2obj reports an "unknown key 'Other'" error.
It happens because 'mapOptional()' is never called for 'Other/Visibility' in this case,
leaving those unhandled.
This message does not describe the reason of the error well. This patch fixes it.
Differential revision: https://reviews.llvm.org/D66642
llvm-svn: 370032
This is a patch split from https://reviews.llvm.org/D66374. It tries to add
a new format of profile called ExtBinary. The format adds a section header
table to the profile and organize the profile in sections, so the future
extension like adding a new section or extending an existing section will be
easier while keeping backward compatiblity feasible.
Differential Revision: https://reviews.llvm.org/D66513
llvm-svn: 369798
Summary:
GNU --strip-unneeded strips debugging sections as well. Do that for llvm-objcopy as well.
Additionally, add a test that verifies we keep the .gnu_debuglink section. This apparently was not always the case, and I'm not sure which commit fixed it, but there doesn't appear to be any test coverage to make sure we continue to do so.
This fixes PR41043.
Reviewers: jhenderson, jakehehrlich, espindola, alexshap
Subscribers: emaste, arichardson, MaskRay, abrachet, seiya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D66623
llvm-svn: 369761
This is a follow up of r369642.
This patch assigns a ReadAfterLd to every implicit register use of instruction
CMPXCHG8B and instruction CMPXCHG16B. Perf micro-benchmarks show that implicit
registers are read after 3cy from the start of execution.
llvm-svn: 369750
Excluding ADC/SBB and the bit-test instructions (BTR/BTS/BTC), the observed
latency of all other RMW integer arithmetic/logic instructions is 6cy and not
5cy.
Example (ADD):
```
addb $0, (%rsp) # Latency: 6cy
addb $7, (%rsp) # Latency: 6cy
addb %sil, (%rsp) # Latency: 6cy
addw $0, (%rsp) # Latency: 6cy
addw $511, (%rsp) # Latency: 6cy
addw %si, (%rsp) # Latency: 6cy
addl $0, (%rsp) # Latency: 6cy
addl $511, (%rsp) # Latency: 6cy
addl %esi, (%rsp) # Latency: 6cy
addq $0, (%rsp) # Latency: 6cy
addq $511, (%rsp) # Latency: 6cy
addq %rsi, (%rsp) # Latency: 6cy
```
The same latency profile applies to SUB/AND/OR/XOR/INC/DEC.
The observed latency of ADC/SBB is 7-8cy. So we need a different write to model
those. Latency of BTS/BTR/BTC is not fixed by this patch (they are much slower
than what the model for btver2 currently reports).
Differential Revision: https://reviews.llvm.org/D66636
llvm-svn: 369748
ExtName should not be decorated, just like Name.
This avoids double decoration on symbols in import libraries
that use = for renaming functions. (Weak aliases, which use ==,
worked fine with respect to decoration.)
Differential Revision: https://reviews.llvm.org/D66617
llvm-svn: 369747
st_other field of a symbol usually contains its visibility.
Other bits are usually 0, though some targets, like
MIPS can set them using the named bit field values.
Problem is that there is no way to set an arbitrary value now,
though that might be useful for our test cases.
In this patch I introduced a way to set st_other to any numeric
value using the new StOther field.
I added a test and simplified the existent one to show the effect/benefit
Differential revision: https://reviews.llvm.org/D66583
llvm-svn: 369742
This fixes https://bugs.llvm.org/show_bug.cgi?id=40337.
Previously, it was always assumed that relocations referenced symbols in the static symbol table.
Now, if the Link field references a section called ".dynsym" it will look up these symbols
in the dynamic symbol table.
This patch is heavily based on D59097 by James Henderson
Differential revision: https://reviews.llvm.org/D66532
llvm-svn: 369645
On Jaguar, XCHG has a latency of 1cy and decodes to 2 macro-opcodes. Maximum
throughput for XCHG is 1 IPC. The byte exchange has worse latency and decodes to
1 extra uOP; maximum observed throughput is 0.5 IPC.
```
xchgb %cl, %dl # Latency: 2cy - uOPs: 3 - 2 ALU
xchgw %cx, %dx # Latency: 1cy - uOPs: 2 - 2 ALU
xchgl %ecx, %edx # Latency: 1cy - uOPs: 2 - 2 ALU
xchgq %rcx, %rdx # Latency: 1cy - uOPs: 2 - 2 ALU
```
The reg-mem forms of XCHG are atomic operations with an observed latency of
16cy. The resource usage is similar to the XCHGrr variants. The biggest
difference is obviously the bus-locking, which prevents the LS to issue other
memory uOPs in parallel until the unlocking store uOP is executed.
```
xchgb %cl, (%rsp) # Latency: 16cy - uOPs: 3 - ECX latency: 11cy
xchgw %cx, (%rsp) # Latency: 16cy - uOPs: 3 - ECX latency: 11cy
xchgl %ecx, (%rsp) # Latency: 16cy - uOPs: 3 - ECX latency: 11cy
xchgq %rcx, (%rsp) # Latency: 16cy - uOPs: 3 - ECX latency: 11cy
```
The exchanged in/out register operand becomes available after 11cy from the
start of execution. Added test xchg.s to verify that we correctly see that
register write committed in 11cy (and not 16cy).
Reg-reg XADD instructions have the same latency/throughput than the byte
exchange (register-register variant).
```
xaddb %cl, %dl # latency: 2cy - uOPs: 3 - 3 ALU
xaddw %cx, %dx # latency: 2cy - uOPs: 3 - 3 ALU
xaddl %ecx, %edx # latency: 2cy - uOPs: 3 - 3 ALU
xaddq %rcx, %rdx # latency: 2cy - uOPs: 3 - 3 ALU
```
The non-atomic RM variants have a latency of 11cy, and decode to 4
macro-opcodes. They still consume 2 ALU pipes, and the exchange in/out register
operand becomes available in 3cy (it matches the 'load-to-use latency').
```
xaddb %cl, (%rsp) # latency: 11cy - uOPs: 4 - 3 ALU
xaddw %cx, (%rsp) # latency: 11cy - uOPs: 4 - 3 ALU
xaddl %ecx, (%rsp) # latency: 11cy - uOPs: 4 - 3 ALU
xaddq %rcx, (%rsp) # latency: 11cy - uOPs: 4 - 3 ALU
```
The atomic XADD variants execute in 16cy. The in/out register operand is
available after 11cy from the start of execution.
```
lock xaddb %cl, (%rsp) # latency: 16cy - uOPs: 4 - 3 ALU -- ECX latency: 11cy
lock xaddw %cx, (%rsp) # latency: 16cy - uOPs: 4 - 3 ALU -- ECX latency: 11cy
lock xaddl %ecx, (%rsp) # latency: 16cy - uOPs: 4 - 3 ALU -- ECX latency: 11cy
lock xaddq %rcx, (%rsp) # latency: 16cy - uOPs: 4 - 3 ALU -- ECX latency: 11cy
```
Added test xadd.s to verify those latencies as well as read-advance values.
Differential Revision: https://reviews.llvm.org/D66535
llvm-svn: 369642
The error reporting function are not consistent.
Before this change:
* They had inconsistent naming (e.g. 'error' vs 'report_error').
* Some of them reported the object name, others - dont.
* Some of them accepted the case when there was no error. (i.e. error code or Error had a success value).
This patch tries to cleanup it a bit.
It also renames report_error -> reportError, report_warning -> reportWarning
and removes a full stop from messages.
Differential revision: https://reviews.llvm.org/D66418
llvm-svn: 369515
Summary:
StringMap is used for storing call target to frequency map for AutoFDO. However the iterating order of StringMap is non-deterministic, which leads to non-determinism in AutoFDO profile output. Now new API getSortedCallTargets and SortCallTargets are added for deterministic ordering and output.
Roundtrip test for text profile and binary profile is added.
Reviewers: wmi, davidxl, danielcdh
Subscribers: hiraditya, mgrang, llvm-commits, twoh
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D66191
llvm-svn: 369440
test/llvm-objcopy/ELF/error-format.test is similar to test/llvm-readobj/error-format.test added in D66425.
Reviewed By: grimar, jhenderson
Differential Revision: https://reviews.llvm.org/D66476
llvm-svn: 369392
Currently the warning message of `llvm-strip %t.o %t.o` does not include
the trailing newline. Fix this by appending a '\n'.
This is the only warning llvm-objcopy and llvm-strip can issue.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D66475
llvm-svn: 369391
Latency and throughput of LOCK INC/DEC/NEG/NOT is always 19cy.
Number of uOPs is still 1.
Differential Revision: https://reviews.llvm.org/D66469
llvm-svn: 369388
One of the report_error functions was taking object::Archive::Child as an
argument. It feels excessive, this patch removes it and introduce a helper
function instead. Also I fixed a "TODO" in this patch what improved the message printed.
Differential revision: https://reviews.llvm.org/D66468
llvm-svn: 369382
The type_offset field is 8 bytes long in DWARF64. The patch extends
TypeOffset to uint64_t and fixes its reading. The patch also fixes
checking of TypeOffset bounds as it was inaccurate in DWARF64 case.
Differential Revision: https://reviews.llvm.org/D66465
llvm-svn: 369378
Summary:
Currently, we report:
error: ...
Prepend argv[0] (tool name):
llvm-readobj: error: ...
This is consistent with most GNU binutils/clang/lld, and gives a bit
more context in a long build log.
Reviewed By: grimar, jhenderson, rupprecht
Differential Revision: https://reviews.llvm.org/D66425
llvm-svn: 369377