Commit Graph

2256 Commits

Author SHA1 Message Date
Andrea Di Biagio ff630c2cdc [llvm-mca][BtVer2] teach how to identify false dependencies on partially written
registers.

The goal of this patch is to improve the throughput analysis in llvm-mca for the
case where instructions perform partial register writes.

On x86, partial register writes are quite difficult to model, mainly because
different processors tend to implement different register merging schemes in
hardware.

When the code contains partial register writes, the IPC (instructions per
cycles) estimated by llvm-mca tends to diverge quite significantly from the
observed IPC (using perf).

Modern AMD processors (at least, from Bulldozer onwards) don't rename partial
registers. Quoting Agner Fog's microarchitecture.pdf:
" The processor always keeps the different parts of an integer register together.
For example, AL and AH are not treated as independent by the out-of-order
execution mechanism. An instruction that writes to part of a register will
therefore have a false dependence on any previous write to the same register or
any part of it."

This patch is a first important step towards improving the analysis of partial
register updates. It changes the semantic of RegisterFile descriptors in
tablegen, and teaches llvm-mca how to identify false dependences in the presence
of partial register writes (for more details: see the new code comments in
include/Target/TargetSchedule.h - class RegisterFile).

This patch doesn't address the case where a write to a part of a register is
followed by a read from the whole register.  On Intel chips, high8 registers
(AH/BH/CH/DH)) can be stored in separate physical registers. However, a later
(dirty) read of the full register (example: AX/EAX) triggers a merge uOp, which
adds extra latency (and potentially affects the pipe usage).
This is a very interesting article on the subject with a very informative answer
from Peter Cordes:
https://stackoverflow.com/questions/45660139/how-exactly-do-partial-registers-on-haswell-skylake-perform-writing-al-seems-to

In future, the definition of RegisterFile can be extended with extra information
that may be used to identify delays caused by merge opcodes triggered by a dirty
read of a partial write.

Differential Revision: https://reviews.llvm.org/D49196

llvm-svn: 337123
2018-07-15 11:01:38 +00:00
Nico Weber 337e241d58 Attempt to get test/tools/llvm-lib/help.test passing on sanitizer-x86_64-linux-fast
The bot has a /b directory, so /? matches against that and gets expanded to it.

(Thanks to Hans's r187366, which solved the same problem for clang-cl a while
ago and which saved me much head scratching.)

llvm-svn: 337092
2018-07-14 11:33:33 +00:00
Nico Weber 17172c6b80 Give llvm-lib rudimentary help output.
https://reviews.llvm.org/D49318

llvm-svn: 337084
2018-07-14 02:29:44 +00:00
Jonas Devlieghere 327e7a1608 [dwarfdump] Add pretty printer for accelerator table based on Atom.
For instance, When dumping .apple_types, the second atom represents the
DW_TAG. In addition to printing the raw value, we now also pretty print
the value if the ATOM tells us how.

llvm-svn: 337026
2018-07-13 17:21:51 +00:00
Andrea Di Biagio e86e6efea1 [llvm-mca][BtVer2] Add tests for dependency breaking instructions.
llvm-svn: 337024
2018-07-13 16:46:51 +00:00
Joel Galenson 667eac80da [cfi-verify] Only run AArch64 tests when it is a supported target
This stops the tests I added in r337007 from running when AArch64 is not a supported target.

llvm-svn: 337012
2018-07-13 16:09:19 +00:00
Joel Galenson 06e7e5798f [cfi-verify] Support AArch64.
This patch adds support for AArch64 to cfi-verify.

This required three changes to cfi-verify.  First, it generalizes checking if an instruction is a trap by adding a new isTrap flag to TableGen (and defining it for x86 and AArch64).  Second, the code that ensures that the operand register is not clobbered between the CFI check and the indirect call needs to allow a single dereference (in x86 this happens as part of the jump instruction).  Third, we needed to ensure that return instructions are not counted as indirect branches.  Technically, returns are indirect branches and can be covered by CFI, but LLVM's forward-edge CFI does not protect them, and x86 does not consider them, so we keep that behavior.

In addition, we had to improve AArch64's code to evaluate the branch target of a MCInst to handle calls where the destination is not the first operand (which it often is not).

Differential Revision: https://reviews.llvm.org/D48836

llvm-svn: 337007
2018-07-13 15:19:33 +00:00
Dean Michael Berris 10141261e1 [XRay][compiler-rt] Add PID field to llvm-xray tool and add PID metadata record entry in FDR mode
Summary:
llvm-xray changes:
- account-mode - process-id  {...} shows after thread-id
- convert-mode - process {...} shows after thread
- parses FDR and basic mode pid entries
- Checks version number for FDR log parsing.

Basic logging changes:
- Update header version from 2 -> 3

FDR logging changes:
- Update header version from 2 -> 3
- in writeBufferPreamble, there is an additional PID Metadata record (after thread id record and tsc record)

Test cases changes:
- fdr-mode.cc, fdr-single-thread.cc, fdr-thread-order.cc modified to catch process id output in the log.

Reviewers: dberris

Reviewed By: dberris

Subscribers: hiraditya, llvm-commits, #sanitizers

Differential Revision: https://reviews.llvm.org/D49153

llvm-svn: 336974
2018-07-13 05:38:22 +00:00
Bill Wendling 7bd9e94e38 [gold-plugin] Disable section ordering for relocatable links
Not all programs want section ordering when compiled with LTO.
In particular, the Linux kernel is very sensitive when it comes to linking, and
doesn't boot when each function is placed in its own sections.

Reviewed By: pcc

Differential Revision: https://reviews.llvm.org/D48756

llvm-svn: 336943
2018-07-12 20:35:58 +00:00
Stephen Hines e8c3c5fe5d Add --strip-all option back to llvm-strip.
Summary:
This option appears to have been dropped as part of the refactoring in
r331663. Unfortunately, if we want to use llvm-strip as a drop-in
replacement for strip, this option should still be available.

Reviewers: alexshap

Reviewed By: alexshap

Subscribers: meikeb, kongyi, chh, jakehehrlich, llvm-commits, pirama

Differential Revision: https://reviews.llvm.org/D49226

llvm-svn: 336921
2018-07-12 17:42:17 +00:00
Paul Semel 0f9ca2d960 Revert "[llvm-objdump] Add -demangle (-C) option"
This reverts commit 3a44ccd156e0edd2e89226f8ed63928e227900bb.
This reverts commit d5cfc836bb5552e20507d3612d13ff66ff9e36a0.

llvm-svn: 336829
2018-07-11 18:09:52 +00:00
Paul Semel 569200aab8 Fix llvm-objdump demangle test (added triple option)
llvm-svn: 336821
2018-07-11 16:31:33 +00:00
Andrea Di Biagio 483db141e3 [X86] Fix MayLoad/HasSideEffect flag for (V)MOVLPSrm instructions.
Before revision 336728, the "mayLoad" flag for instruction (V)MOVLPSrm was
inferred directly from the "default" pattern associated with the instruction
definition.

r336728 removed special node X86Movlps, and all the patterns associated to it.
Now instruction (V)MOVLPSrm doesn't have a pattern associated to it, and the
'mayLoad/hasSideEffects' flags are left unset.

When the instruction info is emitted by tablegen, method
CodeGenDAGPatterns::InferInstructionFlags() sees that (V)MOVLPSrm doesn't have a
pattern, and flags are undefined. So, it conservatively sets the
"hasSideEffects" flag for it.

As a consequence, we were losing the 'mayLoad' flag, and we were gaining a
'hasSideEffect' flag in its place.
This patch fixes the issue (originally reported by Michael Holmen).

The mca tests show the differences in the instruction info flags.  Instructions
that were affected by this problem were: MOVLPSrm/VMOVLPSrm/VMOVLPSZ128rm.

Differential Revision: https://reviews.llvm.org/D49182

llvm-svn: 336818
2018-07-11 15:27:50 +00:00
Paul Semel bcf55ab95a [llvm-objdump] Add -demangle (-C) option
Differential Revision: https://reviews.llvm.org/D49043

llvm-svn: 336816
2018-07-11 15:25:39 +00:00
Andrea Di Biagio d2e2c053cf [llvm-mca] Use a different character to flag instructions with side-effects in the Instruction Info View. NFC
This makes easier to identify changes in the instruction info flags.  It also
helps spotting potential regressions similar to the one recently introduced at
r336728.

Using the same character to mark MayLoad/MayStore/HasSideEffects is problematic
for llvm-lit. When pattern matching substrings, llvm-lit consumes tabs and
spaces. A change in position of the flag marker may not trigger a test failure.

This patch only changes the character used for flag `hasSideEffects`. The reason
why I didn't touch other flags is because I want to avoid spamming the mailing
because of the massive diff due to the numerous tests affected by this change.

In future, each instruction flag should be associated with a different character
in the Instruction Info View.

llvm-svn: 336797
2018-07-11 12:44:44 +00:00
Paul Semel b98f504850 [llvm-readobj] Add -hex-dump (-x) option
Differential Revision: https://reviews.llvm.org/D48281

llvm-svn: 336782
2018-07-11 10:00:29 +00:00
Andrea Di Biagio 2b3a4f9c9b [llvm-mca] Add tests for partial register writes.
llvm-mca doesn't know that on modern AMD processors, portions of a general
purpose register are not treated independently. So, a partial register write has
a false dependency on the super-register.

The issue with partial register writes will be addressed by a follow-up patch.

llvm-svn: 336778
2018-07-11 09:50:00 +00:00
Jonas Devlieghere 82dee6aca8 [dsymutil] Add support for outputting assembly
When implementing the DWARF accelerator tables in dsymutil I ran into an
assertion in the assembler. Debugging these kind of issues is a lot
easier when looking at the assembly instead of debugging the assembler
itself. Since it's only a matter of creating an AsmStreamer instead of a
MCObjectStreamer it made sense to turn this into a (hidden) dsymutil
feature.

Differential revision: https://reviews.llvm.org/D49079

llvm-svn: 336561
2018-07-09 16:58:48 +00:00
Andrea Di Biagio 8834779644 [llvm-mca] report an error if the assembly sequence contains an unsupported instruction.
This is a short-term fix for PR38093.
For now, we llvm::report_fatal_error if the instruction builder finds an
unsupported instruction in the instruction stream.

We need to revisit this fix once we start addressing PR38101.
Essentially, we need a better framework for error handling.

llvm-svn: 336543
2018-07-09 12:30:55 +00:00
Roman Lebedev 0e58dee284 [MCA][X86][NFC] Add BSF/BSR resource tests
Reviewers: RKSimon, andreadb, courbet

Reviewed By: RKSimon

Subscribers: gbedwell, llvm-commits

Differential Revision: https://reviews.llvm.org/D48997

llvm-svn: 336510
2018-07-08 09:50:14 +00:00
Alexander Shaposhnikov 42b5ef0269 [llvm-objcopy] Add support for static libraries
This diff adds support for handling static libraries 
to llvm-objcopy and llvm-strip.

Test plan: make check-all

Differential revision: https://reviews.llvm.org/D48413

llvm-svn: 336455
2018-07-06 17:51:03 +00:00
Andrea Di Biagio 61c52af9d9 [llvm-mca] improve the instruction issue logic implemented by the Scheduler.
This patch modifies the Scheduler heuristic used to select the next instruction
to issue to the pipelines.

The motivating example is test X86/BtVer2/add-sequence.s, for which llvm-mca
wrongly reported an estimated IPC of 1.50. According to perf, the actual IPC for
that test should have been ~2.00.
It turns out that an IPC of 2.00 for test add-sequence.s cannot possibly be
predicted by a Scheduler that only prioritizes instructions based on their
"age". A similar issue also affected test X86/BtVer2/dependent-pmuld-paddd.s,
for which llvm-mca wrongly estimated an IPC of 0.84 instead of an IPC of 1.00.

Instructions in the ReadyQueue are now ranked based on two factors:
 - The "age" of an instruction.
 - The number of unique users of writes associated with an instruction.

The new logic still prioritizes older instructions over younger instructions to
minimize the pressure on the reorder buffer. However, the number of users of an
instruction now also affects the overall rank. This potentially increases the
ability of the Scheduler to extract instruction level parallelism.  This patch
fixes the problem with the wrong IPC reported for test add-sequence.s and test
dependent-pmuld-paddd.s.

llvm-svn: 336420
2018-07-06 08:08:30 +00:00
Dave Lee 390abe4a75 Reapply: "objdump: Support newer ObjC image info flags"
Summary:
Add support for two additional ObjC image info flags: `IS_SIMULATED` and
`HAS_CATEGORY_CLASS_PROPERTIES`.

`IS_SIMULATED` indicates a Mach-O binary built for iOS simulator.

`HAS_CATEGORY_CLASS_PROPERTIES` indicates a Mach-O binary built by a compiler
that supports class properties in categories.

Reviewers: enderby, compnerd

Reviewed By: compnerd

Subscribers: keith, llvm-commits

Differential Revision: https://reviews.llvm.org/D48568

llvm-svn: 336411
2018-07-06 05:11:35 +00:00
Dave Lee e6de96410b Revert "objdump: Support newer ObjC image info flags"
This reverts commit 8c4cc472e7a67bd3b2b20cc4cf32d31af29bc7e9.

llvm-svn: 336402
2018-07-06 00:13:21 +00:00
Dave Lee 9e412ec8f2 objdump: Support newer ObjC image info flags
Summary:
Add support for two additional ObjC image info flags: `IS_SIMULATED` and
`HAS_CATEGORY_CLASS_PROPERTIES`.

`IS_SIMULATED` indicates a Mach-O binary built for iOS simulator.

`HAS_CATEGORY_CLASS_PROPERTIES` indicates a Mach-O binary built by a compiler
that supports class properties in categories.

Reviewers: enderby, compnerd

Reviewed By: compnerd

Subscribers: keith, llvm-commits

Differential Revision: https://reviews.llvm.org/D48568

llvm-svn: 336399
2018-07-05 23:32:15 +00:00
Paul Semel 91c9d4251c [llvm-objdump] Removed archive-headers-disas test
This test is failing because of the disas part.
For the moment, I will juste remove it. I will add it again tomorrow
with a proper fix.

llvm-svn: 336370
2018-07-05 16:49:46 +00:00
Paul Semel 63e4008718 [llvm-objcopy] Fix timezone dependant tests
llvm-svn: 336363
2018-07-05 15:24:11 +00:00
Paul Semel 0dc92f6a74 [llvm-objdump] Add --archive-headers (-a) option
llvm-svn: 336357
2018-07-05 14:43:29 +00:00
Roman Lebedev 0dd27042c6 [X86][BtVer2][MCA][NFC] Add CMPEQ dependency-breaking one-idioms tests
Summary: As per `Agner's Microarchitecture doc
(21.8 AMD Bobcat and Jaguar pipeline - Dependency-breaking instructions)`,
these, like zero-idioms, are dependency-breaking,
although they produce ones and still consume resources.

FIXME: as discussed in D48877, llvm-mca handling is broken for these.

Reviewers: andreadb

Reviewed By: andreadb

Subscribers: gbedwell, RKSimon, llvm-commits

Differential Revision: https://reviews.llvm.org/D48876

llvm-svn: 336292
2018-07-04 17:32:44 +00:00
Paul Semel d2af4d6f1b [llvm-objdump] Add --file-headers (-f) option
llvm-svn: 336284
2018-07-04 15:25:03 +00:00
Gabor Buella da4a966e1c NFC - Various typo fixes in tests
llvm-svn: 336268
2018-07-04 13:28:39 +00:00
Teresa Johnson 50615c72b4 Remove absolute path in test
My test change in r336148 accidentally included an absolute path, clean
that up to fix bot failures.

llvm-svn: 336151
2018-07-02 23:02:07 +00:00
Teresa Johnson 8fc766681d [ThinLTO] Fix printing of module paths for distributed backend indexes
Summary:
In the individual index files emitted for distributed ThinLTO backends,
the module path ids are not contiguous. Assign slots to module paths in
order to handle this better and also to get contiguous numbering in the
summary assembly.

Reviewers: davidxl, dexonsmith

Subscribers: mehdi_amini, inglorion, eraman, llvm-commits, steven_wu

Differential Revision: https://reviews.llvm.org/D48698

llvm-svn: 336148
2018-07-02 22:09:23 +00:00
Fangrui Song f50ad6c311 Replace unused output filenames with /dev/null in tests
Similar to rLLD336129

llvm-svn: 336131
2018-07-02 18:16:44 +00:00
Dave Lee d4f77a523b nm: Add -no-weak flag for hiding weak symbols
Summary:
This adds a new -no-weak flag to nm to hide weak symbols in its output.
This also adds a -W alias for this which is analogous to -U.

Patch by Keith Smiley

Reviewers: kastiglione, enderby, compnerd

Reviewed By: kastiglione

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D48751

llvm-svn: 336126
2018-07-02 17:24:37 +00:00
Paul Semel 8dabda70af Revert "[llvm-readobj] Fix printing format"
There is a problem with the formatting on windows build.
I need to investigate on this.

llvm-svn: 336061
2018-07-01 11:54:09 +00:00
Paul Semel 49997adc88 [llvm-readobj] Fix printing format
We were printing every character, even those that weren't printable. It
doesn't really make sense for this option.

The string content was sticked to its address, added two spaces in
between.

Differential Revision: https://reviews.llvm.org/D48271

llvm-svn: 336058
2018-07-01 09:51:59 +00:00
Jonas Devlieghere a0857eaefe [dsymutil] Make the CachedBinaryHolder the default
Replaces all uses of the old binary holder with its cached variant.

Differential revision: https://reviews.llvm.org/D48770

llvm-svn: 335991
2018-06-29 16:51:52 +00:00
Sterling Augustine 0cf1f15e83 Require x86 for this test.
llvm-svn: 335939
2018-06-28 23:22:14 +00:00
Jake Ehrlich 0f440d832f [llvm-readobj] Add experimental support for SHT_RELR sections
This change adds experimental support for SHT_RELR sections, proposed
here: https://groups.google.com/forum/#!topic/generic-abi/bX460iggiKg

Definitions for the new ELF section type and dynamic array tags, as well
as the encoding used in the new section are all under discussion and are
subject to change. Use with caution!

Author: rahulchaudhry

Differential Revision: https://reviews.llvm.org/D47919

llvm-svn: 335922
2018-06-28 21:07:34 +00:00
Sterling Augustine 052ce120d5 Some targets don't have lld built, so just use a binary copy
of the input file.

llvm-svn: 335908
2018-06-28 19:47:23 +00:00
Sterling Augustine bc78b62169 Handle absolute symbols as branch targets in disassembly.
https://reviews.llvm.org/D48554

llvm-svn: 335903
2018-06-28 18:57:13 +00:00
Simon Pilgrim 83125594ed [llvm-mca][x86] Add FMA4 resource tests
We should be ensuring we have (near) complete test coverage of instructions, at least for the generic model.

llvm-svn: 335870
2018-06-28 16:24:13 +00:00
Simon Pilgrim 12f9503d40 [llvm-mca][x86] Add 3dnow! resource tests
We should be ensuring we have (near) complete test coverage of instructions, at least for the generic model.

llvm-svn: 335869
2018-06-28 16:21:22 +00:00
Fangrui Song ee15d3dcdb Move `REQUIRES:` line to the top
llvm-svn: 335635
2018-06-26 17:44:23 +00:00
Tim Northover f2f9f2f505 ARM: add binary file git swallowed.
Should fix bots.

llvm-svn: 335596
2018-06-26 12:28:47 +00:00
Tim Northover bf54858115 ARM: diagnose unpredictable IT instructions
IT instructions are allowed to have the 'AL' predicate, but it must never
result in an 'NV' predicated instruction. Essentially this means that all
branches must be 't' rather than 'e' if the predicate is 'AL'.

This patch adds a diagnostic for this during assembly (error because parsing
hits an assertion if allowed to continue) and an annotation during disassembly.

llvm-svn: 335593
2018-06-26 11:38:41 +00:00
Vedant Kumar b725c69f12 [SelectionDAG] Remove debug locations from ConstantSD(FP)Nodes
This removes debug locations from ConstantSDNode and ConstantSDFPNode.

When this kind of node is materialized we no longer create a line table
entry which jumps back to the constant's first point of use. This makes
single-stepping behavior smoother, and it matches the model used by IR,
where Constants have no locations. See this thread for more context:

  http://lists.llvm.org/pipermail/llvm-dev/2018-June/124164.html

I'd like to handle constant BuildVectorSDNodes and to try to eliminate
passing SDLocs to SelectionDAG::getConstant*() in follow-up commits.

Differential Revision: https://reviews.llvm.org/D48468

llvm-svn: 335497
2018-06-25 17:06:18 +00:00
Jonas Devlieghere fb54074112 [llvm-mt] Use WithColor for printing errors.
Use the WithColor helper from support to print errors.

llvm-svn: 335416
2018-06-23 16:49:07 +00:00
Eugene Leviant da873b5e2e [LIT] Enable testing of LLVM gold plugin on Mac OS X
Differential revision: https://reviews.llvm.org/D48350

llvm-svn: 335136
2018-06-20 15:32:47 +00:00
Andrea Di Biagio 2145b13fc9 [llvm-mca][X86] Teach how to identify register writes that implicitly clear the upper portion of a super-register.
This patch teaches llvm-mca how to identify register writes that implicitly zero
the upper portion of a super-register.

On X86-64, a general purpose register is implemented in hardware as a 64-bit
register. Quoting the Intel 64 Software Developer's Manual: "an update to the
lower 32 bits of a 64 bit integer register is architecturally defined to zero
extend the upper 32 bits".  Also, a write to an XMM register performed by an AVX
instruction implicitly zeroes the upper 128 bits of the aliasing YMM register.

This patch adds a new method named clearsSuperRegisters to the MCInstrAnalysis
interface to help identify instructions that implicitly clear the upper portion
of a super-register.  The rest of the patch teaches llvm-mca how to use that new
method to obtain the information, and update the register dependencies
accordingly.

I compared the kernels from tests clear-super-register-1.s and
clear-super-register-2.s against the output from perf on btver2.  Previously
there was a large discrepancy between the estimated IPC and the measured IPC.
Now the differences are mostly in the noise.

Differential Revision: https://reviews.llvm.org/D48225

llvm-svn: 335113
2018-06-20 10:08:11 +00:00
Roman Lebedev d23b6831de [X86][Znver1] Specify Register Files, RCU; FP scheduler capacity.
Summary:
First off: i do not have any access to that processor,
so this is purely theoretical, no benchmarks.

I have been looking into b**d**ver2 scheduling profile, and while cross-referencing
the existing b**t**ver2, znver1 profiles, and the reference docs
(`Software Optimization Guide for AMD Family {15,16,17}h Processors`),
i have noticed that only b**t**ver2 scheduling profile specifies these.

Also, there is no mca test coverage.

Reviewers: RKSimon, craig.topper, courbet, GGanesh, andreadb

Reviewed By: GGanesh

Subscribers: gbedwell, vprasad, ddibyend, shivaram, Ashutosh, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D47676

llvm-svn: 335099
2018-06-20 07:01:14 +00:00
Clement Courbet e0aa30008f [X86] Fix r335097
Missed `Generic` test in llvm-mca.

llvm-svn: 335098
2018-06-20 06:44:13 +00:00
Clement Courbet 7b9913fb9f [X86] Add sched class WriteLAHFSAHF and fix values.
Summary:
I ran llvm-exegesis on SKX, SKL, BDW, HSW, SNB.
Atom is from Agner and SLM is a guess.
I've left AMD processors alone.

Reviewers: RKSimon, craig.topper

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D48079

llvm-svn: 335097
2018-06-20 06:13:39 +00:00
Roman Lebedev ae0527aac9 [MCA][NFC] Add generic XOP resource tests
Summary:
Based on
* [[ https://support.amd.com/TechDocs/43479.pdf | AMD64 Architecture Programmer’s Manual Volume 6: 128-Bit and 256-Bit XOP and FMA4 Instructions ]],
* [[ https://support.amd.com/TechDocs/24594.pdf | AMD64 Architecture Programmer’s Manual Volume 3: General-Purpose and System Instructions]],
* https://en.wikipedia.org/wiki/XOP_instruction_set

Appears to be only supported in AMD's 15h generation, so only in b**d**ver[1-4],
for which currently llvm has no scheduling profiles.

Reviewers: RKSimon, craig.topper, andreadb, spatel

Reviewed By: RKSimon

Subscribers: gbedwell, llvm-commits

Differential Revision: https://reviews.llvm.org/D48264

llvm-svn: 335034
2018-06-19 09:21:27 +00:00
Roman Lebedev 0d12c1685b [MCA][NFC] Add generic TBM resource tests
Summary:
Based on https://support.amd.com/TechDocs/24594.pdf,
https://en.wikipedia.org/wiki/Bit_Manipulation_Instruction_Sets#TBM_(Trailing_Bit_Manipulation)

Appears to be only supported in AMD's 15h generation, so only in b**d**ver[1-4],
for which currently llvm has no scheduling profiles.

Reviewers: RKSimon, craig.topper, simark, andreadb

Reviewed By: RKSimon

Subscribers: gbedwell, llvm-commits

Differential Revision: https://reviews.llvm.org/D48252

llvm-svn: 335033
2018-06-19 09:21:22 +00:00
Andrea Di Biagio a88281d8ae [llvm-mca] Use an ordered map to collect hardware statistics. NFC.
Histogram entries are now ordered by key.  This should improves their
readability when statistics are printed.

llvm-svn: 334961
2018-06-18 17:04:56 +00:00
Andrea Di Biagio 487da729a2 [llvm-mca] Add tests for XOP and AVX512 instructions that implicitly clear the upper portion of a super-register.
When the destination register of a XOP instruction is an XMM register, bits
[255:128] of the corresponding YMM register are cleared.

When the destination register of a EVEX encoded instruction is an XMM/YMM
register, the upper bits of the corresponding ZMM are cleared.
On processors that feature AVX512, a write to an XMM registers always clears the
upper portion of the corresponding ZMM register if the instruction is VEX or
EVEX encoded.

These new tests show some interesting cases which aren't correctly analyzed by
llvm-mca. The lack of knowledge related to the implicit update on the
super-registers is addressed by D48225.

llvm-svn: 334945
2018-06-18 14:00:30 +00:00
Clement Courbet 0d9da88d18 [X86] Fix NOOP sched overrides on BDW/HSW/SKL.
Summary: Noop certainly does not use resources.

Reviewers: RKSimon, craig.topper, andreadb

Subscribers: gbedwell, llvm-commits, gchatelet

Differential Revision: https://reviews.llvm.org/D48028

llvm-svn: 334927
2018-06-18 06:48:22 +00:00
Simon Pilgrim e930f569f7 [llvm-mca][X86] Add some avx512f/avx512vl resource test placeholders
There are a lot of instructions to add under these ISAs (and the other AVX512 variants) but this should demonstrate how to test for the EVEX instructions with different maskings

llvm-svn: 334907
2018-06-17 16:25:48 +00:00
Simon Pilgrim f5ecd8d50d [llvm-mca][x86] Add Generic cpu resource tests
Added a Generic x86 cpu set of resource tests to allow us to check all ISAs.

We currently use SandyBridge as our generic CPU model, but it's better if we actually duplicate these tests for if/when we change the model, it also means we don't end up polluting the SandyBridge folder with tests for ISAs it doesn't support.

llvm-svn: 334853
2018-06-15 18:35:25 +00:00
Roman Lebedev 9ddf128f79 [MCA] Add -summary-view option
Summary:
While that is indeed a quite interesting summary stat,
there are cases where it does not really add anything
other than consuming extra lines.

Declutters the output of D48190.

Reviewers: RKSimon, andreadb, courbet, craig.topper

Reviewed By: andreadb

Subscribers: javed.absar, gbedwell, llvm-commits

Differential Revision: https://reviews.llvm.org/D48209

llvm-svn: 334833
2018-06-15 14:01:43 +00:00
Roman Lebedev 7c423001e4 [MCA][x86][NFC] Add tests for -register-file-stats, -scheduler-stats
Summary:
There does not seem to be any other tests for this.
Split off from D47676.

Reviewers: RKSimon, craig.topper, courbet, andreadb

Reviewed By: andreadb

Subscribers: javed.absar, gbedwell, llvm-commits

Differential Revision: https://reviews.llvm.org/D48190

llvm-svn: 334832
2018-06-15 14:01:35 +00:00
Andrea Di Biagio 4cafb297d5 [llvm-mca] Add tests for instructions that implicitly clear the upper portion of a super-register.
On x86-64, a write to register EAX implicitly clears the upper half or RAX.
128-bit AVX instructions clear the upper 128-bit of the YMM register that
aliases the XMM definition register.

llvm-mca doesn't know about register writes that implicitly clear the upper
portion of an aliasing super-register. This issue will be fixed in a future patch.

llvm-svn: 334742
2018-06-14 17:48:42 +00:00
Andrea Di Biagio 4729d1ff27 [llvm-mca] Add another test for partial register stalls.
This test checks that a physical register is correctly allocated for the partial
write to register BX.
The ADD instruction has to wait for the write to RBX (and BX) before being
executed.

llvm-svn: 334730
2018-06-14 15:54:34 +00:00
Andrea Di Biagio 0ffb2271a1 [llvm-mca] Fixed a bug in the logic that checks if a memory operation is ready to execute.
Fixes PR37790.

In some (very rare) cases, the LSUnit (Load/Store unit) was wrongly marking a
load (or store) as "ready to execute" effectively bypassing older memory barrier
instructions.

To reproduce this bug, the memory barrier must be the first instruction in the
input assembly sequence, and it doesn't have to perform any register writes.

llvm-svn: 334633
2018-06-13 18:30:14 +00:00
Pavel Labath 4adc88ed25 [DWARF/AccelTable] Remove getDIESectionOffset for DWARF v5 entries
Summary:
This method was not correct for entries in DWO files as it assumed it
could just add up the CU and DIE offsets to get the absolute DIE offset.
This is not correct for the DWO files, as here the CU offset will
reference the skeleton unit, whereas the DIE offset will be the offset
in the full unit in the DWO file.

Unfortunately, this means that we are not able to determine the absolute
DIE offset using the information in the .debug_names section alone,
which means we have to offload some of this work to the users of this
class.

To demonstrate how this can be done, I've added/fixed the ability to
lookup entries using accelerator tables in DWO files in llvm-dwarfdump.
To make this happen, I've needed to make two extra changes in other
classes:
- made the DWARFContext method to lookup a CU based on the section
  offset public. I've needed this functionality to lookup a CU, and this
  seems like a useful thing in general.
- made DWARFUnit::getDWOId call extractDIEsIfNeeded. Before this, the
  DWOId was filled in only if the root DIE happened to be parsed
  before we called the accessor. Since the lazy parsing is supposed to
  happen under the hood, calling extractDIEsIfNeeded seems appropriate.

Reviewers: JDevlieghere, aprantl, dblaikie

Subscribers: mgrang, llvm-commits

Differential Revision: https://reviews.llvm.org/D48009

llvm-svn: 334578
2018-06-13 08:14:27 +00:00
Clement Courbet 7db69cc08a [X86] Fix skylake server scheduling info.
Summary:
This fixes most of the scheduling info for SKX vector operations.
I had to split a lot of the YMM/ZMM classes into separate classes for YMM and ZMM.

The before/after llvm-exegesis analysis are in the phabricator diff.

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D47721

llvm-svn: 334407
2018-06-11 14:37:53 +00:00
Simon Pilgrim 89deac6694 [X86][BtVer2] Add support for all SUB/XOR 32/64 scalar instructions that should match the dependency-breaking 'zero-idiom'
As detailed on Agner's Microarchitecture doc (21.8 AMD Bobcat and Jaguar pipeline - Dependency-breaking instructions), these instructions are dependency breaking and fast-path zero the destination register (and appropriate EFLAGS bits).

llvm-svn: 334303
2018-06-08 17:00:45 +00:00
Simon Pilgrim efb4806bb9 [X86][BtVer2] Remove SBB tests that were accidentally added in rL334296
These aren't true zero-idiom instructions (just dependency breaking).

llvm-svn: 334297
2018-06-08 15:43:00 +00:00
Simon Pilgrim 53766a986d [X86][BtVer2] Add tests for scalar SUB/XOR instructions that should match the dependency-breaking 'zero-idiom'
As detailed on Agner's Microarchitecture doc (21.8 AMD Bobcat and Jaguar pipeline - Dependency-breaking instructions).

llvm-svn: 334296
2018-06-08 15:28:43 +00:00
Simon Pilgrim aafcf9e4a1 [X86][BtVer2] Limit zero idiom tests to a single iteration.
Reduces output size and we're only wanting to check that the instructions are fast-path'd (just Dispatch+Retire) anyhow

llvm-svn: 334292
2018-06-08 15:01:40 +00:00
Paul Semel e57bc78324 [llvm-strip] Expose --strip-unneeded option
Differential Revision: https://reviews.llvm.org/D47818

llvm-svn: 334182
2018-06-07 10:05:25 +00:00
Peter Collingbourne cf017ada68 llvm-readobj: fix printing number of relocations in Android packed format.
With '-elf-output-style=GNU -relocations', a header containing the number
of entries is printed before all the relocation entries in the section.
For Android packed format, we need to perform the unpacking first before
we can get the actual number of relocations in the section.

Patch by Rahul Chaudhry!

Differential Revision: https://reviews.llvm.org/D47800

llvm-svn: 334147
2018-06-07 00:02:07 +00:00
Alexander Shaposhnikov 29407f3abe [llvm-strip] Expose --discard-all option
Expose objcopy's --discard-all option in llvm-strip.

Test plan: make check-all

Differential revision: https://reviews.llvm.org/D47750

llvm-svn: 334131
2018-06-06 21:23:19 +00:00
Simon Pilgrim aef5bdbea1 [X86][BtVer2] Add support for all vector instructions that should match the dependency-breaking 'zero-idiom'
As detailed on Agner's Microarchitecture doc (21.8 AMD Bobcat and Jaguar pipeline - Dependency-breaking instructions), all these instructions are dependency breaking and zero the destination register.

llvm-svn: 334119
2018-06-06 19:06:09 +00:00
Simon Pilgrim 7a48bb6e44 [llvm-mca][x86] Fix all resources-x86_64.s tests to use different registers in reg-reg cases
I noticed while working on zero-idiom + dependency-breaking support (PR36671) that most of our binary instruction tests were reusing the same src registers, which would cause the tests to fail once we enable scalar zero-idiom support on btver2. Fixed in all targets to keep them in sync.

llvm-svn: 334110
2018-06-06 18:20:25 +00:00
Simon Pilgrim 64541ff297 [X86][BtVer2] Add tests for all vector instructions that should match the dependency-breaking 'zero-idiom'
As detailed on Agner's Microarchitecture doc (21.8 AMD Bobcat and Jaguar pipeline - Dependency-breaking instructions), all these instructions are dependency breaking and zero the destination register.

TODO: Scalar instructions still need to be tested (need to check EFLAGS handling).
llvm-svn: 334104
2018-06-06 16:14:37 +00:00
Sanjay Patel 59313be8d3 [CodeGen] assume max/default throughput for unspecified instructions
This is a fix for the problem arising in D47374 (PR37678):
https://bugs.llvm.org/show_bug.cgi?id=37678

We may not have throughput info because it's not specified in the model 
or it's not available with variant scheduling, so assume that those
instructions can execute/complete at max-issue-width.

Differential Revision: https://reviews.llvm.org/D47723

llvm-svn: 334055
2018-06-05 23:34:45 +00:00
Andrea Di Biagio 757600bccb [llvm-mca] Correctly update the CyclesLeft of a register read in the presence of partial register updates.
This patch fixe the logic in ReadState::cycleEvent(). That method was not
correctly updating field `TotalCycles`.

Added extra code comments in class ReadState to better describe each field.

llvm-svn: 334028
2018-06-05 17:12:02 +00:00
Alexander Shaposhnikov d7eaf27654 [llvm-strip] Add missing aliases for --strip-debug
Add missing aliases for --strip-debug: -g, -S, -d.

Test plan: make check-all

Differential revision: https://reviews.llvm.org/D47674

llvm-svn: 333940
2018-06-04 18:55:41 +00:00
Andrea Di Biagio 39e5a5695f [RFC][patch 3/3] Add support for variant scheduling classes in llvm-mca.
This patch is the last of a sequence of three patches related to LLVM-dev RFC
"MC support for variant scheduling classes".
http://lists.llvm.org/pipermail/llvm-dev/2018-May/123181.html

This fixes PR36672.

The main goal of this patch is to teach llvm-mca how to solve variant scheduling
classes.  This patch does that, plus it adds new variant scheduling classes to
the BtVer2 scheduling model to identify so-called zero-idioms (i.e. so-called
dependency breaking instructions that are known to generate zero, and that are
optimized out in hardware at register renaming stage).

Without the BtVer2 change, this patch would not have had any meaningful tests.
This patch is effectively the union of two changes:
 1) a change that teaches llvm-mca how to resolve variant scheduling classes.
 2) a change to the BtVer2 scheduling model that allows us to special-case
    packed XOR zero-idioms (this partially fixes PR36671).

Differential Revision: https://reviews.llvm.org/D47374 

llvm-svn: 333909
2018-06-04 15:43:09 +00:00
Alexander Ivchenko ab60a2823f [llvm-readobj] Support GNU_PROPERTY_X86_FEATURE_1_AND notes in .note.gnu.property
Resubmit of r333424. This version contains the fix for fails found by buildbots
on some targets.

This patch allows parsing GNU_PROPERTY_X86_FEATURE_1_AND
notes in .note.gnu.property sections. These notes
indicate that the object file is built to support Intel CET.

patch by mike.dvoretsky

Differential Revision: https://reviews.llvm.org/D47473

llvm-svn: 333908
2018-06-04 15:14:18 +00:00
Greg Bedwell bbe64af0a0 [llvm-mca] Regenerate a test to remove a double newline
Command used: py update_mca_test_checks.py ..\test\tools\llvm-mca\*\*.s ..\test\tools\llvm-mca\*\*\*.s

llvm-svn: 333893
2018-06-04 12:30:03 +00:00
Roman Lebedev 7b53d1454f [llvm-mca] Make sure not to end the test files with an empty line.
Summary:
It's super irritating.

[properly configured] git client then complains about that double-newline,
and you have to use `--force` to ignore the warning, since even if you
fix it manually, it will be reintroduced the very next runtime :/

Reviewers: RKSimon, andreadb, courbet, craig.topper, javed.absar, gbedwell

Reviewed By: gbedwell

Subscribers: javed.absar, tschuett, gbedwell, llvm-commits

Differential Revision: https://reviews.llvm.org/D47697

llvm-svn: 333887
2018-06-04 11:48:46 +00:00
Paul Semel 46201fb7bc [llvm-objcopy] Fix null symbol handling
This fixes the bug where strip-all option was
leading to a malformed outputted ELF file.

Differential Revision: https://reviews.llvm.org/D47414

llvm-svn: 333772
2018-06-01 16:19:46 +00:00
Alexander Shaposhnikov ecc84834b7 [llvm-strip] Add -o option to llvm-strip
This diff implements the option -o 
for specifying a file to write the output to.

Test plan: make check-all

Differential revision: https://reviews.llvm.org/D47505

llvm-svn: 333693
2018-05-31 20:42:13 +00:00
Clement Courbet 2e41c5a79c [X86] Introduce WriteFLDC for x87 constant loads.
Summary:
{FLDL2E, FLDL2T, FLDLG2, FLDLN2, FLDPI} were using WriteMicrocoded.

 - I've measured the values for Broadwell, Haswell, SandyBridge, Skylake.
 - For ZnVer1 and Atom, values were transferred form InstRWs.
 - For SLM and BtVer2, I've guessed some values :(

Reviewers: RKSimon, craig.topper, andreadb

Subscribers: gbedwell, llvm-commits

Differential Revision: https://reviews.llvm.org/D47585

llvm-svn: 333656
2018-05-31 14:22:01 +00:00
Clement Courbet b78ab5097d [X86] Extract latency of fldz/fld1 in separate classes.
Summary:
 - I've measured the values for Broadwell, Haswell, SandyBridge, Skylake.
 - For ZnVer1 and Atom, values were transferred form `InstRW`s.
 - For SLM and BtVer2, values are from Agner.

This is split off from https://reviews.llvm.org/D47377

Reviewers: RKSimon, andreadb

Subscribers: gbedwell, llvm-commits

Differential Revision: https://reviews.llvm.org/D47523

llvm-svn: 333642
2018-05-31 11:41:27 +00:00
Pavel Labath 59870af66f DWARFAcceleratorTable: fix equal_range iterators
Summary:
Both (Apple and DWARF5) implementations of the iterators had bugs which
resulted in crashes if one attempted to iterate through the accelerator
tables all the way.

For the Apple tables, the issue was that we did not clear the DataOffset
field when we reached the end, which made our iterator compare unequal
to the "end" iterator. For the Dwarf5 tables, the problem was that we
incremented the CurrentIndex pointer and then used the incremented
(possibly invalid) pointer to check whether we have reached the end of
the index list.

The reason these bugs went undetected is because their only user
(dwarfdump) only ever searched for the first match. Besides allowing us
to test this fix, changing llvm-dwarfdump --find to display all matches
seems like a good improvement (it makes the behavior consistent with the
--name option), so I change llvm-dwarfdump to do that.

The existing tests would be sufficient to test this fix with the new
llvm-dwarfdump behavior, but I add a special test that demonstrates that
the tool indeed displays multiple results. The find.test test needed to
be tweaked a bit as the tool now does not print the ".debug_info
contents" header (also consistent with how --name works).

Reviewers: JDevlieghere, aprantl, dblaikie

Subscribers: mgrang, llvm-commits

Differential Revision: https://reviews.llvm.org/D47543

llvm-svn: 333635
2018-05-31 08:47:00 +00:00
Vedant Kumar e3c1fb8b12 [llvm-cov] Use the new PrintHTMLEscaped utility
This removes some duplicate logic to escape characters in HTML output.

llvm-svn: 333608
2018-05-30 23:35:14 +00:00
Peter Collingbourne 1651ac13be llvm-objcopy: Set sh_link to 0 on unrecognized symtab-linked sections.
Per discussion on the generic-abi mailing list:
https://groups.google.com/forum/#!topic/generic-abi/MPr8TVtnVn4

An object file manipulation tool must either write out a symbol
table with the same number of entries as the original symbol table
and in the same order, or if this is impossible, refuse to operate
on the object file if it has unrecognized sections that are linked
to the symtab section. However, existing tools (namely GNU strip,
GNU objcopy and ld.{bfd,gold,lld} -r) do not comply with this at
present: they change symbol table indexes and set sh_link to 0 on
the unrecognized symtab-linked sections.

We intend to use the latter as a (temporary) signal that a tool has
operated on a proposed new symtab-linked section and invalidated the
symbol table indexes. However, llvm-objcopy currently keeps sh_link
pointing to the new symtab section. This patch changes llvm-objcopy
to set sh_link to 0 to match the behaviour of the other tools.

Differential Revision: https://reviews.llvm.org/D47404

llvm-svn: 333581
2018-05-30 19:30:39 +00:00
Galina Kistanova df917811ca Reverted r333424 as it broke multiple build bots and left unfixed for a long time
llvm-svn: 333578
2018-05-30 18:51:08 +00:00
Jonas Devlieghere f4ce54a123 [dsymutil] Escape HTML special characters in plist.
When printing string in the Plist, we weren't escaping the characters
which lead to invalid XML. This patch adds the escape logic to
StringExtras.

rdar://39785334

llvm-svn: 333565
2018-05-30 17:47:11 +00:00
Alexander Ivchenko 6572425462 [llvm-readobj] Support GNU_PROPERTY_X86_FEATURE_1_AND notes in .note.gnu.property
This patch allows parsing GNU_PROPERTY_X86_FEATURE_1_AND
notes in .note.gnu.property sections. These notes
indicate that the object file is built to support Intel CET.

patch by mike.dvoretsky

Differential Revision: https://reviews.llvm.org/D47473

llvm-svn: 333424
2018-05-29 14:49:51 +00:00
Clement Courbet 07c9ec6f2e [X86][Sched] Add InstRW for CLC on Intel after SNB.
Summary:
After SNB, Intel CPUs can rename CF independently of other EFLAGS,
so the renamer can zero it for free. Note that STC still consumes resources.

To reproduce: `$ llvm-exegesis -mode=uops -opcode-name=CLC`

On SNB:
```
---
key:
  opcode_name:     CLC
  mode:            uops
  config:          ''
cpu_name:        sandybridge
llvm_triple:     x86_64-unknown-linux-gnu
num_repetitions: 10000
measurements:
  - { key: '3', value: 0.0014, debug_string: SBPort0 }
  - { key: '4', value: 0.0013, debug_string: SBPort1 }
  - { key: '5', value: 0.0003, debug_string: SBPort4 }
  - { key: '6', value: 0.0029, debug_string: SBPort5 }
  - { key: '10', value: 0.0003, debug_string: SBPort23 }
error:           ''
info:            'instruction is serial, repeating a random one.
Snippet:
CLC
'
...
```

On HSW:
```
---
key:
  opcode_name:     CLC
  mode:            uops
  config:          ''
cpu_name:        haswell
llvm_triple:     x86_64-unknown-linux-gnu
num_repetitions: 10000
measurements:
  - { key: '3', value: 0.001, debug_string: HWPort0 }
  - { key: '4', value: 0.0009, debug_string: HWPort1 }
  - { key: '5', value: 0.0004, debug_string: HWPort2 }
  - { key: '6', value: 0.0006, debug_string: HWPort3 }
  - { key: '7', value: 0.0002, debug_string: HWPort4 }
  - { key: '8', value: 0.0012, debug_string: HWPort5 }
  - { key: '9', value: 0.0022, debug_string: HWPort6 }
  - { key: '10', value: 0.0001, debug_string: HWPort7 }
error:           ''
info:            'instruction is serial, repeating a random one.
Snippet:
CLC
'
...

```

Reviewers: craig.topper, RKSimon

Subscribers: gchatelet, llvm-commits

Differential Revision: https://reviews.llvm.org/D47362

llvm-svn: 333392
2018-05-29 06:19:39 +00:00
Jonas Devlieghere cb547cbb5c [dwarfdump] Make -c and -p work together
When requesting to dump both the parent chain and children, we used to
print the DIE more than once because we propagated the dump options to
the parent without clearing the respective flags. This commit fixes this
oversight and adds a test.

rdar://39415292

Differential revision: https://reviews.llvm.org/D47263

llvm-svn: 333350
2018-05-26 19:39:56 +00:00
Paul Semel cf51c80bf1 [llvm-objcopy] Add --keep-file-symbols option
This option prevent from removing file symbols while removing symbols.

Differential Revision: https://reviews.llvm.org/D46830

llvm-svn: 333339
2018-05-26 08:10:37 +00:00
Simon Pilgrim 0155bf0da9 [X86][SNB] Fix differences between vex/non-vex XMM vector moves (PR37286)
As confirmed by llvm-exegesis, there is no scheduler difference between MOVDQA/MOVDQU and VMOVDQA/VMOVDQU xmm reg-reg moves

Another chapter in the never ending crusade to remove useless InstRW overrides from the x86 scheduler models......

llvm-svn: 333271
2018-05-25 12:18:11 +00:00
Paul Semel 99dda0bab8 [llvm-objcopy] Add --strip-unneeded option
Differential Revision: https://reviews.llvm.org/D46896

llvm-svn: 333267
2018-05-25 11:01:25 +00:00
Greg Bedwell e790f6fb06 [UpdateTestChecks] Improved update_mca_test_checks block analysis
Previously update_mca_test_checks worked entirely at "block" level where
a block is some sequence of lines delimited by at least one empty line.
This generally worked well, but could sometimes lead to excessive
repetition of check lines for various prefixes if some block was almost
identical between prefixes, but not quite (for example, due to a
different dispatch width in the otherwise identical summary views).

This new analyis attempts to split blocks further in the case where the
following conditions are met:
  a) There is some prefix common to every RUN line (typically 'ALL').
  b) The first line of the block is common to the output with every prefix.
  c) The block has the same number of lines for the output with every prefix.

Also, regenerated all llvm-mca test files with the following command:
update_mca_test_checks.py "../test/tools/llvm-mca/*/*.s" "../test/tools/llvm-mca/*/*/*.s"

The new analysis showed a "multiple lines not disambiguated by prefixes" warning
for test "AArch64/Exynos/scheduler-queue-usage.s" so I've also added some
explicit prefixes to each of the RUN lines in that test.

Differential Revision: https://reviews.llvm.org/D47321

llvm-svn: 333204
2018-05-24 16:36:44 +00:00
Jonas Devlieghere 27126f5260 [Support] Add color cl category.
This commit adds a color category so tools can document this option and
enables it for dwarfdump and dsymuttil.

rdar://problem/40498996

llvm-svn: 333176
2018-05-24 11:36:57 +00:00
Alexander Shaposhnikov c7277e6e2b [llvm-strip] Minor fix of the usage of TableGen
This is a small follow-up to the revisions r333117 and r331663.

1. Avoid the name conflicts of the generated variables for prefixes.
2. Apply clang-format -i -style=llvm to llvm-objcopy.cpp once again.
3. Add a test for the flag with double dash.

Test plan: make check-all

llvm-svn: 333120
2018-05-23 20:39:52 +00:00
Alexander Shaposhnikov 35bee3e06b [llvm-strip] Expose --keep-symbol option
Expose --keep-symbol option in llvm-strip.

Test plan: make check-all

Differential revision: https://reviews.llvm.org/D47222

llvm-svn: 333117
2018-05-23 19:44:19 +00:00
Andrea Di Biagio 3fc20c9c7f [llvm-mca] Print the "Block RThroughput" in the SummaryView.
This patch implements the "block reciprocal throughput" computation in the
SummaryView.

The block reciprocal throughput is computed as the MAX of:
  - NumMicroOps / DispatchWidth
  - Resource Cycles / #Units   (for every resource consumed).

The block throughput is bounded from above by the hardware dispatch throughput.
That is because the DispatchWidth is an upper bound on how many opcodes can be part
of a single dispatch group.

The block throughput is also limited by the amount of hardware parallelism. The
number of available resource units affects how the resource pressure is
distributed, and also how many blocks can be delivered every cycle.

llvm-svn: 333095
2018-05-23 15:59:27 +00:00
Alexander Shaposhnikov 6e7814c484 [llvm-objcopy] Fix the behavior of --strip-* and --keep-symbol
If one runs llvm-objcopy --strip-all --keep-symbol foo
and the symbol table indeed contains the symbol "foo"
then it should not be removed.

Test plan: make check-all

Differential revision: https://reviews.llvm.org/D47052

llvm-svn: 333008
2018-05-22 18:24:07 +00:00
Paul Semel 31a212d694 Revert "[llvm-objcopy] Add --strip-unneeded option"
There is a use after free I didn't see. Need to investigate.

This reverts commit f7624abeb1f0d012309baf2e78cf2499fbfe5e5f.

llvm-svn: 332925
2018-05-22 01:04:36 +00:00
Paul Semel 040df77ed6 [llvm-objcopy] Add --strip-unneeded option
This option removes symbols that are not needed by relocations.

Differential Revision: https://reviews.llvm.org/D46896

llvm-svn: 332915
2018-05-21 22:50:32 +00:00
Peter Collingbourne c5a9765cea LTO: Replace split dwarf implementation that uses objcopy with one that uses direct emission.
Part of PR37466.

Differential Revision: https://reviews.llvm.org/D47091

llvm-svn: 332884
2018-05-21 20:26:49 +00:00
Jonas Devlieghere c111382aa8 [DebugInfo] Use absolute addresses in location lists
Rather than relying on the user to do the address calculating in
DW_AT_location we should just dump the absolute address.

rdar://problem/38513870

Differential revision: https://reviews.llvm.org/D47152

llvm-svn: 332873
2018-05-21 19:36:54 +00:00
Andrea Di Biagio cb1ed400a4 [llvm-mca] Removed an empty line generated by the timeline view. NFC.
Also, regenerate all tests.

llvm-svn: 332853
2018-05-21 17:11:56 +00:00
Andrea Di Biagio b5757abefb [X86][BtVer2] Add a 'J' prefix to the PRF/RCU defs. NFC
This is to keep the Jaguar model's naming convention. Processor resources all
have a 'J' prefix in the BtVer2 scheduling model.

llvm-svn: 332851
2018-05-21 16:30:26 +00:00
Nico Weber da5513b9c4 win: try to fix dia tests with newer msvc versions
llvm-svn: 332827
2018-05-21 02:09:57 +00:00
Simon Pilgrim 1273f4ad93 [X86] Add GPR<->XMM Schedule Tags
BtVer2 - fix NumMicroOp and account for the Lat+6cy GPR->XMM and Lat+1cy XMm->GPR delays (see rL332737)

The high number of MOVD/MOVQ equivalent instructions meant that there were a number of missed patterns in SNB/Znver1:
SNB - add missing GPR<->MMX costs (taken from Agner / Intel AOM)
Znver1 - add missing GPR<->XMM MOVQ costs (taken from Agner)

llvm-svn: 332745
2018-05-18 17:58:36 +00:00
Simon Pilgrim 007b50fd35 [X86][BtVer2] Improve simulation of (V)PINSR values
Include the 6cy delay transferring from the GPR to FPU.

llvm-svn: 332737
2018-05-18 17:09:41 +00:00
Simon Pilgrim 3ecb0b80f6 [X86][BtVer2] Partial vector stores (inc MMX) have a 2cy latency
llvm-svn: 332722
2018-05-18 14:22:22 +00:00
Simon Pilgrim c4b8d367a8 [X86][SSE] Ensure vector partial load/stores use the WriteVecLoad/WriteVecStore scheduler classes
Retag some instructions that were missed when we split off vector load/store/moves - MOVQ/MOVD etc.

Fixes BtVer2/SLM which have different behaviours for GPR stores.

llvm-svn: 332718
2018-05-18 14:08:01 +00:00
Simon Pilgrim d749b321b2 [X86][SSE] Ensure float load/stores use the WriteFLoad/WriteFStore scheduler classes
Retag some instructions that were missed when we split off vector load/store/moves - MOVSS/MOVSD/MOVHPD/MOVHPD/MOVLPD/MOVLPS etc.

Fixes BtVer2/SLM which have different behaviours for GPR stores.

llvm-svn: 332714
2018-05-18 13:13:59 +00:00
Simon Pilgrim e389ea0e3e [llvm-mca][X86] Add CMOV test files
llvm-svn: 332622
2018-05-17 16:29:12 +00:00
Simon Pilgrim b5741f5c3d [X86][BtVer2] ADC/SBB take 2cy on an ALU pipe, not 1cy like ADD/SUB
llvm-svn: 332616
2018-05-17 15:43:23 +00:00
Andrea Di Biagio 650b5fc6cb [llvm-mca] add flag -all-views and flag -all-stats.
Flag -all-views enables all the views.
Flag -all-stats enables all the views that print hardware statistics.

llvm-svn: 332602
2018-05-17 12:27:03 +00:00
Simon Pilgrim b4fd145fc3 [llvm-mca][X86] Add ADX test files
llvm-svn: 332595
2018-05-17 11:32:38 +00:00
Simon Pilgrim d5d77dcb46 [X86] Fix typo in instregex for CVTSI642SDrr
llvm-svn: 332510
2018-05-16 18:31:17 +00:00
Andrea Di Biagio 45ccdd1785 [llvm-mca] Regenerate tests after r332381 and r332361. NFC
llvm-svn: 332447
2018-05-16 10:12:06 +00:00
Jake Ehrlich e40398ad98 [llvm-objcopy] Add --only-keep-debug as a noop
This option just keeps being a problem and really needs to be implemented
in some fashion. Implementing it properly requires some kind of
"replaceSectionReference" method because all the existing links need to be
maintained. The desired behavior is just for allocated sections to become
NOBITS but actually implementing that is rather tricky due to the current
design of llvm-objcopy. However converting allocated sections to NOBITS is
just an optimization and not something debuggers need. Debuggers can debug
a stripped executable and take an unstripped executable for that stripped
executable as input. Additionally allocated sections account for a very
small part of debug binaries so this optimization is quite small. I propose
that for the time being we implement this as a NOP so that people can use
llvm-objcopy where they need to, just in a sub-optimal way.

This option has already blocked a lot of people and its currently blocking me.

llvm-svn: 332396
2018-05-15 20:53:53 +00:00
Martin Storsjo e241ce6f65 [llvm-rc] Add support for the optional CLASS statement for dialogs
Differential Revision: https://reviews.llvm.org/D46875

llvm-svn: 332386
2018-05-15 19:21:28 +00:00
Simon Pilgrim be9a206883 [X86] Split WriteCvtF2F into F32->F64 and F64->F32 scheduler classes
BtVer2 - Fixes schedules for (V)CVTPS2PD instructions

A lot of the Intel models still have too many InstRW overrides for these new classes - this needs cleaning up but I wanted to get the classes in first

llvm-svn: 332376
2018-05-15 17:36:49 +00:00
Simon Pilgrim 891ebcdbaa [X86] Split off F16C WriteCvtPH2PS/WriteCvtPS2PH scheduler classes
Btver2 - VCVTPH2PSYrm needs to double pump the AGU
Broadwell - missing VCVTPS2PH*mr stores extra latency

Allows us to remove the WriteCvtF2FSt conversion store class

llvm-svn: 332357
2018-05-15 14:12:32 +00:00
Paul Semel 5d97c823a4 [llvm-objcopy] Add --keep-symbol (-K) option
This option permits to explicitly keep the specified
symbol so that it doesn't get removed.

Differential Revision: https://reviews.llvm.org/D46819

llvm-svn: 332356
2018-05-15 14:09:37 +00:00
Pavel Labath 80827f10a1 Reapply "DWARFVerifier: Check "completeness" of .debug_names section"
This is a resubmit of r331868 (D46583), which was reverted due to
failures on the PS4 bot.

These have been resolved with r332246/D46748.

llvm-svn: 332349
2018-05-15 13:24:10 +00:00
Simon Pilgrim 2aa395abcf [llvm-mca][x86] Add F16C instruction tests
llvm-svn: 332347
2018-05-15 12:50:06 +00:00
Martin Storsjo 11adbacac8 [llvm-rc] Add support for parsing memory flags
Most of the handling is pretty straightforward; fetch the default
memory flags for the specific resource type before parsing the flags
and apply them on top of that, except that some flags imply others
and some flags clear more than one flag.

For icons and cursors, the flags set get passed on to all individual
single icon/cursor resources, while only some flags affect the icon/cursor
group resource.

For stringtables, the behaviour is pretty simple; the first stringtable
resource of a bundle sets the flags for the whole bundle.

The output of these tests match rc.exe byte for byte.

The actual use of these memory flags is deprecated and they have no
effect since Win16, but some resource script files may still happen
to have them in place.

Differential Revision: https://reviews.llvm.org/D46818

llvm-svn: 332329
2018-05-15 06:35:29 +00:00
Martin Storsjo 860e5fcdf4 [llvm-rc] Read the Planes/BitCount fields from BITMAPINFOHEADER for icons
Previously these fields were only read from this header for cursors,
while Planes was hardcoded to 1 for icons (with a comment that it was
unknown why this was needed) and BitCount was left at the value
read originally in the RESDIRENTRY.

This fixes the single byte that was differing for the icon/cursor test
compared to rc.exe.

This is based on research/testing by Nico Weber.

Differential Revision: https://reviews.llvm.org/D46816

llvm-svn: 332328
2018-05-15 06:35:20 +00:00
Martin Storsjo 5556841cd3 [llvm-rc] Add missing inputs for tag-icon-cursor.test.
This adds the missing input files used for this test, except for
the separate input files for specific error cases; matching
test input files were provided by Nico Weber.

The extra copying of files into the %t directory doesn't seem to
be necessary since that directory only ever is used for output here,
not for inputs.

Differential Revision: https://reviews.llvm.org/D46813

llvm-svn: 332297
2018-05-14 21:32:47 +00:00
Simon Pilgrim 5bd5e2fd3e [llvm-mca][X86] Add missing SSE4A test file
llvm-svn: 332270
2018-05-14 18:20:40 +00:00
Simon Pilgrim 228d24a2d6 [X86][BtVer2] Fix MMX/YMM integer vector nt store schedules
MMX was missing and YMM was tagged as a fp nt store

llvm-svn: 332269
2018-05-14 18:07:28 +00:00
Simon Pilgrim 4135de2e93 [llvm-mca][x86] Add scalar nt-store instruction tests
llvm-svn: 332262
2018-05-14 17:10:33 +00:00
Simon Pilgrim 7340d88740 [llvm-mca][x86] Add and/not/or/xor instruction tests
llvm-svn: 332257
2018-05-14 16:26:24 +00:00
Simon Pilgrim 661ae7778d [X86][BtVer2] Model ymm move as double pumped instructions
We still need to handle mmx/xmm moves as 'decode-only' no-pipe instructions

llvm-svn: 332109
2018-05-11 17:38:36 +00:00
Simon Pilgrim 706403bab8 [X86][MMX] Tag MMX Move/Load/Store as WriteVec schedule classes
Fixes an issue on SLM/Btver2 where we had instructions were being treated as scalar loads/stores

llvm-svn: 332104
2018-05-11 16:38:59 +00:00
Simon Pilgrim 032a01f74a [X86][SLM] Vector stores only use the MEC port.
Confirmed by both Agner and Intel's AOM - the IEC/FPC are not required for pure load/stores (even if its a partial update).

Can't fix WriteStore until all RMW instructions are cleaned up though....

llvm-svn: 332096
2018-05-11 15:16:15 +00:00
Simon Pilgrim 22dd72b995 [X86] Split WriteF/WriteVec Move/Load/Store scheduler classes by vector width
Fixes a SNB issue that was missing vlddqu/vmovntdqa ymm instructions

llvm-svn: 332094
2018-05-11 14:30:54 +00:00
Alexander Shaposhnikov 18b5fb7b84 [llvm-strip] Add support for -remove-section
This diff adds support for -remove-section to llvm-strip.

Test plan: make check-all

Differential revision: https://reviews.llvm.org/D46567

llvm-svn: 332081
2018-05-11 05:27:06 +00:00
Alexander Shaposhnikov 191913e3e7 [llvm-objcopy] Update remove-section.test
Verify that the input binary is not getting modified
and add an invocation which uses -remove-section instead of -R.

Test plan: make check-all

llvm-svn: 332078
2018-05-11 04:30:57 +00:00
Sam Clegg a5908009cd [WebAsembly] Update default triple in test files to wasm32-unknown-unkown.
Summary: The final -wasm component has been the default for some time now.

Subscribers: jfb, dschuff, jgravelle-google, eraman, aheejin, JDevlieghere, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D46342

llvm-svn: 332007
2018-05-10 17:49:11 +00:00
Simon Pilgrim 37fbb7f173 [X86][SNB] Fix typo in PEXTRDmr instregex, was missing VPEXTRDmr.
llvm-svn: 332002
2018-05-10 17:30:49 +00:00
Alexander Shaposhnikov af555fb4a3 [llvm-objcopy] Add tests for help messages
This diff slightly reorganizes the tests  and improves 
the test coverage of help messages / error reports.

Test plan: make check-all

Differential revision: https://reviews.llvm.org/D46589

llvm-svn: 331993
2018-05-10 15:56:04 +00:00
James Henderson a3acf99e59 [DWARF] Rework debug line parsing to use llvm::Error and callbacks
Reviewed by: dblaikie, JDevlieghere, espindola

Differential Revision: https://reviews.llvm.org/D44560

Summary:
The .debug_line parser previously reported errors by printing to stderr and
return false. This is not particularly helpful for clients of the library code,
as it prevents them from handling the errors in a manner based on the calling
context. This change switches to using llvm::Error and callbacks to indicate
what problems were detected during parsing, and has updated clients to handle
the errors in a location-specific manner. In general, this means that they
continue to do the same thing to external users. Below, I have outlined what
the known behaviour changes are, relating to this change.

There are two levels of "errors" in the new error mechanism, to broadly
distinguish between different fail states of the parser, since not every
failure will prevent parsing of the unit, or of subsequent unit. Malformed
table errors that prevent reading the remainder of the table (reported by
returning them) and other minor issues representing problems with parsing that
do not prevent attempting to continue reading the table (reported by calling a
specified callback funciton). The only example of this currently is when the
last sequence of a unit is unterminated. However, I think it would be good to
change the handling of unrecognised opcodes to report as minor issues as well,
rather than just printing to the stream if --verbose is used (this would be a
subsequent change however).

I have substantially extended the DwarfGenerator to be able to handle
custom-crafted .debug_line sections, allowing for comprehensive unit-testing
of the parser code. For now, I am just adding unit tests to cover the basic
error reporting, and positive cases, and do not currently intend to test every
part of the parser, although the framework should be sufficient to do so at a
later point.

Known behaviour changes:
  - The dump function in DWARFContext now does not attempt to read subsequent
  tables when searching for a specific offset, if the unit length field of a
  table before the specified offset is a reserved value.
  - getOrParseLineTable now returns a useful Error if an invalid offset is
  encountered, rather than simply a nullptr.
  - The parse functions no longer use `WithColor::warning` directly to report
  errors, allowing LLD to call its own warning function.
  - The existing parse error messages have been updated to not specifically
  include "warning" in their message, allowing consumers to determine what
  severity the problem is.
  - If the line table version field appears to have a value less than 2, an
  informative error is returned, instead of just false.
  - If the line table unit length field uses a reserved value, an informative
  error is returned, instead of just false.
  - Dumping of .debug_line.dwo sections is now implemented the same as regular
  .debug_line sections.
  - Verbose dumping of .debug_line[.dwo] sections now prints the prologue, if
  there is a prologue error, just like non-verbose dumping.

As a helper for the generator code, I have re-added emitInt64 to the
AsmPrinter code. This previously existed, but was removed way back in r100296,
presumably because it was dead at the time.

This change also requires a change to LLD, which will be committed separately.

llvm-svn: 331971
2018-05-10 10:51:33 +00:00
Douglas Yung 1d4c29c437 Fix tests added in r331924 so that they work on Windows.
The test needed to check for the optional executable extension (llvm-objcopy.EXE).

llvm-svn: 331952
2018-05-10 03:06:42 +00:00
Paul Semel 4246a462a3 [llvm-objcopy] Add --strip-symbol (-N) option
llvm-svn: 331924
2018-05-09 21:36:54 +00:00