Commit Graph

609 Commits

Author SHA1 Message Date
Tom Stellard 4b0b26199b Revert CMake: Make most target symbols hidden by default
This reverts r362990 (git commit 374571301d)

This was causing linker warnings on Darwin:

ld: warning: direct access in function 'llvm::initializeEvexToVexInstPassPass(llvm::PassRegistry&)'
from file '../../lib/libLLVMX86CodeGen.a(X86EvexToVex.cpp.o)' to global weak symbol
'void std::__1::__call_once_proxy<std::__1::tuple<void* (&)(llvm::PassRegistry&),
std::__1::reference_wrapper<llvm::PassRegistry>&&> >(void*)' from file '../../lib/libLLVMCore.a(Verifier.cpp.o)'
means the weak symbol cannot be overridden at runtime. This was likely caused by different translation
units being compiled with different visibility settings.

llvm-svn: 363028
2019-06-11 03:21:13 +00:00
Tom Stellard 374571301d CMake: Make most target symbols hidden by default
Summary:
For builds with LLVM_BUILD_LLVM_DYLIB=ON and BUILD_SHARED_LIBS=OFF
this change makes all symbols in the target specific libraries hidden
by default.

A new macro called LLVM_EXTERNAL_VISIBILITY has been added to mark symbols in these
libraries public, which is mainly needed for the definitions of the
LLVMInitialize* functions.

This patch reduces the number of public symbols in libLLVM.so by about
25%.  This should improve load times for the dynamic library and also
make abi checker tools, like abidiff require less memory when analyzing
libLLVM.so

One side-effect of this change is that for builds with
LLVM_BUILD_LLVM_DYLIB=ON and LLVM_LINK_LLVM_DYLIB=ON some unittests that
access symbols that are no longer public will need to be statically linked.

Before and after public symbol counts (using gcc 8.2.1, ld.bfd 2.31.1):
nm before/libLLVM-9svn.so | grep ' [A-Zuvw] ' | wc -l
36221
nm after/libLLVM-9svn.so | grep ' [A-Zuvw] ' | wc -l
26278

Reviewers: chandlerc, beanz, mgorny, rnk, hans

Reviewed By: rnk, hans

Subscribers: Jim, hiraditya, michaelplatings, chapuni, jholewinski, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, javed.absar, sbc100, jgravelle-google, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, zzheng, edward-jones, mgrang, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, kristina, jsji, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D54439

llvm-svn: 362990
2019-06-10 22:12:56 +00:00
Pengfei Wang 2e67d0c842 [X86] Add VP2INTERSECT instructions
Support Intel AVX512 VP2INTERSECT instructions in llvm

Patch by Xiang Zhang (xiangzhangllvm)

Differential Revision: https://reviews.llvm.org/D62366

llvm-svn: 362188
2019-05-31 02:50:41 +00:00
Fangrui Song 656afe370d [X86] Fix x86-64 call *foo@tlsdesc(%rax) and support R_386_TLSGOTDESC R_386_TLS_DESC_CALL
D18885 emitted 5 bytes for call *foo@tlsdesc(%rax). It should use the
2-byte form instead and let R_X86_64_TLSDESC_CALL apply to the beginning
of the call instruction.

The 2-byte form was deliberately chosen to make ->LE and ->IE relaxation work:

    0:   48 8d 05 00 00 00 00    lea    0x0(%rip),%rax        # 7 <.text+0x7>
                         3: R_X86_64_GOTPC32_TLSDESC     a-0x4
    7:   ff 10                   callq  *(%rax)
                         7: R_X86_64_TLSDESC_CALL        a

=>

    0:   48 c7 c0 fc ff ff ff    mov    $0xfffffffffffffffc,%rax
    7:   66 90                   xchg   %ax,%ax

Also change the symbol type to STT_TLS when VK_TLSCALL or VK_TLSDESC is
seen.

Reviewed By: compnerd

Differential Revision: https://reviews.llvm.org/D62512

llvm-svn: 361910
2019-05-29 02:02:59 +00:00
Simon Pilgrim a044410f37 [X86][SSE] Add shuffle combining support for ISD::ANY_EXTEND_VECTOR_INREG
Reuses what we already have in place for ISD::ZERO_EXTEND_VECTOR_INREG just with a different sentinel

llvm-svn: 361734
2019-05-26 16:00:35 +00:00
Fangrui Song 2463239777 [X86] Support .reloc *, R_{386,X86_64}_NONE, *
This can be used to create references among sections. When --gc-sections
is used, the referenced section will be retained if the origin section
is retained.

See R_MIPS_NONE (D13659), R_ARM_NONE (D61992), R_AARCH64_NONE (D61973) for similar changes.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D62014

llvm-svn: 360983
2019-05-17 03:25:39 +00:00
Richard Trieu 0116385452 [X86] Create a TargetInfo header. NFC
Move the declarations of getThe<Name>Target() functions into a new header in
TargetInfo and make users of these functions include this new header.
This fixes a layering problem.

llvm-svn: 360736
2019-05-15 01:17:58 +00:00
Richard Trieu b28b8b7724 [X86] Move InstPrinter files to MCTargetDesc. NFC
For some targets, there is a circular dependency between InstPrinter and
MCTargetDesc.  Merging them together will fix this.  For the other targets,
the merging is to maintain consistency so all targets will have the same
structure.

llvm-svn: 360484
2019-05-10 23:24:38 +00:00
Craig Topper 47d8865a38 [X86] Remove string literal from an if. NFC
This if used to be an assert that got refactored into an if, but left the string literal behind.

Fixes PR41718

llvm-svn: 359833
2019-05-02 21:57:18 +00:00
Eric Christopher 6b06c6a5ef Add explicit dependencies on MCSection.h and MCDwarf.h to the .cpp
files rather than rely on transitive includes from MCStreamer.h.

llvm-svn: 358263
2019-04-12 07:40:01 +00:00
Luo, Yuanke a2b4d3fab6 [X86] Add MM register mapping from CodeView to MC register id
Differential Revision: https://reviews.llvm.org/D60437

Change-Id: I2183a6d825d0284b22705d423b88882992b236c5
llvm-svn: 358179
2019-04-11 15:01:03 +00:00
Craig Topper 8e2871cd2c [X86] Add support for {vex2}, {vex3}, and {evex} to the assembler to match gas. Use {evex} to improve the one our 32-bit AVX512 tests.
These can be used to force the encoding used for instructions.

{vex2} will fail if the instruction is not VEX encoded, but otherwise won't do anything since we prefer vex2 when possible. Might need to skip use of the _REV MOV instructions for this too, but I haven't done that yet.

{vex3} will force the instruction to use the 3 byte VEX encoding or fail if there is no VEX form.

{evex} will force the instruction to use the EVEX version or fail if there is no EVEX version.

Differential Revision: https://reviews.llvm.org/D59266

llvm-svn: 358029
2019-04-09 18:45:15 +00:00
Craig Topper 80aa2290fb [X86] Merge the different Jcc instructions for each condition code into single instructions that store the condition code as an operand.
Summary:
This avoids needing an isel pattern for each condition code. And it removes translation switches for converting between Jcc instructions and condition codes.

Now the printer, encoder and disassembler take care of converting the immediate. We use InstAliases to handle the assembly matching. But we print using the asm string in the instruction definition. The instruction itself is marked IsCodeGenOnly=1 to hide it from the assembly parser.

Reviewers: spatel, lebedev.ri, courbet, gchatelet, RKSimon

Reviewed By: RKSimon

Subscribers: MatzeB, qcolombet, eraman, hiraditya, arphaman, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60228

llvm-svn: 357802
2019-04-05 19:28:09 +00:00
Craig Topper 7323c2bf85 [X86] Merge the different SETcc instructions for each condition code into single instructions that store the condition code as an operand.
Summary:
This avoids needing an isel pattern for each condition code. And it removes translation switches for converting between SETcc instructions and condition codes.

Now the printer, encoder and disassembler take care of converting the immediate. We use InstAliases to handle the assembly matching. But we print using the asm string in the instruction definition. The instruction itself is marked IsCodeGenOnly=1 to hide it from the assembly parser.

Reviewers: andreadb, courbet, RKSimon, spatel, lebedev.ri

Reviewed By: andreadb

Subscribers: hiraditya, lebedev.ri, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60138

llvm-svn: 357801
2019-04-05 19:27:49 +00:00
Craig Topper e0bfeb5f24 [X86] Merge the different CMOV instructions for each condition code into single instructions that store the condition code as an immediate.
Summary:
Reorder the condition code enum to match their encodings. Move it to MC layer so it can be used by the scheduler models.

This avoids needing an isel pattern for each condition code. And it removes
translation switches for converting between CMOV instructions and condition
codes.

Now the printer, encoder and disassembler take care of converting the immediate.
We use InstAliases to handle the assembly matching. But we print using the
asm string in the instruction definition. The instruction itself is marked
IsCodeGenOnly=1 to hide it from the assembly parser.

This does complicate the scheduler models a little since we can't assign the
A and BE instructions to a separate class now.

I plan to make similar changes for SETcc and Jcc.

Reviewers: RKSimon, spatel, lebedev.ri, andreadb, courbet

Reviewed By: RKSimon

Subscribers: gchatelet, hiraditya, kristina, lebedev.ri, jdoerfert, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60041

llvm-svn: 357800
2019-04-05 19:27:41 +00:00
Craig Topper 4307172b84 [X86] Classify the AVX512 rounding control operand as X86::OPERAND_ROUNDING_CONTROL instead of MCOI::OPERAND_IMMEDIATE. Add an assert on legal values of rounding control in the encoder and remove an explicit mask.
This should allow llvm-exegesis to intelligently constrain the rounding mode.

The mask in the encoder shouldn't be necessary any more. We used to allow codegen to use 8-11 for rounding mode and the assembler would use 0-3 to mean the same thing so we masked here and in the printer. Codegen now matches the assembler and the printer was updated, but I forgot to update the encoder.

llvm-svn: 357419
2019-04-01 19:08:15 +00:00
Jonas Hahnfeld c64d73cce2 [ELF] Fix GCC8 warnings about "fall through", NFCI
Add break statements in Object/ELF.cpp since the code should consider the
generic tags for Hexagon, MIPS, and PPC. Add a test (copied from llvm-readobj)
to show that this works correctly (earlier versions of this patch would have
asserted).

The warnings in X86ELFObjectWriter.cpp are actually false-positives since
the nested switch() handles all possible values and returns in all cases.
Make this explicit by adding llvm_unreachable's.

Differential Revision: https://reviews.llvm.org/D58837

llvm-svn: 356037
2019-03-13 10:38:17 +00:00
Craig Topper 93e15dfacc [X86] Make lowering of intrinsics with rounding mode stricter so that only valid rounding modes are lowered. Update tests accordingly
Many of our tests were not using valid rounding mode immediates. Clang verifies this in the frontend when it creates the intrinsics from builtins, but the backend would still lower invalid immediates.

With this change we will now leave them as intrinsics if the immediate is invalid. This will cause an isel selection failure.

llvm-svn: 355789
2019-03-10 17:20:45 +00:00
Andrea Di Biagio edbf06a767 [AsmPrinter] Remove hidden flag -print-schedule.
This patch removes hidden codegen flag -print-schedule effectively reverting the
logic originally committed as r300311
(https://llvm.org/viewvc/llvm-project?view=revision&revision=300311).

Flag -print-schedule was originally introduced by r300311 to address PR32216
(https://bugs.llvm.org/show_bug.cgi?id=32216). That bug was about adding "Better
testing of schedule model instruction latencies/throughputs".

These days, we can use llvm-mca to test scheduling models. So there is no longer
a need for flag -print-schedule in LLVM. The main use case for PR32216 is
now addressed by llvm-mca.
Flag -print-schedule is mainly used for debugging purposes, and it is only
actually used by x86 specific tests. We already have extensive (latency and
throughput) tests under "test/tools/llvm-mca" for X86 processor models. That
means, most (if not all) existing -print-schedule tests for X86 are redundant.

When flag -print-schedule was first added to LLVM, several files had to be
modified; a few APIs gained new arguments (see for example method
MCAsmStreamer::EmitInstruction), and MCSubtargetInfo/TargetSubtargetInfo gained
a couple of getSchedInfoStr() methods.

Method getSchedInfoStr() had to originally work for both MCInst and
MachineInstr. The original implmentation of getSchedInfoStr() introduced a
subtle layering violation (reported as PR37160 and then fixed/worked-around by
r330615).
In retrospect, that new API could have been designed more optimally. We can
always query MCSchedModel to get the latency and throughput. More importantly,
the "sched-info" string should not have been generated by the subtarget.
Note, r317782 fixed an issue where "print-schedule" didn't work very well in the
presence of inline assembly. That commit is also reverted by this change.

Differential Revision: https://reviews.llvm.org/D57244

llvm-svn: 353043
2019-02-04 12:51:26 +00:00
Chandler Carruth 2946cd7010 Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

llvm-svn: 351636
2019-01-19 08:50:56 +00:00
Simon Pilgrim 7ee86e8e81 [X86] Fix gcc7 -Wunused-but-set-variable warning. NFCI.
llvm-svn: 350701
2019-01-09 11:18:49 +00:00
Francis Visoiu Mistrih 7a6d7672c1 [X86][Darwin] Emit compact-unwind for register-sized stack adjustments
For stack frames on the size of a register in x86, a code size optimization
emits "push rax/eax" instead of "sub" for stack allocation. For example:

foo:
  .cfi_startproc
BB#0:
  pushq %rax
Ltmp0:
  .cfi_def_cfa_offset 16
  ...
  .cfi_endproc

However, we are falling back to DWARF in this case because we cannot
encode %rax as a saved register.

This requirement is wrong, since we don't care about the contents of
%rax, it is the equivalent of a sub.

In order to specify that we care about the contents of %rax, we would
need a .cfi_offset %rax, <offset>.

It's also overzealous in the case where there are pushes for callee saved
registers followed by a "push rax/eax" instead of "sub", in which case we should
also be able to encode the callee saved regs and everything else using compact
unwind.

Patch authored by Bruno Cardoso Lopes.

Differential Revision: https://reviews.llvm.org/D13793

llvm-svn: 350623
2019-01-08 13:53:15 +00:00
Evandro Menezes 9ef79c884a [TableGen] Refactor macro names (NFC)
Make the names for the macros for `TargetInstrInfo` uniform.

llvm-svn: 347706
2018-11-27 20:58:27 +00:00
Clement Courbet 54a1184fff [X86][NFC] Fix comment.
llvm-svn: 346226
2018-11-06 13:48:56 +00:00
Martin Storsjo 37b742e208 [COFF] [X86] Don't use llvm_unreachable for unsupported relocation types
This can happen if assembling a reference to _GLOBAL_OFFSET_TABLE_.

While it doesn't make sense to try to assemble that for COFF,
the fact that we previously used llvm_unreachable meant that the code
had undefined behaviour if something tried to assemble that.

The configure script of libgmp would try to assemble such a snippet
(which should signal a failure). If llvm is built without assertions,
the undefined behaviour meant a (near) infinite loop.

Differential Revision: https://reviews.llvm.org/D52903

llvm-svn: 343811
2018-10-04 20:43:38 +00:00
Reid Kleckner d5e4ec74e3 [codeview] Fix 32-bit x86 variable locations in realigned stack frames
Add the .cv_fpo_stackalign directive so that we can define $T0, or the
VFRAME virtual register, with it. This was overlooked in the initial
implementation because unlike MSVC, we push CSRs before allocating stack
space, so this value is only needed to describe local variable
locations. Variables that the compiler now addresses via ESP are instead
described as being stored at offsets from VFRAME, which for us is ESP
after alignment in the prologue.

This adds tests that show that we use the VFRAME register properly in
our S_DEFRANGE records, and that we emit the correct FPO data to define
it.

Fixes PR38857

llvm-svn: 343603
2018-10-02 16:43:52 +00:00
Andrea Di Biagio 8b6c314be1 [TableGen][SubtargetEmitter] Add the ability for processor models to describe dependency breaking instructions.
This patch adds the ability for processor models to describe dependency breaking
instructions.

Different processors may specify a different set of dependency-breaking
instructions.
That means, we cannot assume that all processors of the same target would use
the same rules to classify dependency breaking instructions.

The main goal of this patch is to provide the means to describe dependency
breaking instructions directly via tablegen, and have the following
TargetSubtargetInfo hooks redefined in overrides by tabegen'd
XXXGenSubtargetInfo classes (here, XXX is a Target name).

```
virtual bool isZeroIdiom(const MachineInstr *MI, APInt &Mask) const {
  return false;
}

virtual bool isDependencyBreaking(const MachineInstr *MI, APInt &Mask) const {
  return isZeroIdiom(MI);
}
```

An instruction MI is a dependency-breaking instruction if a call to method
isDependencyBreaking(MI) on the STI (TargetSubtargetInfo object) evaluates to
true. Similarly, an instruction MI is a special case of zero-idiom dependency
breaking instruction if a call to STI.isZeroIdiom(MI) returns true.
The extra APInt is used for those targets that may want to select which machine
operands have their dependency broken (see comments in code).
Note that by default, subtargets don't know about the existence of
dependency-breaking. In the absence of external information, those method calls
would always return false.

A new tablegen class named STIPredicate has been added by this patch to let
processor models classify instructions that have properties in common. The idea
is that, a MCInstrPredicate definition can be used to "generate" an instruction
equivalence class, with the idea that instructions of a same class all have a
property in common.

STIPredicate definitions are essentially a collection of instruction equivalence
classes.
Also, different processor models can specify a different variant of the same
STIPredicate with different rules (i.e. predicates) to classify instructions.
Tablegen backends (in this particular case, the SubtargetEmitter) will be able
to process STIPredicate definitions, and automatically generate functions in
XXXGenSubtargetInfo.

This patch introduces two special kind of STIPredicate classes named
IsZeroIdiomFunction and IsDepBreakingFunction in tablegen. It also adds a
definition for those in the BtVer2 scheduling model only.

This patch supersedes the one committed at r338372 (phabricator review: D49310).

The main advantages are:
 - We can describe subtarget predicates via tablegen using STIPredicates.
 - We can describe zero-idioms / dep-breaking instructions directly via
   tablegen in the scheduling models.

In future, the STIPredicates framework can be used for solving other problems.
Examples of future developments are:
 - Teach how to identify optimizable register-register moves
 - Teach how to identify slow LEA instructions (each subtarget defining its own
   concept of "slow" LEA).
 - Teach how to identify instructions that have undocumented false dependencies
   on the output registers on some processors only.

It is also (in my opinion) an elegant way to expose knowledge to both external
tools like llvm-mca, and codegen passes.
For example, machine schedulers in LLVM could reuse that information when
internally constructing the data dependency graph for a code region.

This new design feature is also an "opt-in" feature. Processor models don't have
to use the new STIPredicates. It has all been designed to be as unintrusive as
possible.

Differential Revision: https://reviews.llvm.org/D52174

llvm-svn: 342555
2018-09-19 15:57:45 +00:00
Martin Storsjo 489993db94 [MinGW] [X86] Add stubs for references to data variables that might end up imported from a dll
Variables declared with the dllimport attribute are accessed via a
stub variable named __imp_<var>. In MinGW configurations, variables that
aren't declared with a dllimport attribute might still end up imported
from another DLL with runtime pseudo relocs.

For x86_64, this avoids the risk that the target is out of range
for a 32 bit PC relative reference, in case the target DLL is loaded
further than 4 GB from the reference. It also avoids having to make the
text section writable at runtime when doing the runtime fixups, which
makes it worthwhile to do for i386 as well.

Add stub variables for all dso local data references where a definition
of the variable isn't visible within the module, since the DLL data
autoimporting might make them imported even though they are marked as
dso local within LLVM.

Don't do this for variables that actually are defined within the same
module, since we then know for sure that it actually is dso local.

Don't do this for references to functions, since there's no need for
runtime pseudo relocations for autoimporting them; if a function from
a different DLL is called without the appropriate dllimport attribute,
the call just gets routed via a thunk instead.

GCC does something similar since 4.9 (when compiling with -mcmodel=medium
or large; from that version, medium is the default code model for x86_64
mingw), but only for x86_64.

Differential Revision: https://reviews.llvm.org/D51288

llvm-svn: 340942
2018-08-29 17:28:34 +00:00
Joel Galenson c6f6c17c9b Add missing override keyword (NFC)
llvm-svn: 340615
2018-08-24 16:15:44 +00:00
Joel Galenson d36fb48a27 Find PLT entries for x86, x86_64, and AArch64.
This adds a new method to ELFObjectFileBase that returns the symbols and addresses of PLT entries.

This design was suggested by pcc and eugenis in https://reviews.llvm.org/D49383.

Differential Revision: https://reviews.llvm.org/D50203

llvm-svn: 340610
2018-08-24 15:21:56 +00:00
Zachary Turner bc94ae437f Add the extended XMM registers mappings for AVX-512.
After this we should have the entire AVX-512 register set
mapping in place.

llvm-svn: 340118
2018-08-18 03:54:16 +00:00
Reid Kleckner bd5d71229d [codeview] Use push_macro to avoid conflicts instead of a prefix
Summary:
This prefix was added in r333421, and it changed our dumper output to
say things like "CVRegEAX" instead of just "EAX". That's a functional
change that I'd rather avoid.

I tested GCC, Clang, and MSVC, and all of them support #pragma
push_macro. They don't issue warnings whem the macro is not defined
either.

I don't have a Mac so I can't test the real termios.h header, but I
looked at the termios.h sources online and looked for other conflicts.
I saw only the CR* macros, so those are the ones we work around.

Reviewers: zturner, JDevlieghere

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D50851

llvm-svn: 339907
2018-08-16 17:34:31 +00:00
Nirav Dave 7fd992a755 [MC][X86] Enhance X86 Register expression handling to more closely match GCC.
Allow the comparison of x86 registers in the evaluation of assembler
directives. This generalizes and simplifies the extension from r334022
to catch another case found in the Linux kernel.

Reviewers: rnk, void

Reviewed By: rnk

Subscribers: hiraditya, nickdesaulniers, llvm-commits

Differential Revision: https://reviews.llvm.org/D50795

llvm-svn: 339895
2018-08-16 16:31:14 +00:00
Zachary Turner 2838b59121 Add support for AVX-512 CodeView registers.
When compiling with /arch:AVX512 and optimizations turned on,
we could crash while emitting debug info because we did not
have CodeView register constants for the AVX 512 register
set defined.  This patch defines them.

Differential Revision: https://reviews.llvm.org/D50819

llvm-svn: 339893
2018-08-16 16:17:55 +00:00
Andrea Di Biagio a1852b6194 [llvm-mca][BtVer2] Teach how to identify dependency-breaking idioms.
This patch teaches llvm-mca how to identify dependency breaking instructions on
btver2.

An example of dependency breaking instructions is the zero-idiom XOR (example:
`XOR %eax, %eax`), which always generates zero regardless of the actual value of
the input register operands.
Dependency breaking instructions don't have to wait on their input register
operands before executing. This is because the computation is not dependent on
the inputs.

Not all dependency breaking idioms are also zero-latency instructions. For
example, `CMPEQ %xmm1, %xmm1` is independent on
the value of XMM1, and it generates a vector of all-ones.
That instruction is not eliminated at register renaming stage, and its opcode is
issued to a pipeline for execution. So, the latency is not zero. 

This patch adds a new method named isDependencyBreaking() to the MCInstrAnalysis
interface. That method takes as input an instruction (i.e. MCInst) and a
MCSubtargetInfo.
The default implementation of isDependencyBreaking() conservatively returns
false for all instructions. Targets may override the default behavior for
specific CPUs, and return a value which better matches the subtarget behavior.

In future, we should teach to Tablegen how to automatically generate the body of
isDependencyBreaking from scheduling predicate definitions. This would allow us
to expose the knowledge about dependency breaking instructions to the machine
schedulers (and, potentially, other codegen passes).

Differential Revision: https://reviews.llvm.org/D49310

llvm-svn: 338372
2018-07-31 13:21:43 +00:00
Andrea Di Biagio b6022aa8d9 [X86][BtVer2] correctly model the latency/throughput of LEA instructions.
This patch fixes the latency/throughput of LEA instructions in the BtVer2
scheduling model.

On Jaguar, A 3-operands LEA has a latency of 2cy, and a reciprocal throughput of
1. That is because it uses one cycle of SAGU followed by 1cy of ALU1.  An LEA
with a "Scale" operand is also slow, and it has the same latency profile as the
3-operands LEA. An LEA16r has a latency of 3cy, and a throughput of 0.5 (i.e.
RThrouhgput of 2.0).

This patch adds a new TIIPredicate named IsThreeOperandsLEAFn to X86Schedule.td.
The tablegen backend (for instruction-info) expands that definition into this
(file X86GenInstrInfo.inc):
```
static bool isThreeOperandsLEA(const MachineInstr &MI) {
  return (
    (
      MI.getOpcode() == X86::LEA32r
      || MI.getOpcode() == X86::LEA64r
      || MI.getOpcode() == X86::LEA64_32r
      || MI.getOpcode() == X86::LEA16r
    )
    && MI.getOperand(1).isReg()
    && MI.getOperand(1).getReg() != 0
    && MI.getOperand(3).isReg()
    && MI.getOperand(3).getReg() != 0
    && (
      (
        MI.getOperand(4).isImm()
        && MI.getOperand(4).getImm() != 0
      )
      || (MI.getOperand(4).isGlobal())
    )
  );
}
```

A similar method is generated in the X86_MC namespace, and included into
X86MCTargetDesc.cpp (the declaration lives in X86MCTargetDesc.h).

Back to the BtVer2 scheduling model:
A new scheduling predicate named JSlowLEAPredicate now checks if either the
instruction is a three-operands LEA, or it is an LEA with a Scale value
different than 1.
A variant scheduling class uses that new predicate to correctly select the
appropriate latency profile.

Differential Revision: https://reviews.llvm.org/D49436

llvm-svn: 337469
2018-07-19 16:42:15 +00:00
Craig Topper 0661f67296 [X86] Remove FMA3Info DenseMap. Break into sorted tables that we can binary search.
I separated out the rounding and broadcast groups into their own tables because it made the ordering in the main table easier.

Further splitting of the tables might make it possible to directly index using bits from the TSFlags, but its probably not worth it right now.

llvm-svn: 336075
2018-07-02 06:23:39 +00:00
Craig Topper d8d64a56b5 [X86] Make %eiz usage in 64-bit mode, force a 0x67 address size prefix. Fix some test CHECK lines.
llvm-svn: 335414
2018-06-23 06:15:04 +00:00
Craig Topper c26c62e0e5 [X86][AsmParser] Allow (%bp,%si) and (%bp,%di) to be encoded without using a zero displacement.
(%bp) can't be encoded without a displacement. The encoding is instead used for displacement alone. So a 1 byte displacement of 0 must be used. But if there is an index register we can encode without a displacement.

llvm-svn: 335379
2018-06-22 19:42:21 +00:00
Andrea Di Biagio 2145b13fc9 [llvm-mca][X86] Teach how to identify register writes that implicitly clear the upper portion of a super-register.
This patch teaches llvm-mca how to identify register writes that implicitly zero
the upper portion of a super-register.

On X86-64, a general purpose register is implemented in hardware as a 64-bit
register. Quoting the Intel 64 Software Developer's Manual: "an update to the
lower 32 bits of a 64 bit integer register is architecturally defined to zero
extend the upper 32 bits".  Also, a write to an XMM register performed by an AVX
instruction implicitly zeroes the upper 128 bits of the aliasing YMM register.

This patch adds a new method named clearsSuperRegisters to the MCInstrAnalysis
interface to help identify instructions that implicitly clear the upper portion
of a super-register.  The rest of the patch teaches llvm-mca how to use that new
method to obtain the information, and update the register dependencies
accordingly.

I compared the kernels from tests clear-super-register-1.s and
clear-super-register-2.s against the output from perf on btver2.  Previously
there was a large discrepancy between the estimated IPC and the measured IPC.
Now the differences are mostly in the noise.

Differential Revision: https://reviews.llvm.org/D48225

llvm-svn: 335113
2018-06-20 10:08:11 +00:00
Fangrui Song f72cdb50be [MC] [X86] Teach leaq _GLOBAL_OFFSET_TABLE(%rip), %r15 to use R_X86_64_GOTPC32 instead of R_X86_64_PC32
Summary:
This is similar to D46319 (ARM). x86-64 psABI p40 gives an example:

  leaq _GLOBAL_OFFSET_TABLE(%rip), %r15 # GOTPC32 reloc

GNU as creates R_X86_64_GOTPC32. However, MC currently emits R_X86_64_PC32.

Reviewers: javed.absar, echristo

Subscribers: kristof.beyls, llvm-commits, peter.smith, grimar

Differential Revision: https://reviews.llvm.org/D47507

llvm-svn: 334515
2018-06-12 16:20:44 +00:00
Peter Smith 57f661bd7d [MC] Pass MCSubtargetInfo to fixupNeedsRelaxation and applyFixup
On targets like Arm some relaxations may only be performed when certain
architectural features are available. As functions can be compiled with
differing levels of architectural support we must make a judgement on
whether we can relax based on the MCSubtargetInfo for the function. This
change passes through the MCSubtargetInfo for the function to
fixupNeedsRelaxation so that the decision on whether to relax can be made
per function. In this patch, only the ARM backend makes use of this
information. We must also pass the MCSubtargetInfo to applyFixup because
some fixups skip error checking on the assumption that relaxation has
occurred, to prevent code-generation errors applyFixup must see the same
MCSubtargetInfo as fixupNeedsRelaxation.

Differential Revision: https://reviews.llvm.org/D44928

llvm-svn: 334078
2018-06-06 09:40:06 +00:00
Nirav Dave 05b589101e [MC][X86] Allow assembler variable assignment to register name.
Summary:
Allow extended parsing of variable assembler assignment syntax and modify X86 to permit
VAR = register assignment. As we emit these as .set directives when possible, we inline
such expressions in output assembly.

Fixes PR37425.

Reviewers: rnk, void, echristo

Reviewed By: rnk

Subscribers: nickdesaulniers, llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D47545

llvm-svn: 334022
2018-06-05 15:13:39 +00:00
Sriraman Tallam d10c4e07f5 Relax GOTPCREL relocations for tail jmp instructions.
Differential Revision: https://reviews.llvm.org/D47563

llvm-svn: 333676
2018-05-31 18:12:33 +00:00
Jonas Devlieghere 43dce3edbe [CodeView] Add prefix to CodeView registers.
Adds CVReg to CodeView register names to prevent a duplicate symbol with
CR3 defined in termios.h, as suggested by Zachary on the mailing list.

http://lists.llvm.org/pipermail/llvm-dev/2018-May/123372.html

Differential revision: https://reviews.llvm.org/D47478

rdar://39863705

llvm-svn: 333421
2018-05-29 14:35:34 +00:00
Peter Collingbourne dcd7d6c331 MC: Separate creating a generic object writer from creating a target object writer. NFCI.
With this we gain a little flexibility in how the generic object
writer is created.

Part of PR37466.

Differential Revision: https://reviews.llvm.org/D47045

llvm-svn: 332868
2018-05-21 19:20:29 +00:00
Peter Collingbourne 2602a0d40c Fix ubsan bounds check failure.
llvm-svn: 332866
2018-05-21 19:09:47 +00:00
Peter Collingbourne 571a3301ae MC: Change MCAsmBackend::writeNopData() to take a raw_ostream instead of an MCObjectWriter. NFCI.
To make this work I needed to add an endianness field to MCAsmBackend
so that writeNopData() implementations know which endianness to use.

Part of PR37466.

Differential Revision: https://reviews.llvm.org/D47035

llvm-svn: 332857
2018-05-21 17:57:19 +00:00
Peter Collingbourne f7b81db715 MC: Change the streamer ctors to take an object writer instead of a stream. NFCI.
The idea is that a client that wants split dwarf would create a
specific kind of object writer that creates two files, and use it to
create the streamer.

Part of PR37466.

Differential Revision: https://reviews.llvm.org/D47050

llvm-svn: 332749
2018-05-18 18:26:45 +00:00
Adrian Prantl 5f8f34e459 Remove \brief commands from doxygen comments.
We've been running doxygen with the autobrief option for a couple of
years now. This makes the \brief markers into our comments
redundant. Since they are a visual distraction and we don't want to
encourage more \brief markers in new code either, this patch removes
them all.

Patch produced by

  for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done

Differential Revision: https://reviews.llvm.org/D46290

llvm-svn: 331272
2018-05-01 15:54:18 +00:00