Commit Graph

226 Commits

Author SHA1 Message Date
Craig Topper 811756b4dc [X86][XOP] Reduce the size of a multiclass by moving more stuff to parameters instead of doing 128-bit and 256-bit simultaneously.
This requires some instructions to be renamed to move the Y earlier in the instruction name. The new names are more consistent with other instructions.

llvm-svn: 295579
2017-02-18 22:53:43 +00:00
Hans Wennborg a468601e0e [X86] Re-enable conditional tail calls and fix PR31257.
This reverts r294348, which removed support for conditional tail calls
due to the PR above. It fixes the PR by marking live registers as
implicitly used and defined by the now predicated tailcall. This is
similar to how IfConversion predicates instructions.

Differential Revision: https://reviews.llvm.org/D29856

llvm-svn: 295262
2017-02-16 00:04:05 +00:00
Hans Wennborg 819e3e02a9 [X86] Disable conditional tail calls (PR31257)
They are currently modelled incorrectly (as calls, which clobber
registers, confusing e.g. Machine Copy Propagation).

Reverting until we figure out the proper solution.

llvm-svn: 294348
2017-02-07 20:37:45 +00:00
Sanjoy Das 2f63cbcc0c [ImplicitNullCheck] Extend Implicit Null Check scope by using stores
Summary:
This change allows usage of store instruction for implicit null check.

Memory Aliasing Analisys is not used and change conservatively supposes
that any store and load may access the same memory. As a result
re-ordering of store-store, store-load and load-store is prohibited.

Patch by Serguei Katkov!

Reviewers: reames, sanjoy

Reviewed By: sanjoy

Subscribers: atrick, llvm-commits

Differential Revision: https://reviews.llvm.org/D29400

llvm-svn: 294338
2017-02-07 19:19:49 +00:00
Peter Collingbourne dc5e583687 X86: Produce @ABS8 symbol modifiers for absolute symbols in range [0,128).
Differential Revision: https://reviews.llvm.org/D28689

llvm-svn: 293844
2017-02-02 00:32:03 +00:00
Nirav Dave a7c041d147 [X86] Implement -mfentry
Summary: Insert calls to __fentry__ at function entry.

Reviewers: hfinkel, craig.topper

Subscribers: mgorny, llvm-commits

Differential Revision: https://reviews.llvm.org/D28000

llvm-svn: 293648
2017-01-31 17:00:27 +00:00
Matthias Braun 8c209aa877 Cleanup dump() functions.
We had various variants of defining dump() functions in LLVM. Normalize
them (this should just consistently implement the things discussed in
http://lists.llvm.org/pipermail/cfe-dev/2014-January/034323.html

For reference:
- Public headers should just declare the dump() method but not use
  LLVM_DUMP_METHOD or #if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
- The definition of a dump method should look like this:
  #if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
  LLVM_DUMP_METHOD void MyClass::dump() {
    // print stuff to dbgs()...
  }
  #endif

llvm-svn: 293359
2017-01-28 02:02:38 +00:00
Dean Michael Berris f7e7b938ea [XRay] Merge instrumentation point table emission code into AsmPrinter.
Summary:
No need to have this per-architecture.  While there, unify 32-bit ARM's
behaviour with what changed elsewhere and start function names lowercase
as per the coding standards.  Individual entry emission code goes to the
entry's own class.

Fully tested on amd64, cross-builds on both ARMs and PowerPC.

Reviewers: dberris

Subscribers: aemerson, llvm-commits

Differential Revision: https://reviews.llvm.org/D28209

llvm-svn: 290858
2017-01-03 04:30:21 +00:00
Gadi Haber 19c4fc5e62 This is a large patch for X86 AVX-512 of an optimization for reducing code size by encoding EVEX AVX-512 instructions using the shorter VEX encoding when possible.
There are cases of AVX-512 instructions that have two possible encodings. This is the case with instructions that use vector registers with low indexes of 0 - 15 and do not use the zmm registers or the mask k registers.
The EVEX encoding prefix requires 4 bytes whereas the VEX prefix can take only up to 3 bytes. Consequently, using the VEX encoding for these instructions results in a code size reduction of ~2 bytes even though it is compiled with the AVX-512 features enabled.

Reviewers: Craig Topper, Zvi Rackoover, Elena Demikhovsky 
Differential Revision: https://reviews.llvm.org/D27901

llvm-svn: 290663
2016-12-28 10:12:48 +00:00
Craig Topper d4091494d3 [X86] Invert an 'if' and early out to fix a weird indentation. NFCI
llvm-svn: 287909
2016-11-25 02:29:24 +00:00
Craig Topper a46936185a [X86] Size a SmallVector to the worst case mask size for a 512-bit shuffle. NFCI
llvm-svn: 287908
2016-11-25 02:29:21 +00:00
Kuba Mracek 06995e866b [xray] Add XRay support for Mach-O in CodeGen
Currently, XRay only supports emitting the XRay table (xray_instr_map) on ELF binaries. Let's add Mach-O support.

Differential Revision: https://reviews.llvm.org/D26983

llvm-svn: 287734
2016-11-23 02:07:04 +00:00
Simon Pilgrim ca3072ac58 [X86][AVX512] Add mask/maskz writemask support to constant pool shuffle decode commentx
llvm-svn: 284488
2016-10-18 15:45:37 +00:00
Craig Topper 175a415e78 [AVX-512] Add support for decoding shuffle mask from constant pool for masked VPERMILPS/PD.
llvm-svn: 284450
2016-10-18 03:36:52 +00:00
Craig Topper 1f5178ff9f [X86] Fix shuffle decoding assertions to print the right number of required operands. Update the checks themselves to be >= to the same number instead of > one less than the required number.
llvm-svn: 284365
2016-10-17 06:41:18 +00:00
Hans Wennborg c4b1d20ba2 Win64: Don't emit unwind info for "leaf" functions (PR30337)
According to MSDN (see the PR), functions which don't touch any callee-saved
registers (including %rsp) don't need any unwind info.

This patch makes LLVM not emit unwind info for such functions, to save
binary size.

Differential Revision: https://reviews.llvm.org/D24748

llvm-svn: 282185
2016-09-22 19:50:05 +00:00
Dean Michael Berris 4640154446 [XRay] ARM 32-bit no-Thumb support in LLVM
This is a port of XRay to ARM 32-bit, without Thumb support yet. The XRay instrumentation support is moving up to AsmPrinter.
This is one of 3 commits to different repositories of XRay ARM port. The other 2 are:

https://reviews.llvm.org/D23932 (Clang test)
https://reviews.llvm.org/D23933 (compiler-rt)

Differential Revision: https://reviews.llvm.org/D23931

llvm-svn: 281878
2016-09-19 00:54:35 +00:00
Eric Christopher a808f2981e Remove unused function getMang().
llvm-svn: 281707
2016-09-16 07:32:58 +00:00
Hans Wennborg 6ecf619be9 X86: Fold tail calls into conditional branches also for 64-bit (PR26302)
This extends the optimization in r280832 to also work for 64-bit. The only
quirk is that we can't do this for 64-bit Windows (yet).

Differential Revision: https://reviews.llvm.org/D24423

llvm-svn: 281113
2016-09-09 22:37:27 +00:00
Hans Wennborg c39ef776fc Win64: Don't use REX prefix for direct tail calls
The REX prefix should be used on indirect jmps, but not direct ones.
For direct jumps, the unwinder looks at the offset to determine if
it's inside the current function.

Differential Revision: https://reviews.llvm.org/D24359

llvm-svn: 281003
2016-09-08 23:35:10 +00:00
Renato Golin 049f387112 Revert "[XRay] ARM 32-bit no-Thumb support in LLVM"
And associated commits, as they broke the Thumb bots.

This reverts commit r280935.
This reverts commit r280891.
This reverts commit r280888.

llvm-svn: 280967
2016-09-08 17:10:39 +00:00
Dean Michael Berris 17d94e279e [XRay] ARM 32-bit no-Thumb support in LLVM
This is a port of XRay to ARM 32-bit, without Thumb support yet. The XRay instrumentation support is moving up to AsmPrinter.
This is one of 3 commits to different repositories of XRay ARM port. The other 2 are:

1. https://reviews.llvm.org/D23932 (Clang test)
2. https://reviews.llvm.org/D23933 (compiler-rt)

Differential Revision: https://reviews.llvm.org/D23931

llvm-svn: 280888
2016-09-08 00:19:04 +00:00
Hans Wennborg 75e25f6812 X86: Fold tail calls into conditional branches where possible (PR26302)
When branching to a block that immediately tail calls, it is possible to fold
the call directly into the branch if the call is direct and there is no stack
adjustment, saving one byte.

Example:

  define void @f(i32 %x, i32 %y) {
  entry:
    %p = icmp eq i32 %x, %y
    br i1 %p, label %bb1, label %bb2
  bb1:
    tail call void @foo()
    ret void
  bb2:
    tail call void @bar()
    ret void
  }

before:

  f:
          movl    4(%esp), %eax
          cmpl    8(%esp), %eax
          jne     .LBB0_2
          jmp     foo
  .LBB0_2:
          jmp     bar

after:

  f:
          movl    4(%esp), %eax
          cmpl    8(%esp), %eax
          jne     bar
  .LBB0_1:
          jmp     foo

I don't expect any significant size savings from this (on a Clang bootstrap I
saw 288 bytes), but it does make the code a little tighter.

This patch only does 32-bit, but 64-bit would work similarly.

Differential Revision: https://reviews.llvm.org/D24108

llvm-svn: 280832
2016-09-07 17:52:14 +00:00
Dean Michael Berris e8ae5baaf7 [XRay] Detect and emit sleds for sibling/tail calls
Summary:
This change promotes the 'isTailCall(...)' member function to
TargetInstrInfo as a query interface for determining on a per-target
basis whether a given MachineInstr is a tail call instruction. We build
upon this in the XRay instrumentation pass to emit special sleds for
tail call optimisations, where we emit the correct kind of sled.

The tail call sleds look like a mix between the function entry and
function exit sleds. Form-wise, the sled comes before the "jmp"
instruction that implements the tail call similar to how we do it for
the function entry sled. Functionally, because we know this is a tail
call, it behaves much like an exit sled -- i.e. at runtime we may use
the exit trampolines instead of a different kind of trampoline.

A follow-up change to recognise these sleds will be done in compiler-rt,
so that we can start intercepting these initially as exits, but also
have the option to have different log entries to more accurately reflect
that this is actually a tail call.

Reviewers: echristo, rSerge, majnemer

Subscribers: mehdi_amini, dberris, llvm-commits

Differential Revision: https://reviews.llvm.org/D23986

llvm-svn: 280334
2016-09-01 01:29:13 +00:00
Philip Reames e83c4b30ca [stackmaps] More extraction of common code [NFCI]
General cleanup before starting to work on the part I want to actually change.

llvm-svn: 279586
2016-08-23 23:33:29 +00:00
Dean Michael Berris 1dd1ca9727 [XRay] Synthesize a reference to the xray_instr_map
Without the synthesized reference to a symbol in the xray_instr_map,
linker section garbage collection will helpfully remove the whole
xray_instr_map section from the final executable (or archive). This will
cause the runtime to not be able to identify the sleds and hot-patch the
calls/jumps into the runtime trampolines.

This change adds a reference from the text section at the end of the
function to keep around the associated xray_instr_map section as well.

We also make sure that we catch this reference in the test.

Reviewers: chandlerc, echristo, majnemer, mehdi_amini

Subscribers: mehdi_amini, llvm-commits, dberris

Differential Revision: https://reviews.llvm.org/D23398

llvm-svn: 279204
2016-08-19 04:44:30 +00:00
Dean Michael Berris 3a25d84a51 [XRay] Test for xray_instr_map in object file. (NFC)
This makes a trivial change in the emission of the per-function XRay
tables, and makes sure that the xray_instr_map section does show up in
the object file.

llvm-svn: 278113
2016-08-09 10:42:11 +00:00
Dean Michael Berris 7e9abea2ae [XRay] Align entry and return sleds to 2 byte boundaries
This should ensure that we can atomically write two bytes (on top of the
retq and the one past it) and have those two bytes not straddle cache
lines.

We also move the label past the alignment instruction so that we can refer
to the actual first instruction, as opposed to potential padding before the
aligned instruction.

Update the tests to allow us to reflect the new order of assembly.

Reviewers: rSerge, echristo, majnemer

Subscribers: llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D23101

llvm-svn: 277701
2016-08-04 07:37:28 +00:00
Dean Michael Berris 0b8f6c8777 [XRay] Make the xray_instr_map section specification more correct
Summary:
We also add a test to show what currently happens when we create a
section per function and emit an xray_instr_map. This illustrates the
relationship (or lack thereof) between the per-function section and the
xray_instr_map section.

We also change the code generation slightly so that we don't always
create group sections, but rather only do so if a function where the
table is associated with is in a group.

Also in this change:

  - Remove the "merge" flag on the xray_instr_map section.
  - Test that we're generating the right table for comdat and non-comdat functions.

Reviewers: echristo, majnemer

Subscribers: llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D23104

llvm-svn: 277580
2016-08-03 07:21:55 +00:00
Dean Michael Berris 52735fc435 XRay: Add entry and exit sleds
Summary:
In this patch we implement the following parts of XRay:

- Supporting a function attribute named 'function-instrument' which currently only supports 'xray-always'. We should be able to use this attribute for other instrumentation approaches.
- Supporting a function attribute named 'xray-instruction-threshold' used to determine whether a function is instrumented with a minimum number of instructions (IR instruction counts).
- X86-specific nop sleds as described in the white paper.
- A machine function pass that adds the different instrumentation marker instructions at a very late stage.
- A way of identifying which return opcode is considered "normal" for each architecture.

There are some caveats here:

1) We don't handle PATCHABLE_RET in platforms other than x86_64 yet -- this means if IR used PATCHABLE_RET directly instead of a normal ret, instruction lowering for that platform might do the wrong thing. We think this should be handled at instruction selection time to by default be unpacked for platforms where XRay is not availble yet.

2) The generated section for X86 is different from what is described from the white paper for the sole reason that LLVM allows us to do this neatly. We're taking the opportunity to deviate from the white paper from this perspective to allow us to get richer information from the runtime library.

Reviewers: sanjoy, eugenis, kcc, pcc, echristo, rnk

Subscribers: niravd, majnemer, atrick, rnk, emaste, bmakam, mcrosier, mehdi_amini, llvm-commits

Differential Revision: http://reviews.llvm.org/D19904

llvm-svn: 275367
2016-07-14 04:06:33 +00:00
Simon Pilgrim a99368fa35 [X86][AVX512] Add support for VPERMILPD/VPERMILPS variable shuffle mask comments
llvm-svn: 275272
2016-07-13 15:45:36 +00:00
Duncan P. N. Exon Smith 7b4c18e8f3 X86: Avoid implicit iterator conversions, NFC
Avoid implicit conversions from MachineInstrBundleIterator to
MachineInstr*, mainly by preferring MachineInstr& over MachineInstr* and
using range-based for loops.

llvm-svn: 275149
2016-07-12 03:18:50 +00:00
Rafael Espindola a99ccfce1a Drop support for creating $stubs.
They are created by ld64 since OS X 10.5.

llvm-svn: 274130
2016-06-29 14:59:50 +00:00
Joerg Sonnenberger 2298203056 doesSetDirectiveSuppressesReloc -> doesSetDirectiveSuppressReloc, the
former is grammatically incorrect.

llvm-svn: 273100
2016-06-18 23:25:37 +00:00
Benjamin Kramer 46e38f3678 Avoid copies of std::strings and APInt/APFloats where we only read from it
As suggested by clang-tidy's performance-unnecessary-copy-initialization.
This can easily hit lifetime issues, so I audited every change and ran the
tests under asan, which came back clean.

llvm-svn: 272126
2016-06-08 10:01:20 +00:00
Simon Pilgrim 2ead861d07 [X86][XOP] Added VPERMIL2PD/VPERMIL2PS shuffle mask comment decoding
llvm-svn: 271809
2016-06-04 21:44:28 +00:00
Rafael Espindola 712f957cae Simplify handling of hidden stub.
Since r207518 they are printed exactly like non-hidden stubs on x86 and
since r207517 on ARM.

This means we can use a single set for all stubs in those platforms.

llvm-svn: 269776
2016-05-17 16:01:32 +00:00
Simon Pilgrim af742d51ad [X86][SSE] Added TODO comment to add support for AVX512 mask registers to shuffle comments
This came up in discussion on D19198

llvm-svn: 268915
2016-05-09 13:30:16 +00:00
Quentin Colombet 4e1d389ac5 [X86] Model FAULTING_LOAD_OP as a terminator and branch.
This operation may branch to the handler block and we do not want it
to happen anywhere within the basic block.
Moreover, by marking it "terminator and branch" the machine verifier
does not wrongly assume (because of AnalyzeBranch not knowing better)
the branch is analyzable. Indeed, the target was seeing only the
unconditional branch and not the faulting load op and thought it was
a simple unconditional block.
The machine verifier was complaining because of that and moreover,
other optimizations could have done wrong transformation!

In the process, simplify the representation of the handler block in
the faulting load op. Now, we directly reference the handler block
instead of using a label. This has the benefits of:
1. MC knows how to issue a label for a BB, so leave that to it.
2. Accessing the target BB from its label is painful, whereas it is
   direct from a MBB operand.

Note: The 2 bytes offset in implicit-null-check.ll comes from the
fact the unconditional jumps are not removed anymore, as the whole
terminator sequence is not analyzable anymore.

Will fix it in a subsequence commit.

llvm-svn: 268327
2016-05-02 22:58:54 +00:00
Craig Topper 184310d6a9 [X86] Use nested switches to vary the operand to helper functions that were previously called in multiple cases. This seems to help the inliner reduce code. NFC
llvm-svn: 267964
2016-04-29 00:51:30 +00:00
Davide Italiano bf4df85ba7 [MC] Silence warning due to unused variable in !Debug builds.
llvm-svn: 266901
2016-04-20 18:45:31 +00:00
Davide Italiano 8a8f24b098 [MC] EmitNop: Make an assertion more useful.
Differential Revision:  http://reviews.llvm.org/D19334

llvm-svn: 266895
2016-04-20 17:53:21 +00:00
Sanjoy Das 2effffd456 [X86] Simplify StackMapShadowTracker; NFC
- Elide trivial contructor and desctructor
 - Move implementation out of an unnecessary explicit llvm namespace
   scope

llvm-svn: 266794
2016-04-19 18:48:16 +00:00
Sanjoy Das 6ecfae61dc [X86MCInstLower] Clean up EmitNops; NFC
Instead of having a conditional assert inside EmitNops, refactor so that
the caller can have the assert instead.

llvm-svn: 266793
2016-04-19 18:48:13 +00:00
Sanjoy Das c0441c29df Introduce a "patchable-function" function attribute
Summary:
The `"patchable-function"` attribute can be used by an LLVM client to
influence LLVM's code generation in ways that makes the generated code
easily patchable at runtime (for instance, to redirect control).
Right now only one patchability scheme is supported,
`"prologue-short-redirect"`, but this can be expanded in the future.

Reviewers: joker.eph, rnk, echristo, dberris

Subscribers: joker.eph, echristo, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D19046

llvm-svn: 266715
2016-04-19 05:24:47 +00:00
Simon Pilgrim 1cc5712763 [X86][XOP] Support for VPPERM 2-input shuffle mask decoding
This patch adds support for decoding XOP VPPERM instruction when it represents a basic shuffle.

The mask decoding required the existing MCInstrLowering code to be updated to support binary shuffles - the implementation now matches what is done in X86InstrComments.cpp.

Differential Revision: http://reviews.llvm.org/D18441

llvm-svn: 265874
2016-04-09 14:51:26 +00:00
Duncan P. N. Exon Smith e9bc579c37 ADT: Remove == and != comparisons between ilist iterators and pointers
I missed == and != when I removed implicit conversions between iterators
and pointers in r252380 since they were defined outside ilist_iterator.

Since they depend on getNodePtrUnchecked(), they indirectly rely on UB.
This commit removes all uses of these operators.  (I'll delete the
operators themselves in a separate commit so that it can be easily
reverted if necessary.)

There should be NFC here.

llvm-svn: 261498
2016-02-21 20:39:50 +00:00
Craig Topper e00bffbc13 [X86] Make MOV32ri64 a post-RA pseudo instead of a CodeGenOnly instruction. It was only needed for rematerialization.
llvm-svn: 256818
2016-01-05 07:44:14 +00:00
Craig Topper 69653af748 [X86] Move shuffle decoding for constant pool into the X86CodeGen library to remove a layering violation in the Util library.
llvm-svn: 256680
2015-12-31 22:40:45 +00:00
Craig Topper 7e3ba15529 [X86] Add support for printing shuffle comments for AVX512 PSHUFB instructions.
llvm-svn: 256452
2015-12-26 19:48:43 +00:00