Commit Graph

13337 Commits

Author SHA1 Message Date
JF Bastien d7fcc6f9c7 WebAssembly: handle unused function arguments.
Subscribers: llvm-commits, sunfish, jfb

Differential Revision: http://reviews.llvm.org/D11684

llvm-svn: 243770
2015-07-31 18:13:27 +00:00
JF Bastien 600aee9805 WebAssembly: print basic integer assembly.
Summary:
This prints assembly for int32 integer operations defined in WebAssemblyInstrInteger.td only, with major caveats:

  - The operation names are currently incorrect.
  - Other integer and floating-point types will be added later.
  - The printer isn't factored out to handle recursive AST code yet, since it can't even handle control flow anyways.
  - The assembly format isn't full s-expressions yet either, this will be added later.
  - This currently disables PrologEpilogCodeInserter as well as MachineCopyPropagation becasue they don't like virtual registers, which WebAssembly likes quite a bit. This will be fixed by factoring out NVPTX's change (currently a fork of PrologEpilogCodeInserter).

Reviewers: sunfish

Subscribers: llvm-commits, jfb

Differential Revision: http://reviews.llvm.org/D11671

llvm-svn: 243763
2015-07-31 17:53:38 +00:00
Sanjay Patel 9ff4626028 [x86] reassociate integer multiplies using machine combiner pass
Add i16, i32, i64 imul machine instructions to the list of reassociation
candidates.

A new bit of logic is needed to handle integer instructions: they have an
implicit EFLAGS operand, so we have to make sure it's dead in order to do
any reassociation with integer ops.

Differential Revision: http://reviews.llvm.org/D11660

llvm-svn: 243756
2015-07-31 16:21:55 +00:00
Geoff Berry 8a7ef3b2ee [AArch64] Favor extended reg patterns for sub
Summary:
Favor the extended reg patterns over the shifted reg patterns that match
only the operand shift and not the full sign/zero extend and shift.

Reviewers: jmolloy, t.p.northover

Subscribers: mcrosier, aemerson, llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D11569

llvm-svn: 243753
2015-07-31 15:55:54 +00:00
Matt Arsenault e1ce344b5a AMDGPU: Fix v16i32 to v16i8 truncstore
llvm-svn: 243731
2015-07-31 04:12:04 +00:00
Sumanth Gundapaneni 532a13691c [ARM] Lower modulo operation to generate __aeabi_divmod on Android
For a modulo (reminder) operation,
clang -target armv7-none-linux-gnueabi generates "__modsi3"
clang -target armv7-none-eabi generates "__aeabi_idivmod"
clang -target armv7-linux-androideabi generates "__modsi3"
Android bionic libc doesn't provide a __modsi3, instead it provides a
"__aeabi_idivmod". This patch fixes the LLVM ARMISelLowering to generate
the correct call when ever there is a modulo operation.

Differential Revision: http://reviews.llvm.org/D11661

llvm-svn: 243717
2015-07-31 00:45:12 +00:00
Alex Lorenz 60bf599607 MIR Parser: Report an error when a constant pool item is redefined.
llvm-svn: 243696
2015-07-30 22:00:17 +00:00
Alex Lorenz a06c0c6401 MIR Parser: Report an error when a virtual register is redefined.
llvm-svn: 243695
2015-07-30 21:54:10 +00:00
Sanjay Patel 1166f2ff9f fix memcpy/memset/memmove lowering when optimizing for size
Fixing MinSize attribute handling was discussed in D11363. 
This is a prerequisite patch to doing that.

The handling of OptSize when lowering mem* functions was broken
on Darwin because it wants to ignore -Os for these cases, but the
existing logic also made it ignore -Oz (MinSize).

The Linux change demonstrates a widespread problem. The backend
doesn't usually recognize the MinSize attribute by itself; it
assumes that if the MinSize attribute exists, then the OptSize 
attribute must also exist. 

Fixing this more generally will be a follow-on patch or two.

Differential Revision: http://reviews.llvm.org/D11568

llvm-svn: 243693
2015-07-30 21:41:50 +00:00
Alex Lorenz 618b283cd9 MIR Serialization: Serialize the machine basic block's successor weights.
Reviewers: Duncan P. N. Exon Smith
llvm-svn: 243659
2015-07-30 16:54:38 +00:00
Vasileios Kalintiris 2041b1dd0b [mips][FastISel] Remove hidden mips-fast-isel option.
Summary:
This hidden option would disable code generation through FastISel by
default. It was removed from the available options and from the
Fast-ISel tests that required it in order to run the tests.

Reviewers: dsanders

Subscribers: qcolombet, llvm-commits

Differential Revision: http://reviews.llvm.org/D11610

llvm-svn: 243638
2015-07-30 12:39:33 +00:00
Vasileios Kalintiris 77fb0a3dcf [mips][FastISel] Apply only zero-extension to constants prior to their materialization.
Summary:
Previously, we would sign-extend non-boolean negative constants and
zero-extend otherwise. This was problematic for PHI instructions with
negative values that had a type with bitwidth less than that of the
register used for materialization.

More specifically, ComputePHILiveOutRegInfo() assumes the constants
present in a PHI node are zero extended in their container and
afterwards deduces the known bits.

For example, previously we would materialize an i16 -4 with the
following instruction:

  addiu $r, $zero, -4

The register would end-up with the 32-bit 2's complement representation
of -4. However, ComputePHILiveOutRegInfo() would generate a constant
with the upper 16-bits set to zero. The SelectionDAG builder would use
that information to generate an AssertZero node that would remove any
subsequent trunc & zero_extend nodes.

In theory, we should modify ComputePHILiveOutRegInfo() to consult
target-specific hooks about the way they prefer to materialize the
given constants. However, git-blame reports that this specific code
has not been touched since 2011 and it seems to be working well for every
target so far.

Reviewers: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11592

llvm-svn: 243636
2015-07-30 11:51:44 +00:00
Nick Lewycky c3890d2969 Fix typo "fuction" noticed in comments in AssumptionCache.h, and also all the other files that have the same typo. All comments, no functionality change! (Merely a "fuctionality" change.)
Bonus change to remove emacs major mode marker from SystemZMachineFunctionInfo.cpp because emacs already knows it's C++ from the extension. Also fix typo "appeary" in AMDGPUMCAsmInfo.h.

llvm-svn: 243585
2015-07-29 22:32:47 +00:00
Simon Pilgrim ba10f76705 [X86][SSE] Keep 32-bit target i64 vector shifts on SSE unit.
This patch improves the 32-bit target i64 constant matching to detect the shuffle vector splats that are introduced by i64 vector shift vectorization (D8416).

Differential Revision: http://reviews.llvm.org/D11327

llvm-svn: 243577
2015-07-29 21:44:27 +00:00
Tim Northover 2a9d801fd5 AArch64: use 32-bit MOV rather than UBFX to truncate registers.
It's potentially more efficient on Cyclone, and from the optimization guides &
schedulers looks like it has no effect on Cortex-A53 or A57. In general you'd
expect a MOV to be about the most efficient instruction with its semantics,
even though the official "UXTW" alias is really a UBFX.

llvm-svn: 243576
2015-07-29 21:34:32 +00:00
Alex Lorenz a6f9a37d92 MIR Serialization: Serialize the frame info's save and restore points.
This commit serializes the save and restore machine basic block references from
the machine frame information class.

llvm-svn: 243575
2015-07-29 21:09:09 +00:00
Simon Pilgrim 86478c6909 [X86][SSE] Vectorize i64 ASHR operations
This patch vectorizes the v2i64/v4i64 ASHR shift operations - the last remaining integer vector shifts that are still being transferred to/from the scalar unit to be completed.

Differential Revision: http://reviews.llvm.org/D11439

llvm-svn: 243569
2015-07-29 20:31:45 +00:00
Jingyue Wu 3a04dc6e78 Roll forward r242871
r242871 missed one place that should be guarded with isPhysicalReg. This patch
fixes that.

llvm-svn: 243555
2015-07-29 18:59:09 +00:00
Alex Lorenz b139323f21 MIR Serialization: Serialize the '.cfi_def_cfa' CFI instruction.
llvm-svn: 243554
2015-07-29 18:57:23 +00:00
Alex Lorenz fbe9c04c5f MIR Parser: Parse multiple LHS register machine operands.
llvm-svn: 243553
2015-07-29 18:51:21 +00:00
Bruno Cardoso Lopes 38c0250679 Revert "[PeepholeOptimizer] Look through PHIs to find additional register sources"
Reported to Broke some internal tests: PR24303

This reverts commit r243486.

llvm-svn: 243540
2015-07-29 17:46:47 +00:00
Jingyue Wu 7ec38530a5 Temporarily revert r242871
PR24299

llvm-svn: 243522
2015-07-29 15:26:11 +00:00
Bill Schmidt 42ddd71120 [PPC] Fix PR24216: Don't generate splat for misaligned shuffle mask
Given certain shuffle-vector masks, LLVM emits splat instructions
which splat the wrong bytes from the source register.  The issue is
that the function PPC::isSplatShuffleMask() in PPCISelLowering.cpp
does not ensure that the splat pattern found is requesting bytes that
are aligned on an EltSize boundary.  This patch detects this situation
as not a valid splat mask, resulting in a permute being generated
instead of a splat.

Patch and test case by Tyler Kenney, cleaned up a bit by me.

This is a simple bug fix that would be good to incorporate into 3.7.

llvm-svn: 243519
2015-07-29 14:31:57 +00:00
Akira Hatanaka f53b0403f8 [AArch64] Define subtarget feature strict-align.
This commit defines subtarget feature strict-align and uses it instead of
cl::opt -aarch64-strict-align to decide whether strict alignment should be
forced.

rdar://problem/21529937

llvm-svn: 243516
2015-07-29 14:17:26 +00:00
Sanjay Patel 133e68b45c ignore duplicate divisor uses when transforming into reciprocal multiplies (PR24141)
PR24141: https://llvm.org/bugs/show_bug.cgi?id=24141
contains a test case where we have duplicate entries in a node's uses() list.

After r241826, we use CombineTo() to delete dead nodes when combining the uses into
reciprocal multiplies, but this fails if we encounter the just-deleted node again in
the list.

The solution in this patch is to not add duplicate entries to the list of users that
we will subsequently iterate over. For the test case, this avoids triggering the
combine divisors logic entirely because there really is only one user of the divisor.

Differential Revision: http://reviews.llvm.org/D11345

llvm-svn: 243500
2015-07-28 23:28:22 +00:00
Alex Lorenz ef5c196fb0 MIR Serialization: Serialize the target index machine operands.
Reviewers: Duncan P. N. Exon Smith
llvm-svn: 243497
2015-07-28 23:02:45 +00:00
Akira Hatanaka 2670f4a550 [ARM] Define subtarget feature strict-align.
This commit defines subtarget feature strict-align and uses it instead of
cl::opt -arm-strict-align to decide whether strict alignment should be
forced. Also, remove the logic that was checking the OS and architecture
as clang is now responsible for setting strict-align based on the command
line options specified and the target architecute and OS.

rdar://problem/21529937

http://reviews.llvm.org/D11470

llvm-svn: 243493
2015-07-28 22:44:28 +00:00
Tim Northover 17ae83a25f AArch64: be careful of large immediates when optimising cmps.
llvm-svn: 243492
2015-07-28 22:42:32 +00:00
Bruno Cardoso Lopes 3c235763e5 [PeepholeOptimizer] Look through PHIs to find additional register sources
Reapply 243271 with more fixes; although we are not handling multiple
sources with coalescable copies, we were not properly skipping this
case.

- Teaches the ValueTracker in the PeepholeOptimizer to look through PHI
instructions.
- Add findNextSourceAndRewritePHI method to lookup into multiple sources
returnted by the ValueTracker and rewrite PHIs with new sources.

With these changes we can find more register sources and rewrite more
copies to allow coaslescing of bitcast instructions. Hence, we eliminate
unnecessary VR64 <-> GR64 copies in x86, but it could be extended to
other archs by marking "isBitcast" on target specific instructions. The
x86 example follows:

A:
  psllq %mm1, %mm0
  movd  %mm0, %r9
  jmp C

B:
  por %mm1, %mm0
  movd  %mm0, %r9
  jmp C

C:
  movd  %r9, %mm0
  pshufw  $238, %mm0, %mm0

Becomes:

A:
  psllq %mm1, %mm0
  jmp C

B:
  por %mm1, %mm0
  jmp C

C:
  pshufw  $238, %mm0, %mm0

Differential Revision: http://reviews.llvm.org/D11197
rdar://problem/20404526

llvm-svn: 243486
2015-07-28 21:45:50 +00:00
Vasileios Kalintiris 9876946aee [mips][FastISel] Fix call lowering by bailing out on "fastcc" calls.
Summary:
Currently, we support only the MIPS O32 ABI calling convention for call
lowering. With this change we avoid using the O32 calling convetion for
lowering calls marked as using the fast calling convention.

Reviewers: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11515

llvm-svn: 243485
2015-07-28 21:43:31 +00:00
Chih-Hung Hsieh 41169c5487 Fix typo.
llvm-svn: 243475
2015-07-28 20:38:29 +00:00
Chih-Hung Hsieh c5e53ca1b7 Limit this test only on linux.
Differential Revision: http://reviews.llvm.org/D10522

llvm-svn: 243474
2015-07-28 20:31:10 +00:00
Vasileios Kalintiris 9ec6114860 [mips][FastISel] Fix generated code for IR's select instruction.
Summary:
Generate correct code for the select instruction by zero-extending
it's boolean/condition operand to GPR-width. This is necessary because
the conditional-move instructions operate on the whole register.

Reviewers: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11506

llvm-svn: 243469
2015-07-28 19:57:25 +00:00
Matt Arsenault 7227cc1a48 AMDGPU: Don't try to use LDS/vector for private if pointer value stored
If the pointer is the store's value operand, this would produce
a broken module. Make sure the use is actually for the pointer operand.

llvm-svn: 243462
2015-07-28 18:47:00 +00:00
Matt Arsenault fdcd39a8ad AMDGPU: Fix crash if called function is a bitcast
getCalledFunction() is null, so this would crash. Replace
crash with an error on unsupported call.

llvm-svn: 243461
2015-07-28 18:29:14 +00:00
Alex Lorenz 305e8f6312 Add a test case for r242191 ([MMX] Use the appropriate instructions for
GR64 <-> VR64 copies).

This commit adds a MIR test case for the commit r242191, which was committed
without one. This test case verifies that the ExpandPostRA pass expands the
GR64 <-> VR64 copies into the appropriate MMX_MOV instructions.

llvm-svn: 243457
2015-07-28 17:52:59 +00:00
Chih-Hung Hsieh 9843f406ec Move unit tests to target specific directories.
Differential Revision: http://reviews.llvm.org/D10522

llvm-svn: 243454
2015-07-28 17:32:49 +00:00
Alex Lorenz deb534907e MIR Serialization: Serialize the block address machine operands.
llvm-svn: 243453
2015-07-28 17:28:03 +00:00
Sanjay Patel 94a7433cde add tests to show broken current behavior of minsize attribute
llvm-svn: 243451
2015-07-28 17:18:25 +00:00
Chih-Hung Hsieh 1e859582d6 Implement target independent TLS compatible with glibc's emutls.c.
The 'common' section TLS is not implemented.
Current C/C++ TLS variables are not placed in common section.
DWARF debug info to get the address of TLS variables is not generated yet.

clang and driver changes in http://reviews.llvm.org/D10524

  Added -femulated-tls flag to select the emulated TLS model,
  which will be used for old targets like Android that do not
  support ELF TLS models.

Added TargetLowering::LowerToTLSEmulatedModel as a target-independent
function to convert a SDNode of TLS variable address to a function call
to __emutls_get_address.

Added into lib/Target/*/*ISelLowering.cpp to call LowerToTLSEmulatedModel
for TLSModel::Emulated. Although all targets supporting ELF TLS models are
enhanced, emulated TLS model has been tested only for Android ELF targets.
Modified AsmPrinter.cpp to print the emutls_v.* and emutls_t.* variables for
emulated TLS variables.
Modified DwarfCompileUnit.cpp to skip some DIE for emulated TLS variabls.

TODO: Add proper DIE for emulated TLS variables.
      Added new unit tests with emulated TLS.

Differential Revision: http://reviews.llvm.org/D10522

llvm-svn: 243438
2015-07-28 16:24:05 +00:00
Geoff Berry c573bf7a5f [AArch64] Match float round and convert to int instructions.
Summary:
Add patterns for doing floating point round with various rounding modes
followed by conversion to int as a single FCVT* instruction.

Reviewers: t.p.northover, jmolloy

Subscribers: aemerson, rengolin, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D11424

llvm-svn: 243422
2015-07-28 15:24:10 +00:00
Adhemerval Zanella 7bc3319d84 Implement __builtin_thread_pointer
This path add the aarch64 lowering of __builtin_thread_pointer.  It uses
the already implemented AArch64ISD::THREAD_POINTER used in TLS generation.

llvm-svn: 243412
2015-07-28 13:03:31 +00:00
Simon Pilgrim df984f58ad [X86][SSE] Use bitmasks instead of shuffles where possible.
VPAND is a lot faster than VPSHUFB and VPBLENDVB - this patch ensures we attempt to lower to a basic bitmask before lowering to the slower byte shuffle/blend instructions.

Split off from D11518.

Differential Revision: http://reviews.llvm.org/D11541

llvm-svn: 243395
2015-07-28 08:54:41 +00:00
Igor Breger 8352a0ddf2 AVX512: Implemented encoding and intrinsics for VGETEXPSS/D instructions
Added tests for intrinsics and encoding.

Differential Revision: http://reviews.llvm.org/D11528

llvm-svn: 243390
2015-07-28 06:53:28 +00:00
Sanjay Patel 8c13e3680d fix invalid load folding with SSE/AVX FP logical instructions (PR22371)
This is a follow-up to the FIXME that was added with D7474 ( http://reviews.llvm.org/rL229531 ).
I thought this load folding bug had been made hard-to-hit, but it turns out to be very easy
when targeting 32-bit x86 and causes a miscompile/crash in Wine:
https://bugs.winehq.org/show_bug.cgi?id=38826
https://llvm.org/bugs/show_bug.cgi?id=22371#c25

The quick fix is to simply remove the scalar FP logical instructions from the load folding table
in X86InstrInfo, but that causes us to miss load folds that should be possible when lowering fabs,
fneg, fcopysign. So the majority of this patch is altering those lowerings to use *vector* FP
logical instructions (because that's all x86 gives us anyway). That lets us do the load folding 
legally.

Differential Revision: http://reviews.llvm.org/D11477

llvm-svn: 243361
2015-07-28 00:48:32 +00:00
JF Bastien 088c47ee5b WebAssembly: add a generic CPU
Summary: WebAssemblySubtarget.cpp expects a default 'generic' CPU to exist, and this seems to be prevalent with other targets. It makes sense to have something between MVP and bleeding-edge, even though for now it's the same as MVP. This removes a warning that's currently generated.

Subscribers: jfb, llvm-commits, sunfish

Differential Revision: http://reviews.llvm.org/D11546

llvm-svn: 243345
2015-07-27 23:25:54 +00:00
NAKAMURA Takumi 69ed7170dc Tweak llvm/test/CodeGen/X86/virtual-registers-cleared-in-machine-functions-liveins.ll not to fail for targeting win32.
llvm-svn: 243341
2015-07-27 23:01:41 +00:00
Alex Lorenz 8a1915b04e MIR Serialization: Serialize the unnamed basic block references.
This commit serializes the references from the machine basic blocks to the
unnamed basic blocks.

This commit adds a new attribute to the machine basic block's YAML mapping
called 'ir-block'. This attribute contains the actual reference to the
basic block.

Reviewers: Duncan P. N. Exon Smith
llvm-svn: 243340
2015-07-27 22:42:41 +00:00
Simon Pilgrim c363e7d8e0 [X86][SSE] Added shuffle tests to demonstrate missed bitmask.
llvm-svn: 243324
2015-07-27 20:41:57 +00:00
Alex Lorenz 5b0d5f6f26 MIR Serialization: Serialize the '.cfi_def_cfa_register' CFI instruction.
llvm-svn: 243322
2015-07-27 20:39:03 +00:00