Commit Graph

34753 Commits

Author SHA1 Message Date
Matt Arsenault 2ea0a23f18 AMDGPU: Print modifiers when dumping AMDGPUOperand
llvm-svn: 251160
2015-10-24 00:12:56 +00:00
Reid Kleckner f02e33ce42 [X86] Clean up the tail call eligibility logic
Summary:
The logic here isn't straightforward because our support for
TargetOptions::GuaranteedTailCallOpt.

Also fix a bug where we were allowing tail calls to cdecl functions from
fastcall and vectorcall functions. We were special casing thiscall and
stdcall callers rather than checking for any convention that requires
clearing stack arguments before returning.

Reviewers: hans

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14024

llvm-svn: 251137
2015-10-23 19:35:38 +00:00
Matt Arsenault 382557ec72 AMDGPU: Fix parsing of 32-bit literals with sign bit set
llvm-svn: 251132
2015-10-23 18:07:58 +00:00
Artyom Skrobov 5a6e39454e [ARM] Renaming +t2dsp feature into +dsp, as discussed on llvm-dev
llvm-svn: 251125
2015-10-23 17:19:19 +00:00
Oleg Ranevskyy 6389dd9fa2 [ARM CodeGen] @llvm.debugtrap call may be removed when restoring callee saved registers
Summary:
When ARMFrameLowering::emitPopInst generates a "pop" instruction to restore the callee saved registers, it checks if the LR register is among them. If so, the function may decide to remove the basic block's terminator and replace it with a "pop" to the PC register instead of LR.

This leads to a problem when the block's terminator is preceded by a "llvm.debugtrap" call. The MI iterator points to the trap in such a case, which is also a terminator. If the function decides to restore LR to PC, it erroneously removes the trap.

Reviewers: asl, rengolin

Subscribers: aemerson, jfb, rengolin, dschuff, llvm-commits

Differential Revision: http://reviews.llvm.org/D13672

llvm-svn: 251123
2015-10-23 17:17:59 +00:00
Oleg Ranevskyy 5f78c5c293 Test commit: fix typo in comment.
llvm-svn: 251122
2015-10-23 17:10:44 +00:00
Joseph Tremoulet 3d0fbf1d74 [CodeGen] Mark setjmp/catchret MBBs address-taken
Summary:
This ensures that BranchFolding (and similar) won't remove these blocks.

Also allow AsmPrinter::EmitBasicBlockStart to process MBBs which are
address-taken but do not have BBs that are address-taken, since otherwise
its call to getAddrLabelSymbolTableToEmit would fail an assertion on such
blocks.  I audited the other callers of getAddrLabelSymbolTableToEmit
(and getAddrLabelSymbol); they all have BBs known to be address-taken
except for the call through getAddrLabelSymbol from
WinException::create32bitRef; that call is actually now unreachable, so
I've removed it and updated the signature of create32bitRef.

This fixes PR25168.

Reviewers: majnemer, andrew.w.kaylor, rnk

Subscribers: pgavlin, llvm-commits

Differential Revision: http://reviews.llvm.org/D13774

llvm-svn: 251113
2015-10-23 15:06:05 +00:00
James Molloy 5b18b4ce96 Revert "[AArch64]Merge halfword loads into a 32-bit load"
This reverts commit r250719. This introduced a codegen fault in SPEC2000.gcc, when compiled for Cortex-A53.

llvm-svn: 251108
2015-10-23 10:41:38 +00:00
Zlatko Buljan 2cf61020b8 [mips][microMIPS] Implement SHLL.PH, SHLL_S.PH, SHLL.QB, SHLLV.PH, SHLLV_S.PH, SHLLV.QB, SHLLV_S.W, SHLL_S.W, SHRA.QB and SHRA_R.QB instructions
Differential Revision: http://reviews.llvm.org/D13929

llvm-svn: 251098
2015-10-23 06:39:29 +00:00
Matthias Braun d276de6db1 AArch64: Disable the latency heuristic
It turned out not to improve any of our benchmarks but occasionally led
to increased register pressure and spilling.

Only enabling for the Cyclone CPU as the results on the cortex CPUs
give mixed results.

Differential Revision: http://reviews.llvm.org/D13708

llvm-svn: 251038
2015-10-22 18:07:38 +00:00
Eric Christopher 227d71bba6 Remove the last traces of X86CompilationCallback as it is completely
unused.

llvm-svn: 251035
2015-10-22 17:55:35 +00:00
Craig Topper 8fe40e0ed5 Change makeLibCall to take an ArrayRef<SDValue> instead of pointer and size. This removes the need to pass a hardcoded size in many places. NFC
llvm-svn: 251032
2015-10-22 17:05:00 +00:00
Bill Schmidt de1dc9c98f [PPC] Fix PR24686 by failing assembly for an invalid relocation
PR24686 identifies a problem where a relocation expression is invalid
when not all of the symbols in the expression can be locally
resolved.  This causes the compiler to request a PC-relative half16ds
relocation, which is nonsensical for PowerPC.  This patch recognizes
this situation and ensures we fail the assembly cleanly.

Test case provided by Anton Blanchard.

llvm-svn: 251027
2015-10-22 15:53:44 +00:00
Asaf Badouh 7c52245660 [X86][AVX512] extend vcvtph2ps to support xmm/ymm and sae versions
Differential Revision: http://reviews.llvm.org/D13945

llvm-svn: 251018
2015-10-22 14:01:16 +00:00
Elena Demikhovsky 5c97dfdc9c AVX-512: Fixed a bug in select_cc for i1 type
Fixed faiure:
LLVM ERROR: Cannot select: t33: i1 = select_cc t25, Constant:i32<0>, t45, t42, seteq:ch

added a test

Differential Revision: http://reviews.llvm.org/D13943

llvm-svn: 250996
2015-10-22 07:10:29 +00:00
Elena Demikhovsky 7ad0d563a5 Partially reverted changes from r250686
Clang runtime failure was reported.
   Assertion failed: (isExtended() && "Type is not extended!"), function getTypeForEVT
I'll need to add a proper handling for PointerType in masked load/store intrinsics.

llvm-svn: 250995
2015-10-22 06:20:29 +00:00
JF Bastien f2364bf129 WebAssembly: fix more syntax
br_if shouldn't start with a dot.
div and rem went from prefix u/s to suffix.

llvm-svn: 250972
2015-10-22 02:32:50 +00:00
Pete Cooper b70b956c80 Add missing load/store flags to thumb2 instructions.
These were the cause of a verifier error when building 7zip with
-verify-machineinstrs.  Running 'make check' with the verifier
triggered the same error on the test here so i've updated the test
to run the verifier on one of its runs instead of adding a new one.

While looking at this code, there was a stale comment that these
instructions were only used for disassembly.  This probably used to
be the case, but they are now used in the 'ARM load / store optimization pass' too.

This reapplies r242300 which was reverted in r242428 due to bot failures.

Ultimately those failures were spurious and completely unrelated to this commit.  I reverted this
at the time because it was thought to be at fault.

llvm-svn: 250969
2015-10-22 01:48:57 +00:00
Matt Arsenault 391be09ef3 AMDGPU: Fix adding redundant m0 uses
BuildMI already adds these since they are defined correctly now.

llvm-svn: 250961
2015-10-21 22:37:51 +00:00
Matt Arsenault e8c0891e42 AMDGPU: Fix verifier error in SIFoldOperands
There may be other use operands that also need their kill flags cleared.

This happens in a few tests when SIFoldOperands is moved after
PeepholeOptimizer.

PeepholeOptimizer rewrites cases that look like:
%vreg0 = ...
%vreg1 = COPY %vreg0
use %vreg1<kill>
%vreg2 = COPY %vreg0
use %vreg2<kill>

to use the earlier source to
%vreg0 = ...
use %vreg0
use %vreg0

Currently SIFoldOperands sees the copied registers, so there is
only one use. So far I haven't managed to come up with a test
that currently has multiple uses of a foldable VGPR -> VGPR copy.

llvm-svn: 250960
2015-10-21 22:37:50 +00:00
Matt Arsenault b6fd98c7d9 AMDGPU: Split DiagnosticInfoUnsupported into its own file
llvm-svn: 250959
2015-10-21 22:37:46 +00:00
Matt Arsenault 6005fcbe12 AMDGPU: Simplify VOP3 operand legalization.
This was checking for a variety of situations that should
never happen. This saves a tiny bit of compile time.

We should not be selecting instructions with invalid operands in the
first place. Most of the time for registers copys are inserted
to the correct operand register class.

For VOP3, since all operand types are supported and literal
constants never are, we just need to verify the constant bus
requirements (all immediates should be legal inline ones).

The only possibly tricky case to maybe worry about is if when
legalizing operands in moveToVALU with s_add_i32 and similar
instructions. If the original s_add_i32 had a literal constant
and we need to replace it with v_add_i32_e64 we would have an
unsupported literal operand.  However, I don't think we should worry
about that because SIFoldOperands should handle folding literal
constant operands into the SALU instructions based on the uses.
At SIFoldOperands time, the legality and profitability of
operand types is a bit different.

llvm-svn: 250951
2015-10-21 21:51:02 +00:00
Matt Arsenault e223cebd10 AMDGPU: Fix not checking implicit operands in verifyInstruction
When verifying constant bus restrictions, this wasn't catching
uses in implicit operands.

llvm-svn: 250948
2015-10-21 21:15:01 +00:00
Joerg Sonnenberger 7212809abc Drop assert that a call with struct return goes to a function with sret
attribute. Clang incorrectly misses it on __muldc3 and friends and the
type system doesn't include it properly either.

llvm-svn: 250938
2015-10-21 20:05:01 +00:00
Sanjay Patel efab8b0d08 [x86] move recursive add match for LEA to helper function; NFCI
llvm-svn: 250926
2015-10-21 18:56:06 +00:00
Craig Topper 896c267544 [X86] Add AMD mwaitx, monitorx, and clzero instructions to the assembly parser and disassembler.
llvm-svn: 250911
2015-10-21 17:26:45 +00:00
Daniel Sanders d6cf3e05ef [mips][mips16] Re-work the inline assembly stubs to work with IAS. NFC.
Summary:
Previously, we were inserting an InlineAsm statement for each line of the
inline assembly. This works for GAS but it triggers prologue/epilogue
emission when IAS is in use. This caused:
    .set noreorder
    .cpload $25
to be emitted as:
    .set push
    .set reorder
    .set noreorder
    .set pop
    .set push
    .set reorder
    .cpload $25
    .set pop
which led to assembler errors and caused the test to fail.

The whitespace-after-comma changes included in this patch are necessary to
match the output when IAS is in use.

Reviewers: vkalintiris

Subscribers: rkotler, llvm-commits, dsanders

Differential Revision: http://reviews.llvm.org/D13653

llvm-svn: 250895
2015-10-21 12:44:14 +00:00
Daniel Sanders 0f596814e9 [mips][msa] Remove copy_u.d and move copy_u.w to MSA64.
Summary:
The forwards compatibility strategy employed by MIPS is to consider registers
to be infinitely sign-extended. Then on ISA's with a wider register, the result
of existing instructions are sign-extended to register width and zero-extended
counterparts are added. copy_u.w on MSA32 and copy_u.w on MSA64 violate this
strategy and we have therefore corrected the MSA specs to fix this.

We still keep track of sign/zero-extension during legalization but we now
match copy_s.[wd] where required.

No change required to clang since __builtin_msa_copy_u_[wd] will map to
copy_s.[wd] where appropriate for the target.

Reviewers: vkalintiris

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D13472

llvm-svn: 250887
2015-10-21 09:58:54 +00:00
Mehdi Amini 4215236621 Do not use `dyn_cast<X>` after `isa<X>` (NFC)
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 250883
2015-10-21 06:11:01 +00:00
JF Bastien 1a59c6b2c9 WebAssembly: support imports
C/C++ code can declare an extern function, which will show up as an import in WebAssembly's output. It's expected that the linker will resolve these, and mark unresolved imports as call_import (I have a patch which does this in wasmate).

llvm-svn: 250875
2015-10-21 02:23:09 +00:00
Krzysztof Parzyszek ced9941cd4 [Hexagon] Bit-based instruction simplification
Analyze bit patterns of operands and values of instructions to perform
various simplifications, dead/redundant code elimination, etc.

llvm-svn: 250868
2015-10-20 22:57:13 +00:00
Krzysztof Parzyszek 26b2c9080f [Hexagon] Fix isNVStorable flag in .td files
An upper half and a double word cannot be used as value sources in a
new-value store.

llvm-svn: 250867
2015-10-20 22:40:57 +00:00
Krzysztof Parzyszek 79512b88b0 [Hexagon] Capture aggregate variables by reference, not value
llvm-svn: 250851
2015-10-20 19:33:46 +00:00
Krzysztof Parzyszek e4cff4058c [Hexagon] Do not fall-through if there is no CFG edge
llvm-svn: 250850
2015-10-20 19:30:21 +00:00
Krzysztof Parzyszek bfe8e92fd1 [Hexagon] Use symbolic name for subregister instead of hardcoded number
llvm-svn: 250849
2015-10-20 19:26:36 +00:00
Krzysztof Parzyszek 0257905f27 [Hexagon] Change Based->Base in getBasedWithImmOffset
llvm-svn: 250848
2015-10-20 19:21:05 +00:00
Krzysztof Parzyszek 05da79d5ac [Hexagon] Remove the remnants of isConstExtProfitable
llvm-svn: 250845
2015-10-20 19:04:53 +00:00
Jonas Paulsson 4b29f6f7f7 [SystemZ] Use LivePhysRegs helper class in SystemZShortenInst.cpp.
Don't use home brewed liveness tracking code for phys regs, since
this class does the job.

Reviewed by Ulrich Weigand.

llvm-svn: 250829
2015-10-20 15:05:58 +00:00
Artyom Skrobov 7fd67e25aa Adding support for TargetLoweringBase::LibCall
Summary:
TargetLoweringBase::Expand is defined as "Try to expand this to other ops,
otherwise use a libcall." For ISD::UDIV and ISD::SDIV, the choice between
the two possibilities was defined in a rather convoluted way:

- if DIVREM is legal, expand to DIVREM
- if DIVREM has a custom lowering, expand to DIVREM
- if DIVREM libcall is defined and a remainder from the same division is
  computed elsewhere, expand to a DIVREM libcall
- else, expand to a DIV libcall

This had the undesirable effect that if both DIV and DIVREM are implemented
as libcalls, then ISD::UDIV and ISD::SDIV are expanded to the heavier DIVREM
libcall, even when the remainder isn't used.

The new code adds a new LegalizeAction, TargetLoweringBase::LibCall, so that
backends can directly control whether they prefer an expansion or a conversion
to a libcall. This makes the generic lowering code even more generic,
allowing its reuse in a wider range of target-specific configurations.

The useful effect is that ARM backend will now generate a call
to __aeabi_{i,u}div rather than __aeabi_{i,u}divmod in cases where
it doesn't need the remainder. There's no functional change outside
the ARM backend.

Reviewers: t.p.northover, rengolin

Subscribers: t.p.northover, llvm-commits, aemerson

Differential Revision: http://reviews.llvm.org/D13862

llvm-svn: 250826
2015-10-20 13:14:52 +00:00
Igor Breger 21296d230a AVX512: Implemented encoding and intrinsics for VPBROADCASTB/W/D/Q instructions.
Differential Revision: http://reviews.llvm.org/D13884

llvm-svn: 250819
2015-10-20 11:56:42 +00:00
Matt Arsenault 3add6439d0 AMDGPU: Add MachineInstr overloads for instruction format tests
llvm-svn: 250797
2015-10-20 04:35:43 +00:00
Matt Arsenault 8f18917a90 AMDGPU: Stop reserving v[254:255]
This wasn't doing anything useful. They weren't explicitly used
anywhere, and the RegScavenger ignores reserved registers.

This for some reason caused a random scheduling change in the test.
Getting the check lines to pass is too frustrating, and there's probably
not too much value in checking the vector case's operands N times.

llvm-svn: 250794
2015-10-20 03:59:58 +00:00
JF Bastien c8f89e86d5 WebAssembly: fix call/return syntax.
They are now typeless, unlike other operations.

llvm-svn: 250793
2015-10-20 01:26:54 +00:00
Duncan P. N. Exon Smith c4829deae8 MSP430: Remove implicit ilist iterator conversions, NFC
llvm-svn: 250792
2015-10-20 01:18:39 +00:00
Duncan P. N. Exon Smith a2c90e4743 SystemZ: Remove implicit ilist iterator conversion, NFC
llvm-svn: 250790
2015-10-20 01:12:46 +00:00
Duncan P. N. Exon Smith 0ce253d3a9 XCore: Remove implicit ilist iterator conversions, NFC
llvm-svn: 250788
2015-10-20 01:07:42 +00:00
Duncan P. N. Exon Smith ac65b4c422 PowerPC: Remove implicit ilist iterator conversions, NFC
llvm-svn: 250787
2015-10-20 01:07:37 +00:00
Duncan P. N. Exon Smith c3f7988472 Sparc: Remove implicit ilist iterator conversions, NFC
llvm-svn: 250781
2015-10-20 00:59:43 +00:00
Duncan P. N. Exon Smith 61149b86c3 NVPTX: Remove implicit ilist iterator conversions, NFC
llvm-svn: 250779
2015-10-20 00:54:09 +00:00
Duncan P. N. Exon Smith a72c6e25ec Hexagon: Remove implicit ilist iterator conversions, NFC
There are two things out of the ordinary in this commit.  First, I made
a loop obviously "infinite" in HexagonInstrInfo.cpp.  After checking if
an instruction was at the beginning of a basic block (in which case,
`break`), the loop decremented and checked the iterator for `nullptr` as
the loop condition.  This has never been possible (the prev pointers are
always been circular, so even with the weird ilist/iplist
implementation, this isn't been possible), so I removed the condition.

Second, in HexagonAsmPrinter.cpp there was another case of comparing a
`MachineBasicBlock::instr_iterator` against `MachineBasicBlock::end()`
(which returns `MachineBasicBlock::iterator`).  While not incorrect,
it's fragile.  I switched this to `::instr_end()`.

All that said, no functionality change intended here.

llvm-svn: 250778
2015-10-20 00:46:39 +00:00
JF Bastien 3b0177c542 WebAssembly: fix syntax for br_if.
llvm-svn: 250777
2015-10-20 00:37:42 +00:00
Duncan P. N. Exon Smith 7869148c47 Mips: Remove implicit ilist iterator conversions, NFC
llvm-svn: 250769
2015-10-20 00:15:20 +00:00
Duncan P. N. Exon Smith 19d951874a CppBackend: Remove implicit ilist iterator conversions, NFC
Mostly just converted to range-based for loops.  May have converted a
couple of extra loops as a drive-by (not sure).

llvm-svn: 250766
2015-10-20 00:06:41 +00:00
Duncan P. N. Exon Smith d95fa4ccf1 BPF: Remove implicit ilist iterator conversion, NFC
llvm-svn: 250765
2015-10-20 00:02:50 +00:00
Duncan P. N. Exon Smith 9f9559e807 ARM: Remove implicit ilist iterator conversions, NFC
llvm-svn: 250759
2015-10-19 23:25:57 +00:00
Duncan P. N. Exon Smith d77de6495e X86: Remove implicit ilist iterator conversions, NFC
llvm-svn: 250741
2015-10-19 21:48:29 +00:00
Krzysztof Parzyszek 055c5fd74e [Hexagon] Remove unnecessary argument sign extends
llvm-svn: 250724
2015-10-19 19:10:48 +00:00
Benjamin Kramer 755e502952 Add missing override noticed by Clang's -Winconsistent-missing-override.
llvm-svn: 250720
2015-10-19 18:41:23 +00:00
Jun Bum Lim d3548303ec [AArch64]Merge halfword loads into a 32-bit load
Convert two halfword loads into a single 32-bit word load with bitfield extract
instructions. For example :
  ldrh w0, [x2]
  ldrh w1, [x2, #2]
becomes
  ldr w0, [x2]
  ubfx w1, w0, #16, #16
  and  w0, w0, #ffff

llvm-svn: 250719
2015-10-19 18:34:53 +00:00
Krzysztof Parzyszek 23920ec95d [Hexagon] Fix debug information for local objects
- Isolate the check for the existence of a stack frame into hasFP.
- Implement getFrameIndexReference for DWARF address computation.
- Use getFrameIndexReference for offset computation in eliminateFrameIndex.
- Preserve debug information for dynamically allocated stack objects.
- Prefer FP to access local objects at -O0.
- Add experimental code to skip allocframe when not strictly necessary
  (disabled by default).

llvm-svn: 250718
2015-10-19 18:30:27 +00:00
Krzysztof Parzyszek db8677067c [Hexagon] Delay emission of CFI instructions
Emit the CFI instructions after all code transformation have been done.
This will avoid any interference between CFI instructions and packetization.

llvm-svn: 250714
2015-10-19 17:46:01 +00:00
Benjamin Kramer 335332329b Remove CRLF newlines. NFC.
llvm-svn: 250698
2015-10-19 13:05:25 +00:00
Asiri Rathnayake 1040a53be3 Fix mapping of @llvm.arm.ssat/usat intrinsics to ssat/usat instructions
The mapping of these two intrinsics in ARMInstrInfo.td had a small
omission which lead to their operands not being validated/transformed
before being lowered into usat and ssat instructions. This can cause
incorrect instructions to be emitted.

I've also added tests for the remaining two saturating arithmatic
intrinsics @llvm.arm.qadd and @llvm.arm.qsub as they are missing
codegen tests.

llvm-svn: 250697
2015-10-19 11:44:24 +00:00
Elena Demikhovsky 20662e39f1 Removed parameter "Consecutive" from isLegalMaskedLoad() / isLegalMaskedStore().
Originally I planned to use the same interface for masked gather/scatter and set isConsecutive to "false" in this case.

Now I'm implementing masked gather/scatter and see that the interface is inconvenient. I want to add interfaces isLegalMaskedGather() / isLegalMaskedScatter() instead of using the "Consecutive" parameter in the existing interfaces.

Differential Revision: http://reviews.llvm.org/D13850

llvm-svn: 250686
2015-10-19 07:43:38 +00:00
Zlatko Buljan 5292083584 [mips][microMIPS] Implement ADDQ.PH, ADDQ_S.W, ADDQH.PH, ADDQH.W, ADDSC, ADDU.PH, ADDU_S.QB, ADDWC and ADDUH.QB instructions
Differential Revision: http://reviews.llvm.org/D13130

llvm-svn: 250685
2015-10-19 07:16:26 +00:00
Zlatko Buljan d0a7d6e4ee [mips][microMIPS] Implement ABSQ.QB, ABSQ_S.PH, ABSQ_S.W, ABSQ_S.QB, INSV, MADD, MADDU, MSUB, MSUBU, MULT and MULTU instructions
Differential Revision: http://reviews.llvm.org/D13721

llvm-svn: 250683
2015-10-19 06:34:44 +00:00
Asaf Badouh 696e8e0bb7 [X86][AVX512DQ] add scalar fpclass
Differential Revision: http://reviews.llvm.org/D13769

llvm-svn: 250650
2015-10-18 11:04:38 +00:00
Igor Breger cbb9550537 AVX512: Lowering i8/i16 vector CTLZ using the dword LZCNT vector instruction
Differential Revision: http://reviews.llvm.org/D13632

llvm-svn: 250649
2015-10-18 09:56:39 +00:00
Craig Topper 92cfdd70f8 [Sparc] Use MCPhysReg instead of unsigned to size static arrays of registers. Should reduce the table size.
llvm-svn: 250644
2015-10-18 05:29:05 +00:00
Craig Topper 2626094fa1 Make a bunch of static arrays const.
llvm-svn: 250642
2015-10-18 05:15:34 +00:00
Craig Topper ec15ea12e7 Use std::find instead of manual loop.
llvm-svn: 250624
2015-10-17 21:32:28 +00:00
Craig Topper a833451173 Use std::is_sorted to replace a custom version. Also replace a comparison predicate struct with a lambda.
llvm-svn: 250623
2015-10-17 21:32:26 +00:00
Simon Pilgrim 86c5e85e84 [X86][XOP] Add VPROT instruction opcodes
Added X86ISD opcodes for VPROT vector rotate by variable and by immediate.

llvm-svn: 250620
2015-10-17 19:04:24 +00:00
Craig Topper a2d0635098 Remove unnecessary 'const' pointed out by David Blaikie.
llvm-svn: 250619
2015-10-17 18:22:46 +00:00
Craig Topper 9ff9bf4959 Replace a custom table sort check with std::is_sorted. Change a function to take ArrayRef instead of pointer and length. NFC
llvm-svn: 250615
2015-10-17 16:37:13 +00:00
Craig Topper c177d9edb3 Use std::begin/end and std::is_sorted to simplify some code. NFC
llvm-svn: 250614
2015-10-17 16:37:11 +00:00
Simon Pilgrim a18ae9bd70 [CostModel] Fixed AVX integer shift costs
Targets with AVX but without AVX2 were incorrectly reporting costs of 256-bit integer shifts.

llvm-svn: 250611
2015-10-17 13:23:38 +00:00
Simon Pilgrim 5b65f28fe7 [X86][FastISel] Teach how to select SSE4A nontemporal stores.
Add FastISel support for SSE4A scalar float / double non-temporal stores

Follow up to D13698

Differential Revision: http://reviews.llvm.org/D13773

llvm-svn: 250610
2015-10-17 13:04:42 +00:00
Colin LeMahieu 7c9587136d [Hexagon] Adding skeleton of HVX extension instructions.
llvm-svn: 250600
2015-10-17 01:33:04 +00:00
JF Bastien 3428ed4f53 WebAssembly: don't omit dead vregs from locals
Summary:
This is a temporary hack until we get around to remapping the vreg
numbers to local numbers. Dead vregs cause bad numbering and make
consumers sad.

We could also just look at debug info an use named locals instead, but
vregs have to work properly anyways so there!

Reviewers: binji, sunfish

Subscribers: jfb, llvm-commits, dschuff

Differential Revision: http://reviews.llvm.org/D13839

llvm-svn: 250594
2015-10-17 00:25:38 +00:00
JF Bastien 4f43e80ece WebAssembly: fix the syntax for comparisons
Summary: It has also slightly changed.

Reviewers: binji

Subscribers: jfb, dschuff, llvm-commits, sunfish

Differential Revision: http://reviews.llvm.org/D13837

llvm-svn: 250591
2015-10-17 00:12:29 +00:00
Reid Kleckner 28e490342b [WinEH] Fix stack alignment in funclets and ParentFrameOffset calculation
Our previous value of "16 + 8 + MaxCallFrameSize" for ParentFrameOffset
is incorrect when CSRs are involved. We were supposed to have a test
case to catch this, but it wasn't very rigorous.

The main effect here is that calling _CxxThrowException inside a
catchpad doesn't immediately crash on MOVAPS when you have an odd number
of CSRs.

llvm-svn: 250583
2015-10-16 23:43:27 +00:00
Sanjay Patel bbd524496c [x86] promote 'add nsw' to a wider type to allow more combines
The motivation for this patch starts with PR20134:
https://llvm.org/bugs/show_bug.cgi?id=20134

void foo(int *a, int i) {
  a[i] = a[i+1] + a[i+2];
}

It seems better to produce this (14 bytes):

movslq	%esi, %rsi
movl	0x4(%rdi,%rsi,4), %eax
addl	0x8(%rdi,%rsi,4), %eax
movl	%eax, (%rdi,%rsi,4)

Rather than this (22 bytes):

leal	0x1(%rsi), %eax
cltq             
leal	0x2(%rsi), %ecx      
movslq	%ecx, %rcx     
movl	(%rdi,%rcx,4), %ecx
addl	(%rdi,%rax,4), %ecx
movslq	%esi, %rax       
movl	%ecx, (%rdi,%rax,4)

The most basic problem (the first test case in the patch combines constants) should also be fixed in InstCombine, 
but it gets more complicated after that because we need to consider architecture and micro-architecture. For
example, AArch64 may not see any benefit from the more general transform because the ISA solves the sexting in
hardware. Some x86 chips may not want to replace 2 ADD insts with 1 LEA, and there's an attribute for that: 
FeatureSlowLEA. But I suspect that doesn't go far enough or maybe it's not getting used when it should; I'm 
also not sure if FeatureSlowLEA should also mean "slow complex addressing mode".

I see no perf differences on test-suite with this change running on AMD Jaguar, but I see small code size
improvements when building clang and the LLVM tools with the patched compiler.

A more general solution to the sext(add nsw(x, C)) problem that works for multiple targets is available
in CodeGenPrepare, but it may take quite a bit more work to get that to fire on all of the test cases that
this patch takes care of.

Differential Revision: http://reviews.llvm.org/D13757

llvm-svn: 250560
2015-10-16 22:14:12 +00:00
Andrew Kaylor 09b39acc03 Fix assertion failure with fp128 to unsigned i64 conversion
Patch by Mitch Bodart

Differential Revision: http://reviews.llvm.org/D13780

llvm-svn: 250550
2015-10-16 20:39:20 +00:00
Krzysztof Parzyszek a7c5f0409c [Hexagon] Split double registers
llvm-svn: 250549
2015-10-16 20:38:54 +00:00
Krzysztof Parzyszek aec39c68ae [Hexagon] Delete lib/Target/Hexagon/HexagonRemoveSZExtArgs.cpp
llvm-svn: 250543
2015-10-16 19:51:53 +00:00
Krzysztof Parzyszek 5b7dd0cdf9 [Hexagon] Merge adjacent stores
llvm-svn: 250542
2015-10-16 19:43:56 +00:00
JF Bastien 6126d2b883 WebAssembly: fix load/store syntax
Summary: The syntax has changed a bit recently.

Reviewers: binji

Subscribers: llvm-commits, jfb, sunfish, dschuff

Differential Revision: http://reviews.llvm.org/D13821

llvm-svn: 250535
2015-10-16 18:24:42 +00:00
JF Bastien 53bd975033 WebAssembly: relooper analysis pass
Summary: Make the relooper an analysis pass, to convert CFG to AST.

Reviewers: sunfish

Subscribers: jfb, dschuff

Differential Revision: http://reviews.llvm.org/D12744

llvm-svn: 250524
2015-10-16 16:35:49 +00:00
Charlie Turner 434d4599d4 [AArch64] Implement vector splitting on UADDV.
Summary: Fixes PR25056.

Reviewers: mcrosier, junbuml, jmolloy

Subscribers: aemerson, rengolin, llvm-commits

Differential Revision: http://reviews.llvm.org/D13466

llvm-svn: 250520
2015-10-16 15:38:25 +00:00
Hrvoje Varga 3c88fbd367 [mips][microMIPS] Implement LB, LBE, LBU and LBUE instructions
Differential Revision: http://reviews.llvm.org/D11633

llvm-svn: 250511
2015-10-16 12:24:58 +00:00
Craig Topper 09b6598572 [X86] Add fxsr feature flag for fxsave/fxrestore instructions.
llvm-svn: 250497
2015-10-16 06:03:09 +00:00
JF Bastien 1d20a5e9e8 WebAssembly: update syntax
Summary:
Follow the same syntax as for the spec repo. Both have evolved slightly
independently and need to converge again.

This, along with wasmate changes, allows me to do the following:

  echo "int add(int a, int b) { return a + b; }" > add.c
  ./out/bin/clang -O2 -S --target=wasm32-unknown-unknown add.c -o add.wack
  ./experimental/prototype-wasmate/wasmate.py add.wack > add.wast
  ./sexpr-wasm-prototype/out/sexpr-wasm add.wast -o add.wasm
  ./sexpr-wasm-prototype/third_party/v8-native-prototype/v8/v8/out/Release/d8 -e "print(WASM.instantiateModule(readbuffer('add.wasm'), {print:print}).add(42, 1337));"

As you'd expect, the d8 shell prints out the right value.

Reviewers: sunfish

Subscribers: jfb, llvm-commits, dschuff

Differential Revision: http://reviews.llvm.org/D13712

llvm-svn: 250480
2015-10-16 00:53:49 +00:00
Evgeniy Stepanov 9addbc9fc1 Revert "[safestack] Fast access to the unsafe stack pointer on AArch64/Android."
Breaks the hexagon buildbot.

llvm-svn: 250461
2015-10-15 21:26:49 +00:00
Evgeniy Stepanov 142947e9f0 [safestack] Fast access to the unsafe stack pointer on AArch64/Android.
Android libc provides a fixed TLS slot for the unsafe stack pointer,
and this change implements direct access to that slot on AArch64 via
__builtin_thread_pointer() + offset.

This change also moves more code into TargetLowering and its
target-specific subclasses to get rid of target-specific codegen
in SafeStackPass.

This change does not touch the ARM backend because ARM lowers
builting_thread_pointer as aeabi_read_tp, which is not available
on Android.

llvm-svn: 250456
2015-10-15 20:50:16 +00:00
JF Bastien 2cdd5e4710 x86: preserve flags when folding atomic operations
D4796 taught LLVM to fold some atomic integer operations into a single
instruction. The pattern was unaware that the instructions clobbered
flags. I fixed some of this issue in D13680 but had missed INC/DEC.

This patch adds the missing EFLAGS definition.

llvm-svn: 250438
2015-10-15 18:24:52 +00:00
JF Bastien 5b327712b0 x86 FP atomic codegen: don't drop globals, stack
Summary:
x86 codegen is clever about generating good code for relaxed
floating-point operations, but it was being silly when globals and
immediates were involved, forgetting where the global was and
loading/storing from/to the wrong place. The same applied to hard-coded
address immediates.

Don't let it forget about the displacement.

This fixes https://llvm.org/bugs/show_bug.cgi?id=25171

A very similar bug when doing floating-points atomics to the stack is
also fixed by this patch.

This fixes https://llvm.org/bugs/show_bug.cgi?id=25144

Reviewers: pete

Subscribers: llvm-commits, majnemer, rsmith

Differential Revision: http://reviews.llvm.org/D13749

llvm-svn: 250429
2015-10-15 16:46:29 +00:00
Daniel Sanders 6394ee598e [mips][ias] Implement ulh macro.
Summary:
This macro is needed to prevent test/CodeGen/Mips/2008-08-01-AsmInline.ll from
failing after the integrated assembler is enabled by default.

Reviewers: vkalintiris

Subscribers: llvm-commits, dsanders

Differential Revision: http://reviews.llvm.org/D13654

llvm-svn: 250414
2015-10-15 14:52:58 +00:00
Benjamin Kramer c5275bdec1 [NVPTX] Remove dead code.
I left helpers that look useful for debugging alone. NFC.

llvm-svn: 250410
2015-10-15 14:45:41 +00:00
Daniel Sanders 8008de5551 [mips][mips16] MIPS16 is not a CPU/Architecture but is an ASE.
Summary:
The -mcpu=mips16 option caused the Integrated Assembler to crash because
it couldn't figure out the architecture revision number to write to the
.MIPS.abiflags section. This CPU definition has been removed because, like
microMIPS, MIPS16 is an ASE to a base architecture.

Reviewers: vkalintiris

Subscribers: rkotler, llvm-commits, dsanders

Differential Revision: http://reviews.llvm.org/D13656

llvm-svn: 250407
2015-10-15 14:34:23 +00:00