Commit Graph

33446 Commits

Author SHA1 Message Date
Pete Cooper 4915dd076f Remove includes of MCMachOSymbolFlags.h after it was deleted
llvm-svn: 239318
2015-06-08 17:25:57 +00:00
Matthias Braun 6f8db0e1a7 X86: Reject register operands with obvious type mismatches.
While we have some code to transform specification like {ax} into
{eax}/{rax} if the operand type isn't 16bit, we should reject cases
where there is no sane way to do this, like the i128 type in the
example.

Related to rdar://21042280

Differential Revision: http://reviews.llvm.org/D10260

llvm-svn: 239309
2015-06-08 16:56:23 +00:00
Colin LeMahieu 6aca6f0be5 [Hexagon] Adding functionality for searching for compound instruction pairs. Compound instructions reduce slot resource requirements freeing those packet slots up for more instructions.
llvm-svn: 239307
2015-06-08 16:34:47 +00:00
Javed Absar e1c7dc3ee2 ARM]: Add support for MMFR4_EL1 in assembler
This patch adds support for system register MMFR4_EL1 (memory model feature register) in the assembler.
This register provides information about the implemented memory model and memory management support.

llvm-svn: 239302
2015-06-08 15:01:11 +00:00
Igor Breger 00d9f8457b AVX-512: Implemented 256/128bit VALIGND/Q instructions for SKX and KNL
Implemented DAG lowering for all these forms.
Added tests for DAG lowering and encoding.

Differential Revision: http://reviews.llvm.org/D10310

llvm-svn: 239300
2015-06-08 14:03:17 +00:00
Simon Pilgrim 3a7718038d [X86] Added BitScanForward/BitScanReverse memory folding + tests
llvm-svn: 239257
2015-06-07 18:34:25 +00:00
Rafael Espindola f3d49b30b5 Handle 16 bit PC relative relocations.
Fixes pr23771.

llvm-svn: 239214
2015-06-06 02:29:56 +00:00
Peter Collingbourne 6679fc1a79 Revert r238473, "Thumb2: Modify codegen for memcpy intrinsic to prefer LDM/STM."
as it caused miscompilations and assertion failures (PR23768,
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20150601/280380.html).

llvm-svn: 239169
2015-06-05 18:01:28 +00:00
Alexei Starovoitov 8cf9a4c472 [bpf] rename triple names bpf_be -> bpfeb
llvm-svn: 239162
2015-06-05 16:11:14 +00:00
Colin LeMahieu be8c453d58 [Hexagon] Reapply r239097 with tests corrected for shuffling and duplexing.
llvm-svn: 239161
2015-06-05 16:00:11 +00:00
Benjamin Kramer 113b2a943f [ARM] Make helper function static.
This one had a declaration but it differed from the definition so the
declaration was actually dead.

llvm-svn: 239157
2015-06-05 14:32:54 +00:00
John Brawn 985c04e8fa [ARM] Add support for -sp- FPUs and FPU none to TargetParser
These are added mainly for the benefit of clang, but this also means that they
are now allowed in .fpu directives and we emit the correct .fpu directive when
single-precision-only is used.

Differential Revision: http://reviews.llvm.org/D10238

llvm-svn: 239151
2015-06-05 13:31:19 +00:00
John Brawn d03d22922d [ARM] Add knowledge of FPU subtarget features to TargetParser
Add getFPUFeatures to TargetParser, which gets the list of subtarget features
that are enabled/disabled for each FPU, and use it when handling the .fpu
directive.

No functional change in this commit, though clang will start behaving
differently once it starts using this.

Differential Revision: http://reviews.llvm.org/D10237

llvm-svn: 239150
2015-06-05 13:29:24 +00:00
Toma Tabacu 399a56d771 Revert "[mips] [IAS] Restore STI.FeatureBits in .set pop." (r239144).
This is breaking the Windows buildbots.

llvm-svn: 239145
2015-06-05 12:19:27 +00:00
Toma Tabacu 89ebf88ff3 [mips] [IAS] Restore STI.FeatureBits in .set pop.
Summary:
Only restoring AvailableFeatures is not enough and will lead to buggy behaviour.
For example, if we have a feature enabled and we ".set pop", the next time we try
to ".set" that feature nothing will happen because the "!(STI.getFeatureBits()[Feature])"
check will be false, because we didn't restore STI.FeatureBits.

In order to fix this, we need to make MipsAssemblerOptions remember the STI.FeatureBits
instead of the AvailableFeatures and then regenerate AvailableFeatures each time we ".set pop".
This is because, AFAIK, there is no way to convert from AvailableFeatures back to STI.FeatureBits,
but the reverse is possible by using ComputeAvailableFeatures(STI.FeatureBits).

I also moved the updating of AssemblerOptions inside the "if" statement in
setFeatureBits() and clearFeatureBits(), as there is no reason to update if
nothing changes.

Reviewers: dsanders, mkuper

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9156

llvm-svn: 239144
2015-06-05 11:48:54 +00:00
Jim Grosbach 56ed0bb111 MC: Clean up the naming for MCMachObjectWriter. NFC.
s/ExecutePostLayoutBinding/executePostLayoutBinding/
s/ComputeSymbolTable/computeSymbolTable/
s/BindIndirectSymbols/bindIndirectSymbols/
s/RecordTLVPRelocation/recordTLVPRelocation/
s/RecordScatteredRelocation/recordScatteredRelocation/
s/WriteLinkerOptionsLoadCommand/writeLinkerOptionsLoadCommand/
s/WriteLinkeditLoadCommand/writeLinkeditLoadCommand/
s/WriteNlist/writeNlist/
s/WriteDysymtabLoadCommand/writeDysymtabLoadCommand/
s/WriteSymtabLoadCommand/writeSymtabLoadCommand/
s/WriteSection/writeSection/
s/WriteSegmentLoadCommand/writeSegmentLoadCommand/
s/WriteHeader/writeHeader/

llvm-svn: 239119
2015-06-04 23:25:54 +00:00
Charles Davis da280728b6 [Target/X86] Don't use callee-saved registers in a Win64 tail call on non-Windows.
Summary:
A small bit that I missed when I updated the X86 backend to account for
the Win64 calling convention on non-Windows. Now we don't use dead
non-volatile registers when emitting a Win64 indirect tail call on
non-Windows.

Should fix PR23710.

Test Plan: Added test for the correct behavior based on the case I posted to PR23710.

Reviewers: rnk

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10258

llvm-svn: 239111
2015-06-04 22:50:05 +00:00
Jim Grosbach 36e60e9127 MC: Clean up naming in MCObjectWriter. NFC.
s/WriteObject/writeObject/
s/RecordRelocation/recordRelocation/
s/IsSymbolRefDifferenceFullyResolved/isSymbolRefDifferenceFullyResolved/
s/Write8/write8/
s/WriteLE16/writeLE16/
s/WriteLE32/writeLE32/
s/WriteLE64/writeLE64/
s/WriteBE16/writeBE16/
s/WriteBE32/writeBE32/
s/WriteBE64/writeBE64/
s/Write16/write16/
s/Write32/write32/
s/Write64/write64/
s/WriteZeroes/writeZeroes/
s/WriteBytes/writeBytes/

llvm-svn: 239108
2015-06-04 22:24:41 +00:00
Colin LeMahieu c40be85adc Revert r239095 incorrect test tree.
llvm-svn: 239102
2015-06-04 21:32:42 +00:00
Jingyue Wu a2f6027a31 [NVPTX] roll forward r239082
NVPTXISelDAGToDAG translates "addrspacecast to param" to
NVPTX::nvvm_ptr_gen_to_param

Added an llc test in bug21465.

llvm-svn: 239100
2015-06-04 21:28:26 +00:00
Colin LeMahieu f99fe00afc [Hexagon] Removing unused variable.
llvm-svn: 239097
2015-06-04 21:22:12 +00:00
Colin LeMahieu fc52c11d80 [Hexagon] Adding functionality for duplexing. Duplexing is a way to compress commonly used pairs of instructions in order to reduce code size. The test case duplex.ll normally would be 8 bytes, assign register to 0 and jump to link register. After duplexing this is only 4 bytes. This also tests the HexagonMCShuffler code path which is used to make sure duplexed instructions still follow slot requirements.
llvm-svn: 239095
2015-06-04 21:16:16 +00:00
Jingyue Wu b8f38668d5 Revert r239082
llc crashed for NVPTX backend

llvm-svn: 239094
2015-06-04 21:07:08 +00:00
Ahmed Bougacha 8207641251 [GlobalMerge] Take into account minsize on Global users' parents.
Now that we can look at users, we can trivially do this: when we would
have otherwise disabled GlobalMerge (currently -O<3), we can just run
it for minsize functions, as it's usually a codesize win.

Differential Revision: http://reviews.llvm.org/D10054

llvm-svn: 239087
2015-06-04 20:39:23 +00:00
Jim Grosbach 7c76b4cc6e MC: Remove obsolete MachO UseAggressiveSymbolFolding.
Fix the FIXME and remove this old as(1) compat option. It was useful for
bringup of the integrated assembler to diff object files, but now it's
just causing more relocations than strictly necessary to be generated.

rdar://21201804

llvm-svn: 239084
2015-06-04 20:27:42 +00:00
Jingyue Wu f3a8079b75 [NVPTX] kernel pointer arguments point to the global address space
Summary:
With this patch, NVPTXLowerKernelArgs converts a kernel pointer argument to a
pointer in the global address space. This change, along with
NVPTXFavorNonGenericAddrSpaces, allows the NVPTX backend to emit ld.global.*
and st.global.* for accessing kernel pointer arguments.

Minor changes:
1. refactor: extract function convertToPointerInAddrSpace
2. fix a bug in the test case in bug21465.ll

Test Plan: lower-kernel-ptr-arg.ll

Reviewers: eliben, meheff, jholewinski

Reviewed By: jholewinski

Subscribers: wengxt, jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D10154

llvm-svn: 239082
2015-06-04 20:19:38 +00:00
Alexei Starovoitov 310deada10 [bpf] add big- and host- endian support
Summary:
-march=bpf    -> host endian
-march=bpf_le -> little endian
-match=bpf_be -> big endian

Test Plan:
v1 was tested by IBM s390 guys and appears to be working there.
It bit rots too fast here.

Reviewers: chandlerc, tstellarAMD

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10177

llvm-svn: 239071
2015-06-04 19:15:05 +00:00
Matt Arsenault 73e06fa262 R600/SI: Reimplement isLegalAddressingMode
Now that we sometimes know the address space, this can
theoretically do a better job.

This needs better test coverage, but this mostly depends on
first updating the loop optimizatiosn to provide the address
space.

llvm-svn: 239053
2015-06-04 16:17:42 +00:00
Matt Arsenault 81c7ae2bf5 R600/SI: Fix some cases for load / store of half
Mostly argument loads were producing broken zextloads
from an FP type.

llvm-svn: 239049
2015-06-04 16:00:27 +00:00
Benjamin Kramer 50e2a29385 Replace custom fixed endian to raw_ostream emission with EndianStream.
Less code, clearer and more efficient. No functionality change intended.

llvm-svn: 239040
2015-06-04 15:03:02 +00:00
Daniel Sanders 7813ae879e Replace string GNU Triples with llvm::Triple in MCAsmInfo subclasses and create*AsmInfo(). NFC.
Summary:
This is the first of several patches to eliminate StringRef forms of GNU
triples from the internals of LLVM. After this is complete, GNU triples
will be replaced by a more authoratitive representation in the form of
an LLVM TargetTuple.

Reviewers: rengolin

Reviewed By: rengolin

Subscribers: ted, llvm-commits, rengolin, jholewinski

Differential Revision: http://reviews.llvm.org/D10236

llvm-svn: 239036
2015-06-04 13:12:25 +00:00
Elena Demikhovsky 2f1a0dabd0 AVX-512: I brought back vector-shuffle-512-v8.ll test.
I re-generated it after all AVX-512 shuffle optimizations.

llvm-svn: 239026
2015-06-04 07:49:56 +00:00
Elena Demikhovsky 4078c75bd4 AVX-512: added all SKX forms of VPERMW/D/Q instructions.
Added all forms of VPERMPS/PD instrcuctions.
Added encoding tests.

llvm-svn: 239016
2015-06-04 07:07:13 +00:00
Elena Demikhovsky 214335d703 Removed {}, NFC.
llvm-svn: 239014
2015-06-04 07:01:29 +00:00
Rafael Espindola 8c006ee385 Bring back r239006 with a fix.
The fix is just that getOther had not been updated for packing the st_other
values in fewer bits and could return spurious values:

-  unsigned Other = (getFlags() & (0x3f << ELF_STO_Shift)) >> ELF_STO_Shift;
+  unsigned Other = (getFlags() & (0x7 << ELF_STO_Shift)) >> ELF_STO_Shift;

Original message:

Pack the MCSymbolELF bit fields into MCSymbol's Flags.

This reduces MCSymolfELF from 64 bytes to 56 bytes on x86_64.

While at it, also make getOther/setOther easier to use by accepting unshifted
STO_* values.

llvm-svn: 239012
2015-06-04 05:59:23 +00:00
Rafael Espindola a86ecee52b Revert "Pack the MCSymbolELF bit fields into MCSymbol's Flags."
This reverts commit r239006.

I am debugging the powerpc failures.

llvm-svn: 239010
2015-06-04 05:00:12 +00:00
Rafael Espindola d31203ae21 Pack the MCSymbolELF bit fields into MCSymbol's Flags.
This reduces MCSymolfELF from 64 bytes to 56 bytes on x86_64.

While at it, also make getOther/setOther easier to use by accepting unshifted
STO_* values.

llvm-svn: 239006
2015-06-04 02:32:20 +00:00
Sanjay Patel 667a7e2a0f make reciprocal estimate code generation more flexible by adding command-line options (3rd try)
The first try (r238051) to land this was reverted due to ExecutionEngine build failure;
that was hopefully addressed by r238788.

The second try (r238842) to land this was reverted due to BUILD_SHARED_LIBS failure;
that was hopefully addressed by r238953.

This patch adds a TargetRecip class for processing many recip codegen possibilities.
The class is intended to handle both command-line options to llc as well
as options passed in from a front-end such as clang with the -mrecip option.

The x86 backend is updated to use the new functionality.
Only -mcpu=btver2 with -ffast-math should see a functional change from this patch.
All other x86 CPUs continue to *not* use reciprocal estimates by default with -ffast-math.

Differential Revision: http://reviews.llvm.org/D8982

llvm-svn: 239001
2015-06-04 01:32:35 +00:00
Tom Stellard 1ba52feb96 R600: Re-enable sub-reg liveness
The bug in the R600 backend that this uncovered has been fixed.

llvm-svn: 238999
2015-06-04 01:20:04 +00:00
Rafael Espindola f8794ff29d Remove MCELFSymbolFlags.h. It is now internal to MCSymbolELF.
llvm-svn: 238996
2015-06-04 00:47:43 +00:00
Rafael Espindola c73aed1cb3 Remove getOrCreateSymbolData. There is no MCSymbolData anymore.
llvm-svn: 238952
2015-06-03 19:03:11 +00:00
Colin LeMahieu 1ce7a11c9c [Hexagon] Test doesn't work on all platforms. At any rate the uninitialized variable issue was fixed. Removing re-registering ASM backend.
llvm-svn: 238949
2015-06-03 18:00:45 +00:00
Colin LeMahieu a675077310 [Hexagon] Reapply 238772 OSABI was not correctly set, added empty_elf test to make sure it is.
llvm-svn: 238947
2015-06-03 17:34:16 +00:00
Matthias Braun 125c9f5f7b ARM: Thumb2 LDRD/STRD supports independent input/output regs
The existing code would unnecessarily break LDRD/STRD apart with
non-adjacent registers, on thumb2 this is not necessary.

Ideally on thumb2 we shouldn't match for ldrd/strd pre-regalloc anymore
as there is not reason to set register hints anymore, changing that is
something for a future patch however.

Differential Revision: http://reviews.llvm.org/D9694

Recommiting after the revert in r238821, the buildbot still failed with
the patch removed so there seems to be another reason for the breakage.

llvm-svn: 238935
2015-06-03 16:30:24 +00:00
Daniel Sanders 43a79bf694 [arm] Fix r238921. We must handle Constraint_i too.
llvm-svn: 238925
2015-06-03 14:17:18 +00:00
Asaf Badouh 402ebb34af re-apply 238809
AVX-512: Implemented GETEXP instruction for KNL and SKX
Added rounding mode modifier for SQRTPS/PD
Added tests for encoding and intrinsics.
CR:
http://reviews.llvm.org/D9991

llvm-svn: 238923
2015-06-03 13:41:48 +00:00
Daniel Sanders 1f58ef71ea [arm] Distinguish the /U[qytnms]/, 'Uv', 'Q', and 'm' inline assembly memory constraints.
Summary:
But still handle them the same way since I don't know how they differ on
this target.

Of these, /U[qytnms]/ do not have backend tests but are accepted by clang.

No functional change intended.

Reviewers: t.p.northover

Reviewed By: t.p.northover

Subscribers: t.p.northover, aemerson, llvm-commits

Differential Revision: http://reviews.llvm.org/D8203

llvm-svn: 238921
2015-06-03 12:33:56 +00:00
Elena Demikhovsky 86224fe468 AVX-512: More code improvements in shuffles, NFC
llvm-svn: 238919
2015-06-03 12:05:03 +00:00
Elena Demikhovsky 21de893377 AVX-512: VSHUFPD instruction selection - code improvements
llvm-svn: 238918
2015-06-03 11:21:01 +00:00
Elena Demikhovsky 9e38086534 AVX-512: Implemented SHUFF32x4/SHUFF64x2/SHUFI32x4/SHUFI64x2 instructions for SKX and KNL.
Added tests for encoding.

By Igor Breger (igor.breger@intel.com)

llvm-svn: 238917
2015-06-03 10:56:40 +00:00
Elena Demikhovsky f7e641cc2d X86: Added MPX feature and bound registers.
Intel® Memory Protection Extensions (Intel® MPX) is a new feature in Skylake.
It is a part of KNL and SKX sets. It is also a part of Skylake client.

I added definition of %bnd0 - %bnd3 registers, each register is a pair of 64-bit integers.

llvm-svn: 238916
2015-06-03 10:30:57 +00:00
Simon Pilgrim 452252e6c8 [X86] Removed (unused) FSRL x86 operation
This patch removes the old X86ISD::FSRL op - which allowed float vectors to use the byte right shift operations (causing a domain switch....).

Since the refactoring of the shuffle lowering code this no longer has any use.

Differential Revision: http://reviews.llvm.org/D10169

llvm-svn: 238906
2015-06-03 08:32:36 +00:00
Rafael Espindola cf8beece97 Revert "make reciprocal estimate code generation more flexible by adding command-line options (2nd try)"
This reverts commit r238842.

It broke -DBUILD_SHARED_LIBS=ON build.

llvm-svn: 238900
2015-06-03 05:32:44 +00:00
Rafael Espindola 9aa3ab30a9 Avoid a call to getOrCreateSymbol when we already have the symbol.
llvm-svn: 238890
2015-06-03 00:02:40 +00:00
Rafael Espindola 0ccf9b71f3 Pass a MCSymbolELF to a few ELF only functions. NFC.
llvm-svn: 238868
2015-06-02 21:30:13 +00:00
Rafael Espindola 95fb9b93ed Merge MCELF.h into MCSymbolELF.h.
Now that we have a dedicated type for ELF symbol, these helper functions can
become member function of MCSymbolELF.

llvm-svn: 238864
2015-06-02 20:38:46 +00:00
Tim Northover 3f3a4d8503 AArch64: fix typo in SMIN far atomics and add tests
llvm-svn: 238858
2015-06-02 18:37:20 +00:00
Benjamin Kramer db220dbf02 Push constness through LoopInfo::isLoopHeader and clean it up a bit.
NFC.

llvm-svn: 238843
2015-06-02 15:28:27 +00:00
Sanjay Patel 6f031d848e make reciprocal estimate code generation more flexible by adding command-line options (2nd try)
The first try (r238051) to land this was reverted due to bot failures
that were hopefully addressed by r238788.

This patch adds a TargetRecip class for processing many recip codegen possibilities.
The class is intended to handle both command-line options to llc as well
as options passed in from a front-end such as clang with the -mrecip option.

The x86 backend is updated to use the new functionality.
Only -mcpu=btver2 with -ffast-math should see a functional change from this patch.
All other x86 CPUs continue to *not* use reciprocal estimates by default with -ffast-math.

Differential Revision: http://reviews.llvm.org/D8982

llvm-svn: 238842
2015-06-02 15:28:15 +00:00
Elena Demikhovsky 8938f5acca AVX-512: Implemented VRANGESD and VRANGESS instructions for SKX Implemented DAG lowering for all these forms.
Added tests for encoding.

By Igor Breger (igor.breger@intel.com)

llvm-svn: 238834
2015-06-02 14:12:54 +00:00
Elena Demikhovsky 44a129c533 AVX-512: Shorten implementation of lowerV16X32VectorShuffle()
using lowerVectorShuffleWithSHUFPS() and other shuffle-helpers routines.
Added matching of VALIGN instruction.

llvm-svn: 238830
2015-06-02 13:43:18 +00:00
Vasileios Kalintiris bb698c7d5f [mips] Add support for dynamic stack realignment.
Summary:
With this change we are able to realign the stack dynamically, whenever it
contains objects with alignment requirements that are larger than the
alignment specified from the given ABI.

We have to use the $fp register as the frame pointer when we perform
dynamic stack realignment. In complex stack frames, with variably-sized
objects, we reserve additionally the callee-saved register $s7 as the
base pointer in order to reference locals.

Reviewers: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8633

llvm-svn: 238829
2015-06-02 13:14:46 +00:00
Renato Golin 3a7bec86bd Revert "ARM: Thumb2 LDRD/STRD supports independent input/output regs"
This reverts commit r238795, as it broke the Thumb2 self-hosting buildbot.

Since self-hosting issues with Clang are hard to investigate, I'm taking the
liberty to revert now, so we can investigate it offline.

llvm-svn: 238821
2015-06-02 11:47:30 +00:00
Vladimir Sukharev 5f6f60d942 [AArch64] Add v8.1a atomic instructions
Patch by: Tom Coxon

Reviewers: t.p.northover

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8501

llvm-svn: 238818
2015-06-02 10:58:41 +00:00
Toma Tabacu 2969650ecd [mips] [IAS] Add support for the .set softfloat/hardfloat directives.
Summary: These directives are used to set the current value of the SoftFloat feature.

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits, mpf

Differential Revision: http://reviews.llvm.org/D9074

llvm-svn: 238813
2015-06-02 09:48:04 +00:00
Elena Demikhovsky 3425c932da AVX-512: Implemented VFIXUPIMMSD and VFIXUPIMMSS instructions for KNL
Implemented DAG lowering for all these forms.
Added tests for encoding.

By Igor Breger (igor.breger@intel.com)

llvm-svn: 238811
2015-06-02 08:28:57 +00:00
Asaf Badouh 8d897dd05f revert 238809
llvm-svn: 238810
2015-06-02 07:45:19 +00:00
Asaf Badouh 17de10f37e AVX-512: Implemented GETEXP instruction for KNL and SKX
Added rounding mode modifier for SQRTPS/PD
Added tests for encoding and intrinsics.

llvm-svn: 238809
2015-06-02 07:18:14 +00:00
Rafael Espindola a869576008 Create a MCSymbolELF.
This create a MCSymbolELF class and moves SymbolSize since only ELF
needs a size expression.

This reduces the size of MCSymbol from 56 to 48 bytes.

llvm-svn: 238801
2015-06-02 00:25:12 +00:00
Matthias Braun e20dc1cd3a ARM: Thumb2 LDRD/STRD supports independent input/output regs
The existing code would unnecessarily break LDRD/STRD apart with
non-adjacent registers, on thumb2 this is not necessary.

Ideally on thumb2 we shouldn't match for ldrd/strd pre-regalloc anymore
as there is not reason to set register hints anymore, changing that is
something for a future patch however.

Differential Revision: http://reviews.llvm.org/D9694

llvm-svn: 238795
2015-06-01 23:27:08 +00:00
Matthias Braun 72b8f74813 AArch64: Use CMP;CCMP sequences for and/or/setcc trees.
Previously CCMP/FCCMP instructions were only used by the
AArch64ConditionalCompares pass for control flow. This patch uses them
for SELECT like instructions as well by matching patterns in ISelLowering.

PR20927, rdar://18326194

Differential Revision: http://reviews.llvm.org/D8232

llvm-svn: 238793
2015-06-01 22:31:17 +00:00
Alexei Starovoitov dadc97767f [bpf] fix build
fix breakage due to r238634

Patch by Vijay Subramanian.

llvm-svn: 238792
2015-06-01 22:24:36 +00:00
Matt Arsenault a0269b6d20 R600/SI: Don't hardcode pointer type
llvm-svn: 238789
2015-06-01 21:58:24 +00:00
Matthias Braun ec50fa6f8c ARMLoadStoreOptimizer: Fix doxygen comments; NFC
llvm-svn: 238784
2015-06-01 21:26:23 +00:00
Rafael Espindola b5815b4738 Revert "[Hexagon] Adding basic ELF relocation generation and testing advanced relaxation codepath."
This reverts commit r238748.

It broke the msan bot:

http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/4372/steps/check-llvm%20msan/logs/stdio

llvm-svn: 238772
2015-06-01 19:20:47 +00:00
Vasileios Kalintiris cbbf8e0a39 [mips][FastISel] Implement bswap.
Summary: Implement bswap intrinsic for MIPS FastISel. It's very different for misp32 r1/r2 .

Based on a patch by Reed Kotler.

Test Plan:
bswap1.ll
test-suite

Reviewers: dsanders, rkotler

Subscribers: llvm-commits, rfuhler

Differential Revision: http://reviews.llvm.org/D7219

llvm-svn: 238760
2015-06-01 16:40:45 +00:00
Vasileios Kalintiris bdb91b31f0 [mips][FastISel] Implement intrinsics memset, memcopy & memmove.
Summary:
Implement the intrinsics memset, memcopy and memmove in MIPS FastISel.
Make some needed infrastructure fixes so that this can work.

Based on a patch by Reed Kotler.

Test Plan:
memtest1.ll
The patch passes test-suite for mips32 r1/r2 and at O0/O2

Reviewers: rkotler, dsanders

Subscribers: llvm-commits, rfuhler

Differential Revision: http://reviews.llvm.org/D7158

llvm-svn: 238759
2015-06-01 16:36:01 +00:00
Vasileios Kalintiris 8fcb3986d0 [mips][FastISel] Implement srem/urem and sdiv/udiv instructions.
Summary: Implement the LLVM assembly urem/srem and sdiv/udiv instructions in MIPS FastISel.

Based on a patch by Reed Kotler.

Test Plan:
srem1.ll
div1.ll
test-suite at O0/O2 for mips32 r1/r2

Reviewers: dsanders, rkotler

Subscribers: llvm-commits, rfuhler

Differential Revision: http://reviews.llvm.org/D7028

llvm-svn: 238757
2015-06-01 16:17:37 +00:00
Vasileios Kalintiris 127f894b55 [mips][FastISel] Implement the select statement for MIPS FastISel.
Summary: Implement the LLVM IR select statement for MIPS FastISelsel.

Based on a patch by Reed Kotler.

Test Plan:
"Make check" test included now.
Passes test-suite at O2/O0 mips32 r1/r2.

Reviewers: dsanders, rkotler

Subscribers: llvm-commits, rfuhler

Differential Revision: http://reviews.llvm.org/D6774

llvm-svn: 238756
2015-06-01 15:56:40 +00:00
Vasileios Kalintiris 7f680e156e [mips][FastISel] Clobber HI0/LO0 registers in MUL instructions.
Summary:
The contents of the HI/LO registers are unpredictable after the execution of
the MUL instruction. In addition to implicitly defining these registers in the
MUL instruction definition, we have to mark those registers as dead too.

Without this the fast register allocator is running out of registers when the
MUL instruction is followed by another one that tries to allocate the AC0
register.

Based on a patch by Reed Kotler.

Reviewers: dsanders, rkotler

Subscribers: llvm-commits, rfuhler

Differential Revision: http://reviews.llvm.org/D9825

llvm-svn: 238755
2015-06-01 15:48:09 +00:00
Rafael Espindola 7f7caf9167 Fix relocation selection for foo-. on mips.
This handles only the 32 bit case.

llvm-svn: 238751
2015-06-01 15:10:51 +00:00
Rafael Espindola ccb8d1a114 Simplify code, NFC.
llvm-svn: 238750
2015-06-01 14:58:29 +00:00
Colin LeMahieu a739a4b3c7 [Hexagon] Adding basic ELF relocation generation and testing advanced relaxation codepath.
llvm-svn: 238748
2015-06-01 14:51:26 +00:00
Elena Demikhovsky 67afb630e1 AVX-512: Optimized vector shuffle for v16f32 and v16i32 types.
llvm-svn: 238743
2015-06-01 13:26:18 +00:00
Luke Cheeseman 85fd06d389 Re-commit of r238201 with fix for building with shared libraries.
llvm-svn: 238739
2015-06-01 12:02:47 +00:00
Elena Demikhovsky 3582eb3b39 AVX-512: Implemented VRANGEPD and VRANGEPD instructions for SKX.
Implemented DAG lowering for all these forms.
Added tests for encoding.

By Igor Breger (igor.breger@intel.com)

llvm-svn: 238738
2015-06-01 11:05:34 +00:00
Elena Demikhovsky 0c41088ebf AVX-512: Implemented vector shuffle lowering for v8i64 and v8f64 types.
I removed the vector-shuffle-512-v8.ll, it is auto-generated test, not valid any more.

llvm-svn: 238735
2015-06-01 09:49:53 +00:00
Elena Demikhovsky 75ede68793 AVX-512: added all forms of VPSHUFD and VPSHUFHW, VPSHUFLW
including encodings.

llvm-svn: 238729
2015-06-01 07:17:23 +00:00
Elena Demikhovsky 42c96d9c0a AVX-512: Implemented VFIXUPIMMPD and VFIXUPIMMPS instructions for KNL and SKX
Implemented DAG lowering for all these forms.
Added tests for encoding.

by Igor Breger (igor.breger@intel.com)

llvm-svn: 238728
2015-06-01 06:50:49 +00:00
Elena Demikhovsky dd68d0cb0f AVX-512: Fixed a bug in compress and expand intrinsics.
By Igor Breger (igor.breger@intel.com)

llvm-svn: 238724
2015-06-01 06:30:13 +00:00
Matt Arsenault bd7d80a4a6 Add address space argument to isLegalAddressingMode
This is important because of different addressing modes
depending on the address space for GPU targets.

This only adds the argument, and does not update
any of the uses to provide the correct address space.

llvm-svn: 238723
2015-06-01 05:31:59 +00:00
Rafael Espindola 5eb02e45e3 Simplify another function that doesn't fail.
llvm-svn: 238703
2015-06-01 00:27:26 +00:00
NAKAMURA Takumi 072a58a7fd ARMConstantIslandPass.cpp: Prune an empty \brief. [-Wdocumentation]
llvm-svn: 238697
2015-05-31 23:05:35 +00:00
Colin LeMahieu a97365b8e0 [Hexagon] Including raw_ostream for debug builds.
llvm-svn: 238695
2015-05-31 22:29:33 +00:00
Colin LeMahieu b819d3c465 [Hexagon] classes are actually structs.
llvm-svn: 238694
2015-05-31 22:18:42 +00:00
Colin LeMahieu b23c47bab3 [Hexagon] Adding MC packet shuffler.
llvm-svn: 238692
2015-05-31 21:57:09 +00:00
Tim Northover a603c4076c ARM: recommit r237590: allow jump tables to be placed as constant islands.
The original version didn't properly account for the base register
being modified before the final jump, so caused miscompilations in
Chromium and LLVM. I've fixed this and tested with an LLVM self-host
(I don't have the means to build & test Chromium).

The general idea remains the same: in pathological cases jump tables
can be too far away from the instructions referencing them (like other
constants) so they need to be movable.

Should fix PR23627.

llvm-svn: 238680
2015-05-31 19:22:07 +00:00
Colin LeMahieu b510fb38f5 [Hexagon] Adding override specifier and removing erroneous assertion
llvm-svn: 238664
2015-05-30 20:03:07 +00:00
Colin LeMahieu 86f218e7ec [Hexagon] Adding basic relaxation functionality.
llvm-svn: 238660
2015-05-30 18:55:47 +00:00
Simon Pilgrim f19ef9f741 Stripped trailing whitespace. NFC.
llvm-svn: 238654
2015-05-30 13:01:42 +00:00
Renato Golin 5d78c9ce58 Comment change. NFC
That comment misleads the current discussions in mentioned bug. Leave
the discussions to the bug. Also, adding a future change FIXME.

llvm-svn: 238653
2015-05-30 10:44:07 +00:00
Chandler Carruth cb58910ce8 [x86] Unify the horizontal adding used for popcount lowering taking the
best approach of each.

For vNi16, we use SHL + ADD + SRL pattern that seem easily the best.

For vNi32, we use the PUNPCK + PSADBW + PACKUSWB pattern. In some cases
there is a huge improvement with this in IACA's estimated throughput --
over 2x higher throughput!!!! -- but the measurements are too good to be
true. In one narrow case, the SHL + ADD + SHL + ADD + SRL pattern looks
slightly faster, but I'm not sure I believe any of the measurements at
this point. Both are the exact same uops though. Hard to be confident of
anything past that.

If anyone wants to collect very detailed (Agner-level) timings with the
result of this patch, or with the i32 case replaced with SHL + ADD + SHl
+ ADD + SRL, I'd be very interested. Note that you'll need to test it on
both Ivybridge and Haswell, with both SSE3, SSSE3, and AVX selected as
I saw unique behavior in each of these buckets with IACA all of which
should be checked against measured performance.

But this patch is still a useful improvement by dropping duplicate work
and getting the much nicer PSADBW lowering for v2i64.

I'd still like to rephrase this in terms of generic horizontal sum. It's
a bit lame to have a special case of that just for popcount.

llvm-svn: 238652
2015-05-30 10:35:03 +00:00
Renato Golin 230d298320 [ARMTargetParser] Move IAS arch ext parser. NFC
The plan was to move the whole table into the already existing ArchExtNames
but some fields depend on a table-generated file, and we don't yet have this
feature in the generic lib/Support side.

Once the minimum target-specific table-generated files are available in a
generic fashion to these libraries, we'll have to keep it in the ASM parser.

llvm-svn: 238651
2015-05-30 10:30:02 +00:00
Chandler Carruth 11e6f8fed1 [x86] Split out the horizontal byte sum lowering component of the LUT
lowering into a helper function.

NFC.

llvm-svn: 238650
2015-05-30 09:46:16 +00:00
Chandler Carruth 9cc2516676 [x86] Replace the long spelling of getting a bitcast with the *much*
shorter one. NFC.

In addition to being much shorter to type and requiring fewer arguments,
this change saves over 30 lines from this one file, all wasted on total
boilerplate...

llvm-svn: 238640
2015-05-30 04:23:13 +00:00
Chandler Carruth 060cdca996 [x86] Replace the long spelling of getting a bitcast with the new short
spelling. NFC.

llvm-svn: 238639
2015-05-30 04:19:57 +00:00
Chandler Carruth 502b23a7a9 [sdag] Add the helper I most want to the DAG -- building a bitcast
around a value using its existing SDLoc.

Start using this in just one function to save omg lines of code.

llvm-svn: 238638
2015-05-30 04:14:10 +00:00
Chandler Carruth 2599da3cfd [x86] Restore the bitcasts I removed when refactoring this to avoid
shifting vectors of bytes as x86 doesn't have direct support for that.

This removes a bunch of redundant masking in the generated code for SSE2
and SSE3.

In order to avoid the really significant code size growth this would
have triggered, I also factored the completely repeatative logic for
shifting and masking into two lambdas which in turn makes all of this
much easier to read IMO.

llvm-svn: 238637
2015-05-30 04:05:11 +00:00
Chandler Carruth 6ba9730a4e [x86] Implement a faster vector population count based on the PSHUFB
in-register LUT technique.

Summary:
A description of this technique can be found here:
http://wm.ite.pl/articles/sse-popcount.html

The core of the idea is to use an in-register lookup table and the
PSHUFB instruction to compute the population count for the low and high
nibbles of each byte, and then to use horizontal sums to aggregate these
into vector population counts with wider element types.

On x86 there is an instruction that will directly compute the horizontal
sum for the low 8 and high 8 bytes, giving vNi64 popcount very easily.
Various tricks are used to get vNi32 and vNi16 from the vNi8 that the
LUT computes.

The base implemantion of this, and most of the work, was done by Bruno
in a follow up to D6531. See Bruno's detailed post there for lots of
timing information about these changes.

I have extended Bruno's patch in the following ways:

0) I committed the new tests with baseline sequences so this shows
   a diff, and regenerated the tests using the update scripts.

1) Bruno had noticed and mentioned in IRC a redundant mask that
   I removed.

2) I introduced a particular optimization for the i32 vector cases where
   we use PSHL + PSADBW to compute the the low i32 popcounts, and PSHUFD
   + PSADBW to compute doubled high i32 popcounts. This takes advantage
   of the fact that to line up the high i32 popcounts we have to shift
   them anyways, and we can shift them by one fewer bit to effectively
   divide the count by two. While the PSHUFD based horizontal add is no
   faster, it doesn't require registers or load traffic the way a mask
   would, and provides more ILP as it happens on different ports with
   high throughput.

3) I did some code cleanups throughout to simplify the implementation
   logic.

4) I refactored it to continue to use the parallel bitmath lowering when
   SSSE3 is not available to preserve the performance of that version on
   SSE2 targets where it is still much better than scalarizing as we'll
   still do a bitmath implementation of popcount even in scalar code
   there.

With #1 and #2 above, I analyzed the result in IACA for sandybridge,
ivybridge, and haswell. In every case I measured, the throughput is the
same or better using the LUT lowering, even v2i64 and v4i64, and even
compared with using the native popcnt instruction! The latency of the
LUT lowering is often higher than the latency of the scalarized popcnt
instruction sequence, but I think those latency measurements are deeply
misleading. Keeping the operation fully in the vector unit and having
many chances for increased throughput seems much more likely to win.

With this, we can lower every integer vector popcount implementation
using the LUT strategy if we have SSSE3 or better (and thus have
PSHUFB). I've updated the operation lowering to reflect this. This also
fixes an issue where we were scalarizing horribly some AVX lowerings.

Finally, there are some remaining cleanups. There is duplication between
the two techniques in how they perform the horizontal sum once the byte
population count is computed. I'm going to factor and merge those two in
a separate follow-up commit.

Differential Revision: http://reviews.llvm.org/D10084

llvm-svn: 238636
2015-05-30 03:20:59 +00:00
Chandler Carruth c2e400de83 [x86] Restructure the parallel bitmath lowering of popcount into
a separate routine, generalize it to work for all the integer vector
sizes, and do general code cleanups.

This dramatically improves lowerings of byte and short element vector
popcount, but more importantly it will make the introduction of the
LUT-approach much cleaner.

The biggest cleanup I've done is to just force the legalizer to do the
bitcasting we need. We run these iteratively now and it makes the code
much simpler IMO. Other changes were minor, and mostly naming and
splitting things up in a way that makes it more clear what is going on.

The other significant change is to use a different final horizontal sum
approach. This is the same number of instructions as the old method, but
shifts left instead of right so that we can clear everything but the
final sum with a single shift right. This seems likely better than
a mask which will usually have to read the mask from memory. It is
certaily fewer u-ops. Also, this will be temporary. This and the LUT
approach share the need of horizontal adds to finish the computation,
and we have more clever approaches than this one that I'll switch over
to.

llvm-svn: 238635
2015-05-30 03:20:55 +00:00
Jim Grosbach 13760bd152 MC: Clean up MCExpr naming. NFC.
llvm-svn: 238634
2015-05-30 01:25:56 +00:00
Reid Kleckner e6531a5588 [WinEH] Adjust the 32-bit SEH prologue to better match reality
It turns out that _except_handler3 and _except_handler4 really use the
same stack allocation layout, at least today. They just make different
choices about encoding the LSDA.

This is in preparation for lowering the llvm.eh.exceptioninfo().

llvm-svn: 238627
2015-05-29 22:57:46 +00:00
Reid Kleckner 173a72524f Disable FP elimination in funcs using 32-bit MSVC EH personalities
The value in 'ebp' acts as an implicit argument to the outlined
handlers, and is recovered with frameaddress(1).

llvm-svn: 238619
2015-05-29 21:58:11 +00:00
Rafael Espindola 4d37b2a259 Remove getData.
This completes the mechanical part of merging MCSymbol and MCSymbolData.

llvm-svn: 238617
2015-05-29 21:45:01 +00:00
Reid Kleckner 5b8ebfbc25 Only add the EH state insertion pass on 32-bit Windows
llvm-svn: 238612
2015-05-29 20:43:10 +00:00
Rafael Espindola beb6060a51 Remove the MCSymbolData typedef.
The getData member function is next.

llvm-svn: 238611
2015-05-29 20:41:47 +00:00
Rafael Espindola b5d316bfc3 Rename getOrCreateSymbolData to registerSymbol and return void.
Another step in merging MCSymbol and MCSymbolData.

llvm-svn: 238607
2015-05-29 20:21:02 +00:00
Rafael Espindola e3b2acf274 Pass MCSymbols to the helper functions in MCELF.h.
llvm-svn: 238596
2015-05-29 18:47:23 +00:00
Rafael Espindola ece40ca43d Pass a MCSymbol to needsRelocateWithSymbol.
llvm-svn: 238589
2015-05-29 18:26:09 +00:00
Nemanja Ivanovic 376e17364f Add support for VSX FMA single-precision instructions to the PPC back end
This patch corresponds to review:
http://reviews.llvm.org/D9941

It adds the various FMA instructions introduced in the version 2.07 of
the ISA along with the testing for them. These are operations on single
precision scalar values in VSX registers.

llvm-svn: 238578
2015-05-29 17:13:25 +00:00
Reid Kleckner 1d3d4adbb9 [WinEH] Emit EH tables for __CxxFrameHandler3 on 32-bit x86
Small (really small!) C++ exception handling examples work on 32-bit x86
now.

This change disables the use of .seh_* directives in WinException when
CFI is not in use. It also uses absolute symbol references in the tables
instead of imagerel32 relocations.

Also fixes a cache invalidation bug in MMI personality classification.

llvm-svn: 238575
2015-05-29 17:00:57 +00:00
Jingyue Wu 995dde2799 [NVPTXFavorNonGenericAddrSpaces] recursively trace into GEP and BitCast
Summary:
This patch allows NVPTXFavorNonGenericAddrSpaces to remove addrspacecast
from longer chains consisting of GEPs and BitCasts. For example, it can
now optimize

  %0 = addrspacecast [10 x float] addrspace(3)* @a to [10 x float]*
  %1 = gep [10 x float]* %0, i64 0, i64 %i
  %2 = bitcast float* %1 to i32*
  %3 = load i32* %2 ; emits ld.u32

to

  %0 = gep [10 x float] addrspace(3)* @a, i64 0, i64 %i
  %1 = bitcast float addrspace(3)* %0 to i32 addrspace(3)*
  %3 = load i32 addrspace(3)* %1 ; emits ld.shared.f32

Test Plan: @ld_int_from_global_float in access-non-generic.ll

Reviewers: broune, eliben, jholewinski, meheff

Subscribers: jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D10074

llvm-svn: 238574
2015-05-29 17:00:27 +00:00
Colin LeMahieu 68d967d92e [Hexagon] Disassembling, printing, and emitting instructions a whole-bundle at a time which is the semantic unit for Hexagon. Fixing tests to use the new format. Disabling tests in the direct object emission path for a followup patch.
llvm-svn: 238556
2015-05-29 14:44:13 +00:00
Toma Tabacu b45fb36f20 [mips] Remove 2 unused variables in MipsTargetStreamer.cpp. NFC.
llvm-svn: 238554
2015-05-29 13:52:56 +00:00
Matthias Braun e41e146c16 CodeGen: Use mop_iterator instead of MIOperands/ConstMIOperands
MIOperands/ConstMIOperands are classes iterating over the MachineOperand
of a MachineInstr, however MachineInstr::mop_iterator does the same
thing.

I assume these two iterators exist to have a uniform interface to
iterate over the operands of a machine instruction bundle and a single
machine instruction. However in practice I find it more confusing to have 2
different iterator classes, so this patch transforms (nearly all) the
code to use mop_iterators.

The only exception being MIOperands::anlayzePhysReg() and
MIOperands::analyzeVirtReg() still needing an equivalent, I leave that
as an exercise for the next patch.

Differential Revision: http://reviews.llvm.org/D9932

This version is slightly modified from the proposed revision in that it
introduces MachineInstr::getOperandNo to avoid the extra counting
variable in the few loops that previously used MIOperands::getOperandNo.

llvm-svn: 238539
2015-05-29 02:56:46 +00:00
Reid Kleckner fe4d491bd9 [WinEH] Start inserting state number stores for C++ EH
This moves all the state numbering code for C++ EH to WinEHPrepare so
that we can call it from the X86 state numbering IR pass that runs
before isel.

Now we just call the same state numbering machinery and insert a bunch
of stores. It also populates MachineModuleInfo with information about
the current function.

llvm-svn: 238514
2015-05-28 22:00:24 +00:00
Rafael Espindola 3a5d3cce80 Remove a trivial forwarding function. NFC.
llvm-svn: 238506
2015-05-28 21:36:02 +00:00
Reid Kleckner bfcad2f181 Remove debug prints from r238487
llvm-svn: 238501
2015-05-28 21:23:53 +00:00
Reid Kleckner 80956a0142 Disable x86 tail call optimizations that jump through GOT
For x86 targets, do not do sibling call optimization when materializing
the callee's address would require a GOT relocation. We can still do
tail calls to internal functions, hidden functions, and protected
functions, because they do not require this kind of relocation. It is
still possible to get GOT relocations when the user explicitly asks for
it with musttail or -tailcallopt, both of which are supposed to
guarantee TCO.

Based on a patch by Chih-hung Hsieh.

Reviewers: srhines, timmurray, danalbert, enh, void, nadav, rnk

Subscribers: joerg, davidxl, llvm-commits

Differential Revision: http://reviews.llvm.org/D9799

llvm-svn: 238487
2015-05-28 20:44:28 +00:00
Peter Collingbourne 450fbee6b2 Thumb2: Modify codegen for memcpy intrinsic to prefer LDM/STM.
We were previously codegen'ing these as regular load/store operations and
hoping that the register allocator would allocate registers in ascending order
so that we could apply an LDM/STM combine after register allocation. According
to the commit that first introduced this code (r37179), we planned to teach
the register allocator to allocate the registers in ascending order. This
never got implemented, and up to now we've been stuck with very poor codegen.

A much simpler approach for achiveing better codegen is to create LDM/STM
instructions with identical sets of virtual registers, let the register
allocator pick arbitrary registers and order register lists when printing an
MCInst. This approach also avoids the need to repeatedly calculate offsets
which ultimately ought to be eliminated pre-RA in order to decrease register
pressure.

This is implemented by lowering the memcpy intrinsic to a series of SD-only
MCOPY pseudo-instructions which performs a memory copy using a given number
of registers. During SD->MI lowering, we lower MCOPY to LDM/STM. This is a
little unusual, but it avoids the need to encode register lists in the SD,
and we can take advantage of SD use lists to decide whether to use the _UPD
variant of the instructions.

Fixes PR9199.

Differential Revision: http://reviews.llvm.org/D9508

llvm-svn: 238473
2015-05-28 20:02:45 +00:00
Reid Kleckner e2e57faa7d [WinEH] Remove debugging dump() call
llvm-svn: 238472
2015-05-28 20:02:05 +00:00
Chad Rosier adc06311ba Reuse Loc variable. NFC.
llvm-svn: 238448
2015-05-28 18:18:21 +00:00
Kai Nacke 3adf9b8d80 [mips] Add new format for dmtc2/dmfc2 for Octeon CPUs.
Octeon CPUs use dmtc2 rt,imm16 and dmfcp2 rt,imm16 for the crypto coprocessor.
E.g. dmtc2 rt,0x4057 starts calculation of sha-1.

I had to introduce a new deconding namespace to avoid a decoding conflict.

Reviewed By: dsanders

Differential Revision: http://reviews.llvm.org/D10083

llvm-svn: 238439
2015-05-28 16:23:16 +00:00
Petar Jovanovic 9720283e99 [Mips64] Add support for MCJIT for MIPS64r2 and MIPS64r6
Add support for resolving MIPS64r2 and MIPS64r6 relocations in MCJIT.

Patch by Vladimir Radosavljevic.

Differential Revision: http://reviews.llvm.org/D9667

llvm-svn: 238424
2015-05-28 13:48:41 +00:00
Benjamin Kramer dba7ee90b5 Don't call utostr in Twine/raw_ostream contexts.
Creating temporary std::strings there is unnecessary.

llvm-svn: 238412
2015-05-28 11:24:24 +00:00
Jingyue Wu c2a014697a [NaryReassociate] Run EarlyCSE after NaryReassociate
Summary:
This patch made two improvements to NaryReassociate and the NVPTX pipeline

1. Run EarlyCSE/GVN after NaryReassociate to get rid of redundant common
expressions.

2. When adding an instruction to SeenExprs, maps both the SCEV before and after
reassociation to that instruction.

Test Plan: updated @reassociate_gep_nsw in nary-gep.ll

Reviewers: meheff, broune

Reviewed By: broune

Subscribers: dberlin, jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D9947

llvm-svn: 238396
2015-05-28 04:56:52 +00:00
Renato Golin f7c0d5f247 ARMTargetParser: Normalising build attributes
Now that most of the methods in Clang and LLVM that were parsing arch/cpu/fpu
strings are using ARMTargetParser, it's time to make it a bit more conforming
with what the ABI says.

This commit adds some clarification on what build attributes are accepted and
which are "non-standard". It also makes clear that the "defaultCPU" and
"defaultArch" methods were really just build attribute getters.

It also diverges from GCC's behaviour to say that armv2/armv3 are really an
ARMv4 in the build attributes, when the ABI has a clear state for that: Pre-v4.

llvm-svn: 238344
2015-05-27 18:15:37 +00:00
Jan Vesely 6c12e2db06 R600: Rely on TypeLegalizer to use divrem instead of div/rem
reviewer: tstellardAMD
llvm-svn: 238337
2015-05-27 16:54:10 +00:00
Zoran Jovanovic 85a53a1ed5 [mips][microMIPSr6] Implement SEB and SEH instructions
Differential Revision: http://reviews.llvm.org/D9739

llvm-svn: 238333
2015-05-27 15:39:47 +00:00
Jozef Kolek 888830adfe [mips][microMIPSr6] Implement BEQZALC, BGEZALC, BGTZALC, BLEZALC, BLTZALC and BNEZALC instructions
This patch implements microMIPS32r6 BEQZALC, BGEZALC, BGTZALC, BLEZALC, BLTZALC
and BNEZALC instructions using mapping.

Differential Revision: http://reviews.llvm.org/D10031

llvm-svn: 238325
2015-05-27 14:19:22 +00:00
Elena Demikhovsky 86c7b46680 AVX-512: Fixed a bug in extracting subvector from v64i1
By Igor Breger (igor.breger@intel.com)

llvm-svn: 238322
2015-05-27 14:09:33 +00:00
Rafael Espindola f4a1365387 Use operator<< instead of print in a few more places.
llvm-svn: 238315
2015-05-27 13:05:42 +00:00
Daniel Sanders 8ef465f4bb Revert r238190 and r238197: [mips] Make TTypeEncoding indirect to allow .eh_frame to be read-only.
This broke the llvm-mips-linux builder and several of our out-of-tree builders.
Initial investigations show that the commit probably isn't the problem but
reverting anyway while I investigate.

llvm-svn: 238302
2015-05-27 08:44:01 +00:00
Elena Demikhovsky 3948c590e3 AVX-512: Implemented all forms of sign-extend and zero-extend instructions for KNL and SKX
Implemented DAG lowering for all these forms.
Added tests for DAG lowering and encoding.

By Igor Breger (igor.breger@intel.com)

llvm-svn: 238301
2015-05-27 08:15:19 +00:00
Quentin Colombet aa8020752e [X86] Implement the support for shrink-wrapping.
With this patch the x86 backend is now shrink-wrapping capable
and this functionality can be tested by using the
-enable-shrink-wrap switch.

The next step is to make more test and enable shrink-wrapping by
default for x86.

Related to <rdar://problem/20821487>

llvm-svn: 238293
2015-05-27 06:28:41 +00:00
Matthias Braun aa9fa35555 ARMLoadStoreOptimizer: Code cleanup; NFC
llvm-svn: 238289
2015-05-27 05:12:40 +00:00
Rafael Espindola 2fb8401b2a Print "lock \t foo" instead of "lock \n foo".
This gets gas and llc -filetype=obj to agree on the order of prefixes.

For llvm-mc we need to fix the asm parser to know that it makes a difference
on which line the "lock" is in.

Part of pr23594.

llvm-svn: 238232
2015-05-26 18:35:10 +00:00
Jan Vesely b670d37105 R600: Use SIGN_EXTEND_INREG for SEXT loads
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Matt Arsenault <Matthew.Arsenault@amd.com>
llvm-svn: 238229
2015-05-26 18:07:22 +00:00
Jan Vesely a2143fa244 R600: Add comments to subword private address load lowering code
v2: Use C++ comments and end with periods

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Matt Arsenault <Matthew.Arsenault@amd.com>
llvm-svn: 238228
2015-05-26 18:07:21 +00:00
Diego Novillo bfecc06656 Revert "Re-commit changes in r237579 with fix for bug breaking windows builds."
This reverts commit r238201 to fix linking problems in x86 Linux
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20150525/278413.html

llvm-svn: 238223
2015-05-26 17:45:38 +00:00
Tom Stellard 245c15fce2 R600/SI: Add assembler support for all CI and VI VOP2 instructions
llvm-svn: 238211
2015-05-26 15:55:52 +00:00
Rafael Espindola bb9a71c1ed Replace getOrCreateSectionData with registerSection.
There is now no SectionData to be created.

llvm-svn: 238208
2015-05-26 15:07:25 +00:00
Luke Cheeseman a5d053d6f4 Re-commit changes in r237579 with fix for bug breaking windows builds.
llvm-svn: 238201
2015-05-26 13:40:31 +00:00
Luke Cheeseman 0af4f635f1 Test Commit
llvm-svn: 238199
2015-05-26 13:10:35 +00:00
Elena Demikhovsky 887baa0b49 AVX-512: fixed a bug in arithmetic operations lowering for i1 type
https://llvm.org/bugs/show_bug.cgi?id=23630

llvm-svn: 238198
2015-05-26 12:37:17 +00:00
Elena Demikhovsky b2b901c607 AVX-512: fixed a bug in lowering VSELECT for 512-bit vector
https://llvm.org/bugs/show_bug.cgi?id=23634

llvm-svn: 238195
2015-05-26 11:32:39 +00:00
Michael Kuperstein db0712f986 Use std::bitset for SubtargetFeatures.
Previously, subtarget features were a bitfield with the underlying type being uint64_t. 
Since several targets (X86 and ARM, in particular) have hit or were very close to hitting this bound, switching the features to use a bitset.
No functional change.

The first several times this was committed (e.g. r229831, r233055), it caused several buildbot failures.
Apparently the reason for most failures was both clang and gcc's inability to deal with large numbers (> 10K) of bitset constructor calls in tablegen-generated initializers of instruction info tables. 
This should now be fixed.

llvm-svn: 238192
2015-05-26 10:47:10 +00:00
Daniel Sanders 58ee4c9451 [mips] Make TTypeEncoding indirect to allow .eh_frame to be read-only.
Summary:
Following on from r209907 which made personality encodings indirect, do the
same for TType encodings. This fixes the case where a try/catch block needs
to generate references to, for example, std::exception in the
.gcc_except_table.

This commit uses DW_EH_PE_sdata8 for N64 as far as is possible at the moment.
However, it is possible to end up with DW_EH_PE_sdata4 when a TargetMachine is
not available. There's no risk of issues with inconsistency here since the
tables are self describing but it does mean there is a small chance of the
PC-relative offset being out of range for particularly large programs.

Reviewers: petarj

Reviewed By: petarj

Subscribers: srhines, joerg, tberghammer, llvm-commits

Differential Revision: http://reviews.llvm.org/D9669

llvm-svn: 238190
2015-05-26 10:19:18 +00:00
Rafael Espindola 64acc7fcc7 Remove most uses of MCSectionData from MCAssembler.
llvm-svn: 238172
2015-05-26 02:17:21 +00:00
Rafael Espindola 61e724a8c5 Stop using MCSectionData in MCMachObjectWriter.h.
llvm-svn: 238165
2015-05-26 01:15:30 +00:00
Rafael Espindola 079027ea90 Stop using MCSectionData in MCExpr.h.
llvm-svn: 238163
2015-05-26 00:52:18 +00:00
Rafael Espindola 7549f87672 Return a MCSection from MCFragment::getParent().
Another step in merging MCSectionData and MCSection.

llvm-svn: 238162
2015-05-26 00:36:57 +00:00
Rafael Espindola a554c05d95 Turn MCSectionData into a field of MCSection.
This also changes MCAssembler to store a vector of MCSections instead of an
iplist of MCSectionData.

llvm-svn: 238159
2015-05-25 23:14:17 +00:00
Simon Pilgrim 0be4fa761f [X86][AVX2] Vectorized i16 shift operators
Part of D9474, this patch extends AVX2 v16i16 types to 2 x 8i32 vectors and uses i32 shift variable shifts before packing back to i16.

Adds AVX2 tests for v8i16 and v16i16 

llvm-svn: 238149
2015-05-25 17:49:13 +00:00
Tom Stellard 50828163a1 R600/SI: Remove some unnecessary patterns from VINTRP multiclass
DisableEncoding and Constraints can be set using let statements around
the multiclass defs.

llvm-svn: 238148
2015-05-25 16:15:56 +00:00
Tom Stellard ec87f841c6 R600/SI: Fix bug with v_interp_p1_f32 instructions on 16 bank lds chips
The src and dst register cannot be the same on chips with 16 lds banks.

llvm-svn: 238147
2015-05-25 16:15:54 +00:00
Tom Stellard c70cf90d09 R600/SI: Use NAME rather than opName as the key to the MCOpcode tables
This lets us drop a parameter the opName parameter to the VINTRP
multiclass and makes it possible to create multiple VINTRP defs
with the same asm mnemonic.

llvm-svn: 238146
2015-05-25 16:15:50 +00:00
Kit Barton 6646033e6e This patch adds support for the vector quadword add/sub instructions introduced
in POWER8:

vadduqm
vaddeuqm
vaddcuq
vaddecuq
vsubuqm
vsubeuqm
vsubcuq
vsubecuq
In addition to adding the instructions themselves, it also adds support for the
v1i128 type for intrinsics (Intrinsics.td, Function.cpp, and
IntrinsicEmitter.cpp).

http://reviews.llvm.org/D9081

llvm-svn: 238144
2015-05-25 15:49:26 +00:00
Rafael Espindola 6e6820a7e6 Stop forwarding getOrdinal and setOrdinal.
llvm-svn: 238139
2015-05-25 14:12:48 +00:00
Michael Kuperstein f145228676 [X86] When pattern-matching scalar FMA3 intrinsics, don't re-arrange the first and second operands.
The semantics of the scalar FMA intrinsics are that the high vector elements are copied from the first source.
The existing pattern switches src1 and src2 around, to match the "213" order, which ends up tying the original src2 to the dest. Since the actual scalar fma3 instructions copy the high elements from the dest register, the wrong values are copied.

This modifies the pattern to leave src1 and src2 in their original order.

Differential Revision: http://reviews.llvm.org/D9908

llvm-svn: 238131
2015-05-25 12:35:25 +00:00
Elena Demikhovsky 1c1391ba24 Added promotion to EXTRACT_SUBVECTOR operand.
I encountered with this case in one of KNL tests for i1 vectors.
v16i1 = EXTRACT_SUBVECTOR v32i1, x

llvm-svn: 238130
2015-05-25 11:33:13 +00:00
NAKAMURA Takumi 5582a6a4a5 Reformat.
llvm-svn: 238126
2015-05-25 01:43:34 +00:00
NAKAMURA Takumi fb3bd7127a Prune CRLFs.
llvm-svn: 238125
2015-05-25 01:43:23 +00:00
Matt Arsenault 65ad1602b0 Add target hook to allow merging stores of nonzero constants
On GPU targets, materializing constants is cheap and stores are
expensive, so only doing this for zero vectors was silly.

Most of the new testcases aren't optimally merged, and are for
later improvements.

llvm-svn: 238108
2015-05-24 00:51:27 +00:00
Benjamin Kramer 1577f1f484 Bump SmallString to the minimum required amount for raw_ostream to avoid allocation.
NFC.

llvm-svn: 238104
2015-05-23 17:20:53 +00:00
Benjamin Kramer 33b4691fd0 [Mips] Prefer Twine::utohexstr over utohexstr, saves a string copy.
NFC.

llvm-svn: 238103
2015-05-23 16:53:07 +00:00
Benjamin Kramer be48c40475 [AArch64] Clean up the ELF streamer a bit.
llvm-svn: 238102
2015-05-23 16:39:10 +00:00
Benjamin Kramer 1d1b9243d5 [AArch64] Move AArch64TargetStreamer out of MCStreamer.h
It doesn't belong in the shared MC layer. NFC.

llvm-svn: 238101
2015-05-23 16:15:10 +00:00
Hal Finkel 5f2a1379ef [PowerPC] Fix fast-isel when compare is split from branch
When the compare feeding a branch was in a different BB from the branch, we'd
try to "regenerate" the compare in the block with the branch, possibly trying
to make use of values not available there. Copy a page from AArch64's play book
here to fix the problem (at least in terms of correctness).

Fixes PR23640.

llvm-svn: 238097
2015-05-23 12:18:10 +00:00
Akira Hatanaka ddf76aa36f Stop resetting NoFramePointerElim in TargetMachine::resetTargetOptions.
This is part of the work to remove TargetMachine::resetTargetOptions.

In this patch, instead of updating global variable NoFramePointerElim in
resetTargetOptions, its use in DisableFramePointerElim is replaced with a call
to TargetFrameLowering::noFramePointerElim. This function determines on a
per-function basis if frame pointer elimination should be disabled.

There is no change in functionality except that cl:opt option "disable-fp-elim"
can now override function attribute "no-frame-pointer-elim". 

llvm-svn: 238080
2015-05-23 01:14:08 +00:00
Rafael Espindola 445712264d Revert "make reciprocal estimate code generation more flexible by adding command-line options"
This reverts commit r238051.

It broke some bots:

http://lab.llvm.org:8011/builders/llvm-ppc64-linux1/builds/18190

llvm-svn: 238075
2015-05-23 00:22:44 +00:00
Sanjay Patel ba2ba80302 make reciprocal estimate code generation more flexible by adding command-line options
This patch adds a class for processing many recip codegen possibilities.
The TargetRecip class is intended to handle both command-line options to llc as well
as options passed in from a front-end such as clang with the -mrecip option.

The x86 backend is updated to use the new functionality.
Only -mcpu=btver2 with -ffast-math should see a functional change from this patch.
All other CPUs continue to *not* use reciprocal estimates by default with -ffast-math.

Differential Revision: http://reviews.llvm.org/D8982

llvm-svn: 238051
2015-05-22 21:10:06 +00:00
Chad Rosier 67336305f5 Use new MachineInstr mayLoadOrStore() API. NFC.
llvm-svn: 238044
2015-05-22 20:07:34 +00:00
Alexei Starovoitov 6296f6d7d8 [bpf] emit jmp fixups in little endian
The 'off' field of 'struct bpf_insn' is in cpu-endianness,
since the rest is emitted as little endian, make sure
that 'off' field is little endian as well.

llvm-svn: 238038
2015-05-22 18:47:33 +00:00
Quentin Colombet 494eb606cd Reapply r238011 with a fix for the trap instruction.
The problem was that I slipped a change required for shrink-wrapping, namely I
used getFirstTerminator instead of the getLastNonDebugInstr that was here before
the refactoring, whereas the surrounding code is not yet patched for that.

Original message:
[X86] Refactor the prologue emission to prepare for shrink-wrapping.

- Add a late pass to expand pseudo instructions (tail call and EH returns).
 Instead of doing it in the prologue emission.
- Factor some static methods in X86FrameLowering to ease code sharing.

NFC.

Related to <rdar://problem/20821487>

llvm-svn: 238035
2015-05-22 18:10:47 +00:00
Bill Schmidt e26236eed9 [PPC64] Add support for clrbhrb, mfbhrbe, rfebb.
This patch adds support for the ISA 2.07 additions involving the
branch history rolling buffer and event-based branching.  These will
not be used by typical applications, so built-in support is not
required.  They will only be available via inline assembly.

Assembly/disassembly tests are included in the patch.

llvm-svn: 238032
2015-05-22 16:44:10 +00:00
John Brawn c815a969c7 [ARM] Fix typo in subtarget feature list for 7em triple
The list of subtarget features for the 7em triple contains 't2xtpk',
which actually disables that subtarget feature. Correct that to
'+t2xtpk' and test that the instructions enabled by that feature do
actually work.

Differential Revision: http://reviews.llvm.org/D9936

llvm-svn: 238022
2015-05-22 14:16:22 +00:00
Tamas Berghammer 466692abdc Revert "[X86] Fix a variable name for r237977 so that it works with every compilers."
Revert "[X86] Refactor the prologue emission to prepare for shrink-wrapping."

This reverts commit 6b3b93fc8b68a2c806aa992ee4bd3d7f61898d4b.
This reverts commit ab0b15dff8539826283a59c2dd700a18a9680e0f.

llvm-svn: 238011
2015-05-22 10:01:56 +00:00
Quentin Colombet 04ac8fcbde [X86] Fix a variable name for r237977 so that it works with every compilers.
llvm-svn: 237980
2015-05-22 00:41:03 +00:00
Quentin Colombet faf4b57e1d [X86] Refactor the prologue emission to prepare for shrink-wrapping.
- Add a late pass to expand pseudo instructions (tail call and EH returns).
  Instead of doing it in the prologue emission.
- Factor some static methods in X86FrameLowering to ease code sharing.

NFC.

Related to <rdar://problem/20821487>

llvm-svn: 237977
2015-05-22 00:12:31 +00:00
Hal Finkel 82e1fc5fc7 [PPC] Correct iterator bug in PPCTLSDynamicCall
Unfortunately, I can't reduce a small test case for this (although compiling
mpfr-3.1.2 with -O2 -mcpu=a2 would fairly reliably trigger a crash), but the
problem is fairly clear (at least once you know you're looking for one). If the
TLS instruction being replaced was at the end of the block, we'd increment the
iterator past it (so it would then point to MBB.end()), and then we'd increment
it again as part of the for statement, thus overrunning the end of the list.
Don't do that.

llvm-svn: 237974
2015-05-21 23:45:49 +00:00
Peter Collingbourne 7e814d100b Revert r237590, "ARM: allow jump tables to be placed as constant islands."
Caused a miscompile of the Android port of Chromium, details
forthcoming.

llvm-svn: 237972
2015-05-21 23:20:55 +00:00
Chad Rosier a73b359542 Use new MachineInstr mayLoadOrStore() API.
llvm-svn: 237965
2015-05-21 21:59:57 +00:00
Chad Rosier ce8e5abbaf [AArch64] Enhance the load/store optimizer with target-specific alias analysis.
Phabricator: http://reviews.llvm.org/D9863
llvm-svn: 237963
2015-05-21 21:36:46 +00:00
David Blaikie 457343dcaa [opaque pointer type] Allow gep_type_iterator to work with the pointee type from the GEP instruction
The raw non-instruction/constant form of this is still relying on being
able to access the pointee type from a pointer type - those will be
cleaned up later. For now, just focus on the cases where the pointee
type is easily accessible.

llvm-svn: 237958
2015-05-21 21:12:43 +00:00
Rafael Espindola 967d6a6914 Stop forwarding (get|set)Aligment from MCSectionData to MCSection.
llvm-svn: 237956
2015-05-21 21:02:35 +00:00
Bill Schmidt e13ac91c5d [PPC64] Handle vpkudum mask pattern correctly when vpkudum isn't available
My recent patch to add support for ISA 2.07 vector pack/unpack
instructions didn't properly check for availability of the vpkudum
instruction when recognizing it as a special vector shuffle case.
This causes us to leave the vector shuffle in place (rather than
converting it to a vector permute) so that it can be recognized later
as a vpkudum, but that pattern is invalid for processors prior to
POWER8.  Thus LLVM crashes with an "unable to select" message.  We
observed this since one of our buildbots is configured to generate
code for a POWER7.

This patch fixes the problem by checking for availability of the
vpkudum instruction during custom lowering of vector shuffles.

I've added a test case variant for the vpkudum pattern when the
instruction isn't available.

llvm-svn: 237952
2015-05-21 20:48:49 +00:00
Hal Finkel 3b3c9c3e44 [PPC/LoopUnrollRuntime] Don't avoid high-cost trip count computation on the PPC/A2
On X86 (and similar OOO cores) unrolling is very limited, and even if the
runtime unrolling is otherwise profitable, the expense of a division to compute
the trip count could greatly outweigh the benefits. On the A2, we unroll a lot,
and the benefits of unrolling are more significant (seeing a 5x or 6x speedup
is not uncommon), so we're more able to tolerate the expense, on average, of a
division to compute the trip count.

llvm-svn: 237947
2015-05-21 20:30:23 +00:00
Nemanja Ivanovic f02def6cbc Add support for VSX scalar single-precision arithmetic in the PPC target
http://reviews.llvm.org/D9891
Following up on the VSX single precision loads and stores added earlier, this
adds support for elementary arithmetic operations on single precision values
in VSX registers. These instructions utilize the new VSSRC register class.
Instructions added:
xsaddsp
xsdivsp
xsmulsp
xsresp
xsrsqrtesp
xssqrtsp
xssubsp

llvm-svn: 237937
2015-05-21 19:32:49 +00:00
Rafael Espindola 0709a7bd1a Move alignment from MCSectionData to MCSection.
This starts merging MCSection and MCSectionData.

There are a few issues with the current split between MCSection and
MCSectionData.

* It optimizes the the not as important case. We want the production
of .o files to be really fast, but the split puts the information used
for .o emission in a separate data structure.

* The ELF/COFF/MachO hierarchy is not represented in MCSectionData,
leading to some ad-hoc ways to represent the various flags.

* It makes it harder to remember where each item is.

The attached patch starts merging the two by moving the alignment from
MCSectionData to MCSection.

Most of the patch is actually just dropping 'const', since
MCSectionData is mutable, but MCSection was not.

llvm-svn: 237936
2015-05-21 19:20:38 +00:00
Elena Demikhovsky 4aed59fc89 AVX-512: Enabled SSE intrinsics on AVX-512.
Predicate UseAVX depricates pattern selection on AVX-512.
This predicate is necessary for DAG selection to select EVEX form.
But mapping SSE intrinsics to AVX-512 instructions is not ready yet.
So I replaced UseAVX with HasAVX for intrinsics patterns.

llvm-svn: 237903
2015-05-21 14:01:32 +00:00
Simon Pilgrim f483abc14e Fixed unused variable warning in non-assert builds from rL237885
llvm-svn: 237889
2015-05-21 10:22:10 +00:00
Simon Pilgrim e054199354 [X86][SSE] Improve support for 128-bit vector sign extension
This patch improves support for sign extension of the lower lanes of vectors of integers by making use of the SSE41 pmovsx* sign extension instructions where possible, and optimizing the sign extension by shifts on pre-SSE41 targets (avoiding the use of i64 arithmetic shifts which require scalarization).

It converts SIGN_EXTEND nodes to SIGN_EXTEND_VECTOR_INREG where necessary, that more closely matches the pmovsx* instruction than the default approach of using SIGN_EXTEND_INREG which splits the operation (into an ANY_EXTEND lowered to a shuffle followed by shifts) making instruction matching difficult during lowering. Necessary support for SIGN_EXTEND_VECTOR_INREG has been added to the DAGCombiner.

Differential Revision: http://reviews.llvm.org/D9848

llvm-svn: 237885
2015-05-21 10:05:03 +00:00
Reid Kleckner 2632f0df48 [WinEH] Store pointers to the LSDA in the exception registration object
We aren't yet emitting the LSDA yet, so this will still fail to
assemble.

llvm-svn: 237852
2015-05-20 23:08:04 +00:00
Hans Wennborg a8f8df5dd2 Revert r237828 "[X86] Remove unused node after morphing it from shr to and."
This caused assertions during DAG combine: PR23601.

llvm-svn: 237843
2015-05-20 22:31:55 +00:00
Davide Italiano 141b2891cb [Target/ARM] Only enable OptimizeBarrierPass at -O1 and above.
Ideally this is going to be and LLVM IR pass (shared, among others
with AArch64), but for the time being just enable it if consumers
ask us for optimization and not unconditionally.

Discussed with Tim Northover on IRC.

llvm-svn: 237837
2015-05-20 21:40:38 +00:00
Duncan P. N. Exon Smith 92a699c50e MC: Remove most remaining uses of MCSymbolData::getSymbol(), NFC
Remove most remaining calls to `MCSymbolData::getSymbol()`, instead
using the already available `MCSymbol` directly.

llvm-svn: 237829
2015-05-20 20:18:16 +00:00
Benjamin Kramer a74480d1eb [X86] Remove unused node after morphing it from shr to and.
In some cases it won't get cleaned up properly leading to crashes
downstream. PR23353.

Based on a patch by Davide Italiano.

llvm-svn: 237828
2015-05-20 20:10:26 +00:00
Matthias Braun 6091208331 ARM: Fix comment and make it slightly more readable
llvm-svn: 237820
2015-05-20 18:40:06 +00:00
Pete Cooper 9e1d335697 Change Function::getIntrinsicID() to return an Intrinsic::ID. NFC.
Now that Intrinsic::ID is a typed enum, we can forward declare it and so return it from this method.

This updates all users which were either using an unsigned to store it, or had a now unnecessary cast.

llvm-svn: 237810
2015-05-20 17:16:39 +00:00
Duncan P. N. Exon Smith fd27a1dc1b MC: Update MCAssembler to use MCSymbol, NFC
Use `MCSymbol` over `MCSymbolData` where both are needed.

llvm-svn: 237803
2015-05-20 16:02:11 +00:00
Duncan P. N. Exon Smith 08b8726de3 MC: Use MCSymbol in MachObjectWriter, NFC
Replace uses of `MCSymbolData` with `MCSymbol` where both are needed, so
we can remove the backpointer.

llvm-svn: 237799
2015-05-20 15:16:14 +00:00
Elena Demikhovsky f61727d880 AVX-512: fixed algorithm of building vectors of i1 elements
fixed extract-insert i1 element,
load i1, zextload i1 should be with "and $1, %reg" to prevent loading garbage.
added a bunch of new tests.

llvm-svn: 237793
2015-05-20 14:32:03 +00:00
Daniel Sanders 69c6008e49 Revert r237789 - [mips] The naming convention for private labels is ABI dependant.
It works, but I've noticed that I missed several callers of createMCAsmInfo()
and many don't have a TargetMachine to provide.

llvm-svn: 237792
2015-05-20 14:18:59 +00:00
Daniel Sanders b718eca643 [mips] The naming convention for private labels is ABI dependant.
Summary:
For N32/N64, private labels begin with '.L' but for O32 they begin with '$'.

MCAsmInfo now has an initializer function which can be used to provide information from the TargetMachine to control the assembly syntax.

Reviewers: vkalintiris

Reviewed By: vkalintiris

Subscribers: jfb, sandeep, llvm-commits, rafael

Differential Revision: http://reviews.llvm.org/D9821

llvm-svn: 237789
2015-05-20 13:16:42 +00:00
Toma Tabacu 81496c1dec [mips] [IAS] Factor out .set nomacro warning. NFC.
Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9772

llvm-svn: 237780
2015-05-20 08:54:45 +00:00
David Majnemer 402c5def11 [X86] Implement the local-exec TLS model for Windows targets
We know that _tls_index is zero for local-exec TLS variables because
they are always defined in the executable.

llvm-svn: 237772
2015-05-20 04:45:26 +00:00
Alexei Starovoitov ccbcf7cfd3 [bpf] fix build
llvm-svn: 237751
2015-05-20 00:20:26 +00:00
Duncan P. N. Exon Smith 99d8a8e8ac MC: Take MCSymbol in MachObjectWriter::getSymbolAddress(), NFC
Pass through an `MCSymbol` instead of an `MCSymbolData` so we can get
rid of the back pointer.

llvm-svn: 237750
2015-05-20 00:02:39 +00:00
Duncan P. N. Exon Smith 2a40483418 MC: Use MCSymbol in MCAsmLayout::getSymbolOffset(), NFC
Continue to canonicalize on MCSymbol instead of MCSymbolData when both
are needed.

llvm-svn: 237749
2015-05-19 23:53:20 +00:00
Matthias Braun 07066cca20 MachineInstr: Remove unused parameter.
llvm-svn: 237726
2015-05-19 21:22:20 +00:00
Pete Cooper f0cd2b49f5 Remove unnecessary cast. NFC
llvm-svn: 237722
2015-05-19 20:50:14 +00:00
Zoran Jovanovic dde61c00c3 [mips][microMIPSr6] Implement NOR, OR, ORI, XOR and XORI instructions
Differential Revision: http://reviews.llvm.org/D8800

llvm-svn: 237697
2015-05-19 14:12:55 +00:00
Zoran Jovanovic 299fed6b7d [mips][microMIPSr6] Implement AND and ANDI instructions
Differential Revision: http://reviews.llvm.org/D8772

llvm-svn: 237696
2015-05-19 13:32:31 +00:00
Daniel Sanders c8cd58fa26 [mips] Correct and improve special-case shuffle instructions.
Summary:
The documentation writes vectors highest-index first whereas LLVM-IR writes
them lowest-index first. As a result, instructions defined in terms of
left_half() and right_half() had the halves reversed.

In addition to correcting them, they have been improved to allow shuffles
that use the same operand twice or in reverse order. For example, ilvev
used to accept masks of the form:
  <0, n, 2, n+2, 4, n+4, ...>
but now accepts:
  <0, 0, 2, 2, 4, 4, ...>
  <n, n, n+2, n+2, n+4, n+4, ...>
  <0, n, 2, n+2, 4, n+4, ...>
  <n, 0, n+2, 2, n+4, 4, ...>

One further improvement is that splati.[bhwd] is now the preferred instruction
for splat-like operations. The other special shuffles are no longer used
for splats. This lead to the discovery that <0, 0, ...> would not cause
splati.[hwd] to be selected and this has also been fixed.

This fixes the enc-3des test from the test-suite on Mips64r6 with MSA.

Reviewers: vkalintiris

Reviewed By: vkalintiris

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9660

llvm-svn: 237689
2015-05-19 12:24:52 +00:00
Zoran Jovanovic 3825261572 [mips][microMIPSr6] Implement DIV, DIVU, MOD and MODU instructions
Differential Revision: http://reviews.llvm.org/D8769

llvm-svn: 237685
2015-05-19 11:21:37 +00:00
Michael Kuperstein e88a021bf4 [X86] ABI change for x86-32: pass 3 vector arguments in-register instead of 4, except on Darwin.
This changes the ABI used on 32-bit x86 for passing vector arguments. 
Historically, clang passes the first 4 vector arguments in-register, and additional vector arguments on the stack, regardless of platform. That is different from the behavior of gcc, icc, and msvc, all of which pass only the first 3 arguments in-register.
The 3-register convention is documented, unofficially, in Agner's calling convention guide, and, officially, in the recently released version 1.0 of the i386 psABI.

Darwin is kept as is because the OS X ABI Function Call Guide explicitly documents the current (4-register) behavior.

This fixes PR21510

Differential revision: http://reviews.llvm.org/D9644

llvm-svn: 237682
2015-05-19 11:06:56 +00:00
Reid Kleckner 7bc6b6210c Re-land r237175: [X86] Always return the sret parameter in eax/rax ...
This reverts commit r237210.

Also fix X86/complex-fca.ll to match the code that we used to generate
on win32 and now generate everwhere to conform to SysV.

llvm-svn: 237639
2015-05-18 23:35:09 +00:00
Jozef Kolek cc0c0fc926 [mips][microMIPSr6] Implement LSA instruction
This patch implements LSA instruction using mapping.

Differential Revision: http://reviews.llvm.org/D8919

llvm-svn: 237634
2015-05-18 23:12:10 +00:00
David Blaikie ff6409d096 Simplify IRBuilder::CreateCall* by using ArrayRef+initializer_list/braced init only
llvm-svn: 237624
2015-05-18 22:13:54 +00:00
Tim Northover d6223a2471 AArch64: work around ld64 bug more aggressively.
ld64 currently mishandles internal pointer relocations (i.e.
ARM64_RELOC_UNSIGNED referred to by section & offset rather than symbol). The
existing __cfstring clause was an early discovery and workaround for this, but
the problem is wider and we should avoid such relocations wherever possible for
now.

This code should be reverted to allowing internal relocations as soon as
possible.

PR23437.

llvm-svn: 237621
2015-05-18 22:07:20 +00:00
Matthias Braun fa3872e7ad MachineInstr: Change return value of getOpcode() to unsigned.
This was previously returning int. However there are no negative opcode
numbers and more importantly this was needlessly different from
MCInstrDesc::getOpcode() (which even is the value returned here) and
SDValue::getOpcode()/SDNode::getOpcode().

llvm-svn: 237611
2015-05-18 20:27:55 +00:00
Jim Grosbach 6f482000e9 MC: Clean up method names in MCContext.
The naming was a mish-mash of old and new style. Update to be consistent
with the new. NFC.

llvm-svn: 237594
2015-05-18 18:43:14 +00:00
Tim Northover 12c41af07c ARM: allow jump tables to be placed as constant islands.
Previously, they were forced to immediately follow the actual branch
instruction. This was usually OK (the LEAs actually accessing them got emitted
nearby, and weren't usually separated much afterwards). Unfortunately, a
sufficiently nasty phi elimination dumps many instructions right before the
basic block terminator, and this can increase the range too much.

This patch frees them up to be placed as usual by the constant islands pass,
and consequently has to slightly modify the form of TBB/TBH tables to refer to
a PC-relative label at the final jump. The other jump table formats were
already position-independent.

rdar://20813304

llvm-svn: 237590
2015-05-18 17:10:40 +00:00
James Y Knight c49e78851c Sparc: support the "set" synthetic instruction.
This pseudo-instruction expands into 'sethi' and 'or' instructions,
or, just one of them, if the other isn't necessary for a given value.

Differential Revision: http://reviews.llvm.org/D9089

llvm-svn: 237585
2015-05-18 16:43:33 +00:00
Oliver Stannard 6cb23465e0 Revert r237579, as it broke windows buildbots
llvm-svn: 237583
2015-05-18 16:39:16 +00:00
James Y Knight f7e7017281 Sparc: Support PSR, TBR, WIM read/write instructions.
Differential Revision: http://reviews.llvm.org/D8971

llvm-svn: 237582
2015-05-18 16:38:47 +00:00
James Y Knight 24060be73a Sparc: Add the "alternate address space" load/store instructions.
- Adds support for the asm syntax, which has an immediate integer
  "ASI" (address space identifier) appearing after an address, before
  a comma.

- Adds the various-width load, store, and swap in alternate address
  space instructions. (ldsba, ldsha, lduba, lduha, lda, stba, stha,
  sta, swapa)

This does not attempt to hook these instructions up to pointer address
spaces in LLVM, although that would probably be a reasonable thing to
do in the future.

Differential Revision: http://reviews.llvm.org/D8904

llvm-svn: 237581
2015-05-18 16:35:04 +00:00
James Y Knight 807563df22 Add support for the Sparc implementation-defined "ASR" registers.
(Note that register "Y" is essentially just ASR0).

Also added some test cases for divide and multiply, which had none before.

Differential Revision: http://reviews.llvm.org/D8670

llvm-svn: 237580
2015-05-18 16:29:48 +00:00
Oliver Stannard 0c553afe6a [LLVM - ARM/AArch64] Add ACLE special register intrinsics
This patch implements LLVM support for the ACLE special register intrinsics in
section 10.1, __arm_{w,r}sr{,p,64}.

This patch is intended to lower the read/write_register instrinsics, used to
implement the special register intrinsics in the clang patch for special
register intrinsics (see http://reviews.llvm.org/D9697), to ARM specific
instructions MRC,MCR,MSR etc. to allow reading an writing of coprocessor
registers in AArch32 and AArch64. This is done by inspecting the register
string passed to the intrinsic and then lowering to the appropriate
instruction.

Patch by Luke Cheeseman.

Differential Revision: http://reviews.llvm.org/D9699

llvm-svn: 237579
2015-05-18 16:23:33 +00:00
Jozef Kolek cbb227b48d [mips][microMIPSr6] Implement ALIGN and AUI instructions
This patch implements ALIGN and AUI instructions using mapping.

Differential Revision: http://reviews.llvm.org/D8782

llvm-svn: 237563
2015-05-18 11:44:30 +00:00
Elena Demikhovsky b8573cba02 AVX-512: Added intrinsics for ADDSS/D, MULSS/D, SUBSS/D, DIVSS/D
instructions. These intrinsics are comming with rounding mode.
Added intrinsics for MAXSS/D, MINSS/D - with and without  sae.

By Asaf Badouh (asaf.badouh@intel.com)

llvm-svn: 237560
2015-05-18 07:24:19 +00:00
Elena Demikhovsky 7eb367625e fixed compilation warning/error
llvm-svn: 237559
2015-05-18 07:10:25 +00:00
Elena Demikhovsky 08ce53c0ea AVX-512: Added patterns for scalar-to-vector broadcast
llvm-svn: 237558
2015-05-18 07:06:23 +00:00
Elena Demikhovsky ad9c396838 AVX-512: Added VBROADCASTF64X4, VBROADCASTF64X2, VBROADCASTI32X8, and other instructions from this set
Added encoding tests.

llvm-svn: 237557
2015-05-18 06:42:57 +00:00
Hal Finkel 8340de142c [PowerPC] Add extra r2 read deps on @toc@l relocations
If some commits are happy, and some commits are sad, this is a sad commit. It
is sad because it restricts instruction scheduling to work around a binutils
linker bug, and moreover, one that may never be fixed. On 2012-05-21, GCC was
updated not to produce code triggering this bug, and now we'll do the same...

When resolving an address using the ELF ABI TOC pointer, two relocations are
generally required: one for the high part and one for the low part. Only
the high part generally explicitly depends on r2 (the TOC pointer). And, so,
we might produce code like this:

.Ltmp526:
        addis 3, 2, .LC12@toc@ha
.Ltmp1628:
        std 2, 40(1)
        ld 5, 0(27)
        ld 2, 8(27)
        ld 11, 16(27)
        ld 3, .LC12@toc@l(3)
        rldicl 4, 4, 0, 32
        mtctr 5
        bctrl
        ld 2, 40(1)

And there is nothing wrong with this code, as such, but there is a linker bug
in binutils (https://sourceware.org/bugzilla/show_bug.cgi?id=18414) that will
misoptimize this code sequence to this:
        nop
        std     r2,40(r1)
        ld      r5,0(r27)
        ld      r2,8(r27)
        ld      r11,16(r27)
        ld      r3,-32472(r2)
        clrldi  r4,r4,32
        mtctr   r5
        bctrl
        ld      r2,40(r1)
because the linker does not know (and does not check) that the value in r2
changed in between the instruction using the .LC12@toc@ha (TOC-relative)
relocation and the instruction using the .LC12@toc@l(3) relocation.
Because it finds these instructions using the relocations (and not by
scanning the instructions), it has been asserted that there is no good way
to detect the change of r2 in between. As a result, this bug may never be
fixed (i.e. it may become part of the definition of the ABI). GCC was
updated to add extra dependencies on r2 to instructions using the @toc@l
relocations to avoid this problem, and we'll do the same here.

This is done as a separate pass because:
 1. These extra r2 dependencies are not really properties of the
    instructions, but rather due to a linker bug, and maybe one day we'll be
    able to get rid of them when targeting linkers without this bug (and,
    thus, keeping the logic centralized here will make that
    straightforward).
 2. There are ISel-level peephole optimizations that propagate the @toc@l
    relocations to some user instructions, and so the exta dependencies do
    not apply only to a fixed set of instructions (without undesirable
    definition replication).

The test case was reduced with the help of bugpoint, with minimal cleaning. I'm
looking forward to our upcoming MI serialization support, and with that, much
better tests can be created.

llvm-svn: 237556
2015-05-18 06:25:59 +00:00
Elena Demikhovsky a8200603d4 AVX-512: fixed extended load to 512-bit register
llvm-svn: 237537
2015-05-17 08:08:06 +00:00
Elena Demikhovsky 1d6a495d6d AVX-512: fixed a bug in mask operations - (i1 1) pattern
Filling k-reg with all-ones value was wrong,
(i1 1) should switch on only one bit in mask register

llvm-svn: 237536
2015-05-17 07:28:51 +00:00
Daniel Sanders d049669546 [x86] Distinguish the 'o', 'v', 'X', and 'i' inline assembly memory constraints.
Summary:
But still handle them the same way since I don't know how they differ on
this target.

Of these, 'o' and 'v' are not tested but were already implemented.

I'm not sure why 'i' is required for X86 since it's supposed to be an
immediate constraint rather than a memory constraint. A test asserts
without it so I've included it for now.

No functional change intended.

Reviewers: nadav

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8254

llvm-svn: 237517
2015-05-16 12:09:54 +00:00
Duncan P. N. Exon Smith 6e23e5a680 MC: Use MCSymbol in RelAndSymbol, NFC
Switch from `MCSymbolData` to `MCSymbol`.

llvm-svn: 237502
2015-05-16 01:14:19 +00:00