llvm-project

Commit Graph

Author	SHA1	Message	Date
Simon Pilgrim	5b8627aada	[X86][SSE] Regenerate scalar i64 uitofp test Added 32-bit target test llvm-svn: 283883	2016-10-11 14:01:38 +00:00
Simon Pilgrim	092cfc597f	[X86][SSE] Regenerate vector load-trunc test llvm-svn: 283881	2016-10-11 13:55:49 +00:00
Simon Pilgrim	fe9fa7314c	[X86][SSE] Regenerate vsplit and tests To make it more obvious how bad some of that truncation code is.... llvm-svn: 283880	2016-10-11 13:51:44 +00:00
Sanjay Patel	6d71f7b348	[x86] update test to use FileCheck and auto-generate checks llvm-svn: 283876	2016-10-11 13:36:07 +00:00
Oliver Stannard	d2083fb356	[Thumb] Save/restore high registers in Thumb1 pro/epilogues The high registers are not allocatable in Thumb1 functions, but they could still be used by inline assembly, so we need to save and restore the callee-saved high registers (r8-r11) in the prologue and epilogue. This is complicated by the fact that the Thumb1 push and pop instructions cannot access these registers. Therefore, we have to move them down into low registers before pushing, and move them back after popping into low registers. In most functions, we will have low registers that are also being pushed/popped, which we can use as the temporary registers for saving/restoring the high registers. However, this is not guaranteed, so we may need to push some extra low registers to ensure that the high registers can be saved/restored. For correctness, it would be sufficient to use just one low register, but if we have enough low registers available then we only need one push/pop instruction, rather than one per high register. We can also use the argument/return registers when they are not live, and the link register when saving (but not restoring), reducing the number of extra registers we need to push. There are still a few extreme edge cases where we need two push/pop instructions, because not enough low registers can be made live in the prologue or epilogue. In addition to the regression tests included here, I've also tested this using a script to generate functions which clobber different combinations of registers, have different numbers of argument and return registers (including variadic arguments), allocate different fixed sized objects on the stack, and do or don't use variable sized allocas and the __builtin_return_address intrinsic (all of which affect the available registers in the prologue and epilogue). I ran these functions in a test harness which verifies that all of the callee-saved registers are correctly preserved. Differential Revision: https://reviews.llvm.org/D24228 llvm-svn: 283867	2016-10-11 10:12:25 +00:00
Oliver Stannard	50a74393c2	[ARM] Fix registers clobbered by SjLj EH on soft-float targets Currently, the Int_eh_sjlj_dispatchsetup intrinsic is marked as clobbering all registers, including floating-point registers that may not be present on the target. This is technically true, as we could get linked against code that does use the FP registers, but that will not actually work, as the soft-float code cannot save and restore the FP registers. SjLj exception handling can only work correctly if either all or none of the code is built for a target with FP registers. Therefore, we can assume that, when Int_eh_sjlj_dispatchsetup is compiled for a soft-float target, it is only going to be linked against other soft-float code, and so only clobbers the general-purpose registers. This allows us to check that no non-savable registers are clobbered when generating the prologue/epilogue. Differential Revision: https://reviews.llvm.org/D25180 llvm-svn: 283866	2016-10-11 10:06:59 +00:00
Diana Picus	c93518db8c	[AArch64] Allow label arithmetic with add/sub/cmp Allow instructions such as 'cmp w0, #(end - start)' by folding the expression into a constant. For ELF, we fold only if the symbols are in the same section. For MachO, we fold if the expression contains only symbols that are not linker visible. Fixes https://llvm.org/bugs/show_bug.cgi?id=18920 Differential Revision: https://reviews.llvm.org/D23834 llvm-svn: 283862	2016-10-11 09:17:47 +00:00
George Rimar	5fecfaadc9	Reverted r283740 [Object/ELF] - Do not crash on invalid Header->e_shoff value. Bot does not like it: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/17075 /mnt/b/sanitizer-buildbot3/sanitizer-x86_64-linux-fast/build/llvm/test/Object/invalid.test:70:32: error: expected string not found in input INVALID-SEC-ADDRESS-ALIGNMENT: Invalid address alignment of section headers ^ <stdin>:1:1: note: scanning from here /mnt/b/sanitizer-buildbot3/sanitizer-x86_64-linux-fast/build/llvm/include/llvm/Object/ELF.h:412:7: runtime error: upcast of misaligned address 0x000002d8b899 for type 'llvm::object::Elf_Shdr_Impl<llvm::object::ELFType<llvm::support::endianness::little, true> >', which requires 2 byte alignment ^ <stdin>:1:125: note: possible intended match here /mnt/b/sanitizer-buildbot3/sanitizer-x86_64-linux-fast/build/llvm/include/llvm/Object/ELF.h:412:7: runtime error: upcast of misaligned address 0x000002d8b899 for type 'llvm::object::Elf_Shdr_Impl<llvm::object::ELFType<llvm::support::endianness::little, true> >', which requires 2 byte alignment llvm-svn: 283858	2016-10-11 08:12:27 +00:00
Daniel Jasper	0c42dc4784	Revert "Codegen: Tail-duplicate during placement." This reverts commit r283842. test/CodeGen/X86/tail-dup-repeat.ll causes and llc crash with our internal testing. I'll share a link with you. llvm-svn: 283857	2016-10-11 07:36:11 +00:00
Matthias Braun	74ad41c7cd	MIRParser: Rewrite register info initialization; mostly NFC This changes MachineRegisterInfo to be initializes after parsing all instructions. This is in preparation for upcoming commits that allow the register class specification on the operand or deduce them from the MCInstrDesc. This commit removes the unused feature of having nonsequential register numbers. This was confusing anyway as the vreg numbers would be different after parsing when you had "holes" in your numbering. This patch also introduces the concept of an incomplete virtual register. An incomplete virtual register may be used during .mir parsing to construct MachineOperands without knowing the exact register class (or register bank) yet. NFC except for some error messages. Differential Revision: https://reviews.llvm.org/D22397 llvm-svn: 283848	2016-10-11 03:13:01 +00:00
Kyle Butt	ae068a320c	Codegen: Tail-duplicate during placement. The tail duplication pass uses an assumed layout when making duplication decisions. This is fine, but passes up duplication opportunities that may arise when blocks are outlined. Because we want the updated CFG to affect subsequent placement decisions, this change must occur during placement. In order to achieve this goal, TailDuplicationPass is split into a utility class, TailDuplicator, and the pass itself. The pass delegates nearly everything to the TailDuplicator object, except for looping over the blocks in a function. This allows the same code to be used for tail duplication in both places. This change, in concert with outlining optional branches, allows triangle shaped code to perform much better, esepecially when the taken/untaken branches are correlated, as it creates a second spine when the tests are small enough. Issue from previous rollback fixed, and a new test was added for that case as well. Issue was worklist/scheduling/taildup issue in layout. Issue from 2nd rollback fixed, with 2 additional tests. Issue was tail merging/loop info/tail-duplication causing issue with loops that share a header block. Issue with early tail-duplication of blocks that branch to a fallthrough predecessor fixed with test case: tail-dup-branch-to-fallthrough.ll Differential revision: https://reviews.llvm.org/D18226 llvm-svn: 283842	2016-10-11 01:20:33 +00:00
Dylan McKay	c328fe5af4	[RegAllocGreedy] Attempt to split unspillable live intervals Summary: Previously, when allocating unspillable live ranges, we would never attempt to split. We would always bail out and try last ditch graph recoloring. This patch changes this by attempting to split all live intervals before performing recoloring. This fixes LLVM bug PR14879. I can't add test cases for any backends other than AVR because none of them have small enough register classes to trigger the bug. Reviewers: qcolombet Subscribers: MatzeB Differential Revision: https://reviews.llvm.org/D25070 llvm-svn: 283838	2016-10-11 01:04:36 +00:00
David Majnemer	80dca0c78f	[InstCombine] Transform !range metadata to !nonnull when combining loads When combining an integer load with !range metadata that does not include 0 to a pointer load, make sure emit !nonnull metadata on the newly-created pointer load. This prevents the !nonnull metadata from being dropped during a ptrtoint/inttoptr pair. This fixes PR30597. Patch by Ariel Ben-Yehuda! Differential Revision: https://reviews.llvm.org/D25215 llvm-svn: 283836	2016-10-11 01:00:45 +00:00
Quentin Colombet	d2623f8e38	[AArch64][InstructionSelector] Teach how to select FP load/store. This patch allows to select 32 and 64-bit FP load and store. llvm-svn: 283832	2016-10-11 00:21:14 +00:00
Quentin Colombet	0e5312787e	[AArch64][InstructionSelector] Teach the selector how to handle vector OR. This only adds the support for 64-bit vector OR. Adding more sizes is not difficult, but it requires a bigger refactoring because ORs work on any size, not necessarly the ones that match the width of the register width. Right now, this is not expressed in the legalization, so don't bother pushing the refactoring yet. llvm-svn: 283831	2016-10-11 00:21:11 +00:00
Quentin Colombet	d3126d5fb4	[AArch64][MachineLegalizer] Mark v2s32 G_LOAD as legal. Actually every 64-bit loads are legal, but right now the API does not offer a simple way to express that. llvm-svn: 283829	2016-10-11 00:21:08 +00:00
Sanjay Patel	3013a62dd8	[x86] auto-generate checks llvm-svn: 283812	2016-10-10 22:04:12 +00:00
Sanjay Patel	b493cdaabf	[x86] auto-generate checks llvm-svn: 283811	2016-10-10 22:01:42 +00:00
Tim Northover	bdf1624367	GlobalISel: select G_GLOBAL_VALUE uses on AArch64. llvm-svn: 283809	2016-10-10 21:50:00 +00:00
Tim Northover	ad0acca544	GlobalISel: allow G_GLOBAL_VALUEs in AArch64 legalization. llvm-svn: 283808	2016-10-10 21:49:53 +00:00
Tim Northover	2fda4b08ae	GlobalISel: support selecting G_GEP instructions. They're basically just an alias for G_ADD on AArch64. llvm-svn: 283807	2016-10-10 21:49:49 +00:00
Tim Northover	4edc60d785	GlobalISel: support selecting constants on AArch64. llvm-svn: 283806	2016-10-10 21:49:42 +00:00
Hal Finkel	fcd2421667	[SelectionDAGBuilder] Support llvm.flt.rounds on targets where i32 is not legal Add integer expansion for FLT_ROUNDS_ for targets where i32 is not a legal type. Patch by Edward Jones, thanks! Differential Revision: https://reviews.llvm.org/D24459 llvm-svn: 283797	2016-10-10 20:45:15 +00:00
Adrian Prantl	3bfe1093df	Teach llvm::StripDebugInfo() about global variable !dbg attachments. This is a regression introduced by the global variable ownership reversal performed in r281284. rdar://problem/28448075 llvm-svn: 283784	2016-10-10 17:53:33 +00:00
Alexandros Lamprineas	20e9ddba73	[ARM] Fix invalid VLDM/VSTM access when targeting Big Endian with NEON The instructions VLDM/VSTM can only access word-aligned memory locations and produce alignment fault if the condition is not met. The compiler currently generates VLDM/VSTM for v2f64 load/store regardless the alignment of the memory access. Instead, if a v2f64 load/store is not word-aligned, the compiler should generate VLD1/VST1. For each non double-word-aligned VLD1/VST1, a VREV instruction should be generated when targeting Big Endian. Differential Revision: https://reviews.llvm.org/D25281 llvm-svn: 283763	2016-10-10 16:01:54 +00:00
Zvi Rackover	2a21f125bd	[X86] Prefer rotate by 1 over rotate by imm Summary: Rotate by 1 is translated to 1 micro-op, while rotate with imm8 is translated to 2 micro-ops. Fixes pr30644. Reviewers: delena, igorb, craig.topper, spatel, RKSimon Differential Revision: https://reviews.llvm.org/D25399 llvm-svn: 283758	2016-10-10 14:43:55 +00:00
Simon Pilgrim	cfef627b1f	[SLPVectorizer][X86] Add 512-bit sitofp/uitofp tests llvm-svn: 283756	2016-10-10 14:28:06 +00:00
Simon Pilgrim	2c0733c678	[SLPVectorizer][X86] Add avx512 sitofp/uitofp tests llvm-svn: 283751	2016-10-10 14:14:31 +00:00
Simon Pilgrim	6cadb5610e	[SLPVectorizer][X86] Fixed alignments of scalar loads in sitofp/uitofp tests Fixed copy+paste vector alignment to correct for per-element scalar loads Increased to 512-bit data sizes in preparation of avx512 tests llvm-svn: 283748	2016-10-10 14:10:41 +00:00
Simon Pilgrim	4aea8e8a39	Fixed windows stdout/stderr redirection in inline asm constraint tests llvm-svn: 283741	2016-10-10 11:11:27 +00:00
George Rimar	e4dce5ce3e	[Object/ELF] - Do not crash on invalid Header->e_shoff value. sections_begin() may return unalignment pointer when Header->e_shoff isinvalid. That may result in a crash in clients, for example we have one in LLD: assert((PtrWord & ~PointerBitMask) == 0 && "Pointer is not sufficiently aligned"); fails when trying to push_back Elf_Shdr* (unaligned) into TinyPtrVector. Patch forces check for alignment of Header->e_shoff. Differential revision: https://reviews.llvm.org/D25368 llvm-svn: 283740	2016-10-10 10:51:38 +00:00
Chris Dewhurst	850131213f	This pass, fixing an erratum in some LEON 2 processors ensures that the SDIV instruction is not issued, but replaced by SDIVcc instead, which does not exhibit the error. Unit test included. Differential Review: https://reviews.llvm.org/D24660 llvm-svn: 283727	2016-10-10 08:53:06 +00:00
Craig Topper	9ece2f7529	[AVX-512] Add missing pattern sext or zext from bytes to quad words with a 128-bit load as input. llvm-svn: 283720	2016-10-10 06:25:48 +00:00
Craig Topper	0f905027b3	[AVX-512] Add test cases for AVX512 sign/zero extend instructions derived from the sse41 and avx2 test cases. Code will be improved in future commits. llvm-svn: 283719	2016-10-10 06:25:45 +00:00
Craig Topper	aba15075da	[AVX-512] Add an AVX512VL/BW command line to sse41-pmovxrm.ll and avx2-pmovxrm.ll. Also disable peephole so we really test pattern matching. llvm-svn: 283718	2016-10-10 06:25:42 +00:00
Michael Zuckerman	3eeac2d56b	[x86][inline-asm][llvm] accept 'v' constraint Commit in the name of:Coby Tayree 1.'v' constraint for (x86) non-avx arch imitates the already implemented 'x' constraint, i.e. allows XMM{0-15} & YMM{0-15} depending on the apparent arch & mode (32/64). 2.for the avx512 arch it allows [X,Y,Z]MM{0-31} (mode dependent) This patch applies the needed changes to clang clang patch: https://reviews.llvm.org/D25004 Differential Revision: D25005 llvm-svn: 283717	2016-10-10 05:48:56 +00:00
Craig Topper	64378f4378	[AVX-512] Port 128 and 256-bit memory->register sign/zero extend patterns from SSE file. Also add a minimal set for 512-bit. llvm-svn: 283704	2016-10-09 23:08:39 +00:00
Zvi Rackover	b764bf2987	[X86] Adding the 'nounwind' attribute to test functions for cleaner generated code Thanks to RKSimon for the suggestion. llvm-svn: 283696	2016-10-09 13:33:51 +00:00
Zvi Rackover	f841080caf	[X86] Improve the rotate ISel test Summary: - Added 64-bit target testing. - Added 64-bit operand test cases. - Added cases that demonstrate pr30644 Reviewers: RKSimon, craig.topper, igorb Differential Revision: https://reviews.llvm.org/D25401 llvm-svn: 283695	2016-10-09 13:07:25 +00:00
Elena Demikhovsky	5b10aa1f1e	DAG: Setting Masked-Expand-Load as a variant of Masked-Load node Masked-expand-load node represents load operation that loads a variable amount of elements from memory according to amount of "true" bits in the mask and expands the loaded elements according to their position in the mask vector. Right now, the node is used in intrinsics for VEXPAND* instructions. The work is done towards implementation of masked.expandload and masked.compressstore intrinsics. Differential Revision: https://reviews.llvm.org/D25322 llvm-svn: 283694	2016-10-09 10:48:52 +00:00
Craig Topper	43973154dd	[AVX-512] Fix execution domain for EVEX encoded VINSERTPS. llvm-svn: 283692	2016-10-09 06:41:47 +00:00
Craig Topper	e30cb00dc0	[AVX-512] Add subvector insert and extract to load/store folding tables. llvm-svn: 283689	2016-10-09 03:54:13 +00:00
Craig Topper	50a468e03f	[AVX-512] Add avx512dq to the fp stack folding test. llvm-svn: 283688	2016-10-09 03:54:09 +00:00
Craig Topper	4262d53024	[AVX-512] Add the vector down convert instructions to the store folding tables. llvm-svn: 283687	2016-10-09 03:54:05 +00:00
Mehdi Amini	8ec7b4f588	ThinLTO: Fix Gold test after caching fix in r283655 (I don't have Gold available, so this is speculative) llvm-svn: 283681	2016-10-08 22:49:28 +00:00
Simon Pilgrim	319c094771	[X86][SSE] Regenerate select tests llvm-svn: 283674	2016-10-08 21:17:44 +00:00
Zvi Rackover	ce4900aaa6	Revert "[X86] Apply the Update LLC Test Checks tool on the rotate tests." This reverts commit 283667. llvm-svn: 283673	2016-10-08 20:54:20 +00:00
Simon Pilgrim	9e7a22fc13	[X86][SSE] Regenerate and add 32-bit tests to widening tests llvm-svn: 283672	2016-10-08 19:54:28 +00:00
Simon Pilgrim	30cbd1ab84	Fix comment typos - full update script path in assertions note llvm-svn: 283670	2016-10-08 18:51:55 +00:00
Craig Topper	2067142d7d	[AVX-512] Add test case for PR30430 that I should have added in r281959. llvm-svn: 283669	2016-10-08 18:50:00 +00:00
Craig Topper	086f0c1401	[AVX-512] Fix a bug in getLargestLegalSuperClass where we inflated to VR128X/VR256X even when VLX isn't supported. This seems to have been responsible for the XMM16-31 spills observed in PR29112. With this fixed the test case has been modified to no longer have a spill of XMM16. llvm-svn: 283668	2016-10-08 18:49:57 +00:00
Zvi Rackover	2413d475fc	[X86] Apply the Update LLC Test Checks tool on the rotate tests. Also added cases demonstrating pr30644. llvm-svn: 283667	2016-10-08 18:44:47 +00:00
Simon Pilgrim	d0d90fb9b2	[X86][AVX2] Regenerate and add 32-bit tests to core tests llvm-svn: 283666	2016-10-08 18:36:57 +00:00
Teresa Johnson	897bab9b35	[ThinLTO] Record calls to aliases Summary: When there is a call to an alias in the same module, we were not adding a call edge. So we could incorrectly think that the alias was dead if it was inlined in that function, despite having a reference imported elsewhere. This resulted in unsats at link time. Add a call edge when the call is to an alias. Reviewers: davide, mehdi_amini Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D25384 llvm-svn: 283664	2016-10-08 16:11:42 +00:00
Sebastian Pop	eb65d72d9c	[AArch64] Avoid generating indexed vector instructions for Exynos Avoid generating indexed vector instructions for Exynos. This is needed for fmla/fmls/fmul/fmulx. For example, the instruction fmla v0.4s, v1.4s, v2.s[1] is less efficient than the instructions dup v2.4s, v2.s[1] fmla v0.4s, v1.4s, v2.4s Patch written by Abderrazek Zaafrani. Differential Revision: https://reviews.llvm.org/D21571 llvm-svn: 283663	2016-10-08 12:30:07 +00:00
Adam Nemet	ee5cf031ce	[OptRemarks] Remove non-printable chars from function name Value names may be prefixed with a binary '1' to indicate that the backend should not modify the symbols due to any platform naming convention. This should not show up in the YAML opt record file because it breaks the YAML parser. llvm-svn: 283656	2016-10-08 04:47:20 +00:00
Mehdi Amini	f82bda0a7a	ThinLTO: don't perform incremental LTO on module without a hash Clang always emit a hash for ThinLTO, but as other frontend are starting to use ThinLTO, this could be a serious bug. Differential Revision: https://reviews.llvm.org/D25379 llvm-svn: 283655	2016-10-08 04:44:23 +00:00
Mehdi Amini	00fa1409ec	ThinLTO: handles modules with empty summaries We need to add an entry in the combined-index for modules that have a hash but otherwise empty summary, this is needed so that we can get the hash for the module. Also, if no entry is present in the combined index for a module, we need to skip it when trying to compute a cache entry. Differential Revision: https://reviews.llvm.org/D25300 llvm-svn: 283654	2016-10-08 04:44:18 +00:00
Mehdi Amini	01e0e136bd	Requires the AVR backend for running test/CodeGen/AVR llvm-svn: 283653	2016-10-08 04:39:34 +00:00
Kyle Butt	2facd194a2	Revert "Codegen: Tail-duplicate during placement." This reverts commit 71c312652c10f1855b28d06697c08d47e7a243e4. llvm-svn: 283647	2016-10-08 01:47:05 +00:00
Zachary Turner	3b14764ce5	[pdb] Dump Module Symbols to Yaml. This is the first step towards round-tripping symbol information, and thusly being able to write symbol information to a PDB. This patch writes the symbol information for each compiland to the Yaml when running in pdb2yaml mode. There's still some loose ends, such as what to do about relocations (necessary in order to print linkage names), how to print enums with friendly names, and how to give the dumper access to the StringTable, but this is a good first start. llvm-svn: 283641	2016-10-08 01:12:01 +00:00
Dylan McKay	12109e7314	Allow a maximum of 64 bits to be returned in registers The rest spills to the stack Authored by Jake Goulding llvm-svn: 283635	2016-10-08 01:05:09 +00:00
Dylan McKay	c1ff65cf62	[AVR] Expand MULHS for all types Once MULHS was expanded, this exposed an issue where the condition register was thought to be 16-bit. This caused an attempt to copy a 16-bit register to an 8-bit register. Authored by Jake Goulding llvm-svn: 283634	2016-10-08 01:01:49 +00:00
Hal Finkel	f495280a09	[llvm-opt-report] Don't leave space for opts that never happen Because screen space is precious, if an optimization (vectorization, for example) never happens, don't leave empty space for the associated markers on every line of the output. This makes the output much more compact, and allows for the later inclusion of markers for more (although perhaps rare) optimizations. llvm-svn: 283626	2016-10-08 00:26:54 +00:00
Gor Nishanov	1b6aec8e25	[coroutines] Store an address of destroy OR cleanup part in the coroutine frame. Summary: If heap allocation of a coroutine is elided, we need to make sure that we will update an address stored in the coroutine frame from f.destroy to f.cleanup. Before this change, CoroSplit synthesized these stores after coro.begin: ``` store void (%f.Frame) @f.resume, void (%f.Frame)* %resume.addr store void (%f.Frame) @f.destroy, void (%f.Frame)* %destroy.addr ``` In those cases where we did heap elision, but were not able to devirtualize all indirect calls, destroy call will attempt to "free" the coroutine frame stored on the stack. Oops. Now we use select to put an appropriate coroutine subfunction in the destroy slot. As bellow: ``` store void (%f.Frame) @f.resume, void (%f.Frame)* %resume.addr %0 = select i1 %need.alloc, void (%f.Frame) @f.destroy, void (%f.Frame) @f.cleanup store void (%f.Frame) %0, void (%f.Frame)* %destroy.addr ``` Reviewers: majnemer Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D25377 llvm-svn: 283625	2016-10-08 00:22:50 +00:00
Tom Stellard	5ab6154dc3	AMDGPU/SI: Handle div_fmas hazard in GCNHazardRecognizer Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D25250 llvm-svn: 283622	2016-10-07 23:42:48 +00:00
Kyle Butt	37e676d857	Codegen: Tail-duplicate during placement. The tail duplication pass uses an assumed layout when making duplication decisions. This is fine, but passes up duplication opportunities that may arise when blocks are outlined. Because we want the updated CFG to affect subsequent placement decisions, this change must occur during placement. In order to achieve this goal, TailDuplicationPass is split into a utility class, TailDuplicator, and the pass itself. The pass delegates nearly everything to the TailDuplicator object, except for looping over the blocks in a function. This allows the same code to be used for tail duplication in both places. This change, in concert with outlining optional branches, allows triangle shaped code to perform much better, esepecially when the taken/untaken branches are correlated, as it creates a second spine when the tests are small enough. Issue from previous rollback fixed, and a new test was added for that case as well. Issue was worklist/scheduling/taildup issue in layout. Issue from 2nd rollback fixed, with 2 additional tests. Issue was tail merging/loop info/tail-duplication causing issue with loops that share a header block. Differential revision: https://reviews.llvm.org/D18226 llvm-svn: 283619	2016-10-07 22:33:20 +00:00
Arnold Schwaighofer	3f25658143	swifterror: Don't compute swifterror vregs during instruction selection The code used llvm basic block predecessors to decided where to insert phi nodes. Instruction selection can and will liberally insert new machine basic block predecessors. There is not a guaranteed one-to-one mapping from pred. llvm basic blocks and machine basic blocks. Therefore the current approach does not work as it assumes we can mark predecessor machine basic block as needing a copy, and needs to know the set of all predecessor machine basic blocks to decide when to insert phis. Instead of computing the swifterror vregs as we select instructions, propagate them at the end of instruction selection when the MBB CFG is complete. When an instruction needs a swifterror vreg and we don't know the value yet, generate a new vreg and remember this "upward exposed" use, and reconcile this at the end of instruction selection. This will only happen if the target supports promoting swifterror parameters to registers and the swifterror attribute is used. rdar://28300923 llvm-svn: 283617	2016-10-07 22:06:55 +00:00
Davide Italiano	f6988d2980	[InstCombine] Don't unpack arrays that are too large (part 2). This is similar to r283599, but for store instructions. Thanks to David for pointing out! llvm-svn: 283612	2016-10-07 21:53:09 +00:00
Davide Italiano	da11412243	[InstCombine] Don't unpack arrays that are too large Differential Revision: https://reviews.llvm.org/D25376 llvm-svn: 283599	2016-10-07 20:57:42 +00:00
Tom Stellard	6982bb8f25	AMDGPU/SI: Add support for 8-byte relocations Reviewers: arsenm, kzhuravl Subscribers: wdng, nhaehnle, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D25375 llvm-svn: 283593	2016-10-07 20:36:58 +00:00
Anna Thomas	e76d77ace5	[RS4GC] Strengthen coverage: add more tests Summary: Add tests for cases where we have zero coverage in RS4GC. Reviewers: sanjoy, reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D25341 llvm-svn: 283591	2016-10-07 20:34:00 +00:00
Sanjay Patel	4326c4ac8f	[InstCombine] fold select X, (ext X), C If we're going to canonicalize IR towards select of constants, try harder to create those. Also, don't lose the metadata. This is actually 4 related transforms in one patch: // select X, (sext X), C --> select X, -1, C // select X, (zext X), C --> select X, 1, C // select X, C, (sext X) --> select X, C, 0 // select X, C, (zext X) --> select X, C, 0 Differential Revision: https://reviews.llvm.org/D25126 llvm-svn: 283575	2016-10-07 17:53:07 +00:00
Simon Pilgrim	f9648b72df	[X86][SSE] Reapplied: Add vector fcopysign combine tests Now with better lowering and fix for PR30443 llvm-svn: 283569	2016-10-07 16:00:59 +00:00
Artem Tamazov	73f1ab28cd	[AMDGPU][mc] Add support for buffer_load_dwordx3, buffer_store_dwordx3. Partially fixes Bug 28232. Lit tests added. Differential Revision: https://reviews.llvm.org/D25367 llvm-svn: 283567	2016-10-07 15:53:16 +00:00
Matthew Simpson	a371c14ffe	[LV] Don't mark multi-use branch conditions uniform Previously, we marked the branch conditions of latch blocks uniform after vectorization if they were instructions contained in the loop. However, if a condition instruction has users other than the branch, it may not remain uniform. This patch ensures the conditions we mark uniform are only used by the branch. This should fix PR30627. Reference: https://llvm.org/bugs/show_bug.cgi?id=30627 llvm-svn: 283563	2016-10-07 15:20:13 +00:00
Sam Kolton	a3ec5c10e2	[AMDGPU] Assembler: support v_mac_f32 DPP and SDWA. Move getNamedOperandIdx to AMDGPUBaseInfo.h Reviewers: artem.tamazov, tstellarAMD Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D25084 llvm-svn: 283560	2016-10-07 14:46:06 +00:00
Simon Pilgrim	02f623e74c	[X86][SSE] Tidied up tests - use standard check prefixes llvm-svn: 283559	2016-10-07 14:42:22 +00:00
Tom Stellard	17eb3413cd	[ValueTracking] Fix crash in GetPointerBaseWithConstantOffset() Summary: While walking defs of pointer operands we were assuming that the pointer size would remain constant. This is not true, because addresspacecast instructions may cast the pointer to an address space with a different pointer width. This partial reverts r282612, which was a more conservative solution to this problem. Reviewers: reames, sanjoy, apilipenko Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D24772 llvm-svn: 283557	2016-10-07 14:23:29 +00:00
Konstantin Zhuravlyov	f74fc60a7d	[AMDGPU] Promote uniform (i1, i16] operations to i32 Differential Revision: https://reviews.llvm.org/D25302 llvm-svn: 283555	2016-10-07 14:22:58 +00:00
Martin Storsjo	04864f45b2	[ARM] Reapply: Use __rt_div functions for divrem on Windows Reapplying r283383 after revert in r283442. The additional fix is a getting rid of a stray space in a function name, in the refactoring part of the commit. This avoids falling back to calling out to the GCC rem functions (__moddi3, __umoddi3) when targeting Windows. The __rt_div functions have flipped the two arguments compared to the __aeabi_divmod functions. To match MSVC, we emit a check for division by zero before actually calling the library function (even if the library function itself also might do the same check). Not all calls to __rt_div functions for division are currently merged with calls to the same function with the same parameters for the remainder. This is more wasteful than a div + mls as before, but avoids calls to __moddi3. Differential Revision: https://reviews.llvm.org/D25332 llvm-svn: 283550	2016-10-07 13:28:53 +00:00
Javed Absar	fb4b6e8db9	[ARM]: Add Cortex-R52 target to LLVM This patch adds Cortex-R52, the new ARM real-time processor, to LLVM. Cortex-R52 implements the ARMv8-R architecture. llvm-svn: 283542	2016-10-07 12:06:40 +00:00
Simon Pilgrim	a5d019ee95	[X86][SSE] Update register class during MOVSD/MOVSS - BLENDPD/BLENDPS commutation MOVSD/MOVSS take a 128-bit register and a FR32/FR64 register input, the commutation code wasn't taking this into account leading to verification errors. This patch inserts a vreg copy mi to ensure that the registers are correct. Fix for PR30607 Differential Revision: https://reviews.llvm.org/D25280 llvm-svn: 283539	2016-10-07 11:18:38 +00:00
Alexey Bataev	6ad5da7c81	[SLPVectorizer] Fix for PR25748: reduction vectorization after loop unrolling. The next code is not vectorized by the SLPVectorizer: ``` int test(unsigned int *p) { int sum = 0; for (int i = 0; i < 8; i++) sum += p[i]; return sum; } ``` During optimization this loop is fully unrolled and SLPVectorizer is unable to vectorize it. Patch tries to fix this problem. Differential Revision: https://reviews.llvm.org/D24796 llvm-svn: 283535	2016-10-07 09:39:22 +00:00
Oliver Stannard	4df1cc0b00	[ARM] Don't convert switches to lookup tables of pointers with ROPI/RWPI With the ROPI and RWPI relocation models we can't always have pointers to global data or functions in constant data, so don't try to convert switches into lookup tables if any value in the lookup table would require a relocation. We can still safely emit lookup tables of other values, such as simple constants. Differential Revision: https://reviews.llvm.org/D24462 llvm-svn: 283530	2016-10-07 08:48:24 +00:00
Nicolai Haehnle	87bc4c218b	AMDGPU: Fix use-after-free in SIOptimizeExecMasking Summary: There was a bug with sequences like s_mov_b64 s[0:1], exec s_and_b64 s[2:3]<def>, s[0:1], s[2:3]<kill> ... s_mov_b64_term exec, s[2:3] because s[2:3] was defined and used in the same instruction, ending up with SaveExecInst inside OtherUseInsts. Note that the test case also exposes an unrelated bug. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98028 Reviewers: tstellarAMD, arsenm Subscribers: kzhuravl, wdng, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D25306 llvm-svn: 283528	2016-10-07 08:40:14 +00:00
Matt Arsenault	93401f4b5e	AMDGPU: Change check prefix in test llvm-svn: 283521	2016-10-07 03:55:04 +00:00
Hal Finkel	bd5a172d9c	[llvm-opt-report] Left justify unrolling counts, etc. In the left part of the reports, we have things like U<number>; if some of these numbers use more digits than others, we don't want a space in between the U and the start of the number. Instead, the space should come afterward. This way it is clear that the number goes with the U and not any other optimization indicator that might come later on the line. llvm-svn: 283518	2016-10-07 01:57:06 +00:00
David Majnemer	8c03c1bade	[SimplifyCFG] Correctly test for unconditional branches in GetCaseResults GetCaseResults assumed that a terminator with one successor was an unconditional branch. This is not necessarily the case, it could be a cleanupret. Strengthen the check by querying whether or not the terminator is exceptional. llvm-svn: 283517	2016-10-07 01:38:35 +00:00
Hal Finkel	16d29e3111	[llvm-opt-report] Use -no-demangle to disable demangling As this is intended to be a user-facing option, -no-demangle seems much better than -demangle=0. Add testing for the option. llvm-svn: 283516	2016-10-07 01:30:59 +00:00
Michael Kuperstein	5185b7dde3	[LV] Remove triples from target-independent vectorizer tests. NFC. Vectorizer tests in the target-independent directory should not have a target triple. If a test really needs to query a specific backend, it belongs in the right target subdirectory (which "REQUIRES" the right backend). Otherwise, it should not specify a triple. llvm-svn: 283512	2016-10-06 23:57:25 +00:00
Dan Gohman	2726b88c03	[WebAssemby] Implement block signatures. Per spec changes, this implements block signatures, and adds just enough logic to produce correct block signatures at the ends of functions. Differential Revision: https://reviews.llvm.org/D25144 llvm-svn: 283503	2016-10-06 22:29:32 +00:00
Dan Gohman	3a643e8d46	[WebAssembly] Remove loop's bottom label. Per spec changes, loop constructs no longer have a bottom label. https://reviews.llvm.org/D25118 llvm-svn: 283502	2016-10-06 22:10:23 +00:00
Dan Gohman	7f1bdb2e02	[WebAssembly] Remove the output operand from stores. Per spec changes, store instructions in WebAssembly no longer have a return value. Update the instruction descriptions. Differential Revision: https://reviews.llvm.org/D25122 llvm-svn: 283501	2016-10-06 22:08:28 +00:00
Wolfgang Pieb	e51bede1d8	Preserve the debug location when CodeGenPrepare sinks a compare instruction into the basic block of a user. Patch by Andrea DiBiagio. Differential Revision: https://reviews.llvm.org/D24632 llvm-svn: 283500	2016-10-06 21:43:45 +00:00
Pirama Arumuga Nainar	cc152ac794	Handle *_EXTEND_VECTOR_INREG during Integer Legalization Summary: These nodes need legalization for 3-element vectors. This commit handles the legalization and adds tests for zext and sext. This fixes PR30614. Reviewers: RKSimon, srhines Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D25268 llvm-svn: 283496	2016-10-06 21:27:05 +00:00
Rong Xu	0e79f7d11d	[PGO] Create weak alias for the renamed Comdat function Add a weak alias to the renamed Comdat function in IR level instrumentation, using it's original name. This ensures the same behavior w/ and w/o IR instrumentation, even for non standard conforming code. Differential Revision: http://reviews.llvm.org/D25339 llvm-svn: 283490	2016-10-06 20:38:13 +00:00
Michael Kuperstein	e524e22846	[X86] Preserve BasePtr for LEA64_32r When replacing FrameIndex with BasePtr, we must preserve BasePtr for LEA64_32r since BasePtr is used later for stack adjustment if it is the same as StackPtr. Patch by H.J Lu <hjl.tools@gmail.com> Differential Revision: https://reviews.llvm.org/D23575 llvm-svn: 283486	2016-10-06 19:31:27 +00:00
Simon Pilgrim	bddb412896	[X86][SSE] Add f16/f80/f128 vector sitofp test cases As discussed on D23808 llvm-svn: 283485	2016-10-06 19:29:25 +00:00
Michael Kuperstein	7cc2123847	[DAG] Generalize build_vector -> vector_shuffle combine for more than 2 inputs This generalizes the build_vector -> vector_shuffle combine to support any number of inputs. The idea is to create a binary tree of shuffles, where the first layer performs pairwise shuffles of the input vectors placing each input element into the correct lane, and the rest of the tree blends these shuffles together. This doesn't try to be smart and create any sort of "optimal" shuffles. The assumption is that even a "poor" shuffle sequence is better than extracting and inserting the elements one by one. Differential Revision: https://reviews.llvm.org/D24683 llvm-svn: 283480	2016-10-06 18:58:24 +00:00
Michael Ilseman	6d6b4d87a3	Revert "Add -strip-nonlinetable-debuginfo capability" This reverts commit r283473. Reverted until review is completed. llvm-svn: 283478	2016-10-06 18:30:26 +00:00
Michael Ilseman	d0a4db7632	Add -strip-nonlinetable-debuginfo capability This adds a new function to DebugInfo.cpp that takes an llvm::Module as input and removes all debug info metadata that is not directly needed for line tables, thus effectively stripping all type and variable information from the module. The primary motivation for this feature was the bitcode work flow (cf. http://lists.llvm.org/pipermail/llvm-dev/2016-June/100643.html for more background). This is not wired up yet, but will be in subsequent patches. For testing, the new functionality is exposed to opt with a -strip-nonlinetable-debuginfo option. The secondary use-case (and one that works right now!) is as a reduction pass in bugpoint. I added two new bugpoint options (-disable-strip-debuginfo and -disable-strip-debug-types) to control the new features. By default it will first attempt to remove all debug information, then only the type info, and then proceed to hack at any remaining MDNodes. llvm-svn: 283473	2016-10-06 17:58:38 +00:00
Matt Arsenault	6bc43d8627	BranchRelaxation: Support expanding unconditional branches AMDGPU needs to expand unconditional branches in a new block with an indirect branch. llvm-svn: 283464	2016-10-06 16:20:41 +00:00
Krzysztof Parzyszek	d391d6f1c3	[Hexagon] Avoid replacing full regs with subregisters in tied operands Doing so will result in the two-address pass generating incorrect code. llvm-svn: 283463	2016-10-06 16:18:04 +00:00
Matt Arsenault	ef5bba0136	BranchRelaxation: Account for function alignment llvm-svn: 283462	2016-10-06 16:00:58 +00:00
Matt Arsenault	36919a4f7c	Move AArch64BranchRelaxation to generic code llvm-svn: 283459	2016-10-06 15:38:53 +00:00
Nirav Dave	ee554e6155	[X86] Fix intel syntax push parsing bug Change erroneous parsing of push immediate instructions in intel syntax to default to pointer size by rewriting into the ATT style for matching. This fixes PR22028. Reviewers: majnemer, rnk Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D25288 llvm-svn: 283457	2016-10-06 15:28:08 +00:00
Rafael Espindola	d9525a166d	Centralize sh_entsize checking. llvm-svn: 283455	2016-10-06 15:08:10 +00:00
Rafael Espindola	c3befb2e39	Refactor to use getSectionContentsAsArray. This centralizes quite a bit of error checking. llvm-svn: 283454	2016-10-06 14:47:04 +00:00
Sam Kolton	3381d7a216	[AMDGPU] Disassembler: print label names in branch instructions Summary: Add AMDGPUSymbolizer for finding names for labels from ELF symbol table. Initialize MCObjectFileInfo with some default values. Reviewers: vpykhtin, artem.tamazov, tstellarAMD Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D24802 llvm-svn: 283450	2016-10-06 13:46:08 +00:00
Hal Finkel	4d6f3088c3	[llvm-opt-report] Record VF, etc. correctly for multiple opts on one line When there are multiple optimizations on one line, record the vectorization factors, etc. correctly (instead of incorrectly substituting default values). llvm-svn: 283443	2016-10-06 11:58:52 +00:00
Diana Picus	6341e46cd1	Revert "[ARM] Use __rt_div functions for divrem on Windows" This reverts commit r283383 because it broke some of the bots: undefined reference to ` __aeabi_uldivmod' It affected (at least) clang-cmake-armv7-a15-selfhost, clang-cmake-armv7-a15-selfhost and clang-native-arm-lnt. llvm-svn: 283442	2016-10-06 11:24:29 +00:00
Hal Finkel	47faf3be89	[llvm-opt-report] Print line numbers starting from 1 Line numbers should start from 1, not 2. llvm-svn: 283440	2016-10-06 11:11:11 +00:00
Zvi Rackover	08a37f46e3	Add test-cases which demontrate pr30561 llvm-svn: 283436	2016-10-06 10:04:00 +00:00
Bjorn Pettersson	3961603921	[ValueTracking] Teach computeKnownBits and ComputeNumSignBits to look through ExtractElement. Summary: The computeKnownBits and ComputeNumSignBits functions in ValueTracking can now do a simple look-through of ExtractElement. Reviewers: majnemer, spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24955 llvm-svn: 283434	2016-10-06 09:56:21 +00:00
James Molloy	6215fad0e9	[ARM] Constant pool promotion - fix alignment calculation Global variables are GlobalValues, so they have explicit alignment. Querying DataLayout for the alignment was incorrect. Testcase added. llvm-svn: 283423	2016-10-06 07:56:00 +00:00
James Molloy	78561c4917	[ARM] Improve testcase for r283323 We can work around a shortcoming of FileCheck by using {{\[}} to match a square bracket before a [[ sequence. Thanks to Eli Friedman for the heads up! llvm-svn: 283422	2016-10-06 07:44:05 +00:00
Konstantin Zhuravlyov	b4eb5d5049	[AMDGPU] Promote uniform i16 bitreverse intrinsic to i32 Differential Revision: https://reviews.llvm.org/D25121 llvm-svn: 283415	2016-10-06 02:20:46 +00:00
Sanjay Patel	edc2baddf8	[DAG] add tests to show missing checks for SDNode FMF The AVX attribute is added to remove noise caused by SSE's destructive insts. llvm-svn: 283410	2016-10-05 23:20:32 +00:00
Hal Finkel	5d0fbbbca1	Fix tests for Windows We need to match file names with both forward and backward slashes. llvm-svn: 283407	2016-10-05 22:48:13 +00:00
Reid Kleckner	bb96df602e	[codeview] Truncate records to maximum record size near 64KB If we don't truncate, LLVM asserts when the label difference doesn't fit in a 16 bit field. This patch truncates two kinds of data: trailing null terminated names in symbol records, and inline line tables. The inline line table test that I have is too large (many MB), so I'm not checking it in. Hopefully fixes PR28264. llvm-svn: 283403	2016-10-05 22:36:07 +00:00
Hal Finkel	5aa0248059	[llvm-opt-report] Distinguish inlined contexts when optimizations differ How code is optimized sometimes, perhaps often, depends on the context into which it was inlined. This change allows llvm-opt-report to track the differences between the optimizations performed, or not, in different contexts, and when these differ, display those differences. For example, this code: $ cat /tmp/q.cpp void bar(); void foo(int n) { for (int i = 0; i < n; ++i) bar(); } void quack() { foo(4); } void quack2() { foo(4); } will now produce this report: < /home/hfinkel/src/llvm/test/tools/llvm-opt-report/Inputs/q.cpp 2 \| void bar(); 3 \| void foo(int n) { [[ > foo(int): 4 \| for (int i = 0; i < n; ++i) > quack(), quack2(): 4 U4 \| for (int i = 0; i < n; ++i) ]] 5 \| bar(); 6 \| } 7 \| 8 \| void quack() { 9 I \| foo(4); 10 \| } 11 \| 12 \| void quack2() { 13 I \| foo(4); 14 \| } 15 \| Note that the tool has demangled the function names, and grouped the reports associated with line 4. This shows that the loop on line 4 was unrolled by a factor of 4 when inlined into the functions quack() and quack2(), but not in the function foo(int) itself. llvm-svn: 283402	2016-10-05 22:25:33 +00:00
Adrian Prantl	b3510afcd1	Verifier: Reject any unknown named MD nodes in the llvm.dbg namespace. This came out of a discussion in https://reviews.llvm.org/D25285. There used to be various other llvm.dbg.* nodes, but we don't support upgrading them and we want to reserve the namespace for future uses. This also removes an entirely obsolete and bitrotted testcase for PR7662. Reapplies 283390 with a forgotten testcase. llvm-svn: 283400	2016-10-05 22:15:37 +00:00
Adrian Prantl	497f085475	Revert "Verifier: Reject any unknown named MD nodes in the llvm.dbg namespace." Forgot to add a testcase in r283390. llvm-svn: 283399	2016-10-05 22:15:34 +00:00
Hal Finkel	52031b7e65	Add an llvm-opt-report tool to generate basic source-annotated optimization summaries LLVM now has the ability to record information from optimization remarks in a machine-consumable YAML file for later analysis. This can be enabled in opt (see r282539), and D25225 adds a Clang flag to do the same. This patch adds llvm-opt-report, a tool to generate basic optimization "listing" files (annotated sources with information about what optimizations were performed) from one of these YAML inputs. D19678 proposed to add this capability directly to Clang, but this more-general YAML-based infrastructure was the direction we decided upon in that review thread. For this optimization report, I focused on making the output as succinct as possible while providing information on inlining and loop transformations. The goal here is that the source code should still be easily readable in the report. My primary inspiration here is the reports generated by Cray's tools (http://docs.cray.com/books/S-2496-4101/html-S-2496-4101/z1112823641oswald.html). These reports are highly regarded within the HPC community. Intel's compiler, for example, also has an optimization-report capability (https://software.intel.com/sites/default/files/managed/55/b1/new-compiler-optimization-reports.pdf). $ cat /tmp/v.c void bar(); void foo() { bar(); } void Test(int res, int c, int d, int p, int n) { int i; #pragma clang loop vectorize(assume_safety) for (i = 0; i < 1600; i++) { res[i] = (p[i] == 0) ? res[i] : res[i] + d[i]; } for (i = 0; i < 16; i++) { res[i] = (p[i] == 0) ? res[i] : res[i] + d[i]; } foo(); foo(); bar(); foo(); } D25225 adds -fsave-optimization-record (and -fsave-optimization-record=filename), and this would be used as follows: $ clang -O3 -o /tmp/v.o -c /tmp/v.c -fsave-optimization-record $ llvm-opt-report /tmp/v.yaml > /tmp/v.lst $ cat /tmp/v.lst < /tmp/v.c 2 \| void bar(); 3 \| void foo() { bar(); } 4 \| 5 \| void Test(int res, int c, int d, int p, int n) { 6 \| int i; 7 \| 8 \| #pragma clang loop vectorize(assume_safety) 9 V4,2 \| for (i = 0; i < 1600; i++) { 10 \| res[i] = (p[i] == 0) ? res[i] : res[i] + d[i]; 11 \| } 12 \| 13 U16 \| for (i = 0; i < 16; i++) { 14 \| res[i] = (p[i] == 0) ? res[i] : res[i] + d[i]; 15 \| } 16 \| 17 I \| foo(); 18 \| 19 \| foo(); bar(); foo(); I \| ^ I \| ^ 20 \| } Each source line gets a prefix giving the line number, and a few columns for important optimizations: inlining, loop unrolling and loop vectorization. An 'I' is printed next to a line where a function was inlined, a 'U' next to an unrolled loop, and 'V' next to a vectorized loop. These are printed on the relevant code line when that seems unambiguous, or on subsequent lines when multiple potential options exist (messages, both positive and negative, from the same optimization with different column numbers are taken to indicate potential ambiguity). When on subsequent lines, a '^' is output in the relevant column. Annotated source for all relevant input files are put into the listing file (each starting with '<' and then the file name). You can disable having the unrolling/vectorization factors appear by using the -s flag. Differential Revision: https://reviews.llvm.org/D25262 llvm-svn: 283398	2016-10-05 22:10:35 +00:00
Sanjay Patel	5839858584	[DAG] change test to use 'unsafe' function attribute instead of global setting But we have node-level FMF, so the next step is to fix this at the instruction/node-level. llvm-svn: 283393	2016-10-05 21:43:50 +00:00
Adrian Prantl	71bba7253e	Verifier: Reject any unknown named MD nodes in the llvm.dbg namespace. This came out of a discussion in https://reviews.llvm.org/D25285. There used to be various other llvm.dbg.* nodes, but we don't support upgrading them and we want to reserve the namespace for future uses. This also removes an entirely obsolete and bitrotted testcase for PR7662. llvm-svn: 283390	2016-10-05 21:31:19 +00:00
Reid Kleckner	2b3e6428e5	[codeview] Translate bitpiece metadata to DEFRANGE_SUBFIELD* records This allows LLVM to describe locations of aggregate variables that have been split by SROA. Fixes PR29141 Reviewers: amccarth, majnemer Differential Revision: https://reviews.llvm.org/D25253 llvm-svn: 283388	2016-10-05 21:21:33 +00:00
Martin Storsjo	f997759aef	[ARM] Use __rt_div functions for divrem on Windows This avoids falling back to calling out to the GCC rem functions (__moddi3, __umoddi3) when targeting Windows. The __rt_div functions have flipped the two arguments compared to the __aeabi_divmod functions. To match MSVC, we emit a check for division by zero before actually calling the library function (even if the library function itself also might do the same check). Not all calls to __rt_div functions for division are currently merged with calls to the same function with the same parameters for the remainder. This is more wasteful than a div + mls as before, but avoids calls to __moddi3. Differential Revision: https://reviews.llvm.org/D24076 llvm-svn: 283383	2016-10-05 21:08:02 +00:00
James Y Knight	b0a473aaf8	[Sparc] Implement UMUL_LOHI and SMUL_LOHI instead of MULHS/MULHU/MUL. This is what the instruction-set actually provides, and the default expansions of the others into the lohi opcodes are good. llvm-svn: 283381	2016-10-05 20:54:17 +00:00
Yunzhong Gao	ba150d6156	Improve the debug-info test created in r274263. This patch is related to r274263 or Phabricator/D21818. This patch aims to improve the test case added in the previous commit to verify specifically that the stack protector pass is adding the debug line info as intended. Before, the test only verified that the verifier pass does not crash. The current approach is to generate the assembly output and then look for the .loc directive. Differential Revision: https://reviews.llvm.org/D25290 llvm-svn: 283374	2016-10-05 20:26:29 +00:00
Krzysztof Parzyszek	3b6cbd55f7	[RDF] Fix live def propagation through basic block llvm-svn: 283371	2016-10-05 20:08:09 +00:00
Reid Kleckner	f9dddec21c	Improve DEBUG_VALUE assembly comments for spilled bitpieces Previously we would give up when we saw the bitpiece DWARF expression and print "[complex expression]" when actually we handled bitpiece expressions outside the loop. llvm-svn: 283355	2016-10-05 18:36:02 +00:00
Simon Dardis	299dbd6cd1	[mips][ias] fix li macro when values are negated with ~ The integrated assembler evaluates the expressions such as ~0x80000000 to 0xffffffff7fffffff early in the parsing process. This patch adds compatibility with gas so that li loads the expected value (0x7fffffff) in those cases. This only occurs iff all the upper 32bits are set and maintains existing checks by not truncating the result down to 32 bits if any of the the upper bits are not set. Reviewers: dsanders, zoran.jovanovic Differential Review: https://reviews.llvm.org/D23399 llvm-svn: 283353	2016-10-05 18:26:19 +00:00
Bjorn Pettersson	12559441bd	[DAG] Teach computeKnownBits and ComputeNumSignBits in SelectionDAG to look through EXTRACT_VECTOR_ELT. Summary: Both computeKnownBits and ComputeNumSignBits can now do a simple look-through of EXTRACT_VECTOR_ELT. It will compute the result based on the known bits (or known sign bits) for the vector that the element is extracted from. Reviewers: bogner, tstellarAMD, mkuper Subscribers: wdng, RKSimon, jyknight, llvm-commits, nhaehnle Differential Revision: https://reviews.llvm.org/D25007 llvm-svn: 283347	2016-10-05 17:40:27 +00:00
Bjorn Pettersson	ddd31e5637	Test commit permission. NFC llvm-svn: 283346	2016-10-05 17:22:11 +00:00
Simon Dardis	f45a59f80b	Recommit: "[mips] Add rsqrt, recip for MIPS" Add rsqrt.[ds], recip.[ds] for MIPS. Correct the microMIPS definitions for architecture support and register usage. Reviewers: vkalintiris, zoran.jovanoic Differential Review: https://reviews.llvm.org/D24499 llvm-svn: 283334	2016-10-05 16:11:01 +00:00
Hans Wennborg	c26c03d911	Revert r282920 "X86: Allow conditional tail calls in Win64 "leaf" functions (PR26302)" This is suspected to cause a miscompile in Chromium. Reverting while investigating. llvm-svn: 283329	2016-10-05 15:39:27 +00:00
Simon Dardis	bbfd528748	Revert "[mips] Add rsqrt, recip for MIPS" This reverts commit r282485 which contain two patches instead of one. llvm-svn: 283327	2016-10-05 15:28:33 +00:00
Douglas Katzman	0411e8669b	[X86] Don't randomly encode %rip where illegal Differential Revision: https://reviews.llvm.org/D25112 llvm-svn: 283326	2016-10-05 15:23:35 +00:00
James Molloy	b7de497cb9	[Thumb] Don't try and emit LDRH/LDRB from the constant pool This is not a valid encoding - these instructions cannot do PC-relative addressing. The underlying problem here is of whitelist in ARMISelDAGToDAG that unwraps ARMISD::Wrappers during addressing-mode selection. This didn't realise TargetConstantPool was actually possible, so didn't handle it. llvm-svn: 283323	2016-10-05 14:52:13 +00:00
Douglas Katzman	8449b238ea	[X86] Fix some tests that didn't assert anything llvm-svn: 283322	2016-10-05 14:46:14 +00:00
Oren Ben Simhon	0670e5a35b	Test commit permission llvm-svn: 283319	2016-10-05 14:12:41 +00:00
Krzysztof Parzyszek	e7c72cdbb0	Fix machine operand traversal in ScheduleDAGInstrs::fixupKills llvm-svn: 283315	2016-10-05 13:15:06 +00:00
Kyle Butt	25ac35d822	Revert "Codegen: Tail-duplicate during placement." This reverts commit 062ace9764953e9769142c1099281a345f9b6bdc. Issue with loop info and block removal revealed by polly. I have a fix for this issue already in another patch, I'll re-roll this together with that fix, and a test case. llvm-svn: 283292	2016-10-05 01:39:29 +00:00
Kyle Butt	adabac2d57	Codegen: Tail-duplicate during placement. The tail duplication pass uses an assumed layout when making duplication decisions. This is fine, but passes up duplication opportunities that may arise when blocks are outlined. Because we want the updated CFG to affect subsequent placement decisions, this change must occur during placement. In order to achieve this goal, TailDuplicationPass is split into a utility class, TailDuplicator, and the pass itself. The pass delegates nearly everything to the TailDuplicator object, except for looping over the blocks in a function. This allows the same code to be used for tail duplication in both places. This change, in concert with outlining optional branches, allows triangle shaped code to perform much better, esepecially when the taken/untaken branches are correlated, as it creates a second spine when the tests are small enough. Issue from previous rollback fixed, and a new test was added for that case as well. Differential revision: https://reviews.llvm.org/D18226 llvm-svn: 283274	2016-10-04 23:54:18 +00:00
Sanjay Patel	bfdbea6481	[Target] move reciprocal estimate settings from TargetOptions to TargetLowering The motivation for the change is that we can't have pseudo-global settings for codegen living in TargetOptions because that doesn't work with LTO. Ideally, these reciprocal attributes will be moved to the instruction-level via FMF, metadata, or something else. But making them function attributes is at least an improvement over the current state. The ingredients of this patch are: Remove the reciprocal estimate command-line debug option. Add TargetRecip to TargetLowering. Remove TargetRecip from TargetOptions. Clean up the TargetRecip implementation to work with this new scheme. Set the default reciprocal settings in TargetLoweringBase (everything is off). Update the PowerPC defaults, users, and tests. Update the x86 defaults, users, and tests. Note that if this patch needs to be reverted, the related clang patch checked in at r283251 should be reverted too. Differential Revision: https://reviews.llvm.org/D24816 llvm-svn: 283252	2016-10-04 20:46:43 +00:00
Kevin Enderby	f993d6e72c	Next set of additional error checks for invalid Mach-O files for the load commands that uses the MachO::encryption_info_command and MachO::encryption_info_command types but not used in llvm libObject code but used in llvm tool code. This includes just LC_ENCRYPTION_INFO and LC_ENCRYPTION_INFO_64 load commands. llvm-svn: 283250	2016-10-04 20:37:43 +00:00
Matthias Braun	46a5238682	AArch64: Macrofusion: Split features, add missing combinations. AArch64InstrInfo::shouldScheduleAdjacent() determines whether two instruction can benefit from macroop fusion on apple CPUs. The list turned out to be incomplete: - the "rr" variants of the instructions were missing - even the "rs" variants can have shift value == 0 and behave like the "rr" variants This also splits the MacropFusion target feature into ArithmeticBccFusion and ArithmeticCbzFusion. Differential Revision: https://reviews.llvm.org/D25142 llvm-svn: 283243	2016-10-04 19:28:21 +00:00
Hal Finkel	bdd6735a9e	Don't filter diagnostics written as YAML to the output file The purpose of the YAML diagnostic output file is to collect information on optimizations performed, or not performed, for later processing by tools that help users (and compiler developers) understand how code was optimized. As such, the diagnostics that appear in the file should not be coupled to what a user might want to see summarized for them as the compiler runs, and in fact, because the user likely does not know what optimization diagnostics their tools might want to use, the user cannot provide a useful filter regardless. As such, we shouldn't filter the diagnostics going to the output file. Differential Revision: https://reviews.llvm.org/D25224 llvm-svn: 283236	2016-10-04 18:13:45 +00:00
Adam Nemet	0428e93217	Serialize remark argument as a mapping to get proper quotation for the value. llvm-svn: 283231	2016-10-04 17:05:04 +00:00
Alexey Bataev	7e217c2402	[SLPVectorizer] Add a test with non-vectorizable IR. llvm-svn: 283225	2016-10-04 15:07:23 +00:00
Anna Thomas	479cbb9405	[RS4GC] Handle ShuffleVector instruction in findBasePointer Summary: This patch modifies the findBasePointer to handle the shufflevector instruction. Tests run: RS4GC tests, local downstream tests. Reviewers: reames, sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D25197 llvm-svn: 283219	2016-10-04 13:48:37 +00:00
Andrey Bokhanko	6903be56d5	Fix IntegerType::MAX_INT_BITS value IntegerType::MAX_INT_BITS is apparently not in sync with Type::SubclassData size. This patch fixes this. Differential Revision: https://reviews.llvm.org/D24814 llvm-svn: 283215	2016-10-04 12:43:46 +00:00
Nemanja Ivanovic	6354d23555	[Power9] Exploit D-Form VSX Scalar memory ops that target full VSX register set This patch corresponds to review: The newly added VSX D-Form (register + offset) memory ops target the upper half of the VSX register set. The existing ones target the lower half. In order to unify these and have the ability to target all the VSX registers using D-Form operations, this patch defines Pseudo-ops for the loads/stores which are expanded post-RA. The expansion then choses the correct opcode based on the register that was allocated for the operation. llvm-svn: 283212	2016-10-04 11:25:52 +00:00
Simon Dardis	86b3a1e79b	[mips][fastisel] Consider soft-float an unsupported floating point mode Treat soft-float as unsupported for fast-isel. Additionally, ensure we check that lowering f32 arguments also considers the case of soft-float mode. Reviewers: ehostunreach, vkalintiris, zoran.jovanovic Differential Review: https://reviews.llvm.org/D24505 llvm-svn: 283209	2016-10-04 10:35:07 +00:00
George Rimar	67443021a4	[Object/ELF] - Do not crash on invalid sh_offset value of REL[A] section. Previously code would access invalid memory and may crash, patch fixes the issue. Differential revision: https://reviews.llvm.org/D25187 llvm-svn: 283204	2016-10-04 09:25:39 +00:00
George Rimar	5cbf23664d	[Object/ELF] - Avoid possible crash in getExtendedSymbolTableIndex(). When using broken input object found using AFL, getExtendedSymbolTableIndex() crashed because ShndxTable was empty as object does not contain SHT_SYMTAB_SHNDX section. Differential revision: https://reviews.llvm.org/D25189 llvm-svn: 283196	2016-10-04 08:44:03 +00:00
Nemanja Ivanovic	a565d9e612	Fix a test case failure on Apple PPC. llvm-svn: 283191	2016-10-04 07:37:38 +00:00
Nemanja Ivanovic	11049f8f07	[Power9] Part-word VSX integer scalar loads/stores and sign extend instructions This patch corresponds to review: https://reviews.llvm.org/D23155 This patch removes the VSHRC register class (based on D20310) and adds exploitation of the Power9 sub-word integer loads into VSX registers as well as vector sign extensions. The new instructions are useful for a few purposes: Int to Fp conversions of 1 or 2-byte values loaded from memory Building vectors of 1 or 2-byte integers with values loaded from memory Storing individual 1 or 2-byte elements from integer vectors This patch implements all of those uses. llvm-svn: 283190	2016-10-04 06:59:23 +00:00
Kyle Butt	3ffb8529bc	Revert "Codegen: Tail-duplicate during placement." This reverts commit ff234efbe23528e4f4c80c78057b920a51f434b2. Causing crashes on aarch64 build. llvm-svn: 283172	2016-10-04 00:38:23 +00:00
Eli Friedman	74bed9d757	Make GlobalsAA ignore dead constant expressions. Slightly improves the precision of GlobalsAA in certain situations, and makes the behavior of optimization passes more predictable. Differential Revision: https://reviews.llvm.org/D24104 llvm-svn: 283165	2016-10-04 00:03:55 +00:00
Kyle Butt	396bfdd707	Codegen: Tail-duplicate during placement. The tail duplication pass uses an assumed layout when making duplication decisions. This is fine, but passes up duplication opportunities that may arise when blocks are outlined. Because we want the updated CFG to affect subsequent placement decisions, this change must occur during placement. In order to achieve this goal, TailDuplicationPass is split into a utility class, TailDuplicator, and the pass itself. The pass delegates nearly everything to the TailDuplicator object, except for looping over the blocks in a function. This allows the same code to be used for tail duplication in both places. This change, in concert with outlining optional branches, allows triangle shaped code to perform much better, esepecially when the taken/untaken branches are correlated, as it creates a second spine when the tests are small enough. llvm-svn: 283164	2016-10-04 00:00:09 +00:00
Matthias Braun	d2fc0d40e4	Set some tests to an unknown vendor and OS This avoids llc using the hosts OS/vendor as defaults and triggering unwanted behaviour in the tests. This should deal with the buildbot breakages on windows after r283140. llvm-svn: 283149	2016-10-03 21:58:20 +00:00
Mehdi Amini	762d68bbab	[LTO] Fix test to not depend on the exact address of symbols, just their linkage llvm-svn: 283148	2016-10-03 21:40:50 +00:00
Krzysztof Parzyszek	c8b6ecabd8	[RDF] Fix liveness propagation through shadows Each shadow only represents data flow that is restricted to its reaching def. Propagating more than that could lead to spurious register liveness, resulting in extra (incorrectly) block live-ins. llvm-svn: 283143	2016-10-03 20:17:20 +00:00
Matthias Braun	eccdee9196	X86: Do not produce GOT relocations on windows Windows has no GOT relocations the way elf/darwin has. Some people use x86_64-pc-win32-macho to build EFI firmware; Do not produce GOT relocations for this target. Differential Revision: https://reviews.llvm.org/D24627 llvm-svn: 283140	2016-10-03 20:11:24 +00:00
Sanjoy Das	0359a193a7	[PruneEH] Be correct in the face IPO This fixes one spot I had missed in r265762. Credit goes to Philip Reames for spotting this one! llvm-svn: 283137	2016-10-03 19:35:30 +00:00
Konstantin Zhuravlyov	691e2e020b	[AMDGPU] Sign extend AShr when promoting (instead of zero extending) llvm-svn: 283130	2016-10-03 18:29:01 +00:00
Hans Wennborg	b4d2678c6f	Jump threading: avoid trying to split edge into landingpad block (PR27840) Splitting the edge is nontrivial because of the landing pad, and we would currently assert trying to do it. Differential Revision: https://reviews.llvm.org/D24680 llvm-svn: 283129	2016-10-03 18:18:04 +00:00
Sanjay Patel	d27a21874b	[x86, SSE/AVX] allow 128/256-bit lowering for copysign vector intrinsics (PR30433) This should fix: https://llvm.org/bugs/show_bug.cgi?id=30433 There are a couple of open questions about the codegen: 1. Should we let scalar ops be scalars and avoid vector constant loads/splats? 2. Should we have a pass to combine constants such as the inverted pair that we have here? Differential Revision: https://reviews.llvm.org/D25165 llvm-svn: 283119	2016-10-03 16:38:27 +00:00
Rafael Espindola	d7325ee702	Don't drop the llvm. prefix when renaming. If the llvm. prefix is dropped other parts of llvm don't see this as an intrinsic. This means that the number of regular symbols depends on the context the module is loaded into, which causes LTO to abort. Fixes PR30509. llvm-svn: 283117	2016-10-03 15:51:42 +00:00
Nirav Dave	157891c57f	Prevent out of order HashDirective lexing in AsmLexer. Retrying after buildbot reset. To lex hash directives we peek ahead to find component tokens, create a unified token, and unlex the peeked tokens so the parser does not need to parse the tokens then. Make sure we do not to lex another hash directive during peek operation. This fixes PR28921. Reviewers: rnk, loladiro Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24839 llvm-svn: 283111	2016-10-03 13:48:27 +00:00
Matt Arsenault	40bae76620	AMDGPU: Fix missing -verify-machineinstrs in test llvm-svn: 283107	2016-10-03 12:58:59 +00:00
Simon Pilgrim	52ab136881	[X86][SSE] Add PR30371 (shuffle constant folding) test case llvm-svn: 283103	2016-10-03 12:16:39 +00:00
Sjoerd Meijer	4dbe73c1ed	[ARM] Code size optimisation to lower udiv+urem to udiv+mls instead of a library call to __aeabi_uidivmod. This is an improved implementation of r280808, see also D24133, that got reverted because isel was stuck in a loop. That was caused by the optimisation incorrectly triggering on i64 ints, which shouldn't happen because there is no 64bit hwdiv support; that put isel's type legalization and this optimisation in a loop. A native ARM compiler and testing now shows that this is fixed. Patch mostly by Pablo Barrio. Differential Revision: https://reviews.llvm.org/D25077 llvm-svn: 283098	2016-10-03 10:12:32 +00:00
Alexey Bataev	fe91cf3aba	[CodeGen] Adding a test showing the current state of poor code gen of search loop, by Andrey Tischenko PR27136 shows failure to hoist constant out of loop. This test is used as start point to fix the failure: it shows the current state of codegen and discovers what should be fixed Differential Revision: https://reviews.llvm.org/D25097 llvm-svn: 283091	2016-10-03 07:47:01 +00:00
Simon Pilgrim	a8d2168cb0	[X86][AVX2] Add support for combining target shuffles to VPERMD/VPERMPS llvm-svn: 283080	2016-10-02 21:07:58 +00:00
Simon Pilgrim	bce1f6b491	[X86][AVX2] Missed opportunities to combine to VPERMD/VPERMPS llvm-svn: 283077	2016-10-02 20:43:02 +00:00
Simon Pilgrim	b5200971d6	[X86][AVX2] Fix typo in test names We are testing vpermps not vpermd llvm-svn: 283076	2016-10-02 19:31:58 +00:00
Sanjay Patel	170d7eb303	[x86] remove 'nan' strings from copysign assertions; NFC Preemptively scrubbing these to avoid a bot fail as in PR30443: https://llvm.org/bugs/show_bug.cgi?id=30443 I'm nearly done with a patch to fix these cases, so not trying very hard to do better for the temporary win. I plan to use better checks than what the script produces for the vectorized cases. llvm-svn: 283072	2016-10-02 17:07:24 +00:00
Sanjay Patel	dfbbbcd662	[x86] add test to show unnecessary scalarization of copysign intrinsics (PR30433) llvm-svn: 283071	2016-10-02 16:31:35 +00:00
Simon Pilgrim	03afbe783d	[X86][AVX] Ensure broadcast loads respect dependencies To allow broadcast loads of a non-zero'th vector element, lowerVectorShuffleAsBroadcast can replace a load with a new load with an adjusted address, but unfortunately we weren't ensuring that the new load respected the same dependencies. This patch adds a TokenFactor and updates all dependencies of the old load to reference the new load instead. Bug found during internal testing. Differential Revision: https://reviews.llvm.org/D25039 llvm-svn: 283070	2016-10-02 15:59:15 +00:00
Hal Finkel	a9321059b9	[PowerPC] Refactor soft-float support, and enable PPC64 soft float This change enables soft-float for PowerPC64, and also makes soft-float disable all vector instruction sets for both 32-bit and 64-bit modes. This latter part is necessary because the PPC backend canonicalizes many Altivec vector types to floating-point types, and so soft-float breaks scalarization support for many operations. Both for embedded targets and for operating-system kernels desiring soft-float support, it seems reasonable that disabling hardware floating-point also disables vector instructions (embedded targets without hardware floating point support are unlikely to have Altivec, etc. and operating system kernels desiring not to use floating-point registers to lower syscall cost are unlikely to want to use vector registers either). If someone needs this to work, we'll need to change the fact that we promote many Altivec operations to act on v4f32. To make it possible to disable Altivec when soft-float is enabled, hardware floating-point support needs to be expressed as a positive feature, like the others, and not a negative feature, because target features cannot have dependencies on the disabling of some other feature. So +soft-float has now become -hard-float. Fixes PR26970. llvm-svn: 283060	2016-10-02 02:10:20 +00:00
Martell Malone	3a4d900039	COFF: Fix short import lib import name type bitshift As per the PE COFF spec (section 8.3, Import Name Type) Offset: 18 Size 2 bits Name: Type Offset: 20 Size 3 bits Name: Name Type Offset: 20 added based on 18+2 Partially commited as rL279069 Differential Revision: https://reviews.llvm.org/D23540 llvm-svn: 283055	2016-10-01 23:10:20 +00:00
Simon Pilgrim	04e249b128	[SLPVectorizer][X86] Added fptosi/fptoui tests llvm-svn: 283048	2016-10-01 19:35:59 +00:00
Simon Pilgrim	1ec20e9b0a	[CostModel][X86] Added tests for current fptosi/fptoui costs llvm-svn: 283047	2016-10-01 19:09:59 +00:00
Simon Pilgrim	567c4fbdae	[SLPVectorizer][X86] Added fcopysign tests llvm-svn: 283046	2016-10-01 17:00:26 +00:00
Simon Pilgrim	cceeb2a4fa	[SLPVectorizer][X86] Added fabs tests llvm-svn: 283045	2016-10-01 16:54:01 +00:00
Simon Pilgrim	e0ec5c1f05	[CostModel][X86] Added fcopysign costs llvm-svn: 283044	2016-10-01 16:41:52 +00:00
Simon Pilgrim	8b021c382d	[CostModel][X86] Added fabs costs llvm-svn: 283042	2016-10-01 16:30:13 +00:00
Simon Pilgrim	1638d49f20	[X86][SSE] Add support for combining target shuffles to binary BLEND We already had support for 1-input BLEND with zero - this adds support for 2-input BLEND as well. llvm-svn: 283040	2016-10-01 16:04:28 +00:00
Simon Pilgrim	ae17cf20ce	[X86][SSE] Always combine target shuffles to MOVSD/MOVSS Now we can commute to BLENDPD/BLENDPS on SSE41+ targets if necessary, so simplify the combine matching where we can. This required me to add a couple of scalar math movsd/moss fold patterns that hadn't been needed in the past. llvm-svn: 283038	2016-10-01 15:33:01 +00:00
Simon Pilgrim	ccdd1ff49b	[X86][SSE] Enable commutation from MOVSD/MOVSS to BLENDPD/BLENDPS on SSE41+ targets Instead of selecting between MOVSD/MOVSS and BLENDPD/BLENDPS at shuffle lowering by subtarget this will help us select the instruction based on actual commutation requirements. We could possibly add BLENDPD/BLENDPS -> MOVSD/MOVSS commutation and MOVSD/MOVSS memory folding using a similar approach if it proves useful I avoided adding AVX512 handling as I'm not sure when we should be making use of VBLENDPD/VBLENDPS on EVEX targets llvm-svn: 283037	2016-10-01 14:26:11 +00:00
Simon Pilgrim	f1c575bad7	[X86][SSE] Regenerate vselect tests and improve AVX1/AVX2 coverage llvm-svn: 283035	2016-10-01 13:10:14 +00:00
Nirav Dave	e4c6153cf1	Revert "[MC] Prevent out of order HashDirective lexing in AsmLexer." This reverts commit r282992 which appears to be causing an LTO test failure. llvm-svn: 283034	2016-10-01 10:57:55 +00:00
Craig Topper	5eb5ade894	[X86] Cleanup patterns for using VMOVDDUP for broadcasts. -Remove OptForSize. Not all of the backend follows the same rules for creating broadcasts and there is no conflicting pattern. -Don't stop selecting VEX VMOVDDUP when AVX512 is supported. We need VLX for EVEX VMOVDDUP. -Only use VMOVDDUP for v2i64 broadcasts if AVX2 is not supported. llvm-svn: 283020	2016-10-01 07:11:24 +00:00
Craig Topper	8aca90507f	[AVX-512] Add VLX command lines to 128 and 256-bit shufffle tests. llvm-svn: 283014	2016-10-01 06:01:18 +00:00
Mehdi Amini	86eeda8e20	Revert "AMDGPU: Don't use offen if it is 0" This reverts commit r282999. Tests are not passing: http://lab.llvm.org:8011/builders/clang-x86_64-linux-selfhost-modules/builds/20038 llvm-svn: 283003	2016-10-01 02:35:24 +00:00
Matt Arsenault	3070fdf798	AMDGPU: Don't use offen if it is 0 This removes many re-initializations of a base register to 0. llvm-svn: 282999	2016-10-01 01:37:15 +00:00
Nirav Dave	9f2bd4e7ea	[MC] Prevent out of order HashDirective lexing in AsmLexer. To lex hash directives we peek ahead to find component tokens, create a unified token, and unlex the peeked tokens so the parser does not need to parse the tokens then. Make sure we do not to lex another hash directive during peek operation. This fixes PR28921. Reviewers: rnk, loladiro Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24839 llvm-svn: 282992	2016-10-01 00:42:32 +00:00
Mehdi Amini	6610b01a27	[ASAN] Add the binder globals on Darwin to llvm.compiler.used to avoid LTO dead-stripping The binder is in a specific section that "reverse" the edges in a regular dead-stripping: the binder is live as long as a global it references is live. This is a big hammer that prevents LLVM from dead-stripping these, while still allowing linker dead-stripping (with special knowledge of the section). Differential Revision: https://reviews.llvm.org/D24673 llvm-svn: 282988	2016-10-01 00:05:34 +00:00
Reid Kleckner	9cb915b7be	[SEH] Emit the parent frame offset label even if there are no funclets This avoids errors about references to undefined local labels from unreferenced filter functions. Fixes (sort of) PR30431 llvm-svn: 282967	2016-09-30 22:10:12 +00:00
Rui Ueyama	5d6714e593	Do not pass a superblock to PDBFileBuilder. When we create a PDB file using PDBFileBuilder, the information in the superblock, such as the size of the resulting file, is not available. Previously, PDBFileBuilder::initialize took a superblock assuming that all the members of the struct are correct. That is useful when you want to restore the exact information from a YAML file, but that's probably the only use case in which that is useful. When we are creating a PDB file on the fly, we have to backfill the members. This patch redefines PDBFileBuilder::initialize to take only a block size. Now all the other members are left as default values, so that they'll be updated when commit() is called. Differential Revision: https://reviews.llvm.org/D25108 llvm-svn: 282944	2016-09-30 20:52:12 +00:00
Hans Wennborg	b5643b47b6	X86: Allow conditional tail calls in Win64 "leaf" functions (PR26302) We can't use Jcc to leave a Win64 function in general, because that confuses the unwinder. However, for "leaf" functions, that is, functions where the return address is always on top of the stack and which don't have unwind info, it's OK. Differential Revision: https://reviews.llvm.org/D24836 llvm-svn: 282920	2016-09-30 20:07:35 +00:00
Sanjay Patel	f7b851fe84	[InstCombine] allow non-splat folds of select cond (ext X), C llvm-svn: 282906	2016-09-30 19:49:22 +00:00
Dehao Chen	a067b0926f	Revert test change in r282894 as it's broken in some platforms. llvm-svn: 282903	2016-09-30 19:25:23 +00:00
Gor Nishanov	a263a60ad5	[Coroutines] Part15c: Fix coro-split to correctly handle definitions between coro.save and coro.suspend Summary: In the case below, %Result.i19 is defined between coro.save and coro.suspend and used after coro.suspend. We need to correctly place such a value into the coroutine frame. ``` %save = call token @llvm.coro.save(i8* null) %Result.i19 = getelementptr inbounds %"struct.lean_future<int>::Awaiter", %"struct.lean_future<int>::Awaiter"* %ref.tmp7, i64 0, i32 0 %suspend = call i8 @llvm.coro.suspend(token %save, i1 false) switch i8 %suspend, label %exit [ i8 0, label %await.ready i8 1, label %exit ] await.ready: %val = load i32, i32* %Result.i19 ``` Reviewers: majnemer Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D24418 llvm-svn: 282902	2016-09-30 19:24:19 +00:00
Sanjay Patel	75b2518762	[InstCombine] add tests for non-splat select(ext) llvm-svn: 282901	2016-09-30 19:15:41 +00:00
Gor Nishanov	c16219486a	[Coroutines] Part15b: Fix dbg information handling in coro-split. Summary: Without the fix, if there was a function inlined into the coroutine with debug information, CloneFunctionInto(NewF, &F, VMap, /ModuleLevelChanges=/true, Returns); would duplicate all of the debug information including the DICompileUnit. We know use VMap to indicate that debug metadata for a File, Unit and FunctionType should not be duplicated when we creating clones that will become f.resume, f.destroy and f.cleanup. Reviewers: majnemer Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D24417 llvm-svn: 282899	2016-09-30 19:05:06 +00:00
Gor Nishanov	768de2c604	[Coroutines] Part 15a: Lower coro.subfn.addr in CoroCleanup Summary: Not all coro.subfn.addr intrinsics can be eliminated in CoroElide through devirtualization. Those that remain need to be lowered in CoroCleanup. Reviewers: majnemer Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D24412 llvm-svn: 282897	2016-09-30 18:41:35 +00:00
Sanjay Patel	b43712a513	clean up tests and auto-generate checks llvm-svn: 282896	2016-09-30 18:37:34 +00:00
Michal Gorny	b002f6370b	cmake: Install the OCaml libraries into a more correct path Add a OCAML_INSTALL_PATH variable that can be used to control the install path for OCaml libraries. The new variable defaults to ${OCAML_STDLIB_PATH}, i.e. the OCaml library path obtained from the OCaml compiler. Install libraries into "llvm" subdirectory. This fixes two issues: 1. OCaml library directories differ between systems, and 'lib/ocaml' is incorrect e.g. on amd64 Gentoo where OCaml is installed in 'lib64/ocaml'. Therefore, obtain the library path from the OCaml compiler using 'ocamlc -where' (which is already used to set OCAML_STDLIB_PATH), which is the method used commonly in OCaml packages. 2. The top-level directory is reserved for the standard library, and has precedence over local directory in search path. As a result, OCaml preferred the files installed along with previous LLVM version over the source tree when building a new version, resulting in two versions being mixed during the build. The new layout is used commonly by other OCaml packages, and findlib is able to find the LLVM libraries successfully. Bug: https://bugs.gentoo.org/559134 Bug: https://bugs.gentoo.org/559624 Differential Revision: https://reviews.llvm.org/D24354 llvm-svn: 282895	2016-09-30 18:34:23 +00:00
Dehao Chen	977853b7c5	Update loop unroller cost model to make sure debug info does not affect optimization decisions. Summary: Debug info should not affect optimization decisions. This patch updates loop unroller cost model to make it not affected by debug info. Reviewers: davidxl, mzolotukhin Subscribers: haicheng, llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D25098 llvm-svn: 282894	2016-09-30 18:30:04 +00:00
Sanjay Patel	9685b3eb57	[InstCombine] add tests for select X, (ext X), C llvm-svn: 282891	2016-09-30 18:10:14 +00:00
Derek Schuff	e9e6891b2d	[WebAssembly] Make register stackification more conservative Register stackification currently checks VNInfo for changes. Make that more accurate by testing each intervening instruction for any other defs to the same virtual register. Patch by Jacob Gravelle Differential Revision: https://reviews.llvm.org/D24942 llvm-svn: 282886	2016-09-30 18:02:54 +00:00
Etienne Bergeron	0ca0568604	[asan] Support dynamic shadow address instrumentation Summary: This patch is adding the support for a shadow memory with dynamically allocated address range. The compiler-rt needs to export a symbol containing the shadow memory range. This is required to support ASAN on windows 64-bits. Reviewers: kcc, rnk, vitalybuka Subscribers: zaks.anna, kubabrecka, dberris, llvm-commits, chrisha Differential Revision: https://reviews.llvm.org/D23354 llvm-svn: 282881	2016-09-30 17:46:32 +00:00
Matthew Simpson	7808833e28	[LV] Build all scalar steps for non-uniform induction variables When building the steps for scalar induction variables, we previously attempted to determine if all the scalar users of the induction variable were uniform. If they were, we would only emit the step corresponding to vector lane zero. This optimization was too aggressive. We generally don't know the entire set of induction variable users that will be scalar. We have isScalarAfterVectorization, but this is only a conservative estimate of the instructions that will be scalarized. Thus, an induction variable may have scalar users that aren't already known to be scalar. To avoid emitting unused steps, we can only check that the induction variable is uniform. This should fix PR30542. Reference: https://llvm.org/bugs/show_bug.cgi?id=30542 llvm-svn: 282863	2016-09-30 15:13:52 +00:00
Dylan McKay	309eba75b1	Revert "[RegAllocGreedy] Attempt to split unspillable live intervals" It was accidentally committed. llvm-svn: 282855	2016-09-30 14:05:15 +00:00
Dylan McKay	2a80cc688a	[RegAllocGreedy] Attempt to split unspillable live intervals Summary: Previously, when allocating unspillable live ranges, we would never attempt to split. We would always bail out and try last ditch graph recoloring. This patch changes this by attempting to split all live intervals before performing recoloring. This fixes LLVM bug PR14879. I can't add test cases for any backends other than AVR because none of them have small enough register classes to trigger the bug. Reviewers: qcolombet Subscribers: MatzeB Differential Revision: https://reviews.llvm.org/D25070 llvm-svn: 282852	2016-09-30 13:59:20 +00:00
Craig Topper	3f37a4180b	Revert r282835 "[AVX-512] Always use the full 32 register vector classes for addRegisterClass regardless of whether AVX512/VLX is enabled or not." Turns out this doesn't pass verify-machineinstrs. llvm-svn: 282841	2016-09-30 05:35:42 +00:00
Craig Topper	bc6e97b8f4	[AVX-512] Always use the full 32 register vector classes for addRegisterClass regardless of whether AVX512/VLX is enabled or not. If AVX512 is disabled, the registers should already be marked reserved. Pattern predicates and register classes on instructions should take care of most of the rest. Loads/stores and physical register copies for XMM16-31 and YMM16-31 without VLX have already been taken care of. I'm a little unclear why this changed the register allocation of the SSE2 run of the sad.ll test, but the registers selected appear to be valid after this change. llvm-svn: 282835	2016-09-30 04:31:33 +00:00
Piotr Padlewski	d28694739c	[thinlto] Don't decay threshold for hot callsites Summary: We don't want to decay hot callsites to import chains of hot callsites. The same mechanism is used in LIPO. Reviewers: tejohnson, eraman, mehdi_amini Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D24976 llvm-svn: 282833	2016-09-30 03:01:17 +00:00
Matt Arsenault	5d8eb25e78	AMDGPU: Use unsigned compare for eq/ne For some reason there are both of these available, except for scalar 64-bit compares which only has u64. I'm not sure why there are both (I'm guessing it's for the one bit inputs we don't use), but for consistency always using the unsigned one. llvm-svn: 282832	2016-09-30 01:50:20 +00:00
Reid Kleckner	147f91c88e	[X86] Don't preserve Win64 SSE CSRs when SSE is disabled Code that doesn't use floating point and doesn't use SSE (kernel code) shouldn't save and restore SSE registers. Fixes PR30503 llvm-svn: 282819	2016-09-30 00:17:49 +00:00
Kevin Enderby	4f229d867b	Next set of additional error checks for invalid Mach-O files for the load command that uses the MachO::entry_point_command type but not used in llvm libObject code but used in llvm tool code. This includes just the LC_MAIN load command. llvm-svn: 282766	2016-09-29 21:07:29 +00:00
Reid Kleckner	e45b2c7d8e	[codeview] Use character types for all byte-sized integer types The VS debugger doesn't appear to understand the 0x68 or 0x69 type indices, which were probably intended for use on a platform where a C 'int' is 8 bits. So, use the character types instead. Clang was already using the character types because '[u]int8_t' is usually defined in terms of 'char'. See the Rust issue for screenshots of what VS does: https://github.com/rust-lang/rust/issues/36646 Fixes PR30552 llvm-svn: 282739	2016-09-29 17:55:01 +00:00
Kevin Enderby	245be3ed2a	Next set of additional error checks for invalid Mach-O files for the load command that uses the Mach::source_version_command type but not used in llvm libObject code but used in llvm tool code. This includes just the LC_SOURCE_VERSION load command. llvm-svn: 282736	2016-09-29 17:45:23 +00:00
Kostya Serebryany	a9b0dd0e51	[sanitizer-coverage/libFuzzer] make the guards for trace-pc 32-bit; create one array of guards per function, instead of one guard per BB. reorganize the code so that trace-pc-guard does not create unneeded globals llvm-svn: 282735	2016-09-29 17:43:24 +00:00
Piotr Padlewski	ba72b95f7b	[thinlto] Add cold-callsite import heuristic Summary: Not tunned up heuristic, but with this small heuristic there is about +0.10% improvement on SPEC 2006 Reviewers: tejohnson, mehdi_amini, eraman Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D24940 llvm-svn: 282733	2016-09-29 17:32:07 +00:00
Simon Pilgrim	d72330dd09	[X86] Add explicit test triple to make windows/msvc builds happier llvm-svn: 282719	2016-09-29 15:10:09 +00:00
Craig Topper	bd74f75619	[X86] Really fix the FileCheck line from r282690. Why does Folded Spill comments print with a different number of # characters on different systems? llvm-svn: 282693	2016-09-29 06:49:21 +00:00
Craig Topper	1b60e9d7c0	[AVX-512] Fix a check line from r282690. llvm-svn: 282691	2016-09-29 06:37:21 +00:00
Craig Topper	d875d6b9b4	[AVX-512] Support spills of XMM16-31 and YMM16-31 when VLX isn't available. This adds new pseudo instructions that can be selected during register allocation to represent loads and stores of XMM/YMM registers when AVX512F is available, but VLX isn't. They will be converted to VEX encoded moves if the register turns out to be XMM0-15/YMM0-15. Otherwise either an EVEX VEXTRACT(store) or VBROADCAST(load) will be used. Fixes one of the cases from PR29112. llvm-svn: 282690	2016-09-29 06:07:09 +00:00
Craig Topper	f91830e6ee	[X86] Remove extra FileCheck lines that got left behind in r282688. llvm-svn: 282689	2016-09-29 06:07:07 +00:00
Craig Topper	7eb0e7ce1f	[AVX-512] Replicate pattern from AVX to select VMOVDDUP for (v2f64 (X86VBroadcast f64:)). Add AVX512VL to command line of existing AVX2 test that hits this condition. llvm-svn: 282688	2016-09-29 05:54:43 +00:00
Craig Topper	e7f2611160	[X86] Add EVEX encoded VBROADCASTSS/SD and VPBROADCASTD/Q to execution domain fixing table. llvm-svn: 282687	2016-09-29 05:54:39 +00:00
Craig Topper	7da0465062	[X86] Add 512-bit VPBROADCASTB and VPBROADCASTW tests. llvm-svn: 282685	2016-09-29 05:54:32 +00:00
Craig Topper	816a1d7783	[X86] Add VBROADCASTF128/VBROADCASTI128 to execution domain fixing tables. llvm-svn: 282684	2016-09-29 05:54:28 +00:00
Matt Arsenault	e6740754f0	AMDGPU: Partially fix control flow at -O0 Fixes to allow spilling all registers at the end of the block work with exec modifications. Don't emit s_and_saveexec_b64 for if lowering, and instead emit copies. Mark control flow mask instructions as terminators to get correct spill code placement with fast regalloc, and then have a separate optimization pass form the saveexec. This should work if SGPRs are spilled to VGPRs, but will likely fail in the case that an SGPR spills to memory and no workitem takes a divergent branch. llvm-svn: 282667	2016-09-29 01:44:16 +00:00
Lei Liu	361615cfd0	AArch64: Set shift bit of TLSLE HI12 add instruction Summary: AArch64 LLVM assembler emits add instruction without shift bit to calculate the higher 12-bit address of TLS variables in local exec model. This generates wrong code sequence to access TLS variables with thread offset larger than 0x1000. Reviewers: t.p.northover, peter.smith, rovka Subscribers: salim.nasser, aemerson, llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D24702 llvm-svn: 282661	2016-09-29 01:05:48 +00:00
Evgeny Stupachenko	dc8a254663	Wisely choose sext or zext when widening IV. Summary: The patch fixes regression caused by two earlier patches D18777 and D18867. Reviewers: reames, sanjoy Differential Revision: http://reviews.llvm.org/D24280 From: Li Huang llvm-svn: 282650	2016-09-28 23:39:39 +00:00
Kevin Enderby	76966bf066	Next set of additional error checks for invalid Mach-O files for the load command that uses the Mach::rpath_command type but not used in llvm libObject code but used in llvm tool code. This includes just the LC_RPATH load command. llvm-svn: 282649	2016-09-28 23:16:01 +00:00
Mike Aizatsky	392caa538d	[sancov] introducing symbolized coverage files (.symcov) Summary: Answering any meaningful questions about .sancov files requires accessing symbol information from the corresponding binary. This change introduces a separate intermediate data structure and format: symbolized coverage. It contains all symbol information that is required to answer common queries: - merging - coverd/uncovered files and functions - line status. Also removing the html report functionality from sancov: generated HTML files are too huge, and a different approach is required. Maintaining this half-working approach in the C++ is painful. Differential Revision: https://reviews.llvm.org/D24947 llvm-svn: 282639	2016-09-28 21:39:28 +00:00
Kevin Enderby	32359dbf6b	Next set of additional error checks for invalid Mach-O files for the other load commands that use the Mach::version_min_command type but not used in llvm libObject code but used in llvm tool code. This includes LC_VERSION_MIN_MACOSX, LC_VERSION_MIN_IPHONEOS, LC_VERSION_MIN_TVOS and LC_VERSION_MIN_WATCHOS load commands. llvm-svn: 282635	2016-09-28 21:20:45 +00:00
Krzysztof Parzyszek	dcb1bcae0b	IfConversion: Add implicit uses for redefined regs with live subregisters Normally, if conversion would add implicit uses for redefined registers, e.g. R0<def> = add_if ..., R0<imp-use>. However, if only subregisters of R0 are known to be live but not R0 itself, such implicit uses will not be added, causing prior definitions of such subregisters and R0 itself to become dead. llvm-svn: 282626	2016-09-28 20:07:41 +00:00
Konstantin Zhuravlyov	e14df4b236	[AMDGPU] Promote uniform i16 ops to i32 ops for targets that have 16 bit instructions Differential Revision: https://reviews.llvm.org/D24125 llvm-svn: 282624	2016-09-28 20:05:39 +00:00
Sanjay Patel	3e9e5ccf7c	[InstCombine] update to use FileCheck Also, remove unnecessary function attributes, parameters, and comments. It looks like at least some of these tests are not minimal though... llvm-svn: 282620	2016-09-28 19:10:16 +00:00
Simon Pilgrim	fea5c7a051	[X86][AVX] Add test showing that VBROADCAST loads don't correctly respect dependencies llvm-svn: 282613	2016-09-28 17:59:30 +00:00
Artur Pilipenko	b6ce6e5dac	Don't look through addrspacecast in GetPointerBaseWithConstantOffset Pointers in different addrspaces can have different sizes, so it's not valid to look through addrspace cast calculating base and offset for a value. This is similar to D13008. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D24729 llvm-svn: 282612	2016-09-28 17:57:16 +00:00

... 3 4 5 6 7 ...

40260 Commits