llvm-project

Commit Graph

Author	SHA1	Message	Date
Renato Golin	54c736f833	[DWARF] Move test to x86 directory llvm-svn: 301176	2017-04-24 12:37:11 +00:00
George Rimar	ca53211beb	[DWARF] - Take relocations in account when extracting ranges from .debug_ranges I found this when investigated "Bug 32319 - .gdb_index is broken/incomplete" for LLD. When we have object file with .debug_ranges section it may be filled with zeroes. Relocations are exist in file to relocate this zeroes into real values later, but until that a pair of zeroes is treated as terminator. And DWARF parser thinks there is no ranges at all when I am trying to collect address ranges for building .gdb_index. Solution implemented in this patch is to take relocations in account when parsing ranges. Differential revision: https://reviews.llvm.org/D32228 llvm-svn: 301170	2017-04-24 10:19:45 +00:00
Diana Picus	f53865daa4	[ARM] GlobalISel: Legalize s8 and s16 G_(S\|U)DIV We have to widen the operands to 32 bits and then we can either use hardware division if it is available or lower to a libcall otherwise. At the moment it is not enough to set the Legalizer action to WidenScalar, since for libcalls it won't know what to do (it won't be able to find what size to widen to, because it will find Libcall and not Legal for 32 bits). To hack around this limitation, we request Custom lowering, and as part of that we widen first and then we run another legalizeInstrStep on the widened DIV. llvm-svn: 301166	2017-04-24 09:12:19 +00:00
Sjoerd Meijer	e5b8557d5b	[Arch64AsmParser] better diagnostic for isb Instruction isb takes as an operand either 'sy' or an immediate value. This improves the diagnostic when the string is not 'sy' and adds a test case for this which was missing. This also adds tests to check invalid inputs for dsb and dmb. Differential Revision: https://reviews.llvm.org/D32227 llvm-svn: 301165	2017-04-24 08:22:20 +00:00
Diana Picus	b70e88bdec	[ARM] GlobalISel: Support G_(S\|U)DIV for s32 Add support for both targets with hardware division and without. For hardware division we have to add support throughout the pipeline (legalizer, reg bank select, instruction select). For targets without hardware division, we only need to mark it as a libcall. llvm-svn: 301164	2017-04-24 08:20:05 +00:00
Diana Picus	95a8aa93e2	[ARM] GlobalISel: Select G_CONSTANT with CImm operands When selecting a G_CONSTANT to a MOVi, we need the value to be an Imm operand. We used to just leave the G_CONSTANT operand unchanged, which works in some cases (such as the GEP offsets that we create when referring to stack slots). However, in many other places the G_CONSTANTs are created with CImm operands. This patch makes sure to handle those as well, and to error out gracefully if in the end we don't end up with an Imm operand. Thanks to Oliver Stannard for reporting this issue. llvm-svn: 301162	2017-04-24 06:30:56 +00:00
Dean Michael Berris	ca780b5a27	[XRay] A tool for Comparing xray function call graphs Summary: This is a tool for comparing the function graphs produced by the llvm-xray graph too. It takes the form of a new subcommand of the llvm-xray tool 'graph-diff'. This initial version of the patch is very rough, but it is close to feature complete. Depends on D29363 Reviewers: dblaikie, dberris Reviewed By: dberris Subscribers: mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D29320 llvm-svn: 301160	2017-04-24 05:54:33 +00:00
Sanjoy Das	0cdcdf018e	Revert "[SCEV] Enable SCEV verification by default in EXPENSIVE_CHECKS builds" This reverts commit r301150. It breaks CodeGen/Hexagon/hwloop-wrap2.ll, reverting while I investigate. llvm-svn: 301154	2017-04-24 02:35:19 +00:00
Sanjoy Das	8919303b0a	[SCEV] Enable SCEV verification by default in EXPENSIVE_CHECKS builds llvm-svn: 301150	2017-04-24 00:41:58 +00:00
Sanjoy Das	bdbc4938f9	[SCEV] Fix exponential time complexity by caching llvm-svn: 301149	2017-04-24 00:09:46 +00:00
Xinliang David Li	db8d09b6c2	[PartialInine]: add triaging options There are more bugs (runtime failures) triggered when partial inlining is turned on. Add options to help triaging problems. llvm-svn: 301148	2017-04-23 23:39:04 +00:00
Simon Pilgrim	12df01c3c7	[X86][AVX] Add scheduling latency/throughput tests for some AVX1 instructions More instructions will be added in future commits llvm-svn: 301145	2017-04-23 22:08:17 +00:00
Sanjay Patel	e0c26e0640	[InstCombine] add/move folds for [not]-xor We handled all of the commuted variants for plain xor already, although they were scattered around and sometimes folded less efficiently using distributive laws. We had no folds for not-xor. Handling all of these patterns consistently is part of trying to reinstate: https://reviews.llvm.org/rL300977 llvm-svn: 301144	2017-04-23 22:00:02 +00:00
Simon Pilgrim	06d6263309	[X86][SSE] Add scheduler class support for SSE42 (PCMPGT) instructions llvm-svn: 301142	2017-04-23 21:23:27 +00:00
Simon Pilgrim	7d71ed503d	[X86][SSE] Add scheduling latency/throughput tests for (most) SSE42 instructions llvm-svn: 301141	2017-04-23 21:00:25 +00:00
Sanjay Patel	afa371fd1d	[InstCombine] add tests for not-xor and remove redundant tests; NFC llvm-svn: 301140	2017-04-23 20:59:00 +00:00
Xin Tong	f98602a1ab	[JumpThread] We want to fold (not thread) when all predecessor go to single BB's successor. Summary: In case all predecessor go to a single successor of current BB. We want to fold (not thread). I failed to update the phi nodes properly in the last patch https://reviews.llvm.org/rL300657. Phi nodes values are per predecessor in LLVM. Reviewers: sanjoy Reviewed By: sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32400 llvm-svn: 301139	2017-04-23 20:56:29 +00:00
Simon Pilgrim	19a173ac23	[X86][SSE] Add scheduling latency/throughput tests for (most) SSE41 instructions llvm-svn: 301137	2017-04-23 20:05:21 +00:00
Simon Pilgrim	57fea6879b	[X86][SSE] Add missing scheduling latency/throughput test for PINSRW llvm-svn: 301136	2017-04-23 19:56:49 +00:00
Sanjay Patel	42a84ac710	[InstCombine] add tests for or-to-xor; NFC llvm-svn: 301131	2017-04-23 16:37:36 +00:00
Sanjay Patel	d13b0bfdac	[InstCombine] add pattern matches for commuted variants of xor-to-xor There's probably some better way to write this that eliminates the code duplication without hurting readability, but at least this eliminates the logic holes and is hopefully slightly more efficient than creating new instructions. llvm-svn: 301129	2017-04-23 16:03:00 +00:00
Sanjay Patel	9081808521	[InstCombine] add tests for xor-to-xor; NFC Besides missing 2 commuted patterns, the way we handle these folds is inefficient. llvm-svn: 301128	2017-04-23 14:51:03 +00:00
Simon Pilgrim	c781c0f630	[X86][SSE] Add scheduling latency/throughput tests for SSSE3 instructions llvm-svn: 301127	2017-04-23 14:01:55 +00:00
Simon Pilgrim	e8f8422fe5	[X86][SSE] Add scheduling latency/throughput tests for SSE3 instructions llvm-svn: 301126	2017-04-23 13:59:29 +00:00
Sanjay Patel	794c34dc35	[InstCombine] add tests for add-to-xor commuted variants; NFC 1 out of the 4 tests commuted the operands, so there's an asymmetry somewhere under this in how we handle these transforms. llvm-svn: 301125	2017-04-23 13:37:05 +00:00
Ayman Musa	544988f34d	[X86] Convert test checks to generated checks of update_llc_test_checks.py. NFC llvm-svn: 301107	2017-04-23 07:41:40 +00:00
Artyom Skrobov	53cf1897cc	[ARM] ScheduleDAGRRList::DelayForLiveRegsBottomUp must consider OptionalDefs Summary: D30400 has enabled tADC and tSBC instructions to be unglued, thereby allowing CPSR to remain live between Thumb1 scheduling units. Most Thumb1 instructions have an OptionalDef for CPSR; but the scheduler ignored the OptionalDefs, and could unwittingly insert a flag-setting instruction in between an ADDS and the corresponding ADC. Reviewers: javed.absar, atrick, MatzeB, t.p.northover, jmolloy, rengolin Reviewed By: javed.absar Subscribers: rogfer01, efriedma, aemerson, rengolin, llvm-commits, MatzeB Differential Revision: https://reviews.llvm.org/D31081 llvm-svn: 301106	2017-04-23 06:58:08 +00:00
Adrian Prantl	4677205010	Revert "Use DW_OP_stack_value when reconstructing variable values with arithmetic." This reverts commit r301093 while investigating stage2 bot breakage. llvm-svn: 301099	2017-04-23 00:44:40 +00:00
Jonathan Roelofs	1233fe5ac3	Fix testcase: s/CHECKNEXT/CHECK-NEXT/ llvm-svn: 301098	2017-04-22 23:43:44 +00:00
Sanjay Patel	ceff20fe50	[InstCombine] clean up tests and regenerate checks; NFC llvm-svn: 301097	2017-04-22 23:36:47 +00:00
Adrian Prantl	a2d25ac14a	Use DW_OP_stack_value when reconstructing variable values with arithmetic. When the location description of a source variable involves arithmetic on the value itself, it needs to be marked with DW_OP_stack_value since it is not describing the variable's location, but rather its value. This is a follow-up to r297971 and fixes the source testcase quoted in the comment in debuginfo-dce.ll. rdar://problem/30725338 llvm-svn: 301093	2017-04-22 20:54:06 +00:00
Simon Pilgrim	f27a714a9e	[X86] Regenerate TLS tests Use the correct check prefix for X86/X32/X64 target types. llvm-svn: 301092	2017-04-22 20:13:58 +00:00
Daniel Sanders	658541fe69	[globalisel][tablegen] Add support for RegisterOperand. Summary: It functions just like RegisterClass except that the class is obtained from a field. Depends on D31761. Reviewers: ab, qcolombet, t.p.northover, rovka, kristof.beyls, aditya_nandakumar Reviewed By: ab Subscribers: dberris, llvm-commits, igorb Differential Revision: https://reviews.llvm.org/D32229 llvm-svn: 301080	2017-04-22 15:53:21 +00:00
Daniel Sanders	2deea1878e	[globalisel][tablegen] Revise API for ComplexPattern operands to improve flexibility. Summary: Some targets need to be able to do more complex rendering than just adding an operand or two to an instruction. For example, it may need to insert an instruction to extract a subreg first, or it may need to perform an operation on the operand. In SelectionDAG, targets would create SDNode's to achieve the desired effect during the complex pattern predicate. This worked because SelectionDAG had a form of garbage collection that would take care of SDNode's that were created but not used due to a later predicate rejecting a match. This doesn't translate well to GlobalISel and the churn was wasteful. The API changes in this patch enable GlobalISel to accomplish the same thing without the waste. The API is now: InstructionSelector::OptionalComplexRendererFn selectArithImmed(MachineOperand &Root) const; where Root is the root of the match. The return value can be omitted to indicate that the predicate failed to match, or a function with the signature ComplexRendererFn can be returned. For example: return OptionalComplexRendererFn( [=](MachineInstrBuilder &MIB) { MIB.addImm(Immed).addImm(ShVal); }); adds two immediate operands to the rendered instruction. Immed and ShVal are captured from the predicate function. As an added bonus, this also reduces the amount of information we need to provide to GIComplexOperandMatcher. Depends on D31418 Reviewers: aditya_nandakumar, t.p.northover, qcolombet, rovka, ab, javed.absar Reviewed By: ab Subscribers: dberris, kristof.beyls, igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D31761 llvm-svn: 301079	2017-04-22 15:11:04 +00:00
Daniel Sanders	3016d3c6c9	[globalisel][tablegen] Fix PR32733 by checking which instruction operands belong to. canMutate() was returning true when the operands were all in the same order as the matched instruction. However, it wasn't checking the operands were actually on that instruction. This worked when we could only match a single instruction but the addition of nested instruction matching led to cases where the operands could be split across multiple instructions. canMutate() now returns false if operands belong to instructions other than the root of the match. llvm-svn: 301077	2017-04-22 14:31:28 +00:00
David Blaikie	5477b97d45	Fix test to handle .rel and .rela sections (& to actually specify the target architecture as X86) llvm-svn: 301073	2017-04-22 08:17:39 +00:00
David Blaikie	85366acf15	Avoid using relocations for ref_addr in .dwo files In dwo files the fixed offset can be used - if the dwos are linked into a dwp, the dwo consumer must use the dwp tables to find out where the original range of the debug_info was and resolve the "section relative" value relative to that original range - effectively avoiding/reimplementing the relocation handling. llvm-svn: 301072	2017-04-22 07:53:44 +00:00
David Blaikie	6cce69020c	Fix test from polluting the source tree (though this seems like a "does this not crash" test - which isn't very good. Should be fixed) llvm-svn: 301071	2017-04-22 07:53:40 +00:00
Artur Pilipenko	0632bdc648	Fix for PR32740 - Invalid floating type, unreachable between r300969 and r301029 The bug was introduced by r301018 "[InstCombine] fadd double (sitofp x), y check that the promotion is valid". The patch didn't expect that fadd can be on vectors not necessarily scalars. Add vector support along with the test. llvm-svn: 301070	2017-04-22 07:24:52 +00:00
Matt Arsenault	01d17e7c5f	LowerSwitch: Fix producing invalid IR on unreachable code If a switch was in an unreachable block that branched to a block with a phi, it would leave phis with missing predecessors. llvm-svn: 301064	2017-04-21 23:54:12 +00:00
David Blaikie	96b1ed50e8	Move Split DWARF handling to an MC option/command line argument rather than using metadata Since Split DWARF needs to name the actual .dwo file that is generated, it can't be known at the time the llvm::Module is produced as it may be merged with other Modules before the object is generated and that object may be generated with any name. By passing the Split DWARF file name when LLVM is producing object code the .dwo file name in the object file can match correctly. The support for Split DWARF for implicit modules remains the same - using metadata to store the dwo name and dwo id so that potentially multiple skeleton CUs referring to different dwo files can be generated from one llvm::Module. llvm-svn: 301062	2017-04-21 23:35:26 +00:00
Matthias Braun	d78597ec08	AArch64FrameLowering: Check if the ExtraCSSpill register is actually unused The code assumed that when saving an additional CSR register (ExtraCSSpill==true) we would have a free register throughout the function. This was not true if this CSR register is also used to pass values as in the swiftself case. rdar://31451816 llvm-svn: 301057	2017-04-21 22:42:08 +00:00
Adrian Prantl	ff384546f5	Add test coverage for mem2reg dbg.declare lowering. llvm-svn: 301050	2017-04-21 22:13:55 +00:00
Hans Wennborg	9b9a5358dd	Re-commit r301040 "X86: Don't emit zero-byte functions on Windows" In addition to the original commit, tighten the condition for when to pad empty functions to COFF Windows. This avoids running into problems when targeting e.g. Win32 AMDGPU, which caused test failures when this was committed initially. llvm-svn: 301047	2017-04-21 21:48:41 +00:00
Matt Arsenault	c07bda7b87	InferAddressSpaces: Infer for just GEPs Fixes leaving intermediate flat addressing computations where a GEP instruction's source is a constant expression. Still leaves behind a trivial addrspacecast + gep pair that instcombine is able to handle, which ideally could be folded here directly. llvm-svn: 301044	2017-04-21 21:35:04 +00:00
Xinliang David Li	0e9f6df169	[PartialInliner] Partial inliner needs to check use kind before transformation Differential Revision: https://reviews.llvm.org/D32373 llvm-svn: 301042	2017-04-21 21:20:56 +00:00
Hans Wennborg	04593000d8	Revert r301040 "X86: Don't emit zero-byte functions on Windows" This broke almost all bots. Reverting while fixing. llvm-svn: 301041	2017-04-21 21:10:37 +00:00
Hans Wennborg	cb3e810714	X86: Don't emit zero-byte functions on Windows Empty functions can lead to duplicate entries in the Guard CF Function Table of a binary due to multiple functions sharing the same RVA, causing the kernel to refuse to load that binary. We had a terrific bug due to this in Chromium. It turns out we were already doing this for Mach-O in certain situations. This patch expands the code for that in AsmPrinter::EmitFunctionBody() and renames TargetInstrInfo::getNoopForMachoTarget() to simply getNoop() since it seems it was used for not just Mach-O anyway. Differential Revision: https://reviews.llvm.org/D32330 llvm-svn: 301040	2017-04-21 20:58:12 +00:00
Zachary Turner	0fc009b008	Add a dependency from llvm/test to llvm-cvtres. llvm-svn: 301038	2017-04-21 20:45:11 +00:00
Tim Northover	1efaa3a88f	AArch64: add test for "fence singlethread" Forgot a git add yesterday. llvm-svn: 301037	2017-04-21 20:36:08 +00:00
Tim Northover	e31cf3f824	ARM: make sure we use all entries in a vector before forming a vpaddl. Otherwise there's some mismatch, and we'll either form an illegal type or an illegal node. Thanks to Eli Friedman for pointing out the problem with my original solution. llvm-svn: 301036	2017-04-21 20:35:52 +00:00
Sanjay Patel	8ce1d4cbe1	[InstCombine] revert r300977 and r301021 This can cause an inf-loop. Investigating... llvm-svn: 301035	2017-04-21 20:29:17 +00:00
Konstantin Zhuravlyov	3d1cc88c68	AMDGPU: Temporarily disable packed inlinable literals (v2f16, v2i16) Differential Revision: https://reviews.llvm.org/D32361 llvm-svn: 301028	2017-04-21 19:45:22 +00:00
Konstantin Zhuravlyov	88938d4e67	AMDGPU: Fix S_PACK_HH_B32_B16 - We really ought to zero out lower 16 bits Differential Revision: https://reviews.llvm.org/D32356 llvm-svn: 301026	2017-04-21 19:35:05 +00:00
Yaxun Liu	15a96b1dc8	[AMDGPU] Handle SI_MASKED_UNREACHABLE in instruction emitter SI_MASKED_UNREACHABLE does not have machine instruction encoding. It needs special handling in AMDGPUAsmPrinter::EmitInstruction like some other pseudo instructions. This patch fixes compilation failure of RadeonRays. Differential Revision: https://reviews.llvm.org/D32364 llvm-svn: 301025	2017-04-21 19:32:02 +00:00
Konstantin Zhuravlyov	c4b18e7099	AMDGPU: Do not lower fast unsafe div for safe, f32, with fp32 denormals Differential Revision: https://reviews.llvm.org/D32085 llvm-svn: 301023	2017-04-21 19:25:33 +00:00
Akira Hatanaka	22e839f4b2	[AArch64] Improve code generation for logical instructions taking immediate operands. This commit adds an AArch64 dag-combine that optimizes code generation for logical instructions taking immediate operands. The optimization uses demanded bits to change a logical instruction's immediate operand so that the immediate can be folded into the immediate field of the instruction. This recommits r300932 and r300930, which was causing dag-combine to loop forever. The problem was that optimizeLogicalImm was returning true even when there was no change to the immediate node (which happened when the immediate was all zeros or ones), which caused dag-combine to push and pop the same node to the work list over and over again without making any progress. This commit fixes the bug by returning false early in optimizeLogicalImm if the immediate is all zeros or ones. Also, it changes the code to compare the immediate with 0 or Mask rather than calling countPopulation. rdar://problem/18231627 Differential Revision: https://reviews.llvm.org/D5591 llvm-svn: 301019	2017-04-21 18:53:12 +00:00
Artur Pilipenko	134d94f9a3	[InstCombine] fadd double (sitofp x), y check that the promotion is valid Doing these transformations check that the result of integer addition is representable in the FP type. (fadd double (sitofp x), fpcst) --> (sitofp (add int x, intcst)) (fadd double (sitofp x), (sitofp y)) --> (sitofp (add int x, y)) This is a fix for https://bugs.llvm.org//show_bug.cgi?id=27036 Reviewed By: andrew.w.kaylor, scanon, spatel Differential Revision: https://reviews.llvm.org/D31182 llvm-svn: 301018	2017-04-21 18:45:25 +00:00
Zachary Turner	087edfa2e8	Add empty shell of llvm-cvtres. This marks the beginning of an effort to port remaining MSVC toolchain miscellaneous utilities to all platforms. Currently clang-cl shells out to certain additional tools such as the IDL compiler, resource compiler, and a few other tools, but as these tools are Windows-only it limits the ability of clang to target Windows on other platforms. having a full suite of these tools directly in LLVM should eliminate this constraint. The current implementation provides no actual functionality, it is just an empty skeleton executable for the purposes of making incremental changes. Differential Revision: https://reviews.llvm.org/D32095 Patch by Eric Beckmann (ecbeckmann@google.com) llvm-svn: 301004	2017-04-21 17:30:29 +00:00
Tim Northover	1061ccca8c	ARM: don't try to create an i8 -> i32 vpaddl. DAG combine was mistakenly assuming that the step-up it was looking at was always a doubling, but it can sometimes be a larger extension in which case we'd crash. llvm-svn: 301002	2017-04-21 17:21:59 +00:00
Daniel Sanders	e7b0d66080	[globalisel][tablegen] Import SelectionDAG's rule predicates and support the equivalent in GIRule. Summary: The SelectionDAG importer now imports rules with Predicate's attached via Requires, PredicateControl, etc. These predicates are implemented as bitset's to allow multiple predicates to be tested together. However, unlike the MC layer subtarget features, each target only pays for it's own predicates (e.g. AArch64 doesn't have 192 feature bits just because X86 needs a lot). Both AArch64 and X86 derive at least one predicate from the MachineFunction or Function so they must re-initialize AvailableFeatures before each function. They also declare locals in <Target>InstructionSelector so that computeAvailableFeatures() can use the code from SelectionDAG without modification. Reviewers: rovka, qcolombet, aditya_nandakumar, t.p.northover, ab Reviewed By: rovka Subscribers: aemerson, rengolin, dberris, kristof.beyls, llvm-commits, igorb Differential Revision: https://reviews.llvm.org/D31418 llvm-svn: 300993	2017-04-21 15:59:56 +00:00
Craig Topper	7af078847c	[SimplifyCFG] Fix the determination of PostBB in conditional store merging to handle the targets on the second branch being commuted Currently we choose PostBB as the single successor of QFB, but its possible that QTB's single successor is QFB which would make QFB the correct choice. Differential Revision: https://reviews.llvm.org/D32323 llvm-svn: 300992	2017-04-21 15:53:42 +00:00
Wei Mi	337d4d95c2	[ConstHoisting] Add BFI in constanthoisting pass and select the best insertion places based on it. Existing constant hoisting pass will merge a group of contants in a small range and hoist the const materialization code to the common dominator of their uses. However, if the uses are all in cold pathes, existing implementation may hoist the materialization code from cold pathes to a hot place. This may hurt performance. The patch introduces BFI to the pass and selects the best insertion places based on it. The change is controlled by an option consthoist-with-block-frequency which is off by default for now. Differential Revision: https://reviews.llvm.org/D28962 llvm-svn: 300989	2017-04-21 15:50:16 +00:00
Matthew Simpson	e2037d24f9	[LV] Model if-converted phi node costs Phi nodes in non-header blocks are converted to select instructions after if-conversion. This patch updates the cost model to account for the selects. Differential Revision: https://reviews.llvm.org/D31906 llvm-svn: 300980	2017-04-21 14:14:54 +00:00
Daniel Sanders	419efdd55b	Revert r300964 + r300970 - [globalisel][tablegen] Import SelectionDAG's rule predicates and support the equivalent in GIRule. It's causing llvm-clang-x86_64-expensive-checks-win to fail to compile and I haven't worked out why. Reverting to make it green while I figure it out. llvm-svn: 300978	2017-04-21 14:09:20 +00:00
Sanjay Patel	347b54b093	[InstCombine] prefer xor with -1 because 'not' is easier to understand (PR32706) This matches the demanded bits behavior in the DAG and should fix: https://bugs.llvm.org/show_bug.cgi?id=32706 Differential Revision: https://reviews.llvm.org/D32255 llvm-svn: 300977	2017-04-21 14:03:54 +00:00
Diana Picus	64a33431eb	[ARM] GlobalISel: Add support for G_TRUNC Select them as copies. We only select if both the source and the destination are on the same register bank, so this shouldn't cause any trouble. llvm-svn: 300971	2017-04-21 13:16:50 +00:00
Diana Picus	f941ec0ecc	[ARM] GlobalISel: Make struct arguments fail elegantly The condition in isSupportedType didn't handle struct/array arguments properly. Fix the check and add a test to make sure we use the fallback path in this kind of situation. The test deals with some common cases where the call lowering should error out. There are still some issues here that need to be addressed (tail calls come to mind), but they can be addressed in other patches. llvm-svn: 300967	2017-04-21 11:53:01 +00:00
Daniel Sanders	279d03527e	[globalisel][tablegen] Import SelectionDAG's rule predicates and support the equivalent in GIRule. Summary: The SelectionDAG importer now imports rules with Predicate's attached via Requires, PredicateControl, etc. These predicates are implemented as bitset's to allow multiple predicates to be tested together. However, unlike the MC layer subtarget features, each target only pays for it's own predicates (e.g. AArch64 doesn't have 192 feature bits just because X86 needs a lot). Both AArch64 and X86 derive at least one predicate from the MachineFunction or Function so they must re-initialize AvailableFeatures before each function. They also declare locals in <Target>InstructionSelector so that computeAvailableFeatures() can use the code from SelectionDAG without modification. Reviewers: rovka, qcolombet, aditya_nandakumar, t.p.northover, ab Reviewed By: rovka Subscribers: aemerson, rengolin, dberris, kristof.beyls, llvm-commits, igorb Differential Revision: https://reviews.llvm.org/D31418 llvm-svn: 300964	2017-04-21 10:27:20 +00:00
Clement Courbet	00b51bf123	add skylake llvm-svn: 300962	2017-04-21 09:21:01 +00:00
Clement Courbet	2430e25ba3	add 32 bit tests llvm-svn: 300961	2017-04-21 09:20:58 +00:00
Clement Courbet	d5f6182bec	use repmovsb when optimizing forminsize llvm-svn: 300960	2017-04-21 09:20:55 +00:00
Clement Courbet	203fc17797	Rename FastString flag. llvm-svn: 300959	2017-04-21 09:20:50 +00:00
Clement Courbet	a7c233fbe0	add more tests llvm-svn: 300958	2017-04-21 09:20:44 +00:00
Clement Courbet	1ce3b82dea	X86 memcpy: use REPMOVSB instead of REPMOVS{Q,D,W} for inline copies when the subtarget has fast strings. This has two advantages: - Speed is improved. For example, on Haswell thoughput improvements increase linearly with size from 256 to 512 bytes, after which they plateau: (e.g. 1% for 260 bytes, 25% for 400 bytes, 40% for 508 bytes). - Code is much smaller (no need to handle boundaries). llvm-svn: 300957	2017-04-21 09:20:39 +00:00
Artyom Skrobov	8d9643009f	[Thumb1] The recently added tADCS and tSBCS pseudo-instructions were missing `Uses = [CPSR]` Summary: Thanks to Oliver Stannard for helping catch this. Reviewers: olista01, efriedma Subscribers: llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D31815 llvm-svn: 300951	2017-04-21 07:35:21 +00:00
Davide Italiano	fa15de34b7	[PartialInliner] Fix crash when inlining functions with unreachable blocks. CodeExtractor looks up the dominator node corresponding to return blocks when splitting them. If one of these blocks is unreachable, there's no node in the Dom and CodeExtractor crashes because it doesn't check for domtree node validity. In theory, we could add just a check for skipping null DTNodes in `splitReturnBlock` but the fix I propose here is slightly different. To the best of my knowledge, unreachable blocks are irrelevant for the algorithm, therefore we can just skip them when building the candidate set in the constructor. Differential Revision: https://reviews.llvm.org/D32335 llvm-svn: 300946	2017-04-21 04:25:00 +00:00
Akira Hatanaka	78ccba6a20	Revert r300932 and r300930. It seems that r300930 was creating an infinite loop in dag-combine when compling the following file: MultiSource/Benchmarks/MiBench/consumer-typeset/z21.c llvm-svn: 300940	2017-04-21 01:31:50 +00:00
Akira Hatanaka	19077aaee0	[AArch64] Improve code generation for logical instructions taking immediate operands. This commit adds an AArch64 dag-combine that optimizes code generation for logical instructions taking immediate operands. The optimization uses demanded bits to change a logical instruction's immediate operand so that the immediate can be folded into the immediate field of the instruction. This recommits r300913, which broke bots because I didn't fix a call to ShrinkDemandedConstant in SIISelLowering.cpp after changing the APIs of TargetLoweringOpt and TargetLowering. rdar://problem/18231627 Differential Revision: https://reviews.llvm.org/D5591 llvm-svn: 300930	2017-04-21 00:05:16 +00:00
Eli Friedman	d0e6ae5678	Revert r300746 (SCEV analysis for or instructions). There have been multiple reports of this causing problems: a compile-time explosion on the LLVM testsuite, and a stack overflow for an opencl kernel. llvm-svn: 300928	2017-04-20 23:59:05 +00:00
Akira Hatanaka	7b06cebe73	Revert "[AArch64] Improve code generation for logical instructions taking" This reverts r300913. This broke bots. llvm-svn: 300916	2017-04-20 23:03:30 +00:00
Craig Topper	1cb0e5afb0	[Simplify] Add testcase to show that merging conditional stores for triangles is sensitive to the order of the branch targets on the conditional branches. NFC llvm-svn: 300915	2017-04-20 22:57:36 +00:00
Akira Hatanaka	e327f09832	[AArch64] Improve code generation for logical instructions taking immediate operands. This commit adds an AArch64 dag-combine that optimizes code generation for logical instructions taking immediate operands. The optimization uses demanded bits to change a logical instruction's immediate operand so that the immediate can be folded into the immediate field of the instruction. rdar://problem/18231627 Differential Revision: https://reviews.llvm.org/D5591 llvm-svn: 300913	2017-04-20 22:47:56 +00:00
Sanjay Patel	c9485ca895	[InstCombine] allow shl+shr demanded bits folds with splat constants llvm-svn: 300911	2017-04-20 22:33:54 +00:00
Sanjay Patel	be2dcaf45a	[InstCombine] add tests for shl+shr demanded bits splat vector folds; NFC llvm-svn: 300907	2017-04-20 22:18:47 +00:00
Tim Northover	46e58354da	ARM: lower "fence singlethread" to a pure compiler barrier. Single-threaded fences aren't required to provide any synchronization with other processing elements so there's no need for a DMB. They should still be a barrier for compiler optimizations though. llvm-svn: 300904	2017-04-20 21:56:52 +00:00
Sanjay Patel	3e1ae72fcf	[InstCombine] allow shl demanded bits folds with splat constants More fixes are needed to enable the helper SimplifyShrShlDemandedBits(). llvm-svn: 300898	2017-04-20 21:33:02 +00:00
Sanjay Patel	fb5b3e773a	[InstCombine] allow ashr/lshr demanded bits folds with splat constants llvm-svn: 300888	2017-04-20 20:59:02 +00:00
Sanjay Patel	7e77bed813	[InstCombine] add tests for demanded bits ashr/lshr splat constants; NFC llvm-svn: 300884	2017-04-20 20:44:54 +00:00
Adrian Prantl	ada104888e	Don't emit locations that need a DW_OP_stack_value in DWARF 2 & 3. https://bugs.llvm.org/show_bug.cgi?id=32382 llvm-svn: 300883	2017-04-20 20:42:33 +00:00
Tim Northover	8b1240b0f0	ARM: handle post-indexed NEON ops where the offset isn't the access width. Before, we assumed that any ConstantInt offset was precisely the access width, so we could use the "[rN]!" form. ISelLowering only ever created that kind, but further simplification during combining could lead to unexpected constants and incorrect codegen. Should fix PR32658. llvm-svn: 300878	2017-04-20 19:54:02 +00:00
Paul Robinson	70b34533c2	[DWARF] Versioning for DWARF constants; verify FORMs Associate the version-when-defined with definitions of standard DWARF constants. Identify the "vendor" for DWARF extensions. Use this information to verify FORMs in .debug_abbrev are defined as of the DWARF version specified in the associated unit. Removed two tests that had specified DWARF v1 (which essentially does not exist). Differential Revision: http://reviews.llvm.org/D30785 llvm-svn: 300875	2017-04-20 19:16:51 +00:00
Yaxun Liu	5d977f8ed4	CodeGen: Let frame index value type match alloca addr space Recently alloca address space has been added to data layout. Due to this change, pointer returned by alloca may have different size as pointer in address space 0. However, currently the value type of frame index is assumed to be of the same size as pointer in address space 0. This patch fixes that. Most targets assume alloca returning pointer in address space 0, which is the default alloca address space. Therefore it is NFC for them. AMDGCN target with amdgiz environment requires this change since it assumes alloca returning pointer to addr space 5 and its size is 32, which is different from the size of pointer in addr space 0 which is 64. Differential Revision: https://reviews.llvm.org/D32021 llvm-svn: 300864	2017-04-20 18:15:34 +00:00
Amara Emerson	bfbdebd00e	[MVT][SVE] Scalable vector MVTs (2/3) Adds scalable vector machine value types, and updates the switch statements required for tablegen. Patch by Graham Hunter. Differential Revision: https://reviews.llvm.org/D32018 llvm-svn: 300840	2017-04-20 13:36:58 +00:00
Petar Jovanovic	2b6fe3ffa6	[mips][msa] Mask vectors holding shift amounts Masked vectors which hold shift amounts when creating the following nodes: ISD::SHL, ISD::SRL or ISD::SRA. Instructions that use said nodes, which have had their arguments altered are sll, srl, sra, bneg, bclr and bset. For said instructions, the shift amount or the bit position that is specified in the corresponding vector elements will be interpreted as the shift amount/bit position modulo the size of the element in bits. The problem lies in compiling with -O2 enabled, where the instructions for formats .w and .d are not generated, but are instead optimized away. In this case, having shift amounts that are either negative or greater than the element bit size results in generation of incorrect results when constant folding. We remedy this by masking the operands for the nodes mentioned above before actually creating them, so that the final result is correct before placed into the constant pool. Patch by Stefan Maksimovic. Differential Revision: https://reviews.llvm.org/D31331 llvm-svn: 300839	2017-04-20 13:26:46 +00:00
John Brawn	66719f63d0	[ARM] Fix handling of mapping symbols when changing sections ChangeSection incorrectly registers LastEMSInfo as belonging to the previous section, not the current section. This happens to work when changing sections using .section, as the previous section is set to the current section before the call to ChangeSection, but not when using .popsection. Differential Revision: https://reviews.llvm.org/D32225 llvm-svn: 300831	2017-04-20 10:18:13 +00:00
John Brawn	5ca5daa6b9	[AArch64] Fix handling of zero immediate in fmov instructions Currently fmov #0 with a vector destination is handle incorrectly and results in fmov #-1.9375 being emitted but should instead give an error. This is due to the way we cope with fmov #0 with a scalar destination being an alias of fmov zr, so fix this by actually doing it through an alias. Differential Revision: https://reviews.llvm.org/D31949 llvm-svn: 300830	2017-04-20 10:13:54 +00:00
John Brawn	dcf037a6f0	[AArch64] Fix handling of integer fp immediates When an integer is used as an fp immediate we're failing to check the return value of getFP64Imm, so invalid values are silently permitted. Fix this by merging together the integer and real handling. llvm-svn: 300828	2017-04-20 10:10:10 +00:00
Adrian Prantl	c12cee3600	Fix bug that caused DwarfExpression to drop DW_OP_deref from FI locations - introduced in r300522 and found via the Swift LLDB testsuite. The fix is to set the location kind to memory whenever an FrameIndex location is emitted. rdar://problem/31707602 llvm-svn: 300793	2017-04-19 23:42:25 +00:00
Reid Kleckner	aa0cec7d6d	Simplify test for sret attribute in instcombine This change is correct because the verifier requires that at most one argument be marked 'sret'. NFC, removes a use of AttributeList slot APIs. llvm-svn: 300784	2017-04-19 23:17:47 +00:00
Galina Kistanova	2cc97d92ce	Temporarily revert r299221 to fix nondeterminism in ThinLTO builder. llvm-svn: 300783	2017-04-19 23:16:14 +00:00
Matthias Braun	372ee59766	X86FrameLowering: Fix getFrameIndexReference() for 'fixed' objects Debug information is calculated with getFrameIndexReference() which was missing some logic for the fixed object cases (= parameters on the stack). rdar://24557797 Differential Revision: https://reviews.llvm.org/D32204 llvm-svn: 300781	2017-04-19 23:10:43 +00:00
Kostya Serebryany	c5d3d49034	[sanitizer-coverage] remove some more stale code llvm-svn: 300778	2017-04-19 22:42:11 +00:00
Sanjay Patel	0658a95a35	[DAG] add splat vector support for 'or' in SimplifyDemandedBits I've changed one of the tests to not fold away, but we didn't and still don't do the transform that the comment claims we do (and I don't know why we'd want to do that). Follow-up to: https://reviews.llvm.org/rL300725 https://reviews.llvm.org/rL300763 llvm-svn: 300772	2017-04-19 22:00:00 +00:00
Kostya Serebryany	be87d480ff	[sanitizer-coverage] remove stale code llvm-svn: 300769	2017-04-19 21:48:09 +00:00
Sanjay Patel	ae382bb6af	[DAG] add splat vector support for 'xor' in SimplifyDemandedBits This allows forming more 'not' ops, so we get improvements for ISAs that have and-not. Follow-up to: https://reviews.llvm.org/rL300725 llvm-svn: 300763	2017-04-19 21:23:09 +00:00
Matthias Braun	8aaa368d00	ARMFrameLowering: Reserve emergency spill slot for large arguments Re-commit after revert in r300668. Changed getMaxFPOffset() to a more conservative heuristic instead of trying to be clever and missing for some exotic calling conventions. We need to reserve an emergency spill slot in cases with large argument types that could overflow immediate offsets for FP relative address calculations. rdar://31317893 Differential Revision: https://reviews.llvm.org/D31643 llvm-svn: 300761	2017-04-19 21:11:44 +00:00
Simon Pilgrim	9a828c1f2b	[InstCombine] Add frem constant folding test (PR3316) llvm-svn: 300757	2017-04-19 21:09:19 +00:00
Matt Arsenault	4a48623e4f	AMDGPU: Custom lower illegal small select types Promote them to i32 vectors to avoid unpacking and re-packing the vectors. llvm-svn: 300754	2017-04-19 20:53:07 +00:00
Simon Pilgrim	19a6dddd6d	[InstCombine] Add frem constant folding test (PR32177) llvm-svn: 300750	2017-04-19 20:47:58 +00:00
Eli Friedman	f281d490cc	[ARM] Use TableGen patterns to select vtbl. NFC. Differential Revision: https://reviews.llvm.org/D32103 llvm-svn: 300749	2017-04-19 20:39:39 +00:00
Eli Friedman	e77d2b86b4	[SCEV] Make SCEV or modeling more aggressive. Use haveNoCommonBitsSet to figure out whether an "or" instruction is equivalent to addition. This handles more cases than just checking for a constant on the RHS. Differential Revision: https://reviews.llvm.org/D32239 llvm-svn: 300746	2017-04-19 20:19:58 +00:00
Dehao Chen	a364f09f18	Using address range map to speedup finding inline stack for address. Summary: In the current implementation, to find inline stack for an address incurs expensive linear search in 2 places: * linear search for the top-level DIE * recursive linear traverse the DIE tree to find the path to the leaf DIE In this patch, a map is built from address to its corresponding leaf DIE. The inline stack is built by traversing from the leaf DIE up to the root DIE. This speeds up batch symbolization by ~10X without noticible memory overhead. Reviewers: dblaikie Reviewed By: dblaikie Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32177 llvm-svn: 300742	2017-04-19 20:09:38 +00:00
Dehao Chen	cfe5b2bf74	Update the madd.ll test with utils/update_llc_test_checks.py (NFC) llvm-svn: 300740	2017-04-19 20:08:14 +00:00
Dehao Chen	58601674d2	PR32710: Disable using PMADDWD for unsigned short. Summary: PMADDWD can only handle signed short. Reviewers: mkuper, wmi Reviewed By: mkuper Subscribers: andreadb, llvm-commits Differential Revision: https://reviews.llvm.org/D32236 llvm-svn: 300737	2017-04-19 19:50:34 +00:00
Matt Arsenault	021a218dd2	AMDGPU: Don't emit amd_kernel_code_t for callable functions This is inserted directly in the text section. The relocation for the function ends up resolving to the beginning of the amd_kernel_code_t header rather than the actual function entry point. Also skip some of the comments for initialization that only makes sense for kernels. llvm-svn: 300736	2017-04-19 19:38:10 +00:00
Artem Tamazov	557a057d4f	[AMDGPU][mc][tests][NFC] Update bulk ISA tests for Gfx7 and Gfx8 Added approx. 1100 gfx7 and 1040 gfx8 test cases. llvm-svn: 300734	2017-04-19 19:12:06 +00:00
Matt Arsenault	d3406bc45c	StructurizeCFG: Directly invert cmp instructions The most common case for a branch condition is a single use compare. Directly invert the branch predicate rather than adding a lot of xor i1 true which the DAG will have to fold later. This produces nicer to read structurizer output. This produces some random changes in codegen due to the DAG swapping branch conditions itself, and then does a poor job of dealing with those inverts. llvm-svn: 300732	2017-04-19 18:29:07 +00:00
Sanjoy Das	5945447d84	[GVN] Don't coerce non-integral pointers to integers or vice versa Summary: See http://llvm.org/docs/LangRef.html#non-integral-pointer-type The NewGVN test does not fail without these changes (perhaps it does try to coerce pointers <-> integers to begin with?), but I added the test case anyway. Reviewers: dberlin Subscribers: mcrosier, llvm-commits, Prazek Differential Revision: https://reviews.llvm.org/D32208 llvm-svn: 300730	2017-04-19 18:21:09 +00:00
Tim Northover	ff168c68dc	ARM: TLS calling convention doesn't preserve r9 or r12 on Darwin. llvm-svn: 300726	2017-04-19 18:07:54 +00:00
Sanjay Patel	ded7d59f0e	[DAG] add splat vector support for 'and' in SimplifyDemandedBits The patch itself is simple: stop discriminating against vectors in visitAnd() and again in SimplifyDemandedBits(). Some notes for reference: 1. We're not consistent about calls to SimplifyDemandedBits in the various visitXXX functions. Sometimes, we check if the RHS is a constant first. Other times (like here), we just dive in. 2. I'd like to break the vector shackles in steps for the sake of risk minimization, but we could make similar simultaneous changes in other places if we think that would be better. 3. I don't know what the intent of the changed tests in this patch was supposed to be, but since they wiggled in a positive way, I'm just going with that. :) 4. In the rotate tests, note that we can see through non-splat constants. This is a result of D24253. 5. My motivation for being here now is to make D31944 look better, so this is step 1 of N towards improving the vector codegen in that patch without writing any actual new code. Differential Revision: https://reviews.llvm.org/D32230 llvm-svn: 300725	2017-04-19 18:05:06 +00:00
Matt Arsenault	6cb7b8a42f	AMDGPU: Don't align callable functions to 256 llvm-svn: 300720	2017-04-19 17:42:39 +00:00
Matt Arsenault	4c1ecded63	AMDGPU: Change DivergenceAnalysis for function arguments Stop assuming all functions are kernels. llvm-svn: 300719	2017-04-19 17:42:34 +00:00
Sanjay Patel	a3c297dba4	[InstSimplify] fold identity shuffles (recursing if needed) This patch simplifies the examples from D31509 and D31927 (PR30630) and catches the basic identity shuffle tests that Zvi recently added. I'm not sure if we have something like this in DAGCombiner, but we should? It's worth noting that "MaxRecurse / RecursionLimit" is only 3 on entry at the moment. We might want to bump that up if there are longer shuffle chains like this in the wild. For now, we're ignoring shuffles that have undef mask elements because it's not clear how those should be handled. Differential Revision: https://reviews.llvm.org/D31960 llvm-svn: 300714	2017-04-19 16:48:22 +00:00
Krzysztof Parzyszek	333b2bf2ed	[Hexagon] Generate proper offset in opt-addr-mode Also, make a few changes to allow using the pass in .mir testcases. Among other things, change the abbreviation from opt-amode to amode-opt, because otherwise lit would expand the "opt" part to the full path to the opt binary. llvm-svn: 300707	2017-04-19 15:15:51 +00:00
Sanjay Patel	c9d36f181f	[PowerPC] add test and auto-generate checks; NFC llvm-svn: 300700	2017-04-19 14:58:09 +00:00
Sanjay Patel	5a2235bbd0	[ARM] add test and auto-generate checks; NFC llvm-svn: 300698	2017-04-19 14:55:50 +00:00
Davide Italiano	a9f047a594	[InstSimplify] Deduce correct type for vector GEP. InstSimplify returned the wrong type when simplifying a vector GEP and we ended up crashing when trying to replace all uses with the new value. Fixes PR32697. Differential Revision: https://reviews.llvm.org/D32180 llvm-svn: 300693	2017-04-19 14:23:42 +00:00
Dylan McKay	da2d74642a	[AVR] Remove the 'multibyte' asm test It tests registers which are not actually used on AVR. llvm-svn: 300684	2017-04-19 12:13:45 +00:00
Simon Pilgrim	5536ecd9f0	Regenerate test. NFCI. llvm-svn: 300683	2017-04-19 12:06:40 +00:00
Dylan McKay	7838104382	[AVR] Fix the test suite A bunch of tests failed because memory operations have been reordered. I am unsure which commit changed this behaviour as the AVR build was failing at that point with an unrelated error. This commit just reoders some of the CHECK lines in some tests to suit current llc output. llvm-svn: 300682	2017-04-19 12:02:52 +00:00
Igor Breger	4fdf1e489c	[GlobalIsel][X86] support G_TRUNC selection. Summary: [GlobalIsel][X86] support G_TRUNC selection. Add regbank-select and legalizer tests. Currently legalization of trunc i64 on 32bit platform not supported. Reviewers: ab, zvi, rovka Reviewed By: zvi Subscribers: dberris, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D32115 llvm-svn: 300678	2017-04-19 11:34:59 +00:00
Simon Pilgrim	94d5dba8ae	[X86] Add D32039/PR31357 tests to show current BSWAP codegen llvm-svn: 300672	2017-04-19 11:06:22 +00:00
Simon Pilgrim	b13ab63503	[X86][SSE] Add scheduling latency/throughput tests for (most) SSE2 instructions llvm-svn: 300671	2017-04-19 10:52:09 +00:00
Renato Golin	742aed8683	Revert "ARMFrameLowering: Reserve emergency spill slot for large arguments" This reverts commit r300639, as it broke self-hosting on ARM. PR32709. llvm-svn: 300668	2017-04-19 09:02:52 +00:00
Igor Breger	746b3c336f	[GlobalISel][X86] Split select tests. NFC. llvm-svn: 300666	2017-04-19 08:40:44 +00:00
Diana Picus	49472ff1cf	[ARM] GlobalISel: Add support for G_MUL Support G_MUL, very similar to G_ADD and G_SUB. The only difference is in the instruction selector, where we have to select either MUL or MULv5 depending on the target. llvm-svn: 300665	2017-04-19 07:29:46 +00:00
Kristof Beyls	0f36e68f62	[GlobalISel] Support vector-of-pointers in LLT This fixes PR32471. As comment 10 on that bug report highlights (https://bugs.llvm.org//show_bug.cgi?id=32471#c10), there are quite a few different defendable design tradeoffs that could be made, including not representing pointers at all in LLT. I decided to go for representing vector-of-pointer as a concept in LLT, while keeping the size of the LLT type 64 bits (this is an increase from 48 bits before). My rationale for keeping pointers explicit is that on some targets probably it's very handy to have the distinction between pointer and non-pointer (e.g. 68K has a different register bank for pointers IIRC). If we keep a scalar pointer, it probably is easiest to also have a vector-of-pointers to keep LLT relatively conceptually clean and orthogonal, while we don't have a very strong reason to break that orthogonality. Once we gain more experience on the use of LLT, we can of course reconsider this direction. Rejecting vector-of-pointer types in the IRTranslator is also an option to avoid the crash reported in PR32471, but that is only a very short-term solution; also needs quite a bit of code tweaks in places, and is probably fragile. Therefore I didn't consider this the best option. llvm-svn: 300664	2017-04-19 07:23:57 +00:00
Chandler Carruth	ae3386aa74	Revert r300657 due to crashes in stage2 of bootstraps: http://lab.llvm.org:8011/builders/clang-with-lto-ubuntu/builds/2476/steps/build-stage2-LLVMgold.so/logs/stdio http://bb.pgr.jp/builders/clang-3stage-x86_64-linux/builds/15036/steps/build_llvmclang/logs/stdio I've updated the commit thread, reverting to get the bots back to green. Original commit summary: [JumpThread] We want to fold (not thread) when all predecessor go to single BB's successor. llvm-svn: 300662	2017-04-19 06:23:20 +00:00
Xin Tong	636a332906	[JumpThread] We want to fold (not thread) when all predecessor go to single BB's successor. . Summary: In case all predecessor go to a single successor of current BB. We want to fold (not thread). Reviewers: efriedma, sanjoy Reviewed By: sanjoy Subscribers: dberlin, majnemer, llvm-commits Differential Revision: https://reviews.llvm.org/D30869 llvm-svn: 300657	2017-04-19 05:15:57 +00:00
Matthias Braun	661d3d4b00	ARMFrameLowering: Reserve emergency spill slot for large arguments We need to reserve an emergency spill slot in cases with large argument types that could overflow immediate offsets for FP relative address calculations. rdar://31317893 Differential Revision: https://reviews.llvm.org/D31643 llvm-svn: 300639	2017-04-19 01:16:07 +00:00
Dean Michael Berris	97923db891	[XRay][tools] Fix yaml matching to be more permissive Account for a potentially empty function name. Follow-up to D32153. llvm-svn: 300631	2017-04-19 00:10:09 +00:00
Dean Michael Berris	918802bed4	[XRay][tools] Add option to llvm-xray extract to symbolize functions Summary: This allows us to, if the symbol names are available in the binary, be able to provide the function name in the YAML output. Reviewers: dblaikie, pelikan Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32153 llvm-svn: 300624	2017-04-18 23:23:54 +00:00
Sanjay Patel	ff981f9256	[x86] add tests for potential andn optimization; NFC llvm-svn: 300617	2017-04-18 22:36:59 +00:00
Chih-Hung Hsieh	877923a87f	[X86] Keep EXTRACT_VECTOR_ELT result type as f128 for Android x86_64. Android x86_64 target uses f128 type and stores f128 values in %xmm* registers. SoftenFloatRes_EXTRACT_VECTOR_ELT should not convert result value from f128 to i128. Differential Revision: http://reviews.llvm.org/D32102 llvm-svn: 300583	2017-04-18 20:15:18 +00:00
Simon Pilgrim	9398649fea	[X86][SSE] Add scheduling latency/throughput tests for (most) SSE1 instructions llvm-svn: 300576	2017-04-18 19:04:40 +00:00
Easwaran Raman	76aba5f6d7	[SLP vectorizer] Allow phi node reordering in tryToVectorizeList. In tryToVectorizeList, under a very limited circumstance (when entered from tryToVectorizePair), the values may be reordered (swapped) and the SLP tree is built with the new order. This extends that to the case when starting from phis in vectorizeChainsInBlock when there are exactly two phis. The textual order of phi nodes shouldn't really matter. Without this change, the loop body in the accompnaying test case is fully vectorized when we swap the orde of the phis but not with this order. While this doesn't solve the phi-ordering problem in a general way (for more than 2 phis), this is simple fix that piggybacks on an existing mechanism and is useful in cases like multiplying two complex numbers. Differential revision: https://reviews.llvm.org/D32065 llvm-svn: 300574	2017-04-18 18:16:57 +00:00
Nirav Dave	855ef45602	[DAG] Improve store merge candidate pruning. Remove non-consecutive stores from store merge candidate search as they cannot be merged and will prevent us from finding subsequent mergeable store cases. Reviewers: jyknight, bogner, javed.absar, spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32086 llvm-svn: 300561	2017-04-18 15:36:34 +00:00
Nirav Dave	e50544cdf2	Add base-index-based store merge test llvm-svn: 300559	2017-04-18 15:12:13 +00:00
Nikolai Bozhenov	95fc644148	Make globalaa-retained.ll test catching more cases. Summary: * Add checks for store. That is needed because GlobalsAA is called twice in the current pipeline with different sets of Function passes following it. However, the loads are eliminated using instcombine which happens everywhere. On the other hand, DeadStoreElimination is performed only once so by checking for store we'll be able to catch more cases when GlobalsAA is invalidated unintentionally. * Add empty function above/below the test so that we don't depend on the relative order of instcombine/dead-store-elimination and the pass that invalidates the analysis (inside the same FunctionPassManager). Reviewers: kristof.beyls Reviewed By: kristof.beyls Subscribers: llvm-commits, n.bozhenov Differential Revision: https://reviews.llvm.org/D32015 Patch by Andrei Elovikov <andrei.elovikov@intel.com> llvm-svn: 300553	2017-04-18 13:29:26 +00:00

1 2 3 4 5 ...

44438 Commits