llvm-project

Commit Graph

Author	SHA1	Message	Date
Sumanth Gundapaneni	0ac1ce70ba	In the below scenario, we must be able to skip the a DBG_VALUE instruction and remove the dead store. %vreg0<def> = L2_loadri_io <fi#15>, 0; mem:LD4[%dataF](align=4) DBG_VALUE %vreg0, %noreg, !"dataF", <!184>; IntRegs:%vreg0 S2_storeri_io <fi#15>, 0, %vreg0; mem:ST4[%dataF] In reality, this kind of stores are eliminated before Stack Slot Coloring pass, possibly in instruction lowering Differential Revision: https://reviews.llvm.org/D26616 llvm-svn: 291455	2017-01-09 17:45:02 +00:00
Bjorn Pettersson	b14afd452d	[SelectionDAG] Fix in legalization of UMAX/SMAX/UMIN/SMIN. Solves PR31486. Summary: Originally i64 = umax t8, Constant:i64<4> was expanded into i32,i32 = umax Constant:i32<0>, Constant:i32<0> i32,i32 = umax t7, Constant:i32<4> Now instead the two produced umax:es return i32 instead of i32, i32. Thanks to Jan Vesely for help with the test case. Patch by mikael.holmen at ericsson.com Reviewers: bogner, jvesely, tstellarAMD, arsenm Subscribers: test, wdng, RKSimon, arsenm, nhaehnle, llvm-commits Differential Revision: https://reviews.llvm.org/D28135 llvm-svn: 291441	2017-01-09 12:03:50 +00:00
Daniel Sanders	12360efa76	[globalisel] Stop requiring -debug/-debug-only=registerbankinfo for assertions. Summary: I've noticed that these assertions don't trigger when the condition is false. The problem is that the DEBUG(x) macro only executes x when the pass is emitting debug output via the -debug and -debug-only=registerbankinfo command line arguments. Debug builds should always execute the assertions so use '#ifndef NDEBUG' instead. Also removed an assertion that is only true the first time it's tested. <Target>RegisterBankInfo's constructor will re-use register banks causing them to be valid on subsequent tests. That assertion will fail on the first test too in the near future. Reviewers: t.p.northover, ab, rovka, qcolombet Subscribers: dberris, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D28358 llvm-svn: 291235	2017-01-06 14:29:34 +00:00
David Majnemer	9e04befb09	[SelectionDAG] Rework lowerRangeToAssertZExt Utilize ConstantRange to make it easier to interpret range metadata. llvm-svn: 291211	2017-01-06 02:43:28 +00:00
David Majnemer	eaba06cffa	[SelectionDAG] Correctly transform range metadata to AssertZExt We used the logBase2 of the high instead of the ceilLogBase2 resulting in the wrong result for certain values. For example, it resulted in an i1 AssertZExt when the exclusive portion of the range was 3. llvm-svn: 291196	2017-01-06 00:11:46 +00:00
Joerg Sonnenberger	83963995c6	PR 31534: When emitting both DWARF unwind tables and debug information, do not use .cfi_sections. This requires checking if any non-declaration function in the module needs an unwind table. llvm-svn: 291172	2017-01-05 20:55:28 +00:00
Matthias Braun	1172332203	CodeGen: Assert that liveness is up to date when reading block live-ins. Add an assert that checks whether liveins are up to date before they are used. - Do not print liveins into .mir files anymore in situations where they are out of date anyway. - The assert in the RegisterScavenger is superseded by the new one in livein_begin(). - Skip parts of the liveness updating logic in IfConversion.cpp when liveness isn't tracked anymore (just enough to avoid hitting the new assert()). Differential Revision: https://reviews.llvm.org/D27562 llvm-svn: 291169	2017-01-05 20:01:19 +00:00
Kristof Beyls	a983e7c4a4	[GlobalISel] Add support for address-taken basic blocks To make this work, pointers from the MachineBasicBlock to the LLVM-IR-level basic blocks need to be initialized, as the AsmPrinter uses this link to be able to print out labels for the basic blocks that are address-taken. Most of the changes in this commit are about adapting existing tests to include the basic block name that is now printed out in the MIR format, now that the name becomes available as the link to the LLVM-IR basic block is initialized. The relevant test change for the functionality added in this patch are the added "(address-taken)" strings in test/CodeGen/AArch64/GlobalISel/arm64-irtranslator.ll. Differential Revision: https://reviews.llvm.org/D28123 llvm-svn: 291105	2017-01-05 13:27:52 +00:00
Kristof Beyls	eced071e88	[GlobalISel] Add support for switch statements This commit does this using a trivial chain of conditional branches. In the future, we probably want to reuse the optimized switch lowering used in SelectionDAG. Differential Revision: https://reviews.llvm.org/D28176 llvm-svn: 291099	2017-01-05 11:28:51 +00:00
Saleem Abdulrasool	6252bd8eac	MC: support passing search paths to the IAS This is needed to support inclusion in inline assembly via the `.include` directive. llvm-svn: 291085	2017-01-05 05:56:39 +00:00
Tim Shen	5480eb8445	[Legalizer] Fix fp-to-uint to fp-tosint promotion assertion. Summary: When promoting fp-to-uint16 to fp-to-sint32, the result is actually zero extended. For example, given double 65534.0, without legalization: fp-to-uint16: 65534.0 -> 0xfffe With the legalization: fp-to-sint32: 65534.0 -> 0x0000fffe Without this patch, legalization wrongly emits a signed extend assertion, which is consumed by later icmp instruction, and cause miscompile. Note that the floating point value must be in [0, 65535), otherwise the behavior is undefined. This patch reverts r279223 behavior and adds more tests and documentations. In PR29041's context, James Molloy mentioned that: We don't need to mask because conversion from float->uint8_t is undefined if the integer part of the float value is not representable in uint8_t. Therefore we can assume this doesn't happen! which is totally true and good, because fptoui is documented clearly to have undefined behavior when overflow/underflow happens. We should take the advantage of this behavior so that we can save unnecessary mask instructions. Reviewers: jmolloy, nadav, echristo, kbarton Subscribers: mehdi_amini, nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D28284 llvm-svn: 291015	2017-01-04 22:11:42 +00:00
Evgeny Stupachenko	c88697dc16	The patch fixes (base, index, offset) match. Summary: Instead of matching: (a + i) + 1 -> (a + i, undef, 1) Now it matches: (a + i) + 1 -> (a, i, 1) Reviewers: rengolin Differential Revision: http://reviews.llvm.org/D26367 From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 291012	2017-01-04 21:43:39 +00:00
Bjorn Pettersson	3c6ce733f5	Fix for InlineSpiller accessing not updated dom tree base information. Summary: The InlineSpiller was accessing the DominatorTreeBase directly through the public data member DT in the MachineDominatorTree. This is not a good idea as the "cached" information in SplitCriticalEdges is not applied before the access. The DominatorTreeBase must be accessed through the member function getBase() in MachineDominatorTree. The fault was introduced in r266162. I think the public data member DT in the MachineDominatorTree should have been made private in the original code (r215576) that introduced the concept of lazily updating the MachineDominatorTree information from MachineBasicBlock::SplitCriticalEdge(). Patch by Karl-Johan Karlsson <karl-johan.karlsson@ericsson.com> Reviewers: wmi, qcolombet Subscribers: llvm-commits, bjope, uabelho Differential Revision: https://reviews.llvm.org/D27983 llvm-svn: 290950	2017-01-04 09:41:56 +00:00
Ahmed Bougacha	8a41319d8d	[CodeGen] Further simplify returned call operand logic. NFC. As Pete points out in r290905, CallSite lets us avoid duplicating this! llvm-svn: 290909	2017-01-03 21:42:43 +00:00
Ahmed Bougacha	6aff744e7c	[CodeGen] Simplify logic that looks for returned call operands. NFC-ish. Use getReturnedArgOperand() instead of rolling our own. Note that it's equivalent because there can only be one 'returned' operand. The existing code was also incorrect: there already was awkward logic to ignore callee/EH blocks, but operands can now also be operand bundles, in which case we'll look for non-existent parameter attributes. Unfortunately, this isn't observable in-tree, as it only crashes when exercising the regular call lowering logic with operand bundles. Still, this is a nice small cleanup anyway. llvm-svn: 290905	2017-01-03 20:33:22 +00:00
Dean Michael Berris	f7e7b938ea	[XRay] Merge instrumentation point table emission code into AsmPrinter. Summary: No need to have this per-architecture. While there, unify 32-bit ARM's behaviour with what changed elsewhere and start function names lowercase as per the coding standards. Individual entry emission code goes to the entry's own class. Fully tested on amd64, cross-builds on both ARMs and PowerPC. Reviewers: dberris Subscribers: aemerson, llvm-commits Differential Revision: https://reviews.llvm.org/D28209 llvm-svn: 290858	2017-01-03 04:30:21 +00:00
Joerg Sonnenberger	7b83732a40	Emit .cfi_sections before the first .cfi_startproc GNU as rejects input where .cfi_sections is used after .cfi_startproc, if the new section differs from the old. Adjust our output to always emit .cfi_sections before the first .cfi_startproc to minimize necessary code. Differential Revision: https://reviews.llvm.org/D28011 llvm-svn: 290817	2017-01-02 18:05:27 +00:00
Keno Fischer	f7d84ee6ff	Reapply "[CodeGen] Fix invalid DWARF info on Win64" This reapplies rL289013 (reverted in rL289014) with the fixes identified in D21731. Should hopefully pass the buildbots this time. llvm-svn: 290809	2017-01-02 03:00:19 +00:00
Florian Hahn	f872d230ad	[selectiondag] Check PromotedFloats map during expansive checks. Summary: `PromotedFloats` needs to be checked in `DAGTypeLegalizer::PerformExpensiveChecks`. This patch fixes a few type legalization failures with expansive checks for ARM fp16 tests. Reviewers: baldrick, bogner, arsenm Subscribers: arsenm, aemerson, llvm-commits Differential Revision: https://reviews.llvm.org/D28187 llvm-svn: 290796	2017-01-01 13:58:27 +00:00
Reid Kleckner	0e7c84c682	Simplify FunctionLoweringInfo.cpp with range for loops I'm preparing to add some pattern matching code here, so simplify the code before I do. NFC llvm-svn: 290731	2016-12-30 00:21:38 +00:00
Reid Kleckner	cd46c1df80	Revert "[COFF] Use 32-bit jump table entries in .rdata for Win64" This reverts commit r290694. It broke sanitizer tests on Win64. I'll probably bring this back, but the jump tables will just live in .text like they do for MSVC. llvm-svn: 290714	2016-12-29 17:07:10 +00:00
Igor Laevsky	4f31e52f94	Introduce element-wise atomic memcpy intrinsic This change adds a new intrinsic which is intended to provide memcpy functionality with additional atomicity guarantees. Please refer to the review thread or language reference for further details. Differential Revision: https://reviews.llvm.org/D27133 llvm-svn: 290708	2016-12-29 14:31:07 +00:00
Reid Kleckner	c9e0a153cf	[COFF] Use 32-bit jump table entries in .rdata for Win64 Summary: We were already using 32-bit jump table entries, but this was a consequence of the default PIC model on Win64, and not an intentional design decision. This patch ensures that we always use 32-bit label difference jump table entries on Win64 regardless of the PIC model. This is a good idea because it saves executable size and object file size. Moving the jump tables to .rdata cleans up the disassembled object code and reduces the available ROP targets, but it requires adding one more RIP-relative lea to the code. COFF doesn't have relocations to express the difference between two arbitrary symbols, so we can't use the jump table label in the label difference like we do elsewhere. Fixes PR31488 Reviewers: majnemer, compnerd Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D28141 llvm-svn: 290694	2016-12-29 00:12:39 +00:00
Reid Kleckner	92647369fc	[WinEH] Don't assume endFunction is called while in .text Jump table emission can switch to .rdata before WinException::endFunction gets called. Just remember the appropriate text section we started in and reset back to it when we end the function. We were already switching sections back from .xdata anyway. Fixes the first problem in PR31488, so that now COFF switch tables can live in .rdata if we want them to. llvm-svn: 290678	2016-12-28 19:05:12 +00:00
Simon Pilgrim	0d66d29678	[SelectionDAG] Early out from computeKnownBits when we know we will have no common bits. Avoid extra (recursive) calls to computeKnownBits if we already know that there are no common known bits. llvm-svn: 290490	2016-12-24 12:59:35 +00:00
Zijiao Ma	bf6007bd1b	Make the canonicalisation on shifts benifit to more case. 1.Fix pessimized case in FIXME. 2.Add tests for it. 3.The canonicalisation on shifts results in different sequence for tests of machine-licm.Correct some check lines. Differential Revision: https://reviews.llvm.org/D27916 llvm-svn: 290410	2016-12-23 02:56:07 +00:00
Sanjoy Das	50fef4321b	NFC code motion in ImplicitNullChecks Extract out two large lambdas into top level member functions. llvm-svn: 290395	2016-12-23 00:41:24 +00:00
Sanjoy Das	9a129807f3	Reimplement depedency tracking in the ImplicitNullChecks pass Summary: This change rewrites a core component in the ImplicitNullChecks pass for greater simplicity since the original design was over-complicated for no good reason. Please review this as essentially a new pass. The change is almost NFC and I've added a test case for a scenario that this new code handles that wasn't handled earlier. The implicit null check pass, at its core, is a code hoisting transform. It differs from "normal" code transforms in that it speculates potentially faulting instructions (by design), but a lot of the usual hazard detection logic (register read-after-write etc.) still applies. We previously detected hazards by keeping track of registers defined and used by machine instructions over an instruction range, but that was unwieldy and did not actually confer any performance benefits. The intent was to have linear time complexity over the number of machine instructions considered, but it ended up being N^2 is practice. This new version is more obviously O(N^2) (with N capped to 8 by default) in hazard detection. It does not attempt to be clever in tracking register uses or defs (the previous cleverness here was a source of bugs). Once this is checked in, I'll extract out the `IsSuitableMemoryOp` and `CanHoistLoadInst` lambda into member functions (they're too complicated to be inline lambdas) and do some other related NFC cleanups. Reviewers: reames, anna, atrick Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D27592 llvm-svn: 290394	2016-12-23 00:41:21 +00:00
Quentin Colombet	3749f33888	[GlobalISel] More fix for the size vs. type typo. NFC. I missed those in my previous commit (r290378). llvm-svn: 290387	2016-12-22 22:50:34 +00:00
Quentin Colombet	fa5960a28b	[MachineVerifier] Check that even generic vregs comply to regclass constraints. We used to not check generic vregs, but that is actually a mistake given nothing in the GlobalISel pipeline is going to fix the constraints on target specific instructions. Therefore, the target has to have them right from the start. llvm-svn: 290380	2016-12-22 21:56:39 +00:00
Quentin Colombet	e08cc599b8	[MIRParser] Fix a typo in comment and error message. We have long switched from size to type. llvm-svn: 290378	2016-12-22 21:56:35 +00:00
Quentin Colombet	9751e61fe1	[MIRParser] Non-generic virtual register may have a type. When generic virtual registers get constrained, because of a use on a target specific operation for instance, we end up with regular virtual registers with a type and that's perfectly fine. llvm-svn: 290376	2016-12-22 21:56:29 +00:00
Quentin Colombet	7e1f66d6f5	[RegisterBankInfo] Allow to set a register class when nothing else is set This is going to be needed to be able to constraint register class on target specific instruction while the RegBankSelect pass did not run yet. llvm-svn: 290375	2016-12-22 21:56:26 +00:00
Quentin Colombet	b4e71185b2	[GlobalISel] Refactor the logic to constraint registers. Move the logic to constraint register from InstructionSelector to a utility function. It will be required by other passes in the GlobalISel pipeline. llvm-svn: 290374	2016-12-22 21:56:19 +00:00
Wei Mi	a2f0b594c2	Redo store splitting in CodeGenPrepare. This is a succeeding patch of https://reviews.llvm.org/D22840 to address the issue when a value to be merged into an int64 pair is in a different BB. Redoing the store splitting in CodeGenPrepare so we can match the pattern across multiple BBs and move some instructions into the same BB. We still keep the code in dag combine so that we can catch cases that show up after DAG combining runs. Differential Revision: https://reviews.llvm.org/D25914 llvm-svn: 290365	2016-12-22 19:44:45 +00:00
Wei Mi	f3f01aba48	Change the interface of TLI.isMultiStoresCheaperThanBitsMerge. This is for splitMergedValStore in DAG Combine to share the target query interface with similar logic in CodeGenPrepare. Differential Revision: https://reviews.llvm.org/D24707 llvm-svn: 290363	2016-12-22 19:38:22 +00:00
Krzysztof Parzyszek	8839124848	Add the DAG mutation interface to the software pipeliner llvm-svn: 290360	2016-12-22 19:21:20 +00:00
Krzysztof Parzyszek	df24da221e	Fix two bugs in the pipeliner in renaming phis in the prolog and epilog When the pipeliner is renaming phi values, it may need to iterate through the phi operands to check for other phis. However, the pipeliner should stop once it reaches a phi that is outside the pipelined loop. Also, when the generateExistingPhis code is unable to reuse an existing phi, the default code that computes the PhiOp2 is only to be used when the pipeliner is generating the kernel. Otherwise, the phi may be a value computed earlier in the same epilog. Patch by Brendon Cahoon. llvm-svn: 290355	2016-12-22 18:49:55 +00:00
Adrian Prantl	5542da4bbc	Fix an assertion in DwarfExpression when emitting fragments in vector registers When DwarfExpression is emitting a fragment that is located in a register and that fragment is smaller than the register, and the register must be composed from sub-registers (are you still with me?) the last DW_OP_piece operation must not be larger than the size of the fragment itself, since the last piece of the fragment could be smaller than the last subregister that is being emitted. rdar://problem/29779065 llvm-svn: 290324	2016-12-22 06:10:41 +00:00
Adrian Prantl	49797ca6be	Refactor the DIExpression fragment query interface (NFC) ... so it becomes available to DIExpressionCursor. llvm-svn: 290322	2016-12-22 05:27:12 +00:00
Matt Arsenault	485dacd90c	DAG: Add helper for testing constant values There are helpers for testing for constant or constant build_vector, and for splat ConstantFP vectors, but not for a constantfp or non-splat ConstantFP vector. llvm-svn: 290317	2016-12-22 04:39:45 +00:00
Oren Ben Simhon	3b95157090	[X86] Vectorcall Calling Convention - Adding CodeGen Complete Support The vectorcall calling convention specifies that arguments to functions are to be passed in registers, when possible. vectorcall uses more registers for arguments than fastcall or the default x64 calling convention use. The vectorcall calling convention is only supported in native code on x86 and x64 processors that include Streaming SIMD Extensions 2 (SSE2) and above. The current implementation does not handle Homogeneous Vector Aggregates (HVAs) correctly and this review attempts to fix it. This aubmit also includes additional lit tests to cover better HVAs corner cases. Differential Revision: https://reviews.llvm.org/D27392 llvm-svn: 290240	2016-12-21 08:31:45 +00:00
Sebastian Pop	7779484313	machine combiner: fix pretty printer we used to print UNKNOWN instructions when the instruction to be printer was not yet inserted in any BB: in that case the pretty printer would not be able to compute a TII as the instruction does not belong to any BB or function yet. This patch explicitly passes the TII to the pretty-printer. Differential Revision: https://reviews.llvm.org/D27645 llvm-svn: 290228	2016-12-21 01:41:12 +00:00
George Burgess IV	3f08914e7e	[Analysis] Centralize objectsize lowering logic. We're currently doing nearly the same thing for @llvm.objectsize in three different places: two of them are missing checks for overflow, and one of them could subtly break if InstCombine gets much smarter about removing alloc sites. Seems like a good idea to not do that. llvm-svn: 290214	2016-12-20 23:46:36 +00:00
Adrian Prantl	bceaaa9643	[IR] Remove the DIExpression field from DIGlobalVariable. This patch implements PR31013 by introducing a DIGlobalVariableExpression that holds a pair of DIGlobalVariable and DIExpression. Currently, DIGlobalVariables holds a DIExpression. This is not the best way to model this: (1) The DIGlobalVariable should describe the source level variable, not how to get to its location. (2) It makes it unsafe/hard to update the expressions when we call replaceExpression on the DIGLobalVariable. (3) It makes it impossible to represent a global variable that is in more than one location (e.g., a variable with multiple DW_OP_LLVM_fragment-s). We also moved away from attaching the DIExpression to DILocalVariable for the same reasons. This reapplies r289902 with additional testcase upgrades and a change to the Bitcode record for DIGlobalVariable, that makes upgrading the old format unambiguous also for variables without DIExpressions. <rdar://problem/29250149> https://llvm.org/bugs/show_bug.cgi?id=31013 Differential Revision: https://reviews.llvm.org/D26769 llvm-svn: 290153	2016-12-20 02:09:43 +00:00
Bjorn Pettersson	b29a15ecad	[CodeGen] Make MachineInstr::isIdenticalTo() symmetric. Summary: MachineInstr::isIdenticalTo() is for some reason not symmetric when comparing bundles, which gives us the property: I1->isIdenticalTo(I2) != I2->isIdenticalTo(I1) when comparing bundles where one bundle is longer than the other. This patch makes sure that bundles of different length always are considered as not being identical. Thus, the result of the comparison will be the same regardless of which side that happens to be to the left. Reviewers: dexonsmith, jonpa, andrew.w.kaylor Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D27508 llvm-svn: 290095	2016-12-19 11:20:57 +00:00
Dean Michael Berris	03b8be575e	[XRay] Fix assertion failure on empty machine basic blocks (PR 31424) The original version of the code in XRayInstrumentation.cpp assumed that functions may not have empty machine basic blocks (or that the first one couldn't be). This change addresses that by special-casing that specific situation. We provide two .mir test-cases to make sure we're handling this appropriately. Fixes llvm.org/PR31424. Reviewers: chandlerc Subscribers: varno, llvm-commits Differential Revision: https://reviews.llvm.org/D27913 llvm-svn: 290091	2016-12-19 09:20:38 +00:00
Tom Stellard	7761abb64a	Add custom type for PseudoSourceValue Summary: PseudoSourceValue can be used to attach a target specific value for "well behaved" side-effects lowered from target specific intrinsics. This is useful whenever there is not an LLVM IR Value around when representing such "well behaved" side-effected operations in backends by attaching a MachineMemOperand with a custom PseudoSourceValue as this makes the scheduler not treating them as "GlobalMemoryObjects" which triggers a logic that makes the operation act like a barrier in the Schedule DAG. This patch adds another Kind to the PseudoSourceValue object which is "TargetCustom". It indicates a type of PseudoSourceValue that has a target specific meaning (aka. LLVM shouldn't assume any specific usage for such a PSV). It supports the possibility of having many different kinds of "TargetCustom" PseudoSourceValues. We had a discussion about if this was valuable or not (in particular because there was a believe that PSV were going away sooner or later) but seems like they are not going anywhere and I think they are useful backend side. It is not clear the interaction of this with MIRParser (do we need a target hook to parse these?) and I would like a comment from Alex about that :) Reviewers: arphaman, hfinkel, arsenm Subscribers: Eugene.Zelenko, llvm-commits Patch By: Marcello Maggioni Differential Revision: https://reviews.llvm.org/D13575 llvm-svn: 290037	2016-12-17 04:41:53 +00:00
Matthias Braun	181983055f	BranchRelaxation: Recompute live-ins when splitting a block Factors out and reuses live-in computation code from BranchFolding. Differential Revision: https://reviews.llvm.org/D27558 llvm-svn: 290013	2016-12-16 23:55:37 +00:00
Paul Robinson	2dfb688214	Allow "line 0" to be the first explicit debug location in a function. Feedback on r289468 from Adrian Prantl. llvm-svn: 290012	2016-12-16 23:54:33 +00:00
Zachary Turner	46225b193f	Resubmit "[CodeView] Hook CodeViewRecordIO for reading/writing symbols." The original patch was broken due to some undefined behavior as well as warnings that were triggering -Werror. llvm-svn: 290000	2016-12-16 22:48:14 +00:00
Jun Bum Lim	90b6b5074a	[CodeGenPrep] Skip merging empty case blocks This is recommit of r287553 after fixing the invalid loop info after eliminating an empty block and unit test failures in AVR and WebAssembly : Summary: Merging an empty case block into the header block of switch could cause ISel to add COPY instructions in the header of switch, instead of the case block, if the case block is used as an incoming block of a PHI. This could potentially increase dynamic instructions, especially when the switch is in a loop. I added a test case which was reduced from the benchmark I was targetting. Reviewers: t.p.northover, mcrosier, manmanren, wmi, joerg, davidxl Subscribers: joerg, qcolombet, danielcdh, hfinkel, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D22696 llvm-svn: 289988	2016-12-16 20:38:39 +00:00
Sanjoy Das	007572706b	Inline stripInvariantGroupMetadata out of existence As a one liner function, I don't think it is pulling its weight in terms of helping readability. llvm-svn: 289987	2016-12-16 20:29:39 +00:00
Adrian Prantl	73ec065604	Revert "[IR] Remove the DIExpression field from DIGlobalVariable." This reverts commit 289920 (again). I forgot to implement a Bitcode upgrade for the case where a DIGlobalVariable has not DIExpression. Unfortunately it is not possible to safely upgrade these variables without adding a flag to the bitcode record indicating which version they are. My plan of record is to roll the planned follow-up patch that adds a unit: field to DIGlobalVariable into this patch before recomitting. This way we only need one Bitcode upgrade for both changes (with a version flag in the bitcode record to safely distinguish the record formats). Sorry for the churn! llvm-svn: 289982	2016-12-16 19:39:01 +00:00
Zachary Turner	d0fffd1d14	Revert "[CodeView] Hook CodeViewRecordIO for reading/writing symbols." This reverts commit r289978, which is failing due to some rebase/merge issues. llvm-svn: 289981	2016-12-16 19:25:23 +00:00
Zachary Turner	a4e7dfbc16	[CodeView] Hook CodeViewRecordIO for reading/writing symbols. This is the 3rd of 3 patches to get reading and writing of CodeView symbol and type records to use a single codepath. Differential Revision: https://reviews.llvm.org/D26427 llvm-svn: 289978	2016-12-16 19:20:35 +00:00
Krzysztof Parzyszek	ea9f8ce03c	Implement LaneBitmask::any(), use it to replace !none(), NFCI llvm-svn: 289974	2016-12-16 19:11:56 +00:00
Sanjoy Das	089c699743	Fix CodeGenPrepare::stripInvariantGroupMetadata `dropUnknownNonDebugMetadata` takes a list of "known" metadata IDs. The only reason it worked at all is that `getMetadataID` returns something unrelated -- it returns the subclass ID of the receiver (which is used in `dyn_cast` etc.). That does not numerically match `LLVMContext::MD_invariant_group` and ends up dropping `invariant_group` along with every other metadata that does not numerically match `LLVMContext::MD_invariant_group`. llvm-svn: 289973	2016-12-16 18:52:33 +00:00
Joel Jones	8980ba643e	Fix name typo in SelectonDAG llvm-svn: 289969	2016-12-16 18:22:54 +00:00
Jun Bum Lim	f9416af191	Revert "[CodeGenPrep] Skip merging empty case blocks" This reverts commit r289951. llvm-svn: 289960	2016-12-16 17:06:14 +00:00
Jun Bum Lim	85347dde27	[CodeGenPrep] Skip merging empty case blocks This is recommit of r287553 after fixing the invalid loop info after eliminating an empty block: Summary: Merging an empty case block into the header block of switch could cause ISel to add COPY instructions in the header of switch, instead of the case block, if the case block is used as an incoming block of a PHI. This could potentially increase dynamic instructions, especially when the switch is in a loop. I added a test case which was reduced from the benchmark I was targetting. Reviewers: t.p.northover, mcrosier, manmanren, wmi, joerg, davidxl Subscribers: joerg, qcolombet, danielcdh, hfinkel, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D22696 llvm-svn: 289951	2016-12-16 16:03:31 +00:00
Krzysztof Parzyszek	36ef5dc3df	[MIRParser] Add parsing hex literals of arbitrary size as unsigned integers The current code does not parse hex literals larger than 32-bit. llvm-svn: 289943	2016-12-16 13:58:01 +00:00
Diana Picus	812caee65a	[ARM] GlobalISel: Select add i32, i32 Add the minimal support necessary to select a function that returns the sum of two i32 values. This includes some support for argument/return lowering of i32 values through registers, as well as the handling of copy and add instructions throughout the GlobalISel pipeline. Differential Revision: https://reviews.llvm.org/D26677 llvm-svn: 289940	2016-12-16 12:54:46 +00:00
Florian Hahn	3c8b8c98b0	[codegen] Add generic functions to skip debug values. Summary: This commits moves skipDebugInstructionsForward and skipDebugInstructionsBackward from lib/CodeGen/IfConversion.cpp to include/llvm/CodeGen/MachineBasicBlock.h and updates some codgen files to use them. This refactoring was suggested in https://reviews.llvm.org/D27688 and I thought it's best to do the refactoring in a separate review, but I could also put both changes in a single review if that's preferred. Also, the names for the functions aren't the snappiest and I would be happy to rename them if anybody has suggestions. Reviewers: eli.friedman, iteratee, aprantl, MatzeB Subscribers: MatzeB, llvm-commits Differential Revision: https://reviews.llvm.org/D27782 llvm-svn: 289933	2016-12-16 11:10:26 +00:00
Adrian Prantl	74a835cda0	[IR] Remove the DIExpression field from DIGlobalVariable. This patch implements PR31013 by introducing a DIGlobalVariableExpression that holds a pair of DIGlobalVariable and DIExpression. Currently, DIGlobalVariables holds a DIExpression. This is not the best way to model this: (1) The DIGlobalVariable should describe the source level variable, not how to get to its location. (2) It makes it unsafe/hard to update the expressions when we call replaceExpression on the DIGLobalVariable. (3) It makes it impossible to represent a global variable that is in more than one location (e.g., a variable with multiple DW_OP_LLVM_fragment-s). We also moved away from attaching the DIExpression to DILocalVariable for the same reasons. This reapplies r289902 with additional testcase upgrades. <rdar://problem/29250149> https://llvm.org/bugs/show_bug.cgi?id=31013 Differential Revision: https://reviews.llvm.org/D26769 llvm-svn: 289920	2016-12-16 04:25:54 +00:00
Chandler Carruth	ba5de63bc3	Add extra headers that got deleted by my revert in r289916 but for which new usage had already grown in the file. llvm-svn: 289917	2016-12-16 04:08:31 +00:00
Chandler Carruth	4154062b69	Revert patch series introducing the DAG combine to match a load-by-bytes idiom. r289538: Match load by bytes idiom and fold it into a single load r289540: Fix a buildbot failure introduced by r289538 r289545: Use more detailed assertion messages in the code ... r289646: Add a couple of assertions to the load combine code ... This DAG combine has a bad crash in it that is quite hard to trigger sadly -- it relies on sneaking code with UB through the SDAG build and into this particular combine. I've responded to the original commit with a test case that reproduces it. However, the code also has other problems that will require substantial changes to address and so I'm going ahead and reverting it for now. This should unblock us and perhaps others that are hitting the crash in the wild and will let a fresh patch with updated approach come in cleanly afterward. Sorry for any trouble or disruption! llvm-svn: 289916	2016-12-16 04:05:22 +00:00
Adrian Prantl	03c6d31a3b	Revert "[IR] Remove the DIExpression field from DIGlobalVariable." This reverts commit 289902 while investigating bot berakage. llvm-svn: 289906	2016-12-16 01:00:30 +00:00
Adrian Prantl	ce13935776	[IR] Remove the DIExpression field from DIGlobalVariable. This patch implements PR31013 by introducing a DIGlobalVariableExpression that holds a pair of DIGlobalVariable and DIExpression. Currently, DIGlobalVariables holds a DIExpression. This is not the best way to model this: (1) The DIGlobalVariable should describe the source level variable, not how to get to its location. (2) It makes it unsafe/hard to update the expressions when we call replaceExpression on the DIGLobalVariable. (3) It makes it impossible to represent a global variable that is in more than one location (e.g., a variable with multiple DW_OP_LLVM_fragment-s). We also moved away from attaching the DIExpression to DILocalVariable for the same reasons. <rdar://problem/29250149> https://llvm.org/bugs/show_bug.cgi?id=31013 Differential Revision: https://reviews.llvm.org/D26769 llvm-svn: 289902	2016-12-16 00:36:43 +00:00
David Blaikie	38b74bf249	DebugInfo: Address non-deterministic output (iterating a SmallPtrSet) in 289697 Post-commit review feedback from Adrian Prantl. Hopefully this fixes that up :) llvm-svn: 289892	2016-12-15 23:37:38 +00:00
Quentin Colombet	327f942876	[IRTranslator] Merge the entry and ABI lowering blocks. The IRTranslator uses an additional block before the LLVM-IR entry block to perform all the ABI lowering and the constant hoisting. Thus, this block is the actual entry block and it falls through the LLVM-IR entry block. However, with such representation, we end up with two basic blocks that are not maximal. Therefore, this patch adds a bit of canonicalization by merging both the LLVM-IR entry block and the ABI lowering/constants hoisting into one block, making the resulting block more likely to be maximal (indeed the LLVM-IR entry block might not have been maximal). llvm-svn: 289891	2016-12-15 23:32:25 +00:00
David Blaikie	3e3eb33ed7	DebugInfo: Emit ranges for functions with DISubprograms but lacking locations on any instructions This seems more consistent, and helps tidy up/simplify some other code in this change. llvm-svn: 289889	2016-12-15 23:17:52 +00:00
Eli Friedman	379294676d	Don't combine splats with other shuffles. We sometimes end up creating shuffles which are worse than the obvious translation of the IR. Fixes https://llvm.org/bugs/show_bug.cgi?id=31301 . Differential Revision: https://reviews.llvm.org/D27793 llvm-svn: 289882	2016-12-15 22:41:40 +00:00
Eli Friedman	34505083c6	Don't combine a shuffle of two BUILD_VECTORs with duplicate elements. Targets can't handle this case well in general; we often transform a shuffle of two cheap BUILD_VECTORs to element-by-element insertion, which is very inefficient. Fixes https://llvm.org/bugs/show_bug.cgi?id=31364 . Partially fixes https://llvm.org/bugs/show_bug.cgi?id=31301. Differential Revision: https://reviews.llvm.org/D27787 llvm-svn: 289874	2016-12-15 21:36:59 +00:00
Geoff Berry	66d1f0ff1f	[LiveRangeEdit] Change eliminateDeadDef assert to if condition. The assert could potentially fire (though no cases have been encountered), so just check that the instruction we're handling specially for rematerialization only has one def to begin with. Reviewed by Wei Mi over email. llvm-svn: 289861	2016-12-15 19:55:19 +00:00
Krzysztof Parzyszek	91b5cf8412	Extract LaneBitmask into a separate type Specifically avoid implicit conversions from/to integral types to avoid potential errors when changing the underlying type. For example, a typical initialization of a "full" mask was "LaneMask = ~0u", which would result in a value of 0x00000000FFFFFFFF if the type was extended to uint64_t. Differential Revision: https://reviews.llvm.org/D27454 llvm-svn: 289820	2016-12-15 14:36:06 +00:00
Prakhar Bahuguna	52a7dd7d78	[ARM] Implement execute-only support in CodeGen This implements execute-only support for ARM code generation, which prevents the compiler from generating data accesses to code sections. The following changes are involved: * Add the CodeGen option "-arm-execute-only" to the ARM code generator. * Add the clang flag "-mexecute-only" as well as the GCC-compatible alias "-mpure-code" to enable this option. * When enabled, literal pools are replaced with MOVW/MOVT instructions, with VMOV used in addition for floating-point literals. As the MOVT instruction is required, execute-only support is only available in Thumb mode for targets supporting ARMv8-M baseline or Thumb2. * Jump tables are placed in data sections when in execute-only mode. * The execute-only text section is assigned section ID 0, and is marked as unreadable with the SHF_ARM_PURECODE flag with symbol 'y'. This also overrides selection of ELF sections for globals. llvm-svn: 289784	2016-12-15 07:59:08 +00:00
Hal Finkel	34f9d6ac11	Trying to fix NDEBUG build after r289764 llvm-svn: 289766	2016-12-15 05:33:19 +00:00
Sanjoy Das	d7389d6261	[MachineBlockPlacement] Don't make blocks "uneditable" Summary: This fixes an issue with MachineBlockPlacement due to a badly timed call to `analyzeBranch` with `AllowModify` set to true. The timeline is as follows: 1. `MachineBlockPlacement::maybeTailDuplicateBlock` calls `TailDup.shouldTailDuplicate` on its argument, which in turn calls `analyzeBranch` with `AllowModify` set to true. 2. This `analyzeBranch` call edits the terminator sequence of the block based on the physical layout of the machine function, turning an unanalyzable non-fallthrough block to a unanalyzable fallthrough block. Normally MBP bails out of rearranging such blocks, but this block was unanalyzable non-fallthrough (and thus rearrangeable) the first time MBP looked at it, and so it goes ahead and decides where it should be placed in the function. 3. When placing this block MBP fails to analyze and thus update the block in keeping with the new physical layout. Concretely, before (1) we have something like: ``` LBL0: < unknown terminator op that may branch to LBL1 > jmp LBL1 LBL1: ... A LBL2: ... B ``` In (2), analyze branch simplifies this to ``` LBL0: < unknown terminator op that may branch to LBL2 > ;; jmp LBL1 <- redundant jump removed LBL1: ... A LBL2: ... B ``` In (3), MachineBlockPlacement goes ahead with its plan of putting LBL2 after the first block since that is profitable. ``` LBL0: < unknown terminator op that may branch to LBL2 > ;; jmp LBL1 <- redundant jump LBL2: ... B LBL1: ... A ``` and the program now has incorrect behavior (we no longer fall-through from `LBL0` to `LBL1`) because MBP can no longer edit LBL0. There are several possible solutions, but I went with removing the teeth off of the `analyzeBranch` calls in TailDuplicator. That makes thinking about the result of these calls easier, and breaks nothing in the lit test suite. I've also added some bookkeeping to the MachineBlockPlacement pass and used that to write an assert that would have caught this. Reviewers: chandlerc, gberry, MatzeB, iteratee Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D27783 llvm-svn: 289764	2016-12-15 05:08:57 +00:00
Sanjay Patel	afee21a5b2	[DAG] allow more select folding for targets that have 'and not' (PR31175) The original motivation for this patch comes from wanting to canonicalize more IR to selects and also canonicalizing min/max. If we're going to do that, we need more backend fixups to undo select codegen when simpler ops will do. I chose AArch64 for the tests because that shows the difference in the simplest way. This should fix: https://llvm.org/bugs/show_bug.cgi?id=31175 Differential Revision: https://reviews.llvm.org/D27489 llvm-svn: 289738	2016-12-14 22:59:14 +00:00
David Blaikie	b461468958	DebugInfo: Improve type safety and simplify some subprogram finalization code This probably ended up this way aften the subprogram<>function link inversion and debug info metadata schema changes. llvm-svn: 289697	2016-12-14 19:38:39 +00:00
Andrew Kaylor	ce3bcae632	[WinEH] Avoid holding references to BlockColor (DenseMap) entries while inserting new elements Differential Revision: https://reviews.llvm.org/D27693 llvm-svn: 289694	2016-12-14 19:30:18 +00:00
Nirav Dave	f5bf03c7ef	Revert "In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled." Reverting due to ARM MCJIT and MIPS LLD error. This reverts commit r289659. llvm-svn: 289667	2016-12-14 16:43:44 +00:00
Nirav Dave	8527ab0ad2	In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled. Retrying after fixing after removing load-store factoring through token factors in favor of improved token factor operand pruning Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. Whem merging stores, search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and generally the output CodeGen (with some exceptions). Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seemed sufficient to not cause regressions in tests. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable. Some tests relying on the order were changed to use volatile memory operations Noteworthy tests: CodeGen/AArch64/argument-blocks.ll - It's not entirely clear what the test_varargs_stackalign test is supposed to be asserting, but the new code looks right. CodeGen/AArch64/arm64-memset-inline.lli - CodeGen/AArch64/arm64-stur.ll - CodeGen/ARM/memset-inline.ll - The backend now generates worse code due to store merging succeeding, as we do do a 16-byte constant-zero store efficiently. CodeGen/AArch64/merge-store.ll - Improved, but there still seems to be an extraneous vector insert from an element to itself? CodeGen/PowerPC/ppc64-align-long-double.ll - Worse code emitted in this case, due to the improved store->load forwarding. CodeGen/X86/dag-merge-fast-accesses.ll - CodeGen/X86/MergeConsecutiveStores.ll - CodeGen/X86/stores-merging.ll - CodeGen/Mips/load-store-left-right.ll - Restored correct merging of non-aligned stores CodeGen/AMDGPU/promote-alloca-stored-pointer-value.ll - Improved. Correctly merges buffer_store_dword calls CodeGen/AMDGPU/si-triv-disjoint-mem-access.ll - Improved. Sidesteps loading a stored value and merges two stores CodeGen/X86/pr18023.ll - This test has been removed, as it was asserting incorrect behavior. Non-volatile stores CAN be moved past volatile loads, and now are. CodeGen/X86/vector-idiv.ll - CodeGen/X86/vector-lzcnt-128.ll - It's basically impossible to tell what these tests are actually testing. But, looks like the code got better due to the memory operations being recognized as non-aliasing. CodeGen/X86/win32-eh.ll - Both loads of the securitycookie are now merged. Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle Subscribers: wdng, nhaehnle, nemanjai, arsenm, weimingz, niravd, RKSimon, aemerson, qcolombet, dsanders, resistor, tstellarAMD, t.p.northover, spatel Differential Revision: https://reviews.llvm.org/D14834 llvm-svn: 289659	2016-12-14 15:44:26 +00:00
Simon Pilgrim	05ab8ffc7e	[DAGCombiner] Try to use SelectionDAG::isKnownToBeAPowerOfTwo instead of just APInt::isPowerOf2 Generalize sdiv/udiv/srem/urem combines using APInt::isPowerOf2, which only works for const/splat-const values, to call SelectionDAG::isKnownToBeAPowerOfTwo instead which recognises many more cases. Added a DAGCombiner::BuildLogBase2 helper since PowerOf2 combines often involve taking the log2 of such a value. Differential Revision: https://reviews.llvm.org/D27714 llvm-svn: 289654	2016-12-14 15:08:13 +00:00
Stephan Bergmann	17c7f70362	Replace APFloatBase static fltSemantics data members with getter functions At least the plugin used by the LibreOffice build (<https://wiki.documentfoundation.org/Development/Clang_plugins>) indirectly uses those members (through inline functions in LLVM/Clang include files in turn using them), but they are not exported by utils/extract_symbols.py on Windows, and accessing data across DLL/EXE boundaries on Windows is generally problematic. Differential Revision: https://reviews.llvm.org/D26671 llvm-svn: 289647	2016-12-14 11:57:17 +00:00
Artur Pilipenko	f3ee444010	Add a couple of assertions to the load combine code introduced by r289538 llvm-svn: 289646	2016-12-14 11:55:47 +00:00
Paul Robinson	8fec3da00c	[DWARF] Preserve column number when emitting 'line 0' record Follow-up to r289256, address a FIXME to avoid resetting the column number. This reduced .debug_line by 2.6% in a RelWithDebInfo self-build of clang. llvm-svn: 289620	2016-12-14 00:27:35 +00:00
Alina Sbirlea	77c5eaaeda	Generalize strided store pattern in interleave access pass Summary: This patch aims to generalize matching of the strided store accesses to more general masks. The more general rule is to have consecutive accesses based on the stride: [x, y, ... z, x+1, y+1, ...z+1, x+2, y+2, ...z+2, ...] All elements in the masks need not form a contiguous space, there may be gaps. As before, undefs are allowed and filled in with adjacent element loads. Reviewers: HaoLiu, mssimpso Subscribers: mkuper, delena, llvm-commits Differential Revision: https://reviews.llvm.org/D23646 llvm-svn: 289573	2016-12-13 19:32:36 +00:00
Artur Pilipenko	469fcd2afd	Use more detailed assertion messages in the code introduced by r289538 llvm-svn: 289545	2016-12-13 16:26:15 +00:00
Artur Pilipenko	79d1255e26	Fix a buildbot failure introduced by r289538 Build failed because of unused variable in product mode. llvm-svn: 289540	2016-12-13 14:55:31 +00:00
Artur Pilipenko	c93cc5955f	[DAGCombiner] Match load by bytes idiom and fold it into a single load Match a pattern where a wide type scalar value is loaded by several narrow loads and combined by shifts and ors. Fold it into a single load or a load and a bswap if the targets supports it. Assuming little endian target: i8 a = ... i32 val = a[0] \| (a[1] << 8) \| (a[2] << 16) \| (a[3] << 24) => i32 val = ((i32)a) i8 a = ... i32 val = (a[0] << 24) \| (a[1] << 16) \| (a[2] << 8) \| a[3] => i32 val = BSWAP(((i32)a)) This optimization was discussed on llvm-dev some time ago in "Load combine pass" thread. We came to the conclusion that we want to do this transformation late in the pipeline because in presence of atomic loads load widening is irreversible transformation and it might hinder other optimizations. Eventually we'd like to support folding patterns like this where the offset has a variable and a constant part: i32 val = a[i] \| (a[i + 1] << 8) \| (a[i + 2] << 16) \| (a[i + 3] << 24) Matching the pattern above is easier at SelectionDAG level since address reassociation has already happened and the fact that the loads are adjacent is clear. Understanding that these loads are adjacent at IR level would have involved looking through geps/zexts/adds while looking at the addresses. The general scheme is to match OR expressions by recursively calculating the origin of individual bits which constitute the resulting OR value. If all the OR bits come from memory verify that they are adjacent and match with little or big endian encoding of a wider value. If so and the load of the wider type (and bswap if needed) is allowed by the target generate a load and a bswap if needed. Reviewed By: hfinkel, RKSimon, filcab Differential Revision: https://reviews.llvm.org/D26149 llvm-svn: 289538	2016-12-13 14:21:14 +00:00
Artur Pilipenko	01e86444a0	Move BaseIndexOffset in DAGCombiner.cpp so it will be available for the upcoming user llvm-svn: 289537	2016-12-13 14:16:02 +00:00
Simon Pilgrim	9dc67c0101	[SelectionDAG] computeKnownBits - simplified knownbits sign extension. NFCI. We don't need to extract+test the sign bit of the known ones/zeros, we can use sext which will handle all of this. llvm-svn: 289534	2016-12-13 13:36:27 +00:00
Diana Picus	2d9adbf524	[GlobalISel] Move extendRegister where it belongs. NFCI Apparently I missed this one when I moved ValueHandler back in r288658. Sorry! llvm-svn: 289528	2016-12-13 10:46:12 +00:00
Philip Reames	1f1bbac8da	[peephole] Enhance folding logic to work for STATEPOINTs The general idea here is to get enough of the existing restrictions out of the way that the already existing folding logic in foldMemoryOperand can kick in for STATEPOINTs and fold references to immutable stack slots. The key changes are: Support for folding multiple operands at once which reference the same load Support for folding multiple loads into a single instruction Walk all the operands of the instruction for varidic instructions (this is a bug fix!) Once this lands, I'll post another patch which refactors the TII interface here. There's nothing actually x86 specific about the x86 code used here. Differential Revision: https://reviews.llvm.org/D24103 llvm-svn: 289510	2016-12-13 01:38:41 +00:00
Philip Reames	51387a8c28	[Statepoints] Reuse stack slots more than once within a basic block The stack slot reuse code had a really amusing bug. We ended up only reusing a stack slot exact once (initial use + reuse) within a basic block. If we had a third statepoint to process, we ended up allocating a new set of stack slots. If we crossed a basic block boundary, the set got cleared. As a result, code which is invoke heavy doesn't see the problem, but multiple calls within a basic block does. Net result: as we optimize invokes into calls, lowering gets worse. The root error here is that the bitmap uses by the custom allocator wasn't kept in sync. The result was that we ended up resizing the bitmap on the next statepoint (to handle the cross block case), reset the bit once, but then never reset it again. Differential Revision: https://reviews.llvm.org/D25243 llvm-svn: 289509	2016-12-13 01:21:15 +00:00
Andrew Kaylor	ff6a1edfa8	Avoid infinite loops in branch folding Differential Revision: https://reviews.llvm.org/D27582 llvm-svn: 289486	2016-12-12 23:05:38 +00:00
Paul Robinson	ac7fe5e0c4	Recommit r288212: Emit 'no line' information for interesting 'orphan' instructions. DWARF specifies that "line 0" really means "no appropriate source location" in the line table. By default, use this for branch targets and some other cases that have no specified source location, to prevent inheriting unfortunate line numbers from physically preceding instructions (which might be from completely unrelated source). Updated patch allows enabling or suppressing this behavior for all unspecified source locations. Differential Revision: http://reviews.llvm.org/D24180 llvm-svn: 289468	2016-12-12 20:49:11 +00:00
Geoff Berry	d73420d591	[LiveRangeEdit] Add assert string and descriptive comment. llvm-svn: 289456	2016-12-12 19:12:41 +00:00
Simon Pilgrim	040a36c176	[SelectionDAG] Add support for EXTRACT_SUBVECTOR to ComputeNumSignBits Pre-commit as discussed on D27657 llvm-svn: 289425	2016-12-12 10:29:43 +00:00
Sebastian Pop	e08d9c7c87	instr-combiner: sum up all latencies of the transformed instructions We have found that -- when the selected subarchitecture has a scheduling model and we are not optimizing for size -- the machine-instruction combiner uses a too-simple algorithm to compute the cost of one of the two alternatives [before and after running a combining pass on a section of code], and therefor it throws away the combination results too often. This fix has the potential to help any ISA with the potential to combine instructions and for which at least one subarchitecture has a scheduling model. As of now, this is only known to definitely affect AArch64 subarchitectures with a scheduling model. Regression tested on AMD64/GNU-Linux, new test case tested to fail on an unpatched compiler and pass on a patched compiler. Patch by Abe Skolnik and Sebastian Pop. llvm-svn: 289399	2016-12-11 19:39:32 +00:00
Simon Pilgrim	54945a12ec	[SelectionDAG] Add ability for computeKnownBits to peek through bitcasts from 'large element' scalar/vector to 'small element' vector. Extension to D27129 which already supported bitcasts from 'small element' vector to 'large element' scalar/vector types. llvm-svn: 289329	2016-12-10 17:00:00 +00:00
Adrian Prantl	8fafb8d378	Fix LLVM's use of DW_OP_bit_piece in DWARF expressions. LLVM's use of DW_OP_bit_piece is incorrect and a based on a misunderstanding of the wording in the DWARF specification. The offset argument of DW_OP_bit_piece refers to the offset into the location that is on the top of the DWARF expression stack, and not an offset into the source variable. This has since also been clarified in the DWARF specification. This patch fixes all uses of DW_OP_bit_piece to emit the correct offset and simplifies the DwarfExpression class to semi-automaticaly emit empty DW_OP_pieces to adjust the offset of the source variable, thus simplifying the code using DwarfExpression. While this is an incompatible bugfix, in practice I don't expect this to be much of a problem since LLVM's old interpretation and the correct interpretation of DW_OP_bit_piece differ only when there are gaps in the fragmented locations of the described variables or if individual fragments are smaller than a byte. LLDB at least won't interpret locations with gaps in them because is has no way to present undefined bits in a variable, and there is a high probability that an old-form expression will be malformed when interpreted correctly, because the DW_OP_bit_piece offset will be outside of the location at the top of the stack. As a nice side-effect, this patch enables us to use a more efficient encoding for subregisters: In order to express a sub-register at a non-zero offset we now use a DW_OP_bit_piece instead of shifting the value into place manually. This patch also adds missing test coverage for code paths that weren't exercised before. <rdar://problem/29335809> Differential Revision: https://reviews.llvm.org/D27550 llvm-svn: 289266	2016-12-09 20:43:40 +00:00
Paul Robinson	4fa7b57a1f	[DWARF] Suppress .loc directives from CFI instructions Like DBG_VALUE, these emit nothing to the .text section, and sometimes have no source location specified. Just ignore them. Differential Revision: http://reviews.llvm.org/D27492 llvm-svn: 289256	2016-12-09 19:15:32 +00:00
Simon Pilgrim	017b7a71d8	[SelectionDAG] Add knownbits support for EXTRACT_VECTOR_ELT opcodes (REAPPLIED) Reapplied with fix for PR31323 - X86 SSE2 vXi16 multiplies for illegal types were creating CONCAT_VECTORS nodes with vector inputs that might not total the number of elements in the result type. llvm-svn: 289232	2016-12-09 17:53:11 +00:00
Matt Arsenault	38d8ed2b75	AMDGPU: Fix i128 mul llvm-svn: 289231	2016-12-09 17:49:14 +00:00
Nirav Dave	bedb5d906c	Revert "In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled." This reverts commit r289221 which appears to be triggering an assertion llvm-svn: 289226	2016-12-09 17:18:24 +00:00
Nirav Dave	fd51ff4fd8	In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled. Retrying after fixing overly aggressive load-store forwarding optimization. Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. Whem merging stores, search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and generally the output CodeGen (with some exceptions). Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seemed sufficient to not cause regressions in tests. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable. Some tests relying on the order were changed to use volatile memory operations Noteworthy tests: CodeGen/AArch64/argument-blocks.ll - It's not entirely clear what the test_varargs_stackalign test is supposed to be asserting, but the new code looks right. CodeGen/AArch64/arm64-memset-inline.lli - CodeGen/AArch64/arm64-stur.ll - CodeGen/ARM/memset-inline.ll - The backend now generates worse code due to store merging succeeding, as we do do a 16-byte constant-zero store efficiently. CodeGen/AArch64/merge-store.ll - Improved, but there still seems to be an extraneous vector insert from an element to itself? CodeGen/PowerPC/ppc64-align-long-double.ll - Worse code emitted in this case, due to the improved store->load forwarding. CodeGen/X86/dag-merge-fast-accesses.ll - CodeGen/X86/MergeConsecutiveStores.ll - CodeGen/X86/stores-merging.ll - CodeGen/Mips/load-store-left-right.ll - Restored correct merging of non-aligned stores CodeGen/AMDGPU/promote-alloca-stored-pointer-value.ll - Improved. Correctly merges buffer_store_dword calls CodeGen/AMDGPU/si-triv-disjoint-mem-access.ll - Improved. Sidesteps loading a stored value and merges two stores CodeGen/X86/pr18023.ll - This test has been removed, as it was asserting incorrect behavior. Non-volatile stores CAN be moved past volatile loads, and now are. CodeGen/X86/vector-idiv.ll - CodeGen/X86/vector-lzcnt-128.ll - It's basically impossible to tell what these tests are actually testing. But, looks like the code got better due to the memory operations being recognized as non-aliasing. CodeGen/X86/win32-eh.ll - Both loads of the securitycookie are now merged. Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle Subscribers: wdng, nhaehnle, nemanjai, arsenm, weimingz, niravd, RKSimon, aemerson, qcolombet, dsanders, resistor, tstellarAMD, t.p.northover, spatel Differential Revision: https://reviews.llvm.org/D14834 llvm-svn: 289221	2016-12-09 16:15:12 +00:00
Simon Pilgrim	b9eb99f570	Use SelectionDAG.getSplatBuildVector helper. NFCI. llvm-svn: 289220	2016-12-09 16:01:50 +00:00
Simon Pilgrim	bf9c0e7434	[SelectionDAG] Use SelectionDAG.getBuildVector helper. NFCI. Makes interception of BUILD_VECTOR creation easier for debugging. llvm-svn: 289218	2016-12-09 15:23:41 +00:00
Simon Pilgrim	15f1f828b5	[SelectionDAG] Add additional checks to CONCAT_VECTORS creation Part of the work for PR31323 - add extra asserts checking that the input vectors are of consistent type and result in the correct number of vector elements. llvm-svn: 289214	2016-12-09 14:27:52 +00:00
Benjamin Kramer	eedc4059c3	Plug another leak in the DWARF unittests, DIEInlineStrings are never destroyed. llvm-svn: 289208	2016-12-09 13:33:41 +00:00
Simon Pilgrim	e4050a2961	[SelectionDAG] Add partial BITCAST support to computeKnownBits Adds support for bitcasting a little endian 'small element' vector to 'large element' scalar/vector (e.g. v16i8 to v4i32 or v2i32 to i64), which is required for PR30845. We extract the knownbits for each 'small element' part and concatenate the results together. We can add support for big endian and 'large element' scalar/vector to 'small element' vector bitcasting once we have test cases for them. Differential Revision: https://reviews.llvm.org/D27129 llvm-svn: 289200	2016-12-09 10:13:45 +00:00
Daniel Jasper	f51e05ffbc	Revert "[SelectionDAG] Add knownbits support for EXTRACT_VECTOR_ELT opcodes" This reverts commit r288916 as it is currently causing a crasher in Halide. Reproducer on llvm.org/PR31323. While it might be that halide is generating invalid IR, llc shouldn't crash. llvm-svn: 289194	2016-12-09 09:04:51 +00:00
Tim Northover	b58346f2f2	GlobalISel: fall back gracefully for debug intrinsics. Supporting them properly is a reasonably complex chunk of work, so to allow bot testing before then we should at least be able to fall back to DAG ISel. llvm-svn: 289150	2016-12-08 22:44:13 +00:00
Tim Northover	1e656ec137	GlobalISel: factor overflow handling into separate function. NFC. llvm-svn: 289149	2016-12-08 22:44:00 +00:00
Reid Kleckner	785e7d282c	Don't emit .seh_handler directives for any cleanup funclets We were falsely claiming that we had an LSDA for the relevant EH personality before this change, which could lead to the EH machinery interpreting random adjacent data as an LSDA. Fixes PR31317 This change is safe because cleanups can't contain exception handlers today. We do these things to maintain that invariant: - C++ destructors are naturally out-of-line - __finally blocks are outlined in clang - LLVM's inliner will not inline EH constructs into cleanups llvm-svn: 289101	2016-12-08 20:38:46 +00:00
NAKAMURA Takumi	689493bb12	Prune unused libdeps. llvm-svn: 289060	2016-12-08 15:28:02 +00:00
Nicolai Haehnle	f08dc90253	[SelectionDAG] Add expansion and promotion of [US]MUL_LOHI Summary: Most targets set the action for these nodes to Expand even though there isn't actually any code for them in ExpandNode. Instead, targets simply relied on the fact that no code generates these nodes as long as the nodes aren't legal or custom. However, generating these nodes can be useful e.g. for divide-by-constant in wider integer types. Expand of [US]MUL_LOHI will use MULH[US] when legal or custom, and a sequence of half-width multiplications otherwise. Promote uses a wider multiply. This patch intends to not change the generated code, but indirect effects are possible since expansions/promotions that were previously done in DAGCombine may now be done in LegalizeDAG. See D24822 for a change that actually uses the new expansion. Reviewers: spatel, bkramer, venkatra, efriedma, hfinkel, ast, nadav, tstellarAMD Subscribers: arsenm, jyknight, nemanjai, wdng, nhaehnle, llvm-commits Differential Revision: https://reviews.llvm.org/D24956 llvm-svn: 289050	2016-12-08 14:08:14 +00:00
Daniel Jasper	0f77869d58	Move DwarfGenerator.cpp to unittests So far it creates a test helper and so it should be moved there. It also create a layering cycle between CodeGen and CodeGen/AsmPrinter, which should be avoided. Review: https://reviews.llvm.org/D27570 llvm-svn: 289044	2016-12-08 12:45:29 +00:00
Simon Pilgrim	413c8e217f	Wdocumentation fix llvm-svn: 289038	2016-12-08 10:41:41 +00:00
Keno Fischer	d4ea4c18f1	Revert "[CodeGen] Fix invalid DWARF info on Win64" Appears to break on build bots. Reverting pending investigation. llvm-svn: 289014	2016-12-08 01:56:23 +00:00
Keno Fischer	460218fb7d	[CodeGen] Fix invalid DWARF info on Win64 The relocations for `DIEEntry::EmitValue` were wrong for Win64 (emitting FK_Data_4 instead of FK_SecRel_4). This corrects that oversight so that the DWARF data is correct in Win64 COFF files. Fixes PR15393. Patch by Jameson Nash <jameson@juliacomputing.com> based on a patch by David Majnemer. Differential Revision: https://reviews.llvm.org/D21731 llvm-svn: 289013	2016-12-08 01:40:21 +00:00
Greg Clayton	3462a420d1	Make a DWARF generator so we can unit test DWARF APIs with gtest. The only tests we have for the DWARF parser are the tests that use llvm-dwarfdump and expect output from textual dumps. More DWARF parser modification are coming in the next few weeks and I wanted to add tests that can verify that we can encode and decode all form types, as well as test some other basic DWARF APIs where we ask DIE objects for their children and siblings. DwarfGenerator.cpp was added in the lib/CodeGen directory. This file contains the code necessary to easily create DWARF for tests: dwarfgen::Generator DG; Triple Triple("x86_64--"); bool success = DG.init(Triple, Version); if (!success) return; dwarfgen::CompileUnit &CU = DG.addCompileUnit(); dwarfgen::DIE CUDie = CU.getUnitDIE(); CUDie.addAttribute(DW_AT_name, DW_FORM_strp, "/tmp/main.c"); CUDie.addAttribute(DW_AT_language, DW_FORM_data2, DW_LANG_C); dwarfgen::DIE SubprogramDie = CUDie.addChild(DW_TAG_subprogram); SubprogramDie.addAttribute(DW_AT_name, DW_FORM_strp, "main"); SubprogramDie.addAttribute(DW_AT_low_pc, DW_FORM_addr, 0x1000U); SubprogramDie.addAttribute(DW_AT_high_pc, DW_FORM_addr, 0x2000U); dwarfgen::DIE IntDie = CUDie.addChild(DW_TAG_base_type); IntDie.addAttribute(DW_AT_name, DW_FORM_strp, "int"); IntDie.addAttribute(DW_AT_encoding, DW_FORM_data1, DW_ATE_signed); IntDie.addAttribute(DW_AT_byte_size, DW_FORM_data1, 4); dwarfgen::DIE ArgcDie = SubprogramDie.addChild(DW_TAG_formal_parameter); ArgcDie.addAttribute(DW_AT_name, DW_FORM_strp, "argc"); // ArgcDie.addAttribute(DW_AT_type, DW_FORM_ref4, IntDie); ArgcDie.addAttribute(DW_AT_type, DW_FORM_ref_addr, IntDie); StringRef FileBytes = DG.generate(); MemoryBufferRef FileBuffer(FileBytes, "dwarf"); auto Obj = object::ObjectFile::createObjectFile(FileBuffer); EXPECT_TRUE((bool)Obj); DWARFContextInMemory DwarfContext(*Obj.get()); This code is backed by the AsmPrinter code that emits DWARF for the actual compiler. While adding unit tests it was discovered that DIEValue that used DIEEntry as their values had bugs where DW_FORM_ref1, DW_FORM_ref2, DW_FORM_ref8, and DW_FORM_ref_udata forms were not supported. These are all now supported. Added support for DW_FORM_string so we can emit inlined C strings. Centralized the code to unique abbreviations into a new DIEAbbrevSet class and made both the dwarfgen::Generator and the llvm::DwarfFile classes use the new class. Fixed comments in the llvm::DIE class so that the Offset is known to be the compile/type unit offset. DIEInteger now supports more DW_FORM values. There are also unit tests that cover: Encoding and decoding all form types and values Encoding and decoding all reference types (DW_FORM_ref1, DW_FORM_ref2, DW_FORM_ref4, DW_FORM_ref8, DW_FORM_ref_udata, DW_FORM_ref_addr) including cross compile unit references with that go forward one compile unit and backward on compile unit. Differential Revision: https://reviews.llvm.org/D27326 llvm-svn: 289010	2016-12-08 01:03:48 +00:00
Matthias Braun	e2d2ead661	TargetPassConfig: Rename DisablePostRA -> DisablePostRASched; NFC llvm-svn: 289003	2016-12-08 00:16:08 +00:00
Matthias Braun	0c989a893b	LivePhysReg: Use reference instead of pointer in init(); NFC llvm-svn: 289002	2016-12-08 00:15:51 +00:00
Quentin Colombet	ae3168da3f	[InlineSpiller] Don't call TargetInstrInfo::foldMemoryOperand with an empty list. Since r287792 if we try to do that we will hit an assert. llvm-svn: 289001	2016-12-08 00:06:51 +00:00
Tim Northover	c53606ef02	GlobalISel: use correct builder for ConstantExprs. ConstantExpr instances were emitting code into the current block rather than the entry block. This meant they didn't necessarily dominate all uses, which is clearly wrong. llvm-svn: 288985	2016-12-07 21:29:15 +00:00
Tim Northover	50db7f416c	GlobalISel: store the current MachineFunction as direct state. NFC. Having to ask the MIRBuilder for the current function is a little awkward, and I'm intending to improve how that's threaded through anyway. llvm-svn: 288983	2016-12-07 21:17:47 +00:00
Tim Northover	05cc4859ad	GlobalISel: simplify MachineIRBuilder interface. MachineIRBuilder had weird before/after and beginning/end flags for the insert point. Unfortunately the non-default means that instructions will be inserted in reverse order which is almost never what anyone wants. Really, I think we just want (like IRBuilder has) the ability to insert at any C++ iterator-style point (i.e. before any instruction or before MBB.end()). So this fixes MIRBuilders to behave like IRBuilders in this respect. llvm-svn: 288980	2016-12-07 21:05:38 +00:00
Simon Pilgrim	ba05d41095	[SelectionDAG] Add knownbits support for vector demandedelts in SMAX/SMIN/UMAX/UMIN opcodes llvm-svn: 288926	2016-12-07 17:54:00 +00:00
Simon Pilgrim	967325b373	[SelectionDAG] Add knownbits support for EXTRACT_VECTOR_ELT opcodes llvm-svn: 288916	2016-12-07 16:28:21 +00:00
Simon Pilgrim	ff79f31328	[SelectionDAG] Removed old knownbits TODO comment. NFCI. EXTRACT_VECTOR_ELT does support demanded elts if the element index is known and in range. llvm-svn: 288913	2016-12-07 15:31:12 +00:00
Eli Friedman	0a76e3241f	[CodeGen] Fix result type for SMULO/UMULO legalization On some platforms (like MSP430) the second element of the result structure for SMULO/UMULO may have a shorter type than the one returned by SetCC. We need to truncate it to the right type, or else some incorrect code may be generated later on. This fixes issue https://github.com/rust-lang/rust/issues/37829 Patch by Vadzim Dambrouski! Differential Revision: https://reviews.llvm.org/D27154 llvm-svn: 288857	2016-12-06 22:49:36 +00:00
Tim Northover	14ceb45fb4	GlobalISel: correctly handle small args via memory. We were rounding size in bits down rather than up, leading to 0-sized slots for i1 (assert!) and bugs for other types not byte-aligned. llvm-svn: 288848	2016-12-06 21:02:19 +00:00
Simon Pilgrim	dd6ca639d5	[DAGCombine] Add (sext_in_reg (zext x)) -> (sext x) combine Handle the case where a sign extension has ended up being split into separate stages (typically to get around vector legal ops) and a zext + sext_in_reg gets inserted. Differential Revision: https://reviews.llvm.org/D27461 llvm-svn: 288842	2016-12-06 19:09:37 +00:00
Tim Northover	0a683e7bfd	GlobalISel: fall back gracefully when we hit unhandled legalizer default. llvm-svn: 288840	2016-12-06 19:02:15 +00:00
Simon Pilgrim	1577b39f51	[SelectionDAG] We can ignore knownbits from an undef shuffle vector index if we don't actually demand that element llvm-svn: 288839	2016-12-06 18:58:25 +00:00
Tim Northover	c1a23854f3	GlobalISel: handle G_SEQUENCE fallbacks gracefully. There were two problems: + AArch64 was reusing random data from its binary op tables, which is complete nonsense for G_SEQUENCE. + Even when AArch64 gave up and said it couldn't handle G_SEQUENCE, the generic code asserted. llvm-svn: 288836	2016-12-06 18:38:38 +00:00
Tim Northover	f50f2f3d32	GlobalISel: allow G_SELECT instructions for pointers. llvm-svn: 288835	2016-12-06 18:38:34 +00:00
Tim Northover	405e25cd6a	GlobalISel: stop the legalizer from trying to handle oddly-sized types. It'll almost immediately fail because it always tries to half/double the size until it finds a legal one. Unfortunately, this triggers an assertion preventing the DAG fallback from being possible. llvm-svn: 288834	2016-12-06 18:38:29 +00:00
Simon Pilgrim	29c17f3f58	Avoid repeated calls to Op.getOpcode(). NFCI. llvm-svn: 288814	2016-12-06 14:50:09 +00:00
Sam McCall	03435f57aa	Add missing parens in assert. Summary: Add missing parens in assert, which warn in GCC. Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27448 llvm-svn: 288792	2016-12-06 10:14:36 +00:00
Tim Northover	800638fd67	GlobalISel: avoid looking too closely at PHIs when we bail. The function used to finish off PHIs by adding the relevant basic blocks can fail if we're aborting and still don't actually have the needed MachineBasicBlocks. So avoid trying in that case. llvm-svn: 288727	2016-12-05 23:10:19 +00:00
Tim Northover	b566848d68	GlobalISel: place constants correctly in the entry block. When the entry block was empty after arg lowering, we were always placing constants at the end. This is probably hamrless while translating the same block, but horribly wrong once its terminator has been translated. So switch to inserting at the beginning. llvm-svn: 288720	2016-12-05 22:40:13 +00:00
Tim Northover	c0bd197c6b	GlobalISel: handle pointer arguments that get assigned to the stack. llvm-svn: 288717	2016-12-05 22:20:32 +00:00
Tim Northover	cc35f90492	GlobalISel: translate constants larger than 64 bits. llvm-svn: 288713	2016-12-05 21:54:17 +00:00
Tim Northover	9267ac5d47	GlobalISel: make G_CONSTANT take a ConstantInt rather than int64_t. This makes it more similar to the floating-point constant, and also allows for larger constants to be translated later. There's no real functional change in this patch though, just syntax updates. llvm-svn: 288712	2016-12-05 21:47:07 +00:00
Tim Northover	6ad7b9f837	GlobalISel: improve translation fallback for constants. Returning 0 (NoReg) from getOrCreateVReg leads to unexpected situations later in the translation. It's better to return a valid (if undefined) register and let the rest of the instruction carry on as planned. llvm-svn: 288709	2016-12-05 21:40:33 +00:00

1 2 3 4 5 ...

21889 Commits