llvm-project

Commit Graph

Author	SHA1	Message	Date
Chandler Carruth	41ed4034dd	[x86] Revert the X86FoldTablesEmitter due to more miscompiles. In testing, we've found yet another miscompile caused by the new tables. And this one is even less clear how to fix (we could teach it to fold a 16-bit load instead of the 32-bit load it wants, or block folding entirely). Also, the approach to excluding instructions seems increasingly to not scale well. I have left a more detailed analysis on the review log for the original patch (https://reviews.llvm.org/D32684) along with suggested path forward. I will land an additional test case that I wrote which covers the code that was miscompiling (folding into the output of `pextrw`) in a subsequent commit to keep this a pure revert. For each commit reverted here, I've restricted the revert to the non-test code touching the x86 fold table emission until the last commit where I did revert the test updates. This means the new test cases added for `insertps` and `xchg` remain untouched (and continue to pass). Reverted commits: r304540: [X86] Don't fold into memory operands into insertps in the ... r304347: [TableGen] Adapt more places to getValueAsString now ... r304163: [X86] Don't fold away the memory operand of an xchg. r304123: Don't capture a temporary std::string in a StringRef. r304122: Resubmit "[X86] Adding new LLVM TableGen backend that ..." Original commit was in r304088, and after a string of fixes was reverted previously in r304121 to fix build bots, and then re-landed in r304122. llvm-svn: 304762	2017-06-06 02:15:31 +00:00
Matthias Braun	7bda195812	CodeGen: Refactor MIR parsing When parsing .mir files immediately construct the MachineFunctions and put them into MachineModuleInfo. This allows us to get rid of the delayed construction (and delayed error reporting) through the MachineFunctionInitialzier interface. Differential Revision: https://reviews.llvm.org/D33809 llvm-svn: 304758	2017-06-06 00:44:35 +00:00
Matthias Braun	c7c06f158c	CodeGen/LLVMTargetMachine: Refactor ISel pass construction; NFCI - Move ISel (and pre-isel) pass construction into TargetPassConfig - Extract AsmPrinter construction into a helper function Putting the ISel code into TargetPassConfig seems a lot more natural and both changes together make make it easier to build custom pipelines involving .mir in an upcoming commit. This moves MachineModuleInfo to an earlier place in the pass pipeline which shouldn't have any effect. llvm-svn: 304754	2017-06-06 00:26:13 +00:00
Sanjay Patel	d47e64398e	[x86] fix over-specific triple; NFC There's nothing darwin-specific in these tests, and using that setting causes extra phantom diffs when the auto-generated check lines are regenerated today. llvm-svn: 304753	2017-06-06 00:18:11 +00:00
Quentin Colombet	c668935d85	[InlineSpiller] Don't spill fully undef values Althought it is not wrong to spill undef values, it is useless and harms both code size and runtime. Before spilling a value, check that its content actually matters. http://www.llvm.org/PR33311 llvm-svn: 304752	2017-06-05 23:51:27 +00:00
Matt Arsenault	9e5b5053d1	RenameIndependentSubregs: Fix handling of undef tied operands If a tied source operand was undef, it would be replaced but not update the other tied operand, which would end up using different virtual registers. llvm-svn: 304747	2017-06-05 22:58:57 +00:00
Volkan Keles	ebe6bb9006	[GlobalISel] IRTranslator: Add MachineMemOperand to target memory intrinsics Reviewers: qcolombet, ab, t.p.northover, aditya_nandakumar, dsanders Reviewed By: qcolombet Subscribers: rovka, kristof.beyls, javed.absar, igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D33724 llvm-svn: 304743	2017-06-05 22:17:17 +00:00
Davide Italiano	fb4d5c095b	[SelectionDAG] Update the dominator after splitting critical edges. Running `llc -verify-dom-info` on the attached testcase results in a crash in the verifier, due to a stale dominator tree. i.e. DominatorTree is not up to date! Computed: =============================-------------------------------- Inorder Dominator Tree: [1] %safe_mod_func_uint8_t_u_u.exit.i.i.i {0,7} [2] %lor.lhs.false.i61.i.i.i {1,2} [2] %safe_mod_func_int8_t_s_s.exit.i.i.i {3,6} [3] %safe_div_func_int64_t_s_s.exit66.i.i.i {4,5} Actual: =============================-------------------------------- Inorder Dominator Tree: [1] %safe_mod_func_uint8_t_u_u.exit.i.i.i {0,9} [2] %lor.lhs.false.i61.i.i.i {1,2} [2] %safe_mod_func_int8_t_s_s.exit.i.i.i {3,8} [3] %safe_div_func_int64_t_s_s.exit66.i.i.i {4,5} [3] %safe_mod_func_int8_t_s_s.exit.i.i.i.lor.lhs.false.i61.i.i.i_crit_edge {6,7} This is because in `SelectionDAGIsel` we split critical edges without updating the corresponding dominator for the function (and we claim in `MachineFunctionPass::getAnalysisUsage()` that the domtree is preserved). We could either stop preserving the domtree in `getAnalysisUsage` or tell `splitCriticalEdge()` to update it. As the second option is easy to implement, that's the one I chose. Differential Revision: https://reviews.llvm.org/D33800 llvm-svn: 304742	2017-06-05 22:16:41 +00:00
Simon Pilgrim	807b708d13	[X86][SSE41] Non-temporal loads shouldn't be folded if it can be avoided (PR32743) Missed SSE41 non-temporal load case in previous commit Differential Revision: https://reviews.llvm.org/D33728 llvm-svn: 304722	2017-06-05 16:45:32 +00:00
Simon Pilgrim	b2ef948628	[X86][AVX1] Split 256-bit vector non-temporal loads to keep it non-temporal (PR32744) Differential Revision: https://reviews.llvm.org/D33728 llvm-svn: 304718	2017-06-05 16:02:01 +00:00
Simon Pilgrim	a25bf0b6b9	[X86][SSE] Non-temporal loads shouldn't be folded if it can be avoided (PR32743) Differential Revision: https://reviews.llvm.org/D33728 llvm-svn: 304717	2017-06-05 15:43:03 +00:00
Diana Picus	0091cc3528	[ARM] GlobalISel: Constrain callee register on indirect calls When lowering calls, we generate instructions with machine opcodes rather than generic ones. Therefore, we need to constrain the register classes of the operands. Also enable the machine verifier on the arm-irtranslator.ll test, since that would've caught this issue. Fixes (part of) PR32146. llvm-svn: 304712	2017-06-05 12:54:53 +00:00
Javed Absar	b16d146838	Add support for #pragma clang section This patch provides a means to specify section-names for global variables, functions and static variables, using #pragma directives. This feature is only defined to work sensibly for ELF targets. One can specify section names as: #pragma clang section bss="myBSS" data="myData" rodata="myRodata" text="myText" One can "unspecify" a section name with empty string e.g. #pragma clang section bss="" data="" text="" rodata="" Reviewers: Roger Ferrer, Jonathan Roelofs, Reid Kleckner Differential Revision: https://reviews.llvm.org/D33413 llvm-svn: 304704	2017-06-05 10:09:13 +00:00
Stanislav Mekhanoshin	286a4225b9	[AMDGPU] Fix SIFoldOperands crash with clamp Fixes bug #33302. Pass did not account that Src1 of max instruction can be an immediate. Differential Revision: https://reviews.llvm.org/D33884 llvm-svn: 304696	2017-06-05 01:03:04 +00:00
Simon Pilgrim	46dd55f1e1	[X86][SSE] Change BUILD_VECTOR interleaving ordering to improve coalescing/combine opportunities We currently generate BUILD_VECTOR as a tree of UNPCKL shuffles of the same type: e.g. for v4f32: Step 1: unpcklps 0, 2 ==> X: <?, ?, 2, 0> : unpcklps 1, 3 ==> Y: <?, ?, 3, 1> Step 2: unpcklps X, Y ==> <3, 2, 1, 0> The issue is because we are not placing sequential vector elements together early enough, we fail to recognise many combinable patterns - consecutive scalar loads, extractions etc. Instead, this patch unpacks progressively larger sequential vector elements together: e.g. for v4f32: Step 1: unpcklps 0, 2 ==> X: <?, ?, 1, 0> : unpcklps 1, 3 ==> Y: <?, ?, 3, 2> Step 2: unpcklpd X, Y ==> <3, 2, 1, 0> This does mean that we are creating UNPCKL shuffle of different value types, but the relevant combines that benefit from this are quite capable of handling the additional BITCASTs that are now included in the shuffle tree. Differential Revision: https://reviews.llvm.org/D33864 llvm-svn: 304688	2017-06-04 20:12:04 +00:00
Igor Breger	3bfba2c569	[GlobalISel][X86] merge irtranslator-call test files. NFC llvm-svn: 304683	2017-06-04 12:41:10 +00:00
Stanislav Mekhanoshin	0330660403	[AMDGPU] Untangle SDWA pass from SIShrinkInstructions Remove dependency of SDWA pass on SIShrinkInstructions. The goal is to move SDWA even higher in the stack to avoid second run of MachineLICM, MachineCSE and SIFoldOperands. Also added handling to preserve original src modifiers. Differential Revision: https://reviews.llvm.org/D33860 llvm-svn: 304665	2017-06-03 17:39:47 +00:00
Amaury Sechet	39fbe3bb60	Regenerate expectations for trunc-to-bool.ll . NFC llvm-svn: 304660	2017-06-03 11:35:40 +00:00
Simon Pilgrim	f93debb40c	[X86][SSE] Add SCALAR_TO_VECTOR(PEXTRW/PEXTRB) support to faux shuffle combining Generalized existing SCALAR_TO_VECTOR(EXTRACT_VECTOR_ELT) code to support AssertZext + PEXTRW/PEXTRB cases as well. llvm-svn: 304659	2017-06-03 11:12:57 +00:00
Tom Stellard	e042412ef1	AMDGPU/GlobalISel: Mark 1-bit integer constants as legal Summary: These are mostly legal, but will probably need special lowering for some cases. Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D33791 llvm-svn: 304628	2017-06-03 01:13:33 +00:00
Stanislav Mekhanoshin	f154b4f52c	[AMDGPU] Preserve operand order in SIFoldOperands SIFoldOperands can commute operands even if no folding was done. This change is to preserve IR is no folding was done. Differential Revision: https://reviews.llvm.org/D33802 llvm-svn: 304625	2017-06-03 00:41:52 +00:00
Quentin Colombet	1ee8616ca0	[SystemZ] Simplify test case. NFC Remove useless successors information. llvm-svn: 304615	2017-06-02 23:40:58 +00:00
Sanjay Patel	56641ac497	[x86] fix over-specific triple; NFC There's nothing darwin-specific in these tests, and using that setting causes extra phantom diffs when the auto-generated check lines are regenerated today. llvm-svn: 304614	2017-06-02 23:40:46 +00:00
Philip Reames	80135bdf9e	Canonicalize a test via utils/update_test_checks.py Turns out I might not have further changes to make here, but with the way I'd written the tests, even I couldn't tell that. :( llvm-svn: 304613	2017-06-02 23:27:36 +00:00
Sanjay Patel	4cad0f0477	[x86] add tests for unsigned vector compares with known signbits; NFC (PR33276) llvm-svn: 304612	2017-06-02 23:24:28 +00:00
Matthias Braun	0021d46a1c	RegisterScavenging: Add ScavengerTest pass This pass allows to run the register scavenging independently of PrologEpilogInserter to allow targeted testing. Also adds some basic register scavenging tests. llvm-svn: 304606	2017-06-02 23:01:42 +00:00
Quentin Colombet	2145cf3f07	[RABasic] Properly update the LiveRegMatrix when LR splitting occur Prior to this patch we used to not touch the LiveRegMatrix while doing live-range splitting. In other words, when live-range splitting was occurring, the LiveRegMatrix was not reflecting the changes. This is generally fine because it means the query to the LiveRegMatrix will be conservately correct. However, when decisions are taken based on what is going to happen on the interferences (e.g., when we spill a register and know that it is going to be available for another one), we might hit an assertion that the color used for the assignment is still in use. This patch makes sure the changes on the live-ranges are properly reflected in the LiveRegMatrix, so the assertions don't break. An alternative could have been to remove the assertion, but it would make the invariants of the code and the general reasoning more complicated in my opnion. http://llvm.org/PR33057 llvm-svn: 304603	2017-06-02 22:46:31 +00:00
Quentin Colombet	ebbaed6d3c	[RABasic] Properly initialize the pass Use the initializeXXX method to initialize the RABasic pass in the pipeline. This enables us to take advantage of the .mir infrastructure. llvm-svn: 304602	2017-06-02 22:46:26 +00:00
Ahmed Bougacha	018a68f9e4	[X86] Correctly broadcast NaN-like integers as float on AVX. Since r288804, we try to lower build_vectors on AVX using broadcasts of float/double. However, when we broadcast integer values that happen to have a NaN float bitpattern, we lose the NaN payload, thereby changing the integer value being broadcast. This is caused by ConstantFP::get, to which we pass the splat i32 as a float (by bitcasting it using bitsToFloat). ConstantFP::get takes a double parameter, so we end up lossily converting a single-precision NaN to double-precision. Instead, avoid any kinds of conversions by directly building an APFloat from the splatted APInt. Note that this also fixes another piece of code (broadcast of subvectors), that currently isn't susceptible to the same problem. Also note that we could really just use APInt and ConstantInt throughout: the constant pool type doesn't matter much. Still, for consistency, use the appropriate type. llvm-svn: 304590	2017-06-02 20:02:59 +00:00
Amaury Sechet	04ffaca604	Regenerate expectation for wide-fma-contraction.ll . NFC llvm-svn: 304586	2017-06-02 19:15:04 +00:00
Konstantin Zhuravlyov	be6c0ca5e2	AMDGPU: Make auto waitcnt before barrier a feature Differential Revision: https://reviews.llvm.org/D33793 llvm-svn: 304571	2017-06-02 17:40:26 +00:00
Philip Reames	94cc4a29ed	Add placeholder for more extensive verification of psuedo ops This initial patch doesn't actually do much useful. It's just to show where the new code goes. Once this is in, I'll extend the verification logic to check more useful properties. For those curious, the more complicated version of this patch already found one very suspicious thing. Differential Revision: https://reviews.llvm.org/D33819 llvm-svn: 304564	2017-06-02 16:36:37 +00:00
Amaury Sechet	5746e7356a	Update select.ll expected results. NFC llvm-svn: 304557	2017-06-02 16:07:43 +00:00
Alexander Timofeev	3f70b619a9	AMDGPUAnnotateUniformValue should always treat volatile loads as divergent llvm-svn: 304554	2017-06-02 15:25:52 +00:00
Mark Searles	70359ac60d	[AMDGPU] Turn on the new waitcnt insertion pass. Adjust tests. -enable-si-insert-waitcnts=1 becomes the default -enable-si-insert-waitcnts=0 to use old pass Differential Revision: https://reviews.llvm.org/D33730 llvm-svn: 304551	2017-06-02 14:19:25 +00:00
Zoran Jovanovic	2aae0649a1	[mips][microMIPS] Extending size reduction pass with LBU16, LHU16, SB16 and SH16 Author: milena.vujosevic.janicic Reviewers: sdardis The patch extends size reduction pass for MicroMIPS. The following instructions are examined and transformed, if possible: LBU instruction is transformed into 16-bit instruction LBU16 LHU instruction is transformed into 16-bit instruction LHU16 SB instruction is transformed into 16-bit instruction SB16 SH instruction is transformed into 16-bit instruction SH16 Differential Revision: https://reviews.llvm.org/D33091 llvm-svn: 304550	2017-06-02 14:14:21 +00:00
Krzysztof Parzyszek	066e8b56a0	[Hexagon] Return 0 from getDotNewPredOp when .new opcode does not exist This allows using this function to test if an instruction can be converted to a .new form. llvm-svn: 304549	2017-06-02 14:07:06 +00:00
Amaury Sechet	2e1fed9ef8	Regenerate sse3.ll test results. NFC llvm-svn: 304548	2017-06-02 14:02:49 +00:00
Amaury Sechet	8e370f14cb	Regenerate and-sink.ll test results. NFC llvm-svn: 304547	2017-06-02 14:02:46 +00:00
Amaury Sechet	f0c066f140	Regenerate shrink-compare.ll test results. NFC llvm-svn: 304546	2017-06-02 14:02:43 +00:00
Benjamin Kramer	19092d783c	[X86] Don't fold into memory operands into insertps in the generated folding tables. insertps behaves differently, the register form selects from an input register based on the immediate operand while the memory form just loads the given address. We have custom code to change the immediate in cases where that's legal, so completely remove insertps from the generated tables. llvm-svn: 304540	2017-06-02 10:50:22 +00:00
John Brawn	6671616cde	[GlobalMerge] Don't merge globals that may be preempted When a global may be preempted it needs to be accessed directly, instead of indirectly through a MergedGlobals symbol, for the preemption to work. This fixes PR33136. Differential Revision: https://reviews.llvm.org/D33727 llvm-svn: 304537	2017-06-02 10:24:14 +00:00
Diana Picus	e7aa90987d	[ARM] GlobalISel: Support struct params/returns Very very similar to the support for arrays. As with arrays, we don't support returning large structs that wouldn't fit in R0-R3. Most front-ends would likely use sret arguments for that anyway. The only significant difference is that when splitting a struct, we need to make sure we set the correct original alignment on each member, otherwise it may get split incorrectly between stack and registers. llvm-svn: 304536	2017-06-02 10:16:48 +00:00
Javed Absar	4ae7e81233	[ARM] Cortex-A57 scheduling model for ARM backend (AArch32) This patch implements the Cortex-A57 scheduling model. The main code is in ARMScheduleA57.td, ARMScheduleA57WriteRes.td. Small changes in cpp,.h files to support required scheduling predicates. Scheduling model implemented according to: http://infocenter.arm.com/help/topic/com.arm.doc.uan0015b/Cortex_A57_Software_Optimization_Guide_external.pdf. Patch by : Andrew Zhogin (submitted on his behalf, as requested). Rewiewed by: Renato Golin, Diana Picus, Javed Absar, Kristof Beyls. Differential Revision: https://reviews.llvm.org/D28152 llvm-svn: 304530	2017-06-02 08:53:19 +00:00
Amaury Sechet	9a6fdc0bd5	Specify triple for xor-icmp.ll . llvm-svn: 304526	2017-06-02 07:45:22 +00:00
Amaury Sechet	968dda7f81	Regenerate expectations for xor-icmp.ll . NFC llvm-svn: 304525	2017-06-02 07:25:02 +00:00
Yaxun Liu	a618acf923	[AMDGPU] Fix kernel arg segment size for amdgizcl Differential Revision: https://reviews.llvm.org/D33307 llvm-svn: 304482	2017-06-01 21:31:53 +00:00
Nirav Dave	4952871630	[SDAG] Fix CombineTo ordering in visitZERO_EXTEND and visitSIGN_EXTEND Reorder CombineTo Calls to prevent references to stale/deleted SDNodes which caused undue assertions. Reviewers: dbabokin Subscribers: aemerson, rengolin, llvm-commits Differential Revision: https://reviews.llvm.org/D31625 llvm-svn: 304460	2017-06-01 19:33:50 +00:00
Krzysztof Parzyszek	3cf16576d5	[Hexagon] Fix dependence check in the packetizer An incorrect check in the packetizer lead to an attempt to convert an unconditional branch to a .new (conditional) form. llvm-svn: 304442	2017-06-01 18:02:40 +00:00
Krzysztof Parzyszek	51fd5405d5	[Hexagon] Handle long-running simplification loop in idiom recognition The initial assumption was that the simplification would converge to a fixed point relatvely quickly. Turns out that there are legitimate situa- tions where the complexity of the code causes it to take a large number of iterations. Two main changes: - Instead of aborting upon hitting the limit, simply return nullptr. - Reduce the limit to 10,000 from 100,000. llvm-svn: 304441	2017-06-01 18:00:47 +00:00

1 2 3 4 5 ...

20371 Commits