llvm-project

Commit Graph

Author	SHA1	Message	Date
Craig Topper	5d787ac4be	[X86] Remove some uarch tuning flags from KNL that look to have been inherited from SNB/IVB incorrectly KNL is based on a modified Silvermont core so I don't think these features apply. I think the LEA flag is probably also wrong, but I'm less sure as I barely understand the 3 LEA flags we have currently. Differential Revision: https://reviews.llvm.org/D53671 llvm-svn: 345285	2018-10-25 17:28:57 +00:00
Volkan Keles	3a103b1d25	[AArch64][GlobalISel] Fix the LegalityPredicate for lowerIf for G_LOAD/G_STORE Summary: Currently, Legalizer is trying to lower G_LOAD with a vector type that has more than two elements due to the incorrect LegalityPredicate. This patch fixes the issue by removing the multiplication by 8 as `MemDesc.Size` already contains the size in bits. Reviewers: dsanders, aemerson Reviewed By: dsanders Subscribers: rovka, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D53679 llvm-svn: 345282	2018-10-25 17:23:25 +00:00
Simon Pilgrim	8f11ddc397	[ARM] Regenerate vdup tests llvm-svn: 345276	2018-10-25 15:33:47 +00:00
John Brawn	958865202d	[AArch64] Add EXT patterns for 64-bit EXT of a subvector of a 128-bit vector If we have a 64-bit EXT where one of the operands is a subvector of a 128-bit vector then in some cases we can eliminate an extract_subvector by converting to a 128-bit EXT of the 128-bit vector. Differential Revision: https://reviews.llvm.org/D53582 llvm-svn: 345275	2018-10-25 15:31:51 +00:00
Sam Parker	a16667e79b	[ARM] Use Cortex-A57 sched model for Cortex-A72 This mirrors what we already do for AArch64 as the cores are similar. As discussed in the review, enabling the machine scheduler causes more variations in performance changes so it is not enabled for now. This patch improves LNT scores by a geomean of 1.57% at -O3. Differential Revision: https://reviews.llvm.org/D53562 llvm-svn: 345272	2018-10-25 15:08:29 +00:00
John Brawn	49e61d90ca	[AArch64] Do 64-bit vector move of 0 and -1 by extracting from the 128-bit move Currently a vector move of 0 or -1 will use different instructions depending on the size of the vector. Using a single instruction (the 128-bit one) for both gives more opportunity for Machine CSE to eliminate instructions. Differential Revision: https://reviews.llvm.org/D53579 llvm-svn: 345270	2018-10-25 14:56:48 +00:00
Alexey Bataev	0f2fe4f135	[DEBUG_INFO][NVPTX]Fix processing of DBG_VALUES. Summary: If the instruction in the eliminateFrameIndex function is a DBG_VALUE instruction, it requires special processing. The frame register is set to VRFrame and the offset is based on the object offset. The code is similar to the code used in lib/CodeGen/PrologEpilogInserter.cpp. Reviewers: tra Subscribers: jholewinski, llvm-commits Differential Revision: https://reviews.llvm.org/D53657 llvm-svn: 345269	2018-10-25 14:27:27 +00:00
Francis Visoiu Mistrih	7d55dd673b	[X86] Fix llc invocation on MIR test case The current state of the llc invocation is: * Running all the passes from dwarfehprepare to stack coloring (included) * It runs it from the LLVM IR included in the file * It ADDS the generated MI from ISel to the MI in the MIR file * The machine verifier doesn't like it. Differential Revision: https://reviews.llvm.org/D53698 llvm-svn: 345266	2018-10-25 14:11:07 +00:00
Amara Emerson	cbd86d8429	[GlobalISel] Use the target preferred type for G_EXTRACT_VECTOR_ELT index. Allows for better imported pattern re-use. llvm-svn: 345265	2018-10-25 14:04:54 +00:00
Simon Pilgrim	53e8e145e9	[CostModel][X86] Add realistic vXi64 uitofp vXf64 costs Match codegen improvements from D53649/rL345256 llvm-svn: 345263	2018-10-25 13:06:20 +00:00
Simon Pilgrim	0573b8d8b6	[CostModel][X86] Add realistic i64 uitofp f64 scalar costs llvm-svn: 345261	2018-10-25 12:42:10 +00:00
Simon Pilgrim	838eb24014	[TargetLowering] Improve vXi64 UINT_TO_FP vXf64 support (P38226) As suggested on D52965, this patch moves the i64 to f64 UINT_TO_FP expansion code from LegalizeDAG into TargetLowering and makes it available to LegalizeVectorOps as well. Not only does this help perform X86 lowering as a true vectorization instead of (partially vectorized) scalar conversions, it avoids the HADDPD op from the scalar code which can be slow on most targets. The AVX512F does have the vcvtusi2sdq scalar operation but we don't unroll to use it as it seems to only help for the v2f64 case - otherwise the unrolling cost will certainly be too high. My feeling is that we should leave it to the vectorizers - and if it generates the vector UINT_TO_FP we should use it. Differential Revision: https://reviews.llvm.org/D53649 llvm-svn: 345256	2018-10-25 11:15:57 +00:00
George Rimar	581fc63dc0	[llvm-dwarfdump] - Fix incorrect parsing of the DW_LLE_startx_length As was already mentioned in comments for D53364, DWARF 5 spec says about DW_LLE_startx_length: "This is a form of bounded location description that has two unsigned ULEB operands. The first value is an address index (into the .debug_addr section) that indicates the beginning of the address range over which the location is valid. The second value is the length of the range. ") Currently, the length is always parsed as U32. Patch change the behavior to parse DW_LLE_startx_length as ULEB128 for DWARF 5 and keeps it as U32 for DWARF4+(pre-DWARF5) for compatibility. Differential revision: https://reviews.llvm.org/D53564 llvm-svn: 345254	2018-10-25 10:56:44 +00:00
Carlos Alberto Enciso	9a24e1a7cd	[DebugInfo][Dexter] Unreachable line stepped onto after SimplifyCFG. When SimplifyCFG changes the PHI node into a select instruction, the debug line records becomes ambiguous. It causes the debugger to display unreachable source lines. Differential Revision: https://reviews.llvm.org/D53287 llvm-svn: 345250	2018-10-25 09:58:59 +00:00
Gabor Buella	1f6ca0ba15	Add -instcombine-code-sinking option Reviewers: craig.topper, andrew.w.kaylor, efriedma Reviewed By: craig.topper, andrew.w.kaylor, efriedma Differential Revision: https://reviews.llvm.org/D52709 llvm-svn: 345248	2018-10-25 08:32:29 +00:00
Simon Atanasyan	1993254509	[llvm-readobj] Print ELF header flags names in GNU output GNU readelf tool prints hex value of the ELF header flags field and the flags names. This change adds the same functionality to llvm-readobj. Now llvm-readobj can print MIPS and RISCV flags. New GNUStyle::printFlags() method is a copy of ScopedPrinter::printFlags() routine. Probably we can escape code duplication and / or simplify the printFlags() method. But it's a task for separate commit. Differential revision: https://reviews.llvm.org/D52027 llvm-svn: 345238	2018-10-25 05:39:27 +00:00
Thomas Lively	325c9c5e84	[WebAssembly] Set LoadExt and TruncStore actions for SIMD types Summary: Fixes part of the problem reported in bug 39275. Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits, alexcrichton Differential Revision: https://reviews.llvm.org/D53542 llvm-svn: 345230	2018-10-25 01:46:07 +00:00
Reid Kleckner	a6c6698217	[X86] Adjust MIR test case to pacify machine verifier llvm-svn: 345227	2018-10-24 23:52:33 +00:00
Reid Kleckner	24d12c28e7	[X86] Fix pipeline tests when enabling MIR verification, NFC llvm-svn: 345226	2018-10-24 23:52:22 +00:00
David Blaikie	60fddac907	DebugInfo: Reuse common addresses for rnglist base address selections This makes the offsets larger (since they are further from the base address) but those are in the .dwo - and allows removing addresses and relocations from the .o file. This could be built into the AddressPool more fundamentally, perhaps - when you ask for an AddressPool entry you could say "or give me some other entry and an offset I need to use" - though what to do about situations where the first use of an address in a section is not the earliest address in that section... is tricky. At least with range addresses we can be fairly sure we've seen the earliest address first because we see the start address for the function. llvm-svn: 345224	2018-10-24 23:36:29 +00:00
Heejin Ahn	ac764aa88e	[WebAssembly] Fix immediate of rethrow when throwing to caller Summary: Currently when assigning depths 'rethrow' does not take the whole control flow stack into accounts but only considers EH pad stacks. When assigning depth immmediates to rethrows, in normal cases it is done correctly but when a rethrow instruction throws up to a caller, i.e., we convert a pseudo RETHROW_TO_CALLER instruction to a rethrow, it mistakenly compute the whole stack depth. Reviewers: dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D53619 llvm-svn: 345223	2018-10-24 23:31:24 +00:00
Thomas Lively	ed9513472c	[WebAssembly] Retain shuffle types during custom lowering Summary: Changing the node type in lowering was violating assumptions made in the DAG combiner, so don't change the node type any more. This fixes one of the issues reported in bug 39275. Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits, alexcrichton Differential Revision: https://reviews.llvm.org/D53537 llvm-svn: 345221	2018-10-24 23:27:40 +00:00
Reid Kleckner	49a24278ba	[ELF] Fix large code model MIR verifier errors Instead of using the MOVGOT64r pseudo, use the existing MO_PIC_BASE_OFFSET support on symbol operands. Now I don't have to create a "scratch register operand" for the pseudo to use, and the register allocator can make better decisions. Fixes some X86 verifier errors tracked in PR27481. llvm-svn: 345219	2018-10-24 22:57:28 +00:00
Thomas Lively	30f1d69115	[NFC] Rename minnan and maxnan to minimum and maximum Summary: Changes all uses of minnan/maxnan to minimum/maximum globally. These names emphasize that the semantic difference between these operations is more than just NaN-propagation. Reviewers: arsenm, aheejin, dschuff, javed.absar Subscribers: jholewinski, sdardis, wdng, sbc100, jgravelle-google, jrtc27, atanasyan, llvm-commits Differential Revision: https://reviews.llvm.org/D53112 llvm-svn: 345218	2018-10-24 22:49:55 +00:00
Alina Sbirlea	ad4d018202	Update MemorySSA in LoopRotate. Summary: Teach LoopRotate to preserve MemorySSA. Enable tests for correctness, dependency disabled by default. Subscribers: sanjoy, jlebar, Prazek, george.burgess.iv, llvm-commits Differential Revision: https://reviews.llvm.org/D51718 llvm-svn: 345216	2018-10-24 22:46:45 +00:00
David Blaikie	c8ae096739	llvm-dwarfdump: Account for skeleton addr_base when dumping addresses in split unit in the same file llvm-svn: 345215	2018-10-24 22:44:54 +00:00
Thomas Lively	43bc46207a	[SelectionDAG] DAG combiner for fminnan and fmaxnan Summary: Depends on D52765. Reviewers: aheejin, dschuff Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D52768 llvm-svn: 345210	2018-10-24 22:18:54 +00:00
Vedant Kumar	c299006879	[HotColdSplitting] Identify larger cold regions using domtree queries The current splitting algorithm works in three stages: 1) Identify cold blocks, then 2) Use forward/backward propagation to mark hot blocks, then 3) Grow a SESE region of blocks outside of the set of hot blocks and start outlining. While testing this pass on Apple internal frameworks I noticed that some kinds of control flow (e.g. loops) are never outlined, even though they unconditionally lead to / follow cold blocks. I noticed two other issues related to how cold regions are identified: - An inconsistency can arise in the internal state of the hotness propagation stage, as a block may end up in both the ColdBlocks set and the HotBlocks set. Further inconsistencies can arise as these sets do not match what's in ProfileSummaryInfo. - It isn't necessary to limit outlining to single-exit regions. This patch teaches the splitting algorithm to identify maximal cold regions and outline them. A maximal cold region is defined as the set of blocks post-dominated by a cold sink block, or dominated by that sink block. This approach can successfully outline loops in the cold path. As a side benefit, it maintains less internal state than the current approach. Due to a limitation in CodeExtractor, blocks within the maximal cold region which aren't dominated by a single entry point (a so-called "max ancestor") are filtered out. Results: - X86 (LNT + -Os + externals): 134KB of TEXT were outlined compared to 47KB pre-patch, or a ~3x improvement. Did not see a performance impact across two runs. - AArch64 (LNT + -Os + externals + Apple-internal benchmarks): 149KB of TEXT were outlined. Ditto re: performance impact. - Outlining results improve marginally in the internal frameworks I tested. Follow-ups: - Outline more than once per function, outline large single basic blocks, & try to remove unconditional branches in outlined functions. Differential Revision: https://reviews.llvm.org/D53627 llvm-svn: 345209	2018-10-24 22:15:41 +00:00
Sanjay Patel	e9f9a2a29c	[InstCombine] add test for fptrunc with vector with undef elt; NFC This should be fixed with D53650. llvm-svn: 345206	2018-10-24 22:02:05 +00:00
Paul Robinson	73766ccfda	Make llvm-dwarfdump -name work on type units. Differential Revision: https://reviews.llvm.org/D53672 llvm-svn: 345203	2018-10-24 21:51:55 +00:00
Joel E. Denny	3e66509f6c	[SourceMgr][FileCheck] Obey -color by extending WithColor (Relands r344930, reverted in r344935, and now hopefully fixed for Windows.) While this change specifically targets FileCheck, it affects any tool using the same SourceMgr facilities. Previously, -color was documented in FileCheck's -help output, but -color had no effect. Now, -color obeys its documentation: it forces colors to be used in FileCheck diagnostics even when stderr is not a terminal. -color is especially helpful when combined with FileCheck's -v, which can produce a long series of diagnostics that you might wish to pipe to a pager, such as less -R. The WithColor extensions here will also help to clean up color usage in FileCheck's annotated dump of input, which is proposed in D52999. Reviewed By: JDevlieghere, zturner Differential Revision: https://reviews.llvm.org/D53419 llvm-svn: 345202	2018-10-24 21:46:42 +00:00
Reid Kleckner	9c5bda652c	[X86] Add SP to tailcall register class to fix verifier error It's possible to do a tail call to a stack argument. LLVM already calculates the right stack offset to call through. Fixes the sibcall and musttail* verifier failures tracked at PR27481. llvm-svn: 345197	2018-10-24 21:09:34 +00:00
Reid Kleckner	953bdce68d	[MC] Separate masm integer literal lexer support from inline asm Summary: This renames the IsParsingMSInlineAsm member variable of AsmLexer to LexMasmIntegers and moves it up to MCAsmLexer. This is the only behavior controlled by that variable. I added a public setter, so that it can be set from outside or from the llvm-mc command line. We may need to arrange things so that users can get this behavior from clang, but that's future work. I also put additional hex literal lexing functionality under this flag to fix PR32973. It appears that this hex literal parsing wasn't intended to be enabled in non-masm-style blocks. Now, masm integers (0b1101 and 0ABCh) work in __asm blocks from clang, but 0b label references work when using .intel_syntax in standalone .s files. However, 0b label references will not work from __asm blocks in clang. They will work from GCC inline asm blocks, which it sounds like is important for Crypto++ as mentioned in PR36144. Essentially, we only lex masm literals for inline asm blobs that use intel syntax. If the .intel_syntax directive is used inside a gnu-style inline asm statement, masm literals will not be lexed, which is compatible with gas and llvm-mc standalone .s assembly. This fixes PR36144 and PR32973. Reviewers: Gerolf, avt77 Subscribers: eraman, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D53535 llvm-svn: 345189	2018-10-24 20:23:57 +00:00
Tim Northover	1c353419ab	AArch64: add a pass to compress jump-table entries when possible. llvm-svn: 345188	2018-10-24 20:19:09 +00:00
Simon Pilgrim	c5bb362b13	[X86][SSE] Add SimplifyDemandedBitsForTargetNode PMULDQ/PMULUDQ handling Add X86 SimplifyDemandedBitsForTargetNode and use it to simplify PMULDQ/PMULUDQ target nodes. This enables us to repeatedly simplify the node's arguments after the previous approach had to be reverted due to PR39398. Differential Revision: https://reviews.llvm.org/D53643 llvm-svn: 345182	2018-10-24 19:11:28 +00:00
Teresa Johnson	c8dba682bb	[hot-cold-split] Name split functions with ".cold" suffix Summary: The current default of appending "_"+entry block label to the new extracted cold function breaks demangling. Change the deliminator from "_" to "." to enable demangling. Because the header block label will be empty for release compile code, use "extracted" after the "." when the label is empty. Additionally, add a mechanism for the client to pass in an alternate suffix applied after the ".", and have the hot cold split pass use "cold."+Count, where the Count is currently 1 but can be used to uniquely number multiple cold functions split out from the same function with D53588. Reviewers: sebpop, hiraditya Subscribers: llvm-commits, erik.pilkington Differential Revision: https://reviews.llvm.org/D53534 llvm-svn: 345178	2018-10-24 18:53:47 +00:00
Simon Pilgrim	ac84005841	[CostModel][X86] Add vXi8 vector division by constants costs. ISD::MULHS/ISD::MULHU lowering of vXi8 types means we expand these in TargetLowering BuildSDIV/BuildUDIV. llvm-svn: 345175	2018-10-24 18:44:12 +00:00
Peter Collingbourne	4bb928c110	ARM: Use BKPT instead of TRAP to implement llvm.debugtrap. The BKPT instruction is specified to cause a software breakpoint, and at least on Linux results in a SIGTRAP. This makes it more suitable for implementing debugtrap than TRAP (aka UDF #254), which is specified to cause an undefined instruction exception and results in a SIGILL on Linux. Moreover, BKPT is not marked as a terminator, which is not only consistent with the IR instruction but allows the analyzeBlock function to correctly analyze a basic block containing the instruction, which fixes an assertion failure in the machine block placement pass previously triggered by the included test case. Because BKPT is only supported starting with ARMv5T, we continue to use UDF #254 when targeting v4T. Differential Revision: https://reviews.llvm.org/D53614 llvm-svn: 345171	2018-10-24 18:10:38 +00:00
Craig Topper	2417273255	[X86] Bring back the MOV64r0 pseudo instruction This patch brings back the MOV64r0 pseudo instruction for zeroing a 64-bit register. This replaces the SUBREG_TO_REG MOV32r0 sequence we use today. Post register allocation we will rewrite the MOV64r0 to a 32-bit xor with an implicit def of the 64-bit register similar to what we do for the various XMM/YMM/ZMM zeroing pseudos. My main motivation is to enable the spill optimization in foldMemoryOperandImpl. As we were seeing some code that repeatedly did "xor eax, eax; store eax;" to spill several registers with a new xor for each store. With this optimization enabled we get a store of a 0 immediate instead of an xor. Though I admit the ideal solution would be one xor where there are multiple spills. I don't believe we have a test case that shows this optimization in here. I'll see if I can try to reduce one from the code were looking at. There's definitely some other machine CSE(and maybe other passes) behavior changes exposed by this patch. So it seems like there might be some other deficiencies in SUBREG_TO_REG handling. Differential Revision: https://reviews.llvm.org/D52757 llvm-svn: 345165	2018-10-24 17:32:09 +00:00
Simon Pilgrim	2cce074e8c	[CostModel][X86] Enable non-uniform vector division by constants costs. Non-uniform division/remainder handling was added back at D49248/D50765 - so share the 'mul+sub' costs that already exist for uniform cases. llvm-svn: 345164	2018-10-24 17:30:29 +00:00
Robert Lougher	18bfb3a5ec	[CodeGen] skip lifetime end marker in isInTailCallPosition A lifetime end intrinsic between a tail call and the return should not prevent the call from being tail call optimized. Differential Revision: https://reviews.llvm.org/D53519 llvm-svn: 345163	2018-10-24 17:03:19 +00:00
Sanjay Patel	d1fe437cf1	[InstCombine] add test for ComputeNumSignBits with shuffle; NFC llvm-svn: 345162	2018-10-24 17:01:42 +00:00
Sanjay Patel	2169b9c976	[InstCombine] add test for select with shuffled condition (PR37549); NFC llvm-svn: 345156	2018-10-24 16:21:23 +00:00
Sanjay Patel	3b206305fd	[InstCombine] try harder to form select from logic ops (2nd try) The original patch was committed here: rL344609 ...and reverted: rL344612 ...because it did not properly check/test data types before calling ComputeNumSignBits(). The tests that caused bot failures for the previous commit are over-reaching front-end tests that run the entire -O optimizer pipeline: Clang :: CodeGen/builtins-systemz-zvector.c Clang :: CodeGen/builtins-systemz-zvector2.c I've added a negative test here to ensure coverage for that case. The new early exit check also tests the type of the 'B' parameter, so we don't waste time on matching if either value is unsuitable. Original commit message: This is part of solving PR37549: https://bugs.llvm.org/show_bug.cgi?id=37549 The patterns shown here are a special case of something that we already convert to select. Using ComputeNumSignBits() catches that case (but not the more complicated motivating patterns yet). The backend has hooks/logic to convert back to logic ops if that's better for the target. llvm-svn: 345149	2018-10-24 15:17:56 +00:00
Cameron McInally	678f43f666	[FPEnv] Convert more BinaryOperator::isFNeg(...) to m_FNeg(...) This work is to avoid regressions when we seperate FNeg from the FSub IR instruction. Differential Revision: https://reviews.llvm.org/D53205 llvm-svn: 345146	2018-10-24 14:45:18 +00:00
Alexey Bataev	c15c853c3a	[DEBUGINFO, NVPTX] Try to pack bytes data into a single string. Summary: If the target does not support `.asciz` and `.ascii` directives, the strings are represented as bytes and each byte is placed on the new line as a separate byte directive `.b8 <data>`. NVPTX target allows to represent the vector of the data of the same type as a vector, where values are separated using `,` symbol: `.b8 <data1>,<data2>,...`. This allows to reduce the size of the final PTX file. Ptxas tool includes ptx files into the resulting binary object, so reducing the size of the PTX file is important. Reviewers: tra, jlebar, echristo Subscribers: jholewinski, llvm-commits Differential Revision: https://reviews.llvm.org/D45822 llvm-svn: 345142	2018-10-24 14:04:00 +00:00
James Henderson	5b2e968264	Fix llvm-strings crash for negative char values On Windows at least, llvm-strings was crashing if it encountered bytes that mapped to negative chars, as it was passing these into std::isgraph and std::isblank functions, resulting in undefined behaviour. On debug builds using MSVC, these functions verfiy that the value passed in is representable as an unsigned char. Since the char is promoted to an int, a value greater than 127 would turn into a negative integer value, and fail the check. Using the llvm::isPrint function is sufficient to solve the issue. Reviewed by: ruiu, mstorsjo Differential Revision: https://reviews.llvm.org/D53509 llvm-svn: 345137	2018-10-24 13:16:16 +00:00
Simon Pilgrim	84cc110732	[X86][SSE] Update PMULDQ schedule tests to survive more aggressive SimplifyDemandedBits llvm-svn: 345136	2018-10-24 13:13:36 +00:00
Andrea Di Biagio	083addf751	[llvm-mca] [llvm-mca] Improved error handling and error reporting from class InstrBuilder. A new class named InstructionError has been added to Support.h in order to improve the error reporting from class InstrBuilder. The llvm-mca driver is responsible for handling InstructionError objects, and printing them out to stderr. The goal of this patch is to remove all the remaining error handling logic from the library code. In particular, this allows us to: - Simplify the logic in InstrBuilder by removing a needless dependency from MCInstrPrinter. - Centralize all the error halding logic in a new function named 'runPipeline' (see llvm-mca.cpp). This is also a first step towards generalizing class InstrBuilder, so that in future, we will be able to reuse its logic to also "lower" MachineInstr to mca::Instruction objects. Differential Revision: https://reviews.llvm.org/D53585 llvm-svn: 345129	2018-10-24 10:56:47 +00:00
Gil Rapaport	c523036fd2	Revert r345114 Investigating fails. llvm-svn: 345123	2018-10-24 08:41:22 +00:00
Tim Renouf	2a1b1d94b6	[AMDGPU] Defined gfx909 Raven Ridge 2 Differential Revision: https://reviews.llvm.org/D53418 Change-Id: Ie3d054f2e956c2768988c0f4c0ffd29a47294eef llvm-svn: 345120	2018-10-24 08:14:07 +00:00
Eugene Leviant	1f54500af0	[ThinLTO] Fix dot dumper for regular LTO modules Regular LTO module identifier is (unsigned)-1. This patch emits correct module identifier while printing edges with source summary in regular LTO module. Differential revision: https://reviews.llvm.org/D53583 llvm-svn: 345118	2018-10-24 07:48:32 +00:00
Dorit Nuzman	5114390e48	[LV] Don't have fold-tail under optsize invalidate interleave-groups when masked-interleaving is enabled Enable interleave-groups under fold-tail scenario for Opt for size compilation; D50480 added support for vectorizing loops of arbitrary trip-count without a remiander, which in turn makes everything in the loop conditional, including interleave-groups if any. It therefore invalidated all interleave-groups because we didn't have support for vectorizing predicated interleaved-groups at the time. In the meantime, D53011 introduced this support, so we don't have to invalidate interleave-groups when masked-interleaved support is enabled. Reviewers: Ayal, hsaito, dcaballe, fhahn Reviewed By: hsaito Differential Revision: https://reviews.llvm.org/D53559 llvm-svn: 345115	2018-10-24 07:11:38 +00:00
Gil Rapaport	5012e7f6ac	[LSR] Combine unfolded offset into invariant register LSR reassociates constants as unfolded offsets when the constants fit as immediate add operands, which currently prevents such constants from being combined later with loop invariant registers. This patch modifies GenerateCombinations() to generate a second formula which includes the unfolded offset in the combined loop-invariant register. Differential Revision: https://reviews.llvm.org/D51861 llvm-svn: 345114	2018-10-24 07:08:38 +00:00
Sanjin Sijaric	cd41638292	[ARM64][Windows] Add unwind support to llvm-readobj This patch adds support for dumping the unwind info from ARM64 COFF object files. Differential Revision: https://reviews.llvm.org/D53264 llvm-svn: 345108	2018-10-24 00:03:34 +00:00
Saleem Abdulrasool	4005f9a860	ARM: handle checking aliases with out-of-bounds GEPs A global alias may use indices which are not considered in bounds. In such a case, accessing the base object will fail as it only peers through inbounds accesses. This pattern is used by the swift compiler to create references to preceeding members in the type metadata. This would cause the code generation to fail when targeting a platform that used ELF as the object file format. Be conservative and fail the read-only check if we run into an alias that we cannot peer through. llvm-svn: 345107	2018-10-24 00:00:52 +00:00
Wei Mi	80a0c97e07	[PM] keeping history when original SCC split and then merge into itself in the same round of SCC update. In https://reviews.llvm.org/rL309784, inline history is added to prevent infinite inlining across multiple run of inliner and SCC update, but the history will only be kept when new SCC is actually generated during SCC update. We found a case that SCC can be split and then merge into itself in the same round of SCC update, so the same SCC will be pop out from UR.CWorklist and then added back immediately, without any new SCC generated, that is why the existing patch cannot catch the infinite inline case. What the patch does is even if no new SCC is generated, if only the current SCC appears in UR.CWorklist again, then keep the inline history. Differential Revision: https://reviews.llvm.org/D52915 llvm-svn: 345103	2018-10-23 23:29:45 +00:00
Matthias Braun	4f82406c46	SelectionDAG: Reuse bigger sized constants in memset expansion. When implementing memset's today we often see this pattern: $x0 = MOV 0xXYXYXYXYXYXYXYXY store $x0, ... $w1 = MOV 0xXYXYXYXY store $w1, ... We first create a 64bit constant in a 64bit register with all bytes the same and then create a 32bit constant with all bytes the same in a 32bit register. In many targets we could just access the lower byte of the 64bit register instead. - Ideally this would be handled by the ConstantHoist pass but it runs too early when memset isn't expanded yet. - The memset expansion code already had this optimization implemented, however SelectionDAG constantfolding would constantfold the "trunc(bigconstnat)" pattern to "smallconstant". - This patch makes the memset expansion mark the constant as Opaque and stop DAGCombiner from constant folding in this situation. (Similar to how ConstantHoisting marks things as Opaque to avoid folding ADD/SUB/etc.) Differential Revision: https://reviews.llvm.org/D53181 llvm-svn: 345102	2018-10-23 23:19:23 +00:00
Lang Hames	23cb2e7f77	[ORC] Re-apply r345077 with fixes to remove ambiguity in lookup calls. llvm-svn: 345098	2018-10-23 23:01:39 +00:00
Teresa Johnson	7c6344a64f	Revert "[ThinLTO] Fix a crash in lazy loading of Metadata" This reverts commit r345095. It was accidentally committed. llvm-svn: 345097	2018-10-23 23:00:29 +00:00
Teresa Johnson	d725335bd1	[hot-cold-split] Only perform splitting in ThinLTO backend post-link Summary: Fix the new PM to only perform hot cold splitting once during ThinLTO, by skipping it in the pre-link phase. This was already fixed in the old PM by the move of the hot cold split pass later (after the early return when PrepareForThinLTO) by r344869. Reviewers: vsk, sebpop, hiraditya Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, llvm-commits Differential Revision: https://reviews.llvm.org/D53611 llvm-svn: 345096	2018-10-23 22:57:40 +00:00
Teresa Johnson	3513dc245e	[ThinLTO] Fix a crash in lazy loading of Metadata Summary: This is a revised version of D41474. When the debug location is parsed in BitcodeReader::parseFunction, the scope and inlinedAt MDNodes are obtained via MDLoader->getMDNodeFwdRefOrNull(), which will create a forward ref if they were not yet loaded. Specifically, if one of these MDNodes is in the module level metadata block, and this is during ThinLTO importing, that metadata block is lazily loaded. Most places in that invoke getMDNodeFwdRefOrNull have a corresponding call to resolveForwardRefsAndPlaceholders which will take care of resolving them. E.g. places that call getMetadataFwdRefOrLoad, or at the end of parsing a function-level metadata block, or at the end of the initial lazy load of module level metadata in order to handle invocations of getMDNodeFwdRefOrNull for named metadata and global object attachments. However, the calls for the scope/inlinedAt of debug locations are not backed by any such call to resolveForwardRefsAndPlaceholders. To fix this, change the scope and inlinedAt parsing to instead use getMetadataFwdRefOrLoad, which will ensure the forward refs to lazily loaded metadata are resolved. Fixes PR35472. Reviewers: dexonsmith, Sunil_Srivastava, vsk Subscribers: inglorion, eraman, steven_wu, sebpop, mehdi_amini, dmikulin, vsk, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D53596 llvm-svn: 345095	2018-10-23 22:57:21 +00:00
Fangrui Song	fa735b0eab	Actually fix test from r345085 REQUIRE: asserts llvm-svn: 345090	2018-10-23 22:07:34 +00:00
Fangrui Song	54b825cafe	Fix test after r345085 llvm-svn: 345089	2018-10-23 22:04:33 +00:00
Craig Topper	e01d516ac7	[X86] Autogenerate comple checks. NFC llvm-svn: 345087	2018-10-23 21:58:49 +00:00
Zhizhou Yang	13f76f84bc	Print out DebugCounter info with -print-debug-counter Summary: This patch will print out {Counter, Skip, StopAfter} info of all passes which have DebugCounter set at destruction. It can be used to monitor how many times does certain transformation happen in a pass, and also help check if -debug-counter option is set correctly. Please refer to this [[ http://lists.llvm.org/pipermail/llvm-dev/2018-July/124722.html \| thread ]] for motivation. Reviewers: george.burgess.iv, davide, greened Reviewed By: greened Subscribers: kristina, llozano, mgorny, llvm-commits, mgrang Differential Revision: https://reviews.llvm.org/D50031 llvm-svn: 345085	2018-10-23 21:51:56 +00:00
Jonas Devlieghere	3ef53e10d3	[dwarfdump] Make incompatibility between -diff and -verbose explicit. Using -diff and -verbose together doesn't work today. We should audit where these two options interact and fix them. In the meantime we error out when the user try to specify both. llvm-svn: 345084	2018-10-23 21:51:44 +00:00
Peter Collingbourne	abd820a92b	CGP: Clear data structures at the end of a loop iteration instead of the beginning. Clearing LargeOffsetGEPMap at the end fixes a bug where if a large offset GEP is in a dead basic block, we fail an assertion when trying to delete the block due to the asserting VH in LargeOffsetGEPMap. Differential Revision: https://reviews.llvm.org/D53464 llvm-svn: 345082	2018-10-23 21:23:18 +00:00
Reid Kleckner	db367e952e	Revert r345077 "[ORC] Change how non-exported symbols are matched during lookup." Doesn't build on Windows. The call to 'lookup' is ambiguous. Clang and MSVC agree, anyway. http://lab.llvm.org:8011/builders/clang-x64-windows-msvc/builds/787 C:\b\slave\clang-x64-windows-msvc\build\llvm.src\unittests\ExecutionEngine\Orc\CoreAPIsTest.cpp(315): error C2668: 'llvm::orc::ExecutionSession::lookup': ambiguous call to overloaded function C:\b\slave\clang-x64-windows-msvc\build\llvm.src\include\llvm/ExecutionEngine/Orc/Core.h(823): note: could be 'llvm::Expected<llvm::JITEvaluatedSymbol> llvm::orc::ExecutionSession::lookup(llvm::ArrayRef<llvm::orc::JITDylib *>,llvm::orc::SymbolStringPtr)' C:\b\slave\clang-x64-windows-msvc\build\llvm.src\include\llvm/ExecutionEngine/Orc/Core.h(817): note: or 'llvm::Expected<llvm::JITEvaluatedSymbol> llvm::orc::ExecutionSession::lookup(const llvm::orc::JITDylibSearchList &,llvm::orc::SymbolStringPtr)' C:\b\slave\clang-x64-windows-msvc\build\llvm.src\unittests\ExecutionEngine\Orc\CoreAPIsTest.cpp(315): note: while trying to match the argument list '(initializer list, llvm::orc::SymbolStringPtr)' llvm-svn: 345078	2018-10-23 20:54:43 +00:00
Lang Hames	841796decd	[ORC] Change how non-exported symbols are matched during lookup. In the new scheme the client passes a list of (JITDylib&, bool) pairs, rather than a list of JITDylibs. For each JITDylib the boolean indicates whether or not to match against non-exported symbols (true means that they should be found, false means that they should not). The MatchNonExportedInJD and MatchNonExported parameters on lookup are removed. The new scheme is more flexible, and easier to understand. This patch also updates JITDylib search orders to be lists of (JITDylib&, bool) pairs to match the new lookup scheme. Error handling is also plumbed through the LLJIT class to allow regression tests to fail predictably when a lookup from a lazy call-through fails. llvm-svn: 345077	2018-10-23 20:20:22 +00:00
Vedant Kumar	503154615d	[HotColdSplitting] Attach MinSize to outlined code Outlined code is cold by assumption, so it makes sense to optimize it for minimal code size rather than performance. After r344869 moved the splitting pass to the end of the IR pipeline, this does not result in much of a code size reduction. This is probably because a comparatively small number backend transforms make use of the MinSize hint. Running LNT on x86_64, I see that 33/1020 binaries shrink for a total of 919 bytes of TEXT reduction. I didn't measure a significant performance impact. Differential Revision: https://reviews.llvm.org/D53518 llvm-svn: 345072	2018-10-23 19:41:12 +00:00
Simon Pilgrim	b6c57075c0	[X86][SSE] Revert rL343922 combinePMULDQ AddToWorklist (PR39398) We can't add the MULDQ node back to the worklist after the demanded bits change has been committed in case the node has been removed entirely. This will have to wait until we have SimplifyDemandedBitsForTargetNode. llvm-svn: 345070	2018-10-23 19:07:53 +00:00
Jordan Rupprecht	aaeaa0a8b3	[llvm-strip] Support -s alias for --strip-all. Make both strip and objcopy case sensitive to support both -s (--strip-all) and -S (--strip-debug). Summary: GNU strip supports both `-s` and `-S` as aliases for `--strip-all` and `--strip-debug`, respectfully. As part of this, it turns out that strip/objcopy were accepting case insensitive command line args. I'm not sure if there was an explicit reason for this. The only others uses of this are llvm-cvtres/llvm-mt/llvm-lib, which are all tools specific for windows support. Forcing case sensitivity allows both aliases to exist, but seems like a good idea anyway. And as a surprise test case adjustment, the llvm-strip unit test was running with `-keep=unavailable_symbol`, despite `keep` not be a valid flag for strip. This is because there is a flag `-K` which, when case insensitivity is permitted, allows it to be interpreted as `-K` = `eep=unavailable_symbol` (e.g. to allow `-Kfoo` == `--keep-symbol=foo`). Reviewers: jakehehrlich, jhenderson, alexshap Reviewed By: jakehehrlich Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D53163 llvm-svn: 345068	2018-10-23 18:46:33 +00:00
Simon Pilgrim	d705ba97dd	[LegalizeDAG] Share Vector/Scalar CTLZ Expansion As suggested on D53258, this patch shares common CTLZ expansion code between VectorLegalizer and SelectionDAGLegalize by putting it in TargetLowering. Extension to D53474 llvm-svn: 345060	2018-10-23 17:48:30 +00:00
Stefan Pintilie	927e8bf316	[Power9] Add __float128 support in the backend for bitcast to a i128 Add support to allow bit-casting from f128 to i128 and then extracting 64 bits from the result. Differential Revision: https://reviews.llvm.org/D49507 llvm-svn: 345053	2018-10-23 17:11:36 +00:00
Simon Pilgrim	f04a04c2b6	[TTI][X86] Treat SK_Transpose shuffles as SK_PermuteTwoSrc - there's no difference in lowering. llvm-svn: 345048	2018-10-23 16:45:26 +00:00
Jordan Rupprecht	2fed6ac186	[DebugInfo][GlobalOpt] Fix -debugify for globalopt shrinking globals to booleans. Summary: TryToShrinkGlobalToBoolean, when possible, will split store <value> + load <value> into store <bool> + select <bool ? value : 0>. This preserves DebugLoc during that pass. Fixes PR37959. The test case here is the simplified .ll for: ``` static int foo; int bar() { foo = 5; return foo; } ``` Reviewers: dblaikie, gbedwell, aprantl Reviewed By: dblaikie Subscribers: mehdi_amini, JDevlieghere, dexonsmith, llvm-commits Tags: #debug-info Differential Revision: https://reviews.llvm.org/D53531 llvm-svn: 345046	2018-10-23 16:35:51 +00:00
Simon Pilgrim	f1d8b7c49e	[CostModel][X86] Add transpose shuffle cost tests llvm-svn: 345045	2018-10-23 16:27:14 +00:00
Sanjay Patel	5b6b090cf2	[Reassociate] replace fake binop queries with 'match' API We need to update this code before introducing an 'fneg' instruction in IR, so we might as well kill off the integer neg/not queries too. This is no-functional-change-intended for scalar code and most vector code. For vectors, we can see that the 'match' API allows for undef elements in constants, so we optimize those cases better. Ideally, there would be a test for each code diff, but I don't see evidence of that for the existing code, so I didn't try very hard to come up with new vector tests for each code change. Differential Revision: https://reviews.llvm.org/D53533 llvm-svn: 345042	2018-10-23 15:55:06 +00:00
Simon Pilgrim	532a0f122e	[SLPVectorizer] Add basic support for mul/and/or/xor horizontal reductions Expand arithmetic reduction to include mul/and/or/xor instructions. This patch just fixes the SLPVectorizer - the effective reduction costs for AVX1+ are still poor (see rL344846) and will need to be improved before SLP sees this as a valid transform - but we can already see the effect on SSE2 tests. This partially helps PR37731, but doesn't fix it all as it still falls over on the extraction/reduction order for some reason. Differential Revision: https://reviews.llvm.org/D53473 llvm-svn: 345037	2018-10-23 15:13:09 +00:00
Sanjay Patel	747feb28e4	[InstCombine] use 'match' to handle vectors and simplify code This is another step towards completely removing the fake binop queries for not/neg/fneg. llvm-svn: 345036	2018-10-23 15:05:12 +00:00
Sanjay Patel	ad76c682c7	[InstCombine] swap select profile metadata when swapping select ops llvm-svn: 345034	2018-10-23 14:43:31 +00:00
Sanjay Patel	54da0057b9	[InstCombine] add/move tests for select with inverted condition; NFC The transform is broken in 2 ways - it doesn't correct metadata (or even drop it), and it doesn't work with vectors with undef elements. llvm-svn: 345033	2018-10-23 14:37:29 +00:00
Sanjay Patel	5141435d23	[SLSR] use 'match' to simplify code; NFC This pass could probably be modified slightly to allow vector splat transforms for practically no cost, but it only works on scalars for now. So the use of the newer 'match' API should make no functional difference. llvm-svn: 345030	2018-10-23 14:07:39 +00:00
Sanjay Patel	d3d2a0b591	[SLSR] auto-generate full test assertions; NFC llvm-svn: 345028	2018-10-23 13:39:40 +00:00
Roman Lebedev	06e4db07af	Experimental re-land of [X86][BMI1] X86DAGToDAGISel: select BEXTR from x << (32 - y) >> (32 - y) pattern This initially landed in rL345014, but was reverted in rL345017 due to sanitizer-x86_64-linux-fast buildbot failure in check-lld (ELF/relocatable-versioned.s) test. While i'm not yet quite sure what is the problem, one obvious thing here is that extra truncation roundtrip. Maybe that's it? If not, will re-revert. Differential Revision: https://reviews.llvm.org/D53521 llvm-svn: 345027	2018-10-23 13:19:31 +00:00
Simon Pilgrim	9a2ee1ea2d	Add BROADCAST shuffle cost tests. Part of a lot of cleanup necessary before PR39368. llvm-svn: 345025	2018-10-23 13:14:54 +00:00
Simon Pilgrim	051feee5c2	Add BROADCAST shuffle cost tests. Part of a lot of cleanup necessary before PR39368. llvm-svn: 345023	2018-10-23 13:00:22 +00:00
Roman Lebedev	c29dbbdb10	Revert "[X86][BMI1] X86DAGToDAGISel: select BEXTR from x << (32 - y) >> (32 - y) pattern" Seems to be breaking sanitizer-x86_64-linux-fast buildbot, the ELF/relocatable-versioned.s test: ==17758==MemorySanitizer CHECK failed: /b/sanitizer-x86_64-linux-fast/build/llvm/projects/compiler-rt/lib/sanitizer_common/sanitizer_allocator.cc:191 "((kBlockMagic)) == ((((u64)addr)[0]))" (0x6a6cb03abcebc041, 0x0) #0 0x59716b in MsanCheckFailed(char const, int, char const, unsigned long long, unsigned long long) /b/sanitizer-x86_64-linux-fast/build/llvm/projects/compiler-rt/lib/msan/msan.cc:393 #1 0x586635 in __sanitizer::CheckFailed(char const, int, char const, unsigned long long, unsigned long long) /b/sanitizer-x86_64-linux-fast/build/llvm/projects/compiler-rt/lib/sanitizer_common/sanitizer_termination.cc:79 #2 0x57d5ff in __sanitizer::InternalFree(void, __sanitizer::SizeClassAllocatorLocalCache<__sanitizer::SizeClassAllocator32<__sanitizer::AP32> >*) /b/sanitizer-x86_64-linux-fast/build/llvm/projects/compiler-rt/lib/sanitizer_common/sanitizer_allocator.cc:191 #3 0x7fc21b24193f (/lib/x86_64-linux-gnu/libc.so.6+0x3593f) #4 0x7fc21b241999 in exit (/lib/x86_64-linux-gnu/libc.so.6+0x35999) #5 0x7fc21b22c2e7 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x202e7) #6 0x57c039 in _start (/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/lld+0x57c039) This reverts commit r345014. llvm-svn: 345017	2018-10-23 10:34:57 +00:00
Roman Lebedev	1c95b2f779	[X86][BMI1] X86DAGToDAGISel: select BEXTR from x << (32 - y) >> (32 - y) pattern Summary: Continuation of D52348. We also get the `c) x & (-1 >> (32 - y))` pattern here, because of the D48768. I will add extra-uses into those tests and follow-up with a patch to handle those patterns too. Reviewers: RKSimon, craig.topper Reviewed By: craig.topper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D53521 llvm-svn: 345014	2018-10-23 09:08:44 +00:00
Craig Topper	f50f086743	[X86] Regenerate test checks to show fma comments. NFC llvm-svn: 344999	2018-10-23 04:18:08 +00:00
Lang Hames	776f1d50c8	[RuntimeDyld][COFF] Skip non-loaded sections when calculating ImageBase. Non-loaded sections (whose unused load-address defaults to zero) should not be taken into account when calculating ImageBase, or ImageBase will be incorrectly set to 0. Patch by Andrew Scheidecker. Thanks Andrew! https://reviews.llvm.org/D51343 + // The Sections list may contain sections that weren't loaded for + // whatever reason: they may be debug sections, and ProcessAllSections + // is false, or they may be sections that contain 0 bytes. If the + // section isn't loaded, the load address will be 0, and it should not + // be included in the ImageBase calculation. llvm-svn: 344995	2018-10-23 01:36:33 +00:00
Kostya Serebryany	af95597c3c	[hwasan] add stack frame descriptions. Summary: At compile-time, create an array of {PC,HumanReadableStackFrameDescription} for every function that has an instrumented frame, and pass this array to the run-time at the module-init time. Similar to how we handle pc-table in SanitizerCoverage. The run-time is dummy, will add the actual logic in later commits. Reviewers: morehouse, eugenis Reviewed By: eugenis Subscribers: srhines, llvm-commits, kubamracek Differential Revision: https://reviews.llvm.org/D53227 llvm-svn: 344985	2018-10-23 00:50:40 +00:00
Heejin Ahn	a40303aa03	[WebAssembly] Fix assembly printing of br_table Summary: In `br_table's stack version asm string, \t was missing. Reviewers: aardappel Subscribers: dschuff, sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D53516 llvm-svn: 344981	2018-10-23 00:28:14 +00:00
Wouter van Oortmerssen	a569c20587	[WebAssembly] Added test for inline assembly roundtrip. Summary: Due to previous work to make WebAssembly MC by default stack-only inline assembly now "just works" (previously it didn't since it had no way to know types of registers), so no further work required. So far we only have tests (in inline-asm.ll) which test with non-existing instructions, so this adds a test that roundtrips both the inline assembly and its surrounding code thru the assembler. Reviewers: dschuff, sunfish Subscribers: sbc100, jgravelle-google, eraman, aheejin, llvm-commits Differential Revision: https://reviews.llvm.org/D52914 llvm-svn: 344977	2018-10-23 00:12:49 +00:00
Leonard Chan	0acfc6be38	[Intrinsic] Unigned Saturation Addition Intrinsic Add an intrinsic that takes 2 integers and perform unsigned saturation addition on them. This is a part of implementing fixed point arithmetic in clang where some of the more complex operations will be implemented as intrinsics. Differential Revision: https://reviews.llvm.org/D53340 llvm-svn: 344971	2018-10-22 23:08:40 +00:00
Matthias Braun	a0beeffeed	X86: Do not optimize branches with undef eflags inputs analyzeBranch()/insertBranch() etc. do not properly deal with an undef flag on the eflags input and used to produce invalid MIR. I don't see this ever affecting real world inputs (I don't think it is possible to produce undef flags with llvm IR), so I simply changed the code to bail out in this case. rdar://42122367 llvm-svn: 344970	2018-10-22 22:52:23 +00:00
Sanjay Patel	767625400d	[Reassociate] remove bogus tests; NFC I was trying to provide test coverage for D53533 with rL344964, but these don't do it...and I don't think they add any value, so deleting. llvm-svn: 344969	2018-10-22 22:50:27 +00:00
Simon Pilgrim	8c3d87b8cf	[ARM] Regenerate reverse shuffle costs Came about while cleaning up general shuffle costs for PR39368 llvm-svn: 344966	2018-10-22 22:26:00 +00:00
Craig Topper	c8e183f9ee	Recommit r344877 "[X86] Stop promoting integer loads to vXi64" I've included a fix to DAGCombiner::ForwardStoreValueToDirectLoad that I believe will prevent the previous miscompile. Original commit message: Theoretically this was done to simplify the amount of isel patterns that were needed. But it also meant a substantial number of our isel patterns have to match an explicit bitcast. By making the vXi32/vXi16/vXi8 types legal for loads, DAG combiner should be able to change the load type to rem I had to add some additional plain load instruction patterns and a few other special cases, but overall the isel table has reduced in size by ~12000 bytes. So it looks like this promotion was hurting us more than helping. I still have one crash in vector-trunc.ll that I'm hoping @RKSimon can help with. It seems to relate to using getTargetConstantFromNode on a load that was shrunk due to an extract_subvector combine after the constant pool entry was created. So we end up decoding more mask elements than the lo I'm hoping this patch will simplify the number of patterns needed to remove the and/or/xor promotion. Reviewers: RKSimon, spatel Reviewed By: RKSimon Subscribers: llvm-commits, RKSimon Differential Revision: https://reviews.llvm.org/D53306 llvm-svn: 344965	2018-10-22 22:14:05 +00:00

1 2 3 4 5 ...

56881 Commits