llvm-project

Commit Graph

Author	SHA1	Message	Date
Jonas Devlieghere	27126f5260	[Support] Add color cl category. This commit adds a color category so tools can document this option and enables it for dwarfdump and dsymuttil. rdar://problem/40498996 llvm-svn: 333176	2018-05-24 11:36:57 +00:00
Jonas Paulsson	7bcfeab4f2	[ScheduleDAGInstrs / buildSchedGraph] Clear subregister entries also. In addPhysRegDeps, subregister entries of the defined register were previously not removed from Uses or Defs, which resulted in extra redundant edges for subregs around the register definition. This is principally NFC (in very rare cases some node got a different height). This makes the DAG more readable and efficient in some cases. Review: Andy Trick https://reviews.llvm.org/D46838 llvm-svn: 333165	2018-05-24 08:38:06 +00:00
Simon Atanasyan	f6b0c93fb3	[mips] Remove duplicated code from the expandLoadInst. NFC llvm-svn: 333164	2018-05-24 07:36:18 +00:00
Simon Atanasyan	a188267f0a	[mips] Remove redundant argument from expandLoadInst/expandStoreInst. NFC llvm-svn: 333163	2018-05-24 07:36:11 +00:00
Simon Atanasyan	be8a42efe2	[mips] Add precondition asserts to the expandLoadInst/expandStoreInst. NFC llvm-svn: 333162	2018-05-24 07:36:06 +00:00
Simon Atanasyan	478220f1fc	[mips] Cleanup the code a bit. NFC llvm-svn: 333161	2018-05-24 07:36:00 +00:00
Fangrui Song	79420acb96	[demangler] Add ItaniumPartialDemangler::isCtorOrDtor Reviewers: erik.pilkington, ruiu, echristo, pcc Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D47248 llvm-svn: 333159	2018-05-24 06:57:57 +00:00
Shiva Chen	43bfe84451	[RISCV] Support linker relax function call from auipc and jalr to jal To do this: 1. Add fixup_riscv_relax fixup types which eventually will transfer to R_RISCV_RELAX relocation types. 2. Insert R_RISCV_RELAX relocation types to auipc function call expression when linker relaxation enabled. Differential Revision: https://reviews.llvm.org/D44886 llvm-svn: 333158	2018-05-24 06:21:23 +00:00
Karl-Johan Karlsson	478232d52f	[NaryReassociate] Detect deleted instr with WeakVH Summary: If NaryReassociate succeed it will, when replacing the old instruction with the new instruction, also recursively delete trivially dead instructions from the old instruction. However, if the input to the NaryReassociate pass contain dead code it is not save to recursively delete trivially deadinstructions as it might lead to deleting the newly created instruction. This patch will fix the problem by using WeakVH to detect this rare case, when the newly created instruction is dead, and it will then restart the basic block iteration from the beginning. This fixes pr37539 Reviewers: tra, meheff, grosser, sanjoy Reviewed By: sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D47139 llvm-svn: 333155	2018-05-24 06:09:02 +00:00
Tom Stellard	1b95fed6f7	AMDGPU/R600: Remove code for handling AMDGPUISD::CLAMP Summary: We don't generate AMDGPUISD::CLAMP for R600 now that llvm.AMDGPU.clamp is gone. Reviewers: arsenm, nhaehnle Reviewed By: arsenm Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D47181 llvm-svn: 333153	2018-05-24 05:28:34 +00:00
Andres Freund	361941283f	Revert r333147 "[ORC] Add findSymbolIn() wrapper to C bindings." This reverts r333147 until https://reviews.llvm.org/D47308 is ready to be reviewed. r333147 exposed a behavioural difference between OrcCBindingsStack::findSymbolIn() and OrcCBindingsStack::findSymbol(), where only the latter does name mangling. After r333147 that causes a test failure on OSX, because the new test looks for main using findSymbolIn() but the mangled name is _main. llvm-svn: 333152	2018-05-24 05:10:19 +00:00
Lei Huang	f4ec67822f	[PowerPC] Remove the match pattern in the definition of LXSDX/STXSDX The match pattern in the definition of LXSDX is xoaddr, so the Pseudo instruction XFLOADf64 never gets selected. XFLOADf64 expands to LXSDX/LFDX post RA based on the register pressure. To avoid ambiguity, we need to remove the select pattern for LXSDX, same as what was done for LXSD. STXSDX also have the same issue. Patch by Qing Shan Zhang (steven.zhang). Differential Revision: https://reviews.llvm.org/D47178 llvm-svn: 333150	2018-05-24 03:20:28 +00:00
Andres Freund	b0b67b07f5	[ORC] Add findSymbolIn() wrapper to C bindings. In many cases JIT users will know in which module a symbol resides. Avoiding to search other modules can be more efficient. It also allows to handle duplicate symbol names between modules. Reviewed By: lhames Differential Revision: https://reviews.llvm.org/D44889 llvm-svn: 333147	2018-05-24 01:01:42 +00:00
Mandeep Singh Grang	ddcb95664e	[RISCV] Lower the tail pseudoinstruction This patch lowers the tail pseudoinstruction. This has been modeled after ARM's tail call opt. llvm-svn: 333137	2018-05-23 22:44:08 +00:00
Vedant Kumar	9374c0432b	[DebugInfo] Maintain DI for sunken bitcasts When a bitcast is being sunk in -codegenprepare pass, its DI wasn't copied over to the newly created instruction. This patch fixes that bug. Patch by Kareem Ergawy! Differential Revision: https://reviews.llvm.org/D47282 llvm-svn: 333133	2018-05-23 22:03:48 +00:00
Sameer AbuAsal	eadce02741	[RISCV] Set CostPerUse for registers Summary: Set CostPerUse higher for registers that are not used in the compressed instruction set. This will influence the greedy register allocator to reduce the use of registers that can't be encoded in 16 bit instructions. This affects register allocation even when compressed instruction isn't targeted, we see no major negative codegen impact. Reviewers: asb Reviewed By: asb Subscribers: rbar, johnrusso, simoncook, jordy.potman.lists, apazos, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, mgrang Differential Revision: https://reviews.llvm.org/D47039 llvm-svn: 333132	2018-05-23 21:34:30 +00:00
Lang Hames	4c4a2ba353	[RuntimeDyld][MachO] Add support for MachO::ARM64_RELOC_POINTER_TO_GOT reloc. llvm-svn: 333130	2018-05-23 21:27:07 +00:00
Lang Hames	5216ac9685	[LKH] Add a new IRTransformLayer. llvm-svn: 333129	2018-05-23 21:27:07 +00:00
Lang Hames	85642262b2	[LKH] Add ObjectTransformLayer2. llvm-svn: 333128	2018-05-23 21:27:06 +00:00
Lang Hames	4caa2f70ac	[LKH] Add a new IRCompileLayer. llvm-svn: 333127	2018-05-23 21:27:01 +00:00
Roman Tereshin	13229aff54	[GlobalISel] NFCI, Getting GlobalISel ~5% faster by replacing DenseMap with IndexedMap for LLTs within MRI, as benchmarked by cross-compiling sqlite3 amalgamation for AArch64 on x86 machine. Reviewed By: qcolombet Differential Revision: https://reviews.llvm.org/D46809 llvm-svn: 333125	2018-05-23 21:12:02 +00:00
Lei Huang	8b0da65bfb	[Power9]Legalize and emit code for W vector extract and convert to QP Implemente patterns to extract [Un]signed Word vector element and convert to quad-precision. Differential Revision: https://reviews.llvm.org/D46536 llvm-svn: 333115	2018-05-23 19:31:54 +00:00
Lei Huang	8990168a45	[Power9]Legalize and emit code for DW vector extract and convert to QP Implemente patterns to extract [Un]signed DWord vector element and convert to quad-precision. Differential Revision: https://reviews.llvm.org/D46333 llvm-svn: 333112	2018-05-23 18:36:51 +00:00
Changpeng Fang	5f9154618e	StructurizeCFG: Adjust the loop depth for a subregion to order the nodes correctly Summary: StructurizeCFG::orderNodes basically uses a reverse post-order (RPO) traversal of the region list to get the order. The only problem with it is that sometimes backedges for outer loops will be visited before backedges for inner loops. To solve this problem, a loop depth based approach has been used to make sure all blocks in this loop has been visited before moving on to outer loop. However, we found a problem for a SubRegion which is a loop itself: --> BB1 --> BB2 --> BB3 --> In this case, BB2 is a SubRegion (loop), and thus its loopdepth is different than that of BB1 and BB3. This fact will lead BB2 to be placed in the wrong order. In this work, we treat the SubRegion as a special case and use its exit block to determine the loop and its depth to guard the sorting. Reviewers: arsenm, jlebar Differential Revision: https://reviews.llvm.org/D46912 llvm-svn: 333111	2018-05-23 18:34:48 +00:00
Chad Rosier	3f66363139	[CodeGen][AArch64] Use RegUnits to track register aliases. (NFC) Use RegUnits to track register aliases in AArch64RedundantCopyElimination. Differential Revision: https://reviews.llvm.org/D47269 llvm-svn: 333107	2018-05-23 17:49:38 +00:00
Roman Lebedev	6b6c553bb8	[InstCombine] Fold unfolded masked merge pattern with variable mask! Summary: Finally fixes [[ https://bugs.llvm.org/show_bug.cgi?id=6773 \| PR6773 ]]. Now that the backend is all done, we can finally fold it! The canonical unfolded masked merge pattern is ```(x & m) \| (y & ~m)``` There is a second, equivalent variant: ```(x \| ~m) & (y \| m)``` Only one of them (the or-of-and's i think) is canonical. And if the mask is not a constant, we should fold it to: ```((x ^ y) & M) ^ y``` https://rise4fun.com/Alive/ndQw Reviewers: spatel, craig.topper Reviewed By: spatel Subscribers: nicholas, RKSimon, llvm-commits Differential Revision: https://reviews.llvm.org/D46814 llvm-svn: 333106	2018-05-23 17:47:52 +00:00
Jakub Kuderski	ef33edd9b5	[Dominators] Add PDT constructor from Function Summary: This patch adds a PDT constructor from Function and lets codes previously using a local class to do this use PostDominatorTree class directly. Reviewers: davide, kuhar, grosser, dberlin Reviewed By: kuhar Author: NutshellySima Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D46709 llvm-svn: 333102	2018-05-23 17:29:21 +00:00
Craig Topper	3b768e8602	[InstCombine] Negate ABS/NABS patterns by swapping the select operands to remove the negation Differential Revision: https://reviews.llvm.org/D47236 llvm-svn: 333101	2018-05-23 17:29:03 +00:00
Petar Jovanovic	7d37bb42a1	Silence warnings introduced with r333093 r333093 introduced several warnings (-Wlogical-not-parentheses, -Wbool-compare). Adding parentheses in MipsSEInstrInfo::isCopyInstr() to silence it. llvm-svn: 333097	2018-05-23 16:27:51 +00:00
Petar Jovanovic	c051000b83	[X86][MIPS][ARM] New machine instruction property 'isMoveReg' This property is needed in order to follow values movement between registers. This property is used in TII to implement method that returns true if simple copy like instruction is recognized, along with source and destination machine operands. Patch by Nikola Prica. Differential Revision: https://reviews.llvm.org/D45204 llvm-svn: 333093	2018-05-23 15:28:28 +00:00
Nicola Zaghen	03d0b91f43	Remove DEBUG macro. Now that the LLVM_DEBUG() macro landed on the various sub-projects the DEBUG macro can be removed. Also change the new uses of DEBUG to LLVM_DEBUG. Differential Revision: https://reviews.llvm.org/D46952 llvm-svn: 333091	2018-05-23 15:09:29 +00:00
Alex Bradbury	257d5b5639	[RISCV] Add symbol diff relocation support for RISC-V For RISC-V it is desirable to have relaxation happen in the linker once addresses are known, and as such the size between two instructions/byte sequences in a section could change. For most assembler expressions, this is fine, as the absolute address results in the expression being converted to a fixup, and finally relocations. However, for expressions such as .quad .L2-.L1, the assembler folds this down to a constant once fragments are laid out, under the assumption that the difference can no longer change, although in the case of linker relaxation the differences can change at link time, so the constant is incorrect. One place where this commonly appears is in debug information, where the size of a function expression is in a form similar to the above. This patch extends the assembler to allow an AsmBackend to declare that it does not want the assembler to fold down this expression, and instead generate a pair of relocations that allow the linker to carry out the calculation. In this case, the expression is not folded, but when it comes to emitting a fixup, the generic FK_Data_* fixups are converted into a pair, one for the addition half, one for the subtraction, and this is passed to the relocation generating methods as usual. I have named these FK_Data_Add_* and FK_Data_Sub_* to indicate which half these are for. For RISC-V, which supports this via e.g. the R_RISCV_ADD64, R_RISCV_SUB64 pair of relocations, these are also set to always emit relocations relative to local symbols rather than section offsets. This is to deal with the fact that if relocations were calculated on e.g. .text+8 and .text+4, the result 12 would be stored rather than 4 as both addends are added in the linker. Differential Revision: https://reviews.llvm.org/D45181 Patch by Simon Cook. llvm-svn: 333079	2018-05-23 12:36:18 +00:00
Alex Bradbury	3fa69dd055	[Sparc] Use addAliasForDirective to support data directives The Sparc asm parser currently has custom parsing logic for .half, .word, .nword and .xword. Rather than use this custom logic, we can just use addAliasForDirective to enable the reuse of AsmParser::parseDirectiveValue. https://reviews.llvm.org/D47003 llvm-svn: 333078	2018-05-23 11:20:28 +00:00
Alex Bradbury	0a59f18951	[AArch64] Use addAliasForDirective to support data directives The AArch64 asm parser currently has custom parsing logic for .hword, .word, and .xword. Rather than use this custom logic, we can just use addAliasForDirective to enable the reuse of AsmParser::parseDirectiveValue. Differential Revision: https://reviews.llvm.org/D47000 llvm-svn: 333077	2018-05-23 11:17:20 +00:00
Alex Bradbury	1c010d0fa4	[RISCV] Correctly report sizes for builtin fixups This is a different approach to fixing the problem described in D46746. RISCVAsmBackend currently depends on the getSize helper function returning the number of bytes a fixup may change (note: some other backends have a similar helper named getFixupNumKindBytes). As noted in that review, this doesn't return the correct size for FK_Data_1, FK_Data_2, or FK_Data_8 meaning that too few bytes will be written in the case of FK_Data_8, and there's the potential of writing outside the Data array for the smaller fixups. D46746 extends getSize to recognise some of the builtin fixup types. Rather than having a function that needs to be kept up to date as new builtin or target-specific fixups are added, We can calculate an appropriate bound on the number of bytes that might be touched using Info.TargetSize and Info.TargetOffset. Differential Revision: https://reviews.llvm.org/D46965 llvm-svn: 333076	2018-05-23 10:53:56 +00:00
Max Kazantsev	d99f3bacb4	[LoopUnswitch] Fix SCEV invalidation in unswitching Loop unswitching makes substantial changes to a loop that can also affect cached SCEV info in its outer loops as well, but it only cares to invalidate SCEV cache for the innermost loop in case of full unswitching and does not invalidate anything at all in case of trivial unswitching. As result, we may end up with incorrect data in cache. Differential Revision: https://reviews.llvm.org/D46045 Reviewed By: mzolotukhin llvm-svn: 333072	2018-05-23 10:09:53 +00:00
Piotr Padlewski	d6f7346a4b	Fix aliasing of launder.invariant.group Summary: Patch for capture tracking broke bootstrap of clang with -fstict-vtable-pointers which resulted in debbugging nightmare. It was fixed https://reviews.llvm.org/D46900 but as it turned out, there were other parts like inliner (computing of noalias metadata) that I found after bootstraping with enabled assertions. Reviewers: hfinkel, rsmith, chandlerc, amharc, kuhar Subscribers: JDevlieghere, eraman, llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D47088 llvm-svn: 333070	2018-05-23 09:16:44 +00:00
Daniel Cederman	6356571ec0	[Sparc] Add mnemonic aliases for flush, stb, stba, sth, and stha Reviewers: jyknight Reviewed By: jyknight Subscribers: fedor.sergeev, jrtc27, llvm-commits Differential Revision: https://reviews.llvm.org/D47140 llvm-svn: 333068	2018-05-23 08:26:49 +00:00
Serguei Katkov	46ef8fffdf	SafepointIRVerifier is made unreachable block tolerant SafepointIRVerifier crashed while traversing blocks without a DomTreeNode. This could happen with a custom pipeline or when some optional passes were skipped by OptBisect. SafepointIRVerifier is fixed to traverse basic blocks that are reachable from entry. Test are added. Patch Author: Yevgeny Rouban! Reviewers: anna, reames, dneilson, DaniilSuchkov, skatkov Reviewed By: reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D47011 llvm-svn: 333063	2018-05-23 05:54:55 +00:00
Roman Tereshin	e79d656c33	[GlobalISel][ARM] Adding HPR and QPR regclasses to FPRB regbank Also bringing ARMRegisterBankInfo::getRegBankFromRegClass implementation up to speed with the *.td-definition. Reviewed By: qcolombet Differential Revision: https://reviews.llvm.org/D43982 llvm-svn: 333056	2018-05-23 02:59:31 +00:00
Heejin Ahn	1e4d35044f	[WebAssembly] Add functions for EHScopes Summary: There are functions using the term 'funclet' to refer to both 1. an EH scopes, the structure of BBs that starts with catchpad/cleanuppad and ends with catchret/cleanupret, and 2. a small function that gets outlined in AsmPrinter, which is the original meaning of 'funclet'. So far the two have been the same thing; EH scopes are always outlined in AsmPrinter as funclets at the end of the compilation pipeline. But now wasm also uses scope-based EH but does not outline those, so we now need to correctly distinguish those two use cases in functions. This patch splits `MachineBasicBlock::isFuncletEntry` into `isFuncletEntry` and `isEHScopeEntry`, and `MachineFunction::hasFunclets` into `hasFunclets` and `hasEHScopes`, in order to distinguish the two different use cases. And this also changes some uses of the term 'funclet' to 'scope' in `getFuncletMembership` and change the function name to `getEHScopeMembership` because this function is not about outlined funclets but about EH scope memberships. This change is in the same vein as D45559. Reviewers: majnemer, dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D47005 llvm-svn: 333045	2018-05-23 00:32:46 +00:00
Sanjay Patel	4b96935bd7	[InstCombine] use nsw negation for abs libcalls Also, produce the canonical IR abs (s<0) to be more efficient. This is the libcall equivalent of the clang builtin change from: rL333038 Pasting from that commit message: The stdlib functions are defined in section 7.20.6.1 of the C standard with: "If the result cannot be represented, the behavior is undefined." That lets us mark the negation with 'nsw' because "sub i32 0, INT_MIN" would be UB/poison. llvm-svn: 333042	2018-05-22 23:29:40 +00:00
David Bolvansky	1f343fa0e0	[InstCombine] Remove calloc transformations Summary: Previous patch does not care if a value is changed between calloc and strlen. This needs to be removed from InstCombine and maybe moved to DSE later after some rework. Reviewers: efriedma Reviewed By: efriedma Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D47218 llvm-svn: 333022	2018-05-22 20:27:36 +00:00
Matt Arsenault	606bc315d6	AMDGPU: Fix v2f16 fneg/fabs pattern The integer operation convertion for some reason only happens if the source is a bitcast from an integer, which happens to always be the situation when the result is loaded. Add an additional pattern for when the source operation is really an FP operation. llvm-svn: 333019	2018-05-22 20:13:34 +00:00
Eli Friedman	785acce51d	Delete unused variable from r333015. (The assertion suppressed the unused variable warning on Release+Asserts builds, so I didn't notice.) llvm-svn: 333018	2018-05-22 19:38:07 +00:00
Tom Stellard	b12f4dec08	AMDGPU: Move AMDGPUTargetLowering::isFPExtFoldable() into SITargetLowering Summary: This is always false for R600. Reviewers: arsenm, nhaehnle Reviewed By: arsenm Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D47180 llvm-svn: 333016	2018-05-22 19:37:55 +00:00
Eli Friedman	042dc9e092	[MachineOutliner] Add "thunk" outlining for AArch64. When we're outlining a sequence that ends in a call, we can save up to three instructions in the outlined function by turning the call into a tail-call. I refer to this as thunk outlining because the resulting outlined function looks like a thunk; suggestions welcome for a better name. In addition to making the outlined function shorter, thunk outlining allows outlining calls which would otherwise be illegal to outline: we don't need to save/restore LR, so we don't need to prove anything about the stack access patterns of the callee. To make this work effectively, I also added MachineOutlinerInstrType::LegalTerminator to the generic MachineOutliner code; this allows treating an arbitrary instruction as a terminator in the suffix tree. Differential Revision: https://reviews.llvm.org/D47173 llvm-svn: 333015	2018-05-22 19:11:06 +00:00
Krzysztof Parzyszek	840b02bccf	[Hexagon] Add patterns for accumulating HVX compares llvm-svn: 333009	2018-05-22 18:27:02 +00:00
Florian Hahn	a6e63f176c	[NewGVN] Fix handling of assumes This patch fixes two bugs: * test1: Previously assume(a >= 5) concluded that a == 5. That's only valid for assume(a == 5)... * test2: If operands were swapped, additional users were added to the wrong cmp operand. This resulted in an "unsettled iteration" assertion failure. Patch by Nikita Popov Differential Revision: https://reviews.llvm.org/D46974 llvm-svn: 333007	2018-05-22 17:38:22 +00:00
Jonas Devlieghere	63eca15e95	[DebugInfo] Invert DIE order for range errors. When printing an error for an invalid address range in a DIE, we used to print the child above the parent, which is counter intuitive. This patch reverses the order and indents the child to mimic the way we print the debug info section. llvm-svn: 333006	2018-05-22 17:38:03 +00:00
Jonas Devlieghere	7e0b023302	[DebugInfo] Fix location list check in the verifier We weren't properly verifying location lists because we tried obtaining the offset as a constant. llvm-svn: 333005	2018-05-22 17:37:27 +00:00
Paul Robinson	543c0e1d50	[DWARFv5] Put the DWO ID in its place. In DWARF v5, the DWO ID is in the (split/skeleton) CU header, not an attribute on the CU DIE. This changes the size of those headers, so use the parsed size whenever we have one, for simplicitly. Differential Revision: https://reviews.llvm.org/D47158 llvm-svn: 333004	2018-05-22 17:27:31 +00:00
Lang Hames	5261aa9f91	[ORC] Move symbol-scanning and discard from BasicIRLayerMaterializationUnit in to a base class (IRMaterializationUnit). The new class, IRMaterializationUnit, provides a convenient base for any client that wants to write a materializer for LLVM IR. llvm-svn: 332993	2018-05-22 16:15:38 +00:00
David Bolvansky	41f4b64ee1	[InstCombine] Calloc-ed strings optimizations Summary: Example cases: strlen(calloc(...)) -> 0 Reviewers: efriedma, bkramer Reviewed By: bkramer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D47059 llvm-svn: 332990	2018-05-22 15:41:23 +00:00
Aleksandar Beserminji	a5f755186a	[mips] Merge MipsLongBranch and MipsHazardSchedule passes MipsLongBranchPass and MipsHazardSchedule passes are joined to one pass because of mutual conflict. When MipsHazardSchedule inserts 'nop's, it potentially breaks some jumps, so they have to be expanded to long branches. When some branch is expanded to long branch, it potentially creates a hazard situation, which should be fixed by adding nops. New pass is called MipsBranchExpansion, it combines these two passes, and runs them alternately until one of them reports no changes were made. Differential Revision: https://reviews.llvm.org/D46641 llvm-svn: 332977	2018-05-22 13:24:38 +00:00
Simon Dardis	437153bb80	[mips] Correct the predicates of the cache and pref instructions Reviewers: atanasyan, abeserminji, smaksimovic Differential Revision: https://reviews.llvm.org/D46949 llvm-svn: 332970	2018-05-22 10:55:05 +00:00
Simon Pilgrim	4162d77744	[TTI] Add uniform/non-uniform constant Pow2 detection to TargetTransformInfo::getInstructionThroughput This enables us to detect more fast path sdiv cases under cost analysis. This patch also enables us to handle non-uniform-constant pow2 cases for X86 SDIV costs. Found while working on D46276 Future patches can then extend the vectorizers to more fully support non-uniform pow2 cases. Differential Revision: https://reviews.llvm.org/D46637 llvm-svn: 332969	2018-05-22 10:40:09 +00:00
Karl-Johan Karlsson	11d68a619e	[LowerSwitch] Fixed faulty PHI node update Summary: When lowerswitch merge several cases into a new default block it's not updating the PHI nodes accordingly. The code that update the PHI nodes for the default edge only update the first entry and do not remove the remaining ones, to make sure the number of entries match the number of predecessors. This is easily fixed by replacing the code that update the PHI node with the already existing utility function for updating PHI nodes. Reviewers: hans, reames, arsenm Reviewed By: arsenm Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D47055 llvm-svn: 332960	2018-05-22 08:46:48 +00:00
Bjorn Pettersson	fecef6be9e	[LoopVersioning] Don't modify the list that we iterate over in addPHINodes Summary: In LoopVersioning::addPHINodes we need to iterate over all users for a value "Inst", and if the user is outside of the VersionedLoop we should replace the use of "Inst" by using the value "PN" instead. Replacing the use of "Inst" for a user of "Inst" also means that Inst->users() is modified. So it is not safe to do the replace while iterating over Inst->users() as we used to do. This patch splits the task into two steps. First we iterate over Inst->users() to find all users that should be updated. Those users are saved into a local data structure on the stack. And then, in the second step, we do the actual updates. This time iterating over the local data structure. Reviewers: mzolotukhin, anemet Reviewed By: mzolotukhin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D47134 llvm-svn: 332958	2018-05-22 08:33:02 +00:00
Stanislav Mekhanoshin	0e132dca53	[AMDGPU] Optimze old value of v_mov_b32_dpp We can eliminate old value if bound_ctrl = 1 and row_mask = bank_mask = 0xf. This is alternative implementation working with the intrinsic in InstCombine. Original review for past-ISel optimization: D46570. Differential Revision: https://reviews.llvm.org/D46596 llvm-svn: 332956	2018-05-22 08:04:33 +00:00
Matt Arsenault	1349a04ef5	AMDGPU: Make v2i16/v2f16 legal on VI This usually results in better code. Fixes using inline asm with short2, and also fixes having a different ABI for function parameters between VI and gfx9. Partially cleans up the mess used for lowering of the d16 operations. Making v4f16 legal will help clean this up more, but this requires additional work. llvm-svn: 332953	2018-05-22 06:32:10 +00:00
Dan Gohman	b81848272d	[WebAssembly] Fix fast-isel lowering illegal argument and return types. For both argument and return types, promote illegal types like i24 to i32, and if a type can't be easily promoted, clear out the signature before bailing out, so avoid leaving it in a partially complete state. Fixes PR37546. llvm-svn: 332947	2018-05-22 04:58:36 +00:00
Tom Stellard	44b30b4537	AMDGPU: Remove #include "MCTargetDesc/AMDGPUMCTargetDesc.h" from common headers Summary: MCTargetDesc/AMDGPUMCTargetDesc.h contains enums for all the instuction and register defintions, which are huge so we only want to include them where needed. This will also make it easier if we want to split the R600 and GCN definitions into separate tablegenerated files. I was unable to remove AMDGPUMCTargetDesc.h from SIMachineFunctionInfo.h because it uses some enums from the header to initialize default values for the SIMachineFunction class, so I ended up having to remove includes of SIMachineFunctionInfo.h from headers too. Reviewers: arsenm, nhaehnle Reviewed By: nhaehnle Subscribers: MatzeB, kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D46272 llvm-svn: 332930	2018-05-22 02:03:23 +00:00
Peter Collingbourne	2d3eaeb2d4	MC: Remove dead code. NFCI. This code appears to have been copied from the mach-o streamer. It has no effect in ELF because indirect symbols are specific to mach-o. llvm-svn: 332926	2018-05-22 01:20:46 +00:00
Sanjay Patel	17a870f07c	[DAG] fold FP binops with undef operands to NaN This is the FP sibling of D43141 with the corresponding IR change in rL327212. We can't propagate undef here because if a variable operand is a NaN, these binops must propagate NaN. Neither global nor node-level fast-math makes a difference. If we have 'nnan', I think later folds can turn the NaN into undef. The tests in X86/fp-undef.ll are meant to be the definitive verification for these folds - everything reduces identically now. The other test changes are collateral damage. They may need to be altered to preserve their intent. Differential Revision: https://reviews.llvm.org/D47026 llvm-svn: 332920	2018-05-21 23:54:19 +00:00
Lang Hames	373f4628a5	[LKH] Add a replacement RTDyldLayer. llvm-svn: 332918	2018-05-21 23:45:40 +00:00
Craig Topper	358b094971	[X86] Remove 128/256-bit cvtdq2ps, cvtudq2ps, cvtqq2pd, cvtuqq2pd intrinsics. These can all be implemented with sitofp/uitofp instructions. llvm-svn: 332916	2018-05-21 23:15:00 +00:00
Diego Caballero	1bd5f2261d	Fix warning from r332654 with LLVM_ATTRIBUTE_USED r332654 tried to fix an unused function warning with a void cast. This approach worked for clang and gcc but not for MSVC. This commit replaces the void cast with the LLVM_ATTRIBUTE_USED approach. llvm-svn: 332910	2018-05-21 22:12:38 +00:00
Roman Lebedev	9f65d16d5d	[DAGCombiner] isAllOnesConstantOrAllOnesSplatConstant(): look through bitcasts Summary: As pointed out in D46528, we errneously transform cases like `xor X, -1`, even though we use said function. It's because the `-1` is actually a bitcast there. So i think we can just look through it in the function. Differential Revision: https://reviews.llvm.org/D47156 llvm-svn: 332905	2018-05-21 21:41:10 +00:00
Roman Lebedev	7772de25d0	[DAGCombine][X86][AArch64] Masked merge unfolding: vector edition. Summary: This appears to be the last missing piece for the masked merge pattern handling in the backend. This is [[ https://bugs.llvm.org/show_bug.cgi?id=37104 \| PR37104 ]]. [[ https://bugs.llvm.org/show_bug.cgi?id=6773 \| PR6773 ]] will introduce an IR canonicalization that is likely bad for the end assembly. Previously, `andps`+`andnps` / `bsl` would be generated. (see `@out`) Now, they would no longer be generated (see `@in`), and we need to make sure that they are generated. Differential Revision: https://reviews.llvm.org/D46528 llvm-svn: 332904	2018-05-21 21:41:02 +00:00
Lang Hames	502f81e37e	[ORC] Preserve Materializing symbol flag during resolution. llvm-svn: 332899	2018-05-21 21:11:22 +00:00
Lang Hames	0b0b41fcce	[ORC] Lookup now returns an error if any symbols are not found. Also tightens the behavior of ExecutionSession::failQuery. Queries can usually only be failed by marking a symbol as failed-to-materialize, but ExecutionSession::failQuery provides a second route, and both routes may be executed from different threads. In the case that a query has already been failed due to a materialization error, ExecutionSession::failQuery will direct the error to ExecutionSession::reportError instead. llvm-svn: 332898	2018-05-21 21:11:21 +00:00
Lang Hames	add9b6805c	[ORC] Remove the optional MaterializationResponsibility argument from lookup. The lookup function provides blocking symbol resolution for JIT clients (not layers themselves) so it does not need to track symbol dependencies via a MaterializationResponsibility. llvm-svn: 332897	2018-05-21 21:11:21 +00:00
Lang Hames	1cf9987f6e	[ORC] Add IRLayer and ObjectLayer interfaces and related MaterializationUnits. llvm-svn: 332896	2018-05-21 21:11:13 +00:00
Craig Topper	25444c852a	[DAGCombiner] Use computeKnownBits to match rotate patterns that have had their amount masking modified by simplifyDemandedBits SimplifyDemandedBits can remove bits from the masks for the shift amounts we need to see to detect rotates. This patch uses zeroes from computeKnownBits to fill in some of these mask bits to make the match work. As currently written this calls computeKnownBits even when the mask hasn't been simplified because it made the code simpler. If we're worried about compile time performance we can improve this. I know we're talking about making a rotate intrinsic, but hopefully we can go ahead and do this change and just make sure the rotate intrinsic also handles it. Differential Revision: https://reviews.llvm.org/D47116 llvm-svn: 332895	2018-05-21 21:09:18 +00:00
Reid Kleckner	537917d13c	[X86] Simplify some X86 address mode folding code, NFCI This code should really do exactly the same thing for 32-bit x86 and 64-bit small code models, with the exception that RIP-relative addressing can't use base and index registers. llvm-svn: 332893	2018-05-21 21:03:19 +00:00
Craig Topper	aad3aefaeb	[X86] Remove masking from vpternlog intrinsics. Use a select in IR instead. This removes 6 intrinsics since we no longer need separate mask and maskz intrinsics. Differential Revision: https://reviews.llvm.org/D47124 llvm-svn: 332890	2018-05-21 20:58:09 +00:00
Peter Collingbourne	274c4f7ab4	Fix a make_unique ambiguity. llvm-svn: 332889	2018-05-21 20:56:28 +00:00
Sanjay Patel	b8346e3f07	[InstCombine] remove fptrunc (select) code; NFCI This pattern is handled within commonCastTransforms(), so the code here is dead AFAICT. llvm-svn: 332887	2018-05-21 20:39:35 +00:00
Peter Collingbourne	c5a9765cea	LTO: Replace split dwarf implementation that uses objcopy with one that uses direct emission. Part of PR37466. Differential Revision: https://reviews.llvm.org/D47091 llvm-svn: 332884	2018-05-21 20:26:49 +00:00
Peter Collingbourne	9a45114b3c	CodeGen: Add a dwo output file argument to addPassesToEmitFile and hook it up to dwo output. Part of PR37466. Differential Revision: https://reviews.llvm.org/D47089 llvm-svn: 332881	2018-05-21 20:16:41 +00:00
Peter Collingbourne	63062d9d0f	MC: Introduce an ELF dwo object writer and teach llvm-mc about it. Part of PR37466. Differential Revision: https://reviews.llvm.org/D47051 llvm-svn: 332875	2018-05-21 19:44:54 +00:00
Jonas Devlieghere	c111382aa8	[DebugInfo] Use absolute addresses in location lists Rather than relying on the user to do the address calculating in DW_AT_location we should just dump the absolute address. rdar://problem/38513870 Differential revision: https://reviews.llvm.org/D47152 llvm-svn: 332873	2018-05-21 19:36:54 +00:00
Peter Collingbourne	f0226e62a8	MC: Extract a derived class from ELFObjectWriter. NFCI. This class will be used to create regular, non-split ELF files. Part of PR37466. Differential Revision: https://reviews.llvm.org/D47049 llvm-svn: 332870	2018-05-21 19:30:59 +00:00
Peter Collingbourne	dcd7d6c331	MC: Separate creating a generic object writer from creating a target object writer. NFCI. With this we gain a little flexibility in how the generic object writer is created. Part of PR37466. Differential Revision: https://reviews.llvm.org/D47045 llvm-svn: 332868	2018-05-21 19:20:29 +00:00
Peter Collingbourne	a29fe579f4	MC: Extract ELFObjectWriter's ELF writing functionality into an ELFWriter class. NFCI. The idea is that we will be able to use this class to create multiple files. Differential Revision: https://reviews.llvm.org/D47048 llvm-svn: 332867	2018-05-21 19:18:28 +00:00
Peter Collingbourne	2602a0d40c	Fix ubsan bounds check failure. llvm-svn: 332866	2018-05-21 19:09:47 +00:00
Craig Topper	f14e62c9a5	[EarlyCSE] Improve EarlyCSE of some absolute value cases. Change matchSelectPattern to return X and -X for ABS/NABS in a well defined order. Adjust EarlyCSE to account for this. Ensure the SPF result is some kind of min/max and not abs/nabs in one place in InstCombine that made me nervous. Prevously we returned the two operands of the compare part of the abs pattern. The RHS is always going to be a 0i, 1 or -1 constant. This isn't a very meaningful thing to return for any one. There's also some freedom in the abs pattern as to what happens when the value is equal to 0. This freedom led to early cse failing to match when different constants were used in otherwise equivalent operations. By returning the input and its negation in a defined order we can ensure an exact match. This also makes sure both patterns use the exact same subtract instruction for the negation. I believe CSE should evebntually make this happen and properly merge the nsw/nuw flags. But I'm not familiar with CSE and what order it does things in so it seemed like it might be good to really enforce that they were the same. Differential Revision: https://reviews.llvm.org/D47037 llvm-svn: 332865	2018-05-21 18:42:42 +00:00
Peter Collingbourne	59a6fc469f	MC: Remove stream and output functions from MCObjectWriter. NFCI. Part of PR37466. Differential Revision: https://reviews.llvm.org/D47043 llvm-svn: 332864	2018-05-21 18:28:57 +00:00
Peter Collingbourne	438390fae1	MC: Have the object writers return the number of bytes written. NFCI. This removes the last external use of the stream. Part of PR37466. Differential Revision: https://reviews.llvm.org/D47042 llvm-svn: 332863	2018-05-21 18:23:50 +00:00
Stanislav Mekhanoshin	9badad2051	[AMDGPU] Add divergence analysis as a dependency for ISel AMDGPUDAGToDAGISel adds DivergenceAnalysis in getAnalysisUsage but does not list it in pass dependencies which may lead to crash. Differential Revision: https://reviews.llvm.org/D47151 llvm-svn: 332862	2018-05-21 18:18:52 +00:00
Peter Collingbourne	f17b149d8c	MC: Change object writers to use endian::Writer. NFCI. Part of PR37466. Differential Revision: https://reviews.llvm.org/D47040 llvm-svn: 332861	2018-05-21 18:17:42 +00:00
Diego Caballero	168d04d544	[VPlan] Reland r332654 and silence unused func warning r332654 was reverted due to an unused function warning in release build. This commit includes the same code with the warning silenced. Differential Revision: https://reviews.llvm.org/D44338 llvm-svn: 332860	2018-05-21 18:14:23 +00:00
Peter Collingbourne	147db3e628	MC: Change MCAssembler::writeSectionData and writeFragmentPadding to take a raw_ostream. NFCI. Also clean up a couple of hacks where we were writing the section contents to another stream by setting the object writer's stream, writing and setting it back. Part of PR37466. Differential Revision: https://reviews.llvm.org/D47038 llvm-svn: 332858	2018-05-21 18:11:35 +00:00
Peter Collingbourne	571a3301ae	MC: Change MCAsmBackend::writeNopData() to take a raw_ostream instead of an MCObjectWriter. NFCI. To make this work I needed to add an endianness field to MCAsmBackend so that writeNopData() implementations know which endianness to use. Part of PR37466. Differential Revision: https://reviews.llvm.org/D47035 llvm-svn: 332857	2018-05-21 17:57:19 +00:00
Tom Stellard	a91ce17b5f	AMDGPU/GlobalISel: Address post-commit review comments for r332379 MCRegisterInfo::getPhysRegSize() will be deprecated. llvm-svn: 332856	2018-05-21 17:49:31 +00:00
Alexey Bataev	7c9ad0db3d	[InstCombine] Fix PR37526: MinMax patterns produce an infinite loop. Summary: This patch fixes PR37526 by simplifying the newly generated LoadInst instructions. If the pointer address is a bitcast from the pointer to the NewType, we can just remove this extra bitcast instead of creating the new one. This fixes the PR37526 + may speed up the whole compilation process. Reviewers: spatel, RKSimon, hfinkel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D47144 llvm-svn: 332855	2018-05-21 17:46:34 +00:00
Andrea Di Biagio	b5757abefb	[X86][BtVer2] Add a 'J' prefix to the PRF/RCU defs. NFC This is to keep the Jaguar model's naming convention. Processor resources all have a 'J' prefix in the BtVer2 scheduling model. llvm-svn: 332851	2018-05-21 16:30:26 +00:00
Robert Widmann	38fa750b7a	[LLVM-C] Add DIBuilder Bindings For ObjC Classes Summary: Add LLVMDIBuilderCreateObjCIVar, LLVMDIBuilderCreateObjCProperty, and LLVMDIBuilderCreateInheritance to allow declaring metadata for Objective-C class hierarchies and their associated properties and instance variables. Reviewers: whitequark, deadalnix Reviewed By: whitequark Subscribers: harlanhaskins, llvm-commits Differential Revision: https://reviews.llvm.org/D47123 llvm-svn: 332850	2018-05-21 16:27:35 +00:00
Lama Saba	9417f7ff2e	[X86] - Avoid SFB pass - fix bug in updating the offsets for newly created copies Change-Id: I169ab6fe7e187727c0298c2a1e2868a683f3e688 llvm-svn: 332849	2018-05-21 16:23:16 +00:00
James Henderson	004b729ed1	[DWARF] Refactor callback usage for .debug_line error handling Change the "recoverable" error callback to take an Error instaed of a string. Reviewed by: JDevlieghere Differential Revision: https://reviews.llvm.org/D46831 llvm-svn: 332845	2018-05-21 15:30:54 +00:00
Simon Pilgrim	a8869e68a9	[X86][SSE] Add an assert to ensure that rotation amount is converted to a scale Missed in rL332832 where we added SSE v4i32 rotations for PR37426. llvm-svn: 332844	2018-05-21 15:17:23 +00:00
Tim Northover	4e3eec39fa	ARM: be conservative when asked load/store alignment of weird type. Chances are we'll be asked again after type legalization, but before that point it's better to claim misaligned accesses aren't allowed than to assert. llvm-svn: 332840	2018-05-21 12:43:54 +00:00
Nico Weber	e4a12cfa2f	revert r332610, it breaks cfi, see D46326 llvm-svn: 332838	2018-05-21 11:44:39 +00:00
Aleksandar Beserminji	4977705727	[mips] Revert Merge MipsLongBranch and MipsHazardSchedule passes Revert this patch due buildbot failure. Differential Revision: https://reviews.llvm.org/D46641 llvm-svn: 332837	2018-05-21 11:38:52 +00:00
David Green	8ceab61c75	[CVP] Require DomTree for new Pass Manager We were previously using a DT in CVP through SimplifyQuery, but not requiring it in the new pass manager. Hence it would crash if DT was not already available. This now gets DT directly and plumbs it through to where it is used (instead of using it through SQ). llvm-svn: 332836	2018-05-21 11:06:28 +00:00
Eric Christopher	563d0b9cb9	Fix up a few grammar issues. llvm-svn: 332835	2018-05-21 10:27:36 +00:00
Aleksandar Beserminji	de7be5e46f	[mips] Merge MipsLongBranch and MipsHazardSchedule passes MipsLongBranchPass and MipsHazardSchedule passes are joined to one pass because of mutual conflict. When MipsHazardSchedule inserts 'nop's, it potentially breaks some jumps, so they have to be expanded to long branches. When some branch is expanded to long branch, it potentially creates a hazard situation, which should be fixed by adding nops. New pass is called MipsBranchExpansion, it combines these two passes, and runs them alternately until one of them reports no changes were made. Differential Revision: https://reviews.llvm.org/D46641 llvm-svn: 332834	2018-05-21 10:20:02 +00:00
Simon Pilgrim	5aa7cdfd70	[X86][SSE] Support v4i32 rotations (PR37426) As suggested by Fabian on PR37426, we can use PMULUDQ to perform v4i32 vector rotations as the upper 32bits of the multiply will contain the 'wrapped' bits of the rotation. v8i16/v16i8 rotations would be straightforward to add to lowerRotate in the future - ideally we'd mostly share code with the vector shifts lowering. Differential Revision: https://reviews.llvm.org/D46954 llvm-svn: 332832	2018-05-21 09:45:59 +00:00
Robert Widmann	360d6e35e6	[LLVM-C] Improve Bindings For Aliases Summary: Add wrappers for a module's alias iterators and a getter and setter for the aliasee value. Reviewers: whitequark, deadalnix Reviewed By: whitequark Subscribers: llvm-commits, harlanhaskins Differential Revision: https://reviews.llvm.org/D46808 llvm-svn: 332826	2018-05-20 23:49:08 +00:00
Craig Topper	e4c045b7df	[X86] Remove mask arguments from permvar builtins/intrinsics. Use a select in IR instead. Someday maybe we'll use selects for all intrinsics. llvm-svn: 332824	2018-05-20 23:34:04 +00:00
Nico Weber	41597b92b1	Revert 332750, llvm part (see comment on D46910). llvm-svn: 332823	2018-05-20 23:03:17 +00:00
Simon Dardis	777afc7fbd	[mips] Add microMIPSR6 ll/sc instructions. Previously the compiler was using the microMIPSR3 variants, incorrectly. Reviewers: atanasyan, abeserminji, smaksimovic Differential Revision: https://reviews.llvm.org/D46948 llvm-svn: 332820	2018-05-20 17:21:00 +00:00
Sanjay Patel	a003c728a5	[InstCombine] choose 1 form of abs and nabs as canonical We already do this for min/max (see the blob above the diff), so we should do the same for abs/nabs. A sign-bit check (<s 0) is used as a predicate for other IR transforms and it's likely the best for codegen. This might solve the motivating cases for D47037 and D47041, but I think those patches still make sense. We can't guarantee this canonicalization if the icmp has more than one use. Differential Revision: https://reviews.llvm.org/D47076 llvm-svn: 332819	2018-05-20 14:23:23 +00:00
Haicheng Wu	69ba0613f2	[GlobalMerge] Exit early if only one global is to be merged To save some compilation time and prevent some unnecessary changes. Differential Revision: https://reviews.llvm.org/D46640 llvm-svn: 332813	2018-05-19 18:00:02 +00:00
Brian Gesiak	9968e0dd49	Re-revert "[Option] Fix PR37006 prefix choice in findNearest" Summary: Reverting due to a test failure in an llvm-mt test on some buildbots, namely http://green.lab.llvm.org/green/job/clang-stage2-configure-Rlto/26020/. llvm-svn: 332812	2018-05-19 16:21:01 +00:00
Robert Widmann	025c78f5d7	[LLVM-C] Use Length-Providing Value Name Getters and Setters Summary: - Provide LLVMGetValueName2 and LLVMSetValueName2 that return and take the length of the provided C string respectively - Deprecate LLVMGetValueName and LLVMSetValueName Reviewers: whitequark, deadalnix Reviewed By: whitequark Subscribers: llvm-commits, harlanhaskins Differential Revision: https://reviews.llvm.org/D46890 llvm-svn: 332810	2018-05-19 15:08:36 +00:00
Max Kazantsev	c0b268f90c	[IRCE] Fix miscompile with range checks against negative values In the patch rL329547, we have lifted the over-restrictive limitation on collected range checks, allowing to work with range checks with the end of their range not being provably non-negative. However it appeared that the non-negativity of this value was assumed in the utility function `ClampedSubtract`. In particular, its reasoning is based on the fact that `0 <= SINT_MAX - X`, which is not true if `X` is negative. The function `ClampedSubtract` is only called twice, once with `X = 0` (which is OK) and the second time with `X = IRC.getEnd()`, where we may now see the problem if the end is actually a negative value. In this case, we may sometimes miscompile. This patch is the conservative fix of the miscompile problem. Rather than rejecting non-provably non-negative `getEnd()` values, we will check it for non-negativity in runtime. For this, we use function `smax(smin(X, 0), -1) + 1` that is equal to `1` if `X` is non-negative and is equal to 0 if `X` is negative. If we multiply `Begin, End` of safe iteration space by this function calculated for `X = IRC.getEnd()`, we will get the original `[Begin, End)` if `IRC.getEnd()` was non-negative (and, thus, `ClampedSubtract` worked correctly) and the empty range `[0, 0)` in case if ` IRC.getEnd()` was negative. So we in fact prohibit execution of the main loop if at least one of range checks was made against a negative value (and we figured it out in runtime). It is still better than what we have before (non-negativity had to be proved in compile time) and prevents us from miscompile, however it is sometiles too restrictive for unsigned range checks against a negative value (which in fact can be eliminated). Once we re-implement `ClampedSubtract` in a way that it handles negative `X` correctly, this limitation can be lifted, too. Differential Revision: https://reviews.llvm.org/D46860 Reviewed By: samparker llvm-svn: 332809	2018-05-19 13:06:37 +00:00
Benjamin Kramer	a76b64ff80	[MergeICmps] Don't crash when memcmp is not available Fixes clang crashing with -fno-builtin, PR37527. llvm-svn: 332808	2018-05-19 12:51:59 +00:00
Simon Pilgrim	ede0e4073e	Fix MSVC unused variable warning. NFCI. AMDGPURegisterInfo::getSubRegFromChannel is a static method - we don't need to get the AMDGPURegisterInfo instance. llvm-svn: 332807	2018-05-19 12:46:02 +00:00
Brian Gesiak	8cfb4b6d41	Un-revert "[Option] Fix PR37006 prefix choice in findNearest" Summary: In https://reviews.llvm.org/rL332804 I loosed the assertion in the Clang driver test that forced me to revert https://reviews.llvm.org/rL332299. Once this lands I should be able to narrow down what caused PS4 buildbots to fail, and reinstate the check in that test. Test Plan: check-llvm & check-clang llvm-svn: 332805	2018-05-19 12:03:26 +00:00
Yaxun Liu	ea988f1fd9	Fix evaluator for non-zero alloca addr space The evaluator goes through BB and creates global vars as temporary values to evaluate results of LLVM instructions. It creates undef for alloca, however it assumes alloca in addr space 0. If the next instruction is addrspace cast to 0, then we get an invalid cast instruction. This patch let the temp global var have an address space matching alloca addr space, so that the valuation can be done. Differential Revision: https://reviews.llvm.org/D47081 llvm-svn: 332794	2018-05-19 02:58:16 +00:00
Piotr Padlewski	5642a42442	Propagate nonnull and dereferenceable throught launder Summary: invariant.group.launder should not stop propagation of nonnull and dereferenceable, because e.g. we would not be able to hoist loads speculatively. Reviewers: rsmith, amharc, kuhar, xbolva00, hfinkel Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D46972 llvm-svn: 332788	2018-05-18 23:54:33 +00:00
Piotr Padlewski	ce358262eb	Dissallow non-empty metadata for invariant.group Summary: This feature is not needed, but it might be usefull in the future to use metadata to mark what which function should support it (and strip it when not). Reviewers: rsmith, sanjoy, amharc, kuhar Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D45419 llvm-svn: 332787	2018-05-18 23:53:46 +00:00
Piotr Padlewski	a26a08cb52	Constant fold launder of null and undef Summary: This might be useful because clang will add some barriers for pointer comparisons. Reviewers: majnemer, dberlin, hfinkel, nlewycky, davide, rsmith, amharc, kuhar Subscribers: davide, amharc, llvm-commits Differential Revision: https://reviews.llvm.org/D32423 llvm-svn: 332786	2018-05-18 23:52:57 +00:00
Piotr Padlewski	153fe60079	[MemDep] Fixed handling of invariant.group Summary: Memdep had funny bug related to invariant.groups - because it did not invalidated cache, in some very rare cases it was possible to show memory dependence of the instruction that was deleted, but because other instruction took it's place it resulted in call to vtable! Thanks @amharc for repro!. Reviewers: dberlin, kuhar, amharc Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45320 Co-authored-by: Krzysztof Pszeniczny <krzysztof.pszeniczny@gmail.com> llvm-svn: 332781	2018-05-18 22:40:34 +00:00
Matt Arsenault	9fc8593a77	DAG: Fix crash on shift with large shift amounts Fixes bug 37521. llvm-svn: 332774	2018-05-18 21:54:16 +00:00
Wolfgang Pieb	20e1546655	Fixing buildbot error introduced with r332759. llvm-svn: 332772	2018-05-18 21:44:28 +00:00
Matt Arsenault	372d796ab1	AMDGPU: Add pass to optimize reqd_work_group_size Eliminate loads from the dispatch packet when they will have a known value. Also pattern match the code used by the library to handle partial workgroup dispatches, which isn't necessary if reqd_work_group_size is used. llvm-svn: 332771	2018-05-18 21:35:00 +00:00
Craig Topper	0198b73769	[InstCombine] Qualify a select pattern based transform to restrct to only min/max and ignore abs/nabs. llvm-svn: 332770	2018-05-18 21:21:56 +00:00
Sam Clegg	4bbc6b55e7	[WebAssembly] Object: Add more error checking for object file reading This should address some the assert failures the fuzzer has been finding such as: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=6719 Differential Revision: https://reviews.llvm.org/D47046 llvm-svn: 332769	2018-05-18 21:08:26 +00:00
Wolfgang Pieb	401b5ecfea	Addressing a couple of compiler warnings introduced with r332759. llvm-svn: 332766	2018-05-18 20:51:16 +00:00
Wolfgang Pieb	da71639cdb	Fixing build error introduced with r332759. llvm-svn: 332762	2018-05-18 20:35:13 +00:00
Evgeniy Stepanov	28f330fd6f	[msan] Don't check divisor shadow in fdiv. Summary: Floating point division by zero or even undef does not have undefined behavior and may occur due to optimizations. Fixes https://bugs.llvm.org/show_bug.cgi?id=37523. Reviewers: kcc Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D47085 llvm-svn: 332761	2018-05-18 20:19:53 +00:00
Wolfgang Pieb	ad60559be7	[DWARF v5] Improved support for .debug_rnglists (consumer). Enables any consumer to extract DWARF v5 encoded rangelists. Reviewer: JDevlieghere Differential Revision: https://reviews.llvm.org/D45549 llvm-svn: 332759	2018-05-18 20:12:54 +00:00
Jessica Paquette	c604817493	[NFC] Change cast from r332739 to a static cast The casts in the delta computation for size remarks should have been static casts. This fixes that. Thanks to Dávid Bolvanský for pointing that out. llvm-svn: 332758	2018-05-18 20:04:21 +00:00
Peter Collingbourne	e3f652973e	Support: Simplify endian stream interface. NFCI. Provide some free functions to reduce verbosity of endian-writing a single value, and replace the endianness template parameter with a field. Part of PR37466. Differential Revision: https://reviews.llvm.org/D47032 llvm-svn: 332757	2018-05-18 19:46:24 +00:00
Konstantin Zhuravlyov	caa8251971	AMDGPU/NFC: Set symbol's type that is coming from an argument in EmitAMDGPUSymbolType, instead of hard-coding it to STT_AMDGPU_HSA_KERNEL. llvm-svn: 332753	2018-05-18 18:41:37 +00:00
Petr Hosek	24b61ac832	[Support] Avoid normalization in sys::getDefaultTargetTriple The return value of sys::getDefaultTargetTriple, which is derived from -DLLVM_DEFAULT_TRIPLE, is used to construct tool names, default target, and in the future also to control the search path directly; as such it should be used textually, without interpretation by LLVM. Normalization of this value may lead to unexpected results, for example if we configure LLVM with -DLLVM_DEFAULT_TARGET_TRIPLE=x86_64-linux-gnu, normalization will transform that value to x86_64--linux-gnu. Driver will use that value to search for tools prefixed with x86_64--linux-gnu- which may be confusing. This is also inconsistent with the behavior of the --target flag which is taken as-is without any normalization and overrides the value of LLVM_DEFAULT_TARGET_TRIPLE. Users of sys::getDefaultTargetTriple already perform their own normalization as needed, so this change shouldn't impact existing logic. Differential Revision: https://reviews.llvm.org/D46910 llvm-svn: 332750	2018-05-18 18:33:07 +00:00
Peter Collingbourne	f7b81db715	MC: Change the streamer ctors to take an object writer instead of a stream. NFCI. The idea is that a client that wants split dwarf would create a specific kind of object writer that creates two files, and use it to create the streamer. Part of PR37466. Differential Revision: https://reviews.llvm.org/D47050 llvm-svn: 332749	2018-05-18 18:26:45 +00:00
Brendon Cahoon	e5ed563cc5	[Hexagon] Generate post-increment for floating point types The code that generates post-increments for Hexagon considered integer values only. This patch adds support to generate them for floating point values, f32 and f64. Differential Revision: https://reviews.llvm.org/D47036 llvm-svn: 332748	2018-05-18 18:14:44 +00:00
Galina Kistanova	083ea389d6	Reverted r332654 as it has broken some buildbots and left unfixed for a long time. The introduced problem is: llvm.src/lib/Transforms/Vectorize/VPlanVerifier.cpp:29:13: error: unused function 'hasDuplicates' [-Werror,-Wunused-function] static bool hasDuplicates(const SmallVectorImpl<VPBlockBase *> &VPBlockVec) { ^ llvm-svn: 332747	2018-05-18 18:14:06 +00:00
Simon Pilgrim	1273f4ad93	[X86] Add GPR<->XMM Schedule Tags BtVer2 - fix NumMicroOp and account for the Lat+6cy GPR->XMM and Lat+1cy XMm->GPR delays (see rL332737) The high number of MOVD/MOVQ equivalent instructions meant that there were a number of missed patterns in SNB/Znver1: SNB - add missing GPR<->MMX costs (taken from Agner / Intel AOM) Znver1 - add missing GPR<->XMM MOVQ costs (taken from Agner) llvm-svn: 332745	2018-05-18 17:58:36 +00:00
Craig Topper	f94ed26ea9	[X86] Directly legalize v16i16/v8i16 vselect to vXi8 vselect to use VPBLENDVB The intrinsic legalization for masked truncate uses ISD::TRUNCATE which can be constant folded by getNode. This prevents getVectorMaskingNode from seeing the ISD::TRUNCATE special case where it should emit X86ISD::SELECT instead of ISD::VSELECT. This causes a vselect with a v16i1 or v8i1 condition to be emitted during vector legalization. but vector legalization doesn't revisit nodes it creates. DAG combine will then promote this condition to match the result type. Then op legalization will try to legalize it, but the custom lowering hook returned SDValue(). But op legalization doesn't have an Expand for VSELECT because it expects vector legalization to have taken care of it. So the operation sticks around and fails in isel. This patch adds a custom legalization hook to morph it to a vXi8 vselect instead. This also simplifies the normal vXi16 vselect handling because vector legalization was normally expanding to AND/ANDN/OR and DAG combine was turning that into VBLENDVB. So we can skip a step by doing it directly. Fixes PR37499 Differential Revision: https://reviews.llvm.org/D47025 llvm-svn: 332743	2018-05-18 17:48:06 +00:00
Than McIntosh	3c639dbd0d	Revert changes from D46265. This is a revert of the changes from https://reviews.llvm.org/D46265; the new test introduced (test/CodeGen/X86/PR37310.mir) causes buildbot failures. Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D47061 llvm-svn: 332742	2018-05-18 17:47:10 +00:00
Nirav Dave	588fad4d3b	[MC] Relax .fill size requirements Avoid requirement that number of values must be known at assembler time. Fixes PR33586. Reviewers: rnk, peter.smith, echristo, jyknight Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D46703 llvm-svn: 332741	2018-05-18 17:45:48 +00:00
Jessica Paquette	e49374d009	Add remarks describing when a pass changes the IR instruction count of a module This patch adds a remark which tells the user when a pass changes the number of IR instructions in a module. It can be enabled by using -Rpass-analysis=size-info. The point of this is to make it easier to collect statistics on how passes modify programs in terms of code size. This is similar in concept to timing reports, but using a remark-based interface makes it easy to diff changes over multiple compilations of the same program. By adding functionality like this, we can see * Which passes impact code size the most * How passes impact code size at different optimization levels * Which pass might have contributed the most to an overall code size regression The patch lives in the legacy pass manager, but since it's simply emitting remarks, it shouldn't be too difficult to adapt the functionality to the new pass manager as well. This can also be adapted to handle MachineInstr counts in code gen passes. https://reviews.llvm.org/D38768 llvm-svn: 332739	2018-05-18 17:26:39 +00:00
Simon Pilgrim	007b50fd35	[X86][BtVer2] Improve simulation of (V)PINSR values Include the 6cy delay transferring from the GPR to FPU. llvm-svn: 332737	2018-05-18 17:09:41 +00:00
Simon Pilgrim	3ecb0b80f6	[X86][BtVer2] Partial vector stores (inc MMX) have a 2cy latency llvm-svn: 332722	2018-05-18 14:22:22 +00:00
Simon Pilgrim	c4b8d367a8	[X86][SSE] Ensure vector partial load/stores use the WriteVecLoad/WriteVecStore scheduler classes Retag some instructions that were missed when we split off vector load/store/moves - MOVQ/MOVD etc. Fixes BtVer2/SLM which have different behaviours for GPR stores. llvm-svn: 332718	2018-05-18 14:08:01 +00:00
Simon Pilgrim	e819199e2a	[X86][AVX] VEXTRACTF128mr store is a WriteFStoreX not WriteFStore llvm-svn: 332715	2018-05-18 13:17:51 +00:00
Simon Pilgrim	d749b321b2	[X86][SSE] Ensure float load/stores use the WriteFLoad/WriteFStore scheduler classes Retag some instructions that were missed when we split off vector load/store/moves - MOVSS/MOVSD/MOVHPD/MOVHPD/MOVLPD/MOVLPS etc. Fixes BtVer2/SLM which have different behaviours for GPR stores. llvm-svn: 332714	2018-05-18 13:13:59 +00:00
Clement Courbet	8892c7db08	[ExynosM3] Fix scheduling info. Differential Revision: https://reviews.llvm.org/D46356 llvm-svn: 332713	2018-05-18 13:10:41 +00:00
Simon Pilgrim	a325dffd36	[X86][ZnVer1] Cleanup more single match instregexs llvm-svn: 332712	2018-05-18 13:05:26 +00:00
Than McIntosh	4c21a363af	StackColoring: better handling of statically unreachable code Summary: Avoid assert/crash during liveness calculation in situations where the incoming machine function has statically unreachable BBs. Fixes PR37130. Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D46265 llvm-svn: 332707	2018-05-18 12:25:30 +00:00
Jonas Paulsson	b51ccaf4d4	[SystemZ] Fix commit message of previous commit. Sorry, the commit comment for r332703 is completely broken. My mind slipped - the right description would be: In SystemZDAGToDAGISel::Select(), in the handling for SELECT_CCMASK: Check if UpdateNodeOperands() returns a different SDNode and in that case call ReplaceNode. Review: Ulrich Weigand. llvm-svn: 332706	2018-05-18 12:07:16 +00:00
Alexander Ivchenko	5c54742da4	[X86][CET] Changing -fcf-protection behavior to comply with gcc (LLVM part) This patch aims to match the changes introduced in gcc by https://gcc.gnu.org/ml/gcc-cvs/2018-04/msg00534.html. The IBT feature definition is removed, with the IBT instructions being freely available on all X86 targets. The shadow stack instructions are also being made freely available, and the use of all these CET instructions is controlled by the module flags derived from the -fcf-protection clang option. The hasSHSTK option remains since clang uses it to determine availability of shadow stack instruction intrinsics, but it is no longer directly used. Comes with a clang patch (D46881). Patch by mike.dvoretsky Differential Revision: https://reviews.llvm.org/D46882 llvm-svn: 332705	2018-05-18 11:58:25 +00:00
Jonas Paulsson	de54c058a6	[SystemZ] Fold AHIMux in foldMemoryOperandImpl. AHIMux can be folded the same way as AHI. Review: Ulrich Weigand llvm-svn: 332703	2018-05-18 11:54:04 +00:00
David Stenberg	0af67e5b65	[SimplifyCFG] Fix a debug invariant bug in FoldBranchToCommonDest() Summary: Fix a case where FoldBranchToCommonDest() would bail out from doing CSE when encountering a debug intrinsic. Handle that by skipping past the debug intrinsics. Also, as a minor refactoring, rename checkCSEInPredecessor() to tryCSEWithPredecessor() to make it a bit more clear that the function may remove instructions. Reviewers: fhahn, craig.topper, dblaikie, xbolva00 Reviewed By: fhahn, xbolva00 Subscribers: vsk, davide, llvm-commits Differential Revision: https://reviews.llvm.org/D46635 llvm-svn: 332698	2018-05-18 08:52:15 +00:00
Shiva Chen	6e07dfb148	[RISCV] Add WasForced parameter to MCAsmBackend::fixupNeedsRelaxationAdvanced For RISCV branch instructions, we need to preserve relocation types when linker relaxation enabled, so then linker could modify offset when the branch offsets changed. We preserve relocation types by define shouldForceRelocation. IsResolved return by evaluateFixup will always false when shouldForceRelocation return true. It will make RISCV MC Branch Relaxation always relax 16-bit branches to 32-bit form, even if the symbol actually could be resolved. To avoid 16-bit branches always relax to 32-bit form when linker relaxation enabled, we add a new parameter WasForced to indicate that the symbol actually couldn't be resolved and not forced by shouldForceRelocation return true. RISCVAsmBackend::fixupNeedsRelaxationAdvanced could relax branches with unresolved symbols by (!IsResolved && !WasForced). RISCV MC Branch Relaxation is needed because RISCV could perform 32-bit to 16-bit transformation in MC layer. Differential Revision: https://reviews.llvm.org/D46350 llvm-svn: 332696	2018-05-18 06:42:21 +00:00
Serguei Katkov	5095883fe9	[LICM] Extend the MustExecute scope CanProveNotTakenFirstIteration utility does not handle the case when condition of the branch is a constant. Add its handling. Reviewers: reames, anna, mkazantsev Reviewed By: reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D46996 llvm-svn: 332695	2018-05-18 04:56:28 +00:00
Walter Lee	cdbb207bd1	[asan] Add instrumentation support for Myriad 1. Define Myriad-specific ASan constants. 2. Add code to generate an outer loop that checks that the address is in DRAM range, and strip the cache bit from the address. The former is required because Myriad has no memory protection, and it is up to the instrumentation to range-check before using it to index into the shadow memory. 3. Do not add an unreachable instruction after the error reporting function; on Myriad such function may return if the run-time has not been initialized. 4. Add a test. Differential Revision: https://reviews.llvm.org/D46451 llvm-svn: 332692	2018-05-18 04:10:38 +00:00
Eric Christopher	68f2218e1e	Revert "Temporarily revert "[DEBUG] Initial adaptation of NVPTX target for debug info emission."" This reapplies commits: r330271, r330592, r330779. [DEBUG] Initial adaptation of NVPTX target for debug info emission. Summary: Patch adds initial emission of the debug info for NVPTX target. Currently, only .file and .loc directives are emitted, everything else is commented out to not break the compilation of Cuda. llvm-svn: 332689	2018-05-18 03:13:08 +00:00
Eric Christopher	f31e91e4a8	Tidy comment up a bit. llvm-svn: 332687	2018-05-18 02:39:57 +00:00
Eli Friedman	d268bf0a4d	Fix unused lambda capture. llvm-svn: 332686	2018-05-18 02:11:25 +00:00
Eli Friedman	4081a57af7	[MachineOutliner] Count savings from outlining in bytes. Counting the number of instructions is both unintuitive and inaccurate. On AArch64, this only affects the generated remarks and certain rare pseudo-instructions, but it will have a bigger impact on other targets. Differential Revision: https://reviews.llvm.org/D46921 llvm-svn: 332685	2018-05-18 01:52:16 +00:00
Keno Fischer	e07153a859	[X86DomainReassignment] Don't compare stack-allocated values by address Summary: The Closure allocated in the main loop is allocated on the stack. However, later in the code its address is taken (and used for comparisons). This obviously doesn't work. In fact, the Closure will get the same stack address during every loop iteration, rendering the check that intended to identify Closure conflicts entirely ineffective. Fix this bug by giving every Closure a unique ID and using that for comparison. Alternatively, we could heap allocate the closure object. Fixes PR37396 Fixes JuliaLang/julia#27032 Reviewers: craig.topper, guyblank Reviewed By: craig.topper Subscribers: vchuravy, llvm-commits Differential Revision: https://reviews.llvm.org/D46800 llvm-svn: 332682	2018-05-18 01:03:01 +00:00
Keno Fischer	66ab99c3ee	[X86DomainReassignment] Don't delete IMPLICIT_DEF nodes Summary: We cannot simply delete IMPLICIT_DEF nodes. They may be used later (e.g. by a PHI) and deleting them will cause later passes (e.g. LiveVariables) to crash. However, it seems fine to ignore them for purposes of the domain reassignment (as we do with PHI). Fixes PR37430 Fixes JuliaLang/julia#27080 Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D46797 llvm-svn: 332680	2018-05-18 00:40:52 +00:00
Zachary Turner	c762666e87	Resubmit [pdb] Change /DEBUG:GHASH to emit 8 byte hashes." This fixes the remaining failing tests, so resubmitting with no functional change. llvm-svn: 332676	2018-05-17 22:55:15 +00:00
Peter Collingbourne	070777dbdd	Support: Add a raw_ostream::write_zeros() function. NFCI. This will eventually replace MCObjectWriter::WriteZeros. Part of PR37466. Differential Revision: https://reviews.llvm.org/D47033 llvm-svn: 332675	2018-05-17 22:11:43 +00:00
George Burgess IV	c6526176cf	Revert r332657: "[AA] cfl-anders-aa with field sensitivity" I don't believe the person who LGTMed this review has appropriate context on this code. I apologize if I'm wrong. llvm-svn: 332674	2018-05-17 21:56:39 +00:00
Changpeng Fang	860d460063	AMDGPU/SI: Don't promote alloca to vector for atomic load/store Summary: Don't promote alloca to vector for atomic load/store Reviewer: arsenm Differential Revision: https://reviews.llvm.org/D46085 llvm-svn: 332673	2018-05-17 21:49:44 +00:00
Zachary Turner	1de9fce151	Revert "[pdb] Change /DEBUG:GHASH to emit 8 byte hashes." A few tests haven't been properly updated, so reverting while I have time to investigate proper fixes. llvm-svn: 332672	2018-05-17 21:49:25 +00:00
Zachary Turner	3c4c8a0937	[pdb] Change /DEBUG:GHASH to emit 8 byte hashes. Previously we emitted 20-byte SHA1 hashes. This is overkill for identifying debug info records, and has the negative side effect of making object files bigger and links slower. By using only the last 8 bytes of a SHA1, we get smaller object files and ~10% faster links. This modifies the format of the .debug$H section by adding a new value for the hash algorithm field, so that the linker will still work when its object files have an old format. Differential Revision: https://reviews.llvm.org/D46855 llvm-svn: 332669	2018-05-17 21:22:48 +00:00
Heejin Ahn	b4be38fcdd	[WebAssembly] Add Wasm personality and isScopedEHPersonality() Summary: - Add wasm personality function - Re-categorize the existing `isFuncletEHPersonality()` function into two different functions: `isFuncletEHPersonality()` and `isScopedEHPersonality(). This becomes necessary as wasm EH uses scoped EH instructions (catchswitch, catchpad/ret, and cleanuppad/ret) but not outlined funclets. - Changed some callsites of `isFuncletEHPersonality()` to `isScopedEHPersonality()` if they are related to scoped EH IR-level stuff. Reviewers: majnemer, dschuff, rnk Subscribers: jfb, sbc100, jgravelle-google, eraman, JDevlieghere, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D45559 llvm-svn: 332667	2018-05-17 20:52:03 +00:00
Lang Hames	ecb3e50041	[ORC] Consolidate materialization errors, and generate them in VSO's notifyFailed method rather than passing in an error generator. VSO::notifyFailed is responsible for notifying queries that they will not succeed due to error. In practice the queries don't care about the details of the failure, just the fact that a failure occurred for some symbols. Having VSO::notifyFailed take care of this simplifies the interface. llvm-svn: 332666	2018-05-17 20:48:58 +00:00
Reid Kleckner	f40f85868e	[codeview] Include record prefix in global type hashing The prefix includes type kind, which is important to preserve. Two different type leafs can easily have the same interior record contents as another type. We ran into this issue in PR37492 where a bitfield type record collided with a const modifier record. Their contents were bitwise identical, but their kinds were different. llvm-svn: 332664	2018-05-17 20:47:22 +00:00
Peter Collingbourne	0d8fa1b6fd	ARC, Nios2: Silence build warnings. NFCI. llvm-svn: 332663	2018-05-17 20:46:01 +00:00
David Bolvansky	b1c59e3f30	[AA] cfl-anders-aa with field sensitivity Summary: There was some unfinished work started for offset tracking in CFLGraph by the author of implementation of Andersen algorithm. This work was completed and support for field sensitivity was added to the core of Andersen algorithm. The performance results seem promising. SPEC2006 int_base score was increased by 1.1 % (I compared clang 6.0 with clang 6.0 with this patch). The avergae compile time was increased by +- 1 % according my measures with small and medium C/C++ projects (I did not tested it on the large projects with milions of lines of code) Reviewers: chandlerc, george.burgess.iv, rja Reviewed By: rja Subscribers: rja, llvm-commits Differential Revision: https://reviews.llvm.org/D46282 llvm-svn: 332657	2018-05-17 20:23:33 +00:00
Diego Caballero	f58ad3129c	[LV][VPlan] Build plain CFG with simple VPInstructions for outer loops. Patch #3 from VPlan Outer Loop Vectorization Patch Series #1 (RFC: http://lists.llvm.org/pipermail/llvm-dev/2017-December/119523.html). Expected to be NFC for the current inner loop vectorization path. It introduces the basic algorithm to build the VPlan plain CFG (single-level CFG, no hierarchical CFG (H-CFG), yet) in the VPlan-native vectorization path using VPInstructions. It includes: - VPlanHCFGBuilder: Main class to build the VPlan H-CFG (plain CFG without nested regions, for now). - VPlanVerifier: Main class with utilities to check the consistency of a H-CFG. - VPlanBlockUtils: Main class with utilities to manipulate VPBlockBases in VPlan. Reviewers: rengolin, fhahn, mkuper, mssimpso, a.elovikov, hfinkel, aprantl. Differential Revision: https://reviews.llvm.org/D44338 llvm-svn: 332654	2018-05-17 19:24:47 +00:00
Xinliang David Li	bc471c39ee	Add a limit for phi folding instcombine Differential Revision: http://reviews.llvm.org/D47023 llvm-svn: 332653	2018-05-17 19:24:03 +00:00
Sameer AbuAsal	1dc0a8fb18	[RISCV] Separate base from offset in lowerGlobalAddress Summary: When lowering global address, lower the base as a TargetGlobal first then create an SDNode for the offset separately and chain it to the address calculation This optimization will create a DAG where the base address of a global access will be reused between different access. The offset can later be folded into the immediate part of the memory access instruction. With this optimization we generate: lui a0, %hi(s) addi a0, a0, %lo(s) ; shared base address. addi a1, zero, 20 ; 2 instructions per access. sw a1, 44(a0) addi a1, zero, 10 sw a1, 8(a0) addi a1, zero, 30 sw a1, 80(a0) Instead of: lui a0, %hi(s+44) ; 3 instructions per access. addi a1, zero, 20 sw a1, %lo(s+44)(a0) lui a0, %hi(s+8) addi a1, zero, 10 sw a1, %lo(s+8)(a0) lui a0, %hi(s+80) addi a1, zero, 30 sw a1, %lo(s+80)(a0) Which will save one instruction per access. Reviewers: asb, apazos Reviewed By: asb Subscribers: rbar, johnrusso, simoncook, jordy.potman.lists, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, mgrang, apazos, asb, llvm-commits Differential Revision: https://reviews.llvm.org/D46989 llvm-svn: 332641	2018-05-17 18:14:53 +00:00
Mandeep Singh Grang	ef0ebf2806	[RISCV] Implement MC layer support for the tail pseudoinstruction Summary: This patch implements MC support for tail psuedo instruction. A follow-up patch implements the codegen support as well as handling of the indirect tail pseudo instruction. Reviewers: asb, apazos Reviewed By: asb Subscribers: rbar, johnrusso, simoncook, jordy.potman.lists, sabuasal, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, llvm-commits Differential Revision: https://reviews.llvm.org/D46221 llvm-svn: 332634	2018-05-17 17:31:27 +00:00
Sam Clegg	c0d41195b5	[WebAssembly] MC: Fix typo in comment llvm-svn: 332632	2018-05-17 17:15:15 +00:00
Simon Pilgrim	2782a19fad	[X86] Split WriteCMOV + WriteCMOV2 scheduler classes Handle SNB+ targets which treat CMOVA/CMOVBE specially due to partial EFLAGS handling. llvm-svn: 332626	2018-05-17 16:47:30 +00:00
Changpeng Fang	391bcf8893	AMDGPU/SI: Handle infinite loop for the structurizer to work with CFG with infinite loops. Summary: The current StructurizeCFG pass only works for CFG with one exit. AMDGPUUnifyDivergentExitNodes combines multiple "return" blocks and/or "unreachable" blocks to one exit block for the Structurizer to work. However, infinite loop is another kind of special "exit", and if we don't handle it, the case of multiple exits will prevent the structurizer from working. In this work, for each infinite loop, we add a dummy edge to the "return" block, and thus the AMDGPUUnifyDivergentExitNodes pass will work with infinite loops. This will make CFG with infinite loops be structurized. Reviewer: nhaehnle Differential Revision: https://reviews.llvm.org/D46340 llvm-svn: 332625	2018-05-17 16:45:01 +00:00
Petar Jovanovic	daf5169398	[mips] Add support for Global INValidate ASE This includes Instructions: ginvi, ginvt, Assembler directives: .set ginv, .set noginv, .module ginv, .module noginv Attribute: ginv .MIPS.abiflags: GINV (0x20000) Patch by Vladimir Stefanovic. Differential Revision: https://reviews.llvm.org/D46268 llvm-svn: 332624	2018-05-17 16:30:32 +00:00
Craig Topper	bd332588bd	[InstCombine] Propagate the nsw/nuw flags from the add in the 'shifty' abs pattern to the sub in the select version. According to alive this is valid. I'm hoping to use this to make an assumption that the sign bit is zero after this sequence. The only way it wouldn't be is if the input was INT__MIN, but by preserving the flags we can make doing this to INT_MIN UB. The nuw flags is weird because it creates such a contradiction that the original number would have to be positive meaning we could remove the select entirely, but we don't get that far. Differential Revision: https://reviews.llvm.org/D46988 llvm-svn: 332623	2018-05-17 16:29:52 +00:00
Alex Bradbury	6a53023b4e	[RISCV] Set isReMaterializable on ADDI and LUI instructions The isReMaterlizable flag is somewhat confusing, unlike most other instruction flags it is currently interpreted as a hint (mightBeRematerializable would be a better name). While LUI is always rematerialisable, for an instruction like ADDI it depends on its operands. TargetInstrInfo::isTriviallyReMaterializable will call TargetInstrInfo::isReallyTriviallyReMaterializable, which in turn calls TargetInstrInfo::isReallyTriviallyReMaterializableGeneric. We rely on the logic in the latter to pick out instances of ADDI that really are rematerializable. The isReMaterializable flag does make a difference on a variety of test programs. The recently committed remat.ll test case demonstrates how stack usage is reduce and a unnecessary lw/sw can be removed. Stack usage in the Proc0 function in dhrystone reduces from 192 bytes to 112 bytes. For the sake of completeness, this patch also implements RISCVRegisterInfo::isConstantPhysReg. Although this is called from a number of places, it doesn't seem to result in different codegen for any programs I've thrown at it. However, it is called in the rematerialisation codepath and it seems sensible to implement something correct here. Differential Revision: https://reviews.llvm.org/D46182 llvm-svn: 332617	2018-05-17 15:51:37 +00:00
Simon Pilgrim	b5741f5c3d	[X86][BtVer2] ADC/SBB take 2cy on an ALU pipe, not 1cy like ADD/SUB llvm-svn: 332616	2018-05-17 15:43:23 +00:00
Dmitry Mikulin	3c6b4e35bd	In thin and full LTO + CFI, direct function calls may go through jump table entries to reach the target. Since these calls don't require type checks, we can short-circuit them to their real targets. Differential Revision: https://reviews.llvm.org/D46326 llvm-svn: 332610	2018-05-17 14:29:07 +00:00
Alex Bradbury	5e41fc83c5	[Hexagon] Use addAliasForDirective for data directives Data directives such as .word, .half, .hword are currently parsed using HexagonAsmParser::ParseDirectiveValue which effectively duplicates logic from AsmParser::parseDirectiveValue. This patch deletes that duplicated logic in favour of using addAliasForDirective. Differential Revision: https://reviews.llvm.org/D46999 llvm-svn: 332607	2018-05-17 13:21:18 +00:00
Simon Pilgrim	0c0336e003	[X86] Split WriteADC/WriteADCRMW scheduler classes For integer ALU instructions taking eflags as an input (ADC/SBB/ADCX/ADOX) llvm-svn: 332605	2018-05-17 12:43:42 +00:00
Jonas Paulsson	caafed5570	[SystemZ] Commenting (NFC) Some minor commenting in scheduler files. Review: Ulrich Weigand llvm-svn: 332599	2018-05-17 11:53:56 +00:00
Simon Pilgrim	ceb4933dc1	[X86][SNB] Minor scheduler cleanup Merge 2 instregex and explain the VMOVDQArr/MOVDQArr difference llvm-svn: 332591	2018-05-17 10:36:29 +00:00
Sander de Smalen	75cfa34156	[AArch64][SVE] Asm: Support for structured ST2, ST3 and ST4 (scalar+scalar) store instructions. Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D46680 llvm-svn: 332584	2018-05-17 09:05:41 +00:00
Mikael Holmen	2ca16899ec	Require DominatorTree when requiring/preserving LoopInfo in the old pass manager Summary: Require DominatorTree when requiring/preserving LoopInfo in the old pass manager BreakCriticalEdges tries to keep LoopInfo and DominatorTree updated if they exist. However, since commit r321653 and r321805, to update LoopInfo we must have a DominatorTree, or we will hit an assert. To fix this we now make a couple of passes that only required/preserved LoopInfo also require DominatorTree. This solves PR37334. Reviewers: eli.friedman, efriedma Reviewed By: efriedma Subscribers: efriedma, llvm-commits Differential Revision: https://reviews.llvm.org/D46829 llvm-svn: 332583	2018-05-17 09:05:40 +00:00
Martin Storsjo	c10788728b	[Analysis] Only use _unlocked stdio functions on linux The existing comment said that the functions were available only on GNU/Linux (and on certain Android versions), but only checked T.isGNUEnvironment() which also is true on MinGW (for arch-windows-gnu triplets), which doesn't have such functions. Existing checks in the initialize function in TargetLibraryInfo.cpp also use only T.isOSLinux() to check for glibc features. This fixes use of stdio on MinGW. Differential Revision: https://reviews.llvm.org/D47002 llvm-svn: 332581	2018-05-17 08:16:08 +00:00
Bjorn Pettersson	81a76a388a	[SROA] Handle PHI with multiple duplicate predecessors Summary: The verifier accepts PHI nodes with multiple entries for the same basic block, as long as the value is the same. As seen in PR37203, SROA did not handle such PHI nodes properly when speculating loads over the PHI, since it inserted multiple loads in the predecessor block and changed the PHI into having multiple entries for the same basic block, but with different values. This patch teaches SROA to reuse the same speculated load for each PHI duplicate entry in such situations. Resolves: https://bugs.llvm.org/show_bug.cgi?id=37203 Reviewers: uabelho, chandlerc, hfinkel, bkramer, efriedma Reviewed By: efriedma Subscribers: dberlin, efriedma, llvm-commits Differential Revision: https://reviews.llvm.org/D46426 llvm-svn: 332577	2018-05-17 07:21:41 +00:00
Hiroshi Inoue	f5c0e6c285	[SROA] pr37267: fix assertion failure in integer widening The current integer widening does not support rewriting partial split slices in rewriteIntegerStore (and rewriteIntegerLoad). This patch adds explicit checks for this case in isIntegerWideningViableForSlice. Before r322533, splitting is allowed only for the whole-alloca slice and hence the above case is implicitly rejected by another check `if (DL.getTypeStoreSize(ValueTy) > Size)` because whole-alloca slice is larger than the partition. Differential Revision: https://reviews.llvm.org/D46750 llvm-svn: 332575	2018-05-17 06:32:17 +00:00
Alex Bradbury	cea6db0480	[RISCV] Add support for .half, .hword, .word, .dword directives These directives are recognised by gas. Support is added through the use of addAliasForDirective. Also match RISC-V gcc in preferring .half and .word for 16-bit and 32-bit data directives. llvm-svn: 332574	2018-05-17 05:58:08 +00:00
Craig Topper	a2c5264718	[X86] Add OptForSize to a couple load folding patterns. Remove some bad FIXME comments. The FIXME comments were about preventing load folding to avoid a partial xmm update. But these instructions use GPR as input when the load isn't folded. This won't help prevent a partial xmm update. llvm-svn: 332573	2018-05-17 05:41:11 +00:00
Dan Gohman	aef674102c	[WebAssembly] Fix the opcode number for i64.load16_u. Fixes PR37488. llvm-svn: 332561	2018-05-17 00:14:13 +00:00
Craig Topper	342273a139	[CodeGen] Use MachineInstr::getOperand(0) instead of gets the defs iterator_range and calling begin. NFC Defs are well defined to come first in MachineInstr operand list. No need for a more complex indirection. llvm-svn: 332559	2018-05-16 23:39:27 +00:00
Greg Clayton	f81f3a838a	Revert 332508 as it caused problems in the clang test suite. llvm-svn: 332555	2018-05-16 23:29:36 +00:00
Vedant Kumar	5a0872c2b7	[STLExtras] Add size() for ranges, and remove distance() r332057 introduced distance() for ranges. Based on post-commit feedback, this renames distance() to size(). The new size() is also only enabled when the operation is O(1). Differential Revision: https://reviews.llvm.org/D46976 llvm-svn: 332551	2018-05-16 23:20:42 +00:00
JF Bastien	ddc84bf7d1	[NFC] WebAssembly build break #2 Summary: Same as r332530, move WasmSymbol::dump to an implementation file to avoid linker issues when the dump function is seen in the header, doesn't get eliminated, and then linking fails because of the missing dependency. <rdar://problem/40258137> Reviewers: sbc100, ncw, paquette, vsk, dschuff Subscribers: jgravelle-google, aheejin, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D46985 llvm-svn: 332542	2018-05-16 22:31:42 +00:00
Lang Hames	d261e1258c	[ORC] Rewrite the VSO symbol table yet again. Update related utilities. VSOs now track dependencies for materializing symbols. Each symbol must have its dependencies registered with the VSO prior to finalization. Usually this will involve registering the dependencies returned in AsynchronousSymbolQuery::ResolutionResults for queries made while linking the symbols being materialized. Queries against symbols are notified that a symbol is ready once it and all of its transitive dependencies are finalized, allowing compilation work to be broken up and moved between threads without queries returning until their symbols fully safe to access / execute. Related utilities (VSO, MaterializationUnit, MaterializationResponsibility) are updated to support dependence tracking and more explicitly track responsibility for symbols from the point of definition until they are finalized. llvm-svn: 332541	2018-05-16 22:24:30 +00:00
Simon Pilgrim	820433f533	[X86][SNB] Remove unnecessary CVT InstRW overrides llvm-svn: 332536	2018-05-16 22:14:29 +00:00
Sam Clegg	6a32560886	[WebAssembly] Remove unused headers in MCWasmObjectWriter Differential Revision: https://reviews.llvm.org/D46969 llvm-svn: 332535	2018-05-16 22:13:18 +00:00
Benjamin Kramer	8ac15bf4dc	[InstCombine] Fix the signature of fgets_unlocked. It returns a pointer, not an int. This miscompiles all code that uses the return value of fgets. llvm-svn: 332531	2018-05-16 21:45:39 +00:00
JF Bastien	659932b0b2	[NFC] WebAssembly build fix Summary: r332305 added a use of llvm::wasm::toString in llvm::object::WasmSymbol::print, which is in a header file. It also moves toString to BinaryFormat. This has the unintended side-effect that any inclusion of Object/Wasm.h now relies on toString, and needs to required_libraries = BinaryFormat. Thankfully most builds don't fail with this because print just isn't used and gets eliminated, dropping the required dependency in the process. Not all builds are so lucky. Fix this issue by moving print to the corresponding .cpp file. <rdar://problem/40258137> Reviewers: sbc100, ncw, paquette Subscribers: dschuff, jgravelle-google, aheejin, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D46977 llvm-svn: 332530	2018-05-16 21:24:03 +00:00
Eli Friedman	ddbf6d6514	[MachineOutliner] Don't outline instructions that modify SP. This breaks the code which saves and restores LR, so we can't outline without doing something more complicated for stack adjustment. Found by inspection; we get lucky in most cases because getMemOpInfo only handles STRWpost, not any other pre/post-increment forms. But it hits a couple of artificial testcases in the tree. Differential Revision: https://reviews.llvm.org/D46920 llvm-svn: 332529	2018-05-16 21:20:16 +00:00
Krzysztof Parzyszek	f18009dbc6	[Hexagon] Fix the order of operands when selecting QCAT llvm-svn: 332526	2018-05-16 21:02:43 +00:00
Krzysztof Parzyszek	e8a0ae7346	[Hexagon] Mark HVX vector predicate bitwise ops as legal, add patterns llvm-svn: 332525	2018-05-16 21:00:24 +00:00
Simon Pilgrim	2e0f6c9b21	[X86][SSE] Reduce instruction/register usages for v4i32 vector shifts (PR37441) As suggested by Fabian on PR37441, use PSHUFLW to extend shift amount types for use with PSRAD/PSRLD to reduce register pressure. Some of this ideally would be done by combineTargetShuffle but its tricky to do as most of the shuffles are sharing inputs. Differential Revision: https://reviews.llvm.org/D46959 llvm-svn: 332524	2018-05-16 20:52:52 +00:00
Konstantin Zhuravlyov	c72ece6c2c	AMDGPU : Recalculate SGPRs when trap handler is supported Differential Revision: https://reviews.llvm.org/D29911 llvm-svn: 332523	2018-05-16 20:47:48 +00:00
Eric Christopher	1f5eb86b51	Fix small grammar-o. llvm-svn: 332522	2018-05-16 20:34:00 +00:00
Eric Christopher	fb923d28a9	Fix up a misleading format warning. llvm-svn: 332521	2018-05-16 20:33:59 +00:00
Sam Clegg	6ccb59b3e9	[WebAssembly] MC: Ensure that FUNCTION_OFFSET relocations are always against function symbols. The getAtom() method wasn't doing what we needed in all cases. We want the symbols for the function which defines that section. We can compute this easily enough and we know that we have at most one function in each section. Once this lands I will revert rL331412 which is no longer needed. Fixes PR37409 Differential Revision: https://reviews.llvm.org/D46970 llvm-svn: 332517	2018-05-16 20:09:05 +00:00
Eli Friedman	02709bcb78	[MachineOutliner] Don't save/restore LR for tail calls. The cost computation assumes we do this correctly, but the actual lowering was wrong. Differential Revision: https://reviews.llvm.org/D46923 llvm-svn: 332514	2018-05-16 19:49:01 +00:00
Simon Pilgrim	d5d77dcb46	[X86] Fix typo in instregex for CVTSI642SDrr llvm-svn: 332510	2018-05-16 18:31:17 +00:00
Greg Clayton	b24957e22a	Fix llvm::sys::path::remove_dots() to return "." instead of an empty path. Differential Revision: https://reviews.llvm.org/D46887 llvm-svn: 332508	2018-05-16 18:25:51 +00:00
Roman Lebedev	e592104cf0	[Timers] TimerGroup: add constructor from StringMap<TimeRecord> Summary: This is needed for the continuation of D46504, to be able to store the timings. Reviewers: george.karpenkov, NoQ, alexfh, sbenza Reviewed By: alexfh Subscribers: llvm-commits, cfe-commits Differential Revision: https://reviews.llvm.org/D46939 llvm-svn: 332506	2018-05-16 18:16:01 +00:00
Roman Lebedev	d9ade38d4e	[Timers] TimerGroup: make printJSONValues() method public Summary: This is needed for the continuation of D46504, to be able to store the timings. Reviewers: george.karpenkov, NoQ, alexfh, sbenza Reviewed By: alexfh Subscribers: llvm-commits, cfe-commits Differential Revision: https://reviews.llvm.org/D46938 llvm-svn: 332505	2018-05-16 18:15:56 +00:00
Roman Lebedev	ddfefc3538	[Timers] TimerGroup::printJSONValue(): print doubles with no precision loss Summary: Although this is not stricly required, i would very much prefer not to have known random precision losses along the way. Reviewers: george.karpenkov, NoQ, alexfh, sbenza Reviewed By: george.karpenkov Subscribers: llvm-commits, cfe-commits Differential Revision: https://reviews.llvm.org/D46937 llvm-svn: 332504	2018-05-16 18:15:51 +00:00
Roman Lebedev	c39ad98d80	[Timers] TimerGroup::printJSONValues(): print mem timer with .mem suffix Summary: We have just used `.sys` suffix for the previous timer, this is clearly a typo Reviewers: george.karpenkov, NoQ, alexfh, sbenza Reviewed By: alexfh Subscribers: llvm-commits, cfe-commits Differential Revision: https://reviews.llvm.org/D46936 llvm-svn: 332503	2018-05-16 18:15:47 +00:00
Craig Topper	67aa726f8c	[X86][AVX512DQ] Use packed instructions for scalar FP<->i64 conversions on 32-bit targets As i64 types are not legal on 32-bit targets, insert these into a suitable zero vector and use the packed vXi64<->FP conversion instructions instead. Fixes PR3163. Differential Revision: https://reviews.llvm.org/D43441 llvm-svn: 332498	2018-05-16 17:40:07 +00:00
JF Bastien	aa1333a91f	Signal handling should be signal-safe Summary: Before this patch, signal handling wasn't signal safe. This leads to real-world crashes. It used ManagedStatic inside of signals, this can allocate and can lead to unexpected state when a signal occurs during llvm_shutdown (because llvm_shutdown destroys the ManagedStatic). It also used cl::opt without custom backing storage. Some de-allocation was performed as well. Acquiring a lock in a signal handler is also a great way to deadlock. We can't just disable signals on llvm_shutdown because the signals might do useful work during that shutdown. We also can't just disable llvm_shutdown for programs (instead of library uses of clang) because we'd have to then mark the pointers as not leaked and make sure all the ManagedStatic uses are OK to leak and remain so. Move all of the code to lock-free datastructures instead, and avoid having any of them in an inconsistent state. I'm not trying to be fancy, I'm not using any explicit memory order because this code isn't hot. The only purpose of the atomics is to guarantee that a signal firing on the same or a different thread doesn't see an inconsistent state and crash. In some cases we might miss some state (for example, we might fail to delete a temporary file), but that's fine. Note that I haven't touched any of the backtrace support despite it not technically being totally signal-safe. When that code is called we know something bad is up and we don't expect to continue execution, so calling something that e.g. sets errno is the least of our problems. A similar patch should be applied to lib/Support/Windows/Signals.inc, but that can be done separately. Fix r332428 which I reverted in r332429. I originally used double-wide CAS because I was lazy, but some platforms use a runtime function for that which thankfully failed to link (it would have been bad for signal handlers otherwise). I use a separate flag to guard the data instead. <rdar://problem/28010281> Reviewers: dexonsmith Subscribers: steven_wu, llvm-commits llvm-svn: 332496	2018-05-16 17:25:35 +00:00
Nirav Dave	11fd14c1ac	[DAG] Prune cycle check in store merge. As part of merging stores we check that fusing the nodes does not cause a cycle due to one candidate store being indirectly dependent on another store (this may happen via chained memory copies). This is done by searching if a store is a predecessor to another store's value. Prune the search at the candidate search's root node which is a predecessor to all candidate stores. This reduces the size of the subgraph searched in large basic blocks. Reviewers: jyknight Subscribers: llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D46955 llvm-svn: 332490	2018-05-16 16:48:20 +00:00
Nirav Dave	d9d86cb738	[DAG] Defer merge store cycle checking to just before merge. NFCI. llvm-svn: 332489	2018-05-16 16:47:54 +00:00
Tony Tye	43259df44a	[AMDGPU] Change llvm.debugtrap to be a debug breakpoint that can resume execution. No longer require the queue pointer to be passed in in fixed SGPRs. Differential Revision: https://reviews.llvm.org/D46769 llvm-svn: 332485	2018-05-16 16:19:34 +00:00
Sander de Smalen	22176a2242	[AArch64][SVE] Improve diagnostics for vectors with incorrect element-size. For regular SVE vector operands, this patch introduces a more sensible diagnostic when the vector has a wrong suffix (e.g. z0.s vs z0.b). For example: add z0.s, z1.s, z2.b -> invalid element width ^_____^ mismatch For the vector-with-shift/extend (e.g. z0.s, uxtw #2) this patch takes a slightly different approach and instead returns a 'invalid operand' if the element size is not as expected. This is because the diagnostics are more specificied to suggest using the right shift/extend suffix. This is a trade-off not to introduce more operand classes and still provide useful diagnostics for LD1 and PRF instructions. For example: ld1w z1.s, p0/z, [x0, z0.s] -> invalid shift/extend specified, expected 'z[0..31].s, (uxtw\|sxtw)' ld1w z1.d, p0/z, [x0, z0.s] -> invalid operand ^________________^ mismatch For gather prefetches, both 'z0.s' and 'z0.d' would be allowed: prfw #0, p0, [x0, z0.s] -> invalid shift/extend specified, expected 'z[0..31].s, (uxtw\|sxtw) #2' prfw #0, p0, [x0, z0.d] -> invalid shift/extend specified, expected 'z[0..31].d, (lsl\|uxtw\|sxtw) #2' Without this change, the diagnostic would unnecessarily suggest a different element size: prfw #0, p0, [x0, z0.s] -> invalid shift/extend specified, expected 'z[0..31].d, (lsl\|uxtw\|sxtw) #2' Reviewers: SjoerdMeijer, aemerson, fhahn, samparker, javed.absar Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D46688 llvm-svn: 332483	2018-05-16 15:45:17 +00:00
Sirish Pande	cabe50a308	[AArch64] Gangup loads and stores for pairing. Keep loads and stores together (target defines how many loads and stores to gang up), such that it will help in pairing and vectorization. Differential Revision https://reviews.llvm.org/D46477 llvm-svn: 332482	2018-05-16 15:36:52 +00:00
Sanjay Patel	2eb3512090	[InstCombine] allow more binop (shuffle X), C transforms The canonicalization was restricted to shuffle masks with a 1-to-1 mapping to the constant vector, but that disqualifies the common splat pattern. This is part of solving PR37463: https://bugs.llvm.org/show_bug.cgi?id=37463 llvm-svn: 332479	2018-05-16 15:15:22 +00:00
Sander de Smalen	bbc4e9a4e3	[AArch64][SVE] Asm: Support for gather PRF prefetch instructions Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D46686 llvm-svn: 332472	2018-05-16 14:16:01 +00:00
Krzysztof Pszeniczny	2ba8fd4914	[BasicAA] Fix handling of invariant group launders Summary: A recent patch ([[ https://reviews.llvm.org/rL331587 \| rL331587 ]]) to Capture Tracking taught it that the `launder_invariant_group` intrinsic captures its argument only by returning it. Unfortunately, BasicAA still considered every call instruction as a possible escape source and hence concluded that the result of a `launder_invariant_group` call cannot alias any local non-escaping value. This led to [[ https://bugs.llvm.org/show_bug.cgi?id=37458 \| bug 37458 ]]. This patch updates the relevant check for escape sources in BasicAA. Reviewers: Prazek, kuhar, rsmith, hfinkel, sanjoy, xbolva00 Reviewed By: hfinkel, xbolva00 Subscribers: JDevlieghere, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D46900 llvm-svn: 332466	2018-05-16 13:16:54 +00:00
Simon Dardis	8ea0ecdedc	[mips] Simplify some of the predicate scopes for (negative) multiply add/sub instructions (NFCI) llvm-svn: 332464	2018-05-16 12:44:27 +00:00
Simon Dardis	6c35f8f445	[mips] Join existing scopes for DecoderNamespace (NFCI) llvm-svn: 332462	2018-05-16 12:37:04 +00:00
Matt Arsenault	67a9815a5c	AMDGPU: Custom lower v4i16/v4f16 vector operations Avoids stack access. Also handle extract hi elt pattern from truncate + shift to avoid a couple test regressions. llvm-svn: 332453	2018-05-16 11:47:30 +00:00
David Bolvansky	ca22d427b9	[SimplifyLibcalls] Replace locked IO with unlocked IO Summary: If file stream arg is not captured and source is fopen, we could replace IO calls by unlocked IO ("_unlocked" function variants) to gain better speed, Reviewers: efriedma, RKSimon, spatel, sanjoy, hfinkel, majnemer, lebedev.ri, rja Reviewed By: rja Subscribers: rja, srhines, efriedma, lebedev.ri, llvm-commits Differential Revision: https://reviews.llvm.org/D45736 llvm-svn: 332452	2018-05-16 11:39:52 +00:00
Simon Pilgrim	5647e89f5a	[X86] Split WriteCvtI2F/WriteCvtF2I into I<->F32 and I<->F64 scheduler classes A lot of the models still have too many InstRW overrides for these new classes - this needs cleaning up but I wanted to get the classes in first llvm-svn: 332451	2018-05-16 10:53:45 +00:00
David Green	cdee1d957e	[LoopUnroll] Split out simplify code after Unroll into a new function. NFC So that it can be shared with other passes that may end up doing the same thing. Differential Revision: https://reviews.llvm.org/D45874 llvm-svn: 332450	2018-05-16 10:41:58 +00:00
Amara Emerson	0d6a26dffc	[GlobalISel][IRTranslator] Split aggregates during IR translation. We currently handle all aggregates by creating one large LLT, and letting the legalizer deal with splitting them up. However using this approach means that we can't support big endian code correctly. This patch changes the way that the IRTranslator deals with aggregate values, by splitting them up into their constituent element values. To do this, parts of the translator need to be modified to deal with multiple VRegs for a single Value. A new Value to VReg mapper is introduced to help keep compile time under control, currently there is no measurable impact on CTMark despite the extra code being generated in some cases. Patch is based on the original work of Tim Northover. Differential Revision: https://reviews.llvm.org/D46018 llvm-svn: 332449	2018-05-16 10:32:02 +00:00
Simon Dardis	5cf9de4b72	[mips] Add support for isBranchOffsetInRange and use it for MipsLongBranch Add support for this target hook, covering MIPS, microMIPS and MIPSR6, along with some tests. Also add missing getOppositeBranchOpc() cases exposed by the tests. Reviewers: atanasyan, abeserminji, smaksimovic Differential Revision: https://reviews.llvm.org/D46794 llvm-svn: 332446	2018-05-16 10:03:05 +00:00
Peter Smith	c811758da6	[AArch64] Support "S" inline assembler constraint This patch re-introduces the "S" inline assembler constraint. This matches an absolute symbolic address or a label reference. The primary use case is asm("adrp %0, %1\n\t" "add %0, %0, :lo12:%1" : "=r"(addr) : "S"(&var)); I say re-introduces as it seems like "S" was implemented in the original AArch64 backend, but it looks like it wasn't carried forward to the merged backend. The original implementation had A and L modifiers that could be used to print ":lo12:" to the string. It looks like gcc doesn't use these and :lo12: is expected to be written in the inline assembly string so I've not implemented A and L. Clang already supports the S modifier. Fixes PR37180 Differential Revision: https://reviews.llvm.org/D46745 llvm-svn: 332444	2018-05-16 09:33:25 +00:00
Sander de Smalen	a680f558be	[AArch64][SVE] Asm: Support for structured LD2, LD3 and LD4 (scalar+scalar) load instructions. Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D46679 llvm-svn: 332442	2018-05-16 09:16:20 +00:00
Alexander Richardson	8f44579d0b	Emit a left-shift instead of a power-of-two multiply for jump-tables Summary: SelectionDAGLegalize::ExpandNode() inserts an ISD::MUL when lowering a BR_JT opcode. While many backends optimize this multiply into a shift, e.g. the MIPS backend currently always lowers this into a sequence of load-immediate+multiply+mflo in MipsSETargetLowering::lowerMulDiv(). I initially changed the multiply to a shift in the MIPS backend but it turns out that would not have handled the MIPSR6 case and was a lot more code than doing it in LegalizeDAG. I believe performing this simple optimization in LegalizeDAG instead of each individual backend is the better solution since this also fixes other backeds such as MSP430 which calls the multiply runtime function __mspabi_mpyi without this patch. Reviewers: sdardis, atanasyan, pftbest, asl Reviewed By: sdardis Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45760 llvm-svn: 332439	2018-05-16 08:58:26 +00:00
Sander de Smalen	67f9154964	[AArch64][SVE] Asm: Support for contiguous PRF prefetch instructions. Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D46682 llvm-svn: 332433	2018-05-16 07:50:09 +00:00
Fangrui Song	2cafed76d3	[Unix] Indent ChangeStd{in,out}ToBinary. llvm-svn: 332432	2018-05-16 06:43:27 +00:00

... 3 4 5 6 7 ...

113740 Commits