llvm-project

Commit Graph

Author	SHA1	Message	Date
Peter Collingbourne	a7d936f0c0	Revert r329611, "AArch64: Allow offsets to be folded into addresses with ELF." Caused a build failure in check-tsan. llvm-svn: 329718	2018-04-10 16:19:30 +00:00
Jessica Paquette	e4b90d82a0	Add missing nullptr check to AArch64MachObjectWriter::recordRelocation There was missing nullptr check before a call to getSection() in recordRelocation. This would result in a segfault in code like the attached test. This adds the missing check and a test which makes sure we get the expected error output. llvm-svn: 329716	2018-04-10 15:53:28 +00:00
Nicolai Haehnle	b1c3b22b4c	AMDGPU/MC: Allow disassembling without symbol info Summary: We would like the UMR debugging tool[0] to be able to provide disassembly for currently live waves based on plain memory dumps, and we want to leverage the LLVM disassembler for this. This mostly works, except that UMR clearly can't provide real symbol info, so it wants to set DisInfo == nullptr. [0] https://cgit.freedesktop.org/amd/umr/ Reviewers: arsenm, rampitec, artem.tamazov, dp Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D45477 Change-Id: Ibb2c5af2e66f2e100b4702fd81308e1932bc4ee6 llvm-svn: 329715	2018-04-10 15:46:43 +00:00
Aaron Smith	c0a5c01aeb	[PDB] Remove dead code and run clang format; NFC llvm-svn: 329712	2018-04-10 15:25:04 +00:00
Chad Rosier	af7519e9af	Fix spelling. NFC. llvm-svn: 329709	2018-04-10 14:57:13 +00:00
Pavel Labath	b7243ed2f4	[CodeGen/Dwarf] Rename the "sizetype" synthetic type and add it to the accelerator table Summary: This type is created on-demand and used as the base type for array ranges. Since it is "special", its construction did not go through the createTypeDIE function and so it was never inserted into the accelerator table, although it clearly belongs there. I add an explicit addAccelType call to insert it into the table. During review, we also decided to rename the type to something more unique to avoid confusion in case the user has own "sizetype" type. The new name for the type size __ARRAY_SIZE_TYPE__. Reviewers: JDevlieghere, aprantl, dblaikie Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45445 llvm-svn: 329705	2018-04-10 14:23:41 +00:00
Simon Pilgrim	95f941117c	Fix whitespace indentation. NFCI. llvm-svn: 329704	2018-04-10 14:21:33 +00:00
Gabor Buella	3eab22d896	[X86] Disable SGX for Skylake Server Reviewers: craig.topper, zvi, echristo Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D45057 llvm-svn: 329700	2018-04-10 13:58:57 +00:00
David Green	5ef933b02c	[DA] Improve alias checking in dependence analysis Improve the alias analysis to account for cases where we know that src/dst pairs cannot alias due to things like TBAA. As we know they are noalias, we know no dependency can occur. Also fixes issues around the size parameter to AA being incorrect. Differential Revision: https://reviews.llvm.org/D42381 llvm-svn: 329692	2018-04-10 11:37:21 +00:00
Francis Visoiu Mistrih	f2c22050e8	[AArch64] Use FP to access the emergency spill slot In the presence of variable-sized stack objects, we always picked the base pointer when resolving frame indices if it was available. This makes us hit an assert where we can't reach the emergency spill slot if it's too far away from the base pointer. Since on AArch64 we decide to place the emergency spill slot at the top of the frame, it makes more sense to use FP to access it. The changes here don't affect only emergency spill slots but all the frame indices. The goal here is to try to choose between FP, BP and SP so that we minimize the offset and avoid scavenging, or worse, asserting when trying to access a slot allocated by the scavenger. Previously discussed here: https://reviews.llvm.org/D40876. Differential Revision: https://reviews.llvm.org/D45358 llvm-svn: 329691	2018-04-10 11:29:40 +00:00
Tim Renouf	7190a4692a	[AMDGPU] For OS type AMDPAL, fixed scratch on compute shader Summary: For OS type AMDPAL, the scratch descriptor is loaded from offset 0 of the GIT, whose 32 bit pointer is in s0 (s8 for gfx9 merged shaders). This commit fixes that to use offset 0x10 instead of offset 0 for a compute shader, per the PAL ABI spec. V2: Ensure s0 (s8 for gfx9 merged shader) is marked live-in when loading scratch descriptor from GIT. Reviewers: kzhuravl, nhaehnle, timcorringham Subscribers: kzhuravl, wdng, yaxunl, t-tye, llvm-commits, dstuttard, nhaehnle, arsenm Differential Revision: https://reviews.llvm.org/D44468 Change-Id: I93dffa647758e37f613bb5e0dfca840d82e6d26f llvm-svn: 329690	2018-04-10 11:25:15 +00:00
Tim Northover	6a1c51bf6b	AArch64: diagnose unpredictable store-exclusive instructions Much like any written register in load/store instructions, the status register is not allowed to overlap with any others. So diagnose it like we already do with the other cases. llvm-svn: 329687	2018-04-10 11:04:29 +00:00
Andrea Di Biagio	486358c153	[X86][Broadwell] HWPort5 should not be added to BroadwellModelProcResources. The BroadwellModelProcResources had an entry for HWPort5, which is a Haswell resource, and not a Broadwell processor resource. That entry was added to the Broadwell model because variable blends were consuming it. This was clearly a typo (the resource name should have been BWPort5), which unfortunately was never caught before. It was not reported as an error because HWPort5 is a resource defined by the Haswell model. It has been found when testing some code with llvm-mca: the list of resources in the resource pressure view was odd. This patch fixes the issue; now variable blend instructions consume 2 cycles on BWPort5 instead of HWPort5. This is enough to get rid of the extra (spurious) entry in the BroadWellModelProcResources table. llvm-svn: 329686	2018-04-10 10:49:41 +00:00
Sander de Smalen	f974e255fe	[AArch64][SVE] Asm: Add support for unpredicated LSL/LSR (shift by immediate) instructions. Reviewers: rengolin, fhahn, javed.absar, SjoerdMeijer, huntergr, t.p.northover, echristo, evandro Reviewed By: rengolin, fhahn Subscribers: tschuett, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D45371 llvm-svn: 329681	2018-04-10 10:03:13 +00:00
Clement Courbet	b449379eae	[MC][TableGen] Add optional libpfm counter names for ProcResUnits. Summary: Subtargets can define the libpfm counter names that can be used to measure cycles and uops issued on ProcResUnits. This allows making llvm-exegesis available on more targets. Fixes PR36984. Reviewers: gchatelet, RKSimon, andreadb, craig.topper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45360 llvm-svn: 329675	2018-04-10 08:16:37 +00:00
Sander de Smalen	30fda45c18	[AArch64][SVE] Asm: Add support for SVE INDEX instructions. Reviewers: rengolin, fhahn, javed.absar, SjoerdMeijer, huntergr, t.p.northover, echristo, evandro Reviewed By: rengolin, fhahn Subscribers: tschuett, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D45370 llvm-svn: 329674	2018-04-10 07:01:53 +00:00
Chandler Carruth	0ca3bd0729	[x86] Model the direction flag (DF) separately from the rest of EFLAGS. This cleans up a number of operations that only claimed te use EFLAGS due to using DF. But no instructions which we think of us setting EFLAGS actually modify DF (other than things like popf) and so this needlessly creates uses of EFLAGS that aren't really there. In fact, DF is so restrictive it is pretty easy to model. Only STD, CLD, and the whole-flags writes (WRFLAGS and POPF) need to model this. I've also somewhat cleaned up some of the flag management instruction definitions to be in the correct .td file. Adding this extra register also uncovered a failure to use the correct datatype to hold X86 registers, and I've corrected that as necessary here. Differential Revision: https://reviews.llvm.org/D45154 llvm-svn: 329673	2018-04-10 06:40:51 +00:00
Craig Topper	7e42af87a6	[X86] Prevent folding loads with 64-bit ANDs with immediates that fit in 32-bits. Prefer to use the 32-bit AND with immediate instead. Primarily I'm doing this to ensure that immediates created by shrinkAndImmediate will always get absorbed into the AND. But I do believe this would be a reduction in the number of uops that need to execute. Ideally we should shrink the 'and' and the 'load' during DAG combine to re-enable the fold. Fixes PR37063. llvm-svn: 329667	2018-04-10 03:44:15 +00:00
Michael Zolotukhin	d6beefd5d3	Revert "[PR16756] Use SSAUpdaterBulk in JumpThreading." one more time. This reverts r329661. Bots are still unhappy. llvm-svn: 329666	2018-04-10 03:40:29 +00:00
Michael Zolotukhin	8a13f6d4a7	Revert "Revert "[PR16756] Use SSAUpdaterBulk in JumpThreading."" This reapplies commit r329644. llvm-svn: 329661	2018-04-10 02:16:45 +00:00
Michael Zolotukhin	aa7868594e	[SSAUpdaterBulk] Handle CFG with unreachable from entry blocks. llvm-svn: 329660	2018-04-10 02:16:29 +00:00
Alexandre Ganea	08df84e4f0	[DebugInfo][COFF] Fix reading variable-length encoded records While reading Codeview records which contain variable-length encoded integers, such as LF_BCLASS, LF_ENUMERATE, LF_MEMBER, LF_VBCLASS or LF_IVBCLASS, the record's size would be improperly calculated in cases where the value was indeed of a variable length (>= LF_NUMERIC). This caused a bad alignement on the next record, which would/might crash later on. Differential Revision: https://reviews.llvm.org/D45104 llvm-svn: 329659	2018-04-10 01:58:45 +00:00
Chandler Carruth	19618fc639	[x86] Introduce a pass to begin more systematically fixing PR36028 and similar issues. The key idea is to lower COPY nodes populating EFLAGS by scanning the uses of EFLAGS and introducing dedicated code to preserve the necessary state in a GPR. In the vast majority of cases, these uses are cmovCC and jCC instructions. For such cases, we can very easily save and restore the necessary information by simply inserting a setCC into a GPR where the original flags are live, and then testing that GPR directly to feed the cmov or conditional branch. However, things are a bit more tricky if arithmetic is using the flags. This patch handles the vast majority of cases that seem to come up in practice: adc, adcx, adox, rcl, and rcr; all without taking advantage of partially preserved EFLAGS as LLVM doesn't currently model that at all. There are a large number of operations that techinaclly observe EFLAGS currently but shouldn't in this case -- they typically are using DF. Currently, they will not be handled by this approach. However, I have never seen this issue come up in practice. It is already pretty rare to have these patterns come up in practical code with LLVM. I had to resort to writing MIR tests to cover most of the logic in this pass already. I suspect even with its current amount of coverage of arithmetic users of EFLAGS it will be a significant improvement over the current use of pushf/popf. It will also produce substantially faster code in most of the common patterns. This patch also removes all of the old lowering for EFLAGS copies, and the hack that forced us to use a frame pointer when EFLAGS copies were found anywhere in a function so that the dynamic stack adjustment wasn't a problem. None of this is needed as we now lower all of these copies directly in MI and without require stack adjustments. Lots of thanks to Reid who came up with several aspects of this approach, and Craig who helped me work out a couple of things tripping me up while working on this. Differential Revision: https://reviews.llvm.org/D45146 llvm-svn: 329657	2018-04-10 01:41:17 +00:00
Vlad Tsyrklevich	0cdc6ec535	ShadowCallStack/x86_64: Ignore pseudo-machine instructions llvm-svn: 329656	2018-04-10 01:31:01 +00:00
Vitaly Buka	6c05a3bb71	Object: Don't mark alias unconditionally defined Summary: Can't remove EmitAssignment override as llvm/test/Object/X86/nm-bitcodeweak.test expects this behavior. Reviewers: pcc, espindola Subscribers: mehdi_amini, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D44596 llvm-svn: 329651	2018-04-10 00:53:16 +00:00
Michael Zolotukhin	0274632ee6	Revert "[PR16756] Use SSAUpdaterBulk in JumpThreading." This reverts commit r329644. llvm-svn: 329650	2018-04-10 00:42:43 +00:00
Hideki Saito	d829973794	Fix for the buildbot failure. Now-unused private field TTI deleted. llvm-svn: 329649	2018-04-10 00:38:36 +00:00
Alexandre Ganea	3241cec577	Fix line endings (CR/LF -> LF) introduced by rL329613 reviewer: zturner llvm-svn: 329646	2018-04-10 00:09:15 +00:00
Hideki Saito	dfa932b049	[NFC][LV] Move InterleaveInfo from Legal to CostModel Summary: Another clean up, following D43208. Interleaved memory access analysis/optimization has nothing to do with vectorization legality. It doesn't really belong there. On the other hand, cost model certainly has to know about it. In principle, vectorization should proceed like Legality ==> Optimization ==> CostModel ==> CodeGen, and this change just does that, by moving the interleaved access analysis/decision out of Legal, and run it just before CostModel object is created. After this, I can move LoopVectorizationLegality and Hints/Requirements classes into it's own header file, making it shareable within Transform tree. I have the patch already but I don't want to mix with this change. Eventual goal is to move to Analysis tree, but I first need to move RecurrenceDescriptor/InductionDescriptor from Transform/Util/LoopUtil.* to Analysis. Reviewers: rengolin, hfinkel, mkuper, dcaballe, sguggill, fhahn, aemerson Reviewed By: rengolin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45072 llvm-svn: 329645	2018-04-09 23:45:40 +00:00
Michael Zolotukhin	c6d2d65f37	[PR16756] Use SSAUpdaterBulk in JumpThreading. Summary: SSAUpdater is a bottleneck in JumpThreading, and this patch improves the situation by using SSAUpdaterBulk instead. Compile time impact: no noticable changes on CTMark, a big improvement on the test from PR16756. Reviewers: dberlin, davide, MatzeB Subscribers: llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D44282 llvm-svn: 329644	2018-04-09 23:37:37 +00:00
Michael Zolotukhin	52b064f3d3	[PR16756] Add SSAUpdaterBulk. Summary: SSAUpdater is a bottleneck in a number of passes, and one of the reasons is that it performs a lot of unnecessary computations (DT/IDF) over and over again. This patch adds a new SSAUpdaterBulk that uses existing DT and avoids recomputing IDF when possible. Reviewers: dberlin, davide, MatzeB Subscribers: llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D44282 llvm-svn: 329643	2018-04-09 23:37:20 +00:00
George Burgess IV	0034e393d9	[MemorySSA] remove cruft; NFC. The caching walker used to hold its own caches, which made its `reset()` function meaningful. Since caching has been moved out of it, there's no reason to continue to have these cache-related methods. Similarly, the EXPENSIVE_CHECKS block that's getting removed used to rerun the query with caching disabled. Since that's how we always do queries now, it's redundant. llvm-svn: 329638	2018-04-09 23:09:27 +00:00
George Burgess IV	2a84e4ab12	[MemorySSA] Remove redundant assert; NFC The `if (!Def && !Use) return nullptr;` right above this assert sort of defeats the purpose. llvm-svn: 329632	2018-04-09 22:45:14 +00:00
Daniel Sanders	5281b02e84	[globalisel][legalizerinfo] Add support for the Lower action in getActionDefinitionsBuilder() and use it in AArch64. Lower is slightly odd. It often doesn't change the type but the lowerings do use the new type to decide what code to create. Treat it like a mutation but provide convenience functions that re-use the existing type. Re-uses the existing tests: test/CodeGen/AArch64/GlobalISel/legalize-rem.mir test/CodeGen/AArch64/GlobalISel//legalize-mul.mir test/CodeGen/AArch64/GlobalISel//legalize-cmpxchg-with-success.mir llvm-svn: 329623	2018-04-09 21:10:09 +00:00
Matt Arsenault	97b6b1b926	Fix printing of stack id in MachineFrameInfo uint8_t is printed as a char, so it needs to be casted to do the right thing. llvm-svn: 329622	2018-04-09 21:04:30 +00:00
Zhaoshi Zheng	43af17be41	[MemorySSAUpdater] Mark Phi users of a node being moved as non-optimize Fix PR36484, as suggested: <quote> during moves, mark the direct users of the erased things that were phis as "not to be optimized" <quote> llvm-svn: 329621	2018-04-09 20:55:37 +00:00
Konstantin Zhuravlyov	6183065b97	AMDGPU: Remove max_scratch_backing_memory_byte_size from kernel header 1. Remove max_scratch_backing_memory_byte_size from kernel header 2. Make it a reserved field 3. Ignore it while parsing assembly for backwards compatibility 4. Bump up minor version of kernel header Differential Revision: https://reviews.llvm.org/D45452 llvm-svn: 329620	2018-04-09 20:47:22 +00:00
Craig Topper	47b2f9d836	[X86] Don't use Lower512IntUnary to split bitcasts with v32i16/v64i8 types on targets without AVX512BW. LowerIntUnary as its name says has an assert for integer types. But for the bitcast case one side might be an FP type. Rather than making sure the function really works for fp types and renaming it. Just do really basic splitting directly. The LowerIntUnary has the advantage that it can peek through BUILD_VECTOR because every other call is during Lowering. But these calls are during legalization and will be followed by a DAG combine round. Revert some change to LowerVectorIntUnary that were originally made just to make these two calls work even in pure integer cases. This was found purely by compiling the avx512f-builtins.c test from clang so I've copied over the offending function from that. llvm-svn: 329616	2018-04-09 20:37:14 +00:00
Alexandre Ganea	d9e96741c4	[Debuginfo][COFF] Minimal serialization support for precompiled types records This change adds support for the LF_PRECOMP and LF_ENDPRECOMP records required to read/write Microsoft precompiled types .objs. See https://en.wikipedia.org/wiki/Precompiled_header#Microsoft_Visual_C_and_C++ This also adds handling for the .debug$P section, which is actually a .debug$T section in disguise, found only in precompiled .objs. Differential Revision: https://reviews.llvm.org/D45283 llvm-svn: 329613	2018-04-09 20:17:56 +00:00
Peter Collingbourne	5cff2409ae	AArch64: Allow offsets to be folded into addresses with ELF. This is a code size win in code that takes offseted addresses frequently, such as C++ constructors that typically need to compute an offseted address of a vtable. It reduces the size of Chromium for Android's .text section by 46KB, or 56KB with ThinLTO (which exposes more opportunities to use a direct access rather than a GOT access). Because the addend range is limited in COFF and Mach-O, this is enabled for ELF only. Differential Revision: https://reviews.llvm.org/D45199 llvm-svn: 329611	2018-04-09 19:59:57 +00:00
Alex Shlyapnikov	79f2c720b5	Revert "AMDGPU: enable 128-bit for local addr space under an option" This reverts commit r329591. It breaks various bots: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/16516 http://lab.llvm.org:8011/builders/clang-ppc64be-linux/builds/17374 http://lab.llvm.org:8011/builders/clang-ppc64le-linux/builds/15992 http://lab.llvm.org:8011/builders/clang-ppc64be-linux-lnt http://lab.llvm.org:8011/builders/clang-ppc64le-linux-lnt/builds/11251 ... llvm-svn: 329610	2018-04-09 19:47:38 +00:00
Mandeep Singh Grang	afa3aaf14d	[WebAssembly] Change std::sort to llvm::sort in response to r327219 Summary: r327219 added wrappers to std::sort which randomly shuffle the container before sorting. This will help in uncovering non-determinism caused due to undefined sorting order of objects having the same key. To make use of that infrastructure we need to invoke llvm::sort instead of std::sort. Note: This patch is one of a series of patches to replace all std::sort to llvm::sort. Refer the comments section in D44363 for a list of all the required patches. Reviewers: sunfish, RKSimon Reviewed By: sunfish Subscribers: jfb, dschuff, sbc100, jgravelle-google, aheejin, llvm-commits Differential Revision: https://reviews.llvm.org/D44873 llvm-svn: 329607	2018-04-09 19:38:31 +00:00
Daniel Sanders	c01efe690f	Fix type mismatch between MachineMemOperand constructor and accessors. NFC This allows MachineMemOperand::getSize()'s result to be fed directly into MachineMemOperand::MachineMemOperand() without a narrowing type conversion warning. llvm-svn: 329602	2018-04-09 18:42:19 +00:00
Erik Pilkington	d43931dcb8	[demangler] Support for fold expressions. llvm-svn: 329601	2018-04-09 18:33:01 +00:00
Erik Pilkington	452e2ef996	[demangler] Support for <data-member-prefix>. llvm-svn: 329600	2018-04-09 18:32:25 +00:00
Erik Pilkington	650130ac04	[demangler] Support for partially substituted sizeof.... llvm-svn: 329599	2018-04-09 18:31:50 +00:00
Aditya Nandakumar	b1c467dbe7	[GISel] Refactor MachineIRBuilder to allow transformations while building. https://reviews.llvm.org/D45067 This change attempts to do two things: 1) It separates out the state that is stored in the MachineIRBuilder(InsertionPt, MF, MRI, InsertFunction etc) into a separate object called MachineIRBuilderState. 2) Add the ability to constant fold operations while building instructions (optionally). MachineIRBuilder is now refactored into a MachineIRBuilderBase which contains lots of non foldable build methods and their implementation. Instructions which can be constant folded/transformed are now in a class called FoldableInstructionBuilder which uses CRTP to use the implementation of the derived class for buildBinaryOps. Additionally buildInstr in the derived class can be used to implement other kinds of transformations. Also because of separation of state, given a MachineIRBuilder in an API, if one wishes to use another MachineIRBuilder, a new one can be constructed from the state locally. For eg, void doFoo(MachineIRBuilder &B) { MyCustomBuilder CustomB(B.getState()); // Use CustomB for building. } reviewed by : aemerson llvm-svn: 329596	2018-04-09 17:30:56 +00:00
Craig Topper	0c2a12cb3e	[X86] Revert the SLM part of r328914. While it appears to be correct information based on Intel's optimization manual and Agner's data, it causes perf regressions on a couple of the benchmarks in our internal list. llvm-svn: 329593	2018-04-09 17:07:40 +00:00
Marek Olsak	52b033b827	AMDGPU: enable 128-bit for local addr space under an option Author: Samuel Pitoiset ds_read_b128 and ds_write_b128 have been recently enabled under the amdgpu-ds128 option because the performance benefit is unclear. Though, using 128-bit loads/stores for the local address space appears to introduce regressions in tessellation shaders. Not sure what is broken, but as ds_read_b128/ds_write_b128 are not enabled by default, just introduce a global option and enable 128-bit only if requested (until it's fixed/used correctly). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105464 llvm-svn: 329591	2018-04-09 16:56:32 +00:00
Tom Stellard	e753c52227	AMDGPU: Initialize GlobalISel passes Summary: This fixes AMDGPU GlobalISel test failures when enabling the AMDGPU target without any other targets that use GlobalISel. Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D45353 llvm-svn: 329588	2018-04-09 16:09:13 +00:00
Simon Pilgrim	23c2182c2b	Support generic expansion of ordered vector reduction (PR36732) Without the fast math flags, the llvm.experimental.vector.reduce.fadd/fmul intrinsic expansions must be expanded in order. This patch scalarizes the reduction, applying the accumulator at the start of the sequence: ((((Acc + Scl[0]) + Scl[1]) + Scl[2]) + ) ... + Scl[NumElts-1] Differential Revision: https://reviews.llvm.org/D45366 llvm-svn: 329585	2018-04-09 15:44:20 +00:00
Zaara Syeda	935474fef5	[MachineLICM] Re-enable hoisting of constant stores This patch fixes an issue exposed on the SystemZ build bots when committing https://reviews.llvm.org/rL327856. The hoisting was temporarily disabled with an option. This patch now re-enables hoisting and checks that we only hoist a store instruction when all its operands are either constant caller preserved registers or immediates. Differential Revision: https://reviews.llvm.org/D45286 llvm-svn: 329577	2018-04-09 14:50:02 +00:00
Pavel Labath	eadfac8748	[CodeGen/AccelTable] Don't emit zero-CU name indexes Summary: If an input DICompileUnit is completely empty (e.g., the result of running "clang -g" on an empty file), we don't bother emitting an empty DWARF CU. When we do that, we must make sure we don't also emit a DWARF v5 name index, as DWARF specifies that each index must reference at least one compilation unit. Reviewers: JDevlieghere, aprantl, dblaikie Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45435 llvm-svn: 329575	2018-04-09 14:38:53 +00:00
Xin Tong	fdad23bc36	[MergeICmp] Update debug msg.NFC llvm-svn: 329572	2018-04-09 14:29:13 +00:00
Simon Pilgrim	e5ed5e2cba	[X86][MMX] Fix missing itinerary for PALIGNR llvm-svn: 329568	2018-04-09 13:52:33 +00:00
Simon Pilgrim	140fee078f	[X86][MMX] Fix missing itinerary for MOVQ2DQ instruction format llvm-svn: 329567	2018-04-09 13:42:14 +00:00
Simon Pilgrim	abf3611332	[X86][MMX] Fix missing itinerary for CVTPI2PS llvm-svn: 329565	2018-04-09 13:27:47 +00:00
Xin Tong	0efadbbcde	[MergeICmp] Split blocks that do other work. Summary: We do not try to move the instructions and split the block till we know the blocks can be split, i.e. BCE-cmp-insts can be separated from non-BCE-cmp-insts. Reviewers: davide, courbet Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D44443 llvm-svn: 329564	2018-04-09 13:14:06 +00:00
Dmitry Preobrazhensky	2f8e146ad3	[AMDGPU][MC][GFX9] Added instructions s_mul_hi_32, s_lshl_add_u32 See bugs 36841: https://bugs.llvm.org/show_bug.cgi?id=36841 36842: https://bugs.llvm.org/show_bug.cgi?id=36842 Differential Revision: https://reviews.llvm.org/D45251 Reviewers: artem.tamazov, arsenm, timcorringham llvm-svn: 329562	2018-04-09 13:10:33 +00:00
Simon Pilgrim	0047efdd1e	[X86][MMX] Fix flipped reg/mem typo in MMX_MISC_FUNC_ITINS The RR/RM itineraries were the wrong way around llvm-svn: 329561	2018-04-09 13:02:07 +00:00
Simon Pilgrim	6131286553	[X86][SSE] Fix f32 mul/div itinerary groups typo The RM folded itineraries were incorrectly using the f64 version. llvm-svn: 329556	2018-04-09 10:45:53 +00:00
Pavel Labath	889bf9fe00	[CodeGen/AccelTable]: Don't emit accelerator entries for functions with no names Summary: We were emitting accelerator entries for functions with no name, which is contrary to the DWARF v5 spec: "All other (i.e., not DW_TAG_namespace) debugging information entries without a DW_AT_name attribute are excluded." Besides that, a name table entry with an empty string as a key is fairly useless. We can sometimes end up with functions which have a DW_AT_linkage_name but no DW_AT_name. One such example is the global-constructor-initialization functions, which C++ compilers synthesize for each compilation unit with global constructors. A very strict reading of the DWARF v5 spec would suggest that we should not even emit the accelerator entry for the linkage name in this case, but I don't think we should go that far. I found this when running the dwarf verifier over llvm codebase compiled with DWARF v5 accelerator tables. Reviewers: JDevlieghere, aprantl, dblaikie Subscribers: vleschuk, clayborg, echristo, probinson, llvm-commits Differential Revision: https://reviews.llvm.org/D45367 llvm-svn: 329552	2018-04-09 08:41:57 +00:00
Sam Parker	1f4f4d9a08	[DAGCombine] Improve ReduceLoad for SRL Recommitting r329283, third time lucky... If the SRL node is only used by an AND, we may be able to set the ExtVT to the width of the mask, making the AND redundant. To support this, another check has been added in isLegalNarrowLoad which queries whether the load is valid. Differential Revision: https://reviews.llvm.org/D41350 llvm-svn: 329551	2018-04-09 08:16:11 +00:00
Craig Topper	c95e122cc7	[X86] Merge some of the autoupgrade handling for masked intrinsics that just need to upgrade to an unmasked version plus a select. NFCI These are were previously grouped in small groups of similarish intrinsics. But all the intrinsics have the same number of arguments and the same order. So we can move them all into a larger group for handling. llvm-svn: 329549	2018-04-09 06:15:09 +00:00
Max Kazantsev	8624a4786a	[IRCE] Relax restriction on collected range checks In IRCE, we have a very old legacy check that works when we collect comparisons that we treat as range checks. It ensures that the value against which the indvar is compared is loop invariant and is also positive. This latter condition remained there since the times when IRCE was only able to handle signed latch comparison. As the optimization evolved, it now learned how to intersect signed or unsigned ranges, and this logic has no reliance on the fact that the right border of each range should be positive. The old implementation of this non-negativity check was also naive enough and just looked into ranges (while most of other IRCE logic tries to use power of SCEV implications), so this check did not allow to deal with the most simple case that looks like follows: int size; // not known non-negative int length; //known non-negative; i = 0; if (size != 0) { do { range_check(i < size); range_check(i < length); ++i; } while (i < size) } In this case, even if from some dominating conditions IRCE could parse loop structure, it could only remove the range check against `length` and simply ignored the check against `size`. In this patch we remove this obsolete check. It will allow IRCE to pick comparison against `size` as a potential range check and then let Range Intersection logic decide whether it is OK to eliminate it or not. Differential Revision: https://reviews.llvm.org/D45362 Reviewed By: samparker llvm-svn: 329547	2018-04-09 06:01:22 +00:00
Hiroshi Inoue	9ff2380ea6	[NFC] fix trivial typos in comments and error message "is is" -> "is", "are are" -> "are" llvm-svn: 329546	2018-04-09 04:37:53 +00:00
Michael Zolotukhin	8d052a0dd2	Remove MachineLoopInfo dependency from AsmPrinter. Summary: Currently MachineLoopInfo is used in only two places: 1) for computing IsBasicBlockInsideInnermostLoop field of MCCodePaddingContext, and it is never used. 2) in emitBasicBlockLoopComments, which is called only if `isVerbose()` is true. Despite that, we currently have a dependency on MachineLoopInfo, which makes pass manager to compute it and MachineDominator Tree. This patch removes the use (1) and makes the use (2) lazy, thus avoiding some redundant recomputations. Reviewers: opaparo, gadi.haber, rafael, craig.topper, zvi Subscribers: rengolin, javed.absar, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D44812 llvm-svn: 329542	2018-04-09 00:54:47 +00:00
Sanjay Patel	0d7df36c66	[TargetSchedule] shrink interface for init(); NFCI The TargetSchedModel is always initialized using the TargetSubtargetInfo's MCSchedModel and TargetInstrInfo, so we don't need to extract those and pass 3 parameters to init(). Differential Revision: https://reviews.llvm.org/D44789 llvm-svn: 329540	2018-04-08 19:56:04 +00:00
Craig Topper	b7baa358f6	[X86] Add SchedWrites for CMOV and SETCC. Use them to remove InstRWs. Summary: Cmov and setcc previously used WriteALU, but on Intel processors at least they are more restricted than basic ALU ops. This patch adds new SchedWrites for them and removes the InstRWs. I had to leave some InstRWs for CMOVA/CMOVBE and SETA/SETBE because those have an extra uop relative to the other condition codes on Intel CPUs. The test changes are due to fixing a missing ZnAGU dependency on the memory form of setcc. Reviewers: RKSimon, andreadb, GGanesh Reviewed By: RKSimon Subscribers: GGanesh, llvm-commits Differential Revision: https://reviews.llvm.org/D45380 llvm-svn: 329539	2018-04-08 17:53:18 +00:00
Craig Topper	c362f42b6a	[X86][Znver1] Remove InstRWs for BLENDVPS/PD Summary: This removes the InstRWs for BLENDVPS/PD in favor of WriteFVarBlend. The latency listed was 3 cycles but WriteFVarBlend is defined as 1 cycle latency. The 1 cycle latency matches Agner Fog's data. The patterns were missing the VEX forms which is why there are no test changes. We don't test "-mcpu=znver1 -mattr=-avx" Reviewers: RKSimon, GGanesh Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D44841 llvm-svn: 329538	2018-04-08 17:53:15 +00:00
Mandeep Singh Grang	68ab401f62	[Support] Change std::sort to llvm::sort in response to r327219 Summary: r327219 added wrappers to std::sort which randomly shuffle the container before sorting. This will help in uncovering non-determinism caused due to undefined sorting order of objects having the same key. To make use of that infrastructure we need to invoke llvm::sort instead of std::sort. Note: This patch is one of a series of patches to replace all std::sort to llvm::sort. Refer the comments section in D44363 for a list of all the required patches. Reviewers: chandlerc, jordan_rose, bkramer Reviewed By: bkramer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45140 llvm-svn: 329536	2018-04-08 16:46:22 +00:00
Mandeep Singh Grang	327fd5e47c	[PowerPC] Change std::sort to llvm::sort in response to r327219 Summary: r327219 added wrappers to std::sort which randomly shuffle the container before sorting. This will help in uncovering non-determinism caused due to undefined sorting order of objects having the same key. To make use of that infrastructure we need to invoke llvm::sort instead of std::sort. Note: This patch is one of a series of patches to replace all std::sort to llvm::sort. Refer the comments section in D44363 for a list of all the required patches. Reviewers: hfinkel, RKSimon Reviewed By: RKSimon Subscribers: nemanjai, kbarton, llvm-commits Differential Revision: https://reviews.llvm.org/D44870 llvm-svn: 329535	2018-04-08 16:45:04 +00:00
Mandeep Singh Grang	68a151a13c	[X86] Change std::sort to llvm::sort in response to r327219 Summary: r327219 added wrappers to std::sort which randomly shuffle the container before sorting. This will help in uncovering non-determinism caused due to undefined sorting order of objects having the same key. To make use of that infrastructure we need to invoke llvm::sort instead of std::sort. Note: This patch is one of a series of patches to replace all std::sort to llvm::sort. Refer the comments section in D44363 for a list of all the required patches. Reviewers: chandlerc, craig.topper, RKSimon Reviewed By: chandlerc, craig.topper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D44874 llvm-svn: 329534	2018-04-08 16:42:52 +00:00
Xin Tong	99c4e2f364	[LIR] Reorder header. NFC llvm-svn: 329530	2018-04-08 13:19:53 +00:00
Zvi Rackover	7a53f169f1	DAGCombiner: Combine SDIV with non-splat vector pow2 divisor Summary: Extend existing SDIV combine for pow2 constant divider to handle non-splat vectors of pow2 constants. Reviewers: RKSimon, craig.topper, spatel, hfinkel, efriedma Reviewed By: RKSimon Subscribers: magabari, llvm-commits Differential Revision: https://reviews.llvm.org/D42479 llvm-svn: 329525	2018-04-08 11:35:20 +00:00
Simon Pilgrim	86588fc809	[X86][Btver2] Add vector extract costs llvm-svn: 329524	2018-04-08 11:26:26 +00:00
Michal Gorny	47671e31cd	[LLVMTestingSupport] Add explicit linkage to LLVMSupport Explicitly link LLVMTestingSupport library against LLVMSupport. This is necessary to fix linking errors when LLVMTestingSupport is built as a shared library (with BUILD_SHARED_LIBS=ON) and -Wl,-z,defs is used. Differential Revision: https://reviews.llvm.org/D45408 llvm-svn: 329522	2018-04-08 06:49:17 +00:00
Guozhi Wei	0eb86c8efc	[DAGCombiner] Fold (zext (and/or/xor (shl/shr (load x), cst), cst)) In our real world application, we found the following optimization is missed in DAGCombiner (zext (and/or/xor (shl/shr (load x), cst), cst)) -> (and/or/xor (shl/shr (zextload x), (zext cst)), (zext cst)) If the user of original zext is an add, it may enable further lea optimization on x86. This patch add a new function CombineZExtLogicopShiftLoad to do this optimization. Differential Revision: https://reviews.llvm.org/D44402 llvm-svn: 329516	2018-04-07 23:36:10 +00:00
Craig Topper	ef37aebc96	[X86] Combine vXi64 multiplies to MULDQ/MULUDQ during DAG combine instead of lowering. Previously we used a custom lowering for this because of the AVX1 splitting requirement. But we can do the split during DAG combine if we check the types and subtarget llvm-svn: 329510	2018-04-07 19:09:52 +00:00
Craig Topper	5b95eae1c3	[DAGCombiner] Add a combine to turn a build vector of zero extends of extract vector elts into a vector zero extend and possibly an extract subvector. llvm-svn: 329509	2018-04-07 19:09:50 +00:00
Sanjay Patel	2a24958923	[InstCombine] simplify code that propagates FMF; NFC llvm-svn: 329503	2018-04-07 14:14:23 +00:00
Simon Pilgrim	80ce1dde44	[CostModel][X86] Fix v32i16/v64i8 SETCC costs on AVX512BW targets llvm-svn: 329498	2018-04-07 13:24:33 +00:00
Tim Northover	e25e458d52	Reapply ARM: Do not spill CSR to stack on entry to noreturn functions Should fix UBSan bot by also checking there's no "uwtable" attribute before skipping. Otherwise the unwind table will be useless since its moves expect CSRs to actually be preserved. A noreturn nounwind function can be expected to never return in any way, and by never returning it will also never have to restore any callee-saved registers for its caller. This makes it possible to skip spills of those registers during function entry, saving some stack space and time in the process. This is rather useful for embedded targets with limited stack space. Should fix PR9970. Patch mostly by myeisha (pmb). llvm-svn: 329494	2018-04-07 10:57:03 +00:00
Roman Lebedev	41922f1a6d	[InstCombine] Get rid of select of bittest (PR36950 / PR17564) Summary: See [[ https://bugs.llvm.org/show_bug.cgi?id=36950 \| PR36950 ]], [[ https://bugs.llvm.org/show_bug.cgi?id=17564 \| PR17564 ]], D45065, D45107 https://godbolt.org/g/iAYRup Alive proof: https://rise4fun.com/Alive/uiH Testing: `ninja check-llvm` Reviewers: spatel, craig.topper Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45108 llvm-svn: 329492	2018-04-07 10:37:24 +00:00
Robert Widmann	f53050f010	[LLVM-C] Move DIBuilder Bindings For Block Scopes Summary: Move LLVMDIBuilderCreateFunction , LLVMDIBuilderCreateLexicalBlock, and LLVMDIBuilderCreateLexicalBlockFile from Go to LLVM-C. Reviewers: whitequark, harlanhaskins, deadalnix Reviewed By: whitequark, harlanhaskins Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45352 llvm-svn: 329488	2018-04-07 06:07:55 +00:00
Vitaly Buka	de5f196530	Revert "ARM: Do not spill CSR to stack on entry to noreturn functions" Breaks ubsan test TestCases/Misc/missing_return.cpp on ARM This reverts commit r329287 llvm-svn: 329486	2018-04-07 05:36:44 +00:00
Nico Weber	377e247018	Convert line endings of lib/WindowsManifest/CMakeLists.txt to unix. llvm-svn: 329483	2018-04-07 04:28:08 +00:00
Nico Weber	b64da22db7	Remove trailing space in build file. llvm-svn: 329479	2018-04-07 03:30:28 +00:00
Graydon Hoare	54fe208a5f	[Support] Make line-number cache robust against access patterns. Summary: The LLVM SourceMgr class (which is used indirectly by Swift, though not Clang) has a routine for looking up line numbers of SMLocs. This routine uses a shared, special-purpose cache that handles exactly one access pattern efficiently: looking up the line number of an SMLoc that points into the same buffer as the last query made to the SourceMgr, at a location in the buffer at or ahead of the last query. When this works it's fine, but when it fails it's catastrophic for performancer: one recent out-of-order access from a Swift utility routine ran for tens of seconds, spending 99% of its time repeatedly scanning buffers for '\n'. This change removes the shared cache from the SourceMgr and installs a new cache in each SrcBuffer. The per-SrcBuffer caches are also "full", in the sense that rather than caching a single last-query pointer, they cache _all_ the line-ending offsets, in a binary-searchable array, such that once it's populated (on first access), all subsequent access patterns run at the same speed. Performance measurements I've done show this is actually a little bit faster on real codebases (though only a couple fractions of a percent). Memory usage is up by a few tens to hundreds of bytes per SrcBuffer that has a line lookup done on it; I've attempted to minimize this by using dynamic selection of integer sized when storing offset arrays. But the main motive here is to make-impossible the cases we don't always see, that show up by surprise when there is an out-of-order access pattern. Reviewers: jordan_rose Reviewed By: jordan_rose Subscribers: probinson, llvm-commits Differential Revision: https://reviews.llvm.org/D45003 llvm-svn: 329470	2018-04-07 00:44:02 +00:00
Aaron Smith	8a5ea61886	Windows needs the current codepage instead of utf8 sometimes Llvm-mc (and tools that use Path.inc on Windows) assume that strings are utf-8 encoded, however, this is not always the case. On Windows the default codepage is not utf-8, so most of the time the strings are not utf-8 encoded. The lld test 'format-binary-non-ascii' uses llvm-mc with a file with non-ascii characters in the name which is how this bug was found. The test fails when run using Python 3 because it uses properly encoded unicode strings (Python 2 actually ends up using a byte string which is not utf-8 encoded, so the test passes, but that's separate issue). Patch by Stella Stamenova! llvm-svn: 329468	2018-04-07 00:32:59 +00:00
Artem Belevich	f256decdc4	[NVPTX] add support for initializing fp16 arrays. Previously HalfTy was not handled which would either trigger an assertion, or result in array initialized with garbage. Differential Revision: https://reviews.llvm.org/D45391 llvm-svn: 329463	2018-04-06 22:25:08 +00:00
Vitaly Buka	9cb59b92cc	Fix warning by cl::opt<int> -> cl::opt<unsigned> llvm-svn: 329461	2018-04-06 21:41:17 +00:00
Vitaly Buka	66f53d71f7	Runtime flag to control branch funnel threshold Reviewers: pcc Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D45193 llvm-svn: 329459	2018-04-06 21:32:36 +00:00
Artem Belevich	a28e598ebb	[NVPTX] Fixed vectorized LDG for f16. v2f16 is a special case in NVPTX. v4f16 may be loaded as a pair of v2f16 and that was not previously handled correctly by tryLDGLDU() Differential Revision: https://reviews.llvm.org/D45339 llvm-svn: 329456	2018-04-06 21:10:24 +00:00
Sameer AbuAsal	c1b0e66b58	[RISCV] Tablegen-driven Instruction Compression. Summary: This patch implements a tablegen-driven Instruction Compression mechanism for generating RISCV compressed instructions (C Extension) from the expanded instruction form. This tablegen backend processes CompressPat declarations in a td file and generates all the compile-time and runtime checks required to validate the declarations, validate the input operands and generate correct instructions. The checks include validating register operands, immediate operands, fixed register operands and fixed immediate operands. Example: class CompressPat<dag input, dag output> { dag Input = input; dag Output = output; list<Predicate> Predicates = []; } let Predicates = [HasStdExtC] in { def : CompressPat<(ADD GPRNoX0:$rs1, GPRNoX0:$rs1, GPRNoX0:$rs2), (C_ADD GPRNoX0:$rs1, GPRNoX0:$rs2)>; } The result is an auto-generated header file 'RISCVGenCompressEmitter.inc' which exports two functions for compressing/uncompressing MCInst instructions, plus some helper functions: bool compressInst(MCInst& OutInst, const MCInst &MI, const MCSubtargetInfo &STI, MCContext &Context); bool uncompressInst(MCInst& OutInst, const MCInst &MI, const MCRegisterInfo &MRI, const MCSubtargetInfo &STI); The clients that include this auto-generated header file and invoke these functions can compress an instruction before emitting it, in the target-specific ASM or ELF streamer, or can uncompress an instruction before printing it, when the expanded instruction format aliases is favored. The following clients were added to implement compression\uncompression for RISCV: 1) RISCVAsmParser::MatchAndEmitInstruction: Inserted a call to compressInst() to compresses instructions parsed by llvm-mc coming from an ASM input. 2) RISCVAsmPrinter::EmitInstruction: Inserted a call to compressInst() to compress instructions that were lowered from Machine Instructions (MachineInstr). 3) RVInstPrinter::printInst: Inserted a call to uncompressInst() to print the expanded version of the instruction instead of the compressed one (e.g, add s0, s0, a5 instead of c.add s0, a5) when -riscv-no-aliases is not passed. This patch squashes D45119, D42780 and D41932. It was reviewed in smaller patches by asb, efriedma, apazos and mgrang. Reviewers: asb, efriedma, apazos, llvm-commits, sabuasal Reviewed By: sabuasal Subscribers: mgorny, eraman, asb, rbar, johnrusso, simoncook, jordy.potman.lists, apazos, niosHD, kito-cheng, shiva0217, zzheng Differential Revision: https://reviews.llvm.org/D45385 llvm-svn: 329455	2018-04-06 21:07:05 +00:00
Mandeep Singh Grang	1b0e2f2a20	[TableGen] Change std::sort to llvm::sort in response to r327219 Summary: r327219 added wrappers to std::sort which randomly shuffle the container before sorting. This will help in uncovering non-determinism caused due to undefined sorting order of objects having the same key. To make use of that infrastructure we need to invoke llvm::sort instead of std::sort. Note: This patch is one of a series of patches to replace all std::sort to llvm::sort. Refer the comments section in D44363 for a list of all the required patches. Reviewers: stoklund, kparzysz, dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45144 llvm-svn: 329451	2018-04-06 20:18:05 +00:00
Matt Davis	13b8331054	[StackProtector] Ignore certain intrinsics when calculating sspstrong heuristic. Summary: The 'strong' StackProtector heuristic takes into consideration call instructions. Certain intrinsics, such as lifetime.start, can cause the StackProtector to protect functions that do not need to be protected. Specifically, a volatile variable, (not optimized away), but belonging to a stack allocation will encourage a llvm.lifetime.start to be inserted during compilation. Because that intrinsic is a 'call' the strong StackProtector will see that the alloca'd variable is being passed to a call instruction, and insert a stack protector. In this case the intrinsic isn't really lowered to a call. This can cause unnecessary stack checking, at the cost of additional (wasted) CPU cycles. In the future we should rely on TargetTransformInfo::isLoweredToCall, but as of now that routine considers all intrinsics as not being lowerable. That needs to be corrected, and such a change is on my list of things to get moving on. As a side note, the updated stack-protector-dbginfo.ll test always seems to pass. I never see the dbg.declare/dbg.value reaching the StackProtector::HasAddressTaken, but I don't see any code excluding dbg intrinsic calls either, so I think it's the safest thing to do. Reviewers: void, timshen Reviewed By: timshen Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45331 llvm-svn: 329450	2018-04-06 20:14:13 +00:00
Geoff Berry	5bf4a5eafa	[EarlyCSE] Add debug counter for debugging mis-optimizations. NFC. Reviewers: reames, spatel, davide, dberlin Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D45162 llvm-svn: 329443	2018-04-06 18:47:33 +00:00
Dmitry Preobrazhensky	ae31223ba7	[AMDGPU][MC][GFX9] Added s_call_b64 See bug 36843: https://bugs.llvm.org/show_bug.cgi?id=36843 Differential Revision: https://reviews.llvm.org/D45268 Reviewers: artem.tamazov, arsenm, timcorringham llvm-svn: 329440	2018-04-06 18:24:49 +00:00
Krzysztof Parzyszek	b7e54e8482	[Hexagon] Fix assert with packetizing IMPLICIT_DEF instructions The compiler is generating packet with the following instructions, which causes an undefined register assert in the verifier. $r0 = IMPLICIT_DEF $r1 = IMPLICIT_DEF S2_storerd_io killed $r29, 0, killed %d0 The problem is that the packetizer is not saving the IMPLICIT_DEF instructions, which are needed when checking if it is legal to add the store instruction. The fix is to add the IMPLICIT_DEF instructions to the CurrentPacketMIs structure. Patch by Brendon Cahoon. llvm-svn: 329439	2018-04-06 18:19:22 +00:00
Krzysztof Parzyszek	aca8f32713	[Hexagon] Prevent a stall across zero-latency instructions in a packet Packetizer keeps two zero-latency bound instrctions in the same packet ignoring the stalls on the later instruction. This should not be the case if there is no data dependence. Patch by Sumanth Gundapaneni. llvm-svn: 329437	2018-04-06 18:13:11 +00:00
Krzysztof Parzyszek	269740a88e	[Hexagon] Remove duplicated code, NFC llvm-svn: 329436	2018-04-06 18:10:13 +00:00
Mandeep Singh Grang	e92f0cfe34	[CodeGen] Change std::sort to llvm::sort in response to r327219 Summary: r327219 added wrappers to std::sort which randomly shuffle the container before sorting. This will help in uncovering non-determinism caused due to undefined sorting order of objects having the same key. To make use of that infrastructure we need to invoke llvm::sort instead of std::sort. Note: This patch is one of a series of patches to replace all std::sort to llvm::sort. Refer the comments section in D44363 for a list of all the required patches. Reviewers: bogner, rnk, MatzeB, RKSimon Reviewed By: rnk Subscribers: JDevlieghere, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D45133 llvm-svn: 329435	2018-04-06 18:08:42 +00:00
Krzysztof Parzyszek	ed04f02432	[Hexagon] Handle subregisters when calculating iteration count in HW loops llvm-svn: 329434	2018-04-06 17:51:57 +00:00
Dmitry Preobrazhensky	306b1a0119	[AMDGPU][MC][GFX9] Added instruction s_endpgm_ordered_ps_done See bug 36844: https://bugs.llvm.org/show_bug.cgi?id=36844 Differential Revision: https://reviews.llvm.org/D45313 Reviewers: artem.tamazov, arsenm, timcorringham llvm-svn: 329430	2018-04-06 17:25:00 +00:00
Sanjay Patel	a9ca709011	[InstCombine] limit nsz: -(X - Y) --> Y - X to hasOneUse() As noted in the post-commit discussion for r329350, we shouldn't generally assume that fsub is the same cost as fneg. llvm-svn: 329429	2018-04-06 17:24:08 +00:00
Craig Topper	c50570fb4f	[X686] Add appropriate ReadAfterLd for the register input to memory forms of ADC/SBB. llvm-svn: 329424	2018-04-06 17:12:18 +00:00
Simon Pilgrim	a74f4ae404	Strip trailing whitespace. NFCI. llvm-svn: 329421	2018-04-06 17:01:54 +00:00
Dmitry Preobrazhensky	f20aff565d	[AMDGPU][MC][GFX9] Added instructions saveexec, wrexec and bitreplicate See bug 36840: https://bugs.llvm.org/show_bug.cgi?id=36840 Differential Revision: https://reviews.llvm.org/D45250 Reviewers: artem.tamazov, arsenm, timcorringham llvm-svn: 329419	2018-04-06 16:35:11 +00:00
Craig Topper	b9d298ecf2	[X86] Remove InstRWs for basic arithmetic instructions from Sandy Bridge scheduler model. We can get this right through WriteALU and friends now. llvm-svn: 329417	2018-04-06 16:29:31 +00:00
Craig Topper	f0d042619b	[X86] Attempt to model basic arithmetic instructions in the Haswell/Broadwell/Skylake scheduler models without InstRWs Summary: This patch removes InstRW overrides for basic arithmetic/logic instructions. To do this I've added the store address port to RMW. And used a WriteSequence to make the latency additive. It does not cover ADC/SBB because they have different latency. Apparently we were inconsistent about whether the store has latency or not thus the test changes. I've also left out Sandy Bridge because the load latency there is currently 4 cycles and should be 5. Reviewers: RKSimon, andreadb Reviewed By: andreadb Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45351 llvm-svn: 329416	2018-04-06 16:16:48 +00:00
Craig Topper	f131b60049	[X86] Add an extra store address cycle to WriteRMW in the Sandy Bridge/Broadwell/Haswell/Skylake scheduler model. Even those the address was calculated for the load, its calculated again for the store. llvm-svn: 329415	2018-04-06 16:16:46 +00:00
Craig Topper	22d25a08ae	[X86] Merge itineraries for CLC, CMC, and STC. These are very simple flag setting instructions that appear to only be a single uop. They're unlikely to need this separation. llvm-svn: 329414	2018-04-06 16:16:43 +00:00
Mircea Trofin	aa3fea6cb0	[GlobalOpt] Fix support for casts in ctors. Summary: Fixing an issue where initializations of globals where constructors use casts were silently translated to 0-initialization. Reviewers: davidxl, evgeny777 Reviewed By: evgeny777 Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45198 llvm-svn: 329409	2018-04-06 15:54:47 +00:00
Dmitry Preobrazhensky	59399ae4cc	[AMDGPU][MC][VI][GFX9] Added s_atc_probe* instructions See bug 36839: https://bugs.llvm.org/show_bug.cgi?id=36839 Differential Revision: https://reviews.llvm.org/D45249 Reviewers: artem.tamazov, arsenm, timcorringham llvm-svn: 329408	2018-04-06 15:48:39 +00:00
Pete Couperus	b7b6e1da6c	[ARC] Add <.f> suffix for F32_GEN4_{DOP\|SOP}. Add disassembler support for instructions which writeback STATUS32. https://reviews.llvm.org/D45148 Patch by Yan Luo! (Yan.Luo2@synopsys.com) llvm-svn: 329404	2018-04-06 15:43:11 +00:00
Dmitry Preobrazhensky	4732d876ee	[AMDGPU][MC][GFX9] Added s_dcache_discard* instructions See bug 36838: https://bugs.llvm.org/show_bug.cgi?id=36838 Differential Revision: https://reviews.llvm.org/D45247 Reviewers: artem.tamazov, arsenm, timcorringham llvm-svn: 329397	2018-04-06 15:08:42 +00:00
Chad Rosier	45735b8e40	[LoopUnroll] Make LoopPeeling respect the AllowPeeling preference. The SimpleLoopUnrollPass isn't suppose to perform loop peeling. Differential Revision: https://reviews.llvm.org/D45334 llvm-svn: 329395	2018-04-06 13:57:21 +00:00
Pavel Labath	c9f07b06a1	DWARFVerifier: validate information in name index entries Summary: This patch add checks to verify that the information in the name index entries is consistent with the debug_info section. Specifically, we check that entries point to valid DIEs, and their names, tags, and compile units match the information in the debug_info sections. These checks are only run if the previous checks did not find any errors in the name index headers. Attempting to proceed with the checks anyway would likely produce a lot of spurious errors and the verification code would need to be very careful to avoid crashing. I also add a couple of more checks to the abbreviation-validation code to verify that some attributes are always present (an index without a DW_IDX_die_offset attribute is fairly useless). The entry verification works only on indexes without any type units - I haven't attempted to extend it to type units, as we don't even have a DWARF v5-compatible type unit generator at the moment. Reviewers: JDevlieghere, aprantl, dblaikie Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45323 llvm-svn: 329392	2018-04-06 13:34:12 +00:00
Simon Pilgrim	09eeb3a8b9	[X86][SandyBridge] Add (V)DPPS memory fold latencies Noticed this during D44654 llvm-svn: 329389	2018-04-06 11:25:21 +00:00
Simon Pilgrim	8a83f16ccd	[X86][SandyBridge] SBWriteResPair +5cy Memory Folds As mentioned on D44647, this patch increases the default memory latency to +5cy , which more closely matches what most custom cases are doing for reg-mem instructions. I've bumped LoadLatency, ReadAfterLd and WriteLoad values to 5cy to be consistent. As Sandy Bridge is currently our default generic model, this affects a lot of scheduling tests... Differential Revision: https://reviews.llvm.org/D44654 llvm-svn: 329388	2018-04-06 11:00:51 +00:00
Hans Wennborg	da8b71f292	Tweak an assert message in the verifier llvm-svn: 329387	2018-04-06 10:20:19 +00:00
Simon Pilgrim	fd1f4fe54e	[X86][SkylakeServer] Merge 2 InstRW entries to the same sched group. NFCI. llvm-svn: 329386	2018-04-06 10:16:36 +00:00
Hans Wennborg	b230c763a4	EntryExitInstrumenter: Handle musttail calls Inserting instrumentation between a musttail call and ret instruction would create invalid IR. Instead, treat musttail calls as function exits. llvm-svn: 329385	2018-04-06 10:14:09 +00:00
Max Kazantsev	832563a782	[NFC] Add missing end of line symbols llvm-svn: 329383	2018-04-06 09:47:06 +00:00
Francis Visoiu Mistrih	537d7eee90	[MIR] Add support for MachineFrameInfo::LocalFrameSize MFI.LocalFrameSize was not serialized. It is usually set from LocalStackSlotAllocation, so if that pass doesn't run it is impossible do deduce it from the stack objects. Until now, this information was lost. llvm-svn: 329382	2018-04-06 08:56:25 +00:00
Pavel Labath	54ca2d688a	[debug_loc] Fix typo in DWARFExpression constructor Summary: The positions of the DwarfVersion and AddressSize arguments were reversed, which caused parsing for dwarf opcodes which contained address-size-dependent operands (such as DW_OP_addr). Amusingly enough, none of the address-size asserts fired, as dwarf version was always 4, which is a valid address size. I ran into this when constructing weird inputs for the DWARF verifier. I I add a test case as hand-written dwarf -- I am not sure how to trigger this differently, as having a DW_OP_addr inside a location list is a fairly non-standard thing to do. Fixing this error exposed a bug in the debug_loc.dwo parser, which was always being constructed with an address size of 0. I fix that as well by following the pattern in the non-dwo parser of picking up the address size from the first compile unit (which is technically not correct, but probably good enough in practice). Reviewers: JDevlieghere, aprantl, dblaikie Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45324 llvm-svn: 329381	2018-04-06 08:49:57 +00:00
Max Kazantsev	2f2fbebdc8	[NFC] Loosen restriction on preheader to fix buildbot llvm-svn: 329379	2018-04-06 07:23:45 +00:00
Hiroshi Inoue	a2eefb6d9a	[PowerPC] allow D-form VSX load/store when accessing FrameIndex without offset VSX D-form load/store instructions of POWER9 require the offset be a multiple of 16 and a helper`isOffsetMultipleOf` is used to check this. So far, the helper handles FrameIndex + offset case, but not handling FrameIndex without offset case. Due to this, we are missing opportunities to exploit D-form instructions when accessing an object or array allocated on stack. For example, x-form store (stxvx) is used for int a[4] = {0}; instead of d-form store (stxv). For larger arrays, D-form instruction is not used when accessing the first 16-byte. Using D-form instructions reduces register pressure as well as instructions. Differential Revision: https://reviews.llvm.org/D45079 llvm-svn: 329377	2018-04-06 05:41:16 +00:00
Robert Widmann	f108d57f9b	[LLVM-C] Audit Inline Assembly APIs for Consistency Summary: - Add a missing getter for module-level inline assembly - Add a missing append function for module-level inline assembly - Deprecate LLVMSetModuleInlineAsm and replace it with LLVMSetModuleInlineAsm2 which takes an explicit length parameter - Deprecate LLVMConstInlineAsm and replace it with LLVMGetInlineAsm, a function that allows passing a dialect and is not mis-classified as a constant operation Reviewers: whitequark, deadalnix Reviewed By: whitequark Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45346 llvm-svn: 329369	2018-04-06 02:31:29 +00:00
Manoj Gupta	afb355bdc0	Fix lld-x86_64-darwin13 build fails. Use double braces in std::array initialization to keep Darwin builders happy. llvm-svn: 329363	2018-04-05 23:23:29 +00:00
Sanjay Patel	04683de82f	[InstCombine] FP: Z - (X - Y) --> Z + (Y - X) This restores what was lost with rL73243 but without re-introducing the bug that was present in the old code. Note that we already have these transforms if the ops are marked 'fast' (and I assume that's happening somewhere in the code added with rL170471), but we clearly don't need all of 'fast' for these transforms. llvm-svn: 329362	2018-04-05 23:21:15 +00:00
Manoj Gupta	9d68b9eac5	Attempt to fix Mips breakages. Summary: Replace ArrayRefs by actual std::array objects so that there are no dangling references. Reviewers: rsmith, gkistanova Subscribers: sdardis, arichardson, llvm-commits Differential Revision: https://reviews.llvm.org/D45338 llvm-svn: 329359	2018-04-05 22:47:25 +00:00
Craig Topper	fbe3132f67	[X86] Separate CDQ and CDQE in the scheduler model. According to Agner's data, CDQE is closer to CWDE. llvm-svn: 329354	2018-04-05 21:56:19 +00:00
Mandeep Singh Grang	f3555650bd	[IR] Change std::sort to llvm::sort in response to r327219 r327219 added wrappers to std::sort which randomly shuffle the container before sorting. This will help in uncovering non-determinism caused due to undefined sorting order of objects having the same key. To make use of that infrastructure we need to invoke llvm::sort instead of std::sort. Note: This patch is one of a series of patches to replace all std::sort to llvm::sort. Refer D44363 for a list of all the required patches. llvm-svn: 329353	2018-04-05 21:52:24 +00:00
Craig Topper	4cc3827791	[X86] Add MOVZPQILo2PQIrr to the Sandy Bridge scheduler model llvm-svn: 329351	2018-04-05 21:40:32 +00:00
Sanjay Patel	03e2526728	[InstCombine] nsz: -(X - Y) --> Y - X This restores part of the fold that was removed with rL73243 (PR4374). llvm-svn: 329350	2018-04-05 21:37:17 +00:00
Craig Topper	3b0b96c591	[X86] Add LEAVE instruction to the scheduler models using the same data as LEAVE64. Make LEAVE/LEAVE64 more correct on Sandy Bridge. This is the 32-bit mode version of LEAVE64. It should be at least somewhat similar to LEAVE64. The Sandy Bridge version was missing a load port use. llvm-svn: 329347	2018-04-05 21:16:26 +00:00
Wolfgang Pieb	3fb9e3f398	[DWARF v5][NFC]: Refactor DebugRnglists to prepare for the support of the DW_AT_ranges attribute in conjunction with .debug_rnglists. Reviewers: JDevlieghere Differential Revision: https://reviews.llvm.org/D45307 llvm-svn: 329345	2018-04-05 21:01:49 +00:00
Konstantin Zhuravlyov	c233ae8004	AMDGPU/Metadata: Always report a fixed number of hidden arguments Currently it is 6. If the "feature" was not used, report dummy hidden argument. Otherwise it does not match the kernarg size reported in the kernel header. Differential Revision: https://reviews.llvm.org/D45129 llvm-svn: 329341	2018-04-05 20:46:04 +00:00
Craig Topper	c6bb36a3d0	[X86] Remove some InstRWs for plain store instructions on Sandy Bridge. We were forcing the latency of these instructions to 5 cycles, but every other scheduler model had them as 1 cycle. I'm sure I didn't get everything, but this gets a big portion. llvm-svn: 329339	2018-04-05 20:04:06 +00:00
Lang Hames	7a598477aa	[RuntimeDyld][PowerPC] Use global entry points for calls between sections. Functions in different objects may use different TOCs, so calls between such functions should use the global entry point of the callee which updates the TOC pointer. This should fix a bug that the Numba developers encountered (see https://github.com/numba/numba/issues/2451). Patch by Olexa Bilaniuk. Thanks Olexa! No RuntimeDyld checker test case yet as I am not familiar enough with how RuntimeDyldELF fixes up call-sites, but I do not want to hold up landing this. I will continue to work on it and see if I can rope some powerpc experts in. llvm-svn: 329335	2018-04-05 19:37:05 +00:00
Mandeep Singh Grang	176c3efa0e	[Bitcode] Change std::sort to llvm::sort in response to r327219 Summary: r327219 added wrappers to std::sort which randomly shuffle the container before sorting. This will help in uncovering non-determinism caused due to undefined sorting order of objects having the same key. To make use of that infrastructure we need to invoke llvm::sort instead of std::sort. Note: This patch is one of a series of patches to replace all std::sort to llvm::sort. Refer the comments section in D44363 for a list of all the required patches. Reviewers: pcc, mehdi_amini, dexonsmith Reviewed By: dexonsmith Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45132 llvm-svn: 329334	2018-04-05 19:27:04 +00:00
Daniel Neilson	367c2aea4e	[InstCombine] Properly change GEP type when reassociating loop invariant GEP chains Summary: This is a fix to PR37005. Essentially, rL328539 ([InstCombine] reassociate loop invariant GEP chains to enable LICM) contains a bug whereby it will convert: %src = getelementptr inbounds i8, i8* %base, <2 x i64> %val %res = getelementptr inbounds i8, <2 x i8> %src, i64 %val2 into: %src = getelementptr inbounds i8, i8 %base, i64 %val2 %res = getelementptr inbounds i8, <2 x i8*> %src, <2 x i64> %val By swapping the index operands if the GEPs are in a loop, and %val is loop variant while %val2 is loop invariant. This fix recreates new GEP instructions if the index operand swap would result in the type of %src changing from vector to scalar, or vice versa. Reviewers: sebpop, spatel Reviewed By: sebpop Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45287 llvm-svn: 329331	2018-04-05 18:51:45 +00:00
Craig Topper	9eec2025c5	[X86] Synchronize the SchedRW on some EVEX instructions with their VEX equivalents. Mostly vector load, store, and move instructions. llvm-svn: 329330	2018-04-05 18:38:45 +00:00
Mandeep Singh Grang	9893fe218c	[ARM] Change std::sort to llvm::sort in response to r327219 Summary: r327219 added wrappers to std::sort which randomly shuffle the container before sorting. This will help in uncovering non-determinism caused due to undefined sorting order of objects having the same key. To make use of that infrastructure we need to invoke llvm::sort instead of std::sort. Note: This patch is one of a series of patches to replace all std::sort to llvm::sort. Refer the comments section in D44363 for a list of all the required patches. Reviewers: t.p.northover, RKSimon, MatzeB, bkramer Reviewed By: bkramer Subscribers: javed.absar, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D44855 llvm-svn: 329329	2018-04-05 18:31:50 +00:00
Craig Topper	665f74414d	[X86] Disassembler support for having an ADSIZE prefix affect instructions with 0xf2 and 0xf3 prefixes. Needed to support umonitor from D45253. llvm-svn: 329327	2018-04-05 18:20:14 +00:00
Sanjay Patel	deaf4f354e	[InstCombine] use pattern matchers for fsub --> fadd folds This allows folding for vectors with undef elements. llvm-svn: 329316	2018-04-05 17:06:45 +00:00
Sam Clegg	cfd44a2e69	[WebAssembly] Allow for the creation of user-defined custom sections This patch adds a way for users to create their own custom sections to be added to wasm files. At the LLVM IR layer, they are defined through the "wasm.custom_sections" named metadata. The expected use case for this is bindings generators such as wasm-bindgen. Patch by Dan Gohman Differential Revision: https://reviews.llvm.org/D45297 llvm-svn: 329315	2018-04-05 17:01:39 +00:00
Craig Topper	6ecdb03f16	[X86] Use WriteFShuffle256 for VEXTRACTF128 to be consistent with VEXTRACTI128 which uses WriteShuffle256. llvm-svn: 329310	2018-04-05 16:32:48 +00:00

1 2 3 4 5 ...

112375 Commits