llvm-project

Commit Graph

Author	SHA1	Message	Date
Craig Topper	88ffb5d4d5	[X86] Mark ISD::FP_TO_UINT v16i8/v16i16 as Promote under AVX512 instead of legal. Fix infinite loop in op legalization when promotion requires 2 steps. Previously we had an isel pattern to add the truncate. Instead use Promote to add the truncate to the DAG before isel. The Promote legalization code had to be updated to prevent an infinite loop if promotion took multiple steps because it wasn't remembering the previously tried value. llvm-svn: 319259	2017-11-28 23:56:02 +00:00
Craig Topper	3f749c2d4b	[X86] Regenerate avx512-schedule test. For some reason some sqrt instructions were missing the scheduling comments. llvm-svn: 319258	2017-11-28 23:55:59 +00:00
Matt Arsenault	607a756651	AMDGPU: Enable IPRA llvm-svn: 319256	2017-11-28 23:40:12 +00:00
Simon Pilgrim	b9aa93cb93	[X86] Tag CLFLUSHOPT with same scheduling behaviour as CLFLUSH llvm-svn: 319253	2017-11-28 23:25:42 +00:00
Daniel Sanders	40c5cbfb08	[globalisel][tablegen] Fix PR35375 by sign-extending the table value to match getConstantVRegVal() Summary: From the bug report: > The problem is that it fails when trying to compare -65536 (or 4294901760) to 0xFFFF,0000. This is because the > constant in the instruction is sign extended to 64 bits (0xFFFF,FFFF,FFFF,0000) and then compared to the non > extended 64 bit version expected by TableGen. > > In contrast, the DAGISelEmitter generates special code for AND immediates (OPC_CheckAndImm), which does not > sign extend. This patch doesn't introduce the special case for AND (and OR) immediates since the majority of it is related to handling known bits that have no effect on the result and GlobalISel doesn't detect known-bits at this time. Instead this patch just ensures that the immediate is extended consistently on both sides of the check. Thanks to Diana Picus for the detailed bug report. Reviewers: rovka Reviewed By: rovka Subscribers: kristof.beyls, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D40532 llvm-svn: 319252	2017-11-28 23:18:54 +00:00
Simon Pilgrim	a675071b7a	[X86] Add CLFLUSHOPT schedule tests llvm-svn: 319250	2017-11-28 23:12:12 +00:00
Simon Pilgrim	f490c6efee	[X86][SSE] Add SSE_SHUFP OpndItins Update multi-classes to take the scheduling OpndItins instead of hard coding it. Will be reused in the AVX512 equivalents. llvm-svn: 319249	2017-11-28 23:09:18 +00:00
Simon Pilgrim	c6c2103e1b	[X86] Test clflushopt intrinsic on 32 and 64-bit targets llvm-svn: 319247	2017-11-28 23:04:42 +00:00
Simon Pilgrim	8f62394751	[X86][SSE] Add SSE_UNPCK/SSE_PUNPCK OpndItins Update multi-classes to take the scheduling OpndItins instead of hard coding it. Will be reused in the AVX512 equivalents. llvm-svn: 319245	2017-11-28 22:55:08 +00:00
Simon Pilgrim	1bc7b0e148	[X86][SSE] Use SSE_PACK OpndItins in PACKSS/PACKUS instruction definitions Update multi-classes to take the scheduling OpndItins instead of hard coding it. SSE_PACK will be reused in the AVX512 equivalents. llvm-svn: 319243	2017-11-28 22:47:45 +00:00
Adam Nemet	80fb55625b	Remove this test After r319235, we no longer generate this remark. llvm-svn: 319242	2017-11-28 22:39:38 +00:00
Simon Pilgrim	14d3fd29f8	Fix VS2017 narrowing conversion warning. NFCI llvm-svn: 319240	2017-11-28 22:32:43 +00:00
Craig Topper	ab9bfc904b	[X86] Remove unused variable. llvm-svn: 319239	2017-11-28 22:28:23 +00:00
Adam Nemet	2e92289014	Demote this opt remark to DEBUG. From a random opt-stat output: Top 10 remarks: tailcallelim/tailcall 53% inline/AlwaysInline 13% gvn/LoadClobbered 13% inline/Inlined 8% inline/TooCostly 2% inline/NoDefinition 2% licm/LoadWithLoopInvariantAddressInvalidated 2% licm/Hoisted 1% asm-printer/InstructionCount 1% prologepilog/StackSize 1% llvm-svn: 319235	2017-11-28 22:11:00 +00:00
Craig Topper	a27f1e675a	[X86] Remove code from combineUIntToFP that tried to favor UINT_TO_FP if legal when zero extending from vXi8/vX816. The UINT_TO_FP is immediately converted to SINT_TO_FP when the node is re-evaluated because we'll detect that the sign bit is zero. llvm-svn: 319234	2017-11-28 22:08:51 +00:00
Craig Topper	3aaa71f222	[X86] Remove custom lowering for uint_to_fp from vXi8/vXi16. We have a DAG combine that uses a zero extend that should prevent this from ever occurring now. llvm-svn: 319233	2017-11-28 22:08:48 +00:00
Daniel Sanders	766646517f	[globalisel][tablegen] Add support for importing G_ATOMIC_CMPXCHG, G_ATOMICRMW_* rules from SelectionDAG. GIM_CheckNonAtomic has been replaced by GIM_CheckAtomicOrdering to allow it to support a wider range of orderings. This has then been used to import patterns using nodes such as atomic_cmp_swap, atomic_swap, and atomic_load_*. llvm-svn: 319232	2017-11-28 22:07:05 +00:00
Adrian Prantl	77d90b0c39	SROA: Don't create variable fragments that are outside of the variable. An alloca may be larger than a variable that is described to be stored there. Don't create a dbg.value for fragments that are outside of the variable. This fixes PR35447. https://bugs.llvm.org/show_bug.cgi?id=35447 llvm-svn: 319230	2017-11-28 21:30:38 +00:00
Don Hinton	f5aab5454e	[cmake] Pass LLVM_USE_LINKER flag when building host tools, e.g., LLVM_OPTIMIZED_TABLEGEN=ON, and not crosscompiling. Differential Revision: https://reviews.llvm.org/D39734 llvm-svn: 319228	2017-11-28 21:23:30 +00:00
Alexey Bataev	ab5f3f2b33	[SLP] Additional test for PR35354, NFC. llvm-svn: 319224	2017-11-28 20:48:24 +00:00
Mandeep Singh Grang	e0173664e9	[Hexagon] Use stable sort for HexagonShuffler to remove non-deterministic ordering Summary: This fixes failures in the following tests uncovered by D39245: LLVM :: CodeGen/Hexagon/args.ll LLVM :: CodeGen/Hexagon/constp-extract.ll LLVM :: CodeGen/Hexagon/expand-condsets-basic.ll LLVM :: CodeGen/Hexagon/gp-rel.ll LLVM :: CodeGen/Hexagon/packetize_cond_inst.ll LLVM :: CodeGen/Hexagon/simple_addend.ll LLVM :: CodeGen/Hexagon/swp-stages4.ll LLVM :: CodeGen/Hexagon/swp-vmult.ll LLVM :: CodeGen/Hexagon/swp-vsum.ll LLVM :: MC/Hexagon/align.s LLVM :: MC/Hexagon/asmMap.s LLVM :: MC/Hexagon/dis-duplex-p0.s LLVM :: MC/Hexagon/double-vector-producer.s LLVM :: MC/Hexagon/inst_select.ll LLVM :: MC/Hexagon/instructions/j.s Reviewers: colinl, kparzysz, adasgupt, slarin Reviewed By: kparzysz Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D40227 llvm-svn: 319223	2017-11-28 20:48:10 +00:00
Daniel Sanders	7b361b50d8	[aarch64][globalisel] Add missing tests from r319216 llvm-svn: 319220	2017-11-28 20:27:59 +00:00
Sean Fertile	e200016ea9	[PowerPC] Allow tail calls of fastcc functions from C CallingConv functions. Allow fastcc callees to be tail-called from ccc callers. Differential Revision: https://reviews.llvm.org/D40355 llvm-svn: 319218	2017-11-28 20:25:58 +00:00
Daniel Sanders	7fe7acc6b1	[aarch64][globalisel] Define G_ATOMIC_CMPXCHG and G_ATOMICRMW_* and make them legal The IRTranslator cannot generate these instructions at the moment so there's no issue with not having implemented ISel for them yet. D40092 will add G_ATOMIC_CMPXCHG_WITH_SUCCESS and G_ATOMICRMW_* to the IRTranslator and a further patch will add support for lowering G_ATOMIC_CMPXCHG_WITH_SUCCESS into G_ATOMIC_CMPXCHG with an external success check via the `Lower` action. The separation of G_ATOMIC_CMPXCHG_WITH_SUCCESS and G_ATOMIC_CMPXCHG is to import SelectionDAG rules while still supporting targets that prefer to custom lower the original LLVM-IR-like operation. llvm-svn: 319216	2017-11-28 20:21:15 +00:00
Mandeep Singh Grang	230b0a1477	[SelectionDAG] Make sorting predicate stronger to remove non-deterministic ordering Summary: Recommitting this with the correct sorting predicate. The Low field of Clusters is a ConstantInt and cannot be directly compared. So we needed to invoke slt (signed less than) to compare correctly. This fixes failures in the following tests uncovered by D39245: LLVM :: CodeGen/ARM/ifcvt3.ll LLVM :: CodeGen/ARM/switch-minsize.ll LLVM :: CodeGen/X86/switch.ll LLVM :: CodeGen/X86/switch-bt.ll LLVM :: CodeGen/X86/switch-density.ll Reviewers: hans, fhahn Reviewed By: hans Subscribers: aemerson, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D40541 llvm-svn: 319210	2017-11-28 19:55:54 +00:00
Simon Pilgrim	d49bd0cd87	[X86][SSE] Add SSE_HADDSUB/SSE_PABS/SSE_PALIGN OpndItins Update multi-classes to take the scheduling OpndItins instead of hard coding it. Will be reused in the AVX512 equivalents. llvm-svn: 319209	2017-11-28 19:39:47 +00:00
Craig Topper	dd4295626b	[X86] In lowerVectorShuffleAsElementInsertion, if were able to find a scalar i8 or i16 and need to zero extend it, make sure we use a vXi32 type of the full vector width. Previously, this was hardcoded to v4i32, but if the input type is 256 bits we need to use v8i32. Fixes PR35443 llvm-svn: 319208	2017-11-28 19:25:45 +00:00
Francis Visoiu Mistrih	3aa8eaa951	[CodeGen] Fix doxygen \file comment style llvm-svn: 319207	2017-11-28 19:23:39 +00:00
Francis Visoiu Mistrih	d4b340b460	[CodeGen] Fix doxygen llvm-svn: 319206	2017-11-28 19:15:46 +00:00
Sanjay Patel	1a72f67006	[InstCombine] auto-generate complete test checks; NFC llvm-svn: 319205	2017-11-28 19:13:23 +00:00
Krzysztof Parzyszek	081e458e90	[Hexagon] Make sure to zero-extend bytes before building a vector llvm-svn: 319204	2017-11-28 19:13:17 +00:00
Sanjay Patel	b1a97d3774	[InstCombine] auto-generate complete test checks; NFC llvm-svn: 319203	2017-11-28 19:07:28 +00:00
Daniel Sanders	17d277b734	[mir] Print/Parse both MOLoad and MOStore when they occur together. Summary: They're not always mutually exclusive. read-modify-write atomics are both at the same time. One example of this is the SWP instructions on AArch64. Another example is GlobalISel's G_ATOMICRMW_* generic instructions which will be added in a later patch. Reviewers: arphaman, aemerson Reviewed By: aemerson Subscribers: aemerson, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D40157 llvm-svn: 319202	2017-11-28 18:57:02 +00:00
Rafael Espindola	bba7f862d8	Fix non assert build warnings. llvm-svn: 319200	2017-11-28 18:50:08 +00:00
Hans Wennborg	ca46db957d	EntryExitInstrumenter: set DebugLocs on the inserted call instructions (PR35412) Apparently the verifier requires that inlineable calls in a function with debug info have debug locations. llvm-svn: 319199	2017-11-28 18:44:26 +00:00
Zachary Turner	6900de1dfb	[CodeView] Refactor / Rewrite TypeSerializer and TypeTableBuilder. The motivation behind this patch is that future directions require us to be able to compute the hash value of records independently of actually using them for de-duplication. The current structure of TypeSerializer / TypeTableBuilder being a single entry point that takes an unserialized type record, and then hashes and de-duplicates it is not flexible enough to allow this. At the same time, the existing TypeSerializer is already extremely complex for this very reason -- it tries to be too many things. In addition to serializing, hashing, and de-duplicating, ti also supports splitting up field list records and adding continuations. All of this functionality crammed into this one class makes it very complicated to work with and hard to maintain. To solve all of these problems, I've re-written everything from scratch and split the functionality into separate pieces that can easily be reused. The end result is that one class TypeSerializer is turned into 3 new classes SimpleTypeSerializer, ContinuationRecordBuilder, and TypeTableBuilder, each of which in isolation is simple and straightforward. A quick summary of these new classes and their responsibilities are: - SimpleTypeSerializer : Turns a non-FieldList leaf type into a series of bytes. Does not do any hashing. Every time you call it, it will re-serialize and return bytes again. The same instance can be re-used over and over to avoid re-allocations, and in exchange for this optimization the bytes returned by the serializer only live until the caller attempts to serialize a new record. - ContinuationRecordBuilder : Turns a FieldList-like record into a series of fragments. Does not do any hashing. Like SimpleTypeSerializer, returns references to privately owned bytes, so the storage is invalidated as soon as the caller tries to re-use the instance. Works equally well for LF_FIELDLIST as it does for LF_METHODLIST, solving a long-standing theoretical limitation of the previous implementation. - TypeTableBuilder : Accepts sequences of bytes that the user has already serialized, and inserts them by de-duplicating with a hash table. For the sake of convenience and efficiency, this class internally stores a SimpleTypeSerializer so that it can accept unserialized records. The same is not true of ContinuationRecordBuilder. The user is required to create their own instance of ContinuationRecordBuilder. Differential Revision: https://reviews.llvm.org/D40518 llvm-svn: 319198	2017-11-28 18:33:17 +00:00
Simon Pilgrim	4fecbd8871	[X86][X87] Tag FP_TO_INT_IN_MEM pseudos with hasNoSchedulingInfo We don't need scheduling info for pseudos llvm-svn: 319197	2017-11-28 18:10:29 +00:00
Francis Visoiu Mistrih	aa739695a4	[CodeGen] Separate MachineOperand implementation from MachineInstr Move the implementation to its own file. Differential Revision: https://reviews.llvm.org/D40419 llvm-svn: 319194	2017-11-28 17:58:43 +00:00
Francis Visoiu Mistrih	946e394e33	[CodeGen] Cleanup MachineOperand * clang-format * move doxygen from the implementation to headers * remove duplicate doxygen llvm-svn: 319193	2017-11-28 17:58:38 +00:00
Konstantin Zhuravlyov	06ae4ec78e	AMDGPU: Add num spilled s/vgprs to metadata This was requested by tools. Differential Revision: https://reviews.llvm.org/D40321 llvm-svn: 319192	2017-11-28 17:51:08 +00:00
Adam Nemet	353f7cbc21	Add opt-viewer testing Detects whether we have the Python modules (pygments, yaml) required by opt-viewer and hooks this up to REQUIRES. This fixes https://bugs.llvm.org/show_bug.cgi?id=34129 (the lack of opt-viewer testing). It's also related to https://github.com/apple/swift/pull/12938 and the idea is to expose LLVM_HAVE_OPT_VIEWER_MODULES to the Swift cmake. Differential Revision: https://reviews.llvm.org/D40202 llvm-svn: 319188	2017-11-28 17:26:28 +00:00
Francis Visoiu Mistrih	9d7bb0cb40	[CodeGen] Print register names in lowercase in both MIR and debug output As part of the unification of the debug format and the MIR format, always print registers as lowercase. * Only debug printing is affected. It now follows MIR. Differential Revision: https://reviews.llvm.org/D40417 llvm-svn: 319187	2017-11-28 17:15:09 +00:00
Dan Gohman	2803bfaf00	[WebAssembly] Support bitcasted function addresses with varargs. Generalize FixFunctionBitcasts to handle varargs functions. This in particular fixes the case where clang bitcasts away a varargs when calling a K&R-style function. This avoids interacting with tricky ABI details because it operates at the LLVM IR level before varargs ABI details are exposed. This fixes PR35385. llvm-svn: 319186	2017-11-28 17:15:03 +00:00
Matt Arsenault	e123aba94e	DAG: Legalize truncstores to illegal int types Truncate to a legal int type, and produce a new truncstore from a narrower type. llvm-svn: 319185	2017-11-28 17:11:30 +00:00
Simon Pilgrim	ece5bc358a	[X86][X87] Tag FTST x87 instruction scheduler class Looking through Agner, FTST is very similar to generic float compare behaviour, so I've added them to the existing IIC_FCOMI (WriteFAdd) tags. llvm-svn: 319184	2017-11-28 16:57:20 +00:00
Sanjay Patel	14230e02ff	[InstCombine] add tests from D39421 to show current transforms; NFC llvm-svn: 319182	2017-11-28 16:40:30 +00:00
Francis Visoiu Mistrih	14bd3b9f21	[Support] Add unit test for printLowerCase Add test case for the function added in r319171. llvm-svn: 319177	2017-11-28 16:11:56 +00:00
Don Hinton	17fdf32cc1	[cmake] Remove redundant call to cmake when building host tools. Summary: Remove the redundant, config-time call to cmake when building host tools for cross compiles or optimized tablegen.. The config-time call to cmake is redundant because it will always get called again when the CONFIGURE_LLVM_${target_name} target fires at build-time. This speeds up initial configuration, but has no affect on build behavior. Reviewers: beanz Reviewed By: beanz Subscribers: mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D40229 llvm-svn: 319176	2017-11-28 16:08:57 +00:00
Simon Pilgrim	0747a7e8c3	[X86][X87] Tag FABS/FCHS/FSQRT/FSIN/FCOS x87 instruction scheduler classes Atom's FABS/FCHS/FSQRT latencies taken from Agner. Note: I just added FSIN and FCOS to the existing IIC_FSINCOS itinerary, which is actually a more costly instruction. llvm-svn: 319175	2017-11-28 15:03:42 +00:00
Jonas Paulsson	f0ff20f1f0	Use getStoreSize() in various places instead of 'BitSize >> 3'. This is needed for cases when the memory access is not as big as the width of the data type. For instance, storing i1 (1 bit) would be done in a byte (8 bits). Using 'BitSize >> 3' (or '/ 8') would e.g. give the memory access of an i1 a size of 0, which for instance makes alias analysis return NoAlias even when it shouldn't. There are no tests as this was done as a follow-up to the bugfix for the case where this was discovered (r318824). This handles more similar cases. Review: Björn Petterson https://reviews.llvm.org/D40339 llvm-svn: 319173	2017-11-28 14:44:32 +00:00
Simon Pilgrim	b843dc26e4	[X86][X86] Add some x87 schedule tests Still missing some instructions: mainly loads/stores/system ops, all flagged as TODO. llvm-svn: 319172	2017-11-28 14:35:52 +00:00
Francis Visoiu Mistrih	26d6fc1f0e	[Support] Merge toLower / toUpper implementations Merge the ones from StringRef and StringExtras. llvm-svn: 319171	2017-11-28 14:22:27 +00:00
Francis Visoiu Mistrih	9d419d3b0c	[CodeGen] Rename functions PrintReg* to printReg* LLVM Coding Standards: Function names should be verb phrases (as they represent actions), and command-like function should be imperative. The name should be camel case, and start with a lower case letter (e.g. openFile() or isFoo()). Differential Revision: https://reviews.llvm.org/D40416 llvm-svn: 319168	2017-11-28 12:42:37 +00:00
Simon Pilgrim	8dc603b031	[X86][3DNow] Add instruction itinerary and scheduling classes for femms/prefetch/prefetchw llvm-svn: 319167	2017-11-28 12:37:35 +00:00
Peter Smith	a939257a42	[ARM][AArch64] Workaround ARM/AArch64 peculiarity in clearing icache. Certain ARM implementations treat icache clear instruction as a memory read, and CPU segfaults on trying to clear cache on !PROT_READ page. We workaround this in Memory::protectMappedMemory by adding PROT_READ to affected pages, clearing the cache, and then setting desired protection. This fixes "AllocationTests/MappedMemoryTest.***/3" unit-tests on affected hardware. Reviewers: psmith, zatrazz, kristof.beyls, lhames Reviewed By: lhames Subscribers: llvm-commits, krytarowski, peter.smith, jgreenhalgh, aemerson, rengolin Patch by maxim-kuvrykov! Differential Revision: https://reviews.llvm.org/D40423 llvm-svn: 319166	2017-11-28 12:34:05 +00:00
Chandler Carruth	c34f789e38	Add a new pass to speculate around PHI nodes with constant (integer) operands when profitable. The core idea is to (re-)introduce some redundancies where their cost is hidden by the cost of materializing immediates for constant operands of PHI nodes. When the cost of the redundancies is covered by this, avoiding materializing the immediate has numerous benefits: 1) Less register pressure 2) Potential for further folding / combining 3) Potential for more efficient instructions due to immediate operand As a motivating example, consider the remarkably different cost on x86 of a SHL instruction with an immediate operand versus a register operand. This pattern turns up surprisingly frequently, but is somewhat rarely obvious as a significant performance problem. The pass is entirely target independent, but it does rely on the target cost model in TTI to decide when to speculate things around the PHI node. I've included x86-focused tests, but any target that sets up its immediate cost model should benefit from this pass. There is probably more that can be done in this space, but the pass as-is is enough to get some important performance on our internal benchmarks, and should be generally performance neutral, but help with more extensive benchmarking is always welcome. One awkward part is that this pass has to be scheduled after everything that can eliminate these kinds of redundancies. This includes SimplifyCFG, GVN, etc. I'm open to suggestions about better places to put this. We could in theory make it part of the codegen pass pipeline, but there doesn't really seem to be a good reason for that -- it isn't "lowering" in any sense and only relies on pretty standard cost model based TTI queries, so it seems to fit well with the "optimization" pipeline model. Still, further thoughts on the pipeline position are welcome. I've also only implemented this in the new pass manager. If folks are very interested, I can try to add it to the old PM as well, but I didn't really see much point (my use case is already switched over to the new PM). I've tested this pretty heavily without issue. A wide range of benchmarks internally show no change outside the noise, and I don't see any significant changes in SPEC either. However, the size class computation in tcmalloc is substantially improved by this, which turns into a 2% to 4% win on the hottest path through tcmalloc for us, so there are definitely important cases where this is going to make a substantial difference. Differential revision: https://reviews.llvm.org/D37467 llvm-svn: 319164	2017-11-28 11:32:31 +00:00
Florian Hahn	25ea91a838	[TailRecursionElimination] Skip debug intrinsics. Summary: I think we do not need to analyze debug intrinsics here, as they should not impact codegen. This has 2 benefits: 1) slightly less work to do and 2) avoiding generating optimization remarks for converting calls to debug intrinsics to tail calls, which are not really helpful for users. Based on work by Sander de Smalen. Reviewers: davide, trentxintong, aprantl Reviewed By: aprantl Subscribers: llvm-commits, JDevlieghere Tags: #debug-info Differential Revision: https://reviews.llvm.org/D40440 llvm-svn: 319158	2017-11-28 09:32:25 +00:00
Nicolai Haehnle	b4f28deda0	AMDGPU: Re-organize the outer loop of SILoadStoreOptimizer Summary: The entire algorithm operates per basic-block, so for cache locality it should be better to re-optimize a basic-block immediately rather than in a separate loop. I don't have performance measurements. Change-Id: I85106570bd623c4ff277faaa50ee43258e1ddcc5 Reviewers: arsenm, rampitec Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D40344 llvm-svn: 319156	2017-11-28 08:42:46 +00:00
Nicolai Haehnle	39980dac0b	AMDGPU: Consistently check for immediates in SIInstrInfo::FoldImmediate Summary: The PeepholeOptimizer pass calls this function solely based on checking DefMI->isMoveImmediate(), which only checks the MoveImm bit of the instruction description. So it's up to FoldImmediate itself to properly check that DefMI actually moves from an immediate. I don't have a separate test case for this, but the next patch introduces a test case which happens to crash without this change. This error is caught by the assertion in MachineOperand::getImm(). Change-Id: I88e7cdbcf54d75e1a296822e6fe5f9a5f095bbf8 Reviewers: arsenm, rampitec Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D40342 llvm-svn: 319155	2017-11-28 08:41:50 +00:00
Max Kazantsev	6e78ad35cc	[SCEV][NFC] More efficient caching in CompareValueComplexity Currently, we use a set of pairs to cache responces like `CompareValueComplexity(X, Y) == 0`. If we had proved that `CompareValueComplexity(S1, S2) == 0` and `CompareValueComplexity(S2, S3) == 0`, this cache does not allow us to prove that `CompareValueComplexity(S1, S3)` is also `0`. This patch replaces this set with `EquivalenceClasses` that merges Values into equivalence sets so that any two values from the same set are equal from point of `CompareValueComplexity`. This, in particular, allows us to prove the fact from example above. Differential Revision: https://reviews.llvm.org/D40429 llvm-svn: 319153	2017-11-28 08:26:43 +00:00
Martin Storsjo	04b68446eb	[COFF] Implement constructor priorities The priorities in the section name suffixes are zero padded, allowing the linker to just do a lexical sort. Add zero padding for .ctors sections in ELF as well. Differential Revision: https://reviews.llvm.org/D40407 llvm-svn: 319150	2017-11-28 08:07:18 +00:00
Max Kazantsev	cf9b1b24ce	[SCEV][NFC] More efficient caching in CompareSCEVComplexity Currently, we use a set of pairs to cache responces like `CompareSCEVComplexity(X, Y) == 0`. If we had proved that `CompareSCEVComplexity(S1, S2) == 0` and `CompareSCEVComplexity(S2, S3) == 0`, this cache does not allow us to prove that `CompareSCEVComplexity(S1, S3)` is also `0`. This patch replaces this set with `EquivalenceClasses` any two values from the same set are equal from point of `CompareSCEVComplexity`. This, in particular, allows us to prove the fact from example above. Differential Revision: https://reviews.llvm.org/D40428 llvm-svn: 319149	2017-11-28 07:48:12 +00:00
Max Kazantsev	115607226a	[GVN] Prevent ScalarPRE from hoisting across instructions that don't pass control flow to successors This is to address a problem similar to those in D37460 for Scalar PRE. We should not PRE across an instruction that may not pass execution to its successor unless it is safe to speculatively execute it. Differential Revision: https://reviews.llvm.org/D38619 llvm-svn: 319147	2017-11-28 07:07:55 +00:00
Adam Nemet	bf74f64e67	Revert "Add opt-viewer testing" This reverts commit r319073. Bot fails with a mismatch that looks like pygments-generated HTML. llvm-svn: 319146	2017-11-28 06:22:29 +00:00
Dan Gohman	3ff73cfbcd	[WebAssembly] Handle errors better in fast-isel. Fast-isel routines need to bail out in the case that fast-isel fails on the operands. This fixes https://bugs.llvm.org/show_bug.cgi?id=35064 llvm-svn: 319144	2017-11-28 05:36:42 +00:00
Craig Topper	640a3c1e2a	[X86] Remove some unused pattern fragments from td file. NFC llvm-svn: 319143	2017-11-28 05:23:57 +00:00
Simon Dardis	3aeb1a5404	[DAGCombine] Disable finding better chains for stores at O0 Unoptimized IR can have linear sequences of stores to an array, where the initial GEP for the first store is formed from the pointer to the array, and the GEP for each store after the first is formed from the previous GEP with some offset in an inductive fashion. The (large) resulting DAG when analyzed by DAGCombine undergoes an excessive number of combines as each store node is examined every time its' offset node is combined with any child of the offset. One of the transformations is findBetterNeighborChains which assists MergeConsecutiveStores. The former relies on repeated chain walking to do its' work, however MergeConsecutiveStores is disabled at O0 which makes the transformation redundant. Any optimization level other than O0 would invoke InstCombine which would resolve the chain of GEPs into flat base + offset GEP for each store which does not exhibit the repeated examination of each store to the array. Disabling this optimization fixes an excessive compile time issue (30~ minutes for the test case provided) at O0. Reviewers: niravd, craig.topper, t.p.northover Differential Revision: https://reviews.llvm.org/D40193 llvm-svn: 319142	2017-11-28 04:07:59 +00:00
Matthias Braun	eca985847c	MachineVerifier: Improve register operand checks This fixes cases where we wouldn't perform various register operand checks just because we didn't happen to have a definition in the MCInstrDesc. This changes the code to only skip the tests that actually depend on the MCInstrDesc definition. This makes the machine verifier spot the problem from https://llvm.org/PR33071 after the pass that actually caused it. llvm-svn: 319141	2017-11-28 03:54:20 +00:00
Matthias Braun	a6d5374ee6	MachineVerifier: Improve PHI operand checking Additional checks for phi operands: - first operand should be a virtual register def. It should not be tied, implicit, internalread, earlyclobber or a read. - The other operands should be register/mbb operands next to each other - The register operands should not be implicit, internalread, earlyclobber, debug or tied. - We can perform most of the PHI checks even for unreachable blocks. llvm-svn: 319140	2017-11-28 03:54:19 +00:00
Matthias Braun	adf7582d14	lit: Bring back -Dtool=xxx feature lost in r313928 llvm-svn: 319139	2017-11-28 03:23:07 +00:00
Rafael Espindola	3ecd20430c	Use FILE_FLAG_DELETE_ON_CLOSE for TempFile on windows. We won't see the temp file no more. llvm-svn: 319137	2017-11-28 01:41:22 +00:00
Craig Topper	ddbc340c20	[X86] Make zero extend from v16i1/v8i1 to v16i8/v8i16/v16i16 not scalarize under AVX512. llvm-svn: 319136	2017-11-28 01:36:33 +00:00
Craig Topper	5befc5bfce	[X86] Add command line without AVX512BW/AVX512VL to bitcast-int-to-vector-bool-zext.ll. llvm-svn: 319135	2017-11-28 01:36:31 +00:00
Rafael Espindola	2c4e920f0c	Move code. NFC. This moves the TempFile implementation so that it can use system specific code. llvm-svn: 319134	2017-11-28 01:34:20 +00:00
Peter Collingbourne	1621c20ffc	Reland r319090, "COFF: Do not create SectionChunks for discarded comdat sections." with a fix for debug sections. If /debug was not specified, readSection will return a null pointer for debug sections. If the debug section is associative with another section, we need to make sure that the section returned from readSection is not a null pointer before adding it as an associative section. Differential Revision: https://reviews.llvm.org/D40533 llvm-svn: 319133	2017-11-28 01:30:07 +00:00
Rafael Espindola	c06f55e1e8	This reverts commit r319096 and r319097. Revert "[SROA] Propagate !range metadata when moving loads." Revert "[Mem2Reg] Clang-format unformatted parts of this file. NFCI." Davide says they broke a bot. llvm-svn: 319131	2017-11-28 01:25:38 +00:00
Matthias Braun	5d01e708e1	ARM: Fix PR32578 https://llvm.org/PR32578 I simplified and converted the reproducer into a lit test. Patch by Vedant Kumar! llvm-svn: 319130	2017-11-28 01:17:52 +00:00
Dan Gohman	cdd48b8a6b	[WebAssembly] Fix trapping behavior in fptosi/fptoui. This adds code to protect WebAssembly's `trunc_s` family of opcodes from values outside their domain. Even though such conversions have full undefined behavior in C/C++, LLVM IR's `fptosi` and `fptoui` do not, and only return undef. This also implements the proposed non-trapping float-to-int conversion feature and uses that instead when available. llvm-svn: 319128	2017-11-28 01:13:40 +00:00
Adrian Prantl	d7f6f1636d	SROA: Avoid creating a fragment expression that covers the entire variable. Fixes PR35416. https://bugs.llvm.org/show_bug.cgi?id=35416 llvm-svn: 319126	2017-11-28 00:57:53 +00:00
Adrian Prantl	3e0e1d0934	Move getVariableSize from Verifier.cpp into DIVariable::getSize() (NFC) llvm-svn: 319125	2017-11-28 00:57:51 +00:00
Craig Topper	8b9cd03824	[X86] Remove unnecessary fp<->int setOperationAction lines from a hasVLX block. NFCI These lines all exist identically either under SSE2, AVX2 or AVX512. Given that VLX implies all of those, these aren't providing anything new. llvm-svn: 319124	2017-11-28 00:41:12 +00:00
Craig Topper	ce732e7c30	[X86] Remove duplicate calls to setOperationAction. NFCI These same calls exist a few lines down. llvm-svn: 319122	2017-11-28 00:16:42 +00:00
Rafael Espindola	bce112c9e9	Add an F_Delete flag. For now this only changes the handle Access. llvm-svn: 319121	2017-11-28 00:12:44 +00:00
Craig Topper	dbd4a7fecc	[DAGCombiner] Don't combine aext(setcc) if the setcc is already using the target's preferred result type. With AVX512 vXi1 types are legal so we shouldn't be extending them. This change is similar to existing code in the zext(setcc) combine. llvm-svn: 319120	2017-11-27 23:51:40 +00:00
Craig Topper	57c02d18b9	[DAGCombiner] Use EVT::changeVectorElementTypeToInteger() instead of implementing manually. llvm-svn: 319119	2017-11-27 23:51:31 +00:00
Rafael Espindola	d19c2e8126	Add OpenFlags to the create(Unique\|Temporary)File interfaces. This will allow a future F_Delete flag to be specified when we want the file to be automatically deleted on close. llvm-svn: 319117	2017-11-27 23:44:11 +00:00
Craig Topper	256cc48df6	[X86] Teach getSetCCResultType to handle more than just SimpleVTs when looking at larger than 512-bit vectors. Which VTs are considered simple is determined by the superset of the legal types of all targets in LLVM. If we're looking at VTs that are going to be split down to 512-bits we should allow any VT not just simple ones since the simple list changes over time as new targets are added. llvm-svn: 319110	2017-11-27 22:56:10 +00:00
Petr Hosek	6163329caa	[CMake] Pass LLVM_HOST_TRIPLE to external projects LLVM runtimes rely on LLVM_HOST_TRIPLE being set in their builds and tests so make sure it's being passed down. Differential Revision: https://reviews.llvm.org/D40515 llvm-svn: 319109	2017-11-27 22:50:48 +00:00
Petr Hosek	a08d65ded2	[CMake][runtimes] Support monorepo layout with runtimes build We introduce a new variable LLVM_ENABLE_RUNTIMES which works similarly to LLVM_ENABLE_PROJECTS and allows specifying runtimes that will be enabled in the runtimes build. Differential Revision: https://reviews.llvm.org/D40233 llvm-svn: 319107	2017-11-27 22:31:11 +00:00
Michal Gorny	8eaa8ec8fc	[cmake] Pass -Wl,-z,nodelete on Linux to prevent unloading Prevent unloading shared libraries on Linux when dlclose() is called. This is necessary since command-line option parsing API relies on registering the global option instances in the option parser instance which can be loaded in a different shared library. Given that we can't reliably remove those options when a library is unloaded, the parser ends up containing dangling references. Since glibc has relatively complex library unloading rules, some of the LLVM libraries can be unloaded while others (including the Support library) stay loaded causing quite a mayhem. To reliably prevent that, just forbid unloading all libraries -- it's a very bad idea anyway. While the issue arguably happens only with BUILD_SHARED_LIBS, it may affect any library reusing llvm::cl interface. Based on patch provided Ross Hayward on https://bugs.gentoo.org/617154. Previously hit by Fedora back in Feb 2016: https://lists.freedesktop.org/archives/mesa-dev/2016-February/107242.html Differential Revision: https://reviews.llvm.org/D40459 llvm-svn: 319105	2017-11-27 22:23:09 +00:00
Greg Clayton	d6b67eb15c	Fixed the ability to recursively get an attribute value from a DWARFDie. The previous implementation would only look 1 DW_AT_specification or DW_AT_abstract_origin deep. This means DWARFDie::getName() would fail in certain cases. I ran into such a case while creating a tool that used the LLVM DWARF parser to generate a symbolication format so I have seen this in the wild. Differential Revision: https://reviews.llvm.org/D40156 llvm-svn: 319104	2017-11-27 22:12:44 +00:00
Craig Topper	4aa519507d	[X86] Remove lines that set v8f32 FP_ROUND/FP_EXTEND to Legal under AVX512. NFCI We don't do this for narrow vectors under AVX or SSE features. We also don't set them to Expand like we do for many vectors op. Nor does TargetLoweringBase.cpp. This leads me to believe these default to Legal. llvm-svn: 319103	2017-11-27 22:01:17 +00:00
Peter Collingbourne	c8477b8234	Revert r319090, "COFF: Do not create SectionChunks for discarded comdat sections." Caused test failures in check-cfi on Windows. http://lab.llvm.org:8011/builders/sanitizer-windows/builds/20284 llvm-svn: 319100	2017-11-27 21:37:51 +00:00
Davide Italiano	824d71a9c5	[Mem2Reg] Clang-format unformatted parts of this file. NFCI. llvm-svn: 319097	2017-11-27 21:25:52 +00:00
Davide Italiano	b5d59e73ee	[SROA] Propagate !range metadata when moving loads. This tries to propagate !range metadata to a pre-existing load when a load is optimized out. This is done instead of adding an assume because converting loads to and from assumes creates a lot of IR. Patch by Ariel Ben-Yehuda. Differential Revision: https://reviews.llvm.org/D37216 llvm-svn: 319096	2017-11-27 21:25:13 +00:00
Sanjay Patel	0de1a4bc2d	[PartiallyInlineLibCalls][x86] add TTI hook to allow sqrt inlining to depend on arg rather than result This should fix PR31455: https://bugs.llvm.org/show_bug.cgi?id=31455 Differential Revision: https://reviews.llvm.org/D28314 llvm-svn: 319094	2017-11-27 21:15:43 +00:00
Daniel Sanders	7c3a89231c	Add release note about TargetRegistry change from r318352 llvm-svn: 319093	2017-11-27 21:12:55 +00:00
Yaxun Liu	dcb6067d9f	[AMDGPU] Update test nullptr.ll to use amdgiz environment This test needs to be manually updated since it is difficult to do it with script. Addr space 6 to 23 are only used by r600, therefore only check them for r600. Differential Revision: https://reviews.llvm.org/D40117 llvm-svn: 319092	2017-11-27 20:48:21 +00:00
Peter Collingbourne	3f2921f5ec	COFF: Do not create SectionChunks for discarded comdat sections. With this change, instead of creating a SectionChunk for each section in the object file, we only create them when we encounter a prevailing comdat section. Also change how symbol resolution occurs between comdat symbols. Now only the comdat leader participates in comdat resolution, and not any other external associated symbols. This is more in line with how COFF semantics are defined, and should allow for a more straightforward implementation of non-ANY comdat types. On my machine, this change reduces our runtime linking a release build of chrome_child.dll with /nopdb from 5.65s to 4.54s (median of 50 runs). Differential Revision: https://reviews.llvm.org/D40238 llvm-svn: 319090	2017-11-27 20:42:34 +00:00
Petr Hosek	1f34379965	Use LIST_SEPARATOR rather than escaping in ExternalProject_Add Escaping ; in list arguments passed to ExternalProject_Add doesn't seem to be working in newer versions of CMake (see https://public.kitware.com/Bug/view.php?id=16137 for more details). Use a custom LIST_SEPARATOR instead which is the officially supported way. Differential Revision: https://reviews.llvm.org/D40232 llvm-svn: 319089	2017-11-27 20:41:53 +00:00

1 2 3 4 5 ...

157208 Commits