llvm-project

Commit Graph

Author	SHA1	Message	Date
Sanjay Patel	5d67d81f48	[InstCombine] prevent crashing/assert on shift constant expression (PR44028) The binary operator cast implies an instruction, but the matcher for shift does not: https://bugs.llvm.org/show_bug.cgi?id=44028	2019-11-17 17:31:09 -05:00
Craig Topper	1b0efe2b17	[LegalizeTypes] When expanding the integer result of LLROUND/LLRINT, also call GetSoftenedFloat if the floating point input needs to be softened. Before this we were emitting a bitcast to integer from the lowering code that itself will need to be legalized. By calling GetSoftenedFloat we get the integer conversion in one step without needing to relegalize a bitcast.	2019-11-17 13:31:30 -08:00
Craig Topper	9b515b6dd9	[LegalizeTypes] Remove PromoteFloat support form ExpandIntRes_LLROUND_LLRINT. This code isn't exercised, and was in the wrong place. If we need this, we would need to promote the type before figuring out which libcall to use. I'm choosing to remove it rather than fixing since we don't support PromoteFloat for LRINT/LROUND/LLRINT/LLROUND when the result type is legal so I don't see much reason to support it for the case where the result type isn't legal.	2019-11-17 13:31:30 -08:00
Craig Topper	d4ba11ae32	[LegalizeTypes] Merge ExpandIntRes_LLROUND and ExpandIntRes_LLRINT into one function that handles both. NFC These too functions are were the same except for which libcall gets emitted. Just merge them into one. This is prep work for some other work including strict fp support.	2019-11-17 13:31:30 -08:00
Florian Hahn	8eeabbaf5d	[ConstantFold] Handle identity folds at top of ConstantFoldBinaryInst Currently we miss folds with undef and identity values for binary ops that do not fold to undef in general. We can generalize the identity simplifications and do them before checking for undef in particular. Alive checks: * OR - https://rise4fun.com/Alive/8OsK * AND - https://rise4fun.com/Alive/e3tE This will also allow us to remove some now redundant cases throughout the function, but I would like to do this as follow-up. That should make tracking down potential issues easier. Reviewers: spatel, RKSimon, lebedev.ri Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D70169	2019-11-17 21:30:14 +00:00
Stefan Stipanovic	a516fbac52	[Attributor] Use nofree argument attribute for heap-to-stack conversion Reviewers: jdoerfert, uenoku Subscribers: Differential Revision: https://reviews.llvm.org/D70140	2019-11-17 21:35:04 +01:00
Sanjay Patel	ebf9bf2cbc	[SimplifyCFG] propagate fast-math-flags (FMF) from phi to select Similar to/extension of D70208 (rGee0882bdf866), but this one may finally allow closing motivating bugs. This is another step towards having FMF apply only to FP values rather than those + fcmp. See PR38086 for one of the original discussions/motivations: https://bugs.llvm.org/show_bug.cgi?id=38086 And the test here is derived from PR39535: https://bugs.llvm.org/show_bug.cgi?id=39535 Currently, we lose FMF when converting any phi to select in SimplifyCFG. There are a small number of similar changes needed to correct within SimplifyCFG, so it should be quick to patch this pass up. FMF was extended to select and phi with: D61917 D67564	2019-11-17 11:23:44 -05:00
David Green	08390c52a2	[InstCombine] Canonicalize ssub.with.overflow with clamp to ssub.sat Working on top of D69252, this adds canonicalisation patterns for ssub.with.overflow to ssub.sats. Differential Revision: https://reviews.llvm.org/D69753	2019-11-17 10:45:11 +00:00
David Green	03fce6b12e	[InstCombine] Canonicalize sadd.with.overflow with clamp to sadd.sat This adds to D69245, adding extra signed patterns for folding from a sadd_with_overflow to a sadd_sat. These are more complex than the unsigned patterns, as the overflow can occur in either direction. For the add case, the positive overflow can only occur if both of the values are positive (same for both the values being negative). So there is an extra select on whether to use the positive or negative overflow limit. Differential Revision: https://reviews.llvm.org/D69252	2019-11-17 10:42:39 +00:00
Simon Atanasyan	584704c725	[mips] Remove redundant cast. NFC	2019-11-16 20:22:18 +03:00
Simon Atanasyan	6d7fa65c38	[mips] Remove old FIXME comment. NFC The issue was fixed at r275050.	2019-11-16 20:22:17 +03:00
Sourabh Singh Tomar	423f541c1a	[DWARF5]Addition of alignment atrribute in typedef DIE. This patch, adds support for DW_AT_alignment[DWARF5] attribute, to be emitted with typdef DIE. When explicit alignment is specified. Patch by Awanish Pandey <Awanish.Pandey@amd.com> Reviewers: aprantl, dblaikie, jini.susan.george, SouraVX, alok, deadalinx Differential Revision: https://reviews.llvm.org/D70111	2019-11-16 21:56:53 +05:30
James Y Knight	bf142fc433	MCObjectStreamer: assign MCSymbols in the dummy fragment to offset 0. In MCObjectStreamer, when there is no current fragment, initially symbols are created in a "pending" state and assigned to a dummy empty fragment. Previously, they were not being assigned an offset, and thus evaluateAbsolute would fail if trying to evaluate an expression 'a - b', where both 'a' and 'b' were in this pending state. Also slightly refactored the EmitLabel overload which takes an MCFragment for clarity. Fixes: https://llvm.org/PR41825 Differential Revision: https://reviews.llvm.org/D70062	2019-11-16 09:52:07 -05:00
Sylvestre Ledru	114f3e5b08	Fix a build failure with perf: Add a missing include to llvm/Support/ManagedStatic.h It was failing with PerfJITEventListener.cpp:489:7: error: 'ManagedStatic' in namespace 'llvm' does not name a template type llvm::ManagedStatic<PerfJITEventListener> PerfListener;	2019-11-16 14:43:46 +01:00
Nicolai Hähnle	d8f7c68e28	AMDGPU/SILoadStoreOptimizer: fix a likely bug introduced recently Summary: We should check for same instruction class before checking whether they have the same base address, else we might iterate out of bounds of a MachineInstr operands list. The InstClass check is also cheaper. This was introduced in SVN r373630. Reviewers: tstellar Subscribers: arsenm, kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68690	2019-11-16 11:35:34 +01:00
Shiva Chen	cf6cf0cd14	[RISCV] Handle variable sized objects with the stack need to be realigned Differential Revision: https://reviews.llvm.org/D68979	2019-11-16 12:39:53 +08:00
David Blaikie	77cfcd7509	DebugInfo: Use loclistx for DWARFv5 location lists to reduce the number of relocations This only implements the non-dwo part, but loclistx is necessary to use location lists in DWARFv5, so it's a precursor to that work - and generally reduces relocations (only using one reloc, then indexes/relative offsets for all location list references) in non-split DWARF.	2019-11-15 18:51:13 -08:00
David Blaikie	d295087639	DebugInfo: Templatize rnglist header parsing to setup for reuse with loclist header parsing	2019-11-15 16:23:02 -08:00
Thomas Lively	194d7ec081	[WebAssembly] Fix miscompile of select with and Summary: Rolls back the remaining bad optimizations introduced in `eb15d00193`. Some of them were already rolled back in `e661f946a7` and this finishes the job. Fixes https://bugs.llvm.org/show_bug.cgi?id=44012. Reviewers: dschuff, aheejin Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70347	2019-11-15 16:22:01 -08:00
Quentin Colombet	98ceac4981	[GISel][CombinerHelper] Use uses() instead of operands() when traversing use operands. NFC	2019-11-15 13:54:33 -08:00
Quentin Colombet	304abde077	[GISel][CombinerHelper] Add support for scalar type for the result of shuffle vector LLVM IR of 1-element vectors get lower into scalar in GISel. As a result, shuffle vector may also produce a scalar. This patch teaches the shuffle combiner how to deal with scalars when they are in the destination type of a shuffle vector. For now, we just support the easy case where this can be lowered to a plain copy. For other cases, we leave the shuffle vector as is. This type of IR are seen in O0 pipelines. E.g., as produced with SingleSource/UnitTests/Vector/AArch64/aarch64_neon_intrinsics.c. rdar://problem/57198904	2019-11-15 13:54:33 -08:00
Reid Kleckner	631be5c0d4	Remove Support/Options.h, it is unused It was added in 2014 in `732e0aa9fb` with one use in Scalarizer.cpp. That one use was then removed when porting to the new pass manager in 2018 in `b6f76002d9`. While the RFC and the desire to get off of static initializers for cl::opt all still stand, this code is now dead, and I think we should delete this code until someone is ready to do the migration. There were many clients of CommandLine.h that were it transitively through LLVMContext.h, so I cleaned that up in `4c1a1d3cf9`. Reviewers: beanz Differential Revision: https://reviews.llvm.org/D70280	2019-11-15 13:32:52 -08:00
Sanjay Patel	ee0882bdf8	[SimplifyCFG] propagate fast-math-flags (FMF) from phi to select This is another step towards having FMF apply only to FP values rather than those + fcmp. See PR38086 for one of the original discussions/motivations: https://bugs.llvm.org/show_bug.cgi?id=38086 And the test here is derived from PR39535: https://bugs.llvm.org/show_bug.cgi?id=39535 Currently, we lose FMF when converting any phi to select in SimplifyCFG. There are a small number of similar changes needed to correct within SimplifyCFG, so it should be quick to patch this pass up. FMF was extended to select and phi with: D61917 D67564 Differential Revision: https://reviews.llvm.org/D70208	2019-11-15 16:14:35 -05:00
Richard Smith	7889d8e7eb	Revert "[LoadStoreVectorize] Use '\|\|' instead of '\|' between sides with function calls. NFCI." This broke two tests. Presumably the non-short-circuting '\|' was intentional here. This reverts commit `f7efea0ded`.	2019-11-15 12:49:35 -08:00
Simon Atanasyan	6108eb4e5c	[mips] Enable `la` pseudo instruction on 64-bit arch. This patch makes LLVM compatible with GAS. It accepts `la` pseudo instruction on arch with 64-bit pointers and just shows a warning. Differential Revision: https://reviews.llvm.org/D70202	2019-11-15 23:38:14 +03:00
Simon Atanasyan	0287efb891	[mips] Do not emit R_MIPS_JALR for sym+offset in case of O32 ABI O32 ABI uses relocations in REL format. Relocation's addend is written in place. R_MIPS_JALR relocation points to the `jalr` instruction which does not have a place to store the relocation addend. So it's impossible to save non-zero "offset". This patch blocks emission of `R_MIPS_JALR` relocations in such cases. Differential Revision: https://reviews.llvm.org/D70201	2019-11-15 23:38:14 +03:00
Rachel Craik	f897d087d0	[LoopCacheAnalysis]: Fix assertion failure during cost computation Ensure the stride and trip count have the same type before multiplying them during reference cost calculation Reviewed By: jdoefert Differential Revision: https://reviews.llvm.org/D70192	2019-11-15 14:56:26 -05:00
Alexandre Ganea	478ad94c8e	[GCOV] Skip artificial functions from being emitted This is a patch to support D66328, which was reverted until this lands. Enable a compiler-rt test that used to fail previously with D66328. Differential Revision: https://reviews.llvm.org/D67283	2019-11-15 14:23:11 -05:00
Francesco Petrogalli	d6de5f12d4	[SVFS] Inject TLI Mappings in VFABI attribute. This patch introduces a function pass to inject the scalar-to-vector mappings stored in the TargetLIbraryInfo (TLI) into the Vector Function ABI (VFABI) variants attribute. The test is testing the injection for three vector libraries supported by the TLI (Accelerate, SVML, MASSV). The pass does not change any of the analysis associated to the function. Differential Revision: https://reviews.llvm.org/D70107	2019-11-15 18:42:56 +00:00
Fangrui Song	8bcd01f48a	[ThinLTO] Fix -Wunused-function in NDEBUG builds after llvmorg-10-init-9933-g3d708bf5c26	2019-11-15 10:00:23 -08:00
Dávid Bolvanský	f7efea0ded	[LoadStoreVectorize] Use '\|\|' instead of '\|' between sides with function calls. NFCI. Fixes warning from PVS Studio	2019-11-15 18:51:13 +01:00
Aditya Nandakumar	7276868556	[MirNamer][Canonicalizer]: Perform instruction semantic based renaming https://reviews.llvm.org/D70210 Previously: Due to sensitivity of the algorithm with gaps, and extra instructions, when diffing, often we see naming being off by a few. Makes the diff unreadable even for tests with 7 and 8 instructions respectively. Naming can change depending on candidates (and order of picking candidates). Suddenly if there's one extra instruction somewhere, the entire subtree would be named completely differently. No consistent naming of similar instructions which occur in different functions. If we try to do something like count the frequency distribution of various differences across suite, then the above sensitivity issues are going to result in poor results. Instead: Name instruction based on semantics of the instruction (hash of the opcode and operands). Essentially for a given instruction that occurs in any module/function it'll be named similarly (ie semantic). This has some nice properties Can easily look at many instructions and just check the hash and if they're named similarly, then it's the same instruction. Makes it very easy to spot the same instruction both multiple times, as well as across many functions (useful for frequency distribution). Independent of traversal/candidates/depth of graph. No need to keep track of last index/gaps/skip count etc. No off by few issues with diffs. I've tried the old vs new implementation in files ranging from 30 to 700 instructions. In both cases with the old algorithm, diffs are a sea of red, where as for the semantic version, in both cases, the diffs line up beautifully. Simplified implementation of the main loop (simple iteration) , no keep track of what's visited and not. Handle collision just by incrementing a counter. Roughly bb[N]_hash_[CollisionCount]. Additionally with the new implementation, we can probably avoid doing the hoisting of instructions to various places, as they'll likely be named the same resulting in differences only based on collision (ie regardless of whether the instruction is hoisted or not/close to use or not, it'll be named the same hash which should result in use of the instruction be identical with the only change being the collision count) which is very easy to spot visually.	2019-11-15 08:38:54 -08:00
diggerlin	3dfa975fb3	Add read-only data assembly writing for aix SUMMARY: The patch will emit read-only variable assembly code for aix. Reviewers: daltenty,Xiangling_Liao Subscribers: rupprecht, seiyai,hiraditya Differential Revision: https://reviews.llvm.org/D70182	2019-11-15 11:30:19 -05:00
Momchil Velikov	aa6d48fa70	Implement target(branch-protection) attribute for AArch64 This patch implements `__attribute__((target("branch-protection=...")))` in a manner, compatible with the analogous GCC feature: https://gcc.gnu.org/onlinedocs/gcc-9.2.0/gcc/AArch64-Function-Attributes.html#AArch64-Function-Attributes Differential Revision: https://reviews.llvm.org/D68711	2019-11-15 15:40:46 +00:00
Simon Tatham	b0c1900820	[ARM,MVE] Add reversed isel patterns for MVE `vcmp qN,rN` Summary: As well as vector/vector compare instructions, MVE also has a family of comparisons taking a vector and a scalar, which compare every lane of the vector against the same value. We generate those at isel time using isel patterns that match `(ARMvcmp vector, (ARMvdup scalar))`. This commit adds corresponding patterns for the operand-reversed form `(ARMvcmp (ARMvdup scalar), vector)`, with condition codes swapped as necessary. That way, we can still generate the vector/scalar compare instruction if the IR happens to have been rearranged to put the operands the other way round, which can happen in some optimization phases. Previously, a vcmp the other way round was handled by emitting a `vdup` instruction to //explicitly// replicate the scalar input into a vector, and then doing a vector/vector comparison. I haven't added a new test, because it turned out that several existing tests were already exhibiting that failure mode. So just updating the expected output in the existing MVE codegen tests demonstrates what's been improved. Reviewers: ostannard, MarkMurrayARM, dmgreen Reviewed By: dmgreen Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70296	2019-11-15 14:06:00 +00:00
Piotr Sobczak	02419ab5c7	[AMDGPU] Lower llvm.amdgcn.s.buffer.load.v3[i\|f]32 Summary: Add lowering support for 32-bit vec3 variant of s.buffer.load intrinsic. Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70118	2019-11-15 15:01:15 +01:00
evgeny	3d708bf5c2	Recommit "[ThinLTO] Add correctness check for RO/WO variable import" ValueInfo has user-defined 'operator bool' which allows incorrect implicit conversion to GlobalValue::GUID (which is unsigned long). This causes bugs which are hard to track and should be removed in future.	2019-11-15 16:13:19 +03:00
Serge Pavlov	e6584b2b7b	Move floating point related entities to namespace level Enumerations that describe rounding mode and exception behavior were defined inside ConstrainedFPIntrinsic. It makes sense to use the same definitions to represent the same properties in other cases, not only in constrained intrinsics. It was however inconvenient as required to include constrained intrinsics definitions even if they were not needed. Also using long scope prefix reduced readability. This change moves these definitioins to the namespace llvm::fp. No functional changes. Differential Revision: https://reviews.llvm.org/D69552	2019-11-15 19:56:33 +07:00
Pavel Labath	0908093977	DWARFDebugLoc(v4): Add an incremental parsing function Summary: This adds a visitLocationList function to the DWARF v4 location lists, similar to what already exists for DWARF v5. It follows the approach outlined in previous patches (D69672), where the parsed form is always stored in the DWARF v5 format, which makes it easier for generic code to be built on top of that. v4 location lists are "upgraded" during parsing, and then this upgrade is undone while dumping. Both "inline" and section-based dumping is rewritten to reuse the existing "generic" location list dumper. This means that the output format is consistent for all location lists (the only thing one needs to implement is the function which prints the "raw" form of a location list), and that debug_loc dumping correctly processes base address selection entries, etc. The previous existing debug_loc functionality (e.g., parseOneLocationList) is rewritten on top of the new API, but it is not removed as there is still code which uses them. This will be done in follow-up patches, after I build the API to access the "interpreted" location lists in a generic way (as that is what those users really want). Reviewers: dblaikie, probinson, JDevlieghere, aprantl, SouraVX Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69847	2019-11-15 13:38:00 +01:00
Jay Foad	c953e061b4	[CodeGen] Increase the size of a SmallVector The SmallVector reserve() call in MachineInstrExpressionTrait::getHashValue accounted for over 3% of all calls to malloc() when I compiled a bunch of graphics shaders for the AMDGPU target. Its initial size was only enough for machine instructions with up to 7 operands, but for AMDGPU 8 and 10 operands are very common. Here's a histogram of number of operands for each call to getHashValue, gathered from the same collection of shaders: 1 13503 2 254273 3 135781 4 422508 5 614997 6 194953 7 287248 8 1517255 9 31218 10 1191269 11 70731 12 24 13 77 15 84 17 4692 27 16 33 705 49 6 Typical instructions with 8 and 10 operands are floating point arithmetic and multiply-accumulate instructions like: %83:vgpr_32 = V_MUL_F32_e64 0, killed %82:vgpr_32, 0, killed %81:vgpr_32, 0, 0, implicit $exec %330:vgpr_32 = V_MAC_F32_e64 0, killed %327:vgpr_32, 0, killed %329:sgpr_32, 0, %328:vgpr_32(tied-def 0), 0, 0, implicit $exec Differential Revision: https://reviews.llvm.org/D70301	2019-11-15 11:32:11 +00:00
Sjoerd Meijer	71327707b0	[ARM][MVE] tail-predication This is a follow up of `d90804d`, to also flag fmcp instructions as instructions that we do not support in tail-predicated vector loops. Differential Revision: https://reviews.llvm.org/D70295	2019-11-15 11:01:13 +00:00
Petar Avramovic	1f559353a7	[MIPS GlobalISel] Select andi, ori and xori Introduce IntImmLeaf version of PatLeaf immZExt16 for 32-bit immediates. Change immZExt16 with imm32ZExt16 for andi, ori and xori. This keeps same behavior for SDAG and allows for GlobalISel selectImpl to select 'G_CONSTANT imm' + G_AND, G_OR, G_XOR into ANDi, ORi, XORi, respectively, when 32-bit imm satisfies imm32ZExt16 predicate: zero extending 16 low bits of imm is equal to imm. Large number of test changes comes from zero extending of small types which is transformed into 'and' with bitmask in legalizer. Differential Revision:https://reviews.llvm.org/D70185	2019-11-15 11:41:25 +01:00
Petar Avramovic	dda8e95540	[MIPS GlobalISel] Select addiu Introduce IntImmLeaf version of PatLeaf immSExt16 for 32-bit immediates. Change immSExt16 with imm32SExt16 for addiu. This keeps same behavior for SDAG and allows for GlobalISel selectImpl to select 'G_CONSTANT imm' + G_ADD into ADDIu when 32-bit imm satisfies imm32SExt16 predicate: sign extending 16 low bits of imm is equal to imm. Differential Revision: https://reviews.llvm.org/D70184	2019-11-15 11:36:13 +01:00
Mikael Holmen	1587c7e86f	[Scalarizer] Treat values from unreachable blocks as undef Summary: When scalarizing PHI nodes we might try to examine/rewrite InsertElement nodes in predecessors. If those predecessors are unreachable from entry, then the IR in those blocks could have unexpected properties resulting in infinite loops in Scatterer::operator[]. By simply treating values originating from instructions in unreachable blocks as undef we do not need to analyse them further. This fixes PR41723. Reviewers: bjope Reviewed By: bjope Subscribers: bjope, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70171	2019-11-15 11:13:37 +01:00
Hans Wennborg	c42e385135	Fix GCC -Wcast-qual warnings	2019-11-15 09:49:06 +01:00
Hans Wennborg	04dcb8009f	GCC 5.3 build fix It was failing with llvm/lib/ExecutionEngine/Orc/DebugUtils.cpp:56:10: error: could not convert ‘Obj’ from ‘std::unique_ptr<llvm::MemoryBuffer>’ to ‘llvm::Expected<std::unique_ptr<llvm::MemoryBuffer> >’ return Obj; ^	2019-11-15 09:49:06 +01:00
Matt Arsenault	31479d868e	AMDGPU: Change boolean content type to 0 or 1 The usage of target boolean checks is overly inflexible, since sext and zext of a compare are equally cheap. The choice is arbitrary, but using 0/1 to some degree is the choice of lower resistance since that's what most targets use. This enables a few combines that don't bother to support ZeroOrNegativeOneBooleanContent.	2019-11-15 13:43:47 +05:30
Matt Arsenault	69fcfb7d35	AMDGPU: Try to commute sub of boolean ext Avoids another regression in a future patch.	2019-11-15 13:43:42 +05:30
Matt Arsenault	bc276c6379	GlobalISel: Lower s1 source G_SITOFP/G_UITOFP	2019-11-15 13:37:20 +05:30
Lang Hames	16f38dda29	[ORC] Add a utility to support dumping JIT'd objects to disk for debugging. Adds a DumpObjects utility that can be used to dump JIT'd objects to disk. Instances of DebugObjects may be used by ObjectTransformLayer as no-op transforms. This patch also adds an ObjectTransformLayer to LLJIT and an example of how to use this utility to dump JIT'd objects in LLJIT.	2019-11-14 21:27:19 -08:00

1 2 3 4 5 ...

128536 Commits