llvm-project

Commit Graph

Author	SHA1	Message	Date
Sriraman Tallam	df082ac45a	Basic Block Sections support in LLVM. This is the second patch in a series of patches to enable basic block sections support. This patch adds support for: * Creating direct jumps at the end of basic blocks that have fall through instructions. * New pass, bbsections-prepare, that analyzes placement of basic blocks in sections. * Actual placing of a basic block in a unique section with special handling of exception handling blocks. * Supports placing a subset of basic blocks in a unique section. * Support for MIR serialization and deserialization with basic block sections. Parent patch : D68063 Differential Revision: https://reviews.llvm.org/D73674	2020-03-16 16:06:54 -07:00
Craig Topper	378b1e6080	[X86] Assign avx512bf16 instructions to the SSEPackedSingle ExeDomain.	2020-03-16 14:07:01 -07:00
Benjamin Kramer	05ff3323e0	[AArch64] Remove unused variable	2020-03-16 21:59:55 +01:00
Nico Weber	623cb95eb3	Revert "[InstSimplify] Simplify calls with "returned" attribute" This reverts commit `45555c3819`. Causes clang crashes in some causes, see comments on https://reviews.llvm.org/D75815 for details (including repro steps).	2020-03-16 15:21:30 -04:00
Francesco Petrogalli	0f2b68d9c7	Implement IR intrinsics for gather prefetch. Summary: Intrinsics and relative codegen has been implemented for the following SVE instructions: 1. PRF<T> <prfop>, <Pg>, [<Xn\|SP>, <Zm>.S, <mod>] -> 32-bit scaled offset 2. PRF<T> <prfop>, <Pg>, [<Xn\|SP>, <Zm>.D, <mod>] -> 32-bit unpacked scaled offset 3. PRF<T> <prfop>, <Pg>, [<Xn\|SP>, <Zm>.D] -> 64-bit scaled offset 4. PRF<T> <prfop>, <Pg>, [<Zn>.S{, #<imm>}] -> 32-bit element 5. PRF<T> <prfop>, <Pg>, [<Zn>.D{, #<imm>}] -> 64-bit element The instructions are associated the following intrinsics, respectively: 1. void @llvm.aarch64.sve.gather.prf<T>.scaled.<mod>.nx4vi32( i8* %base, <vscale x 4 x i32> %offset, <vscale x 4 x i1> %Pg, i32 %prfop) 2. void @llvm.aarch64.sve.gather.prf<T>.scaled.<mod>.nx2vi32( i8* %base, <vscale x 2 x i32> %offset, <vscale x 2 x i1> %Pg, i32 %prfop) 3. void @llvm.aarch64.sve.gather.prf<T>.scaled.nx2vi64( i8* %base, <vscale x 2 x i64> %offset, <vscale x 2 x i1> %Pg, i32 %prfop) 4. void @llvm.aarch64.sve.gather.prf<T>.nx4vi32( <vscale x 4 x i32> %bases, i64 %imm, <vscale x 4 x i1> %Pg, i32 %prfop) 5. void @llvm.aarch64.sve.gather.prf<T>.nx2vi64( <vscale x 2 x i64> %bases, i64 %imm, <vscale x 2 x i1> %Pg, i32 %prfop) The intrinsics are the IR counterpart of the following SVE ACLE functions: * void svprf<T>(svbool_t pg, const void base, svprfop op) void svprf<T>_vnum(svbool_t pg, const void base, int64_t vnum, svprfop op) void svprf<T>_gather[_u32base](svbool_t pg, svuint32_t bases, svprfop op) * void svprf<T>_gather[_u64base](svbool_t pg, svuint64_t bases, svprfop op) * void svprf<T>_gather_[s32]offset(svbool_t pg, const void base, svint32_t offsets, svprfop op) void svprf<T>_gather_[u32]offset(svbool_t pg, const void base, svint32_t offsets, svprfop op) void svprf<T>_gather_[s64]offset(svbool_t pg, const void base, svint64_t offsets, svprfop op) void svprf<T>_gather_[u64]offset(svbool_t pg, const void base, svint64_t offsets, svprfop op) void svprf<T>_gather[_u32base]_offset(svbool_t pg, svuint32_t bases, int64_t offset, svprfop op) * void svprf<T>_gather[_u64base]_offset(svbool_t pg, svuint64_t bases,int64_t offset, svprfop op) Reviewers: andwar, sdesmalen, efriedma, rengolin Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75580	2020-03-16 18:52:35 +00:00
Huihui Zhang	0616e9964b	[InstSimplify][SVE] Fix SimplifyGEPInst for scalable vector. Summary: Skip folds that rely on DataLayout::getTypeAllocSize(). For scalable vector, only minimal type alloc size is known at compile-time. Reviewers: sdesmalen, efriedma, spatel, apazos Reviewed By: efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75892	2020-03-16 11:46:12 -07:00
Matt Arsenault	b0bdb186f5	Utils: Always set alignment when expanding mem intrinsics This was creating natural aligned loads and stores, which may not be the case. The target could request a wider type load with less alignment.	2020-03-16 14:34:29 -04:00
Matt Arsenault	05e7d8d6ce	TTI: Add addrspace parameters to memcpy lowering functions	2020-03-16 14:34:29 -04:00
Nico Weber	9e48422035	Revert "[llvm-objdump] Display locations of variables alongside disassembly" Makes tests fail on Windows, see https://reviews.llvm.org/D70720#1924542 This reverts commit `3a5ddedadb`, and follow-ups: `f4cb9c919e` `042eb0482a` `c0cf5f5da9` `18649f4813` `f62b898c1f`	2020-03-16 14:04:25 -04:00
Simon Pilgrim	ebb181cf40	[X86] matchScalarReduction - add support for partial reductions Add optional support for opt-in partial reduction cases by providing an optional partial mask to indicate which elements have been extracted for the scalar reduction.	2020-03-16 18:01:02 +00:00
Matt Arsenault	2e77362626	GlobalISel: Fix lower bswap for vectors This would hit an assertion from trying to use the wrong bitwidth for the constants.	2020-03-16 13:59:08 -04:00
Matt Arsenault	80b627d69d	AMDGPU/GlobalISel: Fix handling of G_ANYEXT with s1 source We were letting G_ANYEXT with a vcc register bank through, which was incorrect and would select to an invalid copy. Fix this up like G_ZEXT and G_SEXT. Also drop old code to fixup the non-boolean case in RegBankSelect. We now have to perform that expansion during selection, so there's no benefit to doing it during RegBankSelect.	2020-03-16 12:59:54 -04:00
Matt Arsenault	c460dc6eeb	AMDGPU/GlobalISel: Fix some illegal scalar argument types Fixes integers that don't evenly divide to i32 pieces. We should probably extract some of the code in the legalizer to start handling argument breakdowns. I'm dissatisfied with the argument lowering's handling of vectors for example, and we should not be producing the weird G_EXTRACTs we do now.	2020-03-16 12:51:23 -04:00
Juneyoung Lee	07a41544fd	Minor fix to a comment in CodeGenPrepare.cpp	2020-03-17 01:10:26 +09:00
Matt Arsenault	84386b2d8a	AMDGPU: Drop special case f64 fround lowering The result is better if ftrunc is emitted and separately legalized when unavailable.	2020-03-16 12:09:30 -04:00
Matt Arsenault	19a0350187	GlobalISel: Fix round lowering I used the implementation for floor instead of round. It also turns out the OpenCL builtin library wasn't using the round builtin, but implemented the expanded form.	2020-03-16 11:37:30 -04:00
Dominik Montada	8ff2dcb18b	[GlobalISel] add additional lowering support for G_INSERT Summary: Add lowering support for inserting pointers or scalars into scalars, vectors or pointers Reviewers: arsenm, dsanders Reviewed By: arsenm Subscribers: jvesely, wdng, nhaehnle, rovka, hiraditya, volkan, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75994	2020-03-16 16:27:17 +01:00
Matt Arsenault	57d896e838	AMDGPU/GlobalISel: Make some large merges legal We allow up to 1024-bit registers, so we should support merges all the way to the maximum.	2020-03-16 10:49:10 -04:00
Fangrui Song	536ba6373f	[Object] Change ELFObjectFile<ELFT>::getFileFormatName() to use BFD names Follow-up for D74433 What the function returns are almost standard BFD names, except that "ELF" is in uppercase instead of lowercase. This patch changes "ELF" to "elf" and changes ARM/AArch64 to use their BFD names. MIPS and PPC64 have endianness differences as well, but this patch does not intend to address them. Advantages: * llvm-objdump: the "file format " line matches GNU objdump on ARM/AArch64 objects * "file format " line can be extracted and fed into llvm-objcopy -O literally. (https://github.com/ClangBuiltLinux/linux/issues/779 has such a use case) Affected tools: llvm-readobj, llvm-objdump, llvm-dwarfdump, MCJIT (internal implementation detail, not exposed) Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D76046	2020-03-16 07:42:04 -07:00
Simon Pilgrim	2b3b453a82	[TargetLowering] Only demand a funnelshift's modulo amount bits ISD::FSHL/FSHR shift amount values are guaranteed to act as a modulo amount, so for power-of-2 bitwidths we only need the lowest bits.	2020-03-16 13:52:17 +00:00
Juneyoung Lee	7aecf2323c	[ExpandMemCmp] Correctly set alignment of generated loads Summary: This is a part of the series of efforts for correcting alignment of memory operations. (Another related bugs: https://bugs.llvm.org/show_bug.cgi?id=44388 , https://bugs.llvm.org/show_bug.cgi?id=44543 ) This fixes https://bugs.llvm.org/show_bug.cgi?id=43880 by giving default alignment of loads to 1. The test CodeGen/AArch64/bcmp-inline-small.ll should have been changed; it was introduced by https://reviews.llvm.org/D64805 . I talked with @evandro, and confirmed that the test is okay to be changed. Other two tests from PowerPC needed changes as well, but fixes were straightforward. Reviewers: courbet Reviewed By: courbet Subscribers: nlopes, gchatelet, wuzish, nemanjai, kristof.beyls, hiraditya, steven.zhang, danielkiss, llvm-commits, evandro Tags: #llvm Differential Revision: https://reviews.llvm.org/D76113	2020-03-16 22:39:48 +09:00
Simon Pilgrim	e43a085781	[X86] X86::isConstantSplat - enable partial undef bit handling by default. We currently only ever use this for lowering constant uniform values (shift/rotate by immediate) so we can safely enable it by default (it treats the undef bits as zero when extracting constants). This is necessary for an upcoming patch that will use SimplifyDemandedBits more aggressively on funnel shift amounts and causes regressions in vXi64 constant without it.	2020-03-16 12:56:24 +00:00
Simon Pilgrim	ac4609cb1d	[X86] LowerRotate - use X86::isConstantSplat to detect constant splat rotation amounts. Avoid code duplication and matches what we do for the similar LowerFunnelShift and LowerScalarImmediateShift methods.	2020-03-16 12:56:23 +00:00
Jonas Paulsson	132f25bcca	[SystemZ] Avoid scalarization of [SU]INT_TO_FP ISD-nodes. The type legalizer will scalarize vector conversions from integer to floating point if the source element size is less than that of the result. This is avoided now by inserting a zero/sign-extension of the source vector before type legalization. Review: Ulrich Weigand Differential revision: https://reviews.llvm.org/D75978	2020-03-16 13:07:42 +01:00
David Stenberg	02b6a3c349	[DebugInfo] Handle generic type DW_OP_convert ops in dsymutil Summary: This is a preparatory change for allowing LLVM to emit DW_OP_convert operations converting to the generic type. If DW_OP_convert's operand is 0, it converts the top of the stack to the generic type, as specified by DWARFv5 section 2.5.1.6: "[...] takes one operand, which is an unsigned LEB128 integer that represents the offset of a debugging information entry in the current compilation unit, or value 0 which represents the generic type." This adds support for such operations to dsymutil. Reviewers: aprantl, markus, friss, JDevlieghere Reviewed By: aprantl, JDevlieghere Subscribers: hiraditya, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D76142	2020-03-16 12:16:37 +01:00
Oliver Stannard	18649f4813	[llvm-objdump] Add entry_value and stack_value opcodes Add the DW_OP_entry_value and DW_OP_stack_value opcodes to the DWARF expression printer. Differential revision: https://reviews.llvm.org/D74843	2020-03-16 10:54:41 +00:00
Oliver Stannard	c0cf5f5da9	[llvm-objdump] Add simple memory expressions to variable display Add the DW_OP_breg0..DW_OP_breg31 and DW_OP_bregx opcodes to the DWARF expression printer. Differential revision: https://reviews.llvm.org/D74841	2020-03-16 10:54:41 +00:00
Oliver Stannard	3a5ddedadb	[llvm-objdump] Display locations of variables alongside disassembly This adds the --debug-vars option to llvm-objdump, which prints locations (registers/memory) of source-level variables alongside the disassembly based on DWARF info. A vertical line is printed for each live-range, with a label at the top giving the variable name and location, and the position and length of the line indicating the program counter range in which it is valid. Currently, this only works for object files, not executables or shared libraries. Differential revision: https://reviews.llvm.org/D70720	2020-03-16 10:54:40 +00:00
David Stenberg	c93652517c	[DebugInfo] Handle generic type DW_OP_convert ops in llvm-dwarfdump Summary: This is a preparatory change for allowing LLVM to emit DW_OP_convert operations converting to the generic type. If DW_OP_convert's operand is 0, it converts the top of stack to the generic type, as specified by DWARFv5 section 2.5.1.6: "[...] takes one operand, which is an unsigned LEB128 integer that represents the offset of a debugging information entry in the current compilation unit, or value 0 which represents the generic type." This adds support for such operations to llvm-dwarfdump. Reviewers: aprantl, markus, jdoerfert, jhenderson Reviewed By: aprantl Subscribers: hiraditya, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D76141	2020-03-16 11:24:01 +01:00
Shengchen Kan	b1a7a245ec	[NFC][MC] Rename alignBranches* to emitInstruction* alignBranches is X86 specific, change the name in a more general one since other target can do some state chang before and after emitting the instruction.	2020-03-16 17:13:14 +08:00
Simon Atanasyan	e0ab0e6a28	[MIPS] Implement PUL.PS and PUU.PS instructions Patch by Michael Roe. Differential Revision: https://reviews.llvm.org/D75812	2020-03-16 09:39:47 +03:00
Serguei Katkov	ad643d5e93	[Verifier] Remove invalid verifier check According to LangRef for unordered atomic memory transfer intrinsics "The first three arguments are the same as they are in the @llvm.memcpy intrinsic, with the added constraint that len is required to be a positive integer multiple of the element_size. If len is not a positive integer multiple of element_size, then the behaviour of the intrinsic is undefined." So the len is not multiple of element size is just an undefined behavior and verifier should not complain about that as undefined behavior is allowed in LLVM IR. This change removes the verifier check for this condition Reviewers: reames Reviewed By: reames Subscribers: dantrushin, hiraditya, jfb, llvm-commits Differential Revision: https://reviews.llvm.org/D76116	2020-03-16 12:00:08 +07:00
Juneyoung Lee	6ad63606ea	[CodeGenPrepare] Freeze condition when transforming select to br Summary: This is a simple fix for CodeGenPrepare that freezes branch condition when transforming select to branch. If it is not frozen, instsimplify or the later pipeline can potentially exploit undefined behavior. The diff shows optimized form becase D75859 and D76048 already made a few changes to CodeGenPrepare for optimizing freeze(cmp). Reviewers: jdoerfert, spatel, lebedev.ri, efriedma Reviewed By: lebedev.ri Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76179	2020-03-16 12:46:20 +09:00
Juneyoung Lee	4ffe3ac729	Revert "[CodeGenPrepare] Freeze condition when transforming select to br" This reverts commit `10aa7ea951`.	2020-03-16 12:45:54 +09:00
Philip Reames	a79863f2f7	Support prefix padding for alignment purposes (Relaxable instructions only) Now that D75203 has landed and baked for a few days, extend the basic approach to prefix padding as well. The patch itself is fairly straight forward. For the moment, this patch adds the functional support and some basic testing there of, but defaults to not enabling prefix padding. I want to be able to phrase a separate patch which adds the target specific reasoning and test it cleanly. I haven't decided whether I want to common it with the nop logic or not. Differential Revision: https://reviews.llvm.org/D75300	2020-03-15 19:53:41 -07:00
Craig Topper	b2da1ddaef	[X86] Add a non-zero cost for truncating v32i16->v32i8 on avx512bw.	2020-03-15 17:18:46 -07:00
Lang Hames	9c5771710e	Revert "[ORC] Enable JITEventListeners in the RTDyldObjectLinkingLayer." This reverts commit `98f2bb4461`. Reverting while I investigate bot failures.	2020-03-15 15:35:08 -07:00
Lang Hames	98f2bb4461	[ORC] Enable JITEventListeners in the RTDyldObjectLinkingLayer. Enable use of ExecutionEngine JITEventListeners in RTDyldObjectLinkingLayer. This allows existing MCJIT clients to more easily migrate to LLJIT / ORCv2. Example usage in llvm/examples/OrcV2Examples/LLJITWithGDBRegistrationListener. Differential Revision: https://reviews.llvm.org/D75838	2020-03-15 15:14:46 -07:00
Benjamin Kramer	caef4a81c9	[AVR] Make helper functions static. NFC.	2020-03-15 16:50:15 +01:00
Sander de Smalen	8105935d3a	[TypeSize] Allow returning scalable size in implicit conversion to uint64_t This patch removes compiler runtime assertions that ensure the implicit conversion are only guaranteed to work for fixed-width vectors. With the assert it would be impossible to get _anything_ to build until the entire codebase has been upgraded, even when the indiscriminate uses of the size as uint64_t would work fine for both scalable and fixed-width types. This issue will need to be addressed differently, with build-time errors rather than assertion failures, but that effort falls beyond the scope of this patch. Returning the scalable size and avoiding the assert in getFixedSize() is a temporary stop-gap in order to use LLVM for compiling and using the SVE ACLE intrinsics. Reviewers: efriedma, huntergr, rovka, ctetreau, rengolin Reviewed By: efriedma Tags: #llvm Differential Revision: https://reviews.llvm.org/D75297	2020-03-15 13:48:49 +00:00
Simon Pilgrim	5641804298	[DAG] MatchRotate - Add funnel shift by variable support Followup to D75114, this patch reuses the existing MatchRotate ROTL/ROTR rotation pattern code to also recognize the more general FSHL/FSHR funnel shift patterns when we have variable shift amounts, matched with MatchFunnelPosNeg which acts in an (almost) equivalent manner to MatchRotatePosNeg.	2020-03-15 11:50:45 +00:00
Florian Hahn	650f363bd7	[ValueLattice] Add singlecrfromundef lattice value. This patch adds a new singlecrfromundef lattice value, indicating a single element constant range which was merge with undef at some point. Merging it with another constant range results in overdefined, as we won't be able to replace all users with a single value. This patch uses a ConstantRange instead of a Constant*, because regular integer constants are represented as single element constant ranges as well and this allows the existing code working without additional changes. Reviewers: efriedma, nikic, reames, davide Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D75845	2020-03-15 11:23:46 +00:00
Juneyoung Lee	10aa7ea951	[CodeGenPrepare] Freeze condition when transforming select to br Summary: This is a simple fix for CodeGenPrepare that freezes branch condition when transforming select to branch. If it is not freezed, instsimplify or the later pipeline can potentially exploit undefined behavior. The diff shows optimized form becase D75859 and D76048 already made a few changes to CodeGenPrepare for optimizing freeze(cmp). Reviewers: jdoerfert, spatel, lebedev.ri, efriedma Reviewed By: lebedev.ri Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76179	2020-03-15 11:10:46 +09:00
Lang Hames	981f017c5c	[ORC] Print symbol flags and materializer name in ExecutionSession::dump. The extra information can be helpful in diagnosing JIT bugs.	2020-03-14 18:52:10 -07:00
Lang Hames	9c9eb60b4b	[JITLink][MachO] Re-apply `b64afadf30`, MachO linker-private support, with fixes. Global symbols with linker-private prefixes should be resolvable across object boundaries, but internal symbols with linker-private prefixes should not.	2020-03-14 18:36:15 -07:00
Lang Hames	a7d187d9c0	Revert "[JITLink][MachO] Treat linker private symbols as hidden rather than private." This reverts commit `b64afadf30`. Reverting while I investigate bot failures.	2020-03-14 16:52:25 -07:00
Lang Hames	b64afadf30	[JITLink][MachO] Treat linker private symbols as hidden rather than private. Linker-private symbols should be resolvable across object file boundaries.	2020-03-14 16:33:15 -07:00
Lang Hames	633ea07200	[Orc] Add basic OrcV2 C bindings and example. Renames the llvm/examples/LLJITExamples directory to llvm/examples/OrcV2Examples since it is becoming a home for all OrcV2 examples, not just LLJIT. See http://llvm.org/PR31103.	2020-03-14 14:41:22 -07:00
Krzysztof Parzyszek	3656558cec	[Hexagon] Only allow single HVX vector loads/stores in lowering This will prevent store widening from forming vector pair stores, which eventually end up broken up into single stores.	2020-03-14 14:26:01 -05:00
Simon Pilgrim	ee862adf60	Fix signed/unsigned comparison warning.	2020-03-14 18:42:27 +00:00

1 2 3 4 5 ...

132266 Commits