llvm-project

Commit Graph

Author	SHA1	Message	Date
Florian Hahn	ed9e1a7dcc	[PhaseOrdering] Add test for missing vectorization with NewPM.	2021-05-12 19:34:14 +01:00
Florian Hahn	96c1fa2a04	[SCEV] Add loop-guard pessimizing test with step = 2.	2021-05-12 19:30:11 +01:00
Stelios Ioannou	1124ad2f5d	[LoopFlatten] Simplify loops so that the pass can operate on unsimplified loops. The loop flattening pass requires loops to be in simplified form. If the loops are not in simplified form, the pass cannot operate. This patch simplifies all loops before flattening. As a result, all loops will be simplified regardless of whether anything ends up being flattened. This change was inspired by observing a certain loop that was not flatten because the loops were not in simplified form. This loop is added as a test to verify that it is now flattened. Differential Revision: https://reviews.llvm.org/D102249 Change-Id: I45bcabe70fb99b0d89f0effafc82eb9e0585ec30	2021-05-12 19:22:01 +01:00
Shoaib Meenai	56f7e5a822	[cmake] Add support for multiple distributions LLVM's build system contains support for configuring a distribution, but it can often be useful to be able to configure multiple distributions (e.g. if you want separate distributions for the tools and the libraries). Add this support to the build system, along with documentation and usage examples. Reviewed By: phosek Differential Revision: https://reviews.llvm.org/D89177	2021-05-12 11:13:18 -07:00
Rob Suderman	7b57517507	[mlir][linalg] Fixed issue generating reassociation map with Rank-0 types Rank-0 case causes a graph during linalg reshape operation. Differential Revision: https://reviews.llvm.org/D102282	2021-05-12 11:00:51 -07:00
Benjamin Kramer	1470b8587f	Remove AST inclusion from Basic include That's a cyclic dependency. NFC.	2021-05-12 19:51:21 +02:00
Valentin Clement	113b807017	[mlir][openacc] Add OpenACC translation to LLVM IR (enter_data op create/copyin) This patch begins to translate acc.enter_data operation to call to tgt runtime call. It currently only translate create/copyin operands of memref type. This acts as a basis to add support for FIR types in the Flang/OpenACC support. It follows more or less a similar path than clang with `omp target enter data map` directives. This patch is taking a different approach than D100678 and perform a translation to LLVM IR and make use of the OpenMPIRBuilder instead of doing a conversion to the LLVMIR dialect. OpenACC support in Flang will rely on the current OpenMP runtime where 1:1 lowering can be applied. Some extension will be added where features are not available yet. Big part of this code will be shared for other standalone data operations in the OpenACC dialect such as acc.exit_data and acc.update. It is likely that parts of the lowering can also be shared later with the ops for standalone data directives in the OpenMP dialect when they are introduced. This is an initial translation and it probably needs more work. Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D101504	2021-05-12 13:41:14 -04:00
Roman Lebedev	81f56a2eb3	[NFC][clang][Codegen] Split ThunkInfo into it's own header Otherwise we'll have issues with forward definition of GlobalDecl. Split off from https://reviews.llvm.org/D100388	2021-05-12 20:39:54 +03:00
Roman Lebedev	2d84195d60	[NFCI][clang][Codegen] CodeGenVTables::addVTableComponent(): use getGlobalDecl It does the same thing. Split off from https://reviews.llvm.org/D100388	2021-05-12 20:39:54 +03:00
Fangrui Song	0fe6649bc5	[X86] Fix -Wunused-lambda-capture	2021-05-12 10:34:32 -07:00
Fangrui Song	3bf1acab5b	[CMake][ELF] Add -fno-semantic-interposition and -Bsymbolic-functions llvm-dev message: https://lists.llvm.org/pipermail/llvm-dev/2021-May/150465.html In an ELF shared object, a default visibility defined symbol is preemptible by default. This creates some missed optimization opportunities. -fno-semantic-interposition can optimize -fPIC: * in Clang: avoid GOT/PLT cost for variable access/function calls to external linkage definition in the same TU * in GCC: enable interprocedural optimizations (including inlining) and avoid PLT See https://gist.github.com/MaskRay/2d4dfcfc897341163f734afb59f689c6 for more information. -Bsymbolic-functions is more aggressive than -fvisibility-inlines-hidden (present since 2012) as it applies to all function definitions. It can * avoid PLT for cross-TU function calls && reduce dynamic symbol lookup * reduce dynamic symbol lookup for taking function addresses and optimize out GOT/TOC on x86-64/ppc64 With both options, the libLLVM.so and libclang-cpp.so performance should be closer to PIE binary linking against `libLLVM.a` and `libclang.a` (In a -DLLVM_TARGETS_TO_BUILD=X86 build, the number of JUMP_SLOT decreases from 12716 to 1628, and the number of GLOB_DAT decreases from 1918 to 1313 The built clang with `-DLLVM_LINK_LLVM_DYLIB=on -DCLANG_LINK_CLANG_DYLIB=on` is significantly faster. See the Linux kernel build result https://bugs.archlinux.org/task/70697 ) Some implication: Interposing a subset of functions is no longer supported. (This is fragile anyway and cannot really be supported. For Mach-O we don't use `ld -interpose`, so interposition is not supported on Mach-O at all.) Compiling a program which takes the address of any LLVM function with `{gcc,clang} -fno-pic` and expects the address to equal to the address taken from libLLVM.so or libclang-cpp.so is unsupported. I am fairly confident that llvm-project shouldn't have different behaviors depending on such pointer equality (as we've been using -fvisibility-inlines-hidden which applies to inline functions for a long time), but if we accidentally do, users should be aware that they should not make assumption on pointer equality in `-fno-pic` mode. Reviewed By: phosek Differential Revision: https://reviews.llvm.org/D102090	2021-05-12 10:34:31 -07:00
Inho Seo	5480ea6c84	Update static bound checker for Linalg to cover decreasing cases The current static checker for linalg does not work on the decreasing index cases well. So, this is to Update the current static bound checker for linalg to cover decreasing index cases. Reviewed By: hanchung Differential Revision: https://reviews.llvm.org/D102302	2021-05-12 10:29:19 -07:00
Simon Pilgrim	fb1d61b725	[X86][AVX] Fold concat(pslq(x,32),pslq(y,32)) -> shuffle(concat(x,y),zero) (PR46621) On AVX1 targets we can handle v4i64 logical shifts by 32 bits as a pair of v8f32 shuffles with zero. I was hoping to put this in LowerScalarImmediateShift, but performing that early causes regressions where other instructions were respliting the subvectors.	2021-05-12 18:04:40 +01:00
Aart Bik	ca5d0a7310	[mlir][sparse] keep runtime support library signature consistent Reviewed By: bixia Differential Revision: https://reviews.llvm.org/D102285	2021-05-12 09:59:46 -07:00
Amara Emerson	dc8d16c03f	[AArch64][GlobalISel] Add MMOs to constant pool loads to allow LICM hoisting. This caused performance regressions vs SDAG on SingleSource/Benchmarks/Adobe-C++	2021-05-12 09:47:09 -07:00
Greg McGary	93c8559baf	[lld-macho] Implement branch-range-extension thunks Extend the range of calls beyond an architecture's limited branch range by first calling a thunk, which loads the far address into a scratch register (x16 on ARM64) and branches through it. Other ports (COFF, ELF) use multiple passes with successively-refined guesses regarding the expansion of text-space imposed by thunk-space overhead. This MachO algorithm places thunks during MergedOutputSection::finalize() in a single pass using exact thunk-space overheads. Thunks are kept in a separate vector to avoid the overhead of inserting into the `inputs` vector of `MergedOutputSection`. FIXME: * arm64-stubs.s test is broken * add thunk tests * Handle thunks to DylibSymbol in MergedOutputSection::finalize() Differential Revision: https://reviews.llvm.org/D100818	2021-05-12 09:44:58 -07:00
Jon Chesterfield	9934571eab	[libomptarget][amdgpu][nfc] Expand errorcheck macros [libomptarget][amdgpu][nfc] Expand errorcheck macros These macros expand to continue, which is confusing, or exit, which is incompatible with continuing execution on offloading fail. Expanding the macros in place makes the code look untidy but the control flow obvious and amenable to improving. In particular, exit becomes easier to eliminate. Reviewed By: pdhaliwal Differential Revision: https://reviews.llvm.org/D102230	2021-05-12 17:30:41 +01:00
Abhina Sreeskantharajan	cbed6e5b2f	[SystemZ][z/OS] Fix warning caused by umask returning a signed integer type On z/OS, umask() returns an int because mode_t is type int, however it is being compared to an unsigned int. This patch fixes the following warning we see when compiling Path.cpp. ``` comparison of integers of different signs: 'const int' and 'const unsigned int' ``` Reviewed By: muiez Differential Revision: https://reviews.llvm.org/D102326	2021-05-12 12:26:22 -04:00
Malcolm Parsons	5389a05836	[docs] Fix documentation for bugprone-dangling-handle string_view isn't experimental anymore. This check has always handled both forms. Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D102313	2021-05-12 17:20:15 +01:00
Fabian Schuiki	33f908c428	[MLIR] Factor pass timing out into a dedicated timing manager This factors out the pass timing code into a separate `TimingManager` that can be plugged into the `PassManager` from the outside. Users are able to provide their own implementation of this manager, and use it to time additional code paths outside of the pass manager. Also allows for multiple `PassManager`s to run and contribute to a single timing report. More specifically, moves most of the existing infrastructure in `Pass/PassTiming.cpp` into a new `Support/Timing.cpp` file and adds a public interface in `Support/Timing.h`. The `PassTiming` instrumentation becomes a wrapper around the new timing infrastructure which adapts the instrumentation callbacks to the new timers. Reviewed By: rriddle, lattner Differential Revision: https://reviews.llvm.org/D100647	2021-05-12 18:14:51 +02:00
Victor Huang	cf4610d27b	[PowerPC] Fix definitions of CMPRB8, CMPEQB, CMPRB, SETB in PPCInstr64Bit.td and PPCInstrInfo.td	2021-05-12 10:59:33 -05:00
Baptiste Saleil	5885f1a4cb	[AMDGPU] Disable the SIFormMemoryClauses pass at -O1 This patch disables the SIFormMemoryClauses pass at -O1. This pass has a significant impact on compilation time, so we only want it to be enabled starting from -O2. Differential Revision: https://reviews.llvm.org/D101939	2021-05-12 11:51:59 -04:00
Paul Robinson	47a11a97d0	Fix grammar in README.md	2021-05-12 08:48:59 -07:00
Simon Pilgrim	7bff9bdd34	[X86][AVX] combineConcatVectorOps - add ConcatSubOperand helper. NFCI. Pull out repeated code to create a concat_vectors of the same operand from all subvecs.	2021-05-12 16:42:18 +01:00
Simon Pilgrim	778562ada3	[X86][AVX] Add v4i64 shift-by-32 tests AVX1 could perform this as a v8f32 shuffle instead of splitting - based off PR46621	2021-05-12 16:42:18 +01:00
Fraser Cormack	c5ec00e62b	[TargetLowering] Improve legalization of scalable vector types This patch extends the vector type-conversion and legalization capabilities of scalable vector types. Firstly, `vscale x 1` types now behave more like the corresponding `vscale x 2+` types. This enables the integer promotion legalization of extended scalable types, such as the promotion of `<vscale x 1 x i5>` to `<vscale x 1 x i8>`. These `vscale x 1` types are also now better handled by `getVectorTypeBreakdown`, where what looks like older handling for 1-element fixed-length vector types was spuriously updated to include scalable types. Widening of scalable types is now better supported, by using `INSERT_SUBVECTOR` to insert the smaller scalable vector "value" type into the wider scalable vector "part" type. This allows AArch64 to pass and return `vscale x 1` types by value by widening. There are still cases where we are unable to legalize `vscale x 1` types, such as where expansion would require splitting the vector in two. Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D102073	2021-05-12 16:33:07 +01:00
Valentin Clement	6110b667b0	[mlir][openacc] Conversion of data operand to LLVM IR dialect Add a conversion pass to convert higher-level type before translation. This conversion extract meangingful information and pack it into a struct that the translation (D101504) will be able to understand. Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D102170	2021-05-12 11:34:15 -04:00
Anastasia Stulova	58d18dde5c	[OpenCL] Remove pragma requirement from Arm dot extension. This removed the pointless need for extension pragma since it doesn't disable anything properly and it doesn't need to enable anything that is not possible to disable. The change doesn't break existing kernels since it allows to compile more cases i.e. without pragma statements but the pragma continues to be accepted. Differential Revision: https://reviews.llvm.org/D100985	2021-05-12 16:25:33 +01:00
Jordan Rupprecht	1336c5ae2f	[llvm-cov][test] Add test coverage for "gcov" implying "llvm-cov gcov" compatibility. Much like other LLVM binary utilities, `llvm-cov` has a symlink compatibility feature where it runs in `gcov` compatibility mode if the binary name ends in `gcov`. This is identical to invoking `llvm-cov gcov ...`. Differential Revision: https://reviews.llvm.org/D102299	2021-05-12 08:21:42 -07:00
Yaxun (Sam) Liu	98575708da	[CUDA][HIP] Fix device template variables Currently clang does not emit device template variables instantiated only in host functions, however, nvcc is able to do that: https://godbolt.org/z/fneEfferY This patch fixes this issue by refactoring and extending the existing mechanism for emitting static device var ODR-used by host only. Basically clang records device variables ODR-used by host code and force them to be emitted in device compilation. The existing mechanism makes sure these device variables ODR-used by host code are added to llvm.compiler-used, therefore they are guaranteed not to be deleted. It also fixes non-ODR-use of static device variable by host code causing static device variable to be emitted and registered, which should not. Reviewed by: Artem Belevich Differential Revision: https://reviews.llvm.org/D102237	2021-05-12 11:13:29 -04:00
Craig Topper	44e0e91db0	[ValueTypes] Rename MVT::getVectorNumElements() to MVT::getVectorMinNumElements(). Fix some misuses of getVectorNumElements() getVectorNumElements() returns a value for scalable vectors without any warning so it is effectively getVectorMinNumElements(). By renaming it and making getVectorNumElements() forward to it, we can insert a check for scalable vectors into getVectorNumElements() similar to EVT. I didn't do that in this patch because there are still more fixes needed, but I was able to temporarily do it and passed the RISCV lit tests with these changes. The changes to isPow2VectorType and getPow2VectorType are copied from EVT. The change to TypeInfer::EnforceSameNumElts reduces the size of AArch64's isel table. We're now considering SameNumElts to require the scalable property to match which removes some unneeded type checks. This was motivated by the bug I fixed yesterday in `80b9510806` Reviewed By: frasercrmck, sdesmalen Differential Revision: https://reviews.llvm.org/D102262	2021-05-12 07:46:45 -07:00
Stefan Pintilie	8d37411e48	Revert "[SelectionDAG][Mips][PowerPC][RISCV][WebAssembly] Teach computeKnownBits/ComputeNumSignBits about atomics" This reverts commit `6c80361b84`. Breaks PowerPC Big Endian buildbots.	2021-05-12 09:46:18 -05:00
Hendrik Greving	762ac725bf	[DAGCombiner] Fix DAG combine store elimination, different address space. Fixes a bug in the DAG combiner that eliminates the stores because it missed to inspect the address space of the pointers. %v = load %ptr_as1 // no chain side effect store %v, %ptr_as2 As well as store %v, %ptr_as1 store %v, %ptr_as2 Fixes a test for above in X86. Differential Revision: https://reviews.llvm.org/D102096	2021-05-12 07:14:22 -07:00
Hendrik Greving	4b00ffa767	[DAGCombiner] Add test exposing bug in DAG combine. Adds a test in X86, exposing a bug in DAG combine eliminating stores that are the same value but no the same address space. Differential Revision: https://reviews.llvm.org/D102243	2021-05-12 07:14:21 -07:00
Peter Waller	3fa6510f6e	[CodeGen][AArch64][SVE] Fold [rdffr, ptest] => rdffrs; bugfix for optimizePTestInstr When a ptest is used to set flags from the output of rdffr, the ptest can be eliminated, using a flags-setting rdffrs instead. Additionally, check that nothing consumes flags between rdffr and ptest; this case appears to have been missed previously. * There is no unpredicated RDFFRS instruction. * If substituting RDFFR_PP, require that the mask argument of the PTEST matches that of the RDFFR_PP. * Move some precondition code up inside optimizePTestInstr, so that it covers the new code paths for RDFFR which return earlier. * Only consider RDFFR, PTEST in same basic block. * Check for other flag setting instructions between the two, abort if found. * Drop an old TODO comment about removing dead PTEST instructions. RDFFR_P to follow in later patch. Differential Revision: https://reviews.llvm.org/D101357	2021-05-12 15:06:22 +01:00
Ben Shi	892c56eabe	[clang][AVR] Redefine some types to be compatible with avr-gcc Reviewed By: dylanmckay Differential Revision: https://reviews.llvm.org/D100701	2021-05-12 22:05:26 +08:00
David Sherwood	61630814b1	[NFC] Use variable GEP index in vec_demanded_elts tests I've changed a test in each of these files: Transforms/InstCombine/vec_demanded_elts.ll Transforms/InstCombine/vec_demanded_elts-inseltpoison.ll to use a variable GEP index instead of a constant value so that we're testing the more general case.	2021-05-12 14:56:04 +01:00
Martin Storsjö	4b98199ce8	[Passes] Reenable the relative lookup table converter pass for ELF and COFF on aarch64 The bug (PR50227, affecting COFF) that caused the revert in `6f5670a4c3` has been fixed in `382c505d9c` now, so it should be safe to reenable the pass for that target (and ELF). In PR50227 it's also mentioned that the same pass seems to cause problems on aarch64 on darwin, so leaving it disabled there for now.	2021-05-12 16:42:11 +03:00
Greg McGary	5a43901539	[llvm-objdump] Exclude __mh__header symbols during MachO disassembly `__mh_(execute\|dylib\|dylinker\|bundle\|preload\|object)_header` are special symbols whose values hold the VMA of the Mach header to support introspection. They are attached to the first section in `__TEXT`, even though their addresses are outside `__TEXT`, and they do not refer to code. It is normally harmless, but when the first section of `__TEXT` has no other symbols, `__mh__header` is considered by the disassembler when determing function boundaries. Since `__mh_*_header` refers to an address outside `__TEXT`, the boundary determination fails and disassembly quits. Since `__TEXT,__text` normally has symbols, this bug is obscured. Experiments placing `__stubs` and `__stub_helper` first exposed the bug, since neither has symbols. Differential Revision: https://reviews.llvm.org/D101786	2021-05-12 06:39:14 -07:00
Julien Pagès	46adccc5cc	[AMDGPU] Improve Codegen for build_vector Improve the code generation of build_vector. Use the v_pack_b32_f16 instruction instead of v_and_b32 + v_lshl_or_b32 Differential Revision: https://reviews.llvm.org/D98081 Patch by Julien Pagès!	2021-05-12 14:17:44 +01:00
Roman Lebedev	554b1bced3	[InstCombine] ~(C + X) --> ~C - X (PR50308) We can not rely on (C+X)-->(X+C) already happening, because we might not have visited that `add` yet. The added testcase would get stuck in an endless combine loop.	2021-05-12 16:10:55 +03:00
Jay Foad	a383d325f6	[TargetRegisterInfo] Speed up getAllocatableSet. NFCI. MachineRegisterInfo caches the reserved register set that is computed by by TargetRegisterInfo::getReservedRegs, so call into MRI to get the reserved regs to avoid recomputing them. In particular this speeds up AMDGPU's SIFormMemoryClauses pass because AMDGPU has a particularly complicated reserved set that is expensive to compute. Differential Revision: https://reviews.llvm.org/D102318	2021-05-12 14:09:05 +01:00
Tobias Gysi	06bb9cf30d	[mlir][linalg] Remove IndexedGenericOp support from LinalgInterchangePattern... after introducing the IndexedGenericOp to GenericOp canonicalization (https://reviews.llvm.org/D101612). Differential Revision: https://reviews.llvm.org/D102245	2021-05-12 13:01:37 +00:00
Piotr Sobczak	a4db7025a9	[AMDGPU] Remove assert Remove assert introduced in D101177, following post-commit feedback.	2021-05-12 14:52:37 +02:00
Sanjay Patel	f58e0513dd	[x86] try harder to lower to PCMPGT instead of not-of-PCMPEQ This is motivated by the example in https://llvm.org/PR50055 , but it doesn't do anything for that bug currently because we don't actually have a zero-extended setcc there. Proof for the generic transform (inverse of what we would try to do in combining): https://alive2.llvm.org/ce/z/aBL-Mg Differential Revision: https://reviews.llvm.org/D102275	2021-05-12 08:25:29 -04:00
Sanjay Patel	24d06fff55	[x86] add test for pcmpeq with 0; NFC	2021-05-12 08:25:29 -04:00
Nathan James	4c59ab34f7	[clang-tidy][NFC] Simplify a lot of bugprone-sizeof-expression matchers There should be a follow up to this for changing the traversal mode, but some of the tests don't like that. Reviewed By: steveire Differential Revision: https://reviews.llvm.org/D101614	2021-05-12 13:18:41 +01:00
Tobias Gysi	c6b96ae06f	[mlir][linalg] Remove IndexedGenericOp support from LinalgBufferize... after introducing the IndexedGenericOp to GenericOp canonicalization (https://reviews.llvm.org/D101612). Differential Revision: https://reviews.llvm.org/D102308	2021-05-12 12:15:05 +00:00
David Spickett	7d0a81ca38	Revert "[scudo] Enable arm32 arch" This reverts commit `b1a77e465e`. Which has a failing test on our armv7 bots: https://lab.llvm.org/buildbot/#/builders/59/builds/1812	2021-05-12 13:12:28 +01:00
Hana Joo	163325086c	[clang-tidy] Enable the use of IgnoreArray flag in pro-type-member-init rule The `IgnoreArray` flag was not used before while running the rule. Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=47288 \| b/47288 ]] Reviewed By: njames93 Differential Revision: https://reviews.llvm.org/D101239	2021-05-12 12:57:21 +01:00

1 2 3 4 5 ...

388178 Commits All Branches Search

388178 Commits

All Branches