llvm-project

Commit Graph

Author	SHA1	Message	Date
Edd Dawson	dd2f79ed44	[test][MC] Use %python in llvm/test/MC/COFF/bigobj.py ... instead of the one on the $PATH. Reviewed By: hubert.reinterpretcast Differential Revision: https://reviews.llvm.org/D88986	2020-10-07 14:03:28 -04:00
Florian Hahn	a73166a452	[LAA] Use DL to get element size for bound computation. Currently LAA uses getScalarSizeInBits to compute the size of an element when computing the end bound of an access. This does not work as expected for pointers to pointers, because getScalarSizeInBits will return 0 for pointer types. By using DataLayout to get the size of the element we can also correctly handle pointer element types. Note the changes to the existing test, which seems to also use the wrong offset for the end. Fixes PR47751. Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D88953	2020-10-07 18:57:07 +01:00
Amara Emerson	322d0afd87	[llvm][mlir] Promote the experimental reduction intrinsics to be first class intrinsics. This change renames the intrinsics to not have "experimental" in the name. The autoupgrader will handle legacy intrinsics. Relevant ML thread: http://lists.llvm.org/pipermail/llvm-dev/2020-April/140729.html Differential Revision: https://reviews.llvm.org/D88787	2020-10-07 10:36:44 -07:00
Heejin Ahn	3bba91f64e	[WebAssembly] Rename Emscripten EH functions Renaming for some Emscripten EH functions has so far been done in wasm-emscripten-finalize tool in Binaryen. But recently we decided to make a compilation/linking path that does not rely on wasm-emscripten-finalize for modifications, so here we move that functionality to LLVM. Invoke wrappers are generated in LowerEmscriptenEHSjLj pass, but final wasm types are not available in the IR pass, we need to rename them at the end of the pipeline. This patch also removes uses of `emscripten_longjmp_jmpbuf` in LowerEmscriptenEHSjLj pass, replacing that with `emscripten_longjmp`. `emscripten_longjmp_jmpbuf` is lowered to `emscripten_longjmp`, but previously we generated calls to `emscripten_longjmp_jmpbuf` in LowerEmscriptenEHSjLj pass because it takes `jmp_buf*` instead of `i32`. But we were able use `ptrtoint` to make it use `emscripten_longjmp` directly here. Addresses: https://github.com/WebAssembly/binaryen/issues/3043 https://github.com/WebAssembly/binaryen/issues/3081 Companions: https://github.com/WebAssembly/binaryen/pull/3191 https://github.com/emscripten-core/emscripten/pull/12399 Reviewed By: dschuff, tlively, sbc100 Differential Revision: https://reviews.llvm.org/D88697	2020-10-07 09:42:49 -07:00
Nikita Popov	7a01fc5abe	[MemCpyOpt] Add additional callslot test cases (NFC) For cases where the destination is captured.	2020-10-07 18:06:29 +02:00
Roman Lebedev	bef27e50b9	[NFC][InstCombine] Autogenerate a few tests being affected by upcoming patch	2020-10-07 19:00:08 +03:00
Philip Reames	14d5ee63e3	[Tests] Precommit test showing gap around load forwarding of vectors in instcombine	2020-10-07 08:57:24 -07:00
Yonghong Song	ddf1864ace	BPF: add AdjustOpt IR pass to generate verifier friendly codes Add an IR phase right before main module optimization. This is to modify IR to restrict certain downward optimizations in order to generate verifier friendly code. > prevent certain instcombine optimizations, handling both in-block/cross-block instcombines. > avoid speculative code motion if the variable used in condition is also used in the later blocks. Internally, a bpf IR builtin result = __builtin_bpf_passthrough(seq_num, result) is used to enforce ordering. This builtin is only used during target independent IR optimizations and it will be removed at the beginning of target dependent IR optimizations. For example, removing the following workaround, --- a/tools/testing/selftests/bpf/progs/test_sysctl_loop1.c +++ b/tools/testing/selftests/bpf/progs/test_sysctl_loop1.c @@ -47,7 +47,7 @@ int sysctl_tcp_mem(struct bpf_sysctl ctx) / a workaround to prevent compiler from generating * codes verifier cannot handle yet. */ - volatile int ret; + int ret; this patch is able to generate code which passed the verifier. To disable optimization, users need to use "opt" command like below: clang -target bpf -O2 -S -emit-llvm -Xclang -disable-llvm-passes test.c // disable icmp serialization opt -O2 -bpf-disable-serialize-icmp test.ll \| llvm-dis > t.ll // disable avoid-speculation opt -O2 -bpf-disable-avoid-speculation test.ll \| llvm-dis > t.ll llc t.ll Differential Revision: https://reviews.llvm.org/D85570	2020-10-07 08:49:10 -07:00
Ronak Chauhan	528057c197	[AMDGPU] Support disassembly for AMDGPU kernel descriptors Decode AMDGPU Kernel descriptors as assembler directives. Reviewed By: scott.linder, jhenderson, kzhuravl Differential Revision: https://reviews.llvm.org/D80713	2020-10-07 20:39:43 +05:30
Cameron McInally	333b2ab60b	[SVE] Lower fixed length VECREDUCE_OR operation Differential Revision: https://reviews.llvm.org/D88847	2020-10-07 09:56:25 -05:00
Jay Foad	fc819b6925	[AMDGPU] Use @LINE for error checking in gfx10.3 assembler tests	2020-10-07 15:48:01 +01:00
Georgii Rymar	55a60af237	[llvm-readelf] - Implement --addrsig option. We have `--addrsig` implemented for `llvm-readobj`. Usually it is convenient to use a single tool for dumping, so it seems we might want to implement `--addrsig` for `llvm-readelf` too. I've selected a simple output format which is a bit similar to one, used for dumping of the symbol table. It looks like: ``` Address-significant symbols section '.llvm_addrsig' contains 2 entries: Num: Name 1: foo 2: bar ``` Differential revision: https://reviews.llvm.org/D88835	2020-10-07 16:45:30 +03:00
Dmitry Preobrazhensky	4a7e7620d6	[AMDGPU][MC] Improved diagnostics for instructions with missing features Reviewers: rampitec Differential Revision: https://reviews.llvm.org/D88887	2020-10-07 16:31:29 +03:00
Roman Lebedev	fed0f890e5	InstCombine: Negator: don't rely on complexity sorting already being performed (PR47752) In some cases, we can negate instruction if only one of it's operands negates. Previously, we assumed that constants would have been canonicalized to RHS already, but that isn't guaranteed to happen, because of InstCombine worklist visitation order, as the added test (previously-hanging) shows. So if we only need to negate a single operand, we should ensure ourselves that we try constant operand first. Do that by re-doing the complexity sorting ourselves, when we actually care about it. Fixes https://bugs.llvm.org/show_bug.cgi?id=47752	2020-10-07 15:09:50 +03:00
Rodrigo Dominguez	f71f5f39f6	[AMDGPU] Implement hardware bug workaround for image instructions Summary: This implements a workaround for a hardware bug in gfx8 and gfx9, where register usage is not estimated correctly for image_store and image_gather4 instructions when D16 is used. Change-Id: I4e30744da6796acac53a9b5ad37ac1c2035c8899 Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81172	2020-10-07 07:39:52 -04:00
Simon Pilgrim	dce03e3059	[InstCombine] Tweak funnel by constant tests for better shl/lshr commutation coverage	2020-10-07 11:47:03 +01:00
Simon Pilgrim	6625892d7c	[ARM] Regenerate vldlane tests To help make the diffs in D88569 clearer	2020-10-07 11:47:03 +01:00
Florian Hahn	20cfd5fa33	[LAA] Add test for PR47751, which currently uses wrong bounds.	2020-10-07 11:22:22 +01:00
Jay Foad	1aa8e6a51a	[SDag] SimplifyDemandedBits: simplify to FP constant if all bits known We were already doing this for integer constants. This patch implements the same thing for floating point constants. Differential Revision: https://reviews.llvm.org/D88570	2020-10-07 09:24:38 +01:00
Max Kazantsev	85a6f8fc96	[Test] Add one more test where we can avoid creating trunc	2020-10-07 15:06:38 +07:00
Roman Lebedev	7fa503ef4a	[SROA] rewritePartition()/findCommonType(): if uses have conflicting type, try getTypePartition() before falling back to largest integral use type (PR47592) And another step towards transformss not introducing inttoptr and/or ptrtoint casts that weren't there already. In this case, when load/store uses have conflicting types, instead of falling back to the iN, we can try to use allocated sub-type. As disscussed, this isn't the best idea overall (we shouldn't rely on allocated type), but it works fine as a temporary measure. I've measured, and @ `-O3` as of vanilla llvm test-suite + RawSpeed, this results in +0.05% more bitcasts, -5.51% less inttoptr and -1.05% less ptrtoint (at the end of middle-end opt pipeline) See https://bugs.llvm.org/show_bug.cgi?id=47592 Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D88788	2020-10-07 09:20:19 +03:00
Yonghong Song	edd71db38b	BPF: avoid duplicated globals for CORE relocations This patch fixed two issues related with relocation globals. In LLVM, if a global, e.g. with name "g", is created and conflict with another global with the same name, LLVM will rename the global, e.g., with a new name "g.2". Since relocation global name has special meaning, we do not want llvm to change it, so internally we have logic to check whether duplication happens or not. If happens, just reuse the previous global. The first bug is related to non-btf-id relocation (BPFAbstractMemberAccess.cpp). Commit `54d9f743c8` ("BPF: move AbstractMemberAccess and PreserveDIType passes to EP_EarlyAsPossible") changed ModulePass to FunctionPass, i.e., handling each function at a time. But still just one BPFAbstractMemberAccess object is created so module level de-duplication still possible. Commit `40251fee00` ("[BPF][NewPM] Make BPFTargetMachine properly adjust NPM optimizer pipeline") made a change to create a BPFAbstractMemberAccess object per function so module level de-duplication is not possible any more without going through all module globals. This patch simply changed the map which holds reloc globals as class static, so it will be available to all BPFAbstractMemberAccess objects for different functions. The second bug is related to btf-id relocation (BPFPreserveDIType.cpp). Before Commit `54d9f743c8`, the pass is a ModulePass, so we have a local variable, incremented for each instance, and works fine. But after Commit `54d9f743c8`, the pass becomes a FunctionPass. Local variable won't work properly since different functions will start with the same initial value. Fix the issue by change the local count variable as static, so it will be truely unique across the whole module compilation. Differential Revision: https://reviews.llvm.org/D88942	2020-10-06 22:37:49 -07:00
Max Kazantsev	0c009e092e	[Test] Add test showing that we can avoid inserting trunc/zext	2020-10-07 12:19:01 +07:00
Chen Zheng	f05608707c	[PowerPC] implement target hook getTgtMemIntrinsic This patch can make pass recognize Powerpc related memory intrinsics. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D88373	2020-10-07 00:02:44 -04:00
Bill Wendling	d2c61d2bf9	[CodeGen][TailDuplicator] Don't duplicate blocks with INLINEASM_BR Tail duplication of a block with an INLINEASM_BR may result in a PHI node on the indirect branch. This is okay, but it also introduces a copy for that PHI node after the INLINEASM_BR, which is not okay. See: https://github.com/ClangBuiltLinux/linux/issues/1125 Differential Revision: https://reviews.llvm.org/D88823	2020-10-06 18:44:59 -07:00
Johannes Doerfert	7993d61177	[Attributor] Use smarter way to determine alignment of GEPs Use same logic existing in other places to deal with base case GEPs. Add the original Attributor talk example.	2020-10-06 19:31:08 -05:00
Johannes Doerfert	c4cfe7a435	[Attributor] Ignore read accesses to constant memory The old function attribute deduction pass ignores reads of constant memory and we need to copy this behavior to replace the pass completely. First step are constant globals. TBAA can also describe constant accesses and there are other possibilities. We might want to consider asking the alias analyses that are available but for now this is simpler and cheaper.	2020-10-06 19:31:07 -05:00
Johannes Doerfert	3f540c05df	[Attributor] Give up early on AANoReturn::initialize If the function is not assumed `noreturn` we should not wait for an update to mark the call site as "may-return". This has two kinds of consequences: - We have less iterations in many tests. - We have less deductions based on "known information" (since we ask earlier, point 1, and therefore assumed information is not "known" yet). The latter is an artifact that we might want to tackle properly at some point but which is not easily fixable right now.	2020-10-06 19:31:07 -05:00
Scott Linder	bf5c1d92d9	[AMDGPU] Fix remaining kernel descriptor test Follow up on `e4a9e4ef55` to fix a test I missed in the original patch. Committed as obvious.	2020-10-06 18:45:04 +00:00
Scott Linder	e4a9e4ef55	[AMDGPU] Emit correct kernel descriptor on big-endian hosts Previously we wrote multi-byte values out as-is from host memory. Use the `emitIntN` helpers in `MCStreamer` to produce a valid descriptor irrespective of the host endianness. Reviewed By: arsenm, rochauha Differential Revision: https://reviews.llvm.org/D88858	2020-10-06 17:29:38 +00:00
Nikita Popov	616f545048	[MemCpyOpt] Use dereferenceable pointer helper The call slot optimization has some home-grown code for checking whether the destination is dereferenceable. Replace this with the generic isDereferenceableAndAlignedPointer() helper. I'm not checking alignment here, because that is currently handled separately and may be an enforced alignment for allocas. The clean way of integrating that part would probably be to accept a callback in isDereferenceableAndAlignedPointer() for the actual isAligned check, which would then have a chance to use an enforced alignment instead. This allows the destination to be a GEP (among other things), though the two open TODOs may prevent it from working in practice. Differential Revision: https://reviews.llvm.org/D88805	2020-10-06 18:41:19 +02:00
Nikita Popov	6b441ca523	[MemCpyOpt] Check for throwing calls during call slot optimization When performing call slot optimization for a non-local destination, we need to check whether there may be throwing calls between the call and the copy. Otherwise, the early write to the destination may be observable by the caller. This was already done for call slot optimization of load/store, but not for memcpys. For the sake of clarity, I'm moving this check into the common optimization function, even if that does need an additional instruction scan for the load/store case. As efriedma pointed out, this check is not sufficient due to potential accesses from another thread. This case is left as a TODO. Differential Revision: https://reviews.llvm.org/D88799	2020-10-06 18:24:40 +02:00
Fangrui Song	43c7dc52f1	[X86] .code16: temporarily set Mode32Bit when matching an instruction with the data32 prefix PR47632 This allows MC to match `data32 ...` as one instruction instead of two (data32 without insn + insn). The compatibility with GNU as improves: `data32 ljmp` will be matched as ljmpl. `data32 lgdt 4(%eax)` will be matched as `lgdtl` (prefixes: 0x67 0x66, instead of 0x66 0x67). GNU as supports many other `data32 w` as `l`. We currently just hard code `data32 callw` and `data32 ljmpw`. Generalizing the suffix replacement is tricky and requires a think about the "bwlq" appending suffix rules in MatchAndEmitATTInstruction. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D88772	2020-10-06 08:32:03 -07:00
Dávid Bolvanský	86429c4eaf	[SimplifyLibCalls] Optimize mempcpy_chk to mempcpy	2020-10-06 17:08:46 +02:00
Arthur Eubanks	40251fee00	[BPF][NewPM] Make BPFTargetMachine properly adjust NPM optimizer pipeline This involves porting BPFAbstractMemberAccess and BPFPreserveDIType to NPM, then adding them BPFTargetMachine::registerPassBuilderCallbacks (the NPM equivalent of adjustPassManager()). Reviewed By: yonghong-song, asbirlea Differential Revision: https://reviews.llvm.org/D88855	2020-10-06 07:42:32 -07:00
Arthur Eubanks	8df17b4dc1	[test][InstCombine][NewPM] Fix InstCombine tests under NPM Some of these depended on analyses being present that aren't provided automatically in NPM. early_dce_clobbers_callgraph.ll was previously inlining a noinline function? cast-call-combine.ll relied on the legacy always-inline pass being a CGSCC pass and getting rerun. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D88187	2020-10-06 07:39:00 -07:00
Arthur Eubanks	61d4b342d1	[test][NewPM] Make dead-uses.ll work under NPM This one is weird... globals-aa needs to be already computed at licm, or else a function pass can't run a module analysis and won't have access to globals-aa. But the globals-aa result is impacted by instcombine in a way that affects what the test is expecting. If globals-aa is computed before instcombine, it is cached and globals-aa used in licm won't contain the necessary info provided by instcombine. Another catch is that if we don't invalidate AAManager, it will use the cached AAManager that instcombine requested, which may not contain globals-aa. So we have to invalidate<aa> so that licm can recompute an AAManager with the globals-aa created by the require<globals-aa>. This is essentially the problem described in https://reviews.llvm.org/D84259. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D88118	2020-10-06 07:33:02 -07:00
Johannes Doerfert	4a7a988442	[Attributor][FIX] Move assertion to make it not trivially fail The idea of this assertion was to check the simplified value before we assign it, not after, which caused this to trivially fail all the time.	2020-10-06 09:32:18 -05:00
Johannes Doerfert	04f6951397	[Attributor][FIX] Dead return values are not `noundef` When we assume a return value is dead we might still visit return instructions via `Attributor::checkForAllReturnedValuesAndReturnInsts(..)`. When we do so the "returned value" is potentially simplified to `undef` as it is the assumed "returned value". This is a problem if there was a preexisting `noundef` attribute that will only be removed as we manifest the `undef` return value. We should not use this combination to derive `unreachable` though. Two test cases fixed.	2020-10-06 09:32:18 -05:00
Johannes Doerfert	957094e31b	[Attributor][NFC] Ignore benign uses in AAMemoryBehaviorFloating In AAMemoryBehaviorFloating we used to track benign uses in a SetVector. With this change we look through benign uses eagerly to reduce the number of elements (=Uses) we look at during an update. The test does actually not fail prior to this commit but I already wrote it so I kept it.	2020-10-06 09:32:18 -05:00
Sam Tebbs	68e002e181	[ARM] Fold select_cc(vecreduce_[u\|s][min\|max], x) into VMINV or VMAXV This folds a select_cc or select(set_cc) of a max or min vector reduction with a scalar value into a VMAXV or VMINV. Differential Revision: https://reviews.llvm.org/D87836	2020-10-06 14:44:58 +01:00
Dmitry Preobrazhensky	e2452f57fa	[AMDGPU][MC] Added detection of unsupported instructions Implemented identification of unsupported instructions; improved errors reporting. See bug 42590. Reviewers: rampitec Differential Revision: https://reviews.llvm.org/D88211	2020-10-06 16:44:27 +03:00
Jonas Paulsson	5588dbce73	[SystemZAsmParser] Treat VR128 separately in ParseDirectiveInsn(). This patch makes the parser - reject higher vector registers (>=16) in operands where they should not be accepted. - accept higher integers (>=16) in vector register operands. Review: Ulrich Weigand Differential Revision: https://reviews.llvm.org/D88888	2020-10-06 14:42:40 +02:00
Alexander Shaposhnikov	315970de1d	[llvm-objcopy][MachO] Add support for universal binaries This diff adds support for universal binaries to llvm-objcopy. This is a recommit of `32c8435ef7` with the asan issue fixed. Test plan: make check-all Differential revision: https://reviews.llvm.org/D88400	2020-10-06 04:01:40 -07:00
Denis Antrushin	c08d48fc2d	[Statepoints] Change statepoint machine instr format to better suit VReg lowering. Current Statepoint MI format is this: STATEPOINT <id>, <num patch bytes >, <num call arguments>, <call target>, [call arguments...], <StackMaps::ConstantOp>, <calling convention>, <StackMaps::ConstantOp>, <statepoint flags>, <StackMaps::ConstantOp>, <num deopt args>, [deopt args...], <gc base/derived pairs...> <gc allocas...> Note that GC pointers are listed in pairs <base,derived>. This causes base pointers to appear many times (at least twice) in instruction, which is bad for us when VReg lowering is ON. The problem is that machine operand tiedness is 1-1 relation, so it might look like this: %vr2 = STATEPOINT ... %vr1, %vr1(tied-def0) Since only one instance of %vr1 is tied, that may lead to incorrect codegen (see PR46917 for more details), so we have to always spill base pointers. This mostly defeats new VReg lowering scheme. This patch changes statepoint instruction format so that every gc pointer appears only once in operand list. That way they all can be tied. Additional set of operands is added to preserve base-derived relation required to build stackmap. New statepoint has following format: STATEPOINT <id>, <num patch bytes>, <num call arguments>, <call target>, [call arguments...], <StackMaps::ConstantOp>, <calling convention>, <StackMaps::ConstantOp>, <statepoint flags>, <StackMaps::ConstantOp>, <num deopt args>, [deopt args...], <StackMaps::ConstantOp>, <num gc pointers>, [gc pointers...], <StackMaps::ConstantOp>, <num gc allocas>, [gc allocas...] <StackMaps::ConstantOp>, <num entries in gc map>, [base/derived indices...] Changes are: - every gc pointer is listed only once in a flat length-prefixed list; - alloca list is prefixed with its length too; - following alloca list is length-prefixed list of base-derived indices of pointers from gc pointer list. Note that indices are logical (number of pointer), not absolute (index of machine operand). Differential Revision: https://reviews.llvm.org/D87154	2020-10-06 17:40:29 +07:00
Paul Walker	27f3d51b4e	[SVE] Lower fixed length vector fneg and fsqrt operations. Also updates sve-fp.ll to use fneg directly. Differential Revision: https://reviews.llvm.org/D88683	2020-10-06 10:48:16 +01:00
Paul Walker	8bb702a8ad	[SVE] Lower fixed length vector floating point rounding operations. Adds lowering for: llvm.ceil llvm.floor llvm.nearbyint llvm.rint llvm.round llvm.trunc Differential Revision: https://reviews.llvm.org/D88671	2020-10-06 10:48:16 +01:00
Dmitri Gribenko	80f66ac0d5	Revert "[llvm-objcopy][MachO] Add support for universal binaries" This reverts commit `32c8435ef7`. It fails ASan, details in https://reviews.llvm.org/D88400.	2020-10-06 11:29:24 +02:00
Mauri Mustonen	cef0de5eb5	[VPlan] Add vplan native path vectorization test case for inner loop reduction Regarding this bug I posted earlier: https://bugs.llvm.org/show_bug.cgi?id=47035 After reading through LLVM source code and getting familiar with VPlan I was able to vectorize the code using by enabling VPlan native path. After talking with @fhahn he suggested that I contribute this as a test case. So here it is. I tried to follow the available guides how to do this best I could. I modified IR code by hand to have more clear variable names instead of numbers. One thing what I'd like to get input from someone is that is current CHECK lines sufficient enough to verify that the inner loop has been vectorized properly? Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D87564	2020-10-06 10:11:58 +01:00
Georgii Rymar	f1ceaa200f	[llvm-readobj/elf][test] - Stop using precompiled binaries in mips-got.test This removed 2 last precompiled binaries from the mips-got.test. YAML descriptions are used instead. Differential revision: https://reviews.llvm.org/D88565	2020-10-06 12:04:44 +03:00

1 2 3 4 5 ...

75839 Commits