llvm-project

Commit Graph

Author	SHA1	Message	Date
Kerry McLaughlin	3f5bf35f86	[AArch64][SVE] Implement intrinsics for non-temporal loads & stores Summary: Adds the following intrinsics: - llvm.aarch64.sve.ldnt1 - llvm.aarch64.sve.stnt1 This patch creates masked loads and stores with the MONonTemporal flag set when used with the intrinsics above. Reviewers: sdesmalen, paulwalker-arm, dancgr, mgudim, efriedma, rengolin Reviewed By: efriedma Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, cfe-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71000	2019-12-11 11:13:51 +00:00
Sjoerd Meijer	d97cf1f889	[ARM][LowOverheadLoops] Remove dead loop update instructions. After creating a low-overhead loop, the loop update instruction was still lingering around hurting performance. This removes dead loop update instructions, which in our case are mostly SUBS instructions. To support this, some helper functions were added to MachineLoopUtils and ReachingDefAnalysis to analyse live-ins of loop exit blocks and find uses before a particular loop instruction, respectively. This is a first version that removes a SUBS instruction when there are no other uses inside and outside the loop block, but there are some more interesting cases in test/CodeGen/Thumb2/LowOverheadLoops/mve-tail-data-types.ll which shows that there is room for improvement. For example, we can't handle this case yet: .. dlstp.32 lr, r2 .LBB0_1: mov r3, r2 subs r2, #4 vldrh.u32 q2, [r1], #8 vmov q1, q0 vmla.u32 q0, q2, r0 letp lr, .LBB0_1 @ %bb.2: vctp.32 r3 .. which is a lot more tricky because r2 is not only used by the subs, but also by the mov to r3, which is used outside the low-overhead loop by the vctp instruction, and that requires a bit of a different approach, and I will follow up on this. Differential Revision: https://reviews.llvm.org/D71007	2019-12-11 10:20:19 +00:00
Simon Tatham	bd0f271c9e	[ARM][MVE] Add intrinsics for immediate shifts. (reland) This adds the family of `vshlq_n` and `vshrq_n` ACLE intrinsics, which shift every lane of a vector left or right by a compile-time immediate. They mostly work by expanding to the IR `shl`, `lshr` and `ashr` operations, with their second operand being a vector splat of the immediate. There's a fiddly special case, though. ACLE specifies that the immediate in `vshrq_n` can take values up to //and including// the bit size of the vector lane. But LLVM IR thinks that shifting right by the full size of the lane is UB, and feels free to replace the `lshr` with an `undef` half way through the optimization pipeline. Hence, to keep this legal in source code, I have to detect it at codegen time. Logical (unsigned) right shifts by the element size are handled by simply emitting the zero vector; arithmetic ones are converted into a shift of one bit less, which will always give the same output. In order to do that check, I also had to enhance the tablegen MveEmitter so that it can cope with converting a builtin function's operand into a bare integer to pass to a code-generating subfunction. Previously the only bare integers it knew how to handle were flags generated from within `arm_mve.td`. Reviewers: dmgreen, miyuki, MarkMurrayARM, ostannard Reviewed By: dmgreen, MarkMurrayARM Subscribers: echristo, hokein, rdhindsa, kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71065	2019-12-11 10:10:09 +00:00
QingShan Zhang	46822083ef	[NFC] Correct the example in the comments of JSON.h to avoid mislead user	2019-12-11 10:03:34 +00:00
Florian Hahn	b48b4ed1a0	[MCRegInfo] Add sub_and_superregs_inclusive iterator range. Reviewers: evandro, qcolombet, paquette, MatzeB, arsenm Reviewed By: paquette Differential Revision: https://reviews.llvm.org/D70566	2019-12-11 09:53:19 +00:00
Andrzej Warzynski	1eecbda087	[AArch64][SVE] Move TableGen class definitions for gather loads (NFC) Move 2 intrinsic class definitions so that they're all clustered in one place. Patch submitted to test commit access.	2019-12-11 09:48:48 +00:00
Florian Hahn	11f311875f	[LiveRegUnits] Add phys_regs_and_masks iterator range (NFC). This iterator range just includes physical registers and register masks, which are interesting when dealing with register liveness. Reviewers: evandro, t.p.northover, paquette, MatzeB, arsenm Reviewed By: paquette Differential Revision: https://reviews.llvm.org/D70562	2019-12-11 09:34:42 +00:00
Wang, Pengfei	21bc8631fe	[FPEnv][X86] Constrained FCmp intrinsics enabling on X86 Summary: This is a follow up of D69281, it enables the X86 backend support for the FP comparision. Reviewers: uweigand, kpn, craig.topper, RKSimon, cameron.mcinally, andrew.w.kaylor Subscribers: hiraditya, llvm-commits, annita.zhang, LuoYuanke, LiuChen3 Tags: #llvm Differential Revision: https://reviews.llvm.org/D70582	2019-12-11 08:23:09 +08:00
Vlad Tsyrklevich	636c93ed11	Revert "Reapply: [DebugInfo] Recover debug intrinsics when killing duplicated/empty..." This reverts commit `f2ba93971c`, it was causing build timeouts on sanitizer-x86_64-linux-autoconf such as http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-autoconf/builds/44917	2019-12-10 16:03:17 -08:00
Sanjay Patel	16e9315685	[IR] allow undefined elements when checking for splat constants This mimics the related call in SDAG. The caller is responsible for ensuring that undef values are propagated safely.	2019-12-10 17:16:59 -05:00
Vedant Kumar	8ddec9ad46	Fix -Wincomplete-umbrella warning in the modules build [281/3666] Building CXX object lib/IR/CMakeFiles/LLVMCore.dir/IntrinsicInst.cpp.o /Users/vsk/src/llvm-project-master/llvm/lib/IR/IntrinsicInst.cpp:155:2: warning: missing submodule 'LLVM_IR.ConstrainedOps' [-Wincomplete-umbrella]	2019-12-10 11:19:17 -08:00
Kevin P. Neal	6515c524b0	[FPEnv] clang support for constrained FP builtins Change the IRBuilder and clang so that constrained FP intrinsics will be emitted for builtins when appropriate. Only non-target-specific builtins are affected in this patch. Differential Revision: https://reviews.llvm.org/D70256	2019-12-10 13:09:12 -05:00
Fangrui Song	83b79f8a18	[VectorUtils] Fix -Wunused-private-field after D67572	2019-12-10 09:36:23 -08:00
Francesco Petrogalli	0be81968a2	[VectorUtils] Introduce the Vector Function Database (VFDatabase). This patch introduced the VFDatabase, the framework proposed in http://lists.llvm.org/pipermail/llvm-dev/2019-June/133484.html. [] In this patch the VFDatabase is used to bridge the TargetLibraryInfo (TLI) calls that were previously used to query for the availability of vector counterparts of scalar functions. The VFISAKind field `ISA` of VFShape have been moved into into VFInfo, under the assumption that different vector ISAs may provide the same vector signature. At the moment, the vectorizer accepts any of the available ISAs as long as the signature provided by the VFDatabase matches the one expected in the vectorization process. For example, when targeting AVX or AVX2, which both have 256-bit registers, the IR signature of the two vector functions associated to the two ISAs is the same. The `getVectorizedFunction` method at the moment returns the first available match. We will need to add more heuristics to the search system to decide which of the available version (TLI, AVX, AVX2, ...) the system should prefer, when multiple versions with the same VFShape are present. Some of the code in this patch is based on the work done by Sumedh Arani in https://reviews.llvm.org/D66025. [] Notice that in the proposal the VFDatabase was called SVFS. The name VFDatabase is more in line with LLVM recommendations for naming classes and variables. Differential Revision: https://reviews.llvm.org/D67572	2019-12-10 16:36:44 +00:00
Mikhail Maltsev	e6d3261c67	[ARM][MVE] Refactor complex vector intrinsics [NFCI] Summary: This patch refactors instruction selection of the complex vector addition, multiplication and multiply-add intrinsics, so that it is now based on TableGen patterns rather than C++ code. It also changes the first parameter (halving vs non-halving) of the arm_mve_vcaddq IR intrinsic to match the corresponding instruction encoding, hence it requires some changes in the tests. The patch addresses David's comment in https://reviews.llvm.org/D71190 Reviewers: dmgreen, ostannard, simon_tatham, MarkMurrayARM Reviewed By: dmgreen Subscribers: merge_guards_bot, kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71245	2019-12-10 16:21:52 +00:00
Yonghong Song	d77ae1552f	[DebugInfo] Support to emit debugInfo for extern variables Extern variable usage in BPF is different from traditional pure user space application. Recent discussion in linux bpf mailing list has two use cases where debug info types are required to use extern variables: - extern types are required to have a suitable interface in libbpf (bpf loader) to provide kernel config parameters to bpf programs. https://lore.kernel.org/bpf/CAEf4BzYCNo5GeVGMhp3fhysQ=_axAf=23PtwaZs-yAyafmXC9g@mail.gmail.com/T/#t - extern types are required so kernel bpf verifier can verify program which uses external functions more precisely. This will make later link with actual external function no need to reverify. https://lore.kernel.org/bpf/87eez4odqp.fsf@toke.dk/T/#m8d5c3e87ffe7f2764e02d722cb0d8cbc136880ed This patch added clang support to emit debuginfo for extern variables with a TargetInfo hook to enable it. The debuginfo for the extern variable is emitted only if that extern variable is referenced in the current compilation unit. Currently, only BPF target enables to generate debug info for extern variables. The emission of such debuginfo is disabled for C++ at this moment since BPF only supports a subset of C language. Emission with C++ can be enabled later if an appropriate use case is identified. -fstandalone-debug permits us to see more debuginfo with the cost of bloated binary size. This patch did not add emission of extern variable debug info with -fstandalone-debug. This can be re-evaluated if there is a real need. Differential Revision: https://reviews.llvm.org/D70696	2019-12-10 08:09:51 -08:00
Guillaume Chatelet	1b2842bf90	[Alignment][NFC] CreateMemSet use MaybeAlign Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: arsenm, jvesely, nhaehnle, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71213	2019-12-10 15:17:44 +01:00
stozer	f2ba93971c	Reapply: [DebugInfo] Recover debug intrinsics when killing duplicated/empty... basic blocks Originally applied in `72ce759928`. Fixed a build failure caused by incorrect use of cast instead of dyn_cast. This reverts commit `8b0780f795`.	2019-12-10 13:33:32 +00:00
Kiran Chandramohan	965ed1e974	[AArch64] Fix issues with large arrays on stack Summary: This patch fixes a few issues when large arrays are allocated on the stack. Currently, clang has inconsistent behaviour, for debug builds there is an assertion failure when the array size on stack is around 2GB but there is no assertion when the stack is around 8GB. For release builds there is no assertion, the compilation succeeds but generates incorrect code. The incorrect code generated is due to using int/unsigned int instead of their 64-bit counterparts. This patch, 1) Removes the assertion in frame legality check. 2) Converts int/unsigned int in some places to the 64-bit variants. This helps in generating correct code and removes the inconsistent behaviour. 3) Adds a test which runs without optimisations. Reviewers: sdesmalen, efriedma, fhahn, aemerson Reviewed By: efriedma Subscribers: eli.friedman, fpetrogalli, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70496	2019-12-10 11:44:41 +00:00
Johannes Doerfert	eb3e81f43f	[OpenMP][NFCI] Introduce llvm/IR/OpenMPConstants.h Summary: The new OpenMPConstants.h is a location for all OpenMP related constants (and helpers) to live. This patch moves the directives there (the enum OpenMPDirectiveKind) and rewires Clang to use the new location. Initially part of D69785. Reviewers: kiranchandramohan, ABataev, RaviNarayanaswamy, gtbercea, grokos, sdmitriev, JonChesterfield, hfinkel, fghanim Subscribers: jholewinski, ppenzin, penzn, llvm-commits, cfe-commits, jfb, guansong, bollu, hiraditya, mgorny Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D69853	2019-12-10 00:10:09 -06:00
Fangrui Song	9574757dba	[MC] Delete MCCodePadder D34393 added MCCodePadder as an infrastructure for padding code with NOP instructions. It lacked tests and was not being worked on since then. Intel has now worked on an assembler patch to mitigate performance loss after applying microcode update for the Jump Conditional Code Erratum. https://www.intel.com/content/www/us/en/support/articles/000055650/processors.html This new patch shares similarity with MCCodePadder, but has a concrete use case in mind and is being actively developed. The infrastructure it introduces can potentially be used for general performance improvement via alignment. Delete the unused MCCodePadder so that people can develop the new feature from a clean state. Reviewed By: jyknight, skan Differential Revision: https://reviews.llvm.org/D71106	2019-12-09 19:21:31 -08:00
Eric Christopher	9c6b7f68b8	Revert "[ARM][MVE] Add intrinsics for immediate shifts." and two follow-on commits: one warning fix and one functionality. As it's breaking at least the lto bot: http://lab.llvm.org:8011/builders/clang-with-lto-ubuntu/builds/15132/steps/test-stage1-compiler/logs/stdio This reverts commits: `8d70f3c933` `ff4dceef92` `d97b3e3e65`	2019-12-09 16:47:38 -08:00
Christian Sigg	d5acc83a3a	Implement LWG#1203 for raw_ostream. Implement LWG#1203 (https://cplusplus.github.io/LWG/issue1203) for raw_ostream like libc++ does for std::basic_ostream<...>. Add a operator<< overload that takes an rvalue reference of a typed derived from raw_ostream, streams the value to it and returns the stream of the same type as the argument. This allows free operator<< to work with rvalue reference raw_ostreams: raw_ostream& operator<<(raw_ostream&, const SomeType& Value); raw_os_ostream(std::cout) << SomeType(); It also allows using the derived type like: auto Foo = (raw_string_ostream(buffer) << "foo").str(); Author: Christian Sigg <csigg@google.com> Differential Revision: https://reviews.llvm.org/D70686	2019-12-09 14:00:53 -08:00
Hiroshi Yamauchi	d9ae493937	[PGO][PGSO] Instrument the code gen / target passes. Summary: Split off of D67120. Add the profile guided size optimization instrumentation / queries in the code gen or target passes. This doesn't enable the size optimizations in those passes yet as they are currently disabled in shouldOptimizeForSize (for non-IR pass queries). A second try after reverted D71072. Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71149	2019-12-09 12:42:59 -08:00
Mark Murray	fc3417cb5a	[ARM][MVE][Intrinsics] Add VQADDQ, VHADDQ, VRHADDQ, VQSUBQ, VHSUBQ, VQDMULHQ, VQRDMULHQ intrinsics. Summary: Add VQADDQ, VHADDQ, VRHADDQ, VQSUBQ, VHSUBQ, VQDMULHQ, VQRDMULHQ intrinsics and unit tests. Reviewers: simon_tatham, ostannard, dmgreen, miyuki Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71198	2019-12-09 17:41:47 +00:00
Mark Murray	2eb61fa5d6	[ARM][MVE][Intrinsics] Add VMULL[BT]Q_(INT\|POLY) intrinsics. Summary: Add VMULL[BT]Q_(INT\|POLY) intrinsics and unit tests. Reviewers: simon_tatham, ostannard, dmgreen Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71066	2019-12-09 17:41:47 +00:00
Simon Tatham	d97b3e3e65	[ARM][MVE] Add intrinsics for immediate shifts. Summary: This adds the family of `vshlq_n` and `vshrq_n` ACLE intrinsics, which shift every lane of a vector left or right by a compile-time immediate. They mostly work by expanding to the IR `shl`, `lshr` and `ashr` operations, with their second operand being a vector splat of the immediate. There's a fiddly special case, though. ACLE specifies that the immediate in `vshrq_n` can take values up to //and including// the bit size of the vector lane. But LLVM IR thinks that shifting right by the full size of the lane is UB, and feels free to replace the `lshr` with an `undef` half way through the optimization pipeline. Hence, to keep this legal in source code, I have to detect it at codegen time. Logical (unsigned) right shifts by the element size are handled by simply emitting the zero vector; arithmetic ones are converted into a shift of one bit less, which will always give the same output. In order to do that check, I also had to enhance the tablegen MveEmitter so that it can cope with converting a builtin function's operand into a bare integer to pass to a code-generating subfunction. Previously the only bare integers it knew how to handle were flags generated from within `arm_mve.td`. Reviewers: dmgreen, miyuki, MarkMurrayARM, ostannard Reviewed By: MarkMurrayARM Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71065	2019-12-09 15:44:09 +00:00
Jeremy Morse	00e238896c	[DebugInfo] Nerf placeDbgValues, with prejudice CodeGenPrepare::placeDebugValues moves variable location intrinsics to be immediately after the Value they refer to. This makes tracking of locations very easy; but it changes the order in which assignments appear to the debugger, from the source programs order to the order in which the optimised program computes values. This then leads to PR43986 and PR38754, where variable locations that were in a conditional block are made unconditional, which is highly misleading. This patch adjusts placeDbgValues to only re-order variable location intrinsics if they use a Value before it is defined, significantly reducing the damage that it does. This is still not 100% safe, but the rest of CodeGenPrepare needs polishing to correctly update debug info when optimisations are performed to fully fix this. This will probably break downstream debuginfo tests -- if the instruction-stream position of variable location changes isn't the focus of the test, an easy fix should be to manually apply placeDbgValues' behaviour to the failing tests, moving dbg.value intrinsics next to SSA variable definitions thus: %foo = inst1 %bar = ... %baz = ... void call @llvm.dbg.value(metadata i32 %foo, ... to %foo = inst1 void call @llvm.dbg.value(metadata i32 %foo, ... %bar = ... %baz = ... This should return your test to exercising whatever it was testing before. Differential Revision: https://reviews.llvm.org/D58453	2019-12-09 12:52:10 +00:00
Mikhail Maltsev	0d1490bf6a	[ARM][MVE] Add complex vector intrinsics Summary: This patch adds intrinsics for the following MVE instructions: * VCADD, VHCADD * VCMUL * VCMLA Each of the above 3 groups has a corresponding new LLVM IR intrinsic. Reviewers: simon_tatham, MarkMurrayARM, ostannard, dmgreen Reviewed By: MarkMurrayARM Subscribers: merge_guards_bot, kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71190	2019-12-09 12:05:59 +00:00
David Green	4a6e13ad88	[CommandLine] Add missing Callbacks It appears that the cl::bits options are not used anywhere in-tree. In the recent addition to add Callback's to the options, the Callback was missing from this one. This fixes it by adding the same code from the other classes. It also adds a simple test, of sorts, just to make sure these continue compiling.	2019-12-09 11:37:34 +00:00
David Green	be7a107070	[ARM] Teach the Arm cost model that a Shift can be folded into other instructions This attempts to teach the cost model in Arm that code such as: %s = shl i32 %a, 3 %a = and i32 %s, %b Can under Arm or Thumb2 become: and r0, r1, r2, lsl #3 So the cost of the shift can essentially be free. To do this without trying to artificially adjust the cost of the "and" instruction, it needs to get the users of the shl and check if they are a type of instruction that the shift can be folded into. And so it needs to have access to the actual instruction in getArithmeticInstrCost, which if available is added as an extra parameter much like getCastInstrCost. We otherwise limit it to shifts with a single user, which should hopefully handle most of the cases. The list of instruction that the shift can be folded into include ADC, ADD, AND, BIC, CMP, EOR, MVN, ORR, ORN, RSB, SBC and SUB. This translates to Add, Sub, And, Or, Xor and ICmp. Differential Revision: https://reviews.llvm.org/D70966	2019-12-09 10:24:33 +00:00
David Stenberg	6965f835b4	[DebugInfo] Make describeLoadedValue() reg aware Summary: Currently the describeLoadedValue() hook is assumed to describe the value of the instruction's first explicit define. The hook will not be called for instructions with more than one explicit define. This commit adds a register parameter to the describeLoadedValue() hook, and invokes the hook for all registers in the worklist. This will allow us to for example describe instructions which produce more than two parameters' values; e.g. Hexagon's various combine instructions. This also fixes situations in our downstream target where we may pass smaller parameters in the high part of a register. If such a parameter's value is produced by a larger copy instruction, we can't describe the call site value using the super-register, and we instead need to know which sub-register that should be used. This also allows us to handle cases like this: $ebx = [...] $rdi = MOVSX64rr32 $ebx $esi = MOV32rr $edi CALL64pcrel32 @call The hook will first be invoked for the MOV32rr instruction, which will say that @call's second parameter (passed in $esi) is described by $edi. As $edi is not preserved it will be added to the worklist. When we get to the MOVSX64rr32 instruction, we need to describe two values; the sign-extended value of $ebx -> $rdi for the first parameter, and $ebx -> $edi for the second parameter, which is now possible. This commit modifies the dbgcall-site-lea-interpretation.mir test case. In the test case, the values of some 32-bit parameters were produced with LEA64r. Perhaps we can in general cases handle such by emitting expressions that AND out the lower 32-bits, but I have not been able to land in a case where a LEA64r is used for a 32-bit parameter instead of LEA64_32 from C code. I have not found a case where it would be useful to describe parameters using implicit defines, so in this patch the hook is still only invoked for explicit defines of forwarding registers. Reviewers: djtodoro, NikolaPrica, aprantl, vsk Reviewed By: djtodoro, vsk Subscribers: ormris, hiraditya, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D70431	2019-12-09 10:47:49 +01:00
David Stenberg	f3696533f2	Revert "[DebugInfo] Make describeLoadedValue() reg aware" This reverts commit `3cd93a4efc`. I'll recommit with a well-formatted arcanist commit message.	2019-12-09 10:45:13 +01:00
David Stenberg	3cd93a4efc	[DebugInfo] Make describeLoadedValue() reg aware Currently the describeLoadedValue() hook is assumed to describe the value of the instruction's first explicit define. The hook will not be called for instructions with more than one explicit define. This commit adds a register parameter to the describeLoadedValue() hook, and invokes the hook for all registers in the worklist. This will allow us to for example describe instructions which produce more than two parameters' values; e.g. Hexagon's various combine instructions. This also fixes a case in our downstream target where we may pass smaller parameters in the high part of a register. If such a parameter's value is produced by a larger copy instruction, we can't describe the call site value using the super-register, and we instead need to know which sub-register that should be used. This also allows us to handle cases like this: $ebx = [...] $rdi = MOVSX64rr32 $ebx $esi = MOV32rr $edi CALL64pcrel32 @call The hook will first be invoked for the MOV32rr instruction, which will say that @call's second parameter (passed in $esi) is described by $edi. As $edi is not preserved it will be added to the worklist. When we get to the MOVSX64rr32 instruction, we need to describe two values; the sign-extended value of $ebx -> $rdi for the first parameter, and $ebx -> $edi for the second parameter, which is now possible. This commit modifies the dbgcall-site-lea-interpretation.mir test case. In the test case, the values of some 32-bit parameters were produced with LEA64r. Perhaps we can in general cases handle such by emitting expressions that AND out the lower 32-bits, but I have not been able to land in a case where a LEA64r is used for a 32-bit parameter instead of LEA64_32 from C code. I have not found a case where it would be useful to describe parameters using implicit defines, so in this patch the hook is still only invoked for explicit defines of forwarding registers.	2019-12-09 10:44:17 +01:00
Ulrich Weigand	9db13b5a7d	[FPEnv] Constrained FCmp intrinsics This adds support for constrained floating-point comparison intrinsics. Specifically, we add: declare <ty2> @llvm.experimental.constrained.fcmp(<type> <op1>, <type> <op2>, metadata <condition code>, metadata <exception behavior>) declare <ty2> @llvm.experimental.constrained.fcmps(<type> <op1>, <type> <op2>, metadata <condition code>, metadata <exception behavior>) The first variant implements an IEEE "quiet" comparison (i.e. we only get an invalid FP exception if either argument is a SNaN), while the second variant implements an IEEE "signaling" comparison (i.e. we get an invalid FP exception if either argument is any NaN). The condition code is implemented as a metadata string. The same set of predicates as for the fcmp instruction is supported (except for the "true" and "false" predicates). These new intrinsics are mapped by SelectionDAG codegen onto two new ISD opcodes, ISD::STRICT_FSETCC and ISD::STRICT_FSETCCS, again representing quiet vs. signaling comparison operations. Otherwise those nodes look like SETCC nodes, with an additional chain argument and result as usual for strict FP nodes. The patch includes support for the common legalization operations for those nodes. The patch also includes full SystemZ back-end support for the new ISD nodes, mapping them to all available SystemZ instruction to fully implement strict semantics (scalar and vector). Differential Revision: https://reviews.llvm.org/D69281	2019-12-07 11:28:39 +01:00
Don Hinton	6555995a6d	[CommandLine] Add callbacks to Options Summary: Add a new cl::callback attribute to Option. This attribute specifies a callback function that is called when an option is seen, and can be used to set other options, as in option A implies option B. If the option is a `cl::list`, and `cl::CommaSeparated` is also specified, the callback will fire once for each value. This could be used to validate combinations or selectively set other options. Reviewers: beanz, thomasfinch, MaskRay, thopre, serge-sans-paille Reviewed By: beanz Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70620	2019-12-06 15:16:45 -08:00
Sam Clegg	b4f4e370b5	[WebAssebmly][MC] Support .import_name/.import_field asm directives Convert the MC test to use asm rather than bitcode. This is a precursor to https://reviews.llvm.org/D70520. Differential Revision: https://reviews.llvm.org/D70877	2019-12-06 15:09:56 -08:00
Reid Kleckner	1d9291cc78	[MC] Rewrite tablegen for printInstrAlias to comiple faster, NFC Before this change, the *InstPrinter.cpp files of each target where some of the slowest objects to compile in all of LLVM. See this snippet produced by ClangBuildAnalyzer: https://reviews.llvm.org/P8171$96 Search for "InstPrinter", and see that it shows up in a few places. Tablegen was emitting a large switch containing a sequence of operand checks, each of which created many conditions and many BBs. Register allocation and jump threading both did not scale well with such a large repetitive sequence of basic blocks. So, this change essentially turns those control flow structures into data. The previous structure looked like: switch (Opc) { case TGT::ADD: // check alias 1 if (MI->getOperandCount() == N && // check num opnds MI->getOperand(0).isReg() && // check opnd 0 ... MI->getOperand(1).isImm() && // check opnd 1 AsmString = "foo"; break; } // check alias 2 if (...) ... return false; The new structure looks like: OpToPatterns: Sorted table of opcodes mapping to pattern indices. \-> Patterns: List of patterns. Previous table points to subrange of patterns to match. \-> Conds: The if conditions above encoded as a kind and 32-bit value. See MCInstPrinter.cpp for the details of how the new data structures are interpreted. Here are some before and after metrics. Time to compile AArch64InstPrinter.cpp: 0m29.062s vs. 0m2.203s size of the obj: 3.9M vs. 676K size of clang.exe: 97M vs. 96M I have not benchmarked disassembly performance, but typically disassemblers are bottlenecked on IO and string processing, not alias matching, so I'm not sure it's interesting enough to be worth doing. Reviewers: RKSimon, andreadb, xbolva00, craig.topper Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D70650	2019-12-06 15:00:18 -08:00
Hiroshi Yamauchi	2eb30fafa5	Revert "[PGO][PGSO] Instrument the code gen / target passes." This reverts commit `9a0b5e1407`. This seems to break buildbots.	2019-12-06 12:17:32 -08:00
Hiroshi Yamauchi	9a0b5e1407	[PGO][PGSO] Instrument the code gen / target passes. Summary: Split off of D67120. Add the profile guided size optimization instrumentation / queries in the code gen or target passes. This doesn't enable the size optimizations in those passes yet as they are currently disabled in shouldOptimizeForSize (for non-IR pass queries). Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71072	2019-12-06 10:43:39 -08:00
Cullen Rhodes	bb8c679f4b	[AArch64][SVE] Implement integer compare intrinsics Summary: Adds intrinsics for the following: * cmphs, cmphi * cmpge, cmpgt * cmpeq, cmpne * cmplt, cmple * cmplo, cmpls Includes a minor change to `TLI.getMemValueType` that fixes a crash due to the scalable flag being dropped. Reviewers: sdesmalen, efriedma, rengolin, rovka, dancgr, huntergr Reviewed By: efriedma Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70889	2019-12-06 10:39:06 +00:00
Alexey Lapshin	9e8c799e2b	[Dsymutil][NFC] Move NonRelocatableStringpool into common CodeGen folder. That refactoring moves NonRelocatableStringpool into common CodeGen folder. So that NonRelocatableStringpool could be used not only inside dsymutil. Differential Revision: https://reviews.llvm.org/D71068	2019-12-06 10:02:27 +03:00
Lang Hames	72db78eba5	[JITLink] Use Blocks rather than Symbols for SectionRange. This ensures that anonymous blocks are included in the section range.	2019-12-05 20:19:17 -08:00
Lang Hames	8c4f048a00	[JITLink] Remove the Section::symbols_empty() method. llvm::empty(Sec.symbols()) can be used instead.	2019-12-05 20:19:17 -08:00
Greg Clayton	aeda128a96	Add lookup functions for efficient lookups of addresses when using GsymReader classes. Summary: Lookup functions are designed to not fully decode a FunctionInfo, LineTable or InlineInfo, they decode only what is needed into a LookupResult object. This allows lookups to avoid costly memory allocations and avoid parsing large amounts of information one a suitable match is found. LookupResult objects contain the address that was looked up, the concrete function address range, the name of the concrete function, and a list of source locations. One for each inline function, and one for the concrete function. This allows one address to turn into multiple frames and improves the signal you get when symbolicating addresses in GSYM files. Reviewers: labath, aprantl Subscribers: mgorny, hiraditya, llvm-commits, lldb-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70993	2019-12-05 16:49:53 -08:00
Wenlei He	e503fd85d3	[AutoFDO] Properly merge context-sensitive profile of inlinee back to outlined function Summary: When sample profile loader decides not to inline a previously inlined call-site, we adjust the profile of outlined function simply by scaling up its profile counts by call-site count. This means the context-sensitive profile of that inlined instance will be thrown away. This commit try to keep context-sensitive profile for such cases: - Instead of scaling outlined function's profile, we now properly merge the FunctionSamples of inlined instance into outlined function, including all recursively inlined profile. - Instead of adjusting the profile for negative inline decision at the end of the sample profile loader pass, we do the profile merge right after processing each function. This change paired with top-down ordering of annotation/inline-replay (a separate diff) will make sure we recursively merge profile back before the profile is used for annotation and inline replay. A new switch -sample-profile-merge-inlinee is added to enable the new profile merge for tuning. It should be the default behavior eventually. Reviewers: wmi, davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70653	2019-12-05 15:57:55 -08:00
Fangrui Song	7faa844044	[IR] Move ctor in the NDEBUG branch	2019-12-05 15:23:11 -08:00
Fangrui Song	b98f3ce33c	[IR] Add a default copy constructor for -Wdeprecated-copy	2019-12-05 15:00:30 -08:00
Volkan Keles	bfa3d260b8	[GlobalISel] Localizer: Allow targets not to run the pass conditionally Summary: Previously, it was not possible to skip running the localizer pass conditionally. This patch adds an input function to the pass which decides if the pass should run on the given MachineFunction or not. No test case as there is no upstream target needs this functionality. Reviewers: qcolombet Reviewed By: qcolombet Subscribers: rovka, hiraditya, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71038	2019-12-05 11:09:50 -08:00
Danilo Carvalho Grael	b29916cec3	[AArch64][SVE] Integer reduction instructions pattern/intrinsics. Added pattern matching/intrinsics for the following SVE instructions: -- saddv, uaddv -- smaxv, sminv, umaxv, uminv -- orv, eorv, andv	2019-12-05 09:59:19 -05:00

1 2 3 4 5 ...

38856 Commits