llvm-project

Commit Graph

Author	SHA1	Message	Date
Rob Suderman	e708471395	[mlir][NFC] Cleanup AffineOps directory structure Summary: Change AffineOps Dialect structure to better group both IR and Tranforms. This included extracting transforms directly related to AffineOps. Also move AffineOps to Affine. Differential Revision: https://reviews.llvm.org/D76161	2020-03-20 14:23:43 -07:00
Richard Smith	dc4259d5a3	[c++20] Further extend the set of comparisons broken by C++20 that we accept as an extension. This attempts to accept the same cases a GCC, plus cases where a comparison is rewritten to an operator== with an integral but non-bool return type; this is sufficient to avoid most problems with various major open-source projects (such as ICU) and appears to fix all but one of the comparison-related C++20 build breaks in LLVM. This approach is being pursued for standardization.	2020-03-20 14:22:48 -07:00
Adrian Prantl	97f490d87b	Don't set the isOptimized flag in module skeleton DICompileUnits. It's not used for anything.	2020-03-20 14:18:15 -07:00
Adrian Prantl	079c6ddaf5	Correctly initialize the DW_AT_comp_dir attribute of Clang module skeleton CUs Before this patch a Clang module skeleton CU would have a DW_AT_comp_dir pointing to the directory of the module map file, and this information was not used by anyone. Even worse, LLDB actually resolves relative DWO paths by appending it to DW_AT_comp_dir. This patch sets it to the same directory that is used as the main CU's compilation directory, which would make the LLDB code work. Differential Revision: https://reviews.llvm.org/D76377	2020-03-20 14:18:14 -07:00
Nikita Popov	417d69595f	[InstSimplify] Reorder checks to be more efficient; NFC First check whether the RHS is a null pointer, and only then perform a potentially expensive non-zero query.	2020-03-20 22:05:38 +01:00
Davide Italiano	696ae6f7d8	[StopHook] Use wildcard matching. Pointed out by Jim Ingham.	2020-03-20 13:57:40 -07:00
Davide Italiano	6385c2ab8f	[AppleObjCRuntimeV2] Force lazily allocated class names to be resolved. Fixes a couple of tests on new versions of the Obj-C runtime.	2020-03-20 13:43:08 -07:00
Nicolas Vasilache	40fc80a023	[mlir][AVX512] Hotfix - Fix CMake Doc generation dependence	2020-03-20 16:32:18 -04:00
Pirama Arumuga Nainar	fe5599eac6	[llvm-ar] Use target triple to deduce archive kind for bitcode inputs Summary: When using full LTO on cross-compile settings, instead of generating the default archive kind of the host platform, we could deduce the archive kind based on the target triple. This specifically addresses https://github.com/android/ndk/issues/1209 by making it possible to drop llvm-ar in place of GNU ar without extra flags. Reviewers: compnerd, pcc, srhines, danalbert Subscribers: hiraditya, MaskRay, steven_wu, dexonsmith, rupprecht, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76461	2020-03-20 13:19:44 -07:00
Ahmed Taei	08a9147349	[mlir][LLVMIR] Fix fusion for rank-0 tensors Summary: This diff fixes fusion craching for ops with rank-0 tensors Reviewers: mravishankar, nicolasvasilache, rriddle! Subscribers: mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, liufengdb, Joonsoo, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76479	2020-03-20 13:17:19 -07:00
Hanhan Wang	be4e9db579	[mlir][Linalg] NFC: Clean up for 0-D abstraction. Summary: After D75831 has been landed, both the generic op and indexed_generic op can handle 0-D edge case. In the previous patch, only generic op has been updated. This patch updates the lowering to loops for indexed_generic op. Since they are almost the sanme, the patch also refactors the common part. Differential Revision: https://reviews.llvm.org/D76413	2020-03-20 13:07:14 -07:00
Nikita Popov	2b52e4e629	[InstCombine] Remove known bits constant folding If ExpensiveCombines is enabled (which is the case with -O3 on the legacy PM and always on the new PM), InstCombine tries to compute the known bits of all instructions in the hope that all bits end up being known, which is fairly expensive. How effective is it? If we add some statistics on how often the constant folding succeeds and how many KnownBits calculations are performed and run test-suite we get: "instcombine.NumConstPropKnownBits": 642, "instcombine.NumConstPropKnownBitsComputed": 18744965, In other words, we get one fold for every 30000 KnownBits calculations. However, the truth is actually much worse: Currently, known bits are computed before performing other folds, so there is a high chance that cases that get folded by known bits would also have been handled by other folds. What happens if we compute known bits after all other folds (hacky implementation: https://gist.github.com/nikic/751f25b3b9d9e0860db5dde934f70f46)? "instcombine.NumConstPropKnownBits": 0, "instcombine.NumConstPropKnownBitsComputed": 18105547, So it turns out despite doing 18 million known bits calculations, the known bits fold does not do anything useful on test-suite. I was originally planning to move this into AggressiveInstCombine so it only runs once in the pipeline, but seeing this, I think we're better off removing it entirely. As this is the only use of the "expensive combines" mechanism, it may be removed afterwards, but I'll leave that to a separate patch. Differential Revision: https://reviews.llvm.org/D75801	2020-03-20 20:54:06 +01:00
Alexey Bataev	9b95929a26	[OPENMP50]Do not allow several scan directives in the same parent region. According to OpenMP 5.0, exactly one scan directive must appear in the loop body of an enclosing worksharing-loop, worksharing-loop SIMD, or simd construct on which a reduction clause with the inscan modifier is present.	2020-03-20 15:45:31 -04:00
Vedant Kumar	7ec2444880	unittest: Work around build failure on MSVC builders MSVC insists on using the deleted move constructor instead of the copy constructor: http://lab.llvm.org:8011/builders/lld-x86_64-win7/builds/41203 C:\ps4-buildslave2\lld-x86_64-win7\llvm-project\llvm\unittests\ADT\CoalescingBitVectorTest.cpp(193): error C2280: 'llvm::CoalescingBitVector<unsigned int,16>::CoalescingBitVector(llvm::CoalescingBitVector<unsigned int,16> &&)': attempting to reference a deleted function	2020-03-20 12:38:00 -07:00
Vedant Kumar	a245943355	[LiveDebugValues] Speed up collectIDsForRegs, NFC Use the advanceToLowerBound operation available on CoalescingBitVector iterators to speed up collection of variables which reside within some set of registers. The speedup comes from avoiding repeated top-down traversals in IntervalMap::find. The linear scan forward from one register interval to the next is unlikely to be as expensive as a full IntervalMap search starting from the root. This reduces time spent in LiveDebugValues when compiling sqlite3 by 200ms (about 0.1% - 0.2% of the total User Time). Depends on D76466. rdar://60046261 Differential Revision: https://reviews.llvm.org/D76467	2020-03-20 12:18:26 -07:00
Vedant Kumar	a3fd1a1c74	[ADT] CoalescingBitVector: Add advanceToLowerBound iterator operation advanceToLowerBound moves an iterator to the first bit set at, or after, the given index. This can be faster than doing IntervalMap::find. rdar://60046261 Differential Revision: https://reviews.llvm.org/D76466	2020-03-20 12:18:26 -07:00
Vedant Kumar	4716ebb823	[ADT] CoalescingBitVector: Avoid initial heap allocation, NFC Avoid making a heap allocation when constructing a CoalescingBitVector. This reduces time spent in LiveDebugValues when compiling sqlite3 by 700ms (0.5% of the total User Time). rdar://60046261 Differential Revision: https://reviews.llvm.org/D76465	2020-03-20 12:18:25 -07:00
Siva Chandra Reddy	25294708f5	[libc] NFC - Move the round redirector from its own nested directory. We have moved away from directories for entrypoints but this function was not moved out. Submitting as obvious.	2020-03-20 12:15:27 -07:00
Louis Dionne	7efbd851ad	[libc++] Add a new FILE_DEPENDENCIES parser Instead of considering all the .dat files to be dependencies of a test, only consider those that are listed in FILE_DEPENDENCIES.	2020-03-20 14:55:52 -04:00
Alexey Bataev	06dea73307	[OPENMP50]Initial support for inclusive clause. Added parsing/sema/serialization support for inclusive clause in scan directive.	2020-03-20 14:20:38 -04:00
Fangrui Song	7899fe9da8	[X86] Reland D71360 Clean up UseInitArray initialization for X86ELFTargetObjectFile UseInitArray is now the CC1 default but TargetLoweringObjectFileELF::UseInitArray still defaults to false. The following two unknown OS target triples continue using .ctors/.dtors because InitializeELF is not called. clang -target i386 -c a.c clang -target x86_64 -c a.c This cleanup fixes this as a bonus. Differential Revision: https://reviews.llvm.org/D71360	2020-03-20 11:18:36 -07:00
Fangrui Song	fe5937cb33	[llc] Initialize TargetLoweringObjectFile for MIR input MIRParser uses MC and transitively calls MCObjectFileInfo::getObjectFileType(). TargetLoweringObjectFile::Initialize should be called beforehand to initialize MCObjectFileInfo::Env. This manifested as a -fsanitize=undefined test/CodeGen/MIR/X86/instr-symbols-and-mcsymbol-operands.mir failure when D71360/aa5ee8f244441a8ea103a7e0ed8b6f3e74454516 was committed.	2020-03-20 11:18:36 -07:00
Vedant Kumar	636665331b	PR45181: Fix another invalid DIExpression combination The original test case from PR45181 triggers a DIExpression combination that wasn't fixed in D76164.	2020-03-20 11:18:05 -07:00
Nicolas Vasilache	462db62053	[mlir][AVX512] Start a primitive AVX512 dialect The Vector Dialect [document](https://mlir.llvm.org/docs/Dialects/Vector/) discusses the vector abstractions that MLIR supports and the various tradeoffs involved. One of the layer that is missing in OSS atm is the Hardware Vector Ops (HWV) level. This revision proposes an AVX512-specific to add a new Dialect/Targets/AVX512 Dialect that would directly target AVX512-specific intrinsics. Atm, we rely too much on LLVM’s peephole optimizer to do a good job from small insertelement/extractelement/shufflevector. In the future, when possible, generic abstractions such as VP intrinsics should be preferred. The revision will allow trading off HW-specific vs generic abstractions in MLIR. Differential Revision: https://reviews.llvm.org/D75987	2020-03-20 14:11:57 -04:00
Adrian Prantl	18e8f27ad8	Add missing module map entry	2020-03-20 11:11:27 -07:00
Sterling Augustine	5de4ba1770	Cleanup the plumbing for DILineInfoSpecifier. [NFC - Try 2]	2020-03-20 10:29:57 -07:00
Nikita Popov	3205d1a860	[InstCombine] Handle known shl nsw sign bit in SimplifyDemanded Ideally SimplifyDemanded should compute the same known bits as computeKnownBits(). This patch addresses one discrepancy, where ValueTracking is more powerful: If we have a shl nsw shift, we know that the sign bit of the input and output must be the same. If this results in a conflict, the result is poison. This is implemented in `2c4ca6832f/lib/Analysis/ValueTracking.cpp (L1175-L1179)` and `2c4ca6832f/lib/Analysis/ValueTracking.cpp (L904-L908)`. This implements the same basic logic in SimplifyDemanded. It's slightly stronger, because I return undef instead of zero for the poison case (which is not an option inside ValueTracking). As mentioned in https://reviews.llvm.org/D75801#inline-698484, we could detect poison in more cases, this just establishes parity with the existing logic. Differential Revision: https://reviews.llvm.org/D76489	2020-03-20 18:16:05 +01:00
Anton Kolesov	0b18b568e9	[lldb-vscode] Don't use SBLaunchInfo in request_attach If LLDB attaches to an already running target, then structure SBAttachInfo is used instead of SBLaunchInfo. lldb-vscode function request_attach sets some values to g_vsc.launch_info, however this field is then not passed anywhere, so this action has no effect. This commit removes invocation of SBLaunchInfo::SetDetachOnError, which has no equivalent in SBAttachInfo. File package.json doesn't describe detachOnError property for "attach" request type, therefore it is not needed to update it. Differential Revision: https://reviews.llvm.org/D76351	2020-03-20 20:15:23 +03:00
Feng Liu	942afe0cb2	[mlir/quant] fix a small typo in the quant utility This is an edge case where the input type is a primitive type. Differential Revision: https://reviews.llvm.org/D76442	2020-03-20 10:12:53 -07:00
Gabor Marton	f59bb40e36	Attempt to fix failing build-bot with [-Werror,-Wcovered-switch-default]	2020-03-20 18:04:55 +01:00
Gabor Marton	ededa65d55	[analyzer] StdLibraryFunctionsChecker: Add NotNull Arg Constraint Reviewers: NoQ, Szelethus, balazske, gamesh411, baloghadamsoftware, steakhal Subscribers: whisperity, xazax.hun, szepet, rnkovacs, a.sidorin, mikhail.ramalho, donat.nagy, dkrupp, Charusso, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D75063	2020-03-20 17:34:29 +01:00
Pirama Arumuga Nainar	edcfb47ff6	[DAGCombiner] Do not fold truncate(build_vector(..)) if it creates an illegal type Summary: It can be the case that a vector type is legal but the corresponding scalar type is not legal for an architecture (i8 vs. v16i8 on AArch64). Check if the scalar type created when folding truncate(build_vector(x,y)) -> build_vector(truncate(x),truncate(y)) is legal if we are running after the type legalizer. This fixes https://github.com/android/ndk/issues/1207. Reviewers: RKSimon, srhines Subscribers: kristof.beyls, hiraditya, danielkiss, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76312	2020-03-20 09:20:16 -07:00
Sean Fertile	56122fcd64	[PowerPC][AIX][NFC] Extend the test coverage of ByVal args. Adds/changes some types in the ByVal cc test so that they aren't all structs of arrays of bytes, and adds testing for passing multiple ByVal arguments.	2020-03-20 12:19:08 -04:00
Craig Topper	32fbea1548	[X86] Prevent (bitcast (broadcast_load)) combine from producing vXf16 broadcast instructions. The combine tries to put the broadcast in either the integer or fp domain to match the bitcast domain. But we can only do this if the broadcast size is 32 or larger.	2020-03-20 09:15:07 -07:00
Reid Kleckner	ce5173c0e1	Use FinishThunk to finish musttail thunks FinishThunk, and the invariant of setting and then unsetting CurCodeDecl, was added in `7f416cc426` (2015). The invariant didn't exist when I added this musttail codepath in `ab2090d107` (2014). Recently in `28328c3771`, I started using this codepath on non-Windows platforms, and users reported problems during release testing (PR44987). The issue was already present for users of EH on i686-windows-msvc, so I added a test for that case as well. Reviewed By: hans Differential Revision: https://reviews.llvm.org/D76444	2020-03-20 09:02:21 -07:00
Erich Keane	ffcc076a2b	[[Clang CallGraph]] CallGraph should still record calls to decls. Discovered by a downstream user, we found that the CallGraph ignores callees unless they are defined. This seems foolish, and prevents combining the report with other reports to create unified reports. Additionally, declarations contain information that is likely useful to consumers of the CallGraph. This patch implements this by splitting the includeInGraph function into two versions, the current one plus one that is for callees only. The only difference currently is that includeInGraph checks for a body, then calls includeCalleeInGraph. Differential Revision: https://reviews.llvm.org/D76435	2020-03-20 08:55:23 -07:00
Simon Pilgrim	34659de5fd	[InstCombine][X86] simplifyX86immShift - convert variable in-range vector shift by scalar amounts to generic shifts (PR40391) The sll/srl/sra scalar vector shifts can be replaced with generic shifts if the shift amount is known to be in range. This also required public DemandedElts variants of llvm::computeKnownBits to be exposed (PR36319).	2020-03-20 15:48:06 +00:00
Simon Tatham	1adfa4c991	[ARM,MVE] Add ACLE intrinsics for the vaddv/vaddlv family. Summary: I've implemented them as target-specific IR intrinsics rather than using `@llvm.experimental.vector.reduce.add`, on the grounds that the 'experimental' intrinsic doesn't currently have much code generation benefit, and my replacements encapsulate the sign- or zero-extension so that you don't expose the illegal MVE vector type (`<4 x i64>`) in IR. The machine instructions come in two versions: with and without an input accumulator. My new IR intrinsics, like the 'experimental' one, don't take an accumulator parameter: we represent that by just adding on the input value using an ordinary i32 or i64 add. So if you write the `vaddvaq` C-language intrinsic with an input accumulator of zero, it can be optimised to VADDV, and conversely, if you write something like `x += vaddvq(y)` then that can be combined into VADDVA. Most of this is achieved in isel lowering, by converting these IR intrinsics into the existing `ARMISD::VADDV` family of custom SDNode types. For the difficult case (64-bit accumulators), isel lowering already implements the optimization of folding an addition into a VADDLV to make a VADDLVA; so once we've made a VADDLV, our job is already done, except that I had to introduce a parallel set of ARMISD nodes for the //predicated// forms of VADDLV. For the simpler VADDV, we handle the predicated form by just leaving the IR intrinsic alone and matching it in an ordinary dag pattern. Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard Reviewed By: dmgreen Subscribers: kristof.beyls, hiraditya, danielkiss, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D76491	2020-03-20 15:42:33 +00:00
Simon Tatham	45a9945b9e	[ARM,MVE] Add ACLE intrinsics for the vminv/vmaxv family. Summary: I've implemented these as target-specific IR intrinsics, because they're not //quite// enough like @llvm.experimental.vector.reduce.min (which doesn't take the extra scalar parameter). Also this keeps the predicated and unpredicated versions looking similar, and the floating-point minnm/maxnm versions fold into the same schema. We had a couple of min/max reductions already implemented, from the initial pathfinding exercise in D67158. Those were done by having separate IR intrinsic names for the signed and unsigned integer versions; as part of this commit, I've changed them to use a flag parameter indicating signedness, which is how we ended up deciding that the rest of the MVE intrinsics family ought to work. So now hopefully the ewhole lot is consistent. In the new llc test, the output code from the `v8f16` test functions looks quite unpleasant, but most of it is PCS lowering (you can't pass a `half` directly in or out of a function). In other circumstances, where you do something else with your `half` in the same function, it doesn't look nearly as nasty. Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard Reviewed By: MarkMurrayARM Subscribers: kristof.beyls, hiraditya, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D76490	2020-03-20 15:42:33 +00:00
Marcel Hlopko	eddede9d51	[Syntax] Test both the default and windows target platforms in unittests Summary: This increases the coverage for things that differ between Linux and Windows, such as -fdelayed-template-parsing. This would have prevented the rollback of https://reviews.llvm.org/D76346. While at it, update -std=c++11 to c++17 for the test. Reviewers: gribozavr2 Reviewed By: gribozavr2 Subscribers: cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D76497	2020-03-20 16:33:58 +01:00
Gabor Marton	94061df6e5	[analyzer] StdLibraryFunctionsChecker: Add argument constraints Differential Revision: https://reviews.llvm.org/D73898	2020-03-20 16:33:14 +01:00
Sean Fertile	fc902cb6e2	[PowerPC][AIX][NFC] Add zero-sized by val params to cc test. The zero sized structs force creation of a stack object of size 1, align 8 in the locals area, but otherwise have no effect on the calling convention code. i.e. They consume no registers or stack space in the paramater save area. The 32-bit codegen has 8 bytes of padding to fit the new stack object so stack size stays the same. 64-bit codegen has no padding in the stack frames allocated so 8 bytes is added, and becuase of 16-byte aligned stack, the stack size increases from 112 bytes to 128.	2020-03-20 11:24:46 -04:00
Bjorn Pettersson	d168b77780	[DAGCombiner] Fix non-determinism problem related to argument evaluation order in visitFDIV Summary: For some reason the order in which we call getNegatedExpression for the involved operands, after a call to isCheaperToUseNegatedFPOps, seem to matter. This patch includes a new test case in test/CodeGen/X86/fdiv.ll that crashes if we reverse the order of those calls. Before this patch that could happen depending on which compiler that were used when buildind llvm. With my GCC version (7.4.0) I got the crash, because it seems like it is using a different order for the argument evaluation compared to clang. All other users of isCheaperToUseNegatedFPOps already used this pattern with unfolded/ordered calls to getNegatedExpression, so this patch is aligning visitFDIV with the other use cases. This patch simply deals with the non-determinism for FDIV. While the underlying problem with getNegatedExpression is discussed further in D76439. Reviewers: spatel, RKSimon Reviewed By: spatel Subscribers: hiraditya, mgrang, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76319	2020-03-20 16:11:17 +01:00
Matt Arsenault	a950e3beef	AMDGPU: Move towards deprecating alignbit intrinsic This is equivalent to llvm.fshr, so legalize the intrinsic to the generic node.	2020-03-20 11:03:04 -04:00
Matt Arsenault	53d6b156bb	AMDGPU: Add more tests for fshr	2020-03-20 11:01:51 -04:00
Kostya Kortchinsky	f8352502a3	[scudo][standalone] Allow fallback to secondary if primary is full Summary: We introduced a way to fallback to the immediately larger size class for the Primary in the event a region was full, but in the event of the largest size class, we would just fail. This change allows to fallback to the Secondary when the last region of the Primary is full. We also expand the trick to all platforms as opposed to being Android only, and update the test to cover the new case. Reviewers: hctim, cferris, eugenis, morehouse, pcc Subscribers: #sanitizers, llvm-commits Tags: #sanitizers Differential Revision: https://reviews.llvm.org/D76430	2020-03-20 07:55:20 -07:00
alex-t	6e34e71869	[AMDGPU] Enable divergence driven ISel for ADD/SUB i64 Summary: Currently we custom select add/sub with carry out to scalar form relying on later replacing them to vector form if necessary. This change enables custom selection code to take the divergence of adde/addc SDNodes into account and select the appropriate form in one step. Reviewers: arsenm, vpykhtin, rampitec Reviewed By: arsenm, vpykhtin Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa Differential Revision: https://reviews.llvm.org/D76371	2020-03-20 17:06:11 +03:00
Mikhail Maltsev	6ae3eff8ba	[ARM,CDE] Implement CDE vreinterpret intrinsics Summary: This patch implements the following CDE intrinsics: int8x16_t __arm_vreinterpretq_s8_u8 (uint8x16_t in); uint16x8_t __arm_vreinterpretq_u16_u8 (uint8x16_t in); int16x8_t __arm_vreinterpretq_s16_u8 (uint8x16_t in); uint32x4_t __arm_vreinterpretq_u32_u8 (uint8x16_t in); int32x4_t __arm_vreinterpretq_s32_u8 (uint8x16_t in); uint64x2_t __arm_vreinterpretq_u64_u8 (uint8x16_t in); int64x2_t __arm_vreinterpretq_s64_u8 (uint8x16_t in); float16x8_t __arm_vreinterpretq_f16_u8 (uint8x16_t in); float32x4_t __arm_vreinterpretq_f32_u8 (uint8x16_t in); These intrinsics are header-only because they reuse the existing MVE vreinterpret clang built-ins. This set is slightly different from the published specification (see https://static.docs.arm.com/101028/0010/ACLE_2019Q4_release-0010.pdf): it includes int8x16_t __arm_vreinterpretq_s8_u8 (uint8x16_t in); which was unintentionally ommitted from the spec, and does not include float64x2_t __arm_vreinterpretq_f64_u8 (uint8x16_t in); The float64x2_t type requires additional implementation effort, and we are not including it yet. Reviewers: simon_tatham, MarkMurrayARM, dmgreen, ostannard Reviewed By: MarkMurrayARM Subscribers: kristof.beyls, danielkiss, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D76300	2020-03-20 14:01:57 +00:00
Mikhail Maltsev	969034b860	[ARM,CDE] Implement CDE unpredicated Q-register intrinsics Summary: This patch implements the following intrinsics: uint8x16_t __arm_vcx1q_u8 (int coproc, uint32_t imm); T __arm_vcx1qa(int coproc, T acc, uint32_t imm); T __arm_vcx2q(int coproc, T n, uint32_t imm); uint8x16_t __arm_vcx2q_u8(int coproc, T n, uint32_t imm); T __arm_vcx2qa(int coproc, T acc, U n, uint32_t imm); T __arm_vcx3q(int coproc, T n, U m, uint32_t imm); uint8x16_t __arm_vcx3q_u8(int coproc, T n, U m, uint32_t imm); T __arm_vcx3qa(int coproc, T acc, U n, V m, uint32_t imm); Most of them are polymorphic. Furthermore, some intrinsics are polymorphic by 2 or 3 parameter types, such polymorphism is not supported by the existing MVE/CDE tablegen backends, also we don't really want to have a combinatorial explosion caused by 1000 different combinations of 3 vector types. Because of this some intrinsics are implemented as macros involving a cast of the polymorphic arguments to uint8x16_t. The IR intrinsics are even more restricted in terms of types: all MVE vectors are cast to v16i8. Reviewers: simon_tatham, MarkMurrayARM, dmgreen, ostannard Reviewed By: MarkMurrayARM Subscribers: kristof.beyls, hiraditya, danielkiss, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D76299	2020-03-20 14:01:56 +00:00
Mikhail Maltsev	d22e661712	[ARM,CDE] Implement CDE S and D-register intrinsics Summary: This patch implements the following ACLE intrinsics: uint32_t __arm_vcx1_u32(int coproc, uint32_t imm); uint32_t __arm_vcx1a_u32(int coproc, uint32_t acc, uint32_t imm); uint32_t __arm_vcx2_u32(int coproc, uint32_t n, uint32_t imm); uint32_t __arm_vcx2a_u32(int coproc, uint32_t acc, uint32_t n, uint32_t imm); uint32_t __arm_vcx3_u32(int coproc, uint32_t n, uint32_t m, uint32_t imm); uint32_t __arm_vcx3a_u32(int coproc, uint32_t acc, uint32_t n, uint32_t m, uint32_t imm); uint64_t __arm_vcx1d_u64(int coproc, uint32_t imm); uint64_t __arm_vcx1da_u64(int coproc, uint64_t acc, uint32_t imm); uint64_t __arm_vcx2d_u64(int coproc, uint64_t m, uint32_t imm); uint64_t __arm_vcx2da_u64(int coproc, uint64_t acc, uint64_t m, uint32_t imm); uint64_t __arm_vcx3d_u64(int coproc, uint64_t n, uint64_t m, uint32_t imm); uint64_t __arm_vcx3da_u64(int coproc, uint64_t acc, uint64_t n, uint64_t m, uint32_t imm); Since the semantics of CDE instructions is opaque to the compiler, the ACLE intrinsics require dedicated LLVM IR intrinsics. The 64-bit and 32-bit variants share the same IR intrinsic. Reviewers: simon_tatham, MarkMurrayARM, ostannard, dmgreen Reviewed By: MarkMurrayARM Subscribers: kristof.beyls, hiraditya, danielkiss, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D76298	2020-03-20 14:01:53 +00:00

... 2 3 4 5 6 ...

345973 Commits All Branches Search

345973 Commits

All Branches