llvm-project

Commit Graph

Author	SHA1	Message	Date
John Brawn	590dc8d02c	Use virtual functions in ParsedAttrInfo instead of function pointers This doesn't do anything on its own, but it's the first step towards allowing plugins to define attributes. It does simplify the ParsedAttrInfo generation in ClangAttrEmitter a little though. Differential Revision: https://reviews.llvm.org/D31337	2020-02-26 17:24:30 +00:00
Simon Tatham	c32af4447f	[ARM,MVE] Add the vmovnbq,vmovntq intrinsic family. Summary: These are in some sense the inverse of vmovl[bt]q: they take a vector of n wide elements and truncate each to half its width. So they only write half a vector's worth of output data, and therefore they also take an 'inactive' parameter to provide the other half of the data in the output vector. So vmovnb overwrites the even lanes of 'inactive' with the narrowed values from the main input, and vmovnt overwrites the odd lanes. LLVM had existing codegen which generates these MVE instructions in response to IR that takes two vectors of wide elements, or two vectors of narrow ones. But in this case, we have one vector of each. So my clang codegen strategy is to narrow the input vector of wide elements by simply reinterpreting it as the output type, and then we have two narrow vectors and can represent the operation as a vector shuffle that interleaves lanes from both of them. Even so, not all the cases I needed ended up being selected as a single MVE instruction, so I've added a couple more patterns that spot combinations of the 'MVEvmovn' and 'ARMvrev32' SDNodes which can be generated as a VMOVN instruction with operands swapped. This commit adds the unpredicated forms only. Reviewers: dmgreen, miyuki, MarkMurrayARM, ostannard Reviewed By: dmgreen Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D74337	2020-02-18 09:34:50 +00:00
Johannes Doerfert	b86bf83c28	[FIX] Remove pointer in attribute to eliminate leaks (see D71830)	2020-02-15 18:09:54 -06:00
Johannes Doerfert	1228d42dda	[OpenMP][Part 2] Use reusable OpenMP context/traits handling This patch implements an almost complete handling of OpenMP contexts/traits such that we can reuse most of the logic in Flang through the OMPContext.{h,cpp} in llvm/Frontend/OpenMP. All but construct SIMD specifiers, e.g., inbranch, and the device ISA selector are define in `llvm/lib/Frontend/OpenMP/OMPKinds.def`. From these definitions we generate the enum classes `TraitSet`, `TraitSelector`, and `TraitProperty` as well as conversion and helper functions in `llvm/lib/Frontend/OpenMP/OMPContext.{h,cpp}`. The above enum classes are used in the parser, sema, and the AST attribute. The latter is not a collection of multiple primitive variant arguments that contain encodings via numbers and strings but instead a tree that mirrors the `match` clause (see `struct OpenMPTraitInfo`). The changes to the parser make it more forgiving when wrong syntax is read and they also resulted in more specialized diagnostics. The tests are updated and the core issues are detected as before. Here and elsewhere this patch tries to be generic, thus we do not distinguish what selector set, selector, or property is parsed except if they do behave exceptionally, as for example `user={condition(EXPR)}` does. The sema logic changed in two ways: First, the OMPDeclareVariantAttr representation changed, as mentioned above, and the sema was adjusted to work with the new `OpenMPTraitInfo`. Second, the matching and scoring logic moved into `OMPContext.{h,cpp}`. It is implemented on a flat representation of the `match` clause that is not tied to clang. `OpenMPTraitInfo` provides a method to generate this flat structure (see `struct VariantMatchInfo`) by computing integer score values and boolean user conditions from the `clang::Expr` we keep for them. The OpenMP context is now an explicit object (see `struct OMPContext`). This is in anticipation of construct traits that need to be tracked. The OpenMP context, as well as the `VariantMatchInfo`, are basically made up of a set of active or respectively required traits, e.g., 'host', and an ordered container of constructs which allows duplication. Matching and scoring is kept as generic as possible to allow easy extension in the future. --- Test changes: The messages checked in `OpenMP/declare_variant_messages.{c,cpp}` have been auto generated to match the new warnings and notes of the parser. The "subset" checks were reversed causing the wrong version to be picked. The tests have been adjusted to correct this. We do not print scores if the user did not provide one. We print spaces to make lists in the `match` clause more legible. Reviewers: kiranchandramohan, ABataev, RaviNarayanaswamy, gtbercea, grokos, sdmitriev, JonChesterfield, hfinkel, fghanim Subscribers: merge_guards_bot, rampitec, mgorny, hiraditya, aheejin, fedor.sergeev, simoncook, bollu, guansong, dexonsmith, jfb, s.egerton, llvm-commits, cfe-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71830	2020-02-14 16:37:42 -06:00
Jonas Devlieghere	4fe839ef3a	[CMake] Rename EXCLUDE_FROM_ALL and make it an argument to add_lit_testsuite EXCLUDE_FROM_ALL means something else for add_lit_testsuite as it does for something like add_executable. Distinguish between the two by renaming the variable and making it an argument to add_lit_testsuite. Differential revision: https://reviews.llvm.org/D74168	2020-02-06 15:33:18 -08:00
Sven van Haastregt	0fff6593f8	[OpenCL] Reduce size of builtin function tables Reduce the size of some of the TableGen'ed OpenCL builtin function tables: - Use bit fields for bools such that they are packed together. This saves about 7kb. - Use unsigned short for SignatureTable. This saves about 10kb.	2020-02-06 15:08:32 +00:00
Simon Tatham	f8d4afc49a	[ARM,MVE] Add intrinsics for v[id]dupq and v[id]wdupq. Summary: These instructions generate a vector of consecutive elements starting from a given base value and incrementing by 1, 2, 4 or 8. The `wdup` versions also wrap the values back to zero when they reach a given limit value. The instruction updates the scalar base register so that another use of the same instruction will continue the sequence from where the previous one left off. At the IR level, I've represented these instructions as a family of target-specific intrinsics with two return values (the constructed vector and the updated base). The user-facing ACLE API provides a set of intrinsics that throw away the written-back base and another set that receive it as a pointer so they can update it, plus the usual predicated versions. Because the intrinsics return two values (as do the underlying instructions), the isel has to be done in C++. This is the first family of MVE intrinsics that use the `imm_1248` immediate type in the clang Tablegen framework, so naturally, I found I'd given it the wrong C integer type. Also added some tests of the check that the immediate has a legal value, because this is the first time those particular checks have been exercised. Finally, I also had to fix a bug in MveEmitter which failed an assertion when I nested two `seq` nodes (the inner one used to extract the two values from the pair returned by the IR intrinsic, and the outer one put on by the predication multiclass). Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard Reviewed By: dmgreen Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D73357	2020-02-03 11:20:06 +00:00
Benjamin Kramer	735f90fe42	Fix one round of implicit conversions found by g++5.	2020-01-29 01:52:48 +01:00
Benjamin Kramer	adcd026838	Make llvm::StringRef to std::string conversions explicit. This is how it should've been and brings it more in line with std::string_view. There should be no functional change here. This is mostly mechanical from a custom clang-tidy check, with a lot of manual fixups. It uncovers a lot of minor inefficiencies. This doesn't actually modify StringRef yet, I'll do that in a follow-up.	2020-01-28 23:25:25 +01:00
Francis Visoiu Mistrih	0f34ea5dc3	[perf-training] Update ' (in-process)' prefix handling A recent change added a new line after the prefix, so it's now part of the prefix list.	2020-01-25 09:14:24 -08:00
Simon Tatham	98ea4b30c2	[ARM,MVE] Make the MVE intrinsics work in C++! Summary: Apparently nobody has tried this in months of development. It turns out that `FunctionDecl::getBuiltinID` will never consider a function to be a builtin if it is in C++ and not extern "C". So none of the function declarations in <arm_mve.h> are recognized as builtins when clang is compiling in C++ mode: it just emits calls to them as ordinary functions, which then turn out not to exist at link time. The trivial fix is to wrap most of arm_mve.h in an extern "C". Added a test in clang/test/CodeGen/arm-mve-intrinsics which checks basic functioning of the MVE header file in C++ mode. I've filled it with copies of existing test functions from other files in that directory, including a few moderately tricky cases of overloading (in particular one that relies on the strict-polymorphism attribute added in D72518). (I considered making //every// test in that directory compile in both C and C++ mode and check the code generation was identical. But I think that would increase testing time by more than the value it adds, and also update_cc_test_checks gets confused when the output function name varies between RUN lines.) Reviewers: LukeGeeson, MarkMurrayARM, miyuki, dmgreen Reviewed By: MarkMurrayARM Subscribers: kristof.beyls, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D73268	2020-01-23 14:10:27 +00:00
Simon Tatham	4321c6af28	[ARM,MVE] Support immediate vbicq,vorrq,vmvnq intrinsics. Summary: Immediate vmvnq is code-generated as a simple vector constant in IR, and left to the backend to recognize that it can be created with an MVE VMVN instruction. The predicated version is represented as a select between the input and the same constant, and I've added a Tablegen isel rule to turn that into a predicated VMVN. (That should be better than the previous VMVN + VPSEL: it's the same number of instructions but now it can fold into an adjacent VPT block.) The unpredicated forms of VBIC and VORR are done by enabling the same isel lowering as for NEON, recognizing appropriate immediates and rewriting them as ARMISD::VBICIMM / ARMISD::VORRIMM SDNodes, which I then instruction-select into the right MVE instructions (now that I've also reworked those instructions to use the same MC operand encoding). In order to do that, I had to promote the Tablegen SDNode instance `NEONvorrImm` to a general `ARMvorrImm` available in MVE as well, and similarly for `NEONvbicImm`. The predicated forms of VBIC and VORR are represented as a vector select between the original input vector and the output of the unpredicated operation. The main convenience of this is that it still lets me use the existing isel lowering for VBICIMM/VORRIMM, and not have to write another copy of the operand encoding translation code. This intrinsic family is the first to use the `imm_simd` system I put into the MveEmitter tablegen backend. So, naturally, it showed up a bug or two (emitting bogus range checks and the like). Fixed those, and added a full set of tests for the permissible immediates in the existing Sema test. Also adjusted the isel pattern for `vmovlb.u8`, which stopped matching because lowering started turning its input into a VBICIMM. Now it recognizes the VBICIMM instead. Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard Reviewed By: dmgreen Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D72934	2020-01-23 11:53:52 +00:00
Reid Kleckner	f63d763738	[TableGen] Use a table to lookup MVE intrinsic names Summary: Speeds up compilation of SemaDeclAttr.cpp by nine seconds: 0m49.555s - > 0m40.249s Reviewers: simon_tatham, dmgreen, ostannard, MarkMurrayARM Subscribers: cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D72984	2020-01-21 11:05:45 -08:00
Francis Visoiu Mistrih	03689fe97f	[perf-training] Ignore ' (in-process)' prefix from -### After D69825, the output of clang -### when running in process can be prefixed by ' (in-process)'. Skip it.	2020-01-17 09:38:35 -08:00
Simon Tatham	ada01d1b86	[clang] New __attribute__((__clang_arm_mve_strict_polymorphism)). This is applied to the vector types defined in <arm_mve.h> for use with the intrinsics for the ARM MVE vector architecture. Its purpose is to inhibit lax vector conversions, but only in the context of overload resolution of the MVE polymorphic intrinsic functions. This solves an ambiguity problem with polymorphic MVE intrinsics that take a vector and a scalar argument: the scalar argument can often have the wrong integer type due to default integer promotions or unsuffixed literals, and therefore, the type of the vector argument should be considered trustworthy when resolving MVE polymorphism. As part of the same change, I've added the new attribute to the declarations generated by the MveEmitter Tablegen backend (and corrected a namespace issue with the other attribute while I was there). Reviewers: aaron.ballman, dmgreen Reviewed By: aaron.ballman Subscribers: kristof.beyls, JDevlieghere, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D72518	2020-01-15 15:04:10 +00:00
Jinsong Ji	e2b8e2113a	[clang][OpenCL] Fix covered switch warning -Werror clang build is broken now. tools/clang/lib/Sema/OpenCLBuiltins.inc:11824:5: error: default label in switch which covers all enumeration values [-Werror,-Wcovered-switch-default] default: We don't need default now, since all enumeration values are covered. Reviewed By: svenvh Differential Revision: https://reviews.llvm.org/D72707	2020-01-14 16:21:42 +00:00
Simon Tatham	3100480925	[ARM,MVE] Intrinsics for partial-overwrite imm shifts. This batch of intrinsics covers two sets of immediate shift instructions, which have in common that they only overwrite part of their output register and so they need an extra input giving its previous value. The VSLI and VSRI instructions shift each lane of the input vector left or right just as if they were normal immediate VSHL/VSHR, but then they only overwrite the output bits that correspond to actual shifted bits of the input. So VSLI will leave the low n bits of each output lane unchanged, and VSRI the same with the top n bits. The V[Q][R]SHR[U]N family are all narrowing shifts: they take an input vector of 2n-bit integers, shift each lane right by a constant, and then narrowing the shifted result to only n bits. So they only overwrite half of the n-bit lanes in the output register, and the B/T suffix indicates whether it's the bottom or top half of each 2n-bit lane. I've implemented the whole of the latter family using a single IR intrinsic `vshrn`, which takes a lot of i32 parameters indicating which instruction it expands to (by specifying signedness of the input and output types, whether it saturates and/or rounds, etc). Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard Reviewed By: dmgreen Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D72328	2020-01-08 14:42:24 +00:00
Simon Tatham	4978296cd8	[ARM,MVE] Support -ve offsets in gather-load intrinsics. Summary: The ACLE intrinsics with `gather_base` or `scatter_base` in the name are wrappers on the MVE load/store instructions that take a vector of base addresses and an immediate offset. The immediate offset can be up to 127 times the alignment unit, and it can be positive or negative. At the MC layer, we got that right. But in the Sema error checking for the wrapping intrinsics, the offset was erroneously constrained to be positive. To fix this I've adjusted the `imm_mem7bit` class in the Tablegen that defines the intrinsics. But that causes integer literals like `0xfffffffffffffe04` to appear in the autogenerated calls to `SemaBuiltinConstantArgRange`, which provokes a compiler warning because that's out of the non-overflowing range of an `int64_t`. So I've also tweaked `MveEmitter` to emit that as `-0x1fc` instead. Updated the tests of the Sema checks themselves, and also adjusted a random sample of the CodeGen tests to actually use negative offsets and prove they get all the way through code generation without causing a crash. Reviewers: dmgreen, miyuki, MarkMurrayARM Reviewed By: dmgreen Subscribers: kristof.beyls, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D72268	2020-01-06 16:33:07 +00:00
Artem Dergachev	2203089a60	[analyzer] exploded-graph-rewriter: Fix string encodings in python3. Makes sure that the script works fine both in python2 and python3. Patch by Pavel Samolysov! Differential Revision: https://reviews.llvm.org/D71746	2019-12-21 10:59:38 -08:00
Sven van Haastregt	308b8b76ce	[OpenCL] Add builtin function extension handling Provide a mechanism to attach OpenCL extension information to builtin functions, so that their use can be restricted according to the extension(s) the builtin is part of. Patch by Pierre Gondois and Sven van Haastregt. Differential Revision: https://reviews.llvm.org/D71476	2019-12-18 10:13:51 +00:00
Xin-Xin Wang	b3f789e037	[perf-training] Change profile file pattern string to use %4m instead of %p Summary: With %p, each test file that we're using to generate profile data will make its own profraw file which is around 60 MB in size. If we have a lot of test files, that quickly uses a lot of space. Use %4m instead to share the profraw files used to store the profile data. We use 4 here based on the default value in https://reviews.llvm.org/source/llvm-github/browse/master/llvm/CMakeLists.txt$604 Reviewers: beanz, phosek, xiaobai, smeenai, vsk Reviewed By: vsk Subscribers: vsk, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D71585	2019-12-17 12:12:21 -08:00
John McCall	b699fe8b95	Forward {read,write}SomeEnumType to {read,write}Enum instead of directly to {read,write}UInt32. This will be useful for textual formats. NFC.	2019-12-16 13:34:00 -05:00
John McCall	6887ccfcf2	Add the ability for properties to be conditional on other properties. This will be required by TemplateName.	2019-12-16 13:34:00 -05:00
John McCall	256ec99644	Add the ability to declare helper variables when reading properties from a value. This is useful when the properties of a case are actually read out of a specific structure, as with TemplateName.	2019-12-16 13:34:00 -05:00
John McCall	efd0dfbd70	Add the ability to use property-based serialization for "cased" types. This patch doesn't actually use this serialization for anything, but follow-ups will move the current handling of various standard types over to this.	2019-12-16 13:33:59 -05:00
John McCall	00bc76eddd	Move Basic{Reader,Writer} emission into ASTPropsEmitter; NFC. I'm going to introduce some uses of the property read/write methods.	2019-12-16 13:33:59 -05:00
Shoaib Meenai	2c59c4ffb9	[perf-training] Make training data location configurable We may wish to keep the PGO training data outside the repository. Add a CMake variable to allow referencing an external lit testsuite. Differential Revision: https://reviews.llvm.org/D71507	2019-12-14 09:46:41 -08:00
John McCall	d505e57cc2	Abstract serialization: TableGen the (de)serialization code for Types. The basic technical design here is that we have three levels of readers and writers: - At the lowest level, there's a `Basic{Reader,Writer}` that knows how to emit the basic structures of the AST. CRTP allows this to be metaprogrammed so that the client only needs to support a handful of primitive types (e.g. `uint64_t` and `IdentifierInfo`) and more complicated "inline" structures such as `DeclarationName` can just be emitted in terms of those primitives. In Clang's binary-serialization code, these are `ASTRecord{Reader,Writer}`. For now, a large number of basic structures are still emitted explicitly by code on those classes rather than by either TableGen or CRTP metaprogramming, but I expect to move more of these over. - In the middle, there's a `Property{Reader,Writer}` which is responsible for processing the properties of a larger object. The object-level reader/writer asks the property-level reader/writer to project out a particular property, yielding a basic reader/writer which will be used to read/write the property's value, like so: ``` propertyWriter.find("count").writeUInt32(node->getCount()); ``` Clang's binary-serialization code ignores this level (it uses the basic reader/writer as the property reader/writer and has the projection methods just return `this`) and simply relies on the roperties being read/written in a stable order. - At the highest level, there's an object reader/writer (e.g. `Type{Reader,Writer}` which emits a logical object with properties. Think of this as writing something like a JSON dictionary literal. I haven't introduced support for bitcode abbreviations yet --- it turns out that there aren't any operative abbreviations for types besides the QualType one --- but I do have some ideas of how they should work. At any rate, they'll be necessary in order to handle statements. I'm sorry for not disentangling the patches that added basic and type reader/writers; I made some effort to, but I ran out of energy after disentangling a number of other patches from the work. Negligible impact on module size, time to build a set of about 20 fairly large modules, or time to read a few declarations out of them.	2019-12-14 00:17:01 -05:00
John McCall	6404bd2362	Abstract serialization: TableGen "basic" reader/writer CRTP classes that serialize basic values	2019-12-14 00:16:48 -05:00
John McCall	3ce3d23fac	Standardize the reader methods in ASTReader; NFC. There are three significant changes here: - Most of the methods to read various embedded structures (`APInt`, `NestedNameSpecifier`, `DeclarationName`, etc.) have been moved from `ASTReader` to `ASTRecordReader`. This cleans up quite a bit of code which was passing around `(F, Record, Idx)` arguments everywhere or doing explicit indexing, and it nicely parallels how it works on the writer side. It also sets us up to then move most of these methods into the `BasicReader`s that I'm introducing as part of abstract serialization. As part of this, several of the top-level reader methods (e.g. `readTypeRecord`) have been converted to use `ASTRecordReader` internally, which is a nice readability improvement. - I've standardized most of these method names on `readFoo` rather than `ReadFoo` (used in some of the helper structures) or `GetFoo` (used for some specific types for no apparent reason). - I've changed a few of these methods to return their result instead of reading into an argument passed by reference. This is partly for general consistency and partly because it will make the metaprogramming easier with abstract serialization.	2019-12-14 00:16:48 -05:00
John McCall	f6da0cf34a	Enable better node-hierarchy metaprogramming; NFC.	2019-12-14 00:16:47 -05:00
John McCall	30066e522c	Extract out WrappedRecord as a convenience base class; NFC.	2019-12-14 00:16:47 -05:00
John McCall	91dd67ef72	Introduce some types and functions to make it easier to work with the tblgen AST node hierarchies. Not totally NFC because both of the emitters now emit in a different order. The type-nodes emitter now visits nodes in hierarchy order, which means we could use range checks in classof if we had any types that would benefit from that; currently we do not. The AST-nodes emitter now uses a multimap keyed by the name of the record; previously it was using `Record*`, which of couse isn't stable across processes and may have led to non-reproducible builds in some circumstances.	2019-12-14 00:16:47 -05:00
John McCall	a7950ffd12	[NFC] Correct accidental use of tabs.	2019-12-14 00:16:47 -05:00
John McCall	b6f03a5a6b	[NFC] Rename ClangASTEmitters.h -> ASTTableGen.h	2019-12-14 00:16:47 -05:00
Simon Tatham	bd0f271c9e	[ARM][MVE] Add intrinsics for immediate shifts. (reland) This adds the family of `vshlq_n` and `vshrq_n` ACLE intrinsics, which shift every lane of a vector left or right by a compile-time immediate. They mostly work by expanding to the IR `shl`, `lshr` and `ashr` operations, with their second operand being a vector splat of the immediate. There's a fiddly special case, though. ACLE specifies that the immediate in `vshrq_n` can take values up to //and including// the bit size of the vector lane. But LLVM IR thinks that shifting right by the full size of the lane is UB, and feels free to replace the `lshr` with an `undef` half way through the optimization pipeline. Hence, to keep this legal in source code, I have to detect it at codegen time. Logical (unsigned) right shifts by the element size are handled by simply emitting the zero vector; arithmetic ones are converted into a shift of one bit less, which will always give the same output. In order to do that check, I also had to enhance the tablegen MveEmitter so that it can cope with converting a builtin function's operand into a bare integer to pass to a code-generating subfunction. Previously the only bare integers it knew how to handle were flags generated from within `arm_mve.td`. Reviewers: dmgreen, miyuki, MarkMurrayARM, ostannard Reviewed By: dmgreen, MarkMurrayARM Subscribers: echristo, hokein, rdhindsa, kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71065	2019-12-11 10:10:09 +00:00
Eric Christopher	9c6b7f68b8	Revert "[ARM][MVE] Add intrinsics for immediate shifts." and two follow-on commits: one warning fix and one functionality. As it's breaking at least the lto bot: http://lab.llvm.org:8011/builders/clang-with-lto-ubuntu/builds/15132/steps/test-stage1-compiler/logs/stdio This reverts commits: `8d70f3c933` `ff4dceef92` `d97b3e3e65`	2019-12-09 16:47:38 -08:00
Mark Murray	2eb61fa5d6	[ARM][MVE][Intrinsics] Add VMULL[BT]Q_(INT\|POLY) intrinsics. Summary: Add VMULL[BT]Q_(INT\|POLY) intrinsics and unit tests. Reviewers: simon_tatham, ostannard, dmgreen Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71066	2019-12-09 17:41:47 +00:00
Haojian Wu	ff4dceef92	Fix the compiler warnings: "-Winconsistent-missing-override", "-Wunused-variable" for `d97b3e3e65`	2019-12-09 17:09:07 +01:00
Simon Tatham	d97b3e3e65	[ARM][MVE] Add intrinsics for immediate shifts. Summary: This adds the family of `vshlq_n` and `vshrq_n` ACLE intrinsics, which shift every lane of a vector left or right by a compile-time immediate. They mostly work by expanding to the IR `shl`, `lshr` and `ashr` operations, with their second operand being a vector splat of the immediate. There's a fiddly special case, though. ACLE specifies that the immediate in `vshrq_n` can take values up to //and including// the bit size of the vector lane. But LLVM IR thinks that shifting right by the full size of the lane is UB, and feels free to replace the `lshr` with an `undef` half way through the optimization pipeline. Hence, to keep this legal in source code, I have to detect it at codegen time. Logical (unsigned) right shifts by the element size are handled by simply emitting the zero vector; arithmetic ones are converted into a shift of one bit less, which will always give the same output. In order to do that check, I also had to enhance the tablegen MveEmitter so that it can cope with converting a builtin function's operand into a bare integer to pass to a code-generating subfunction. Previously the only bare integers it knew how to handle were flags generated from within `arm_mve.td`. Reviewers: dmgreen, miyuki, MarkMurrayARM, ostannard Reviewed By: MarkMurrayARM Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71065	2019-12-09 15:44:09 +00:00
Reid Kleckner	1f822f212c	Handle two corner cases in creduce-clang-crash.py Summary: First, call os.path.normpath on the filename argument. I passed in ./foo-asdf.cpp, and this meant that the script failed to find the filename, and bad things happened. Second, call os.path.abspath on binaries. CReduce runs the interestingness test in a temp dir, so relative paths will not work. Reviewers: akhuang Subscribers: cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D71098	2019-12-05 16:24:24 -08:00
Simon Tatham	d173fb5d28	[ARM,MVE] Add intrinsics to deal with predicates. Summary: This commit adds the `vpselq` intrinsics which take an MVE predicate word and select lanes from two vectors; the `vctp` intrinsics which create a tail predicate word suitable for processing the first m elements of a vector (e.g. in the last iteration of a loop); and `vpnot`, which simply complements a predicate word and is just syntactic sugar for the `~` operator. The `vctp` ACLE intrinsics are lowered to the IR intrinsics we've already added (and which D70592 just reorganized). I've filled in the missing isel rule for VCTP64, and added another set of rules to generate the predicated forms. I needed one small tweak in MveEmitter to allow the `unpromoted` type modifier to apply to predicates as well as integers, so that `vpnot` doesn't pointlessly convert its input integer to an `<n x i1>` before complementing it. Reviewers: ostannard, MarkMurrayARM, dmgreen Reviewed By: dmgreen Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D70485	2019-12-02 16:20:30 +00:00
Tim Northover	78ad22e0cc	Recommit ARM-NEON: make type modifiers orthogonal and allow multiple modifiers. The modifier system used to mutate types on NEON intrinsic definitions had a separate letter for all kinds of transformations that might be needed, and we were quite quickly running out of letters to use. This patch converts to a much smaller set of orthogonal modifiers that can be applied together to achieve the desired effect. When merging with downstream it is likely to cause a conflict with any local modifications to the .td files. There is a new script in utils/convert_arm_neon.py that was used to convert all .td definitions and I would suggest running it on the last downstream version of those files before this commit rather than resolving conflicts manually. The original version broke vcreate_* because it became a macro and didn't apply the normal integer promotion rules before bitcasting to a vector. This adds a temporary.	2019-11-26 09:21:47 +00:00
Nico Weber	6f773205cd	Revert "Use InitLLVM to setup a pretty stack printer" This reverts commit `3f76260dc0`. Breaks at least these tests on Windows: Clang :: Driver/clang-offload-bundler.c Clang :: Driver/clang-offload-wrapper.c	2019-11-25 21:06:56 -05:00
Rui Ueyama	3f76260dc0	Use InitLLVM to setup a pretty stack printer InitLLVM does not only save a few lines from main() but also makes the commands do the right thing for multibyte character pathnames on Windows (i.e. canonicalize argv's to UTF-8) because of the code we have in this file: https://github.com/llvm/llvm-project/blob/master/llvm/lib/Support/InitLLVM.cpp#L32 For many LLVM commands, we already have calls of InitLLVM, but there are still remainings. Differential Revision: https://reviews.llvm.org/D70702	2019-11-26 10:56:10 +09:00
Hans Wennborg	21f26470e9	Revert `3f91705ca5` "ARM-NEON: make type modifiers orthogonal and allow multiple modifiers." This broke the vcreate_u64 intrinsic. Example: $ cat /tmp/a.cc #include <arm_neon.h> void g() { auto v = vcreate_u64(0); } $ bin/clang -c /tmp/a.cc --target=arm-linux-androideabi16 -march=armv7-a /tmp/a.cc:4:12: error: C-style cast from scalar 'int' to vector 'uint64x1_t' (vector of 1 'uint64_t' value) of different size auto v = vcreate_u64(0); ^~~~~~~~~~~~~~ /work/llvm.monorepo/build.release/lib/clang/10.0.0/include/arm_neon.h:4144:11: note: expanded from macro 'vcreate_u64' __ret = (uint64x1_t)(__p0); \ ^~~~~~~~~~~~~~~~~~ Reverting until this can be investigated. > The modifier system used to mutate types on NEON intrinsic definitions had a > separate letter for all kinds of transformations that might be needed, and we > were quite quickly running out of letters to use. This patch converts to a much > smaller set of orthogonal modifiers that can be applied together to achieve the > desired effect. > > When merging with downstream it is likely to cause a conflict with any local > modifications to the .td files. There is a new script in > utils/convert_arm_neon.py that was used to convert all .td definitions and I > would suggest running it on the last downstream version of those files before > this commit rather than resolving conflicts manually.	2019-11-25 16:27:53 +01:00
Tim Northover	3f91705ca5	ARM-NEON: make type modifiers orthogonal and allow multiple modifiers. The modifier system used to mutate types on NEON intrinsic definitions had a separate letter for all kinds of transformations that might be needed, and we were quite quickly running out of letters to use. This patch converts to a much smaller set of orthogonal modifiers that can be applied together to achieve the desired effect. When merging with downstream it is likely to cause a conflict with any local modifications to the .td files. There is a new script in utils/convert_arm_neon.py that was used to convert all .td definitions and I would suggest running it on the last downstream version of those files before this commit rather than resolving conflicts manually.	2019-11-20 13:20:02 +00:00
Tim Northover	e23d6f3184	NeonEmitter: remove special case on casting polymorphic builtins. For some reason we were not casting a fairly obscure class of builtin calls we expected to be polymorphic to vectors of char. It worked because the only affected intrinsics weren't actually polymorphic after all, but is unnecessarily complicated.	2019-11-20 13:20:02 +00:00
Simon Tatham	9e37892773	[ARM,MVE] Add intrinsics for vector get/set lane. This adds the `vgetq_lane` and `vsetq_lane` families, to copy between a scalar and a specified lane of a vector. One of the new `vgetq_lane` intrinsics returns a `float16_t`, which causes a compile error if `%clang_cc1` doesn't get the option `-fallow-half-arguments-and-returns`. The driver passes that option to cc1 already, but I've had to edit all the explicit cc1 command lines in the existing MVE intrinsics tests. A couple of fixes are included for the code I wrote up front in MveEmitter to support lane-index immediates (and which nothing has tested until now): the type was wrong (`uint32_t` instead of `int`) and the range was off by one. I've also added a method of bypassing the default promotion to `i32` that is done by the MveEmitter code generation: it's sensible to promote short scalars like `i16` to `i32` if they're going to be passed to custom IR intrinsics representing a machine instruction operating on GPRs, but not if they're going to be passed to standard IR operations like `insertelement` which expect the exact type. Reviewers: ostannard, MarkMurrayARM, dmgreen Reviewed By: dmgreen Subscribers: kristof.beyls, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D70188	2019-11-15 09:53:58 +00:00
Simon Tatham	902e84556a	[ARM,MVE] Add intrinsics for 'administrative' vector operations. This batch of intrinsics includes lots of things that move vector data around or change its type without really affecting its value very much. It includes the `vreinterpretq` family (cast one vector type to another); `vuninitializedq` (create a vector of a given type with don't-care contents); and `vcreateq` (make a 128-bit vector out of two `uint64_t` halves). These are all implemented using completely standard IR that's already tested in existing LLVM unit tests, so I've just written a clang test to check the IR is correct, and left it at that. I've also added some richer infrastructure to the MveEmitter Tablegen backend, to make it specify the exact integer type of integer arguments passed to IR construction functions, and wrap those arguments in a `static_cast` in the autogenerated C++. That was necessary to prevent an overloading ambiguity when passing the integer literal `0` to `IRBuilder::CreateInsertElement`, because otherwise, it could mean either a null pointer `llvm::Value *` or a zero `uint64_t`. Reviewers: ostannard, MarkMurrayARM, dmgreen Subscribers: kristof.beyls, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D70133	2019-11-15 09:53:43 +00:00

1 2 3 4 5 ...

1327 Commits