llvm-project

Commit Graph

Author	SHA1	Message	Date
Simon Pilgrim	4465a765ee	[X86] Remove icmp undef from reduced tests Pre-commit for D59363 (Add icmp UNDEF handling to SelectionDAG::FoldSetCC) Approved by @spatel (Sanjay Patel) llvm-svn: 356859	2019-03-24 17:02:08 +00:00
Simon Pilgrim	a71c0ed471	[X86][AVX] Start shuffle combining from ZERO_EXTEND_VECTOR_INREG (PR40685) Just enable this for AVX for now as SSE41 introduces extra register moves for the PMOVZX(PSHUFD(V)) -> UNPCKH(V,0) pattern (but otherwise helps reduce port5 usage on Intel targets). Only AVX support is required for PR40685 as the issue is due to 8i8->8i32 zext shuffle leftovers. llvm-svn: 356858	2019-03-24 16:30:35 +00:00
George Rimar	272571718c	Recommit r356738 "[llvm-objcopy] - Implement replaceSectionReferences for GroupSection class." Fix: r356853 + set AddressAlign to 4 in Inputs/compress-debug-sections.yaml for the new group section introduced. Original commit message: Currently, llvm-objcopy incorrectly handles compression and decompression of the sections from COMDAT groups, because we do not implement the replaceSectionReferences for this type of the sections. The patch does that. Differential revision: https://reviews.llvm.org/D59638 llvm-svn: 356856	2019-03-24 14:41:45 +00:00
Sanjay Patel	7d676dfd86	[x86] improve the default expansion of uaddsat/usubsat This is yet another step towards solving PR14613: https://bugs.llvm.org/show_bug.cgi?id=14613 uaddsat X, Y --> (X >u (X + Y)) ? -1 : X + Y usubsat X, Y --> (X >u Y) ? X - Y : 0 We can't count on a sane vector ISA, so override the default (umin/umax) expansion of unsigned add/sub saturate in cases where we do not have umin/umax. Differential Revision: https://reviews.llvm.org/D59006 llvm-svn: 356855	2019-03-24 13:55:54 +00:00
George Rimar	0a5d4b8472	[llvm-objcopy] - Report SHT_GROUP sections with invalid alignment. This patch fixes the reason of ubsan failure (UB detected) happened after landing the D59638 (I had to revert it). http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap-ubsan/builds/11760/steps/check-llvm%20ubsan/logs/stdio) Problem is the following. Our implementation of GroupSection assumes that its address is 4 bytes aligned when writes it: template <class ELFT> void ELFSectionWriter<ELFT>::visit(const GroupSection &Sec) { ELF::Elf32_Word Buf = reinterpret_cast<ELF::Elf32_Word >(Out.getBufferStart() + Sec.Offset); ... But the test case for D59638 did not set AddressAlign in YAML. So address was not 4 bytes aligned since Sec.Offset was odd. That triggered the issue. This patch teaches llvm-objcopy to report an error for such sections (which should not met in reality), what is better than having UB. Differential revision: https://reviews.llvm.org/D59695 llvm-svn: 356853	2019-03-24 13:31:08 +00:00
Simon Pilgrim	9eb0de8573	[X86][SLP] Show example of failure to uniformly commute splats for 'alt' shuffles. If either the main/alt opcodes isn't commutable we may end up with the splats not correctly commuted to the same side. llvm-svn: 356837	2019-03-23 16:14:04 +00:00
Reid Kleckner	e6a81b9bec	[pdb] Add -type-stats and sort stats by descending size Summary: It prints this on chromium browser_tests.exe.pdb: Types Total: 5647475 entries ( 371,897,512 bytes, 65.85 avg) -------------------------------------------------------------------------- LF_CLASS: 397894 entries ( 119,537,780 bytes, 300.43 avg) LF_STRUCTURE: 236351 entries ( 83,208,084 bytes, 352.05 avg) LF_FIELDLIST: 291003 entries ( 66,087,920 bytes, 227.10 avg) LF_MFUNCTION: 1884176 entries ( 52,756,928 bytes, 28.00 avg) LF_POINTER: 1149030 entries ( 13,877,344 bytes, 12.08 avg) LF_ARGLIST: 789980 entries ( 12,436,752 bytes, 15.74 avg) LF_METHODLIST: 361498 entries ( 8,351,008 bytes, 23.10 avg) LF_ENUM: 16069 entries ( 6,108,340 bytes, 380.13 avg) LF_PROCEDURE: 269374 entries ( 4,309,984 bytes, 16.00 avg) LF_MODIFIER: 235602 entries ( 2,827,224 bytes, 12.00 avg) LF_UNION: 9131 entries ( 2,072,168 bytes, 226.94 avg) LF_VFTABLE: 323 entries ( 207,784 bytes, 643.29 avg) LF_ARRAY: 6639 entries ( 106,380 bytes, 16.02 avg) LF_VTSHAPE: 126 entries ( 6,472 bytes, 51.37 avg) LF_BITFIELD: 278 entries ( 3,336 bytes, 12.00 avg) LF_LABEL: 1 entries ( 8 bytes, 8.00 avg) The PDB is overall 1.9GB, so the LF_CLASS and LF_STRUCTURE declarations account for about 10% of the overall file size. I was surprised to find that on average LF_FIELDLIST records are short. Maybe this is because there are many more types with short member lists than there are instantiations with lots of members, like std::vector. Reviewers: aganea, zturner Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59672 llvm-svn: 356813	2019-03-22 21:22:13 +00:00
Douglas Yung	8316ea4299	Revert "[llvm-readobj] Separate `Symbol Version` dumpers into `LLVM style` and `GNU style`" This reverts commit `94a0cffe25` (r356764). This change was originally committed in r356764, but then partially reverted in r356777 due to "bad changes". This caused test failures because the test changes committed along with the original change were not reverted, so this change reverts the rest of the changes. llvm-svn: 356811	2019-03-22 21:07:57 +00:00
Eli Friedman	b906bba576	[ARM] Don't form "ands" when it isn't scheduled correctly. In r322972/r323136, the iteration here was changed to catch cases at the beginning of a basic block... but we accidentally deleted an important safety check. Restore that check to the way it was. Fixes https://bugs.llvm.org/show_bug.cgi?id=41116 Differential Revision: https://reviews.llvm.org/D59680 llvm-svn: 356809	2019-03-22 20:49:15 +00:00
Craig Topper	ce1ed55a4a	[X86] Use xmm registers to implement 64-bit popcnt on 32-bit targets if possible if popcnt instruction is not available On 32-bit targets without popcnt, we currently expand 64-bit popcnt to sequences of arithmetic and logic ops for each 32-bit half and then add the 32 bit halves together. If we have xmm registers we can use use those to implement the operation instead. This results in less instructions then doing two separate 32-bit popcnt sequences. This mitigates some of PR41151 for the i64 on i686 case when we have SSE2. Differential Revision: https://reviews.llvm.org/D59662 llvm-svn: 356808	2019-03-22 20:47:02 +00:00
Craig Topper	1ffd8e8114	[X86] Use movq for i64 atomic load on 32-bit targets when sse2 is enable We used a lock cmpxchg8b to do i64 atomic loads. But if we have SSE2 we can do better and use a plain movq to do the load instead. I tried to just use an f64 atomic load and add isel patterns to MOVSD(which the domain fixing pass can turn to MOVQ), but the atomic_load SDNode in TargetSelectionDAG.td requires the type to be integer. So I've emitted VZEXT_LOAD instead which should be selected by isel to a MOVQ. Hopefully we don't need a specific atomic flavor of this. I kept the memory operand from the original AtomicSDNode. I wasn't sure if I might need to set the MOVolatile flag? I've left some FIXMEs for improvements we can do without SSE2. Differential Revision: https://reviews.llvm.org/D59679 llvm-svn: 356807	2019-03-22 20:46:56 +00:00
Daniel Sanders	ef8761fd3b	Fix non-determinism in Reassociate caused by address coincidences Summary: Between building the pair map and querying it there are a few places that erase and create Values. It's rare but the address of these newly created Values is occasionally the same as a just-erased Value that we already have in the pair map. These coincidences should be accounted for to avoid non-determinism. Thanks to Roman Tereshin for the test case. Reviewers: rtereshin, bogner Reviewed By: rtereshin Subscribers: mgrang, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59401 llvm-svn: 356803	2019-03-22 20:16:35 +00:00
Evandro Menezes	4a7739b681	[AArch64, ARM] Add support for Exynos M5 Add Exynos M5 support and test cases. llvm-svn: 356793	2019-03-22 18:42:14 +00:00
Sanjay Patel	a0aaa11afc	[SLP] fix variables names in test; NFC 'tmpXXX' conflicts with the auto-generated script regex names. That could cause mask a bug or fail if the output changes. llvm-svn: 356790	2019-03-22 18:33:11 +00:00
James Y Knight	c0e6b8ac3a	IR: Support parsing numeric block ids, and emit them in textual output. Just as as llvm IR supports explicitly specifying numeric value ids for instructions, and emits them by default in textual output, now do the same for blocks. This is a slightly incompatible change in the textual IR format. Previously, llvm would parse numeric labels as string names. E.g. define void @f() { br label %"55" 55: ret void } defined a label named "55", even without needing to be quoted, while the reference required quoting. Now, if you intend a block label which looks like a value number to be a name, you must quote it in the definition too (e.g. `"55":`). Previously, llvm would print nameless blocks only as a comment, and would omit it if there was no predecessor. This could cause confusion for readers of the IR, just as unnamed instructions did prior to the addition of "%5 = " syntax, back in 2008 (PR2480). Now, it will always print a label for an unnamed block, with the exception of the entry block. (IMO it may be better to print it for the entry-block as well. However, that requires updating many more tests.) Thus, the following is supported, and is the canonical printing: define i32 @f(i32, i32) { %3 = add i32 %0, %1 br label %4 4: ret i32 %3 } New test cases covering this behavior are added, and other tests updated as required. Differential Revision: https://reviews.llvm.org/D58548 llvm-svn: 356789	2019-03-22 18:27:13 +00:00
Simon Pilgrim	aea9db9d40	[X86] Regenerate powi tests to include i686 x87/sse targets llvm-svn: 356787	2019-03-22 18:04:28 +00:00
Simon Pilgrim	08380afaab	[X86] Add PR13897 test case (i128 mul on i686) llvm-svn: 356786	2019-03-22 17:52:21 +00:00
Simon Pilgrim	564392d752	[X86] lowerShuffleAsBitMask - ensure float bit masks are the correct width (PR41203) llvm-svn: 356784	2019-03-22 17:23:55 +00:00
Philip Reames	d627048c07	[Tests] Add masked.gather tests for non-constant masks + speculation possibilities llvm-svn: 356782	2019-03-22 16:39:04 +00:00
Bixia Zheng	bdf0230cff	[ConstantFolding] Fix GetConstantFoldFPValue to avoid cast overflow. Summary: In C++, the behavior of casting a double value that is beyond the range of a single precision floating-point to a float value is undefined. This change replaces such a cast with APFloat::convert to convert the value, which is consistent with how we convert a double value to a half value. Reviewers: sanjoy Subscribers: lebedev.ri, sanjoy, jlebar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59500 llvm-svn: 356781	2019-03-22 16:37:37 +00:00
Philip Reames	f032e85d64	[tests] Add a generic masked.gather test to show sometimes we can't transform llvm-svn: 356779	2019-03-22 16:30:56 +00:00
Philip Reames	e234fd6118	[tests] Add tests for converting masked.load to load speculatively llvm-svn: 356778	2019-03-22 16:26:57 +00:00
Philip Reames	4a518c7055	[Tests] Use valid alignment in masked.gather tests llvm-svn: 356775	2019-03-22 16:20:24 +00:00
Tim Renouf	94c163c34e	InstCombineSimplifyDemanded: Allow v3 results for AMDGCN buffer and image intrinsics This helps to avoid the situation where RA spots that only 3 of the v4f32 result of a load are used, and immediately reallocates the 4th register for something else, requiring a stall waiting for the load. Differential Revision: https://reviews.llvm.org/D58906 Change-Id: I947661edfd5715f62361a02b100f14aeeada29aa llvm-svn: 356768	2019-03-22 15:53:50 +00:00
Xing GUO	94a0cffe25	[llvm-readobj] Separate `Symbol Version` dumpers into `LLVM style` and `GNU style` Summary: Currently, llvm-readobj can dump symbol version sections only in LLVM style. In this patch, I would like to separate these dumpers into GNU style and LLVM style for future implementation. Reviewers: grimar, jhenderson, mattd, rupprecht Reviewed By: rupprecht Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59186 llvm-svn: 356764	2019-03-22 15:42:13 +00:00
Sanjay Patel	221081e365	[x86] auto-generate complete test checks; NFC llvm-svn: 356763	2019-03-22 15:33:59 +00:00
Sanjay Patel	0893351c1c	[x86] auto-generate complete test checks; NFC llvm-svn: 356762	2019-03-22 15:33:55 +00:00
Sanjay Patel	61e2333acb	[x86] add 'nounwind' to tests to reduce noise; NFC llvm-svn: 356761	2019-03-22 15:33:51 +00:00
Sanjay Patel	f39494e795	[x86] auto-generate complete checks for test; NFC llvm-svn: 356760	2019-03-22 15:33:47 +00:00
Tim Renouf	6f0191a55a	[AMDGPU] Use three- and five-dword result type in image ops Some image ops return three or five dwords. Previously, we modeled that with a 4 or 8 dword register class. The register allocator could cleverly spot that some subregs were dead and allocate something else there, but that caused the de-optimization that waitcnt insertion would think that the result was used immediately. This commit allows such an image op to have a result with a three or five dword result, avoiding the above de-optimization. Differential Revision: https://reviews.llvm.org/D58905 Change-Id: I3651211bbd7ed22721ee7b9fefd7bcc60a809d8b llvm-svn: 356757	2019-03-22 15:21:11 +00:00
Tim Renouf	677387d8dc	[AMDGPU] Implemented dwordx3 variants of buffer/tbuffer load/store intrinsics Now we have vec3 MVTs, this commit implements dwordx3 variants of the buffer intrinsics. On gfx6, a dwordx3 buffer load intrinsic is implemented as a dwordx4 instruction, and a dwordx3 buffer store intrinsic is not supported. We need to support the dwordx3 load intrinsic because it is generated by subtarget-unaware code in InstCombine. Differential Revision: https://reviews.llvm.org/D58904 Change-Id: I016729d8557b98a52f529638ae97c340a5922a4e llvm-svn: 356755	2019-03-22 14:58:02 +00:00
Dinar Temirbulatov	f95351b918	[SLPVectorizer] Add test related to SLP Throttling support, NFCI. llvm-svn: 356754	2019-03-22 14:50:53 +00:00
Pavel Labath	69de7a955e	[ObjectYAML] Add basic minidump generation support Summary: This patch adds the ability to read a yaml form of a minidump file and write it out as binary. Apart from the minidump header and the stream directory, only three basic stream kinds are supported: - Text: This kind is used for streams which contain textual data. This is typically the contents of a /proc file on linux (e.g. /proc/PID/maps). In this case, we just put the raw stream contents into the yaml. - SystemInfo: This stream contains various bits of information about the host system in binary form. We expose the data in a structured form. - Raw: This kind is used as a fallback when we don't have any special knowledge about the stream. In this case, we just print the stream contents in hex. For this code to be really useful, more stream kinds will need to be added (particularly for things like lists of memory regions and loaded modules). However, these can be added incrementally. Reviewers: jhenderson, zturner, clayborg, aprantl Subscribers: mgorny, lemo, llvm-commits, lldb-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59482 llvm-svn: 356753	2019-03-22 14:47:26 +00:00
James Henderson	c069d9fd36	[llvm-objcopy]Add coverage for --split-dwo and --output-format Also fix up a couple of minor issues in the test being updated, where FileCheck could match on incorrect output and fix the test case order to match the struct order. Reviewed by: grimar Differential Revision: https://reviews.llvm.org/D59691 llvm-svn: 356746	2019-03-22 12:45:27 +00:00
George Rimar	d822018dbe	Revert r356738 "[llvm-objcopy] - Implement replaceSectionReferences for GroupSection class." Seems this broke ubsan bot: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap-ubsan/builds/11760 llvm-svn: 356745	2019-03-22 12:14:04 +00:00
Alex Bradbury	dab1f6fc4e	[RISCV] Add basic RV32E definitions and MC layer support The RISC-V ISA defines RV32E as an alternative "base" instruction set encoding, that differs from RV32I by having only 16 rather than 32 registers. This patch adds basic definitions for RV32E as well as MC layer support (assembling, disassembling) and tests. The only supported ABI on RV32E is ILP32E. Add a new RISCVFeatures::validate() helper to RISCVUtils which can be called from codegen or MC layer libraries to validate the combination of TargetTriple and FeatureBitSet. Other targets have similar checks (e.g. erroring if SPE is enabled on PPC64 or oddspreg + o32 ABI on Mips), but they either duplicate the checks (Mips), or fail to check for both codegen and MC codepaths (PPC). Codegen for the ILP32E ABI support and RV32E codegen are left for a future patch/patches. Differential Revision: https://reviews.llvm.org/D59470 llvm-svn: 356744	2019-03-22 11:21:40 +00:00
Alex Bradbury	b9e78c3994	[RISCV] Optimize emission of SELECT sequences This patch optimizes the emission of a sequence of SELECTs with the same condition, avoiding the insertion of unnecessary control flow. Such a sequence often occurs when a SELECT of values wider than XLEN is legalized into two SELECTs with legal types. We have identified several use cases where the SELECTs could be interleaved with other instructions. Therefore, we extend the sequence to include non-SELECT instructions if we are able to detect that the non-SELECT instructions do not impact the optimization. This patch supersedes https://reviews.llvm.org/D59096, which attempted to address this issue by introducing a new SelectionDAG node. Hat tip to Eli Friedman for his feedback on how to best handle this issue. Differential Revision: https://reviews.llvm.org/D59355 Patch by Luís Marques. llvm-svn: 356741	2019-03-22 10:45:03 +00:00
Alex Bradbury	3369101158	[RISCV] Allow conversion of CC logic to bitwise logic Indicates in the TargetLowering interface that conversions from CC logic to bitwise logic are allowed. Adds tests that show the benefit when optimization opportunities are detected. Also adds tests that show that when the optimization is not applied correct code is generated (but opportunities for other optimizations remain). Differential Revision: https://reviews.llvm.org/D59596 Patch by Luís Marques. llvm-svn: 356740	2019-03-22 10:39:22 +00:00
George Rimar	1ed6a745db	[llvm-objcopy] - Fix a st_name of the first symbol table entry. Spec says about the first symbol table entry that index 0 both designates the first entry in the table and serves as the undefined symbol index. It should have zero value. Hence the first symbol table entry has no name. And so has to have a st_name == 0. (http://refspecs.linuxbase.org/elf/gabi4+/ch4.symtab.html) Currently, we do not emit zero value for the first symbol table entry. That happens because we add empty strings to the string builder, which for each such case adds a zero byte: (https://github.com/llvm-mirror/llvm/blob/master/lib/MC/StringTableBuilder.cpp#L185) After the string optimization performed it might return non zero indexes for the empty string requested. The patch fixes this issue for the case above and other sections with no names. Differential revision: https://reviews.llvm.org/D59496 llvm-svn: 356739	2019-03-22 10:28:56 +00:00
George Rimar	73e1c4a030	[llvm-objcopy] - Implement replaceSectionReferences for GroupSection class. Currently, llvm-objcopy incorrectly handles compression and decompression of the sections from COMDAT groups, because we do not implement the replaceSectionReferences for this type of the sections. The patch does that. Differential revision: https://reviews.llvm.org/D59638 llvm-svn: 356738	2019-03-22 10:24:37 +00:00
James Henderson	c040d5de25	[llvm-objcopy]Add support for *-freebsd output formats GNU objcopy can support output formats like elf32-i386-freebsd and elf64-x86-64-freebsd. The only difference from their regular non-freebsd counterparts that I have observed is that the freebsd versions set the OS/ABI field to ELFOSABI_FREEBSD. This patch sets the OS/ABI field according based on the format whenever --output-format is specified. Reviewed by: rupprecht, grimar Differential Revision: https://reviews.llvm.org/D59645 llvm-svn: 356737	2019-03-22 10:21:09 +00:00
Alex Bradbury	4fdad7e30e	[RISCV][NFC] Add test case to MC/RISCV/linker-relaxation.s showing incorrect relocations being emitted A follow-up patch will fix this case. llvm-svn: 356736	2019-03-22 10:20:21 +00:00
Tim Renouf	033f99a2e5	[AMDGPU] Added v5i32 and v5f32 register classes They are not used by anything yet, but a subsequent commit will start using them for image ops that return 5 dwords. Differential Revision: https://reviews.llvm.org/D58903 Change-Id: I63e1904081e39a6d66e4eb96d51df25ad399d271 llvm-svn: 356735	2019-03-22 10:11:21 +00:00
Alex Bradbury	f8c785bf12	[RISCV][NFC] Expand test/MC/RISCV/linker-relaxation.s tests Add more complete CHECK lines for the relocations generated when relaxation is enabled, and add cases where a locally defined symbol is referenced. Two instances of pcrel_lo(defined_symbol) are commented out, as they will produce an error. A follow-up patch will fix this. llvm-svn: 356734	2019-03-22 06:05:52 +00:00
Craig Topper	b865084ef3	[X86] Add 32-bit command lines with and without SSE2 to atomic-non-integer.ll. NFC llvm-svn: 356733	2019-03-22 04:28:40 +00:00
Yonghong Song	a1ffe2fa49	[BPF] fix flaky btf unit test static-var-derived-type.ll The DataSecEentries is defined as an unordered_map since order does not really matter. std::unordered_map<std::string, std::unique_ptr<BTFKindDataSec>> DataSecEntries; This seems causing the test static-var-derived-type.ll flaky as two sections ".bss" and ".readonly" have undeterministic ordering when performing map iterating, which decides the output assembly code sequence of BTF_KIND_DATASEC entries. Fix the test to have only one data section to remove flakiness. Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 356731	2019-03-22 02:54:47 +00:00
Fangrui Song	4597dce483	[DWARF] Refactor RelocVisitor and fix computation of SHT_RELA-typed relocation entries Summary: getRelocatedValue may compute incorrect value for SHT_RELA-typed relocation entries. // DWARFDataExtractor.cpp uint64_t DWARFDataExtractor::getRelocatedValue(uint32_t Size, uint32_t Off, ... // This formula is correct for REL, but may be incorrect for RELA if the value // stored in the location (getUnsigned(Off, Size)) is not zero. return getUnsigned(Off, Size) + Rel->Value; In this patch, we refactor these visit* functions to include a new parameter `uint64_t A`. Since these visit* functions are no longer used as visitors, rename them to resolve. + REL: A is used as the addend. A is the value stored in the location where the relocation applies: getUnsigned(Off, Size) + RELA: The addend encoded in RelocationRef is used, e.g. getELFAddend(R) and add another set of supports* functions to check if a given relocation type is handled. DWARFObjInMemory uses them to fail early. Reviewers: echristo, dblaikie Reviewed By: echristo Subscribers: mgorny, aprantl, aheejin, fedor.sergeev, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57939 llvm-svn: 356729	2019-03-22 02:43:11 +00:00
Yonghong Song	ded9a440d0	[BPF] handle derived type properly for computing type id Currently, the type id for a derived type is computed incorrectly. For example, type #1: int type #2: ptr to #1 For a global variable "int *a", type #1 will be attributed to variable "a". This is due to a bug which assigns the type id of the basetype of that derived type as the derived type's type id. This happens to "const", "volatile", "restrict", "typedef" and "pointer" types. This patch fixed this bug, fixed existing test cases and added a new one focusing on pointers plus other derived types. Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 356727	2019-03-22 01:30:50 +00:00
Craig Topper	056b9a995b	[X86] Autogenerate complete checks. NFC llvm-svn: 356723	2019-03-21 23:09:56 +00:00
Amara Emerson	c10b24691a	[AArch64] Split the neon.addp intrinsic into integer and fp variants. This is the result of discussions on the list about how to deal with intrinsics which require codegen to disambiguate them via only the integer/fp overloads. It causes problems for GlobalISel as some of that information is lost during translation, while with other operations like IR instructions the information is encoded into the instruction opcode. This patch changes clang to emit the new faddp intrinsic if the vector operands to the builtin have FP element types. LLVM IR AutoUpgrade has been taught to upgrade existing calls to aarch64.neon.addp with fp vector arguments, and we remove the workarounds introduced for GlobalISel in r355865. This is a more permanent solution to PR40968. Differential Revision: https://reviews.llvm.org/D59655 llvm-svn: 356722	2019-03-21 22:31:37 +00:00

1 2 3 4 5 ...

60278 Commits