llvm-project

Commit Graph

Author	SHA1	Message	Date
Tim Northover	097a3e3d95	ARM: deduplicate hard-float detection code. NFC. ARMSubtarget had a copy/pasted block to determine whether the target was hard-float, but it just delegated to triple features anyway so it's better at the TargetMachine level. llvm-svn: 337384	2018-07-18 12:36:25 +00:00
Sander de Smalen	330d887d72	[AArch64][SVE] Asm: Support for unpredicated FP operations. This patch adds support for the following unpredicated floating-point instructions: FADD Floating point add FSUB Floating point subtract FMUL Floating point multiplication FTSMUL Floating point trigonometric starting value FRECPS Floating point reciprocal step FRSQRTS Floating point reciprocal square root step The instructions have the following assembly format: fadd z0.h, z1.h, z2.h and have variants for 16, 32 and 64-bit FP elements. llvm-svn: 337383	2018-07-18 11:59:12 +00:00
George Rimar	7d8e632e98	[ELF] - Stop silently producing a broken .eh_frame_hdr. Currently, getFdePC() returns uint64_t. Its because the following encodings might use 8 bytes: DW_EH_PE_absptr and DW_EH_PE_udata8. But caller assigns returned value to uint32_t field: https://github.com/llvm-mirror/lld/blob/master/ELF/SyntheticSections.cpp#L508 Value is used for building .eh_frame_hdr section. We use DW_EH_PE_sdata4 encoding for building it at this moment: https://github.com/llvm-mirror/lld/blob/master/ELF/SyntheticSections.cpp#L2545 And that means that an overflow issue might happen if DW_EH_PE_absptr/DW_EH_PE_udata8 address encodings are present in .eh_frame. In that case, before this patch, we silently would truncate the address and produced broken .eh_frame_hdr section. It would be not hard to support real 64-bit values for DW_EH_PE_absptr/DW_EH_PE_udata8 encodings, but it is unclear if it is usefull and if we should do it. Since nobody faced/reported it, int this patch I only implement a check to stop producing broken output silently for now. llvm-svn: 337382	2018-07-18 11:56:53 +00:00
Nico Weber	1dbff9a406	Mention clang-cl improvements from r335466 and r336379 in ReleaseNotes.rst llvm-svn: 337381	2018-07-18 11:55:03 +00:00
Andrea Di Biagio	cd8f627c37	[TargetInstPredicate] Add definition of CheckInvalidRegisterOperand. This should have been part of r337378. I forgot to svn add it before committing the change. llvm-svn: 337380	2018-07-18 11:16:31 +00:00
Max Kazantsev	6b12506200	[NFC] Make a test more neat llvm-svn: 337379	2018-07-18 11:03:40 +00:00
Andrea Di Biagio	9a2e9db712	[Tablegen][PredicateExpander] Add the ability to define checks for invalid registers. This was discussed in review D49436. llvm-svn: 337378	2018-07-18 11:03:22 +00:00
George Rimar	a1bb8f7a0c	[ELF] - Add a test case to check DW_EH_PE_absptr address encoding. This covers the following line of the code: https://github.com/llvm-mirror/lld/blob/master/ELF/SyntheticSections.cpp#L525 llvm-svn: 337377	2018-07-18 11:02:37 +00:00
Roman Lebedev	3cb87e905c	[InstCombine] Re-commit: Fold 'check for [no] signed truncation' pattern Summary: [[ https://bugs.llvm.org/show_bug.cgi?id=38149 \| PR38149 ]] As discussed in https://reviews.llvm.org/D49179#1158957 and later, the IR for 'check for [no] signed truncation' pattern can be improved: https://rise4fun.com/Alive/gBf ^ that pattern will be produced by Implicit Integer Truncation sanitizer, https://reviews.llvm.org/D48958 https://bugs.llvm.org/show_bug.cgi?id=21530 in signed case, therefore it is probably a good idea to improve it. The DAGCombine will reverse this transform, see https://reviews.llvm.org/D49266 This transform is surprisingly frustrating. This does not deal with non-splat shift amounts, or with undef shift amounts. I've outlined what i think the solution should be: ``` // Potential handling of non-splats: for each element: // * if both are undef, replace with constant 0. // Because (1<<0) is OK and is 1, and ((1<<0)>>1) is also OK and is 0. // * if both are not undef, and are different, bailout. // * else, only one is undef, then pick the non-undef one. ``` This is a re-commit, as the original patch, committed in rL337190 was reverted in rL337344 as it broke chromium build: https://bugs.llvm.org/show_bug.cgi?id=38204 and https://crbug.com/864832 Proofs that the fixed folds are ok: https://rise4fun.com/Alive/VYM Differential Revision: https://reviews.llvm.org/D49320 llvm-svn: 337376	2018-07-18 10:55:17 +00:00
Simon Pilgrim	21813140f6	[X86][SSE] Add extra scalar fop + blend tests for commuted inputs While working on PR38197, I noticed that we don't make use of FADD/FMUL being able to commute the inputs to support the addps+movss -> addss style combine llvm-svn: 337375	2018-07-18 10:54:13 +00:00
George Rimar	c51b81de9c	[ELF] - Improve eh-frame-value-format7.s test case. This adds .eh_frame_hdr content checking to test that DW_EH_PE_udata2 address was decoded correctly. llvm-svn: 337374	2018-07-18 10:42:10 +00:00
Daniel Cederman	959c8bf51c	Revert "[Sparc] Use the IntPair reg class for r constraints with value type f64" This reverts commit 55222c9183c6e07f53a54c4061677734f54feac1. I missed that this patch has a dependency on https://reviews.llvm.org/D49219 that has not been approved yet. llvm-svn: 337373	2018-07-18 10:05:30 +00:00
Sander de Smalen	ccdc7ebc1d	[AArch64][SVE] Asm: Support for UDOT/SDOT instructions. The signed/unsigned DOT instructions perform a dot-product on quadtuplets from two source vectors and accumulate the result in the destination register. The instructions come in two forms: Vector form, e.g. sdot z0.s, z1.b, z2.b - signed dot product on four 8-bit quad-tuplets, accumulating results in 32-bit elements. udot z0.d, z1.h, z2.h - unsigned dot product on four 16-bit quad-tuplets, accumulating results in 64-bit elements. Indexed form, e.g. sdot z0.s, z1.b, z2.b[3] - signed dot product on four 8-bit quad-tuplets with specified quadtuplet from second source vector, accumulating results in 32-bit elements. udot z0.d, z1.h, z2.h[1] - dot product on four 16-bit quad-tuplets with specified quadtuplet from second source vector, accumulating results in 64-bit elements. llvm-svn: 337372	2018-07-18 09:37:51 +00:00
George Rimar	c1090da852	[llvm-objdump] - An attempt to fix BB after r337361. Seems r337361 is the reason of the following ARM BB failures: http://lab.llvm.org:8011/builders/clang-cmake-armv8-quick http://lab.llvm.org:8011/builders/clang-cmake-armv8-full/builds/4633 Reason is unclear to me, other bots are OK. If this will not help, I'll revert r337361. llvm-svn: 337371	2018-07-18 09:25:36 +00:00
Daniel Cederman	4e38df18ea	[Sparc] Use the IntPair reg class for r constraints with value type f64 Summary: This is how it appears to be handled in GCC and it prevents a "Unknown mismatch" error in the SelectionDAGBuilder. Reviewers: venkatra, jyknight, jrtc27 Reviewed By: jyknight, jrtc27 Subscribers: eraman, fedor.sergeev, jrtc27, llvm-commits Differential Revision: https://reviews.llvm.org/D49218 llvm-svn: 337370	2018-07-18 09:25:33 +00:00
Sander de Smalen	889fe81ce5	[AArch64][SVE] Asm: Integer divide instructions. This patch adds the following predicated instructions: UDIV Unsigned divide active elements UDIVR Unsigned divide active elements, reverse form. SDIV Signed divide active elements SDIVR Signed divide active elements, reverse form. e.g. udiv z0.s, p0/m, z0.s, z1.s (unsigned divide active elements in z0 by z1, store result in z0) sdivr z0.s, p0/m, z0.s, z1.s (signed divide active elements in z1 by z0, store result in z0) llvm-svn: 337369	2018-07-18 09:17:29 +00:00
Simon Pilgrim	e2c615dca1	Fix -Wdocumentation warning. NFCI. llvm-svn: 337368	2018-07-18 09:10:18 +00:00
Simon Pilgrim	3cb3056fca	Fix -Wdocumentation warning. NFCI. llvm-svn: 337367	2018-07-18 09:07:54 +00:00
Philip Pfaffe	3b8b3c2a2c	[CMake] Export the LLVM_LINK_LLVM_DYLIB setting Summary: When building out-of-tree tools, there are several macros available to automate linking against llvm. An examples is `add_llvm_executable`, or the clang variant of this. These macros use the LLVM_LINK_LLVM_DYLIB option to decide whether to link against libraries defined by setting LLVM_LINK_COMPONENTS or to link against libLLVM instead. Currently this is problematic in out-of-tree targets, because they cannot identify whether this option is required or even available. If the option was enabled in LLVM's own build, the clang libraries are built against libLLVM, so a client linking against those must link against it too. On the other hand the client can't just always link against it, because it might not be available. This is related to D44391, but that change assumed the client knew whether they wanted the dylib or not. Reviewers: mgorny, beanz, labath Reviewed By: mgorny Subscribers: bollu, llvm-commits Differential Revision: https://reviews.llvm.org/D49193 llvm-svn: 337366	2018-07-18 08:53:31 +00:00
George Rimar	2566c0a2f2	[ELF] - Improve relocatable-many-sections.s test case. NFC. This adds a check for .shstrtab section index. llvm-svn: 337365	2018-07-18 08:52:09 +00:00
Roman Lebedev	3404d4dd41	[NFC][InstCombine] i65 tests for 'check for [no] signed truncation' pattern Those initially broke chromium build: https://bugs.llvm.org/show_bug.cgi?id=38204 and https://crbug.com/864832 llvm-svn: 337364	2018-07-18 08:49:51 +00:00
George Rimar	b9f3ea3e1c	[ELF] - Do not produce broken output when amount of sections is > ~65k This is a part of ttps://bugs.llvm.org//show_bug.cgi?id=38119 We produce broken ELF header now when the number of output sections is >= SHN_LORESERVE (0xff00). ELF spec says (http://www.sco.com/developers/gabi/2003-12-17/ch4.eheader.html): e_shnum: If the number of sections is greater than or equal to SHN_LORESERVE (0xff00), this member has the value zero and the actual number of section header table entries is contained in the sh_size field of the section header at index 0. (Otherwise, the sh_size member of the initial entry contains 0.) e_shstrndx If the section name string table section index is greater than or equal to SHN_LORESERVE (0xff00), this member has the value SHN_XINDEX (0xffff) and the actual index of the section name string table section is contained in the sh_link field of the section header at index 0. (Otherwise, the sh_link member of the initial entry contains 0.) We did not set these fields correctly earlier. The patch fixes the issue. Differential revision: https://reviews.llvm.org/D49371 llvm-svn: 337363	2018-07-18 08:44:38 +00:00
George Rimar	9958f620f9	[ELF] — Add a test case for DW_EH_PE_udata2 encoding. This adds a test to check LLD can handle such address format correctly. Test case covers the following line: https://github.com/llvm-mirror/lld/blob/master/ELF/SyntheticSections.cpp#L519 llvm-svn: 337362	2018-07-18 08:39:31 +00:00
George Rimar	e35e6448f9	[llvm-objdump] - Stop reporting bogus section IDs. Imagine we have a file with few sections, and one of them is .foo with index N != 0. Problem is that when llvm-objdump is given a -section=.foo parameter it lists .foo as a section at index 0. That makes impossible to write test cases which needs to find the index of the particular section, while ignoring dumping of others. The patch fixes that. Differential revision: https://reviews.llvm.org/D49372 llvm-svn: 337361	2018-07-18 08:34:35 +00:00
George Rimar	6fdac3b23a	[llvm-readobj] - Teach tool to dump objects with >= SHN_LORESERVE of sections. http://www.sco.com/developers/gabi/2003-12-17/ch4.eheader.html says that e_shnum and/or e_shstrndx may have special values if "the number of sections is greater than or equal to SHN_LORESERVE" or "the section name string table section index is greater than or equal to SHN_LORESERVE (0xff00)" Previously llvm-readobj was unable to dump such files, patch changes that. I had to add a precompiled test case because it does not seem possible to prepare a test using yaml2obj or llvm-mc (not clear how to make .shstrtab to have index >= SHN_LORESERVE). Differential revision: https://reviews.llvm.org/D49369 llvm-svn: 337360	2018-07-18 08:19:58 +00:00
Roman Lebedev	ad50ae82ad	Revert test changes part of "Revert "[InstCombine] Fold 'check for [no] signed truncation' pattern"" We want the test to remain good anyway. I think the fix is incoming. This reverts part of commit rL337344. llvm-svn: 337359	2018-07-18 08:15:13 +00:00
Sander de Smalen	ac0cb5bf75	[AArch64][SVE] Asm: Support for integer MUL instructions. This patch adds the following instructions: MUL - multiply vectors, e.g. mul z0.h, p0/m, z0.h, z1.h - multiply with immediate, e.g. mul z0.h, z0.h, #127 SMULH - signed multiply returning high half, e.g. smulh z0.h, p0/m, z0.h, z1.h UMULH - unsigned multiply returning high half, e.g. umulh z0.h, p0/m, z0.h, z1.h llvm-svn: 337358	2018-07-18 08:10:03 +00:00
Craig Topper	92ea7a7b48	[X86] Enable commuting of VUNPCKHPD to VMOVLHPS to enable load folding by using VMOVLPS with a modified address. This required an annoying amount of tablegen multiclass changes to make only VUNPCKHPDZ128rr commutable. llvm-svn: 337357	2018-07-18 07:31:32 +00:00
Craig Topper	a2bbfa21ce	[X86] Add test case for missed opportunity to commute vunpckhpd to enable use of vmovlps to fold a load. We do this transform for SSE, but not AVX or AVX512VL. llvm-svn: 337356	2018-07-18 07:31:30 +00:00
Joachim Protze	bb869f42b7	[libomptarget] Also support several images for elf In revision r336569 (D49036) libomptarget support for multiple nvidia images has been fixed in case a target region resides inside one or multiple libraries and in the compiled application. But the issues is still present for elf images. This fix will also support multiple images for elf. Patch by Jannis Klinkenberg Reviewers: protze.joachim, ABataev, grokos Reviewed By: protze.joachim, ABataev, grokos Subscribers: openmp-commits Differential Revision: https://reviews.llvm.org/D49418 llvm-svn: 337355	2018-07-18 07:23:46 +00:00
Craig Topper	a9ec6a11d8	[X86] Regenerate fma.ll checks using current version of the script which produces different regular expressions on spills and reloads. NFC llvm-svn: 337354	2018-07-18 07:08:28 +00:00
Vassil Vassilev	33eb7297d2	[modules] Print input files when -module-file-info file switch is passed. This patch improves traceability of duplicated header files which end up in multiple pcms. Differential Revision: https://reviews.llvm.org/D47118 llvm-svn: 337353	2018-07-18 06:49:33 +00:00
Martin Storsjo	17c0f721b9	[AArch64] Define TARGET_HEADER_BUILTIN Without it, the new intrinsics became available for all language variants. This was missed in SVN r337327. llvm-svn: 337352	2018-07-18 06:15:09 +00:00
Hiroshi Inoue	cd83d459bc	[NFC] fix trivial typos in comments llvm-svn: 337351	2018-07-18 06:04:43 +00:00
Justin Hibbits	22e939a15b	Fix build failures from r337347, found by clang * Delete a no-longer-used override, and mark the other getRegisterTypeForCallingConv() as override. * SPE only supports i32, not i64, as the internal type, so simply remove the type check, so that DestReg and Opc are provably always set. GCC 6.4 did not warn about either of the above. llvm-svn: 337350	2018-07-18 05:19:25 +00:00
Craig Topper	95063a45b8	[X86] Remove patterns that mix X86ISD::MOVLHPS/MOVHLPS with v2i64/v2f64 types. The X86ISD::MOVLHPS/MOVHLPS should now only be emitted in SSE1 only. This means that the v2i64/v2f64 types would be illegal thus we don't need these patterns. llvm-svn: 337349	2018-07-18 05:10:53 +00:00
Craig Topper	1425e10cc6	[X86] Generate v2f64 X86ISD::UNPCKL/UNPCKH instead of X86ISD::MOVLHPS/MOVHLPS for unary v2f64 {0,0} and {1,1} shuffles with SSE2. I'm trying to restrict the MOVLHPS/MOVHLPS ISD nodes to SSE1 only. With SSE2 we can use unpcks. I believe this will allow some patterns to be cleaned up to require fewer bitcasts. I've put in an odd isel hack to still select MOVHLPS instruction from the unpckh node to avoid changing tests and because movhlps is a shorter encoding. Ideally we'd do execution domain switching on this, but the operands are in the wrong order and are tied. We might be able to try a commute in the domain switching using custom code. We already support domain switching for UNPCKLPD and MOVLHPS. llvm-svn: 337348	2018-07-18 05:10:51 +00:00
Justin Hibbits	d52990c71b	Introduce codegen for the Signal Processing Engine Summary: The Signal Processing Engine (SPE) is found on NXP/Freescale e500v1, e500v2, and several e200 cores. This adds support targeting the e500v2, as this is more common than the e500v1, and is in SoCs still on the market. This patch is very intrusive because the SPE is binary incompatible with the traditional FPU. After discussing with others, the cleanest solution was to make both SPE and FPU features on top of a base PowerPC subset, so all FPU instructions are now wrapped with HasFPU predicates. Supported by this are: * Code generation following the SPE ABI at the LLVM IR level (calling conventions) * Single- and Double-precision math at the level supported by the APU. Still to do: * Vector operations * SPE intrinsics As this changes the Callee-saved register list order, one test, which tests the precise generated code, was updated to account for the new register order. Reviewed by: nemanjai Differential Revision: https://reviews.llvm.org/D44830 llvm-svn: 337347	2018-07-18 04:25:10 +00:00
Justin Hibbits	4fa4fa6a73	Complete the SPE instruction set patterns This is the lead-up to having SPE codegen. Add the rest of the instructions, along with MC tests. Differential Revision: https://reviews.llvm.org/D44829 llvm-svn: 337346	2018-07-18 04:24:57 +00:00
Justin Hibbits	ceb3cd96f7	Add PowerPC e500(v2) core scheduler and directives. Differential Revision: https://reviews.llvm.org/D44828 llvm-svn: 337345	2018-07-18 04:24:49 +00:00
Bob Haarman	4ebe5d59b6	Revert "[InstCombine] Fold 'check for [no] signed truncation' pattern" This reverts r337190 (and a few follow-up commits), which caused the Chromium build to fail. See https://bugs.llvm.org/show_bug.cgi?id=38204 and https://crbug.com/864832 llvm-svn: 337344	2018-07-18 02:18:28 +00:00
Dean Michael Berris	4719c52455	[XRay][compiler-rt] Segmented Array: Simplify and Optimise Summary: This is a follow-on to D49217 which simplifies and optimises the implementation of the segmented array. In this patch we co-locate the book-keeping for segments in the `__xray::Array<T>` with the data it's managing. We take the chance in this patch to actually rename `Chunk` to `Segment` to better align with the high-level description of the segmented array. With measurements using benchmarks landed in D48879, we've identified that calls to `pthread_getspecific` started dominating the cycles, which led us to revert the change made in D49217 to use C++ thread_local initialisation instead (it reduces the cost by a huge margin, since we save one PLT-based call to pthread functions in the hot path). In particular, this is in `__xray::getThreadLocalData()`. We also took the opportunity to remove the least-common-multiple based calculation and instead pack as much data into segments of the array. This greatly simplifies the API of the container which hides as much of the implementation details as possible. For instance, we calculate the number of elements we need for the each segment internally in the Array instead of making it part of the type. With the changes here, we're able to get a measurable improvement on the performance of profiling mode on top of what D48879 already provides. Depends on D48879. Reviewers: kpw, eizan Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D49363 llvm-svn: 337343	2018-07-18 02:08:39 +00:00
Dean Michael Berris	9d6b7a5f2b	[XRay][compiler-rt] Simplify Allocator Implementation Summary: This change simplifies the XRay Allocator implementation to self-manage an mmap'ed memory segment instead of using the internal allocator implementation in sanitizer_common. We've found through benchmarks and profiling these benchmarks in D48879 that using the internal allocator in sanitizer_common introduces a bottleneck on allocating memory through a central spinlock. This change allows thread-local allocators to eliminate contention on the centralized allocator. To get the most benefit from this approach, we also use a managed allocator for the chunk elements used by the segmented array implementation. This gives us the chance to amortize the cost of allocating memory when creating these internal segmented array data structures. We also took the opportunity to remove the preallocation argument from the allocator API, simplifying the usage of the allocator throughout the profiling implementation. In this change we also tweak some of the flag values to reduce the amount of maximum memory we use/need for each thread, when requesting memory through mmap. Depends on D48956. Reviewers: kpw, eizan Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D49217 llvm-svn: 337342	2018-07-18 01:53:39 +00:00
Dean Michael Berris	1e3feb49e3	[XRay][compiler-rt] FDR Mode: Allow multiple runs Summary: Fix a bug in FDR mode which didn't allow for re-initialising the logging in the same process. This change ensures that: - When we flush the FDR mode logging, that the state of the logging implementation is `XRAY_LOG_UNINITIALIZED`. - Fix up the thread-local initialisation to use aligned storage and `pthread_getspecific` as well as `pthread_setspecific` for the thread-specific data. - Actually use the pointer provided to the thread-exit cleanup handling, instead of assuming that the thread has thread-local data associated with it, and reaching at thread-exit time. In this change we also have an explicit test for two consecutive sessions for FDR mode tracing, and ensuring both sessions succeed. Reviewers: kpw, eizan Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D49359 llvm-svn: 337341	2018-07-18 01:31:30 +00:00
Sterling Augustine	2526104ee4	Workaround warning bug in old versions of gcc. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56480 llvm-svn: 337340	2018-07-18 00:33:25 +00:00
Peter Collingbourne	14b468bab6	Re-land r337333, "Teach Clang to emit address-significance tables.", which was reverted in r337336. The problem that required a revert was fixed in r337338. Also added a missing "REQUIRES: x86-registered-target" to one of the tests. Original commit message: > Teach Clang to emit address-significance tables. > > By default, we emit an address-significance table on all ELF > targets when the integrated assembler is enabled. The emission of an > address-significance table can be controlled with the -faddrsig and > -fno-addrsig flags. > > Differential Revision: https://reviews.llvm.org/D48155 llvm-svn: 337339	2018-07-18 00:27:07 +00:00
Peter Collingbourne	fc50498ced	CodeGen: Don't create address significance table entries for thread-local variables. The presence of these symbols in the symbol table can cause symbol type mismatch errors (or undefined symbol errors on emulated TLS targets) and they can't be ICF'd anyway. llvm-svn: 337338	2018-07-18 00:21:40 +00:00
Puyan Lotfi	0f5d5fae93	[NFC][llvm-objcopy] Cleanup namespace usage in llvm-objcopy. Nest any classes not used outside of a file into anon. Nest any classes used across files in llvm-objcopy into namespace llvm::objcopy. Differential Revision: https://reviews.llvm.org/D49449 llvm-svn: 337337	2018-07-18 00:10:51 +00:00
Peter Collingbourne	35c6996b68	Revert r337333, "Teach Clang to emit address-significance tables." Causing multiple failures on sanitizer bots due to TLS symbol errors, e.g. /usr/bin/ld: __msan_origin_tls: TLS definition in /home/buildbots/ppc64be-clang-test/clang-ppc64be/stage1/lib/clang/7.0.0/lib/linux/libclang_rt.msan-powerpc64.a(msan.cc.o) section .tbss.__msan_origin_tls mismatches non-TLS reference in /tmp/lit_tmp_0a71tA/mallinfo-3ca75e.o llvm-svn: 337336	2018-07-17 23:56:30 +00:00
Jason Molenda	c6ab25deef	Link the lldb driver ("lldb") against the llvm static libraries because of the new prettystackprinter dependency. llvm-svn: 337335	2018-07-17 23:44:09 +00:00

... 2 3 4 5 6 ...

294573 Commits All Branches Search

294573 Commits

All Branches