llvm-project

Commit Graph

Author	SHA1	Message	Date
Matt Arsenault	3cd3959fe2	GlobalISel: Implement fewerElementsVector for G_BUILD_VECTOR Turn it into a G_CONCAT_VECTORS of G_BUILD_VECTOR. llvm-svn: 374252	2019-10-09 22:44:43 +00:00
Cameron McInally	47363a148f	[IRBuilder] Update IRBuilder::CreateFNeg(...) to return a UnaryOperator Also update Clang to call Builder.CreateFNeg(...) for UnaryMinus. Differential Revision: https://reviews.llvm.org/D61675 llvm-svn: 374240	2019-10-09 21:52:15 +00:00
Wei Mi	09dcfe6805	[SampleFDO] Add indexing for function profiles so they can be loaded on demand in ExtBinary format Currently for Text, Binary and ExtBinary format profiles, when we compile a module with samplefdo, even if there is no function showing up in the profile, we have to load all the function profiles from the profile input. That is a waste of compile time. CompactBinary format profile has already had the support of loading function profiles on demand. In this patch, we add the support to load profile on demand for ExtBinary format. It will work no matter the sections in ExtBinary format profile are compressed or not. Experiment shows it reduces the time to compile a server benchmark by 30%. When profile remapping and loading function profiles on demand are both used, extra work needs to be done so that the loading on demand process will take the name remapping into consideration. It will be addressed in a follow-up patch. Differential Revision: https://reviews.llvm.org/D68601 llvm-svn: 374233	2019-10-09 21:36:03 +00:00
David Blaikie	411497c6c7	llvm-dwarfdump: Support multiple debug_loclists contributions Also fixing the incorrect "offset" field being computed/printed for each location list. llvm-svn: 374232	2019-10-09 21:25:28 +00:00
Vitaly Buka	2d85fd942a	[System Model] [TTI] Fix virtual destructor warning llvm-svn: 374221	2019-10-09 20:48:52 +00:00
Evandro Menezes	e60415a0db	[Support] Add mathematical constants Add own version of the mathematical constants from the upcoming C++20 `std::numbers`. Differential revision: https://reviews.llvm.org/D68257 llvm-svn: 374207	2019-10-09 19:58:01 +00:00
David Greene	2e6f6b4dad	[System Model] [TTI] Update cache and prefetch TTI interfaces Re-apply 9fdfb045ae8b/r365676 with fixes for PPC and Hexagon. This involved moving defaults from TargetTransformInfoImplBase to MCSubtargetInfo. Rework the TTI cache and software prefetching APIs to prepare for the introduction of a general system model. Changes include: - Marking existing interfaces const and/or override as appropriate - Adding comments - Adding BasicTTIImpl interfaces that delegate to a subtarget implementation - Moving the default TargetTransformInfoImplBase implementation to a default MCSubtarget implementation Only a handful of targets use these interfaces currently: AArch64, Hexagon, PPC and SystemZ. AArch64 already has a custom subtarget implementation, so its custom TTI implementation is migrated to use the new facilities in BasicTTIImpl to invoke its custom subtarget implementation. The custom TTI implementations continue to exist for the other targets with this change. They are not moved over to subtarget-based implementations. The end goal is to have the default subtarget implementation defer to the system model defined by the target. With this change, the default MCSubtargetInfo implementation essentially returns the defaults TargetTransformInfoImplBase used to return. Existing users of TTI defaults will hit the defaults now in MCSubtargetInfo. Targets that define their own custom TTI implementations won't use the BasicTTIImpl implementations that route to the subtarget. Once system models are in place for the targets that use these interfaces, their custom TTI implementations can be removed. Differential Revision: https://reviews.llvm.org/D63614 llvm-svn: 374205	2019-10-09 19:51:48 +00:00
Thomas Lively	3419e90dc1	[WebAssembly] Add builtin and intrinsic for v8x16.swizzle Summary: This clang builtin and corresponding LLVM intrinsic are necessary to expose the exact semantics of the underlying WebAssembly instruction to users. LLVM produces a poison value if the dynamic swizzle indices are greater than the vector size, but the WebAssembly instruction sets the corresponding output lane to zero. Users who depend on this behavior can safely use this builtin. Depends on D68527. Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D68531 llvm-svn: 374189	2019-10-09 17:45:47 +00:00
Jason Liu	6453f700f2	[AIX][XCOFF][NFC] Change the SectionLen field name of CSect Auxiliary entry to SectionOrLength. Summary: According the the XCOFF document, If Then XTY_SD x_scnlen contains the csect length. XTY_LD x_scnlen contains the symbol table index of the containing csect. XTY_CM x_scnlen contains the csect length. XTY_ER x_scnlen contains 0. Change the SectionLen member name to SectionOrLength is more reasonable. Authored By: DiggerLin Reviewed By: hubert.reinterpretcast Differential Revision: https://reviews.llvm.org/D68650 llvm-svn: 374179	2019-10-09 16:19:39 +00:00
Simon Pilgrim	604b7c22be	Fix Wdocumentation unknown parameter warning. NFCI. llvm-svn: 374171	2019-10-09 14:26:09 +00:00
Hans Wennborg	1e1e3ba252	Unify the two CRC implementations David added the JamCRC implementation in r246590. More recently, Eugene added a CRC-32 implementation in r357901, which falls back to zlib's crc32 function if present. These checksums are essentially the same, so having multiple implementations seems unnecessary. This replaces the CRC-32 implementation with the simpler one from JamCRC, and implements the JamCRC interface in terms of CRC-32 since this means it can use zlib's implementation when available, saving a few bytes and potentially making it faster. JamCRC took an ArrayRef<char> argument, and CRC-32 took a StringRef. This patch changes it to ArrayRef<uint8_t> which I think is the best choice, and simplifies a few of the callers nicely. Differential revision: https://reviews.llvm.org/D68570 llvm-svn: 374148	2019-10-09 09:06:30 +00:00
Kristina Brooks	0746aafd89	[TypeSize] Fix module builds (cassert) TypeSize.h uses `assert` statements without including the <cassert> header first which leads to failures in modular builds. llvm-svn: 374138	2019-10-09 04:00:03 +00:00
David Blaikie	5841e9af1d	DebugInfo: Move LLE enum handling to .def to match RLE handling llvm-svn: 374122	2019-10-08 21:48:46 +00:00
Jordan Rose	cb8292274a	Mark several PointerIntPair methods as lvalue-only No point in mutating 'this' if it's just going to be thrown away. https://reviews.llvm.org/D63945 llvm-svn: 374102	2019-10-08 19:01:48 +00:00
Daniel Sanders	4b7cabf1e1	[tblgen] Add getOperatorAsDef() to Record Summary: While working with DagInit's, it's often the case that you expect the operator to be a reference to a def. This patch adds a wrapper for this common case to reduce the amount of boilerplate callers need to duplicate repeatedly. getOperatorAsDef() returns the record if the DagInit has an operator that is a DefInit. Otherwise, it prints a fatal error. There's only a few pre-existing examples in LLVM at the moment and I've left a few instances of the code this simplifies as they had more specific error messages than the generic one this produces. I'm going to be using this a fair bit in my subsequent patches. Reviewers: bogner, volkan, nhaehnle Reviewed By: nhaehnle Subscribers: nhaehnle, hiraditya, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, lenary, s.egerton, pzheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68424 llvm-svn: 374101	2019-10-08 18:41:32 +00:00
Yonghong Song	05e46979d2	[BPF] do compile-once run-everywhere relocation for bitfields A bpf specific clang intrinsic is introduced: u32 __builtin_preserve_field_info(member_access, info_kind) Depending on info_kind, different information will be returned to the program. A relocation is also recorded for this builtin so that bpf loader can patch the instruction on the target host. This clang intrinsic is used to get certain information to facilitate struct/union member relocations. The offset relocation is extended by 4 bytes to include relocation kind. Currently supported relocation kinds are enum { FIELD_BYTE_OFFSET = 0, FIELD_BYTE_SIZE, FIELD_EXISTENCE, FIELD_SIGNEDNESS, FIELD_LSHIFT_U64, FIELD_RSHIFT_U64, }; for __builtin_preserve_field_info. The old access offset relocation is covered by FIELD_BYTE_OFFSET = 0. An example: struct s { int a; int b1:9; int b2:4; }; enum { FIELD_BYTE_OFFSET = 0, FIELD_BYTE_SIZE, FIELD_EXISTENCE, FIELD_SIGNEDNESS, FIELD_LSHIFT_U64, FIELD_RSHIFT_U64, }; void bpf_probe_read(void , unsigned, const void ); int field_read(struct s arg) { unsigned long long ull = 0; unsigned offset = __builtin_preserve_field_info(arg->b2, FIELD_BYTE_OFFSET); unsigned size = __builtin_preserve_field_info(arg->b2, FIELD_BYTE_SIZE); #ifdef USE_PROBE_READ bpf_probe_read(&ull, size, (const void )arg + offset); unsigned lshift = __builtin_preserve_field_info(arg->b2, FIELD_LSHIFT_U64); #if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__ lshift = lshift + (size << 3) - 64; #endif #else switch(size) { case 1: ull = (unsigned char )((void )arg + offset); break; case 2: ull = (unsigned short )((void )arg + offset); break; case 4: ull = (unsigned int )((void )arg + offset); break; case 8: ull = (unsigned long long )((void )arg + offset); break; } unsigned lshift = __builtin_preserve_field_info(arg->b2, FIELD_LSHIFT_U64); #endif ull <<= lshift; if (__builtin_preserve_field_info(arg->b2, FIELD_SIGNEDNESS)) return (long long)ull >> __builtin_preserve_field_info(arg->b2, FIELD_RSHIFT_U64); return ull >> __builtin_preserve_field_info(arg->b2, FIELD_RSHIFT_U64); } There is a minor overhead for bpf_probe_read() on big endian. The code and relocation generated for field_read where bpf_probe_read() is used to access argument data on little endian mode: r3 = r1 r1 = 0 r1 = 4 <=== relocation (FIELD_BYTE_OFFSET) r3 += r1 r1 = r10 r1 += -8 r2 = 4 <=== relocation (FIELD_BYTE_SIZE) call bpf_probe_read r2 = 51 <=== relocation (FIELD_LSHIFT_U64) r1 = (u64 )(r10 - 8) r1 <<= r2 r2 = 60 <=== relocation (FIELD_RSHIFT_U64) r0 = r1 r0 >>= r2 r3 = 1 <=== relocation (FIELD_SIGNEDNESS) if r3 == 0 goto LBB0_2 r1 s>>= r2 r0 = r1 LBB0_2: exit Compare to the above code between relocations FIELD_LSHIFT_U64 and FIELD_LSHIFT_U64, the code with big endian mode has four more instructions. r1 = 41 <=== relocation (FIELD_LSHIFT_U64) r6 += r1 r6 += -64 r6 <<= 32 r6 >>= 32 r1 = (u64 )(r10 - 8) r1 <<= r6 r2 = 60 <=== relocation (FIELD_RSHIFT_U64) The code and relocation generated when using direct load. r2 = 0 r3 = 4 r4 = 4 if r4 s> 3 goto LBB0_3 if r4 == 1 goto LBB0_5 if r4 == 2 goto LBB0_6 goto LBB0_9 LBB0_6: # %sw.bb1 r1 += r3 r2 = (u16 )(r1 + 0) goto LBB0_9 LBB0_3: # %entry if r4 == 4 goto LBB0_7 if r4 == 8 goto LBB0_8 goto LBB0_9 LBB0_8: # %sw.bb9 r1 += r3 r2 = (u64 )(r1 + 0) goto LBB0_9 LBB0_5: # %sw.bb r1 += r3 r2 = (u8 )(r1 + 0) goto LBB0_9 LBB0_7: # %sw.bb5 r1 += r3 r2 = (u32 )(r1 + 0) LBB0_9: # %sw.epilog r1 = 51 r2 <<= r1 r1 = 60 r0 = r2 r0 >>= r1 r3 = 1 if r3 == 0 goto LBB0_11 r2 s>>= r1 r0 = r2 LBB0_11: # %sw.epilog exit Considering verifier is able to do limited constant propogation following branches. The following is the code actually traversed. r2 = 0 r3 = 4 <=== relocation r4 = 4 <=== relocation if r4 s> 3 goto LBB0_3 LBB0_3: # %entry if r4 == 4 goto LBB0_7 LBB0_7: # %sw.bb5 r1 += r3 r2 = (u32 )(r1 + 0) LBB0_9: # %sw.epilog r1 = 51 <=== relocation r2 <<= r1 r1 = 60 <=== relocation r0 = r2 r0 >>= r1 r3 = 1 if r3 == 0 goto LBB0_11 r2 s>>= r1 r0 = r2 LBB0_11: # %sw.epilog exit For native load case, the load size is calculated to be the same as the size of load width LLVM otherwise used to load the value which is then used to extract the bitfield value. Differential Revision: https://reviews.llvm.org/D67980 llvm-svn: 374099	2019-10-08 18:23:17 +00:00
Jinsong Ji	9912232b46	Revert "[LoopVectorize][PowerPC] Estimate int and float register pressure separately in loop-vectorize" Also Revert "[LoopVectorize] Fix non-debug builds after rL374017" This reverts commit `9f41deccc0`. This reverts commit `18b6fe07bc`. The patch is breaking PowerPC internal build, checked with author, reverting on behalf of him for now due to timezone. llvm-svn: 374091	2019-10-08 17:32:56 +00:00
Vedant Kumar	9852699dcb	[CodeExtractor] Factor out and reuse shrinkwrap analysis Factor out CodeExtractor's analysis of allocas (for shrinkwrapping purposes), and allow the analysis to be reused. This resolves a quadratic compile-time bug observed when compiling AMDGPUDisassembler.cpp.o. Pre-patch (Release + LTO clang): ``` ---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name --- 176.5278 ( 57.8%) 0.4915 ( 18.5%) 177.0192 ( 57.4%) 177.4112 ( 57.3%) Hot Cold Splitting ``` Post-patch (ReleaseAsserts clang): ``` ---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name --- 1.4051 ( 3.3%) 0.0079 ( 0.3%) 1.4129 ( 3.2%) 1.4129 ( 3.2%) Hot Cold Splitting ``` Testing: check-llvm, and comparing the AMDGPUDisassembler.cpp.o binary pre- vs. post-patch. An alternate approach is to hide CodeExtractorAnalysisCache from clients of CodeExtractor, and to recompute the analysis from scratch inside of CodeExtractor::extractCodeRegion(). This eliminates some redundant work in the shrinkwrapping legality check. However, some clients continue to exhibit O(n^2) compile time behavior as computing the analysis is O(n). rdar://55912966 Differential Revision: https://reviews.llvm.org/D68616 llvm-svn: 374089	2019-10-08 17:17:51 +00:00
Nikola Prica	98603a8153	[DebugInfo][If-Converter] Update call site info during the optimization During the If-Converter optimization pay attention when copying or deleting call instructions in order to keep call site information in valid state. Reviewers: aprantl, vsk, efriedma Reviewed By: vsk, efriedma Differential Revision: https://reviews.llvm.org/D66955 llvm-svn: 374068	2019-10-08 15:43:12 +00:00
Hideto Ueno	96e6ce4cd3	[Attributor][MustExec] Deduce dereferenceable and nonnull attribute using MustBeExecutedContextExplorer Summary: In D65186 and related patches, MustBeExecutedContextExplorer is introduced. This enables us to traverse instructions guaranteed to execute from function entry. If we can know the argument is used as `dereferenceable` or `nonnull` in these instructions, we can mark `dereferenceable` or `nonnull` in the argument definition: 1. Memory instruction (similar to D64258) Trace memory instruction pointer operand. Currently, only inbounds GEPs are traced. ``` define i64* @f(i64* %a) { entry: %add.ptr = getelementptr inbounds i64, i64* %a, i64 1 ; (because of inbounds GEP we can know that %a is at least dereferenceable(16)) store i64 1, i64* %add.ptr, align 8 ret i64* %add.ptr ; dereferenceable 8 (because above instruction stores into it) } ``` 2. Propagation from callsite (similar to D27855) If `deref` or `nonnull` are known in call site parameter attributes we can also say that argument also that attribute. ``` declare void @use3(i8* %x, i8* %y, i8* %z); declare void @use3nonnull(i8* nonnull %x, i8* nonnull %y, i8* nonnull %z); define void @parent1(i8* %a, i8* %b, i8* %c) { call void @use3nonnull(i8* %b, i8* %c, i8* %a) ; Above instruction is always executed so we can say that@parent1(i8* nonnnull %a, i8* nonnull %b, i8* nonnull %c) call void @use3(i8* %c, i8* %a, i8* %b) ret void } ``` Reviewers: jdoerfert, sstefan1, spatel, reames Reviewed By: jdoerfert Subscribers: xbolva00, hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65402 llvm-svn: 374063	2019-10-08 15:25:56 +00:00
Cyndy Ishida	fb92ef1e55	Revert [TextAPI] Introduce TBDv4 This reverts r374058 (git commit `5d566c5a46`) llvm-svn: 374062	2019-10-08 15:24:37 +00:00
Cyndy Ishida	5d566c5a46	[TextAPI] Introduce TBDv4 Summary: This format introduces new features and platforms The motivation for this format is to support more than 1 platform since previous versions only supported additional architectures and 1 platform, for example ios + ios-simulator and macCatalyst. Reviewers: ributzka, steven_wu Reviewed By: ributzka Subscribers: mgorny, hiraditya, mgrang, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67529 llvm-svn: 374058	2019-10-08 15:07:36 +00:00
Pavel Labath	6e0b1ce48e	Object/minidump: Add support for the MemoryInfoList stream Summary: This patch adds the definitions of the constants and structures necessary to interpret the MemoryInfoList minidump stream, as well as the object::MinidumpFile interface to access the stream. While the code is fairly simple, there is one important deviation from the other minidump streams, which is worth calling out explicitly. Unlike other "List" streams, the size of the records inside MemoryInfoList stream is not known statically. Instead it is described in the stream header. This makes it impossible to return ArrayRef<MemoryInfo> from the accessor method, as it is done with other streams. Instead, I create an iterator class, which can be parameterized by the runtime size of the structure, and return iterator_range<iterator> instead. Reviewers: amccarth, jhenderson, clayborg Subscribers: JosephTremoulet, zturner, markmentovai, lldb-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68210 llvm-svn: 374051	2019-10-08 14:15:32 +00:00
Sebastian Pop	d0d52edae9	fix fmls fp16 Tim Northover remarked that the added patterns for fmls fp16 produce wrong code in case the fsub instruction has a multiplication as its first operand, i.e., all the patterns FMLSv_OP1: > define <8 x half> @test_FMLSv8f16_OP1(<8 x half> %a, <8 x half> %b, <8 x half> %c) { > ; CHECK-LABEL: test_FMLSv8f16_OP1: > ; CHECK: fmls {{v[0-9]+}}.8h, {{v[0-9]+}}.8h, {{v[0-9]+}}.8h > entry: > > %mul = fmul fast <8 x half> %c, %b > %sub = fsub fast <8 x half> %mul, %a > ret <8 x half> %sub > } > > This doesn't look right to me. The exact instruction produced is "fmls > v0.8h, v2.8h, v1.8h", which I think calculates "v0 - v2v1", but the > IR is calculating "v2v1-v0". The equivalent <4 x float> code also > doesn't emit an fmls. This patch generates an fmla and negates the value of the operand2 of the fsub. Inspecting the pattern match, I found that there was another mistake in the opcode to be selected: matching FMULv416 should generate FMLSv416 and not FMLSv232. Tested on aarch64-linux with make check-all. Differential Revision: https://reviews.llvm.org/D67990 llvm-svn: 374044	2019-10-08 13:23:57 +00:00
Graham Hunter	b302561b76	[SVE][IR] Scalable Vector size queries and IR instruction support * Adds a TypeSize struct to represent the known minimum size of a type along with a flag to indicate that the runtime size is a integer multiple of that size * Converts existing size query functions from Type.h and DataLayout.h to return a TypeSize result * Adds convenience methods (including a transparent conversion operator to uint64_t) so that most existing code 'just works' as if the return values were still scalars. * Uses the new size queries along with ElementCount to ensure that all supported instructions used with scalable vectors can be constructed in IR. Reviewers: hfinkel, lattner, rkruppe, greened, rovka, rengolin, sdesmalen Reviewed By: rovka, sdesmalen Differential Revision: https://reviews.llvm.org/D53137 llvm-svn: 374042	2019-10-08 12:53:54 +00:00
Andrea Di Biagio	8d6651f7b1	[MCA][LSUnit] Track loads and stores until retirement. Before this patch, loads and stores were only tracked by their corresponding queues in the LSUnit from dispatch until execute stage. In practice we should be more conservative and assume that memory opcodes leave their queues at retirement stage. Basically, loads should leave the load queue only when they have completed and delivered their data. We conservatively assume that a load is completed when it is retired. Stores should be tracked by the store queue from dispatch until retirement. In practice, stores can only leave the store queue if their data can be written to the data cache. This is mostly a mechanical change. With this patch, the retire stage notifies the LSUnit when a memory instruction is retired. That would triggers the release of LDQ/STQ entries. The only visible change is in memory tests for the bdver2 model. That is because bdver2 is the only model that defines the load/store queue size. This patch partially addresses PR39830. Differential Revision: https://reviews.llvm.org/D68266 llvm-svn: 374034	2019-10-08 10:46:01 +00:00
Zi Xuan Wu	9f41deccc0	[LoopVectorize][PowerPC] Estimate int and float register pressure separately in loop-vectorize In loop-vectorize, interleave count and vector factor depend on target register number. Currently, it does not estimate different register pressure for different register class separately(especially for scalar type, float type should not be on the same position with int type), so it's not accurate. Specifically, it causes too many times interleaving/unrolling, result in too many register spills in loop body and hurting performance. So we need classify the register classes in IR level, and importantly these are abstract register classes, and are not the target register class of backend provided in td file. It's used to establish the mapping between the types of IR values and the number of simultaneous live ranges to which we'd like to limit for some set of those types. For example, POWER target, register num is special when VSX is enabled. When VSX is enabled, the number of int scalar register is 32(GPR), float is 64(VSR), but for int and float vector register both are 64(VSR). So there should be 2 kinds of register class when vsx is enabled, and 3 kinds of register class when VSX is NOT enabled. It runs on POWER target, it makes big(+~30%) performance improvement in one specific bmk(503.bwaves_r) of spec2017 and no other obvious degressions. Differential revision: https://reviews.llvm.org/D67148 llvm-svn: 374017	2019-10-08 03:28:33 +00:00
Chen Zheng	9806a1d5f9	[ConstantRange] [NFC] replace addWithNoSignedWrap with addWithNoWrap. llvm-svn: 374016	2019-10-08 03:00:31 +00:00
Johannes Doerfert	766f2cc1a4	[Attributor] Use local linkage instead of internal Local linkage is internal or private, and private is a specialization of internal, so either is fine for all our "local linkage" queries. llvm-svn: 373986	2019-10-07 23:21:52 +00:00
Johannes Doerfert	661db04b98	[Attributor] Use abstract call sites for call site callback Summary: When we iterate over uses of functions and expect them to be call sites, we now use abstract call sites to allow callback calls. Reviewers: sstefan1, uenoku Subscribers: hiraditya, bollu, hfinkel, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67871 llvm-svn: 373985	2019-10-07 23:14:58 +00:00
Reid Kleckner	f9b67b810e	[X86] Add new calling convention that guarantees tail call optimization When the target option GuaranteedTailCallOpt is specified, calls with the fastcc calling convention will be transformed into tail calls if they are in tail position. This diff adds a new calling convention, tailcc, currently supported only on X86, which behaves the same way as fastcc, except that the GuaranteedTailCallOpt flag does not need to enabled in order to enable tail call optimization. Patch by Dwight Guth <dwight.guth@runtimeverification.com>! Reviewed By: lebedev.ri, paquette, rnk Differential Revision: https://reviews.llvm.org/D67855 llvm-svn: 373976	2019-10-07 22:28:58 +00:00
Johannes Doerfert	ee33c61e34	[Attributor][FIX] Remove assertion wrong for on invalid IRPositions llvm-svn: 373972	2019-10-07 21:48:08 +00:00
Cameron McInally	60786f9143	[llvm-c] Add UnaryOperator to LLVM_FOR_EACH_VALUE_SUBCLASS macro Note that we are not sure where the tests for these functions lives. This was discussed in the Phab Diff. Differential Revision: https://reviews.llvm.org/D68588 llvm-svn: 373969	2019-10-07 21:33:39 +00:00
Johannes Doerfert	1097fab1cf	[Attributor] Deduce memory behavior of functions and arguments Deduce the memory behavior, aka "read-none", "read-only", or "write-only", for functions and arguments. Reviewers: sstefan1, uenoku Subscribers: hiraditya, bollu, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67384 llvm-svn: 373965	2019-10-07 21:07:57 +00:00
Cameron McInally	46d317fad4	[Bitcode] Update naming of UNOP_NEG to UNOP_FNEG Differential Revision: https://reviews.llvm.org/D68588 llvm-svn: 373958	2019-10-07 20:41:25 +00:00
Jonas Devlieghere	61446a1421	[AccelTable] Remove stale comment (NFC) rdar://55857228 llvm-svn: 373956	2019-10-07 20:33:20 +00:00
Matt Arsenault	4bcdcad91b	GlobalISel: Partially implement lower for G_INSERT llvm-svn: 373946	2019-10-07 19:13:27 +00:00
Matt Arsenault	27269054d2	GlobalISel: Add target pre-isel instructions Allows targets to introduce regbankselectable pseudo-instructions. Currently the closet feature to this is an intrinsic. However this requires creating a public intrinsic declaration. This litters the public intrinsic namespace with operations we don't necessarily want to expose to IR producers, and would rather leave as private to the backend. Use a new instruction bit. A previous attempt tried to keep using enum value ranges, but it turned into a mess. llvm-svn: 373937	2019-10-07 18:43:29 +00:00
Jordan Rose	fdaa742174	Second attempt to add iterator_range::empty() Doing this makes MSVC complain that `empty(someRange)` could refer to either C++17's std::empty or LLVM's llvm::empty, which previously we avoided via SFINAE because std::empty is defined in terms of an empty member rather than begin and end. So, switch callers over to the new method as it is added. https://reviews.llvm.org/D68439 llvm-svn: 373935	2019-10-07 18:14:24 +00:00
Erich Keane	8a410bcef0	Fix Calling Convention through aliases r369697 changed the behavior of stripPointerCasts to no longer include aliases. However, the code in CGDeclCXX.cpp's createAtExitStub counted on the looking through aliases to properly set the calling convention of a call. The result of the change was that the calling convention mismatch of the call would be replaced with a llvm.trap, causing a runtime crash. Differential Revision: https://reviews.llvm.org/D68584 llvm-svn: 373929	2019-10-07 17:28:03 +00:00
Wei Mi	283df8cf74	Fix build errors caused by rL373914. llvm-svn: 373919	2019-10-07 16:45:47 +00:00
Wei Mi	b523790ae1	[SampleFDO] Add compression support for any section in ExtBinary profile format Previously ExtBinary profile format only supports compression using zlib for profile symbol list. In this patch, we extend the compression support to any section. User can select some or all of the sections to compress. In an experiment, for a 45M profile in ExtBinary format, compressing name table reduced its size to 24M, and compressing all the sections reduced its size to 11M. Differential Revision: https://reviews.llvm.org/D68253 llvm-svn: 373914	2019-10-07 16:12:37 +00:00
whitequark	b63db94fa5	[LLVM-C] Add bindings to create macro debug info Summary: The C API doesn't have the bindings to create macro debug information. Reviewers: whitequark, CodaFi, deadalnix Reviewed By: whitequark Subscribers: aprantl, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58334 llvm-svn: 373903	2019-10-07 13:57:13 +00:00
Kevin P. Neal	1c3d19c82d	[FPEnv] Add constrained intrinsics for lrint and lround Earlier in the year intrinsics for lrint, llrint, lround and llround were added to llvm. The constrained versions are now implemented here. Reviewed by: andrew.w.kaylor, craig.topper, cameron.mcinally Approved by: craig.topper Differential Revision: https://reviews.llvm.org/D64746 llvm-svn: 373900	2019-10-07 13:20:00 +00:00
Martin Storsjo	dfc1aee25b	Revert "[SLP] avoid reduction transform on patterns that the backend can load-combine" This reverts SVN r373833, as it caused a failed assert "Non-zero loop cost expected" on building numerous projects, see PR43582 for details and reproduction samples. llvm-svn: 373882	2019-10-07 08:21:37 +00:00
Simon Pilgrim	b4ba3cbda0	[X86][AVX] Access a scalar float/double as a free extract from a broadcast load (PR43217) If a fp scalar is loaded and then used as both a scalar and a vector broadcast, perform the load as a broadcast and then extract the scalar for 'free' from the 0th element. This involved switching the order of the X86ISD::BROADCAST combines so we only convert to X86ISD::BROADCAST_LOAD once all other canonicalizations have been attempted. Adds a DAGCombinerInfo::recursivelyDeleteUnusedNodes wrapper. Fixes PR43217 Differential Revision: https://reviews.llvm.org/D68544 llvm-svn: 373871	2019-10-06 21:11:45 +00:00
Matt Arsenault	a5b9c75674	GlobalISel: Partially implement lower for G_EXTRACT Turn into shift and truncate. Doesn't yet handle pointers. llvm-svn: 373838	2019-10-06 01:37:35 +00:00
Sanjay Patel	e2321bb448	[SLP] avoid reduction transform on patterns that the backend can load-combine I don't see an ideal solution to these 2 related, potentially large, perf regressions: https://bugs.llvm.org/show_bug.cgi?id=42708 https://bugs.llvm.org/show_bug.cgi?id=43146 We decided that load combining was unsuitable for IR because it could obscure other optimizations in IR. So we removed the LoadCombiner pass and deferred to the backend. Therefore, preventing SLP from destroying load combine opportunities requires that it recognizes patterns that could be combined later, but not do the optimization itself ( it's not a vector combine anyway, so it's probably out-of-scope for SLP). Here, we add a scalar cost model adjustment with a conservative pattern match and cost summation for a multi-instruction sequence that can probably be reduced later. This should prevent SLP from creating a vector reduction unless that sequence is extremely cheap. In the x86 tests shown (and discussed in more detail in the bug reports), SDAG combining will produce a single instruction on these tests like: movbe rax, qword ptr [rdi] or: mov rax, qword ptr [rdi] Not some (half) vector monstrosity as we currently do using SLP: vpmovzxbq ymm0, dword ptr [rdi + 1] # ymm0 = mem[0],zero,zero,.. vpsllvq ymm0, ymm0, ymmword ptr [rip + .LCPI0_0] movzx eax, byte ptr [rdi] movzx ecx, byte ptr [rdi + 5] shl rcx, 40 movzx edx, byte ptr [rdi + 6] shl rdx, 48 or rdx, rcx movzx ecx, byte ptr [rdi + 7] shl rcx, 56 or rcx, rdx or rcx, rax vextracti128 xmm1, ymm0, 1 vpor xmm0, xmm0, xmm1 vpshufd xmm1, xmm0, 78 # xmm1 = xmm0[2,3,0,1] vpor xmm0, xmm0, xmm1 vmovq rax, xmm0 or rax, rcx vzeroupper ret Differential Revision: https://reviews.llvm.org/D67841 llvm-svn: 373833	2019-10-05 18:03:58 +00:00
Mehdi Amini	482f4d9aa9	Expose ProvidePositionalOption as a public API The motivation is to reuse the key value parsing logic here to parse instance specific pass options within the context of MLIR. The primary functionality exposed is the "," splitting for arrays and the logic for properly handling duplicate definitions of a single flag. Patch by: Parker Schuh <parkers@google.com> Differential Revision: https://reviews.llvm.org/D68294 llvm-svn: 373815	2019-10-05 01:37:04 +00:00
Aditya Kumar	6a2673605e	Invalidate assumption cache before outlining. Subscribers: llvm-commits Tags: #llvm Reviewers: compnerd, vsk, sebpop, fhahn, tejohnson Reviewed by: vsk Differential Revision: https://reviews.llvm.org/D68478 llvm-svn: 373807	2019-10-04 22:46:42 +00:00

1 2 3 4 5 ...

38273 Commits