llvm-project

Commit Graph

Author	SHA1	Message	Date
Krzysztof Parzyszek	e704583f23	[Hexagon] Cache loads to select to avoid traversing mutating DAG llvm-svn: 321034	2017-12-18 23:13:27 +00:00
Evandro Menezes	687df6380e	[AArch64] Expand test coverage of vector element shuffling to Exynos Make sure that all test cases are run for Exynos as well. Otherwise, NFC. llvm-svn: 321032	2017-12-18 22:17:39 +00:00
Bob Haarman	ea5ff9fa6b	Fix buffer overrun in WindowsResourceCOFFWriter::writeSymbolTable() Summary: We were using sprintf(..., "$R06X", <some uint32_t>) to create strings that are expected to be exactly length 8, but this results in longer strings if the uint32_t is greater than 0xffffff. This change modifies the behavior as follows: - Uses the loop counter instead of the data offset. This gives us sequential symbol names, avoiding collisions as much as possible. - Masks the value to 0xffffff to avoid generating names longer than 8 bytes. - Uses formatv instead of sprintf. Fixes PR35581. Reviewers: ruiu, zturner Reviewed By: ruiu Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D41270 llvm-svn: 321030	2017-12-18 22:10:14 +00:00
Reid Kleckner	8f3c351aa3	Add test for .req directive starting with 'p' Reduced test case from libjpeg_turbo. llvm-svn: 321029	2017-12-18 22:01:18 +00:00
Craig Topper	4802d4e23e	[X86] Don't use NOPL when the assembler is passed an empty CPU string. Update tests to force a CPU with NOPL Empty string should be equivalent to "generic" which doesn't allow NOPL. Force tests to use specificy 'pentiumpro' to guarantee NOPL. Fixes PR35686 llvm-svn: 321026	2017-12-18 21:37:27 +00:00
Reid Kleckner	37517a2ddd	Revert "[AArch64][SVE] Asm" changes, they broke libjpeg_turbo This reverts changes r320992, r320986, r320973, and r320970. r320970 by itself breaks the test case, and the rest depend on it. Test case will land soon. llvm-svn: 321024	2017-12-18 20:58:25 +00:00
Ivan A. Kosarev	a80c79b5bf	[Analysis] Generate more precise TBAA tags when one access encloses the other There are cases when two tags with different base types denote accesses to the same direct or indirect member of a structure type. Currently, merging of such tags results in a tag that represents an access to an object that has the type of that member. This patch changes this so that if one of the accesses encloses the other, then the generic tag is the one of the enclosed access. Differential Revision: https://reviews.llvm.org/D39557 llvm-svn: 321019	2017-12-18 20:05:20 +00:00
Teresa Johnson	915897e21b	[PGO] Fix handling of cold entry count for instrumented PGO Summary: In r277849, getEntryCount was changed to return None when the entry count was 0, specifically for SamplePGO where it means no samples were recorded. However, for instrumentation PGO a 0 entry count should be returned directly, since it does mean that the function was completely cold. Otherwise we end up treating these functions conservatively in isFunctionEntryCold() and isColdBB(). Instead, for SamplePGO use -1 when there are no samples, and change getEntryCount to return None when the value is -1. Reviewers: danielcdh, davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D41307 llvm-svn: 321018	2017-12-18 20:02:43 +00:00
Quentin Colombet	ec76d9c47f	[TableGen][GlobalISel] Optimize MatchTable for faster instruction selection * Context * Prior to this patchw, the table generated for matching instruction was straight forward but highly inefficient. Basically, each pattern generates its own set of self contained checks and actions. E.g., TableGen generated: // First pattern CheckNumOperand 3 CheckOpcode G_ADD ... Build ADDrr // Second pattern CheckNumOperand 3 CheckOpcode G_ADD ... Build ADDri // Third pattern CheckNumOperand 3 CheckOpcode G_SUB ... Build SUBrr * Problem * Because of that generation, a lot of check were redundant between each pattern and were checked every single time until we reach the pattern that matches. E.g., Taking the previous table, let say we are matching a G_SUB, that means we were going to check all the rules for G_ADD before looking at the G_SUB rule. In particular we are going to do: check 3 operands; PASS check G_ADD; FAIL ; Next rule check 3 operands; PASS (but we already knew that!) check G_ADD; FAIL (well it is still not true) ; Next rule check 3 operands; PASS (really!!) check G_SUB; PASS (at last :P) * Proposed Solution * This patch introduces a concept of group of rules (GroupMatcher) that share some predicates and only get checked once for the whole group. This patch only creates groups with one nesting level. Conceptually there is nothing preventing us for having deeper nest level. However, the current implementation is not smart enough to share the recording (aka capturing) of values. That limits its ability to do more sharing. For the given example the current patch will generate: // First group CheckOpcode G_ADD // First pattern CheckNumOperand 3 ... Build ADDrr // Second pattern CheckNumOperand 3 ... Build ADDri // Second group CheckOpcode G_SUB // Third pattern CheckNumOperand 3 ... Build SUBrr But if we allowed several nesting level, it could create a sub group for the checknumoperand 3. (We would need to call optimizeRules on the rules within a group.) * Result * With only one level of nesting, the instruction selection pass is up to 4x faster. For instance, one instruction now takes 500 checks, instead of 24k! With more nesting we could get in the tens I believe. Differential Revision: https://reviews.llvm.org/D39034 rdar://problem/34670699 llvm-svn: 321017	2017-12-18 19:47:41 +00:00
Dimitry Andric	e4f5d01033	Fix more inconsistent line endings. NFC. llvm-svn: 321016	2017-12-18 19:46:56 +00:00
Jessica Paquette	02c124d644	[MachineOutliner] Recommit r320229 LR was undefined entering outlined functions that contain calls. This made the machine verifier unhappy when expensive checks were enabled. This fixes that. llvm-svn: 321014	2017-12-18 19:33:21 +00:00
Benjamin Kramer	efc7c88ea8	[PPC] Also disable the pre-emit version of reg+reg to reg+imm transformation. This has the same issue as the early pass disabled in r321010. llvm-svn: 321013	2017-12-18 19:21:56 +00:00
Paul Robinson	a06f8dcca6	Recommit "[DWARFv5] Dump an MD5 checksum in the line-table header." Adds missing support for DW_FORM_data16. Update of r320852/r320886, fixing the unittest again, this time use a raw char string for the test data. Differential Revision: https://reviews.llvm.org/D41090 llvm-svn: 321011	2017-12-18 19:08:35 +00:00
Krzysztof Parzyszek	6b589e593d	[Hexagon] Generate HVX code for vector sign-, zero- and any-extends Implement any-extend as zero-extend. llvm-svn: 321004	2017-12-18 18:32:27 +00:00
Simon Pilgrim	f947137ed0	[X86] Regenerate test to improve codegen testing for D41350 llvm-svn: 321003	2017-12-18 18:31:02 +00:00
Teresa Johnson	9ecaaff251	[ThinLTO] Make distributed indexes test more robust Modify test so that it passes in the reverse-iteration bot. We use DenseMap instead of std::map for the summaries to emit into distributed index files. The iteration order is not defined, but it is deterministic, which is good enough. llvm-svn: 321000	2017-12-18 18:00:32 +00:00
Xinliang David Li	19fb5b467b	[PGO] add MST min edge selection heuristic to ensure non-zero entry count Differential Revision: http://reviews.llvm.org/D41059 llvm-svn: 320998	2017-12-18 17:56:19 +00:00
Francis Visoiu Mistrih	b213b27ee3	[YAML] Add support for non-printable characters LLVM IR function names which disable mangling start with '\01' (https://www.llvm.org/docs/LangRef.html#identifiers). When an identifier like "\01@abc@" gets dumped to MIR, it is quoted, but only with single quotes. http://www.yaml.org/spec/1.2/spec.html#id2770814: "The allowed character range explicitly excludes the C0 control block allowed), the surrogate block #xD800-#xDFFF, #xFFFE, and #xFFFF." http://www.yaml.org/spec/1.2/spec.html#id2776092: "All non-printable characters must be escaped. [...] Note that escape sequences are only interpreted in double-quoted scalars." This patch adds support for printing escaped non-printable characters between double quotes if needed. Should also fix PR31743. Differential Revision: https://reviews.llvm.org/D41290 llvm-svn: 320996	2017-12-18 17:38:03 +00:00
Sander de Smalen	09f56a54d0	[AArch64][SVE] Asm: Improve diagnostics further when +sve is not specified Summary: Patch [4/4] in a series to add parsing of predicates and properly parse SVE ZIP1/ZIP2 instructions. This patch further improves diagnostic messages for when the SVE feature is not specified. Reviewers: rengolin, fhahn, olista01, echristo, efriedma Reviewed By: fhahn Subscribers: sdardis, aemerson, javed.absar, tschuett, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D40363 llvm-svn: 320992	2017-12-18 16:48:53 +00:00
Simon Dardis	fd8c65e868	Reland "[mips] Fix the target specific instruction verifier" Fix an off by one error in the bounds checking for 'dinsu' and update the ranges in the test comments so that they are accurate. This version has the correct commit message. Reviewers: atanasyan Differential Revision: https://reviews.llvm.org/D41183 llvm-svn: 320991	2017-12-18 15:56:40 +00:00
Sean Fertile	5fb624a3b8	[Memcpy Loop Lowering] Remove the fixed int8 lowering. Switch over to the lowering that uses target supplied operand types. Differential Revision: https://reviews.llvm.org/D41201 llvm-svn: 320989	2017-12-18 15:31:14 +00:00
Sander de Smalen	190979189a	[TableGen][AsmMatcherEmitter] Only choose specific diagnostic for enabled instruction Summary: When emitting a diagnostic for an invalid operand, a specific diagnostic should only be reported when the instruction being matched is actually enabled by the feature flags. Patch [3/4] in a series to add parsing of predicates and properly parse SVE ZIP1/ZIP2 instructions. This patch fixes bogus diagnostic messages for when the SVE feature is not specified. Reviewers: rengolin, craig.topper, olista01, sdardis, stoklund Reviewed By: olista01, sdardis Subscribers: fhahn, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D40362 llvm-svn: 320986	2017-12-18 14:34:24 +00:00
Max Kazantsev	1acab00229	[LVI] Support for ashr in LVI Enhance LVI to analyze the ‘ashr’ binary operation. This leverages the infrastructure in ConstantRange for the ashr operation. Patch by Surya Kumari Jangala! Differential Revision: https://reviews.llvm.org/D40886 llvm-svn: 320983	2017-12-18 14:23:30 +00:00
Simon Dardis	f70af977af	Revert "[mips] Fix the target specific instruction verifier" This reverts commit r320974. The commit message lacked the Differential Revison: line. llvm-svn: 320975	2017-12-18 12:30:34 +00:00
Simon Dardis	c3c0d4590b	[mips] Fix the target specific instruction verifier Fix an off by one error in the bounds checking for 'dinsu' and update the ranges in the test comments so that they are accurate. Reviewers: atanasyan https://reviews.llvm.org/D41183 llvm-svn: 320974	2017-12-18 12:24:17 +00:00
Sander de Smalen	fce0c1c45b	[AArch64][SVE] Asm: Add ZIP1/ZIP2 instructions (predicate/data vectors) Summary: Patch [2/4] in a series to add parsing of predicates and properly parse SVE ZIP1/ZIP2 instructions. Reviewers: rengolin, kristof.beyls, fhahn, mcrosier, evandro Reviewed By: fhahn Subscribers: aemerson, javed.absar, llvm-commits, tschuett Differential Revision: https://reviews.llvm.org/D40361 llvm-svn: 320973	2017-12-18 11:29:59 +00:00
Sander de Smalen	ce1e0975f4	[AArch64][SVE] Asm: Add SVE predicate register definitions and parsing support Summary: Patch [1/4] in a series to add parsing of predicates and properly parse SVE ZIP1/ZIP2 instructions. Reviewers: rengolin, kristof.beyls, fhahn, mcrosier, evandro, echristo, efriedma Reviewed By: fhahn Subscribers: aemerson, javed.absar, llvm-commits, tschuett Differential Revision: https://reviews.llvm.org/D40360 llvm-svn: 320970	2017-12-18 11:26:34 +00:00
Tim Northover	9097a07e4e	AArch64: work around how Cyclone handles "movi.2d vD, #0". For Cylone, the instruction "movi.2d vD, #0" is executed incorrectly in some rare circumstances. Work around the issue conservatively by avoiding the instruction entirely. This patch changes CodeGen so that problematic instructions are never generated, and the AsmParser so that an equivalent instruction is used (with a warning). llvm-svn: 320965	2017-12-18 10:36:00 +00:00
Igor Laevsky	7bd3fb15e1	[TargetLibraryInfo] Discard library functions with incorrectly sized integers Differential Revision: https://reviews.llvm.org/D41184 llvm-svn: 320964	2017-12-18 10:31:58 +00:00
Sam Parker	fd967f2f7a	[ARM] Adjust test checks Correct the CHECK-LABELS of a couple of dag combine tests. llvm-svn: 320963	2017-12-18 10:08:03 +00:00
Sam Parker	00804efd72	[DAGCombine] Move AND nodes to multiple load leaves Search from AND nodes to find whether they can be propagated back to loads, so that the AND and load can be combined into a narrow load. We search through OR, XOR and other AND nodes and all bar one of the leaves are required to be loads or constants. The exception node then needs to be masked off meaning that the 'and' isn't removed, but the loads(s) are narrowed still. Differential Revision: https://reviews.llvm.org/D41177 llvm-svn: 320962	2017-12-18 10:04:27 +00:00
Craig Topper	7034d401f8	[X86] Use mattr instead of mcpu in some of the cost model tests. Based on the names of the check lines, features seems more appropriate that cpu. Spotted while prototyping my patch to make 512-bit vectors illegal on SKX sometimes. llvm-svn: 320959	2017-12-18 07:21:58 +00:00
Hiroshi Inoue	c6faf15459	[SROA] Disable non-whole-alloca splits by default This patch introduce a switch to control splitting of non-whole-alloca slices with default off. The switch will be default on again after fixing an issue reported in PR35657. llvm-svn: 320958	2017-12-18 06:47:37 +00:00
Serguei Katkov	b0b67a8d38	[CGP] Fix the handling select inst in complex addressing mode When we put the value in select placeholder we must pass the value through simplification tracker due to the value might be already simplified and erased. This is a fix for PR35658. Reviewers: john.brawn, uabelho Reviewed By: john.brawn Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D41251 llvm-svn: 320956	2017-12-18 04:25:07 +00:00
Sanjay Patel	9da049fa8a	[x86] add tests for finite libcall lowering (PR35672); NFC llvm-svn: 320955	2017-12-18 00:38:45 +00:00
Bjorn Steinbrink	3603de2fa2	Re-commit "Properly handle multi-element and dynamically sized allocas in getPointerDereferenceableBytes()"" llvm-clang-x86_64-expensive-checks-win is still broken, so the failure seems unrelated. llvm-svn: 320953	2017-12-17 21:20:16 +00:00
Craig Topper	255a76d6d1	[X86] Add test cases that show cases where buildvector of extract and inserts should be turned into fmsubadd. This is a follow up to the fmaddsub support added in r320950. Hopefully in the future we can fix lowering to handle this fmsubadd too. llvm-svn: 320951	2017-12-17 18:31:36 +00:00
Craig Topper	fd8d040820	[X86] Make the code that creates fmaddsub from build_vector of extracts and inserts functional and add tests. Summary: We had no tests for this and we couldn't do the optimization because of a bad use count check. We need to know how many non-undef pieces of the build vector were filled in and ensure our use count is equal to that. But on the shuffle combine version we need the use count to be 2. The missing coverage was noticed during the review of D40335. Reviewers: RKSimon, zvi, spatel Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D41133 llvm-svn: 320950	2017-12-17 18:23:45 +00:00
Simon Pilgrim	406d04a916	[X86] Regenerate truncated rotation tests + add missing 32-bit checks llvm-svn: 320949	2017-12-17 18:20:42 +00:00
Bjorn Steinbrink	6f7bbf349f	Revert "Properly handle multi-element and dynamically sized allocas in getPointerDereferenceableBytes()" This reverts commit 217067d5179882de9deb60d2e866befea4c126e7. Fails on llvm-clang-x86_64-expensive-checks-win llvm-svn: 320945	2017-12-17 15:16:58 +00:00
Bjorn Steinbrink	c27f81b92b	Properly handle byval arguments in getPointerDereferenceableBytes() Summary: For byval arguments, the number of dereferenceable bytes is equal to the size of the pointee, not the pointer. Reviewers: hfinkel, rnk Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D41305 llvm-svn: 320939	2017-12-17 02:37:42 +00:00
Bjorn Steinbrink	5d86532467	Properly handle multi-element and dynamically sized allocas in getPointerDereferenceableBytes() Reviewers: hfinkel, rnk Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D41288 llvm-svn: 320938	2017-12-17 01:54:25 +00:00
Craig Topper	ee1e71e576	[X86] Use extract_vector_elt instead of X86ISD::VEXTRACT for isel of vXi1 extractions. llvm-svn: 320937	2017-12-17 01:35:48 +00:00
Craig Topper	c0c2d19e08	[X86] Canonicalize extract_vector_elt from vXi1 to always return MVT::i32. This allows us to remove some isel patterns that allowed MVT::i8 result type. llvm-svn: 320936	2017-12-17 01:35:47 +00:00
Simon Pilgrim	4c9e8215e9	[X86][AVX] lowerVectorShuffleAsBroadcast - aggressively peek through BITCASTs Assuming we can safely adjust the broadcast index for the new type to keep it suitably aligned, then peek through BITCASTs when looking for the broadcast source. Fixes PR32007 llvm-svn: 320933	2017-12-16 23:32:18 +00:00
Simon Pilgrim	f3b6da00f5	[X86][AVX] Fix failed broadcast fold Strip excess BITCASTs from EXTRACT_SUBVECTOR input llvm-svn: 320930	2017-12-16 22:57:17 +00:00
Sean Fertile	68d7f9da76	[Memcpy Loop Lowering] Only calculate residual size/bytes copied when needed. If the loop operand type is int8 then there will be no residual loop for the unknown size expansion. Dont create the residual-size and bytes-copied values when they are not needed. llvm-svn: 320929	2017-12-16 22:41:39 +00:00
Craig Topper	1260a4e826	[X86] When using vpopcntdq for ctpop of v8i16 vectors, only promote to v8i32. Previously we promoted to v8i64, but we don't need to go all the way to 512-bits. If we have VLX we can use the 256-bit instruction. And even if we don't have VLX we can widen v8i32 to v16i32 and drop the upper half. llvm-svn: 320926	2017-12-16 19:31:36 +00:00
Simon Pilgrim	5f022d278b	[InstCombine] Regenerate FMUL/FMA combine tests with update_test_checks.py llvm-svn: 320922	2017-12-16 17:18:15 +00:00
Sanjay Patel	5a0cdac174	[InstCombine] canonicalize shifty abs(): ashr+add+xor --> cmp+neg+sel We want to do this for 2 reasons: 1. Value tracking does not recognize the ashr variant, so it would fail to match for cases like D39766. 2. DAGCombiner does better at producing optimal codegen when we have the cmp+sel pattern. More detail about what happens in the backend: 1. DAGCombiner has a generic transform for all targets to convert the scalar cmp+sel variant of abs into the shift variant. That is the opposite of this IR canonicalization. 2. DAGCombiner has a generic transform for all targets to convert the vector cmp+sel variant of abs into either an ABS node or the shift variant. That is again the opposite of this IR canonicalization. 3. DAGCombiner has a generic transform for all targets to convert the exact shift variants produced by #1 or #2 into an ISD::ABS node. Note: It would be an efficiency improvement if we had #1 go directly to an ABS node when that's legal/custom. 4. The pattern matching above is incomplete, so it is possible to escape the intended/optimal codegen in a variety of ways. a. For #2, the vector path is missing the case for setlt with a '1' constant. b. For #3, we are missing a match for commuted versions of the shift variants. 5. Therefore, this IR canonicalization can only help get us to the optimal codegen. The version of cmp+sel produced by this patch will be recognized in the DAG and converted to an ABS node when possible or the shift sequence when not. 6. In the following examples with this patch applied, we may get conditional moves rather than the shift produced by the generic DAGCombiner transforms. The conditional move is created using a target-specific decision for any given target. Whether it is optimal or not for a particular subtarget may be up for debate. define i32 @abs_shifty(i32 %x) { %signbit = ashr i32 %x, 31 %add = add i32 %signbit, %x %abs = xor i32 %signbit, %add ret i32 %abs } define i32 @abs_cmpsubsel(i32 %x) { %cmp = icmp slt i32 %x, zeroinitializer %sub = sub i32 zeroinitializer, %x %abs = select i1 %cmp, i32 %sub, i32 %x ret i32 %abs } define <4 x i32> @abs_shifty_vec(<4 x i32> %x) { %signbit = ashr <4 x i32> %x, <i32 31, i32 31, i32 31, i32 31> %add = add <4 x i32> %signbit, %x %abs = xor <4 x i32> %signbit, %add ret <4 x i32> %abs } define <4 x i32> @abs_cmpsubsel_vec(<4 x i32> %x) { %cmp = icmp slt <4 x i32> %x, zeroinitializer %sub = sub <4 x i32> zeroinitializer, %x %abs = select <4 x i1> %cmp, <4 x i32> %sub, <4 x i32> %x ret <4 x i32> %abs } > $ ./opt -instcombine shiftyabs.ll -S \| ./llc -o - -mtriple=x86_64 -mattr=avx > abs_shifty: > movl %edi, %eax > negl %eax > cmovll %edi, %eax > retq > > abs_cmpsubsel: > movl %edi, %eax > negl %eax > cmovll %edi, %eax > retq > > abs_shifty_vec: > vpabsd %xmm0, %xmm0 > retq > > abs_cmpsubsel_vec: > vpabsd %xmm0, %xmm0 > retq > > $ ./opt -instcombine shiftyabs.ll -S \| ./llc -o - -mtriple=aarch64 > abs_shifty: > cmp w0, #0 // =0 > cneg w0, w0, mi > ret > > abs_cmpsubsel: > cmp w0, #0 // =0 > cneg w0, w0, mi > ret > > abs_shifty_vec: > abs v0.4s, v0.4s > ret > > abs_cmpsubsel_vec: > abs v0.4s, v0.4s > ret > > $ ./opt -instcombine shiftyabs.ll -S \| ./llc -o - -mtriple=powerpc64le > abs_shifty: > srawi 4, 3, 31 > add 3, 3, 4 > xor 3, 3, 4 > blr > > abs_cmpsubsel: > srawi 4, 3, 31 > add 3, 3, 4 > xor 3, 3, 4 > blr > > abs_shifty_vec: > vspltisw 3, -16 > vspltisw 4, 15 > vsubuwm 3, 4, 3 > vsraw 3, 2, 3 > vadduwm 2, 2, 3 > xxlxor 34, 34, 35 > blr > > abs_cmpsubsel_vec: > vspltisw 3, -16 > vspltisw 4, 15 > vsubuwm 3, 4, 3 > vsraw 3, 2, 3 > vadduwm 2, 2, 3 > xxlxor 34, 34, 35 > blr > Differential Revision: https://reviews.llvm.org/D40984 llvm-svn: 320921	2017-12-16 16:41:17 +00:00

1 2 3 4 5 ...

49670 Commits