llvm-project

Commit Graph

Author	SHA1	Message	Date
Jatin Bhateja	23eaf52d7d	[X86] Adding more tests for horizontal [F]HADD/[F]SUB for AVX512 vectors types llvm-svn: 311847	2017-08-27 12:43:25 +00:00
Craig Topper	36bd247f64	[X86] Add a target-specific DAG combine to combine extract_subvector from all zero/one build_vectors. llvm-svn: 311841	2017-08-27 05:39:57 +00:00
Craig Topper	a088362e88	[AVX512] Add patterns to match masked extract_subvector with bitcasts between the vselect and the extract_subvector. Remove the late DAG combine. We used to do a late DAG combine to move the bitcasts out of the way, but I'm starting to think that it's better to canonicalize extract_subvector's type to match the type of its input. I've seen some cases where we've formed two different extract_subvector from the same node where one had a bitcast and the other didn't. Add some more test cases to ensure we've also got most of the zero masking covered too. llvm-svn: 311837	2017-08-26 22:24:57 +00:00
Jatin Bhateja	c2f41b9f0b	[X86] Adding a test for horizontal [f]add/[f]sub for avx512 vector type 16x32. Differential Revision: https://reviews.llvm.org/D37183 llvm-svn: 311834	2017-08-26 19:02:49 +00:00
Jatin Bhateja	e4ca95d6aa	[DAGCombiner] Extending pattern detection for vector shuffle. Summary: If all the operands of a BUILD_VECTOR extract elements from same vector then split the vector efficiently based on the maximum vector access index. This will also fix PR 33784 Reviewers: zvi, delena, RKSimon, thakis Reviewed By: RKSimon Subscribers: chandlerc, eladcohen, llvm-commits Differential Revision: https://reviews.llvm.org/D35788 llvm-svn: 311833	2017-08-26 19:02:36 +00:00
Jatin Bhateja	b60cfbefac	Revert rL311247 : To rectify commit message. Summary: This reverts commit rL311247. Differential Revision: https://reviews.llvm.org/D36927 llvm-svn: 311832	2017-08-26 19:02:17 +00:00
Daniel Berlin	de269f4620	NewGVN: Fix PR33204 - We need to add memory users when we bypass memorydefs for loads, not just when we do it for stores. llvm-svn: 311829	2017-08-26 07:37:11 +00:00
Craig Topper	69ec201848	[X86] Qualify the RMW INC/DEC patterns with NotSlowIncDec. We were suppressing most uses of INC/DEC, but this one seems to have been missed. llvm-svn: 311828	2017-08-26 06:24:25 +00:00
Petr Hosek	08089e5201	Revert "[llvm] Add symbol table support to llvm-objcopy" This reverts commit r311826 because it's failing on llvm-i686-linux-RA. llvm-svn: 311827	2017-08-26 03:22:25 +00:00
Petr Hosek	70535d2c7b	[llvm] Add symbol table support to llvm-objcopy This change adds support for SHT_SYMTAB sections. Patch by Jake Ehrlich Differential Revision: https://reviews.llvm.org/D34167 llvm-svn: 311826	2017-08-26 03:18:41 +00:00
Craig Topper	d27386a9ed	[AVX512] Add patterns to use masked moves to implement masked extract_subvector of the lowest subvector. This only supports 32 and 64 bit element sizes for now. But we could probably do 16 and 8-bit elements with BWI. llvm-svn: 311821	2017-08-25 23:34:59 +00:00
Craig Topper	b89dbf0220	[AVX512] Add additional test cases for masked extract subvector. This includes tests for extracting 128-bits from a 256-bit vector and zero masking. llvm-svn: 311820	2017-08-25 23:34:57 +00:00
Craig Topper	e81de105a5	[X86] Add patterns to show more failures to use TBM instructions when we're trying to check flags. We can probably add patterns to fix some of them. But the ones that use 'and' as their root node emit a X86ISD::CMP node in front of the 'and' and then pattern matching that to 'test' instruction. We can't use a tablegen pattern to fix that because we can't remap the cmp result to the flag output of a TBM instruction. llvm-svn: 311819	2017-08-25 23:34:55 +00:00
Chandler Carruth	4b611a896d	[x86] Teach the backend to fold more read-modify-write memory operands to instructions. These can't be reasonably matched in tablegen due to the handling of flags, so we have to do this in C++ code. We only did it for `inc` and `dec` historically, this starts fleshing that out to more interesting instructions. Notably, this handles transfering operands to `add` and `sub`. Currently this forces them into a register. The next patch will add support for keeping immediate operands as immediates. Then I'll extend this beyond just `add` and `sub`. I'm not super thrilled by the repeated switches in the code but everything else I tried was really ugly or problematic. Many thanks to Craig Topper for the suggestions about where to even begin here and how to make this stuff work. Differential Revision: https://reviews.llvm.org/D37130 llvm-svn: 311806	2017-08-25 22:50:52 +00:00
Davide Italiano	26053818a3	[Verifier] Diagnose invalid DIType references instead of crashing. Fixes PR34325. llvm-svn: 311805	2017-08-25 22:08:15 +00:00
Matt Morehouse	6ec7595b1e	Revert "[SanitizeCoverage] Enable stack-depth coverage for -fsanitize=fuzzer" This reverts r311801 due to a bot failure. llvm-svn: 311803	2017-08-25 22:01:21 +00:00
Matt Morehouse	f42bd31323	[SanitizeCoverage] Enable stack-depth coverage for -fsanitize=fuzzer Summary: - Don't sanitize __sancov_lowest_stack. - Don't instrument leaf functions. - Add CoverageStackDepth to Fuzzer and FuzzerNoLink. Reviewers: vitalybuka, kcc Reviewed By: kcc Subscribers: cfe-commits, llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D37156 llvm-svn: 311801	2017-08-25 21:18:29 +00:00
Kostya Serebryany	d3e4b7e24a	[sanitizer-coverage] extend fsanitize-coverage=pc-table with flags for every PC llvm-svn: 311794	2017-08-25 19:29:47 +00:00
Sanjay Patel	50a446ef10	[x86] regenerate checks; NFC llvm-svn: 311793	2017-08-25 19:25:03 +00:00
Craig Topper	c5e818e341	[InstCombine] Add tests to show missed opportunities to combine bit tests hidden by a sign compare and a truncate. NFC llvm-svn: 311784	2017-08-25 17:14:35 +00:00
Florian Hahn	cd78345398	[LoopInterchange] Skip zext instructions when looking for induction var. Summary: SimplifyIndVar may introduce zext instructions to widen arguments of the loop exit check. They should not prevent us from splitting the loop at the induction variable, but maybe the check should be more conservative, e.g. making sure it only extends arguments used by a comparison? Reviewers: karthikthecool, mcrosier, mzolotukhin Reviewed By: mcrosier Subscribers: mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D34879 llvm-svn: 311783	2017-08-25 16:52:29 +00:00
David Green	6a2c17a639	[gold] Fix up a new test to allow it to pass on non x86 builds. Fix a test that is failing on a downstream ARM/AArch64 bootstrap. We just need add an elf_x86_64 parameter to gold. llvm-svn: 311780	2017-08-25 16:14:56 +00:00
Amjad Aboud	22178dd33b	[InstCombine] Consider more cases where SimplifyDemandedUseBits does not convert AShr to LShr. There are cases where AShr have better chance to be optimized than LShr, especially when the demanded bits are not known to be Zero, and also known to be similar to the sign bit. Differential Revision: https://reviews.llvm.org/D36936 llvm-svn: 311773	2017-08-25 11:07:54 +00:00
Aditya Nandakumar	892979effc	[GISel]: Implement widenScalar for Legalizing G_PHI https://reviews.llvm.org/D37018 llvm-svn: 311763	2017-08-25 04:57:27 +00:00
Chandler Carruth	46259260c7	[x86] NFC - normalize test case formatting of IR and generate CHECK lines with the script rather than using manually written checks. llvm-svn: 311753	2017-08-25 02:32:51 +00:00
Gor Nishanov	e29e94cf87	[coroutines] Add support for symmetric control transfer (musttail on coro.resumes followed by a suspend) Summary: Add musttail to any resume instructions that is immediately followed by a suspend (i.e. ret). We do this even in -O0 to support guaranteed tail call for symmetrical coroutine control transfer (C++ Coroutines TS extension). This transformation is done only in the resume part of the coroutine that has identical signature and calling convention as the coro.resume call. Reviewers: GorNishanov Reviewed By: GorNishanov Subscribers: EricWF, majnemer, llvm-commits Differential Revision: https://reviews.llvm.org/D37125 llvm-svn: 311751	2017-08-25 02:25:10 +00:00
Craig Topper	355d8cff49	[X86] Add TBM instructions to X86InstrInfo::isDefConvertible. This allows us to remove "test" instructions and use the flags from the TBM instructions directly. llvm-svn: 311747	2017-08-25 01:59:06 +00:00
Chandler Carruth	5b491808f5	[x86] Back out one aspect of r311318: don't generically set FeatureSlowUAMem32. The idea was to mark things that are slow on widely available processors as slow in the generic CPU so that the code generated for that CPU would be fast across those processors. However, for this feature that doesn't work out very well at all. The problem here is that you can very easily enable AVX or AVX2 on top of this generic CPU. For example, this can happen just by using AVX2 intrinsics from Clang within a region of code guarded by a dynamic CPU feature test. When you do that, the generated code with SlowUAMem32 set is ... amazingly slower. The problem is that there really aren't very good alternatives to the unaligned loads, and so our vector codegen regresses significantly. The other issue is that there are plenty of AMD CPUs with AVX1 that don't set FeatureSlowUAMem32 and so we shouldn't just check for AVX2 instead of this special feature. =/ It would be nice to have the target attriute logic be able to enable/disable more than just one feature at a time and control this in a more fine grained and useful way, but that doesn't seem easy. Given that it is only Sandybridge and Ivybridge that set this feature, for now I'm just backing it out of the generic CPU. That has the additional advantage of going back to the previous state that people seemed vaguely happy with. llvm-svn: 311740	2017-08-25 00:56:05 +00:00
Chandler Carruth	8ac488b161	[x86] Fix an amazing goof in the handling of sub, or, and xor lowering. The comment for this code indicated that it should work similar to our handling of add lowering above: if we see uses of an instruction other than flag usage and store usage, it tries to avoid the specialized X86ISD::* nodes that are designed for flag+op modeling and emits an explicit test. Problem is, only the add case actually did this. In all the other cases, the logic was incomplete and inverted. Any time the value was used by a store, we bailed on the specialized X86ISD node. All of this appears to have been historical where we had different logic here. =/ Turns out, we have quite a few patterns designed around these nodes. We should actually form them. I fixed the code to match what we do for add, and it has quite a positive effect just within some of our test cases. The only thing close to a regression I see is using: notl %r testl %r, %r instead of: xorl -1, %r But we can add a pattern or something to fold that back out. The improvements seem more than worth this. I've also worked with Craig to update the comments to no longer be actively contradicted by the code. =[ Some of this still remains a mystery to both Craig and myself, but this seems like a large step in the direction of consistency and slightly more accurate comments. Many thanks to Craig for help figuring out this nasty stuff. Differential Revision: https://reviews.llvm.org/D37096 llvm-svn: 311737	2017-08-25 00:34:07 +00:00
Sanjay Patel	e404cbff66	[DAG] convert vector select-of-constants to logic/math This goes back to a discussion about IR canonicalization. We'd like to preserve and convert more IR to 'select' than we currently do because that's likely the best choice in IR: http://lists.llvm.org/pipermail/llvm-dev/2016-September/105335.html ...but that's often not true for codegen, so we need to account for this pattern coming in to the backend and transform it to better DAG ops. Steps in this patch: 1. Add an EVT param to the existing convertSelectOfConstantsToMath() TLI hook to more finely enable this transform. Other targets will probably want that anyway to distinguish scalars from vectors. We're using that here to exclude AVX512 targets, but it may not be necessary. 2. Convert a vselect to ext+add. This eliminates a constant load/materialization, and the vector ext is often free. Implementing a more general fold using xor+and can be a follow-up for targets that don't have a legal vselect. It's also possible that we can remove the TLI hook for the special case fold implemented here because we're eliminating a constant, but it needs to be tested on other targets. Differential Revision: https://reviews.llvm.org/D36840 llvm-svn: 311731	2017-08-24 23:24:43 +00:00
Mandeep Singh Grang	872f689d0a	[ADT] Enable reverse iteration for DenseMap Reviewers: mehdi_amini, dexonsmith, dblaikie, davide, chandlerc, davidxl, echristo, efriedma Reviewed By: dblaikie Subscribers: rsmith, mgorny, emaste, llvm-commits Differential Revision: https://reviews.llvm.org/D35043 llvm-svn: 311730	2017-08-24 23:02:48 +00:00
Xinliang David Li	66531dd10a	[Profile] backward propagate profile info in JumpThreading Take-2 after fixing bugs in the original patch. Differential Revsion: http://reviews.llvm.org/D36864 llvm-svn: 311727	2017-08-24 22:54:01 +00:00
Sanjay Patel	bb789381fc	[InstCombine] fix and enhance udiv/urem narrowing There are 3 small independent changes here: 1. Account for multiple uses in the pattern matching: avoid the transform if it increases the instruction count. 2. Add a missing fold for the case where the numerator is the constant: http://rise4fun.com/Alive/E2p 3. Enable all folds for vector types. There's still one more potential change - use "shouldChangeType()" to keep from transforming to an illegal integer type. Differential Revision: https://reviews.llvm.org/D36988 llvm-svn: 311726	2017-08-24 22:54:01 +00:00
Dehao Chen	f0e27e63e7	Move accurate-sample-profile into the function attribute. Summary: We need to have accurate-sample-profile in function attribute so that it works with LTO. Reviewers: davidxl, rsmith Reviewed By: davidxl Subscribers: sanjoy, mehdi_amini, javed.absar, llvm-commits, eraman Differential Revision: https://reviews.llvm.org/D37113 llvm-svn: 311706	2017-08-24 21:37:04 +00:00
Jacob Gravelle	690b76e13d	[WebAssembly] FastISel : Bail to SelectionDAG for constexpr calls Summary: Currently FastISel lowers constexpr calls as indirect calls. We'd like those to direct calls, and falling back to SelectionDAGISel handles that. Reviewers: dschuff, sunfish Subscribers: jfb, sbc100, llvm-commits, aheejin Differential Revision: https://reviews.llvm.org/D37073 llvm-svn: 311693	2017-08-24 19:53:44 +00:00
Daniel Sanders	069bb8d45f	[globalisel][tablegen] Predicates should start from GIPFP_Invalid+1 not GIPFP_Invalid This fixes a warning when there are zero defined predicates and also fixes an unnoticed bug where the first predicate in the table was unusable. llvm-svn: 311684	2017-08-24 18:54:16 +00:00
Pete Couperus	2d1f6d67c5	[ARC] Add ARC backend. Add the ARC backend as an experimental target to lib/Target. Reviewed at: https://reviews.llvm.org/D36331 llvm-svn: 311667	2017-08-24 15:40:33 +00:00
Sjoerd Meijer	b0eb5fb317	[AArch64] Add FMOVH0: materialize 0 using zero register for f16 values Instead of loading 0 from a constant pool, it's of course much better to materialize it using an fmov and the zero register. Thanks to Ahmed Bougacha for the suggestion. Differential Revision: https://reviews.llvm.org/D37102 llvm-svn: 311662	2017-08-24 14:47:06 +00:00
Chad Rosier	bfd4014304	[TargetParser][AArch64] Add support for RDM feature in the target parser. Differential Revision: https://reviews.llvm.org/D37081 llvm-svn: 311659	2017-08-24 14:30:44 +00:00
Michael Zuckerman	9ee61d9b00	Adding base lit test for x86interleaved llvm-svn: 311658	2017-08-24 14:11:28 +00:00
Krzysztof Parzyszek	c09a14eeb2	[Hexagon] Generate correct runtime check when recognizing memmove The check (assuming positive stride) for validity of memmove should be (a) the destination is at a lower address than the source, or (b) the distance between the source and destination is greater than or equal the number of bytes copied. For the second part it is sufficient to assume that the destination is at a higher address, since the opposite case is covered by (a). The distance calculation was previously done by subtracting the pointers in the wrong order. llvm-svn: 311650	2017-08-24 11:59:53 +00:00
Evgeny Astigeevich	540a39adf7	[ARM, Thumb1] Prevent ARMTargetLowering::isLegalAddressingMode from accepting illegal modes ARMTargetLowering::isLegalAddressingMode can accept illegal addressing modes for the Thumb1 target. This causes generation of redundant code and affects performance. This fixes PR34106: https://bugs.llvm.org/show_bug.cgi?id=34106 Differential Revision: https://reviews.llvm.org/D36467 llvm-svn: 311649	2017-08-24 10:00:25 +00:00
Sjoerd Meijer	afc2cd3c9e	[AArch64] Custom lowering of copysign f16 This is a follow up patch of r311154 and introduces custom lowering of copysign f16 to avoid promotions to single precision types when the subtarget supports fullfp16. Differential Revision: https://reviews.llvm.org/D36893 llvm-svn: 311646	2017-08-24 09:21:10 +00:00
Daniel Sanders	2c269f6bf8	Re-commit: [globalisel][tablegen] Add support for ImmLeaf without SDNodeXForm Summary: This patch adds support for predicates on imm nodes but only for ImmLeaf and not for PatLeaf or PatFrag and only where the value does not need to be transformed before being rendered into the instruction. The limitation on PatLeaf/PatFrag/SDNodeXForm is due to differences in the necessary target-supplied C++ for GlobalISel. Depends on D36085 The previous commit was reverted for breaking the build but this appears to have been the recurring problem on the Windows bots with tablegen not being re-run when llvm-tblgen is changed but the .td's aren't. If it re-occurs then forcing a build with clean=True should fix it but this string should do this in advance: Requires a clean build. Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar Reviewed By: rovka Subscribers: kristof.beyls, javed.absar, igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D36086 llvm-svn: 311645	2017-08-24 09:11:20 +00:00
Coby Tayree	21c312d8c6	[LLVM][x86][Inline Asm] support for GCC style inline asm - Y<x> constraints This patch is intended to enable the use of basic double letter constraints used in GCC extended inline asm {Yi Y2 Yz Y0 Ym Yt}. Supersedes D35204 Clang counterpart: D36371 Differential Revision: https://reviews.llvm.org/D36369 llvm-svn: 311644	2017-08-24 09:08:33 +00:00
Mikael Holmen	7a99e33b8e	[Reassociate] Do not drop debug location if replacement is missing Summary: When reassociating an expression, do not drop the instruction's original debug location in case the replacement location is missing. The debug location must at least not be dropped for inlinable callsites of debug-info-bearing functions in debug-info-bearing functions. Failing to do so would result in an "inlinable function " "call in a function with debug info must have a !dbg location" error in the verifier. As preserving the original debug location is not expected to result in overly jumpy debug line information, it is preserved for all other cases too. This fixes PR34231: https://bugs.llvm.org/show_bug.cgi?id=34231 Original patch by David Stenberg Reviewers: davide, craig.topper, mcrosier, dblaikie, aprantl Reviewed By: davide, aprantl Subscribers: aprantl Differential Revision: https://reviews.llvm.org/D36865 llvm-svn: 311642	2017-08-24 09:05:00 +00:00
Coby Tayree	d89128925b	[X86AsmParser] Refactoring, (almost) NFC. Some refactoring to X86AsmParser, mostly regarding the way rewrites are conducted. Mainly, we try to concentrate all the rewrite effort under one hood, so it'll hopefully be less of a mess and easier to maintain and understand. naturally, some frontend tests were affected: D36794 Differential Revision: https://reviews.llvm.org/D36793 llvm-svn: 311639	2017-08-24 08:46:25 +00:00
Matt Arsenault	d664315ae8	IPRA: Don't assume called function is first call operand Fixes not finding the called global for AMDGPU call pseudoinstructions, which prevented IPRA from doing much. llvm-svn: 311637	2017-08-24 07:55:15 +00:00
Chandler Carruth	dc2556934c	[x86] NFC: Clean up two tests and generate precise checks for them. Mostly this involved giving unnamed values names and running the IR through `opt` to re-format it but merging in any important comments in the original. I then deleted pointless comments and inlined the function attributes for ease of reading and editting. All of this is to make it much easier to see the instructions being generated here and evaluate any updates to the tests. llvm-svn: 311634	2017-08-24 07:38:36 +00:00
Igor Breger	47be5fbbe9	[GlobalISel][X86] Support G_IMPLICIT_DEF. Summary: Support G_IMPLICIT_DEF. Reviewers: zvi, guyblank, t.p.northover Reviewed By: guyblank Subscribers: rovka, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D36733 llvm-svn: 311633	2017-08-24 07:06:27 +00:00
Wei Ding	a131d3fb29	Add ‘llvm.experimental.constrained.fma‘ Intrinsic. Differential Revision: http://reviews.llvm.org/D36335 llvm-svn: 311629	2017-08-24 04:18:24 +00:00
Daniel Berlin	f948603a15	NewGVN: We weren't properly simplifying selects with equal arguments due to a thinko. llvm-svn: 311626	2017-08-24 02:43:17 +00:00
Hans Wennborg	c39ec95d88	[DAG] Fix Node Replacement in PromoteIntBinOp When one operand is a user of another in a promoted binary operation we may replace and delete the returned value before returning triggering an assertion. Reorder node replacements to prevent this. Fixes PR34137. Landing on behalf of Nirav. Differential Revision: https://reviews.llvm.org/D36581 llvm-svn: 311623	2017-08-24 01:08:27 +00:00
Dylan McKay	4f5002198b	[AVR] Use the correct register classes for 16-bit atomic operations llvm-svn: 311620	2017-08-24 00:14:38 +00:00
Dehao Chen	b2d1de5a7c	Add test to cover accurate-sample-profile. Summary: This patch adds test to cover the logic guarded by "accurate-sample-profile" flag. Reviewers: davidxl Reviewed By: davidxl Subscribers: sanjoy, llvm-commits, eraman Differential Revision: https://reviews.llvm.org/D37084 llvm-svn: 311618	2017-08-23 23:19:11 +00:00
Tim Northover	4bafa16748	ARM: use internal relocations for local symbols after all. Switching to external relocations for ARM-mode branches (to allow Thumb interworking when the offset is unencodable) causes calls to temporary symbols to be miscompiled and instead go to the parent externally visible symbol. Calling a temporary never happens in compiled code, but can occasionally in hand-written assembly. llvm-svn: 311611	2017-08-23 22:07:10 +00:00
Aditya Nandakumar	850b983455	Fix Verifier test - add REQUIRES aarch64-registered-target llvm-svn: 311609	2017-08-23 21:55:36 +00:00
Adrian Prantl	33aa8acb40	Add a Verifier check for DILocation's scopes. Found via https://bugs.llvm.org/show_bug.cgi?id=33997. llvm-svn: 311608	2017-08-23 21:52:24 +00:00
Jonas Devlieghere	a845167dca	[WebAssembly] Fix overflow for input with missing version Differential revision: https://reviews.llvm.org/D37070 llvm-svn: 311605	2017-08-23 21:36:04 +00:00
Rong Xu	15848e5977	[PGO] Set edge weights for indirectbr instruction with profile counts Current PGO only annotates the edge weight for branch and switch instructions with profile counts. We should also annotate the indirectbr instruction as all the information is there. This patch enables the annotating for indirectbr instructions. Also uses this annotation in branch probability analysis. Differential Revision: https://reviews.llvm.org/D37074 llvm-svn: 311604	2017-08-23 21:36:02 +00:00
Aditya Nandakumar	efd8a84cd5	[GISEl]: Translate phi into G_PHI G_PHI has the same semantics as PHI but also has types. This lets us verify that the types in the G_PHI are consistent. This also allows specifying legalization actions for G_PHIs. https://reviews.llvm.org/D36990 llvm-svn: 311596	2017-08-23 20:45:48 +00:00
Reid Kleckner	6d353348e5	Parse and print DIExpressions inline to ease IR and MIR testing Summary: Most DIExpressions are empty or very simple. When they are complex, they tend to be unique, so checking them inline is reasonable. This also avoids the need for CodeGen passes to append to the llvm.dbg.mir named md node. See also PR22780, for making DIExpression not be an MDNode. Reviewers: aprantl, dexonsmith, dblaikie Subscribers: qcolombet, javed.absar, eraman, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D37075 llvm-svn: 311594	2017-08-23 20:31:27 +00:00
Lei Huang	0cb591fc4c	Update branch coalescing to be a PowerPC specific pass Implementing this pass as a PowerPC specific pass. Branch coalescing utilizes the analyzeBranch method which currently does not include any implicit operands. This is not an issue on PPC but must be handled on other targets. Differential Revision : https: // reviews.llvm.org/D32776 llvm-svn: 311588	2017-08-23 19:25:04 +00:00
Craig Topper	853a8d9ffc	[AVX512] Don't create SHRUNKBLEND SDNodes for 512-bit vectors There are no 512-bit blend instructions so we shouldn't create SHRUNKBLEND for them. On a side note, it looks like there may be a missed opportunity for constant folding TESTM when LHS and RHS are equal. This fixes PR34139. Differential Revision: https://reviews.llvm.org/D36992 llvm-svn: 311572	2017-08-23 16:41:02 +00:00
Hans Wennborg	66f6fc0a49	LowerAtomic: Don't skip optnone functions; atomic still need lowering (PR34020) The lowering isn't really an optimization, so optnone shouldn't make a difference. ARM relies on the pass running when using "-mthread-model single", because in that mode, it doesn't run AtomicExpand. See bug for more details. Differential Revision: https://reviews.llvm.org/D37040 llvm-svn: 311565	2017-08-23 15:43:28 +00:00
Victor Leschuk	3697ebe25f	Revert r311546 as it breaks build http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/4394 llvm-svn: 311560	2017-08-23 15:21:10 +00:00
Gor Nishanov	2f55b958b1	[coroutines] CoroBegin from inner coroutines should be considered for spills Summary: If a coroutine outer calls another coroutine inner and the inner coroutine body is inlined into the outer, coro.begin from the inner coroutine should be considered for spilling if accessed across suspends. Prior to this change, coroutine frame building code was not considering any coro.begins for spilling. With this change, we only ignore coro.begin for the current coroutine, but, any coro.begins that were inlined into the current coroutine are eligible for spills. Fixes PR34267 Reviewers: GorNishanov Subscribers: qcolombet, llvm-commits, EricWF Differential Revision: https://reviews.llvm.org/D37062 llvm-svn: 311556	2017-08-23 14:47:52 +00:00
Chad Rosier	8db41e9dbd	[Reassociate] Don't canonicalize x + (-Constant * y) -> x - (Constant * y).. ..if the resulting subtract will be broken up later. This can cause us to get into an infinite loop. x + (-5.0 * y) -> x - (5.0 * y) ; Canonicalize neg const x - (5.0 * y) -> x + (0 - (5.0 * y)) ; Break up subtract x + (0 - (5.0 * y)) -> x + (-5.0 * y) ; Replace 0-X with X*-1. PR34078 llvm-svn: 311554	2017-08-23 14:10:06 +00:00
Daniel Sanders	c3885c4589	[globalisel][tablegen] Add support for ImmLeaf without SDNodeXForm Summary: This patch adds support for predicates on imm nodes but only for ImmLeaf and not for PatLeaf or PatFrag and only where the value does not need to be transformed before being rendered into the instruction. The limitation on PatLeaf/PatFrag/SDNodeXForm is due to differences in the necessary target-supplied C++ for GlobalISel. Depends on D36085 Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar Reviewed By: rovka Subscribers: kristof.beyls, javed.absar, igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D36086 llvm-svn: 311546	2017-08-23 12:14:18 +00:00
Florian Hahn	5b92960091	[ARM] Check for assembler instructions in test. Currently this test causes test failures on some machines, due to isel not being registered. Update the test to run all passes and check emitted assembly instructions for now. llvm-svn: 311545	2017-08-23 11:53:24 +00:00
Florian Hahn	214e13d949	[ARM] Add missing patterns for insert_subvector. Summary: In some cases, shufflevector instruction can be transformed involving insert_subvector instructions. The ARM backend was missing some insert_subvector patterns, causing a failure during instruction selection. AArch64 has similar patterns. Reviewers: t.p.northover, olista01, javed.absar, rengolin Reviewed By: javed.absar Subscribers: aemerson, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D36796 llvm-svn: 311543	2017-08-23 10:20:59 +00:00
Daniel Sanders	499807079b	[globalisel][tablegen] Add tests for FeatureBitsets and ComplexPattern predicates. llvm-svn: 311542	2017-08-23 10:09:25 +00:00
Davide Italiano	06d9eda150	[gold] Test we don't strip globals when producing relocatables. lld was broken in this regard (PR33097). The gold plugin gets this right so, no changes needed, but better adding a test. llvm-svn: 311541	2017-08-23 09:43:41 +00:00
Davide Italiano	c78885818a	[InstCombine] Fold branches with irrelevant conditions to a constant. InstCombine folds instructions with irrelevant conditions to undef. This, as Nuno confirmed is a bug. (see https://bugs.llvm.org/show_bug.cgi?id=33409#c1 ) Given the original motivation for the change is that of removing an USE, we now fold to false instead (which reaches the same goal without undesired side effects). Fixes PR33409. Differential Revision: https://reviews.llvm.org/D36975 llvm-svn: 311540	2017-08-23 09:14:37 +00:00
Hiroshi Inoue	cc555bd0ac	[PowerPC] better instruction selection for OR (XOR) with a 32-bit immediate - recommitting after fixing a test failure on MacOS On PPC64, OR (XOR) with a 32-bit immediate can be done with only two instructions, i.e. ori + oris. But the current LLVM generates three or four instructions for this purpose (and also it clobbers one GPR). This patch makes PPC backend generate ori + oris (xori + xoris) for OR (XOR) with a 32-bit immediate. e.g. (x \| 0xFFFFFFFF) should be ori 3, 3, 65535 oris 3, 3, 65535 but LLVM generates without this patch li 4, 0 oris 4, 4, 65535 ori 4, 4, 65535 or 3, 3, 4 Differential Revision: https://reviews.llvm.org/D34757 llvm-svn: 311538	2017-08-23 08:55:18 +00:00
Hiroshi Inoue	dbb285ca51	Revert rL311526: [PowerPC] better instruction selection for OR (XOR) with a 32-bit immediate This reverts commit rL311526 due to failures in some buildbot. llvm-svn: 311530	2017-08-23 06:38:05 +00:00
Hiroshi Inoue	c4449df1b0	[PowerPC] better instruction selection for OR (XOR) with a 32-bit immediate On PPC64, OR (XOR) with a 32-bit immediate can be done with only two instructions, i.e. ori + oris. But the current LLVM generates three or four instructions for this purpose (and also it clobbers one GPR). This patch makes PPC backend generate ori + oris (xori + xoris) for OR (XOR) with a 32-bit immediate. e.g. (x \| 0xFFFFFFFF) should be ori 3, 3, 65535 oris 3, 3, 65535 but LLVM generates without this patch li 4, 0 oris 4, 4, 65535 ori 4, 4, 65535 or 3, 3, 4 Differential Revision: https://reviews.llvm.org/D34757 llvm-svn: 311526	2017-08-23 05:15:15 +00:00
Dean Michael Berris	0884b73220	[XRay][CodeGen] Use PIC-friendly code in XRay sleds; remove synthetic references in .text Summary: This change achieves two things: - Redefine the Custom Event handling instrumentation points emitted by the compiler to not require dynamic relocation of references to the __xray_CustomEvent trampoline. - Remove the synthetic reference we emit at the end of a function that we used to keep auxiliary sections alive in favour of SHF_LINK_ORDER associated with the section where the function is defined. To achieve the custom event handling change, we've had to introduce the concept of sled versioning -- this will need to be supported by the runtime to allow us to understand how to turn on/off the new version of the custom event handling sleds. That change has to land first before we change the way we write the sleds. To remove the synthetic reference, we rely on a relatively new linker feature that preserves the sections that are associated with each other. This allows us to limit the effects on the .text section of ELF binaries. Because we're still using absolute references that are resolved at runtime for the instrumentation map (and function index) maps, we mark these sections write-able. In the future we can re-define the entries in the map to use relative relocations instead that can be statically determined by the linker. That change will be a bit more invasive so we defer this for later. Depends on D36816. Reviewers: dblaikie, echristo, pcc Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36615 llvm-svn: 311525	2017-08-23 04:49:41 +00:00
Yonghong Song	dc1dbf6ef3	bpf: add variants of -mcpu=# and support for additional jmp insns -mcpu=# will support: . generic: the default insn set . v1: insn set version 1, the same as generic . v2: insn set version 2, version 1 + additional jmp insns . probe: the compiler will probe the underlying kernel to decide proper version of insn set. We did not not use -mcpu=native since llc/llvm will interpret -mcpu=native as the underlying hardware architecture regardless of -march value. Currently, only x86_64 supports -mcpu=probe. Other architecture will silently revert to "generic". Also added -mcpu=help to print available cpu parameters. llvm will print out the information only if there are at least one cpu and at least one feature. Add an unused dummy feature to enable the printout. Examples for usage: $ llc -march=bpf -mcpu=v1 -filetype=asm t.ll $ llc -march=bpf -mcpu=v2 -filetype=asm t.ll $ llc -march=bpf -mcpu=generic -filetype=asm t.ll $ llc -march=bpf -mcpu=probe -filetype=asm t.ll $ llc -march=bpf -mcpu=v3 -filetype=asm t.ll 'v3' is not a recognized processor for this target (ignoring processor) ... $ llc -march=bpf -mcpu=help -filetype=asm t.ll Available CPUs for this target: generic - Select the generic processor. probe - Select the probe processor. v1 - Select the v1 processor. v2 - Select the v2 processor. Available features for this target: dummy - unused feature. Use +feature to enable a feature, or -feature to disable it. For example, llc -mcpu=mycpu -mattr=+feature1,-feature2 ... Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Yonghong Song <yhs@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> llvm-svn: 311522	2017-08-23 04:25:57 +00:00
Matthias Braun	d6c0868da5	Fix tail-merge-after-mbp test The output of this test changed after the fix in r311520 to have -run-pass=block-placement behave like it does in a normal pipeline. Adjust the test. llvm-svn: 311521	2017-08-23 03:49:53 +00:00
Matthias Braun	8426d1342d	Add test case for r311511 This also changes the TailDuplicator to be configured explicitely pre/post regalloc rather than relying on the isSSA() flag. This was necessary to have `llc -run-pass` work reliably. llvm-svn: 311520	2017-08-23 03:17:59 +00:00
Craig Topper	ec4b82571c	[InstCombine] Remove check for sext of vector icmp from shouldOptimizeCast Looks like for 'and' and 'or' we end up performing at least some of the transformations this is bocking in a round about way anyway. For 'and sext(cmp1), sext(cmp2) we end up later turning it into 'select cmp1, sext(cmp2), 0'. Then we optimize that back to sext (and cmp1, cmp2). This is the same result we would have gotten if shouldOptimizeCast hadn't blocked it. We do something analogous for 'or'. With this patch we allow that transformation to happen directly in foldCastedBitwiseLogic. And we now support the same thing for 'xor'. This is definitely opening up many other cases, but since we already went around it for some cases hopefully it's ok. Differential Revision: https://reviews.llvm.org/D36213 llvm-svn: 311508	2017-08-22 23:40:15 +00:00
Jonas Devlieghere	4942a0b0f3	Revert "[llvm-dwarfdump] Print type names in DW_AT_type DIEs" This reverts commit r311492. llvm-svn: 311499	2017-08-22 21:59:46 +00:00
Jonas Devlieghere	f456d1864d	[llvm-dwarfdump] Print type names in DW_AT_type DIEs This patch adds printing for DW_AT_type DIEs like it's currently already the case for DW_AT_specification DIEs. llvm-svn: 311492	2017-08-22 21:41:49 +00:00
Peter Collingbourne	001052a067	WholeProgramDevirt: Create bitcast to i8* at each virtual call site. We can't reuse the llvm.assume instruction's bitcast because it may not dominate every user of the vtable pointer. Differential Revision: https://reviews.llvm.org/D36994 llvm-svn: 311491	2017-08-22 21:41:19 +00:00
Matt Morehouse	b1fa8255db	[SanitizerCoverage] Optimize stack-depth instrumentation. Summary: Use the initialexec TLS type and eliminate calls to the TLS wrapper. Fixes the sanitizer-x86_64-linux-fuzzer bot failure. Reviewers: vitalybuka, kcc Reviewed By: kcc Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D37026 llvm-svn: 311490	2017-08-22 21:28:29 +00:00
Jakub Kuderski	2724d45325	[ADCE][Dominators] Reapply: Teach ADCE to preserve dominators Summary: This patch teaches ADCE to preserve both DominatorTrees and PostDominatorTrees. This is reapplies the original patch r311057 that was reverted in r311381. The previous version wasn't using the batch update api for updating dominators, which in vary rare cases caused assertion failures. This also fixes PR34258. Reviewers: dberlin, chandlerc, sanjoy, davide, grosser, brzycki Reviewed By: davide Subscribers: grandinj, zhendongsu, llvm-commits, david2050 Differential Revision: https://reviews.llvm.org/D35869 llvm-svn: 311467	2017-08-22 16:30:21 +00:00
Sanjay Patel	0ab50f6d68	[x86] auto-generate full checks; NFC I don't see anything Darwin-specific here, so I made the target generic x86-64. llvm-svn: 311465	2017-08-22 16:27:00 +00:00
Sanjay Patel	40b8e3bfe5	[x86] simplify runs and auto-generate full checks I've replaced the two OS-specific runs with a generic run because there's no functional difference in the resulting output that we're checking. Also, the script still doesn't work with a Win target. llvm-svn: 311463	2017-08-22 16:21:45 +00:00
Sam Parker	6dc3fcb1c6	[ARM][AArch64] v8.3-A Javascript Conversion Armv8.3-A adds instructions that convert a double-precision floating point number to a signed 32-bit integer with round towards zero, designed for improving Javascript performance. Differential Revision: https://reviews.llvm.org/D36785 llvm-svn: 311448	2017-08-22 11:08:21 +00:00
Renato Golin	c070c73d5e	[ARM] Avoid creating duplicate ANDs in SelectionDAG When expanding a BRCOND into a BR_CC, do not create an AND 1 if one already exists. Review: D36705 Patch by Joel Galenson <jgalenson@google.com> llvm-svn: 311447	2017-08-22 11:02:45 +00:00
Renato Golin	f63d701669	[ARM] Call setBooleanContents(ZeroOrOneBooleanContent) The ARM backend should call setBooleanContents so that it can use known bits to make some optimizations. Review: D35821 Patch by Joel Galenson <jgalenson@google.com> llvm-svn: 311446	2017-08-22 11:02:37 +00:00
George Rimar	1e94ca115d	[lib/Analysis] - Mark personality functions as live. This is PR33245. Case I am fixing is next: Imagine we have 2 BC files, one defines and uses personality routine, second has only declaration and also uses it. Previously algorithm computing dead symbols (llvm::computeDeadSymbols) did not know about personality routines and leaved them dead even if function that has routine was live. As a result thinLTOInternalizeAndPromoteGUID() method changed binding for such symbol to local. Later when LLD tried to link these objects it failed because one object had undefined global symbol for routine and second object contained local definition instead of global. Patch set the live root flag on the corresponding FunctionSummary for personality routines when we build the per-module summaries during the compile step. Differential revision: https://reviews.llvm.org/D36834 llvm-svn: 311432	2017-08-22 08:50:56 +00:00
Craig Topper	b49f0893b2	[X86] Prevent several calls to ISD::isConstantSplatVector from returning a narrower APInt than the original scalar type ISD::isConstantSplatVector can shrink to the smallest splat width. But we don't check the size of the resulting APInt at all. This can cause us to misinterpret the results. This patch just adds a flag to prevent the APInt from changing width. Fixes PR34271. Differential Revision: https://reviews.llvm.org/D36996 llvm-svn: 311429	2017-08-22 05:40:17 +00:00
Adrian Prantl	acdc3a7bff	dsymutil: don't copy compile units without children from PCM files rdar://problem/33830532 llvm-svn: 311416	2017-08-22 01:10:48 +00:00
Sanjay Patel	6f527aae0b	[InstCombine] add udiv/urem tests with constant numerator; NFC llvm-svn: 311396	2017-08-21 22:40:02 +00:00
Sanjay Patel	5e3037cfc4	[InstCombine] add more tests for udiv/urem narrowing; NFC We don't currently limit these folds with hasOneUse() or shouldChangeType(). llvm-svn: 311390	2017-08-21 21:57:52 +00:00
Evandro Menezes	bc11ca1a31	[AArch64] Restore the test of conditional branch fusion Restore the functionality of this test that was broken by https://reviews.llvm.org/rL306144. Differential revision: https://reviews.llvm.org/D36807 llvm-svn: 311389	2017-08-21 21:57:43 +00:00
Tim Northover	ef1fc5ae89	GlobalISel (AArch64): fix ABI at border between GPRs and SP. If a struct would end up half in GPRs and half on SP the ABI says it should actually go entirely on the stack. We were getting this wrong in GlobalISel before, causing compatibility issues. llvm-svn: 311388	2017-08-21 21:56:11 +00:00
Steven Wu	010fc49e42	[IR] AutoUpgrade ModuleFlagBehavior for PIC and PIE level Summary: From r303590, ModuleFlagBehavior for PIC and PIE level is changed from Error to Max. This will cause bitcode compatibility issue when linking against a bitcode static archive built with old compiler. Add an auto-ugprade path to upgrade the the ModuleFlagBehavior in the old bitcode to match the new one so IRLinker can link them. Reviewers: tejohnson, mehdi_amini, dexonsmith Reviewed By: dexonsmith Subscribers: hans, llvm-commits Differential Revision: https://reviews.llvm.org/D36556 llvm-svn: 311387	2017-08-21 21:49:13 +00:00
Sanjoy Das	08a38fe71e	Revert "Reapply: [ADCE][Dominators] Teach ADCE to preserve dominators" Summary: This partially reverts commit r311057 since it breaks ADCE. See PR34258. Reviewers: kuhar Subscribers: mcrosier, david2050, llvm-commits Differential Revision: https://reviews.llvm.org/D36979 llvm-svn: 311381	2017-08-21 20:39:18 +00:00
Sanjay Patel	82ec872990	[LibCallSimplifier] try harder to fold memcmp with constant arguments (2nd try) The 1st try was reverted because it could inf-loop by creating a dead instruction. Fixed that to not happen and added a test case to verify. Original commit message: Try to fold: memcmp(X, C, ConstantLength) == 0 --> load X == *C Without this change, we're unnecessarily checking the alignment of the constant data, so we miss the transform in the first 2 tests in the patch. I noted this shortcoming of LibCallSimpifier in one of the recent CGP memcmp expansion patches. This doesn't help the example in: https://bugs.llvm.org/show_bug.cgi?id=34032#c13 ...directly, but it's worth short-circuiting more of these simple cases since we're already trying to do that. The benefit of transforming to load+cmp is that existing IR analysis/transforms may further simplify that code. For example, if the load of the variable is common to multiple memcmp calls, CSE can remove the duplicate instructions. Differential Revision: https://reviews.llvm.org/D36922 llvm-svn: 311366	2017-08-21 19:13:14 +00:00
Craig Topper	74177e1ed1	[InstCombine] Teach foldSelectICmpAnd to recognize a (icmp slt X, 0) and (icmp sgt X, -1) as equivalent to an and with the sign bit of the truncated type This is similar to what was already done in foldSelectICmpAndOr. Ultimately I'd like to see if we can call foldSelectICmpAnd from foldSelectIntoOp if we detect a power of 2 constant. This would allow us to remove foldSelectICmpAndOr entirely. Differential Revision: https://reviews.llvm.org/D36498 llvm-svn: 311362	2017-08-21 19:02:06 +00:00
Sean Fertile	00393cce3a	[PPC] Refine checks for emiting TOC restore nop and tail-call eligibility. For the medium and large code models we only need to check if a call crosses dso-boundaries when considering tail-call elgibility. Differential Revision: https://reviews.llvm.org/D34245 llvm-svn: 311353	2017-08-21 17:35:32 +00:00
Sanjay Patel	cf081f9a30	[InstCombine] add tests for memcmp with constant; NFC This is the baseline (current) version of the tests that would have been added with the transform in r311333 (reverted at r311340 due to inf-looping). Adding these now to aid in testing and minimize the patch if/when it is reinstated. llvm-svn: 311350	2017-08-21 16:47:12 +00:00
Sam Elliott	e604b563ea	Emit only A Single Opt Remark When Inlining Summary: This updates the Inliner to only add a single Optimization Remark when Inlining, rather than an Analysis Remark and an Optimization Remark. Fixes https://bugs.llvm.org/show_bug.cgi?id=33786 Reviewers: anemet, davidxl, chandlerc Reviewed By: anemet Subscribers: haicheng, fhahn, mehdi_amini, dblaikie, llvm-commits, eraman Differential Revision: https://reviews.llvm.org/D36054 llvm-svn: 311349	2017-08-21 16:45:47 +00:00
Craig Topper	cc255bcd77	[InstCombine] Fix a weakness in canEvaluateZExtd around 'and' instructions Summary: If the bitsToClear from the LHS of an 'and' comes back non-zero, but all of those bits are known zero on the RHS, we can reset bitsToClear. Without this, the 'or' in the modified test case blocks the transform because it has non-zero bits in its RHS in those bits. Reviewers: spatel, majnemer, davide Reviewed By: davide Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36944 llvm-svn: 311343	2017-08-21 16:04:11 +00:00
Craig Topper	8078dd2984	[X86] When selecting sse_load_f32/f64 pattern, make sure there's only one use of every node all the way back to the root of the match Summary: With masked operations, its possible for the operation node like fadd, fsub, etc. to be used by multiple different vselects. Since the pattern matching will start at the vselect, we need to make sure the operation node itself is only used once before we can fold a load. Otherwise we'll end up folding the same load into multiple instructions. Reviewers: RKSimon, spatel, zvi, igorb Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36938 llvm-svn: 311342	2017-08-21 16:04:04 +00:00
Xinliang David Li	d2838fc4b9	Revert 311208, 311209 llvm-svn: 311341	2017-08-21 16:00:38 +00:00
Sanjay Patel	707f786cc5	revert r311333: [LibCallSimplifier] try harder to fold memcmp with constant arguments We're getting lots of compile-timeout bot failures like: http://lab.llvm.org:8011/builders/clang-native-arm-lnt/builds/7119 http://lab.llvm.org:8011/builders/clang-cmake-x86_64-avx2-linux llvm-svn: 311340	2017-08-21 15:16:25 +00:00
Sanjay Patel	0707434ce8	[InstCombine] add vector tests; NFC llvm-svn: 311339	2017-08-21 15:11:39 +00:00
Zachary Turner	d1de2f4f5e	[llvm-pdbutil] Add support for dumping detailed module stats. This adds support for dumping a summary of module symbols and CodeView debug chunks. This option prints a table for each module of all of the symbols that occurred in the module and the number of times it occurred and total byte size. Then at the end it prints the totals for the entire file. Additionally, this patch adds the -jmc (just my code) option, which suppresses modules which are from external libraries or linker imports, so that you can focus only on the object files and libraries that originate from your own source code. llvm-svn: 311338	2017-08-21 14:53:25 +00:00
Sanjay Patel	48c67c9965	[InstCombine] regenerate test checks; NFC llvm-svn: 311337	2017-08-21 14:34:06 +00:00
Sanjay Patel	7756edfa93	[LibCallSimplifier] try harder to fold memcmp with constant arguments Try to fold: memcmp(X, C, ConstantLength) == 0 --> load X == *C Without this change, we're unnecessarily checking the alignment of the constant data, so we miss the transform in the first 2 tests in the patch. I noted this shortcoming of LibCallSimpifier in one of the recent CGP memcmp expansion patches. This doesn't help the example in: https://bugs.llvm.org/show_bug.cgi?id=34032#c13 ...directly, but it's worth short-circuiting more of these simple cases since we're already trying to do that. The benefit of transforming to load+cmp is that existing IR analysis/transforms may further simplify that code. For example, if the load of the variable is common to multiple memcmp calls, CSE can remove the duplicate instructions. Differential Revision: https://reviews.llvm.org/D36922 llvm-svn: 311333	2017-08-21 13:55:49 +00:00
Stefan Pintilie	9495f33e45	[PowerPC] Check if the pre-increment PHI Node already exists Preparations to use the per-increment are sometimes done in the target independent pass Loop Strength Reduction. We try to detect them in the PowerPC specific pass so that they are not done twice and so that we do not add PHIs that are not required. Differential Revision: https://reviews.llvm.org/D36736 llvm-svn: 311332	2017-08-21 13:36:18 +00:00
Igor Breger	685889cf9b	[GlobalISel][X86] Support G_BRCOND operation. Summary: Support G_BRCOND operation. For now don't try to fold cmp/trunc instructions. Reviewers: zvi, guyblank Reviewed By: guyblank Subscribers: rovka, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D34754 llvm-svn: 311327	2017-08-21 10:51:54 +00:00
Oliver Stannard	9bd18aa7d8	[AsmParser] Recommit: Hash is not a comment on some targets Re-committing after r311325 fixed an unintentional use of '#' comments in clang. The '#' token is not a comment for all targets (on ARM and AArch64 it marks an immediate operand), so we shouldn't treat it as such. Comments are already converted to AsmToken::EndOfStatement by AsmLexer::LexLineComment, so this check was unnecessary. Differential Revision: https://reviews.llvm.org/D36405 llvm-svn: 311326	2017-08-21 09:58:37 +00:00
Igor Breger	1b5e3d3e28	[GlobalISel][X86] LowerCall, for now don't handel ByValue function arguments. llvm-svn: 311321	2017-08-21 08:59:59 +00:00
Michael Zuckerman	bdb6673151	[InterLeaved] Adding lit test for future work interleaved load strid 3 llvm-svn: 311320	2017-08-21 08:56:39 +00:00
Chandler Carruth	98c51cbee1	[x86] Teach the "generic" x86 CPU to avoid patterns that are slow on widely used processors. This occured to me when I saw that we were generating 'inc' and 'dec' when for Haswell and newer we shouldn't. However, there were a few "X is slow" things that we should probably just set. I've avoided any of the "X is fast" features because most of those would be pretty serious regressions on processors where X isn't actually fast. The slow things are likely to be negligible costs on processors where these aren't slow and a significant win when they are slow. In retrospect this seems somewhat obvious. Not sure why we didn't do this a long time ago. Differential Revision: https://reviews.llvm.org/D36947 llvm-svn: 311318	2017-08-21 08:45:22 +00:00
Chandler Carruth	63dd5e0ef6	[x86] Handle more cases where we can re-use an atomic operation's flags rather than doing a separate comparison. This both saves an explicit comparision and avoids the use of `xadd` which introduces register constraints and other challenges to the generated code. The motivating case is from atomic reference counts where `1` is the sentinel rather than `0` for whatever reason. This can and should be lowered efficiently on x86 by just using a different flag, however the x86 code only handled the `0` case. There remains some further opportunities here that are currently hidden due to canonicalization. I've included test cases that show these and FIXMEs. However, I don't at the moment have any production use cases and they seem substantially harder to address. Differential Revision: https://reviews.llvm.org/D36945 llvm-svn: 311317	2017-08-21 08:45:19 +00:00
Sam Parker	b252ffd2cc	[ARM][AArch64] Cortex-A75 and Cortex-A55 support This patch introduces support for Cortex-A75 and Cortex-A55, Arm's latest big.LITTLE A-class cores. They implement the ARMv8.2-A architecture, including the cryptography and RAS extensions, plus the optional dot product extension. They also implement the RCpc AArch64 extension from ARMv8.3-A. Cortex-A75: https://developer.arm.com/products/processors/cortex-a/cortex-a75 Cortex-A55: https://developer.arm.com/products/processors/cortex-a/cortex-a55 Differential Revision: https://reviews.llvm.org/D36667 llvm-svn: 311316	2017-08-21 08:43:06 +00:00
Coby Tayree	c54c5cbe67	[X86] Allow xacquire/xrelease prefixes Allow those prefixes on assembly code Differential Revision: https://reviews.llvm.org/D36845 llvm-svn: 311309	2017-08-21 07:50:15 +00:00
Craig Topper	d6f4be97e6	[AVX-512] Don't change which instructions we use for unmasked subvector broadcasts when AVX512DQ is enabled. There's no functional difference between the AVX512DQ instructions if we're not masking. This change unifies test checks and removes extra isel entries. Similar was done for subvector insert and extracts recently. llvm-svn: 311308	2017-08-21 05:29:02 +00:00
Craig Topper	485cca1ecb	[AVX512] Add 128->256 vbroadcastf64x2/vbroadcasti64x2 instructions to the EVEX->VEX table. llvm-svn: 311307	2017-08-21 05:03:28 +00:00
Dean Michael Berris	c5caf3e9c6	[XRay][tools] Support new kinds of instrumentation map entries Summary: When extracting the instrumentation map from a binary, we should be able to recognize the new kinds of instrumentation sleds we've been emitting with the compiler using -fxray-instrument. This change adds a test for all the kinds of sleds we currently support (sans the tail-call sled, which is a bit harder to force in a simple prebuilt input). Reviewers: kpw, dblaikie Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36819 llvm-svn: 311305	2017-08-21 00:14:06 +00:00
Chandler Carruth	bd6dc14230	Revert r311077: [LV] Using VPlan ... This causes LLVM to assert fail on PPC64 and crash / infloop in other cases. Filed http://llvm.org/PR34248 with reproducer attached. llvm-svn: 311304	2017-08-20 23:17:11 +00:00
Craig Topper	a152903c1b	[InstCombine] Add a test case for a weakness in canEvaluateZExtd. NFC llvm-svn: 311303	2017-08-20 21:38:28 +00:00
Craig Topper	d63b33f9c4	[AVX512] Add a test to check what happens when a load is referenced by two different masked scalar intrinsics with the same op inputs, but different masking node. We're missing some single use checks in the sse_load_f32/f64 handling that cause us to replicate the load. llvm-svn: 311300	2017-08-20 19:47:00 +00:00
Kuba Mracek	6734671dda	Fix archive-update.test after r311296. llvm-svn: 311299	2017-08-20 18:31:30 +00:00
Kuba Mracek	2c0bca49b1	Remove uses of "%T" from test/Object/archive-* tests. llvm-svn: 311296	2017-08-20 18:18:44 +00:00
Kuba Mracek	d3f3fae32d	Get rid of even more "%T" expansions, see <https://reviews.llvm.org/D35396 >. llvm-svn: 311294	2017-08-20 17:05:22 +00:00
Kuba Mracek	5c393d2565	Get rid of some more "%T" expansions, see <https://reviews.llvm.org/D35396 >. llvm-svn: 311293	2017-08-20 17:00:08 +00:00
Benjamin Kramer	760e00b0bc	[MachO] Use Twines more efficiently. llvm-svn: 311291	2017-08-20 15:13:39 +00:00
Elena Demikhovsky	f58f838495	Changed basic cost of store operation on X86 Store operation takes 2 UOps on X86 processors. The exact cost calculation affects several optimization passes including loop unroling. This change compensates performance degradation caused by https://reviews.llvm.org/D34458 and shows improvements on some benchmarks. Differential Revision: https://reviews.llvm.org/D35888 llvm-svn: 311285	2017-08-20 12:34:29 +00:00
Aditya Kumar	a525fffd07	[Loop Vectorize] Added a separate metadata Added a separate metadata to indicate when the loop has already been vectorized instead of setting width and count to 1. Patch written by Divya Shanmughan and Aditya Kumar Differential Revision: https://reviews.llvm.org/D36220 llvm-svn: 311281	2017-08-20 10:32:41 +00:00
Igor Breger	88a3d5c855	[GlobalISel][X86] Support call ABI. Summary: Support call ABI. For now only Linux C and X86_64_SysV calling conventions supported. Variadic function not supported. Reviewers: zvi, guyblank, oren_ben_simhon Reviewed By: oren_ben_simhon Subscribers: rovka, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D34602 llvm-svn: 311279	2017-08-20 09:25:22 +00:00
Igor Breger	b3a860a5e8	[GlobalISel][X86] Support asimetric copy from/to GPR physical register. Usually this case generated by ABI lowering, it requare to performe trancate/anyext. llvm-svn: 311278	2017-08-20 07:14:40 +00:00
Sam Elliott	7fe0aaa140	Revert "Emit only A Single Opt Remark When Inlining" Reverting due to clang build failure llvm-svn: 311274	2017-08-20 06:55:10 +00:00
Sam Elliott	785dd75369	Emit only A Single Opt Remark When Inlining Summary: This updates the Inliner to only add a single Optimization Remark when Inlining, rather than an Analysis Remark and an Optimization Remark. Fixes https://bugs.llvm.org/show_bug.cgi?id=33786 Reviewers: anemet, davidxl, chandlerc Reviewed By: anemet Subscribers: haicheng, fhahn, mehdi_amini, dblaikie, llvm-commits, eraman Differential Revision: https://reviews.llvm.org/D36054 llvm-svn: 311273	2017-08-20 06:43:34 +00:00
Sam Elliott	b0c9753691	Keep Optimization Remark Yaml in NewPM Summary: The New Pass Manager infrastructure was forgetting to keep around the optimization remark yaml file that the compiler might have been producing. This meant setting the option to '-' for stdout worked, but setting it to a filename didn't give file output (presumably it was deleted because compilation didn't explicitly keep it). This change just ensures that the file is kept if compilation succeeds. So far I have updated one of the optimization remark output tests to add a version with the new pass manager. It is my intention for this patch to also include changes to all tests that use `-opt-remark-output=` but I wanted to get the code patch ready for review while I was making all those changes. Fixes https://bugs.llvm.org/show_bug.cgi?id=33951 Reviewers: anemet, chandlerc Reviewed By: anemet, chandlerc Subscribers: javed.absar, chandlerc, fhahn, llvm-commits Differential Revision: https://reviews.llvm.org/D36906 llvm-svn: 311271	2017-08-20 01:30:45 +00:00
Chandler Carruth	9ef881efab	[x86] Fix an even stranger corner case where we have multiple levels of cmov self-refrencing. Pointed out by Amjad Aboud in code review, test case minorly simplified from the one he posted. llvm-svn: 311267	2017-08-19 23:35:50 +00:00
Craig Topper	a0319bb434	[AVX512] Use alignedstore256 in a pattern that's emitting a 256-bit movaps from an extract subvector operation. llvm-svn: 311263	2017-08-19 22:02:02 +00:00
Martin Storsjo	91522ffa12	[ARM] Check the right order for halves of VZIP/VUZP if both parts are used This is the exact same fix as in SVN r247254. In that commit, the fix was applied only for isVTRNMask and isVTRN_v_undef_Mask, but the same issue is present for VZIP/VUZP as well. This fixes PR33921. Differential Revision: https://reviews.llvm.org/D36899 llvm-svn: 311258	2017-08-19 19:47:48 +00:00
Teresa Johnson	b225ad05af	Fix bot failures by requiring x86 target The tests added in r311254 require a target triple since they are running through code generation. Fix bot failures by requiring an x86 target. llvm-svn: 311257	2017-08-19 19:15:04 +00:00
Jatin Bhateja	6b4c205685	[DAGCombiner] Extending pattern detection for vector shuffle. Summary: If all the operands of a BUILD_VECTOR extract elements from same vector then split the vector efficiently based on the maximum vector access index. Reviewers: zvi, delena, RKSimon, thakis Reviewed By: RKSimon Subscribers: chandlerc, eladcohen, llvm-commits Differential Revision: https://reviews.llvm.org/D35788 llvm-svn: 311255	2017-08-19 18:08:59 +00:00
Teresa Johnson	73305f82e9	[ThinLTO] Fix ThinLTO crash Summary: Follow up to fix in r311023, which fixed the case where the combined index is written to disk. The same samplePGO logic exists for the in-memory index when computing imports, so we need to filter out GlobalVariable summaries there too. Reviewers: davidxl Subscribers: inglorion, llvm-commits Differential Revision: https://reviews.llvm.org/D36919 llvm-svn: 311254	2017-08-19 18:04:25 +00:00
Jatin Bhateja	66f7958e91	Revert rL311247 : To rectify commit message. Summary: This reverts commit rL311247. Differential Revision: https://reviews.llvm.org/D36927 llvm-svn: 311252	2017-08-19 17:59:58 +00:00
Jatin Bhateja	6f0d0d23b0	Merge branch 'arcpatch-D35788' llvm-svn: 311247	2017-08-19 17:00:04 +00:00
Jatin Bhateja	1c56863739	Revert rL311242 "Extension of shuffle vector pattern detection, updating post rebase." Summary: This reverts commit rL311242. Differential Revision: https://reviews.llvm.org/D36924 llvm-svn: 311246	2017-08-19 16:40:06 +00:00
Jatin Bhateja	313f97dd84	Extension of shuffle vector pattern detection, updating post rebase. llvm-svn: 311242	2017-08-19 15:58:36 +00:00
Victor Leschuk	ee7d232a41	revert failing test llvm-svn: 311238	2017-08-19 12:24:41 +00:00
Victor Leschuk	ba0954c4e2	Add temporary test to verify that win10 builder hangs on error llvm-svn: 311236	2017-08-19 12:02:39 +00:00
Chandler Carruth	4f3aa29a46	[Inliner] Fix a nasty bug when inlining a non-recursive trace of a function into itself. We tried to fix this before in r306495 but that got reverted as the assert was actually hit. This fixes the original bug (which we seem to have lost track of with the revert) by blocking a second remapping when the function being inlined is also the caller and the remapping could succeed but erroneously. The included test case would actually load from an inlined copy of the alloca before this change, failing to load the stored value and miscompiling. Many thanks to Richard Smith for diagnosing a user miscompile to this bug, and to Kyle for the first attempt and initial analysis and David Li for remembering the issue and how to fix it and suggesting the patch. I'm just stitching it together and landing it. =] llvm-svn: 311229	2017-08-19 06:56:11 +00:00
Chandler Carruth	2a80fddf67	[Inliner] Clean up a test case a bit to make it more clear what is being tested and why. llvm-svn: 311228	2017-08-19 06:06:44 +00:00
Chandler Carruth	93a645525c	[x86] Teach the cmov converter to aggressively convert cmovs with memory operands into control flow. We have seen periodically performance problems with cmov where one operand comes from memory. On modern x86 processors with strong branch predictors and speculative execution, this tends to be much better done with a branch than cmov. We routinely see cmov stalling while the load is completed rather than continuing, and if there are subsequent branches, they cannot be speculated in turn. Also, in many (even simple) cases, macro fusion causes the control flow version to be fewer uops. Consider the IACA output for the initial sequence of code in a very hot function in one of our internal benchmarks that motivates this, and notice the micro-op reduction provided. Before, SNB: ``` Throughput Analysis Report -------------------------- Block Throughput: 2.20 Cycles Throughput Bottleneck: Port1 \| Num Of \| Ports pressure in cycles \| \| \| Uops \| 0 - DV \| 1 \| 2 - D \| 3 - D \| 4 \| 5 \| \| --------------------------------------------------------------------- \| 1 \| \| 1.0 \| \| \| \| \| CP \| mov rcx, rdi \| 0* \| \| \| \| \| \| \| \| xor edi, edi \| 2^ \| 0.1 \| 0.6 \| 0.5 0.5 \| 0.5 0.5 \| \| 0.4 \| CP \| cmp byte ptr [rsi+0xf], 0xf \| 1 \| \| \| 0.5 0.5 \| 0.5 0.5 \| \| \| \| mov rax, qword ptr [rsi] \| 3 \| 1.8 \| 0.6 \| \| \| \| 0.6 \| CP \| cmovbe rax, rdi \| 2^ \| \| \| 0.5 0.5 \| 0.5 0.5 \| \| 1.0 \| \| cmp byte ptr [rcx+0xf], 0x10 \| 0F \| \| \| \| \| \| \| \| jb 0xf Total Num Of Uops: 9 ``` After, SNB: ``` Throughput Analysis Report -------------------------- Block Throughput: 2.00 Cycles Throughput Bottleneck: Port5 \| Num Of \| Ports pressure in cycles \| \| \| Uops \| 0 - DV \| 1 \| 2 - D \| 3 - D \| 4 \| 5 \| \| --------------------------------------------------------------------- \| 1 \| 0.5 \| 0.5 \| \| \| \| \| \| mov rax, rdi \| 0* \| \| \| \| \| \| \| \| xor edi, edi \| 2^ \| 0.5 \| 0.5 \| 1.0 1.0 \| \| \| \| \| cmp byte ptr [rsi+0xf], 0xf \| 1 \| 0.5 \| 0.5 \| \| \| \| \| \| mov ecx, 0x0 \| 1 \| \| \| \| \| \| 1.0 \| CP \| jnbe 0x39 \| 2^ \| \| \| \| 1.0 1.0 \| \| 1.0 \| CP \| cmp byte ptr [rax+0xf], 0x10 \| 0F \| \| \| \| \| \| \| \| jnb 0x3c Total Num Of Uops: 7 ``` The difference even manifests in a throughput cycle rate difference on Haswell. Before, HSW: ``` Throughput Analysis Report -------------------------- Block Throughput: 2.00 Cycles Throughput Bottleneck: FrontEnd \| Num Of \| Ports pressure in cycles \| \| \| Uops \| 0 - DV \| 1 \| 2 - D \| 3 - D \| 4 \| 5 \| 6 \| 7 \| \| --------------------------------------------------------------------------------- \| 0* \| \| \| \| \| \| \| \| \| \| mov rcx, rdi \| 0* \| \| \| \| \| \| \| \| \| \| xor edi, edi \| 2^ \| \| \| 0.5 0.5 \| 0.5 0.5 \| \| 1.0 \| \| \| \| cmp byte ptr [rsi+0xf], 0xf \| 1 \| \| \| 0.5 0.5 \| 0.5 0.5 \| \| \| \| \| \| mov rax, qword ptr [rsi] \| 3 \| 1.0 \| 1.0 \| \| \| \| \| 1.0 \| \| \| cmovbe rax, rdi \| 2^ \| 0.5 \| \| 0.5 0.5 \| 0.5 0.5 \| \| \| 0.5 \| \| \| cmp byte ptr [rcx+0xf], 0x10 \| 0F \| \| \| \| \| \| \| \| \| \| jb 0xf Total Num Of Uops: 8 ``` After, HSW: ``` Throughput Analysis Report -------------------------- Block Throughput: 1.50 Cycles Throughput Bottleneck: FrontEnd \| Num Of \| Ports pressure in cycles \| \| \| Uops \| 0 - DV \| 1 \| 2 - D \| 3 - D \| 4 \| 5 \| 6 \| 7 \| \| --------------------------------------------------------------------------------- \| 0* \| \| \| \| \| \| \| \| \| \| mov rax, rdi \| 0* \| \| \| \| \| \| \| \| \| \| xor edi, edi \| 2^ \| \| \| 1.0 1.0 \| \| \| 1.0 \| \| \| \| cmp byte ptr [rsi+0xf], 0xf \| 1 \| \| 1.0 \| \| \| \| \| \| \| \| mov ecx, 0x0 \| 1 \| \| \| \| \| \| \| 1.0 \| \| \| jnbe 0x39 \| 2^ \| 1.0 \| \| \| 1.0 1.0 \| \| \| \| \| \| cmp byte ptr [rax+0xf], 0x10 \| 0F \| \| \| \| \| \| \| \| \| \| jnb 0x3c Total Num Of Uops: 6 ``` Note that this cannot be usefully restricted to inner loops. Much of the hot code we see hitting this is not in an inner loop or not in a loop at all. The optimization still remains effective and indeed critical for some of our code. I have run a suite of internal benchmarks with this change. I saw a few very significant improvements and a very few minor regressions, but overall this change rarely has a significant effect. However, the improvements were very significant, and in quite important routines responsible for a great deal of our C++ CPU cycles. The gains pretty clealy outweigh the regressions for us. I also ran the test-suite and SPEC2006. Only 11 binaries changed at all and none of them showed any regressions. Amjad Aboud at Intel also ran this over their benchmarks and saw no regressions. Differential Revision: https://reviews.llvm.org/D36858 llvm-svn: 311226	2017-08-19 05:01:19 +00:00
Dinar Temirbulatov	7aff8cfa55	[SLPVectorizer] Tighten up VLeft, VRight declaration, remove unnecessary testcase test/Transforms/SLPVectorizer/X86/reorder.ll, NFCI. llvm-svn: 311223	2017-08-19 03:15:07 +00:00
Dinar Temirbulatov	e3ce1b455e	[SLPVectorizer] Add opcode parameter to reorderAltShuffleOperands, reorderInputsAccordingToOpcode functions. Reviewers: mkuper, RKSimon, ABataev, mzolotukhin, spatel, filcab Subscribers: llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D36766 llvm-svn: 311221	2017-08-19 02:54:20 +00:00
Adrian Prantl	2116dd360a	Filter out non-constant DIGlobalVariableExpressions reachable via the CU They won't affect the DWARF output, but they will mess with the sorting of the fragments. This fixes the crash reported in PR34159. https://bugs.llvm.org/show_bug.cgi?id=34159 llvm-svn: 311217	2017-08-19 01:15:06 +00:00
Eric Beckmann	91d8af5386	llvm-mt: Merge manifest namespaces. mt.exe performs a tree merge where certain element nodes are combined into one. This introduces the possibility of xml namespaces conflicting with each other. The original mt.exe has a hierarchy whereby certain namespace names can override others, and nodes that would then end up in ambigious namespaces have their namespaces explicitly defined. This namespace handles this merging process. llvm-svn: 311215	2017-08-19 00:37:41 +00:00
Xinliang David Li	709ffe178e	[Profile] backward propagate profile info in JumpThreading Differential Revsion: http://reviews.llvm.org/D36864 llvm-svn: 311208	2017-08-18 23:00:05 +00:00
Amjad Aboud	88ffa3afe2	[InstCombine] Teach ComputeNumSignBitsImpl to handle integer multiply instruction. Differential Revision: https://reviews.llvm.org/D36679 llvm-svn: 311206	2017-08-18 22:56:55 +00:00
Max Kazantsev	0aaf8c16ac	[IRCE] Fix buggy behavior in Clamp Clamp function was too optimistic when choosing signed or unsigned min/max function for calculations. In fact, `!IsSignedPredicate` guarantees us that `Smallest` and `Greatest` can be compared safely using unsigned predicates, but we did not check this for `S` which can in theory be negative. This patch makes Clamp use signed min/max for cases when it fails to prove `S` being non-negative, and it adds a test where such situation may lead to incorrect conditions calculation. Differential Revision: https://reviews.llvm.org/D36873 llvm-svn: 311205	2017-08-18 22:50:29 +00:00
Jonas Devlieghere	a2faf7b60f	[llvm-dwarfdump] Hide .debug_str and DIE reference offsets in brief mode This patch hides the .debug_str offset and DIE reference offsets into the CU when llvm-dwarfdump is invoked with -brief. Differential Revision: https://reviews.llvm.org/D36835 llvm-svn: 311201	2017-08-18 21:35:44 +00:00
Simon Pilgrim	f36cca88fb	[X86][ADX] Regenerate ADX intrinsics tests llvm-svn: 311198	2017-08-18 21:21:14 +00:00
Ana Pazos	6210f27dfc	[PGO] Fixed assertion due to mismatched memcpy size type. Summary: Memcpy intrinsics have size argument of any integer type, like i32 or i64. Fixed size type along with its value when cloning the intrinsic. Reviewers: davidxl, xur Reviewed By: davidxl Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D36844 llvm-svn: 311188	2017-08-18 19:17:08 +00:00
Tim Northover	14302fcb24	ARM: use an external relocation for calls from MachO ARM mode. The internal (__text-relative) relocation risks the offset not being encodable if the destination is Thumb. llvm-svn: 311187	2017-08-18 19:13:56 +00:00
Matt Morehouse	5c7fc76983	[SanitizerCoverage] Add stack depth tracing instrumentation. Summary: Augment SanitizerCoverage to insert maximum stack depth tracing for use by libFuzzer. The new instrumentation is enabled by the flag -fsanitize-coverage=stack-depth and is compatible with the existing trace-pc-guard coverage. The user must also declare the following global variable in their code: thread_local uintptr_t __sancov_lowest_stack https://bugs.llvm.org/show_bug.cgi?id=33857 Reviewers: vitalybuka, kcc Reviewed By: vitalybuka Subscribers: kubamracek, hiraditya, cfe-commits, llvm-commits Differential Revision: https://reviews.llvm.org/D36839 llvm-svn: 311186	2017-08-18 18:43:30 +00:00
Marek Sokolowski	5cd3d5c8d6	Reapply: [llvm-rc] Add basic RC scripts parsing ability. As for now, the parser supports a limited set of statements and resources. This will be extended in the following patches. Thanks to Nico Weber (thakis) for his original work in this area. This patch was originally submitted as r311175 and got reverted in r311177 because of the problems with compilation under gcc. Differential Revision: https://reviews.llvm.org/D36340 llvm-svn: 311184	2017-08-18 18:24:17 +00:00
Jonas Devlieghere	e101b07a1d	[Debug info] Transfer DI to fragment expressions for split integer values. This patch teaches the SDag type legalizer how to split up debug info for integer values that are split into a hi and lo part. (re-commit) Differential Revision: https://reviews.llvm.org/D36805 llvm-svn: 311181	2017-08-18 18:07:00 +00:00
Marek Sokolowski	f276f52014	Revert "[llvm-rc] Add basic RC scripts parsing ability." This reverts commit r311175. This failed some buildbots compilation. llvm-svn: 311177	2017-08-18 17:25:55 +00:00
Marek Sokolowski	dbc16476c1	[llvm-rc] Add basic RC scripts parsing ability. As for now, the parser supports a limited set of statements and resources. This will be extended in the following patches. Thanks to Nico Weber (thakis) for his original work in this area. Differential Revision: https://reviews.llvm.org/D36340 llvm-svn: 311175	2017-08-18 17:05:47 +00:00
Simon Pilgrim	879ce046ad	[X86][BMI2] Added scheduling test for RORX/SARX/SHLX/SHRX instructions llvm-svn: 311171	2017-08-18 16:26:39 +00:00
Simon Pilgrim	358aeae7b8	[X86][AES] Add scheduling latency/throughput tests for AES instructions llvm-svn: 311167	2017-08-18 15:26:51 +00:00
Simon Pilgrim	9eb0869e91	[X86][PCLMUL] Add scheduling latency/throughput test for PCLMULQDQ instruction Added it to the SSE42 tests as targets seem to always have both llvm-svn: 311166	2017-08-18 15:08:30 +00:00
Simon Pilgrim	ccaec26175	[X86][SHA] Add scheduling latency/throughput tests for SHA instructions llvm-svn: 311164	2017-08-18 14:55:50 +00:00
Simon Pilgrim	7f506f7d72	[X86][MOVBE] Add scheduling latency/throughput tests for MOVBE instructions llvm-svn: 311163	2017-08-18 14:44:31 +00:00
Simon Pilgrim	320f89782a	[X86][BMI2] Added scheduling test for MULX instructions llvm-svn: 311159	2017-08-18 13:22:18 +00:00
Sjoerd Meijer	ec9581e5e0	[AArch64] Do not promote f16 when subtarget HasFullFP16 Armv8.2-A adds FP16 support, i.e. f16 is not only a storage-only type, but it also supports performing data processing on 16-bit floating-point quantities. All the necessary (tablegen) groundwork of adding the ARMv8.2-A FP16 (scalar) instructions was done in D15014. To take advantage of this, this patch avoids promotion of f16 to f32 types when the subtarget supports FullFP16, which enables instruction selection of these FP16 instructions. Differential Revision: https://reviews.llvm.org/D36396 llvm-svn: 311154	2017-08-18 10:51:14 +00:00
Diana Picus	42ea77d5c2	Revert "GlobalISel (AArch64): fix ABI at border between GPRs and SP." This reverts commit e8fd20964798ca6d46d2729dd3a789707a6416da in an attempt to appease the GlobalISel buildbot, which fails in the test-suite with errors like fpcmp: files differ without tolerance allowance llvm-svn: 311151	2017-08-18 09:31:21 +00:00
Geoff Berry	bd47e8a4f7	Revert "[MachineCopyPropagation] Extend pass to do COPY source forwarding" round 2 This reverts commit r311135. sanitizer-x86_64-linux-android buildbot is timing out with just this patch applied. llvm-svn: 311142	2017-08-18 01:43:11 +00:00
Richard Smith	c0541dfa3e	Increase tail dup threshold for -O3 from 3 to 4. We see a modest performance improvement from this slightly higher tail dup threshold. Differential Revision: https://reviews.llvm.org/D36775 llvm-svn: 311139	2017-08-17 23:38:41 +00:00
Craig Topper	1fae3ae6f0	[X86] Remove SSE/AVX patterns for AND/XOR/OR/ANDN that checked for the inputs being bitcasted from floating point types. There's really no reason to do this we should just let isel pick the integer version and let the execution dependency fixing pass take care of moving to FP if necessary. It's not very reliable to look for bitcasts at the edges of patterns. If for some reason one input was bitcasted and the other wasn't, or if one was a v4f32 bitcast and one was a v2f64 bitcast, we would have fallen back to the integer pattern anyway. llvm-svn: 311138	2017-08-17 23:20:57 +00:00
Tim Northover	48fff995d6	GlobalISel (AArch64): fix ABI at border between GPRs and SP. If a struct would end up half in GPRs and half on SP the ABI says it should actually go entirely on the stack. We were getting this wrong in GlobalISel before, causing compatibility issues. llvm-svn: 311137	2017-08-17 23:14:01 +00:00
Geoff Berry	51f52c4fca	Re-enable "[MachineCopyPropagation] Extend pass to do COPY source forwarding" Two issues identified by buildbots were addressed: - The pass no longer forwards COPYs to physical register uses, since doing so can break code that implicitly relies on the physical register number of the use. - The pass no longer forwards COPYs to undef uses, since doing so can break the machine verifier by creating LiveRanges that don't end on a use (since the undef operand is not considered a use). [MachineCopyPropagation] Extend pass to do COPY source forwarding This change extends MachineCopyPropagation to do COPY source forwarding. This change also extends the MachineCopyPropagation pass to be able to be run during register allocation, after physical registers have been assigned, but before the virtual registers have been re-written, which allows it to remove virtual register COPY LiveIntervals that become dead through the forwarding of all of their uses. Reviewers: qcolombet, javed.absar, MatzeB, jonpa Subscribers: jyknight, nemanjai, llvm-commits, nhaehnle, mcrosier, mgorny Differential Revision: https://reviews.llvm.org/D30751 llvm-svn: 311135	2017-08-17 23:06:55 +00:00
Jonas Devlieghere	30756da212	Revert "[Debug info] Transfer DI to fragment expressions for split integer values." This reverts commit r311102. llvm-svn: 311111	2017-08-17 17:58:33 +00:00
Alexey Bataev	84ad9ae032	[SimplifyCFG] Add a test for preserve store alignment, NFC. llvm-svn: 311106	2017-08-17 17:26:52 +00:00
Sanjay Patel	f2d67f7ecc	[x86] add tests for vector select-of-constants; NFC We've discussed canonicalizing to this form in IR, so the backend should be prepared to lower these in ways better than what we see here in most cases. llvm-svn: 311103	2017-08-17 17:07:37 +00:00
Jonas Devlieghere	622fedc001	[Debug info] Transfer DI to fragment expressions for split integer values. This patch teaches the SDag type legalizer how to split up debug info for integer values that are split into a hi and lo part. Differential Revision: https://reviews.llvm.org/D36805 llvm-svn: 311102	2017-08-17 17:06:48 +00:00
Sanjay Patel	18424e1581	[PowerPC] add tests for vector select-of-constants; NFC We've discussed canonicalizing to this form in IR, so the backend should be prepared to lower these in ways better than what we see here. llvm-svn: 311099	2017-08-17 17:03:11 +00:00
Adrian Prantl	6a57daad81	Improve line debug info when translating a CaseBlock to SDNodes. The SelectionDAGBuilder translates various conditional branches into CaseBlocks which are then translated into SDNodes. If a conditional branch results in multiple CaseBlocks only the first CaseBlock is translated into SDNodes immediately, the rest of the CaseBlocks are put in a queue and processed when all LLVM IR instructions in the basic block have been processed. When a CaseBlock is transformed into SDNodes the SelectionDAGBuilder is queried for the current LLVM IR instruction and the resulting SDNodes are annotated with the debug info of the current instruction (if it exists and has debug metadata). When the deferred CaseBlocks are processed, the SelectionDAGBuilder does not have a current LLVM IR instruction, and the resulting SDNodes will not have any debuginfo. As DwarfDebug::beginInstruction() outputs a .loc directive for the first instruction in a labeled block (typically the case for something coming from a CaseBlock) this tends to produce a line-0 directive. This patch changes the handling of CaseBlocks to store the current instruction's debug info into the CaseBlock when it is created (and the SelectionDAGBuilder knows the current instruction) and to always use the stored debug info when translating a CaseBlock to SDNodes. Patch by Frej Drejhammar! Differential Revision: https://reviews.llvm.org/D36671 llvm-svn: 311097	2017-08-17 16:57:13 +00:00
Craig Topper	3a622a14f9	[AVX512] Don't switch unmasked subvector insert/extract instructions when AVX512DQI is enabled. There's no reason to switch instructions with and without DQI. It just creates extra isel patterns and test divergences. There is however value in enabling the masked version of the instructions with DQI. This required introducing some new multiclasses to enabling this splitting. Differential Revision: https://reviews.llvm.org/D36661 llvm-svn: 311091	2017-08-17 15:40:25 +00:00
Victor Leschuk	bcfd8e28c8	Mark Verifier/invalid-eh.ll as unsupported on windows Mark this unsupported for now as it causes tests hangs on buildbot. Will place it back when the problem is debugged. llvm-svn: 311089	2017-08-17 15:07:03 +00:00
Simon Dardis	b5205c69d2	[dfsan] Add explicit zero extensions for shadow parameters in function wrappers. In the case where dfsan provides a custom wrapper for a function, shadow parameters are added for each parameter of the function. These parameters are i16s. For targets which do not consider this a legal type, the lack of sign extension information would cause LLVM to generate anyexts around their usage with phi variables and calling convention logic. Address this by introducing zero exts for each shadow parameter. Reviewers: pcc, slthakur Differential Revision: https://reviews.llvm.org/D33349 llvm-svn: 311087	2017-08-17 14:14:25 +00:00
Daniel Sanders	032e7f2cad	[globalisel][tablegen] Generate TypeObject table. NFC Summary: Generate the type table from the types used by a target rather than hard-coding the union of types used by all targets. Depends on D36084 Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar Reviewed By: rovka Subscribers: kristof.beyls, igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D36085 llvm-svn: 311084	2017-08-17 13:18:35 +00:00
Simon Pilgrim	8be9f4af4f	[DAGCombiner] Add support for non-uniform constant vectors to (mul x, (1 << c)) -> x << c llvm-svn: 311083	2017-08-17 13:03:34 +00:00
Davide Italiano	903fd3ea4e	[Verifier] Avoid visiting DIGlobalVariables twice. We currently visit them twice. Once, through `visitMDNode()` -> (the code generated by) `../include/llvm/IR/Metadata.def:109` -> `visitDIGlobalVariable()` Then, through `visitMDNode()` -> `visitDIGlobalVariableExpression()` -> `visitDIGlobalVariable()` This results in verification failures printed twice, e.g.: $ ./opt -verify ../../test/DebugInfo/pr34186.ll missing global variable type !4 = distinct !DIGlobalVariable(name: "pat", scope: !0, file: !1, line: 27, isLocal: true, isDefinition: true) missing global variable type !4 = distinct !DIGlobalVariable(name: "pat", scope: !0, file: !1, line: 27, isLocal: true, isDefinition: true) ./opt: ../../test/DebugInfo/pr34186.ll: error: input module is broken! The patch removes one call so we ensure each GV is visited exactly once. Differential Revision: https://reviews.llvm.org/D36797 llvm-svn: 311081	2017-08-17 11:32:21 +00:00
Ayal Zaks	6627883369	[LV] Using VPlan to model the vectorized code and drive its transformation VPlan is an ongoing effort to refactor and extend the Loop Vectorizer. This patch introduces the VPlan model into LV and uses it to represent the vectorized code and drive the generation of vectorized IR. In this patch VPlan models the vectorized loop body: the vectorized control-flow is represented using VPlan's Hierarchical CFG, with predication refactored from being a post-vectorization-step into a vectorization planning step modeling if-then VPRegionBlocks, and generating code inline with non-predicated code. The vectorized code within each VPBasicBlock is represented as a sequence of Recipes, each responsible for modelling and generating a sequence of IR instructions. To keep the size of this commit manageable the Recipes in this patch are coarse-grained and capture large chunks of LV's code-generation logic. The constructed VPlans are dumped in dot format under -debug. This commit retains current vectorizer output, except for minor instruction reorderings; see associated modifications to lit tests. For further details on the VPlan model see docs/Proposals/VectorizationPlan.rst and its references. Authors: Gil Rapaport and Ayal Zaks Differential Revision: https://reviews.llvm.org/D32871 llvm-svn: 311077	2017-08-17 09:29:59 +00:00
Daniel Sanders	edd0784be6	Re-commit: [globalisel][tablegen] Support zero-instruction emission. Summary: Support the case where an operand of a pattern is also the whole of the result pattern. In this case the original result and all its uses must be replaced by the operand. However, register class restrictions can require a COPY. This patch handles both cases by always emitting the copy and leaving it for the register allocator to optimize. The previous commit failed on Windows machines due to a flaw in the sort predicate which allowed both A < B < C and B == C to be satisfied simultaneously. The cause of this was some sloppiness in the priority order of G_CONSTANT instructions compared to other instructions. These had equal priority because it makes no difference, however there were operands had higher priority than G_CONSTANT but lower priority than any other instruction. As a result, a priority order between G_CONSTANT and other instructions must be enforced to ensure the predicate defines a strict weak order. Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar Subscribers: javed.absar, kristof.beyls, igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D36084 llvm-svn: 311076	2017-08-17 09:26:14 +00:00
Jonas Paulsson	57a705d9d0	[SystemZ, MachineScheduler] Improve post-RA scheduling. The idea of this patch is to continue the scheduler state over an MBB boundary in the case where the successor block has only one predecessor. This means that the scheduler will continue in the successor block (after emitting any branch instructions) with e.g. maintained processor resource counters. Benchmarks have been confirmed to benefit from this. The algorithm in MachineScheduler.cpp that extracts scheduling regions of an MBB has been extended so that the strategy may optionally reverse the order of processing the regions themselves. This is controlled by a new method doMBBSchedRegionsTopDown(), which defaults to false. Handling the top-most region of an MBB first also means that a top-down scheduler can continue the scheduler state across any scheduling boundary between to regions inside MBB. Review: Ulrich Weigand, Matthias Braun, Andy Trick. https://reviews.llvm.org/D35053 llvm-svn: 311072	2017-08-17 08:33:44 +00:00
Elad Cohen	124d32829c	[SelectionDAG] Teach the vector-types operand scalarizer about SETCC When v1i1 is legal (e.g. AVX512) the legalizer can reach a case where a v1i1 SETCC with an illgeal vector type operand wasn't scalarized (since v1i1 is legal) but its operands does have to be scalarized. This used to assert because SETCC was missing from the vector operand scalarizer. This patch attemps to teach the legalizer to handle these cases by scalazring the operands, converting the node into a scalar SETCC node. Differential revision: https://reviews.llvm.org/D36651 llvm-svn: 311071	2017-08-17 08:06:36 +00:00
Serguei Katkov	9e5604dbe1	[CGP] Fix the rematerialization of gc.relocates If we want to substitute the relocation of derived pointer with gep of base then we must ensure that relocation of base dominates the relocation of derived pointer. Currently only check for basic block is present. However it is possible that both relocation are in the same basic block but relocation of derived pointer is defined earlier. The patch moves the relocation of base pointer right before relocation of derived pointer in this case. Reviewers: sanjoy,artagnon,igor-laevsky,reames Reviewed By: reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36462 llvm-svn: 311067	2017-08-17 05:48:30 +00:00
Geoff Berry	4e38e02e6f	Revert "[MachineCopyPropagation] Extend pass to do COPY source forwarding" This reverts commit r311038. Several buildbots are breaking, and at least one appears to be due to the forwarding of physical regs enabled by this change. Reverting while I investigate further. llvm-svn: 311062	2017-08-17 04:04:11 +00:00
Saleem Abdulrasool	dd8c16b58e	ARM: mark CPSR as clobbered for Windows VLAs When lowering a VLA, we emit a __chstk call. However, this call can internally clobber CPSR. We did not mark this register as an ImpDef, which could potentially allow a comparison to be hoisted above the call to `__chkstk`. In such a case, the CPSR could be clobbered, and the check invalidated. When the support was initially added, it seemed that the call would take care of preventing CPSR from being clobbered, but this is not the case. Mark the register as clobbered to fix a possible state corruption. llvm-svn: 311061	2017-08-17 02:42:24 +00:00
Jakub Kuderski	fd5c5c9144	Reapply: [ADCE][Dominators] Teach ADCE to preserve dominators Summary: This patch teaches ADCE to preserve both DominatorTrees and PostDominatorTrees. I didn't notice any performance impact when bootstrapping clang with this patch. The patch was originally committed in r311039 and reverted in r311049. This revision fixes the problem with not adding a dependency on the DominatorTreeWrapperPass for the LegacyPassManager. Reviewers: dberlin, chandlerc, sanjoy, davide, grosser, brzycki Reviewed By: davide Subscribers: grandinj, zhendongsu, llvm-commits, david2050 Differential Revision: https://reviews.llvm.org/D35869 llvm-svn: 311057	2017-08-17 01:41:49 +00:00
Sanjay Patel	4abc3f6036	[x86] add cmov promotion tests for D36711; NFC This way we can see what the current codegen looks like. I've also explicitly added/removed the cmov attribute from the RUN lines, so we know exactly what we're checking in the runs. llvm-svn: 311052	2017-08-16 22:50:11 +00:00
Amjad Aboud	86111c6696	[InstCombine] Teach canEvaluateTruncated to handle arithmetic shift (including those with vector splat shift amount) Differential Revision: https://reviews.llvm.org/D36784 llvm-svn: 311050	2017-08-16 22:42:38 +00:00
Jakub Kuderski	cbcffb173c	Revert "[ADCE][Dominators] Teach ADCE to preserve dominators" This reverts commit r311039. The patch caused the `test/Bindings/OCaml/Output/scalar_opts.ml` to fail. llvm-svn: 311049	2017-08-16 22:10:53 +00:00
Craig Topper	882f29630b	[InstCombine] Make folding (X >s -1) ? C1 : C2 --> ((X >>s 31) & (C2 - C1)) + C1 support splat vectors This also uses decomposeBitTestICmp to decode the compare. Differential Revision: https://reviews.llvm.org/D36781 llvm-svn: 311044	2017-08-16 21:52:07 +00:00
Jakub Kuderski	4552e9de9f	[ADCE][Dominators] Teach ADCE to preserve dominators Summary: This patch teaches ADCE to preserve both DominatorTrees and PostDominatorTrees. I didn't notice any performance impact when bootstrapping clang with this patch. Reviewers: dberlin, chandlerc, sanjoy, davide, grosser, brzycki Reviewed By: davide Subscribers: grandinj, zhendongsu, llvm-commits, david2050 Differential Revision: https://reviews.llvm.org/D35869 llvm-svn: 311039	2017-08-16 20:50:23 +00:00
Geoff Berry	87f8d25150	[MachineCopyPropagation] Extend pass to do COPY source forwarding This change extends MachineCopyPropagation to do COPY source forwarding. This change also extends the MachineCopyPropagation pass to be able to be run during register allocation, after physical registers have been assigned, but before the virtual registers have been re-written, which allows it to remove virtual register COPY LiveIntervals that become dead through the forwarding of all of their uses. Reviewers: qcolombet, javed.absar, MatzeB, jonpa Subscribers: jyknight, nemanjai, llvm-commits, nhaehnle, mcrosier, mgorny Differential Revision: https://reviews.llvm.org/D30751 llvm-svn: 311038	2017-08-16 20:50:01 +00:00
Simon Atanasyan	cb833076ac	[mips] Handle R_MIPS_TLS_DTPREL32/64 relocations in the RelocVisitor Debug information for TLS variables on MIPS might have R_MIPS_TLS_DTPREL32 or R_MIPS_TLS_DTPREL64 relocations. This patch adds a support for such relocations in the `RelocVisitor`. llvm-svn: 311031	2017-08-16 19:01:22 +00:00
Xinliang David Li	71ecaa19ff	[PGO] Fix ThinLTO crash Differential Revsion: http://reviews.llvm.org/D36640 llvm-svn: 311023	2017-08-16 17:18:01 +00:00
Simon Pilgrim	38e8a023fa	[X86] Regenerate immediate store merging tests llvm-svn: 311016	2017-08-16 16:22:19 +00:00
Hal Finkel	9e54b7093a	[BDCE] Don't check demanded bits on unsized types To clear assumptions that are potentially invalid after trivialization, we need to walk the use/def chain. Normally, the only way to reach an instruction with an unsized type is via an instruction that has side effects (or otherwise will demand its input bits). That would stop the walk. However, if we have a readnone function that returns an unsized type (e.g., void), we must avoid asking for the demanded bits of the function call's return value. A void-returning readnone function is always dead (and so we can stop walking the use/def chain here), but the check is necessary to avoid asserting. Fixes PR34211. llvm-svn: 311014	2017-08-16 16:09:22 +00:00
Davide Italiano	cd21378ff6	[Verifier] Reject globals without a type associated. llvm-svn: 311012	2017-08-16 15:16:33 +00:00
Dmitry Preobrazhensky	b865ef534a	[AMDGPU][MC][GFX9] Added op_sel support for v_mad_*16, v_fma_f16, v_div_fixup_f16 This change implements features postponed in https://reviews.llvm.org/D35424 because of a dependency on https://reviews.llvm.org/D36322 Reviewers: SamWot, artem.tamazov, arsenm Differential Revision: https://reviews.llvm.org/D36694 llvm-svn: 311011	2017-08-16 15:16:32 +00:00
Balaram Makam	c5698befb6	Revert "MachineInstr: Reason locally about some memory objects before going to AA." r310825 caused the clang-ppc64le-linux-lnt bot to go red (http://lab.llvm.org:8011/builders/clang-ppc64le-linux-lnt/builds/5712) because of a test-suite failure of SingleSource/UnitTests/2003-07-09-SignedArgs This reverts commit 0028f6a87224fb595a1c19c544cde9b003035996. llvm-svn: 311008	2017-08-16 14:17:43 +00:00
Dmitry Preobrazhensky	ff64aa514b	[AMDGPU][MC][GFX9] Added integer clamping support for VOP3 opcodes See Bug 34152: https://bugs.llvm.org//show_bug.cgi?id=34152 Reviewers: SamWot, artem.tamazov, arsenm Differential Revision: https://reviews.llvm.org/D36674 llvm-svn: 311006	2017-08-16 13:51:56 +00:00
Simon Pilgrim	c63f93a197	[CostModel][X86][XOP] Improve costs for XOP shuffles VPPERM/VPERMIL2PD/VPERMIL2PS all provide more effective 2-input shuffles than regular AVX instructions llvm-svn: 311005	2017-08-16 13:50:20 +00:00
Davide Italiano	75ebc568e2	[DI] Every DIGlobalVariable should have a type. I'll make this a verifier check to catch other violations. This commit fixes the tests already in tree. llvm-svn: 311004	2017-08-16 13:39:07 +00:00
Simon Dardis	9c5d64b901	[mips] Handle variables with an explicit section and interactions with .sdata, .sbss If a variable has an explicit section such as .sdata or .sbss, it is placed in that section and accessed in a gp relative manner. This overrides the global -G setting. Otherwise if a variable has a explicit section attached to it, such as '.rodata' or '.mysection', it is not placed in the small data section. This also overrides the global -G setting. Reviewers: atanasyan, nitesh.jain Differential Revision: https://reviews.llvm.org/D36616 llvm-svn: 311001	2017-08-16 12:18:04 +00:00
Sam Parker	84fd0c3bf2	[ARM] Improve loop unrolling for Cortex-M - Set the default runtime unroll count to 4 and use the newly added UnrollRemainder option. - Create loop cost and force unroll for a cost less than 12. - Disable unrolling on Thumb1 only targets. Differential Revision: https://reviews.llvm.org/D36134 llvm-svn: 310997	2017-08-16 07:42:44 +00:00
Igor Breger	ce5ea38135	[GlobalISel][X86] Fix mir tests, use correct physical register.NFC. llvm-svn: 310996	2017-08-16 07:25:51 +00:00
Martin Storsjo	58c9527eaf	[llvm-dlltool] Fix creating stdcall/fastcall import libraries for i386 Hook up the -k option (that in the original GNU dlltool removes the @n suffix from the symbol that the final executable ends up linked to). In llvm-dlltool, make sure that functions end up with the undecorate name type if this option is set and they are decorated. In mingw, when creating import libraries from def files instead of creating an import library as a side effect of linking a DLL, the symbol names in the def contain the stdcall/fastcall decoration (but no leading underscore). By setting the undecorate name type, a linker linking to the import library will omit the decoration from the DLL import entry. With this in place, mingw-w64 for i386 built with llvm-dlltool/clang produces import libraries that actually work. Differential Revision: https://reviews.llvm.org/D36548 llvm-svn: 310990	2017-08-16 05:18:36 +00:00
Stanislav Mekhanoshin	a9487d92d7	[AMDGPU] Eliminate no effect instructions before s_endpgm Differential Revision: https://reviews.llvm.org/D36585 llvm-svn: 310987	2017-08-16 04:43:49 +00:00
Dehao Chen	84d412035a	Merge debug info when hoist then-else code to if. Summary: When we move then-else code to if, we need to merge its debug info, otherwise the hoisted instruction may have inaccurate debug info attached. Reviewers: aprantl, probinson, dblaikie, echristo, loladiro Reviewed By: aprantl Subscribers: sanjoy, llvm-commits Differential Revision: https://reviews.llvm.org/D36778 llvm-svn: 310985	2017-08-16 01:55:26 +00:00
Derek Schuff	8e71359561	[WebAssembly] Remove infinite loop from reg-stackify test r310940 exposed reverse-unreachable code to some optimizers, which caused some of the code in this test to be sunk, changing the input to the pass and breaking the exptectations. Since that change is irrelevant to this particular test, this change just adds an exit node to work around the problem; the test should really be more robust (or be an MIR test?) but this preserves the existing test intent. llvm-svn: 310981	2017-08-16 00:49:44 +00:00
Quentin Colombet	647b482c1e	[VirtRegRewriter] Properly model the register liveness on undef subreg definition Undef subreg definition means that the content of the super register doesn't matter at this point. While that's true for virtual registers, this may not hold when replacing them with actual physical registers. Indeed, some part of the physical register may be coalesced with the related virtual register and thus, the values for those parts matter and must be live. The fix consists in checking whether or not subregs of the physical register being assigned to an undef subreg definition are live through that def and insert an implicit use if they are. Doing so, will keep them alive until that point like they should be. E.g., let vreg14 being assigned to R0_R1 then %vreg14:gsub_0<def,read-undef> = COPY %R0 ; <-- R1 is still live here %vreg14:gsub_1<def> = COPY %R1 Before this changes, the rewriter would change the code into: %R0<def> = KILL %R0, %R0_R1<imp-def> ; <-- this tells R1 is redefined %R1<def> = KILL %R1, %R0_R1<imp-def>, %R0_R1<imp-use> ; this value of this R1 ; is believed to come ; from the previous ; instruction Because of this invalid liveness, later pass could make wrong choices and in particular clobber live register as it happened with the register scavenger in llvm.org/PR34107 Now we would generate: %R0<def> = KILL %R0, %R0_R1<imp-def>, %R0_R1<imp-use> ; This tells R1 needs to ; reach this point %R1<def> = KILL %R1, %R0_R1<imp-def>, %R0_R1<imp-use> The bug has been here forever, it got exposed recently because the register scavenger got smarter. Fixes llvm.org/PR34107 llvm-svn: 310979	2017-08-16 00:17:05 +00:00
Kuba Mracek	a1adbe6ca1	Revert archive-* tests from r310953, there were test failures. llvm-svn: 310974	2017-08-15 23:41:34 +00:00
Craig Topper	0a1a276d91	[InstCombine] Teach canEvaluateZExtd and canEvaluateTruncated to handle vector shifts with splat shift amount We were only allowing ConstantInt before. This patch allows splat of ConstantInt too. Differential Revision: https://reviews.llvm.org/D36763 llvm-svn: 310970	2017-08-15 22:48:41 +00:00
Kuba Mracek	c0301b2f53	Revert changes in r310953 for llvm-symbolizer.test. The change causes a test failure. llvm-svn: 310956	2017-08-15 21:02:17 +00:00
Kuba Mracek	17ee427ef3	[llvm] Get rid of "%T" expansions The %T lit expansion expands to a common directory shared between all the tests in the same directory, which is unexpected and unintuitive, and more importantly, it's been a source of subtle race conditions and flaky tests. In https://reviews.llvm.org/D35396, it was agreed that it would be best to simply ban %T and only keep %t, which is unique to each test. When a test needs a temporary directory, it can just create one using mkdir %t. This patch removes %T in llvm. Differential Revision: https://reviews.llvm.org/D36495 llvm-svn: 310953	2017-08-15 20:29:24 +00:00
Amjad Aboud	0464c5d958	[InstCombine] Added support for (X >>s C) << C --> X & (-1 << C) Differential Revision: https://reviews.llvm.org/D36743 llvm-svn: 310949	2017-08-15 19:33:14 +00:00
Sanjay Patel	f69b7d5c93	[InstCombine] sink sext after ashr Narrow ops are better for bit-tracking, and in the case of vectors, may enable better codegen. As the trunc test shows, this can allow follow-on simplifications. There's a block of code in visitTrunc that deals with shifted ops with FIXME comments. It may be possible to remove some of that now, but I want to make sure there are no problems with this step first. http://rise4fun.com/Alive/Y3a Name: hoist_ashr_ahead_of_sext_1 %s = sext i8 %x to i32 %r = ashr i32 %s, 3 ; shift value is < than source bit width => %a = ashr i8 %x, 3 %r = sext i8 %a to i32 Name: hoist_ashr_ahead_of_sext_2 %s = sext i8 %x to i32 %r = ashr i32 %s, 8 ; shift value is >= than source bit width => %a = ashr i8 %x, 7 ; so clamp this shift value %r = sext i8 %a to i32 Name: junc_the_trunc %a = sext i16 %v to i32 %s = ashr i32 %a, 18 %t = trunc i32 %s to i16 => %t = ashr i16 %v, 15 llvm-svn: 310942	2017-08-15 18:25:52 +00:00
Jakub Kuderski	638c085d07	[Dominators] Include infinite loops in PostDominatorTree Summary: This patch teaches PostDominatorTree about infinite loops. It is built on top of D29705 by @dberlin which includes a very detailed motivation for this change. What's new is that the patch also teaches the incremental updater how to deal with reverse-unreachable regions and how to properly maintain and verify tree roots. Before that, the incremental algorithm sometimes ended up preserving reverse-unreachable regions after updates that wouldn't appear in the tree if it was constructed from scratch on the same CFG. This patch makes the following assumptions: - A sequence of updates should produce the same tree as a recalculating it. - Any sequence of the same updates should lead to the same tree. - Siblings and roots are unordered. The last two properties are essential to efficiently perform batch updates in the future. When it comes to the first one, we can decide later that the consistency between freshly built tree and an updated one doesn't matter match, as there are many correct ways to pick roots in infinite loops, and to relax this assumption. That should enable us to recalculate postdominators less frequently. This patch is pretty conservative when it comes to incremental updates on reverse-unreachable regions and ends up recalculating the whole tree in many cases. It should be possible to improve the performance in many cases, if we decide that it's important enough. That being said, my experiments showed that reverse-unreachable are very rare in the IR emitted by clang when bootstrapping clang. Here are the statistics I collected by analyzing IR between passes and after each removePredecessor call: ``` # functions: 52283 # samples: 337609 # reverse unreachable BBs: 216022 # BBs: 247840796 Percent reverse-unreachable: 0.08716159869015269 % Max(PercRevUnreachable) in a function: 87.58620689655172 % # > 25 % samples: 471 ( 0.1395104988314885 % samples ) ... in 145 ( 0.27733680163724345 % functions ) ``` Most of the reverse-unreachable regions come from invalid IR where it wouldn't be possible to construct a PostDomTree anyway. I would like to commit this patch in the next week in order to be able to complete the work that depends on it before the end of my internship, so please don't wait long to voice your concerns :). Reviewers: dberlin, sanjoy, grosser, brzycki, davide, chandlerc, hfinkel Reviewed By: dberlin Subscribers: nhaehnle, javed.absar, kparzysz, uabelho, jlebar, hiraditya, llvm-commits, dberlin, david2050 Differential Revision: https://reviews.llvm.org/D35851 llvm-svn: 310940	2017-08-15 18:14:57 +00:00
Sanjay Patel	d52997a296	[InstCombine] add tests for sext+ashr; NFC llvm-svn: 310935	2017-08-15 17:41:31 +00:00
Daniel Sanders	eb2f5f3256	Revert r310919 - [globalisel][tablegen] Support zero-instruction emission. As expected, this failed on the windows bots but the instrumentation showed something interesting. The ADD8ri and INC8r rules are never directly compared on the windows machines. That implies that the issue lies in transitivity of the Compare predicate. I believe I've already verified that but maybe I missed something. llvm-svn: 310922	2017-08-15 15:10:31 +00:00
Daniel Sanders	16e6dd3cd6	Re-commit with some instrumentation: [globalisel][tablegen] Support zero-instruction emission. Summary: Support the case where an operand of a pattern is also the whole of the result pattern. In this case the original result and all its uses must be replaced by the operand. However, register class restrictions can require a COPY. This patch handles both cases by always emitting the copy and leaving it for the register allocator to optimize. The previous commit failed on the windows bots and this one is likely to fail on those same bots. However, the added instrumentation should reveal a particular isHigherPriorityThan() evaluation which I'm expecting to expose that these machines are weighing priority of two rules differently from the non-windows machines. Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar Subscribers: javed.absar, kristof.beyls, igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D36084 llvm-svn: 310919	2017-08-15 13:50:09 +00:00
Alex Bradbury	2fee9ead7e	[RISCV] Add RISCVInstPrinter and basic MC assembler tests With the addition of RISCVInstPrinter, it is now possible to test the basic operation of the RISCV MC layer. Differential Revision: https://reviews.llvm.org/D23564 llvm-svn: 310917	2017-08-15 13:08:29 +00:00
George Rimar	6957ab5b7b	[llvm-dwarfdump] - Print section name and index when dumping .debug_info ranges Teaches llvm-dwarfdump to print section index and name of range when it dumps .debug_info. Differential revision: https://reviews.llvm.org/D36313 llvm-svn: 310915	2017-08-15 12:32:54 +00:00
Ayal Zaks	25e2800e20	[LV] Minor savings to Sink casts to unravel first order recurrence Two minor savings: avoid copying the SinkAfter map and avoid moving a cast if it is not needed. Differential Revision: https://reviews.llvm.org/D36408 llvm-svn: 310910	2017-08-15 08:32:59 +00:00
Craig Topper	b1e4b1a070	[InstSimplify] Teach decomposeBitTestICmp to handle non-canonical compares This adds support non-canonical compare predicates. InstSimplify can't rely on canonicalization to have occurred. Differential Revision: https://reviews.llvm.org/D36646 llvm-svn: 310893	2017-08-14 22:11:43 +00:00
John Baldwin	1255b165bf	[MIPS] Implement support for -mstack-alignment. Summary: This is modeled on the implementation for x86 which stores the command line option in a 'StackAlignOverride' field in MipsSubtarget and then uses this to compute a 'stackAlignment' value in MipsSubtarget::initializeSubtargetDependencies. The stackAlignment() method in MipsSubTarget is renamed to getStackAlignment() and returns the computed 'stackAlignment'. Reviewers: sdardis Reviewed By: sdardis Subscribers: llvm-commits, arichardson Differential Revision: https://reviews.llvm.org/D35874 llvm-svn: 310891	2017-08-14 21:49:38 +00:00
Craig Topper	0aa3a19512	Recommit r310869, "[InstSimplify][InstCombine] Modify the interface of decomposeBitTestICmp and use it in the InstSimplify" This recommits r310869, with the moved files and no extra changes. Original commit message: This addresses a fixme in InstSimplify about using decomposeBitTest. This also fixes InstSimplify to handle ugt and ult compares too. I've modified the interface a little to return only the APInt version of the mask that InstSimplify needs. InstCombine now has a small wrapper routine to create a Constant out of it. I've also dropped the returning of 0 since InstSimplify doesn't need that. So InstCombine creates a zero constant itself. I also had to make decomposeBitTest support vectors since InstSimplify needs that. As InstSimplify can't use something from the Transforms library, I've moved the CmpInstAnalysis code to the Analysis library. Differential Revision: https://reviews.llvm.org/D36593 llvm-svn: 310889	2017-08-14 21:39:51 +00:00
Chandler Carruth	bba762a13f	[InlineCost] Refactor the checks for different analyses to be a bit more localized to the code that uses those analyses. Technically, this can change behavior as we no longer require the existence of the ProfileSummaryInfo analysis to use local profile information via BFI. We didn't actually require the PSI to have an interesting profile though, so this only really impacts the behavior in non-default pass pipelines. IMO, this makes it substantially less surprising how everything works -- before an analysis that wasn't actually used had to exist to trigger any profile aware inlining. I think the new organization makes it more obvious where various checks for profile signals happen. Differential Revision: https://reviews.llvm.org/D36710 llvm-svn: 310888	2017-08-14 21:25:00 +00:00
Andrew Kaylor	53a5fbb45f	Add strictfp attribute to prevent unwanted optimizations of libm calls Differential Revision: https://reviews.llvm.org/D34163 llvm-svn: 310885	2017-08-14 21:15:13 +00:00
Craig Topper	69fa8e0d99	Revert r310869 "[InstSimplify][InstCombine] Modify the interface of decomposeBitTestICmp and use it in the InstSimplify" Failed to add the two files that moved. And then added an extra change I didn't mean to while trying to fix that. Reverting everything. llvm-svn: 310873	2017-08-14 19:09:32 +00:00
Craig Topper	2f0b450666	[InstSimplify][InstCombine] Modify the interface of decomposeBitTestICmp and use it in the InstSimplify This addresses a fixme in InstSimplify about using decomposeBitTest. This also fixes InstSimplify to handle ugt and ult compares too. I've modified the interface a little to return only the APInt version of the mask that InstSimplify needs. InstCombine now has a small wrapper routine to create a Constant out of it. I've also dropped the returning of 0 since InstSimplify doesn't need that. So InstCombine creates a zero constant itself. I also had to make decomposeBitTest support vectors since InstSimplify needs that. As InstSimplify can't use something from the Transforms library, I've moved the CmpInstAnalysis code to the Analysis library. Differential Revision: https://reviews.llvm.org/D36593 llvm-svn: 310869	2017-08-14 18:49:42 +00:00
Craig Topper	58def1e1f2	[InstSimplify] Add some tests cases for selects with bittests hidden in ugt/ult/uge/ule compares. NFC llvm-svn: 310868	2017-08-14 18:49:39 +00:00

... 3 4 5 6 7 ...

47239 Commits