llvm-project

Commit Graph

Author	SHA1	Message	Date
Craig Topper	04493fda81	[X86] Don't print the aliased version of CVTSD2SI64rm. This appears to be a mistake I made years ago. llvm-svn: 257149	2016-01-08 06:09:18 +00:00
Craig Topper	29510c0430	[X86] Use \t instead of space after mnemonics in a bunch InstAliases for consistency. llvm-svn: 257148	2016-01-08 06:09:13 +00:00
Xinliang David Li	062cde9cc3	[PGO] Ensure vp data in indexed profile always sorted Done in InstrProfWriter to eliminate the need for client code to do the sorting. The operation is done once and reused many times so it is more efficient. Update unit test to remove sorting. Also update expected output of affected tests. llvm-svn: 257145	2016-01-08 05:45:21 +00:00
Junmo Park	aa9243a25d	Remove extra whitespace. NFC. llvm-svn: 257144	2016-01-08 04:20:32 +00:00
Xinliang David Li	51dc04cff2	[PGO] Fix a bug in InstProfWriter addRecord For a new record with weight != 1, only edge profiling counters are scaled, VP data is not properly scaled. This patch refactors the code and fixes the problem. Also added sort by count interface (for follow up patch). llvm-svn: 257143	2016-01-08 03:49:59 +00:00
Mehdi Amini	599ebf2767	Remove static global GCNames from Function.cpp and move it to the Context This remove the need for locking when deleting a function. Differential Revision: http://reviews.llvm.org/D15988 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 257139	2016-01-08 02:28:20 +00:00
Kyle Butt	bfcff3856a	Add call sequence start and end for __tls_get_addr This is a fix for bug http://llvm.org/bugs/show_bug.cgi?id=25839. For a PIC TLS variable access in a function, prologue (mflr followed by std and stdu) gets scheduled after a tls_get_addr call. tls_get_addr messed up LR but no one saves/restores it. Also added a test for save/restore clobbered registers during calling __tls_get_addr. Patch by Tim Shen llvm-svn: 257137	2016-01-08 02:06:19 +00:00
Kyle Butt	a02ce98bd4	[Vectorization] Actually return from error case in isStridedPtr The early return seems to be missed. This causes a radical and wrong loop optimization on powerpc. It isn't reproducible on x86_64, because "UseInterleaved" is false. Patch by Tim Shen. llvm-svn: 257134	2016-01-08 01:55:13 +00:00
Sanjay Patel	d72a458d28	[InstCombine] insert a new shuffle in a safe place (PR25999) Limit this transform to a basic block and guard against PHIs. Hopefully, this fixes the remaining failures in PR25999: https://llvm.org/bugs/show_bug.cgi?id=25999 llvm-svn: 257133	2016-01-08 01:39:16 +00:00
Dan Gohman	4ef99433aa	[WebAssembly] Minor code cleanups. NFC. llvm-svn: 257131	2016-01-08 01:18:00 +00:00
Matthias Braun	7c66afb887	IntEqClasses: Let join() return the new leader The new leader is known anyway so we can return it for some micro optimization in code where it is easy to pass along the result to the next join(). llvm-svn: 257130	2016-01-08 01:16:39 +00:00
Matthias Braun	bf47f63b74	LiveInterval: A LiveRange is enough for ConnectedVNInfoEqClasses::Classify() llvm-svn: 257129	2016-01-08 01:16:35 +00:00
Dan Gohman	35e4a28947	[WebAssembly] Minor code cleanups. NFC. llvm-svn: 257128	2016-01-08 01:06:00 +00:00
Dan Gohman	8633eedb30	[WebAssembly] Remove an unused def : Pat. WebAssemblyISelLowering.cpp does not wrap jump table nodes inside of WebAssemblywrapper nodes, so this pattern is not currently used. llvm-svn: 257127	2016-01-08 00:50:33 +00:00
Dan Gohman	cceedf79b4	[WebAssembly] Remove unused arguments, unused functions. NFC. llvm-svn: 257125	2016-01-08 00:43:54 +00:00
Eric Christopher	b793230797	Add some testing for thumb1 and thumb2 inline asm immediate constraints and fix a couple of bugs on inspection. Also fixes PR26061. llvm-svn: 257122	2016-01-08 00:34:44 +00:00
Alexey Samsonov	117b104166	[LiveDebugValues] Replace several lines of code with operator[]. llvm-svn: 257114	2016-01-07 23:38:45 +00:00
Aditya Nandakumar	f94c149f7f	Instructions to be redone only if from the same BB While adding instructions(possible roots) to be redone, make sure they are from the same basic block. llvm-svn: 257112	2016-01-07 23:22:55 +00:00
JF Bastien	b9ec4c6cea	WebAssembly: use .skip instead of .zero directive .zero is confusing when used with two arguments. Documentation: This directive emits SIZE 0-valued bytes. SIZE must be an absolute expression. This directive is actually an alias for the '.skip' directive so in can take an optional second argument of the value to store in the bytes instead of zero. Using '.zero' in this way would be confusing however. Ref: https://sourceware.org/bugzilla/show_bug.cgi?id=18353 Hexagon and Sparc do the same, and it's all the same to WebAssembly so let's pick the less confusing of the two. llvm-svn: 257111	2016-01-07 23:18:29 +00:00
Xinliang David Li	1054a85a28	[PGO] Minor refactoring /NFC Move common defs into common header files. llvm-svn: 257108	2016-01-07 22:46:29 +00:00
Keno Fischer	ea33a25816	Temporarily revert r257105 "[Verifier] Check that debug values have proper size" Looks like there's a case where clang generates debug info that triggers the new verifier check. Reverting while investigating. llvm-svn: 257107	2016-01-07 22:39:11 +00:00
Keno Fischer	b3326be6ad	[Verifier] Check that debug values have proper size Summary: Teach the Verifier to make sure that the storage size given to llvm.dbg.declare or the value size given to llvm.dbg.value agree with what is declared in DebugInfo. This is implicitly assumed in a number of passes (e.g. in SROA). Additionally this catches a number of common mistakes, such as passing a pointer when a value was intended or vice versa. One complication comes from stack coloring which modifies the original IR when it merges allocas in order to make sure that if AA falls back to the IR it gets the correct result. However, given this new invariant, indiscriminately replacing one alloca by a different (differently sized one) is no longer valid. Fix this by just undefing out any use of the alloca in a dbg.declare in this case. Additionally, I had to fix a number of test cases. Of particular note: - I regenerated dbg-changes-codegen-branch-folding.ll from the given source as it was affected by the bug fixed in r256077 - two-cus-from-same-file.ll was changed to avoid having a variable-typed debug variable as that would depend on the target, even though this test is supposed to be generic - I had to manually declared size/align for reference type. See also the discussion for D14275/r253186. - fpstack-debuginstr-kill.ll required changing `double` to `long double` - most others were just a question of adding OP_deref Reviewers: aprantl Differential Revision: http://reviews.llvm.org/D14276 llvm-svn: 257105	2016-01-07 22:18:37 +00:00
Dimitry Andric	2c36421337	Turn off lldb debug tuning by default for FreeBSD Summary: In rL242338, debugger tuning was introduced, and the tuning for FreeBSD was set to lldb by default. However, for the foreseeable future we still need to default to gdb tuning, since lldb is not ready for all of FreeBSD's architectures, and some system tools (like objcopy, etc) have not yet been adapted to cope with the lldb tuned format, which has .apple sections. Therefore, let FreeBSD use gdb by default for now. Reviewers: emaste, probinson Subscribers: llvm-commits, emaste Differential Revision: http://reviews.llvm.org/D15966 llvm-svn: 257103	2016-01-07 22:09:12 +00:00
David Majnemer	f1a9c9e148	[SCCP] Don't violate the lattice invariants We marked values which are 'undef' as constant instead of undefined which violates SCCP's invariants. If we can figure out that a computation results in 'undef', leave it in the undefined state. This fixes PR16052. llvm-svn: 257102	2016-01-07 21:36:16 +00:00
JF Bastien	841085c561	WebAssembly: update expected failures, more assert got resolved. llvm-svn: 257098	2016-01-07 21:00:37 +00:00
Mehdi Amini	b9b50aaffd	Fix crash when printing instructions that have a metadata attached but no parent. Fix PR24852 (crash with -debug -instcombine) Patch by Than McIntosh <thanm@google.com> Summary: Add guards to the asm writer to prevent crashing when dumping an instruction that has no basic block. Differential Revision: http://reviews.llvm.org/D15798 From: Than McIntosh <thanm@google.com> llvm-svn: 257094	2016-01-07 20:14:30 +00:00
JF Bastien	d9d2892668	WebAssembly: update expected failures, assert got resolved by r257084. llvm-svn: 257093	2016-01-07 20:07:21 +00:00
Xinliang David Li	810560773e	[PGO] Simplify coverage mapping lowering Coverage mapping data may reference names of functions that are skipped by FE (e.g, unused inline functions). Since those functions are skipped, normal instr-prof function lowering pass won't put those names in the right section, so special handling is needed to walk through coverage mapping structure and recollect the references. With this patch, only names that are skipped are processed. This simplifies the lowering code and it no longer needs to make assumptions coverage mapping data layout. It should also be more efficient. llvm-svn: 257091	2016-01-07 20:05:49 +00:00
David Majnemer	f3b99dd22e	Remove junk accidentally commited with r257087 llvm-svn: 257089	2016-01-07 19:30:13 +00:00
David Majnemer	bae945735a	[SCCP] Can't go from overdefined to constant The fix for PR23999 made us mark loads of null as producing the constant undef which upsets the lattice. Instead, keep the load as "undefined". This fixes PR26044. llvm-svn: 257087	2016-01-07 19:25:39 +00:00
Derek Schuff	9bfea27c26	[WebAssembly] Support combining GEP and FrameIndex offsets in memory operand offset field Previously we only supported putting the FI into memory operand offset fields if there was nothing there already. Now combine them. Differential Revision: http://reviews.llvm.org/D15941 llvm-svn: 257084	2016-01-07 18:55:52 +00:00
Dan Gohman	a4730cf0b4	[WebAssembly] Use the default private label prefixes. The MC assembler doesn't like using the empty string as a private label prefix because then it treats all labels as private. This commit reverts back to the default prefix, which is .L, which is common in ELF targets and consistent with the LLVM name mangler. llvm-svn: 257083	2016-01-07 18:49:53 +00:00
Nicolai Haehnle	82fc962c20	AMDGPU/SI: Fold operands with sub-registers Summary: Multi-dword constant loads generated unnecessary moves from SGPRs into VGPRs, increasing the code size and VGPR pressure. These moves are now folded away. Note that this lack of operand folding was not a problem for VMEM loads, because COPY nodes from VReg_Nnn to VGPR32 are eliminated by the register coalescer. Some tests are updated, note that the fsub.ll test explicitly checks that the move is elided. With the IR generated by current Mesa, the changes are obviously relatively minor: 7063 shaders in 3531 tests Totals: SGPRS: 351872 -> 352560 (0.20 %) VGPRS: 199984 -> 200732 (0.37 %) Code Size: 9876968 -> 9881112 (0.04 %) bytes LDS: 91 -> 91 (0.00 %) blocks Scratch: 1779712 -> 1767424 (-0.69 %) bytes per wave Wait states: 295164 -> 295337 (0.06 %) Totals from affected shaders: SGPRS: 65784 -> 66472 (1.05 %) VGPRS: 38064 -> 38812 (1.97 %) Code Size: 1993828 -> 1997972 (0.21 %) bytes LDS: 42 -> 42 (0.00 %) blocks Scratch: 795648 -> 783360 (-1.54 %) bytes per wave Wait states: 54026 -> 54199 (0.32 %) Reviewers: tstellarAMD, arsenm, mareko Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15875 llvm-svn: 257074	2016-01-07 17:10:29 +00:00
Nicolai Haehnle	3c05d6d3b5	AMDGPU/SI: xnack_mask is always reserved on VI Summary: Somehow, I first interpreted the docs as saying space for xnack_mask is only reserved when XNACK is enabled via SH_MEM_CONFIG. I felt uneasy about this and went back to actually test what is happening, and it turns out that xnack_mask is always reserved at least on Tonga and Carrizo, in the sense that flat_scr is always fixed below the SGPRs that are used to implement xnack_mask, whether or not they are actually used. I confirmed this by writing a shader using inline assembly to tease out the aliasing between flat_scratch and regular SGPRs. For example, on Tonga, where we fix the number of SGPRs to 80, s[74:75] aliases flat_scratch (so xnack_mask is s[76:77] and vcc is s[78:79]). This patch changes both the calculation of the total number of SGPRs and the various register reservations to account for this. It ought to be possible to use the gap left by xnack_mask when the feature isn't used, but this patch doesn't try to do that. (Note that the same applies to vcc.) Note that previously, even before my earlier change in r256794, the SGPRs that alias to xnack_mask could end up being used as well when flat_scr was unused and the total number of SGPRs happened to fall on the right alignment (e.g. highest regular SGPR being used s29 and VCC used would lead to number of SGPRs being 32, where s28 and s29 alias with xnack_mask). So if there were some conflict due to such aliasing, we should have noticed that already. Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15898 llvm-svn: 257073	2016-01-07 17:10:20 +00:00
Michael Zuckerman	3aca221b31	[AVX512] add PSLLW and PSLLV Intrinsic Differential Revision: http://reviews.llvm.org/D15889 llvm-svn: 257070	2016-01-07 16:02:51 +00:00
Silviu Baranga	dd68d46ec1	Revert r257064. It caused failures in some sanitizer tests. llvm-svn: 257069	2016-01-07 15:46:43 +00:00
Silviu Baranga	c67ec3f716	Fix build after r257064: we should be returning false, not nullptr llvm-svn: 257067	2016-01-07 15:09:22 +00:00
Nico Weber	4324b9b236	Revert r257055, it caused PR26064. llvm-svn: 257066	2016-01-07 15:01:46 +00:00
Silviu Baranga	57b1b90996	[InstCombine] Look through PHIs, GEPs, IntToPtrs and PtrToInts to expose more constants when comparing GEPs Summary: When comparing two GEP instructions which have the same base pointer and one of them has a constant index, it is possible to only compare indices, transforming it to a compare with a constant. This removes one use for the GEP instruction with the constant index, can reduce register pressure and can sometimes lead to removing the comparisson entirely. InstCombine was already doing this when comparing two GEPs if the base pointers were the same. However, in the case where we have complex pointer arithmetic (GEPs applied to GEPs, PHIs of GEPs, conversions to or from integers, etc) the value of the original base pointer will be hidden to the optimizer and this transformation will be disabled. This change detects when the two sides of the comparison can be expressed as GEPs with the same base pointer, even if they don't appear as such in the IR. The transformation will convert all the pointer arithmetic to arithmetic done on indices and all the relevant uses of GEPs to GEPs with a common base pointer. The GEP comparison will be converted to a comparison done on indices. Reviewers: majnemer, jmolloy Subscribers: hfinkel, jevinskie, jmolloy, aadg, llvm-commits Differential Revision: http://reviews.llvm.org/D15146 llvm-svn: 257064	2016-01-07 14:56:08 +00:00
Michael Zuckerman	354152d590	[AVX512] add PSRAV Intrinsic Differential Revision: http://reviews.llvm.org/D15856 llvm-svn: 257063	2016-01-07 14:42:20 +00:00
Amjad Aboud	d7cfb48485	Added support for macro emission in dwarf (supporting DWARF version 4). Differential Revision: http://reviews.llvm.org/D15495 llvm-svn: 257060	2016-01-07 14:28:20 +00:00
James Molloy	9971a6841c	[GlobalsAA] Partially back out r248576 See PR25822 for a more full summary, but we were conflating the concepts of "capture" and "escape". We were proving nocapture and using that proof to infer noescape, which is not true. Escaped-ness is a function-local property - as soon as a value is used in a call argument it escapes. Capturedness is a related but distinct property. It implies a temporally limited escape. Consider: static int a; int b; int g(int * nocapture arg); int f() { a = 2; // Even though a escapes to g, it is not captured so can be treated as non-escaping here. g(&a); // But here it must be treated as escaping. g(&b); // Now that g(&a) has returned we know it was not captured so we can treat it as non-escaping again. } The original commit did not sufficiently understand this nuance and so caused PR25822 and PR26046. r248576 included both a performance improvement (which has been backed out) and a related conformance fix (which has been kept along with its testcase). llvm-svn: 257058	2016-01-07 13:33:28 +00:00
Michael Zuckerman	a6df006b50	[AVX512] add PSHUFHW and PSHUFLW Intrinsic Differential Revision: http://reviews.llvm.org/D15925 llvm-svn: 257056	2016-01-07 12:35:43 +00:00
Simon Pilgrim	bcc11a059e	[X86][AVX] Match broadcast loads through a bitcast AVX1 v8i32/v4i64 shuffles are bitcasted to v8f32/v4f64, this patch peeks through bitcasts to check for a load node to allow broadcasts to occur. Follow up to D15310 llvm-svn: 257055	2016-01-07 11:34:27 +00:00
Dylan McKay	5c96de3ad7	Added AVRTargetObjectFile class and AVR.h llvm-svn: 257049	2016-01-07 10:53:15 +00:00
Tamas Berghammer	904d5fe496	Mark arm as the 32bit variant of aarch64 in Triple Change Triple::get32BitArchVariant to return arm/armeb as the 32bit variant of aarch64/aarch64_be and do the same change for the oppoiste direction in Triple::get64BitArchVariant. Differential revision: http://reviews.llvm.org/D15529 llvm-svn: 257048	2016-01-07 10:41:12 +00:00
Junmo Park	1238610aa1	Remove extra whitespace. NFC. llvm-svn: 257047	2016-01-07 10:26:32 +00:00
Simon Pilgrim	83e44c66ae	[X86][SSE} Add INSERTPS as a target shuffle Follow up to D15378, added INSERTPS to the list of decodable target shuffles and enabled XFormVExtractWithShuffleIntoLoad to handle target shuffles with SentinelZero and tested this with INSERTPS. llvm-svn: 257046	2016-01-07 10:24:19 +00:00
Michael Zuckerman	4a1566827d	[AVX512] add PSHUFD Intrinsic Differential Revision: http://reviews.llvm.org/D15934 llvm-svn: 257044	2016-01-07 09:24:12 +00:00
Tim Northover	bd41cf880c	ARM: support TLS accesses on Darwin platforms Darwin TLS accesses most closely resemble ELF's general-dynamic situation, since they have to be able to handle all possible situations. The descriptors and so on are obviously slightly different though. llvm-svn: 257039	2016-01-07 09:03:03 +00:00
Jonas Paulsson	3939b690f6	[SystemZ] Add hasSideEffects flag on Serialize instruction. Serialize will perform a hardware serialization operation, and is acting as a memory barrier. Therefore it must have the hasSideEffects flag set so it will be treated as a global memory object. Reviewed by Ulrich Weigand llvm-svn: 257036	2016-01-07 07:20:55 +00:00
Craig Topper	68cffb17a0	[X86] Remove superfluous mayLoad flag. The pattern already implies it. llvm-svn: 257035	2016-01-07 06:42:10 +00:00
Craig Topper	79e0ef82e8	[X86] Had hasSideEffects=0 to VBROADCASTI128. llvm-svn: 257034	2016-01-07 06:37:55 +00:00
Craig Topper	04cc5d25c7	[X86] Add OpSize32 to MOVSX32_NOREX instructions to match their other versions. llvm-svn: 257033	2016-01-07 06:37:52 +00:00
Craig Topper	0b165557b2	[X86] Add hasSideEffects=0 and mayLoad=1 to MOVZX64* instructions. While there remove a superfluous _Q from the instruction names. llvm-svn: 257032	2016-01-07 05:57:39 +00:00
Craig Topper	fc678ba944	[X86] STOSQ without a rep prefix doesn't read or write RCX. llvm-svn: 257030	2016-01-07 05:18:49 +00:00
David Majnemer	0e90f46e10	Undo spurious change made in r256965 llvm-svn: 257028	2016-01-07 04:31:35 +00:00
Philip Reames	afdbcc6a84	[Statepoints] Add test cases around vectors and stablize test Unlike my comment in 257022 said, it turns out we do handle constant vectors in the statepoint lowering, but only because SelectionDAG doesn't actually produce constants for them. Add a couple of tests which show this working. Also, add a triple to the same test file to hopefully fix a failing bot. It turns out we do han llvm-svn: 257025	2016-01-07 04:15:31 +00:00
Haicheng Wu	08b9462540	[AArch64 MachineCombine] Enhance/Add support for general reassociation to reduce the critical path Allow fadd/fmul to be reassociated in aarch64. llvm-svn: 257024	2016-01-07 04:01:02 +00:00
Philip Reames	3e2cf5320c	[Statepoints] Initial support for relocating vectors of pointers Currently, we try to split vectors of pointers back into their component pointer elements during rewrite-statepoints-for-gc. This is less than ideal since presumably the vectorizer chose to vectorize for a reason. :) It's also been a source of bugs - in particular, the relocation logic as currently implemented was recently discovered to be wrong. The alternate approach is to allow gc.relocates of vector-of-pointer type and update the backend to handle them. That's what this patch tries to do. This won't actually enable vector-of-pointers in practice - there are some RS4GC changes needed - but the lowering is standalone and testable so it makes sense to separate. Note that there are some known cases around vector constants which this patch does not handle. Once this is in, I'll send another patch with individual fixes and test cases. Differential Revision: http://reviews.llvm.org/D15632 llvm-svn: 257022	2016-01-07 03:32:11 +00:00
Dan Gohman	0c6f5ac50a	[WebAssembly] Add -m:e to the target triple. This enables ELF-style name mangling, which primarily means using ".L" for private symbols. llvm-svn: 257020	2016-01-07 03:19:23 +00:00
Ahmed Bougacha	a7324a2823	[Linker] Also treat a DIImportedEntity scope DISubprogram as needed. Follow-up to r257000: DIImportedEntity can reach a DISubprogram via its entity, but also via its scope. Handle the latter case as well. PR26037. llvm-svn: 257019	2016-01-07 03:14:59 +00:00
Philip Reames	103d2381d6	[RS4GC] Add an option to suppress vector splitting At the moment, this is essentially a diangostic option so that I can start collecting failing test cases, but we will eventually migrate to removing the vector splitting code entirely. llvm-svn: 257015	2016-01-07 02:20:11 +00:00
Kostya Serebryany	152ac7ad70	[libFuzzer] add a position hint to the dictionary-based mutator llvm-svn: 257013	2016-01-07 01:49:35 +00:00
Quentin Colombet	9ed52e9a9e	[ShrinkWrapping] Give up on irreducible CFGs. We need to know whether or not a given basic block is in a loop for the analysis to be correct. Loop information may be incomplete on irreducible CFGs, therefore we may generate incorrect code if we use it in those situations. This fixes PR25988. llvm-svn: 257012	2016-01-07 01:23:49 +00:00
Teresa Johnson	b951558294	Always treat DISubprogram reached by DIImportedEntity as needed. It is illegal to have a null entity in a DIImportedEntity, so we must link in a DISubprogram metadata node referenced by one, even if the associated function is not linked in or inlined anywhere. Fixes PR26037. llvm-svn: 257000	2016-01-07 00:06:27 +00:00
Mehdi Amini	0535003bef	Fix PR26051: Memcpy optimization should introduce a call to memcpy before the store destination position This is a conservative fix, I expect Amaury to relax this. Follow-up for r256923 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 256999	2016-01-06 23:50:22 +00:00
Sanjay Patel	882a8eed3e	rangify; NFCI llvm-svn: 256998	2016-01-06 23:45:05 +00:00
Simon Pilgrim	bc82dedd26	[X86] Determine if target shuffle can contain zero elements getTargetShuffleMask may return shuffle masks with SM_SentinelZero (-2) values (currently just for PSHUFB but VPERM2X128 as well with this patch). Although some calling functions can make use of this (mainly for shuffle combining), others can not and their inclusion makes shuffle mask comparisons more difficult. This patch adds a flag to getTargetShuffleMask to indicate if the calling function can't handle SM_SentinelZero; getTargetShuffleMask will then return false if it occurs to make handling much easier. I've tidied up some uses of getTargetShuffleMask to better indicate what is going on - more could be done but at present I don't have test cases to demonstrate it. Some upcoming patches will make use of this to both support more uses where SM_SentinelZero is not permitted (e.g. combineShuffleToAddSub), and also will allow us to add INSERTPS support to getTargetShuffleMask as part of better zero handling discussed in D14261. Differential Revision: http://reviews.llvm.org/D15378 llvm-svn: 256992	2016-01-06 23:24:40 +00:00
Weiming Zhao	0f1762caf9	Recommit r256952 "Filtering IR printing for print-after-all/print-before-all" Fix lit test fail due to outputting an extra line. Differential Revision: http://reviews.llvm.org/D15776 llvm-svn: 256987	2016-01-06 22:55:03 +00:00
Justin Bogner	a43eacbf9e	Bitcode: Fix reading and writing of ConstantDataVectors of halfs In r254991 I allowed ConstantDataVectors to contain elements of HalfTy, but I missed updating the bitcode reader and writer to handle this, so now we crash if we try to emit bitcode on programs that have constant vectors of half. This fixes the issue and adds test coverage for reading and writing constant sequences in bitcode. llvm-svn: 256982	2016-01-06 22:31:32 +00:00
Nicolai Haehnle	a61e5a8d4e	AMDGPU/SI: Fix crash when inline assembly is used in a graphics shader Summary: This is admittedly something that you could only run into by manually playing around with shader assembly because the SITypeWriter pass is skipped for compute. Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15902 llvm-svn: 256980	2016-01-06 22:01:04 +00:00
Sanjay Patel	c2d6461a4a	[LibCallSimplifier] less indenting; NFCI llvm-svn: 256973	2016-01-06 20:52:21 +00:00
Chen Li	78bde83003	[SplitLandingPadPredecessors] Create a PHINode for the original landingpad only if it has some uses Summary: This patch adds a check in SplitLandingPadPredecessors to see if the original landingpad instruction has any uses. If not, we don't need to create a PHINode for it in the joint block since it's gonna be a dead code anyway. The motivation for this patch is that we found a bug that SplitLandingPadPredecessors created a PHINode of token type landingpad, which failed the verifier since PHINode can not be token type. However, the created PHINode will never be used in our code pattern. This patch will workaround this bug, and we might add supports in SplitLandingPadPredecessors to handle token type landingpad with uses in the future. Reviewers: reames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D15835 llvm-svn: 256972	2016-01-06 20:32:05 +00:00
Amaury Sechet	3235c08253	Promote aggregate store to memset when possible Summary: As per title. This will allow the optimizer to pick up on it. Reviewers: craig.topper, spatel, dexonsmith, Prazek, chandlerc, joker.eph, majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D15923 llvm-svn: 256969	2016-01-06 19:47:24 +00:00
Amaury Sechet	5fc9f6999d	Remove useless DEBUG llvm-svn: 256968	2016-01-06 19:45:09 +00:00
Philip Reames	5eb90a7835	Consolidate MemRefs handling from BranchFolding and correct latent bug Move the logic from BranchFolding to use the shared infrastructure for merging MMOs introduced in 256909. This has the effect of making BranchFolding more capable. In the process, fix a latent bug. The existing handling for merging didn't handle the case where one of the instructions being merged had overflowed and dropped MemRefs. This was a latent bug in the places the code was commoned from, but potentially reachable in BranchFolding. Once this is in, we're left with a single place to consider implementing MMO unique-ing as proposed in http://reviews.llvm.org/D15230. Differential Revision: http://reviews.llvm.org/D15913 llvm-svn: 256966	2016-01-06 19:33:12 +00:00
David Majnemer	eea7582bfa	[WinEH] Remove calculateCatchReturnSuccessorColors The functionality that calculateCatchReturnSuccessorColors provides was once non-trivial: it was a computation layered on top of funclet coloring. These days, LLVM IR directly encodes what calculateCatchReturnSuccessorColors computed, obsoleting the need for it. No functionality change is intended. llvm-svn: 256965	2016-01-06 19:26:30 +00:00
Sanjay Patel	cddcd7256c	[LibCallSimplifier] use instruction-level fast-math-flags for tan/atan transform llvm-svn: 256964	2016-01-06 19:23:35 +00:00
Quentin Colombet	eb61e8e6b0	[X86] Correctly model TLS calls w.r.t. frame requirements. TLS calls need the stack frame to be properly set up and this implies that such calls need ADJUSTSTACK_xxx markers. Fixes PR25820. llvm-svn: 256959	2016-01-06 19:09:26 +00:00
Nico Weber	891419adc2	Make WinCOFFObjectWriter.cpp's timestamp writing not use ENABLE_TIMESTAMPS LLVM_ENABLE_TIMESTAMPS controls if timestamps are embedded into llvm's binaries. Turning it off is useful for deterministic builds. r246905 made it so that the define suddenly also controls if the binaries that the llvm binaries _create_ embed timestamps or not – but this shouldn't be a configure-time option. r256203/r256204 added a driver option to toggle this on and off, so this patch now passes this driver option in LLVM_ENABLE_TIMESTAMPS builds so that if LLVM_ENABLE_TIMESTAMPS is set, the build of LLVM is deterministic – but the built clang can still write timestamps into other executables when requested. This also allows removing some of the test machinery added in r292012 to work around this problem. See PR24740 for background. http://reviews.llvm.org/D15783 llvm-svn: 256958	2016-01-06 19:05:19 +00:00
Sanjay Patel	ab69e9f497	refactor divrem8 lowering; NFCI The code duplication contributed to PR25754: https://llvm.org/bugs/show_bug.cgi?id=25754 llvm-svn: 256957	2016-01-06 18:47:09 +00:00
Michael Kuperstein	037c9984db	[ShrinkWrap] Fix FindIDom to only have one kind of failure. FindIDom() can fail in two different ways - it can either return nullptr or the block itself, depending on the circumstances. Some users of FindIDom() check one error condition, while others check the other. Change it to always return nullptr on failure. This fixes PR26004. Differential Revision: http://reviews.llvm.org/D15847 llvm-svn: 256955	2016-01-06 18:40:11 +00:00
Weiming Zhao	b243c95c6a	Revert r256952 due to lit test fails. llvm-svn: 256954	2016-01-06 18:31:44 +00:00
Dan Gohman	8f59cf756f	[WebAssembly] Don't use range-based loop for a list that's being modified The first instruction in a block is what the rend() iterator points to, so if it moves, we need to re-evaluate rend() so that we continue to iterate through the rest of the instructions. llvm-svn: 256953	2016-01-06 18:29:35 +00:00
Weiming Zhao	eac0636805	Filtering IR printing for print-after-all/print-before-all Summary: This patch implements "-print-funcs" option to support function filtering for IR printing like -print-after-all, -print-before etc. Examples: -print-after-all -print-funcs=foo,bar Reviewers: mcrosier, joker.eph Subscribers: tejohnson, joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D15776 llvm-svn: 256952	2016-01-06 18:20:25 +00:00
Weiming Zhao	c7c18d6d14	Fix option desc in FunctionAttrs; NFC Summary: The example in desc should match with actual option name Reviewers: jmolloy Differential Revision: http://reviews.llvm.org/D15800 llvm-svn: 256951	2016-01-06 18:18:16 +00:00
Geoff Berry	12fe2279f3	ScheduleDAGInstrs: Bug fix for missed memory dependency. Summary: In buildSchedGraph(), when adding memory dependencies for loads, move the call to adjustChainDeps() after the call to addChainDependency(AliasChain) to handle the case where addChainDependency(AliasChain) ends up not adding a dependency and instead putting the SU on the RejectMemNodes list. The call to adjustChainDeps() must be done after the call to addChainDependency() in order to process the SU added to the RejectMemNodes list to create memory dependencies for it. Reviewers: hfinkel, atrick, jonpa, resistor Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D15927 llvm-svn: 256950	2016-01-06 18:14:26 +00:00
Philip Reames	fe46cadcf9	[BasicAA] Extract WriteOnly predicate on parameters [NFC] Since writeonly is the only missing attribute and special case left for the memset/memcpy family of intrinsics, rearrange the code to make that much more clear. llvm-svn: 256949	2016-01-06 18:10:35 +00:00
JF Bastien	1dede3f95f	WebAssembly: add missing expected failures exposed by r256890 llvm-svn: 256948	2016-01-06 17:08:56 +00:00
JF Bastien	e6ec487cf7	WebAssembly: add new expected failures exposed by r256890 llvm-svn: 256945	2016-01-06 16:15:51 +00:00
Krzysztof Parzyszek	2d0418e842	[Hexagon] Add system instructions for cache manipulation llvm-svn: 256936	2016-01-06 14:22:22 +00:00
Amaury Sechet	457cc4db9e	Revert "GlobalsAA: Take advantage of ArgMemOnly, InaccessibleMemOnly and InaccessibleMemOrArgMemOnly attributes" Summary: This reverts commit 5a9e526f29cf8510ab5c3d566fbdcf47ac24e1e9. As per discussion in D15665 This also add a test case so that regression introduced by that diff are not reintroduced. Reviewers: vaivaswatha, jmolloy, hfinkel, reames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D15919 llvm-svn: 256932	2016-01-06 13:23:52 +00:00
Matthew Simpson	bf894faa15	[LV] Avoid creating empty reduction entries (NFC) This patch prevents us from unintentionally creating entries in the reductions map for PHIs that are not actually reductions. This is currently not an issue since we bail out if we encounter PHIs other than inductions or reductions. However the behavior could become problematic as we add support for additional recurrence types. llvm-svn: 256930	2016-01-06 12:50:29 +00:00
Artyom Skrobov	51f2d11be9	PR25754: avoid generating UDIVREM8_ZEXT_HREG nodes with i64 result Reviewers: spatel, srking Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D15331 llvm-svn: 256924	2016-01-06 09:41:10 +00:00
Amaury Sechet	d3b2c0fd94	Improve load/store to memcpy for aggregate Summary: It turns out that if we don't try to do it at the store location, we can do it before any operation that alias the load, as long as no operation alias the store. Reviewers: craig.topper, spatel, dexonsmith, Prazek, chandlerc, joker.eph Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D15903 llvm-svn: 256923	2016-01-06 09:30:39 +00:00
Simon Pilgrim	267163e713	[X86][SSE] There is no zmm addsubpd/addsubps instruction. Replace the assert in combineShuffleToAddSub with an early out. llvm-svn: 256922	2016-01-06 09:08:49 +00:00
Simon Pilgrim	eaabd64a11	[X86][SSE] An empty target shuffle mask is always a failure. As discussed on D15378, move the mask.empty() tests to after the switch statement and consider any shuffle decode where the extracted target shuffle mask is empty as a failure. llvm-svn: 256921	2016-01-06 08:59:32 +00:00
Craig Topper	1b94d9a3cc	[X86] Use PS instead of TB for instructions that have PD/XS/XD variations. Use OpSize32 on an instruction that has an OpSize16 variant. llvm-svn: 256918	2016-01-06 06:18:41 +00:00
Craig Topper	275600390f	[X86] Fix an incorrect usage of In32BitMode that should have been Not64BitMode. llvm-svn: 256917	2016-01-06 06:18:37 +00:00
Philip Reames	2d2fc4adf1	Fix a warning [NFC] llvm-svn: 256916	2016-01-06 05:53:09 +00:00
David Majnemer	b70e23c390	[SimplifyLibCalls] Teach SimplifyLibCalls about operand bundles If we replace one call-site with another, be sure to move over any operand bundles that lingered on the old call-site. This fixes PR26036. llvm-svn: 256912	2016-01-06 05:01:34 +00:00
Philip Reames	ae050a5703	[BasicAA] Remove special casing of memset_pattern16 in favor of generic attribute inference Most of the properties of memset_pattern16 can be now covered by the generic attributes and inferred by InferFunctionAttrs. The only exceptions are: - We don't yet have a writeonly attribute for the first argument. - We don't have an attribute for modeling the access size facts encoded in MemoryLocation.cpp. Differential Revision: http://reviews.llvm.org/D15879 llvm-svn: 256911	2016-01-06 04:53:16 +00:00
Philip Reames	cdf46d1b52	[BasicAA] Delete dead code related to memset/memcpy/memmove intrinsics [NFCI] We only need to describe the writeonly property of one of the arguments. All of the rest of the semantics are nicely described by existing attributes in Intrinsics.td. Differential Revision: http://reviews.llvm.org/D15880 llvm-svn: 256910	2016-01-06 04:43:03 +00:00
Philip Reames	c86ed0055d	Extract helper function to merge MemoryOperand lists [NFC] In the discussion on http://reviews.llvm.org/D15730, Andy pointed out we had a utility function for merging MMO lists. Since it turned we actually had two copies and there's another review in progress (http://reviews.llvm.org/D15230) which needs the same, extract it into a utility function and clean up the interfaces to make it easier to use with a MachineInstBuilder. I introduced a pair here to track size and allocation together. I think we should probably move in the direction of the MachineOperandsRef helper class, but I'm leaving that for further work. I want to get the poison state introduced before I make major changes to the interface. Differential Revision: http://reviews.llvm.org/D15757 llvm-svn: 256909	2016-01-06 04:39:03 +00:00
Junmo Park	3a40237c03	Delete trailing whitespace; NFC llvm-svn: 256908	2016-01-06 03:53:36 +00:00
Junmo Park	3ec882feed	Delete trailing whitespace; NFC llvm-svn: 256906	2016-01-06 03:41:30 +00:00
Yunzhong Gao	34c0199378	Do not define NOGDI. Mingw defines LOGFONTW type in wingdi.h and the mingw version of shlobj.h includes shobjidl.h and the latter uses the LOGFONTW type. llvm-svn: 256904	2016-01-06 03:01:10 +00:00
Yunzhong Gao	d84c13cdb8	Another attempt at fixing the i686-mingw32-RA-on-linux buildbot. I am getting confused with what version of mingw is actually installed on the buildbot, and for now I will just assume this is an unknown version which does not ship with VersionHelpers.h. llvm-svn: 256902	2016-01-06 02:48:42 +00:00
Yunzhong Gao	b15585f0ea	Another attempt at fixing the i686-mingw32-RA-on-linux buildbot. llvm-svn: 256901	2016-01-06 02:32:31 +00:00
Kostya Serebryany	80eb76abf4	[libFuzzer] extend the dictionary mutator to optionally overwrite data with the dict entry llvm-svn: 256900	2016-01-06 02:13:04 +00:00
Yunzhong Gao	d7009f31a1	Hopefully fix a mingw32 buildbot (i686-mingw32-RA-on-linux) which does not have the VersionHelpers.h header. llvm-svn: 256896	2016-01-06 01:36:45 +00:00
Yunzhong Gao	fb2a9c4209	Fixing PR25717: fatal IO error writing large outputs to console on Windows. This patch is similar to the Python issue#11395. We need to cap the output size to 32767 on Windows to work around the size limit of WriteConsole(). Reference: https://bugs.python.org/issue11395 Writing a test for this bug turns out to be harder than I thought. I am still working on it (see phabricator review D15705). Differential Revision: http://reviews.llvm.org/D15553 llvm-svn: 256892	2016-01-06 00:50:06 +00:00
Sanjay Patel	3d07ec973f	rangify; NFCI llvm-svn: 256891	2016-01-06 00:45:42 +00:00
Dan Gohman	797f639e79	[SelectionDAGBuilder] Set NoUnsignedWrap for inbounds gep and load/store offsets. In an inbounds getelementptr, when an index produces a constant non-negative offset to add to the base, the add can be assumed to not have unsigned overflow. This relies on the assumption that addresses can't occupy more than half the address space, which isn't possible in C because it wouldn't be possible to represent the difference between the start of the object and one-past-the-end in a ptrdiff_t. Setting the NoUnsignedWrap flag is theoretically useful in general, and is specifically useful to the WebAssembly backend, since it permits stronger constant offset folding. Differential Revision: http://reviews.llvm.org/D15544 llvm-svn: 256890	2016-01-06 00:43:06 +00:00
Sanjay Patel	8260d0a9fa	use std::max ; NFCI llvm-svn: 256889	2016-01-06 00:36:59 +00:00
Sanjay Patel	c7ddb7fcdb	A (B + C) = A B + A C ; NFCI llvm-svn: 256884	2016-01-06 00:32:15 +00:00
Sanjay Patel	f2ea8a25ed	fix typo; NFC llvm-svn: 256883	2016-01-06 00:23:12 +00:00
Mike Aizatsky	8b11f877e4	[libfuzzer] print_new_cov_pcs experimental option. Differential Revision: http://reviews.llvm.org/D15901 llvm-svn: 256882	2016-01-06 00:21:22 +00:00
Sanjay Patel	f5c2d129d8	fix typos; NFC llvm-svn: 256881	2016-01-06 00:18:29 +00:00
Kostya Serebryany	226b734d73	[libFuzzer] make trace-based fuzzing not crash in presence of threads llvm-svn: 256876	2016-01-06 00:03:35 +00:00
Manuel Jacob	3eedd11329	[Statepoints] Check for the "gc-leaf-function" attribute on call sites as well. Reviewers: sanjoy, reames Subscribers: sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D15900 llvm-svn: 256875	2016-01-05 23:59:08 +00:00
Sanjay Patel	29095ea1b0	[LibCallSimplfier] use instruction-level fast-math-flags for fmin/fmax transforms llvm-svn: 256871	2016-01-05 20:46:19 +00:00
Nicolai Haehnle	6035504ab3	AMDGPU/SI: Do not move scratch resource register on Tonga & Iceland Due to the SGPR init bug, every program claims to use the same number of SGPRs anyway, so there's no point in trying to shift those registers down from their initial spot of reservation. Add a test that uses VGPR spilling and blocks most SGPRs from being used for the scratch resource register. Previously, this would run into an assertion. Differential Revision: http://reviews.llvm.org/D15724 llvm-svn: 256870	2016-01-05 20:42:49 +00:00
Amaury Sechet	a0c242cdfd	Implement load to store => memcpy in MemCpyOpt for aggregates Summary: Most of the tool chain is able to optimize scalar and memcpy like operation effisciently while it isn't that good with aggregates. In order to improve the support of aggregate, we try to change aggregate manipulation into either scalar or memcpy like ones whenever possible without loosing informations. This is one such opportunity. Reviewers: craig.topper, spatel, dexonsmith, Prazek, chandlerc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D15894 llvm-svn: 256868	2016-01-05 20:17:48 +00:00
Oleg Ranevskyy	2e83790c37	[Clang/Support/Windows/Unix] Command lines created by clang may exceed the command length limit set by the OS Summary: Hi Rafael, Would you be able to review this patch, please? (Clang part of the patch is D15832). When clang runs an external tool, e.g. a linker, it may create a command line that exceeds the length limit. Clang uses the llvm::sys::argumentsFitWithinSystemLimits function to check if command line length fits the OS limitation. There are two problems in this function that may cause exceeding of the limit: 1. It ignores the length of the program path in its calculations. On the other hand, clang adds the program path to the command line when it runs the program. 2. It assumes no space character is inserted after the last argument, which is not true for Windows. The flattenArgs function adds the trailing space for each argument. The result of this is that the terminating NULL character is not counted and may be placed beyond the length limit if the command line is exactly 32768 characters long. The WinAPI's CreateProcess does not find the NULL character and fails. Reviewers: rafael, ygao, probinson Subscribers: asl, llvm-commits Differential Revision: http://reviews.llvm.org/D15831 llvm-svn: 256866	2016-01-05 19:56:12 +00:00
Sanjay Patel	a1c5347982	[InstCombine] insert a new shuffle before its uses (PR26015) Although this solves the test case in PR26015: https://llvm.org/bugs/show_bug.cgi?id=26015 And may solve PR25999: https://llvm.org/bugs/show_bug.cgi?id=25999 ...I suspect this is not the best solution. I think we want to insert the new shuffle just ahead of the earliest ExtractElementInst that we're replacing, but I don't know how that should be implemented. Differential Revision: http://reviews.llvm.org/D15878 llvm-svn: 256857	2016-01-05 19:09:47 +00:00
Manuel Jacob	0aa9f7fdad	Add function for testing string attributes to InvokeInst and CallSite. NFC. llvm-svn: 256856	2016-01-05 19:08:33 +00:00
David Majnemer	861a0ae349	[X86] Determine if we have an OpaqueSPAdjustment earlier We queried hasFP before we hit ExpandISelPseudos. ExpandISelPseudos manipulated state that hasFP relied on, potentially changing the result after it has been queried elsewhere. While I am not aware of any particular bug due to this state of affairs, it seems best to avoid it entirely by changing the state during DAG construction. llvm-svn: 256849	2016-01-05 17:46:36 +00:00
Michael Zuckerman	5cbae95916	[AVX512] add PSLLD and PSLLQ Intrinsic Differential Revision: http://reviews.llvm.org/D15885 llvm-svn: 256840	2016-01-05 15:17:39 +00:00
MinSeong Kim	4a9a4e198f	[MISched] Explanatory error message when machine model is not complete. NFC When not all instructions have a scheduling class, the error message now provides a possible solution. Differential Revision: http://reviews.llvm.org/D15854 llvm-svn: 256839	2016-01-05 14:50:15 +00:00
MinSeong Kim	a7385ebf78	[AArch64] Add support for Samsung Exynos-M1 Adds core tuning support for new Samsung Exynos-M1 core (ARMv8-A). Differential Revision: http://reviews.llvm.org/D15663 llvm-svn: 256828	2016-01-05 12:51:59 +00:00
Artyom Skrobov	8c6992344d	(NFC) Change SubtargetFeatures::ToggleFeature and SubtargetFeatures::ApplyFeatureFlag to be static, so that MCSubtargetInfo doesn't need to instantiate SubtargetFeatures for nothing. Also change the return type to void, as it wasn't ever used. This is a partial commit of http://reviews.llvm.org/D15746 llvm-svn: 256823	2016-01-05 10:25:56 +00:00
Junmo Park	3b8c715b2f	Remove extra whitespace. NFC. llvm-svn: 256820	2016-01-05 09:36:47 +00:00
Simon Pilgrim	d47ac60f00	[X86][SSE] Merge PerformBLENDICombine into PerformShuffleCombine PBLEND/BLENDPD/BLENDPS are no different to the other target shuffles and this will make future improvements to the target shuffle combines more straightforward. llvm-svn: 256819	2016-01-05 09:12:17 +00:00
Craig Topper	e00bffbc13	[X86] Make MOV32ri64 a post-RA pseudo instead of a CodeGenOnly instruction. It was only needed for rematerialization. llvm-svn: 256818	2016-01-05 07:44:14 +00:00
Craig Topper	9583f51348	[X86] Add OpSize32 to OR32mrLocked instruction to match the normal OR32mr instruction. llvm-svn: 256817	2016-01-05 07:44:11 +00:00
Craig Topper	ad2ce36be0	[AVX512] Add hasSideEffects=0 to kunpck instructions since they lack a pattern in their instructions. llvm-svn: 256816	2016-01-05 07:44:08 +00:00
David Majnemer	59eb733af1	[SimplifyCFG] Further improve our ability to remove redundant catchpads In r256814, we managed to remove catchpads which were trivially redudant because they were the same SSA value. We can do better using the same algorithm but with a smarter datastructure by hashing the SSA values within the catchpad and comparing them structurally. llvm-svn: 256815	2016-01-05 07:42:17 +00:00
David Majnemer	2fa8651a8f	[SimplifyCFG] Remove redundant catchpads Remove duplicate catchpad handlers from a catchswitch. llvm-svn: 256814	2016-01-05 06:27:50 +00:00
Matt Arsenault	905042774d	AMDGPU: Remove redundant let mayLoad = 1 This is already set on the SMRD format class. llvm-svn: 256813	2016-01-05 04:50:28 +00:00
Manuel Jacob	75cbfdcf03	[RS4GC] Simplify handling of Constants in findBaseDefiningValue(). NFC. Summary: Previously there were three conditionals, checking for global variables, undef values and everything constant except these two, all three returning the same value. This commit replaces them by one conditional. Reviewers: reames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D15818 llvm-svn: 256812	2016-01-05 04:06:21 +00:00
Manuel Jacob	83eefa6d20	[Statepoints] Refactor GCRelocateOperands into an intrinsic wrapper. NFC. Summary: This commit renames GCRelocateOperands to GCRelocateInst and makes it an intrinsic wrapper, similar to e.g. MemCpyInst. Also, all users of GCRelocateOperands were changed to use the new intrinsic wrapper instead. Reviewers: sanjoy, reames Subscribers: reames, sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D15762 llvm-svn: 256811	2016-01-05 04:03:00 +00:00
Tom Stellard	5cd09ade38	AMDGPU/SI: Select non-uniform constant addrspace loads to flat instructions for HSA Summary: This fixes a regression caused by r256282. Reviewers: arsenm, cfang Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15736 llvm-svn: 256810	2016-01-05 03:40:16 +00:00
Joseph Tremoulet	0d808888c1	[WinEH] Simplify unreachable catchpads Summary: At least for CoreCLR, a catchpad which immediately executes an `unreachable` instruction indicates that the exception can never have a matching type, and so such catchpads can be removed, and so can their catchswitches if the catchswitch becomes empty. Reviewers: rnk, andrew.w.kaylor, majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D15846 llvm-svn: 256809	2016-01-05 02:37:41 +00:00
David Majnemer	869be0a4a6	Revert "[X86] Use push-pop for materializing small constants under 'minsize'" The red zone consists of 128 bytes beyond the stack pointer so that the allocation of objects in leaf functions doesn't require decrementing rsp. In r255656, we introduced an optimization that would cheaply materialize certain constants via push/pop. Push decrements the stack pointer and stores it's result at what is now the top of the stack. However, this means that using push/pop would encroach on the red zone. PR26023 gives an example where this corrupts an object in the red zone. llvm-svn: 256808	2016-01-05 02:32:06 +00:00
Tom Stellard	2c82ee60c3	AMDGPU/SI: Consolidate FLAT patterns Summary: We had to sets of identical FLAT patterns one inside the HasFlatAddressSpace predicate and one inside the useFlatForGloabl predicate. This patch merges these sets into a single pattern under the isCIVI predicate. The reason we can remove the predicates is that when MUBUF instructions are legal, the instruction selector will prefer selecting those over FLAT instructions because MUBUF patterns have a higher complexity score. So, in this case having patterns for FLAT instructions will have no effect. This change also simplifies the process for forcing global address space loads to use FLAT instructions, since we no only have to disable the MUBUF patterns instead of having to disable the MUBUF patterns and enable the FLAT patterns. Reviewers: arsenm, cfang Subscribers: llvm-commits llvm-svn: 256807	2016-01-05 02:26:37 +00:00
Philip Reames	a694a0b141	[MDA] Don't be quite as conservative for noalias functions If we encounter a noalias call that alias analysis can't analyse, we can fall down into the generic call handling rather than giving up entirely. I noticed this while reading through the code for another purpose. I can't seem to write a test case which changes; that sorta makes sense given any test case would have to be an inconsistency in AA. Suggestions welcome. Differential Revision: http://reviews.llvm.org/D15825 llvm-svn: 256802	2016-01-05 00:49:14 +00:00
Matthias Braun	7e762e4f9c	MachineInstrBundle: Fix reversed isSuperRegisterEq() call Unfortunately this fix had the effect of exposing the -verify-machineinstrs FIXME of X86InstrInfo.cpp in two testcases for which I disabled it for now. Two testcases also have additional pushq/popq where the corrected code cannot prove that %rax is dead any longer. Looking at the examples, this could potentially be fixed by improving computeRegisterLiveness() to check the live-in lists of the successors blocks when reaching the end of a block. This fixes http://llvm.org/PR25951. llvm-svn: 256799	2016-01-05 00:45:35 +00:00
Nicolai Haehnle	5b50497617	AMDGPU: add +xnack feature Summary: Enabling this feature will account for the two SGPRs used by the hardware to store the XNACK_MASK physically. The hardware only requires this reservation when the XNACK feature is explicitly enabled. At some point, HSA will probably want to do that, but it does increase SGPR register pressure, so leave it disabled by default for now (but do add a small test). Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15869 llvm-svn: 256794	2016-01-04 23:35:53 +00:00

1 2 3 4 5 ...

86068 Commits