llvm-project

Commit Graph

Author	SHA1	Message	Date
Geoff Berry	66d1f0ff1f	[LiveRangeEdit] Change eliminateDeadDef assert to if condition. The assert could potentially fire (though no cases have been encountered), so just check that the instruction we're handling specially for rematerialization only has one def to begin with. Reviewed by Wei Mi over email. llvm-svn: 289861	2016-12-15 19:55:19 +00:00
Peter Collingbourne	e089554c8f	LibDriver: Allow resource files to be archive members. It seems pointless to add a resource to an archive because it won't have any symbols to link against (and link.exe doesn't have an equivalent of --whole-archive), but lib.exe allows it for some reason. llvm-svn: 289859	2016-12-15 19:37:46 +00:00
Zachary Turner	578113ffb7	Re-add the check for __has_attribute in StringLiteral. llvm-svn: 289858	2016-12-15 19:33:31 +00:00
Boris Ulasevich	b76f6c2745	BrainF example: fixing segfault caused by outdated code with missing MCJIT dependency Differential Revision: https://reviews.llvm.org/D26280 llvm-svn: 289857	2016-12-15 19:29:42 +00:00
Zachary Turner	b0aa31bb25	Ignore -Wgcc-compat diagnostic in StringLiteral. llvm-svn: 289856	2016-12-15 19:22:58 +00:00
Sanjay Patel	d640641a61	[InstCombine] add folds for icmp (smin X, Y), X Min/max canonicalization (r287585) exposes the fact that we're missing combines for min/max patterns. This patch won't solve the example that was attached to that thread, so something else still needs fixing. The line between InstCombine and InstSimplify gets blurry here because sometimes the icmp instruction that we want to fold to already exists, but sometimes it's the swapped form of what we want. Corresponding changes for smax/umin/umax to follow. Differential Revision: https://reviews.llvm.org/D27531 llvm-svn: 289855	2016-12-15 19:13:37 +00:00
Reid Kleckner	e793966d80	Fix some remaining documentation references to MSVC 2013 MSVC 2015 has been the minimum supported version of VS since October. Differential Revision: https://reviews.llvm.org/D25710 llvm-svn: 289854	2016-12-15 19:08:02 +00:00
Zachary Turner	182b4652e5	[StringRef] Add enable-if to StringLiteral. to prevent StringLiteral from being created with a non-literal char array, clang has a macro enable_if() that can be used in such a way as to guarantee that the constructor is disabled unless the length fo the string can be computed at compile time. This only works on clang, but at least it should allow bots to catch abuse of StringLiteral. Differential Revision: https://reviews.llvm.org/D27780 llvm-svn: 289853	2016-12-15 19:02:43 +00:00
Kostya Serebryany	9a038c188c	[libFuzzer] doc update llvm-svn: 289849	2016-12-15 18:47:22 +00:00
Ahmed Bougacha	5228603387	[GlobalISel] Drop workaround for Legalizer member/class sharing a name. NFC. MachineLegalizer used to be the name of both the class and the member, causing GCC errors. r276522 fixed that by renaming the member to just 'Legalizer'. The 'class' workaround isn't necessary anymore; drop it. llvm-svn: 289848	2016-12-15 18:45:30 +00:00
Sanjay Patel	a97358bc8e	[x86] use a single shufps for 256-bit vectors when it can save instructions This is the 256-bit counterpart to the 128-bit transform checked in here: https://reviews.llvm.org/rL289837 This patch is based on the draft by @sroland (Roland Scheidegger) that is attached to PR27885: https://llvm.org/bugs/show_bug.cgi?id=27885 llvm-svn: 289846	2016-12-15 18:43:46 +00:00
Matthew Simpson	2c8de192a1	[AArch64] Guard Misaligned 128-bit store penalty by subtarget feature This patch checks that the SlowMisaligned128Store subtarget feature is set when penalizing such stores in getMemoryOpCost. Differential Revision: https://reviews.llvm.org/D27677 llvm-svn: 289845	2016-12-15 18:36:59 +00:00
Ahmed Bougacha	2a26a5f1f0	[AArch64][GlobalISel] Remove redundant RBI comments. NFC. It's brittle, and Doxygen already picks the overriden method's comment anyway. llvm-svn: 289844	2016-12-15 18:22:15 +00:00
Teresa Johnson	1b859a2306	[ThinLTO] Ensure callees get hot threshold when first seen on cold path This is split out from D27696, since it turned out to be a bug fix and not part of the NFC efficiency change. Keep the same adjusted (possibly decayed) threshold in both the worklist and the ImportList. Otherwise if we encountered it first along a cold path, the callee would be added to the worklist with a lower decayed threshold than when it is later encountered along a hot path. But the logic uses the threshold recorded in the ImportList entry to check if we should re-add it, and without this patch the threshold recorded there is the same along both paths so we don't re-add it. Using the same possibly decayed threshold in the ImportList ensures we re-add it later with the higher non-decayed hot path threshold. llvm-svn: 289843	2016-12-15 18:21:01 +00:00
Chris Bieneman	dc9b0db8e3	[CMake] Minor change to symlink generation for LLDB If OUTPUT_DIR is not specified we can assume the symlink is linking to a file in the same directory, so we can use $<TARGET_FILE_NAME:${target}> to create a relative symlink. In the case of LLDB, when we build a framework, we are creating symlinks in a different directory than the file we're pointing to, and we don't install those links. To make this work in the build directory we can use $<TARGET_FILE:${target}> instead, which uses the full path to the target. llvm-svn: 289840	2016-12-15 18:17:07 +00:00
Sanjay Patel	a0d8a278a7	[x86] use a single shufps when it can save instructions This is a tiny patch with a big pile of test changes. This partially fixes PR27885: https://llvm.org/bugs/show_bug.cgi?id=27885 My motivating case looks like this: - vpshufd {{.#+}} xmm1 = xmm1[0,1,0,2] - vpshufd {{.#+}} xmm0 = xmm0[0,2,2,3] - vpblendw {{.#+}} xmm0 = xmm0[0,1,2,3],xmm1[4,5,6,7] + vshufps {{.#+}} xmm0 = xmm0[0,2],xmm1[0,2] And this happens several times in the diffs. For chips with domain-crossing penalties, the instruction count and size reduction should usually overcome any potential domain-crossing penalty due to using an FP op in a sequence of int ops. For chips such as recent Intel big cores and Atom, there is no domain-crossing penalty for shufps, so using shufps is a pure win. So the test case diffs all appear to be improvements except one test in vector-shuffle-combining.ll where we miss an opportunity to use a shift to generate zero elements and one test in combine-sra.ll where multiple uses prevent the expected shuffle combining. Differential Revision: https://reviews.llvm.org/D27692 llvm-svn: 289837	2016-12-15 18:03:38 +00:00
Simon Pilgrim	7522f54feb	[X86][SSE] Fix domains for scalar store instructions As discussed on D27692 llvm-svn: 289834	2016-12-15 17:09:24 +00:00
Robert Lougher	6ea759a83e	Revert "[SimplifyCFG] In sinkLastInstruction correctly set debugloc of common inst" Reverting as it is causing buildbot failures (address sanitizer). llvm-svn: 289833	2016-12-15 16:59:13 +00:00
Jacques Pienaar	ccffe38352	[lanai] Simplify small section check in LowerGlobalAddress and treat ldata sections specially. Move the check for the code model into isGlobalInSmallSectionImpl and return false (not in small section) for variables placed in sections prefixed with .ldata (workaround for a tool limitation). llvm-svn: 289832	2016-12-15 16:56:16 +00:00
Simon Pilgrim	ba46422694	[X86][AVX512] Moved instruction domain lookups to the right table. NFCI. Avoid duplicating instructions in the int32/int64 domains. llvm-svn: 289830	2016-12-15 16:38:51 +00:00
Robert Lougher	cf17674211	[SimplifyCFG] In sinkLastInstruction correctly set debugloc of "common" inst Simplify CFG will try to sink the last instruction in a series of basic blocks, creating a "common" instruction in the successor block (sinkLastInstruction). When it does this, the debug location of the single instruction should be the merged debug locations of the commoned instructions. Differential Revision: https://reviews.llvm.org/D27590 llvm-svn: 289828	2016-12-15 16:17:53 +00:00
Krzysztof Parzyszek	0ca1987977	Fix ubsan failures in lane mask shifts llvm-svn: 289826	2016-12-15 16:08:49 +00:00
Simon Pilgrim	d7518896ff	[X86][SSE] Fix domains for VZEXT_LOAD type instructions Add the missing domain equivalences for movss, movsd, movd and movq zero extending loading instructions. Differential Revision: https://reviews.llvm.org/D27684 llvm-svn: 289825	2016-12-15 16:05:29 +00:00
Alexander Timofeev	a57511c451	Fix for regression after Global Load Scalarization patch llvm-svn: 289822	2016-12-15 15:17:19 +00:00
Krzysztof Parzyszek	91b5cf8412	Extract LaneBitmask into a separate type Specifically avoid implicit conversions from/to integral types to avoid potential errors when changing the underlying type. For example, a typical initialization of a "full" mask was "LaneMask = ~0u", which would result in a value of 0x00000000FFFFFFFF if the type was extended to uint64_t. Differential Revision: https://reviews.llvm.org/D27454 llvm-svn: 289820	2016-12-15 14:36:06 +00:00
Simon Pilgrim	2f7f0e7a48	[CostModel][X86] Updated reverse shuffle costs llvm-svn: 289819	2016-12-15 14:24:07 +00:00
Alexey Bataev	4160264e30	[TEST] Initial commit of tests for minmax horizontal reductions. llvm-svn: 289817	2016-12-15 13:21:29 +00:00
Alexey Bataev	2db6045b29	Revert "[TESTS] Initial commit of tests, by Andrew Tischenko" This reverts commit ee709f8988653a0334fbf100cdbbdd83a3933347. llvm-svn: 289814	2016-12-15 12:26:18 +00:00
Ehsan Amiri	795b0671c5	[InstCombine] New opportunities for FoldAndOfICmp and FoldXorOfICmp A number of new patterns for simplifying and/xor of icmp: (icmp ne %x, 0) ^ (icmp ne %y, 0) => icmp ne %x, %y if the following is true: 1- (%x = and %a, %mask) and (%y = and %b, %mask) 2- %mask is a power of 2. (icmp eq %x, 0) & (icmp ne %y, 0) => icmp ult %x, %y if the following is true: 1- (%x = and %a, %mask1) and (%y = and %b, %mask2) 2- Let %t be the smallest power of 2 where %mask1 & %t != 0. Then for any %s that is a power of 2 and %s & %mask2 != 0, we must have %s <= %t. For example if %mask1 = 24 and %mask2 = 16, setting %s = 16 and %t = 8 violates condition (2) above. So this optimization cannot be applied. llvm-svn: 289813	2016-12-15 12:25:13 +00:00
Simon Pilgrim	9876ed07f6	[CostModel] Fix long standing bug with reverse shuffle mask detection Incorrect 'undef' mask index matching meant that broadcast shuffles could be detected as reverse shuffles llvm-svn: 289811	2016-12-15 12:12:45 +00:00
Alexey Bataev	67c90c7d95	[TESTS] Initial commit of tests, by Andrew Tischenko llvm-svn: 289807	2016-12-15 11:48:24 +00:00
Nemanja Ivanovic	552c8e960e	[Power9] Allow AnyExt immediates for XXSPLTIB In some situations, the BUILD_VECTOR node that builds a v18i8 vector by a splat of an i8 constant will end up with signed 8-bit values and other situations, it'll end up with unsigned ones. Handle both situations. Fixes PR31340. llvm-svn: 289804	2016-12-15 11:16:20 +00:00
Dylan McKay	4f590f28e7	[AVR] Support floats in the instrumention pass This also refactors some common code into the 'GetTypeName' method. llvm-svn: 289803	2016-12-15 11:02:41 +00:00
Simon Pilgrim	9ebeac3eed	[CostModel][X86] Add tests for reverse shuffle costs llvm-svn: 289800	2016-12-15 10:45:53 +00:00
Prakhar Bahuguna	bc35f21f70	Add missing triple target for numeric section flag test llvm-svn: 289798	2016-12-15 10:20:48 +00:00
Pavel Labath	08c2e86802	Simplify format member detection in FormatVariadic Summary: This replaces the format member search, which was quite complicated, with a more direct approach to detecting whether a class should be formatted using the format-member method. Instead we use a special type llvm::format_adapter, which every adapter must inherit from. Then the search can be simply implemented with the is_base_of type trait. Aside from the simplification, I like this way more because it makes it more explicit that you are supposed to use this type only for adapter-like formattings, and the other approach (format_provider overloads) should be used as a default (a mistake I made when first trying to use this library). The only slight change in behaviour here is that now choose the format-adapter branch even if the format member invocation will fail to compile (e.g. because it is a non-const member function and we are passing a const adapter), whereas previously we would have gone on to search for format_providers for the type. However, I think that is actually a good thing, as it probably means the programmer did something wrong. Reviewers: zturner, inglorion Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27679 llvm-svn: 289795	2016-12-15 09:40:27 +00:00
Sjoerd Meijer	96e10b5a9e	[Thumb] Teach ISel how to lower compares of AND bitmasks efficiently This is essentially a recommit of r285893, but with a correctness fix. The problem of the original commit was that this: bic r5, r7, #31 cbz r5, .LBB2_10 got rewritten into: lsrs r5, r7, #5 beq .LBB2_10 The result in destination register r5 is not the same and this is incorrect when r5 is not dead. So this fix includes checking the uses of the AND destination register. And also, compared to the original commit, some regression tests didn't need changing anymore because of this extra check. For completeness, this was the original commit message: For the common pattern (CMPZ (AND x, #bitmask), #0), we can do some more efficient instruction selection if the bitmask is one consecutive sequence of set bits (32 - clz(bm) - ctz(bm) == popcount(bm)). 1) If the bitmask touches the LSB, then we can remove all the upper bits and set the flags by doing one LSLS. 2) If the bitmask touches the MSB, then we can remove all the lower bits and set the flags with one LSRS. 3) If the bitmask has popcount == 1 (only one set bit), we can shift that bit into the sign bit with one LSLS and change the condition query from NE/EQ to MI/PL (we could also implement this by shifting into the carry bit and branching on BCC/BCS). 4) Otherwise, we can emit a sequence of LSLS+LSRS to remove the upper and lower zero bits of the mask. 1-3 require only one 16-bit instruction and can elide the CMP. 4 requires two 16-bit instructions but can elide the CMP and doesn't require materializing a complex immediate, so is also a win. Differential Revision: https://reviews.llvm.org/D27761 llvm-svn: 289794	2016-12-15 09:38:59 +00:00
Dylan McKay	4b028e2ee1	[AVR] Add argument indices to the instrumention hook functions This allows the instrumention hook functions to do better pretty-printing. llvm-svn: 289793	2016-12-15 09:38:09 +00:00
Prakhar Bahuguna	13e9921ccc	Fix for build warning in execute-only support llvm-svn: 289788	2016-12-15 08:42:04 +00:00
Prakhar Bahuguna	e640c6f765	Allow ELF section flags to be specified numerically Summary: GAS already allows flags for sections to be specified directly as a numeric value. This functionality is particularly useful for setting processor or application-specific values that may not be directly supported or understood by LLVM. This patch allows LLVM to use numeric section flag values verbatim if specified by the assembly file. Reviewers: grosbach, rafael, t.p.northover, rengolin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27451 llvm-svn: 289785	2016-12-15 07:59:15 +00:00
Prakhar Bahuguna	52a7dd7d78	[ARM] Implement execute-only support in CodeGen This implements execute-only support for ARM code generation, which prevents the compiler from generating data accesses to code sections. The following changes are involved: * Add the CodeGen option "-arm-execute-only" to the ARM code generator. * Add the clang flag "-mexecute-only" as well as the GCC-compatible alias "-mpure-code" to enable this option. * When enabled, literal pools are replaced with MOVW/MOVT instructions, with VMOV used in addition for floating-point literals. As the MOVT instruction is required, execute-only support is only available in Thumb mode for targets supporting ARMv8-M baseline or Thumb2. * Jump tables are placed in data sections when in execute-only mode. * The execute-only text section is assigned section ID 0, and is marked as unreadable with the SHF_ARM_PURECODE flag with symbol 'y'. This also overrides selection of ELF sections for globals. llvm-svn: 289784	2016-12-15 07:59:08 +00:00
Sanjoy Das	93b1de0f8c	Add missing -mtriple to MIR test case llvm-svn: 289779	2016-12-15 07:13:50 +00:00
Yaxun Liu	6f8d90999e	Attempt to fix llvm-readobj crash on ppc64 due to r289674 llvm-svn: 289777	2016-12-15 06:59:23 +00:00
Daniel Jasper	befe7a3fc4	Fix go bindings after r289702 (hopefully, don't really know how to build them, build.sh seems to be broken). llvm-svn: 289775	2016-12-15 06:54:29 +00:00
Kostya Serebryany	628b43aab6	[libFuzzer] enable the failure-resistant merge by default (with trace-pc-guard only) llvm-svn: 289772	2016-12-15 06:21:21 +00:00
Dylan McKay	dc58eb543f	[AVR] Whitelist the avrlit config environment variables This allows us to use `lit` to run on-target execution tests. llvm-svn: 289769	2016-12-15 06:04:53 +00:00
Hal Finkel	f19e114237	Revert part of r289765 that is not necessary CS.doesNotAccessMemory(ArgNo) and CS.onlyReadsMemory(ArgNo) calls dataOperandHasImpliedAttr, so revert this part of r289765 because it should not be necessary. llvm-svn: 289768	2016-12-15 05:50:45 +00:00
Hal Finkel	34f9d6ac11	Trying to fix NDEBUG build after r289764 llvm-svn: 289766	2016-12-15 05:33:19 +00:00
Hal Finkel	39fed399e1	Fix argument attribute queries with bundle operands When iterating over data operands in AA, don't make argument-attribute-specific queries on bundle operands. Trying to fix self hosting... llvm-svn: 289765	2016-12-15 05:09:15 +00:00
Sanjoy Das	d7389d6261	[MachineBlockPlacement] Don't make blocks "uneditable" Summary: This fixes an issue with MachineBlockPlacement due to a badly timed call to `analyzeBranch` with `AllowModify` set to true. The timeline is as follows: 1. `MachineBlockPlacement::maybeTailDuplicateBlock` calls `TailDup.shouldTailDuplicate` on its argument, which in turn calls `analyzeBranch` with `AllowModify` set to true. 2. This `analyzeBranch` call edits the terminator sequence of the block based on the physical layout of the machine function, turning an unanalyzable non-fallthrough block to a unanalyzable fallthrough block. Normally MBP bails out of rearranging such blocks, but this block was unanalyzable non-fallthrough (and thus rearrangeable) the first time MBP looked at it, and so it goes ahead and decides where it should be placed in the function. 3. When placing this block MBP fails to analyze and thus update the block in keeping with the new physical layout. Concretely, before (1) we have something like: ``` LBL0: < unknown terminator op that may branch to LBL1 > jmp LBL1 LBL1: ... A LBL2: ... B ``` In (2), analyze branch simplifies this to ``` LBL0: < unknown terminator op that may branch to LBL2 > ;; jmp LBL1 <- redundant jump removed LBL1: ... A LBL2: ... B ``` In (3), MachineBlockPlacement goes ahead with its plan of putting LBL2 after the first block since that is profitable. ``` LBL0: < unknown terminator op that may branch to LBL2 > ;; jmp LBL1 <- redundant jump LBL2: ... B LBL1: ... A ``` and the program now has incorrect behavior (we no longer fall-through from `LBL0` to `LBL1`) because MBP can no longer edit LBL0. There are several possible solutions, but I went with removing the teeth off of the `analyzeBranch` calls in TailDuplicator. That makes thinking about the result of these calls easier, and breaks nothing in the lit test suite. I've also added some bookkeeping to the MachineBlockPlacement pass and used that to write an assert that would have caught this. Reviewers: chandlerc, gberry, MatzeB, iteratee Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D27783 llvm-svn: 289764	2016-12-15 05:08:57 +00:00
Craig Topper	ab5f355d8c	[AVX-512][InstCombine] Add masked scalar FMA intrinsics to SimplifyDemandedVectorElts. llvm-svn: 289759	2016-12-15 03:49:45 +00:00
Hal Finkel	321053a7ca	Fix iterator-invalidation issue Inserting a new key into a DenseMap potentially invalidates iterators into that map. Trying to fix an issue from r289755 triggering this assertion: Assertion `isHandleInSync() && "invalid iterator access!"' failed. llvm-svn: 289757	2016-12-15 03:30:40 +00:00
Hal Finkel	3ca4a6bcf1	Remove the AssumptionCache After r289755, the AssumptionCache is no longer needed. Variables affected by assumptions are now found by using the new operand-bundle-based scheme. This new scheme is more computationally efficient, and also we need much less code... llvm-svn: 289756	2016-12-15 03:02:15 +00:00
Hal Finkel	cb9f78e1c3	Make processing @llvm.assume more efficient by using operand bundles There was an efficiency problem with how we processed @llvm.assume in ValueTracking (and other places). The AssumptionCache tracked all of the assumptions in a given function. In order to find assumptions relevant to computing known bits, etc. we searched every assumption in the function. For ValueTracking, that means that we did O(#assumes * #values) work in InstCombine and other passes (with a constant factor that can be quite large because we'd repeat this search at every level of recursion of the analysis). Several of us discussed this situation at the last developers' meeting, and this implements the discussed solution: Make the values that an assume might affect operands of the assume itself. To avoid exposing this detail to frontends and passes that need not worry about it, I've used the new operand-bundle feature to add these extra call "operands" in a way that does not affect the intrinsic's signature. I think this solution is relatively clean. InstCombine adds these extra operands based on what ValueTracking, LVI, etc. will need and then those passes need only search the users of the values under consideration. This should fix the computational-complexity problem. At this point, no passes depend on the AssumptionCache, and so I'll remove that as a follow-up change. Differential Revision: https://reviews.llvm.org/D27259 llvm-svn: 289755	2016-12-15 02:53:42 +00:00
Eli Friedman	db07ebbab6	Add testcases for some shuffle bugs. See https://llvm.org/bugs/show_bug.cgi?id=31301 and https://llvm.org/bugs/show_bug.cgi?id=31364 . llvm-svn: 289751	2016-12-15 01:47:15 +00:00
Nico Weber	d43d3ba5cd	Fix test/tools/lto/hide-linkonce-odr.ll after r289719 llvm-svn: 289750	2016-12-15 01:31:38 +00:00
Justin Lebar	7853d3b9dd	[NVPTX] Remove dead #defines from NVPTXUtilities.h. llvm-svn: 289747	2016-12-15 00:45:06 +00:00
Joerg Sonnenberger	400e7b7811	Use PIC relocation model as default for PowerPC64 ELF. Most of the PowerPC64 code generation for the ELF ABI is already PIC. There are four main exceptions: (1) Constant pointer arrays etc. should in writeable sections. (2) The TOC restoration NOP after a call is needed for all global symbols. While GNU ld has a workaround for questionable GCC self-calls, we trigger the checks for calls from COMDAT sections as they cross input sections and are therefore not considered self-calls. The current decision is questionable and suboptimal, but outside the scope of the change. (3) TLS access can not use the initial-exec model. (4) Jump tables should use relative addresses. Note that the current encoding doesn't work for the large code model, but it is more compact than the default for any non-trivial jump table. Improving this is again beyond the scope of this change. At least (1) and (3) are assumptions made in target-independent code and introducing additional hooks is a bit messy. Testing with clang shows that a -fPIC binary is 600KB smaller than the corresponding -fno-pic build. Separate testing from improved jump table encodings would explain only about 100KB or so. The rest is expected to be a result of more aggressive immediate forming for -fno-pic, where the -fPIC binary just uses TOC entries. This change brings the LLVM output in line with the GCC output, other PPC64 compilers like XLC on AIX are known to produce PIC by default as well. The relocation model can still be provided explicitly, i.e. when using MCJIT. One test case for case (1) is included, other test cases with relocation mode sensitive behavior are wired to static for now. They will be reviewed and adjusted separately. Differential Revision: https://reviews.llvm.org/D26566 llvm-svn: 289743	2016-12-15 00:01:53 +00:00
Justin Lebar	a091da75b2	[AMDGPU] Fix runtime-metadata.ll test so it doesn't leave an object file in the source tree. llvm-svn: 289742	2016-12-14 23:24:43 +00:00
Justin Lebar	a54f4d7052	[NVPTX] Remove dead code. I've chosen to remove NVPTXInstrInfo::CanTailMerge but not NVPTXInstrInfo::isLoadInstr and isStoreInstr (which are also dead) because while the latter two are reasonably useful utilities, the former cannot be used safely: It relies on successful address space inference to identify writes to shared memory, but addrspace inference is a best-effort thing. llvm-svn: 289740	2016-12-14 23:20:40 +00:00
Sanjay Patel	afee21a5b2	[DAG] allow more select folding for targets that have 'and not' (PR31175) The original motivation for this patch comes from wanting to canonicalize more IR to selects and also canonicalizing min/max. If we're going to do that, we need more backend fixups to undo select codegen when simpler ops will do. I chose AArch64 for the tests because that shows the difference in the simplest way. This should fix: https://llvm.org/bugs/show_bug.cgi?id=31175 Differential Revision: https://reviews.llvm.org/D27489 llvm-svn: 289738	2016-12-14 22:59:14 +00:00
Davide Italiano	1ebbd176b3	[gold] Add datalayout to two tests where it was missing. Reported by: thakis via chromium bots. llvm-svn: 289737	2016-12-14 22:53:43 +00:00
Eugene Zelenko	f9f8c68290	[Hexagon] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 289736	2016-12-14 22:50:46 +00:00
Greg Clayton	52fe1f68c8	Add the ability to get attribute values as Optional<T> When getting attributes it is sometimes nicer to use Optional<T> some of the time instead of magic values. I tried to cut over to only using the Optional values but it made many of the call sites very messy, so it makes sense the leave in the calls that can return a default value. Otherwise code that looks like this: uint64_t CallColumn = Die.getAttributeValueAsAddress(DW_AT_call_line, 0); Has to be turned into: uint64_t CallColumn = 0; if (auto CallColumnValue = Die.getAttributeValueAsAddress(DW_AT_call_line)) CallColumn = *CallColumnValue; The first snippet of code looks much better. But in cases where you want an offset that may or may not be there, the following code looks better: if (auto StmtOffset = Die.getAttributeValueAsSectionOffset(DW_AT_stmt_list)) { // Use StmtOffset } Differential Revision: https://reviews.llvm.org/D27772 llvm-svn: 289731	2016-12-14 22:38:08 +00:00
Justin Lebar	e7bbf7fde3	Whitespace cleanup in test/CodeGen/NVPTX/annotations.ll. llvm-svn: 289730	2016-12-14 22:32:55 +00:00
Justin Lebar	19bf9d2b6d	[NVPTX] Support .maxnreg annotation. Reviewers: tra Subscribers: llvm-commits, jholewinski Differential Revision: https://reviews.llvm.org/D27638 llvm-svn: 289729	2016-12-14 22:32:50 +00:00
Justin Lebar	e6867085fa	[NVPTX] Remove string constants from NVPTXBaseInfo.h. Summary: Previously they were defined as a 2D char array in a header file. This is kind of overkill -- we can let the linker lay out these strings however it pleases. While we're at it, we might as well just inline these constants where they're used, as each of them is used only once. Also move NVPTXUtilities.{h,cpp} into namespace llvm. Reviewers: tra Subscribers: jholewinski, mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D27636 llvm-svn: 289728	2016-12-14 22:32:44 +00:00
Peter Collingbourne	b677fe00fb	LibDriver: Reject inputs that are not COFF objects or bitcode files. Fixes PR31372. Differential Revision: https://reviews.llvm.org/D27776 llvm-svn: 289726	2016-12-14 22:19:22 +00:00
Dehao Chen	40dd8c5109	Only sets profile summary when it was not preset. Summary: SampleProfileLoader pass may be invoked twice by LTO. The 2nd pass should not append more summary info as it is already preset by the 1st pass. Reviewers: eraman, davidxl Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D27733 llvm-svn: 289725	2016-12-14 22:06:49 +00:00
Dehao Chen	fb699619a0	Fix the bug in r289714 (NFC). llvm-svn: 289724	2016-12-14 22:03:08 +00:00
Jan Sjodin	071b0fa06a	Revert revision 289721. llvm-svn: 289723	2016-12-14 21:58:42 +00:00
Jan Sjodin	9419021bba	Dummy commit. llvm-svn: 289721	2016-12-14 21:57:18 +00:00
Davide Italiano	ebed410ca0	[LTO] Add the missing datalayout in a test. llvm-svn: 289720	2016-12-14 21:57:14 +00:00
Davide Italiano	2ceb628f36	[LTO] Reject modules without datalayout. Also, udpate the ~60 failing tests in the tree which did not contain a valid datalayout. This fixes PR31123. lld will be updated in a following patch, immediately after this is committed. Differential Revision: https://reviews.llvm.org/D27082 llvm-svn: 289719	2016-12-14 21:57:04 +00:00
Filipe Cabecinhas	dd9688703c	[asan] Don't skip instrumentation of masked load/store unless we've seen a full load/store on that pointer. Reviewers: kcc, RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27625 llvm-svn: 289718	2016-12-14 21:57:04 +00:00
Filipe Cabecinhas	1e69017a6d	[asan] Hook ClInstrumentWrites and ClInstrumentReads to masked operation instrumentation. Reviewers: kcc Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27548 llvm-svn: 289717	2016-12-14 21:56:59 +00:00
Dehao Chen	a99e082e15	Create SampleProfileLoader pass in llvm instead of clang Summary: We used to create SampleProfileLoader pass in clang. This makes LTO/ThinLTO unable to add this pass in the linker plugin. This patch moves the SampleProfileLoader pass creation from clang to llvm pass manager builder. Reviewers: tejohnson, davidxl, dnovillo Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D27743 llvm-svn: 289714	2016-12-14 21:40:47 +00:00
Eli Friedman	cbed30c501	[ARM] Split 128-bit vectors in BUILD_VECTOR lowering Given that INSERT_VECTOR_ELT operates on D registers anyway, combining 64-bit vectors into a 128-bit vector is basically free. Therefore, try to split BUILD_VECTOR nodes before giving up and lowering them to a series of INSERT_VECTOR_ELT instructions. Sometimes this allows dramatically better lowerings; see testcases for examples. Inspired by similar code in the x86 backend for AVX. Differential Revision: https://reviews.llvm.org/D27624 llvm-svn: 289706	2016-12-14 20:44:38 +00:00
Nico Weber	53816d074d	fix gcc warning about a superfluous ; llvm-svn: 289705	2016-12-14 20:33:54 +00:00
Robert Lougher	cfd7198698	[InstCombine] Folding of a compare with RHS const should merge debug locations If all the operands to a phi node are compares that have a RHS constant, instcombine will try to pull them through the phi node, combining them into a single operation. When it does this, the debug location of the new op should be the merged debug locations of the phi node arguments. Patch 8 of 8 for D26256. Folding of a compare that has a RHS constant. Differential Revision: https://reviews.llvm.org/D26256 llvm-svn: 289704	2016-12-14 20:27:22 +00:00
Eli Friedman	10576e73c9	[ARM] Add ARMISD::VLD1DUP to match vld1_dup more consistently. Currently, there are substantial problems forming vld1_dup even if the VDUP survives legalization. The lack of an actual node leads to terrible results: not only can we not form post-increment vld1_dup instructions, but we form scalar pre-increment and post-increment loads which force the loaded value into a GPR. This patch fixes that by combining the vdup+load into an ARMISD node before DAGCombine messes it up. Also includes a crash fix for vld2_dup (see testcase @vld2dupi8_postinc_variable). Differential Revision: https://reviews.llvm.org/D27694 llvm-svn: 289703	2016-12-14 20:25:26 +00:00
Amjad Aboud	43c8b6b7b2	[DebugInfo] Changed DIBuilder::createCompileUnit() to take DIFile instead of FileName and Directory. This way it will be easier to expand DIFile (e.g., to contain checksum) without the need to modify the createCompileUnit() API. Reviewers: llvm-commits, rnk Differential Revision: https://reviews.llvm.org/D27762 llvm-svn: 289702	2016-12-14 20:24:54 +00:00
Yaxun Liu	04334b527d	Fix build failure due to r289674 on certain systems Removed a useless include which caused conflict. llvm-svn: 289700	2016-12-14 20:17:47 +00:00
Robert Lougher	c9f7354776	[InstCombine] Folding of a binop with RHS const should merge the debug locations If all the operands to a phi node are a binop with a RHS constant, instcombine will try to pull them through the phi node, combining them into a single operation. When it does this, the debug location of the new op should be the merged debug locations of the phi node arguments. Patch 7 of 8 for D26256. Folding of a binop with RHS constant. Differential Revision: https://reviews.llvm.org/D26256 llvm-svn: 289699	2016-12-14 20:07:49 +00:00
David Blaikie	b461468958	DebugInfo: Improve type safety and simplify some subprogram finalization code This probably ended up this way aften the subprogram<>function link inversion and debug info metadata schema changes. llvm-svn: 289697	2016-12-14 19:38:39 +00:00
Geoff Berry	ca11a1e147	[GVNHoist] Move GVNHoist to function simplification part of pipeline. Summary: Move GVNHoist to later in the optimization pipeline, specifically, to the function simplification part of the pipeline. The new pipeline location allows GVNHoist to run on a function after its callees have been inlined but before the function has been considered for inlining into its callers, exposing more opportunities for hoisting. Performance results on AArch64 kryo: Improvements: Benchmarks/CoyoteBench/fftbench -24.952% spec2006/bzip2 -4.071% internal bmark -3.177% Benchmarks/PAQ8p/paq8p -1.754% spec2000/perlbmk -1.328% spec2006/h264ref -1.140% Regressions: internal bmark +1.818% Benchmarks/mafft/pairlocalalign +1.084% Reviewers: sebpop, dberlin, hiraditya Subscribers: aemerson, mehdi_amini, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D27722 llvm-svn: 289696	2016-12-14 19:38:22 +00:00
Andrew Kaylor	ce3bcae632	[WinEH] Avoid holding references to BlockColor (DenseMap) entries while inserting new elements Differential Revision: https://reviews.llvm.org/D27693 llvm-svn: 289694	2016-12-14 19:30:18 +00:00
Robert Lougher	f02d9b8325	[InstCombine] When folding casts through a phi node merge the debug locations If all the operands to a phi node are a cast, instcombine will try to pull them through the phi node, combining them into a single cast. When it does this, the debug location of the new cast should be the merged debug locations of the phi node arguments. Patch 6 of 8 for D26256. Folding of a cast operation. Differential Revision: https://reviews.llvm.org/D26256 llvm-svn: 289693	2016-12-14 19:24:01 +00:00
Sean Callanan	62204ad74a	Include <cstdarg> in PrettyStackTrace.cpp, fixing the bots. llvm-svn: 289691	2016-12-14 19:19:53 +00:00
Sean Callanan	032dbf9ee3	Prepare PrettyStackTrace for LLDB adoption This patch fixes the linkage for __crashtracer_info__, making it have the proper mangling (extern "C") and linkage (private extern). It also adds a new PrettyStackTrace type, allowing LLDB to adopt this instead of Host::SetCrashDescriptionWithFormat(). Without this patch, CrashTracer on macOS won't pick up pretty stack traces from any LLVM client. An LLDB commit adopting this API will follow shortly. Differential Revision: https://reviews.llvm.org/D27683 llvm-svn: 289689	2016-12-14 19:09:43 +00:00
Robert Lougher	373e36a410	[InstCombine] Folding loads through a phi node should merge the debug locations If all the operands to a phi node are a load, instcombine will try to pull them through the phi node, combining them into a single load. When it does this, the debug location of the new load should be the merged debug locations of the phi node arguments. Patch 5 of 8 for D26256. Folding of a load operation. Differential Revision: https://reviews.llvm.org/D26256 llvm-svn: 289688	2016-12-14 19:02:14 +00:00
Robert Lougher	8fc1e89bbb	[InstCombine] When folding GEP through a phi node merge the debug locations If all the operands to a phi node are getelementptr, instcombine will try to pull them through the phi node, combining them into a single operation. When it does this, the debug location of the new getelementptr should be the merged debug locations of the phi node arguments. Patch 4 of 8 for D26256. Folding of a getelementptr operation. Differential Revision: https://reviews.llvm.org/D26256 llvm-svn: 289684	2016-12-14 18:37:50 +00:00
Eric Christopher	ba1024cfb8	This change does two things: Adds a "Discriminator" field to struct DILineInfo, which defaults to 0. Fills out the "Discriminator" field in DILineInfo in DWARFDebugLine::LineTable::getFileLineInfoForAddress(). in order to have a slightly nicer interface in getFileLineInfoForAddress. Patch by Simon Que! Differential Revision: https://reviews.llvm.org/D27649 llvm-svn: 289683	2016-12-14 18:29:39 +00:00
Robert Lougher	4b0790d488	[InstCombine] Merge debug locations when folding through a phi node If all the operands to a phi node are of the same operation, instcombine will try to pull them through the phi node, combining them into a single operation. When it does this, the debug location of the operation should be the merged debug locations of the phi node arguments. Patch 3 of 8 for D26256. Folding of a compare operation. Differential Revision: https://reviews.llvm.org/D26256 llvm-svn: 289681	2016-12-14 18:14:57 +00:00
Kostya Serebryany	d9d9a54511	[libFuzzer] disable msan for one more hook that reads target's data that might be uninitialized llvm-svn: 289680	2016-12-14 18:13:02 +00:00
Robert Lougher	2428a4050f	[InstCombine] Merge debug locations when folding through a phi node If all the operands to a phi node are of the same operation, instcombine will try to pull them through the phi node, combining them into a single operation. When it does this, the debug location of the operation should be the merged debug locations of the phi node arguments. Patch 2 of 8 for D26256. Folding of a binary operation. Differential Revision: https://reviews.llvm.org/D26256 llvm-svn: 289679	2016-12-14 17:49:19 +00:00
Dehao Chen	23025f8483	revert r289669 which breaks bots llvm-svn: 289676	2016-12-14 17:23:16 +00:00
Yaxun Liu	07d659bc76	AMDGPU: Emit runtime metadata version 2 as YAML Differential Revision: https://reviews.llvm.org/D25046 llvm-svn: 289674	2016-12-14 17:16:52 +00:00
Derek Schuff	ebd8110aa1	lit.cfg: Check value of build config rather than converting to boolean This is a CMake var which never evaluates to false. llvm-svn: 289673	2016-12-14 17:05:34 +00:00
Matt Arsenault	bdc0ac0a0e	AMDGPU: Make AllocationPriority of SGPRs higher than VGPRs Since SGPRs should spill to VGPRs, they should be allocated first. I don't think this is sufficient for SGPRs to always spill to VGPRs though. llvm-svn: 289671	2016-12-14 16:52:06 +00:00
Dehao Chen	cb61c94d87	Create SampleProfileLoader pass in llvm instead of clang Summary: We used to create SampleProfileLoader pass in clang. This makes LTO/ThinLTO unable to add this pass in the linker plugin. This patch moves the SampleProfileLoader pass creation from clang to llvm pass manager builder. Reviewers: tejohnson, davidxl, dnovillo Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D27743 llvm-svn: 289669	2016-12-14 16:49:28 +00:00
Nirav Dave	f5bf03c7ef	Revert "In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled." Reverting due to ARM MCJIT and MIPS LLD error. This reverts commit r289659. llvm-svn: 289667	2016-12-14 16:43:44 +00:00
Matt Arsenault	ebfba7027e	AMDGPU: Change vintrp printing llvm-svn: 289664	2016-12-14 16:36:12 +00:00
Derek Schuff	112b303905	Revert gold part of change, just liblto llvm-svn: 289663	2016-12-14 16:20:25 +00:00
Derek Schuff	0c2796dc36	Disable libLTO tests when libLTO is not built Summary: The current test only checks whether ld64 is available, causing tests to fail when ld64 is avilable but libLTO is not built. Reviewers: beanz, mehdi_amini Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D27739 llvm-svn: 289662	2016-12-14 16:20:22 +00:00
Robert Lougher	7bd04e3b2d	New API for merging debug locations. NFC. Given two debug locations the function getMergedLocation combines the locations into a single location (which may be an empty location). Please see https://reviews.llvm.org/D26256 for the discussion leading up to this API. Note the function is currently a stub. This allows optimisations to use the API although no location will actually be used. This is patch 1 out of 8 for D26256. As suggested by David Blaikie, each change in D26256 has been broken out into a separate patch. llvm-svn: 289661	2016-12-14 16:14:17 +00:00
Nirav Dave	8527ab0ad2	In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled. Retrying after fixing after removing load-store factoring through token factors in favor of improved token factor operand pruning Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. Whem merging stores, search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and generally the output CodeGen (with some exceptions). Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seemed sufficient to not cause regressions in tests. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable. Some tests relying on the order were changed to use volatile memory operations Noteworthy tests: CodeGen/AArch64/argument-blocks.ll - It's not entirely clear what the test_varargs_stackalign test is supposed to be asserting, but the new code looks right. CodeGen/AArch64/arm64-memset-inline.lli - CodeGen/AArch64/arm64-stur.ll - CodeGen/ARM/memset-inline.ll - The backend now generates worse code due to store merging succeeding, as we do do a 16-byte constant-zero store efficiently. CodeGen/AArch64/merge-store.ll - Improved, but there still seems to be an extraneous vector insert from an element to itself? CodeGen/PowerPC/ppc64-align-long-double.ll - Worse code emitted in this case, due to the improved store->load forwarding. CodeGen/X86/dag-merge-fast-accesses.ll - CodeGen/X86/MergeConsecutiveStores.ll - CodeGen/X86/stores-merging.ll - CodeGen/Mips/load-store-left-right.ll - Restored correct merging of non-aligned stores CodeGen/AMDGPU/promote-alloca-stored-pointer-value.ll - Improved. Correctly merges buffer_store_dword calls CodeGen/AMDGPU/si-triv-disjoint-mem-access.ll - Improved. Sidesteps loading a stored value and merges two stores CodeGen/X86/pr18023.ll - This test has been removed, as it was asserting incorrect behavior. Non-volatile stores CAN be moved past volatile loads, and now are. CodeGen/X86/vector-idiv.ll - CodeGen/X86/vector-lzcnt-128.ll - It's basically impossible to tell what these tests are actually testing. But, looks like the code got better due to the memory operations being recognized as non-aliasing. CodeGen/X86/win32-eh.ll - Both loads of the securitycookie are now merged. Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle Subscribers: wdng, nhaehnle, nemanjai, arsenm, weimingz, niravd, RKSimon, aemerson, qcolombet, dsanders, resistor, tstellarAMD, t.p.northover, spatel Differential Revision: https://reviews.llvm.org/D14834 llvm-svn: 289659	2016-12-14 15:44:26 +00:00
Simon Pilgrim	facbd35696	Wdocumentation fix llvm-svn: 289655	2016-12-14 15:14:44 +00:00
Simon Pilgrim	05ab8ffc7e	[DAGCombiner] Try to use SelectionDAG::isKnownToBeAPowerOfTwo instead of just APInt::isPowerOf2 Generalize sdiv/udiv/srem/urem combines using APInt::isPowerOf2, which only works for const/splat-const values, to call SelectionDAG::isKnownToBeAPowerOfTwo instead which recognises many more cases. Added a DAGCombiner::BuildLogBase2 helper since PowerOf2 combines often involve taking the log2 of such a value. Differential Revision: https://reviews.llvm.org/D27714 llvm-svn: 289654	2016-12-14 15:08:13 +00:00
Michael Zuckerman	1ce2a23a1e	Fix bug 30945- [AVX512] Failure to flip vector comparison to remove not mask instruction adding new optimization opportunity by adding new X86ISelLowering pattern. The test case was shown in https://llvm.org/bugs/show_bug.cgi?id=30945. Test explanation: Select gets three arguments mask, op and op2. In this case, the Mask is a result of ICMP. The ICMP instruction compares (with equal operand) the zero initializer vector and the result of the first ICMP. In general, The result of "cmp eq, op1, zero initializers" is "not(op1)" where op1 is a mask. By rearranging of the two arguments inside the Select instruction, we can get the same result. Without the necessary of the middle phase ("cmp eq, op1, zero initializers"). Missed optimization opportunity: vpcmpled %zmm0, %zmm1, %k0 knotw %k0, %k1 can be combine to vpcmpgtd %zmm0, %zmm2, %k1 Reviewers: 1. delena 2. igorb Commited after check all Differential Revision: https://reviews.llvm.org/D27160 llvm-svn: 289653	2016-12-14 14:57:10 +00:00
Simon Pilgrim	ebe58191c8	[X86][SSE] Add AVX1 tests to sdiv/udiv srem/urem combine tests As requested on D27714 llvm-svn: 289652	2016-12-14 14:39:51 +00:00
Renato Golin	ce1dd3c949	Revert "[AVR] Add the very first on-target test" This reverts commit r289648, as it's an execution test and relies on the emulator/dispatcher being available on all builders. llvm-svn: 289651	2016-12-14 13:24:20 +00:00
Stephan Bergmann	7d94d54a36	Adapt to recent APFloat change llvm-svn: 289649	2016-12-14 12:11:35 +00:00
Dylan McKay	452e266cd6	[AVR] Add the very first on-target test This test runs on actual AVR hardware. llvm-svn: 289648	2016-12-14 12:03:39 +00:00
Stephan Bergmann	17c7f70362	Replace APFloatBase static fltSemantics data members with getter functions At least the plugin used by the LibreOffice build (<https://wiki.documentfoundation.org/Development/Clang_plugins>) indirectly uses those members (through inline functions in LLVM/Clang include files in turn using them), but they are not exported by utils/extract_symbols.py on Windows, and accessing data across DLL/EXE boundaries on Windows is generally problematic. Differential Revision: https://reviews.llvm.org/D26671 llvm-svn: 289647	2016-12-14 11:57:17 +00:00
Artur Pilipenko	f3ee444010	Add a couple of assertions to the load combine code introduced by r289538 llvm-svn: 289646	2016-12-14 11:55:47 +00:00
Dylan McKay	cfd1ce6a52	[AVR] Add the integrated testing tool to the .gitignore We build it as an LLVM tool. llvm-svn: 289645	2016-12-14 11:47:14 +00:00
Oliver Stannard	268f42f1ce	[Assembler] Better error messages for .org directive Currently, the error messages we emit for the .org directive when the expression is not absolute or is out of range do not include the line number of the directive, so it can be hard to track down the problem if a file contains many .org directives. This patch stores the source location in the MCOrgFragment, so that it can be used for diagnostics emitted during layout. Since layout is an iterative process, and the errors are detected during each iteration, it would have been possible for errors to be reported multiple times. To prevent this, I've made the assembler bail out after each iteration if any errors have been reported. This will still allow multiple unrelated errors to be reported in the common case where they are all detected in the first round of layout. Differential Revision: https://reviews.llvm.org/D27411 llvm-svn: 289643	2016-12-14 10:43:58 +00:00
Dylan McKay	3abd1d3e12	[AVR] Add a function instrumentation pass This will be used for an on-chip test suite. llvm-svn: 289641	2016-12-14 10:15:00 +00:00
Craig Topper	aeaa52cc11	[X86][InstCombine] Handle demanded elements for operand of AVX-512 scalar floating point to integer conversion intrinsics. llvm-svn: 289639	2016-12-14 07:46:12 +00:00
Hal Finkel	065b756528	[PowerPC] Fix logic dealing with nop after calls (and tail-call eligibility) This change aims to unify and correct our logic for when we need to allow for the possibility of the linker adding a TOC restoration instruction after a call. This comes up in two contexts: 1. When determining tail-call eligibility. If we make a tail call (i.e. directly branch to a function) then there is no place for the linker to add a TOC restoration. 2. When determining when we need to add a nop instruction after a call. Likewise, if there is a possibility that the linker might need to add a TOC restoration after a call, then we need to put a nop after the call (the bl instruction). First problem: We were using similar, but different, logic to decide (1) and (2). This is just wrong. Both the resideInSameModule function (used when determining tail-call eligibility) and the isLocalCall function (used when deciding if the post-call nop is needed) were supposed to be determining the same underlying fact (i.e. might a TOC restoration be needed after the call). The same logic should be used in both places. Second problem: The logic in both places was wrong. We only know that two functions will share the same TOC when both functions come from the same section of the same object. Otherwise the linker might cause the functions to use different TOC base addresses (unless the multi-TOC linker option is disabled, in which case only shared-library boundaries are relevant). There are a number of factors that can cause functions to be placed in different sections or come from different objects (-ffunction-sections, explicitly-specified section names, COMDAT, weak linkage, etc.). All of these need to be checked. The existing logic only checked properties of the callee, but the properties of the caller must also be checked (for example, calling from a function in a COMDAT section means calling between sections). There was a conceptual error in the resideInSameModule function in that it allowed tail calls to functions with weak linkage and protected/hidden visibility. While protected/hidden visibility does prevent the function implementation from being replaced at runtime (via interposition), it does not prevent the linker from using an alternate implementation at link time (i.e. using some strong definition to replace the provided weak one during linking). If this happens, then we're still potentially looking at a required TOC restoration upon return. Otherwise, in general, the post-call nop is needed wherever ELF interposition needs to be supported. We don't currently support ELF interposition at the IR level (see http://lists.llvm.org/pipermail/llvm-dev/2016-November/107625.html for more information), and I don't think we should try to make it appear to work in the backend in spite of that fact. This will yield subtle bugs if interposition is attempted. As a result, regardless of whether we're in PIC mode, we don't assume that we need to add the nop to support the possibility of ELF interposition. However, the necessary check is in place (i.e. calling GV->isInterposable and TM.shouldAssumeDSOLocal) so when we have functions for which interposition is allowed at the IR level, we'll add the nop as necessary. In the mean time, we'll generate more tail calls and fewer nops when compiling position-independent code. Differential Revision: https://reviews.llvm.org/D27231 llvm-svn: 289638	2016-12-14 07:24:50 +00:00
Craig Topper	268b3abe6d	[X86][InstCombine] Teach SimplifyDemandedVectorElts to handle masked scalar add/sub/mul/div/max/min intrinsics better. Now we can remove these intrinsics if element 0 isn't used. Also fix undef element tracking. llvm-svn: 289636	2016-12-14 06:06:58 +00:00
Craig Topper	dfd268d76b	[X86][InstCombine] Handle scalar fmadd intrinsics correctly in SimplifyDemandedVectorElts. Now we pass a modified version of DemandedElts to each operand and we calculate undef elts correctly. llvm-svn: 289632	2016-12-14 05:43:05 +00:00
Mehdi Amini	8e13bc4562	[ThinLTO] Add an API to trigger file-based API for returning objects to the linker Summary: The motivation is to support better the -object_path_lto option on Darwin. The linker needs to write down the generate object files on disk for later use by lldb or dsymutil (debug info are not present in the final binary). We're moving this into libLTO so that we can be smarter when a cache is enabled and hard-link when possible instead of duplicating the files. Reviewers: tejohnson, deadalnix, pcc Subscribers: dexonsmith, llvm-commits Differential Revision: https://reviews.llvm.org/D27507 llvm-svn: 289631	2016-12-14 04:56:42 +00:00
Craig Topper	eb6a20e79e	[X86][InstCombine] Teach SimplifyDemandedVectorElts to handle scalar round intrinsics more correctly. Now we only pass bit 0 of the DemandedElts to optimize operand 1 as we recurse since the upper bits are unused. Similarly we clear bit 0 for optimizing operand 0. Also calculate UndefElts correctly. Simplify InstCombineCalls for these instrinics to just call SimplifyDemandedVectorElts for the call instrution to reuse this support. llvm-svn: 289629	2016-12-14 03:17:30 +00:00
Craig Topper	a0372dec26	[X86][InstCombine] Teach SimplifyDemandedVectorElts to handle scalar min/max/cmp intrinsics more correctly. Now we only pass bit 0 of the DemandedElts to optimize operand 1 as we recurse since the upper bits are unused. Also calculate UndefElts correctly. Simplify InstCombineCalls for these instrinics to just call SimplifyDemandedVectorElts for the call instrution to reuse this support. llvm-svn: 289628	2016-12-14 03:17:27 +00:00
Mehdi Amini	76a00b51f0	Don't double-initialize cl::opt for iterating in reverse order to uncover non-determinism in codegen by default Bots are broken and needs to be fixed before having this on by default. The feature was committed in r289619. I tried to disable it in r289624 and failed because it was initialized in two places. llvm-svn: 289626	2016-12-14 02:35:32 +00:00
Mehdi Amini	fd1184efb5	Disable Iterating SmallPtrSet in reverse order to uncover non-determinism in codegen by default Bots are broken and needs to be fixed before having this on by default. The feature was committed in r289619. llvm-svn: 289624	2016-12-14 02:02:28 +00:00
Kostya Serebryany	8efb35b4cb	[libFuzzer] document one more desired feature of a fuzz target llvm-svn: 289622	2016-12-14 01:31:21 +00:00
Peter Collingbourne	1a0720e8c4	LTO: Add support for multi-module bitcode files. Differential Revision: https://reviews.llvm.org/D27313 llvm-svn: 289621	2016-12-14 01:17:59 +00:00
Paul Robinson	8fec3da00c	[DWARF] Preserve column number when emitting 'line 0' record Follow-up to r289256, address a FIXME to avoid resetting the column number. This reduced .debug_line by 2.6% in a RelWithDebInfo self-build of clang. llvm-svn: 289620	2016-12-14 00:27:35 +00:00
Mandeep Singh Grang	f6b069c7db	[llvm] Iterate SmallPtrSet in reverse order to uncover non-determinism in codegen Summary: Given a flag (-mllvm -reverse-iterate) this patch will enable iteration of SmallPtrSet in reverse order. The idea is to compile the same source with and without this flag and expect the code to not change. If there is a difference in codegen then it would mean that the codegen is sensitive to the iteration order of SmallPtrSet. This is enabled only with LLVM_ENABLE_ABI_BREAKING_CHECKS. Reviewers: chandlerc, dexonsmith, mehdi_amini Subscribers: mgorny, emaste, llvm-commits Differential Revision: https://reviews.llvm.org/D26718 llvm-svn: 289619	2016-12-14 00:15:57 +00:00
Evandro Menezes	54eb192b25	[ARM] Fix typo in checking prefix llvm-svn: 289617	2016-12-14 00:02:03 +00:00
Evandro Menezes	aeec780e42	Add support for Samsung Exynos M3 (NFC) llvm-svn: 289613	2016-12-13 23:31:41 +00:00
Greg Clayton	74c265e537	Update the header docs to match a recent checkin. llvm-svn: 289612	2016-12-13 23:22:53 +00:00
Greg Clayton	1cbf3fa94a	Switch functions that returned bool and filled in a DWARFFormValue arg with ones that return Optional<DWARFFormValue> Differential Revision: https://reviews.llvm.org/D27737 llvm-svn: 289611	2016-12-13 23:20:56 +00:00
Peter Collingbourne	98d40e0557	llvm-cat: Allow bitcode files to be created with no modules. llvm-svn: 289610	2016-12-13 23:14:55 +00:00
Chris Bieneman	da1c84c01e	[llvm-config] Fixing one check where shared libs implied dylib We shouldn't print the dylib if LinkDylib is false. llvm-svn: 289609	2016-12-13 23:08:52 +00:00
Derek Schuff	7ff587a96d	llvm-config: Set LinkMode in addition to LinkDyLib when using --ignore-llvm Summary: LinkDyLib is only used (before arg processing) to set up the default for LinkMode. So reset LinkMode as well, and process before --link-shared or --link-static to allow those flags to continue to override it. Reviewers: beanz Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27736 llvm-svn: 289608	2016-12-13 23:01:53 +00:00
Kostya Serebryany	f6f82c2cc8	[libFuzzer] fix an UB (invalid shift) spotted by ubsan. The code worked fine by luck, because the way shifts actually work on clang+x86 llvm-svn: 289607	2016-12-13 22:49:14 +00:00
Chris Bieneman	7f6611cf3e	[llvm-config] Add --ignore-libllvm This flag forces off linking libLLVM. This should resolve some issues reported on llvm-commits. llvm-svn: 289605	2016-12-13 22:17:59 +00:00
Eugene Zelenko	8208592707	[Hexagon] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 289604	2016-12-13 22:13:50 +00:00
Dehao Chen	0f35fa907d	Change CoverageTracker from a global variable to member variable to avoid breaking thread-safety. (NFC) llvm-svn: 289603	2016-12-13 22:13:18 +00:00
Sanjoy Das	c02dda2ab9	Re-land "[SCEVExpander] Use llvm data structures; NFC" This change re-lands r289215, by reverting r289482. The underlying issue that caused it to be reverted has been fixed by Tim Northover in r289496. Original commit message for r289215: [SCEVExpander] Use llvm data structures; NFC Original commit message for r289482: Revert "[SCEVExpander] Use llvm data structures; NFC" This reverts r289215 (git SHA1 cb7b86a1). It breaks the ubsan build because a DenseMap that keys off of `AssertingVH<T>` will hit UB when it tries to cast the empty and tombstone keys to `T *` (due to insufficient alignment). This is the relevant stack trace (thanks to Mike Aizatsky): #0 0x25cf100 in llvm::AssertingVH<llvm::PHINode>::getValPtr() const llvm/include/llvm/IR/ValueHandle.h:212:39 #1 0x25cea20 in llvm::AssertingVH<llvm::PHINode>::operator=(llvm::AssertingVH<llvm::PHINode> const&) llvm/include/llvm/IR/ValueHandle.h:234:19 #2 0x25d0092 in llvm::DenseMapBase<llvm::DenseMap<llvm::AssertingVH<llvm::PHINode>, llvm::detail::DenseSetEmpty, llvm::DenseMapInfo<llvm::AssertingVH<llvm::PHINode> >, llvm::detail::DenseSetPair<llvm::AssertingVH<llvm::PHINode> > >, llvm::AssertingVH<llvm::PHINode>, llvm::detail::DenseSetEmpty, llvm::DenseMapInfo<llvm::AssertingVH<llvm::PHINode> >, llvm::detail::DenseSetPair<llvm::AssertingVH<llvm::PHINode> > >::clear() llvm/include/llvm/ADT/DenseMap.h:113:23 llvm-svn: 289602	2016-12-13 22:04:58 +00:00
Anna Thomas	65ca8e91cc	[IRCE] Avoid loop optimizations on pre and post loops Summary: This patch will add loop metadata on the pre and post loops generated by IRCE. Currently, we have metadata for disabling optimizations such as vectorization, unrolling, loop distribution and LICM versioning (and confirmed that these optimizations check for the metadata before proceeding with the transformation). The pre and post loops generated by IRCE need not go through loop opts (since these are slow paths). Added two test cases as well. Reviewers: sanjoy, reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26806 llvm-svn: 289588	2016-12-13 21:05:21 +00:00
Michael Kuperstein	3d23d4a234	[LV] Don't vectorize when we have a small static bound on trip count We currently check if the exact trip count is known and is smaller than the "tiny loop" bound. We should be checking the maximum bound on the trip count instead. Differential Revision: https://reviews.llvm.org/D27690 llvm-svn: 289583	2016-12-13 20:38:18 +00:00
Peter Collingbourne	b56a103462	ADT: Use delete[] to delete the array owned by OwningArrayRef, as we created it with new[]. llvm-svn: 289582	2016-12-13 20:30:12 +00:00
Peter Collingbourne	d9af29969a	ADT: Add OwningArrayRef class. This is a MutableArrayRef that owns its array. I plan to use this in D22296. Differential Revision: https://reviews.llvm.org/D27723 llvm-svn: 289579	2016-12-13 20:24:24 +00:00
Peter Collingbourne	45102a24c7	Object: Make IRObjectFile own multiple modules and enumerate symbols from all modules. This implements multi-module support in IRObjectFile. Differential Revision: https://reviews.llvm.org/D26951 llvm-svn: 289578	2016-12-13 20:20:17 +00:00
Peter Collingbourne	c5fecb4f1a	Object: Remove module accessors from IRObjectFile, and hide its constructor. Differential Revision: https://reviews.llvm.org/D27079 llvm-svn: 289577	2016-12-13 20:10:22 +00:00
Peter Collingbourne	77f4c30d6f	LTO: Port the legacy LTO API to ModuleSymbolTable. Differential Revision: https://reviews.llvm.org/D27078 llvm-svn: 289576	2016-12-13 20:01:58 +00:00
Peter Collingbourne	ad90369a94	LTO: Port the new LTO API to ModuleSymbolTable. Differential Revision: https://reviews.llvm.org/D27077 llvm-svn: 289574	2016-12-13 19:43:49 +00:00
Alina Sbirlea	77c5eaaeda	Generalize strided store pattern in interleave access pass Summary: This patch aims to generalize matching of the strided store accesses to more general masks. The more general rule is to have consecutive accesses based on the stride: [x, y, ... z, x+1, y+1, ...z+1, x+2, y+2, ...z+2, ...] All elements in the masks need not form a contiguous space, there may be gaps. As before, undefs are allowed and filled in with adjacent element loads. Reviewers: HaoLiu, mssimpso Subscribers: mkuper, delena, llvm-commits Differential Revision: https://reviews.llvm.org/D23646 llvm-svn: 289573	2016-12-13 19:32:36 +00:00
Matthias Braun	fde00fc252	Revert "AArch64CollectLOH: Rewrite as block-local analysis." This is not always behaving as expected as it turns out block live-in lists are only correct most of the time. Still waiting for reviews on https://reviews.llvm.org/D27559 to have them correct all of the time. See also http://llvm.org/PR31361, rdar://25117107 This reverts commit r288567. This reverts commit r288561. llvm-svn: 289570	2016-12-13 19:08:17 +00:00
Alexei Starovoitov	3b9efca8e8	[bpf] change llvm-objdump to print dec instead of hex since bpf instruction stream is multiple of 8 change llvm-objdump to print decimal instruction number instead of hex address, so that users don't have to do this math manually to match kernel verifier output Signed-off-by: Alexei Starovoitov <ast@kernel.org> llvm-svn: 289569	2016-12-13 19:07:08 +00:00
Tim Northover	fe7c59adb8	GlobalISel: fix GOT accesses on AArch64. We were using the correct pseudo-instruction, but because the operand's flags weren't set correctly we still ended up emitting incorrect relocations during MC lowering. llvm-svn: 289566	2016-12-13 18:25:38 +00:00
Greg Clayton	c8c1032c0c	Make a DWARFDIE class that can help avoid using the wrong DWARFUnit when extracting attributes Many places pass around a DWARFDebugInfoEntryMinimal and a DWARFUnit. It is easy to get things wrong by using the wrong DWARFUnit with a DWARFDebugInfoEntryMinimal. This patch creates a DWARFDie class that contains the DWARFUnit and DWARFDebugInfoEntryMinimal objects so that they can't get out of sync. All attribute extraction has been moved out of DWARFDebugInfoEntryMinimal and into DWARFDie. DWARFDebugInfoEntryMinimal was also renamed to DWARFDebugInfoEntry. DWARFDie objects are temporary objects that are used by clients and contain 2 pointers that you always need to have anyway. Keeping them grouped will avoid errors and simplify many of the attribute extracting APIs by not having to pass in a DWARFUnit. Differential Revision: https://reviews.llvm.org/D27634 llvm-svn: 289565	2016-12-13 18:25:19 +00:00
Marcos Pividori	c21b3c949d	[libFuzzer] Add missing header needed for Windows. llvm-svn: 289564	2016-12-13 17:46:48 +00:00
Marcos Pividori	7c1defd738	[libFuzzer] Avoid name collision with Windows API. Windows uses some macros to replace DeleteFile() by DeleteFileA() or DeleteFileW(). This was causing an error at link time. DeleteFile was renamed to RemoveFile(). Differential Revision: https://reviews.llvm.org/D27577 llvm-svn: 289563	2016-12-13 17:46:40 +00:00
Marcos Pividori	67dfacdd80	[libFuzzer] Implement DirName() for Windows. Implement DirName from scratch to avoid dependencies on external libraries. It's based on MSDN documentation for Naming Files, Paths, and Namespaces. The algorithm can't simply start from the end and look backwards for the first separator, because we need to preserve the prefix that represent the root location. We shouldn't remove anything there. In Windows we have many different options, like: \\Server\Share\ , \ , C: , C:\ , \\?\C:\ , \\?\UNC\Server\Share\ We remove the last separator in the rest of the path, if it exists. It was implemented to have a similar behaviour to dirname() in linux, removing trailing separators, returning "." when the path doesn't contain separators, etc. Differential Revision: https://reviews.llvm.org/D27579 llvm-svn: 289562	2016-12-13 17:46:32 +00:00
Marcos Pividori	64d4147396	[libFuzzer] Fix bug in detecting timeouts when input string is empty. I added a new flag RunningCB to know if the Fuzzer's main thread is running the CB function, instead of using (!CurrentUnitSize). (!CurrentUnitSize) doesn't work properly. For example, in FuzzerLoop.cpp, inside ShuffleAndMinimize() function, we execute the callback with an empty string (size=0). Previous implementation failed to detect timeouts in that execution. Also, I add a regression test for that case. Differential Revision: https://reviews.llvm.org/D27433 llvm-svn: 289561	2016-12-13 17:46:25 +00:00
Marcos Pividori	178fe58745	[libFuzzer] Clean up headers and file formatting of LibFuzzer files. Reorganize #includes to follow LLVM Coding Standards. Include some missing headers. Required to use `Printf()`. Aside from that, this patch contains no functional change. It is purely a re-organization. Differential Revision: https://reviews.llvm.org/D27363 llvm-svn: 289560	2016-12-13 17:46:11 +00:00
Marcos Pividori	6e3d885c79	[libFuzzer] Properly use unsigned for workers, jobs and NumberOfCpuCores. std:🧵:hardware_concurrency() returns an unsigned, so I modify NumberOfCpuCores() to return unsigned too. The number of cpus is used to define the number of workers, so I decided to update the worker and jobs flags to be declared as unsigned too. Differential Revision: https://reviews.llvm.org/D27685 llvm-svn: 289559	2016-12-13 17:45:53 +00:00
Marcos Pividori	463f8bdd0b	[libFuzzer] Properly use unsigned for Process ID. Use unsigned for PID instead of signed int. GetCurrentProcessId() returns an unsigned (DWORD) so we must be sure we can deal with all possible values. I use a long unsigned to be sure it can hold a 32 bit unsigned (DWORD). Differential Revision: https://reviews.llvm.org/D27281 llvm-svn: 289558	2016-12-13 17:45:44 +00:00
Marcos Pividori	c59b692c85	[libFuzzer] Improve Signal Handler interface. Add new flags to FuzzingOptions to represent the different conditions on the signal handling. These options are passed when calling SetSignalHandler(). This changes simplify the implementation of Windows's exception handling. Now we can define a unique handler for all the exceptions. Differential Revision: https://reviews.llvm.org/D27238 llvm-svn: 289557	2016-12-13 17:45:20 +00:00
Rong Xu	3462cac9af	Fix the test cases committed in r289521. llvm-svn: 289556	2016-12-13 17:34:29 +00:00
Simon Pilgrim	5f2db1351f	[X86][SSE] Regenerate vector of pointers tests llvm-svn: 289555	2016-12-13 17:22:39 +00:00
Zachary Turner	bc48d20ef7	[ADT] Add llvm::StringLiteral. StringLiteral is a wrapper around a string literal useful for replacing global tables of char arrays with global tables of StringRefs that can initialized in a constexpr context, avoiding the invocation of a global constructor. Differential Revision: https://reviews.llvm.org/D27686 llvm-svn: 289551	2016-12-13 17:03:49 +00:00
David Callahan	ebcf916c5a	[ADCE] Add code to remove dead branches Summary: This is last in of a series of patches to evolve ADCE.cpp to support removing of unnecessary control flow. This patch adds the code to update the control and data flow graphs to remove the dead control flow. Also update unit tests to test the capability to remove dead, may-be-infinite loop which is enabled by the switch -adce-remove-loops. Previous patches: D23824 [ADCE] Add handling of PHI nodes when removing control flow D23559 [ADCE] Add control dependence computation D23225 [ADCE] Modify data structures to support removing control flow D23065 [ADCE] Refactor anticipating new functionality (NFC) D23102 [ADCE] Refactoring for new functionality (NFC) Reviewers: dberlin, majnemer, nadav, mehdi_amini Subscribers: llvm-commits, david2050, freik, twoh Differential Revision: https://reviews.llvm.org/D24918 llvm-svn: 289548	2016-12-13 16:42:18 +00:00
Artur Pilipenko	469fcd2afd	Use more detailed assertion messages in the code introduced by r289538 llvm-svn: 289545	2016-12-13 16:26:15 +00:00
Artur Pilipenko	79d1255e26	Fix a buildbot failure introduced by r289538 Build failed because of unused variable in product mode. llvm-svn: 289540	2016-12-13 14:55:31 +00:00
Artur Pilipenko	c93cc5955f	[DAGCombiner] Match load by bytes idiom and fold it into a single load Match a pattern where a wide type scalar value is loaded by several narrow loads and combined by shifts and ors. Fold it into a single load or a load and a bswap if the targets supports it. Assuming little endian target: i8 a = ... i32 val = a[0] \| (a[1] << 8) \| (a[2] << 16) \| (a[3] << 24) => i32 val = ((i32)a) i8 a = ... i32 val = (a[0] << 24) \| (a[1] << 16) \| (a[2] << 8) \| a[3] => i32 val = BSWAP(((i32)a)) This optimization was discussed on llvm-dev some time ago in "Load combine pass" thread. We came to the conclusion that we want to do this transformation late in the pipeline because in presence of atomic loads load widening is irreversible transformation and it might hinder other optimizations. Eventually we'd like to support folding patterns like this where the offset has a variable and a constant part: i32 val = a[i] \| (a[i + 1] << 8) \| (a[i + 2] << 16) \| (a[i + 3] << 24) Matching the pattern above is easier at SelectionDAG level since address reassociation has already happened and the fact that the loads are adjacent is clear. Understanding that these loads are adjacent at IR level would have involved looking through geps/zexts/adds while looking at the addresses. The general scheme is to match OR expressions by recursively calculating the origin of individual bits which constitute the resulting OR value. If all the OR bits come from memory verify that they are adjacent and match with little or big endian encoding of a wider value. If so and the load of the wider type (and bswap if needed) is allowed by the target generate a load and a bswap if needed. Reviewed By: hfinkel, RKSimon, filcab Differential Revision: https://reviews.llvm.org/D26149 llvm-svn: 289538	2016-12-13 14:21:14 +00:00
Artur Pilipenko	01e86444a0	Move BaseIndexOffset in DAGCombiner.cpp so it will be available for the upcoming user llvm-svn: 289537	2016-12-13 14:16:02 +00:00
Simon Pilgrim	9dc67c0101	[SelectionDAG] computeKnownBits - simplified knownbits sign extension. NFCI. We don't need to extract+test the sign bit of the known ones/zeros, we can use sext which will handle all of this. llvm-svn: 289534	2016-12-13 13:36:27 +00:00
Simon Dardis	c97cfb69ba	[mips][rtdyld] Move MIPS relocation resolution to a subclass and implement N32 relocations N32 relocations are only correct for individual relocations at the moment. Support for relocation composition will follow in a later patch. Patch By: Daniel Sanders Reviwers: vkalintiris, atanasyan Differential Revision: https://reviews.llvm.org/D27467 llvm-svn: 289532	2016-12-13 11:39:18 +00:00
Simon Dardis	e8af792439	[mips] Fix comment to respect 80 chars per line; NFC llvm-svn: 289530	2016-12-13 11:10:53 +00:00
Simon Dardis	43b5ce492d	[mips] Fix compact branch hazard detection In certain cases it is possible that transient instructions such as %reg = IMPLICIT_DEF as a single instruction in a basic block to reach the MipsHazardSchedule pass. This patch teaches MipsHazardSchedule to properly look through such cases. Reviewers: vkalintiris, zoran.jovanovic Differential Revision: https://reviews.llvm.org/D27209 llvm-svn: 289529	2016-12-13 11:07:51 +00:00
Diana Picus	2d9adbf524	[GlobalISel] Move extendRegister where it belongs. NFCI Apparently I missed this one when I moved ValueHandler back in r288658. Sorry! llvm-svn: 289528	2016-12-13 10:46:12 +00:00
Craig Topper	ac75bca1eb	[X86][InstCombine] Fix SimplifyDemandedVectorElts to handle frcz scalar intrinsics correctly. Only the lower bits of the input element are used. And only the lower element can be undef since the upper bits are zeroed. Have InstCombineCalls call SimplifyDemandedVectorElts for these intrinsics to reuse this support. llvm-svn: 289523	2016-12-13 07:45:45 +00:00
NAKAMURA Takumi	b8ea75a010	llvm/test/Transforms/PGOProfile/noreturncall.ll REQUIRES asserts due to -debug-only. llvm-svn: 289522	2016-12-13 07:04:03 +00:00
Rong Xu	51a1e3c430	[PGO] Fix insane counts due to nonreturn calls Summary: Since we don't break BBs for function calls. We might get some insane counts (wrap of unsigned) in the presence of noreturn calls. This patch sets these counts to zero instead of the wrapped number. Reviewers: davidxl Subscribers: xur, eraman, llvm-commits Differential Revision: https://reviews.llvm.org/D27602 llvm-svn: 289521	2016-12-13 06:41:14 +00:00
Davide Italiano	463bebc319	[SCCP] Debug diagnostic goes under DEBUG(). NFCI. llvm-svn: 289519	2016-12-13 05:56:04 +00:00
Dylan McKay	1e57fa487b	[AVR] Add an 'relax memory operation' pass Summary: This pass will be used to relax instructions which use out of bounds memory accesses to equivalent operations that can work with the addresses. The pass currently implements relaxation for the STDWPtrQRr instruction. Without this pass, an assertion error would be hit in the pseudo expansion pass. In the future, we will need to add more instructions to this pass. We can do that on a case-by-case basic. Reviewers: arsenm, kparzysz Subscribers: wdng, llvm-commits, mgorny Differential Revision: https://reviews.llvm.org/D27650 llvm-svn: 289517	2016-12-13 05:53:14 +00:00
Philip Reames	1f1bbac8da	[peephole] Enhance folding logic to work for STATEPOINTs The general idea here is to get enough of the existing restrictions out of the way that the already existing folding logic in foldMemoryOperand can kick in for STATEPOINTs and fold references to immutable stack slots. The key changes are: Support for folding multiple operands at once which reference the same load Support for folding multiple loads into a single instruction Walk all the operands of the instruction for varidic instructions (this is a bug fix!) Once this lands, I'll post another patch which refactors the TII interface here. There's nothing actually x86 specific about the x86 code used here. Differential Revision: https://reviews.llvm.org/D24103 llvm-svn: 289510	2016-12-13 01:38:41 +00:00
Philip Reames	51387a8c28	[Statepoints] Reuse stack slots more than once within a basic block The stack slot reuse code had a really amusing bug. We ended up only reusing a stack slot exact once (initial use + reuse) within a basic block. If we had a third statepoint to process, we ended up allocating a new set of stack slots. If we crossed a basic block boundary, the set got cleared. As a result, code which is invoke heavy doesn't see the problem, but multiple calls within a basic block does. Net result: as we optimize invokes into calls, lowering gets worse. The root error here is that the bitmap uses by the custom allocator wasn't kept in sync. The result was that we ended up resizing the bitmap on the next statepoint (to handle the cross block case), reset the bit once, but then never reset it again. Differential Revision: https://reviews.llvm.org/D25243 llvm-svn: 289509	2016-12-13 01:21:15 +00:00
Kostya Serebryany	a31300e789	[libFuzzer] don't require extra flags with -minimize_crash=1 (default to -max_total_time=600). Also respect exact_artifact_path when outputting the end result llvm-svn: 289506	2016-12-13 00:40:47 +00:00
Chris Bieneman	5d58aa80ad	Missed a file in r289503. llvm-svn: 289504	2016-12-13 00:32:43 +00:00
Chris Bieneman	a0523fd0cd	[LIT] Fix system-windows Turns out if you were on windows and your default target wasn't windows the system-windows feature wasn't getting enabled. This fixes that and updates the coff-dwarf test to rely on the new "target-windows" feature. That test was the reason why system-windows was changed to not always be enabled on Windows hosts. llvm-svn: 289503	2016-12-13 00:29:56 +00:00
Chris Bieneman	5a7c5069da	Revert "Suppress LLVM::tools/llvm-symbolizer/coff-dwarf.test for mingw, for now." This reverts commit r249937. llvm-svn: 289502	2016-12-13 00:29:51 +00:00
Chris Bieneman	e96abc6d45	[llvm-config] Unsupported should be win32 Hopefully this will fix the failing Windows bot. llvm-svn: 289497	2016-12-12 23:42:08 +00:00
Tim Northover	d82cc61744	Stop lying about pointers' required alignments. These extra specializations were added in the depths of history (r67984 from 2009) and are clearly problematic now. The pointers actually are aligned to the default (8 bytes), since otherwise UBsan would be complaining loudly. I think it originally made sense because there was no "alignof" to infer the correct value so the generic case went with what malloc returned (8-byte aliged objects), and on 32-bit machines this specialization was correct. It became wrong when we started compiling for 64-bit, and caused a UBSan failure when we tried to put a ValueHandle into a DenseMap. Should fix the Green Dragon UBSan bot. llvm-svn: 289496	2016-12-12 23:29:07 +00:00
Marcos Pividori	681e904419	[libFuzzer] Implement Timers for Windows. Implemented timeouts for Windows using TimerQueueTimers. Timers are used to supervise the time of execution of the callback function that is being fuzzed. Differential Revision: https://reviews.llvm.org/D27237 llvm-svn: 289495	2016-12-12 23:25:11 +00:00
Sanjay Patel	2a1554a0b6	[x86] fix test specifications llvm-svn: 289493	2016-12-12 23:16:35 +00:00
Sanjay Patel	1740526e99	[x86] fix test specifications and auto-generate checks llvm-svn: 289492	2016-12-12 23:15:15 +00:00
Petr Hosek	024a17b06d	[CMake] Multi-target builtins build This change enables building builtins for multiple different targets using LLVM runtimes directory. To specify the builtin targets to be built, use the LLVM_BUILTIN_TARGETS variable, where the value is the list of targets. To pass a per target variable to the builtin build, you can set BUILTINS_<target>_<variable> where <variable> will be passed to the builtin build for <target>. Differential Revision: https://reviews.llvm.org/D26652 llvm-svn: 289491	2016-12-12 23:15:10 +00:00
Chris Bieneman	1a5e67869e	Revert "Disable all llvm-config tests for now, will investigate later" This reverts commit r260386. These tests all pass for me locally. I have no idea if they will pass on all configurations, so I'll watch the bots closely. llvm-svn: 289490	2016-12-12 23:14:58 +00:00
Dan Liew	197d2f0df3	[llvm-config] Fix bug where `--libfiles` and `--names` would produce incorrect output when LLVM is built with `LLVM_BUILD_LLVM_DYLIB`. `llvm-config` previously produced output like this ``` $ llvm-config --libfiles /usr/lib/liblibLLVM-4.0svn.so.so $ llvm-config --libnames liblibLLVM-4.0svn.so.so ``` The library prefix and shared library extension were added to the library name twice which was wrong. I wanted to write a test cases for this but it looks like all `llvm-config` tests were disabled by r260386 so I'll leave this for now. Subscribers: llvm-commits, tstellarAMD Reviewers: beanz, DiamondLovesYou, axw Differential Revision: https://reviews.llvm.org/D27393 llvm-svn: 289488	2016-12-12 23:07:22 +00:00
Andrew Kaylor	ff6a1edfa8	Avoid infinite loops in branch folding Differential Revision: https://reviews.llvm.org/D27582 llvm-svn: 289486	2016-12-12 23:05:38 +00:00
Chris Bieneman	7495a4895c	clang-format to fix post-commit feedback Thanks dblaikie! llvm-svn: 289485	2016-12-12 23:05:15 +00:00
Chris Bieneman	f07d05eccd	[llvm-config] Fix cflags test looking for "error" This test is (I think) actually trying to make sure no errors are printed, but it hits on the string "error" in flags. llvm-svn: 289484	2016-12-12 23:03:28 +00:00
Chris Bieneman	04418623fe	Revert "Remove system-libs.test for now" This reverts commit r260281. llvm-svn: 289483	2016-12-12 23:03:01 +00:00
Sanjoy Das	804b629812	Revert "[SCEVExpander] Use llvm data structures; NFC" This reverts r289215 (git SHA1 cb7b86a1). It breaks the ubsan build because a DenseMap that keys off of `AssertingVH<T>` will hit UB when it tries to cast the empty and tombstone keys to `T *` (due to insufficient alignment). This is the relevant stack trace (thanks to Mike Aizatsky): #0 0x25cf100 in llvm::AssertingVH<llvm::PHINode>::getValPtr() const llvm/include/llvm/IR/ValueHandle.h:212:39 #1 0x25cea20 in llvm::AssertingVH<llvm::PHINode>::operator=(llvm::AssertingVH<llvm::PHINode> const&) llvm/include/llvm/IR/ValueHandle.h:234:19 #2 0x25d0092 in llvm::DenseMapBase<llvm::DenseMap<llvm::AssertingVH<llvm::PHINode>, llvm::detail::DenseSetEmpty, llvm::DenseMapInfo<llvm::AssertingVH<llvm::PHINode> >, llvm::detail::DenseSetPair<llvm::AssertingVH<llvm::PHINode> > >, llvm::AssertingVH<llvm::PHINode>, llvm::detail::DenseSetEmpty, llvm::DenseMapInfo<llvm::AssertingVH<llvm::PHINode> >, llvm::detail::DenseSetPair<llvm::AssertingVH<llvm::PHINode> > >::clear() llvm/include/llvm/ADT/DenseMap.h:113:23 llvm-svn: 289482	2016-12-12 23:00:12 +00:00
Kostya Serebryany	092d5764a1	[libFuzzer] split one slow test into several, for more parallel testing llvm-svn: 289481	2016-12-12 22:55:25 +00:00
Nico Weber	b3901bdde8	Fix MSVC build after 289461; MSVC isn't sure if this is std:: or llvm:: llvm-svn: 289480	2016-12-12 22:46:40 +00:00
Kostya Serebryany	a4b43bf8e8	[libFuzzer] make SimpleCmpTest a bit simpler to crack and more verbose llvm-svn: 289477	2016-12-12 22:39:33 +00:00
Sanjay Patel	62104ee6d9	[x86] fix formatting; NFC llvm-svn: 289476	2016-12-12 22:31:01 +00:00
Eugene Zelenko	6a9226d9b8	[AMDGPU, PowerPC, TableGen] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 289475	2016-12-12 22:23:53 +00:00
Tim Shen	18e7ae672e	[APFloatTest] Use std::make_tuple to make GCC 4.8 happy Differential Revision: https://reviews.llvm.org/D26817 llvm-svn: 289474	2016-12-12 22:16:08 +00:00
Guozhi Wei	1fd553c934	[PPC] Prefer direct move on power8 if load 1 or 2 bytes to VSR Power8 has MTVSRWZ but no LXSIBZX/LXSIHZX, so move 1 or 2 bytes to VSR through MTVSRWZ is much faster than store the extended value into stack and load it with LXSIWZX. This patch fixes pr31144. Differential Revision: https://reviews.llvm.org/D27287 llvm-svn: 289473	2016-12-12 22:09:02 +00:00
Tim Shen	44bde896a5	[APFloat] Implement PPCDoubleDouble add and subtract. Summary: I looked at libgcc's implementation (which is based on the paper, Software for Doubled-Precision Floating-Point Computations", by Seppo Linnainmaa, ACM TOMS vol 7 no 3, September 1981, pages 272-283.) and made it generic to arbitrary IEEE floats. Differential Revision: https://reviews.llvm.org/D26817 llvm-svn: 289472	2016-12-12 21:59:30 +00:00
Matthew Simpson	92ce0230b5	[SLP] Fix sign-extends for type-shrinking This patch ensures the correct minimum bit width during type-shrinking. Previously when type-shrinking, we always sign-extended values back to their original width. However, if we are going to sign-extend, and the sign bit is unknown, we have to increase the minimum bit width by one bit so the sign-extend will fill the upper bits correctly. If the sign bit is known to be zero, we can perform a zero-extend instead. This should fix PR31243. Reference: https://llvm.org/bugs/show_bug.cgi?id=31243 Differential Revision: https://reviews.llvm.org/D27466 llvm-svn: 289470	2016-12-12 21:11:04 +00:00
Kostya Serebryany	035af9b346	[libFuzzer] build libFuzzer itself with asan llvm-svn: 289469	2016-12-12 20:58:10 +00:00
Paul Robinson	ac7fe5e0c4	Recommit r288212: Emit 'no line' information for interesting 'orphan' instructions. DWARF specifies that "line 0" really means "no appropriate source location" in the line table. By default, use this for branch targets and some other cases that have no specified source location, to prevent inheriting unfortunate line numbers from physically preceding instructions (which might be from completely unrelated source). Updated patch allows enabling or suppressing this behavior for all unspecified source locations. Differential Revision: http://reviews.llvm.org/D24180 llvm-svn: 289468	2016-12-12 20:49:11 +00:00
Kostya Serebryany	d4be88913e	[libFuzzer] respect -max_len during merge llvm-svn: 289467	2016-12-12 20:39:35 +00:00
Teresa Johnson	a29bd6ffcc	[ThinLTO] Remove useless code (NFC) Should have been removed in r288446. llvm-svn: 289466	2016-12-12 20:34:28 +00:00
Mehdi Amini	ef27db879c	Refactor BitcodeReader: move Metadata and ValueId handling in their own class/file Summary: I'm planning on changing the way we load metadata to enable laziness. I'm getting lost in this gigantic files, and gigantic class that is the bitcode reader. This is a first toward splitting it in a few coarse components that are more easily understandable. Reviewers: pcc, tejohnson Subscribers: mgorny, llvm-commits, dexonsmith Differential Revision: https://reviews.llvm.org/D27646 llvm-svn: 289461	2016-12-12 19:34:26 +00:00
Mehdi Amini	bf2090e31a	Remove IsMetadataMaterialized from BitcodeReader (NFC) Summary: It does not seem useful. Reviewers: pcc, dexonsmith Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27668 llvm-svn: 289457	2016-12-12 19:23:39 +00:00
Geoff Berry	d73420d591	[LiveRangeEdit] Add assert string and descriptive comment. llvm-svn: 289456	2016-12-12 19:12:41 +00:00
Dimitry Andric	59e5cb4342	Fix compile with GCC 5 or later Summary: Compiling with GCC 5 or later can fail with a bogus error "constructor required before non-static data member for llvm::ValueEnumerator::MDRange::First has been parsed". This was originally fixed upstream in GCC PR 70528, but later this fix was reverted, and released versions of GCC still show the bogus error. To work around this, replace MDRange's declaration of a default constructor with a definition. Reviewers: dexonsmith, rsmith, rivanvx Subscribers: llvm-commits, dim, dexonsmith Differential Revision: https://reviews.llvm.org/D18730 llvm-svn: 289454	2016-12-12 19:05:52 +00:00
Reid Kleckner	30422eea0f	Revert "[SCEVExpand] do not hoist divisions by zero (PR30935)" Reverts r289412. It caused an OOB PHI operand access in instcombine when ASan is enabled. Reduction in progress. Also reverts "[SCEVExpander] Add a test case related to r289412" llvm-svn: 289453	2016-12-12 18:52:32 +00:00
Simon Atanasyan	5048514c20	[mips] For PIC code convert unconditional jump to unconditional branch Unconditional branch uses relative addressing which is the right choice in case of position independent code. This is a fix for the bug: https://dmz-portal.mips.com/bugz/show_bug.cgi?id=2445 Differential revision: https://reviews.llvm.org/D27483 llvm-svn: 289448	2016-12-12 17:40:26 +00:00
Nicolai Haehnle	f45ea4bbc5	AMDGPU: llvm.amdgcn.interp.mov is a source of divergence Summary: While the result is constant across a single primitive, each pixel shader wave can have pixels from multiple primitives. Reviewers: tstellarAMD, arsenm Subscribers: kzhuravl, wdng, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D27572 llvm-svn: 289447	2016-12-12 16:52:19 +00:00
Sanjay Patel	052220c5c8	remove stale FIXME note from test; NFC llvm-svn: 289445	2016-12-12 16:20:21 +00:00
Simon Pilgrim	a64d4dc22f	[X86] Regenerate vector bitcast/widening tests. llvm-svn: 289443	2016-12-12 16:15:45 +00:00
Sanjay Patel	e730ce87a5	[InstCombine] fix bug when offsetting case values of a switch (PR31260) We could truncate the condition and then try to fold the add into the original condition value causing wrong case constants to be used. Move the offset transform ahead of the truncate transform and return after each transform, so there's no chance of getting confused values. Fix for: https://llvm.org/bugs/show_bug.cgi?id=31260 llvm-svn: 289442	2016-12-12 16:13:52 +00:00
Teresa Johnson	040cc16835	[ThinLTO] Import only necessary DICompileUnit fields Summary: As discussed on mailing list, for ThinLTO importing we don't need to import all the fields of the DICompileUnit. Don't import enums, macros, retained types lists. Also only import local scoped imported entities. Since we don't currently import any global variables, we also don't need to import the list of global variables (added an assert to verify none are being imported). This is being done by pre-populating the value map entries to map the unneeded metadata to nullptr. For the imported entities, we can simply replace the source module's list with a new list containing only those needed imported entities. This is done in the IRLinker constructor so that value mapping automatically does the desired mapping. Reviewers: mehdi_amini, dexonsmith, dblaikie, aprantl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27635 llvm-svn: 289441	2016-12-12 16:09:30 +00:00
Sanjay Patel	87e2f677d7	[InstCombine] clean up range-for-loops in visitSwitchInst(); NFCI llvm-svn: 289439	2016-12-12 15:52:56 +00:00
Simon Pilgrim	d4ff86b973	[X86] Regenerate test. llvm-svn: 289438	2016-12-12 15:47:53 +00:00
Sanjay Patel	2b060c7700	[InstCombine] add test to show PR31260 miscompile; NFC llvm-svn: 289437	2016-12-12 15:28:44 +00:00
Sanjoy Das	b1227db1f4	[SCEVExpander] Add a test case related to r289412 llvm-svn: 289435	2016-12-12 14:57:11 +00:00
Simon Pilgrim	4cbe1834e4	Update inline argument comment. NFCI. combineX86ShufflesRecursively 'HasPSHUFB' flag has been the more generic 'HasVariableMask' flag for some time. llvm-svn: 289430	2016-12-12 13:43:15 +00:00
Simon Pilgrim	5ebd2b542b	[X86][SSE] Add support for combining SSE VSHLI/VSRLI uniform constant shifts. Fixes some missed constant folding opportunities and allows us to combine shuffles that end with a logical bit shift. llvm-svn: 289429	2016-12-12 13:33:58 +00:00
Simon Pilgrim	369cd349b9	[X86][SSE] Lower suitably sign-extended mul vXi64 using PMULDQ PMULDQ returns the 64-bit result of the signed multiplication of the lower 32-bits of vXi64 vector inputs, we can lower with this if the sign bits stretch that far. Differential Revision: https://reviews.llvm.org/D27657 llvm-svn: 289426	2016-12-12 10:49:15 +00:00
Simon Pilgrim	040a36c176	[SelectionDAG] Add support for EXTRACT_SUBVECTOR to ComputeNumSignBits Pre-commit as discussed on D27657 llvm-svn: 289425	2016-12-12 10:29:43 +00:00
Craig Topper	36ecce9bed	[X86] Teach selectScalarSSELoad to accept full 128-bit vector loads and the X86ISD::VZEXT_LOAD opcode. Disable peephole on some of the tests that no longer require it to properly fold scalar intrinsics. llvm-svn: 289424	2016-12-12 07:57:24 +00:00
Craig Topper	f2c6f7abf3	[X86] Change CMPSS/CMPSD intrinsic instructions to use sse_load_f32/f64 as its memory pattern instead of full vector load. These intrinsics only load a single element. We should use sse_loadf32/f64 to give more options of what loads it can match. Currently these instructions are often only getting their load folded thanks to the load folding in the peephole pass. I plan to add more types of loads to sse_load_f32/64 so we can match without the peephole. llvm-svn: 289423	2016-12-12 07:57:21 +00:00
Craig Topper	081c0e2864	[X86] Remove some intrinsic instructions from hasPartialRegUpdate Summary: These intrinsic instructions are all selected from intrinsics that have well defined behavior for where the upper bits come from. It's not the same place as the lower bits. As you can see we were suppressing load folding for these instructions in some cases. In none of the cases was the separate load helping avoid a partial dependency on the destination register. So we should just go ahead and allow the load to be folded. Only foldMemoryOperand was suppressing folding for these. They all have patterns for folding sse_load_f32/f64 that aren't gated with OptForSize, but sse_load_f32/f64 doesn't allow 128-bit vector loads. It only allows scalar_to_vector and vzmovl of scalar loads to match. There's no reason we can't allow a 128-bit vector load to be narrowed so I would like to fix sse_load_f32/f64 to allow that. And if I do that it changes some of these same test cases to fold the load too. Reviewers: spatel, zvi, RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27611 llvm-svn: 289419	2016-12-12 05:07:17 +00:00
Sebastian Pop	8c9cc8c86b	[SCEVExpand] do not hoist divisions by zero (PR30935) SCEVExpand computes the insertion point for the components of a SCEV to be code generated. When it comes to generating code for a division, SCEVexpand would not be able to check (at compilation time) all the conditions necessary to avoid a division by zero. The patch disables hoisting of expressions containing divisions by anything other than non-zero constants in order to avoid hoisting these expressions past conditions that should hold before doing the division. The patch passes check-all on x86_64-linux. Differential Revision: https://reviews.llvm.org/D27216 llvm-svn: 289412	2016-12-12 02:52:51 +00:00
Craig Topper	7fc6d34ed1	[InstCombine][XOP] The instructions for the scalar frcz intrinsics are defined to put 0 in the upper bits, not pass bits through like other intrinsics. So we should return a zero vector instead. llvm-svn: 289411	2016-12-11 22:32:38 +00:00
Simon Pilgrim	831435cb14	[X86][SSE] Add support for combining target shuffles to SHUFPD. llvm-svn: 289407	2016-12-11 21:26:25 +00:00
Davide Italiano	0a1476c756	[SCCP] Use the appropriate helper function. NFCI. llvm-svn: 289406	2016-12-11 21:19:03 +00:00
Ayman Musa	7ec4ed55d3	[X86][AVX512] Add missing patterns for broadcast fallback in case load node has multiple uses (for v4i64 and v4f64). When the load node which the broadcast instruction broadcasts has multiple uses, it cannot be folded. A fallback pattern is added to catch these cases and provide another solution. Differential Revision: https://reviews.llvm.org/D27661 llvm-svn: 289404	2016-12-11 20:11:17 +00:00
Sanjoy Das	6de678815c	[TBAA] Don't generate invalid TBAA when merging nodes Summary: Fix a corner case in `MDNode::getMostGenericTBAA` where we can sometimes generate invalid TBAA metadata. Reviewers: chandlerc, hfinkel, mehdi_amini, manmanren Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D26635 llvm-svn: 289403	2016-12-11 20:07:25 +00:00
Sanjoy Das	3336f681e3	[Verifier] Add verification for TBAA metadata Summary: This change adds some verification in the IR verifier around struct path TBAA metadata. Other than some basic sanity checks (e.g. we get constant integers where we expect constant integers), this checks: - That by the time an struct access tuple `(base-type, offset)` is "reduced" to a scalar base type, the offset is `0`. For instance, in C++ you can't start from, say `("struct-a", 16)`, and end up with `("int", 4)` -- by the time the base type is `"int"`, the offset better be zero. In particular, a variant of this invariant is needed for `llvm::getMostGenericTBAA` to be correct. - That there are no cycles in a struct path. - That struct type nodes have their offsets listed in an ascending order. - That when generating the struct access path, you eventually reach the access type listed in the tbaa tag node. Reviewers: dexonsmith, chandlerc, reames, mehdi_amini, manmanren Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D26438 llvm-svn: 289402	2016-12-11 20:07:15 +00:00
Sanjay Patel	81ed3499cd	[Constants] don't die processing non-ConstantInt GEP indices in isGEPWithNoNotionalOverIndexing() (PR31262) This should fix: https://llvm.org/bugs/show_bug.cgi?id=31262 llvm-svn: 289401	2016-12-11 20:07:02 +00:00
Simon Pilgrim	7c98a79f7b	[X86][AVX512] Add target shuffle test showing missing PSHUFPD combine. llvm-svn: 289400	2016-12-11 19:41:23 +00:00
Sebastian Pop	e08d9c7c87	instr-combiner: sum up all latencies of the transformed instructions We have found that -- when the selected subarchitecture has a scheduling model and we are not optimizing for size -- the machine-instruction combiner uses a too-simple algorithm to compute the cost of one of the two alternatives [before and after running a combining pass on a section of code], and therefor it throws away the combination results too often. This fix has the potential to help any ISA with the potential to combine instructions and for which at least one subarchitecture has a scheduling model. As of now, this is only known to definitely affect AArch64 subarchitectures with a scheduling model. Regression tested on AMD64/GNU-Linux, new test case tested to fail on an unpatched compiler and pass on a patched compiler. Patch by Abe Skolnik and Sebastian Pop. llvm-svn: 289399	2016-12-11 19:39:32 +00:00
Simon Pilgrim	8766a76f3d	[X86][XOP] Add target shuffle tests showing missing PSHUFPD combine. llvm-svn: 289398	2016-12-11 19:36:25 +00:00
Sanjoy Das	ba1bf87586	[SCEVExpander] Explicitly expand AddRec starts into loop preheader This is NFC today, but won't be once D27216 (or an equivalent patch) is in. This change fixes a design problem in SCEVExpander -- it relied on a hoisting optimization to generate correct code for add recurrences. This meant changing the hoisting optimization to not kick in under certain circumstances (to avoid speculating faulting instructions, say) would break correctness. The fix is to make the correctness requirements explicit, and have it not rely on the hoisting optimization for correctness. llvm-svn: 289397	2016-12-11 19:02:21 +00:00
Oren Ben Simhon	9683ecbff6	[X86] Regcall - Adding support for mask types Regcall calling convention passes mask types arguments in x86 GPR registers. The review includes the changes required in order to support v32i1, v16i1 and v8i1. Differential Revision: https://reviews.llvm.org/D27148 llvm-svn: 289383	2016-12-11 14:10:52 +00:00

... 3 4 5 6 7 ...

142269 Commits