llvm-project

Commit Graph

Author	SHA1	Message	Date
Matt Arsenault	c06574ffc0	AMDGPU: Add pass to replace out arguments It is better to return arguments directly in registers if we are making a call rather than introducing expensive stack usage. In one of sample compile from one of Blender's many kernel variants, this fires on about ~20 different functions. Future improvements may be to recognize simple cases where the pointer is indexing a small array. This also fails when the store to the out argument is in a separate block from the return, which happens in a few of the Blender functions. This should also probably be using MemorySSA which might help with that. I'm not sure this is correct as a FunctionPass, but MemoryDependenceAnalysis seems to not work with a ModulePass. I'm also not sure where it should run.I think it should run before DeadArgumentElimination, so maybe either EP_CGSCCOptimizerLate or EP_ScalarOptimizerLate. llvm-svn: 309416	2017-07-28 18:40:05 +00:00
Hiroshi Yamauchi	1b179bc5ff	[LVI] Constant-propagate a zero extension of the switch condition value through case edges Summary: LazyValueInfo currently computes the constant value of the switch condition through case edges, which allows the constant value to be propagated through the case edges. But we have seen a case where a zero-extended value of the switch condition is used past case edges for which the constant propagation doesn't occur. This patch adds a small logic to handle such a case in getEdgeValueLocal(). This is motivated by the Python 2.7 eval loop in PyEval_EvalFrameEx() where the lack of the constant propagation causes longer live ranges and more spill code than necessary. With this patch, we see that the code size of PyEval_EvalFrameEx() decreases by ~5.4% and a performance test improves by ~4.6%. Reviewers: wmi, dberlin, sanjoy Reviewed By: sanjoy Subscribers: davide, davidxl, llvm-commits Differential Revision: https://reviews.llvm.org/D34822 llvm-svn: 309415	2017-07-28 18:35:25 +00:00
Tim Northover	a7f583e33b	GlobalISel: map 128-bit values to an FPR by default. Eventually we may want to allow a pair of GPRs but absolutely nothing in the entire world is ready for that yet. llvm-svn: 309404	2017-07-28 17:11:01 +00:00
Matt Arsenault	9166ce86e8	AMDGPU: Annotate implicitarg.ptr usage We need to pass something to functions for this to work. It isn't derivable just from the kernarg segment pointer because the implicit arguments are placed after the kernel arguments. Also fixes missing test for the intrinsic. llvm-svn: 309398	2017-07-28 15:52:08 +00:00
Wei Mi	55c05e14af	[GVN] Recommit the patch "Add phi-translate support in scalarpre" Recommit after workaround the bug PR31652. Three bugs fixed in previous recommits: The first one is to use CurrentBlock instead of PREInstr's Parent as param of performScalarPREInsertion because the Parent of a clone instruction may be uninitialized. The second one is stop PRE when CurrentBlock to its predecessor is a backedge and an operand of CurInst is defined inside of CurrentBlock. The same value defined inside of loop in last iteration can not be regarded as available. The third one is an out-of-bound array access in a flipped if guard. Right now scalarpre doesn't have phi-translate support, so it will miss some simple pre opportunities. Like the following testcase, current scalarpre cannot recognize the last "a * b" is fully redundent because a and b used by the last "a * b" expr are both defined by phis. long a[100], b[100], g1, g2, g3; __attribute__((pure)) long goo(); void foo(long a, long b, long c, long d) { g1 = a * b; if (__builtin_expect(g2 > 3, 0)) { a = c; b = d; g2 = a * b; } g3 = a * b; // fully redundant. } The patch adds phi-translate support in scalarpre. This is only a temporary solution before the newpre based on newgvn is available. Differential Revision: https://reviews.llvm.org/D32252 llvm-svn: 309397	2017-07-28 15:47:25 +00:00
Strahinja Petrovic	25e9e1b866	[ARM] Add the option to directly access TLS pointer This patch enables choice for accessing thread local storage pointer (like '-mtp' in gcc). Differential Revision: https://reviews.llvm.org/D34408 llvm-svn: 309381	2017-07-28 12:54:57 +00:00
Simon Pilgrim	1ff3da7273	[X86] Add test case for PR33290 llvm-svn: 309375	2017-07-28 09:43:52 +00:00
Simon Pilgrim	88d3bed351	[X86][AVX] Cleanup shuffle combine tests - remove old prefixes. llvm-svn: 309374	2017-07-28 09:41:55 +00:00
Peter Smith	5804364f7a	[ARM] Add test to check pcs of ARM ABI runtime floating point helpers The ARM Runtime ABI document (IHI0043) defines the AEABI floating point helper functions in section 4.1.2 The floating-point helper functions. The functions listed in this section must always use the base AAPCS calling convention. This test generates calls to all the helper functions that llvm supports and checks that the base AAPCS calling convention has been used. We test the equivalent of -mfloat-abi=soft, -mfloat-abi=softfp, -mfloat-abi=hardfp with an FPU that supports single and double precision, and one that only supports double precision. Differential Revision: https://reviews.llvm.org/D35904 llvm-svn: 309371	2017-07-28 09:21:00 +00:00
Max Kazantsev	fa4969539a	[SCEV] Do not visit nodes twice in containsConstantSomewhere This patch reworks the function that searches constants in Add and Mul SCEV expression chains so that now it does not visit a node more than once, and also renames this function for better correspondence between its implementation and semantics. Differential Revision: https://reviews.llvm.org/D35931 llvm-svn: 309367	2017-07-28 06:42:15 +00:00
Saleem Abdulrasool	61d81ec754	test: require x86 backend Ensure that the target is registered before using it. Should fix the hexagon Bots. llvm-svn: 309363	2017-07-28 04:15:35 +00:00
Saleem Abdulrasool	a219b3d8d1	MC: add support for cfi_return_column This adds support for the CFI pseudo-op return_column. This specifies the frame table column which contains the return address. Addresses PR33953! llvm-svn: 309360	2017-07-28 03:39:19 +00:00
Sanjoy Das	843ab57457	Revert "[SCEV] Cache results of computeExitLimit" This reverts commit r309080. The patch needs to clear out the ScalarEvolution::ExitLimits cache in forgetMemoizedResults. I've replied on the commit thread for the patch with more details. llvm-svn: 309357	2017-07-28 03:25:07 +00:00
Davide Italiano	75a001ba78	[JumpThreading] Stop falsely preserving LazyValueInfo. JumpThreading claims to preserve LVI, but it doesn't preserve the analyses which LVI holds a reference to (e.g. the Dominator). In the current pass manager infrastructure, after JT runs, the PM frees these analyses (including DominatorTree) but preserves LVI. CorrelatedValuePropagation runs immediately after and queries a corrupted domtree, causing weird miscompiles. This commit disables the preservation of LVI for the time being. Eventually, we should either move LVI to a proper dependency tracking mechanism (i.e. an analyses shouldn't hold references to other analyses and compute them on demand if needed), or we should teach all the passes preserving LVI to preserve the analyses LVI depends on. The new pass manager has a mechanism to invalidate LVI in case one of the analyses it depends on becomes invalid, so this problem shouldn't exist (at least not in this immediate form), but handling of analyses holding references is still a very delicate subject. Fixes PR33917 (and rustc). llvm-svn: 309355	2017-07-28 03:10:43 +00:00
David Blaikie	89daf77a11	DebugInfo: Consider a CU containing only local imported entities to be 'empty' This can come up in ThinLTO & wastes space & makes degenerate IR. As per the added FIXME, ultimately, local imported entities should hang off the function and that way the imported entity list on the CU can be tested for emptiness like all the other CU lists. (function-attached local imported entities are probably also the best path forward for fixing how imported entities are handled both in cross-module use (currently, while ThinLTO preserves the imported entities, they would not get used at the imported inlined location - only in the abstract origin that appears in the partial CU created by the import (which isn't emitted under Fission due to cross-CU limitations there)) and to reduce the number of points where imported entities are emitted (they're currently emitted into every inlined instance, concrete instance, and abstract origin - they should only go in teh abstract origin if there is one, otherwise in the concrete instance - but this requires lots of delayed handling and wiring up, same as abstract variables & subprograms)) llvm-svn: 309354	2017-07-28 03:06:25 +00:00
Davide Italiano	01cb947abb	[JumpThreading] Add an option to dump LazyValueInfo after the run. Differential Revision: https://reviews.llvm.org/D35973 llvm-svn: 309353	2017-07-28 02:57:43 +00:00
Matthias Braun	c618a466f1	ARMFrameLowering: Only set ExtraCSSpill for actually unused registers. The code assumed that unclobbered/unspilled callee saved registers are unused in the function. This is not true for callee saved registers that are also used to pass parameters such as swiftself. rdar://33401922 llvm-svn: 309350	2017-07-28 01:36:32 +00:00
Dehao Chen	f4240b5b91	Separate the ICP total threshold and remaining threshold. Summary: In the current implementation, isPromotionProfitable only checks if the call count to a direct target is no less than a certain percentage threshold of the remaining call counts that have not been promoted. This causes code size problems when the target count is small but greater than a large portion of remaining counts. E.g. target1 takes 99.9%, while target2 takes 0.1%. Both targets will be promoted and inlined, makes the function size too large, which potentially prevents it from further inlining into its callers. This patch adds another percentage threshold against the total indirect call count. If the target count needs to be no less than both thresholds in order to be promoted speculatively. Reviewers: davidxl, tejohnson Reviewed By: tejohnson Subscribers: sanjoy, llvm-commits Differential Revision: https://reviews.llvm.org/D35962 llvm-svn: 309345	2017-07-28 01:02:54 +00:00
Reid Kleckner	07a5d4372e	[X86] Fix latent bug in sibcall eligibility logic The X86 tail call eligibility logic was correct when it was written, but the addition of inalloca and argument copy elision broke its assumptions. It was assuming that fixed stack objects were immutable. Currently, we aim to emit a tail call if no arguments have to be re-arranged in memory. This code would trace the outgoing argument values back to check if they are loads from an incoming stack object. If the stack argument is immutable, then we won't need to store it back to the stack when we tail call. Fortunately, stack objects track their mutability, so we can just make the obvious check to fix the bug. This was http://crbug.com/749826 llvm-svn: 309343	2017-07-28 00:58:35 +00:00
Kostya Serebryany	063b652096	[sanitizer-coverage] rename sanitizer-coverage-create-pc-table into sanitizer-coverage-pc-table and add plumbing for a clang flag llvm-svn: 309337	2017-07-28 00:09:29 +00:00
Kostya Serebryany	b75d002f15	[sanitizer-coverage] add a feature sanitizer-coverage-create-pc-table=1 (works with trace-pc-guard and inline-8bit-counters) that adds a static table of instrumented PCs to be used at run-time llvm-svn: 309335	2017-07-27 23:36:49 +00:00
Davide Italiano	1a26f24f35	[ConstantFolder] Don't try to fold gep when the idx is a vector. The code in ConstantFoldGetElementPtr() assumes integers, and therefore it crashes trying to get the integer bidwith of a vector type (in this case <4 x i32>. I just changed the code to prevent the folding in case of vectors and I didn't bother to generalize as this doesn't seem to me something that really happens in practice, but I'm willing to change the patch if you think it's worth it. This is hard to trigger from -instsimplify or -instcombine only as the second instruction is dead, so the test uses loop-unroll. Differential Revision: https://reviews.llvm.org/D35956 llvm-svn: 309330	2017-07-27 22:20:44 +00:00
Ahmed Bougacha	87807c5a86	[AArch64] Fix legality info passed to demanded bits for TBI opt. The (seldom-used) TBI-aware optimization had a typo lying dormant since it was first introduced, in r252573: when asking for demanded bits, it told TLI that it was running after legalize, where the opposite was true. This is an important piece of information, that the demanded bits analysis uses to make assumptions about the node. r301019 added such an assumption, which was broken by the TBI combine. Instead, pass the correct flags to TLO. llvm-svn: 309323	2017-07-27 21:27:25 +00:00
Eric Beckmann	d8bac0fa3f	Add test to reject merging of empty manifest. Reviewers: ruiu, rnk Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D35954 llvm-svn: 309317	2017-07-27 19:58:12 +00:00
Dinar Temirbulatov	636ac1b6da	Change prefix in vector-shuffle-combining-avx.patch to reduce test size. llvm-svn: 309315	2017-07-27 19:47:35 +00:00
Hiroshi Yamauchi	60855214c2	[InstCombine] Simplify pointer difference subtractions (GEP-GEP) where GEPs have other uses and one non-constant index Summary: Pointer difference simplifications currently happen only if input GEPs don't have other uses or their indexes are all constants, to avoid duplicating indexing arithmetic. This patch enables cases with exactly one non-constant index among input GEPs to happen where there is no duplicated arithmetic or code size increase even if input GEPs have other uses. For example, this patch allows "(&A[42][i]-&A[42][0])" --> "i", which didn't happen previously, if the input GEP(s) have other uses. Reviewers: sanjoy, bkramer Reviewed By: sanjoy Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D35499 llvm-svn: 309304	2017-07-27 18:27:11 +00:00
Simon Pilgrim	ac84850ea6	[SelectionDAG] Improve DAGTypeLegalizer::convertMask assertion (PR33960) Improve DAGTypeLegalizer::convertMask's isSETCCorConvertedSETCC assertion to properly check for any mixture of SETCC or BUILD_VECTOR of constants, or a logical mask op of them. llvm-svn: 309302	2017-07-27 18:15:54 +00:00
Dinar Temirbulatov	aead31a36f	[X86] SET0 to use XMM registers where possible PR26018 PR32862 Differential Revision: https://reviews.llvm.org/D35839 llvm-svn: 309298	2017-07-27 17:47:01 +00:00
Adam Nemet	a67dfe3b04	Relax the matching in these tests Looks like the template arguments are displayed differently depending on the host compiler(?). E.g.: InnerAnalysisManagerProxy<CGSCCAnalysisManager InnerAnalysisManagerProxy<llvm::AnalysisManager<llvm::LazyCallGraph::SCC, ... Fix fallout after r309294 llvm-svn: 309297	2017-07-27 17:45:02 +00:00
Adam Nemet	0d8b5d6f69	[ICP] Migrate to OptimizationRemarkEmitter This is a module pass so for the old PM, we can't use ORE, the function analysis pass. Instead ORE is created on the fly. A few notes: - isPromotionLegal is folded in the caller since we want to emit the Function in the remark but we can only do that if the symbol table look-up succeeded. - There was good test coverage for remarks in this pass. - promoteIndirectCall uses ORE conditionally since it's also used from SampleProfile which does not use ORE yet. Fixes PR33792. Differential Revision: https://reviews.llvm.org/D35929 llvm-svn: 309294	2017-07-27 16:54:15 +00:00
Daniel Neilson	2574d7cbf6	All libcalls should be considered to be GC-leaf functions. Summary: It is possible for some passes to materialize a call to a libcall (ex: ldexp, exp2, etc), but these passes will not mark the call as a gc-leaf-function. All libcalls are actually gc-leaf-functions, so we change llvm::callsGCLeafFunction() to tell us that available libcalls are equivalent to gc-leaf-function calls. Reviewers: sanjoy, anna, reames Reviewed By: anna Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D35840 llvm-svn: 309291	2017-07-27 16:49:39 +00:00
David Blaikie	2f0cc477ab	ThinLTO: Don't import aliases of any kind (even linkonce_odr) Summary: Until a more advanced version of importing can be implemented for aliases (one that imports an alias as an available_externally definition of the aliasee), skip the narrow subset of cases that was possible but came at a cost: aliases of linkonce_odr functions could be imported because the linkonce_odr function could be safely duplicated from the source module. This came/comes at the cost of not being able to 'home' imported linkonce functions (they had to be emitted linkonce_odr in all the destination modules (even if they weren't used by an alias) rather than as available_externally - causing extra object size). Tangentially, this also was the only reason ThinLTO would emit multiple CUs in to the resulting DWARF - which happens to be a problem for Fission (there's a fix for this in GDB but not released yet, etc). (actually it's not the only reason - but I'm sending a patch to fix the other reason shortly) There's no reason to believe this particularly narrow alias importing was especially/meaningfully important, only that it was /possible/ to implement in this way. When a more general solution is done, it should still satisfy the DWARF concerns above, since the import will still be available_externally, and thus not create extra CUs. Since now all aliases are treated the same, I removed/simplified some test cases since they were testing corner cases where there are no longer any corners. Reviewers: tejohnson, mehdi_amini Differential Revision: https://reviews.llvm.org/D35875 llvm-svn: 309278	2017-07-27 15:09:06 +00:00
Andrew V. Tischenko	e255526d0b	Added cost of ZEROALL and ZEROUPPER instrs in btver2 cpu. Differential Revision https://reviews.llvm.org/D35834 llvm-svn: 309269	2017-07-27 13:12:08 +00:00
Simon Pilgrim	31f5402711	[X86][AVX] Regenerate shuffle tests with broadcast comments. llvm-svn: 309266	2017-07-27 12:32:45 +00:00
Daniel Sanders	8e82af2be6	Re-commit: r309094 [globalisel][tablegen] Fuse the generated tables together. Summary: Now that we have control flow in place, fuse the per-rule tables into a single table. This is a compile-time saving at this point. However, this will also enable the optimization of a table so that similar instructions can be tested together, reducing the time spent on the matching the code. This is NFC in terms of externally visible behaviour but some internals have changed slightly. State.MIs is no longer reset between each rule that is attempted because it's not necessary to do so. As a consequence of this the restriction on the order that instructions are added to State.MIs has been relaxed to only affect recorded instructions that require new elements to be added to the vector. GIM_RecordInsn can now write to any element from 1 to State.MIs.size() instead of just State.MIs.size(). The compile-time regressions from the last commit were caused by the ARM target including a non-const variable (zero_reg) in the table and therefore generating an initializer for it. That variable is now const. Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar Reviewed By: rovka Subscribers: kristof.beyls, igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D35681 llvm-svn: 309264	2017-07-27 11:03:45 +00:00
Simon Pilgrim	804cbd61e6	[X86] Adding test cases for LEA factorization (PR32755 / D35014) Differential Revision: https://reviews.llvm.org/D35886 llvm-svn: 309262	2017-07-27 10:36:09 +00:00
Hiroshi Inoue	967dc58ac1	[PowerPC] enable optimizeCompareInstr for branch with static branch hint In optimizeCompareInstr, a compare instruction is eliminated by using a record form instruction if possible. If the branch instruction that uses the result of the compare has a static branch hint, the optimization does not happen. This patch makes this optimization happen regardless of the branch hint by splitting branch hint and branch condition before checking the predicate to identify the possible optimizations. Differential Revision: https://reviews.llvm.org/D35801 llvm-svn: 309255	2017-07-27 08:14:48 +00:00
Petr Hosek	88ac9a8c07	Revert "Reland "[LLVM][llvm-objcopy] Added basic plumbing to get things started"" This change is failing tests on Windows bots due to permissions. This reverts commit r309249. llvm-svn: 309251	2017-07-27 06:02:05 +00:00
Petr Hosek	27bcf6a680	Reland "[LLVM][llvm-objcopy] Added basic plumbing to get things started" As discussed on llvm-dev I've implemented the first basic steps towards llvm-objcopy/llvm-objtool (name pending). This change adds the ability to copy (without modification) 64-bit little endian ELF executables that have SHT_PROGBITS, SHT_NOBITS, SHT_NULL and SHT_STRTAB sections. Patch by Jake Ehrlich Differential Revision: https://reviews.llvm.org/D33964 llvm-svn: 309249	2017-07-27 04:35:30 +00:00
Eric Beckmann	f7e84c3ff2	Re-enable libxml2 tests. llvm-svn: 309241	2017-07-27 01:11:53 +00:00
David Blaikie	2195e13676	DebugInfo: Ensure imported entities at the top level of an inlined function don't cause degenerate concrete definitions Local imported entities at the top level of a subprogram were being handled differently from those in nested scopes - that different handling would cause pseudo concrete out-of-line definitions to be created (but without any of their attributes, nor an abstract_origin) in the case where there was no real concrete definition. These local imported entities also only appeared in the concrete definition where those imported entities in nested scopes appear in all cases (abstract, concrete, and inlined). This change at least makes top level case handle the same as the others - though there's a FIXME to improve this to /only/ emit them into the abstract origin (though this requires more plumbing - like the abstract subprogram and variable handling that must defer population until the end of the unit to discover if there is an abstract origin, or only a standalone concrete definition). llvm-svn: 309237	2017-07-27 00:06:53 +00:00
Eric Beckmann	08e38d6b3d	See if disabling libxml tests will pass the i686 bot. llvm-svn: 309229	2017-07-26 23:15:44 +00:00
Stanislav Mekhanoshin	3197eb6981	[AMDGPU] Optimize SI_IF lowering for simple if regions Currently SI_IF results in a s_and_saveexec_b64 followed by s_xor_b64. The xor is used to extract only the changed bits. In case of a simple if region where the only use of that value is in the SI_END_CF to restore the old exec mask, we can omit the xor and perform an or of the exec mask with the original exec value saved by the s_and_saveexec_b64. Differential Revision: https://reviews.llvm.org/D35861 llvm-svn: 309185	2017-07-26 21:29:15 +00:00
Wei Ding	a126a13bb3	AMDGPU : Widen extending scalar loads to 32-bits. Differential Revision: http://reviews.llvm.org/D35146 llvm-svn: 309178	2017-07-26 21:07:28 +00:00
Davide Italiano	dae1a15a79	[gold] Relax this tests a little more. Thanks to Peter for the report! llvm-svn: 309176	2017-07-26 21:01:57 +00:00
Davide Italiano	853ce87a4d	[gold] Relax tests to account for difference in layout across versions. llvm-svn: 309174	2017-07-26 20:40:33 +00:00
Matt Arsenault	894e53d6ac	AMDGPU: Fix using SMRD instructions for argument loads in functions These are not actually uniform values except in kernels. llvm-svn: 309172	2017-07-26 20:39:42 +00:00
Tom Stellard	55038cd1d3	AMDGPU/GlobalISel: Mark 32-bit G_OR as legal Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D35127 llvm-svn: 309165	2017-07-26 20:00:53 +00:00
Adam Nemet	ea06e6e865	Migrate SimplifyLibCalls to new OptimizationRemarkEmitter Summary: This changes SimplifyLibCalls to use the new OptimizationRemarkEmitter API. In fact, as SimplifyLibCalls is only ever called via InstCombine, (as far as I can tell) the OptimizationRemarkEmitter is added there, and then passed through to SimplifyLibCalls later. I have avoided changing any remark text. This closes PR33787 Patch by Sam Elliott! Reviewers: anemet, davide Reviewed By: anemet Subscribers: davide, mehdi_amini, eraman, fhahn, llvm-commits Differential Revision: https://reviews.llvm.org/D35608 llvm-svn: 309158	2017-07-26 19:03:18 +00:00
Andrew V. Tischenko	d1fefa3d7c	This patch returns proper value to indicate the case when instruction throughput can't be calculated. Differential revision https://reviews.llvm.org/D35831 llvm-svn: 309156	2017-07-26 18:55:14 +00:00
Adrian Prantl	833ad37c90	Do a better job at emitting prefrabricated skeleton CUs. This is a better fix than r308708 for the problem introduced in r304020. It restores the skeleton CU testcases modified by that commit to their original form and most importantly ensures that frontend-generated skeleton CUs (such as used to point to Clang modules) come after the regular CUs. This broke for DICompileUnit nodes that don't have any immediate children because they are now constructed lazily instead of the order in which they are listed in !llvm.dbg.cu. After this commit we still don't guarantee that order, but we do guarantee that empty skeletons come last. Shipping versions of LLDB are very sensitive to the ordering of CUs. I'll track a fix for LLDB to be more permissive separately. This fixes a test failure in the LLDB testsuite. rdar://problem/33357252 llvm-svn: 309154	2017-07-26 18:48:32 +00:00
Eric Beckmann	6638ba2d75	Diffing against a file that is itself used in the test seems to be a bad idea, because it might get locked down and rendered unopenable. llvm-svn: 309142	2017-07-26 17:47:44 +00:00
Simon Pilgrim	66a2eb8c77	[X86][AVX512] Regenerated and cleaned up extension tests. llvm-svn: 309139	2017-07-26 16:47:00 +00:00
Simon Pilgrim	b77cb95744	[X86] Regenerate setcc tests llvm-svn: 309138	2017-07-26 16:45:57 +00:00
Simon Pilgrim	164160b4f6	[X86][AVX512] Regenerate shuffle tests with broadcast comments. llvm-svn: 309137	2017-07-26 16:41:18 +00:00
Simon Pilgrim	0a7d9ac766	[X86] Regenerate memset tests llvm-svn: 309136	2017-07-26 16:39:07 +00:00
Eric Beckmann	3f4fe8f4bd	Correctly enable the llvm-mt tests, now that build flags changed. llvm-svn: 309134	2017-07-26 16:35:44 +00:00
Reid Kleckner	43c2b131d9	Quote '?' in llvm-rc test Summary: Bash interperets the '?' character as matching an arbitrary character. On systems that have a file or directory with exactly one character in their root directory, '/?' gets reinterpreted into that pathname, which fails to match the expected Help text for llvm-rc. This patch quotes the '/?' to avoid that edge case. Reviewers: mnbvmar, ecbeckmann, rnk Reviewed By: rnk Subscribers: dyung, ruiu, llvm-commits Differential Revision: https://reviews.llvm.org/D35852 llvm-svn: 309133	2017-07-26 16:25:48 +00:00
Simon Pilgrim	01ab86e62b	[X86] Add combineBT test failure because bits have multiple uses. llvm-svn: 309124	2017-07-26 15:41:57 +00:00
Dehao Chen	e90d0153ca	Make new PM honor -fdebug-info-for-profiling Summary: The new PM needs to invoke add-discriminator pass when building with -fdebug-info-for-profiling. Reviewers: chandlerc, davidxl Reviewed By: chandlerc Subscribers: sanjoy, llvm-commits Differential Revision: https://reviews.llvm.org/D35744 llvm-svn: 309121	2017-07-26 15:01:20 +00:00
Daniel Sanders	d3077a94a8	Revert r309094: [globalisel][tablegen] Fuse the generated tables together. The ARM bots have started failing and while this patch should be an improvement for these bots, it's also the only suspect in the blamelist. Reverting while Diana and I investigate the problem. llvm-svn: 309111	2017-07-26 13:28:40 +00:00
Zvi Rackover	092f199188	DAGCombiner: Extend reduceBuildVecToTrunc to handle non-zero offset Summary: Adding support for combining power2-strided build_vector's where the first build_vectori's operand is extracted from a non-zero index. Example: v4i32 build_vector((extract_elt V, 1), (extract_elt V, 3), (extract_elt V, 5), (extract_elt V, 7)) --> v4i32 truncate (bitcast (shuffle<1,u,3,u,5,u,7,u> V, u) to v4i64) Reviewers: delena, RKSimon, guyblank Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D35700 llvm-svn: 309108	2017-07-26 12:57:03 +00:00
Simon Pilgrim	a9551fb10f	[X86] Regenerated BT tests Test on 32/64 bit targets where appropriate llvm-svn: 309107	2017-07-26 12:49:20 +00:00
Martin Storsjo	0b7bf7a2e3	[COFF, ARM64] Fix symbol offsets in ADRP/ADD/LDR/STR relocations In COFF, a symbol offset can't be stored in the relocation (as is done in ELF or MachO), but is stored as the immediate in the instruction itself. The immediate in the ADRP thus is the symbol offset in bytes, not in pages. For the PAGEOFFSET_12A/L relocations, ignore any offset outside of the lowest 12 bits; they won't have any effect on the ADD/LDR/STR instruction itself but only on the associated ADRP. This is similar to how the same issue is handled for MOVW/MOVT instructions in ELF (see e.g. SVN r307713, and r307728 in lld). This fixes "fixup out of range" errors while building larger object files, where temporary symbols end up as a plain section symbol and an offset, and fixes any cases where the symbol offset mean that the actual target ended up on a different page than the symbol itself. Differential Revision: https://reviews.llvm.org/D35791 llvm-svn: 309105	2017-07-26 11:19:17 +00:00
Simon Pilgrim	dd06da0804	[X86] Add urem vector test for non-uniform pow2 constants llvm-svn: 309104	2017-07-26 11:07:45 +00:00
Simon Pilgrim	c5c72306f3	[X86] Regenerated urem pow2 tests on 32/64 bit targets llvm-svn: 309103	2017-07-26 11:05:16 +00:00
Simon Pilgrim	976a5d2662	[X86] Regenerated umul overflow tests on 32/64 bit targets llvm-svn: 309102	2017-07-26 11:04:18 +00:00
Diana Picus	a5d6518e93	[ARM] GlobalISel: Map G_GLOBAL_VALUE to GPR A G_GLOBAL_VALUE is basically a pointer, so it should live in the GPR. llvm-svn: 309101	2017-07-26 11:01:13 +00:00
Simon Pilgrim	106307aa13	[X86][AVX] Regenerated and cleaned up AVX1 intrinsic tests. Cleaned up triple settings, added 32-bit/64-bit targets where useful, added broadcast comments llvm-svn: 309100	2017-07-26 10:54:51 +00:00
Simon Pilgrim	c402839c72	[X86][AVX2] Regenerated and cleaned up broadcast tests. llvm-svn: 309099	2017-07-26 10:47:51 +00:00
Simon Pilgrim	b695f74bba	[X86][AVX512] Regenerated and added 32-bit targets to select tests llvm-svn: 309098	2017-07-26 10:39:55 +00:00
Simon Pilgrim	82097a8d8c	[X86][AVX] Regenerated and cleaned up masked gather/scatter tests. Remove unused KNL checks and triple settings, added broadcast comments llvm-svn: 309097	2017-07-26 10:37:12 +00:00
Simon Pilgrim	dbf1fa8958	[X86][AVX] Regenerate lzcnt test. Tidied up triples and checks. llvm-svn: 309095	2017-07-26 10:22:56 +00:00
Daniel Sanders	d83817ad6e	[globalisel][tablegen] Fuse the generated tables together. Summary: Now that we have control flow in place, fuse the per-rule tables into a single table. This is a compile-time saving at this point. However, this will also enable the optimization of a table so that similar instructions can be tested together, reducing the time spent on the matching the code. This is NFC in terms of externally visible behaviour but some internals have changed slightly. State.MIs is no longer reset between each rule that is attempted because it's not necessary to do so. As a consequence of this the restriction on the order that instructions are added to State.MIs has been relaxed to only affect recorded instructions that require new elements to be added to the vector. GIM_RecordInsn can now write to any element from 1 to State.MIs.size() instead of just State.MIs.size(). Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar Reviewed By: rovka Subscribers: kristof.beyls, igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D35681 llvm-svn: 309094	2017-07-26 10:20:56 +00:00
Simon Pilgrim	ddf407dec9	[X86][FMA] Regenerate test with broadcast comments. llvm-svn: 309093	2017-07-26 10:20:49 +00:00
Diana Picus	b1fd784936	[ARM] GlobalISel: Mark G_GLOBAL_VALUE as legal llvm-svn: 309090	2017-07-26 09:25:15 +00:00
Michael Zuckerman	c1918ad571	[X86][LLVM]Expanding Supports lowerInterleavedStore() in X86InterleavedAccess. This patch expands the support of lowerInterleavedStore to 32x8i stride 4. LLVM creates suboptimal shuffle code-gen for AVX2. In overall, this patch is a specific fix for the pattern (Strid=4 VF=32) and we plan to include more patterns in the future. To reach our goal of "more patterns". We include two mask creators. The first function creates shuffle's mask equivalent to unpacklo/unpackhi instructions. The other creator creates mask equivalent to a concat of two half vectors(high/low). The patch goal is to optimize the following sequence: At the end of the computation, we have ymm2, ymm0, ymm12 and ymm3 holding each 32 chars: c0, c1, , c31 m0, m1, , m31 y0, y1, , y31 k0, k1, ., k31 And these need to be transposed/interleaved and stored like so: c0 m0 y0 k0 c1 m1 y1 k1 c2 m2 y2 k2 c3 m3 y3 k3 .... Reviewers: dorit Farhana RKSimon guyblank DavidKreitzer Differential Revision: https://reviews.llvm.org/D34601 llvm-svn: 309086	2017-07-26 08:10:14 +00:00
Max Kazantsev	f282aed428	[SCEV] Cache results of computeExitLimit This patch adds a cache for computeExitLimit to save compilation time. A lot of examples of tests that take extensive time to compile are attached to the bug 33494. Differential Revision: https://reviews.llvm.org/D35827 llvm-svn: 309080	2017-07-26 04:55:54 +00:00
Craig Topper	050c9c8f83	[X86] Prevent selecting masked aligned load instructions if the load should be non-temporal Summary: The aligned load predicates don't suppress themselves if the load is non-temporal the way the unaligned predicates do. For the most part this isn't a problem because the aligned predicates are mostly used for instructions that only load the the non-temporal loads have priority over those. The exception are masked loads. Reviewers: RKSimon, zvi Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D35712 llvm-svn: 309079	2017-07-26 04:31:04 +00:00
Dehao Chen	7b05a2712a	Add test coverage for new PM PGOOpt handling. Summary: This patch adds flags and tests to cover the PGOOpt handling logic in new PM. Reviewers: chandlerc, davide Reviewed By: chandlerc Subscribers: sanjoy, llvm-commits Differential Revision: https://reviews.llvm.org/D35807 llvm-svn: 309076	2017-07-26 02:00:43 +00:00
Davide Italiano	557a0b3a9e	[gold] Enable data-sections by default for the gold-plugin. Follow up to r309056. llvm-svn: 309075	2017-07-26 01:47:17 +00:00
Wei Mi	c54b6c881b	Add "REQUIRES: asserts" for test unswitch-equality-undef.ll. llvm-svn: 309073	2017-07-26 01:34:46 +00:00
Spyridoula Gravani	dc635f40bb	[DWARF] Generalized verification of .apple_names accelerator table to be applicable to any acceleration table. Added verification for .apple_types, .apple_namespaces and .apple_objc sections. Differential Revision: https://reviews.llvm.org/D35853 llvm-svn: 309068	2017-07-26 00:52:31 +00:00
Reid Kleckner	14d90fd05c	[PDB] Improve GSI hash table dumping for publics and globals The PDB "symbol stream" actually contains symbol records for the publics and the globals stream. The globals and publics streams are essentially hash tables that point into a single stream of records. In order to match cvdump's behavior, we need to only dump symbol records referenced from the hash table. This patch implements that, and then implements global stream dumping, since it's just a subset of public stream dumping. Now we shouldn't see S_PROCREF or S_GDATA32 records when dumping publics, and instead we should see those record in the globals stream. llvm-svn: 309066	2017-07-26 00:40:36 +00:00
Eric Beckmann	b4dbe7231e	Reapply "llvm-mt: implement simple merging of manifests, not factoring namespaces. This time with correct #if. This reverts commit 9cf4eca0e0383040c1ff1416815c7f649650c2a0. llvm-svn: 309064	2017-07-26 00:25:12 +00:00
Wei Mi	fc0e245464	Disable loop unswitching for some patterns containing equality comparison with undef. This is a workaround for the bug described in PR31652 and http://lists.llvm.org/pipermail/llvm-dev/2017-July/115497.html. The temporary solution is to add a function EqualityPropUnSafe. In EqualityPropUnSafe, for some simple patterns we can know the equality comparison may contains undef, so we regard such comparison as unsafe and will not do loop-unswitching for them. We also need to disable the select simplification when one of select operand is undef and its result feeds into equality comparison. The patch cannot clear the safety issue caused by the bug, but it can suppress the issue from happening to some extent. Differential Revision: https://reviews.llvm.org/D35811 llvm-svn: 309059	2017-07-25 23:37:17 +00:00
Adrian Prantl	be66271f04	Debug Info: Support fragmented variables in the MMI side table This reapplies commit r309034 with a bugfix+test for inlined variables. llvm-svn: 309057	2017-07-25 23:32:59 +00:00
Davide Italiano	756feb2a51	[gold] Enable function-sections by default. This is needed, among others, to respect --section-ordering-file with LTO. I'll follow up with a similar change for data sections. I hope every version of gold available on the bots has support for --section-ordering file. llvm-svn: 309056	2017-07-25 23:32:50 +00:00
Eric Beckmann	455210e18f	Revert "llvm-mt: implement simple merging of manifests, not factoring namespaces." This reverts commit 813308e240792ca70ed2f998f21df24a5061ada0. llvm-svn: 309050	2017-07-25 23:06:46 +00:00
Eric Beckmann	780fd409fb	llvm-mt: implement simple merging of manifests, not factoring namespaces. Summary: Does a simple merge, where mergeable elements are combined, all others are appended. Does not apply trickly namespace rules. Subscribers: llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D35753 llvm-svn: 309047	2017-07-25 22:50:25 +00:00
Petr Hosek	76fb627baf	Revert "Reland "[LLVM][llvm-objcopy] Added basic plumbing to get things started"" This reverts commit 0d9d9250483761eb2f50236830161b0e2137d483. llvm-svn: 309045	2017-07-25 22:39:52 +00:00
Petr Hosek	5e87de3e4e	Reland "[LLVM][llvm-objcopy] Added basic plumbing to get things started" As discussed on llvm-dev I've implemented the first basic steps towards llvm-objcopy/llvm-objtool (name pending). This change adds the ability to copy (without modification) 64-bit little endian ELF executables that have SHT_PROGBITS, SHT_NOBITS, SHT_NULL and SHT_STRTAB sections. Patch by Jake Ehrlich Differential Revision: https://reviews.llvm.org/D33964 llvm-svn: 309043	2017-07-25 22:38:08 +00:00
Petr Hosek	a1ddfbb119	Revert "Reland "[LLVM][llvm-objcopy] Added basic plumbing to get things started"" This reverts commit 960873b10dd071298c817ba74ef2228f94ead7a1. llvm-svn: 309037	2017-07-25 21:55:00 +00:00
Adrian Prantl	b6d5faf2ea	Revert "Debug Info: Support fragmented variables in the MMI side table" This reverts commit r309034 because of a sanitizer issue. llvm-svn: 309035	2017-07-25 21:50:45 +00:00
Adrian Prantl	3d1ab0cd1e	Debug Info: Support fragmented variables in the MMI side table <rdar://problem/17816343> llvm-svn: 309034	2017-07-25 21:29:22 +00:00
Petr Hosek	4e0a4b3674	Reland "[LLVM][llvm-objcopy] Added basic plumbing to get things started" As discussed on llvm-dev I've implemented the first basic steps towards llvm-objcopy/llvm-objtool (name pending). This change adds the ability to copy (without modification) 64-bit little endian ELF executables that have SHT_PROGBITS, SHT_NOBITS, SHT_NULL and SHT_STRTAB sections. Patch by Jake Ehrlich Differential Revision: https://reviews.llvm.org/D33964 llvm-svn: 309032	2017-07-25 21:16:33 +00:00
Martin Storsjo	1913360a16	[AArch64] Update a comment in a test The comment ended up outdated when the test was rewritten in SVN r192281. Differential Revision: https://reviews.llvm.org/D35543 llvm-svn: 309016	2017-07-25 19:57:26 +00:00
Martin Storsjo	84cda2d779	[AArch64] Add a test for float argument passing to win64 vararg functions The existing tests only tested how a va_start is lowered. Differential Revision: https://reviews.llvm.org/D35540 llvm-svn: 309015	2017-07-25 19:57:22 +00:00
Teresa Johnson	a83c3f7879	[LTO] Prevent dead stripping and internalization of symbols with sections Summary: ELF linkers generate __start_<secname> and __stop_<secname> symbols when there is a value in a section <secname> where the name is a valid C identifier. If dead stripping determines that the values declared in section <secname> are dead, and we then internalize (and delete) such a symbol, programs that reference the corresponding start and end section symbols will get undefined reference linking errors. To fix this, add the section name to the IRSymtab entry when a symbol is defined in a specific section. Then use this in the gold-plugin to mark the symbol as external and visible from outside the summary when the section name is a valid C identifier. Reviewers: pcc Subscribers: mehdi_amini, inglorion, eraman, llvm-commits Differential Revision: https://reviews.llvm.org/D35639 llvm-svn: 309009	2017-07-25 19:42:32 +00:00
Nemanja Ivanovic	009016bb70	[PowerPC] Pretty-print CR bits the way the binutils disassembler does This patch just adds printing of CR bit registers in a more human-readable form akin to that used by the GNU binutils. Differential Revision: https://reviews.llvm.org/D31494 llvm-svn: 309001	2017-07-25 18:26:35 +00:00
Nemanja Ivanovic	864c953773	[PowerPC] - Recommit r304907 now that the issue has been fixed This is just a recommit since the issue that the commit exposed is now resolved. llvm-svn: 308995	2017-07-25 17:54:51 +00:00
Simon Pilgrim	18b97f78fe	[X86][CGP] Reduce memcmp() expansion to 2 load pairs (PR33914) D35067/rL308322 attempted to support up to 4 load pairs for memcmp inlining which resulted in regressions for some optimized libc memcmp implementations (PR33914). Until we can match these more optimal cases, this patch reduces the memcmp expansion to a maximum of 2 load pairs (which matches what we do for -Os). This patch should be considered for the 5.0.0 release branch as well Differential Revision: https://reviews.llvm.org/D35830 llvm-svn: 308986	2017-07-25 17:04:37 +00:00
Simon Pilgrim	0d3054fb44	[X86] Regenerate test. llvm-svn: 308981	2017-07-25 16:10:32 +00:00
Simon Pilgrim	3edf2901d2	[X86] Regenerate test with broadcast comments. llvm-svn: 308980	2017-07-25 16:09:56 +00:00
Fedor Sergeev	7856a3205f	[Sparc] invalid adjustments in TLS_LE/TLS_LDO relocations removed Summary: Some SPARC TLS relocations were applying nontrivial adjustments to zero value, leading to unexpected non-zero values in ELF and then Solaris linker failures. Getting rid of these adjustments. Fixes PR33825. Reviewers: rafael, asb, jyknight Subscribers: joerg, jyknight, llvm-commits Differential Revision: https://reviews.llvm.org/D35567 llvm-svn: 308978	2017-07-25 15:28:28 +00:00
Andrew V. Tischenko	32e9b1ad0b	X86 Asm uses assertions instead of proper diagnostic. This patch fixes that. Differential Revision: https://reviews.llvm.org/D35115 llvm-svn: 308972	2017-07-25 13:05:12 +00:00
Chandler Carruth	1dc34c6d80	[LIR] Teach LIR to avoid extending the BE count prior to adding one to it when safe. Very often the BE count is the trip count minus one, and the plus one here should fold with that minus one. But because the BE count might in theory be UINT_MAX or some such, adding one before we extend could in some cases wrap to zero and break when we scale things. This patch checks to see if it would be safe to add one because the specific case that would cause this is guarded for prior to entering the preheader. This should handle essentially all of the common loop idioms coming out of C/C++ code once canonicalized by LLVM. Before this patch, both forms of loop in the added test cases ended up subtracting one from the size, extending it, scaling it up by 8 and then adding 8 back onto it. This is really silly, and it turns out made it all the way into generated code very often, so this is a surprisingly important cleanup to do. Many thanks to Sanjoy for showing me how to do this with SCEV. Differential Revision: https://reviews.llvm.org/D35758 llvm-svn: 308968	2017-07-25 10:48:32 +00:00
Michael Zolotukhin	cd2255ea6a	[tests] Cleanup vect.omp.persistence.ll test. llvm-svn: 308964	2017-07-25 10:35:16 +00:00
Simon Pilgrim	3459f108f8	[X86] Add 24-byte memcmp tests (PR33914) llvm-svn: 308963	2017-07-25 10:33:36 +00:00
Francois Pichet	82bf3de606	Fix endianness bug in DAGCombiner::visitTRUNCATE and visitEXTRACT_VECTOR_ELT Summary: Do not assume little endian architecture in DAGCombiner::visitTRUNCATE and DAGCombiner::visitEXTRACT_VECTOR_ELT. PR33682 Reviewers: hfinkel, sdardis, RKSimon Reviewed By: sdardis, RKSimon Subscribers: uabelho, RKSimon, sdardis, llvm-commits Differential Revision: https://reviews.llvm.org/D34990 llvm-svn: 308960	2017-07-25 09:40:35 +00:00
Sam Parker	19a08e42a8	[ARM] Enable partial and runtime unrolling Enable runtime and partial loop unrolling of simple loops without calls on M-class cores. The thresholds are calculated based on whether the target is Thumb or Thumb-2. Differential Revision: https://reviews.llvm.org/D34619 llvm-svn: 308956	2017-07-25 08:51:30 +00:00
Martin Storsjo	8cb3667541	[AArch64] Reserve a 16 byte aligned amount of fixed stack for win64 varargs Create a dummy 8 byte fixed object for the unused slot below the first stored vararg. Alternative ideas tested but skipped: One could try to align the whole fixed object to 16, but I haven't found how to add an offset to the stack frame used in LowerWin64_VASTART. If only the size of the fixed stack object size is padded but not the offset, via MFI.CreateFixedObject(alignTo(GPRSaveSize, 16), -(int)GPRSaveSize, false), PrologEpilogInserter crashes due to "Attempted to reset backwards range!". This fixes misconceptions about where registers are spilled, since AArch64FrameLowering.cpp assumes the offset from fixed objects is aligned to 16 bytes (and the Win64 case there already manually aligns the offset to 16 bytes). This fixes cases where local stack allocations could overwrite callee saved registers on the stack. Differential Revision: https://reviews.llvm.org/D35720 llvm-svn: 308950	2017-07-25 05:20:01 +00:00
Spyridoula Gravani	2fdfab2a0c	[DWARF] Modified test for die ranges verification so that it doesn't fail on windows hosts. llvm-svn: 308943	2017-07-25 01:58:27 +00:00
NAKAMURA Takumi	3f36aae357	llvm/test/CMakeLists.txt: Add llvm-rc to LLVM_TEST_DEPENDS. llvm-svn: 308942	2017-07-25 01:44:38 +00:00
Marek Sokolowski	2ce2fa481d	Add an empty shell of llvm-rc. This starts the development on one of MS Visual Studio binutils, Resource Converter. The tool compiles resource scripts (.rc) into binary resource files (.res). The current implementation does nothing but parse the command line arguments. It is going to be extended in the future. Differential Revision: https://reviews.llvm.org/D35810 llvm-svn: 308940	2017-07-25 00:25:18 +00:00
Spyridoula Gravani	e0ba415740	[DWARF] Added verification check for die ranges. If highPC is an address, then it should be greater than lowPC for each range. Differential Revision: https://reviews.llvm.org/D35733 llvm-svn: 308928	2017-07-24 21:04:11 +00:00
James Y Knight	84dbd080b7	Followup to r308890: don't assert the llvm llvm version number. llvm-svn: 308917	2017-07-24 19:44:43 +00:00
Krzysztof Parzyszek	1fd0c7e598	[Hexagon] Recognize C4_cmpneqi, C4_cmpltei and C4_cmplteui in NewValueJump llvm-svn: 308914	2017-07-24 19:35:48 +00:00
Michael Zuckerman	196b3cadf6	Adding base test for interleave store VF16 and expand the test for AVX512 This patch doesn't modifay any non test file. llvm-svn: 308909	2017-07-24 18:29:56 +00:00
Reid Kleckner	898ddf61c0	[codeview] Emit 'D' as the cv source language for D code This matches DMD: `522263965c/src/ddmd/backend/cv8.c (L199)` Fixes PR33899. llvm-svn: 308890	2017-07-24 16:16:42 +00:00
Ayman Musa	b16ce777e3	[X86][AVX512] Add patterns for masked AVX512 floating point compare instructions that were missing. patterns were missed by D33188. Adding for completion. +Updating test. Differential Revesion: https://reviews.llvm.org/D35179 llvm-svn: 308868	2017-07-24 08:10:32 +00:00
Dylan McKay	6c5c6aa9d8	[AVR] Remove the instrumentation pass I have a much better way of running integration tests now. https://github.com/dylanmckay/avr-test-suite llvm-svn: 308857	2017-07-23 23:39:11 +00:00
Dylan McKay	94c636b7aa	[AVR] Improve the 'icall-func-pointer-correct-addr-space.ll' test Patch by Carl Peto. llvm-svn: 308856	2017-07-23 23:00:55 +00:00
Petr Hosek	710479cede	[CodeGen][X86] Fuchsia supports sincos* libcalls and sin+cos->sincos optimization Patch by Roland McGrath Differential Revision: https://reviews.llvm.org/D35748 llvm-svn: 308854	2017-07-23 22:30:00 +00:00
Florian Hahn	57ffb2c9d8	[AArch64] Add test for function alignment for a optsize function (NFC). Reviewers: dblaikie, t.p.northover, rengolin Reviewed By: rengolin Subscribers: aemerson, rengolin, javed.absar, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D35620 llvm-svn: 308852	2017-07-23 21:15:10 +00:00
Chad Rosier	9b2b4c961a	[AArch64] Redundant Copy Elimination - remove more zero copies. This patch removes unnecessary zero copies in BBs that are targets of b.eq/b.ne and we know the result of the compare instruction is zero. For example, BB#0: subs w0, w1, w2 str w0, [x1] b.ne .LBB0_2 BB#1: mov w0, wzr ; <-- redundant str w0, [x2] .LBB0_2 Differential Revision: https://reviews.llvm.org/D35075 llvm-svn: 308849	2017-07-23 16:38:08 +00:00
Max Kazantsev	0e9e0796f4	[SCEV] Limit max size of AddRecExpr during evolving When SCEV calculates product of two SCEVAddRecs from the same loop, it tries to combine them into one big AddRecExpr. If the sizes of the initial SCEVs were `S1` and `S2`, the size of their product is `S1 + S2 - 1`, and every operand of the resulting SCEV is combined from operands of initial SCEV and has much higher complexity than they have. As result, if we try to calculate something like: %x1 = {a,+,b} %x2 = mul i32 %x1, %x1 %x3 = mul i32 %x2, %x1 %x4 = mul i32 %x3, %x2 ... The size of such SCEVs grows as `2^N`, and the arguments become more and more complex as we go forth. This leads to long compilation and huge memory consumption. This patch sets a limit after which we don't try to combine two `SCEVAddRecExpr`s into one. By default, max allowed size of the resulting AddRecExpr is set to 16. Differential Revision: https://reviews.llvm.org/D35664 llvm-svn: 308847	2017-07-23 15:40:19 +00:00
Craig Topper	6912d7faa3	[X86] Add patterns for memory forms of SARX/SHLX/SHRX with careful complexity adjustment to keep shift by immediate using the legacy instructions. These patterns were only missing to favor using the legacy instructions when the shift was a constant. With careful adjustment of the pattern complexity we can make sure the immediate instructions still have priority over these patterns. llvm-svn: 308834	2017-07-23 03:59:37 +00:00
Nirav Dave	4e6dcf73f9	[DAG] Fix typo preventing some stores merges to truncated stores. Check the actual memory type stored and not the extended value size when considering if truncated store merge is worthwhile. Reviewers: efriedma, RKSimon, spatel, jyknight Reviewed By: efriedma Subscribers: llvm-commits, nhaehnle Differential Revision: https://reviews.llvm.org/D35623 llvm-svn: 308833	2017-07-23 02:06:28 +00:00
Petr Hosek	c348f9fcd5	Revert "Reland "[LLVM][llvm-objcopy] Added basic plumbing to get things started"" This reverts commit 2b52298eb28ba4d3eca113353a348c02a6ef1f93. llvm-svn: 308822	2017-07-22 02:43:50 +00:00
Petr Hosek	badc76623c	Reland "[LLVM][llvm-objcopy] Added basic plumbing to get things started" As discussed on llvm-dev I've implemented the first basic steps towards llvm-objcopy/llvm-objtool (name pending). This change adds the ability to copy (without modification) 64-bit little endian ELF executables that have SHT_PROGBITS, SHT_NOBITS, SHT_NULL and SHT_STRTAB sections. Patch by Jake Ehrlich Differential Revision: https://reviews.llvm.org/D33964 llvm-svn: 308821	2017-07-22 02:33:45 +00:00
Craig Topper	abfe380f9a	[X86] Add nopq instruction which is a rex encoded version of nopl for gas compatibility. llvm-svn: 308818	2017-07-22 01:30:53 +00:00
Craig Topper	e88aef4b5f	[X86] Add register form of NOPL and NOPW for assembler/disassembler. Fixes PR32805. llvm-svn: 308817	2017-07-22 01:30:51 +00:00
David Blaikie	2bee7c68af	Commit missing/empty test file from r308789 llvm-svn: 308814	2017-07-22 00:24:20 +00:00
Matt Arsenault	c5d1e503e1	RA: Remove another assert on empty intervals This case is similar to the one fixed in r308808, except when rematerializing. Fixes bug 33884. llvm-svn: 308813	2017-07-22 00:24:01 +00:00
Matt Arsenault	6a963f76ca	RA: Remove assert on empty live intervals This is possible if there is an undef use when splitting the vreg during spilling. Fixes bug 33620. llvm-svn: 308808	2017-07-21 23:56:13 +00:00
Petr Hosek	b13608f5e8	Revert "[LLVM][llvm-objcopy] Added basic plumbing to get things started" This reverts commit 2f423248e140b94b8377660d4d2fe9364f30febe. llvm-svn: 308806	2017-07-21 23:39:39 +00:00
Petr Hosek	0b746d3417	Reland "[LLVM][llvm-objcopy] Added basic plumbing to get things started" As discussed on llvm-dev I've implemented the first basic steps towards llvm-objcopy/llvm-objtool (name pending). This change adds the ability to copy (without modification) 64-bit little endian ELF executables that have SHT_PROGBITS, SHT_NOBITS, SHT_NULL and SHT_STRTAB sections. Patch by Jake Ehrlich Differential Revision: https://reviews.llvm.org/D33964 llvm-svn: 308803	2017-07-21 23:27:40 +00:00
Erich Keane	d8f61f8f7e	Remove Bitrig: LLVM Changes Bitrig code has been merged back to OpenBSD, thus the OS has been abandoned. Differential Revision: https://reviews.llvm.org/D35707 llvm-svn: 308799	2017-07-21 22:48:47 +00:00
David Blaikie	b8cc0544d2	[ProfData] Detect if zlib is available As discussed on [1], if the profile is compressed and llvm-profdata is not built with zlib support, the error message is not informative. Give a better error message if zlib is not available. [1] http://lists.llvm.org/pipermail/llvm-dev/2017-July/115571.html Reviewers: davidxl, dblaikie Differential Revision: https://reviews.llvm.org/D35586 llvm-svn: 308789	2017-07-21 21:41:15 +00:00
Farhana Aleen	e4a89a6462	X86InterleaveAccess: A fix for bug33826 Reviewers: DavidKreitzer Differential Revision: https://reviews.llvm.org/D35638 llvm-svn: 308784	2017-07-21 21:35:00 +00:00
Konstantin Zhuravlyov	e9a5a77ee3	AMDGPU: Implement memory model llvm-svn: 308781	2017-07-21 21:19:23 +00:00
Krzysztof Parzyszek	3ad0d01e9e	[Hexagon] Add inline-asm constraint 'a' for modifier register class For example asm ("memw(%0++%1) = %2" : : "r"(addr),"a"(mod),"r"(val) : "memory") llvm-svn: 308761	2017-07-21 17:51:27 +00:00
Haojie Wang	1dec57d5b0	ThinLTO Minimized Bitcode File Size Reduction Summary: Currently the ThinLTO minimized bitcode file only strip the debug info, but there is still a lot of information in the minimized bit code file that will be not used for thin linker. In this patch, most of the extra information is striped to reduce the minimized bitcode file. Now only ModuleVersion, ModuleInfo, ModuleGlobalValueSummary, ModuleHash, Symtab and Strtab are left. Now the minimized bitcode file size is reduced to 15%-30% of the debug info stripped bitcode file size. Reviewers: danielcdh, tejohnson, pcc Reviewed By: pcc Subscribers: mehdi_amini, aprantl, inglorion, eraman, llvm-commits Differential Revision: https://reviews.llvm.org/D35334 llvm-svn: 308760	2017-07-21 17:25:20 +00:00
Simon Dardis	0310eb7a67	[mips] Support -membedded-data and fix a related bug -membedded-data changes the location of constant data from the .sdata to the .rodata section. Previously it was (incorrectly) always located in the .rodata section. Reviewers: atanasyan Differential Revision: https://reviews.llvm.org/D35686 llvm-svn: 308758	2017-07-21 17:19:00 +00:00
Matt Arsenault	ca7b0a1777	AMDGPU: Add instruction definitions for some scratch_* instructions Omit atomics for now since they probably aren't useful. llvm-svn: 308747	2017-07-21 15:36:16 +00:00
Dmitry Preobrazhensky	abf2839478	[AMDGPU][MC][GFX9] Added support of VOP3 'op_sel' modifier See bug 33591: https://bugs.llvm.org//show_bug.cgi?id=33591 Reviewers: vpykhtin, artem.tamazov, SamWot, arsenm Differential Revision: https://reviews.llvm.org/D35424 llvm-svn: 308740	2017-07-21 13:54:11 +00:00
Jonas Paulsson	be7a7e4979	[SystemZ] test update test/CodeGen/SystemZ/loop-01.ll was incorrectly updated by r308729. llvm-svn: 308736	2017-07-21 13:14:17 +00:00
Jonas Paulsson	024e319489	[SystemZ, LoopStrengthReduce] This patch makes LSR generate better code for SystemZ in the cases of memory intrinsics, Load->Store pairs or comparison of immediate with memory. In order to achieve this, the following common code changes were made: * New TTI hook: LSRWithInstrQueries(), which defaults to false. Controls if LSR should do instruction-based addressing evaluations by calling isLegalAddressingMode() with the Instruction pointers. * In LoopStrengthReduce: handle address operands of memset, memmove and memcpy as address uses, and call isFoldableMemAccessOffset() for any LSRUse::Address, not just loads or stores. SystemZ changes: * isLSRCostLess() implemented with Insns first, and without ImmCost. * New function supportedAddressingMode() that is a helper for TTI methods looking at Instructions passed via pointers. Review: Ulrich Weigand, Quentin Colombet https://reviews.llvm.org/D35262 https://reviews.llvm.org/D35049 llvm-svn: 308729	2017-07-21 11:59:37 +00:00
Simon Pilgrim	84cbd8e750	[X86][SSE] Add extra (sra (sra x, c1), c2) -> (sra x, (add c1, c2)) test case We should be able to handle the case where some c1+c2 elements exceed max shift and some don't by performing a clamp after the sum llvm-svn: 308724	2017-07-21 10:22:49 +00:00

1 2 3 4 5 ...

46530 Commits