llvm-project

Commit Graph

Author	SHA1	Message	Date
Sanjay Patel	4364d604c2	[InstSimplify] fold fadd+fsub with common operand llvm-svn: 339174	2018-08-07 20:23:49 +00:00
Heejin Ahn	7fb68d2679	[WebAssembly] CFG sort support for exception handling Summary: This patch extends CFGSort pass to support exception handling. Once it places a loop header, it does not place blocks that are not dominated by the loop header until all the loop blocks are sorted. This patch extends the same algorithm to exception 'catch' part, using the information calculated by WebAssemblyExceptionInfo class. Reviewers: dschuff, sunfish Subscribers: sbc100, jgravelle-google, llvm-commits Differential Revision: https://reviews.llvm.org/D46500 llvm-svn: 339172	2018-08-07 20:19:23 +00:00
Sanjay Patel	f7a8fb2dee	[InstSimplify] fold fsub+fsub with common operand llvm-svn: 339171	2018-08-07 20:14:27 +00:00
Sanjay Patel	50976393ed	[InstSimplify] add tests for fadd/fsub; NFC Instcombine gets some, but not all, of these cases via it's internal reassociation transforms. It fails in all cases with vector types. llvm-svn: 339168	2018-08-07 19:49:13 +00:00
Alexey Bataev	0edcd0278d	[SLP] Fix insert point for reused extract instructions. Summary: Reworked the previously committed patch to insert shuffles for reused extract element instructions in the correct position. Previous logic was incorrect, and might lead to the crash with PHIs and EH instructions. Reviewers: efriedma, javed.absar Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D50143 llvm-svn: 339166	2018-08-07 19:21:05 +00:00
Wei Mi	b1ef2cc53d	[SampleFDO] Fix a bug in getOrCompHotCountThreshold/getOrCompColdCountThreshold getOrCompHotCountThreshold/getOrCompColdCountThreshold introduced in https://reviews.llvm.org/D45377 contain a bad mistake and will only return 1 or 0 instead of the true hot/cold cutoff value. The patch fixes the mistake. But the mistake seems not causing big performance difference according to internal server benchmarks testing. Differential Revision: https://reviews.llvm.org/D50370 llvm-svn: 339162	2018-08-07 18:13:10 +00:00
Philip Reames	c792e197b4	[LICM] Strengthen assume hoisting tests [NFC] As requested in review of https://reviews.llvm.org/D50364 llvm-svn: 339159	2018-08-07 17:54:36 +00:00
Craig Topper	49ed49fcb1	[SelectionDAG] When splitting scatter nodes during DAGCombine, create a serial chain dependency. Scatter could have multiple identical indices. We need to maintain sequential order. We get this right in LegalizeVectorTypes, but not in this code. Differential Revision: https://reviews.llvm.org/D50374 llvm-svn: 339157	2018-08-07 17:35:02 +00:00
Florian Hahn	950576bdf8	[GVN,NewGVN] Keep nonnull if K does not move. In combineMetadata, we should be able to preserve K's nonnull metadata, if K does not move. This condition should hold for all replacements by NewGVN/GVN, but I added a bunch of assertions to verify that. Fixes PR35038. There probably are additional kinds of metadata that could be preserved using similar reasoning. This is follow-up work. Reviewers: dberlin, davide, efriedma, nlopes Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D47339 llvm-svn: 339149	2018-08-07 15:36:11 +00:00
Sjoerd Meijer	b39cd886b9	[ARM] FP16: codegen support for VACGT Differential Revision: https://reviews.llvm.org/D50236 llvm-svn: 339148	2018-08-07 15:11:47 +00:00
Andrew V. Tischenko	1fe3375620	[X86] MCA tests for XCHG, XADD and CMPXCHG* instructions Differential Revision: https://reviews.llvm.org/D49912 llvm-svn: 339145	2018-08-07 14:36:43 +00:00
Sanjay Patel	948ff87d7d	[InstSimplify] move minnum/maxnum with common op fold from instcombine llvm-svn: 339144	2018-08-07 14:36:27 +00:00
Sanjay Patel	b06d283909	[InstSimplify] add tests for minnum/maxnum with shared op; NFC llvm-svn: 339142	2018-08-07 14:13:40 +00:00
Sanjay Patel	b802d18df7	[InstSimplify] move misplaced minnum/maxnum tests; NFC llvm-svn: 339141	2018-08-07 14:12:08 +00:00
Jonas Devlieghere	42243df3b9	Fix inconsistency with/without debug information (-g) This fixes an inconsistency in code generation when compiling with or without debug information (-g). When debug information is available in an empty block, the original test would fail, resulting in possibly different code. Patch by: Jeroen Dobbelaere Differential revision: https://reviews.llvm.org/D49467 llvm-svn: 339129	2018-08-07 12:14:01 +00:00
Aleksandar Beserminji	949a17c016	[mips] Handle branch expansion corner cases When potential jump instruction and target are in the same segment, use jump instruction with immediate field. In cases where offset does not fit immediate value of a bc/j instructions, offset is stored into register, and then jump register instruction is used. Differential Revision: https://reviews.llvm.org/D48019 llvm-svn: 339126	2018-08-07 10:45:45 +00:00
Pavel Labath	2f0881160c	[DebugInfo] Reduce debug_str_offsets section size Summary: The accelerator tables use the debug_str section to store their strings. However, they do not support the indirect method of access that is available for the debug_info section (DW_FORM_strx et al.). Currently our code is assuming that all strings can/will be referenced indirectly, and puts all of them into the debug_str_offsets section. This is generally true for regular (unsplit) dwarf, but in the DWO case, most of the strings in the debug_str section will only be used from the accelerator tables. Therefore the contents of the debug_str_offsets section will be largely unused and bloating the main executable. This patch rectifies this by teaching the DwarfStringPool to differentiate between strings accessed directly and indirectly. When a user inserts a string into the pool it has to declare whether that string will be referenced directly or not. If at least one user requsts indirect access, that string will be assigned an index ID and put into debug_str_offsets table. Otherwise, the offset table is skipped. This approach reduces the overall binary size (when compiled with -gdwarf-5 -gsplit-dwarf) in my tests by about 2% (debug_str_offsets is shrunk by 99%). Reviewers: probinson, dblaikie, JDevlieghere Subscribers: aprantl, mgrang, llvm-commits Differential Revision: https://reviews.llvm.org/D49493 llvm-svn: 339122	2018-08-07 09:54:52 +00:00
Simon Pilgrim	7e18938793	[TargetLowering] Add support for non-uniform vectors to BuildUDIV This patch refactors the existing TargetLowering::BuildUDIV base implementation to support non-uniform constant vector denominators. It also includes a fold for MULHU by pow2 constants to SRL which can now more readily occur from BuildUDIV. Differential Revision: https://reviews.llvm.org/D49248 llvm-svn: 339121	2018-08-07 09:51:34 +00:00
Simon Pilgrim	974a5a7d94	[X86][SSE] Add more non-uniform exact sdiv vector tests covering all/none ashr paths llvm-svn: 339120	2018-08-07 09:31:22 +00:00
George Rimar	65a6828b17	[yaml2obj] - Add a support for changing EntSize. I was trying to add a test case for LLD and found that it is impossible to set sh_entsize via yaml. The patch implements the missing part. Differential revision: https://reviews.llvm.org/D50235 llvm-svn: 339113	2018-08-07 08:11:38 +00:00
Sjoerd Meijer	a2ddddfd3e	[ARM][NFC] Replaced tab characters in test file vfcmp.ll. llvm-svn: 339111	2018-08-07 08:05:15 +00:00
Heejin Ahn	e8653bb89a	[WebAssembly] Enable atomic expansion for unsupported atomicrmws Summary: Wasm does not have direct counterparts to some of LLVM IR's atomicrmw instructions (min, max, umin, umax, and nand). This enables atomic expansion using cmpxchg instruction within a loop for those atomicrmw instructions. Reviewers: dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D49440 llvm-svn: 339084	2018-08-07 00:22:22 +00:00
Matt Arsenault	08f3fe4fae	AMDGPU: cvt_pk_rtz_f16 canonicalizes llvm-svn: 339078	2018-08-06 23:01:31 +00:00
Matt Arsenault	e94ee833f9	AMDGPU: Handle some vector operations in isCanonicalized llvm-svn: 339077	2018-08-06 22:45:51 +00:00
Stella Stamenova	cc2404c01d	[lit, python] Always add quotes around the python path in lit Summary: The issue with the python path is that the path to python on Windows can contain spaces. To make the tests always work, the path to python needs to be surrounded by quotes. This change updates several configuration files which specify the path to python as a substitution and also remove quotes from existing tests. Reviewers: asmith, zturner, alexshap, jakehehrlich Reviewed By: zturner, alexshap, jakehehrlich Subscribers: mehdi_amini, nemanjai, eraman, kbarton, jakehehrlich, steven_wu, dexonsmith, stella.stamenova, delcypher, llvm-commits Differential Revision: https://reviews.llvm.org/D50206 llvm-svn: 339073	2018-08-06 22:37:44 +00:00
Matt Arsenault	a29e76244a	AMDGPU: Push fcanonicalize through partially constant build_vector This usually avoids some re-packing code, and may help find canonical sources. llvm-svn: 339072	2018-08-06 22:30:44 +00:00
Peter Collingbourne	69dd7cd45e	MC: Redirect .addrsig directives referring to private (.L) symbols to the section symbol. This matches our behaviour for regular (i.e. relocated) references to private symbols and therefore avoids needing to unnecessarily write address-significant .L symbols to the object file's symbol table, which can interfere with stack traces. Fixes check-cfi after r339050. llvm-svn: 339066	2018-08-06 21:59:58 +00:00
Matt Arsenault	d49ab0b214	AMDGPU: Treat more custom operations as canonicalizing Everything should quiet, and I think everything should flush. I assume the min3/med3/max3 follow the same rules as regular min/max for flushing, which should at least be conservatively correct. There are still more operations that need to be handled. llvm-svn: 339065	2018-08-06 21:58:11 +00:00
Matt Arsenault	ce6d61fba8	AMDGPU: Conversions always produce canonical results Not sure why this was checking for denormals for f16. My interpretation of the IEEE standard is conversions should produce a canonical result, and the ISA manual says denormals are created when appropriate. llvm-svn: 339064	2018-08-06 21:51:52 +00:00
Philip Reames	94b29601ef	[LICM] Further strengthen tests for hoisting guards and invariant.starts [NFC] llvm-svn: 339062	2018-08-06 21:39:43 +00:00
Matt Arsenault	f8768bfc84	AMDGPU: Fix implementation of isCanonicalized If denormals are enabled, denormals are canonical. Also fix a few other issues. minnum/maxnum are supposed to canonicalize. Temporarily improve workaround for the instruction behavior change in gfx9. Handle selects and fcopysign. The tests were also largely broken, since they were checking for a flush used on some targets after the store of the result. llvm-svn: 339061	2018-08-06 21:38:27 +00:00
Philip Reames	9d7bb2f700	[LICM] Strengthen invariant.start hoisting tests [NFC] llvm-svn: 339057	2018-08-06 21:18:34 +00:00
Reid Kleckner	15e91c3235	[X86] Fix assertion in subreg extraction This assert fires when attempting to extract a subregister from the global PIC base register. This virtual register SD node is not in the VRBaseMap, so we shouldn't call getVR to look it up there. If this is a RegisterSDNode, we should be able to use the virtual register directly. Fixes PR38385 llvm-svn: 339056	2018-08-06 21:16:16 +00:00
Philip Reames	81c7dc93d2	[LICM] Add tests highlighting missing hoists for intrinsics [NFC] llvm-svn: 339054	2018-08-06 21:06:15 +00:00
Evandro Menezes	6e137cb9f0	[SLC] Fix shrinking of pow() Properly shrink `pow()` to `powf()` as a binary function and, when no other simplification applies, do not discard it. Differential revision: https://reviews.llvm.org/D50113 llvm-svn: 339046	2018-08-06 19:40:17 +00:00
Alexandre Ganea	741cc3531a	[llvm-pdbutil] Support PDBs without a DBI stream Differential Revision: https://reviews.llvm.org/D50258 llvm-svn: 339045	2018-08-06 19:35:00 +00:00
Easwaran Raman	10fd92dd94	[X86] Recognize a splat of negate in isFNEG Summary: Expand isFNEG so that we generate the appropriate F(N)M(ADD\|SUB) instructions in more cases. For example, the following sequence a = _mm256_broadcast_ss(f) d = _mm256_fnmadd_ps(a, b, c) generates an fsub and fma without this patch and an fnma with this change. Reviewers: craig.topper Subscribers: llvm-commits, davidxl, wmi Differential Revision: https://reviews.llvm.org/D48467 llvm-svn: 339043	2018-08-06 19:23:38 +00:00
Craig Topper	0076477a4c	[X86] When using "and $0" and "orl $-1" to store 0 and -1 for minsize, make sure the store isn't volatile If the store is volatile this might be a memory mapped IO access. In that case we shouldn't generate a load that didn't exist in the source Differential Revision: https://reviews.llvm.org/D50270 llvm-svn: 339041	2018-08-06 18:44:26 +00:00
Craig Topper	f8a8c746e3	[X86] Add test cases to show bad use of "and $0" and "orl $-1" for minsize when the store is volatile If the store is volatile we shouldn't be adding a little that didn't exist in the source. llvm-svn: 339040	2018-08-06 18:44:21 +00:00
Wei Mi	3c1c088500	[RegisterCoalescer] Delay live interval update work until the rematerialization for all the uses from the same def is done. We run into a compile time problem with flex generated code combined with `-fno-jump-tables`. The cause is that machineLICM hoists a lot of invariants outside of a big loop, and drastically increases the compile time in global register splitting and copy coalescing. https://reviews.llvm.org/D49353 relieves the problem in global splitting. This patch is to handle the problem in copy coalescing. About the situation where the problem in copy coalescing happens. After machineLICM, we have several defs outside of a big loop with hundreds or thousands of uses inside the loop. Rematerialization in copy coalescing happens for each use and everytime rematerialization is done, shrinkToUses will be called to update the huge live interval. Because we have 'n' uses for a def, and each live interval update will have at least 'n' complexity, the total update work is n^2. To fix the problem, we try to do the live interval update work in a collective way. If a def has many copylike uses larger than a threshold, each time rematerialization is done for one of those uses, we won't do the live interval update in time but delay that work until rematerialization for all those uses are completed, so we only have to do the live interval update work once. Delaying the live interval update could potentially change the copy coalescing result, so we hope to limit that change to those defs with many (like above a hundred) copylike uses, and the cutoff can be adjusted by the option -mllvm -late-remat-update-threshold=xxx. Differential Revision: https://reviews.llvm.org/D49519 llvm-svn: 339035	2018-08-06 17:30:45 +00:00
Matt Arsenault	0d1b3934e2	AMDGPU: Fold v_lshl_or_b32 with 0 src0 Appears from expansion of some packed cases. llvm-svn: 339025	2018-08-06 15:40:20 +00:00
Matt Arsenault	56b31d8d75	ValueTracking: Handle canonicalize in CannotBeNegativeZero Also fix apparently missing test coverage for any of the handling here. llvm-svn: 339023	2018-08-06 15:16:26 +00:00
Matt Arsenault	dbf77c5b41	AMDGPU: Rename check prefixes in test Will avoid noisy diff in future change. llvm-svn: 339022	2018-08-06 15:16:12 +00:00
Bryan Chan	e023706471	[AArch64] Fix assertion failure on widened f16 BUILD_VECTOR Summary: Ensure that NormalizedBuildVector returns a BUILD_VECTOR with operands of the same type. This fixes an assertion failure in VerifySDNode. Reviewers: SjoerdMeijer, t.p.northover, javed.absar Reviewed By: SjoerdMeijer Subscribers: kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D50202 llvm-svn: 339013	2018-08-06 14:14:41 +00:00
Tim Northover	9956e4a24b	ARM-MachO: don't add Thumb bit for addend to non-external relocation. ld64 supplies its own Thumb bit for Thumb functions, and intentionally zeroes out that part of any addend in an object file. But it only does that for symbols marked N_EXT -- i.e. external symbols. So LLVM should avoid setting that extra bit in other cases. llvm-svn: 339007	2018-08-06 11:32:44 +00:00
Max Kazantsev	2dbbd64cb7	Re-enable "[ValueTracking] Teach isKnownNonNullFromDominatingCondition about AND" The patch was reverted because of bug detected by sanitizer. The bug is fixed, respective tests added. Differential Revision: https://reviews.llvm.org/D50172 llvm-svn: 339005	2018-08-06 11:14:18 +00:00
Max Kazantsev	3271f379a9	Revert rL338990 to see if it causes sanitizer failures Multiple failues reported by sanitizer-x86_64-linux, seem to be caused by this patch. Reverting to see if they sustain without it. Differential Revision: https://reviews.llvm.org/D50172 llvm-svn: 338994	2018-08-06 08:10:28 +00:00
Max Kazantsev	34b0666be9	[ValueTracking] Teach isKnownNonNullFromDominatingCondition about AND `isKnownNonNullFromDominatingCondition` is able to prove non-null basing on `br` or `guard` by `%p != null` condition, but is unable to do so basing on `(%p != null) && %other_cond`. This patch allows it to do so. Differential Revision: https://reviews.llvm.org/D50172 Reviewed By: reames llvm-svn: 338990	2018-08-06 06:11:36 +00:00
Max Kazantsev	eded4abef8	[GuardWidening] Widen guards with conditions of frequently taken dominated branches If there is a frequently taken branch dominated by a guard, and its condition is available at the point of the guard, we can widen guard with condition of this branch and convert the branch into unconditional: guard(cond1) if (cond2) { // taken in 99.9% cases // do something } else { // do something else } Converts to guard(cond1 && cond2) // do something Differential Revision: https://reviews.llvm.org/D49974 Reviewed By: reames llvm-svn: 338988	2018-08-06 05:49:19 +00:00
David Bolvansky	b7fcd10700	[NFC] Fixed inliner tests - 2 llvm-svn: 338973	2018-08-05 16:53:36 +00:00
David Bolvansky	2f1f3b10ad	[NFC] Fixed inliner tests llvm-svn: 338972	2018-08-05 16:30:46 +00:00
David Bolvansky	c0aa4b75a4	Enrich inline messages Summary: This patch improves Inliner to provide causes/reasons for negative inline decisions. 1. It adds one new message field to InlineCost to report causes for Always and Never instances. All Never and Always instantiations must provide a simple message. 2. Several functions that used to return the inlining results as boolean are changed to return InlineResult which carries the cause for negative decision. 3. Changed remark priniting and debug output messages to provide the additional messages and related inline cost. 4. Adjusted tests for changed printing. Patch by: yrouban (Yevgeny Rouban) Reviewers: craig.topper, sammccall, sgraenitz, NutshellySima, shchenz, chandlerc, apilipenko, javed.absar, tejohnson, dblaikie, sanjoy, eraman, xbolva00 Reviewed By: tejohnson, xbolva00 Subscribers: xbolva00, llvm-commits, arsenm, mehdi_amini, eraman, haicheng, steven_wu, dexonsmith Differential Revision: https://reviews.llvm.org/D49412 llvm-svn: 338969	2018-08-05 14:53:08 +00:00
Eric Christopher	9855a5a0a1	Revert "Add a warning if someone attempts to add extra section flags to sections" There are a bunch of edge cases and inconsistencies in how we're emitting sections cause this warning to fire and it needs more work. This reverts commit r335558. llvm-svn: 338968	2018-08-05 14:23:37 +00:00
Roman Lebedev	365fa96055	[NFC][InstCombine] Add tests for sinking 'not' into 'xor' (PR38446) https://rise4fun.com/Alive/IT3 Comes up in the [most ugliest] signed int -> signed char case of -fsanitize=implicit-conversion (https://reviews.llvm.org/D50250) Not sure if we want to do it always, or only when it is free to invert. llvm-svn: 338967	2018-08-05 10:15:04 +00:00
Roman Lebedev	656a478e98	[NFC][InstCombine] Regenerate set.ll test llvm-svn: 338965	2018-08-05 08:53:40 +00:00
Craig Topper	fb33181038	[X86] Remove stale comments from a test. NFC The 16-bit case was recently fixed so this comment no longer applies. llvm-svn: 338964	2018-08-05 06:25:01 +00:00
David Bolvansky	b82a5ec1b6	[InstCombine] [NFC] Tests for strcmp to memcmp transformation llvm-svn: 338963	2018-08-05 05:46:56 +00:00
Chijun Sima	8b5de48d62	[TailCallElim] Preserve DT and PDT Summary: Previously, in the NewPM pipeline, TailCallElim recalculates the DomTree when it modifies any instruction in the Function. For example, ``` CallInst *CI = dyn_cast<CallInst>(&I); ... CI->setTailCall(); Modified = true; ... if (!Modified \|\| ...) return PreservedAnalyses::all(); ``` After applying this patch, the DomTree only recalculates if needed (plus an extra insertEdge() + an extra deleteEdge() call). When optimizing SQLite with `-passes="default<O3>"` pipeline of the newPM, the number of DomTree recalculation decreases by 6.2%, the number of nodes visited by DFS decreases by 2.9%. The time used by DomTree will decrease approximately 1%~2.5% after applying the patch. Statistics: ``` Before the patch: 23010 dom-tree-stats - Number of DomTree recalculations 489264 dom-tree-stats - Number of nodes visited by DFS -- DomTree After the patch: 21581 dom-tree-stats - Number of DomTree recalculations 475088 dom-tree-stats - Number of nodes visited by DFS -- DomTree ``` Reviewers: kuhar, dmgreen, brzycki, grosser, davide Reviewed By: kuhar, brzycki Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D49982 llvm-svn: 338954	2018-08-04 08:13:47 +00:00
Chijun Sima	eacad79777	[ADCE] Remove the need of DomTree Summary: ADCE doesn't need to query domtree. Reviewers: kuhar, brzycki, dmgreen, davide, grosser Reviewed By: kuhar Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D49988 llvm-svn: 338950	2018-08-04 02:50:12 +00:00
Aditya Nandakumar	e07b3b737b	[GISel]: Add Opcodes for CTLZ/CTTZ/CTPOP https://reviews.llvm.org/D48600 Added IRTranslator support to translate these known intrinsics into GISel opcodes. llvm-svn: 338944	2018-08-04 01:22:12 +00:00
Craig Topper	3c869cb5e5	[X86] Add isel patterns for atomic_load+sub+atomic_sub. Despite the comment removed in this patch, this is beneficial when the RHS of the sub is a register. llvm-svn: 338930	2018-08-03 22:08:30 +00:00
Craig Topper	84319d1b42	[X86] Add test cases to show missed opportunity to use RMW for atomic_load+sub+atomic_store. llvm-svn: 338929	2018-08-03 22:08:28 +00:00
Reid Kleckner	8e40702c1c	[X86] Re-generate abi-isel.ll checks with update_llc_test_checks.py These tests were clearly auto-generated when they were converted to FileCheck back in r80019 (2009), but we didn't have a fancy script to keep them up to date then. I've reviewed the diff, and we should be generating the exact same code sequences we used to. After this, I plan to commit a change that changes our output slightly, but in a way that is still correct. It will generate a large diff, and I want it to be clearly correct, so I am regenerating these checks in preparation for that. llvm-svn: 338928	2018-08-03 21:58:25 +00:00
Reid Kleckner	5578b53c92	[X86] Make abi-isel.ll like update_llc_test_checks.py output - Remove -asm-verbose=0 from every llc command. The tests still pass. - Reorder the RUN lines to match CHECKs. - Use -LABEL like update_llc_test_checks.py does. llvm-svn: 338927	2018-08-03 21:58:12 +00:00
Reid Kleckner	13a9035190	[X86] Layout tests exactly as update_llc_test_checks.py would Put the LLVM IR at the bottom of the function instead of the top. In my next patch, I will run update_llc_test_checks.py on this file, and I want to only highlight the diffs in the CHECK lines. Hopefully by doing this change first, the patch will be more understandable. llvm-svn: 338926	2018-08-03 21:57:59 +00:00
Craig Topper	d7391eefdf	[X86] Remove RELEASE_ and ACQUIRE_ pseudo instructions. Use isel patterns and the normal instructions instead At one point in time acquire implied mayLoad and mayStore as did release. Thus we needed separate pseudos that also carried that property. This appears to no longer be the case. I believe it was changed in 2012 with a comment saying that atomic memory accesses are marked volatile which preserves the ordering. So from what I can tell we shouldn't need additional pseudos since they aren't carry any flags that are different from the normal instructions. The only thing I can think of is that we may consider them for load folding candidates in the peephole pass now where we didn't before. If that's important hopefully there's something in the memory operand we can check to prevent the folding without relying on pseudo instructions. Differential Revision: https://reviews.llvm.org/D50212 llvm-svn: 338925	2018-08-03 21:40:44 +00:00
Craig Topper	8c41136ca3	[X86] Autogenerate complete checks. NFC llvm-svn: 338921	2018-08-03 20:58:14 +00:00
Anastasis Grammenos	4dfe279e00	[TRE][DebugInfo] Preserve Debug Location in new branch instruction There are two branch instructions created so the new test covers them both. Differential Revision: https://reviews.llvm.org/D50263 llvm-svn: 338917	2018-08-03 20:27:13 +00:00
Craig Topper	c4960582ec	[SelectionDAG] Teach LegalizeVectorTypes to widen the mask input to a masked store. The mask operand is visited before the data operand so we need to be able to widen it. Fixes PR38436. llvm-svn: 338915	2018-08-03 20:14:18 +00:00
Matt Arsenault	c3dc8e65e2	DAG: Enhance isKnownNeverNaN Add a parameter for testing specifically for sNaNs - at least one instruction pattern on AMDGPU needs to check specifically for this. Also handle more cases, and add a target hook for custom nodes, similar to the hooks for known bits. llvm-svn: 338910	2018-08-03 18:27:52 +00:00
Artem Belevich	0a11b6366a	[NVPTX] Handle __nvvm_reflect("__CUDA_ARCH"). Summary: libdevice in recent CUDA versions relies on __nvvm_reflect() to select GPU-specific bitcode. This patch addresses the requirement. Reviewers: jlebar Subscribers: jholewinski, sanjoy, hiraditya, bixia, llvm-commits Differential Revision: https://reviews.llvm.org/D50207 llvm-svn: 338908	2018-08-03 18:05:24 +00:00
Craig Topper	feb2a58860	[X86] Add a DAG combine for the __builtin_parity idiom used by clang to enable better codegen Clang uses "ctpop & 1" to implement __builtin_parity. If the popcnt instruction isn't supported this generates a large amount of code to calculate the population count. Instead we can bisect the data down to a single byte using xor and then check the parity flag. Even when popcnt is supported, its still a good idea to split 64-bit data on 32-bit targets using an xor in front of a single popcnt. Otherwise we get two popcnts and an add before the and. I've specifically targeted this at the sizes supported by clang builtins, but we could generalize this if we think that's useful. Differential Revision: https://reviews.llvm.org/D50165 llvm-svn: 338907	2018-08-03 18:00:29 +00:00
Craig Topper	b0ad9b9fd7	[X86] Add test cases for the current codegen of __builtin_parity. Will be improved in a follow commit llvm-svn: 338906	2018-08-03 18:00:23 +00:00
Joel Galenson	cfe5bc158d	Fix crash in bounds checking. In r337830 I added SCEV checks to enable us to insert fewer bounds checks. Unfortunately, this sometimes crashes when multiple bounds checks are added due to SCEV caching issues. This patch splits the bounds checking pass into two phases, one that computes all the conditions (using SCEV checks) and the other that adds the new instructions. Differential Revision: https://reviews.llvm.org/D49946 llvm-svn: 338902	2018-08-03 17:12:23 +00:00
Nicholas Wilson	e408a89a3a	[WebAssembly] Cleanup of the way globals and global flags are handled Differential Revision: https://reviews.llvm.org/D44030 llvm-svn: 338894	2018-08-03 14:33:37 +00:00
Jonas Devlieghere	3a92c5c1d3	[DebugInfo/Verifier] Don't emit error for missing module in index We don't expect module names to be present in the index. This patch adds DW_TAG_module to the blacklist. Differential revision: https://reviews.llvm.org/D50237 llvm-svn: 338878	2018-08-03 12:01:43 +00:00
Jonas Paulsson	f107b7275c	[SystemZ] Improve handling of instructions which expand to several groups Some instructions expand to more than one decoder group. This has been hitherto ignored, but is handled with this patch. Review: Ulrich Weigand https://reviews.llvm.org/D50187 llvm-svn: 338849	2018-08-03 10:43:05 +00:00
Sjoerd Meijer	d62c5ec2fe	[ARM] FP16: support vector zip and unzip This is addressing PR38404. Differential Revision: https://reviews.llvm.org/D50186 llvm-svn: 338835	2018-08-03 09:24:29 +00:00
Simon Pilgrim	4014fb1049	[X86] Add example of 'zero shift' guards on rotation patterns (PR34924) Basic pattern that leaves an unnecessary select on a rotation by zero result. This variant is trivial - the more general case with a compare+branch to prevent execution of undefined shifts is more tricky. llvm-svn: 338833	2018-08-03 09:20:02 +00:00
Sjoerd Meijer	9b30213828	[ARM] FP16: support VFMA This is addressing PR38404. llvm-svn: 338830	2018-08-03 09:12:56 +00:00
Craig Topper	a7a12399a1	[X86] Remove all the vector NOP bitcast patterns. Use a few lines of code in the Select method in X86ISelDAGToDAG.cpp instead. There are a lot of permutations of types here generating a lot of patterns in the isel table. It's more efficient to just ReplaceUses and RemoveDeadNode from the Select function. The test changes are because we have a some shuffle patterns that have a bitcast as their root node. But the behavior is identical to another instruction whose pattern doesn't start with a bitcast. So this isn't a functional change. llvm-svn: 338824	2018-08-03 07:01:10 +00:00
Craig Topper	e902b7d0b0	[X86] Support fp128 and/or/xor/load/store with VEX and EVEX encoded instructions. Move all the patterns to X86InstrVecCompiler.td so we can keep SSE/AVX/AVX512 all in one place. To save some patterns we'll use an existing DAG combine to convert f128 fand/for/fxor to integer when sse2 is enabled. This allows use to reuse all the existing patterns for v2i64. I believe this now makes SHA instructions the only case where VEX/EVEX and legacy encoded instructions could be generated simultaneously. llvm-svn: 338821	2018-08-03 06:12:56 +00:00
Hiroshi Inoue	73f8b255b6	[InstSimplify] fold extracting from std::pair (2/2) This is the second patch of the series which intends to enable jump threading for an inlined method whose return type is std::pair<int, bool> or std::pair<bool, int>. The first patch is https://reviews.llvm.org/rL338485. This patch handles code sequences that merges two values using `shl` and `or`, then extracts one value using `and`. Differential Revision: https://reviews.llvm.org/D49981 llvm-svn: 338817	2018-08-03 05:39:48 +00:00
Craig Topper	a80352c04e	[X86] When post-processing the DAG to remove zero extending moves for YMM/ZMM, make sure the producing instruction is VEX/XOP/EVEX encoded. If the producing instruction is legacy encoded it doesn't implicitly zero the upper bits. This is important for the SHA instructions which don't have a VEX encoded version. We might also be able to hit this with the incomplete f128 support that hasn't been ported to VEX. llvm-svn: 338812	2018-08-03 04:49:42 +00:00
Craig Topper	ded14af7aa	[X86] Autogenerate complete checks. NFC llvm-svn: 338811	2018-08-03 04:49:41 +00:00
Craig Topper	55697276dc	[X86] Autogenerate complete checks. NFC llvm-svn: 338802	2018-08-03 01:28:12 +00:00
Craig Topper	b99281c9b8	[X86] Autogenerate complete checks. NFC llvm-svn: 338799	2018-08-03 01:20:32 +00:00
Craig Topper	2c095444a4	[X86] Prevent promotion of i16 add/sub/and/or/xor to i32 if we can fold an atomic load and atomic store. This makes them consistent with i8/i32/i64. Which still seems to be more aggressive on folding than icc, gcc, or MSVC. llvm-svn: 338795	2018-08-03 00:37:34 +00:00
Philip Reames	5937368d4f	[LICM] Remove unneccessary safety check to increase sinking effectiveness This one requires a bit of explaination. It's not every day you simply delete code to implement an optimization. :) The transform in question is sinking an instruction from a loop to the uses in loop exiting blocks. We know (from LCSSA) that all of the uses outside the loop must be phi nodes, and after predecessor splitting, we know all phi users must have a single operand. Since the use must be strictly dominated by the def, we know from the definition of dominance/ssa that the exit block must execute along a (non-strict) subset of paths which reach the def. As a result, duplicating a potentially faulting instruction can not introduce a fault that didn't previously exist in the program. The full story is that this patch builds on "rL338671: [LICM] Factor out fault legality from canHoistOrSinkInst [NFC]" which pulled this logic out of a common helper routine. As best I can tell, this check was originally added to the helper function for hoisting legality, later an incorrect fastpath for loads/calls was added, and then the bug was fixed by duplicating the fault safety check in the hoist path. This left the redundant check in the common code to pessimize sinking for no reason. I split it out in an NFC, and am not removing the unneccessary check. I wanted there to be something easy to revert in case I missed something. Reviewed by: Anna Thomas (in person) llvm-svn: 338794	2018-08-03 00:21:56 +00:00
Dave Lee	3fb120f12e	objdump: Better handling of Mach-O universal binaries Summary: With Mach-O, there is a flag requirement discrepancy between working with universal binaries and thin binaries. Many flags that don't require the `-macho` flag (for example `-private-headers` and `-disassemble`) fail to work on universal binaries unless `-macho` is given. When this happens, the error message is unhelpful, stating: The file was not recognized as a valid object file. Which can lead to confusion. This change allows generic flags to be used on universal binaries with and without the `-macho` flag. This means flags that can be used for thin files can be used consistently with fat files too. To do this, the universal binary support within `ParseInputMachO()` is extracted into a new function. This new function is called directly from `DumpInput()` when the input binary is universal. Additionally the `-arch` flag validation in `ParseInputMachO()` was extracted to be reused. Reviewers: compnerd Reviewed By: compnerd Subscribers: keith, llvm-commits Differential Revision: https://reviews.llvm.org/D48702 llvm-svn: 338792	2018-08-03 00:06:38 +00:00
Eli Friedman	1ba5e9ac24	[GlobalMerge] Allow merging globals with explicit section markings. At least on ELF, it's impossible to tell from the object file whether two globals with the same section marking were merged: the merged global uses "private" linkage to hide its symbol, and the aliases look like regular symbols. I can't think of any other reason to disallow it. (Of course, we can only merge globals in the same section.) The weird alignment handling matches AsmPrinter; our alignment handling for global variables should probably be refactored. Differential Revision: https://reviews.llvm.org/D49822 llvm-svn: 338791	2018-08-02 23:54:16 +00:00
Tim Renouf	abd85fb1f5	[AMDGPU] Reworked SIFixWWMLiveness Summary: I encountered some problems with SIFixWWMLiveness when WWM is in a loop: 1. It sometimes gave invalid MIR where there is some control flow path to the new implicit use of a register on EXIT_WWM that does not pass through any def. 2. There were lots of false positives of registers that needed to have an implicit use added to EXIT_WWM. 3. Adding an implicit use to EXIT_WWM (and adding an implicit def just before the WWM code, which I tried in order to fix (1)) caused lots of the values to be spilled and reloaded unnecessarily. This commit is a rework of SIFixWWMLiveness, with the following changes: 1. Instead of considering any register with a def that can reach the WWM code and a def that can be reached from the WWM code, it now considers three specific cases that need to be handled. 2. A register that needs liveness over WWM to be synthesized now has it done by adding itself as an implicit use to defs other than the dominant one. Also added the following fixmes: FIXME: We should detect whether a register in one of the above categories is already live at the WWM code before deciding to add the implicit uses to synthesize its liveness. FIXME: I believe this whole scheme may be flawed due to the possibility of the register allocator doing live interval splitting. Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D46756 Change-Id: Ie7fba0ede0378849181df3f1a9a7a39ed1a94a94 llvm-svn: 338783	2018-08-02 23:31:32 +00:00
Craig Topper	63873db5c4	[X86] Allow 'atomic_store (neg/not atomic_load)' to isel to a RMW instruction. There was a FIXMe in the td file about a type inference issue that was easy to fix. llvm-svn: 338782	2018-08-02 23:30:38 +00:00
Craig Topper	2deeeae2a5	[X86] Add NEG and NOT test cases to atomic_mi.ll in preparation for fixing the FIXME in X86InstrCompiler.td to make these work for atomic load/store. llvm-svn: 338781	2018-08-02 23:30:31 +00:00
Tim Renouf	f1c7b92a6a	[AMDGPU] Avoid using divergent value in mubuf addr64 descriptor Summary: This fixes a problem where a load from global+idx generated incorrect code on <=gfx7 when the index is divergent. Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D47383 Change-Id: Ib4d177d6254b1dd3f8ec0203fdddec94bd8bc5ed llvm-svn: 338779	2018-08-02 22:53:57 +00:00
Zachary Turner	666de23fbf	[MS Demangler] Fix some tests that are no longer broken. These were fixed with earlier patches, but had not yet been re-enabled. llvm-svn: 338778	2018-08-02 22:37:40 +00:00
Krzysztof Parzyszek	d91a9e27a9	[Hexagon] Simplify CFG after atomic expansion This will remove suboptimal branching from the generated ll/sc loops. The extra simplification pass affects a lot of testcases, which have been modified to accommodate this change: either by modifying the test to become immune to the CFG simplification, or (less preferablt) by adding option -hexagon-initial-cfg-clenaup=0. llvm-svn: 338774	2018-08-02 22:17:53 +00:00
Heejin Ahn	4128cb0b6b	[WebAssembly] Support for atomic.wait / atomic.wake instructions Summary: This adds support for atomic.wait / atomic.wake instructions in the wasm thread proposal. Reviewers: dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D49395 llvm-svn: 338770	2018-08-02 21:44:24 +00:00
Craig Topper	db89ec1185	[X86] Autogenerate complete checks. NFC llvm-svn: 338765	2018-08-02 20:28:45 +00:00
Krzysztof Parzyszek	90f3249ce2	[SCEV] Properly solve quadratic equations Differential Revision: https://reviews.llvm.org/D48283 llvm-svn: 338758	2018-08-02 19:13:35 +00:00

1 2 3 4 5 ...

55120 Commits