llvm-project

Commit Graph

Author	SHA1	Message	Date
Teresa Johnson	4bdf82ce79	[SamplePGO] Minor efficiency improvement in samplePGO ICP Summary: When attaching prof metadata to promoted direct calls in SamplePGO mode, no need to construct and use a SmallVector to pass a single count to the ArrayRef parameter, we can simply use a brace-enclosed init list. This made a small but consistent improvement for a ThinLTO backend compile I was measuring. Reviewers: wmi Subscribers: mehdi_amini, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57706 llvm-svn: 353123	2019-02-05 00:18:38 +00:00
Matt Arsenault	81511e5428	GlobalISel: Implement narrowScalar for select Don't handle vector conditions. I think this can be merged in the future with fewerElementsVectorSelect, although this becomes slightly tricky with a vector condition. llvm-svn: 353122	2019-02-05 00:13:44 +00:00
Matt Arsenault	24f14993e8	GlobalISel: Combine g_extract with g_merge_values Try to use the underlying source registers. This enables legalization in more cases where some irregular operations are widened and others narrowed. This seems to make the test_combines_2 AArch64 test worse, since the MERGE_VALUES has multiple uses. Since this should be required for legalization, a hasOneUse check is probably inappropriate (or maybe should only be used if the merge is legal?). llvm-svn: 353121	2019-02-04 23:41:59 +00:00
Evandro Menezes	98f356cd74	Revert "[PATCH] [TargetLibraryInfo] Update run time support for Windows" This reverts accidental commit `ff5527718d`. llvm-svn: 353118	2019-02-04 23:34:50 +00:00
Evandro Menezes	ff5527718d	[PATCH] [TargetLibraryInfo] Update run time support for Windows It seems that the run time for Windows has changed and supports more math functions than before. Since LLVM requires at least VS2015, I assume that this is the run time that would be redistributed with programs built with Clang. Thus, I based this update on the header file `math.h` that accompanies it. This patch addresses the PR40541. Unfortunately, I have no access to a Windows development environment to validate it. llvm-svn: 353114	2019-02-04 23:29:41 +00:00
Matt Arsenault	1f795e2c2a	GlobalISel: Enforce operand types for constants A number of of tests were using imm operands, not cimm. Since CSE relies on the exact ConstantInt* pointer used, and implicit conversions are generally evil, also enforce the bitsize of the types. llvm-svn: 353113	2019-02-04 23:29:31 +00:00
Matt Arsenault	f2a26339e2	GlobalISel: Verify g_select Factor the common vector element consistency check many instructions need out, although this makes the error messages worse. llvm-svn: 353112	2019-02-04 23:29:16 +00:00
Matt Arsenault	46f9c6cf0b	MachineVerifier: Move verification of G_* instructions to function llvm-svn: 353111	2019-02-04 23:29:11 +00:00
Sam Clegg	313f9f54f5	[WebAssembly] MC: Mark more function aliases as functions Aliases of functions are now marked as function symbols even if they are bitcast to some other other non-function type. This is important for WebAssembly where object and function symbols can't alias each other. Fixes PR38866 Differential Revision: https://reviews.llvm.org/D57538 llvm-svn: 353109	2019-02-04 23:07:34 +00:00
Matt Arsenault	8a59b1919c	MIR: Validate LLT types when parsing llvm-svn: 353107	2019-02-04 22:59:56 +00:00
Matt Arsenault	3d6a49b0b9	GlobalISel: Fix not calling observer when legalizing bitcount ops This was hiding bugs from never legalizing the source type. llvm-svn: 353102	2019-02-04 22:26:33 +00:00
Matt Arsenault	cba0c6d0c9	AMDGPU: Don't rematerialize mov with implicit operands This was pulling the mov used for register indexing on gfx9 out of the loop. llvm-svn: 353101	2019-02-04 22:26:21 +00:00
Julian Lettner	29ac3a5b82	[SanitizerCoverage] Clang crashes if user declares `__sancov_lowest_stack` variable Summary: If the user declares or defines `__sancov_lowest_stack` with an unexpected type, then `getOrInsertGlobal` inserts a bitcast and the following cast fails: ``` Constant *SanCovLowestStackConstant = M.getOrInsertGlobal(SanCovLowestStackName, IntptrTy); SanCovLowestStack = cast<GlobalVariable>(SanCovLowestStackConstant); ``` This variable is a SanitizerCoverage implementation detail and the user should generally never have a need to access it, so we emit an error now. rdar://problem/44143130 Reviewers: morehouse Differential Revision: https://reviews.llvm.org/D57633 llvm-svn: 353100	2019-02-04 22:06:30 +00:00
Nicolai Haehnle	a69146e67e	[InstCombine] Cleanup the TFE/LWE check in AMDGPU SimplifyDemanded Summary: The fix added in r352904 is not quite correct, or rather misleading: 1. When the texfailctrl (TFC) argument was non-constant, the fix assumed non-TFE/LWE, which is incorrect. 2. Regardless, this code path cannot even be hit for correct TFE/LWE-enabled calls, because those return a struct. Added a test case for those for completeness. Change-Id: I92d314dbc67a2670f6d7adaab765ef45f56a49cf Reviewers: hliao, dstuttard, arsenm Subscribers: kzhuravl, jvesely, wdng, yaxunl, tpr, t-tye, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57681 llvm-svn: 353097	2019-02-04 21:24:19 +00:00
Craig Topper	c45e39b35f	[CodeGen][ARC][SystemZ][WebAssembly] Use MachineInstr::isInlineAsm in more places instead of just comparing opcode. NFCI I'm looking at adding a second INLINEASM opcode for better modeling asm-goto as a terminator. Using the existing predicate will reduce teh number of places that will need to use the new opcode. llvm-svn: 353095	2019-02-04 21:24:13 +00:00
Philip Pfaffe	0ee6a933ce	[NewPM][MSan] Add Options Handling Summary: This patch enables passing options to msan via the passes pipeline, e.e., -passes=msan<recover;kernel;track-origins=4>. Reviewers: chandlerc, fedor.sergeev, leonardchan Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57640 llvm-svn: 353090	2019-02-04 21:02:49 +00:00
Wolfgang Pieb	90d856cd5f	[DEBUGINFO] Reposting r352642: Handle restore instructions in LiveDebugValues The LiveDebugValues pass recognizes spills but not restores, which can cause large gaps in location information for some variables, depending on control flow. This patch make LiveDebugValues recognize restores and generate appropriate DBG_VALUE instructions. This patch was posted previously with r352642 and reverted in r352666 due to buildbot errors. A missing return statement was the cause for the failures. Reviewers: aprantl, NicolaPrica Differential Revision: https://reviews.llvm.org/D57271 llvm-svn: 353089	2019-02-04 20:42:45 +00:00
Scott Linder	d19d197221	[AMDGPU] Support emitting GOT relocations for function calls Differential Revision: https://reviews.llvm.org/D57416 llvm-svn: 353083	2019-02-04 20:00:07 +00:00
Michael Kruse	70560a0a2c	[WarnMissedTransforms] Do not warn about already vectorized loops. LoopVectorize adds llvm.loop.isvectorized, but leaves llvm.loop.vectorize.enable. Do not consider such a loop for user-forced vectorization since vectorization already happened -- by prioritizing llvm.loop.isvectorized except for TM_SuppressedByUser. Fixes http://llvm.org/PR40546 Differential Revision: https://reviews.llvm.org/D57542 llvm-svn: 353082	2019-02-04 19:55:59 +00:00
Matt Arsenault	8121ec26c0	GlobalISel: Fix CSE handling of buildConstant This fixes two problems with CSE done in buildConstant. First, this would hit an assert when used with a vector result type. Solve this by allowing CSE on the vector elements, but not on the result vector for now. Second, this was also performing the CSE based on the input ConstantInt pointer. The underlying buildConstant could potentially convert the constant depending on the result type, giving in a different ConstantInt*. Stop allowing the APInt and ConstantInt forms from automatically casting to the result type to avoid any similar problems in the future. llvm-svn: 353077	2019-02-04 19:15:50 +00:00
Heejin Ahn	18c56a0762	[WebAssembly] clang-tidy (NFC) Summary: This patch fixes clang-tidy warnings on wasm-only files. The list of checks used is: `-,clang-diagnostic-,llvm-,misc-,-misc-unused-parameters,readability-identifier-naming,modernize-` (LLVM's default .clang-tidy list is the same except it does not have `modernize-`. But I've seen in multiple CLs in LLVM the modernize style was recommended and code was fixed based on the style, so I added it as well.) The common fixes are: - Variable names start with an uppercase letter - Function names start with a lowercase letter - Use `auto` when you use casts so the type is evident - Use inline initialization for class member variables - Use `= default` for empty constructors / destructors - Use `using` in place of `typedef` Reviewers: sbc100, tlively, aardappel Subscribers: dschuff, sunfish, jgravelle-google, yurydelendik, kripken, MatzeB, mgorny, rupprecht, llvm-commits Differential Revision: https://reviews.llvm.org/D57500 llvm-svn: 353075	2019-02-04 19:13:39 +00:00
Roman Lebedev	b7ecc9b624	[X86] X86DAGToDAGISel::matchBitExtract(): prepare 'control' in 32 bits Summary: Noticed while looking at D56052. ``` // The 'control' of BEXTR has the pattern of: // [15...8 bit][ 7...0 bit] location // [ bit count][ shift] name // I.e. 0b000000011'00000001 means (x >> 0b1) & 0b11 ``` I.e. we do not care about any of the bits aside from the low 16 bits. So there is no point in doing the `slh`,`or` in 64 bits, let's just do everything in 32 bits, and anyext if needed. We could do that in 16 even, but we intentionally don't zext to i16 (longer encoding IIRC), so i'm guessing the same applies here. Reviewers: craig.topper, andreadb, RKSimon Reviewed By: craig.topper Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D56715 llvm-svn: 353073	2019-02-04 19:04:26 +00:00
David Callahan	fd3e7a9320	Adjust cardinality of internal inliner thresholds Summary: While compiling openJDK11 (also other workloads), some make files would pass both CFLAGS and LDFLAGS at link step ; resulting in duplicate options on the command line when one is using LTO and trying to influence the inliner. Most of the internal flags are ZeroOrMore, this diff changes the remaining ones. Reviewers: david2050, twoh, modocache Reviewed By: twoh Subscribers: mehdi_amini, dexonsmith, eraman, haicheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57537 Patch by: Abdoul-Kader Keita llvm-svn: 353071	2019-02-04 18:46:25 +00:00
Craig Topper	8ea72a8201	[X86] Add ST0 as an implicit def/use of x87 load/store instructions during FP stackifying. These instructions implicitly operate on ST0, but we don't currently add that information to the MachineInstr. We also don't add it the tablegen definitions either. For the most part this doesn't cause any problems because the stackifying occurs after register allocation. All the instructions are marked as having side effects so the postRA scheduler won't reorder them amongst themselves. But nothing stops inline assembly using X87 instructions from being reordered around other x87 instructions if that inline assembly wasn't marked volatile. The two test cases I've identified so far in PR40539 involve loads and stores used to set up the inline assembly or capture the results of the inline assembly ending up in the wrong order. This patch adds implicit ST0 uses/defs to the load/store instructions to prevent this from happening. I plan to fix all of the FP instructions, but the binops are bit trickier to get right. So I've chosen fixing the known test cases as a good first step. I think we also need to update the tablegen descriptions so MS inline assembly infers the right clobbers, but I haven't checked that yet. Differential Revision: https://reviews.llvm.org/D57644 llvm-svn: 353070	2019-02-04 18:43:55 +00:00
Matt Arsenault	0723828675	GlobalISel: Fix moreElementsToNextPow2 This was completely broken. The condition was inverted, and changed the element type for vectors of pointers. Fixes bug 40592. llvm-svn: 353069	2019-02-04 18:42:24 +00:00
Wouter van Oortmerssen	0b3cf247c4	[WebAssembly] Make segment/size/type directives optional in asm Summary: These were "boilerplate" that repeated information already present in .functype and end_function, that needed to be repeated to Please the particular way our object writing works, and missing them would generate errors. Instead, we generate the information for these automatically so the user can concern itself with writing more canonical wasm functions that always work as expected. Reviewers: dschuff, sbc100 Subscribers: jgravelle-google, aheejin, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D57546 llvm-svn: 353067	2019-02-04 18:03:11 +00:00
Jessica Paquette	834bded9d6	Revert "[GlobalISel] Add IRTranslator support for G_FFLOOR" This reverts commit 8bbd570fd5205a04d88d2e5513a6e4adbd028039. Apparently adding ffloor breaks AMDGPU somehow, so I need to back this out while I look into it. llvm-svn: 353064	2019-02-04 17:32:43 +00:00
Sam Clegg	d1152a267c	[WebAssembly] Rename relocations from R_WEBASSEMBLY_ to R_WASM_ See https://github.com/WebAssembly/tool-conventions/pull/95. This is less typing and IMHO more readable, and it also fits with our naming around the binary format which tends to use the short name. e.g. include/llvm/BinaryFormat/Wasm.h tools/llvm-objdump/WasmDump.cpp etc.. Differential Revision: https://reviews.llvm.org/D57611 llvm-svn: 353062	2019-02-04 17:28:46 +00:00
Craig Topper	bf7593ec4a	[X86] Print all register forms of x87 fadd/fsub/fdiv/fmul as having two arguments where on is %st. All of these instructions consume one encoded register and the other register is %st. They either write the result to %st or the encoded register. Previously we printed both arguments when the encoded register was written. And we printed one argument when the result was written to %st. For the stack popping forms the encoded register is always the destination and we didn't print both operands. This was inconsistent with gcc and objdump and just makes the output assembly code harder to read. This patch changes things to always print both operands making us consistent with gcc and objdump. The parser should still be able to handle the single register forms just as it did before. This also matches the GNU assembler behavior. llvm-svn: 353061	2019-02-04 17:28:18 +00:00
Leonard Chan	68d428e578	[Intrinsic] Unsigned Fixed Point Multiplication Intrinsic Add an intrinsic that takes 2 unsigned integers with the scale of them provided as the third argument and performs fixed point multiplication on them. This is a part of implementing fixed point arithmetic in clang where some of the more complex operations will be implemented as intrinsics. Differential Revision: https://reviews.llvm.org/D55625 llvm-svn: 353059	2019-02-04 17:18:11 +00:00
Jessica Paquette	73158e7201	[GlobalISel] Add IRTranslator support for G_FFLOOR Follow-up to https://reviews.llvm.org/D57484 Adds G_FFLOOR to translateKnownIntrinsic and update arm64-irtranslator.ll. Differential Revision: https://reviews.llvm.org/D57485 llvm-svn: 353058	2019-02-04 17:15:34 +00:00
Sanjay Patel	c00bdab4c8	[CGP] use IRBuilder to simplify code This is no-functional-change-intended although there could be intermediate variations caused by a difference in the debug info produced by setting that from the builder's insertion point. I'm updating the IR test file associated with this code just to show that the naming differences from using the builder are visible. The motivation for adding a helper function is that we are likely to extend this code to deal with other overflow ops. llvm-svn: 353056	2019-02-04 16:30:46 +00:00
James Henderson	9652652a32	[CommandLine] Don't print empty sentinel values from EnumValN lists in help text In order to make an option value truly optional, both the ValueOptional attribute and an empty-named value are required. Prior to this change, this empty-named value appears in the command-line help text: -some-option - some help text =v1 - description 1 =v2 - description 2 = - This change improves the help text for these sort of options in a number of ways: 1) ValueOptional options with an empty-named value now print their help text twice: both without and then with '=<value>' after the name. The latter version then lists the allowed values after it. 2) Empty-named values with no help text in ValueOptional options are not listed in the permitted values. -some-option - some help text -some-option=<value> - some help text =v1 - description 1 =v2 - description 2 3) Otherwise empty-named options are printed as =<empty> rather than simply '='. 4) Option values without help text do not have the '-' separator printed. -some-option=<value> - some help text =v1 - description 1 =v2 =<empty> - description It also tweaks the llvm-symbolizer -functions help text to not print a trailing ':' as that looks bad combined with 1) above. This is mostly a reland of r353048 which in turn was a reland of r352750. Reviewed by: ruiu, thopre, mstorsjo Differential Revision: https://reviews.llvm.org/D57030 llvm-svn: 353053	2019-02-04 16:17:57 +00:00
Simon Pilgrim	6e5350a367	[X86][SSE] SimplifyDemandedBitsForTargetNode - PCMPGT(0,X) sign mask For PCMPGT(0, X) patterns where we only demand the sign bit (e.g. BLENDV or MOVMSK) then we can use X directly. Differential Revision: https://reviews.llvm.org/D57667 llvm-svn: 353051	2019-02-04 15:43:36 +00:00
James Henderson	c9e6861a76	Revert r353048. It was causing unexpected unit test failures on build bots. llvm-svn: 353050	2019-02-04 15:09:58 +00:00
James Henderson	d90b5a2e51	[CommandLine] Don't print empty sentinel values from EnumValN lists in help text In order to make an option value truly optional, both the ValueOptional attribute and an empty-named value are required. Prior to this change, this empty-named value appears in the command-line help text: -some-option - some help text =v1 - description 1 =v2 - description 2 = - This change improves the help text for these sort of options in a number of ways: 1) ValueOptional options with an empty-named value now print their help text twice: both without and then with '=<value>' after the name. The latter version then lists the allowed values after it. 2) Empty-named values with no help text in ValueOptional options are not listed in the permitted values. -some-option - some help text -some-option=<value> - some help text =v1 - description 1 =v2 - description 2 3) Otherwise empty-named options are printed as =<empty> rather than simply '='. 4) Option values without help text do not have the '-' separator printed. -some-option=<value> - some help text =v1 - description 1 =v2 =<empty> - description It also tweaks the llvm-symbolizer -functions help text to not print a trailing ':' as that looks bad combined with 1) above. This is mostly a reland of r352750. Reviewed by: ruiu, thopre, mstorsjo Differential Revision: https://reviews.llvm.org/D57030 llvm-svn: 353048	2019-02-04 14:48:33 +00:00
Matt Arsenault	56edf3f344	GlobalISel: Fix formatting of debug output There was a missing space before the instruction name, and the newline is redundant since MI::print by default adds one. llvm-svn: 353046	2019-02-04 14:05:33 +00:00
Matt Arsenault	10547230f3	AMDGPU/GlobalISel: Legalize select for v4s16 Also add some more select tests to help show future legalization changes. llvm-svn: 353045	2019-02-04 14:04:52 +00:00
Simon Pilgrim	a536b89fe0	[DAGCombine] Add ADD(SUB,SUB) combines Noticed while investigating PR40483, and fixes the basic test case from the bug - but not a more general case. We're pretty weak at dealing with ADD/SUB combines compared to the SimplifyAssociativeOrCommutative/SimplifyUsingDistributiveLaws abilities that InstCombine can manage. llvm-svn: 353044	2019-02-04 13:44:49 +00:00
Andrea Di Biagio	edbf06a767	[AsmPrinter] Remove hidden flag -print-schedule. This patch removes hidden codegen flag -print-schedule effectively reverting the logic originally committed as r300311 (https://llvm.org/viewvc/llvm-project?view=revision&revision=300311). Flag -print-schedule was originally introduced by r300311 to address PR32216 (https://bugs.llvm.org/show_bug.cgi?id=32216). That bug was about adding "Better testing of schedule model instruction latencies/throughputs". These days, we can use llvm-mca to test scheduling models. So there is no longer a need for flag -print-schedule in LLVM. The main use case for PR32216 is now addressed by llvm-mca. Flag -print-schedule is mainly used for debugging purposes, and it is only actually used by x86 specific tests. We already have extensive (latency and throughput) tests under "test/tools/llvm-mca" for X86 processor models. That means, most (if not all) existing -print-schedule tests for X86 are redundant. When flag -print-schedule was first added to LLVM, several files had to be modified; a few APIs gained new arguments (see for example method MCAsmStreamer::EmitInstruction), and MCSubtargetInfo/TargetSubtargetInfo gained a couple of getSchedInfoStr() methods. Method getSchedInfoStr() had to originally work for both MCInst and MachineInstr. The original implmentation of getSchedInfoStr() introduced a subtle layering violation (reported as PR37160 and then fixed/worked-around by r330615). In retrospect, that new API could have been designed more optimally. We can always query MCSchedModel to get the latency and throughput. More importantly, the "sched-info" string should not have been generated by the subtarget. Note, r317782 fixed an issue where "print-schedule" didn't work very well in the presence of inline assembly. That commit is also reverted by this change. Differential Revision: https://reviews.llvm.org/D57244 llvm-svn: 353043	2019-02-04 12:51:26 +00:00
Simon Pilgrim	9899967464	Use auto for dyn_cast case to save a line. NFCI. llvm-svn: 353041	2019-02-04 12:32:39 +00:00
David Green	b4f36a2196	[ARM] Mark 255 and 65535 as cheap for Thumb1 "And" This prevents Constant Hoisting from pulling the constant out of the block, allowing us to still produce LDRH/UXTH nodes. LDRB/UXTB (255) is already cheap by the default getIntImmCost, but I've added it for clarity. Differential Revision: https://reviews.llvm.org/D57671 llvm-svn: 353040	2019-02-04 11:58:48 +00:00
Max Kazantsev	56b57e3f53	[NFC] Make a check in GuardWidening more obvious llvm-svn: 353038	2019-02-04 10:41:17 +00:00
Max Kazantsev	09802f41cc	[NFC] Rename variables to reflect the actual status of GuardWidening llvm-svn: 353036	2019-02-04 10:31:18 +00:00
Max Kazantsev	13ab5cbb64	[NFC] Remove redundant parameters for better readability llvm-svn: 353034	2019-02-04 10:20:51 +00:00
Max Kazantsev	65970aa24d	[NFC] Replace equivalent condition for better readability llvm-svn: 353032	2019-02-04 09:55:18 +00:00
Clement Courbet	1bb0e5ccfb	[SelectionDAG] Add a BaseIndexOffset::print() method for debugging. llvm-svn: 353028	2019-02-04 09:30:43 +00:00
Max Kazantsev	437ee05885	[SCEV] Do not bother creating separate SCEVUnknown for unreachable nodes Currently, SCEV creates SCEVUnknown for every node of unreachable code. If we have a huge amounts of such code, we will be littering SE with these nodes. We could just state that they all are undef and save some memory. Differential Revision: https://reviews.llvm.org/D57567 Reviewed By: sanjoy llvm-svn: 353017	2019-02-04 05:04:19 +00:00
Craig Topper	b5e945c260	Recommit r352660 "[X86] Mark EMMS and FEMMS as clobbering MM0-7 and ST0-7." We now print ST0 as 'st' when generating the clobber list for MS inline assembly in clang. This matches what the gcc reg name list expects. Original commit message: This fixes the test case in PR35982 by preventing MMX instructions that read MM0-7 from being moved below EMMS/FEMMS by the post RA scheduler. Though as discussed in bugzilla, this is not a complete fix. There is still the possibility of reordering in IR or by the pre-RA scheduler. Differential Revision: https://reviews.llvm.org/D57298 llvm-svn: 353016	2019-02-04 04:44:20 +00:00
Craig Topper	7a2944efe1	[X86] Print %st(0) as %st when its implicit to the instruction. Continue printing it as %st(0) when its encoded in the instruction. This is a step back from the change I made in r352985. This appears to be more consistent with gcc and objdump behavior. llvm-svn: 353015	2019-02-04 04:15:10 +00:00
Craig Topper	f77b858dc3	Revert r352985 "[X86] Print %st(0) as %st to match what gcc inline asm uses as the clobber name to make MS inline asm work correctly" Looking into gcc and objdump behavior more this was overly aggressive. If the register is encoded in the instruction we should print %st(0), if its implicit we should print %st. I'll be making a more directed change in a future patch. llvm-svn: 353013	2019-02-04 04:15:02 +00:00
Davide Italiano	73929c4d24	[LoopIdiomRecognize] @llvm.dbg values shouldn't affect the transformation. Summary: PR40564 Reviewers: aprantl, rnk Subscribers: llvm-commits, hiraditya Tags: #llvm Differential Revision: https://reviews.llvm.org/D57629 llvm-svn: 353007	2019-02-03 20:33:20 +00:00
Sanjay Patel	84ceae6048	[CGP] adjust target constraints for forming uaddo There are 2 changes visible here: 1. There's no reason to limit this transform based on number of condition registers. That diff allows PPC to produce slightly better (dot-instructions should be generally good) code. Note: someone that cares about PPC codegen might want to look closer at that output because it seems like we could still improve this. 2. We (probably?) should not bother trying to form uaddo (or other overflow ops) when there's no target support for such an op. This goes beyond checking whether the op is expanded because both PPC and AArch64 show better codegen for standard types regardless of whether the op is legal/custom. llvm-svn: 353001	2019-02-03 17:53:09 +00:00
Simon Pilgrim	1fce5a8b75	[X86][AVX] Support shuffle combining for VBROADCAST with smaller vector sources getTargetShuffleMask can only do this safely if we're extracting the lowest subvector from a vector of the same result type. llvm-svn: 352999	2019-02-03 16:51:33 +00:00
Simon Pilgrim	18b73a655b	[X86][AVX] Support shuffle combining for VPMOVZX with smaller vector sources llvm-svn: 352997	2019-02-03 16:10:18 +00:00
Simon Pilgrim	a2a3e5b811	[X86][AVX] More aggressively simplify BROADCAST source operand Aim to use scalar source or lowest 128-bit vector directly. We're still missing some VZMOVL_LOAD combines. llvm-svn: 352994	2019-02-03 14:39:41 +00:00
Sanjay Patel	00fcc74e50	[CGP] refactor optimizeCmpExpression (NFCI) This is not truly NFC because we are bailing out without a TLI now. That should not be a real concern though because there should be a TLI in any real-world scenario. That seems better than passing around a pointer and then checking it for null-ness all over the place. The motivation is to fix what appears to be an unintended restriction on the uaddo transform - hasMultipleConditionRegisters() shouldn't be reason to limit the transform. llvm-svn: 352988	2019-02-03 13:48:03 +00:00
Philip Pfaffe	9438585fe4	[DA][NewPM] Handle transitive dependencies in the new-pm version of DA Summary: The analysis result of DA caches pointers to AA, SCEV, and LI, but it never checks for their invalidation. Fix that. Reviewers: chandlerc, dmgreen, bogner Reviewed By: dmgreen Subscribers: hiraditya, bollu, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D56381 llvm-svn: 352986	2019-02-03 12:25:41 +00:00
Craig Topper	5a570dd437	[X86] Print %st(0) as %st to match what gcc inline asm uses as the clobber name to make MS inline asm work correctly Summary: When calculating clobbers for MS style inline assembly we fail if the asm clobbers stack top because we print st(0) and try to pass it through the gcc register name check. This was found with when I attempted to make a emms/femms clobber all ST registers. If you use emms/femms in MS inline asm we would try to use st(0) as the clobber name but clang would think that wasn't a valid clobber name. This also matches what objdump disassembly prints. It's also what is printed by gcc -S. Reviewers: RKSimon, rnk, efriedma, spatel, andreadb, lebedev.ri Reviewed By: rnk Subscribers: eraman, gbedwell, lebedev.ri, llvm-commits Differential Revision: https://reviews.llvm.org/D57621 llvm-svn: 352985	2019-02-03 07:53:39 +00:00
Craig Topper	950ca192f6	[X86] Lower ISD::UADDO to use the Z flag instead of C flag when the RHS is a constant 1 to encourage INC formation. Summary: Add an additional combine to combineCarryThroughADD to reverse it back to the C flag to avoid regressions. I believe this catches the cases that D57547 got. Reviewers: RKSimon, spatel Reviewed By: spatel Subscribers: javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57637 llvm-svn: 352984	2019-02-03 07:25:06 +00:00
Fangrui Song	b21ed3c57a	[AMDGPU] Fix -Wunused-variable after rL352978 llvm-svn: 352982	2019-02-03 03:51:52 +00:00
Dmitry Venikov	aaa709f2ec	[InstSimplify] Missed optimization in math expression: log10(pow(10.0,x)) == x, log2(pow(2.0,x)) == x Summary: This patch enables folding following instructions under -ffast-math flag: log10(pow(10.0,x)) -> x, log2(pow(2.0,x)) -> x Reviewers: hfinkel, spatel, efriedma, craig.topper, zvi, majnemer, lebedev.ri Reviewed By: spatel, lebedev.ri Subscribers: lebedev.ri, llvm-commits Differential Revision: https://reviews.llvm.org/D41940 llvm-svn: 352981	2019-02-03 03:48:30 +00:00
Matt Arsenault	888aa5dedd	GlobalISel: Implement widenScalar for G_UNMERGE_VALUES For the scalar case only. Also move the similar G_MERGE_VALUES handling to a separate function and cleanup to make them look more similar. llvm-svn: 352979	2019-02-03 00:07:33 +00:00
Matt Arsenault	0e5d856eb8	GlobalISel: Implement widenScalar for G_EXTRACT vector sources Handle the basic element extract case. llvm-svn: 352978	2019-02-02 23:56:00 +00:00
Matt Arsenault	eb2603cfb2	AMDGPU/GlobalISel: Avoid reporting illegal extloads as legal This avoids breaking a test in a future commit. llvm-svn: 352977	2019-02-02 23:39:13 +00:00
Matt Arsenault	58f9d3df97	AMDGPU/GlobalISel: Legalize icmp for pointer types llvm-svn: 352976	2019-02-02 23:35:15 +00:00
Matt Arsenault	2065c94dd3	AMDGPU/GlobalISel: Legalize constant for pointer types llvm-svn: 352975	2019-02-02 23:33:49 +00:00
Matt Arsenault	2491f82679	AMDGPU/GlobalISel: Legalize select for pointer types llvm-svn: 352974	2019-02-02 23:31:50 +00:00
Matt Arsenault	cbaada6bc1	GlobalISel: Legalization for inttoptr/ptrtoint llvm-svn: 352973	2019-02-02 23:29:55 +00:00
Simon Pilgrim	dbf302c9f1	[X86][AVX] Enable INSERT_SUBVECTOR(SRC0, SHUFFLE(SRC1)) shuffle combining Push the insert_subvector up through the shuffle operands to help find more cross-lane shuffles. The is exposes a couple of minor issues that will be fixed shortly: Missed broadcast folds - we have a mixture of vzext_load lengths that need cleaning up combine-sdiv.ll - AVX1 SimplifyDemandedVectorElts failure (hits max depth due to a couple of extra bitcasts). llvm-svn: 352963	2019-02-02 18:08:04 +00:00
Simon Pilgrim	bd42f97946	[SDAG] Add SDNode/SDValue getConstantOperandAPInt helper. NFCI. We already have the getConstantOperandVal helper which returns a uint64_t, but along comes the fuzzer and inserts a i128 -1 constant or something and the whole thing asserts....... I've updated a few obvious cases, and tried to make use of the const reference where possible, but there's more to do. A number of existing oss-fuzz tickets should be fixed if we start using APInt and perform value clamping where necessary. llvm-svn: 352961	2019-02-02 17:35:06 +00:00
Florian Hahn	dd2ef0af46	[LCSSA] Handle case with single new PHI faster. If there is only a single available value, all uses must be dominated by the single value and there is no need to search for a reaching definition. This drastically speeds up LCSSA in some cases. For the test case from PR37202, it speeds up LCSSA construction by 4 times. Time-passes without this patch for test case from PR37202: Total Execution Time: 29.9285 seconds (29.9276 wall clock) ---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name --- 5.2786 ( 17.7%) 0.0021 ( 1.2%) 5.2806 ( 17.6%) 5.2808 ( 17.6%) Unswitch loops 4.3739 ( 14.7%) 0.0303 ( 18.1%) 4.4042 ( 14.7%) 4.4042 ( 14.7%) Loop-Closed SSA Form Pass 4.2658 ( 14.3%) 0.0192 ( 11.5%) 4.2850 ( 14.3%) 4.2851 ( 14.3%) Loop-Closed SSA Form Pass #2 2.2307 ( 7.5%) 0.0013 ( 0.8%) 2.2320 ( 7.5%) 2.2318 ( 7.5%) Loop Invariant Code Motion 2.0888 ( 7.0%) 0.0012 ( 0.7%) 2.0900 ( 7.0%) 2.0897 ( 7.0%) Unroll loops 1.6761 ( 5.6%) 0.0013 ( 0.8%) 1.6774 ( 5.6%) 1.6774 ( 5.6%) Value Propagation 1.3686 ( 4.6%) 0.0029 ( 1.8%) 1.3716 ( 4.6%) 1.3714 ( 4.6%) Induction Variable Simplification 1.1457 ( 3.8%) 0.0010 ( 0.6%) 1.1468 ( 3.8%) 1.1468 ( 3.8%) Loop-Closed SSA Form Pass #4 1.1384 ( 3.8%) 0.0005 ( 0.3%) 1.1389 ( 3.8%) 1.1389 ( 3.8%) Loop-Closed SSA Form Pass #6 1.1360 ( 3.8%) 0.0027 ( 1.6%) 1.1387 ( 3.8%) 1.1387 ( 3.8%) Loop-Closed SSA Form Pass #5 1.1331 ( 3.8%) 0.0010 ( 0.6%) 1.1341 ( 3.8%) 1.1340 ( 3.8%) Loop-Closed SSA Form Pass #3 Time passes with this patch Total Execution Time: 19.2802 seconds (19.2813 wall clock) ---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name --- 4.4234 ( 23.2%) 0.0038 ( 2.0%) 4.4272 ( 23.0%) 4.4273 ( 23.0%) Unswitch loops 2.3828 ( 12.5%) 0.0020 ( 1.1%) 2.3848 ( 12.4%) 2.3847 ( 12.4%) Unroll loops 1.8714 ( 9.8%) 0.0020 ( 1.1%) 1.8734 ( 9.7%) 1.8735 ( 9.7%) Loop Invariant Code Motion 1.7973 ( 9.4%) 0.0022 ( 1.2%) 1.7995 ( 9.3%) 1.8003 ( 9.3%) Value Propagation 1.4010 ( 7.3%) 0.0033 ( 1.8%) 1.4043 ( 7.3%) 1.4044 ( 7.3%) Induction Variable Simplification 0.9978 ( 5.2%) 0.0244 ( 13.1%) 1.0222 ( 5.3%) 1.0224 ( 5.3%) Loop-Closed SSA Form Pass #2 0.9611 ( 5.0%) 0.0257 ( 13.8%) 0.9868 ( 5.1%) 0.9868 ( 5.1%) Loop-Closed SSA Form Pass 0.5856 ( 3.1%) 0.0015 ( 0.8%) 0.5871 ( 3.0%) 0.5869 ( 3.0%) Unroll loops #2 0.4132 ( 2.2%) 0.0012 ( 0.7%) 0.4145 ( 2.1%) 0.4143 ( 2.1%) Loop Invariant Code Motion #3 Reviewers: efriedma, davide, mzolotukhin Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D57033 llvm-svn: 352960	2019-02-02 15:26:05 +00:00
Florian Hahn	509b48a64a	[LCSSA] Add expensive verification of LCSSA form for sub-loops. This assertion makes sure all sub-loops are in LCSSA form before bringing their parent in LCSSA form. This precondition was added to formLCSSA in D56848. Reviewers: davide, efriedma, mzolotukhin Reviewed By: davide Differential Revision: https://reviews.llvm.org/D56921 llvm-svn: 352958	2019-02-02 14:42:27 +00:00
Yonghong Song	fa3654008b	[BPF] [BTF] Process FileName with absolute path correctly In IR, sometimes the following attributes for DIFile may be generated: filename: /home/yhs/test.c directory: /tmp The /tmp may represent the working directory of the compilation process. In such cases, since filename is with absolute path, the directory should be ignored by BTF. The filename alone is enough to get the source. Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 352952	2019-02-02 05:54:59 +00:00
Julian Lettner	f82d8924ef	[ASan] Do not instrument other runtime functions with `__asan_handle_no_return` Summary: Currently, ASan inserts a call to `__asan_handle_no_return` before every `noreturn` function call/invoke. This is unnecessary for calls to other runtime funtions. This patch changes ASan to skip instrumentation for functions calls marked with `!nosanitize` metadata. Reviewers: TODO Differential Revision: https://reviews.llvm.org/D57489 llvm-svn: 352948	2019-02-02 02:05:16 +00:00
Mandeep Singh Grang	2be4eabb6f	[AutoUpgrade] Fix AutoUpgrade for x86.seh.recoverfp Summary: This fixes the bug in https://reviews.llvm.org/D56747#inline-502711. Reviewers: efriedma Reviewed By: efriedma Subscribers: javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57614 llvm-svn: 352945	2019-02-02 01:32:48 +00:00
Yonghong Song	329010e1b5	Revert "[BPF] [BTF] Process FileName with absolute path correctly" This reverts commit r352939. Some tests failed. Revert to unblock others. llvm-svn: 352941	2019-02-01 23:49:52 +00:00
Mandeep Singh Grang	dc1e778369	[AArch64] Fix unused variable [NFC] llvm-svn: 352940	2019-02-01 23:42:34 +00:00
Yonghong Song	5233fb8f5e	[BPF] [BTF] Process FileName with absolute path correctly In IR, sometimes the following attributes for DIFile may be generated: filename: /home/yhs/test.c directory: /tmp The /tmp may represent the working directory of the compilation process. In such cases, since filename is with absolute path, the directory should be ignored by BTF. The filename alone is enough to get the source. Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 352939	2019-02-01 23:23:17 +00:00
Philip Reames	00056ed0e6	[CodeGen] Be as conservative about atomic accesses as for volatile Background: At the moment, we record the AtomicOrdering of an access in the MMO, but also mark any atomic access as volatile in SelectionDAG. I'm working towards separating that. See https://reviews.llvm.org/D57601 for context. Update all usages of isVolatile in lib/CodeGen to preserve behaviour once atomic MMOs stop being also volatile. This is NFC in it's current form, but is essential for correctness once we make that final change. It useful to keep in mind that AtomicSDNode is not a parent of LoadSDNode, StoreSDNode, or LSBaseSDNode. As a result, any call to isVolatile on one of those static types doesn't need a companion isAtomic check. We should probably adjust that class hierarchy long term, but for now, that seperation is useful. I'm deliberately being conservative about handling. I want the change to stop adding volatile to be NFC itself, and then will work through places where we can be less conservative for atomics one by one in separate changes w/tests. Differential Revision: https://reviews.llvm.org/D57596 llvm-svn: 352937	2019-02-01 22:58:52 +00:00
Dan Gohman	f726e4454c	[WebAssembly] Add codegen support for the import_field attribute This adds the LLVM side of https://reviews.llvm.org/D57602 -- the import_field attribute. See that patch for details. Differential Revision: https://reviews.llvm.org/D57603 llvm-svn: 352931	2019-02-01 22:27:34 +00:00
Mandeep Singh Grang	70d484d94e	[COFF, ARM64] Fix localaddress to handle stack realignment and variable size objects Summary: This fixes using the correct stack registers for SEH when stack realignment is needed or when variable size objects are present. Reviewers: rnk, efriedma, ssijaric, TomTan Reviewed By: rnk, efriedma Subscribers: javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D57183 llvm-svn: 352923	2019-02-01 21:41:33 +00:00
Simon Pilgrim	e95550f508	[X86][AVX] Add VMOVDDUP-VPBROADCASTQ execution domain mapping Noticed in D57514. Differential Revision: https://reviews.llvm.org/D57519 llvm-svn: 352922	2019-02-01 21:41:30 +00:00
Jordan Rupprecht	835df27f85	[DebugInfo] Don't use realpath when looking up debug binary locations. Summary: Using realpath makes assumptions about build systems that do not always hold true. The debug binary referred to from the .gnu_debuglink should exist in the same directory (or in a .debug directory, etc.), but the files may only exist as symlinks to a differently named files elsewhere, and using realpath causes that lookup to fail. This was added in r189250, and this is basically a revert + regression test case. Reviewers: dblaikie, samsonov, jhenderson Reviewed By: dblaikie Subscribers: llvm-commits, hiraditya Tags: #llvm Differential Revision: https://reviews.llvm.org/D57609 llvm-svn: 352916	2019-02-01 21:04:16 +00:00
James Y Knight	291f791ef1	[opaque pointer types] Pass function type for CallBase::setCalledFunction. Differential Revision: https://reviews.llvm.org/D57174 llvm-svn: 352914	2019-02-01 20:44:54 +00:00
James Y Knight	7716075a17	[opaque pointer types] Pass value type to GetElementPtr creation. This cleans up all GetElementPtr creation in LLVM to explicitly pass a value type rather than deriving it from the pointer's element-type. Differential Revision: https://reviews.llvm.org/D57173 llvm-svn: 352913	2019-02-01 20:44:47 +00:00
James Y Knight	14359ef1b6	[opaque pointer types] Pass value type to LoadInst creation. This cleans up all LoadInst creation in LLVM to explicitly pass the value type rather than deriving it from the pointer's element-type. Differential Revision: https://reviews.llvm.org/D57172 llvm-svn: 352911	2019-02-01 20:44:24 +00:00
James Y Knight	d9e85a0861	[opaque pointer types] Pass function types to InvokeInst creation. This cleans up all InvokeInst creation in LLVM to explicitly pass a function type rather than deriving it from the pointer's element-type. Differential Revision: https://reviews.llvm.org/D57171 llvm-svn: 352910	2019-02-01 20:43:34 +00:00
James Y Knight	7976eb5838	[opaque pointer types] Pass function types to CallInst creation. This cleans up all CallInst creation in LLVM to explicitly pass a function type rather than deriving it from the pointer's element-type. Differential Revision: https://reviews.llvm.org/D57170 llvm-svn: 352909	2019-02-01 20:43:25 +00:00
Michael Liao	8b323f53eb	[InstCombine] Extra null-checking on TFE/LWE support - If that operand is not ConstantInt, skip enabling TFE/LWE. Differential Revision: https://reviews.llvm.org/D57539 llvm-svn: 352904	2019-02-01 19:53:44 +00:00
Roland Froese	7f29195c3f	test commit (add blank line) NFC llvm-svn: 352897	2019-02-01 18:55:43 +00:00
Wolfgang Pieb	58513b7761	[DWARF v5] Fix DWARF emitter and consumer to produce/expect a uleb for a location description's length. Reviewer: davide, JDevliegere Differential Revision: https://reviews.llvm.org/D57550 llvm-svn: 352889	2019-02-01 17:11:58 +00:00
Tim Corringham	fa3e4e5b53	[AMDGPU] Fix for vector element insertion Summary: Incorrect code was generated when lowering insertelement operations for vectors with 8 or 16 bit elements. The value being inserted was not adjusted for the position of the element within the 32 bit word and so only the low element within each 32 bit word could receive the intended value. Fixed by simply replicating the value to each element of a congruent vector before the mask and or operation used to update the intended element. A number of affected LIT tests have been updated appropriately. before the mask & or into the intended Reviewers: arsenm, nhaehnle Reviewed By: arsenm Subscribers: llvm-commits, arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Tags: #llvm Differential Revision: https://reviews.llvm.org/D57588 llvm-svn: 352885	2019-02-01 16:51:09 +00:00
Sanjay Patel	6502b1444d	[SDAG] improve variable names; NFC The version of FoldConstantArithmetic() that takes arbitrary nodes was confusingly naming those nodes as constants when they might not be; also "Cst" reads like "Cast". llvm-svn: 352884	2019-02-01 16:06:53 +00:00
Simon Pilgrim	85184017e9	[X86][SSE] Use PSLLDQ/PSRLDQ to mask out zeroable ends of a shuffle As suggested on PR40318, this patch uses PSLLDQ/PSRLDQ to lower shuffles to zero out the ends of a vector, leaving a sequential inner section. For pre-SSSE3 we do this for shuffles with zeros at either end (requiring up to 3 shifts), but once PSHUFB is available I've limited this to shuffles with a single zeroable end (2 shifts). Differential Revision: https://reviews.llvm.org/D56784 llvm-svn: 352883	2019-02-01 16:02:12 +00:00
Sanjay Patel	0279b5b0b8	[TargetLowering] try harder to determine undef elements of vector binops This might be the start of tracking all vector element constants generally if we take it to its logical conclusion, but let's stop here and make sure this is correct/beneficial so far. The affected tests require a convoluted path before they get simplified currently because we don't call SimplifyDemandedVectorElts() from binops directly and don't modify the binop operands directly in SimplifyDemandedVectorElts(). That's why the tests all have a trailing shuffle to induce a chain reaction of transforms. So something like this is happening: 1. Improve the knowledge of undefs in the binop via a SimplifyDemandedVectorElts() call that originates from a shuffle. 2. Transfer that undef knowledge back to the shuffle mask user as more undef lanes. 3. Combine the modified shuffle by calling SimplifyDemandedVectorElts() again. 4. Translate the improved shuffle mask as undemanded lanes of build vector constants causing those to become full undef constants. 5. Simplify the binop now that it has a full undef operand. As we can see from the unchanged 'and' and 'or' tests, tracking undefs alone isn't a full solution. We would need to track zero and all-ones constants to improve those opcodes. We'd probably need to track NaN for FP ops too (assuming we don't have fast-math-flags set). Differential Revision: https://reviews.llvm.org/D57066 llvm-svn: 352880	2019-02-01 15:35:12 +00:00
Simon Pilgrim	1a529f58f9	[X86][AVX] Combine INSERT_SUBVECTOR(SRC0, BITCAST(SHUFFLE(EXTRACT_SUBVECTOR(SRC1))) Enable peeking through one use bitcasts to the subvector shuffle. This still depends on the subvector being the same scalar-size but D57514 has already helped with the more tricky patterns llvm-svn: 352879	2019-02-01 15:31:01 +00:00
Sanjay Patel	fbcbac7174	[InstCombine] reduce duplicate code; NFC An unused variable problem was introduced with rL352870 and stubbed out with rL352871, but we can make a better fix by actually using the local variable in code rather than just the assert. llvm-svn: 352873	2019-02-01 14:37:49 +00:00
Fangrui Song	8495aabec2	[InstCombine] Fix -Wunused-variable when -DLLVM_ENABLE_ASSERTIONS=off llvm-svn: 352871	2019-02-01 14:22:02 +00:00
Sanjay Patel	be23a91fcd	[InstCombine] try to reduce x86 addcarry to generic uaddo intrinsic If we can reduce the x86-specific intrinsic to the generic op, it allows existing simplifications and value tracking folds. AFAICT, this always results in identical x86 codegen in the non-reduced case...which should be true because we semi-generically (too aggressively IMO) convert to llvm.uadd.with.overflow in CGP, so the DAG/isel must already combine/lower this intrinsic as expected. This isn't quite what was requested in: https://bugs.llvm.org/show_bug.cgi?id=40486 ...but we want to have these kinds of folds early for efficiency and to enable greater simplifications. For the case in the bug report where we have: _addcarry_u64(0, ahi, 0, &ahi) ...this gets completely simplified away in IR. Differential Revision: https://reviews.llvm.org/D57453 llvm-svn: 352870	2019-02-01 14:14:47 +00:00
Adhemerval Zanella	b3ccc5550d	[AArch64] Optimize floating point materialization This patch changes isFPImmLegal to return if the value can be enconded as the immediate operand of a logical instruction besides checking if for immediate field for fmov. This optimizes some floating point materization, inclusive values used on isinf lowering. Reviewed By: rengolin, efriedma, evandro Differential Revision: https://reviews.llvm.org/D57044 llvm-svn: 352866	2019-02-01 12:26:06 +00:00
Roman Lebedev	7857215f8e	[X86][BdVer2] Transfer delays from the integer to the floating point unit. Summary: I'm unable to find this number in the "AMD SOG for family 15h". llvm-exegesis measures the latencies of these instructions as `2`, which matches the latencies specified in "AMD SOG for family 15h". However if we look at Agner, Microarchitecture, "AMD Bulldozer, Piledriver, Steamroller and Excavator pipeline", "Data delay between different execution domains", the int->ivec transfer is listed as `8`..`10`cy of additional latency. Also, Agner's "Instruction tables", for Piledriver, lists their latencies as `12`, which is consistent with `2cy` from exegesis / AMD SOG + `10cy` transfer delay. Additional data point comes from the fact that Agner's "Instruction tables", for Jaguar, lists their latencies as `8`; and "AMD SOG for family 16h" does state the `+6cy` int->ivec delay, which is consistent with instr latency of `1` or `2`. Reviewers: andreadb, RKSimon, craig.topper Reviewed By: andreadb Subscribers: gbedwell, courbet, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57300 llvm-svn: 352861	2019-02-01 11:15:13 +00:00
Yevgeny Rouban	15b17d0a7c	Provide reason messages for unviable inlining InlineCost's isInlineViable() is changed to return InlineResult instead of bool. This provides messages for failure reasons and allows to get more specific messages for cases where callsites are not viable for inlining. Reviewed By: xbolva00, anemet Differential Revision: https://reviews.llvm.org/D57089 llvm-svn: 352849	2019-02-01 10:44:43 +00:00
James Henderson	212833ce76	Revert r352750. This was causing a build bot failure: http://green.lab.llvm.org/green/job/clang-stage2-Rthinlto/15346/ llvm-svn: 352848	2019-02-01 10:38:40 +00:00
Oliver Stannard	bac11518cd	[CodeGen] Don't scavenge non-saved regs in exception throwing functions Previously, LiveRegUnits was assuming that if a block has no successors and does not return, then no registers are live at the end of it (because the end of the block is unreachable). This was causing the register scavenger to use callee-saved registers to materialise stack frame addresses without saving them in the prologue. This would normally be fine, because the end of the block is unreachable, but this is not legal if the block ends by throwing a C++ exception. If this happens, the scratch register will be modified, but its previous value won't be preserved, so it doesn't get restored by the exception unwinder. Differential revision: https://reviews.llvm.org/D57381 llvm-svn: 352844	2019-02-01 09:23:51 +00:00
Yevgeny Rouban	4cdd783955	[SLPVectorizer] Get rid of IndexQueue array from vectorizeStores. NFCI. Indices are checked as they are generated. No need to fill the whole array of indices. Differential Revision: https://reviews.llvm.org/D57144 llvm-svn: 352839	2019-02-01 06:44:08 +00:00
Alex Bradbury	7539fa2c2d	[RISCV] Implement RV64D codegen This patch: * Adds necessary RV64D codegen patterns * Modifies CC_RISCV so it will properly handle f64 types (with soft float ABI) Note that in general there is no reason to try to select fcvt.w[u].d rather than fcvt.l[u].d for i32 conversions because fptosi/fptoui produce poison if the input won't fit into the target type. Differential Revision: https://reviews.llvm.org/D53237 llvm-svn: 352833	2019-02-01 03:53:30 +00:00
Alex Bradbury	32b77383ec	[SelectionDAG] Support promotion of the FPOWI integer operand For targets where i32 is not a legal type (e.g. 64-bit RISC-V), LegalizeIntegerTypes must promote the integer operand of ISD::FPOWI. As this is a signed value, this should be sign-extended. This patch enables all tests in test/CodeGen/RISCVfloat-intrinsics.ll for RV64, as prior to this patch that file couldn't be compiled for RV64 due to an assertion when performing codegen for fpowi. Differential Revision: https://reviews.llvm.org/D54574 llvm-svn: 352832	2019-02-01 03:46:28 +00:00
James Y Knight	13680223b9	[opaque pointer types] Add a FunctionCallee wrapper type, and use it. Recommit r352791 after tweaking DerivedTypes.h slightly, so that gcc doesn't choke on it, hopefully. Original Message: The FunctionCallee type is effectively a {FunctionType,Value} pair, and is a useful convenience to enable code to continue passing the result of getOrInsertFunction() through to EmitCall, even once pointer types lose their pointee-type. Then: - update the CallInst/InvokeInst instruction creation functions to take a Callee, - modify getOrInsertFunction to return FunctionCallee, and - update all callers appropriately. One area of particular note is the change to the sanitizer code. Previously, they had been casting the result of `getOrInsertFunction` to a `Function*` via `checkSanitizerInterfaceFunction`, and storing that. That would report an error if someone had already inserted a function declaraction with a mismatching signature. However, in general, LLVM allows for such mismatches, as `getOrInsertFunction` will automatically insert a bitcast if needed. As part of this cleanup, cause the sanitizer code to do the same. (It will call its functions using the expected signature, however they may have been declared.) Finally, in a small number of locations, callers of `getOrInsertFunction` actually were expecting/requiring that a brand new function was being created. In such cases, I've switched them to Function::Create instead. Differential Revision: https://reviews.llvm.org/D57315 llvm-svn: 352827	2019-02-01 02:28:03 +00:00
Kostya Serebryany	a78a44d480	[sanitizer-coverage] prune trace-cmp instrumentation for CMP isntructions that feed into the backedge branch. Instrumenting these CMP instructions is almost always useless (and harmful) for fuzzing llvm-svn: 352818	2019-01-31 23:43:00 +00:00
Matt Arsenault	50d6579bac	GlobalISel: Fix MMO creation with non-power-of-2 mem size It should probably just be mandatory for getTgtMemIntrinsic to return the alignment. llvm-svn: 352817	2019-01-31 23:41:23 +00:00
Thomas Lively	9a48438832	[WebAssembly] Fix a regression selecting negative build_vector lanes Summary: The custom lowering introduced in rL352592 creates build_vector nodes with negative i32 operands, but these operands did not meet the value range constraints necessary to match build_vector nodes. This CL fixes the issue by removing the unnecessary constraints. Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish Differential Revision: https://reviews.llvm.org/D57481 llvm-svn: 352813	2019-01-31 23:22:39 +00:00
Alex Bradbury	d834d8301d	[RISCV] Add RV64F codegen support This requires a little extra work due tothe fact i32 is not a legal type. When call lowering happens post-legalisation (e.g. when an intrinsic was inserted during legalisation). A bitcast from f32 to i32 can't be introduced. This is similar to the challenges with RV32D. To handle this, we introduce target-specific DAG nodes that perform bitcast+anyext for f32->i64 and trunc+bitcast for i64->f32. Differential Revision: https://reviews.llvm.org/D53235 llvm-svn: 352807	2019-01-31 22:48:38 +00:00
Sam Clegg	c0affde863	[WebAssembly] MC: Fix for outputing wasm object to /dev/null Subscribers: dschuff, jgravelle-google, aheejin, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D57479 llvm-svn: 352806	2019-01-31 22:38:22 +00:00
Richard Trieu	8f6182f7f6	[Hexagon] Rename textually included file from .h to .inc llvm-svn: 352802	2019-01-31 21:58:42 +00:00
James Y Knight	fadf25068e	Revert "[opaque pointer types] Add a FunctionCallee wrapper type, and use it." This reverts commit `f47d6b38c7` (r352791). Seems to run into compilation failures with GCC (but not clang, where I tested it). Reverting while I investigate. llvm-svn: 352800	2019-01-31 21:51:58 +00:00
Alina Sbirlea	e271889291	[EarlyCSE & MSSA] Cleanup special handling for removing MemoryAccesses. Summary: Moving special handling to MemorySSAUpdater in D57199. Reviewers: gberry, george.burgess.iv Subscribers: sanjoy, jlebar, Prazek, llvm-commits Differential Revision: https://reviews.llvm.org/D57200 llvm-svn: 352794	2019-01-31 21:12:41 +00:00
Thomas Lively	88058d4e1e	[WebAssembly] Add bulk memory target feature Summary: Also clean up some preexisting target feature code. Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, jfb Differential Revision: https://reviews.llvm.org/D57495 llvm-svn: 352793	2019-01-31 21:02:19 +00:00
Guozhi Wei	0bed9e0453	[DAGCombine] Avoid CombineZExtLogicopShiftLoad if there is free ZEXT This patch fixes pr39098. For the attached test case, CombineZExtLogicopShiftLoad can optimize it to t25: i64 = Constant<1099511627775> t35: i64 = Constant<0> t0: ch = EntryToken t57: i64,ch = load<(load 4 from `i40* undef`, align 8), zext from i32> t0, undef:i64, undef:i64 t58: i64 = srl t57, Constant:i8<1> t60: i64 = and t58, Constant:i64<524287> t29: ch = store<(store 5 into `i40* undef`, align 8), trunc to i40> t57:1, t60, undef:i64, undef:i64 But later visitANDLike transforms it to t25: i64 = Constant<1099511627775> t35: i64 = Constant<0> t0: ch = EntryToken t57: i64,ch = load<(load 4 from `i40* undef`, align 8), zext from i32> t0, undef:i64, undef:i64 t61: i32 = truncate t57 t63: i32 = srl t61, Constant:i8<1> t64: i32 = and t63, Constant:i32<524287> t65: i64 = zero_extend t64 t58: i64 = srl t57, Constant:i8<1> t60: i64 = and t58, Constant:i64<524287> t29: ch = store<(store 5 into `i40* undef`, align 8), trunc to i40> t57:1, t60, undef:i64, undef:i64 And it triggers CombineZExtLogicopShiftLoad again, causes a dead loop. Both forms should generate same instructions, CombineZExtLogicopShiftLoad generated IR looks cleaner. But it looks more difficult to prevent visitANDLike to do the transform, so I prevent CombineZExtLogicopShiftLoad to do the transform if the ZExt is free. Differential Revision: https://reviews.llvm.org/D57491 llvm-svn: 352792	2019-01-31 20:46:42 +00:00
James Y Knight	f47d6b38c7	[opaque pointer types] Add a FunctionCallee wrapper type, and use it. The FunctionCallee type is effectively a {FunctionType,Value} pair, and is a useful convenience to enable code to continue passing the result of getOrInsertFunction() through to EmitCall, even once pointer types lose their pointee-type. Then: - update the CallInst/InvokeInst instruction creation functions to take a Callee, - modify getOrInsertFunction to return FunctionCallee, and - update all callers appropriately. One area of particular note is the change to the sanitizer code. Previously, they had been casting the result of `getOrInsertFunction` to a `Function*` via `checkSanitizerInterfaceFunction`, and storing that. That would report an error if someone had already inserted a function declaraction with a mismatching signature. However, in general, LLVM allows for such mismatches, as `getOrInsertFunction` will automatically insert a bitcast if needed. As part of this cleanup, cause the sanitizer code to do the same. (It will call its functions using the expected signature, however they may have been declared.) Finally, in a small number of locations, callers of `getOrInsertFunction` actually were expecting/requiring that a brand new function was being created. In such cases, I've switched them to Function::Create instead. Differential Revision: https://reviews.llvm.org/D57315 llvm-svn: 352791	2019-01-31 20:35:56 +00:00
Alina Sbirlea	240a90a57e	[MemorySSA] Extend removeMemoryAccess API to optimize MemoryPhis. Summary: EarlyCSE needs to optimize MemoryPhis after an access is removed and has special handling for it. This should be handled by MemorySSA instead. The default remains that MemoryPhis are not optimized after an access is removed. Reviewers: george.burgess.iv Subscribers: sanjoy, jlebar, llvm-commits, Prazek Differential Revision: https://reviews.llvm.org/D57199 llvm-svn: 352787	2019-01-31 20:13:47 +00:00
Nirav Dave	b792299d83	[DAG][SystemZ] Define unwrapAddress for PCREL_WRAPPER. Summary: Like with X86, this allows better DAG-level alias analysis and alignment inference for wrapped addresses. Reviewers: jonpa, uweigand Reviewed By: uweigand Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D57407 llvm-svn: 352786	2019-01-31 19:58:34 +00:00
Nirav Dave	4061b44057	[DAG] Aggressively cleanup dangling node in CombineZExtLogicopShiftLoad. While dangling nodes will eventually be pruned when they are considered, leaving them disables combines requiring single-use. Reviewers: Carrot, spatel, craig.topper, RKSimon, efriedma Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D57520 llvm-svn: 352784	2019-01-31 19:35:14 +00:00
Leonard Chan	ae527ac603	[Intrinsic] Expand SMULFIX to MUL, MULH[US], or [US]MUL_LOHI on vector arguments r zero scale SMULFIX, expand into MUL which produces better code for X86. For vector arguments, expand into MUL if SMULFIX is provided with a zero scale. Otherwise, expand into MULH[US] or [US]MUL_LOHI. Differential Revision: https://reviews.llvm.org/D56987 llvm-svn: 352783	2019-01-31 19:15:37 +00:00
Craig Topper	a8f0745440	Revert "[X86] Mark EMMS and FEMMS as clobbering MM0-7 and ST0-7." This is causing a failure in chromium llvm-svn: 352782	2019-01-31 19:05:22 +00:00
Philip Reames	ede49ddff5	Lower widenable_conditions in CGP This ensures that if we make it to the backend w/o lowering widenable_conditions first, that we generate correct code. Doing it in CGP - instead of isel - let's us fold control flow before hitting block local instruction selection. Differential Revision: https://reviews.llvm.org/D57473 llvm-svn: 352779	2019-01-31 18:45:46 +00:00
Simon Pilgrim	00cefe1158	Trim trailing whitespace. NFCI. llvm-svn: 352775	2019-01-31 17:49:25 +00:00
Simon Pilgrim	eb6aef6db3	[X86][AVX] Fold concat(broadcast(x),broadcast(x)) -> broadcast(x) Differential Revision: https://reviews.llvm.org/D57514 llvm-svn: 352774	2019-01-31 17:48:35 +00:00
Simon Pilgrim	d04a2d2d5e	[X86][AVX] insert_subvector(bitcast(v), bitcast(s), c1) -> bitcast(insert_subvector(v,s,c2)) Similar to what we already do in DAGCombiner, but this version also handles bitcasts from types with different scalar sizes, which x86 is better at handling. Differential Revision: https://reviews.llvm.org/D57514 llvm-svn: 352773	2019-01-31 17:38:10 +00:00
Craig Topper	c1892ec15a	[CallSite removal] Remove CallSite uses from InstCombine. Reviewers: chandlerc Reviewed By: chandlerc Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D57494 llvm-svn: 352771	2019-01-31 17:23:29 +00:00
Teresa Johnson	f59242e5ff	Recommit "[ThinLTO] Rename COMDATs for COFF when promoting/renaming COMDAT leader" Recommit of r352763 with fix for use after free. llvm-svn: 352770	2019-01-31 17:18:11 +00:00
Teresa Johnson	4877715ee6	Revert "[ThinLTO] Rename COMDATs for COFF when promoting/renaming COMDAT leader" This reverts commit r352763. Causing a couple bot failures, root cause pointed to by sanitizer bot: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/28909/steps/annotate/logs/stdio Use after free. I understand the issue but will revert and test with fix before recommitting. llvm-svn: 352768	2019-01-31 16:46:14 +00:00
Teresa Johnson	992b53fd16	[ThinLTO] Rename COMDATs for COFF when promoting/renaming COMDAT leader Summary: COFF requires that COMDAT name match that of the leader. When we promote and rename an internal leader in ThinLTO due to an import, ensure we subsequently rename the associated COMDAT. Similar to D31963 which did this during ThinLTO module splitting. Fixes PR40414. Reviewers: pcc, inglorion Subscribers: mehdi_amini, dexonsmith, dmajor, llvm-commits Differential Revision: https://reviews.llvm.org/D57395 llvm-svn: 352763	2019-01-31 16:00:15 +00:00
Simon Pilgrim	63f3383ece	[X86][AVX] Fold broadcast(bitcast(src)) -> bitcast(broadcast(src)) llvm-svn: 352751	2019-01-31 14:04:07 +00:00
James Henderson	140f75f625	[CommandLine] Improve help text for cl::values style options In order to make an option value truly optional, both the ValueOptional and an empty-named value are required. This empty-named value appears in the command-line help text, which is not ideal. This change improves the help text for these sort of options in a number of ways: 1) ValueOptional options with an empty-named value now print their help text twice: both without and then with '=<value>' after the name. The latter version then lists the allowed values after it. 2) Empty-named values with no help text in ValueOptional options are not listed in the permitted values. 3) Otherwise empty-named options are printed as =<empty> rather than simply '='. 4) Option values without help text do not have the '-' separator printed. It also tweaks the llvm-symbolizer -functions help text to not print a trailing ':' as that looks bad combined with 1) above. Reviewed by: thopre, ruiu Differential Revision: https://reviews.llvm.org/D57030 llvm-svn: 352750	2019-01-31 13:58:48 +00:00
Simon Pilgrim	a001008a09	[X86] combineExtractWithShuffle - more aggressively peek through bitcasts Fixes regression introduced by rL352743 llvm-svn: 352745	2019-01-31 11:55:30 +00:00
Simon Pilgrim	b96a2c7fed	[X86][AVX] Enable AVX1 broadcasts in shuffle combining Enables 32/64-bit scalar load broadcasts on AVX1 targets The extractelement-load.ll regression will be fixed shortly in a followup commit. llvm-svn: 352743	2019-01-31 11:41:10 +00:00
Simon Pilgrim	51c2efc104	[X86][AVX] Fold vt1 concat_vectors(vt2 undef, vt2 broadcast(x)) --> vt1 broadcast(x) If we're not inserting the broadcast into the lowest subvector then we can avoid the insertion by just performing a larger broadcast. Avoids a regression when we enable AVX1 broadcasts in shuffle combining llvm-svn: 352742	2019-01-31 11:15:05 +00:00
Max Kazantsev	f392bc846f	Default lowering for experimental.widenable.condition Introduces a pass that provides default lowering strategy for the `experimental.widenable.condition` intrinsic, replacing all its uses with `i1 true`. Differential Revision: https://reviews.llvm.org/D56096 Reviewed By: reames llvm-svn: 352739	2019-01-31 09:10:17 +00:00
Yevgeny Rouban	ae29857d64	Test commit. NFCI. llvm-svn: 352738	2019-01-31 08:49:20 +00:00
Sjoerd Meijer	f222259c3c	[ARM] Thumb2: ConstantMaterializationCost Constants can also be materialised using the negated value and a MVN, and this case seem to have been missed for Thumb2. To check the constant materialisation costs, we now call getT2SOImmVal twice, once for the original constant and then also for its negated value, and this function checks if the constant can both be splatted or rotated. This was revealed by a test that optimises for minsize: instead of a LDR literal pool load and having a literal pool entry, just a MVN with an immediate is smaller (and also faster). Differential Revision: https://reviews.llvm.org/D57327 llvm-svn: 352737	2019-01-31 08:38:06 +00:00
Sjoerd Meijer	f7cc34cae8	[SelectionDAG] Codesize: don't expand SHIFT to SHIFT_PARTS And instead just generate a libcall. My motivating example on ARM was a simple: shl i64 %A, %B for which the code bloat is quite significant. For other targets that also accept __int128/i128 such as AArch64 and X86, it is also beneficial for these cases to generate a libcall when optimising for minsize. On these 64-bit targets, the 64-bits shifts are of course unaffected because the SHIFT/SHIFT_PARTS lowering operation action is not set to custom/expand. Differential Revision: https://reviews.llvm.org/D57386 llvm-svn: 352736	2019-01-31 08:07:30 +00:00
Dmitry Venikov	8817658836	[InstCombine] Missed optimization in math expression: simplify calls exp functions Summary: This patch enables folding following expressions under -ffast-math flag: exp(X) * exp(Y) -> exp(X + Y), exp2(X) * exp2(Y) -> exp2(X + Y). Motivation: https://bugs.llvm.org/show_bug.cgi?id=35594 Reviewers: hfinkel, spatel, efriedma, lebedev.ri Reviewed By: spatel, lebedev.ri Subscribers: lebedev.ri, llvm-commits Differential Revision: https://reviews.llvm.org/D41342 llvm-svn: 352730	2019-01-31 06:28:10 +00:00
Max Kazantsev	b37419ef66	[SCEV] Prohibit SCEV transformations for huge SCEVs Currently SCEV attempts to limit transformations so that they do not work with big SCEVs (that may take almost infinite compile time). But for this, it uses heuristics such as recursion depth and number of operands, which do not give us a guarantee that we don't actually have big SCEVs. This situation is still possible, though it is not likely to happen. However, the bug PR33494 showed a bunch of simple corner case tests where we still produce huge SCEVs, even not reaching big recursion depth etc. This patch introduces a concept of 'huge' SCEVs. A SCEV is huge if its expression size (intoduced in D35989) exceeds some threshold value. We prohibit optimizing transformations if any of SCEVs we are dealing with is huge. This gives us a reliable check that we don't spend too much time working with them. As the next step, we can possibly get rid of old limiting mechanisms, such as recursion depth thresholds. Differential Revision: https://reviews.llvm.org/D35990 Reviewed By: reames llvm-svn: 352728	2019-01-31 06:19:25 +00:00
Richard Trieu	108b892939	Add namespace to some types. llvm-svn: 352725	2019-01-31 04:33:11 +00:00
David L. Jones	d81f23071c	Revert "Reapply "[CGP] Check for existing inttotpr before creating new one"" This change reverts r351626. The changes in r351626 cause quadratic work in several cases. (See r351626 thread on llvm-commits for details.) llvm-svn: 352722	2019-01-31 03:28:46 +00:00
Matt Arsenault	c7bce739ad	GlobalISel: Handle odd splits in fewerElementsVector for load/store llvm-svn: 352720	2019-01-31 02:46:05 +00:00
Matt Arsenault	d1bfc8d0c3	GlobalISel: Implement narrowScalar for bswap llvm-svn: 352719	2019-01-31 02:34:03 +00:00
Matt Arsenault	cf4db733d8	GlobalISel: Don't call changingInstruction before giving up llvm-svn: 352718	2019-01-31 02:22:39 +00:00
Matt Arsenault	d5684f76e0	GlobalISel: Allow bitcount ops to have different result type For AMDGPU the result is always 32-bit for 64-bit inputs. llvm-svn: 352717	2019-01-31 02:09:57 +00:00
Matt Arsenault	8db2001d52	GlobalISel: Use helper function for MMO splitting Also fix an alignment bug getMachineMemOperand. If the tracked value is null, the offset isn't tracked so the base alignment needs to be reduced. llvm-svn: 352716	2019-01-31 01:49:58 +00:00
Matt Arsenault	2a64598ef2	GlobalISel: Fix creating MMOs with align 0 llvm-svn: 352712	2019-01-31 01:38:47 +00:00
Thomas Lively	9510adafe6	[LegalizeVectorTypes] Allow illegal indices when splitting extract_vector_elt Summary: Fixes PR40267, in which the removed assertion was triggering on perfectly valid IR. As far as I can tell, constant out of bounds indices should be allowed when splitting extract_vector_elt, since they will simply be propagated as out of bounds indices in the resulting split vector and handled appropriately elsewhere. Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya Differential Revision: https://reviews.llvm.org/D57471 llvm-svn: 352702	2019-01-31 00:35:37 +00:00
Craig Topper	49c4c68919	[LegalizeTypes] Use report_fatal_error instead of llvm_unreachable in the default case of some type legalization handlers that can be reached with intrinsics with result or operands that aren't legal types. These can be triggered by mistakenly using a 64-bit mode only intrinsics with a -mtriple=i686. Using report_fatal_error gives a better experience for this mistake in release builds instead of probably crashing. We already do this for some of the vector type legalization handles. llvm-svn: 352699	2019-01-31 00:04:48 +00:00
Craig Topper	8bdc203d4b	[X86] Remove handling of ISD::INTRINSIC_WO_CHAIN in ReplaceNodeResults. I believe this was there to handle avx512bw intrinsics that returned i64 type in 32-bit mode. But all those intrinsics have since been changed to v64i1 results or replaced with generic IR. llvm-svn: 352698	2019-01-31 00:04:46 +00:00
Zachary Turner	3c35f774de	[RuntimeDyld] Don't try to allocate sections with align 0. ELF sections allow 0 for the alignment, which is specified to be the same as 1. However many clients do not expect this and will behave poorly in the presence of a 0-aligned section (for example by trying to modulo something by the section alignment). We can be more polite by making sure that we always pass a non-zero value to clients. Differential Revision: https://reviews.llvm.org/D57482 llvm-svn: 352694	2019-01-30 23:52:32 +00:00
Jessica Paquette	84bedac7e9	[GlobalISel][AArch64] Select G_FEXP This teaches the legalizer to handle G_FEXP in AArch64. As a result, it also allows us to select G_FEXP. It... - Updates the legalizer-info tests - Adds a test for legalizing exp - Updates the existing fp tests to show that we can now select G_FEXP https://reviews.llvm.org/D57483 llvm-svn: 352692	2019-01-30 23:46:15 +00:00
Amara Emerson	13311e5274	[GlobalISel][LegalizerHelper] Add some missing MI change observer calls. No test as it's a preventative fix. llvm-svn: 352691	2019-01-30 23:42:46 +00:00
Chen Zheng	be589423d8	[PowerPC] delete no more needed workaround for readsRegister() in PowerPC Differential Revision: https://reviews.llvm.org/D57439 llvm-svn: 352689	2019-01-30 23:18:38 +00:00
Matt Arsenault	547a83b4eb	MIR: Reject non-power-of-4 alignments in MMO parsing llvm-svn: 352686	2019-01-30 23:09:28 +00:00
Jessica Paquette	10f59405ae	[GlobalISel][AArch64] Select G_FABS This adds instruction selection support for G_FABS in AArch64. It also updates the existing basic FP tests, adds a selection test for G_FABS. https://reviews.llvm.org/D57418 llvm-svn: 352684	2019-01-30 22:54:21 +00:00
Sam Clegg	19e8befabb	[WebAssembly] MC: Use WritePatchableLEB helper function. NFC. Subscribers: dschuff, jgravelle-google, aheejin, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D57477 llvm-svn: 352683	2019-01-30 22:47:35 +00:00
Heejin Ahn	0bb9865011	[WebAssembly] Restore stack pointer right after catch instruction Summary: After the staack is unwound due to a thrown exxception, `__stack_pointer` global can point to an invalid address. So a `global.set` to restore `__stack_pointer` should be inserted right after `catch` instruction. But after r352598 the `global.set` instruction is inserted not right after `catch` but after `block` - `br-on-exn` - `end_block` - `extract_exception` sequence. This CL fixes it. While doing that, we can actually move ReplacePhysRegs pass after LateEHPrepare and merge EHRestoreStackPointer pass into LateEHPrepare, and now placing `global.set` to `__stack_pointer` right after `catch` is much easier. Otherwise it is hard to guarantee that `global.set` is still right after `catch` and not touched with other transformations, in which case we have to do something to hoist it. Reviewers: dschuff Subscribers: mgorny, sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D57421 llvm-svn: 352681	2019-01-30 22:44:45 +00:00
Sanjay Patel	9ab23101a8	[DAGCombiner] sub X, 0/1 --> add X, 0/-1 This extends the existing transform for: add X, 0/1 --> sub X, 0/-1 ...to allow the sibling subtraction fold. This pattern could regress with the proposed change in D57401. llvm-svn: 352680	2019-01-30 22:41:35 +00:00
Jessica Paquette	0154bd1385	[GlobalISel][AArch64] Add instruction selection support for @llvm.log2 This teaches GlobalISel to emit a RTLib call for @llvm.log2 when it encounters it. It updates the existing floating point tests to show that we don't fall back on the intrinsic, and select the correct instructions. It also adds a legalizer test for G_FLOG2. https://reviews.llvm.org/D57357 llvm-svn: 352673	2019-01-30 21:16:04 +00:00
Jessica Paquette	22457f8e9b	[GlobalISel][AArch64] Add instruction selection support for @llvm.sqrt This teaches the legalizer about G_FSQRT in AArch64. Also adds a legalizer test for G_FSQRT, a selection test for it, and updates existing floating point tests. https://reviews.llvm.org/D57361 llvm-svn: 352671	2019-01-30 21:03:52 +00:00
Jessica Paquette	b147e7d853	[GlobalISel] Add IRTranslator support for @llvm.sqrt -> G_FSQRT Follow-up commit to https://reviews.llvm.org/D57359. (r352668) This adds IRTranslator support for recognising a @llvm.sqrt intrinsic and translating it into a G_FSQRT. https://reviews.llvm.org/D57360 llvm-svn: 352670	2019-01-30 20:58:14 +00:00
Wolfgang Pieb	facd052e16	Reverting r352642 - Handle restore instructions in LiveDebugValues - as it's causing assertions on some buildbots. llvm-svn: 352666	2019-01-30 20:37:14 +00:00
Erik Pilkington	600e9deacf	Add a 'dynamic' parameter to the objectsize intrinsic This is meant to be used with clang's __builtin_dynamic_object_size. When 'true' is passed to this parameter, the intrinsic has the potential to be folded into instructions that will be evaluated at run time. When 'false', the objectsize intrinsic behaviour is unchanged. rdar://32212419 Differential revision: https://reviews.llvm.org/D56761 llvm-svn: 352664	2019-01-30 20:34:35 +00:00
Craig Topper	22b3de5b51	[X86] Mark EMMS and FEMMS as clobbering MM0-7 and ST0-7. This fixes the test case in PR35982 by preventing MMX instructions that read MM0-7 from being moved below EMMS/FEMMS by the post RA scheduler. Though as discussed in bugzilla, this is not a complete fix. There is still the possibility of reordering in IR or by the pre-RA scheduler. Differential Revision: https://reviews.llvm.org/D57298 llvm-svn: 352660	2019-01-30 19:57:01 +00:00
Philip Reames	c71e996aed	SimplifyDemandedVectorElts for all intrinsics The point is that this simplifies integration of new intrinsics into SimplifiedDemandedVectorElts, and ensures we don't miss any existing ones. This is intended to be NFC-ish, but as seen from the diffs, can produce slightly different output. This is due to order of transforms w/in instcombine resulting in two slightly different fixed points. That's something we should fix, but isn't a problem w/this patch per se. Differential Revision: https://reviews.llvm.org/D57398 llvm-svn: 352653	2019-01-30 19:21:11 +00:00
Wolfgang Pieb	5590a4355f	[DEBUGINFO] Handle restore instructions in LiveDebugValues The LiveDebugValues pass recognizes spills but not restores, which can cause large gaps in location information for some variables, depending on control flow. This patch make LiveDebugValues recognize restores and generate appropriate DBG_VALUE instructions. Reviewers: aprantl, NicolaPrica Differential Revision: https://reviews.llvm.org/D57271 llvm-svn: 352642	2019-01-30 18:34:07 +00:00
Matt Arsenault	dc8258c4aa	GlobalISel: Add assert that legalize mutation makes sense I've repeatedly encountered bugs resulting from custom legalize mutations returning nonsense legalize results, such as increasing the number of elements for FewerElements. Add an assert function to make sure the type to mutate to is consistent with the legalize action. llvm-svn: 352636	2019-01-30 17:52:23 +00:00
Matt Arsenault	4c0409e9c7	AMDGPU: Stop generating unused intrinsic .inc files llvm-svn: 352635	2019-01-30 17:25:37 +00:00
Simon Pilgrim	317fad5921	[X86][AVX] Prefer to combine shuffle to broadcasts whenever possible This is the first step towards improving broadcast support on AVX1 targets. llvm-svn: 352634	2019-01-30 16:19:19 +00:00
Max Kazantsev	365021cc15	Properly use DT.verify in LoopSimplifyCFG llvm-svn: 352621	2019-01-30 12:32:19 +00:00
Max Kazantsev	34eeeec3ae	Enable IRCE for narrow latch by defailt llvm-svn: 352619	2019-01-30 11:25:12 +00:00
Shiva Chen	5af037f1e9	[RISCV] Insert R_RISCV_ALIGN relocation type and Nops for code alignment when linker relaxation enabled Linker relaxation may change code size. We need to fix up the alignment of alignment directive in text section by inserting Nops and R_RISCV_ALIGN relocation type. So then linker could satisfy the alignment by removing Nops. To do this: 1. Add shouldInsertExtraNopBytesForCodeAlign target hook to calculate the Nops we need to insert. 2. Add shouldInsertFixupForCodeAlign target hook to insert R_RISCV_ALIGN fixup type. Differential Revision: https://reviews.llvm.org/D47755 llvm-svn: 352616	2019-01-30 11:16:59 +00:00
Aleksandr Urakov	d17f6ab61b	[NativePDB] Fix access to both old & new fpo data entries from dbi stream Summary: This patch fixes access to fpo streams in native pdb from DbiStream and makes code consistent with DbiStreamBuilder. Patch By: leonid.mashinskiy Reviewers: zturner, aleksandr.urakov Reviewed By: zturner Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D56725 llvm-svn: 352615	2019-01-30 10:40:45 +00:00
Craig Topper	11133d2531	[X86] Remove unnecessary code from the top of handleCompareFP in X86FloatingPoint.cpp. There were checks to ensure some tables were sorted, but those tables aren't used by this function. The same tables are checked in the function that does use them. Maybe this was copy/pasted? llvm-svn: 352609	2019-01-30 08:04:06 +00:00
Craig Topper	594f76aea2	[X86] Remove a couple places where we unnecessarily pass 0 to the EmitPriority of some FP instruction aliases. NFC As far as I can tell we already won't emit these aliases due to an operand count check in the tablegen code. Removing these because I couldn't make sense of the inconsistency between fadd and fmul from reading the code. I checked the AsmMatcher and AsmWriter files before and after this change and there were no differences. llvm-svn: 352608	2019-01-30 07:33:24 +00:00
Craig Topper	9dfe9b086e	[X86] Add FPSW as a Def on some FP instructions that were missing it. llvm-svn: 352607	2019-01-30 07:08:44 +00:00
Hiroshi Inoue	c437f310a5	[NFC] fix trivial typos in comments llvm-svn: 352602	2019-01-30 05:26:31 +00:00
Matt Arsenault	dc6c78596b	GlobalISel: Implement fewerElementsVector for select llvm-svn: 352601	2019-01-30 04:19:31 +00:00
Matt Arsenault	f6cab16258	AMDGPU/GlobalISel: Fix clamping shifts with 16-bit insts llvm-svn: 352599	2019-01-30 03:36:25 +00:00
Heejin Ahn	d6f487863d	[WebAssembly] Exception handling: Switch to the new proposal Summary: This switches the EH implementation to the new proposal: https://github.com/WebAssembly/exception-handling/blob/master/proposals/Exceptions.md (The previous proposal was https://github.com/WebAssembly/exception-handling/blob/master/proposals/old/Exceptions.md) - Instruction changes - Now we have one single `catch` instruction that returns a except_ref value - `throw` now can take variable number of operations - `rethrow` does not have 'depth' argument anymore - `br_on_exn` queries an except_ref to see if it matches the tag and branches to the given label if true. - `extract_exception` is a pseudo instruction that simulates popping values from wasm stack. This is to make `br_on_exn`, a very special instruction, work: `br_on_exn` puts values onto the stack only if it is taken, and the # of values can vay depending on the tag. - Now there's only one `catch` per `try`, this patch removes all special handling for terminate pad with a call to `__clang_call_terminate`. Before it was the only case there are two catch clauses (a normal `catch` and `catch_all` per `try`). - Make `rethrow` act as a terminator like `throw`. This splits BB after `rethrow` in WasmEHPrepare, and deletes an unnecessary `unreachable` after `rethrow` in LateEHPrepare. - Now we stop at all catchpads (because we add wasm `catch` instruction that catches all exceptions), this creates new `findWasmUnwindDestinations` function in SelectionDAGBuilder. - Now we use `br_on_exn` instrution to figure out if an except_ref matches the current tag or not, LateEHPrepare generates this sequence for catch pads: ``` catch block i32 br_on_exn $__cpp_exception end_block extract_exception ``` - Branch analysis for `br_on_exn` in WebAssemblyInstrInfo - Other various misc. changes to switch to the new proposal. Reviewers: dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D57134 llvm-svn: 352598	2019-01-30 03:21:57 +00:00
Matt Arsenault	6d8e1b456a	GlobalISel: Use appropriate extension for legalizing select conditions llvm-svn: 352597	2019-01-30 02:57:43 +00:00
Zi Xuan Wu	fec749ff5d	[PowerPC] [NFC] Create a helper function to copy register to particular register class at PPCFastISel Make copy register code as common function as following. unsigned copyRegToRegClass(const TargetRegisterClass *ToRC, unsigned SrcReg, unsigned Flag = 0, unsigned SubReg = 0); Differential Revision: https://reviews.llvm.org/D57368 llvm-svn: 352596	2019-01-30 02:56:22 +00:00
Matt Arsenault	045bc9a4a6	GlobalISel: Support narrowScalar for uneven loads llvm-svn: 352594	2019-01-30 02:35:38 +00:00
Thomas Lively	079816efb7	[WebAssembly] Optimize BUILD_VECTOR lowering for size Summary: Implements custom lowering logic that finds the optimal value for the initial splat of the vector and either uses it or uses v128.const if it is available and if it would produce smaller code. This logic replaces large TableGen ISEL patterns that would lower all non-splat BUILD_VECTORs into a splat followed by a fixed number of replace_lane instructions. This CL fixes PR39685. Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D56633 llvm-svn: 352592	2019-01-30 02:23:29 +00:00
Matt Arsenault	ccefbbd0f0	GlobalISel: Handle some odd splits in fewerElementsVector Also add some quick hacks to AMDGPU legality for the tests. llvm-svn: 352591	2019-01-30 02:22:13 +00:00
Matt Arsenault	92c5001136	GlobalISel: Handle more cases for widenScalar for G_STORE llvm-svn: 352585	2019-01-30 02:04:31 +00:00
Chen Zheng	ca26039cc7	[PowerPC] more opportunity for converting reg+reg to reg+imm Differential Revision: https://reviews.llvm.org/D57314 llvm-svn: 352583	2019-01-30 01:57:01 +00:00
Matt Arsenault	ccb810fb54	GlobalISel: Verify memory size for load/store llvm-svn: 352578	2019-01-30 01:10:42 +00:00
George Burgess IV	179f6baa45	Remove a redundant space from an error message; NFC llvm-svn: 352576	2019-01-30 00:28:56 +00:00
Sam Clegg	c7d2e5f154	[WebAssembly] Add missing SymbolRef update from rL352551 This change broke some MC tests which are now fixed. Differential Revision: https://reviews.llvm.org/D57424 llvm-svn: 352573	2019-01-30 00:15:48 +00:00
Thomas Lively	74c12ceacb	[WebAssembly] Lower SCALAR_TO_VECTOR to splats Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish Differential Revision: https://reviews.llvm.org/D57269 llvm-svn: 352568	2019-01-29 23:44:48 +00:00
Matt Arsenault	3de9a96174	GlobalISel: Fix unused variable warning in release builds llvm-svn: 352565	2019-01-29 23:38:42 +00:00
Craig Topper	0b5e6b11c3	[IR] Use CallBase to reduce code duplication. NFC Noticed in the asm-goto patch. Callbr needs to go here too. One cast and call is better than 3. Differential Revision: https://reviews.llvm.org/D57295 llvm-svn: 352563	2019-01-29 23:31:54 +00:00
Matt Arsenault	d45b03bb81	GlobalISel: Verify pointer casts Not sure if the old AArch64 tests should be just deleted or not. llvm-svn: 352562	2019-01-29 23:29:00 +00:00

... 2 3 4 5 6 ...

120378 Commits