llvm-project

Commit Graph

Author	SHA1	Message	Date
Craig Topper	8e364c680f	[X86] Restore the pavg intrinsics. The pattern we replaced these with may be too hard to match as demonstrated by PR41496 and PR41316. This patch restores the intrinsics and then we can start focusing on the optimizing the intrinsics. I've mostly reverted the original patch that removed them. Though I modified the avx512 intrinsics to not have masking built in. Differential Revision: https://reviews.llvm.org/D60674 llvm-svn: 358427	2019-04-15 17:17:35 +00:00
Akira Hatanaka	9ca9d32b6b	[ObjC][ARC] Convert the retainRV marker that is passed as a named metadata into a module flag in the auto-upgrader and make the ARC contract pass read the marker as a module flag. This is needed to fix a bug where ARC contract wasn't inserting the retainRV marker when LTO was enabled, which caused objects returned from a function to be auto-released. rdar://problem/49464214 Differential Revision: https://reviews.llvm.org/D60303 llvm-svn: 358047	2019-04-10 06:20:20 +00:00
Amara Emerson	c10b24691a	[AArch64] Split the neon.addp intrinsic into integer and fp variants. This is the result of discussions on the list about how to deal with intrinsics which require codegen to disambiguate them via only the integer/fp overloads. It causes problems for GlobalISel as some of that information is lost during translation, while with other operations like IR instructions the information is encoded into the instruction opcode. This patch changes clang to emit the new faddp intrinsic if the vector operands to the builtin have FP element types. LLVM IR AutoUpgrade has been taught to upgrade existing calls to aarch64.neon.addp with fp vector arguments, and we remove the workarounds introduced for GlobalISel in r355865. This is a more permanent solution to PR40968. Differential Revision: https://reviews.llvm.org/D59655 llvm-svn: 356722	2019-03-21 22:31:37 +00:00
Steven Wu	ed98229286	[Bitcode] Fix bitcode compatibility issue with clang.arc.use intrinsic Summary: In r349534, objc arc implementation is switched to use intrinsics and at the same time, clang.arc.use is renamed to llvm.objc.clang.arc.use to make the naming more consistent. The side-effect of that is llvm no longer recognize it as intrinsics and codegen external references to it instead. Rather than upgrade the old intrinsics name to the new one and wait for the arc-contract pass to remove it, simply remove it in the bitcode upgrader. rdar://problem/48607063 Reviewers: pete, ahatanak, erik.pilkington, dexonsmith Reviewed By: pete, dexonsmith Subscribers: jkorous, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59112 llvm-svn: 355663	2019-03-08 05:27:53 +00:00
Erik Pilkington	4ecd7a90a6	Fix auto-upgrade for the new parameter to llvm.objectsize r352664 added a 'dynamic' parameter to objectsize, but the AutoUpgrade changes were incomplete. Also, fix an off-by-one error I made in the upgrade logic that is now no longer unreachable. Differential revision: https://reviews.llvm.org/D58071 llvm-svn: 353884	2019-02-12 21:55:38 +00:00
Mandeep Singh Grang	2be4eabb6f	[AutoUpgrade] Fix AutoUpgrade for x86.seh.recoverfp Summary: This fixes the bug in https://reviews.llvm.org/D56747#inline-502711. Reviewers: efriedma Reviewed By: efriedma Subscribers: javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57614 llvm-svn: 352945	2019-02-02 01:32:48 +00:00
James Y Knight	14359ef1b6	[opaque pointer types] Pass value type to LoadInst creation. This cleans up all LoadInst creation in LLVM to explicitly pass the value type rather than deriving it from the pointer's element-type. Differential Revision: https://reviews.llvm.org/D57172 llvm-svn: 352911	2019-02-01 20:44:24 +00:00
Erik Pilkington	600e9deacf	Add a 'dynamic' parameter to the objectsize intrinsic This is meant to be used with clang's __builtin_dynamic_object_size. When 'true' is passed to this parameter, the intrinsic has the potential to be folded into instructions that will be evaluated at run time. When 'false', the objectsize intrinsic behaviour is unchanged. rdar://32212419 Differential revision: https://reviews.llvm.org/D56761 llvm-svn: 352664	2019-01-30 20:34:35 +00:00
Craig Topper	453150bc18	[X86] Add new variadic avx512 compress/expand intrinsics that use vXi1 types for the mask argument. Remove and autoupgrade the old intrinsics llvm-svn: 352343	2019-01-28 07:03:03 +00:00
Craig Topper	3b5e01b386	[X86] Remove and autoupgrade vpconflict intrinsics that take a mask and passthru argument. We have unmasked versions as of r352172 llvm-svn: 352270	2019-01-26 06:27:01 +00:00
Craig Topper	6c9c7d0796	[X86] Remove GCCBuiltins from 512-bit cvt(u)qqtops, cvt(u)qqtopd, and cvt(u)dqtops intrinsics. Add new variadic uitofp/sitofp with rounding mode intrinsics. Summary: See clang patch D56998 for a full description. Reviewers: RKSimon, spatel Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D56999 llvm-svn: 352266	2019-01-26 02:41:54 +00:00
Craig Topper	f608dc1f57	[X86] Remove and autoupgrade vpmovqd/vpmovwb intrinsics using trunc+select. llvm-svn: 351729	2019-01-21 08:16:59 +00:00
Simon Pilgrim	e1143c1322	[X86] Auto upgrade VPCOM/VPCOMU intrinsics to generic integer comparisons This causes a couple of changes in the upgrade tests as signed/unsigned eq/ne are equivalent and we constant fold true/false codes, these changes are the same as what we already do for avx512 cmp/ucmp. Noticed while cleaning up vector integer comparison costs for PR40376. llvm-svn: 351697	2019-01-20 19:27:40 +00:00
Simon Pilgrim	b590e4f7e5	[X86] Auto upgrade old style VPCOM/VPCOMU intrinsics to generic integer comparisons We were upgrading these to the new style VPCOM/VPCOMU intrinsics (which includes the condition code immediate), but we'll be getting rid of those shortly, so convert these to generics first. This causes a couple of changes in the upgrade tests as signed/unsigned eq/ne are equivalent and we constant fold true/false codes, these changes are the same as what we already do for avx512 cmp/ucmp. Noticed while cleaning up vector integer comparison costs for PR40376. llvm-svn: 351690	2019-01-20 17:36:22 +00:00
Chandler Carruth	2946cd7010	Update the file headers across all of the LLVM projects in the monorepo to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636	2019-01-19 08:50:56 +00:00
Mandeep Singh Grang	436735c3fe	[EH] Rename llvm.x86.seh.recoverfp intrinsic to llvm.eh.recoverfp Summary: Make recoverfp intrinsic target-independent so that it can be implemented for AArch64, etc. Refer D53541 for the context. Clang counterpart D56748. Reviewers: rnk, efriedma Reviewed By: rnk, efriedma Subscribers: javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D56747 llvm-svn: 351281	2019-01-16 00:37:13 +00:00
Craig Topper	e7b4ea4726	[X86] Remove mask parameter from avx512 pmultishiftqb intrinsics. Use select in IR instead. Fixes PR40259 llvm-svn: 351035	2019-01-14 08:46:45 +00:00
Craig Topper	3f3b8ef442	[X86] Remove mask parameter from vpshufbitqmb intrinsics. Change result to a vXi1 vector. The input mask can be represented with an AND in IR. Fixes PR40258 llvm-svn: 351028	2019-01-14 00:03:50 +00:00
Craig Topper	486313b5f7	Recommit r350554 "[X86] Remove AVX512VBMI2 concat and shift intrinsics. Replace with target independent funnel shift intrinsics." The MSVC limit we hit on AutoUpgrade.cpp has been worked around for now. llvm-svn: 350567	2019-01-07 21:00:32 +00:00
Craig Topper	81fe1fbf4a	[X86][AutoUpgrade] Make some tweaks to reduce the number of nested if/else in the intrinsic upgrade code to avoid an MSVC compiler limit. MSVC has a nesting limit of around 110-130. An if/else if/else if counts against this next level. The autoupgrade code consists a long chain of these checking matches against strings. This commit moves some code to a helper function to move out a large if/else chain that was inside of one of the blocks into a separate function. There are more of these we could move or we could change some to lookup tables. I've also merged together a few similar blocks in the outer chain. This should buy us some margin for a little bit. llvm-svn: 350564	2019-01-07 20:13:45 +00:00
Craig Topper	fad1589f39	Revert r350554 "[X86] Remove AVX512VBMI2 concat and shift intrinsics. Replace with target independent funnel shift intrinsics." The AutoUpgrade.cpp if/else cascade hit an MSVC limit again. llvm-svn: 350562	2019-01-07 19:39:05 +00:00
Craig Topper	9c4f7e9147	[X86] Remove AVX512VBMI2 concat and shift intrinsics. Replace with target independent funnel shift intrinsics. Differential Revision: https://reviews.llvm.org/D56377 llvm-svn: 350554	2019-01-07 19:10:12 +00:00
Simon Pilgrim	5d403f6bf8	[X86][SSE] Auto upgrade PADDS/PSUBS intrinsics to SADD_SAT/SSUB_SAT generic intrinsics (llvm) This auto upgrades the signed SSE saturated math intrinsics to SADD_SAT/SSUB_SAT generic intrinsics. Clang counterpart: https://reviews.llvm.org/D55890 Differential Revision: https://reviews.llvm.org/D55894 llvm-svn: 349892	2018-12-21 09:04:14 +00:00
Simon Pilgrim	2a25360ae3	[X86] Auto upgrade XOP/AVX512 rotation intrinsics to generic funnel shift intrinsics (llvm) This emits FSHL/FSHR generic intrinsics for the XOP VPROT and AVX512 VPROL/VPROR rotation intrinsics. Clang counterpart: https://reviews.llvm.org/D55937 Differential Revision: https://reviews.llvm.org/D55938 llvm-svn: 349795	2018-12-20 19:01:07 +00:00
Simon Pilgrim	7bfbf3caa4	[X86][SSE] Auto upgrade PADDUS/PSUBUS intrinsics to UADD_SAT/USUB_SAT generic intrinsics (llvm) Now that we use the generic ISD opcodes, we can use the generic intrinsics directly as well. This fixes the poor fast-isel codegen by not expanding to an easily broken IR code sequence. I'm intending to deal with the signed saturation equivalents as well. Clang counterpart: https://reviews.llvm.org/D55879 Differential Revision: https://reviews.llvm.org/D55855 llvm-svn: 349630	2018-12-19 14:43:36 +00:00
Craig Topper	02b614abc8	[X86] Merge addcarryx/addcarry intrinsic into a single addcarry intrinsic. Both intrinsics do the exact same thing so we really only need one. Earlier in the 8.0 cycle we changed the signature of this intrinsic without renaming it. But it looks difficult to get the autoupgrade code to allow me to merge the intrinsics and change the signature at the same time. So I've renamed the intrinsic slightly for the new merged intrinsic. I'm skipping autoupgrading from the previous new to 8.0 signature. I've also renamed the subborrow for consistency. llvm-svn: 348737	2018-12-10 06:07:50 +00:00
Craig Topper	7ca55323ce	[X86] Add some comments about when some X86 intrinsic autoupgrade code was added. Someday we'd like to remove old autoupgrade code so it helps to annotate how long its been there so we don't have to go digging through commit history. llvm-svn: 348728	2018-12-09 18:02:40 +00:00
Craig Topper	4863313b35	[X86] Modify the the rdtscp intrinsic to return values instead of taking a pointer argument Similar to what was recently done for addcarry/subborrow and has been done for rdrand/rdseed for a while. It's better to use two results and an explicit store in IR when the store isn't part of the semantics of the instruction. This allows store->load forwarding to happen in the middle end. Or the store to be removed if its never loaded. Differential Revision: https://reviews.llvm.org/D51803 llvm-svn: 341698	2018-09-07 19:14:15 +00:00
Craig Topper	72964ae99e	[X86] Change the addcarry and subborrow intrinsics to return 2 results and remove the pointer argument. We should represent the store directly in IR instead. This gives the middle end a chance to remove it if it can see a load from the same address. Differential Revision: https://reviews.llvm.org/D51769 llvm-svn: 341677	2018-09-07 16:58:39 +00:00
Alexander Richardson	6bcf2ba2f0	Allow creating llvm::Function in non-zero address spaces Most users won't have to worry about this as all of the 'getOrInsertFunction' functions on Module will default to the program address space. An overload has been added to Function::Create to abstract away the details for most callers. This is based on https://reviews.llvm.org/D37054 but without the changes to make passing a Module to Function::Create() mandatory. I have also added some more tests and fixed the LLParser to accept call instructions for types in the program address space. Reviewed By: bjope Differential Revision: https://reviews.llvm.org/D47541 llvm-svn: 340519	2018-08-23 09:25:17 +00:00
Craig Topper	9c1d9fdeaa	[X86] Remove masking from the 512-bit padds and psubs intrinsics. Use select in IR instead. llvm-svn: 339842	2018-08-16 06:20:24 +00:00
Craig Topper	9d6983c9fd	[X86] Remove the unused masked 128 and 256-bit masked padds/psubs intrinsics. Still need to remove masking from the 512-bit versions. llvm-svn: 339841	2018-08-16 06:20:22 +00:00
Simon Pilgrim	7bae71a209	Fix MSVC "compiler limit: blocks nested too deeply" error. NFCI. MSVC only accepts if-else chains up to 127 blocks long. I've had to merge a number of intrinsic cases together to get back below this limit, resulting in some duplication of string matches; this shouldn't cause any notable increase in runtime (and even then only for old IR, nothing that clang currently emits). llvm-svn: 339666	2018-08-14 10:04:14 +00:00
Tomasz Krupa	86a63889f3	[X86] Lowering addus/subus intrinsics to native IR Summary: This revision improves previous version (rL330322) which has been reverted due to crashes. This is the patch that lowers x86 intrinsics to native IR in order to enable optimizations. The patch also includes folding of previously missing saturation patterns so that IR emits the same machine instructions as the intrinsics. Reviewers: craig.topper, spatel, RKSimon Reviewed By: craig.topper Subscribers: mike.dvoretsky, DavidKreitzer, sroland, llvm-commits Differential Revision: https://reviews.llvm.org/D46179 llvm-svn: 339650	2018-08-14 08:00:56 +00:00
Fangrui Song	f78650a8de	Remove trailing space sed -Ei 's/[[:space:]]+$//' include/*/.{def,h,td} lib/*/.{cpp,h} llvm-svn: 338293	2018-07-30 19:41:25 +00:00
Craig Topper	034adf2683	[X86] Remove and autoupgrade the scalar fma intrinsics with masking. This converts them to what clang is now using for codegen. Unfortunately, there seem to be a few kinks to work out still. I'll try to address with follow up patches. llvm-svn: 336871	2018-07-12 00:29:56 +00:00
Craig Topper	c60e1807b3	[X86] Remove FMA4 scalar intrinsics. Use llvm.fma intrinsic instead. The intrinsics can be implemented with a f32/f64 llvm.fma intrinsic and an insert into a zero vector. There are a couple regressions here due to SelectionDAG not being able to pull an fneg through an extract_vector_elt. I'm not super worried about this though as InstCombine should be able to do it before we get to SelectionDAG. llvm-svn: 336416	2018-07-06 07:14:41 +00:00
Craig Topper	7b35585ff1	[X86] Remove all of the avx512 masked packed fma intrinsics. Use llvm.fma or unmasked 512-bit intrinsics with rounding mode. This upgrades all of the intrinsics to use fneg instructions to convert fma into fmsub/fnmsub/fnmadd/fmsubadd. And uses a select instruction for masking. This matches how clang uses the intrinsics these days. llvm-svn: 336409	2018-07-06 03:42:09 +00:00
Craig Topper	88d361e976	[X86] Remove the last of the 'x86.fma.' intrinsics and autoupgrade them to 'llvm.fma'. Add upgrade tests for all. Still need to remove the AVX512 masked versions. llvm-svn: 336383	2018-07-05 18:43:58 +00:00
Craig Topper	350c5f1881	[X86] Remove X86 specific scalar FMA intrinsics and upgrade to tart independent FMA and extractelement/insertelement. llvm-svn: 336315	2018-07-05 06:52:55 +00:00
Craig Topper	e4b9257b69	[X86] Remove some of the packed FMA3 intrinsics since we no longer use them in clang. There's a regression in here due to inability to combine fneg inputs of X86ISD::FMSUB/FNMSUB/FNMADD nodes. More removals to come, but I wanted to stop and fix the regression that showed up in this first. llvm-svn: 336303	2018-07-05 02:52:54 +00:00
Craig Topper	59f2f38fe0	[X86] Remove masking from avx512 rotate intrinsics. Use select in IR instead. llvm-svn: 336035	2018-06-30 01:32:04 +00:00
Craig Topper	875e9f8fa4	[X86] Remove masking from the avx512 packed sqrt intrinsics. Use select in IR instead. While there improve the coverage of the intrinsic testing and add fast-isel tests. llvm-svn: 335944	2018-06-29 05:43:26 +00:00
Craig Topper	31cbe75b3b	[X86] Rename the autoupgraded of packed fp compare and fpclass intrinsics that don't take a mask as input to exclude '.mask.' from their name. I think the intrinsics named 'avx512.mask.' should refer to the previous behavior of taking a mask argument in the intrinsic instead of using a 'select' or 'and' instruction in IR to accomplish the masking. This is more consistent with the goal that eventually we will have no intrinsics that have masking builtin. When we reach that goal, we should have no intrinsics named "avx512.mask". llvm-svn: 335744	2018-06-27 15:57:53 +00:00
Craig Topper	689e363ff2	[X86] Redefine avx512 packed fpclass intrinsics to return a vXi1 mask and implement the mask input argument using an 'and' IR instruction. This recommits r335562 and 335563 as a single commit. The frontend will surround the intrinsic with the appropriate marshalling to/from a scalar type to match the sigature of the builtin that software expects. By exposing the vXi1 type directly in the llvm intrinsic we make it available to optimizers much earlier. This can enable the scalar marshalling code to be optimized away. llvm-svn: 335568	2018-06-26 01:37:02 +00:00
Craig Topper	6f4fdfa9af	Revert r335562 and 335563 "[X86] Redefine avx512 packed fpclass intrinsics to return a vXi1 mask and implement the mask input argument using an 'and' IR instruction." These were supposed to have been squashed to a single commit. llvm-svn: 335566	2018-06-26 01:31:53 +00:00
Craig Topper	9b4322ce31	foo llvm-svn: 335562	2018-06-26 00:43:34 +00:00
Craig Topper	296526bf46	[X86] Remove masking from 512-bit floating max/min intrinsics. Use select instruction instead. llvm-svn: 335199	2018-06-21 05:00:56 +00:00
Tomasz Krupa	bcaab53d47	[X86] Lowering sqrt intrinsics to native IR Summary: Complementary patch to lowering sqrt intrinsics in Clang. Reviewers: craig.topper, spatel, RKSimon, DavidKreitzer, uriel.k Reviewed By: craig.topper Subscribers: tkrupa, mike.dvoretsky, llvm-commits Differential Revision: https://reviews.llvm.org/D41599 llvm-svn: 334849	2018-06-15 18:05:24 +00:00
Craig Topper	3829d258ee	[X86] Remove masking from avx512vbmi2 concat and shift by immediate intrinsics. Use select in IR instead. llvm-svn: 334576	2018-06-13 07:19:21 +00:00
Craig Topper	0e25c8239a	[X86] Remove masking from dbpsadbw intrinsics, use select in IR instead. llvm-svn: 334384	2018-06-11 06:18:22 +00:00
Craig Topper	e71ad1f6d0	[X86] Remove and autoupgrade the expandload and compressstore intrinsics. We use the target independent intrinsics now. llvm-svn: 334381	2018-06-11 01:25:22 +00:00
Craig Topper	98a79934af	[X86] Remove masking from the 512-bit masked floating point add/sub/mul/div intrinsics. Use a select in IR instead. llvm-svn: 334358	2018-06-10 06:01:36 +00:00
Craig Topper	9923eac358	[X86] Remove and autoupgrade masked avx512vnni intrinsics using the unmasked intrinsics and select instructions. llvm-svn: 333857	2018-06-03 23:24:17 +00:00
Craig Topper	21aeddc3dc	[X86] Remove masked vpermi2var/vpermt2var intrinsics and autoupgrade. We have unmasked intrinsics now and wrap them with a select. This is a net reduction of 36 intrinsics from before the unmasked intrinsics were added. llvm-svn: 333388	2018-05-29 05:22:05 +00:00
Craig Topper	51eddb8749	[X86] Remove masking from avx512ifma intrinsics. Use a select instead. This allows us to avoid having mask and maskz variant. Reducing from 12 intrinsics to 6. llvm-svn: 333346	2018-05-26 18:55:19 +00:00
Craig Topper	358b094971	[X86] Remove 128/256-bit cvtdq2ps, cvtudq2ps, cvtqq2pd, cvtuqq2pd intrinsics. These can all be implemented with sitofp/uitofp instructions. llvm-svn: 332916	2018-05-21 23:15:00 +00:00
Craig Topper	aad3aefaeb	[X86] Remove masking from vpternlog intrinsics. Use a select in IR instead. This removes 6 intrinsics since we no longer need separate mask and maskz intrinsics. Differential Revision: https://reviews.llvm.org/D47124 llvm-svn: 332890	2018-05-21 20:58:09 +00:00
Craig Topper	e4c045b7df	[X86] Remove mask arguments from permvar builtins/intrinsics. Use a select in IR instead. Someday maybe we'll use selects for all intrinsics. llvm-svn: 332824	2018-05-20 23:34:04 +00:00
Craig Topper	53ceb4805f	[X86] Remove and autoupgrade avx512.vbroadcast.ss/avx512.vbroadcast.sd intrinsics. llvm-svn: 332271	2018-05-14 18:21:22 +00:00
Craig Topper	0e71c6d5ca	[X86] Remove and autoupgrade the cvtusi2sd intrinsic. Use uitofp+insertelement instead. llvm-svn: 332206	2018-05-14 00:06:49 +00:00
Craig Topper	85906cf041	[X86] Remove and autoupgrade masked vpermd/vpermps intrinsics. llvm-svn: 332198	2018-05-13 18:03:59 +00:00
Craig Topper	df3a9cedff	[X86] Remove an autoupgrade legacy cvtss2sd intrinsics. llvm-svn: 332187	2018-05-13 00:29:40 +00:00
Craig Topper	38ad7ddabc	[X86] Remove and autoupgrade cvtsi2ss/cvtsi2sd intrinsics to match what clang has used for a very long time. llvm-svn: 332186	2018-05-12 23:14:39 +00:00
Craig Topper	a288f241cd	[X86] Remove some unused masked conversion intrinsics that can be replaced with an older intrinsic and a select. This is what clang already uses. llvm-svn: 332170	2018-05-12 02:34:28 +00:00
Craig Topper	a17d627abb	[X86] Remove and autoupgrade a bunch of FMA instrinsics that are no longer used by clang. llvm-svn: 332146	2018-05-11 21:59:34 +00:00
Craig Topper	9968af4a2a	[X86] Remove and autoupgrade the avx512.mask.store.ss intrinsic. llvm-svn: 332079	2018-05-11 04:33:18 +00:00
Piotr Padlewski	5dde809404	Rename invariant.group.barrier to launder.invariant.group Summary: This is one of the initial commit of "RFC: Devirtualization v2" proposal: https://docs.google.com/document/d/16GVtCpzK8sIHNc2qZz6RN8amICNBtvjWUod2SujZVEo/edit?usp=sharing Reviewers: rsmith, amharc, kuhar, sanjoy Subscribers: arsenm, nhaehnle, javed.absar, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D45111 llvm-svn: 331448	2018-05-03 11:03:01 +00:00
Chandler Carruth	16429acacb	[x86] Revert r330322 (& r330323): Lowering x86 adds/addus/subs/subus intrinsics The LLVM commit introduces a crash in LLVM's instruction selection. I filed http://llvm.org/PR37260 with the test case. llvm-svn: 330997	2018-04-26 21:46:01 +00:00
Reid Kleckner	e160d51b42	Fix -Wtautological-compare warning with npos on Windows llvm-svn: 330614	2018-04-23 16:47:27 +00:00
Alexander Ivchenko	e8fed1546e	Lowering x86 adds/addus/subs/subus intrinsics (llvm part) This is the patch that lowers x86 intrinsics to native IR in order to enable optimizations. The patch also includes folding of previously missing saturation patterns so that IR emits the same machine instructions as the intrinsics. Patch by tkrupa Differential Revision: https://reviews.llvm.org/D44785 llvm-svn: 330322	2018-04-19 12:13:30 +00:00
Gerolf Hoflehner	1c3a07834e	[IR] Upgrade comment token in objc retain release marker for asm call Older compiler issued '#' instead of ';' llvm-svn: 330173	2018-04-17 04:02:24 +00:00
Craig Topper	254ed028a4	[X86] Remove the pmuldq/pmuldq intrinsics and replace with native IR. This completes the work started in r329604 and r329605 when we changed clang to no longer use the intrinsics. We lost some InstCombine SimplifyDemandedBit optimizations through this change as we aren't able to fold 'and', bitcast, shuffle very well. llvm-svn: 329990	2018-04-13 06:07:18 +00:00
Craig Topper	9507fa358c	[X86] Remove 128/256-bit masked pmaddubsw and pmaddwd intrinsics. Replace 512-bit masked intrinsic with unmasked intrinsic and a select. The 128/256-bit versions were no longer used by clang. It uses the legacy SSE/AVX2 version and a select. The 512-bit was changed to the same for consistency. llvm-svn: 329774	2018-04-11 04:55:04 +00:00
Craig Topper	c95e122cc7	[X86] Merge some of the autoupgrade handling for masked intrinsics that just need to upgrade to an unmasked version plus a select. NFCI These are were previously grouped in small groups of similarish intrinsics. But all the intrinsics have the same number of arguments and the same order. So we can move them all into a larger group for handling. llvm-svn: 329549	2018-04-09 06:15:09 +00:00
Gerolf Hoflehner	f41aa4fd85	[IR] Upgrade comment token in objc retain release marker Older compiler issued '#' instead of ';' llvm-svn: 329248	2018-04-05 02:44:46 +00:00
Craig Topper	9256ac1a58	[X86] Add 512-bit unmasked pmulhrsw/pmulhw/pmulhuw intrinsics. Remove and auto upgrade 128/256/512 bit masked pmulhrsw/pmulhw/pmulhuw intrinsics. The 128 and 256 bit versions were already not used by clang. This adds an equivalent unmasked 512 bit version. Then autoupgrades all sizes to use unmasked intrinsics plus select. llvm-svn: 325559	2018-02-20 07:28:14 +00:00
Craig Topper	8d19c6fba2	[X86] Reverse the operand order of the autoupgrade of the kunpack builtins. The second operand needs to be in the lower bits of the concatenation. This matches llvm 5.0, gcc, and icc behavior. Fixes PR36360. llvm-svn: 324953	2018-02-12 22:38:34 +00:00
Craig Topper	4dccffc84a	[X86] Change signatures of avx512 packed fp compare intrinsics to return a vXi1 mask type to be closer to an fcmp. Summary: This patch changes the signature of the avx512 packed fp compare intrinsics to return a vXi1 vector and no longer take a mask as input. The casts to scalar type will now need to be explicit in the IR. The masking node will now be an explicit and in the IR. This makes the intrinsic look much more similar to an fcmp instruction that we wish we could use for these but can't. We already use icmp instructions for integer compares. Previously the lowering step of isel would turn the intrinsic into an X86 specific ISD node and a emit the masking nodes as well as some bitcasts. This means DAG combines can't see the vXi1 type until somewhat late, making it more difficult to combine out gpr<->mask transition sequences. By exposing the vXi1 type explicitly in the IR and initial SelectionDAG we give earlier DAG combines and even InstCombine the chance to see it and optimize it. This should make any issues with gpr<->mask sequences the same between integer and fp. Meaning we only have to fix them once. Reviewers: spatel, delena, RKSimon, zvi Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43137 llvm-svn: 324827	2018-02-10 23:33:55 +00:00
Craig Topper	dccf72b583	[X86] Remove kortest intrinsics and replace with native IR. llvm-svn: 324646	2018-02-08 20:16:06 +00:00
Craig Topper	071ad9c6e0	[X86] Remove and autoupgrade kand/kandn/kor/kxor/kxnor/knot intrinsics. Clang already stopped using these a couple months ago. The test cases aren't great as there is nothing forcing the operations to stay in k-registers so some of them moved back to scalar ops due to the bitcasts being moved around. llvm-svn: 324177	2018-02-03 20:18:25 +00:00
Daniel Neilson	1e68724d24	Remove alignment argument from memcpy/memmove/memset in favour of alignment attributes (Step 1) Summary: This is a resurrection of work first proposed and discussed in Aug 2015: http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html and initially landed (but then backed out) in Nov 2015: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html The @llvm.memcpy/memmove/memset intrinsics currently have an explicit argument which is required to be a constant integer. It represents the alignment of the dest (and source), and so must be the minimum of the actual alignment of the two. This change is the first in a series that allows source and dest to each have their own alignments by using the alignment attribute on their arguments. In this change we: 1) Remove the alignment argument. 2) Add alignment attributes to the source & dest arguments. We, temporarily, require that the alignments for source & dest be equal. For example, code which used to read: call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 100, i32 4, i1 false) will now read call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 4 %dest, i8* align 4 %src, i32 100, i1 false) Downstream users may have to update their lit tests that check for @llvm.memcpy/memmove/memset call/declaration patterns. The following extended sed script may help with updating the majority of your tests, but it does not catch all possible patterns so some manual checking and updating will be required. s~declare void @llvm\.mem(set\|cpy\|move)\.p([^(])$(.), i32, i1$~declare void @llvm.mem\1.p\2(\3, i1)~g s~call void @llvm\.memset\.p([^(])i8$i8([^])\ (.), i8 (.), i8 (.), i32 [01], i1 ([^)])$~call void @llvm.memset.p\1i8(i8\2* \3, i8 \4, i8 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(])i16$i8([^])\ (.), i8 (.), i16 (.), i32 [01], i1 ([^)])$~call void @llvm.memset.p\1i16(i8\2* \3, i8 \4, i16 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(])i32$i8([^])\ (.), i8 (.), i32 (.), i32 [01], i1 ([^)])$~call void @llvm.memset.p\1i32(i8\2* \3, i8 \4, i32 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(])i64$i8([^])\ (.), i8 (.), i64 (.), i32 [01], i1 ([^)])$~call void @llvm.memset.p\1i64(i8\2* \3, i8 \4, i64 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(])i128$i8([^])\ (.), i8 (.), i128 (.), i32 [01], i1 ([^)])$~call void @llvm.memset.p\1i128(i8\2* \3, i8 \4, i128 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(])i8$i8([^])\ (.), i8 (.), i8 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.memset.p\1i8(i8\2 align \6 \3, i8 \4, i8 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(])i16$i8([^])\ (.), i8 (.), i16 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.memset.p\1i16(i8\2 align \6 \3, i8 \4, i16 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(])i32$i8([^])\ (.), i8 (.), i32 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.memset.p\1i32(i8\2 align \6 \3, i8 \4, i32 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(])i64$i8([^])\ (.), i8 (.), i64 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.memset.p\1i64(i8\2 align \6 \3, i8 \4, i64 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(])i128$i8([^])\ (.), i8 (.), i128 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.memset.p\1i128(i8\2 align \6 \3, i8 \4, i128 \5, i1 \7)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i8$i8([^])\ (.), i8([^])\ (.), i8 (.), i32 [01], i1 ([^)])$~call void @llvm.mem\1.p\2i8(i8\3 \4, i8\5* \6, i8 \7, i1 \8)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i16$i8([^])\ (.), i8([^])\ (.), i16 (.), i32 [01], i1 ([^)])$~call void @llvm.mem\1.p\2i16(i8\3 \4, i8\5* \6, i16 \7, i1 \8)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i32$i8([^])\ (.), i8([^])\ (.), i32 (.), i32 [01], i1 ([^)])$~call void @llvm.mem\1.p\2i32(i8\3 \4, i8\5* \6, i32 \7, i1 \8)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i64$i8([^])\ (.), i8([^])\ (.), i64 (.), i32 [01], i1 ([^)])$~call void @llvm.mem\1.p\2i64(i8\3 \4, i8\5* \6, i64 \7, i1 \8)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i128$i8([^])\ (.), i8([^])\ (.), i128 (.), i32 [01], i1 ([^)])$~call void @llvm.mem\1.p\2i128(i8\3 \4, i8\5* \6, i128 \7, i1 \8)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i8$i8([^])\ (.), i8([^])\ (.), i8 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.mem\1.p\2i8(i8\3* align \8 \4, i8\5* align \8 \6, i8 \7, i1 \9)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i16$i8([^])\ (.), i8([^])\ (.), i16 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.mem\1.p\2i16(i8\3* align \8 \4, i8\5* align \8 \6, i16 \7, i1 \9)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i32$i8([^])\ (.), i8([^])\ (.), i32 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.mem\1.p\2i32(i8\3* align \8 \4, i8\5* align \8 \6, i32 \7, i1 \9)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i64$i8([^])\ (.), i8([^])\ (.), i64 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.mem\1.p\2i64(i8\3* align \8 \4, i8\5* align \8 \6, i64 \7, i1 \9)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i128$i8([^])\ (.), i8([^])\ (.), i128 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.mem\1.p\2i128(i8\3* align \8 \4, i8\5* align \8 \6, i128 \7, i1 \9)~g The remaining changes in the series will: Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing source and dest alignments. Step 3) Update Clang to use the new IRBuilder API. Step 4) Update Polly to use the new IRBuilder API. Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API, and those that use use MemIntrinsicInst::[get\|set]Alignment() to use getDestAlignment() and getSourceAlignment() instead. Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the MemIntrinsicInst::[get\|set]Alignment() methods. Reviewers: pete, hfinkel, lhames, reames, bollu Reviewed By: reames Subscribers: niosHD, reames, jholewinski, qcolombet, jfb, sanjoy, arsenm, dschuff, dylanmckay, mehdi_amini, sdardis, nemanjai, david2050, nhaehnle, javed.absar, sbc100, jgravelle-google, eraman, aheejin, kbarton, JDevlieghere, asb, rbar, johnrusso, simoncook, jordy.potman.lists, apazos, sabuasal, llvm-commits Differential Revision: https://reviews.llvm.org/D41675 llvm-svn: 322965	2018-01-19 17:13:12 +00:00
Craig Topper	7197a452fc	[X86] Autoupgrade kunpck intrinsics using vector operations instead of scalar operations Summary: This patch changes the kunpck intrinsic autoupgrade to use vXi1 shufflevector operations to perform vector extracts and concats. This more closely matches the definition of the kunpck instructions. Currently we rely on a DAG combine to turn the scalar shift/and/or code into a concat vectors operation. By doing it in the IR we get this for free. Reviewers: spatel, RKSimon, zvi, jina.nahias Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42018 llvm-svn: 322462	2018-01-14 19:24:10 +00:00
Craig Topper	cc342d465e	[X86] Remove llvm.x86.avx512.cvt2mask. intrinsics and autoupgrade to (icmp slt X, 0) I had to drop fast-isel-abort from a test because we can't fast isel some of the mask stuff. When we used intrinsics we implicitly fell back to SelectionDAG for the intrinsic call without triggering the abort error. But with native IR that doesn't happen the same way. llvm-svn: 322050	2018-01-09 00:50:47 +00:00
Michael Zolotukhin	f05cb4374d	Remove redundant includes from lib/IR. llvm-svn: 320622	2017-12-13 21:30:52 +00:00
Craig Topper	c3c3ebf29d	[X86] Attempt to fix a ubsan failure in the autoupgrade of kunpck intrinsics. llvm-svn: 319911	2017-12-06 17:54:07 +00:00
Jina Nahias	51c1a627c2	[x86][AVX512] Lowering kunpack intrinsics to LLVM IR This patch, together with a matching clang patch (https://reviews.llvm.org/D39719), implements the lowering of X86 kunpack intrinsics to IR. Differential Revision: https://reviews.llvm.org/D39720 Change-Id: I4088d9428478f9457f6afddc90bd3d66b3daf0a1 llvm-svn: 319778	2017-12-05 15:42:56 +00:00
Uriel Korach	2aa707bdaa	[X86] test/testn intrinsics lowering to IR. llvm part. Remove builtins from llvm and add AutoUpgrade support. Also add fast-isel tests for the TEST and TESTN instructions. Differential Revision: https://reviews.llvm.org/D38736 llvm-svn: 318036	2017-11-13 12:51:18 +00:00
Jina Nahias	9a7f9f123c	[x86][AVX512] Lowering shuffle i/f intrinsics to LLVM IR This patch, together with a matching clang patch (https://reviews.llvm.org/D38672), implements the lowering of X86 shuffle i/f intrinsics to IR. Differential Revision: https://reviews.llvm.org/D38671 Change-Id: I1e7d359a74743e995ec356237a85214ce55d3661 llvm-svn: 318026	2017-11-13 09:16:39 +00:00
Jina Nahias	7b705f1f91	[x86][AVX512] Lowering Broadcastm intrinsics to LLVM IR This patch, together with a matching clang patch (https://reviews.llvm.org/D38683), implements the lowering of X86 broadcastm intrinsics to IR. Differential Revision: https://reviews.llvm.org/D38684 Change-Id: I709ac0b34641095397e994c8ff7e15d1315b3540 llvm-svn: 317458	2017-11-06 07:09:24 +00:00
Saleem Abdulrasool	46a59fdab6	Bitcode: add an auto-upgrade for LTO section name The bitcode reader looks specifically for `__DATA, __objc_catlist` as a section name. However, SVN r304661 removed the spaces (the two names are functionally equivalent but do not compare equally lexicographically). This causes compatibility issues. Add an auto-upgrade path for removing the spaces as well as use the new name in the LTO plugin. llvm-svn: 315086	2017-10-06 18:06:59 +00:00
Adrian Prantl	a8b2ddbde4	Move the stripping of invalid debug info from the Verifier to AutoUpgrade. This came out of a recent discussion on llvm-dev (https://reviews.llvm.org/D38042). Currently the Verifier will strip the debug info metadata from a module if it finds the dbeug info to be malformed. This feature is very valuable since it allows us to improve the Verifier by making it stricter without breaking bcompatibility, but arguable the Verifier pass should not be modifying the IR. This patch moves the stripping of broken debug info into AutoUpgrade (UpgradeDebugInfo to be precise), which is a much better location for this since the stripping of malformed (i.e., produced by older, buggy versions of Clang) is a (harsh) form of AutoUpgrade. This change is mostly NFC in nature, the one big difference is the behavior when LLVM module passes are introducing malformed debug info. Prior to this patch, a NoAsserts build would have printed a warning and stripped the debug info, after this patch the Verifier will report a fatal error. I believe this behavior is actually more desirable anyway. Differential Revision: https://reviews.llvm.org/D38184 llvm-svn: 314699	2017-10-02 18:31:29 +00:00
Uriel Korach	0ecc984b1b	[X86] Finishing broadcastf32x2 and broadcasti32x2 intrinsics lowering to IR. llvm side. Removing X86 broadcast(f/i)32x2 intrinsics from llvm. Adding autoUpgrade support. Moving matching tests from avx512dq-intrinsics.ll to avx512dq-intrinsics-upgrade.ll and from avx512dqvl-intrinsics.ll to avx512dqvl-intrinsics-upgrade.ll. Differential Revision: https://reviews.llvm.org/D38220 llvm-svn: 314195	2017-09-26 07:39:39 +00:00
Jina Nahias	ccfb8d4fe8	[x86] Lowering Mask Set1 intrinsics to LLVM IR This patch, together with a matching clang patch (https://reviews.llvm.org/D37668), implements the lowering of X86 mask set1 intrinsics to IR. Differential Revision: https://reviews.llvm.org/D37669 llvm-svn: 313625	2017-09-19 11:03:06 +00:00
Craig Topper	f264fcc704	[X86] Remove VPERM2F128/VPERM2I128 intrinsics and autoupgrade to native shuffles. I've moved the test cases from the InstCombine optimizations to the backend to keep the coverage we had there. It covered every possible immediate so I've preserved the resulting shuffle mask for each of those immediates. llvm-svn: 313450	2017-09-16 07:36:14 +00:00
Steven Wu	ab211df5de	[AutoUpgrade] Fix a compatibility issue with module flag Summary: After r304661, module flag to record objective-c image info section is encoded without whitespaces after comma. The new name is equivalent to the old one, except that when LTO a module built by old compiler and a module built by a new compiler, it will fail with conflicting values. Fix the issue by removing whitespaces in bitcode upgrade path. rdar://problem/34416934 Reviewers: compnerd Reviewed By: compnerd Subscribers: mehdi_amini, hans, llvm-commits Differential Revision: https://reviews.llvm.org/D37909 llvm-svn: 313398	2017-09-15 21:12:14 +00:00
Uriel Korach	5d5da5f531	[X86] [PATCH] [intrinsics] Lowering X86 ABS intrinsics to IR. (llvm) This patch, together with a matching clang patch (https://reviews.llvm.org/D37694), implements the lowering of X86 ABS intrinsics to IR. differential revision: https://reviews.llvm.org/D37693. llvm-svn: 313134	2017-09-13 09:02:36 +00:00
Yael Tsafrir	47668b5e03	[X86] Lower _mm[256\|512]_[mask[z]]_avg_epu[8\|16] intrinsics to native llvm IR Differential Revision: https://reviews.llvm.org/D37560 llvm-svn: 313013	2017-09-12 07:50:35 +00:00
Uriel Korach	01dfd3d1e3	Revert "adding autoUpgrade support to broadcast[f\|i]32x2 intrinsics" This reverts commit r312879 - An accidental partial commit. llvm-svn: 312880	2017-09-10 09:07:21 +00:00
Uriel Korach	3eb10a79e5	adding autoUpgrade support to broadcast[f\|i]32x2 intrinsics llvm-svn: 312879	2017-09-10 08:40:13 +00:00
Steven Wu	010fc49e42	[IR] AutoUpgrade ModuleFlagBehavior for PIC and PIE level Summary: From r303590, ModuleFlagBehavior for PIC and PIE level is changed from Error to Max. This will cause bitcode compatibility issue when linking against a bitcode static archive built with old compiler. Add an auto-ugprade path to upgrade the the ModuleFlagBehavior in the old bitcode to match the new one so IRLinker can link them. Reviewers: tejohnson, mehdi_amini, dexonsmith Reviewed By: dexonsmith Subscribers: hans, llvm-commits Differential Revision: https://reviews.llvm.org/D36556 llvm-svn: 311387	2017-08-21 21:49:13 +00:00
Craig Topper	561092f233	[AVX512] Remove and autoupgrade many of the broadcast intrinsics Summary: This autoupgrades most of the broadcast intrinsics. They've been unused in clang for some time. This leaves the 32x2 intrinsics because they are still used in clang. Reviewers: RKSimon, zvi, igorb Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36606 llvm-svn: 310725	2017-08-11 16:22:45 +00:00
Adrian Prantl	abe04759a6	Remove the obsolete offset parameter from @llvm.dbg.value There is no situation where this rarely-used argument cannot be substituted with a DIExpression and removing it allows us to simplify the DWARF backend. Note that this patch does not yet remove any of the newly dead code. rdar://problem/33580047 Differential Revision: https://reviews.llvm.org/D35951 llvm-svn: 309426	2017-07-28 20:21:02 +00:00
Craig Topper	792fc92be2	[AVX-512] Remove and autoupgrade the masked integer compare intrinsics Summary: These intrinsics aren't used by clang and haven't been for a while. There's some really terrible codegen in the 32-bit target for avx512bw due to i64 not being legal. But as I said these intrinsics aren't used by clang even before this patch so this codegen reflects our clang behavior today. Reviewers: spatel, RKSimon, zvi, igorb Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34389 llvm-svn: 306047	2017-06-22 20:11:01 +00:00
Galina Kistanova	f525c76ba1	Added missing break. llvm-svn: 303454	2017-05-19 20:31:51 +00:00
Elad Cohen	ef5798acf5	Support arbitrary address space pointers in masked gather/scatter intrinsics. Fixes PR31789 - When loop-vectorize tries to use these intrinsics for a non-default address space pointer we fail with a "Calling a function with a bad singature!" assertion. This patch solves this by adding the 'vector of pointers' argument as an overloaded type which will determine the address space. Differential revision: https://reviews.llvm.org/D31490 llvm-svn: 302018	2017-05-03 12:28:54 +00:00
Simon Pilgrim	5a22eaa2bf	[X86][SSE] Update MOVNTDQA non-temporal loads to generic implementation (LLVM) MOVNTDQA non-temporal aligned vector loads can be correctly represented using generic builtin loads, allowing us to remove the existing x86 intrinsics. Clang companion patch: D31766. Differential Revision: https://reviews.llvm.org/D31767 llvm-svn: 300325	2017-04-14 15:05:35 +00:00
Matt Arsenault	f10061ec70	Add address space mangling to lifetime intrinsics In preparation for allowing allocas to have non-0 addrspace. llvm-svn: 299876	2017-04-10 20:18:21 +00:00
Michael Zuckerman	88fb171015	[X86][LLVM] Converting __mm{\|256\|512}_movm_epi{8\|16\|32\|64} LLVMIR call into generic intrinsics. This patch is a part one of two reviews, one for the clang and the other for LLVM. The patch deletes the back-end intrinsics and adds support for them in the auto upgrade. Differential Revision: https://reviews.llvm.org/D31393 llvm-svn: 299432	2017-04-04 13:32:14 +00:00
George Burgess IV	56c7e88c2c	Let llvm.objectsize be conservative with null pointers This adds a parameter to @llvm.objectsize that makes it return conservative values if it's given null. This fixes PR23277. Differential Revision: https://reviews.llvm.org/D28494 llvm-svn: 298430	2017-03-21 20:08:59 +00:00
Daniel Berlin	3f91004ce7	Keep attributes, calling convention, etc, when remangling intrinsic Summary: Fix issue reported where intrinsic calling convention is dropped after r295253. Reviewers: sanjoy Subscribers: materi, llvm-commits Differential Revision: https://reviews.llvm.org/D30422 llvm-svn: 296563	2017-03-01 01:49:13 +00:00
Craig Topper	c43f3f3291	[IR][X86] Fix llvm version number in comments in AutoUpgrade. Forgot the next release is 5.0 not 4.1 llvm-svn: 296092	2017-02-24 05:35:07 +00:00
Craig Topper	f2529c188b	[AVX-512] Remove lzcnt intrinsics and autoupgrade them to generic ctlz intrinsics with select. Clang has been emitting cltz intrinsics for a while now. llvm-svn: 296091	2017-02-24 05:35:04 +00:00
Craig Topper	185ced8b2b	[X86][IR] In AutoUpgrade, check explicitly for xop.vpcmov and xop.vpcmov.256 instead of anything starting with xop.vpcmov There were some older intrinsics that only existed for less than a month in 2012 that still exist in some out of tree test files that start with this string, but aren't able to be handled by the current upgrade code and fire an assert. Now we'll go back to treating them as not intrinsics at all and just passing them through to output. Fixes PR32041, sort of. llvm-svn: 295930	2017-02-23 03:22:14 +00:00
Craig Topper	de10312bea	Recommit "[X86] Remove XOP VPCMOV intrinsics and autoupgrade them to native IR." Clang has now been fixed to not use these intrinsics. llvm-svn: 295571	2017-02-18 21:50:58 +00:00
Craig Topper	ba2a726cc6	Revert "[X86] Remove XOP VPCMOV intrinsics and autoupgrade them to native IR." This reverts r295564. I missed that clang was still using the intrinsics despite our half implemented autoupgrade support. llvm-svn: 295565	2017-02-18 20:14:20 +00:00
Craig Topper	884db3f85d	[X86] Remove XOP VPCMOV intrinsics and autoupgrade them to native IR. It seems we were already upgrading 128-bit VPCMOV, but the intrinsic was still defined and being used in isel patterns. While I was here I also simplified the tablegen multiclasses. llvm-svn: 295564	2017-02-18 19:51:25 +00:00
Craig Topper	03a9adc2ba	[X86][IR] Simplify the XOP vpcmov autoupgrade code. NFC llvm-svn: 295563	2017-02-18 19:51:19 +00:00
Craig Topper	aa49f14496	[X86][IR] Merge together some very similar AutoUpgrade handling. NFC llvm-svn: 295562	2017-02-18 19:51:14 +00:00
Craig Topper	a505169ca5	[AVX-512] Remove 128/256-bit masked fp max/min intrinsics. Upgrade them to legacy unmasked intrinsics and select instructions. llvm-svn: 295543	2017-02-18 07:07:50 +00:00
Craig Topper	cbd1b60e42	[IR][X86] Simplify some AutoUpgrade code slightly. NFC llvm-svn: 295426	2017-02-17 07:07:24 +00:00
Craig Topper	905cc75f97	[IR][X86] Rename an AutoUpgrade helper function to more accurately match what intrinsics it handles. NFC llvm-svn: 295425	2017-02-17 07:07:21 +00:00
Craig Topper	b9b9cb0ce6	[IR][X86] Move X86 specific portions of UpgradeIntrinsicFunction1 to a couple helper functions. NFC This enables some early outs to avoid repeatedly using IsX86 check to qualify. I hope to continue to improve this to shorten the lengths of some of the string comparisons. llvm-svn: 295424	2017-02-17 07:07:19 +00:00
Craig Topper	715873ead3	[AVX-512] Remove masked packss/packus intrinsics and autoupgrade to unmasked intrinsics with select instructions. For 512-bit add new unmasked intrinsics. The new 512-bit unmasked intrinsics will make it easy to handle these with the SSE/AVX intrinsics in InstCombine where we currently have a TODO. llvm-svn: 295290	2017-02-16 06:31:54 +00:00
Daniel Berlin	3c1432fecf	Implement intrinsic mangling for literal struct types. Fixes PR 31921 Summary: Predicateinfo requires an ugly workaround to try to avoid literal struct types due to the intrinsic mangling not being implemented. This workaround actually does not work in all cases (you can hit the assert by bootstrapping with -print-predicateinfo), and can't be made to work without DFS'ing the type (IE copying getMangledStr and using a version that detects if it would crash). Rather than do that, i just implemented the mangling. It seems simple, since they are unified structurally. Looking at the overloaded-mangling testcase we have, it actually turns out the gc intrinsics will also crash if you try to use a literal struct. Thus, the testcase added fails before this patch, and works after, without needing to resort to predicateinfo. Reviewers: chandlerc, davide Subscribers: llvm-commits, sanjoy Differential Revision: https://reviews.llvm.org/D29925 llvm-svn: 295253	2017-02-15 23:16:20 +00:00
Justin Lebar	46624a822d	[NVPTX] Auto-upgrade some NVPTX intrinsics to LLVM target-generic code. Summary: Specifically, we upgrade llvm.nvvm.: * brev{32,64} * clz.{i,ll} * popc.{i,ll} * abs.{i,ll} * {min,max}.{i,ll,u,ull} * h2f These either map directly to an existing LLVM target-generic intrinsic or map to a simple LLVM target-generic idiom. In all cases, we check that the code we generate is lowered to PTX as we expect. These builtins don't need to be backfilled in clang: They're not accessible to user code from nvcc. Reviewers: tra Subscribers: majnemer, cfe-commits, llvm-commits, jholewinski Differential Revision: https://reviews.llvm.org/D28793 llvm-svn: 292694	2017-01-21 01:00:32 +00:00
Chad Rosier	d0114fc1dd	[ARM] Remove rbit intrinsics and autoupgrade to generic bitreverse. Testing already covered by CodeGen/ARM/rbit.ll llvm-svn: 291587	2017-01-10 19:23:51 +00:00
Chad Rosier	3daffbf6a8	[AArch64] Add support for lowering bitreverse to the rbit instruction. Differential Revision: https://reviews.llvm.org/D28379 llvm-svn: 291575	2017-01-10 17:20:33 +00:00
Craig Topper	0cda8bbf74	[AVX-512] Remove vinsert intrinsics and autoupgrade to native shufflevectors. There are some codegen problems here that I'll try to fix in future commits. llvm-svn: 290864	2017-01-03 05:45:57 +00:00
Craig Topper	4d47c6ae57	[AVX-512] Remove vextract intrinsics and autoupgrade to native shufflevectors. This unfortunately generates some really terrible code without VLX support due to v2i1 and v4i1 not being legal. Hopefully we can improve that in future patches. llvm-svn: 290863	2017-01-03 05:45:46 +00:00
Craig Topper	2da265b7bf	[AVX-512] Remove masked pmuldq and pmuludq intrinsics and autoupgrade them to unmasked intrinsics plus a select. llvm-svn: 290583	2016-12-27 05:30:14 +00:00
Craig Topper	1f1b441267	[X86] Remove masking from 512-bit VPERMIL intrinsics in preparation for being able to constant fold them in InstCombineCalls like we do for 128/256-bit. llvm-svn: 289350	2016-12-11 01:26:44 +00:00
Craig Topper	edab02b50b	[X86] Remove masking from 512-bit PSHUFB intrinsics in preparation for being able to constant fold it in InstCombineCalls like we do for 128/256-bit. llvm-svn: 289344	2016-12-10 23:09:43 +00:00
Craig Topper	abe7c5b5e9	[AVX-512] Remove 128/256 masked vpermil instrinsics and autoupgrade to a select around the unmasked avx1 intrinsics. llvm-svn: 289340	2016-12-10 21:15:52 +00:00
Craig Topper	a4744d170e	[X86][IR] Move the autoupgrading of store intrinsics out of the main nested if/else chain. This should buy a little more time against the MSVC limit mentioned in PR31034. The handlers for stores all return at the end of their block so they can be picked off early. llvm-svn: 289339	2016-12-10 21:15:48 +00:00
Craig Topper	f57e17def0	[AVX-512] Remove intrinsics for valignd/q and autoupgrade them to native shuffles. llvm-svn: 287744	2016-11-23 06:54:55 +00:00
Craig Topper	02b5a1b50f	[AVX-512] Replace masked 16-bit element variable shift intrinsics with new unmasked versions and selects. The same thing was done to 32-bit and 64-bit element sizes previously. This will allow us to support these shuffls in InstCombineCalls along with the other variable shift intrinsics. llvm-svn: 287312	2016-11-18 05:04:44 +00:00
Simon Pilgrim	b57dd17142	[X86][AVX512] Autoupgrade lossless i32/u32 to f64 conversion intrinsics with generic IR Both the (V)CVTDQ2PD (i32 to f64) and (V)CVTUDQ2PD (u32 to f64) conversion instructions are lossless and can be safely represented as generic SINT_TO_FP/UINT_TO_FP calls instead of x86 intrinsics without affecting final codegen. LLVM counterpart to D26686 Differential Revision: https://reviews.llvm.org/D26736 llvm-svn: 287108	2016-11-16 14:48:32 +00:00
Ayman Musa	4d60243bfd	[X86][AVX512] Removing llvm x86 intrinsics for _mm_mask_move_{ss\|sd} intrinsics. Differential Revision: https://reviews.llvm.org/D26128 llvm-svn: 287087	2016-11-16 09:00:28 +00:00
Craig Topper	6910fa0ef4	[X86] Remove the scalar intrinsics for fadd/fsub/fdiv/fmul Summary: These intrinsics have been unused for clang for a while. This patch removes them. We auto upgrade them to extractelements, a scalar operation and then an insertelement. This matches the sequence used by clangs intrinsic file. Reviewers: zvi, delena, RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26660 llvm-svn: 287083	2016-11-16 05:24:10 +00:00
Craig Topper	e6915b85ed	[X86] Add LLVM version number for each intrinsic handled by auto upgrade for age tracking. One day we'd like to remove some of this autoupgrade support and it will be easier if we know how long some of it has been around. Differential Revision: https://reviews.llvm.org/D26321 llvm-svn: 286933	2016-11-15 05:04:51 +00:00
Craig Topper	353e59b6d6	[AVX-512] Remove and autoupgrade masked dword/qword variable shift intrinsics to the new unmasked versions and selects. llvm-svn: 286786	2016-11-14 01:53:22 +00:00
Craig Topper	987dad2bc3	[X86][IR] Reduce the number of full string comparisons in the code that autoupgrades masked shift intrinsics. llvm-svn: 286768	2016-11-13 19:09:56 +00:00
Igor Breger	e2399f9e0e	revert commit r286761, some builds failed on Win platforms llvm-svn: 286765	2016-11-13 15:48:11 +00:00
Ayman Musa	c09b3769ae	[X86][AVX512] Removing llvm x86 intrinsics for _mm_mask_move_{ss\|sd} intrinsics. Differential Revision: https://reviews.llvm.org/D26128 llvm-svn: 286761	2016-11-13 14:51:25 +00:00
Craig Topper	da6a63db1c	[AVX-512] Remove the remaining masked shift by immediate or by single value. Autoupgrade them to recently introduced unmasked versions and a select. After this I'll add the unmasked intrinsics to InstCombineCalls to finish making our handling of these types of shuffles consistent between AVX-512 and the legacy intrinsics. llvm-svn: 286725	2016-11-12 18:04:46 +00:00
George Burgess IV	63b06e185c	Add a missing break statement. NFC. llvm-svn: 286203	2016-11-08 04:01:50 +00:00
Craig Topper	b110e04851	[AVX-512] Remove masked pmovzx/pmovsx builtins and autoupgrade them to selects and native zext/sext. This mostly reuses earlier autoupgrade support for the sse and avx equivalents. Just needed to add the code to add the select. llvm-svn: 286092	2016-11-07 02:12:57 +00:00
Craig Topper	96041c6ba9	[X86] Use StringRef::startswith to reduce a few compares in the intrinsic autoupgrade code. llvm-svn: 286090	2016-11-07 00:13:42 +00:00
Craig Topper	7e545335d6	[AVX-512] Remove 128/256 masked pshufb intrinsics. Autoupgrade them to legacy intrinsics and a select. llvm-svn: 286089	2016-11-07 00:13:39 +00:00

1 2 3 4 5 ...

396 Commits