llvm-project

Commit Graph

Author	SHA1	Message	Date
Roman Shirokiy	272eac85c7	Fix overconfident assert in ScalarEvolution::isImpliedViaMerge We can have AddRec with loops having many predecessors. This changes an assert to an early return. Differential Revision: https://reviews.llvm.org/D48766 llvm-svn: 335965	2018-06-29 11:46:30 +00:00
Sjoerd Meijer	3b599d75d5	[AArch64] Armv8.4-A: Virtualization system registers This adds the Secure EL2 extension. Differential Revision: https://reviews.llvm.org/D48711 llvm-svn: 335962	2018-06-29 11:03:15 +00:00
Simon Pilgrim	aab8660e23	[X86][SSE] Support v16i8/v32i8 vector rotations This uses the same technique as for shifts - split the rotation into 4/2/1-bit partial rotations and select those partials based on the amount bit, making use of PBLENDVB if available. This halves the use of PBLENDVB compared to expanding to shifts, which can be a slow op. Unfortunately I haven't found a decent way to share much of this code with the shift equivalent. Differential Revision: https://reviews.llvm.org/D48655 llvm-svn: 335957	2018-06-29 09:36:39 +00:00
Sjoerd Meijer	195e904002	[ARM][AArch64] Armv8.4-A Enablement Initial patch adding assembly support for Armv8.4-A. Besides adding v8.4 as a supported architecture to the usual places, this also adds target features for the different crypto algorithms. Armv8.4-A introduced new crypto algorithms, made them optional, and allows different combinations: - none of the v8.4 crypto functions are supported, which is independent of the implementation of the Armv8.0 SHA1 and SHA2 instructions. - the v8.4 SHA512 and SHA3 support is implemented, in this case the Armv8.0 SHA1 and SHA2 instructions must also be implemented. - the v8.4 SM3 and SM4 support is implemented, which is independent of the implementation of the Armv8.0 SHA1 and SHA2 instructions. - all of the v8.4 crypto functions are supported, in this case the Armv8.0 SHA1 and SHA2 instructions must also be implemented. The v8.4 crypto instructions are added to AArch64 only, and not AArch32, and are made optional extensions to Armv8.2-A. The user-facing Clang options will map on these new target features, their naming will be compatible with GCC and added in follow-up patches. The Armv8.4-A instruction sets can be downloaded here: https://developer.arm.com/products/architecture/a-profile/exploration-tools Differential Revision: https://reviews.llvm.org/D48625 llvm-svn: 335953	2018-06-29 08:43:19 +00:00
Roman Lebedev	8d081b78e4	SCEVExpander::expandAddRecExprLiterally(): check before casting as Instruction Summary: An alternative to D48597. Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=37936 \| PR37936 ]]. The problem is as follows: 1. `indvars` marks `%dec` as `NUW`. 2. `loop-instsimplify` runs `instsimplify`, which constant-folds `%dec` to -1 (D47908) 3. `loop-reduce` tries to do some further modification, but crashes with an type assertion in cast, because `%dec` is no longer an `Instruction`, If the runline is split into two, i.e. you first run `-indvars -loop-instsimplify`, store that into a file, and then run `-loop-reduce`, there is no crash. So it looks like the problem is due to `-loop-instsimplify` not discarding SCEV. But in this case we can just not crash if it's not an `Instruction`. This is just a local fix, unlike D48597, so there may very well be other problems. Reviewers: mkazantsev, uabelho, sanjoy, silviu.baranga, wmi Reviewed By: mkazantsev Subscribers: evstupac, javed.absar, spatel, llvm-commits Differential Revision: https://reviews.llvm.org/D48599 llvm-svn: 335950	2018-06-29 07:44:20 +00:00
Craig Topper	875e9f8fa4	[X86] Remove masking from the avx512 packed sqrt intrinsics. Use select in IR instead. While there improve the coverage of the intrinsic testing and add fast-isel tests. llvm-svn: 335944	2018-06-29 05:43:26 +00:00
Tom Stellard	c5a154db48	AMDGPU: Separate R600 and GCN TableGen files Summary: We now have two sets of generated TableGen files, one for R600 and one for GCN, so each sub-target now has its own tables of instructions, registers, ISel patterns, etc. This should help reduce compile time since each sub-target now only has to consider information that is specific to itself. This will also help prevent the R600 sub-target from slowing down new features for GCN, like disassembler support, GlobalISel, etc. Reviewers: arsenm, nhaehnle, jvesely Reviewed By: arsenm Subscribers: MatzeB, kzhuravl, wdng, mgorny, yaxunl, dstuttard, tpr, t-tye, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D46365 llvm-svn: 335942	2018-06-28 23:47:12 +00:00
Eli Friedman	65d885e376	[ARM] Assert that ARMDAGToDAGISel creates valid UBFX/SBFX nodes. We don't ever check these again (unless you're using -fno-integrated-as), so make sure the extracted bits are well-defined. I don't think it's possible to trigger any of the assertions on trunk, but it's difficult to prove. (The first one depends on DAGCombine to minimize the number of set bits in AND masks; I think the others are mathematically impossible to hit.) llvm-svn: 335931	2018-06-28 21:49:41 +00:00
Jessica Paquette	0c5d3ffbb8	[MachineOutliner] Never add the outliner in -O0 This is a recommit of r335879. We shouldn't add the outliner when compiling at -O0 even if -enable-machine-outliner is passed in. This makes sure that we don't add it in this case. This also removes -O0 from the outliner DWARF test. llvm-svn: 335930	2018-06-28 21:49:24 +00:00
Jake Ehrlich	0f440d832f	[llvm-readobj] Add experimental support for SHT_RELR sections This change adds experimental support for SHT_RELR sections, proposed here: https://groups.google.com/forum/#!topic/generic-abi/bX460iggiKg Definitions for the new ELF section type and dynamic array tags, as well as the encoding used in the new section are all under discussion and are subject to change. Use with caution! Author: rahulchaudhry Differential Revision: https://reviews.llvm.org/D47919 llvm-svn: 335922	2018-06-28 21:07:34 +00:00
Sanjay Patel	d512853aa3	[InstCombine] fix opcode check in shuffle fold There's no way to expose this difference currently, but we should use the updated variable because the original opcodes can go stale if we transform into something new. llvm-svn: 335920	2018-06-28 20:52:43 +00:00
Martin Storsjo	2a9bd7b756	[COFF] Fix constant sharing regression for MinGW This fixes a regression since SVN r334523, where the object files built targeting MinGW were rejected by GNU binutils tools. Prior to that commit, we only put constants in comdat for MSVC configurations. Differential Revision: https://reviews.llvm.org/D48567 llvm-svn: 335918	2018-06-28 20:28:29 +00:00
Teresa Johnson	e87868b7e9	[ThinLTO] Port InlinerFunctionImportStats handling to new PM Summary: The InlinerFunctionImportStats will collect and dump stats regarding how many function inlined into the module were imported by ThinLTO. Reviewers: wmi, dexonsmith Subscribers: mehdi_amini, inglorion, llvm-commits, eraman Differential Revision: https://reviews.llvm.org/D48729 llvm-svn: 335914	2018-06-28 20:07:47 +00:00
Benjamin Kramer	23d8282047	[NVPTX] Delete dead code No functionality change. llvm-svn: 335913	2018-06-28 20:05:35 +00:00
Eli Friedman	6613efbd4e	[ARM] Add missing Thumb2 assembler diagnostics. Mostly just adding checks for Thumb2 instructions which correspond to ARM instructions which already had diagnostics. While I'm here, also fix ARM-mode strd to check the input registers correctly. Differential Revision: https://reviews.llvm.org/D48610 llvm-svn: 335909	2018-06-28 19:53:12 +00:00
Anastasis Grammenos	425df22ee3	[SROA] Preserve DebugLoc when rewriting alloca partitions When rewriting an alloca partition copy the DL from the old alloca over the the new one. Differential Revision: https://reviews.llvm.org/D48640 llvm-svn: 335904	2018-06-28 18:58:30 +00:00
Zachary Turner	1adca7c4a5	Add a flag to FileOutputBuffer that allows modification. FileOutputBuffer creates a temp file and on commit atomically renames the temp file to the destination file. Sometimes we want to modify an existing file in place, but still have the atomicity guarantee. To do this we can initialize the contents of the temp file from the destination file (if it exists), that way the resulting FileOutputBuffer can have only selective bytes modified. Committing will then atomically replace the destination file as desired. llvm-svn: 335902	2018-06-28 18:49:09 +00:00
Simon Pilgrim	c09b5e31d7	Remove unnecessary semicolon. NFCI. Fixes -Wpedantic warning. llvm-svn: 335901	2018-06-28 18:37:16 +00:00
Craig Topper	90317d1d94	[X86] Suppress load folding into and/or/xor if it will prevent matching btr/bts/btc. This is a follow up to r335753. At the time I forgot about isProfitableToFold which makes this pretty easy. Differential Revision: https://reviews.llvm.org/D48706 llvm-svn: 335895	2018-06-28 17:58:01 +00:00
Jonas Devlieghere	b757fc3878	Revert "Re-land r335297 "[X86] Implement more of x86-64 large and medium PIC code models"" Reverting because this is causing failures in the LLDB test suite on GreenDragon. LLVM ERROR: unsupported relocation with subtraction expression, symbol '__GLOBAL_OFFSET_TABLE_' can not be undefined in a subtraction expression llvm-svn: 335894	2018-06-28 17:56:43 +00:00
Sanjay Patel	57bda365bf	[InstCombine] allow shl+mul combos with shuffle (select) fold (PR37806) This is an enhancement to D48401 that was discussed in: https://bugs.llvm.org/show_bug.cgi?id=37806 We can convert a shift-left-by-constant into a multiply (we canonicalize IR in the other direction because that's generally better of course). This allows us to remove the shuffle as we do in the regular opcodes-are-the-same cases. This requires a small hack to make sure we don't introduce any extra poison: https://rise4fun.com/Alive/ZGv Other examples of opcodes where this would work are add+sub and fadd+fsub, but we already canonicalize those subs into adds, so there's nothing to do for those cases AFAICT. There are planned enhancements for opcode transforms such or -> add. Note that there's a different fold needed if we've already managed to simplify away a binop as seen in the test based on PR37806, but we manage to get that one case here because this fold is positioned above the demanded elements fold currently. Differential Revision: https://reviews.llvm.org/D48485 llvm-svn: 335888	2018-06-28 17:48:04 +00:00
Jessica Paquette	dafa198c96	[MachineOutliner] Define MachineOutliner support in TargetOptions Targets should be able to define whether or not they support the outliner without the outliner being added to the pass pipeline. Before this, the outliner pass would be added, and ask the target whether or not it supports the outliner. After this, it's possible to query the target in TargetPassConfig, before the outliner pass is created. This ensures that passing -enable-machine-outliner will not modify the pass pipeline of any target that does not support it. https://reviews.llvm.org/D48683 llvm-svn: 335887	2018-06-28 17:45:43 +00:00
Simon Pilgrim	9c70d48cb2	[DAGCombiner] Ensure we use the correct CC result type in visitSDIV (REAPPLIED) We could get away with it for constant folded cases, but not for rL335719. Thanks to Krzysztof Parzyszek for noticing. Reapply original commit rL335821 which was reverted at rL335871 due to a WebAssembly bug that was fixed at rL335884. llvm-svn: 335886	2018-06-28 17:33:41 +00:00
Simon Pilgrim	99f701673d	[WebAssembly] Add getSetCCResultType placeholder override to handle vector compare results. Necessary to get the rL335821 bugfix (which was reverted at rL335871) un-reverted. llvm-svn: 335884	2018-06-28 17:27:09 +00:00
Jessica Paquette	d6261bef7b	Revert "[MachineOutliner] Add always and never options to -enable-machine-outliner" I accidentally committed this instead of D48683 because I haven't had coffee yet. llvm-svn: 335883	2018-06-28 17:26:19 +00:00
Jessica Paquette	f3a44fe833	Revert "[MachineOutliner] Never add the outliner in -O0" This reverts commit 9c7c10e4073a0bc6a759ce5cd33afbac74930091. It relies on r335872 since that introduces the machine outliner flags test. I meant to commit D48683 in that commit, but got mixed up and committed D48682 instead. So, I'm reverting this and r335872, since D48682 hasn't made it through review yet. llvm-svn: 335882	2018-06-28 17:26:18 +00:00
Jessica Paquette	c9d675266e	[MachineOutliner] Never add the outliner in -O0 We shouldn't add the outliner when compiling at -O0 even if -enable-machine-outliner is passed in. This makes sure that we don't add it in this case. This also updates machine-outliner-flags to reflect the change and improves the comment describing what that test does. llvm-svn: 335879	2018-06-28 17:05:57 +00:00
Matthias Braun	da5e7e11d1	SelectionDAGBuilder, mach-o: Skip trap after noreturn call (for Mach-O) Add NoTrapAfterNoreturn target option which skips emission of traps behind noreturn calls even if TrapUnreachable is enabled. Enable the feature on Mach-O to save code size; Comments suggest it is not possible to enable it for the other users of TrapUnreachable. rdar://41530228 DifferentialRevision: https://reviews.llvm.org/D48674 llvm-svn: 335877	2018-06-28 17:00:45 +00:00
Jessica Paquette	1ccb66c5fb	[MachineOutliner] Add always and never options to -enable-machine-outliner To enable the MachineOutliner by default on AArch64, we need to be able to disable the MachineOutliner and also provide an option to "always" enable the outliner. This adds that capability. It allows the user to still use the old -enable-machine-outliner option, which defaults to "always". This is building up to allowing the user to specify "always" versus the target-default outlining behaviour. llvm-svn: 335872	2018-06-28 16:39:42 +00:00
Haojian Wu	2103990e63	Revert "[DAGCombiner] Ensure we use the correct CC result type in visitSDIV" This reverts commit r335821. This crashes the webassembly test, run "ninja check-llvm-codegen-webassembly" to reproduce. llvm-svn: 335871	2018-06-28 16:25:57 +00:00
Stanislav Mekhanoshin	67aa18f165	[AMDGPU] Early expansion of 32 bit udiv/urem This allows hoisting of a common code, for instance if denominator is loop invariant. Current change is expansion only, adding licm to the target pass list going to be a separate patch. Given this patch changes to codegen are minor as the expansion is similar to that on DAG. DAG expansion still must remain for R600. Differential Revision: https://reviews.llvm.org/D48586 llvm-svn: 335868	2018-06-28 15:59:18 +00:00
Stanislav Mekhanoshin	298a61590a	[AMDGPU] Overload llvm.amdgcn.fmad.ftz to support f16 Differential Revision: https://reviews.llvm.org/D48677 llvm-svn: 335866	2018-06-28 15:24:46 +00:00
John Brawn	bdbbd8381f	Add a PhiValuesAnalysis pass to calculate the underlying values of phis This pass is being added in order to make the information available to BasicAA, which can't do caching of this information itself, but possibly this information may be useful for other passes. Incorporates code based on Daniel Berlin's implementation of Tarjan's algorithm. Differential Revision: https://reviews.llvm.org/D47893 llvm-svn: 335857	2018-06-28 14:13:06 +00:00
Benjamin Kramer	269eb21e1c	Revert "Add support for generating a call graph profile from Branch Frequency Info." This reverts commits r335794 and r335797. Breaks ThinLTO+FDO selfhost. llvm-svn: 335851	2018-06-28 13:15:03 +00:00
Sjoerd Meijer	c89ca5582a	[ARM] Parallel DSP Pass Armv6 introduced instructions to perform 32-bit SIMD operations. The purpose of this pass is to do some straightforward IR pattern matching to create ACLE DSP intrinsics, which map on these 32-bit SIMD operations. Currently, only the SMLAD instruction gets recognised. This instruction performs two multiplications with 16-bit operands, and stores the result in an accumulator. We will follow this up with patches to recognise SMLAD in more cases, and also to generate other DSP instructions (like e.g. SADD16). Patch by: Sam Parker and Sjoerd Meijer Differential Revision: https://reviews.llvm.org/D48128 llvm-svn: 335850	2018-06-28 12:55:29 +00:00
Jesper Antonsson	514b6b5796	Comment change to verify commit rights. NFC. Summary: Just a silly one-character correction. Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D48709 llvm-svn: 335832	2018-06-28 10:55:04 +00:00
Hans Wennborg	a257376003	s/TablesChecked/TableChecked/ after r335823 llvm-svn: 335831	2018-06-28 10:24:38 +00:00
Matt Arsenault	75e7192ba3	AMDGPU: Remove MFI::ABIArgOffset We have too many mechanisms for tracking the various offsets used for kernel arguments, so remove one. There's still a lot of confusion with these because there are two different "implicit" argument areas located at the beginning and end of the kernarg segment. Additionally, the offset was determined based on the memory size of the split element types. This would break in a future commit where v3i32 is decomposed into separate i32 pieces. llvm-svn: 335830	2018-06-28 10:18:55 +00:00
Matt Arsenault	1fb9013368	AMDGPU: Error on calls from graphics shaders In principle nothing should stop these from working, but work is necessary to create an ABI for dealing with the stack related registers. llvm-svn: 335829	2018-06-28 10:18:36 +00:00
Matt Arsenault	12269dda5c	AMDGPU: Fix AMDGPUCodeGenPrepare using uninitialized AMDGPUAS struct Not sure how this wasn't noticed before. llvm-svn: 335828	2018-06-28 10:18:23 +00:00
Matt Arsenault	513e0c0ea4	AMDGPU: Fix assert on aggregate type kernel arguments Just fix the crash for now by not doing the optimization since figuring out how to properly convert the bits for an arbitrary struct is a pain. Also fix a crash when there is only an empty struct argument. llvm-svn: 335827	2018-06-28 10:18:11 +00:00
Benjamin Kramer	f9613b2995	Unify sorted asserts to use the existing atomic pattern These are all benign races and only visible in !NDEBUG. tsan complains about it, but a simple atomic bool is sufficient to make it happy. llvm-svn: 335823	2018-06-28 10:03:45 +00:00
Simon Pilgrim	abebe4c746	[DAGCombiner] Ensure we use the correct CC result type in visitSDIV We could get away with it for constant folded cases, but not for rL335719. Thanks to Krzysztof Parzyszek for noticing. llvm-svn: 335821	2018-06-28 09:54:28 +00:00
Florian Hahn	388af14f85	[SCCP] Mark CFG as preserved. SCCP does not change the CFG, so we can mark it as preserved. Reviewers: dberlin, efriedma, davide Reviewed By: davide Differential Revision: https://reviews.llvm.org/D47149 llvm-svn: 335820	2018-06-28 09:53:38 +00:00
Simon Pilgrim	49cb65bb7b	[DAGCombiner] Remove unused variable. NFCI. Noticed in D45806 review. llvm-svn: 335817	2018-06-28 09:29:08 +00:00
Max Kazantsev	f5ba37182e	[IndVarSimplify] Ignore unreachable users of truncs If a trunc has a user in a block which is not reachable from entry, we can safely perform trunc elimination as if this user didn't exist. llvm-svn: 335816	2018-06-28 08:20:03 +00:00
Petar Jovanovic	d175aeb881	[DwarfDebug] Remove unused argument (NFC) Remove unused ByteStreamer argument from function emitDebugLocValue. Patch by Nikola Prica. Differential Revision: https://reviews.llvm.org/D48590 llvm-svn: 335811	2018-06-28 04:50:40 +00:00
Craig Topper	ec5d568ac1	[X86] Use PatFrag with hardcoded numbers for FROUND_NO_EXC/FROUND_CURRENT instead of ImmLeafs with predicates where one of the two numbers was hardcoded. This more efficient for the isel table generator since we can use CheckChildInteger instead of MoveChild, CheckPredicate, MoveParent. This reduced the table size by 1-2K. I wish there was a way to share the values with X86BaseInfo.h and still use a PatFrag like this. These numbers are fixed by the X86 intrinsic spec going back many years and we should never need to change them. So we shouldn't waste table bytes to support sharing. llvm-svn: 335806	2018-06-28 01:45:44 +00:00
Craig Topper	ab70f58891	[X86] Change how we prefer shift by immediate over folding a load into a shift. BMI2 added new shift by register instructions that have the ability to fold a load. Normally without doing anything special isel would prefer folding a load over folding an immediate because the load folding pattern has higher "complexity". This would require an instruction to move the immediate into a register. We would rather fold the immediate instead and have a separate instruction for the load. We used to enforce this priority by artificially lowering the complexity of the load pattern. This patch changes this to instead reject the load fold in isProfitableToFoldLoad if there is an immediate. This is more consistent with other binops and feels less hacky. llvm-svn: 335804	2018-06-28 00:47:41 +00:00
Michael J. Spencer	98f5475f44	[CGProfile] Fix unused variable warning. llvm-svn: 335797	2018-06-28 00:12:04 +00:00

1 2 3 4 5 ...

114508 Commits