llvm-project

Commit Graph

Author	SHA1	Message	Date
Nico Weber	e4a12cfa2f	revert r332610, it breaks cfi, see D46326 llvm-svn: 332838	2018-05-21 11:44:39 +00:00
Aleksandar Beserminji	4977705727	[mips] Revert Merge MipsLongBranch and MipsHazardSchedule passes Revert this patch due buildbot failure. Differential Revision: https://reviews.llvm.org/D46641 llvm-svn: 332837	2018-05-21 11:38:52 +00:00
David Green	8ceab61c75	[CVP] Require DomTree for new Pass Manager We were previously using a DT in CVP through SimplifyQuery, but not requiring it in the new pass manager. Hence it would crash if DT was not already available. This now gets DT directly and plumbs it through to where it is used (instead of using it through SQ). llvm-svn: 332836	2018-05-21 11:06:28 +00:00
Aleksandar Beserminji	de7be5e46f	[mips] Merge MipsLongBranch and MipsHazardSchedule passes MipsLongBranchPass and MipsHazardSchedule passes are joined to one pass because of mutual conflict. When MipsHazardSchedule inserts 'nop's, it potentially breaks some jumps, so they have to be expanded to long branches. When some branch is expanded to long branch, it potentially creates a hazard situation, which should be fixed by adding nops. New pass is called MipsBranchExpansion, it combines these two passes, and runs them alternately until one of them reports no changes were made. Differential Revision: https://reviews.llvm.org/D46641 llvm-svn: 332834	2018-05-21 10:20:02 +00:00
Simon Pilgrim	5aa7cdfd70	[X86][SSE] Support v4i32 rotations (PR37426) As suggested by Fabian on PR37426, we can use PMULUDQ to perform v4i32 vector rotations as the upper 32bits of the multiply will contain the 'wrapped' bits of the rotation. v8i16/v16i8 rotations would be straightforward to add to lowerRotate in the future - ideally we'd mostly share code with the vector shifts lowering. Differential Revision: https://reviews.llvm.org/D46954 llvm-svn: 332832	2018-05-21 09:45:59 +00:00
Nico Weber	d418776e04	win: try more to fix dia tests with newer msvc versions llvm-svn: 332828	2018-05-21 02:55:41 +00:00
Nico Weber	da5513b9c4	win: try to fix dia tests with newer msvc versions llvm-svn: 332827	2018-05-21 02:09:57 +00:00
Robert Widmann	360d6e35e6	[LLVM-C] Improve Bindings For Aliases Summary: Add wrappers for a module's alias iterators and a getter and setter for the aliasee value. Reviewers: whitequark, deadalnix Reviewed By: whitequark Subscribers: llvm-commits, harlanhaskins Differential Revision: https://reviews.llvm.org/D46808 llvm-svn: 332826	2018-05-20 23:49:08 +00:00
Craig Topper	e4c045b7df	[X86] Remove mask arguments from permvar builtins/intrinsics. Use a select in IR instead. Someday maybe we'll use selects for all intrinsics. llvm-svn: 332824	2018-05-20 23:34:04 +00:00
Simon Dardis	777afc7fbd	[mips] Add microMIPSR6 ll/sc instructions. Previously the compiler was using the microMIPSR3 variants, incorrectly. Reviewers: atanasyan, abeserminji, smaksimovic Differential Revision: https://reviews.llvm.org/D46948 llvm-svn: 332820	2018-05-20 17:21:00 +00:00
Sanjay Patel	a003c728a5	[InstCombine] choose 1 form of abs and nabs as canonical We already do this for min/max (see the blob above the diff), so we should do the same for abs/nabs. A sign-bit check (<s 0) is used as a predicate for other IR transforms and it's likely the best for codegen. This might solve the motivating cases for D47037 and D47041, but I think those patches still make sense. We can't guarantee this canonicalization if the icmp has more than one use. Differential Revision: https://reviews.llvm.org/D47076 llvm-svn: 332819	2018-05-20 14:23:23 +00:00
Craig Topper	9ed890b0db	[X86] Add test cases to show missed rotate opportunities due to SimplifyDemandedBits. llvm-svn: 332815	2018-05-20 02:32:45 +00:00
Haicheng Wu	69ba0613f2	[GlobalMerge] Exit early if only one global is to be merged To save some compilation time and prevent some unnecessary changes. Differential Revision: https://reviews.llvm.org/D46640 llvm-svn: 332813	2018-05-19 18:00:02 +00:00
Max Kazantsev	c0b268f90c	[IRCE] Fix miscompile with range checks against negative values In the patch rL329547, we have lifted the over-restrictive limitation on collected range checks, allowing to work with range checks with the end of their range not being provably non-negative. However it appeared that the non-negativity of this value was assumed in the utility function `ClampedSubtract`. In particular, its reasoning is based on the fact that `0 <= SINT_MAX - X`, which is not true if `X` is negative. The function `ClampedSubtract` is only called twice, once with `X = 0` (which is OK) and the second time with `X = IRC.getEnd()`, where we may now see the problem if the end is actually a negative value. In this case, we may sometimes miscompile. This patch is the conservative fix of the miscompile problem. Rather than rejecting non-provably non-negative `getEnd()` values, we will check it for non-negativity in runtime. For this, we use function `smax(smin(X, 0), -1) + 1` that is equal to `1` if `X` is non-negative and is equal to 0 if `X` is negative. If we multiply `Begin, End` of safe iteration space by this function calculated for `X = IRC.getEnd()`, we will get the original `[Begin, End)` if `IRC.getEnd()` was non-negative (and, thus, `ClampedSubtract` worked correctly) and the empty range `[0, 0)` in case if ` IRC.getEnd()` was negative. So we in fact prohibit execution of the main loop if at least one of range checks was made against a negative value (and we figured it out in runtime). It is still better than what we have before (non-negativity had to be proved in compile time) and prevents us from miscompile, however it is sometiles too restrictive for unsigned range checks against a negative value (which in fact can be eliminated). Once we re-implement `ClampedSubtract` in a way that it handles negative `X` correctly, this limitation can be lifted, too. Differential Revision: https://reviews.llvm.org/D46860 Reviewed By: samparker llvm-svn: 332809	2018-05-19 13:06:37 +00:00
Benjamin Kramer	a76b64ff80	[MergeICmps] Don't crash when memcmp is not available Fixes clang crashing with -fno-builtin, PR37527. llvm-svn: 332808	2018-05-19 12:51:59 +00:00
Yaxun Liu	ea988f1fd9	Fix evaluator for non-zero alloca addr space The evaluator goes through BB and creates global vars as temporary values to evaluate results of LLVM instructions. It creates undef for alloca, however it assumes alloca in addr space 0. If the next instruction is addrspace cast to 0, then we get an invalid cast instruction. This patch let the temp global var have an address space matching alloca addr space, so that the valuation can be done. Differential Revision: https://reviews.llvm.org/D47081 llvm-svn: 332794	2018-05-19 02:58:16 +00:00
Piotr Padlewski	5642a42442	Propagate nonnull and dereferenceable throught launder Summary: invariant.group.launder should not stop propagation of nonnull and dereferenceable, because e.g. we would not be able to hoist loads speculatively. Reviewers: rsmith, amharc, kuhar, xbolva00, hfinkel Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D46972 llvm-svn: 332788	2018-05-18 23:54:33 +00:00
Piotr Padlewski	ce358262eb	Dissallow non-empty metadata for invariant.group Summary: This feature is not needed, but it might be usefull in the future to use metadata to mark what which function should support it (and strip it when not). Reviewers: rsmith, sanjoy, amharc, kuhar Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D45419 llvm-svn: 332787	2018-05-18 23:53:46 +00:00
Piotr Padlewski	a26a08cb52	Constant fold launder of null and undef Summary: This might be useful because clang will add some barriers for pointer comparisons. Reviewers: majnemer, dberlin, hfinkel, nlewycky, davide, rsmith, amharc, kuhar Subscribers: davide, amharc, llvm-commits Differential Revision: https://reviews.llvm.org/D32423 llvm-svn: 332786	2018-05-18 23:52:57 +00:00
Piotr Padlewski	153fe60079	[MemDep] Fixed handling of invariant.group Summary: Memdep had funny bug related to invariant.groups - because it did not invalidated cache, in some very rare cases it was possible to show memory dependence of the instruction that was deleted, but because other instruction took it's place it resulted in call to vtable! Thanks @amharc for repro!. Reviewers: dberlin, kuhar, amharc Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45320 Co-authored-by: Krzysztof Pszeniczny <krzysztof.pszeniczny@gmail.com> llvm-svn: 332781	2018-05-18 22:40:34 +00:00
Sanjay Patel	b41a5affea	[x86] add more FP with FMF simplification tests; NFC llvm-svn: 332780	2018-05-18 22:31:43 +00:00
Matt Arsenault	9fc8593a77	DAG: Fix crash on shift with large shift amounts Fixes bug 37521. llvm-svn: 332774	2018-05-18 21:54:16 +00:00
Matt Arsenault	372d796ab1	AMDGPU: Add pass to optimize reqd_work_group_size Eliminate loads from the dispatch packet when they will have a known value. Also pattern match the code used by the library to handle partial workgroup dispatches, which isn't necessary if reqd_work_group_size is used. llvm-svn: 332771	2018-05-18 21:35:00 +00:00
Evgeniy Stepanov	28f330fd6f	[msan] Don't check divisor shadow in fdiv. Summary: Floating point division by zero or even undef does not have undefined behavior and may occur due to optimizations. Fixes https://bugs.llvm.org/show_bug.cgi?id=37523. Reviewers: kcc Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D47085 llvm-svn: 332761	2018-05-18 20:19:53 +00:00
Wolfgang Pieb	ad60559be7	[DWARF v5] Improved support for .debug_rnglists (consumer). Enables any consumer to extract DWARF v5 encoded rangelists. Reviewer: JDevlieghere Differential Revision: https://reviews.llvm.org/D45549 llvm-svn: 332759	2018-05-18 20:12:54 +00:00
Michael Berg	1fa76cc3ea	adding baseline fp fold tests for unsafe on and off llvm-svn: 332756	2018-05-18 19:30:49 +00:00
Amara Emerson	08099c7edd	Delete a test that was missed in the revert r332747. r332747 originally reverted r332654 which added this test. llvm-svn: 332755	2018-05-18 19:21:40 +00:00
Brendon Cahoon	e5ed563cc5	[Hexagon] Generate post-increment for floating point types The code that generates post-increments for Hexagon considered integer values only. This patch adds support to generate them for floating point values, f32 and f64. Differential Revision: https://reviews.llvm.org/D47036 llvm-svn: 332748	2018-05-18 18:14:44 +00:00
Simon Pilgrim	1273f4ad93	[X86] Add GPR<->XMM Schedule Tags BtVer2 - fix NumMicroOp and account for the Lat+6cy GPR->XMM and Lat+1cy XMm->GPR delays (see rL332737) The high number of MOVD/MOVQ equivalent instructions meant that there were a number of missed patterns in SNB/Znver1: SNB - add missing GPR<->MMX costs (taken from Agner / Intel AOM) Znver1 - add missing GPR<->XMM MOVQ costs (taken from Agner) llvm-svn: 332745	2018-05-18 17:58:36 +00:00
Craig Topper	f94ed26ea9	[X86] Directly legalize v16i16/v8i16 vselect to vXi8 vselect to use VPBLENDVB The intrinsic legalization for masked truncate uses ISD::TRUNCATE which can be constant folded by getNode. This prevents getVectorMaskingNode from seeing the ISD::TRUNCATE special case where it should emit X86ISD::SELECT instead of ISD::VSELECT. This causes a vselect with a v16i1 or v8i1 condition to be emitted during vector legalization. but vector legalization doesn't revisit nodes it creates. DAG combine will then promote this condition to match the result type. Then op legalization will try to legalize it, but the custom lowering hook returned SDValue(). But op legalization doesn't have an Expand for VSELECT because it expects vector legalization to have taken care of it. So the operation sticks around and fails in isel. This patch adds a custom legalization hook to morph it to a vXi8 vselect instead. This also simplifies the normal vXi16 vselect handling because vector legalization was normally expanding to AND/ANDN/OR and DAG combine was turning that into VBLENDVB. So we can skip a step by doing it directly. Fixes PR37499 Differential Revision: https://reviews.llvm.org/D47025 llvm-svn: 332743	2018-05-18 17:48:06 +00:00
Than McIntosh	3c639dbd0d	Revert changes from D46265. This is a revert of the changes from https://reviews.llvm.org/D46265; the new test introduced (test/CodeGen/X86/PR37310.mir) causes buildbot failures. Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D47061 llvm-svn: 332742	2018-05-18 17:47:10 +00:00
Nirav Dave	588fad4d3b	[MC] Relax .fill size requirements Avoid requirement that number of values must be known at assembler time. Fixes PR33586. Reviewers: rnk, peter.smith, echristo, jyknight Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D46703 llvm-svn: 332741	2018-05-18 17:45:48 +00:00
Craig Topper	fdde9f363c	[X86] Update fast-isel test cases for _mm256_mask_cvtepi16_epi8 to match clang r332738. llvm-svn: 332740	2018-05-18 17:29:47 +00:00
Jessica Paquette	e49374d009	Add remarks describing when a pass changes the IR instruction count of a module This patch adds a remark which tells the user when a pass changes the number of IR instructions in a module. It can be enabled by using -Rpass-analysis=size-info. The point of this is to make it easier to collect statistics on how passes modify programs in terms of code size. This is similar in concept to timing reports, but using a remark-based interface makes it easy to diff changes over multiple compilations of the same program. By adding functionality like this, we can see * Which passes impact code size the most * How passes impact code size at different optimization levels * Which pass might have contributed the most to an overall code size regression The patch lives in the legacy pass manager, but since it's simply emitting remarks, it shouldn't be too difficult to adapt the functionality to the new pass manager as well. This can also be adapted to handle MachineInstr counts in code gen passes. https://reviews.llvm.org/D38768 llvm-svn: 332739	2018-05-18 17:26:39 +00:00
Simon Pilgrim	007b50fd35	[X86][BtVer2] Improve simulation of (V)PINSR values Include the 6cy delay transferring from the GPR to FPU. llvm-svn: 332737	2018-05-18 17:09:41 +00:00
Sanjay Patel	56e09c6928	[InstCombine] add tests for lack of abs/nabs canonicalization; NFC llvm-svn: 332726	2018-05-18 15:26:38 +00:00
Sanjay Patel	fa3e4601c6	[InstCombine] regenerate checks; NFC There were a combination of auto-generated styles in use here because the scripts have evolved. llvm-svn: 332725	2018-05-18 15:22:19 +00:00
Simon Pilgrim	3ecb0b80f6	[X86][BtVer2] Partial vector stores (inc MMX) have a 2cy latency llvm-svn: 332722	2018-05-18 14:22:22 +00:00
Simon Pilgrim	c4b8d367a8	[X86][SSE] Ensure vector partial load/stores use the WriteVecLoad/WriteVecStore scheduler classes Retag some instructions that were missed when we split off vector load/store/moves - MOVQ/MOVD etc. Fixes BtVer2/SLM which have different behaviours for GPR stores. llvm-svn: 332718	2018-05-18 14:08:01 +00:00
Simon Pilgrim	d749b321b2	[X86][SSE] Ensure float load/stores use the WriteFLoad/WriteFStore scheduler classes Retag some instructions that were missed when we split off vector load/store/moves - MOVSS/MOVSD/MOVHPD/MOVHPD/MOVLPD/MOVLPS etc. Fixes BtVer2/SLM which have different behaviours for GPR stores. llvm-svn: 332714	2018-05-18 13:13:59 +00:00
Than McIntosh	4c21a363af	StackColoring: better handling of statically unreachable code Summary: Avoid assert/crash during liveness calculation in situations where the incoming machine function has statically unreachable BBs. Fixes PR37130. Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D46265 llvm-svn: 332707	2018-05-18 12:25:30 +00:00
Alexander Ivchenko	5c54742da4	[X86][CET] Changing -fcf-protection behavior to comply with gcc (LLVM part) This patch aims to match the changes introduced in gcc by https://gcc.gnu.org/ml/gcc-cvs/2018-04/msg00534.html. The IBT feature definition is removed, with the IBT instructions being freely available on all X86 targets. The shadow stack instructions are also being made freely available, and the use of all these CET instructions is controlled by the module flags derived from the -fcf-protection clang option. The hasSHSTK option remains since clang uses it to determine availability of shadow stack instruction intrinsics, but it is no longer directly used. Comes with a clang patch (D46881). Patch by mike.dvoretsky Differential Revision: https://reviews.llvm.org/D46882 llvm-svn: 332705	2018-05-18 11:58:25 +00:00
Jonas Paulsson	de54c058a6	[SystemZ] Fold AHIMux in foldMemoryOperandImpl. AHIMux can be folded the same way as AHI. Review: Ulrich Weigand llvm-svn: 332703	2018-05-18 11:54:04 +00:00
David Stenberg	0af67e5b65	[SimplifyCFG] Fix a debug invariant bug in FoldBranchToCommonDest() Summary: Fix a case where FoldBranchToCommonDest() would bail out from doing CSE when encountering a debug intrinsic. Handle that by skipping past the debug intrinsics. Also, as a minor refactoring, rename checkCSEInPredecessor() to tryCSEWithPredecessor() to make it a bit more clear that the function may remove instructions. Reviewers: fhahn, craig.topper, dblaikie, xbolva00 Reviewed By: fhahn, xbolva00 Subscribers: vsk, davide, llvm-commits Differential Revision: https://reviews.llvm.org/D46635 llvm-svn: 332698	2018-05-18 08:52:15 +00:00
Shiva Chen	6e07dfb148	[RISCV] Add WasForced parameter to MCAsmBackend::fixupNeedsRelaxationAdvanced For RISCV branch instructions, we need to preserve relocation types when linker relaxation enabled, so then linker could modify offset when the branch offsets changed. We preserve relocation types by define shouldForceRelocation. IsResolved return by evaluateFixup will always false when shouldForceRelocation return true. It will make RISCV MC Branch Relaxation always relax 16-bit branches to 32-bit form, even if the symbol actually could be resolved. To avoid 16-bit branches always relax to 32-bit form when linker relaxation enabled, we add a new parameter WasForced to indicate that the symbol actually couldn't be resolved and not forced by shouldForceRelocation return true. RISCVAsmBackend::fixupNeedsRelaxationAdvanced could relax branches with unresolved symbols by (!IsResolved && !WasForced). RISCV MC Branch Relaxation is needed because RISCV could perform 32-bit to 16-bit transformation in MC layer. Differential Revision: https://reviews.llvm.org/D46350 llvm-svn: 332696	2018-05-18 06:42:21 +00:00
Serguei Katkov	5095883fe9	[LICM] Extend the MustExecute scope CanProveNotTakenFirstIteration utility does not handle the case when condition of the branch is a constant. Add its handling. Reviewers: reames, anna, mkazantsev Reviewed By: reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D46996 llvm-svn: 332695	2018-05-18 04:56:28 +00:00
Keno Fischer	1e11fc1ccb	[X86DomainReassignment] Hopefully fix buildbot failure The Darwin build bot failed with: ``` llc -mcpu=skylake-avx512 -mtriple=x86_64-unknown-linux-gnu domain-reassignment-test.ll -o - \| llvm-mc -- Exit Code: 134 Command Output (stderr): -- Assertion failed: (MAI->hasSingleParameterDotFile()), function EmitFileDirective, file lib/MC/MCAsmStreamer.cpp, line 1087. ``` Looks like this is because the `llvm-mc` command was missing a triple directive and defaulting to MachO. Add the triple option. llvm-svn: 332694	2018-05-18 04:36:38 +00:00
Walter Lee	cdbb207bd1	[asan] Add instrumentation support for Myriad 1. Define Myriad-specific ASan constants. 2. Add code to generate an outer loop that checks that the address is in DRAM range, and strip the cache bit from the address. The former is required because Myriad has no memory protection, and it is up to the instrumentation to range-check before using it to index into the shadow memory. 3. Do not add an unreachable instruction after the error reporting function; on Myriad such function may return if the run-time has not been initialized. 4. Add a test. Differential Revision: https://reviews.llvm.org/D46451 llvm-svn: 332692	2018-05-18 04:10:38 +00:00
Eric Christopher	68f2218e1e	Revert "Temporarily revert "[DEBUG] Initial adaptation of NVPTX target for debug info emission."" This reapplies commits: r330271, r330592, r330779. [DEBUG] Initial adaptation of NVPTX target for debug info emission. Summary: Patch adds initial emission of the debug info for NVPTX target. Currently, only .file and .loc directives are emitted, everything else is commented out to not break the compilation of Cuda. llvm-svn: 332689	2018-05-18 03:13:08 +00:00
Eli Friedman	4081a57af7	[MachineOutliner] Count savings from outlining in bytes. Counting the number of instructions is both unintuitive and inaccurate. On AArch64, this only affects the generated remarks and certain rare pseudo-instructions, but it will have a bigger impact on other targets. Differential Revision: https://reviews.llvm.org/D46921 llvm-svn: 332685	2018-05-18 01:52:16 +00:00
Keno Fischer	e07153a859	[X86DomainReassignment] Don't compare stack-allocated values by address Summary: The Closure allocated in the main loop is allocated on the stack. However, later in the code its address is taken (and used for comparisons). This obviously doesn't work. In fact, the Closure will get the same stack address during every loop iteration, rendering the check that intended to identify Closure conflicts entirely ineffective. Fix this bug by giving every Closure a unique ID and using that for comparison. Alternatively, we could heap allocate the closure object. Fixes PR37396 Fixes JuliaLang/julia#27032 Reviewers: craig.topper, guyblank Reviewed By: craig.topper Subscribers: vchuravy, llvm-commits Differential Revision: https://reviews.llvm.org/D46800 llvm-svn: 332682	2018-05-18 01:03:01 +00:00
Keno Fischer	66ab99c3ee	[X86DomainReassignment] Don't delete IMPLICIT_DEF nodes Summary: We cannot simply delete IMPLICIT_DEF nodes. They may be used later (e.g. by a PHI) and deleting them will cause later passes (e.g. LiveVariables) to crash. However, it seems fine to ignore them for purposes of the domain reassignment (as we do with PHI). Fixes PR37430 Fixes JuliaLang/julia#27080 Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D46797 llvm-svn: 332680	2018-05-18 00:40:52 +00:00
Zachary Turner	c762666e87	Resubmit [pdb] Change /DEBUG:GHASH to emit 8 byte hashes." This fixes the remaining failing tests, so resubmitting with no functional change. llvm-svn: 332676	2018-05-17 22:55:15 +00:00
George Burgess IV	c6526176cf	Revert r332657: "[AA] cfl-anders-aa with field sensitivity" I don't believe the person who LGTMed this review has appropriate context on this code. I apologize if I'm wrong. llvm-svn: 332674	2018-05-17 21:56:39 +00:00
Changpeng Fang	860d460063	AMDGPU/SI: Don't promote alloca to vector for atomic load/store Summary: Don't promote alloca to vector for atomic load/store Reviewer: arsenm Differential Revision: https://reviews.llvm.org/D46085 llvm-svn: 332673	2018-05-17 21:49:44 +00:00
Zachary Turner	1de9fce151	Revert "[pdb] Change /DEBUG:GHASH to emit 8 byte hashes." A few tests haven't been properly updated, so reverting while I have time to investigate proper fixes. llvm-svn: 332672	2018-05-17 21:49:25 +00:00
Zachary Turner	3c4c8a0937	[pdb] Change /DEBUG:GHASH to emit 8 byte hashes. Previously we emitted 20-byte SHA1 hashes. This is overkill for identifying debug info records, and has the negative side effect of making object files bigger and links slower. By using only the last 8 bytes of a SHA1, we get smaller object files and ~10% faster links. This modifies the format of the .debug$H section by adding a new value for the hash algorithm field, so that the linker will still work when its object files have an old format. Differential Revision: https://reviews.llvm.org/D46855 llvm-svn: 332669	2018-05-17 21:22:48 +00:00
Reid Kleckner	f40f85868e	[codeview] Include record prefix in global type hashing The prefix includes type kind, which is important to preserve. Two different type leafs can easily have the same interior record contents as another type. We ran into this issue in PR37492 where a bitfield type record collided with a const modifier record. Their contents were bitwise identical, but their kinds were different. llvm-svn: 332664	2018-05-17 20:47:22 +00:00
David Bolvansky	b1c59e3f30	[AA] cfl-anders-aa with field sensitivity Summary: There was some unfinished work started for offset tracking in CFLGraph by the author of implementation of Andersen algorithm. This work was completed and support for field sensitivity was added to the core of Andersen algorithm. The performance results seem promising. SPEC2006 int_base score was increased by 1.1 % (I compared clang 6.0 with clang 6.0 with this patch). The avergae compile time was increased by +- 1 % according my measures with small and medium C/C++ projects (I did not tested it on the large projects with milions of lines of code) Reviewers: chandlerc, george.burgess.iv, rja Reviewed By: rja Subscribers: rja, llvm-commits Differential Revision: https://reviews.llvm.org/D46282 llvm-svn: 332657	2018-05-17 20:23:33 +00:00
Diego Caballero	f58ad3129c	[LV][VPlan] Build plain CFG with simple VPInstructions for outer loops. Patch #3 from VPlan Outer Loop Vectorization Patch Series #1 (RFC: http://lists.llvm.org/pipermail/llvm-dev/2017-December/119523.html). Expected to be NFC for the current inner loop vectorization path. It introduces the basic algorithm to build the VPlan plain CFG (single-level CFG, no hierarchical CFG (H-CFG), yet) in the VPlan-native vectorization path using VPInstructions. It includes: - VPlanHCFGBuilder: Main class to build the VPlan H-CFG (plain CFG without nested regions, for now). - VPlanVerifier: Main class with utilities to check the consistency of a H-CFG. - VPlanBlockUtils: Main class with utilities to manipulate VPBlockBases in VPlan. Reviewers: rengolin, fhahn, mkuper, mssimpso, a.elovikov, hfinkel, aprantl. Differential Revision: https://reviews.llvm.org/D44338 llvm-svn: 332654	2018-05-17 19:24:47 +00:00
Sanjay Patel	ed58fed9ef	[x86] preserve test intent by removing undef We need to clean up the DAG floating-point undef logic. This process is similar to how we handled integer undef logic in https://reviews.llvm.org/D43141. And as we did there, I'm trying to reduce the patch by changing tests that would probably become meaningless once we correct FP undef folding. llvm-svn: 332648	2018-05-17 18:43:44 +00:00
Anastasis Grammenos	d6c6678766	[Debugify] Print the output to stderr Currently debugify prints it's output to stdout, with this patch all the output generated goes to stderr. This change lets us use debugify without taking away the ability to pipe the output to other llvm tools. llvm-svn: 332642	2018-05-17 18:19:58 +00:00
Sameer AbuAsal	1dc0a8fb18	[RISCV] Separate base from offset in lowerGlobalAddress Summary: When lowering global address, lower the base as a TargetGlobal first then create an SDNode for the offset separately and chain it to the address calculation This optimization will create a DAG where the base address of a global access will be reused between different access. The offset can later be folded into the immediate part of the memory access instruction. With this optimization we generate: lui a0, %hi(s) addi a0, a0, %lo(s) ; shared base address. addi a1, zero, 20 ; 2 instructions per access. sw a1, 44(a0) addi a1, zero, 10 sw a1, 8(a0) addi a1, zero, 30 sw a1, 80(a0) Instead of: lui a0, %hi(s+44) ; 3 instructions per access. addi a1, zero, 20 sw a1, %lo(s+44)(a0) lui a0, %hi(s+8) addi a1, zero, 10 sw a1, %lo(s+8)(a0) lui a0, %hi(s+80) addi a1, zero, 30 sw a1, %lo(s+80)(a0) Which will save one instruction per access. Reviewers: asb, apazos Reviewed By: asb Subscribers: rbar, johnrusso, simoncook, jordy.potman.lists, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, mgrang, apazos, asb, llvm-commits Differential Revision: https://reviews.llvm.org/D46989 llvm-svn: 332641	2018-05-17 18:14:53 +00:00
Sanjay Patel	87621174ac	[x86] preserve test intent by removing undef We need to clean up the DAG floating-point undef logic. This process is similar to how we handled integer undef logic in https://reviews.llvm.org/D43141. And as we did there, I'm trying to reduce the patch by changing tests that would probably become meaningless once we correct FP undef folding. llvm-svn: 332640	2018-05-17 18:13:58 +00:00
Sanjay Patel	5c48b73fff	[ARM] preserve test intent by removing undef We need to clean up the DAG floating-point undef logic. This process is similar to how we handled integer undef logic in https://reviews.llvm.org/D43141. And as we did there, I'm trying to reduce the patch by changing tests that would probably become meaningless once we correct FP undef folding. llvm-svn: 332638	2018-05-17 18:09:56 +00:00
Sanjay Patel	38892e5ce1	[ARM] preserve test intent by removing undef We need to clean up the DAG floating-point undef logic. This process is similar to how we handled integer undef logic in https://reviews.llvm.org/D43141. And as we did there, I'm trying to reduce the patch by changing tests that would probably become meaningless once we correct FP undef folding. Follow-up to: https://reviews.llvm.org/rL332538 ...because that change wasn't enough. llvm-svn: 332637	2018-05-17 18:08:27 +00:00
Sanjay Patel	ffd349577d	[AArch64] preserve test intent by removing undef We need to clean up the DAG floating-point undef logic. This process is similar to how we handled integer undef logic in D43141. And as we did there, I'm trying to reduce the patch by changing tests that would probably become meaningless once we correct FP undef folding. Follow-up to: https://reviews.llvm.org/rL332534 ...because that change wasn't enough. llvm-svn: 332636	2018-05-17 18:07:02 +00:00
Mandeep Singh Grang	ef0ebf2806	[RISCV] Implement MC layer support for the tail pseudoinstruction Summary: This patch implements MC support for tail psuedo instruction. A follow-up patch implements the codegen support as well as handling of the indirect tail pseudo instruction. Reviewers: asb, apazos Reviewed By: asb Subscribers: rbar, johnrusso, simoncook, jordy.potman.lists, sabuasal, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, llvm-commits Differential Revision: https://reviews.llvm.org/D46221 llvm-svn: 332634	2018-05-17 17:31:27 +00:00
Changpeng Fang	391bcf8893	AMDGPU/SI: Handle infinite loop for the structurizer to work with CFG with infinite loops. Summary: The current StructurizeCFG pass only works for CFG with one exit. AMDGPUUnifyDivergentExitNodes combines multiple "return" blocks and/or "unreachable" blocks to one exit block for the Structurizer to work. However, infinite loop is another kind of special "exit", and if we don't handle it, the case of multiple exits will prevent the structurizer from working. In this work, for each infinite loop, we add a dummy edge to the "return" block, and thus the AMDGPUUnifyDivergentExitNodes pass will work with infinite loops. This will make CFG with infinite loops be structurized. Reviewer: nhaehnle Differential Revision: https://reviews.llvm.org/D46340 llvm-svn: 332625	2018-05-17 16:45:01 +00:00
Petar Jovanovic	daf5169398	[mips] Add support for Global INValidate ASE This includes Instructions: ginvi, ginvt, Assembler directives: .set ginv, .set noginv, .module ginv, .module noginv Attribute: ginv .MIPS.abiflags: GINV (0x20000) Patch by Vladimir Stefanovic. Differential Revision: https://reviews.llvm.org/D46268 llvm-svn: 332624	2018-05-17 16:30:32 +00:00
Craig Topper	bd332588bd	[InstCombine] Propagate the nsw/nuw flags from the add in the 'shifty' abs pattern to the sub in the select version. According to alive this is valid. I'm hoping to use this to make an assumption that the sign bit is zero after this sequence. The only way it wouldn't be is if the input was INT__MIN, but by preserving the flags we can make doing this to INT_MIN UB. The nuw flags is weird because it creates such a contradiction that the original number would have to be positive meaning we could remove the select entirely, but we don't get that far. Differential Revision: https://reviews.llvm.org/D46988 llvm-svn: 332623	2018-05-17 16:29:52 +00:00
Simon Pilgrim	e389ea0e3e	[llvm-mca][X86] Add CMOV test files llvm-svn: 332622	2018-05-17 16:29:12 +00:00
Alex Bradbury	6a53023b4e	[RISCV] Set isReMaterializable on ADDI and LUI instructions The isReMaterlizable flag is somewhat confusing, unlike most other instruction flags it is currently interpreted as a hint (mightBeRematerializable would be a better name). While LUI is always rematerialisable, for an instruction like ADDI it depends on its operands. TargetInstrInfo::isTriviallyReMaterializable will call TargetInstrInfo::isReallyTriviallyReMaterializable, which in turn calls TargetInstrInfo::isReallyTriviallyReMaterializableGeneric. We rely on the logic in the latter to pick out instances of ADDI that really are rematerializable. The isReMaterializable flag does make a difference on a variety of test programs. The recently committed remat.ll test case demonstrates how stack usage is reduce and a unnecessary lw/sw can be removed. Stack usage in the Proc0 function in dhrystone reduces from 192 bytes to 112 bytes. For the sake of completeness, this patch also implements RISCVRegisterInfo::isConstantPhysReg. Although this is called from a number of places, it doesn't seem to result in different codegen for any programs I've thrown at it. However, it is called in the rematerialisation codepath and it seems sensible to implement something correct here. Differential Revision: https://reviews.llvm.org/D46182 llvm-svn: 332617	2018-05-17 15:51:37 +00:00
Simon Pilgrim	b5741f5c3d	[X86][BtVer2] ADC/SBB take 2cy on an ALU pipe, not 1cy like ADD/SUB llvm-svn: 332616	2018-05-17 15:43:23 +00:00
Dmitry Mikulin	3c6b4e35bd	In thin and full LTO + CFI, direct function calls may go through jump table entries to reach the target. Since these calls don't require type checks, we can short-circuit them to their real targets. Differential Revision: https://reviews.llvm.org/D46326 llvm-svn: 332610	2018-05-17 14:29:07 +00:00
Alex Bradbury	5e41fc83c5	[Hexagon] Use addAliasForDirective for data directives Data directives such as .word, .half, .hword are currently parsed using HexagonAsmParser::ParseDirectiveValue which effectively duplicates logic from AsmParser::parseDirectiveValue. This patch deletes that duplicated logic in favour of using addAliasForDirective. Differential Revision: https://reviews.llvm.org/D46999 llvm-svn: 332607	2018-05-17 13:21:18 +00:00
Simon Pilgrim	0c0336e003	[X86] Split WriteADC/WriteADCRMW scheduler classes For integer ALU instructions taking eflags as an input (ADC/SBB/ADCX/ADOX) llvm-svn: 332605	2018-05-17 12:43:42 +00:00
Andrea Di Biagio	650b5fc6cb	[llvm-mca] add flag -all-views and flag -all-stats. Flag -all-views enables all the views. Flag -all-stats enables all the views that print hardware statistics. llvm-svn: 332602	2018-05-17 12:27:03 +00:00
Simon Pilgrim	b4fd145fc3	[llvm-mca][X86] Add ADX test files llvm-svn: 332595	2018-05-17 11:32:38 +00:00
Sander de Smalen	75cfa34156	[AArch64][SVE] Asm: Support for structured ST2, ST3 and ST4 (scalar+scalar) store instructions. Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D46680 llvm-svn: 332584	2018-05-17 09:05:41 +00:00
Mikael Holmen	2ca16899ec	Require DominatorTree when requiring/preserving LoopInfo in the old pass manager Summary: Require DominatorTree when requiring/preserving LoopInfo in the old pass manager BreakCriticalEdges tries to keep LoopInfo and DominatorTree updated if they exist. However, since commit r321653 and r321805, to update LoopInfo we must have a DominatorTree, or we will hit an assert. To fix this we now make a couple of passes that only required/preserved LoopInfo also require DominatorTree. This solves PR37334. Reviewers: eli.friedman, efriedma Reviewed By: efriedma Subscribers: efriedma, llvm-commits Differential Revision: https://reviews.llvm.org/D46829 llvm-svn: 332583	2018-05-17 09:05:40 +00:00
Martin Storsjo	c10788728b	[Analysis] Only use _unlocked stdio functions on linux The existing comment said that the functions were available only on GNU/Linux (and on certain Android versions), but only checked T.isGNUEnvironment() which also is true on MinGW (for arch-windows-gnu triplets), which doesn't have such functions. Existing checks in the initialize function in TargetLibraryInfo.cpp also use only T.isOSLinux() to check for glibc features. This fixes use of stdio on MinGW. Differential Revision: https://reviews.llvm.org/D47002 llvm-svn: 332581	2018-05-17 08:16:08 +00:00
Bjorn Pettersson	81a76a388a	[SROA] Handle PHI with multiple duplicate predecessors Summary: The verifier accepts PHI nodes with multiple entries for the same basic block, as long as the value is the same. As seen in PR37203, SROA did not handle such PHI nodes properly when speculating loads over the PHI, since it inserted multiple loads in the predecessor block and changed the PHI into having multiple entries for the same basic block, but with different values. This patch teaches SROA to reuse the same speculated load for each PHI duplicate entry in such situations. Resolves: https://bugs.llvm.org/show_bug.cgi?id=37203 Reviewers: uabelho, chandlerc, hfinkel, bkramer, efriedma Reviewed By: efriedma Subscribers: dberlin, efriedma, llvm-commits Differential Revision: https://reviews.llvm.org/D46426 llvm-svn: 332577	2018-05-17 07:21:41 +00:00
Hiroshi Inoue	f5c0e6c285	[SROA] pr37267: fix assertion failure in integer widening The current integer widening does not support rewriting partial split slices in rewriteIntegerStore (and rewriteIntegerLoad). This patch adds explicit checks for this case in isIntegerWideningViableForSlice. Before r322533, splitting is allowed only for the whole-alloca slice and hence the above case is implicitly rejected by another check `if (DL.getTypeStoreSize(ValueTy) > Size)` because whole-alloca slice is larger than the partition. Differential Revision: https://reviews.llvm.org/D46750 llvm-svn: 332575	2018-05-17 06:32:17 +00:00
Alex Bradbury	cea6db0480	[RISCV] Add support for .half, .hword, .word, .dword directives These directives are recognised by gas. Support is added through the use of addAliasForDirective. Also match RISC-V gcc in preferring .half and .word for 16-bit and 32-bit data directives. llvm-svn: 332574	2018-05-17 05:58:08 +00:00
Sanjay Patel	2e50cec5e3	[Thumb2] fix typo in test from r332548 llvm-svn: 332569	2018-05-17 03:24:25 +00:00
Stanislav Mekhanoshin	595fdcf43b	[AMDGPU] Move lsr test. NFC. llvm-svn: 332562	2018-05-17 01:30:51 +00:00
Sanjay Patel	ae83159530	[Hexagon] preserve test intent by removing undef We need to clean up the DAG floating-point undef logic. This process is similar to how we handled integer undef logic in D43141. And as we did there, I'm trying to reduce the patch by changing tests that would probably become meaningless once we correct FP undef folding. llvm-svn: 332550	2018-05-16 22:49:08 +00:00
Sanjay Patel	354842abc5	[PowerPC] preserve test intent by removing undef We need to clean up the DAG floating-point undef logic. This process is similar to how we handled integer undef logic in D43141. And as we did there, I'm trying to reduce the patch by changing tests that would probably become meaningless once we correct FP undef folding. llvm-svn: 332549	2018-05-16 22:48:48 +00:00
Sanjay Patel	68c83a24d4	[Thumb] preserve test intent by removing undef We need to clean up the DAG floating-point undef logic. This process is similar to how we handled integer undef logic in D43141. And as we did there, I'm trying to reduce the patch by changing tests that would probably become meaningless once we correct FP undef folding. llvm-svn: 332548	2018-05-16 22:47:51 +00:00
Sanjay Patel	eedf265a2c	[Thumb] preserve test intent by removing undef We need to clean up the DAG floating-point undef logic. This process is similar to how we handled integer undef logic in D43141. And as we did there, I'm trying to reduce the patch by changing tests that would probably become meaningless once we correct FP undef folding. llvm-svn: 332547	2018-05-16 22:47:42 +00:00
Simon Pilgrim	2dc00a64a2	[X86] Update SNB/generic scheduler tests missed from rL332536 llvm-svn: 332540	2018-05-16 22:24:22 +00:00
Sanjay Patel	2c1846de2d	[ARM] preserve test intent by removing undef We need to clean up the DAG floating-point undef logic. This process is similar to how we handled integer undef logic in D43141. And as we did there, I'm trying to reduce the patch by changing tests that would probably become meaningless once we correct FP undef folding. llvm-svn: 332539	2018-05-16 22:20:33 +00:00
Sanjay Patel	ce20ac0bc5	[ARM] preserve test intent by removing undef We need to clean up the DAG floating-point undef logic. This process is similar to how we handled integer undef logic in D43141. And as we did there, I'm trying to reduce the patch by changing tests that would probably become meaningless once we correct FP undef folding. llvm-svn: 332538	2018-05-16 22:20:26 +00:00
Sanjay Patel	066309f4ed	[ARM] preserve test intent by removing undef We need to clean up the DAG floating-point undef logic. This process is similar to how we handled integer undef logic in D43141. And as we did there, I'm trying to reduce the patch by changing tests that would probably become meaningless once we correct FP undef folding. llvm-svn: 332537	2018-05-16 22:20:11 +00:00
Sanjay Patel	60fe206793	[AArch64] preserve test intent by removing undef We need to clean up the DAG floating-point undef logic. This process is similar to how we handled integer undef logic in D43141. And as we did there, I'm trying to reduce the patch by changing tests that would probably become meaningless once we correct FP undef folding. llvm-svn: 332534	2018-05-16 21:57:57 +00:00
Sanjay Patel	6dcda28ddc	[ARM] preserve test intent by removing undef We need to clean up the DAG floating-point undef logic. This process is similar to how we handled integer undef logic in D43141. And as we did there, I'm trying to reduce the patch by changing tests that would probably become meaningless once we correct FP undef folding. llvm-svn: 332533	2018-05-16 21:57:19 +00:00
Sanjay Patel	82eca55953	[ARM] preserve test intent by removing undef We need to clean up the DAG floating-point undef logic. This process is similar to how we handled integer undef logic in D43141. And as we did there, I'm trying to reduce the patch by changing tests that would probably become meaningless once we correct FP undef folding. llvm-svn: 332532	2018-05-16 21:57:00 +00:00
Benjamin Kramer	8ac15bf4dc	[InstCombine] Fix the signature of fgets_unlocked. It returns a pointer, not an int. This miscompiles all code that uses the return value of fgets. llvm-svn: 332531	2018-05-16 21:45:39 +00:00
Eli Friedman	ddbf6d6514	[MachineOutliner] Don't outline instructions that modify SP. This breaks the code which saves and restores LR, so we can't outline without doing something more complicated for stack adjustment. Found by inspection; we get lucky in most cases because getMemOpInfo only handles STRWpost, not any other pre/post-increment forms. But it hits a couple of artificial testcases in the tree. Differential Revision: https://reviews.llvm.org/D46920 llvm-svn: 332529	2018-05-16 21:20:16 +00:00
Krzysztof Parzyszek	e8a0ae7346	[Hexagon] Mark HVX vector predicate bitwise ops as legal, add patterns llvm-svn: 332525	2018-05-16 21:00:24 +00:00
Simon Pilgrim	2e0f6c9b21	[X86][SSE] Reduce instruction/register usages for v4i32 vector shifts (PR37441) As suggested by Fabian on PR37441, use PSHUFLW to extend shift amount types for use with PSRAD/PSRLD to reduce register pressure. Some of this ideally would be done by combineTargetShuffle but its tricky to do as most of the shuffles are sharing inputs. Differential Revision: https://reviews.llvm.org/D46959 llvm-svn: 332524	2018-05-16 20:52:52 +00:00
Konstantin Zhuravlyov	c72ece6c2c	AMDGPU : Recalculate SGPRs when trap handler is supported Differential Revision: https://reviews.llvm.org/D29911 llvm-svn: 332523	2018-05-16 20:47:48 +00:00
Sam Clegg	6ccb59b3e9	[WebAssembly] MC: Ensure that FUNCTION_OFFSET relocations are always against function symbols. The getAtom() method wasn't doing what we needed in all cases. We want the symbols for the function which defines that section. We can compute this easily enough and we know that we have at most one function in each section. Once this lands I will revert rL331412 which is no longer needed. Fixes PR37409 Differential Revision: https://reviews.llvm.org/D46970 llvm-svn: 332517	2018-05-16 20:09:05 +00:00
Eli Friedman	02709bcb78	[MachineOutliner] Don't save/restore LR for tail calls. The cost computation assumes we do this correctly, but the actual lowering was wrong. Differential Revision: https://reviews.llvm.org/D46923 llvm-svn: 332514	2018-05-16 19:49:01 +00:00
Simon Pilgrim	d5d77dcb46	[X86] Fix typo in instregex for CVTSI642SDrr llvm-svn: 332510	2018-05-16 18:31:17 +00:00
Sanjay Patel	332bbb0fea	[x86] preserve test intent by removing undef We need to clean up the DAG floating-point undef logic. This process is similar to how we handled integer undef logic in D43141. And as we did there, I'm trying to reduce the patch by changing tests that would probably become meaningless once we make those fixes. llvm-svn: 332501	2018-05-16 17:58:50 +00:00
Sanjay Patel	502d115505	[x86] preserve test intent by removing undef We need to clean up the DAG floating-point undef logic. This process is similar to how we handled integer undef logic in D43141. And as we did there, I'm trying to reduce the patch by changing tests that would probably become meaningless once we make those fixes. llvm-svn: 332500	2018-05-16 17:58:08 +00:00
Sanjay Patel	a7874a52c9	[x86] preserve test intent by removing undef We need to clean up the DAG floating-point undef logic. This process is similar to how we handled integer undef logic in D43141. And as we did there, I'm trying to reduce the patch by changing tests that would probably become meaningless once we make those fixes. llvm-svn: 332499	2018-05-16 17:57:35 +00:00
Craig Topper	67aa726f8c	[X86][AVX512DQ] Use packed instructions for scalar FP<->i64 conversions on 32-bit targets As i64 types are not legal on 32-bit targets, insert these into a suitable zero vector and use the packed vXi64<->FP conversion instructions instead. Fixes PR3163. Differential Revision: https://reviews.llvm.org/D43441 llvm-svn: 332498	2018-05-16 17:40:07 +00:00
Vedant Kumar	5c6b3fb8fb	[Debugify] Tighten up the test for -debugify-each, NFC In post-commit review for r332416, Paul Robinson pointed out that the test for -debugify-each is not checking what it needs to. This commit tightens up the test. llvm-svn: 332497	2018-05-16 17:30:58 +00:00
Sanjay Patel	84caa9659e	[x86] add run with unsafe global param; NFC llvm-svn: 332486	2018-05-16 16:23:41 +00:00
Tony Tye	43259df44a	[AMDGPU] Change llvm.debugtrap to be a debug breakpoint that can resume execution. No longer require the queue pointer to be passed in in fixed SGPRs. Differential Revision: https://reviews.llvm.org/D46769 llvm-svn: 332485	2018-05-16 16:19:34 +00:00
Sanjay Patel	b3ac148cb4	[x86] add tests for DAG FP undef operands; NFC llvm-svn: 332484	2018-05-16 16:16:48 +00:00
Sander de Smalen	22176a2242	[AArch64][SVE] Improve diagnostics for vectors with incorrect element-size. For regular SVE vector operands, this patch introduces a more sensible diagnostic when the vector has a wrong suffix (e.g. z0.s vs z0.b). For example: add z0.s, z1.s, z2.b -> invalid element width ^_____^ mismatch For the vector-with-shift/extend (e.g. z0.s, uxtw #2) this patch takes a slightly different approach and instead returns a 'invalid operand' if the element size is not as expected. This is because the diagnostics are more specificied to suggest using the right shift/extend suffix. This is a trade-off not to introduce more operand classes and still provide useful diagnostics for LD1 and PRF instructions. For example: ld1w z1.s, p0/z, [x0, z0.s] -> invalid shift/extend specified, expected 'z[0..31].s, (uxtw\|sxtw)' ld1w z1.d, p0/z, [x0, z0.s] -> invalid operand ^________________^ mismatch For gather prefetches, both 'z0.s' and 'z0.d' would be allowed: prfw #0, p0, [x0, z0.s] -> invalid shift/extend specified, expected 'z[0..31].s, (uxtw\|sxtw) #2' prfw #0, p0, [x0, z0.d] -> invalid shift/extend specified, expected 'z[0..31].d, (lsl\|uxtw\|sxtw) #2' Without this change, the diagnostic would unnecessarily suggest a different element size: prfw #0, p0, [x0, z0.s] -> invalid shift/extend specified, expected 'z[0..31].d, (lsl\|uxtw\|sxtw) #2' Reviewers: SjoerdMeijer, aemerson, fhahn, samparker, javed.absar Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D46688 llvm-svn: 332483	2018-05-16 15:45:17 +00:00
Sirish Pande	cabe50a308	[AArch64] Gangup loads and stores for pairing. Keep loads and stores together (target defines how many loads and stores to gang up), such that it will help in pairing and vectorization. Differential Revision https://reviews.llvm.org/D46477 llvm-svn: 332482	2018-05-16 15:36:52 +00:00
Sanjay Patel	2eb3512090	[InstCombine] allow more binop (shuffle X), C transforms The canonicalization was restricted to shuffle masks with a 1-to-1 mapping to the constant vector, but that disqualifies the common splat pattern. This is part of solving PR37463: https://bugs.llvm.org/show_bug.cgi?id=37463 llvm-svn: 332479	2018-05-16 15:15:22 +00:00
Sander de Smalen	bbc4e9a4e3	[AArch64][SVE] Asm: Support for gather PRF prefetch instructions Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D46686 llvm-svn: 332472	2018-05-16 14:16:01 +00:00
Krzysztof Pszeniczny	2ba8fd4914	[BasicAA] Fix handling of invariant group launders Summary: A recent patch ([[ https://reviews.llvm.org/rL331587 \| rL331587 ]]) to Capture Tracking taught it that the `launder_invariant_group` intrinsic captures its argument only by returning it. Unfortunately, BasicAA still considered every call instruction as a possible escape source and hence concluded that the result of a `launder_invariant_group` call cannot alias any local non-escaping value. This led to [[ https://bugs.llvm.org/show_bug.cgi?id=37458 \| bug 37458 ]]. This patch updates the relevant check for escape sources in BasicAA. Reviewers: Prazek, kuhar, rsmith, hfinkel, sanjoy, xbolva00 Reviewed By: hfinkel, xbolva00 Subscribers: JDevlieghere, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D46900 llvm-svn: 332466	2018-05-16 13:16:54 +00:00
Matt Arsenault	67a9815a5c	AMDGPU: Custom lower v4i16/v4f16 vector operations Avoids stack access. Also handle extract hi elt pattern from truncate + shift to avoid a couple test regressions. llvm-svn: 332453	2018-05-16 11:47:30 +00:00
David Bolvansky	ca22d427b9	[SimplifyLibcalls] Replace locked IO with unlocked IO Summary: If file stream arg is not captured and source is fopen, we could replace IO calls by unlocked IO ("_unlocked" function variants) to gain better speed, Reviewers: efriedma, RKSimon, spatel, sanjoy, hfinkel, majnemer, lebedev.ri, rja Reviewed By: rja Subscribers: rja, srhines, efriedma, lebedev.ri, llvm-commits Differential Revision: https://reviews.llvm.org/D45736 llvm-svn: 332452	2018-05-16 11:39:52 +00:00
Amara Emerson	0d6a26dffc	[GlobalISel][IRTranslator] Split aggregates during IR translation. We currently handle all aggregates by creating one large LLT, and letting the legalizer deal with splitting them up. However using this approach means that we can't support big endian code correctly. This patch changes the way that the IRTranslator deals with aggregate values, by splitting them up into their constituent element values. To do this, parts of the translator need to be modified to deal with multiple VRegs for a single Value. A new Value to VReg mapper is introduced to help keep compile time under control, currently there is no measurable impact on CTMark despite the extra code being generated in some cases. Patch is based on the original work of Tim Northover. Differential Revision: https://reviews.llvm.org/D46018 llvm-svn: 332449	2018-05-16 10:32:02 +00:00
Andrea Di Biagio	45ccdd1785	[llvm-mca] Regenerate tests after r332381 and r332361. NFC llvm-svn: 332447	2018-05-16 10:12:06 +00:00
Simon Dardis	5cf9de4b72	[mips] Add support for isBranchOffsetInRange and use it for MipsLongBranch Add support for this target hook, covering MIPS, microMIPS and MIPSR6, along with some tests. Also add missing getOppositeBranchOpc() cases exposed by the tests. Reviewers: atanasyan, abeserminji, smaksimovic Differential Revision: https://reviews.llvm.org/D46794 llvm-svn: 332446	2018-05-16 10:03:05 +00:00
Peter Smith	c811758da6	[AArch64] Support "S" inline assembler constraint This patch re-introduces the "S" inline assembler constraint. This matches an absolute symbolic address or a label reference. The primary use case is asm("adrp %0, %1\n\t" "add %0, %0, :lo12:%1" : "=r"(addr) : "S"(&var)); I say re-introduces as it seems like "S" was implemented in the original AArch64 backend, but it looks like it wasn't carried forward to the merged backend. The original implementation had A and L modifiers that could be used to print ":lo12:" to the string. It looks like gcc doesn't use these and :lo12: is expected to be written in the inline assembly string so I've not implemented A and L. Clang already supports the S modifier. Fixes PR37180 Differential Revision: https://reviews.llvm.org/D46745 llvm-svn: 332444	2018-05-16 09:33:25 +00:00
Sander de Smalen	a680f558be	[AArch64][SVE] Asm: Support for structured LD2, LD3 and LD4 (scalar+scalar) load instructions. Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D46679 llvm-svn: 332442	2018-05-16 09:16:20 +00:00
Alexander Richardson	8f44579d0b	Emit a left-shift instead of a power-of-two multiply for jump-tables Summary: SelectionDAGLegalize::ExpandNode() inserts an ISD::MUL when lowering a BR_JT opcode. While many backends optimize this multiply into a shift, e.g. the MIPS backend currently always lowers this into a sequence of load-immediate+multiply+mflo in MipsSETargetLowering::lowerMulDiv(). I initially changed the multiply to a shift in the MIPS backend but it turns out that would not have handled the MIPSR6 case and was a lot more code than doing it in LegalizeDAG. I believe performing this simple optimization in LegalizeDAG instead of each individual backend is the better solution since this also fixes other backeds such as MSP430 which calls the multiply runtime function __mspabi_mpyi without this patch. Reviewers: sdardis, atanasyan, pftbest, asl Reviewed By: sdardis Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45760 llvm-svn: 332439	2018-05-16 08:58:26 +00:00
Simon Pilgrim	5df1ef7a8c	[X86][SSE] Fix tests for vector rotates by splat variable. We weren't correctly splatting the offset shift llvm-svn: 332435	2018-05-16 08:23:47 +00:00
Sander de Smalen	67f9154964	[AArch64][SVE] Asm: Support for contiguous PRF prefetch instructions. Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D46682 llvm-svn: 332433	2018-05-16 07:50:09 +00:00
Shoaib Meenai	074728a2a9	[ObjCARC] Prevent code motion into a catchswitch A catchswitch must be the only non-phi instruction in its basic block; attempting to move a retain or release into a catchswitch basic block will result in invalid IR. Explicitly mark a CFG hazard in this case to prevent the code motion. Differential Revision: https://reviews.llvm.org/D46482 llvm-svn: 332430	2018-05-16 04:52:18 +00:00
Evgeny Stupachenko	bff9302c3d	Fix LSR compile time hang. Summary: Limit number of reassociations in GenerateReassociationsImpl. Reviewers: qcolombet, mkazantsev Differential Revision: https://reviews.llvm.org/D46039 From: Evgeny Stupachenko <evstupac@gmail.com> <evgeny.v.stupachenko@intel.com> llvm-svn: 332426	2018-05-16 02:48:50 +00:00
Anastasis Grammenos	66f13e0ba2	[Debugify] Fix test failing after r332416 I missed a test that needed an update. Failing bot: http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-ubuntu-fast/builds/30071 llvm-svn: 332418	2018-05-16 00:11:52 +00:00
Anastasis Grammenos	b4344c66aa	[Debugfiy] Print the pass name next to the result CheckDebugify now prints the pass name right next to the result of the check. Differential Revision: https://reviews.llvm.org/D46908 llvm-svn: 332416	2018-05-15 23:38:05 +00:00
Eli Friedman	25bef201c5	[MachineOutliner] Add optsize markings to outlined functions. It doesn't matter much this late in the pipeline, but one place that does check for it is the function alignment code. Differential Revision: https://reviews.llvm.org/D46373 llvm-svn: 332415	2018-05-15 23:36:46 +00:00
Simon Pilgrim	de13589625	[X86][SSE] Add tests for vector rotates by splat variable. llvm-svn: 332410	2018-05-15 22:11:51 +00:00
Stanislav Mekhanoshin	57d341c27a	[AMDGPU] Fix handling of void types in isLegalAddressingMode It is legal for the type passed to isLegalAddressingMode to be unsized or, more specifically, VoidTy. In this case, we must check the legality of load / stores for all legal types. Directly trying to call getTypeStoreSize is incorrect, and leads to breakage in e.g. Loop Strength Reduction. This change guards against that behaviour. Differential Revision: https://reviews.llvm.org/D40405 llvm-svn: 332409	2018-05-15 22:07:51 +00:00
Sanjay Patel	919882638e	[InstCombine] fix binop (shuffle X), C --> shuffle (binop X, C') to check uses llvm-svn: 332407	2018-05-15 22:00:37 +00:00
Marek Olsak	37b9f55cc6	AMDGPU: Add a missing test for the 128-bit local addr space option This should have been pushed with: "AMDGPU: enable 128-bit for local addr space under an option" llvm-svn: 332404	2018-05-15 21:41:57 +00:00
Marek Olsak	3c5fd145c5	StructurizeCFG: fix inverting conditions Author: Samuel Pitoiset Without this patch, it appears to me that we are selecting the wrong operand when inverting conditions. In the attached test, it will select %tmp3 instead of %tmp4. To fix it, just use 'A' as everywhere. This fixes a regression introduced by "[PatternMatch] define m_Not using m_Xor and cst_pred_ty" https://reviews.llvm.org/D46351 llvm-svn: 332403	2018-05-15 21:41:55 +00:00
Evgeniy Stepanov	091fed94ae	[msan] Instrument masked.store, masked.load intrinsics. Summary: Instrument masked store/load intrinsics. Reviewers: kcc Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D46785 llvm-svn: 332402	2018-05-15 21:28:25 +00:00
Jake Ehrlich	e40398ad98	[llvm-objcopy] Add --only-keep-debug as a noop This option just keeps being a problem and really needs to be implemented in some fashion. Implementing it properly requires some kind of "replaceSectionReference" method because all the existing links need to be maintained. The desired behavior is just for allocated sections to become NOBITS but actually implementing that is rather tricky due to the current design of llvm-objcopy. However converting allocated sections to NOBITS is just an optimization and not something debuggers need. Debuggers can debug a stripped executable and take an unstripped executable for that stripped executable as input. Additionally allocated sections account for a very small part of debug binaries so this optimization is quite small. I propose that for the time being we implement this as a NOP so that people can use llvm-objcopy where they need to, just in a sub-optimal way. This option has already blocked a lot of people and its currently blocking me. llvm-svn: 332396	2018-05-15 20:53:53 +00:00
Evandro Menezes	8d522d811a	[AArch64] Improve single vector lane unscaled stores When storing the 0th lane of a vector, use a simpler and usually more efficient scalar store instead. In this case, also using the unscaled offset. Differential revision: https://reviews.llvm.org/D46762 llvm-svn: 332394	2018-05-15 20:41:12 +00:00
Sanjay Patel	cc0fbdb6e4	[InstCombine] add more tests for binop-shuffle; NFC The splat pattern is part of PR37463: https://bugs.llvm.org/show_bug.cgi?id=37463 llvm-svn: 332393	2018-05-15 20:34:09 +00:00
Chandler Carruth	5ecd81aab0	[x86][eflags] Fix PR37431 by teaching the EFLAGS copy lowering to specially handle SETB_C* pseudo instructions. Summary: While the logic here is somewhat similar to the arithmetic lowering, it is different enough that it made sense to have its own function. I actually tried a bunch of different optimizations here and none worked well so I gave up and just always do the arithmetic based lowering. Looking at code from the PR test case, we actually pessimize a bunch of code when generating these. Because SETB_C* pseudo instructions clobber EFLAGS, we end up creating a bunch of copies of EFLAGS to feed multiple SETB_C* pseudos from a single set of EFLAGS. This in turn causes the lowering code to ruin all the clever code generation that SETB_C* was hoping to achieve. None of this is needed. Whenever we're generating multiple SETB_C* instructions from a single set of EFLAGS we should instead generate a single maximally wide one and extract subregs for all the different desired widths. That would result in substantially better code generation. But this patch doesn't attempt to address that. The test case from the PR is included as well as more directed testing of the specific lowering pattern used for these pseudos. Reviewers: craig.topper Subscribers: sanjoy, mcrosier, llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D46799 llvm-svn: 332389	2018-05-15 20:16:57 +00:00
Konstantin Zhuravlyov	f13c9969fc	AMDGPU: Fix v_dot{4, 8}* instruction encoding Differential Revision: https://reviews.llvm.org/D46848 llvm-svn: 332387	2018-05-15 19:32:47 +00:00
Martin Storsjo	e241ce6f65	[llvm-rc] Add support for the optional CLASS statement for dialogs Differential Revision: https://reviews.llvm.org/D46875 llvm-svn: 332386	2018-05-15 19:21:28 +00:00
Michael Zolotukhin	67cfbaac89	[MemorySSA] Don't sort IDF blocks. Summary: After r332167 we started to sort the IDF blocks inside IDF calculation, so there is no need to re-sort them on the user site. The test changes are due to a slightly different order we're using now (originally we used DFSInNumber and now the blocks are sorted by a pair (LevelFromRoot, DFSInNumber)). Reviewers: dberlin, mgrang Subscribers: Prazek, hiraditya, george.burgess.iv, llvm-commits Differential Revision: https://reviews.llvm.org/D46899 llvm-svn: 332385	2018-05-15 18:40:29 +00:00
Tom Stellard	e182b28ae4	AMDGPU/GlobalISel: Implement select() for G_FCONSTANT Summary: Also clean up G_CONSTANT selection. Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D46170 llvm-svn: 332379	2018-05-15 17:57:09 +00:00
Konstantin Zhuravlyov	603a43fcd5	AMDGPU: Add disasm tests for deep learning instructions + fix v_fmac_f32 disasm Differential Revision: https://reviews.llvm.org/D46853 llvm-svn: 332377	2018-05-15 17:39:13 +00:00
Simon Pilgrim	be9a206883	[X86] Split WriteCvtF2F into F32->F64 and F64->F32 scheduler classes BtVer2 - Fixes schedules for (V)CVTPS2PD instructions A lot of the Intel models still have too many InstRW overrides for these new classes - this needs cleaning up but I wanted to get the classes in first llvm-svn: 332376	2018-05-15 17:36:49 +00:00

1 2 3 4 5 ...

53347 Commits