llvm-project

Commit Graph

Author	SHA1	Message	Date
Lei Huang	263dc4ef3a	[PowerPC] Utilize DQ-Form instructions for spill/restore and fix FrameIndex elimination to only use `lis/addi` if necessary. Currently we produce a bunch of unnecessary code when emitting the prologue/epilogue for spills/restores. Namely, if the load from stack slot/store to stack slot instruction is an X-Form instruction, we will always produce an LIS/ORI sequence for the stack offset. Furthermore, we have not exploited the P9 vector D-Form loads/stores for this purpose. This patch address both issues. Specifying the D-Form load as the instruction to use for stack spills/reloads should be safe because: 1. The stack should be aligned according to the ABI 2. If the stack isn't aligned, PPCRegisterInfo::eliminateFrameIndex() will check for the offset being a multiple of 16 and will convert it to an X-Form instruction if it isn't. Differential Revision : https://reviews.llvm.org/D38758 llvm-svn: 315500	2017-10-11 20:20:58 +00:00
Zachary Turner	fa0ca6cbd0	[llvm-rc] Use proper search algorithm for finding resources. Previously we would only look in the current directory for a resource, which might not be the same as the directory of the rc file. Furthermore, MSVC rc supports a /I option, and can also look in the system environment. This patch adds support for this search algorithm. Differential Revision: https://reviews.llvm.org/D38740 llvm-svn: 315499	2017-10-11 20:12:09 +00:00
Daniel Neilson	5acfd1dd78	[SCEV] Properly handle the case of a non-constant start with a zero accum in ScalarEvolution::createAddRecFromPHIWithCastsImpl Summary: This patch fixes an error in the patch to ScalarEvolution::createAddRecFromPHIWithCastsImpl made in D37265. In that patch we handle the cases where the either the start or accum values can be zero after truncation. But, we assume that the start value must be a constant if the accum is zero. This is clearly an erroneous assumption. This change removes that assumption. Reviewers: sanjoy, dorit, mkazantsev Reviewed By: sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38814 llvm-svn: 315491	2017-10-11 19:05:14 +00:00
Sanjay Patel	6c0aef77aa	[x86] avoid infinite loop from SoftenFloatOperand (PR34866) Legalization of fp128 assumes things that we should have asserts for, so that's another potential improvement. Differential Revision: https://reviews.llvm.org/D38771 llvm-svn: 315485	2017-10-11 18:24:21 +00:00
Rafael Espindola	3500f5e3bf	Convert the last uses of ErrorOr in include/llvm/Object. llvm-svn: 315483	2017-10-11 18:07:18 +00:00
Rafael Espindola	f340467495	Convert the last uses of ErrorOr in COFF.h. llvm-svn: 315480	2017-10-11 17:33:11 +00:00
Vivek Pandya	9590658fb8	[NFC] Convert OptimizationRemarkEmitter old emit() calls to new closure parameterized emit() calls Summary: This is not functional change to adopt new emit() API added in r313691. Reviewed By: anemet Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38285 llvm-svn: 315476	2017-10-11 17:12:59 +00:00
Rafael Espindola	87867988f9	Convert a couple of ErrorOr to Expected. NFC. llvm-svn: 315475	2017-10-11 17:05:24 +00:00
Rafael Espindola	1a0e5a1933	Convert an ErrorOr to Expected. getRelocationAddend should never be called on non SHT_RELA sections, but changing that requires changing RelocVisitor.h. llvm-svn: 315473	2017-10-11 16:56:33 +00:00
Krzysztof Parzyszek	bf626195df	[Hexagon] Handle non-immediate operands to A2_addi in getIncrementValue llvm-svn: 315472	2017-10-11 16:15:31 +00:00
Simon Pilgrim	7db366630c	Spelling mistake in comment. NFCI. llvm-svn: 315471	2017-10-11 16:10:05 +00:00
Craig Topper	3dc22bba47	[X86] Remove MVT::i1 handling code from LowerTRUNCATE Summary: I don't think this is necessary with i1 being illegal now. Reviewers: RKSimon, zvi, guyblank Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38784 llvm-svn: 315469	2017-10-11 16:05:05 +00:00
Krzysztof Parzyszek	12bdcab59c	[Pipeliner] Fix offset value for instrs dependent on post-inc load/stores The software pipeliner and the packetizer try to break dependence between the post-increment instruction and the dependent memory instructions by changing the base register and the offset value. However, in some cases, the existing logic didn't work properly and created incorrect offset value. Patch by Jyotsna Verma. llvm-svn: 315468	2017-10-11 15:59:51 +00:00
Krzysztof Parzyszek	8f174dde92	[Pipeliner] Improve serialization order for post-increments The pipeliner is generating a serial sequence that causes poor register allocation when a post-increment instruction appears prior to the use of the post-increment register. This occurs when there is a circular set of dependences involved with a sequence of instructions in the same cycle. In this case, there is no serialization of the parallel semantics that will not cause an additional register to be allocated. This patch fixes the problem by changing the instructions so that the post-increment instruction is used by the subsequent instruction, which enables the register allocator to make a better decision and not require another register. Patch by Brendon Cahoon. llvm-svn: 315466	2017-10-11 15:51:44 +00:00
Sanjay Patel	34fd5eaaf0	[DAGCombiner] convert insertelement of bitcasted vector into shuffle Eg: insert v4i32 V, (v2i16 X), 2 --> shuffle v8i16 V', X', {0,1,2,3,8,9,6,7} This is a generalization of the IR fold in D38316 to handle insertion into a non-undef vector. We may want to abandon that one if we can't find value in squashing the more specific pattern sooner. We're using the existing legal shuffle target hook to avoid AVX512 horror with vXi1 shuffles. There may be room for improvement in the shuffle lowering here, but that would be follow-up work. Differential Revision: https://reviews.llvm.org/D38388 llvm-svn: 315460	2017-10-11 14:12:16 +00:00
Alex Bradbury	4d275f0dfe	[TargetLowering] Correctly track NumFixedArgs field of CallLoweringInfo The NumFixedArgs field of CallLoweringInfo is used by TargetLowering::LowerCallTo to determine whether a given argument is passed using the vararg calling convention or not (specifically, to set IsFixed for each ISD::OutputArg). Firstly, CallLoweringInfo::setLibCallee and CallLoweringInfo::setCallee both incorrectly set NumFixedArgs based on the _previous_ args list. Secondly, TargetLowering::LowerCallTo failed to increment NumFixedArgs when modifying the argument list so a pointer is passed for the return value. If your backend uses the IsFixed property or directly accesses NumFixedArgs, it is _possible_ this change could result in codegen changes (although the previous behaviour would have been incorrect). No such cases have been identified during code review for any in-tree architecture. Differential Revision: https://reviews.llvm.org/D37898 llvm-svn: 315457	2017-10-11 13:48:45 +00:00
Alex Bradbury	5c1eef4618	[RISCV] Fix build after r315327 Differential Revision: https://reviews.llvm.org/D38779 Patch by Chih-Mao Chen. llvm-svn: 315455	2017-10-11 12:09:06 +00:00
Simon Dardis	41851e3546	[mips] Add support for parsing target specific flags for MIR Reviewers: atanasyan Differential Revision: https://reviews.llvm.org/D38620 llvm-svn: 315451	2017-10-11 11:11:35 +00:00
Max Kazantsev	fecaff1bd9	[NFC] Fix variables used only for assert in GVN llvm-svn: 315448	2017-10-11 10:31:49 +00:00
Oliver Stannard	4191b9eaea	[Asm] Add debug tracing in table-generated assembly matcher This adds debug tracing to the table-generated assembly instruction matcher, enabled by the -debug-only=asm-matcher option. The changes in the target AsmParsers are to add an MCInstrInfo reference under a consistent name, so that we can use it from table-generated code. This was already being used this way for targets that use deprecation warnings, but 5 targets did not have it, and Hexagon had it under a different name to the other backends. llvm-svn: 315445	2017-10-11 09:17:43 +00:00
Max Kazantsev	3b81809e06	[GVN] Prevent LoadPRE from hoisting across instructions that don't pass control flow to successors This patch fixes the miscompile that happens when PRE hoists loads across guards and other instructions that don't always pass control flow to their successors. PRE is now prohibited to hoist across such instructions because there is no guarantee that the load standing after such instruction is still valid before such instruction. For example, a load from under a guard may be invalid before the guard in the following case: int array[LEN]; ... guard(0 <= index && index < LEN); use(array[index]); Differential Revision: https://reviews.llvm.org/D37460 llvm-svn: 315440	2017-10-11 08:10:43 +00:00
Max Kazantsev	0c8dd052b8	[LICM] Disallow sinking of unordered atomic loads into loops Sinking of unordered atomic load into loop must be disallowed because it turns a single load into multiple loads. The relevant section of the documentation is: http://llvm.org/docs/Atomics.html#unordered, specifically the Notes for Optimizers section. Here is the full text of this section: > Notes for optimizers > In terms of the optimizer, this prohibits any transformation that > transforms a single load into multiple loads, transforms a store into > multiple stores, narrows a store, or stores a value which would not be > stored otherwise. Some examples of unsafe optimizations are narrowing > an assignment into a bitfield, rematerializing a load, and turning loads > and stores into a memcpy call. Reordering unordered operations is safe, > though, and optimizers should take advantage of that because unordered > operations are common in languages that need them. Patch by Daniil Suchkov! Reviewed By: reames Differential Revision: https://reviews.llvm.org/D38392 llvm-svn: 315438	2017-10-11 07:26:45 +00:00
Max Kazantsev	25d8655dc2	[IRCE] Do not process empty safe ranges IRCE should not apply when the safe iteration range is proved to be empty. In this case we do unneeded job creating pre/post loops and then never go to the main loop. This patch makes IRCE not apply to empty safe ranges, adds test for this situation and also modifies one of existing tests where it used to happen slightly. Reviewed By: anna Differential Revision: https://reviews.llvm.org/D38577 llvm-svn: 315437	2017-10-11 06:53:07 +00:00
Davide Italiano	e2138fe41b	[GVN] Don't replace constants with constants. This fixes PR34908. Patch by Alex Crichton! Differential Revision: https://reviews.llvm.org/D38765 llvm-svn: 315429	2017-10-11 04:21:51 +00:00
Peter Collingbourne	b4f1b88551	WIN32_FIND_DATA -> WIN32_FIND_DATAW. Should fix mingw bot. llvm-svn: 315413	2017-10-11 02:09:06 +00:00
Lang Hames	02d330548d	[MC] Have MCObjectStreamer take its MCAsmBackend argument via unique_ptr. MCObjectStreamer owns its MCAsmBackend -- this fixes the types to reflect that, and allows us to remove another instance of MCObjectStreamer's weird "holding ownership via someone else's reference" trick. llvm-svn: 315410	2017-10-11 01:57:21 +00:00
Reid Kleckner	51b2cd8fb9	Silence MSVC warnings about unsigned wrapping without UB Of course, casting an unsigned value too large for 'int' is UB. So, write out the ternary. LLVM folds it to ADD anyway. Fixes the warning from r303693 a different way. Thanks to Erich Keane for pointing this out! llvm-svn: 315406	2017-10-11 01:40:38 +00:00
Craig Topper	85b1da1dc4	[X86] Remove temporary std::string creation from shuffle comment printing. We can just write directly to the raw_ostream. llvm-svn: 315399	2017-10-11 00:46:09 +00:00
Craig Topper	6ce20bd184	[X86] Add 128-bit version of vbroadcasti32x2 to shuffle comment decoding. llvm-svn: 315395	2017-10-11 00:11:53 +00:00
Justin Bogner	fdf9bf4f16	CodeGen: Minor cleanups to use MachineInstr::getMF. NFC Since r315388 we have a shorter way to say this, so we'll replace MI->getParent()->getParent() with MI->getMF() in a few places. llvm-svn: 315390	2017-10-10 23:50:49 +00:00
Justin Bogner	ec7cba53e6	CodeGen: Add MachineInstr::getMF(). NFC Similarly to how Instruction has getFunction, this adds a less verbose way to write MI->getParent()->getParent(). I'll follow up shortly with a change that changes a bunch of the uses. llvm-svn: 315388	2017-10-10 23:34:01 +00:00
Eugene Zelenko	e9ea08a097	[Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 315383	2017-10-10 22:49:55 +00:00
Craig Topper	bb0e316dc7	[X86] Add broadcast patterns that allow a scalar_to_vector between the broadcast and the load. We already have these patterns for AVX512VL, but not AVX1 or 2. llvm-svn: 315382	2017-10-10 22:40:31 +00:00
Eugene Zelenko	149178d92b	[CodeGen] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 315380	2017-10-10 22:33:29 +00:00
Peter Collingbourne	0dfdb44797	Support: Have directory_iterator::status() return FindFirstFileEx/FindNextFile results on Windows. This allows clients to avoid an unnecessary fs::status() call on each directory entry. Because the information returned by FindFirstFileEx is a subset of the information returned by a regular status() call, I needed to extract a base class from file_status that contains only that information. On my machine, this reduces the time required to enumerate a ThinLTO cache directory containing 520k files from almost 4 minutes to less than 2 seconds. Differential Revision: https://reviews.llvm.org/D38716 llvm-svn: 315378	2017-10-10 22:19:46 +00:00
Rafael Espindola	ef421f9c18	Make the ELFObjectFile constructor private. This forces every user to use the new create method that returns an Expected. This in turn propagates better error messages. llvm-svn: 315371	2017-10-10 21:21:16 +00:00
Dehao Chen	3f56a05ae5	Use the first instruction's count to estimate the funciton's entry frequency. Summary: In the current implementation, we only have accurate profile count for standalone symbols. For inlined functions, we do not have entry count data because it's not available in LBR. In this patch, we use the first instruction's frequency to estimiate the function's entry count, especially for inlined functions. This may be inaccurate due to debug info in optimized code. However, this is a better estimate than the static 80/20 estimation we have in the current implementation. Reviewers: tejohnson, davidxl Reviewed By: tejohnson Subscribers: sanjoy, llvm-commits, aprantl Differential Revision: https://reviews.llvm.org/D38478 llvm-svn: 315369	2017-10-10 21:13:50 +00:00
Craig Topper	ad3d03193a	[X86] Fix some patterns that select VLX instructions, but were incorrectly also checking presence of BWI instructions. The EVEX->VEX pass probably obscures this. llvm-svn: 315365	2017-10-10 21:07:14 +00:00
Rafael Espindola	04e4dbab6b	Simplify. NFC. llvm-svn: 315364	2017-10-10 21:03:46 +00:00
Simon Dardis	b994128d14	[mips] Correct the instruction predicates for microMIPSr3 Rather than using the AdditionalPredicates mechanism to guard the microMIPS instructions, use the existing predicates to properly guard those instructions. This also resolves a case where an instruction pattern was incorrectly available for microMIPS32R6, which caused a register allocation failure as the registers specified in the pattern were not available. Reviewers: nitesh.jain, atanasyan Differential Revision: https://reviews.llvm.org/D38451 llvm-svn: 315362	2017-10-10 20:52:53 +00:00
Matt Arsenault	f42074b699	AMDGPU: Fix missing skipFunction calls llvm-svn: 315361	2017-10-10 20:48:36 +00:00
Matt Arsenault	d674e0ac0d	AMDGPU: Fix failure to select branch with optnone opt-bisect/optnone disable the AMDGPUUniformAnnotateValues pass. The heuristic in the custom selector for brcond deferred the branch uniformity check to the pattern, which would fail. llvm-svn: 315360	2017-10-10 20:34:49 +00:00
Adrian Prantl	3a3ba77ba3	Convert condition to an early exit (NFC). <rdar://problem/34689604> llvm-svn: 315359	2017-10-10 20:33:43 +00:00
Matt Arsenault	cc85223f87	AMDGPU: Fix incorrect selection of pseudo-branches These should only be used if the machine structurizer is enabled. llvm-svn: 315357	2017-10-10 20:22:07 +00:00
Rafael Espindola	12db383e20	Convert two uses of ErrorOr to Expected. llvm-svn: 315354	2017-10-10 20:00:07 +00:00
Yaxun Liu	de4b88d9a1	[AMDGPU] Lower enqueued blocks and generate runtime metadata This patch adds a post-linking pass which replaces the function pointer of enqueued block kernel with a global variable (runtime handle) and adds runtime-handle attribute to the enqueued block kernel. In LLVM CodeGen the runtime-handle metadata will be translated to RuntimeHandle metadata in code object. Runtime allocates a global buffer for each kernel with RuntimeHandel metadata and saves the kernel address required for the AQL packet into the buffer. __enqueue_kernel function in device library knows that the invoke function pointer in the block literal is actually runtime handle and loads the kernel address from it and puts it into AQL packet for dispatching. This cannot be done in FE since FE cannot create a unique global variable with external linkage across LLVM modules. The global variable with internal linkage does not work since optimization passes will try to replace loads of the global variable with its initialization value. Differential Revision: https://reviews.llvm.org/D38610 llvm-svn: 315352	2017-10-10 19:39:48 +00:00
Peter Collingbourne	0f9e889881	Support: On Windows, use CreateFileW to delete files in sys::fs::remove(). This saves a call to stat(). Differential Revision: https://reviews.llvm.org/D38715 llvm-svn: 315351	2017-10-10 19:39:46 +00:00
Rafael Espindola	8ed1198774	Try to make gcc happy. llvm-svn: 315349	2017-10-10 19:27:51 +00:00
Rafael Espindola	ff9f5f3372	Return Expected from createRTDyldELFObject. No functionality change, it just makes it easier to use Expected in Object. llvm-svn: 315348	2017-10-10 19:14:30 +00:00
Rafael Espindola	563ede149a	Simplify. NFC. llvm-svn: 315347	2017-10-10 19:07:10 +00:00
Derek Schuff	669300db9c	[WebAssembly] Update MCObjectWriter and associated interfaces after r315327 llvm-svn: 315335	2017-10-10 17:31:43 +00:00
Lang Hames	232cdb48fc	[MC] Add another missing <memory> include left out of r315327. llvm-svn: 315332	2017-10-10 16:59:01 +00:00
Lang Hames	3a67075a3a	[MC] Add a missing <memory> include left out of r315327. llvm-svn: 315331	2017-10-10 16:58:26 +00:00
Bruno Cardoso Lopes	57304923ca	Revert "[SCCP] Propagate integer range info for parameters in IPSCCP." This reverts commit r315288. This is part of fixing segfault introduced in: http://green.lab.llvm.org/green/job/clang-stage2-configure-Rlto/21675/ llvm-svn: 315329	2017-10-10 16:37:57 +00:00
Bruno Cardoso Lopes	122c4b3c8c	Revert "[SCCP] Fix mem-sanitizer failure introduced by r315288." This reverts commit r315294. Part of fixing seg fault introduced in: http://green.lab.llvm.org/green/job/clang-stage2-configure-Rlto/21675/ llvm-svn: 315328	2017-10-10 16:37:51 +00:00
Lang Hames	60fbc7cc38	[MC] Thread unique_ptr<MCObjectWriter> through the create.*ObjectWriter functions. This makes the ownership of the resulting MCObjectWriter clear, and allows us to remove one instance of MCObjectStreamer's bizarre "holding ownership via someone else's reference" trick. llvm-svn: 315327	2017-10-10 16:28:07 +00:00
Jacob Gravelle	37af00e7d0	[WebAssembly] Narrow the scope of WebAssemblyFixFunctionBitcasts Summary: The pass to fix function bitcasts generates thunks for functions that are called directly with a mismatching signature. It was also generating thunks in cases where the function was address-taken, causing aliasing problems in otherwise valid cases. This patch tightens the restrictions for when the pass runs. Reviewers: sunfish, dschuff Subscribers: jfb, sbc100, llvm-commits, aheejin Differential Revision: https://reviews.llvm.org/D38640 llvm-svn: 315326	2017-10-10 16:20:18 +00:00
Simon Dardis	96d35fe06a	[mips] Duplicate the reciprocal instruction definitions for FP32 Add instruction definitions for FP32 mode for recip.d and rsqrt.d. Previously these instructions were only defined when targeting the full 64-bit FPU model but were not guarded properly. Reviewers: nitesh.jain, atanasyan Differential Revision: https://reviews.llvm.org/D38400 llvm-svn: 315318	2017-10-10 14:41:11 +00:00
Jonas Devlieghere	aa6be823a4	Re-land "[llvm-dwarfdump] Print type names in DW_AT_type DIEs" This patch adds printing for DW_AT_type DIEs like it is already the case for DW_AT_specification DIEs. This is a rather naive approach and only a start. We should have pretty printers for different languages. Recommit after being reverted in r315299. Differential revision: https://reviews.llvm.org/D36993 llvm-svn: 315316	2017-10-10 14:15:25 +00:00
Stefan Pintilie	cc330daa5b	[PowerPC] Add missing record form instructions to the P9 Scheduling Model A number of record form instructions were missing from the P9 scheduling model. Added those instructions and marked the P9 model as complete. Differential Revision: https://reviews.llvm.org/D38560 llvm-svn: 315313	2017-10-10 13:45:35 +00:00
Uriel Korach	059e211aa1	after fixing the i386 case Change-Id: If6fe0b6ec01f111115fb734fe31c0e152dbc165f llvm-svn: 315311	2017-10-10 13:43:09 +00:00
Simon Dardis	a17a7b619a	[mips] Partially fix PR34391 Previously, the parsing of the 'subu $reg, ($reg,) imm' relied on a parser which also rendered the operand to the instruction. In some cases the general parser could construct an MCExpr which was not a MCConstantExpr which MipsAsmParser was expecting. Address this by altering the special handling to cope with unexpected inputs and fine-tune the handling of cases where an register name that is not available in the current ABI is regarded as not a match for the custom parser but also not as an outright error. Also enforces the binutils restriction that only constants are accepted. This partially resolves PR34391. Thanks to Ed Maste for reporting the issue! Reviewers: nitesh.jain, arichardson Differential Revision: https://reviews.llvm.org/D37476 llvm-svn: 315310	2017-10-10 13:34:45 +00:00
David Stuttard	51c1b22806	[DAGCombine] Fix for shuffle to vector extend for non power 2 vectors Summary: See https://llvm.org/PR33743 for more details It seems that for non-power of 2 vector sizes, the algorithm can produce non-matching sizes for input and result causing an assert. This usually isn't a problem as the isAnyExtend check will weed these out, but in some cases (most often with lots of undefined values for the mask indices) it can pass this check for non power of 2 vectors. Adding in an extra check that ensures that bit size will match for the result and input (as required) Subscribers: nhaehnle Differential Revision: https://reviews.llvm.org/D35241 llvm-svn: 315307	2017-10-10 12:45:45 +00:00
Oliver Stannard	30b732c942	[ARM, Asm] Harden GNU LDRD/STRD aliases against invalid inputs Previously, the code that implemented the GNU assembler aliases for the LDRD and STRD instructions (where the second register is omitted) assumed that the input was a valid instruction. This caused assertion failures for every example in ldrd-strd-gnu-bad-inst.s. This improves this code so that it bails out if the instruction is not in the expected format, the check bails out, and the asm parser is run on the unmodified instruction. It also relaxes the alias on thumb targets, so that unaligned pairs of registers can be used. The restriction that Rt must be even-numbered only applies to the ARM versions of these instructions. Differential revision: https://reviews.llvm.org/D36732 llvm-svn: 315305	2017-10-10 12:38:22 +00:00
Oliver Stannard	cd3306f62f	[ARM, Asm] Add diagnostics for floating-point register operands This adds diagnostic strings for the ARM floating-point register classes, which will be used when these classes are expected by the assembler, but the provided operand is not valid. One of these, DPR, requires C++ code to select the correct error message, as that class contains different registers depending on the FPU. The rest can all have their diagnostic strings stored in the tablegen decription of them. Differential revision: https://reviews.llvm.org/D36693 llvm-svn: 315304	2017-10-10 12:35:09 +00:00
Oliver Stannard	bbad419e94	[ARM, Asm] Add diagnostics for general-purpose register operands This adds diagnostic strings for the ARM general-purpose register classes, which will be used when these classes are expected by the assembler, but the provided operand is not valid. One of these, rGPR, requires C++ code to select the correct error message, as that class contains different registers in pre-v8 and v8 targets. The rest can all have their diagnostic strings stored in the tablegen description of them. Differential revision: https://reviews.llvm.org/D36692 llvm-svn: 315303	2017-10-10 12:31:53 +00:00
Nicolai Haehnle	312b64f4d7	AMDGPU: Split MUBUF offset into aligned components Summary: Atomic buffer operations do not work (and trap on gfx9) when the components are unaligned, even if their sum is aligned. Previously, we generated an offset of 4156 without an SGPR by splitting it as 4095 + 61 (immediate + inline constant). The highest offset for which we can do this correctly is 4156 = 4092 + 64. Fixes dEQP-GLES31.functional.ssbo.atomic.* Reviewers: arsenm Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D37850 llvm-svn: 315302	2017-10-10 12:22:23 +00:00
Jonas Devlieghere	5b0f885691	Revert "[llvm-dwarfdump] Print type names in DW_AT_type DIEs" This reverts commit r315297. llvm-svn: 315299	2017-10-10 11:49:56 +00:00
Jonas Devlieghere	2eb95c33f6	[llvm-dwarfdump] Print type names in DW_AT_type DIEs This patch adds printing for DW_AT_type DIEs like it is already the case for DW_AT_specification DIEs. This is a rather naive approach and only a start. We should have pretty printers for different languages. Differential revision: https://reviews.llvm.org/D36993 llvm-svn: 315297	2017-10-10 11:24:41 +00:00
Florian Hahn	7d2375df30	[SCCP] Fix mem-sanitizer failure introduced by r315288. llvm-svn: 315294	2017-10-10 10:33:45 +00:00
Florian Hahn	22a44bca40	[SCCP] Propagate integer range info for parameters in IPSCCP. Summary: This updates the SCCP solver to use of the ValueElement lattice for parameters, which provides integer range information. The range information is used to remove unneeded icmp instructions. For the following function, f() can be optimized to `ret i32 2` with this change source_filename = "sccp.c" target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-unknown-linux-gnu" ; Function Attrs: norecurse nounwind readnone uwtable define i32 @main() local_unnamed_addr #0 { entry: %call = tail call fastcc i32 @f(i32 1) %call1 = tail call fastcc i32 @f(i32 47) %add3 = add nsw i32 %call, %call1 ret i32 %add3 } ; Function Attrs: noinline norecurse nounwind readnone uwtable define internal fastcc i32 @f(i32 %x) unnamed_addr #1 { entry: %c1 = icmp sle i32 %x, 100 %cmp = icmp sgt i32 %x, 300 %. = select i1 %cmp, i32 1, i32 2 ret i32 %. } attributes #1 = { noinline } Reviewers: davide, sanjoy, efriedma, dberlin Reviewed By: davide, dberlin Subscribers: mcrosier, gberry, mssimpso, dberlin, llvm-commits Differential Revision: https://reviews.llvm.org/D36656 llvm-svn: 315288	2017-10-10 09:32:38 +00:00
Nemanja Ivanovic	7bf866eb10	Fix for PR34888. The issue is that we assume operand zero of the input to the add instruction is a register. In this case, the input comes from inline assembly and operand zero is not a register thereby causing a crash. The code will bail anyway if the input instruction doesn't have the right opcode. So do that check first and let short-circuiting prevent the crash. llvm-svn: 315285	2017-10-10 08:46:10 +00:00
NAKAMURA Takumi	aba2b3d1f3	SILoadStoreOptimizer.cpp: Fix build; Clang doesn't like "using anonymous struct" since rL315256. llvm-svn: 315283	2017-10-10 08:30:53 +00:00
Clement Courbet	e2e8a5c496	Re-land "[MergeICmps] Disable mergeicmps if the target does not want to handle memcmp expansion." (fixed stability issues) This reverts commit d6492333d3b478a1d88163315002022f8d5e58dc. llvm-svn: 315281	2017-10-10 08:00:45 +00:00
Bjorn Steinbrink	d36bbe9c89	Ignore all duplicate frame index expression Some passes might duplicate calls to llvm.dbg.declare creating duplicate frame index expression which currently trigger an assertion which is meant to catch erroneous, overlapping fragment declarations. But identical frame index expressions are just redundant and don't actually conflict with each other, so we can be more lenient and just ignore the duplicates. Reviewers: aprantl, rnk Subscribers: llvm-commits, JDevlieghere Differential Revision: https://reviews.llvm.org/D38540 llvm-svn: 315279	2017-10-10 07:46:17 +00:00
Alex Bradbury	8cc99f1887	[RISCV] Fix build after r315254 createELFObjectWriter now takes a std::unique_ptr<MCELFObjectTargetWriter> rather than a MCELFObjectTargetWriter*. llvm-svn: 315275	2017-10-10 07:19:18 +00:00
Craig Topper	a88306e6fb	[AVX512] Add patterns to commute integer comparison instructions during isel. This enables broadcast loads to be commuted and allows normal loads to be folded without the peephole pass. llvm-svn: 315274	2017-10-10 06:36:46 +00:00
Xinliang David Li	4cdc9dab0a	Renable r314928 Eliminate inttype phi with inttoptr/ptrtoint. This version fixed a bug in finding the matching phi -- the order of the incoming blocks may be different (triggered in self build on Windows). A new test case is added. llvm-svn: 315272	2017-10-10 05:07:54 +00:00
Reid Kleckner	97a2d5c42f	[MC] Properly diagnose badly scoped .cfi_ directives Removes two report_fatal_errors. Implement this by removing EmitCFICommon, and do the checking in getCurrentDwarfFrameInfo. Have the callers check for null before dereferencing it. llvm-svn: 315264	2017-10-10 01:49:21 +00:00
Reid Kleckner	e52d1e6787	[SEH] Use reportError instead of report_fatal_error for bad directives This makes the .seh_ directives slightly more usable from standalone assembly files. This removes a large number of report_fatal_errors and recovers from the error by ignoring the directive. llvm-svn: 315262	2017-10-10 01:26:25 +00:00
Lang Hames	1301a878f1	[MC] Plumb unique_ptr<MCWasmObjectTargetWriter> through createWasmObjectWriter to WasmObjectWriter's constructor. Fixes the same ownership issue for COFF that r315245 did for MachO: WasmObjectWriter takes ownership of its MCWasmObjectTargetWriter, so we want to pass this through to the constructor via a unique_ptr, rather than a raw ptr. llvm-svn: 315260	2017-10-10 01:15:10 +00:00
Reid Kleckner	ab23dace56	[MC] Suppress .Lcfi labels when emitting textual assembly Summary: This suppresses the generation of .Lcfi labels in our textual assembler. It was annoying that this generated cascading .Lcfi labels: llc foo.ll -o - \| llvm-mc \| llvm-mc After three trips through MCAsmStreamer, we'd have three labels in the output when none are necessary. We should only bother creating the labels and frame data when making a real object file. This supercedes D38605, which moved the entire .seh_ implementation into MCObjectStreamer. This has the advantage that we do more checking when emitting textual assembly, as a minor efficiency cost. Outputting textual assembly is not performance critical, so this shouldn't matter. Reviewers: majnemer, MatzeB Subscribers: qcolombet, nemanjai, javed.absar, eraman, hiraditya, JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D38638 llvm-svn: 315259	2017-10-10 00:57:36 +00:00
Reid Kleckner	a11b983e11	Fix Wasm build after r315254 llvm-svn: 315258	2017-10-10 00:52:40 +00:00
Lang Hames	77dff39cb4	[MC] Plumb unique_ptr<MCWinCOFFObjectTargetWriter> through createWinCOFFObjectWriter to WinCOFFObjectWriter's constructor. Fixes the same ownership issue for COFF that r315245 did for MachO: WinCOFFObjectWriter takes ownership of its MCWinCOFFObjectTargetWriter, so we want to pass this through to the constructor via a unique_ptr, rather than a raw ptr. llvm-svn: 315257	2017-10-10 00:50:29 +00:00
Lang Hames	dcb312bdb9	[MC] Plumb unique_ptr<MCELFObjectTargetWriter> through createELFObjectWriter to ELFObjectWriter's constructor. Fixes the same ownership issue for ELF that r315245 did for MachO: ELFObjectWriter takes ownership of its MCELFObjectTargetWriter, so we want to pass this through to the constructor via a unique_ptr, rather than a raw ptr. llvm-svn: 315254	2017-10-09 23:53:15 +00:00
Adam Nemet	0965da2055	Rename OptimizationDiagnosticInfo.* to OptimizationRemarkEmitter.* Sync it up with the name of the class actually defined here. This has been bothering me for a while... llvm-svn: 315249	2017-10-09 23:19:02 +00:00
Lang Hames	9b206a7d60	[MC] Plumb unique_ptr<MCMachObjectTargetWriter> through createMachObjectWriter to MCObjectWriter's constructor. MCObjectWriter takes ownership of its MCMachObjectTargetWriter argument -- this patch plumbs that ownership relationship through the constructor (which previously took raw MCMachObjectTargetWriter*) and the createMachObjectWriter function. llvm-svn: 315245	2017-10-09 22:38:13 +00:00
Aditya Nandakumar	c3bfc81a1f	[GISel]: Fix generation of illegal COPYs during CallLowering We end up creating COPY's that are either truncating/extending and this should be illegal. https://reviews.llvm.org/D37640 Patch for X86 and ARM by igorb, rovka llvm-svn: 315240	2017-10-09 20:07:43 +00:00
Zvi Rackover	c1d5955684	[X86] Unsigned saturation subtraction canonicalization [the backend part] Summary: On behalf of julia.koval@intel.com The patch transforms canonical version of unsigned saturation, which is sub(max(a,b),a) or sub(a,min(a,b)) to special psubus insturuction on targets, which support it(8bit and 16bit uints). umax(a,b) - b -> subus(a,b) a - umin(a,b) -> subus(a,b) There is also extra case handled, when right part of sub is 32 bit and can be truncated, using UMIN(this transformation was discussed in https://reviews.llvm.org/D25987). The example of special case code: ``` void foo(unsigned short p, int max, int n) { int i; unsigned m; for (i = 0; i < n; i++) { m = --p; p = (unsigned short)(m >= max ? m-max : 0); } } ``` Max in this example is truncated to max_short value, if it is greater than m, or just truncated to 16 bit, if it is not. It is vaid transformation, because if max > max_short, result of the expression will be zero. Here is the table of types, I try to support, special case items are bold: \| Size \| 128 \| 256 \| 512 \| ----- \| ----- \| ----- \| ----- \| i8 \| v16i8 \| v32i8 \| v64i8 \| i16 \| v8i16 \| v16i16 \| v32i16 \| i32 \| \| v8i32* \| v16i32 \| i64 \| \| \| v8i64 Reviewers: zvi, spatel, DavidKreitzer, RKSimon Reviewed By: zvi Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D37534 llvm-svn: 315237	2017-10-09 20:01:10 +00:00
Lang Hames	0b9db4c1fa	[MC] Use a unique_ptr<MCAssembler> for MCObjectStreamer's Assembler member. Removes manual new/delete. llvm-svn: 315225	2017-10-09 18:11:04 +00:00
Sanjay Patel	ce36b03b03	[InstCombine] fix formatting; NFC llvm-svn: 315223	2017-10-09 17:54:46 +00:00
Adrian McCarthy	e6275c6edb	Fix after r315079 Microsoft's debug implementation of std::copy checks if the destination is an array and then does some bounds checking. This was causing an assertion failure in fs::rename_internal which copies to a buffer of the appropriate size but that's type-punned to an array of length 1 for API compatibility reasons. Fix is to make make the destination a pointer rather than an array. llvm-svn: 315222	2017-10-09 17:50:01 +00:00
Sanjay Patel	2a61a821a0	[DAG] combine assertsexts around a trunc This was a suggested follow-up to: D37017 / https://reviews.llvm.org/rL313577 llvm-svn: 315206	2017-10-09 15:22:20 +00:00
Amara Emerson	24ca39ce71	[AArch64] Improve codegen for inverted overflow checking intrinsics E.g. if we have a (xor(overflow-bit), 1) where overflow-bit comes from an intrinsic like llvm.sadd.with.overflow then we can kill the xor and use the inverted condition code for the CSEL. rdar://28495949 Reviewed By: kristof.beyls Differential Revision: https://reviews.llvm.org/D38160 llvm-svn: 315205	2017-10-09 15:15:09 +00:00
Craig Topper	c88883b07d	[X86] Remove a setLoadExtAction from the AVX512 section that uses an AVX512BW type and is alraedy present in the AVX512BW section. llvm-svn: 315202	2017-10-09 01:05:16 +00:00
Craig Topper	4f8656a7af	[X86] Enable extended comparison predicate support for SETUEQ/SETONE when targeting AVX instructions. We believe that despite AMD's documentation, that they really do support all 32 comparision predicates under AVX. Differential Revision: https://reviews.llvm.org/D38609 llvm-svn: 315201	2017-10-09 01:05:15 +00:00
Simon Pilgrim	2c742f919a	[X86][SSE] Don't call combineTo inside combineX86ShufflesRecursively. NFCI. Return the combined shuffle from combineX86ShufflesRecursively and perform the combineTo in the caller. Makes it easier for future patches to use this in functions that aren't actually shuffles themselves. llvm-svn: 315195	2017-10-08 20:58:14 +00:00
Simon Pilgrim	6abbd33ec0	Tidyup with clang-format. NFCI. llvm-svn: 315187	2017-10-08 19:24:30 +00:00
Benjamin Kramer	16610028ea	Remove unused variables. No functionality change. llvm-svn: 315185	2017-10-08 19:11:02 +00:00
Simon Pilgrim	dc32c844f9	[X86] getTargetConstantBitsFromNode - add support for decoding scalar constants llvm-svn: 315182	2017-10-08 17:21:18 +00:00
Craig Topper	c97775c03c	[X86] Prefer MOVSS/SD over BLENDI during legalization. Remove BLENDI versions of scalar arithmetic patterns Summary: We currently disable some converting of shuffles to MOVSS/MOVSD during legalization if SSE41 is enabled. But later during shuffle combining we go back to prefering MOVSS/MOVSD. Additionally we have patterns that look for BLENDIs to detect scalar arithmetic operations. I believe due to the combining using MOVSS/MOVSD these are unnecessary. Interestingly, we still codegen blend instructions even though lowering/isel emit movss/movsd instructions. Turns out machine CSE commutes them to blend, and then commuting those blends back into blends that are equivalent to the original movss/movsd. This patch fixes the inconsistency in legalization to prefer MOVSS/MOVSD. The one test change was caused by this change. The problem is that we have integer types and are mostly selecting integer instructions except for the shufps. This shufps forced the execution domain, but the vpblendw couldn't have its domain changed with a naive instruction swap. We could fix this by special casing VPBLENDW based on the immediate to widen the element type. The rest of the patch is removing all the excess scalar patterns. Long term we should probably add isel patterns to make MOVSS/MOVSD emit blends directly instead of relying on the double commute. We may also want to consider emitting movss/movsd for optsize. I also wonder if we should still use the VEX encoded blendi instructions even with AVX512. Blends have better throughput, and that may outweigh the register constraint. Reviewers: RKSimon, zvi Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38023 llvm-svn: 315181	2017-10-08 16:57:23 +00:00
Amara Emerson	1cd89ca669	[AArch64][GlobalISel] Make G_PHI of p0 types legal. Differential Revision: https://reviews.llvm.org/D38621 llvm-svn: 315177	2017-10-08 15:29:11 +00:00
Gadi Haber	684944b822	[X86][SKX] Adding the scheduling information for the SKX target. Adding the scheduling information for the SkylakeServer (SKX) target. This patch adds the instruction scheduling information for the SkylakeServer (SKX) architecture target by adding the file X86SchedSkylakeServer.td located under the X86 Target. We used the scheduling information retrieved from the Skylake architects in order to create the file. The scheduling information includes latency, number of micro-Ops and used ports by each SKL instruction. The patch continues the scheduling replacement and insertion effort started with the SNB target in r310792, the HSW target in r311879 and the SkylakeClient (SKL) target in rL313613. Please expect some performance fluctuations due to code alignment effects. Reviewers: zvi, RKSimon, craig.topper, chandlerc, aymanmu Differential Revision: https://reviews.llvm.org/D38443 Change-Id: I5c228fcc09e9e5a99b6116e62b356c4f9b971185 llvm-svn: 315175	2017-10-08 12:52:54 +00:00
Ayman Musa	1170deb9c8	[X86] Add missing entries in 'MemoryFoldTable2Addr' to get complete form of the table. Get the folding table 'MemoryFoldTable2Addr' to a complete state as part of the process explained in https://reviews.llvm.org/D38028 Differential Revision: https://reviews.llvm.org/D38500 llvm-svn: 315174	2017-10-08 09:46:50 +00:00
Ayman Musa	993339b941	[X86][TableGen] Recommitting the X86 memory folding tables TableGen backend while disabling it by default. After the original commit ([[ https://reviews.llvm.org/rL304088 \| rL304088 ]]) was reverted, a discussion in llvm-dev was opened on 'how to accomplish this task'. In the discussion we concluded that the best way to achieve our goal (which is to automate the folding tables and remove the manually maintained tables) is: # Commit the tablegen backend disabled by default. # Proceed with an incremental updating of the manual tables - while checking the validity of each added entry. # Repeat previous step until we reach a state where the generated and the manual tables are identical. Then we can safely remove the manual tables and include the generated tables instead. # Schedule periodical (1 week/2 weeks/1 month) runs of the pass: - if changes appear (new entries): - make sure the entries are legal - If they are not, mark them as illegal to folding - Commit the changes (if there are any). CMake flag added for this purpose is "X86_GEN_FOLD_TABLES". Building with this flags will run the pass and emit the X86GenFoldTables.inc file under build/lib/Target/X86/ directory which is a good reference for any developer who wants to take part in the effort of completing the current folding tables. Differential Revision: https://reviews.llvm.org/D38028 llvm-svn: 315173	2017-10-08 09:20:32 +00:00
Craig Topper	bbca2f2978	[X86] Stop LowerSIGN_EXTEND_AVX512 from creating v8i16/v16i16/v16i8 vselects with a v8i1/v16i1 condition when BWI is not available. Some of the tests in vector-shuffle-v1.ll would get into an infinite loop without this. llvm-svn: 315172	2017-10-08 08:50:59 +00:00
Ayman Musa	5fc6dc58d7	[X86] Add new attribute to X86 instructions to enable marking them as "not memory foldable" This attribute will be used in a tablegen backend that generated the X86 memory folding tables which will be added in a future pass. Instructions with this attribute unset will be excluded from the full set of X86 instructions available for the pass. Differential Revision: https://reviews.llvm.org/D38027 llvm-svn: 315171	2017-10-08 08:32:56 +00:00
Craig Topper	9563cab961	[X86] Simplify some code in getInsertVINSERTImmediate and getExtractVEXTRACTImmediate. NFC Replace one of the divides with a multiply. llvm-svn: 315162	2017-10-08 01:33:42 +00:00
Craig Topper	27170fee8d	[X86] If we see an insert of a bitcast into zero vector, canonicalize it to move the bitcast to the other side of the insert. This improves detection of zeroing of upper bits during isel. llvm-svn: 315161	2017-10-08 01:33:41 +00:00
Craig Topper	f7a19db649	[X86] Remove ISD::INSERT_SUBVECTOR handling from combineBitcastForMaskedOp. Add isel patterns to make up for it. This will allow for some flexibility in canonicalizing bitcasts around insert_subvector. llvm-svn: 315160	2017-10-08 01:33:40 +00:00
Craig Topper	16f2044fa8	[X86] Use getConstantOperandVal to simplify some code. NFC llvm-svn: 315159	2017-10-08 01:33:38 +00:00
Simon Pilgrim	9508fe7924	[X86][SSE] Match bitcasted BUILD_VECTOR of constants for v2i64 shifts on 64-bit targets (PR34855) Extension to rL315155, generate constant shifts on 64-bits as well as 32-bits. llvm-svn: 315156	2017-10-07 17:57:22 +00:00
Simon Pilgrim	70e1db78db	[X86][SSE] Match bitcasted v4i32 BUILD_VECTORS for v2i64 shifts on 64-bit targets (PR34855) We were already doing this for 32-bit targets, but we can generate these on 64-bits as well. llvm-svn: 315155	2017-10-07 17:42:17 +00:00
Craig Topper	90b76211d3	[SelectionDAG} Use KnownBits::isUnknown and hasConflict. NFC llvm-svn: 315154	2017-10-07 17:07:48 +00:00
Craig Topper	2f60295364	[X86] Add X86ISD::CMOV to computeKnownBitsForTargetNode and ComputeNumSignBitsForTargetNode. Summary: Implementations based on ISD::SELECT. Reviewers: RKSimon, spatel Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38663 llvm-svn: 315153	2017-10-07 16:51:19 +00:00
Simon Pilgrim	73f143e774	[X86][SSE] Improve shuffling combining with horizontal operations Recognise cases when we can merge the shuffles with their horizontal (HADD/HSUB/PACK) instruction inputs. Replaces an older implementation which performed some of this during lowering, expanding an existing target shuffle combine stage instead. Differential Revision: https://reviews.llvm.org/D38506 llvm-svn: 315150	2017-10-07 12:42:23 +00:00
Martin Storsjo	5e9d482b0a	[X86] Update an outdated comment about SjLj The SjLj intrinsics in the X86 backend are intended for use with SjLj exception handling as well, since SVN r271244. Differential Revision: https://reviews.llvm.org/D38532 llvm-svn: 315146	2017-10-07 06:00:32 +00:00
Craig Topper	e79eff3bb5	[X86] Correct result type for the flag result of RDSEED and RDRAND nodes. Correct the CC type for the CMOV used with RDSEED/RDRAND. The flag result was MVT::Glue, but should be MVT::i32. The CC type was MVT::i8, but should be MVT::i32. llvm-svn: 315145	2017-10-07 05:11:59 +00:00
Jessica Paquette	13593843f6	[MachineOutliner] Disable outlining from LinkOnceODRs by default Say you have two identical linkonceodr functions, one in M1 and one in M2. Say that the outliner outlines A,B,C from one function, and D,E,F from another function (where letters are instructions). Now those functions are not identical, and cannot be deduped. Locally to M1 and M2, these outlining choices would be good-- to the whole program, however, this might not be true! To mitigate this, this commit makes it so that the outliner sees linkonceodr functions as unsafe to outline from. It also adds a flag, -enable-linkonceodr-outlining, which allows the user to specify that they want to outline from such functions when they know what they're doing. Changing this handles most code size regressions in the test suite caused by competing with linker dedupe. It also doesn't have a huge impact on the code size improvements from the outliner. There are 6 tests that regress > 5% from outlining WITH linkonceodrs to outlining WITHOUT linkonceodrs. Overall, most tests either improve or are not impacted. Not outlined vs outlined without linkonceodrs: https://hastebin.com/raw/qeguxavuda Not outlined vs outlined with linkonceodrs: https://hastebin.com/raw/edepoqoqic Outlined with linkonceodrs vs outlined without linkonceodrs: https://hastebin.com/raw/awiqifiheb Numbers generated using compare.py with -m size.__text. Tests run for AArch64 with -Oz -mllvm -enable-machine-outliner -mno-red-zone. llvm-svn: 315136	2017-10-07 00:16:34 +00:00
Sanjay Patel	72d339abb7	[InstCombine] use correct type when propagating constant condition in simplifyDivRemOfSelectWithZeroOp (PR34856) llvm-svn: 315130	2017-10-06 23:43:06 +00:00
Sanjay Patel	ae2e3a44d2	[InstCombine] rename SimplifyDivRemOfSelect to be clearer, add comments, simplify code; NFCI There's at least one bug here - this code can fail with vector types (PR34856). It's also being called for FREM; I'm still trying to understand how that is valid. llvm-svn: 315127	2017-10-06 23:20:16 +00:00
Cameron McInally	9d64101fe8	[AVX512] Fix TERNLOG when folding broadcast Patch to fix ternlog instructions with a folded broadcast. The broadcast decorator, e.g. {1toX}, was missing. Differential Revision: https://reviews.llvm.org/D38649 llvm-svn: 315122	2017-10-06 22:31:29 +00:00
Jonas Devlieghere	f2fa9ebe3f	[dwarfdump] Verify that unit type matches root DIE This patch adds two new verifiers: - It checks that the root DIE of a CU is actually a valid unit DIE. (based on its tag) - For DWARF5 which contains a unit type int he CU header, it checks that this matches the type of the unit DIE. Differential revision: https://reviews.llvm.org/D38453 llvm-svn: 315121	2017-10-06 22:27:31 +00:00
Reid Kleckner	b6b210e61f	Revert "Roll forward r314928" This appears to be miscompiling Clang, as shown on two Windows bootstrap bots: http://lab.llvm.org:8011/builders/clang-x86-windows-msvc2015/builds/7611 http://lab.llvm.org:8011/builders/clang-x64-ninja-win7/builds/6870 Nothing else is in the blame list. Both emit errors on this valid code in the Windows ucrt headers: C:\...\ucrt\malloc.h:95:32: error: invalid operands to binary expression ('char ' and 'int') _Ptr = (char)_Ptr + _ALLOCA_S_MARKER_SIZE; ~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~~~~~~~~ I am attempting to reproduce this now. This reverts r315044 llvm-svn: 315108	2017-10-06 21:17:51 +00:00
Reid Kleckner	813c577cc2	[PEI] Remove required properties and use 'if' instead of std::function Summary: After r303360, we initialize UsesCalleeSaves in runOnMachineFunction, which runs after getRequiredProperties. UsesCalleeSaves was initialized to 'false', so getRequiredProperties would always return an empty set. We don't have a TargetMachine available early anymore after r303360. Just removing the requirement of NoVRegs seems to make things work, so let's do that. Reviewers: thegameg, dschuff, MatzeB Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D38597 llvm-svn: 315089	2017-10-06 18:21:19 +00:00
Saleem Abdulrasool	46a59fdab6	Bitcode: add an auto-upgrade for LTO section name The bitcode reader looks specifically for `__DATA, __objc_catlist` as a section name. However, SVN r304661 removed the spaces (the two names are functionally equivalent but do not compare equally lexicographically). This causes compatibility issues. Add an auto-upgrade path for removing the spaces as well as use the new name in the LTO plugin. llvm-svn: 315086	2017-10-06 18:06:59 +00:00
Stanislav Mekhanoshin	de42c29a68	[AMDGPU] New 64 bit div/rem expansion Old expansion was 20 VGPRs, 78 SGPRs and ~380 instructions. This expansion is 11 VGPRs, 12 SGPRs and ~120 instructions. Passes OpenCL conformance test_integer_ops quick_[u]long_math Differential Revision: https://reviews.llvm.org/D38607 llvm-svn: 315081	2017-10-06 17:24:45 +00:00
Reid Kleckner	4c4422f9a5	[MC] Use unique_ptr to manage WinFrameInfos, NFC The FrameInfo cannot be stored directly in the vector because chained frames may refer to parent frames, so we need pointers that are stable across a vector resize. llvm-svn: 315080	2017-10-06 17:21:49 +00:00
Peter Collingbourne	80e31f1f84	Support: Rewrite Windows implementation of sys::fs::rename to be more POSIXy. The current implementation of rename uses ReplaceFile if the destination file already exists. According to the documentation for ReplaceFile, the source file is opened without a sharing mode. This means that there is a short interval of time between when ReplaceFile renames the file and when it closes the file during which the destination file cannot be opened. This behaviour is not POSIX compliant because rename is supposed to be atomic. It was also causing intermittent link failures when linking with a ThinLTO cache; the ThinLTO cache implementation expects all cache files to be openable. This patch addresses that problem by re-implementing rename using CreateFile and SetFileInformationByHandle. It is roughly a reimplementation of ReplaceFile with a better sharing policy as well as support for renaming in the case where the destination file does not exist. This implementation is still not fully POSIX. Specifically in the case where the destination file is open at the point when rename is called, there will be a short interval of time during which the destination file will not exist. It isn't clear whether it is possible to avoid this using the Windows API. Differential Revision: https://reviews.llvm.org/D38570 llvm-svn: 315079	2017-10-06 17:14:36 +00:00
Dehao Chen	9bd60429e2	Directly return promoted direct call instead of rely on stripPointerCast. Summary: stripPointerCast is not reliably returning the value that's being type-casted. Instead it may look further at function attributes to further propagate the value. Instead of relying on stripPOintercast, the more reliable solution is to directly use the pointer to the promoted direct call. Reviewers: tejohnson, davidxl Reviewed By: tejohnson Subscribers: llvm-commits, sanjoy Differential Revision: https://reviews.llvm.org/D38603 llvm-svn: 315077	2017-10-06 17:04:55 +00:00
Diana Picus	e393bc72ee	[ARM] GlobalISel: Select shifts Unfortunately TableGen doesn't handle this yet: Unable to deduce gMIR opcode to handle Src (which is a leaf). Just add some temporary hand-written code to generate the proper MOVsr. llvm-svn: 315071	2017-10-06 15:39:16 +00:00
Diana Picus	a81a4b17e5	[ARM] GlobalISel: Map shift operands to GPRs llvm-svn: 315067	2017-10-06 14:52:43 +00:00
Francis Ricci	8aedfde298	[llvm-dsymutil] Add support for __swift_ast MachO DWARF section Summary: Xcode's dsymutil emits a __swift_ast DWARF section, which is required for debugging, and which contains a byte-for-byte dump of the swiftmodule file. Add this feature to llvm-dsymutil. Tested with `gobjdump --dwarf=info -s`, by verifying that the contents of `__DWARF.__swift_ast` match between Xcode's dsymutil and llvm-dsymutil (Xcode's dwarfdump and llvm-dwarfdump don't currently recognize the __swift_ast section). Reviewers: aprantl, friss Subscribers: llvm-commits, JDevlieghere Differential Revision: https://reviews.llvm.org/D38504 llvm-svn: 315066	2017-10-06 14:49:20 +00:00
Diana Picus	2c95730450	[ARM] GlobalISel: Mark shifts as legal for s32 The new legalize combiner introduces shifts all over the place, so we should support them sooner rather than later. llvm-svn: 315064	2017-10-06 14:30:05 +00:00
Jonas Paulsson	c63ed222b8	[SystemZ] Enable machine scheduler. The machine scheduler (before register allocation) is enabled by default for SystemZ. The SelectionDAG scheduling preference now becomes source order scheduling (was regpressure). Review: Ulrich Weigand https://reviews.llvm.org/D37977 llvm-svn: 315063	2017-10-06 13:59:28 +00:00
Clement Courbet	d12c189e2e	Revert "[MergeICmps] Disable mergeicmps if the target does not want to handle memcmp expansion." Still a few stability issues on windows. This reverts commit 67e3db9bc121ba244e20337aabc7cf341a62b545. llvm-svn: 315058	2017-10-06 13:02:24 +00:00
Clement Courbet	4e1bae8136	Re-land "[MergeICmps] Disable mergeicmps if the target does not want to handle memcmp expansion." (fixed unit tests by making comparisons stable) This reverts commit 1b2d359ce256fd6737da4e93833346a0bd6d7583. llvm-svn: 315056	2017-10-06 12:12:35 +00:00
Xinliang David Li	bcd36f7c5a	Roll forward r314928 Fixed ThinLTO bootstrap failure : track new bitcast per incomingVal. Added new tests. llvm-svn: 315044	2017-10-06 05:15:25 +00:00
Davide Italiano	c74ea93b8c	[PM] Retire disable unit-at-a-time switch. This is a vestige from the GCC-3 days, which disables IPO passes when set. I don't think anybody actually uses it as there are several IPO passes which still run with this flag set and nobody complained/noticed. This reduces the delta between current and new pass manager and allows us to easily review the difference when we decide to flip the switch (or audit which passes should run, FWIW). llvm-svn: 315043	2017-10-06 04:39:40 +00:00
Jakub Kuderski	cbe9fae99d	[CodeExtractor] Fix multiple bugs under certain shape of extracted region Summary: If the extracted region has multiple exported data flows toward the same BB which is not included in the region, correct resotre instructions and PHI nodes won't be generated inside the exitStub. The solution is simply put the restore instructions right after the definition of output values instead of putting in exitStub. Unittest for this bug is included. Author: myhsu Reviewers: chandlerc, davide, lattner, silvas, davidxl, wmi, kuhar Subscribers: dberlin, kuhar, mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D37902 llvm-svn: 315041	2017-10-06 03:37:06 +00:00
Daniel Berlin	08dd582ea0	NewGVN: Factor out duplicate parts of OpIsSafeForPHIOfOps llvm-svn: 315040	2017-10-06 01:33:06 +00:00
Francis Ricci	b4e77d98ed	Revert "[llvm-dsymutil] Add support for __swift_ast MachO DWARF section" Breaks aarch64 builders This reverts commit r315014. llvm-svn: 315034	2017-10-05 23:09:17 +00:00
Xin Tong	27e66fb579	[MBP] Remove an invalid assert. The patch that this assert comes with is fixing a bug in MBP. The assert is invalid however. Thanks to @sergey.k.okunev for finding this Currently this fails SPECCPU2006 LTO. I will add a test case when I do more investigation and have one. llvm-svn: 315032	2017-10-05 23:00:04 +00:00
Peter Collingbourne	715bcfe0c9	ModuleUtils: Stop using comdat members to generate unique module ids. It is possible for two modules to define the same set of external symbols without causing a duplicate symbol error at link time, as long as each of the symbols is a comdat member. So we cannot use them as part of a unique id for the module. Differential Revision: https://reviews.llvm.org/D38602 llvm-svn: 315026	2017-10-05 21:54:53 +00:00
Reid Kleckner	676941909d	[X86] Extract CATCHRET handling from emitEpilogue, NFC llvm-svn: 315023	2017-10-05 21:37:39 +00:00
Derek Schuff	885dc59297	[WebAssembly] Add the rest of the atomic loads Add extending loads and constant offset patterns A bit more refactoring of the tablegen to make the patterns fairly nice and uniform between the regular and atomic loads. Differential Revision: https://reviews.llvm.org/D38523 llvm-svn: 315022	2017-10-05 21:18:42 +00:00
Sanjay Patel	7ac2db6a48	[InstCombine] improve folds for icmp gt/lt (shr X, C1), C2 We can always eliminate the shift in: icmp gt/lt (shr X, C1), C2 --> icmp gt/lt X, C' This patch was supposed to just be an efficiency improvement because we were doing this 3-step process to fold: IC: Visiting: %c = icmp ugt i4 %s, 1 IC: ADD: %s = lshr i4 %x, 1 IC: ADD: %1 = udiv i4 %x, 2 IC: Old = %c = icmp ugt i4 %1, 1 New = <badref> = icmp uge i4 %x, 4 IC: ADD: %c = icmp uge i4 %x, 4 IC: ERASE %2 = icmp ugt i4 %1, 1 IC: Visiting: %c = icmp uge i4 %x, 4 IC: Old = %c = icmp uge i4 %x, 4 New = <badref> = icmp ugt i4 %x, 3 IC: ADD: %c = icmp ugt i4 %x, 3 IC: ERASE %2 = icmp uge i4 %x, 4 IC: Visiting: %c = icmp ugt i4 %x, 3 IC: DCE: %1 = udiv i4 %x, 2 IC: ERASE %1 = udiv i4 %x, 2 IC: DCE: %s = lshr i4 %x, 1 IC: ERASE %s = lshr i4 %x, 1 IC: Visiting: ret i1 %c When we could go directly to canonical icmp form: IC: Visiting: %c = icmp ugt i4 %s, 1 IC: Old = %c = icmp ugt i4 %s, 1 New = <badref> = icmp ugt i4 %x, 3 IC: ADD: %c = icmp ugt i4 %x, 3 IC: ERASE %1 = icmp ugt i4 %s, 1 IC: ADD: %s = lshr i4 %x, 1 IC: DCE: %s = lshr i4 %x, 1 IC: ERASE %s = lshr i4 %x, 1 IC: Visiting: %c = icmp ugt i4 %x, 3 ...but then I noticed that the folds were incomplete too: https://godbolt.org/g/aB2hLE Here are attempts to prove the logic with Alive: https://rise4fun.com/Alive/92o Name: lshr_ult Pre: ((C2 << C1) u>> C1) == C2 %sh = lshr i8 %x, C1 %r = icmp ult i8 %sh, C2 => %r = icmp ult i8 %x, (C2 << C1) Name: ashr_slt Pre: ((C2 << C1) >> C1) == C2 %sh = ashr i8 %x, C1 %r = icmp slt i8 %sh, C2 => %r = icmp slt i8 %x, (C2 << C1) Name: lshr_ugt Pre: (((C2+1) << C1) u>> C1) == (C2+1) %sh = lshr i8 %x, C1 %r = icmp ugt i8 %sh, C2 => %r = icmp ugt i8 %x, ((C2+1) << C1) - 1 Name: ashr_sgt Pre: (C2 != 127) && ((C2+1) << C1 != -128) && (((C2+1) << C1) >> C1) == (C2+1) %sh = ashr i8 %x, C1 %r = icmp sgt i8 %sh, C2 => %r = icmp sgt i8 %x, ((C2+1) << C1) - 1 Name: ashr_exact_sgt Pre: ((C2 << C1) >> C1) == C2 %sh = ashr exact i8 %x, C1 %r = icmp sgt i8 %sh, C2 => %r = icmp sgt i8 %x, (C2 << C1) Name: ashr_exact_slt Pre: ((C2 << C1) >> C1) == C2 %sh = ashr exact i8 %x, C1 %r = icmp slt i8 %sh, C2 => %r = icmp slt i8 %x, (C2 << C1) Name: lshr_exact_ugt Pre: ((C2 << C1) u>> C1) == C2 %sh = lshr exact i8 %x, C1 %r = icmp ugt i8 %sh, C2 => %r = icmp ugt i8 %x, (C2 << C1) Name: lshr_exact_ult Pre: ((C2 << C1) u>> C1) == C2 %sh = lshr exact i8 %x, C1 %r = icmp ult i8 %sh, C2 => %r = icmp ult i8 %x, (C2 << C1) We did something similar for 'shl' in D28406. Differential Revision: https://reviews.llvm.org/D38514 llvm-svn: 315021	2017-10-05 21:11:49 +00:00
Krzysztof Parzyszek	a114941fa8	[Hexagon] Make PS_fi and PS_fia extendable (they both expand to A2_addi) llvm-svn: 315019	2017-10-05 20:20:06 +00:00
Dehao Chen	16f01fb1db	Annotate VP prof on indirect call if it is ICPed in the profiled binary. Summary: In SamplePGO, when an indirect call is promoted in the profiled binary, before profile annotation, it will be promoted and inlined. For the original indirect call, the current implementation will not mark VP profile on it. This is an issue when profile becomes stale. This patch annotates VP prof on indirect calls during annotation. Reviewers: tejohnson Reviewed By: tejohnson Subscribers: sanjoy, llvm-commits Differential Revision: https://reviews.llvm.org/D38477 llvm-svn: 315016	2017-10-05 20:15:29 +00:00
Francis Ricci	2b513b5c99	[llvm-dsymutil] Add support for __swift_ast MachO DWARF section Summary: Xcode's dsymutil emits a __swift_ast DWARF section, which is required for debugging, and which contains a byte-for-byte dump of the swiftmodule file. Add this feature to llvm-dsymutil. Tested with `gobjdump --dwarf=info -s`, by verifying that the contents of `__DWARF.__swift_ast` match between Xcode's dsymutil and llvm-dsymutil (Xcode's dwarfdump and llvm-dwarfdump don't currently recognize the __swift_ast section). Reviewers: aprantl, friss Subscribers: llvm-commits, JDevlieghere Differential Revision: https://reviews.llvm.org/D38504 llvm-svn: 315014	2017-10-05 20:03:01 +00:00
Krzysztof Parzyszek	7ae3ae9ef4	[Hexagon] Give uniform names to functions changing addressing modes, NFC The new format is changeAddrMode_xx_yy, where xx is the current mode, and yy is the new one. Old name: New name: getBaseWithImmOffset changeAddrMode_abs_io getAbsoluteForm changeAddrMode_io_abs getBaseWithRegOffset changeAddrMode_io_rr xformRegToImmOffset changeAddrMode_rr_io getBaseWithLongOffset changeAddrMode_rr_ur getRegShlForm changeAddrMode_ur_rr llvm-svn: 315013	2017-10-05 20:01:38 +00:00
Francis Ricci	5f689d0db3	Revert "[llvm-dsymutil] Add support for __swift_ast MachO DWARF section" This reverts commit r315004, because of a failing test on non-apple platforms llvm-svn: 315009	2017-10-05 19:47:13 +00:00
Francis Ricci	7767277639	[llvm-dsymutil] Add support for __swift_ast MachO DWARF section Summary: Xcode's dsymutil emits a __swift_ast DWARF section, which is required for debugging, and which contains a byte-for-byte dump of the swiftmodule file. Add this feature to llvm-dsymutil. Tested with `gobjdump --dwarf=info -s`, by verifying that the contents of `__DWARF.__swift_ast` match between Xcode's dsymutil and llvm-dsymutil (Xcode's dwarfdump and llvm-dwarfdump don't currently recognize the __swift_ast section). Reviewers: aprantl, friss Subscribers: llvm-commits, JDevlieghere Differential Revision: https://reviews.llvm.org/D38504 llvm-svn: 315004	2017-10-05 19:17:28 +00:00
Davide Italiano	e070721308	[NewPassManager] Run global dead code elimination after the inliner. This is the same exact change we did for the current pass manager in rL314997, but the new pass manager pipeline already happened to run GlobalOpt after the inliner, so we just insert a run of GDCE here. llvm-svn: 315003	2017-10-05 18:36:01 +00:00
Reid Kleckner	7344282c36	[X86] Simplify X86 epilogue frame size calculation, NFC Sink the insertion of "pop ebp" out of the frame size calculation branches. They all check for HasFP. Our handling of CLEANUPRET and CATCHRET was equivalent, both are funclets and use the same frame size. We can eliminate the CLEANUPRET case. Hoist the hasFP(MF) query into a local bool. Rename TargetMBB to CatchRetTarget to be more descriptive. Eliminate the Optional<unsigned> RetOpcode local, now that it has one use. It's only a net savings of 10 lines, but hopefully it's slightly more readable. llvm-svn: 315000	2017-10-05 18:27:08 +00:00
Davide Italiano	c8708e59e8	[PassManager] Improve the interaction between -O2 and ThinLTO. Run GDCE slightly later so that we don't have to repeat it twice when preparing for Thin. Thanks to Mehdi for the suggestion. llvm-svn: 314999	2017-10-05 18:23:25 +00:00
Davide Italiano	ff829cea8b	[PassManager] Run global optimizations after the inliner. The inliner performs some kind of dead code elimination as it goes, but there are cases that are not really caught by it. We might at some point consider teaching the inliner about them, but it is OK for now to run GlobalOpt + GlobalDCE in tandem as their benefits generally outweight the cost, making the whole pipeline faster. This fixes PR34652. Differential Revision: https://reviews.llvm.org/D38154 llvm-svn: 314997	2017-10-05 18:06:37 +00:00
Matthew Simpson	49ee814996	[SparsePropagation] Move member definitions to header (NFC) AbstractLatticeFunction and SparseSolver are class templates parameterized by a lattice value, so we need to move these member functions over to the header. Differential Revision: https://reviews.llvm.org/D38561 llvm-svn: 314996	2017-10-05 18:03:30 +00:00
Petar Jovanovic	65f10246bb	[mips] implement .set dspr2 directive Implement .set dspr2 directive with appropriate feature bits. This directive is a counterpart of -mattr=dspr2 command line option with the exception that it does not influence elf header flags. Patch by Milos Stojanovic. Differential Revision: https://reviews.llvm.org/D38537 llvm-svn: 314994	2017-10-05 17:40:32 +00:00
Matt Arsenault	2d3f8f333d	AMDGPU: Set v2i32 any_extend to expand llvm-svn: 314993	2017-10-05 17:38:30 +00:00
Krzysztof Parzyszek	9f3e88ae64	[RDF] Simplify construction of maximal registers The old algoritm was not correct, although it worked most of the time. Avoid the complex reachability analysis and simply calculate the maximal registers out of the set of all referenced registers. llvm-svn: 314991	2017-10-05 17:12:49 +00:00
Rong Xu	289da65698	[ProfileData] Fix data racing in merging indexed profiles There is data racing to the static variable RecordIndex in index profile reader when merging in multiple threads. Make it a member variable in IndexedInstrProfReader to fix this. Differential Revision: https://reviews.llvm.org/D38431 llvm-svn: 314990	2017-10-05 17:05:20 +00:00
Artur Pilipenko	7b15254c8f	[X86] Fix chains update when lowering BUILD_VECTOR to a vector load The code which lowers BUILD_VECTOR of consecutive loads into a single vector load doesn't update chains properly. As a result the vector load can be reordered with the store to the same location. The current code in EltsFromConsecutiveLoads only updates the chain following the first load. The fix is to update the chains following all the loads comprising the vector. This is a fix for PR10114. Reviewed By: niravd Differential Revision: https://reviews.llvm.org/D38547 llvm-svn: 314988	2017-10-05 16:28:21 +00:00
Konstantin Zhuravlyov	aa0835a7ab	AMDGPU: Add and set AMDGPU-specific e_flags Differential Revision: https://reviews.llvm.org/D38556 llvm-svn: 314987	2017-10-05 16:19:18 +00:00
Ayal Zaks	c9e0f886e5	[LV] Fix PR34743 - handle casts that sink after interleaved loads When ignoring a load that participates in an interleaved group, make sure to move a cast that needs to sink after it. Testcase derived from reproducer of PR34743. Differential Revision: https://reviews.llvm.org/D38338 llvm-svn: 314986	2017-10-05 15:45:14 +00:00
Clement Courbet	922e5bc698	Revert "Re-land "[MergeICmps] Disable mergeicmps if the target does not want to handle memcmp expansion.""" broken test on windows This reverts commit c91479518344fd1fc071c5bd5848f6eb83e53dca. llvm-svn: 314985	2017-10-05 14:42:06 +00:00
Sanjay Patel	f11b5b4f87	revert r314698 - [InstCombine] remove one-use restriction for icmp (shr exact X, C1), C2 --> icmp X, (C2<<C1) There is a bot failure that appears to be related to this change: http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15-selfhost-neon/builds/2117 ...so reverting to confirm that and attempting to keep the bot green while investigating. llvm-svn: 314984	2017-10-05 14:26:15 +00:00
Ayal Zaks	fc3f7a4f0c	[LV] Fix PR34711 - widen instruction ranges when sinking casts Instead of trying to keep LastWidenRecipe updated after creating each recipe, have tryToWiden() retrieve the last recipe of the current VPBasicBlock and check if it's a VPWidenRecipe when attempting to extend its range. This ensures that such extensions, optimized to maintain the original instruction order, do so only when the instructions are to maintain their relative order. The latter does not always hold, e.g., when a cast needs to sink to unravel first order recurrence (r306884). Testcase derived from reproducer of PR34711. Differential Revision: https://reviews.llvm.org/D38339 llvm-svn: 314981	2017-10-05 12:41:49 +00:00
Clement Courbet	4cafbb9b5e	Re-land "[MergeICmps] Disable mergeicmps if the target does not want to handle memcmp expansion."" llvm-svn: 314980	2017-10-05 12:39:57 +00:00
Simon Dardis	51a7ae2a29	[mips] Place certain 64 bit FPU instructions in their own decoder namespace Previously, instructions that were defined to use the FGR64 register class were associated with the Mips64 table which was incorrect. Reviewers: nitesh.jain, atanasyan Differential Revision: https://reviews.llvm.org/D38454 llvm-svn: 314976	2017-10-05 10:27:37 +00:00
Karl-Johan Karlsson	8d8d201c17	[DebugInfo] Insert DEBUG_VALUEs after each register redefinition Summary: When reinserting debug values after register allocation, make sure to insert debug values after each redefinition of debug value register in the slot index range. The reason for this is that DwarfDebug will end the range of a debug variable when the physical reg is defined. For instructions with e.g. tied operands this result in prematurely ended debug range. This resolves pr34545 Patch by Karl-Johan Karlsson and Bjorn Pettersson Reviewers: rnk, aprantl Reviewed By: rnk Subscribers: bjope, llvm-commits Differential Revision: https://reviews.llvm.org/D38229 llvm-svn: 314974	2017-10-05 08:37:31 +00:00
George Rimar	b074fbcb48	[MC] - llvm-mc hangs on non-english characters. Currently llvm-mc just hangs inside infinite loop while trying to parse file which has ".section .с" inside, where section name is non-english character. Patch fixes the issue. In this patch I also moved content of non-english-characters.s to test/MC/AsmParser/Inputs folder so that non-english-characters.s becomes a single testcase for all invalid inputs containing non-english symbols. That is convinent because llvm-mc otherwise tries to parse and tokenize the whole testcase file with tools invocations and it is harder to isolate the issue. Differential revision: https://reviews.llvm.org/D38545 llvm-svn: 314973	2017-10-05 08:15:55 +00:00
Clement Courbet	6603fc0e7b	Revert "[MergeICmps] Disable mergeicmps if the target does not want to handle memcmp expansion." Breaks clang-stage1-cmake-RA-incremental/llvm/test/Transforms/MergeICmps/X86/tuple-four-int8.ll This reverts commit 3038c459d67f8898ffa295d54a013b280690abfa. llvm-svn: 314972	2017-10-05 08:03:39 +00:00
Craig Topper	17b0c78447	[InstCombine] Fix a vector splat handling bug in transformZExtICmp. We were using an i1 type and then zero extending to a vector. Instead just create the 0/1 directly as a ConstantInt with the correct type. No need to ask ConstantExpr to zero extend for us. This bug is a bit tricky to hit because it requires us to visit a zext of an icmp that would normally be simplified to true/false, but that icmp hasnt' been visited yet. In the test case this zext and icmp were created by visiting a udiv and due to worklist ordering we got to the zext first. Fixes PR34841. llvm-svn: 314971	2017-10-05 07:59:11 +00:00
Clement Courbet	902eef32eb	[MergeICmps] Disable mergeicmps if the target does not want to handle memcmp expansion. Summary: This is to avoid e.g. merging two cheap icmps if the target is not going to expand to something nice later. Reviewers: dberlin, spatel Subscribers: davide, nemanjai Differential Revision: https://reviews.llvm.org/D38232 llvm-svn: 314970	2017-10-05 07:49:09 +00:00
Mikael Holmen	0ec1d25d33	Minor refactoring regarding Cast::isNoopCast(), NFC Summary: FastISel::hasTrivialKill() was the only user of the "IntPtrTy" version of Cast::isNoopCast(). According to review comments in D37894 we could instead use the "DataLayout" version of the method, and thus get rid of the "IntPtrTy" versions of isNoopCast() completely. With the above done, the remaining isNoopCast() could then be simplified a bit more. Reviewers: arsenm Reviewed By: arsenm Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D38497 llvm-svn: 314969	2017-10-05 07:07:09 +00:00
Dean Michael Berris	0a465d7a01	[XRay][tools] Support arg1 logging entries in the basic logging mode Summary: The arg1 logging handler changed in compiler-rt to start writing a different type for entries encountered when logging the first argument of XRay-instrumented functions. This change allows the trace loader to support reading these record types as well as prepare for when the basic (naive) mode implementation starts writing down the argument payloads. Without this change, binaries with arg1 logging support enabled start writing unreadable logs for any of the XRay tracing tools. Reviewers: pelikan Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38550 llvm-svn: 314967	2017-10-05 05:18:17 +00:00
Xinliang David Li	04ab11a08a	Revert r314928 to investigate thinLTO bootstrap failure llvm-svn: 314961	2017-10-05 01:40:13 +00:00
Eugene Zelenko	60433b682f	[X86] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 314953	2017-10-05 00:33:50 +00:00
Matt Arsenault	f48e5c9ce5	AMDGPU: Add comment about clamps llvm-svn: 314952	2017-10-05 00:13:20 +00:00
Matt Arsenault	aafff87dda	AMDGPU: Do not fold clamp instructions when sources are different Patch by hakzsam (Samuel Pitoiset) llvm-svn: 314951	2017-10-05 00:13:17 +00:00
Craig Topper	7a93092399	[InstCombine] Improve support for ashr in foldICmpAndShift We can support ashr similar to lshr, if we know that none of the shifted in bits are used. In that case SimplifyDemandedBits would normally convert it to lshr. But that conversion doesn't happen if the shift has additional users. Differential Revision: https://reviews.llvm.org/D38521 llvm-svn: 314945	2017-10-04 23:06:13 +00:00
Matt Arsenault	9ab1fa6803	AMDGPU: Fix not accounting for instruction size in bundles These were counted as 0. Fixes branch limit exceeded errors in some large programs. llvm-svn: 314944	2017-10-04 22:59:12 +00:00
Konstantin Zhuravlyov	8684f7b4f9	AMDGPU: Correctly set EI_OSABI based on the os Differential Revision: https://reviews.llvm.org/D38555 llvm-svn: 314943	2017-10-04 22:44:13 +00:00
Adrian Prantl	b4a67907b7	clang-format file. llvm-svn: 314942	2017-10-04 22:26:19 +00:00
Adrian Prantl	617a007b7c	delete commented out code. llvm-svn: 314941	2017-10-04 22:26:19 +00:00
Sanjoy Das	005b88c0a6	Do not call Loop::getName on possibly dead loops This fixes PR34832. llvm-svn: 314938	2017-10-04 22:02:27 +00:00
Xin Tong	d8d97972de	[MachineBlockPlacement] Make sure PreferredLoopExit is cleared everytime new loop is processed Summary: Rotate on exit that actually exits the current loop. Reviewers: davidxl, danielcdh, iteratee, chandlerc Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38563 llvm-svn: 314937	2017-10-04 21:39:25 +00:00
Hans Wennborg	899809d531	Fix a -Wparentheses warning. NFC. llvm-svn: 314936	2017-10-04 21:14:07 +00:00
Marcello Maggioni	df3e71e037	[LoopDeletion] Move deleteDeadLoop to to LoopUtils. NFC llvm-svn: 314934	2017-10-04 20:42:46 +00:00
Rafael Espindola	8c0ff9508d	Bring r314809 back. But now include a check for CPU_COUNT so we still build on 10 year old versions of glibc. Original message: Use sched_getaffinity instead of std:🧵:hardware_concurrency. The issue with std:🧵:hardware_concurrency is that it forwards to libc and some implementations (like glibc) don't take thread affinity into consideration. With this change a llvm program that can execute in only 2 cores will use 2 threads, even if the machine has 32 cores. This makes benchmarking a lot easier, but should also help if someone doesn't want to use all cores for compilation for example. llvm-svn: 314931	2017-10-04 20:27:01 +00:00
Sanjay Patel	4c33d5213b	[SimplifyCFG] put the optional assumption cache pointer in the options struct; NFCI This is a follow-up to https://reviews.llvm.org/D38138. I fixed the capitalization of some functions because we're changing those lines anyway and that helped verify that we weren't accidentally dropping any options by using default param values. llvm-svn: 314930	2017-10-04 20:26:25 +00:00
Xinliang David Li	7a73757358	Recommit r314561 after fixing msan build failure (trial 2) Incoming val defined by terminator instruction which also requires bitcasts can not be handled. llvm-svn: 314928	2017-10-04 20:17:55 +00:00
Jun Bum Lim	d40e03c2d8	Recommit : Use the basic cost if a GEP is not used as addressing mode Recommitting r314517 with the fix for handling ConstantExpr. Original commit message: Currently, getGEPCost() returns TCC_FREE whenever a GEP is a legal addressing mode in the target. However, since it doesn't check its actual users, it will return FREE even in cases where the GEP cannot be folded away as a part of actual addressing mode. For example, if an user of the GEP is a call instruction taking the GEP as a parameter, then the GEP may not be folded in isel. llvm-svn: 314923	2017-10-04 18:33:52 +00:00
Daniel Neilson	bef94bcbae	Revert D38481 due to missing cmake check for CPU_COUNT Summary: This reverts D38481. The change breaks systems with older versions of glibc. It injects a use of CPU_COUNT() from sched.h without checking to ensure that the function exists first. Reviewers: Subscribers: llvm-svn: 314922	2017-10-04 18:19:03 +00:00
Simon Pilgrim	9edbe110e8	[X86][AVX] Improve (i8 bitcast (v8i1 x)) handling for v8i64/v8f64 512-bit vector compare results. AVX1/AVX2 targets were missing a chance to use vmovmskps for v8f32/v8i32 results for bool vector bitcasts llvm-svn: 314921	2017-10-04 18:00:42 +00:00
Krzysztof Parzyszek	4697ddeea4	[Hexagon] Add a member Subtarget to HexagonInstrInfo, NFC llvm-svn: 314920	2017-10-04 18:00:15 +00:00
Hans Wennborg	2a6c9adb2f	Revert r314886 "[X86] Improvement in CodeGen instruction selection for LEAs (re-applying post required revision changes.)" It broke the Chromium / SQLite build; see PR34830. > Summary: > 1/ Operand folding during complex pattern matching for LEAs has been > extended, such that it promotes Scale to accommodate similar operand > appearing in the DAG. > e.g. > T1 = A + B > T2 = T1 + 10 > T3 = T2 + A > For above DAG rooted at T3, X86AddressMode will no look like > Base = B , Index = A , Scale = 2 , Disp = 10 > > 2/ During OptimizeLEAPass down the pipeline factorization is now performed over LEAs > so that if there is an opportunity then complex LEAs (having 3 operands) > could be factored out. > e.g. > leal 1(%rax,%rcx,1), %rdx > leal 1(%rax,%rcx,2), %rcx > will be factored as following > leal 1(%rax,%rcx,1), %rdx > leal (%rdx,%rcx) , %edx > > 3/ Aggressive operand folding for AM based selection for LEAs is sensitive to loops, > thus avoiding creation of any complex LEAs within a loop. > > Reviewers: lsaba, RKSimon, craig.topper, qcolombet, jmolloy > > Reviewed By: lsaba > > Subscribers: jmolloy, spatel, igorb, llvm-commits > > Differential Revision: https://reviews.llvm.org/D35014 llvm-svn: 314919	2017-10-04 17:54:06 +00:00
Simon Pilgrim	b47b3f2564	[X86][SSE] Add support for lowering v8i16 binary shuffles to PACKSS/PACKUS Missed in D38472 llvm-svn: 314916	2017-10-04 17:31:28 +00:00
Craig Topper	6fb55716e9	[X86] Redefine MOVSS/MOVSD instructions to take VR128 regclass as input instead of FR32/FR64 This patch redefines the MOVSS/MOVSD instructions to take VR128 as its second input. This allows the MOVSS/SD->BLEND commute to work without requiring a COPY to be inserted. This should fix PR33079 Overall this looks to be an improvement in the generated code. I haven't checked the EXPENSIVE_CHECKS build but I'll do that and update with results. Differential Revision: https://reviews.llvm.org/D38449 llvm-svn: 314914	2017-10-04 17:20:12 +00:00
Yonghong Song	09b01b3555	bpf: fix an insn encoding issue for neg insn Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 314911	2017-10-04 16:11:52 +00:00
Adam Nemet	6c381b7a2e	[OptRemark] Move YAML writing to IR Before the patch this was in Analysis. Moving it to IR and making it implicit part of LLVMContext::diagnose allows the full opt-remark facility to be used outside passes e.g. the pass manager. Jessica is planning to use this to report function size after each pass. The same could be used for time reports. Tested with BUILD_SHARED_LIBS=On. llvm-svn: 314909	2017-10-04 15:18:11 +00:00
Adam Nemet	f1bea0a6ab	Also update MachineORE after r314874. llvm-svn: 314908	2017-10-04 15:18:07 +00:00
Clement Courbet	98eaa88357	[NFC] clang-format lib/Transforms/Scalar/MergeICmps.cpp llvm-svn: 314906	2017-10-04 15:13:52 +00:00
Simon Pilgrim	46a366ccb7	[X86][SSE] Early out from ComputeNumSignBitsForTargetNode. NFCI. Early out from vector shift by immediates that will exceed eltsize - don't bother making an unnecessary ComputeNumSignBits recursive call. llvm-svn: 314903	2017-10-04 13:41:26 +00:00
Simon Pilgrim	bd5d2f0284	[X86][SSE] Add support for lowering unary shuffles to PACKSS/PACKUS Extension to D38472 llvm-svn: 314901	2017-10-04 13:12:08 +00:00
Dylan McKay	8dd702c1cd	[AVR] Implement LPMWRdZ pseudo-instruction's expansion. FIXME: implementation is mostly copy-pasted from LDWRdPtr, so we should refactor a bit and unify the two Patch by Gerdo Erdi. llvm-svn: 314898	2017-10-04 10:37:22 +00:00
Dylan McKay	3f71f1c91e	[AVR] Factor out mayLoad in tablegen patterns Patch by Gergo Erdi. llvm-svn: 314897	2017-10-04 10:36:07 +00:00
Dylan McKay	d00f9c1ef1	[AVR] Elaborate LDWRdPtr into `ld r, X++; ld r+1, X` Patch by Gergo Erdi. llvm-svn: 314896	2017-10-04 10:33:36 +00:00
Dylan McKay	39069208d5	[AVR] Insert JMP for long branches Previously, on long branches (relative jumps of >4 kB), an assertion failure was hit, as AVRInstrInfo::insertIndirectBranch was not implemented. Despite its name, it is called by the branch relaxator for all unconditional jumps. Patch by Thomas Backman. llvm-svn: 314891	2017-10-04 09:51:28 +00:00
Dylan McKay	c4b002bf5a	[AVR] Fix displacement overflow for LDDW/STDW In some cases, the code generator attempts to generate instructions such as: lddw r24, Y+63 which expands to: ldd r24, Y+63 ldd r25, Y+64 # Oops! This is actually ld r25, Y in the binary This commit limits the first offset to 62, and thus the second to 63. It also updates some asserts in AVRExpandPseudoInsts.cpp, including for INW and OUTW, which appear to be unused. Patch by Thomas Backman. llvm-svn: 314890	2017-10-04 09:51:21 +00:00
Oliver Stannard	878216dd05	[ARM] Add diag string for movw/movt immediates in assembly This adds diagnostics for invalid immediate operands to the MOVW and MOVT instructions (ARM and Thumb). Differential revision: https://reviews.llvm.org/D31879 llvm-svn: 314888	2017-10-04 09:24:54 +00:00
Oliver Stannard	5a7aae3a80	[ARM, Asm] Change grammar of immediate operand diagnostics Currently, our diagnostics for assembly operands are not consistent. Some start with (for example) "immediate operand must be ...", and some with "operand must be an immediate ...". I think the latter form is preferable for a few reasons: * It's unambiguous that it is referring to the expected type of operand, not the type the user provided. For example, the user could provide an register operand, and get a message taking about an operand is if it is already an immediate, just not in the accepted range. * It allows us to have a consistent style once we add diagnostics for operands that could take two forms, for example a label or pc-relative memory operand. Differential revision: https://reviews.llvm.org/D36689 llvm-svn: 314887	2017-10-04 09:18:07 +00:00
Jatin Bhateja	3c29bacd43	[X86] Improvement in CodeGen instruction selection for LEAs (re-applying post required revision changes.) Summary: 1/ Operand folding during complex pattern matching for LEAs has been extended, such that it promotes Scale to accommodate similar operand appearing in the DAG. e.g. T1 = A + B T2 = T1 + 10 T3 = T2 + A For above DAG rooted at T3, X86AddressMode will no look like Base = B , Index = A , Scale = 2 , Disp = 10 2/ During OptimizeLEAPass down the pipeline factorization is now performed over LEAs so that if there is an opportunity then complex LEAs (having 3 operands) could be factored out. e.g. leal 1(%rax,%rcx,1), %rdx leal 1(%rax,%rcx,2), %rcx will be factored as following leal 1(%rax,%rcx,1), %rdx leal (%rdx,%rcx) , %edx 3/ Aggressive operand folding for AM based selection for LEAs is sensitive to loops, thus avoiding creation of any complex LEAs within a loop. Reviewers: lsaba, RKSimon, craig.topper, qcolombet, jmolloy Reviewed By: lsaba Subscribers: jmolloy, spatel, igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D35014 llvm-svn: 314886	2017-10-04 09:02:10 +00:00
George Rimar	099960d322	[MC] - Don't assert when non-english characters are used. I found that llvm-mc does not like non-english characters even in comments, which it tries to tokenize. Problem happens because of functions like isdigit(), isalnum() which takes int argument and expects it is not negative. But at the same time MCParser uses char* to store input buffer poiner, char has signed value, so it is possible to pass negative value to one of functions from above and that triggers an assert. Testcase for demonstration is provided. To fix the issue helper functions were introduced in StringExtras.h Differential revision: https://reviews.llvm.org/D38461 llvm-svn: 314883	2017-10-04 08:50:08 +00:00
Mikael Holmen	a1a3f5c5e6	Recommit [UnreachableBlockElim] Use COPY if PHI input is undef This time invoking llc with "-march=x86-64" in the testcase, so we don't assume the default target is x86. Summary: If we have %vreg0<def> = PHI %vreg2<undef>, <BB#0>, %vreg3, <BB#2>; GR32:%vreg0,%vreg2,%vreg3 %vreg3<def,tied1> = ADD32ri8 %vreg0<kill,tied0>, 1, %EFLAGS<imp-def>; GR32:%vreg3,%vreg0 then we can't just change %vreg0 into %vreg3, since %vreg2 is actually undef. We would have to also copy the undef flag to be able to change the register. Instead we deal with this case like other cases where we can't just replace the register: we insert a COPY. The code creating the COPY already copied all flags from the PHI input, so the undef flag will be transferred as it should. Reviewers: kparzysz Reviewed By: kparzysz Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38235 llvm-svn: 314882	2017-10-04 07:42:45 +00:00
Max Kazantsev	8aacef6cae	[IRCE] Temporarily disable unsigned latch conditions by default We have found some corner cases connected to range intersection where IRCE makes a bad thing when the latch condition is unsigned. The fix for that will go as a follow up. This patch temporarily disables IRCE for unsigned latch conditions until the issue is fixed. The unsigned latch conditions were introduced to IRCE by rL310027. Differential Revision: https://reviews.llvm.org/D38529 llvm-svn: 314881	2017-10-04 06:53:22 +00:00
Mikael Holmen	75b1992f78	Revert r314879 "[UnreachableBlockElim] Use COPY if PHI input is undef" Build-bots broke on the new testcase. I'll investigate and fix. llvm-svn: 314880	2017-10-04 06:39:22 +00:00
Mikael Holmen	65eb2f394c	[UnreachableBlockElim] Use COPY if PHI input is undef Summary: If we have %vreg0<def> = PHI %vreg2<undef>, <BB#0>, %vreg3, <BB#2>; GR32:%vreg0,%vreg2,%vreg3 %vreg3<def,tied1> = ADD32ri8 %vreg0<kill,tied0>, 1, %EFLAGS<imp-def>; GR32:%vreg3,%vreg0 then we can't just change %vreg0 into %vreg3, since %vreg2 is actually undef. We would have to also copy the undef flag to be able to change the register. Instead we deal with this case like other cases where we can't just replace the register: we insert a COPY. The code creating the COPY already copied all flags from the PHI input, so the undef flag will be transferred as it should. Reviewers: kparzysz Reviewed By: kparzysz Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38235 llvm-svn: 314879	2017-10-04 06:06:31 +00:00
Martin Storsjo	e14145dcb0	[X86] Fix using the SJLJ jump table on x86_64 The previous version didn't work if the jump table base address didn't fit in 32 bit, since it was encoded as an immediate offset. And in case the jump table is encoded as 32 bit label differences, we need to load and add them to the table base first. This solves the first half of the issues mentioned in PR34720. Also fix some of the errors pointed out by -verify-machineinstrs, by using GR32_NOSPRegClass. Differential Revision: https://reviews.llvm.org/D38333 llvm-svn: 314876	2017-10-04 05:12:10 +00:00
Adam Nemet	f31b1f310c	Move verbosity check for remarks to the diag handler Test needs some slight adjustment because we no longer check the existence of BFI but rather that the actual hotness is set on the remark. If entry_count is not set getBlockProfileCount returns None. llvm-svn: 314874	2017-10-04 04:26:23 +00:00
Tim Shen	83fd6a1243	[FuzzerUtil] Partially revert D38481 on FuzzerUtil This is because lib/Fuzzer doesn't really depend on llvm infrastucture. It's not easy to access the llvm hardware_concurrency here. Differential Reivision: https://reviews.llvm.org/D38481 llvm-svn: 314870	2017-10-04 01:05:34 +00:00
Rui Ueyama	15b8327963	Simplify multikey_qsort function. This function implements the three-way radix quicksort algorithm. This patch simplifies the implementation by using MutableArrayRef. llvm-svn: 314858	2017-10-03 23:12:01 +00:00
Balaram Makam	e0c43152b5	[AArch64] Use LateSimplifyCFG after expanding atomic operations. Summary: After r308422 we defer optimizations that can destroy loop canonical forms to LateSimplifyCFG. Running LateSimplifyCFG after expanding atomic operations can exploit more control-flow opportunities. Reviewers: mcrosier, t.p.northover, efriedma Reviewed By: efriedma Subscribers: aemerson, rengolin, javed.absar, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D38262 llvm-svn: 314857	2017-10-03 22:39:24 +00:00
Konstantin Zhuravlyov	22bc039c89	AMDGPU: Expand setcc for v2f32 and v4f32 llvm-svn: 314853	2017-10-03 21:45:01 +00:00
Konstantin Zhuravlyov	908fa90b51	AMDGPU: Expand setcc for v2i32 and v4i32 llvm-svn: 314852	2017-10-03 21:31:24 +00:00
Konstantin Zhuravlyov	0aa94d314c	AMDGPU: Add ELFOSABI_AMDGPU_MESA3D Differential Revision: https://reviews.llvm.org/D38387 llvm-svn: 314846	2017-10-03 21:14:14 +00:00
Reid Kleckner	33cbbbc62f	[X86] Remove dead declaration convertArgMovsToPushes, NFC This was dead when it landed in r252578. We have this functionality, if not for stack probe calls, but for regular calls in X86CallFrameOptimization.cpp. llvm-svn: 314845	2017-10-03 21:12:18 +00:00
Rafael Espindola	476a7f9293	Pre-compute the tail of the archive An archive looks like <header> <symbol table> <tail> The symbol table refers to offsets in the tail. A complication is that we would like to support symbol tables that use 64 bit offsets if it turns out that any of the offsets is too big. This patch changes the archive writer to first compute the tail. We cannot just compute one big StringRef since that would require reading every member upfront, but we can represent it as a series of StringRefs. Having done that it is much easier to compute the symbol table and all offsets are computed before it is written. With this if there is an accounting problem it will show up with a regular symbol table, not just when a 64 bit one is needed. llvm-svn: 314844	2017-10-03 20:59:43 +00:00
Konstantin Zhuravlyov	a952b44ed5	AMDGPU: Add ELFOSABI_AMDGPU_PAL llvm-svn: 314843	2017-10-03 20:54:07 +00:00
Reid Kleckner	bc66947433	Refactor DIBuilder dbg intrinsic insertion, NFC Both dbg.declare and dbg.value insertion had duplicate code for the two overloads with different insertion point conventions. llvm-svn: 314839	2017-10-03 20:36:40 +00:00
Jessica Paquette	acc15e1265	[MachineOutliner] Fix off-by-one in cost model This commit does two things. Firstly, it cleans up some of the benefit calculation wrt outlined functions and candidates. Secondly, it fixes an off-by-one bug in the cost model which was caused by the benefit value of an OutlinedFunction and Candidate differing by 1. It updates the remarks test to reflect this change. llvm-svn: 314836	2017-10-03 20:32:55 +00:00
Stefan Pintilie	e1d7547237	[PowerPC] Revert P9 scheduling model to incomplete Partially revert a previous change from commit: https://llvm.org/svn/llvm-project/llvm/trunk@314026 The previous change caused regressions on Power 9. llvm-svn: 314835	2017-10-03 20:27:30 +00:00
Craig Topper	df63b96811	[InstCombine] Use isSignBitCheck to simplify an if statement. Directly create new sign bit compares instead of manipulating the constant. NFCI Since we no longer had the direct constant compares, manipulating the constant seemeded less clear. llvm-svn: 314830	2017-10-03 19:14:23 +00:00
Tim Renouf	72800f0436	[AMDGPU] implemented pal metadata Summary: For the amdpal OS type: We write an AMDGPU_PAL_METADATA record in the .note section in the ELF (or as an assembler directive). It contains key=value pairs of 32 bit ints. It is a merge of metadata from codegen of the shaders, and metadata provided by the frontend as _amdgpu_pal_metadata IR metadata. Where both sources have a key=value with the same key, the two values are ORed together. This .note record is part of the amdpal ABI and will be documented in docs/AMDGPUUsage.rst in a future commit. Eventually the amdpal OS type will stop generating the .AMDGPU.config section once the frontend has safely moved over to using the .note records above instead of .AMDGPU.config. Reviewers: arsenm, nhaehnle, dstuttard Subscribers: kzhuravl, wdng, yaxunl, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D37753 llvm-svn: 314829	2017-10-03 19:03:52 +00:00
Alexander Timofeev	4651396584	[AMDGPU] Avoid predicated execution of the basic blocks containing scalar instructions. Differential revision: https://reviews.llvm.org/D38293 llvm-svn: 314828	2017-10-03 18:55:36 +00:00
Hans Wennborg	dc8d6f2527	Fix -Wcovered-switch-default warnings from r314821 llvm-svn: 314826	2017-10-03 18:44:12 +00:00
Hans Wennborg	ab2177edf7	Revert r314817 "[dwarfdump] Add -lookup option" The test fails on Linux; see follow-up email on the llvm-commits list. > Add the option to lookup an address in the debug information and print > out the file, function, block and line table details. > > Differential revision: https://reviews.llvm.org/D38409 This also reverts the follow-up r314818: > [test] Fix llvm-dwarfdump/cmdline.test > > Fixes test/tools/llvm-dwarfdump/cmdline.test llvm-svn: 314825	2017-10-03 18:39:13 +00:00
Hans Wennborg	9a9048e19f	Revert r314806 "[SLP] Vectorize jumbled memory loads." All the buildbots are red, e.g. http://lab.llvm.org:8011/builders/clang-cmake-aarch64-lld/builds/2436/ > Summary: > This patch tries to vectorize loads of consecutive memory accesses, accessed > in non-consecutive or jumbled way. An earlier attempt was made with patch D26905 > which was reverted back due to some basic issue with representing the 'use mask' of > jumbled accesses. > > This patch fixes the mask representation by recording the 'use mask' in the usertree entry. > > Change-Id: I9fe7f5045f065d84c126fa307ef6ebe0787296df > > Reviewers: mkuper, loladiro, Ayal, zvi, danielcdh > > Reviewed By: Ayal > > Subscribers: hans, mzolotukhin > > Differential Revision: https://reviews.llvm.org/D36130 llvm-svn: 314824	2017-10-03 18:32:29 +00:00
Reid Kleckner	b4569de739	Implement David Blaikie's suggestion for comparison operators llvm-svn: 314822	2017-10-03 18:30:11 +00:00
Hans Wennborg	660531085a	CodeView: Provide a .def file with the register ids The list of register ids was previously written out in a couple of dirrent places. This puts it in a .def file and also adds a few more registers (e.g. the x87 regs) which should lead to more readable dumps, but I didn't include the whole list since that seems unnecessary. X86_MC::initLLVMToSEHAndCVRegMapping is pretty ugly, but at least it's not relying on magic constants anymore. The TODO of using tablegen still stands. Differential revision: https://reviews.llvm.org/D38480 llvm-svn: 314821	2017-10-03 18:27:22 +00:00
Reid Kleckner	04e25e00b7	[DebugInfo] Correctly coalesce DBG_VALUEs that mix direct and indirect values Summary: This should fix a regression introduced by r313786, which switched from MachineInstr::isIndirectDebugValue() to checking if operand 1 is an immediate. I didn't have a test case for it until now. A single UserValue, which approximates a user variable, may have many DBG_VALUE instructions that disagree about whether the variable is in memory or in a virtual register. This will become much more common once we have llvm.dbg.addr, but you can construct such a test case manually today with llvm.dbg.value. Before this change, we would get two UserValues: one for direct and one for indirect DBG_VALUE instructions describing the same variable. If we build separate interval maps for direct and indirect locations, we will end up accidentally coalescing identical DBG_VALUE intervals that need to remain separate because they are broken up by intervals of the opposite direct-ness. Reviewers: aprantl Subscribers: llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D37932 llvm-svn: 314819	2017-10-03 17:59:02 +00:00
Jonas Devlieghere	f998c501b6	[dwarfdump] Add -lookup option Add the option to lookup an address in the debug information and print out the file, function, block and line table details. Differential revision: https://reviews.llvm.org/D38409 llvm-svn: 314817	2017-10-03 17:10:21 +00:00
Geoff Berry	fabedbad11	Revert "Re-enable "[MachineCopyPropagation] Extend pass to do COPY source forwarding"" This reverts commit r314729. Another bug has been encountered in an out-of-tree target reported by Quentin. llvm-svn: 314814	2017-10-03 16:59:13 +00:00
Rafael Espindola	6e182fbab4	Use sched_getaffinity instead of std:🧵:hardware_concurrency. The issue with std:🧵:hardware_concurrency is that it forwards to libc and some implementations (like glibc) don't take thread affinity into consideration. With this change a llvm program that can execute in only 2 cores will use 2 threads, even if the machine has 32 cores. This makes benchmarking a lot easier, but should also help if someone doesn't want to use all cores for compilation for example. llvm-svn: 314809	2017-10-03 16:25:15 +00:00
Dehao Chen	ea523ddb1b	Revert the change that accidentally went in r314806. llvm-svn: 314807	2017-10-03 15:50:42 +00:00
Mohammad Shahid	1d5422f27f	[SLP] Vectorize jumbled memory loads. Summary: This patch tries to vectorize loads of consecutive memory accesses, accessed in non-consecutive or jumbled way. An earlier attempt was made with patch D26905 which was reverted back due to some basic issue with representing the 'use mask' of jumbled accesses. This patch fixes the mask representation by recording the 'use mask' in the usertree entry. Change-Id: I9fe7f5045f065d84c126fa307ef6ebe0787296df Reviewers: mkuper, loladiro, Ayal, zvi, danielcdh Reviewed By: Ayal Subscribers: hans, mzolotukhin Differential Revision: https://reviews.llvm.org/D36130 llvm-svn: 314806	2017-10-03 15:28:48 +00:00
Oliver Stannard	0d5c792223	[ARM] Use table-gen'd assembly operand diags in ARM asm parser This switches the ARM AsmParser to use assembly operand diagnostics from tablegen, rather than a switch statement on the ARMMatchResultTy. It moves the existing diagnostic strings to tablegen, but adds no new ones, so this is NFC except for one diagnostic string that had an off-by-1 error in the hand-written switch statement. Differential revision: https://reviews.llvm.org/D31607 llvm-svn: 314804	2017-10-03 14:38:52 +00:00
Oliver Stannard	55114fd9f0	[ARM, Asm] Use correct source location for register tokens tryParseRegister advances the lexer, so we need to take copies of the start and end locations of the register operand before calling it. Previously, the caret in the diagnostic pointer to the comma after the r0 operand in the test, rather than the start of the operand. Differential revision: https://reviews.llvm.org/D31537 llvm-svn: 314799	2017-10-03 14:30:58 +00:00
Simon Dardis	055192ccd3	[mips] Enable spilling and reloading of the dsp register set. The dsp register class is an alias of the gpr register class, so we have to define instructions for spilling and reloading. Reviewers: atanasyan Differential Revision: https://reviews.llvm.org/D38038 llvm-svn: 314798	2017-10-03 13:45:49 +00:00

... 3 4 5 6 7 ...

107304 Commits