llvm-project

Commit Graph

Author	SHA1	Message	Date
Tom Stellard	8ebad11ee9	AMDGPU/SI: Use InstAlias instead of MnemonicAlias for VOPC instructions Summary: With InstAlias, we don't need to print the _e32 portion of the mnemonic when we print the $dst operand. This change makes it possible to include vcc in the asm string when we switch VOPC over to having implicit vcc defs. Reviewers: arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11813 llvm-svn: 244362	2015-08-07 22:00:56 +00:00
Sanjay Patel	8d768b5b3a	redo r244360 (tighten checks...) after specifying triple llvm-svn: 244361	2015-08-07 21:42:24 +00:00
Sanjay Patel	0e1d731631	tighten checks using update_llc_test_checks.py llvm-svn: 244360	2015-08-07 21:38:53 +00:00
Alex Lorenz	61420f790d	MIR Serialization: Serialize the base alignment for the machine memory operands. llvm-svn: 244357	2015-08-07 20:48:30 +00:00
Alex Lorenz	83127739ff	MIR Serialization: Serialize the offsets for the machine memory operands. llvm-svn: 244356	2015-08-07 20:26:52 +00:00
Matt Arsenault	711b390a7c	AMDGPU: Assume SMRD access for constant address space Since r243294 these are selected to SMRD and moved later if required. llvm-svn: 244354	2015-08-07 20:18:34 +00:00
Chen Li	eafbc9dc47	[ConstantFoldTerminator] Preserve make.implicit metadata when converting SwitchInst to BranchInst Summary: llvm::ConstantFoldTerminator function can convert SwitchInst with single case (and default) to a conditional BranchInst. This patch adds support to preserve make.implicit metadata on this conversion. Reviewers: sanjoy, weimingz, chenli Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D11841 llvm-svn: 244348	2015-08-07 19:30:12 +00:00
Simon Pilgrim	3815c16bf8	[InstCombine] Fix SSE2/AVX2 vector logical shift by constant This patch fixes the sse2/avx2 vector shift by constant instcombine call to correctly deal with the fact that the shift amount is formed from the entire lower 64-bit and not just the lowest element as it currently assumes. e.g. %1 = tail call <4 x i32> @llvm.x86.sse2.psrl.d(<4 x i32> %v, <4 x i32> <i32 15, i32 15, i32 15, i32 15>) In this case, (V)PSRLD doesn't perform a lshr by 15 but in fact attempts to shift by 64424509455 ((15 << 32) \| 15) - giving a zero result. In addition, this review also recognizes shift-by-zero from a ConstantAggregateZero type (PR23821). Differential Revision: http://reviews.llvm.org/D11760 llvm-svn: 244341	2015-08-07 18:22:50 +00:00
Tom Stellard	c8733e805e	AMDGPU/SI: Use correct encoding of vopc for VI in the assembler Summary: We were using the SI encoding for VI. Reviewers: arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11812 llvm-svn: 244332	2015-08-07 16:45:33 +00:00
Tom Stellard	d37631a8ac	AMDGPU/SI: Add VI checks to vop3 assembler tests Reviewers: arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11811 llvm-svn: 244331	2015-08-07 16:45:30 +00:00
Rafael Espindola	f7eb882176	add missing tests files llvm-svn: 244323	2015-08-07 15:35:49 +00:00
Rafael Espindola	e01f43bcc1	Add dynamic_table iterators back to ELF.h. In tree they are only used by llvm-readobj, but it is also used by https://github.com/mono/CppSharp. While at it, add some missing error checking. llvm-svn: 244320	2015-08-07 15:25:20 +00:00
Frederic Riss	a5e1453ac3	[dsymutil] Use the new MCDwarfLineTableParams customization to emit linetables llvm-dsymutil has to be able to process debug info produced by other compilers which use different line table settings. The testcase wasn't generated by another compiler, but by a modified clang. llvm-svn: 244319	2015-08-07 15:14:13 +00:00
Silviu Baranga	3e8e51c1a9	[ARM] Update ReconstructShuffle to handle mismatched types Summary: Port the ReconstructShuffle function from AArch64 to ARM to handle mismatched incoming types in the BUILD_VECTOR node. This fixes an outstanding FIXME in the ReconstructShuffle code. Reviewers: t.p.northover, rengolin Subscribers: aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D11720 llvm-svn: 244314	2015-08-07 11:40:46 +00:00
John Brawn	64e5a66794	Revert "Make global aliases have symbol size equal to their type" This reverts r242520, as it caused pr24379. Also removes part of the test added by r243874 that checks the size of alias symbols. llvm-svn: 244313	2015-08-07 10:56:21 +00:00
NAKAMURA Takumi	b918a404c3	Tweak llvm/test/tools/dsymutil/arch-option.test to avoid globbing on mingw-w64. llvm-svn: 244311	2015-08-07 08:38:22 +00:00
JF Bastien	315cc06840	WebAssembly: textual emission uses expected opcode names Summary: WebAssembly's tablegen instructions have the names WebAssembly expects, but by LLVM convention they're uppercase and suffixed with their type after an underscore. Leave the C++ code that way, but print outt he names WebAssembly expects (lowercase, no type). We could teach tablegen to do this later, maybe by using `!cast<string>(node)` in the .td files. Reviewers: sunfish Subscribers: jfb, llvm-commits Differential Revision: http://reviews.llvm.org/D11776 llvm-svn: 244305	2015-08-07 01:57:03 +00:00
Tom Stellard	f594fcad73	ELF: Add AMDGPU specific defintions Reviewers: rafael Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11458 llvm-svn: 244303	2015-08-07 01:35:24 +00:00
Alex Lorenz	cba8c5fe31	MIR Serialization: Fix serialization of unnamed IR block references. The block address machine operands can reference IR blocks in other functions. This commit fixes a bug where the references to unnamed IR blocks in other functions weren't serialized correctly. llvm-svn: 244299	2015-08-06 23:57:04 +00:00
Juergen Ributzka	f09c7a3d0f	[AArch64][FastISel] Always use AND before checking the branch flag. When we are not emitting the condition for the branch, because the condition is in another BB or SDAG did the selection for us, then we have to mask the flag in the register with AND. This is required when the condition comes from a truncate, because SDAG only truncates down to a legal size of i32. This fixes rdar://problem/22161062. llvm-svn: 244291	2015-08-06 22:44:15 +00:00
Juergen Ributzka	9f54dbe7a1	Revert "[AArch64][FastISel] Add more truncation tests." and "[AArch64][FastISel] Always use an AND instruction when truncating to non-legal types." This reverts commit r243198 and 243304. Turns out this wasn't the correct fix for this problem. It works only within FastISel, but fails when the truncate is selected by SDAG. llvm-svn: 244287	2015-08-06 22:13:48 +00:00
Sean Silva	c2b70bf999	[compatibility.ll] Cover explicitly named comdats. Patch by Vedant Kumar! <vsk@apple.com> llvm-svn: 244284	2015-08-06 22:04:21 +00:00
Rafael Espindola	8b3b09fdcf	Move to llvm-readobj code that is only used there. lld might end up using a small part of this, but it will be in a much refactored form. For now this unblocks avoiding the full section scan in the ELFFile constructor. This also has a (very small) error handling improvement. llvm-svn: 244282	2015-08-06 21:54:37 +00:00
Frederic Riss	dc5370b9cc	[dsymutil] Implement dSYM bundle creation A dSYM bundle is a file hierarchy that looks slike this: <bundle name>.dSYM/ Contents/ Info.plist Resources/ DWARF/ <DWARF file(s)> This is the default output mode of dsymutil. llvm-svn: 244270	2015-08-06 21:05:06 +00:00
Frederic Riss	0948db6064	[dsymutil] Add (unimplemented) --flat option dsymutil should by default generate dSYM bundles which are filesystem hierarchies containing the debug info and an additional Info.plist. Currently llvm-dsymutil emits raw binaries containing the debug info. This is what we call the 'flat mode'. Add a -f/-flat option that is supposed to enable that flat mode, but don't wire it for now, only pass it to the tests that will need it to stay functional once we do bundle generation by default. This basically makes this commit NFC and removes the noise from the actual commit that adds support for bundle generation. llvm-svn: 244269	2015-08-06 21:05:01 +00:00
Sanjoy Das	366acc175e	[IndVars] Fix PR24356. Unsigned predicates increase or decrease agnostic of the signs of their increments. llvm-svn: 244265	2015-08-06 20:43:41 +00:00
Rui Ueyama	b9583d22eb	Update comments. llvm-svn: 244259	2015-08-06 20:05:27 +00:00
Tom Stellard	217361c33f	AMDGPU/SI: Add support for 32-bit immediate SMRD offsets on CI Reviewers: arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11604 llvm-svn: 244254	2015-08-06 19:28:38 +00:00
Tom Stellard	dee26a2876	AMDGPU/SI: Use ComplexPatterns for SMRD addressing modes Summary: This allows us to consolidate several of the TableGen patterns. Reviewers: arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11602 llvm-svn: 244253	2015-08-06 19:28:30 +00:00
Kit Barton	a7bf96ab5c	Fix possible infinite loop in shrink wrapping when searching for save/restore points. There is an infinite loop that can occur in Shrink Wrapping while searching for the Save/Restore points. Part of this search checks whether the save/restore points are located in different loop nests and if so, uses the (post) dominator trees to find the immediate (post) dominator blocks. However, if the current block does not have any immediate (post) dominators then this search will result in an infinite loop. This can occur in code containing an infinite loop. The modification checks whether the immediate (post) dominator is different from the current save/restore block. If it is not, then the search terminates and the current location is not considered as a valid save/restore point for shrink wrapping. Phabricator: http://reviews.llvm.org/D11607 llvm-svn: 244247	2015-08-06 19:01:57 +00:00
Quentin Colombet	6443cce233	[Reassociation] Fix miscompile for va_arg arguments. iisUnmovableInstruction() had a list of instructions hardcoded which are considered unmovable. The list lacked (at least) an entry for the va_arg and cmpxchg instructions. Fix this by introducing a new Instruction::mayBeMemoryDependent() instead of maintaining another instruction list. Patch by Matthias Braun <matze@braunis.de>. Differential Revision: http://reviews.llvm.org/D11577 rdar://problem/22118647 llvm-svn: 244244	2015-08-06 18:44:34 +00:00
Alex Lorenz	e86d51533d	MIR Parser: Report an error when parsing duplicate memory operand flags. llvm-svn: 244240	2015-08-06 18:26:36 +00:00
Alex Lorenz	dc8de2a6b7	MIR Serialization: Serialize the 'invariant' machine memory operand flag. llvm-svn: 244230	2015-08-06 16:55:53 +00:00
Richard Diamond	bd753c9315	Fix an alignment error in `llvm::expandAtomicRMWToCmpXchg` without breaking the build where X86 isn't enabled. Summary: Divide the primitive size in bits by eight so the initial load's alignment is in bytes as expected. Tested with the included unit test. Reviewers: rengolin, jfb Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11804 llvm-svn: 244229	2015-08-06 16:55:03 +00:00
Alex Lorenz	10fd03857f	MIR Serialization: Serialize the 'non-temporal' machine memory operand flag. llvm-svn: 244228	2015-08-06 16:49:30 +00:00
Douglas Katzman	63d64da0ce	[SPARC] Don't compare arch name as a string, use the enum instead. Fixes PR22695 llvm-svn: 244221	2015-08-06 15:44:12 +00:00
Renato Golin	a02ac60469	Revert "Divide the primitive size in bits by eight so the initial load's alignment is in bytes as expected. Tested with the included unit test." This reverts commit r244155, as it was breaking the buildbots for too long. Should be reapplied with proper fix. llvm-svn: 244205	2015-08-06 10:37:59 +00:00
Michael Kuperstein	868dc65444	[X86] Improve EmitLoweredSelect for contiguous CMOV pseudo instructions. This change improves EmitLoweredSelect() so that multiple contiguous CMOV pseudo instructions with the same (or exactly opposite) conditions get lowered using a single new basic-block. This eliminates unnecessary extra basic-blocks (and CFG merge points) when contiguous CMOVs are being lowered. Patch by: kevin.b.smith@intel.com Differential Revision: http://reviews.llvm.org/D11428 llvm-svn: 244202	2015-08-06 08:45:34 +00:00
Peter Collingbourne	e834f42073	COFF: Assign the correct symbol type to internal functions. The COFFSymbolRef::isFunctionDefinition() function tests for several conditions that are not related to whether a symbol is a function, but rather whether the symbol meets the requirements for a function definition auxiliary record, which excludes certain symbols such as internal functions and undefined references. The test we need to determine the symbol type is much simpler: we only need to compare the complex type against IMAGE_SYM_DTYPE_FUNCTION. llvm-svn: 244195	2015-08-06 05:26:35 +00:00
Alex Lorenz	49873a8382	MIR Serialization: Initial serialization of the machine operand target flags. This commit implements the initial serialization of the machine operand target flags. It extends the 'TargetInstrInfo' class to add two new methods that help to provide text based serialization for the target flags. This commit can serialize only the X86 target flags, and the target flags for the other targets will be serialized in the follow-up commits. Reviewers: Duncan P. N. Exon Smith llvm-svn: 244185	2015-08-06 00:44:07 +00:00
Frederic Riss	b5dce473e3	Revert "Make sure all temporary files get created under %T." This reverts commit r244163. The workaround shouldn't be necessary after r244172, and moreover the commit was slightly buggy as it dis a simple mkdir without removing the directory first, which could cause 'File exists' errors. llvm-svn: 244182	2015-08-05 23:53:38 +00:00
Frederic Riss	2a6e44518f	[dsymutil] Update source used to generate test binary. Forgot to include that in the last commit. llvm-svn: 244171	2015-08-05 23:33:45 +00:00
Bjarke Hammersholt Roune	5cbc7d2999	[NVPTX] Use LDG for pointer induction variables. More specifically, make NVPTXISelDAGToDAG able to emit cached loads (LDG) for pointer induction variables. Also fix latent bug where LDG was not restricted to kernel functions. I believe that this could not be triggered so far since we do not currently infer that a pointer is global outside a kernel function, and only loads of global pointers are considered for cached loads. llvm-svn: 244166	2015-08-05 23:11:57 +00:00
Artem Belevich	d41611cfdf	Make sure all temporary files get created under %T. llvm-svn: 244163	2015-08-05 22:54:36 +00:00
Frederic Riss	ae0d436545	[dsymutil] Add support for the -arch option. This option allows to select a subset of the architectures when performing a universal binary link. The filter is done completely in the mach-o specific part of the code. llvm-svn: 244160	2015-08-05 22:33:28 +00:00
Reid Kleckner	10cac7a190	Fix Windows test failure with triple instead of using the native OS llvm-svn: 244159	2015-08-05 22:27:08 +00:00
Reid Kleckner	12d2c12023	If the "CodeView" module flag is set, emit codeview instead of DWARF Summary: Emit both DWARF and CodeView if "CodeView" and "Dwarf Version" module flags are set. Reviewers: majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11756 llvm-svn: 244158	2015-08-05 22:26:20 +00:00
Alex Lorenz	5672a893e5	MIR Serialization: Serialize the machine operand's offset. This commit serializes the offset for the following operands: target index, global address, external symbol, constant pool index, and block address. llvm-svn: 244157	2015-08-05 22:26:15 +00:00
Richard Diamond	559c1d72a9	Divide the primitive size in bits by eight so the initial load's alignment is in bytes as expected. Tested with the included unit test. llvm-svn: 244155	2015-08-05 22:10:57 +00:00
Chen Li	50efd9220a	[LoopUnswitch] Preserve make.implicit metadata for unswitched conditions Summary: This patch adds support to preserve make.implicit metadata for unswitched conditions in loop pre-header. Reviewers: sanjoy, weimingz Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D11769 llvm-svn: 244132	2015-08-05 21:13:26 +00:00
JF Bastien	8662083770	x86 atomic: optimize a.store(reg op a.load(acquire), release) Summary: PR24191 finds that the expected memory-register operations aren't generated when relaxed { load ; modify ; store } is used. This is similar to PR17281 which was addressed in D4796, but only for memory-immediate operations (and for memory orderings up to acquire and release). This patch also handles some floating-point operations. Reviewers: reames, kcc, dvyukov, nadav, morisset, chandlerc, t.p.northover, pete Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11382 llvm-svn: 244128	2015-08-05 21:04:59 +00:00
JF Bastien	7c4218f49c	Revert "Fix MO's analyzePhysReg, it was confusing sub- and super-registers. Problem pointed out by Michael Hordijk." I mistakenly committed the patch for D6629, and was trying to commit another. Reverting until it gets proper signoff. llvm-svn: 244121	2015-08-05 20:53:56 +00:00
JF Bastien	ce5256f5c5	Fix MO's analyzePhysReg, it was confusing sub- and super-registers. Problem pointed out by Michael Hordijk. llvm-svn: 244120	2015-08-05 20:49:46 +00:00
Alex Lorenz	3f2058da16	MIR Parser: Report an error when parsing large immediate operands. llvm-svn: 244100	2015-08-05 19:03:42 +00:00
Alex Lorenz	05e3882e81	MIR Serialization: Serialize the typed immediate integer machine operands. llvm-svn: 244098	2015-08-05 18:52:21 +00:00
Frederic Riss	6e3278633d	[dsymutil] Fix test patterns. Depending on the filesystem paths, the YAML dump might quote paths. Account for that in the regex patterns. llvm-svn: 244094	2015-08-05 18:45:13 +00:00
Frederic Riss	4dd3e0c41e	[dsymutil] Implement support for handling mach-o universal binaries as main input/output. The DWARF linker isn't touched by this, the implementation links individual files and merges them together into a fat binary by calling out to the 'lipo' utility. The main change is that the MachODebugMapParser can now return multiple debug maps for a single binary. The test just verifies that lipo would be invoked correctly, but doesn't actually generate a binary. This mimics the way clang tests its external iplatform tools integration. llvm-svn: 244087	2015-08-05 18:27:44 +00:00
Alex Lorenz	2b3cf19332	MIR Parser: Report an error when parsing duplicate register flags. llvm-svn: 244081	2015-08-05 18:09:03 +00:00
Chandler Carruth	405e4f9051	[GMR] Teach the conservative path of GMR to catch even more easy cases. In PR24288 it was pointed out that the easy case of a non-escaping global and something that obviously required an escape sometimes is hidden behind PHIs (or selects in theory). Because we have this binary test, we can easily just check that all possible input values satisfy the requirement. This is done with a (very small) recursion through PHIs and selects. With this, the specific example from the PR is correctly folded by GVN. Differential Revision: http://reviews.llvm.org/D11707 llvm-svn: 244078	2015-08-05 17:58:30 +00:00
Alex Lorenz	01c1a5ee58	MIR Serialization: Serialize the 'early-clobber' register operand flag. llvm-svn: 244075	2015-08-05 17:49:03 +00:00
Alex Lorenz	9075258b6a	MIR Serialization: Serialize the 'debug-use' register operand flag. llvm-svn: 244071	2015-08-05 17:41:17 +00:00
James Y Knight	bce20afe0f	[Sparc] Fix disassembly of popc instruction. And add tests. Patch by David Wiberg! llvm-svn: 244064	2015-08-05 17:00:30 +00:00
Steven Wu	9927206f8c	Force the MachO generated for Darwin to have VERSION_MIN load command On Darwin, it is required to stamp the object file with VERSION_MIN load command. This commit will provide a VERSRION_MIN load command to the MachO file that doesn't specify the version itself by inferring from Target Triple. llvm-svn: 244059	2015-08-05 15:36:38 +00:00
Artyom Skrobov	6fbef2a780	ARMISelDAGToDAG.cpp had this self-contradictory code: return StringSwitch<int>(Flags) .Case("g", 0x1) .Case("nzcvq", 0x2) .Case("nzcvqg", 0x3) .Default(-1); ... // The _g and _nzcvqg versions are only valid if the DSP extension is // available. if (!Subtarget->hasThumb2DSP() && (Mask & 0x2)) return -1; ARMARM confirms that the comment is right, and the code was wrong. llvm-svn: 244029	2015-08-05 11:02:14 +00:00
Simon Pilgrim	42c611b9ae	[InstCombine] Added more specific SSE2/AVX2 vector shift tests. llvm-svn: 244022	2015-08-05 08:21:38 +00:00
Hal Finkel	17caf326e5	[MachineCombiner] Don't use the opcode-only form of computeInstrLatency In r242277, I updated the MachineCombiner to work with itineraries, but I missed a call that is scheduling-model-only (the opcode-only form of computeInstrLatency). Using the form that takes an MI* allows this to work with itineraries (and should be NFC for subtargets with scheduling models). llvm-svn: 244020	2015-08-05 07:45:28 +00:00
Hal Finkel	23cdeeea0f	[RuntimeDyld] Adapt PPC64 relocations to PPC32 Begin adapting some of the implemented PPC64 relocations for PPC32 (with a test case). Patch by Pierre-Andre Saulais! llvm-svn: 243991	2015-08-04 15:29:00 +00:00
Sanjay Patel	75ced2782b	[x86] machine combiner reassociation: mark EFLAGS operand as 'dead' In the commentary for D11660, I wasn't sure if it was alright to create new integer machine instructions without also creating the implicit EFLAGS operand. From what I can see, the implicit operand is always created by the MachineInstrBuilder based on the instruction type, so we don't have to do that explicitly. However, in reviewing the debug output, I noticed that the operand was not marked as 'dead'. The machine combiner should do that to preserve future optimization opportunities that may be checking for that dead EFLAGS operand themselves. Differential Revision: http://reviews.llvm.org/D11696 llvm-svn: 243990	2015-08-04 15:21:56 +00:00
Vasileios Kalintiris	044e172228	Revert r229675 - [mips] Avoid redundant sign extension of the result of binary bitwise instructions. It introduced two regressions on 64-bit big-endian targets running under N32 (MultiSource/Benchmarks/tramp3d-v4/tramp3d-v4, and MultiSource/Applications/kimwitu++/kc) The issue is that on 64-bit targets comparisons such as BEQ compare the whole GPR64 but incorrectly tell the instruction selector that they operate on GPR32's. This leads to the elimination of i32->i64 extensions that are actually required by comparisons to work correctly. There's currently a patch under review that fixes this problem. llvm-svn: 243984	2015-08-04 14:26:35 +00:00
Simon Pilgrim	d19b9d8229	[InstCombine] Split off SSE2/AVX2 vector shift tests. These aren't vector demanded bits tests. More tests to follow. llvm-svn: 243963	2015-08-04 08:05:27 +00:00
Duncan P. N. Exon Smith	706f37e8df	Linker: Fix references to uniqued nodes after r243883 r243883 started moving 'distinct' nodes instead of duplicated them in lib/Linker. This had the side-effect of sometimes not cloning uniqued nodes that reference them. I missed a corner case: !named = !{!0} !0 = !{!1} !1 = distinct !{!0} !0 is the entry point for "remapping", and a temporary clone (say, !0-temp) is created and mapped in case we need to model a uniquing cycle. Recursive descent into !1. !1 is distinct, so we leave it alone, but update its operand to !0-temp. Pop back out to !0. Its only operand, !1, hasn't changed, so we don't need to use !0-temp. !0-temp goes out of scope, and we're finished remapping, but we're left with: !named = !{!0} !0 = !{!1} !1 = distinct !{null} ; uh oh... Previously, if !0 and !0-temp ended up with identical operands, then !0-temp couldn't have been referenced at all. Now that distinct nodes don't get duplicated, that assumption is invalid. We need to !0-temp->replaceAllUsesWith(!0) before freeing !0-temp. I found this while running an internal `-flto -g` bootstrap. Strangely, there was no case of this in the open source bootstrap I'd done before commit... llvm-svn: 243961	2015-08-04 06:42:31 +00:00
Mehdi Amini	c8d5783114	Update test suite to make "ninja check" succeed without native backend builtin Requires "native" feature in most places that were failing. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 243960	2015-08-04 06:32:54 +00:00
Mehdi Amini	63c3989f6a	Move generic MIR tests in their own subdir, requires "native" as well These tests rely on the native backend to be built-in. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 243959	2015-08-04 06:32:45 +00:00
Mehdi Amini	fc6b6983ac	Improve lit "native" feature to check if the native backend is builtin The goal is to have 'ninja check' passing even if the X86 backend is not built. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 243958	2015-08-04 06:32:31 +00:00
Saleem Abdulrasool	0a2672bb43	ARM: support windows division routines This adds the software division routines for the Windows RTABI. These are not expected to be used often though as most modern Windows ARM capable targets support hardware division. In the case that the target CPU doesnt support hardware division, this will be the fallback. llvm-svn: 243952	2015-08-04 03:57:56 +00:00
Sanjoy Das	215df9ed98	Revert "[LSR] Generate and use zero extends" This reverts commit r243348 and r243357. They caused PR24347. llvm-svn: 243939	2015-08-04 01:52:05 +00:00
Ahmed Bougacha	e0e12db8c8	[AArch64] Add isel support for f16 indexed LD/ST. llvm-svn: 243935	2015-08-04 01:29:38 +00:00
Ahmed Bougacha	b0ae36f0d1	[AArch64] Vector FCOPYSIGN supports Custom-lowering: mark it as such. There's a bunch of code in LowerFCOPYSIGN that does smart lowering, and is actually already vector-aware; let's use it instead of scalarizing! The only interesting change is that for v2f32, we previously always used use v4i32 as the integer vector type. Use v2i32 instead, and mark FCOPYSIGN as Custom. llvm-svn: 243926	2015-08-04 00:42:34 +00:00
Ahmed Bougacha	f65371a235	[CodeGen] Fix FCOPYSIGN legalization to account for mismatched types. We used to legalize it like it's any other binary operations. It's not, because it accepts mismatched operand types. Because of that, we used to hit various asserts and miscompiles. Specialize vector legalizations to, in the worst case, unroll, or, when possible, to just legalize the operand that needs legalization. Scalarization isn't covered, because I can't think of a target where some but not all of the 1-element vector types are to be scalarized. llvm-svn: 243924	2015-08-04 00:32:55 +00:00
Alex Lorenz	a518b79601	MIR Serialization: Serialize the 'volatile' machine memory operand flag. llvm-svn: 243923	2015-08-04 00:24:45 +00:00
Alex Lorenz	4af7e610c3	MIR Serialization: Initial serialization of the machine memory operands. Reviewers: Duncan P. N. Exon Smith llvm-svn: 243915	2015-08-03 23:08:19 +00:00
Justin Bogner	45291391b2	lto: Avoid relying on the environment for this test It's better to pass libLTO to ld64 via the command line flag than rely on setting DYLD_LIBRARY_PATH. llvm-svn: 243911	2015-08-03 22:43:14 +00:00
Chandler Carruth	87adb7a2e2	[Unroll] Improve the brute force loop unroll estimate by propagating through PHI nodes across iterations. This patch teaches the new advanced loop unrolling heuristics to propagate constants into the loop from the preheader and around the backedge after simulating each iteration. This lets us brute force solve simple recurrances that aren't modeled effectively by SCEV. It also makes it more clear why we need to process the loop in-order rather than bottom-up which might otherwise make much more sense (for example, for DCE). This came out of an attempt I'm making to develop a principled way to account for dead code in the unroll estimation. When I implemented a forward-propagating version of that it produced incorrect results due to failing to propagate cost between loop iterations through the PHI nodes, and it occured to me we really should at least propagate simplifications across those edges, and it is quite easy thanks to the loop being in canonical and LCSSA form. Differential Revision: http://reviews.llvm.org/D11706 llvm-svn: 243900	2015-08-03 20:32:27 +00:00
Derek Schuff	b4c1c28c6e	Fix testing for end of stream in bitstream reader. This fixes a bug found while working on the bitcode reader. In particular, the method BitstreamReader::AtEndOfStream doesn't always behave correctly when processing a data streamer. The method fillCurWord doesn't properly set CurWord/BitsInCurWord if the data streamer was already at eof, but GetBytes had not yet set the ObjectSize field of the streaming memory object. This patch fixes this problem, and provides a test to show that this problem has been fixed. Patch by Karl Schimpf. Differential Revision: http://reviews.llvm.org/D11391 llvm-svn: 243890	2015-08-03 18:01:50 +00:00
Duncan P. N. Exon Smith	55ca964e94	DI: Disallow uniquable DICompileUnits Since r241097, `DIBuilder` has only created distinct `DICompileUnit`s. The backend is liable to start relying on that (if it hasn't already), so make uniquable `DICompileUnit`s illegal and automatically upgrade old bitcode. This is a nice cleanup, since we can remove an unnecessary `DenseSet` (and the associated uniquing info) from `LLVMContextImpl`. Almost all the testcases were updated with this script: git grep -e '= !DICompileUnit' -l -- test \| grep -v test/Bitcode \| xargs sed -i '' -e 's,= !DICompileUnit,= distinct !DICompileUnit,' I imagine something similar should work for out-of-tree testcases. llvm-svn: 243885	2015-08-03 17:26:41 +00:00
Tim Northover	910dde7ab2	ARM: prefer allocating VFP regs at stride 4 on Darwin. This is necessary for WatchOS support, where the compact unwind format assumes this kind of layout. For now we only want this on Swift-like CPUs though, where it's been the Xcode behaviour for ages. Also, since it can expand the prologue we don't want it at -Oz. llvm-svn: 243884	2015-08-03 17:20:10 +00:00
Artur Pilipenko	17376c4e02	Currently string attributes on function arguments/return values can be generated using LLVM API. However they are not supported in parser. So, the following scenario will fail: * generate function with string attribute using API, * dump it in LL format, * try to parse. Add parser support for string attributes to fix the issue. Reviewed By: reames, hfinkel Differential Revision: http://reviews.llvm.org/D11058 llvm-svn: 243877	2015-08-03 14:31:49 +00:00
John Brawn	f3324cf1a5	[ARM] Make GlobalMerge merge extern globals by default Enabling merging of extern globals appears to be generally either beneficial or harmless. On some benchmarks suites (on Cortex-M4F, Cortex-A9, and Cortex-A57) it gives improvements in the 1-5% range, but in the rest the overall effect is zero. Differential Revision: http://reviews.llvm.org/D10966 llvm-svn: 243874	2015-08-03 12:13:33 +00:00
Alexander Kornienko	cd86c9791c	Don't use test inputs from other directories. The test/DebugInfo/dwarfdump-macho-universal.test test added in r243862 uses an input from another test's directory (test/tools/dsymutil/Inputs/fat-test.o) which breaks our test setup. Copying the required test input to the test's Input directory to fix the issue. llvm-svn: 243872	2015-08-03 11:59:45 +00:00
James Molloy	6967e5e4a3	Be less conservative about forming IT blocks. In http://reviews.llvm.org/rL215382, IT forming was made more conservative under the belief that a flag-setting instruction was unpredictable inside an IT block on ARMv6M. But actually, ARMv6M doesn't even support IT blocks so that's impossible. In the ARMARM for v7M, v7AR and v8AR it states that the semantics of such an instruction changes inside an IT block - it doesn't set the flags. So actually it is fine to use one inside an IT block as long as the flags register is dead afterwards. This gives significant performance improvements in a variety of MPEG based workloads. Differential revision: http://reviews.llvm.org/D11680 llvm-svn: 243869	2015-08-03 09:24:48 +00:00
Alexander Kornienko	2d4bef8d8d	Fix the test added at r243777. When RUN: lines are split into multiple lines, each one must be prefixed with RUN:. llvm-svn: 243868	2015-08-03 09:13:19 +00:00
Frederic Riss	4c5d357b57	[dwarfdump] Add support for dumping mach-o universal objectfiles llvm-svn: 243862	2015-08-03 00:10:31 +00:00
JF Bastien	fda53373f2	WebAssembly: implement getScalarShiftAmountTy so we can shift by amount, with type Summary: This currently sets the shift amount RHS to the same type as the LHS, and assumes that the LHS is a simple type. This isn't currently the case e.g. with weird integers sizes, but will eventually be true and will assert if not. That's what you get for having an experimental backend: break it and you get to keep both pieces. Most backends either set the RHS to MVT::i32 or MVT::i64, but WebAssembly is a virtual ISA and tries to have regular-looking binary operations where both operands are the same type (even if a 64-bit RHS shifter is slightly silly, hey it's free!). Subscribers: llvm-commits, sunfish, jfb Differential Revision: http://reviews.llvm.org/D11715 llvm-svn: 243860	2015-08-03 00:00:11 +00:00
Simon Pilgrim	87368fdd30	[X86][SSE] Refreshed sse2 vector shift tests llvm-svn: 243850	2015-08-02 15:23:53 +00:00
Jingyue Wu	ffa09be222	[NVPTX] allow register copy between float and int Summary: Fixes PR24303. With Bruno's WIP (D11197) on PeepholeOptimizer, across-class register copying (e.g. i32 to f32) becomes possible. Enhance NVPTXInstrInfo::copyPhysReg to handle these cases. Reviewers: jholewinski Subscribers: eliben, jholewinski, llvm-commits, bruno Differential Revision: http://reviews.llvm.org/D11622 llvm-svn: 243839	2015-08-01 18:02:12 +00:00
Simon Atanasyan	3644ee4aa8	[Mips] Support DT_MIPS_RLD_MAP_REL dynamic section tag in the llvm-readobj llvm-svn: 243833	2015-08-01 12:02:02 +00:00
Simon Pilgrim	503a2594c3	[DAGCombiner] Convert constant AND masks to shuffle clear masks down to the byte level The XformToShuffleWithZero method currently checks AND masks at the per-lane level for all-one and all-zero constants and attempts to convert them to legal shuffle clear masks. This patch generalises XformToShuffleWithZero, splitting and checking the sub-lanes of the constants down to the byte level to see if any legal shuffle clear masks are possible. This allows a lot of masks (often from legalization or truncation) to be folded into existing shuffle patterns and removes a lot of constant mask loading. There are a few examples of poor shuffle lowering that are exposed by this patch that will be cleaned up in future patches (e.g. merging shuffles that are separated by bitcasts, x86 legalized v8i8 zero extension uses PMOVZX+AND+AND instead of AND+PMOVZX, etc.) Differential Revision: http://reviews.llvm.org/D11518 llvm-svn: 243831	2015-08-01 10:01:46 +00:00
JF Bastien	8f9aea08d4	WebAssembly: handle more than int32 argument/return Summary: Also test 64-bit integers, except shifts for now which are broken because isel dislikes the 32-bit truncate that precedes them. Reviewers: sunfish Subscribers: llvm-commits, jfb Differential Revision: http://reviews.llvm.org/D11699 llvm-svn: 243822	2015-08-01 04:48:44 +00:00
Alex Lorenz	b4d0d6a345	AMDGPU/SI: Add implicit register operands in the correct order. This commit fixes a bug in the class 'SIInstrInfo' where the implicit register machine operands were added to a machine instruction in an incorrect order - the implicit uses were added before the implicit defs. I found this bug while working on moving the implicit register operand verification code from the MIR parser to the machine verifier. This commit also makes the method 'addImplicitDefUseOperands' in the machine instruction class public so that it can be reused in the 'SIInstrInfo' class. Reviewers: Matt Arsenault Differential Revision: http://reviews.llvm.org/D11689 llvm-svn: 243799	2015-07-31 23:30:09 +00:00
Alex Lorenz	59ed5919cd	MIR Parser: Report an error when a jump table entry is redefined. llvm-svn: 243798	2015-07-31 23:13:23 +00:00
Jingyue Wu	cf70053b20	[NVPTX] convert pointers in byval kernel arguments to global Summary: For example, in struct S { int x; int y; }; __global__ void foo(S s) { int *b = s.y; // use b } "b" is guaranteed to point to global. NVPTX should emit ld.global/st.global for accessing "b". Reviewers: jholewinski Subscribers: llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D11505 llvm-svn: 243790	2015-07-31 21:44:14 +00:00
JF Bastien	4a2d56044f	WebAssembly: handle `ret void`. Summary: Use -1 as numoperands for the return SDTypeProfile, denoting that return is variadic. Note that the patterns in InstrControl.td still need to match the inputs, so this ins't an "anything goes" variadic on ret! The next step will be to handle other local types (not just int32). Reviewers: sunfish Subscribers: llvm-commits, jfb Differential Revision: http://reviews.llvm.org/D11692 llvm-svn: 243783	2015-07-31 21:04:18 +00:00
Alex Lorenz	ad156fb6af	MIR Serialization: Serialize the floating point immediate machine operands. Reviewers: Duncan P. N. Exon Smith llvm-svn: 243780	2015-07-31 20:49:21 +00:00
Duncan P. N. Exon Smith	375fa1381d	IR: Add a broad bitcode compatibility test Successive versions of LLVM should retain the ability to parse bitcode generated by old releases of the compiler. This adds a bitcode format compatibility test, which is intended to provide good (albeit not entirely exhaustive) coverage of the current LangRef. This also includes compatibility tests for LLVM 3.6. After every 3.X.0 release, the compatibility.ll file from the 3.X branch should be copied to compatibility-3.X.ll on trunk, and the 3.X.0 release used to generate a corresponding bitcode file. Patch by Vedant Kumar! llvm-svn: 243779	2015-07-31 20:44:32 +00:00
Frederic Riss	f6402bfe78	[dwarfdump] Ignore scattered relocations for mach-o. When encountering a scattered relocation, the code would assert trying to access an unexisting section. I couldn't find a way to expose the result of the processing of a scattered reloc, and I'm really unsure what the right thing to do is. This patch just skips them during the processing in DwarfContext and adds a mach-o file to the tests that exposed the asserting behavior. (This is a new failure that is being exposed by Rafael's recent work on the libObject interfaces. I think the wrong behavior has always happened, but now it's asserting) llvm-svn: 243778	2015-07-31 20:22:50 +00:00
Frederic Riss	8caf29932c	[dsymutil] Support multiple input files on the command line llvm-svn: 243777	2015-07-31 20:22:20 +00:00
Duncan P. N. Exon Smith	ed013cd221	DI: Remove DW_TAG_arg_variable and DW_TAG_auto_variable Remove the fake `DW_TAG_auto_variable` and `DW_TAG_arg_variable` tags, using `DW_TAG_variable` in their place Stop exposing the `tag:` field at all in the assembly format for `DILocalVariable`. Most of the testcase updates were generated by the following sed script: find test/ -name ".ll" -o -name ".mir" \| xargs grep -l 'DILocalVariable' \| xargs sed -i '' \ -e 's/tag: DW_TAG_arg_variable, //' \ -e 's/tag: DW_TAG_auto_variable, //' There were only a handful of tests in `test/Assembly` that I needed to update by hand. (Note: a follow-up could change `DILocalVariable::DILocalVariable()` to set the tag to `DW_TAG_formal_parameter` instead of `DW_TAG_variable` (as appropriate), instead of having that logic magically in the backend in `DbgVariable`. I've added a FIXME to that effect.) llvm-svn: 243774	2015-07-31 18:58:39 +00:00
JF Bastien	d7fcc6f9c7	WebAssembly: handle unused function arguments. Subscribers: llvm-commits, sunfish, jfb Differential Revision: http://reviews.llvm.org/D11684 llvm-svn: 243770	2015-07-31 18:13:27 +00:00
David Majnemer	654e130b6e	New EH representation for MSVC compatibility This introduces new instructions neccessary to implement MSVC-compatible exception handling support. Most of the middle-end and none of the back-end haven't been audited or updated to take them into account. Differential Revision: http://reviews.llvm.org/D11097 llvm-svn: 243766	2015-07-31 17:58:14 +00:00
JF Bastien	600aee9805	WebAssembly: print basic integer assembly. Summary: This prints assembly for int32 integer operations defined in WebAssemblyInstrInteger.td only, with major caveats: - The operation names are currently incorrect. - Other integer and floating-point types will be added later. - The printer isn't factored out to handle recursive AST code yet, since it can't even handle control flow anyways. - The assembly format isn't full s-expressions yet either, this will be added later. - This currently disables PrologEpilogCodeInserter as well as MachineCopyPropagation becasue they don't like virtual registers, which WebAssembly likes quite a bit. This will be fixed by factoring out NVPTX's change (currently a fork of PrologEpilogCodeInserter). Reviewers: sunfish Subscribers: llvm-commits, jfb Differential Revision: http://reviews.llvm.org/D11671 llvm-svn: 243763	2015-07-31 17:53:38 +00:00
Sanjay Patel	9ff4626028	[x86] reassociate integer multiplies using machine combiner pass Add i16, i32, i64 imul machine instructions to the list of reassociation candidates. A new bit of logic is needed to handle integer instructions: they have an implicit EFLAGS operand, so we have to make sure it's dead in order to do any reassociation with integer ops. Differential Revision: http://reviews.llvm.org/D11660 llvm-svn: 243756	2015-07-31 16:21:55 +00:00
Reid Kleckner	47ea9ece1a	[COFF] Return symbol VAs instead of RVAs for PE files This makes llvm-nm consistent with binutils nm on executables and DLLs. For a vanilla hello world executable, the address of main should include the default image base of 0x400000. llvm-svn: 243755	2015-07-31 16:14:22 +00:00
Geoff Berry	8a7ef3b2ee	[AArch64] Favor extended reg patterns for sub Summary: Favor the extended reg patterns over the shifted reg patterns that match only the operand shift and not the full sign/zero extend and shift. Reviewers: jmolloy, t.p.northover Subscribers: mcrosier, aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D11569 llvm-svn: 243753	2015-07-31 15:55:54 +00:00
Matt Arsenault	e1ce344b5a	AMDGPU: Fix v16i32 to v16i8 truncstore llvm-svn: 243731	2015-07-31 04:12:04 +00:00
Kostya Serebryany	fb7d8d9d06	[libFuzzer] trace switch statements and apply mutations based on the expected case values llvm-svn: 243726	2015-07-31 01:33:06 +00:00
Tom Stellard	e182e74c53	ELFYAML: Enable parsing of EM_AMDGPU Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11263 llvm-svn: 243724	2015-07-31 01:15:15 +00:00
Matt Arsenault	a7bc7db53c	TableGen: Support folding casts from bits to int This is to fix an incorrect error when trying to initialize DwarfNumbers with a !cast<int> of a bits initializer. getValuesAsListOfInts("DwarfNumbers") would not see an IntInit and instead the cast, so would give up. It seems likely that this could be generalized to attempt the convertInitializerTo for any type. I'm not really sure why the existing code seems to special case the string cast cases when convertInitializerTo seems like it should generally handle this sort of thing. llvm-svn: 243722	2015-07-31 01:12:06 +00:00
Sumanth Gundapaneni	532a13691c	[ARM] Lower modulo operation to generate __aeabi_divmod on Android For a modulo (reminder) operation, clang -target armv7-none-linux-gnueabi generates "__modsi3" clang -target armv7-none-eabi generates "__aeabi_idivmod" clang -target armv7-linux-androideabi generates "__modsi3" Android bionic libc doesn't provide a __modsi3, instead it provides a "__aeabi_idivmod". This patch fixes the LLVM ARMISelLowering to generate the correct call when ever there is a modulo operation. Differential Revision: http://reviews.llvm.org/D11661 llvm-svn: 243717	2015-07-31 00:45:12 +00:00
Alex Lorenz	60bf599607	MIR Parser: Report an error when a constant pool item is redefined. llvm-svn: 243696	2015-07-30 22:00:17 +00:00
Alex Lorenz	a06c0c6401	MIR Parser: Report an error when a virtual register is redefined. llvm-svn: 243695	2015-07-30 21:54:10 +00:00
Sanjay Patel	1166f2ff9f	fix memcpy/memset/memmove lowering when optimizing for size Fixing MinSize attribute handling was discussed in D11363. This is a prerequisite patch to doing that. The handling of OptSize when lowering mem* functions was broken on Darwin because it wants to ignore -Os for these cases, but the existing logic also made it ignore -Oz (MinSize). The Linux change demonstrates a widespread problem. The backend doesn't usually recognize the MinSize attribute by itself; it assumes that if the MinSize attribute exists, then the OptSize attribute must also exist. Fixing this more generally will be a follow-on patch or two. Differential Revision: http://reviews.llvm.org/D11568 llvm-svn: 243693	2015-07-30 21:41:50 +00:00
Wei Mi	d6f7252e2e	[SLP vectorizer]: Choose the best consecutive candidate to pair with a store instruction. The patch changes the SLPVectorizer::vectorizeStores to choose the immediate succeeding or preceding candidate for a store instruction when it has multiple consecutive candidates. In this way it has better chance to find more slp vectorization opportunities. Differential Revision: http://reviews.llvm.org/D10445 llvm-svn: 243666	2015-07-30 17:40:39 +00:00
Alex Lorenz	618b283cd9	MIR Serialization: Serialize the machine basic block's successor weights. Reviewers: Duncan P. N. Exon Smith llvm-svn: 243659	2015-07-30 16:54:38 +00:00
Vasileios Kalintiris	3d9d81fbb1	[mips] Fix out-of-date debug information in test file. Update the debug info in the check-lines because the change in r243638 introduced a constant initialization before the prologue's end as part of a register spill. llvm-svn: 243640	2015-07-30 13:13:09 +00:00
Vasileios Kalintiris	2041b1dd0b	[mips][FastISel] Remove hidden mips-fast-isel option. Summary: This hidden option would disable code generation through FastISel by default. It was removed from the available options and from the Fast-ISel tests that required it in order to run the tests. Reviewers: dsanders Subscribers: qcolombet, llvm-commits Differential Revision: http://reviews.llvm.org/D11610 llvm-svn: 243638	2015-07-30 12:39:33 +00:00
Vasileios Kalintiris	77fb0a3dcf	[mips][FastISel] Apply only zero-extension to constants prior to their materialization. Summary: Previously, we would sign-extend non-boolean negative constants and zero-extend otherwise. This was problematic for PHI instructions with negative values that had a type with bitwidth less than that of the register used for materialization. More specifically, ComputePHILiveOutRegInfo() assumes the constants present in a PHI node are zero extended in their container and afterwards deduces the known bits. For example, previously we would materialize an i16 -4 with the following instruction: addiu $r, $zero, -4 The register would end-up with the 32-bit 2's complement representation of -4. However, ComputePHILiveOutRegInfo() would generate a constant with the upper 16-bits set to zero. The SelectionDAG builder would use that information to generate an AssertZero node that would remove any subsequent trunc & zero_extend nodes. In theory, we should modify ComputePHILiveOutRegInfo() to consult target-specific hooks about the way they prefer to materialize the given constants. However, git-blame reports that this specific code has not been touched since 2011 and it seems to be working well for every target so far. Reviewers: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11592 llvm-svn: 243636	2015-07-30 11:51:44 +00:00
Nick Lewycky	c3890d2969	Fix typo "fuction" noticed in comments in AssumptionCache.h, and also all the other files that have the same typo. All comments, no functionality change! (Merely a "fuctionality" change.) Bonus change to remove emacs major mode marker from SystemZMachineFunctionInfo.cpp because emacs already knows it's C++ from the extension. Also fix typo "appeary" in AMDGPUMCAsmInfo.h. llvm-svn: 243585	2015-07-29 22:32:47 +00:00
Frederic Riss	11ab2b8858	[dsymutil] Rename -v option to -verbose The dsymutil-classic -v option dumps the tool version rather than putting it in verbose mode. Rename -v to -verbose and update the tests that use it (in the process removing it from a few tests that didn't require it anymore since the -dump-debug-map option was introduced). A followup commit will reintroduce the -v option that dumps the version. llvm-svn: 243582	2015-07-29 22:29:34 +00:00
Simon Pilgrim	ba10f76705	[X86][SSE] Keep 32-bit target i64 vector shifts on SSE unit. This patch improves the 32-bit target i64 constant matching to detect the shuffle vector splats that are introduced by i64 vector shift vectorization (D8416). Differential Revision: http://reviews.llvm.org/D11327 llvm-svn: 243577	2015-07-29 21:44:27 +00:00
Tim Northover	2a9d801fd5	AArch64: use 32-bit MOV rather than UBFX to truncate registers. It's potentially more efficient on Cyclone, and from the optimization guides & schedulers looks like it has no effect on Cortex-A53 or A57. In general you'd expect a MOV to be about the most efficient instruction with its semantics, even though the official "UXTW" alias is really a UBFX. llvm-svn: 243576	2015-07-29 21:34:32 +00:00
Alex Lorenz	a6f9a37d92	MIR Serialization: Serialize the frame info's save and restore points. This commit serializes the save and restore machine basic block references from the machine frame information class. llvm-svn: 243575	2015-07-29 21:09:09 +00:00
Simon Pilgrim	86478c6909	[X86][SSE] Vectorize i64 ASHR operations This patch vectorizes the v2i64/v4i64 ASHR shift operations - the last remaining integer vector shifts that are still being transferred to/from the scalar unit to be completed. Differential Revision: http://reviews.llvm.org/D11439 llvm-svn: 243569	2015-07-29 20:31:45 +00:00
Alexey Samsonov	869a5ff37f	[ASan] Disable dynamic alloca and UAR detection in presence of returns_twice calls. Summary: returns_twice (most importantly, setjmp) functions are optimization-hostile: if local variable is promoted to register, and is changed between setjmp() and longjmp() calls, this update will be undone. This is the reason why "man setjmp" advises to mark all these locals as "volatile". This can not be enough for ASan, though: when it replaces static alloca with dynamic one, optionally called if UAR mode is enabled, it adds a whole lot of SSA values, and computations of local variable addresses, that can involve virtual registers, and cause unexpected behavior, when these registers are restored from buffer saved in setjmp. To fix this, just disable dynamic alloca and UAR tricks whenever we see a returns_twice call in the function. Reviewers: rnk Subscribers: llvm-commits, kcc Differential Revision: http://reviews.llvm.org/D11495 llvm-svn: 243561	2015-07-29 19:36:08 +00:00
Colin LeMahieu	fcc32766bf	[llvm-objdump] Merging MachO DumpSections in to FilterSections. Simplifying some predicate logic. llvm-svn: 243556	2015-07-29 19:08:10 +00:00
Jingyue Wu	3a04dc6e78	Roll forward r242871 r242871 missed one place that should be guarded with isPhysicalReg. This patch fixes that. llvm-svn: 243555	2015-07-29 18:59:09 +00:00
Alex Lorenz	b139323f21	MIR Serialization: Serialize the '.cfi_def_cfa' CFI instruction. llvm-svn: 243554	2015-07-29 18:57:23 +00:00
Alex Lorenz	fbe9c04c5f	MIR Parser: Parse multiple LHS register machine operands. llvm-svn: 243553	2015-07-29 18:51:21 +00:00
Michael Zolotukhin	9f06ef76d3	[Unroll] Handle SwitchInst properly. Previously successor selection was simply wrong. llvm-svn: 243545	2015-07-29 18:10:33 +00:00
Michael Zolotukhin	3a7d55b623	[Unroll] Don't crash when simplified branch condition is undef. llvm-svn: 243544	2015-07-29 18:10:29 +00:00
Michael Zolotukhin	a2069d36ce	Rename test full-unroll-bad-geps.ll to full-unroll-crashers.ll. No reason to limit it only to GEP-related crashes. More tests are to come here. llvm-svn: 243543	2015-07-29 18:10:23 +00:00
Bruno Cardoso Lopes	38c0250679	Revert "[PeepholeOptimizer] Look through PHIs to find additional register sources" Reported to Broke some internal tests: PR24303 This reverts commit r243486. llvm-svn: 243540	2015-07-29 17:46:47 +00:00
Colin LeMahieu	77804bed85	[llvm-objdump] Added -j flag to filter sections that are operated on. llvm-svn: 243526	2015-07-29 15:45:39 +00:00
Jingyue Wu	7ec38530a5	Temporarily revert r242871 PR24299 llvm-svn: 243522	2015-07-29 15:26:11 +00:00
Bill Schmidt	42ddd71120	[PPC] Fix PR24216: Don't generate splat for misaligned shuffle mask Given certain shuffle-vector masks, LLVM emits splat instructions which splat the wrong bytes from the source register. The issue is that the function PPC::isSplatShuffleMask() in PPCISelLowering.cpp does not ensure that the splat pattern found is requesting bytes that are aligned on an EltSize boundary. This patch detects this situation as not a valid splat mask, resulting in a permute being generated instead of a splat. Patch and test case by Tyler Kenney, cleaned up a bit by me. This is a simple bug fix that would be good to incorporate into 3.7. llvm-svn: 243519	2015-07-29 14:31:57 +00:00
Akira Hatanaka	f53b0403f8	[AArch64] Define subtarget feature strict-align. This commit defines subtarget feature strict-align and uses it instead of cl::opt -aarch64-strict-align to decide whether strict alignment should be forced. rdar://problem/21529937 llvm-svn: 243516	2015-07-29 14:17:26 +00:00
Sanjoy Das	cfe41f050c	[Statepoints] Let patchable statepoints have a symbolic call target. Summary: As added initially, statepoints required their call targets to be a constant pointer null if ``numPatchBytes`` was non-zero. This turns out to be a problem ergonomically, since there is no way to mark patchable statepoints as calling a (readable) symbolic value. This change remove the restriction of requiring ``null`` call targets for patchable statepoints, and changes PlaceSafepoints to maintain the symbolic call target through its transformation. Reviewers: reames, swaroop.sridhar Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11550 llvm-svn: 243502	2015-07-28 23:50:30 +00:00
Sanjay Patel	133e68b45c	ignore duplicate divisor uses when transforming into reciprocal multiplies (PR24141) PR24141: https://llvm.org/bugs/show_bug.cgi?id=24141 contains a test case where we have duplicate entries in a node's uses() list. After r241826, we use CombineTo() to delete dead nodes when combining the uses into reciprocal multiplies, but this fails if we encounter the just-deleted node again in the list. The solution in this patch is to not add duplicate entries to the list of users that we will subsequently iterate over. For the test case, this avoids triggering the combine divisors logic entirely because there really is only one user of the divisor. Differential Revision: http://reviews.llvm.org/D11345 llvm-svn: 243500	2015-07-28 23:28:22 +00:00
Alex Lorenz	ef5c196fb0	MIR Serialization: Serialize the target index machine operands. Reviewers: Duncan P. N. Exon Smith llvm-svn: 243497	2015-07-28 23:02:45 +00:00
Akira Hatanaka	2670f4a550	[ARM] Define subtarget feature strict-align. This commit defines subtarget feature strict-align and uses it instead of cl::opt -arm-strict-align to decide whether strict alignment should be forced. Also, remove the logic that was checking the OS and architecture as clang is now responsible for setting strict-align based on the command line options specified and the target architecute and OS. rdar://problem/21529937 http://reviews.llvm.org/D11470 llvm-svn: 243493	2015-07-28 22:44:28 +00:00
Tim Northover	17ae83a25f	AArch64: be careful of large immediates when optimising cmps. llvm-svn: 243492	2015-07-28 22:42:32 +00:00
Davide Italiano	f75bf454e4	[tests] Use llvm-readobj instead of macho-dump. llvm-svn: 243487	2015-07-28 21:58:08 +00:00
Bruno Cardoso Lopes	3c235763e5	[PeepholeOptimizer] Look through PHIs to find additional register sources Reapply 243271 with more fixes; although we are not handling multiple sources with coalescable copies, we were not properly skipping this case. - Teaches the ValueTracker in the PeepholeOptimizer to look through PHI instructions. - Add findNextSourceAndRewritePHI method to lookup into multiple sources returnted by the ValueTracker and rewrite PHIs with new sources. With these changes we can find more register sources and rewrite more copies to allow coaslescing of bitcast instructions. Hence, we eliminate unnecessary VR64 <-> GR64 copies in x86, but it could be extended to other archs by marking "isBitcast" on target specific instructions. The x86 example follows: A: psllq %mm1, %mm0 movd %mm0, %r9 jmp C B: por %mm1, %mm0 movd %mm0, %r9 jmp C C: movd %r9, %mm0 pshufw $238, %mm0, %mm0 Becomes: A: psllq %mm1, %mm0 jmp C B: por %mm1, %mm0 jmp C C: pshufw $238, %mm0, %mm0 Differential Revision: http://reviews.llvm.org/D11197 rdar://problem/20404526 llvm-svn: 243486	2015-07-28 21:45:50 +00:00
Vasileios Kalintiris	9876946aee	[mips][FastISel] Fix call lowering by bailing out on "fastcc" calls. Summary: Currently, we support only the MIPS O32 ABI calling convention for call lowering. With this change we avoid using the O32 calling convetion for lowering calls marked as using the fast calling convention. Reviewers: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11515 llvm-svn: 243485	2015-07-28 21:43:31 +00:00
Chih-Hung Hsieh	41169c5487	Fix typo. llvm-svn: 243475	2015-07-28 20:38:29 +00:00
Chih-Hung Hsieh	c5e53ca1b7	Limit this test only on linux. Differential Revision: http://reviews.llvm.org/D10522 llvm-svn: 243474	2015-07-28 20:31:10 +00:00
Vasileios Kalintiris	9ec6114860	[mips][FastISel] Fix generated code for IR's select instruction. Summary: Generate correct code for the select instruction by zero-extending it's boolean/condition operand to GPR-width. This is necessary because the conditional-move instructions operate on the whole register. Reviewers: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11506 llvm-svn: 243469	2015-07-28 19:57:25 +00:00
Matt Arsenault	7227cc1a48	AMDGPU: Don't try to use LDS/vector for private if pointer value stored If the pointer is the store's value operand, this would produce a broken module. Make sure the use is actually for the pointer operand. llvm-svn: 243462	2015-07-28 18:47:00 +00:00
Matt Arsenault	fdcd39a8ad	AMDGPU: Fix crash if called function is a bitcast getCalledFunction() is null, so this would crash. Replace crash with an error on unsupported call. llvm-svn: 243461	2015-07-28 18:29:14 +00:00
Jingyue Wu	42f1d67a45	[SCEV] Apply NSW and NUW flags via poison value analysis Summary: Make Scalar Evolution able to propagate NSW and NUW flags from instructions to SCEVs in some cases. This is based on reasoning about when poison from instructions with these flags would trigger undefined behavior. This gives a 13% speed-up on some Eigen3-based Google-internal microbenchmarks for NVPTX. There does not seem to be clear agreement about when poison should be considered to propagate through instructions. In this analysis, poison propagates only in cases where that should be uncontroversial. This change makes LSR able to create induction variables for expressions like &ptr[i + offset] for loops like this: for (int i = 0; i < limit; ++i) { sum += ptr[i + offset]; } Here ptr is a 64 bit pointer and offset is a 32 bit integer. For NVPTX, LSR currently creates an induction variable for i + offset instead, which is not as fast. Improving this situation is what brings the 13% speed-up on some Eigen3-based Google-internal microbenchmarks for NVPTX. There are more details in this discussion on llvmdev. June: http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-June/thread.html#87234 July: http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-July/thread.html#87392 Patch by Bjarke Roune Reviewers: eliben, atrick, sanjoy Subscribers: majnemer, hfinkel, jingyue, meheff, llvm-commits Differential Revision: http://reviews.llvm.org/D11212 llvm-svn: 243460	2015-07-28 18:22:40 +00:00
Alex Lorenz	305e8f6312	Add a test case for r242191 ([MMX] Use the appropriate instructions for GR64 <-> VR64 copies). This commit adds a MIR test case for the commit r242191, which was committed without one. This test case verifies that the ExpandPostRA pass expands the GR64 <-> VR64 copies into the appropriate MMX_MOV instructions. llvm-svn: 243457	2015-07-28 17:52:59 +00:00
Chih-Hung Hsieh	9843f406ec	Move unit tests to target specific directories. Differential Revision: http://reviews.llvm.org/D10522 llvm-svn: 243454	2015-07-28 17:32:49 +00:00
Alex Lorenz	deb534907e	MIR Serialization: Serialize the block address machine operands. llvm-svn: 243453	2015-07-28 17:28:03 +00:00
Sanjay Patel	94a7433cde	add tests to show broken current behavior of minsize attribute llvm-svn: 243451	2015-07-28 17:18:25 +00:00
Chih-Hung Hsieh	1e859582d6	Implement target independent TLS compatible with glibc's emutls.c. The 'common' section TLS is not implemented. Current C/C++ TLS variables are not placed in common section. DWARF debug info to get the address of TLS variables is not generated yet. clang and driver changes in http://reviews.llvm.org/D10524 Added -femulated-tls flag to select the emulated TLS model, which will be used for old targets like Android that do not support ELF TLS models. Added TargetLowering::LowerToTLSEmulatedModel as a target-independent function to convert a SDNode of TLS variable address to a function call to __emutls_get_address. Added into lib/Target//ISelLowering.cpp to call LowerToTLSEmulatedModel for TLSModel::Emulated. Although all targets supporting ELF TLS models are enhanced, emulated TLS model has been tested only for Android ELF targets. Modified AsmPrinter.cpp to print the emutls_v.* and emutls_t.* variables for emulated TLS variables. Modified DwarfCompileUnit.cpp to skip some DIE for emulated TLS variabls. TODO: Add proper DIE for emulated TLS variables. Added new unit tests with emulated TLS. Differential Revision: http://reviews.llvm.org/D10522 llvm-svn: 243438	2015-07-28 16:24:05 +00:00
Geoff Berry	c573bf7a5f	[AArch64] Match float round and convert to int instructions. Summary: Add patterns for doing floating point round with various rounding modes followed by conversion to int as a single FCVT* instruction. Reviewers: t.p.northover, jmolloy Subscribers: aemerson, rengolin, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D11424 llvm-svn: 243422	2015-07-28 15:24:10 +00:00
Silviu Baranga	4825060059	[LAA] Add clarifying comments for the checking pointer grouping algorithm. NFC llvm-svn: 243416	2015-07-28 13:44:08 +00:00
Adhemerval Zanella	7bc3319d84	Implement __builtin_thread_pointer This path add the aarch64 lowering of __builtin_thread_pointer. It uses the already implemented AArch64ISD::THREAD_POINTER used in TLS generation. llvm-svn: 243412	2015-07-28 13:03:31 +00:00
Chandler Carruth	99ad7bb49c	[GMR] Teach GlobalsModRef to distinguish an important and safe case of no-alias with non-addr-taken globals: they cannot alias a captured pointer. If the non-global underlying object would have been a capture were it to alias the global, we can firmly conclude no-alias. It isn't reasonable for a transformation to introduce a capture in a way observable by an alias analysis. Consider, even if it were to temporarily capture one globals address into another global and then restore the other global afterward, there would be no way for the load in the alias query to observe that capture event correctly. If it observes it then the temporary capturing would have changed the meaning of the program, making it an invalid transformation. Even instrumentation passes or a pass which is synthesizing stores to global variables to expose race conditions in programs could not trigger this unless it queried the alias analysis infrastructure mid-transform, in which case it seems reasonable to return results from before the transform started. See the comments in the change for a more detailed outlining of the theory here. This should address the primary performance regression found when the non-conservatively-correct path of the alias query was disabled. Differential Revision: http://reviews.llvm.org/D11410 llvm-svn: 243405	2015-07-28 11:11:11 +00:00
Simon Pilgrim	df984f58ad	[X86][SSE] Use bitmasks instead of shuffles where possible. VPAND is a lot faster than VPSHUFB and VPBLENDVB - this patch ensures we attempt to lower to a basic bitmask before lowering to the slower byte shuffle/blend instructions. Split off from D11518. Differential Revision: http://reviews.llvm.org/D11541 llvm-svn: 243395	2015-07-28 08:54:41 +00:00
Igor Breger	47a7b95b1d	AVX512: Add encoding tests to vptestnm instructions Differential Revision: http://reviews.llvm.org/D11521 llvm-svn: 243391	2015-07-28 07:00:00 +00:00
Igor Breger	8352a0ddf2	AVX512: Implemented encoding and intrinsics for VGETEXPSS/D instructions Added tests for intrinsics and encoding. Differential Revision: http://reviews.llvm.org/D11528 llvm-svn: 243390	2015-07-28 06:53:28 +00:00
Sanjoy Das	6c7a186599	FileCheck'ify some wc/grep based tests; NFCI. llvm-svn: 243378	2015-07-28 03:50:09 +00:00
Sanjay Patel	8c13e3680d	fix invalid load folding with SSE/AVX FP logical instructions (PR22371) This is a follow-up to the FIXME that was added with D7474 ( http://reviews.llvm.org/rL229531 ). I thought this load folding bug had been made hard-to-hit, but it turns out to be very easy when targeting 32-bit x86 and causes a miscompile/crash in Wine: https://bugs.winehq.org/show_bug.cgi?id=38826 https://llvm.org/bugs/show_bug.cgi?id=22371#c25 The quick fix is to simply remove the scalar FP logical instructions from the load folding table in X86InstrInfo, but that causes us to miss load folds that should be possible when lowering fabs, fneg, fcopysign. So the majority of this patch is altering those lowerings to use vector FP logical instructions (because that's all x86 gives us anyway). That lets us do the load folding legally. Differential Revision: http://reviews.llvm.org/D11477 llvm-svn: 243361	2015-07-28 00:48:32 +00:00
Sanjoy Das	3895a57b32	[LSR] Move X86 specific test case to X86/ rL243348 added the test case in the wrong directory. llvm-svn: 243357	2015-07-28 00:13:42 +00:00
Adam Nemet	54f0b83ee2	[LAA] Split out a helper to print a collection of memchecks This is effectively an NFC but we can no longer print the index of the pointer group so instead I print its address. This still lets us cross-check the section that list the checks against the section that list the groups (see how I modified the test). E.g. before we printed this: Run-time memory checks: Check 0: Comparing group 0: %arrayidxC = getelementptr inbounds i16, i16* %c, i64 %store_ind %arrayidxC1 = getelementptr inbounds i16, i16* %c, i64 %store_ind_inc Against group 1: %arrayidxA = getelementptr i16, i16* %a, i64 %ind %arrayidxA1 = getelementptr i16, i16* %a, i64 %add ... Grouped accesses: Group 0: (Low: %c High: (78 + %c)) Member: {%c,+,4}<%for.body> Member: {(2 + %c),+,4}<%for.body> Now we print this (changes are underlined): Run-time memory checks: Check 0: Comparing group (0x7f9c6040c320): ~~~~~~~~~~~~~~ %arrayidxC1 = getelementptr inbounds i16, i16* %c, i64 %store_ind_inc %arrayidxC = getelementptr inbounds i16, i16* %c, i64 %store_ind Against group (0x7f9c6040c358): ~~~~~~~~~~~~~~ %arrayidxA1 = getelementptr i16, i16* %a, i64 %add %arrayidxA = getelementptr i16, i16* %a, i64 %ind ... Grouped accesses: Group 0x7f9c6040c320: ~~~~~~~~~~~~~~ (Low: %c High: (78 + %c)) Member: {(2 + %c),+,4}<%for.body> Member: {%c,+,4}<%for.body> llvm-svn: 243354	2015-07-27 23:54:41 +00:00
Sanjoy Das	93b3504aa8	[LSR] Generate and use zero extends Summary: If a scale or a base register can be rewritten as "Zext({A,+,1})" then LSR will now consider a formula of that form in its normal cost computation. Depends on D9180 Reviewers: qcolombet, atrick Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9181 llvm-svn: 243348	2015-07-27 23:27:51 +00:00
JF Bastien	088c47ee5b	WebAssembly: add a generic CPU Summary: WebAssemblySubtarget.cpp expects a default 'generic' CPU to exist, and this seems to be prevalent with other targets. It makes sense to have something between MVP and bleeding-edge, even though for now it's the same as MVP. This removes a warning that's currently generated. Subscribers: jfb, llvm-commits, sunfish Differential Revision: http://reviews.llvm.org/D11546 llvm-svn: 243345	2015-07-27 23:25:54 +00:00
NAKAMURA Takumi	69ed7170dc	Tweak llvm/test/CodeGen/X86/virtual-registers-cleared-in-machine-functions-liveins.ll not to fail for targeting win32. llvm-svn: 243341	2015-07-27 23:01:41 +00:00
Alex Lorenz	8a1915b04e	MIR Serialization: Serialize the unnamed basic block references. This commit serializes the references from the machine basic blocks to the unnamed basic blocks. This commit adds a new attribute to the machine basic block's YAML mapping called 'ir-block'. This attribute contains the actual reference to the basic block. Reviewers: Duncan P. N. Exon Smith llvm-svn: 243340	2015-07-27 22:42:41 +00:00
Colin LeMahieu	fe36f83b11	[llvm-mc] Add --no-warn flag with -W alias to disable outputting warnings while assembling. llvm-svn: 243338	2015-07-27 22:39:14 +00:00
Colin LeMahieu	fe2c8b8015	[llvm-mc] Pushing plumbing through for --fatal-warnings flag. llvm-svn: 243334	2015-07-27 21:56:53 +00:00
Sanjoy Das	5dab205ced	[IndVars] Make loop varying predicates loop invariant. Summary: Was D9784: "Remove loop variant range check when induction variable is strictly increasing" This change re-implements D9784 with the two differences: 1. It does not use SCEVExpander and does not generate new instructions. Instead, it does a quick local search for existing `llvm::Value`s that it needs when modifying the `icmp` instruction. 2. It is more general -- it deals with both increasing and decreasing induction variables. I've added all of the tests included with D9784, and two more. As an example on what this change does (copied from D9784): Given C code: ``` for (int i = M; i < N; i++) // i is known not to overflow if (i < 0) break; a[i] = 0; } ``` This transformation produces: ``` for (int i = M; i < N; i++) if (M < 0) break; a[i] = 0; } ``` Which can be unswitched into: ``` if (!(M < 0)) for (int i = M; i < N; i++) a[i] = 0; } ``` I went back and forth on whether the top level logic should live in `SimplifyIndvar::eliminateIVComparison` or be put into its own routine. Right now I've put it under `eliminateIVComparison` because even though the `icmp` is not eliminated, it no longer is an IV comparison. I'm open to putting it in its own helper routine if you think that is better. Reviewers: reames, nicholas, atrick Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11278 llvm-svn: 243331	2015-07-27 21:42:49 +00:00
Simon Pilgrim	c363e7d8e0	[X86][SSE] Added shuffle tests to demonstrate missed bitmask. llvm-svn: 243324	2015-07-27 20:41:57 +00:00
Alex Lorenz	5b0d5f6f26	MIR Serialization: Serialize the '.cfi_def_cfa_register' CFI instruction. llvm-svn: 243322	2015-07-27 20:39:03 +00:00
Bruno Cardoso Lopes	b20841df44	Revert "[PeepholeOptimizer] Look through PHIs to find additional register sources" Still breaks some ARM buildbots. This reverts r243271. llvm-svn: 243318	2015-07-27 20:26:04 +00:00
Akira Hatanaka	2541e0241c	[AArch64] Remove check for Darwin that was needed to decide if x18 should be reserved. The decision to reserve x18 is going to be made solely by the front-end, so it isn't necessary to check if the OS is Darwin in the backend. llvm-svn: 243308	2015-07-27 19:18:47 +00:00
Juergen Ributzka	93d67463a3	[AArch64][FastISel] Add more truncation tests. This is a follow-up to r243198 and adds more truncation tests. llvm-svn: 243304	2015-07-27 19:00:23 +00:00
Simon Pilgrim	15c0a59463	[InstCombine][X86][SSE] Replace sign/zero extension intrinsics with native IR Now that we are generating sane codegen for vector sext/zext nodes on SSE targets, this patch uses instcombine to replace the SSE41/AVX2 pmovsx and pmovzx intrinsics with the equivalent native IR code. Differential Revision: http://reviews.llvm.org/D11503 llvm-svn: 243303	2015-07-27 18:52:15 +00:00
Matt Arsenault	95365ca482	Fix assert when inlining a constantexpr addrspacecast The pointer size of the addrspacecasted pointer might not have matched, so this would have hit an assert in accumulateConstantOffset. I think this was here to allow constant folding of a load of an addrspacecasted constant. Accumulating the offset through the addrspacecast doesn't make much sense, so something else is necessary to allow folding the load through this cast. llvm-svn: 243300	2015-07-27 18:31:03 +00:00
Marek Olsak	93df060871	AMDGPU: don't match vgpr loads for constant loads Author: Dave Airlie <airlied@redhat.com> In order to implement indirect sampler loads, we don't want to match on a VGPR load but an SGPR one for constants, as we cannot feed VGPRs to the sampler only SGPRs. this should be applicable for llvm 3.7 as well. llvm-svn: 243294	2015-07-27 18:16:08 +00:00
Alex Lorenz	10b23525cc	Reset the virtual registers in liveins when clearing the virtual registers. This commit zeroes out the virtual register references in the machine function's liveins in the class 'MachineRegisterInfo' when the virtual register definitions are cleared. Reviewers: Matthias Braun llvm-svn: 243290	2015-07-27 17:51:59 +00:00
Alex Lorenz	12045a4b59	MIR Serialization: Serialize the machine function's liveins. Reviewers: Duncan P. N. Exon Smith llvm-svn: 243288	2015-07-27 17:42:45 +00:00
Silviu Baranga	de38070587	The tests added in r243270 require asserts to be enabled llvm-svn: 243274	2015-07-27 15:22:49 +00:00
Silviu Baranga	65bdb6788b	Fix the tests added in r243270. Use 2>&1 instead of \|& llvm-svn: 243273	2015-07-27 15:08:55 +00:00
Bruno Cardoso Lopes	669c921bfd	[PeepholeOptimizer] Look through PHIs to find additional register sources Reapply r242295 with fixes in the implementation. - Teaches the ValueTracker in the PeepholeOptimizer to look through PHI instructions. - Add findNextSourceAndRewritePHI method to lookup into multiple sources returnted by the ValueTracker and rewrite PHIs with new sources. With these changes we can find more register sources and rewrite more copies to allow coaslescing of bitcast instructions. Hence, we eliminate unnecessary VR64 <-> GR64 copies in x86, but it could be extended to other archs by marking "isBitcast" on target specific instructions. The x86 example follows: A: psllq %mm1, %mm0 movd %mm0, %r9 jmp C B: por %mm1, %mm0 movd %mm0, %r9 jmp C C: movd %r9, %mm0 pshufw $238, %mm0, %mm0 Becomes: A: psllq %mm1, %mm0 jmp C B: por %mm1, %mm0 jmp C C: pshufw $238, %mm0, %mm0 Differential Revision: http://reviews.llvm.org/D11197 rdar://problem/20404526 llvm-svn: 243271	2015-07-27 14:39:46 +00:00
Silviu Baranga	7581d22512	[ARM/AArch64] Fix cost model for interleaved accesses Summary: Fix the cost of interleaved accesses for ARM/AArch64. We were calling getTypeAllocSize and using it to check the number of bits, when we should have called getTypeAllocSizeInBits instead. This would pottentially cause the vectorizer to generate loads/stores and shuffles which cannot be matched with an interleaved access instruction. No performance changes are expected for now since matching/generating interleaved accesses is still disabled by default. Reviewers: rengolin Subscribers: aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D11524 llvm-svn: 243270	2015-07-27 14:39:34 +00:00
Marek Olsak	1354b87695	AMDGPU/SI: Fix the V_FRACT_F64 SI bug workaround This is a candidate for 3.7. llvm-svn: 243263	2015-07-27 11:37:42 +00:00
Jingyue Wu	bfefff555e	Roll forward r243250 r243250 appeared to break clang/test/Analysis/dead-store.c on one of the build slaves, but I couldn't reproduce this failure locally. Probably a false positive as I saw this test was broken by r243246 or r243247 too but passed later without people fixing anything. llvm-svn: 243253	2015-07-26 19:10:03 +00:00
Jingyue Wu	84879b71a9	Revert r243250 breaks tests llvm-svn: 243251	2015-07-26 18:30:13 +00:00
Jingyue Wu	bf485f059c	[TTI/CostModel] improve TTI::getGEPCost and use it in CostModel::getInstructionCost Summary: This patch updates TargetTransformInfoImplCRTPBase::getGEPCost to consider addressing modes. It now returns TCC_Free when the GEP can be completely folded to an addresing mode. I started this patch as I refactored SLSR. Function isGEPFoldable looks common and is indeed used by some WIP of mine. So I extracted that logic to getGEPCost. Furthermore, I noticed getGEPCost wasn't directly tested anywhere. The best testing bed seems CostModel, but its getInstructionCost method invokes getAddressComputationCost for GEPs which provides very coarse estimation. So this patch also makes getInstructionCost call the updated getGEPCost for GEPs. This change inevitably breaks some tests because the cost model changes, but nothing looks seriously wrong -- if we believe the new cost model is the right way to go, these tests should be updated. This patch is not perfect yet -- the comments in some tests need to be updated. I want to know whether this is a right approach before fixing those details. Reviewers: chandlerc, hfinkel Subscribers: aschwaighofer, llvm-commits, aemerson Differential Revision: http://reviews.llvm.org/D9819 llvm-svn: 243250	2015-07-26 17:28:13 +00:00

... 2 3 4 5 6 ...

31468 Commits