llvm-project

Commit Graph

Author	SHA1	Message	Date
Chandler Carruth	24e3b69cbd	[x86] Teach the new vector shuffle lowering to fall back on AVX-512 vectors. Someone will need to build the AVX512 lowering, which should follow AVX1 and AVX2 very closely for AVX512F and AVX512BW resp. I've added a dummy test which is a port of the v8f32 and v8i32 tests from AVX and AVX2 to v8f64 and v8i64 tests for AVX512F and AVX512BW. Hopefully this is enough information for someone to implement proper lowering here. If not, I'll be happy to help, but right now the AVX-512 support isn't a priority for me. llvm-svn: 218583	2014-09-28 23:53:10 +00:00
Chandler Carruth	abe742e8fb	[x86] Fix the new vector shuffle lowering's use of VSELECT for AVX2 lowerings. This was hopelessly broken. First, the x86 backend wants '-1' to be the element value representing true in a boolean vector, and second the operand order for VSELECT is backwards from the actual x86 instructions. To make matters worse, the backend is just using '-1' as the true value to get the high bit to be set. It doesn't actually symbolically map the '-1' to anything. But on x86 this isn't quite how it works: there only the high bit is relevant. As a consequence weird non-'-1' values like 0x80 actually "work" once you flip the operands to be backwards. Anyways, thanks to Hal for helping me sort out what these should be. llvm-svn: 218582	2014-09-28 23:23:55 +00:00
Matt Arsenault	93ffe58f90	Add MachineOperand::ChangeToFPImmediate and setFPImm llvm-svn: 218579	2014-09-28 19:24:59 +00:00
Chandler Carruth	6578f9208b	[x86] Fix a really silly bug that I introduced fixing another bug in the new vector shuffle target DAG combines -- it helps to actually test for the value you want rather than just using an integer in a boolean context. Have I mentioned that I loathe implicit conversions recently? :: sigh :: llvm-svn: 218576	2014-09-28 06:11:04 +00:00
Chandler Carruth	b10c6b8e9e	[x86] Fix yet another bug in the new vector shuffle lowering's handling of widening masks. We can't widen a zeroing mask unless both elements that would be merged are either zeroed or undef. This is the only way to widen a mask if it has a zeroed element. Also clean up the code here by ordering the checks in a more logical way and by using the symoblic values for undef and zero. I'm actually torn on using the symbolic values because the existing code is littered with the assumption that -1 is undef, and moreover that entries '< 0' are the special entries. While that works with the values given to these constants, using the symbolic constants actually makes it a bit more opaque why this is the case. llvm-svn: 218575	2014-09-28 03:30:25 +00:00
Hans Wennborg	ba80b5d43c	WinCOFFObjectWriter.cpp: make write_uint32_le more efficient llvm-svn: 218574	2014-09-28 00:22:27 +00:00
James Molloy	463db9a77c	[AArch64] Redundant store instructions should be removed as dead code If there is a store followed by a store with the same value to the same location, then the store is dead/noop. It can be removed. This problem is found in spec2006-197.parser. For example, stur w10, [x11, #-4] stur w10, [x11, #-4] Then one of the two stur instructions can be removed. Patch by David Xu! llvm-svn: 218569	2014-09-27 17:02:54 +00:00
Yaron Keren	7b4133ac81	Fix llvm::huge_valf multiple initializations with Visual C++. llvm::huge_valf is defined in a header file, so it is initialized multiple times in every compiled unit upon program startup. With non-VC compilers huge_valf is set to a HUGE_VALF which the compiler can probably optimize out. With VC numeric_limits<float>::infinity() does not return a number but a runtime structure member which therotically may change between calls so the compiler does not optimize out the initialization and it happens many times. It can be easily seen by placing a breakpoint on the initialization line. This patch moves llvm::huge_valf initialization to a source file instead of the header. llvm-svn: 218567	2014-09-27 14:41:29 +00:00
Chandler Carruth	f4b9e6b9d9	[x86] Fix yet another issue with widening vector shuffle elements. I spotted this by inspection when debugging something else, so I have no test case what-so-ever, and am not even sure it is possible to realistically trigger the bug. But this is what was intended here. llvm-svn: 218565	2014-09-27 08:40:33 +00:00
Craig Topper	5ed88de99b	Update test case to match minor formatting change introduced in r218563. llvm-svn: 218564	2014-09-27 05:36:53 +00:00
Craig Topper	5546f8c8cc	Reduce code duplication a bit. llvm-svn: 218563	2014-09-27 05:26:42 +00:00
Chandler Carruth	4d03be1717	[x86] Fix terrible bugs everywhere in the new vector shuffle lowering and in the target shuffle combining when trying to widen vector elements. Previously only one of these was correct, and we didn't correctly propagate zeroing target shuffle masks (which have a different sentinel value from undef in non- target shuffle masks now). This isn't just a missed optimization, this caused us to drop zeroing shuffles on the floor and miscompile code. The added test case is one example of that. There are other fixes to the test suite as a consequence of this as well as restoring the undef elements in some of the masks that were lost when I brought sanity to the actual value of the undef and zero sentinels. I've also just cleaned up some of the PSHUFD and PSHUFLW and PSHUFHW combining code, but that code really needs to go. It was a nice initial attempt, but it isn't very principled and the recursive shuffle combiner is much more powerful. llvm-svn: 218562	2014-09-27 04:42:44 +00:00
Chandler Carruth	81e6b29f03	[x86] Flip the sentinel values used in the target shuffle mask decoding to significantly more sane sentinels. Notably, everywhere else in the backend's representation of shuffles uses '-1' to represent undef. The target shuffle masks really shouldn't diverge from that, especially as in a few places they are manipulated by shared code. This causes us to lose some undef lanes in various test masks. I want to get these back, but technically it isn't invalid and there are a lot of bugs here so I want to try to establish a saner baseline for fixing some of the bugs by aligning the specific senitnel values used. llvm-svn: 218561	2014-09-27 04:42:39 +00:00
Craig Topper	5996da2032	Fix TableGen -gen-disassembler output for bit fields with an offset. This fixes bit assignments like this Inst{7-0} = Foo{9-2} Patch by Steve King. llvm-svn: 218560	2014-09-27 04:38:02 +00:00
Sanjay Patel	bdf1e38856	Refactor reciprocal and reciprocal square root estimate into target-independent functions (part 2). This is purely refactoring. No functional changes intended. PowerPC is the only target that is currently using this interface. The ultimate goal is to allow targets other than PowerPC (certainly X86 and Aarch64) to turn this: z = y / sqrt(x) into: z = y * rsqrte(x) And: z = y / x into: z = y * rcpe(x) using whatever HW magic they can use. See http://llvm.org/bugs/show_bug.cgi?id=20900 . There is one hook in TargetLowering to get the target-specific opcode for an estimate instruction along with the number of refinement steps needed to make the estimate usable. Differential Revision: http://reviews.llvm.org/D5484 llvm-svn: 218553	2014-09-26 23:01:47 +00:00
Richard Smith	2b91a7f80f	Add LLVM_ENABLE_MODULES flag to CMake to enable building with C++ modules. llvm-svn: 218551	2014-09-26 22:40:15 +00:00
David Majnemer	601327c4b9	llvm-vtabledump: Further simplification Hoist out calls to getSection and getContents. No functional change intended. llvm-svn: 218550	2014-09-26 22:32:19 +00:00
David Majnemer	dac39857d6	Object: BSS/virtual sections don't have contents Users of getSectionContents shouldn't try to pass in BSS or virtual sections. In all instances, this is a bug in the code calling this routine. N.B. Some COFF implementations (like CL) will mark their BSS sections as taking space on disk. This would confuse COFFObjectFile into thinking the section is larger than the file. llvm-svn: 218549	2014-09-26 22:32:16 +00:00
Yaron Keren	abce3c4e18	clang-format of ChangeStdinToBinary & ChangeStdoutToBinary. llvm-svn: 218547	2014-09-26 22:27:11 +00:00
Kevin Enderby	8597488e5e	Update llvm-objdump’s Mach-O symbolizer code to print the name of symbol stubs. So in fully linked images when a call is made through a stub it now gets a comment like the following in the disassembly: callq 0x100000f6c ## symbol stub for: _printf indicating the call is to a symbol stub and which symbol it is for. This is done for branch reference types and seeing if the branch target is in a stub section and if so using the indirect symbol table entry for that stub and using that symbol table entries symbol name. llvm-svn: 218546	2014-09-26 22:20:44 +00:00
Richard Smith	e06ffe2c2d	Remove definition of LLVM_VERSION_INFO; this macro is not used by any of the files in this directory. If it should be defined anywhere, it should be defined when building lib/LTO/LTOCodeGenerator.cpp, but we've not had it defined there for quite some time, so that doesn't really seem to be very important. (It also would slow down the modules build by creating extra module variants.) llvm-svn: 218544	2014-09-26 21:53:12 +00:00
Richard Smith	ca9ae10c1a	Fix CMake warning CMP0054: don't quote a variable name that is intended to be expanded; future versions of cmake may not expand the variable in this case. llvm-svn: 218543	2014-09-26 21:35:48 +00:00
Richard Smith	571b0b9ede	Fix misinterpretation of CMake rule found by a CMake warning (related to CMP0054). lldb sets the variable SHARED_LIBRARY to 1, which breaks this conditional, because older versions of CMake interpret if ("${t}" STREQUAL "SHARED_LIBRARY") as meaning if ("${t}" STREQUAL "1") in this case. Change the conditional so it does the right thing with both old and new CMakes. llvm-svn: 218542	2014-09-26 21:33:05 +00:00
Chandler Carruth	f572f3b2c0	[x86] Fix a moderately terrifying bug in the new 128-bit shuffle logic that managed to elude all of my fuzz testing historically. =/ Something changed to allow this code path to actually be exercised and it was doing bad things. It is especially heavily exercised by the patterns that emerge when doing AVX shuffles that end up lowered through the 128-bit code path. llvm-svn: 218540	2014-09-26 20:41:45 +00:00
Chad Rosier	7b974b73ae	[IndVar] Don't widen loop compare unless IV user is sign extended. PR21030 llvm-svn: 218539	2014-09-26 20:05:35 +00:00
Matt Arsenault	2dd3129b0a	R600/SI: Use break instead of continue If an instruction doesn't have src1, it doesn't have src2 llvm-svn: 218536	2014-09-26 17:55:14 +00:00
Matt Arsenault	ed8a3e0a08	R600/SI: Add strict check lines to div_scale tests. This has weird operand requirements so it's worthwhile to have very strict checks for its operands. Add different combinations of SGPR operands. llvm-svn: 218535	2014-09-26 17:55:11 +00:00
Matt Arsenault	a276c3e053	R600/SI: Add a note about the order of the operands to div_scale llvm-svn: 218534	2014-09-26 17:55:09 +00:00
Matt Arsenault	ee522bf23e	R600/SI: Move finding SGPR operand to move to separate function llvm-svn: 218533	2014-09-26 17:55:06 +00:00
Matt Arsenault	6a0919fb9b	R600/SI Allow same SGPR to be used for multiple operands Instead of moving the first SGPR that is different than the first, legalize the operand that requires the fewest moves if one SGPR is used for multiple operands. This saves extra moves and is also required for some instructions which require that the same operand be used for multiple operands. llvm-svn: 218532	2014-09-26 17:55:03 +00:00
Matt Arsenault	cb0ac3d1fb	R600/SI: Partially move operand legalization to post-isel hook. Disable the SGPR usage restriction parts of the DAG legalizeOperands. It now should only be doing immediate folding until it can be replaced later. The real legalization work is now done by the other SIInstrInfo::legalizeOperands llvm-svn: 218531	2014-09-26 17:54:59 +00:00
Matt Arsenault	92befe7996	R600/SI: Implement findCommutedOpIndices The base implementation of commuteInstruction is used in some cases, but it turns out this has been broken for a long time since modifiers were inserted between the real operands. The base implementation of commuteInstruction also fails on immediates, which also needs to be fixed. llvm-svn: 218530	2014-09-26 17:54:54 +00:00
Matt Arsenault	5885bef6cf	R600/SI: Don't move operands that are required to be SGPRs e.g. v_cndmask_b32 requires the condition operand be an SGPR. If one of the source operands were an SGPR, that would be considered the one SGPR use and the condition operand would be illegally moved. llvm-svn: 218529	2014-09-26 17:54:52 +00:00
Matt Arsenault	0bea8d830e	R600/SI: Don't assert on exotic operand types This needs a test, but I'm not sure if it is currently possible and I originally hit it due to a bug. Right now the only global address operands have no reason to be VALU instructions, although it theoretically could be a problem. llvm-svn: 218528	2014-09-26 17:54:46 +00:00
Matt Arsenault	aff65fbca5	R600/SI: Fix using wrong operand indices when commuting No test since the current SIISelLowering::legalizeOperands effectively hides this, and the general uses seem to only fire on SALU instructions which don't have modifiers between the operands. When trying to use legalizeOperands immediately after instruction selection, it now sees a lot more patterns it did not see before which break on this. llvm-svn: 218527	2014-09-26 17:54:43 +00:00
Matt Arsenault	e50c1c4a64	R600/SI: Remove apparently dead code in legalizeOperands No tests hit this, and I don't see any way a GlobalAddress node would survive beyond lowering on SI. It it would, the move should probably be inserted by selection. llvm-svn: 218526	2014-09-26 17:54:38 +00:00
David Peixotto	472b05b36c	Ignore annotation function calls in cost computation The annotation instructions are dropped during codegen and have no impact on size. In some cases, the annotations were preventing the unroller from unrolling a loop because the annotation calls were pushing the cost over the unrolling threshold. Differential Revision: http://reviews.llvm.org/D5335 llvm-svn: 218525	2014-09-26 17:48:40 +00:00
Chandler Carruth	acd1906446	[x86] The mnemonic is SHUFPS not SHUPFS. =[ I'm very bad at spelling sadly. llvm-svn: 218524	2014-09-26 17:27:40 +00:00
Chandler Carruth	0c9ee10d01	[x86] In the new vector shuffle lowering, when trying to do another layer of tie-breaking sorting, it really helps to check that you're in a tie first. =] Otherwise the whole thing cycles infinitely. Test case added, another one found through fuzz testing. llvm-svn: 218523	2014-09-26 17:24:26 +00:00
Chandler Carruth	5afd4c2603	[x86] Fix a large collection of bugs that crept in as I fleshed out the AVX support. New test cases included. Note that none of the existing test cases covered these buggy code paths. =/ Also, it is clear from this that SHUFPS and SHUFPD are the most bug prone shuffle instructions in x86. =[ These were all detected by fuzz-testing. (I <3 fuzz testing.) llvm-svn: 218522	2014-09-26 17:11:02 +00:00
Renato Golin	36c626e33f	Elide repeated register operand in Thumb1 instructions This patch makes the ARM backend transform 3 operand instructions such as 'adds/subs' to the 2 operand version of the same instruction if the first two register operands are the same. Example: 'adds r0, r0, #1' will is transformed to 'adds r0, #1'. Currently for some instructions such as 'adds' if you try to assemble 'adds r0, r0, #8' for thumb v6m the assembler would throw an error message because the immediate cannot be encoded using 3 bits. The backend should be smart enough to transform the instruction to 'adds r0, #8', which allows for larger immediate constants. Patch by Ranjeet Singh. llvm-svn: 218521	2014-09-26 16:14:29 +00:00
Andrea Di Biagio	196e873cdc	[X86][SchedModel] SSE reciprocal square root instruction latencies. The SSE rsqrt instruction (a fast reciprocal square root estimate) was grouped in the same scheduling IIC_SSE_SQRT* class as the accurate (but very slow) SSE sqrt instruction. For code which uses rsqrt (possibly with newton-raphson iterations) this poor scheduling was affecting performances. This patch splits off the rsqrt instruction from the sqrt instruction scheduling classes and creates new IIC_SSE_RSQER* classes with latency values based on Agner's table. Differential Revision: http://reviews.llvm.org/D5370 Patch by Simon Pilgrim. llvm-svn: 218517	2014-09-26 12:56:44 +00:00
Frederic Riss	82d5c5139f	Revert "Store TypeUnits in a SmallVector<DWARFUnitSection> instead of a single DWARFUnitSection." This reverts commit r218513. Buildbots using libstdc++ issue an error when trying to copy SmallVector<std::unique_ptr<>>. Revert the commit until we have a fix. llvm-svn: 218514	2014-09-26 12:34:06 +00:00
Frederic Riss	6b65eb0642	Store TypeUnits in a SmallVector<DWARFUnitSection> instead of a single DWARFUnitSection. Summary: There will be multiple TypeUnits in an unlinked object that will be extracted from different sections. Now that we have DWARFUnitSection that is supposed to represent an input section, we need a DWARFUnitSection<TypeUnit> per input .debug_types section. Once this is done, the interface is homogenous and we can move the Section parsing code into DWARFUnitSection. Reviewers: samsonov, dblaikie Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5482 llvm-svn: 218513	2014-09-26 12:15:40 +00:00
Daniel Sanders	13496c4102	Fix unused variable warning added in r218509 llvm-svn: 218510	2014-09-26 10:45:26 +00:00
Daniel Sanders	b3ca3388ca	[mips] Generalize the handling of f128 return values to support f128 arguments. Summary: This will allow us to handle f128 arguments without duplicating code from CCState::AnalyzeFormalArguments() or CCState::AnalyzeCallOperands(). No functional change. Reviewers: vmedic Reviewed By: vmedic Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5292 llvm-svn: 218509	2014-09-26 10:06:12 +00:00
Robert Khasanov	6d62c0202b	[AVX512] Added load/store from BW/VL subsets to Register2Memory opcode tables. Added lowering tests for these instructions. llvm-svn: 218508	2014-09-26 09:48:50 +00:00
David Majnemer	6887a251f3	llvm-vtabledump: Small cleanup llvm-svn: 218505	2014-09-26 08:01:23 +00:00
Jyoti Allur	223602db82	fix a typo in doumentation index. llvm-svn: 218504	2014-09-26 06:59:15 +00:00
David Majnemer	56167c3e95	llvm-vtabledump: strip trailing NUL bytes llvm-svn: 218502	2014-09-26 05:50:45 +00:00
David Majnemer	ec44e4d053	Fix build breakage on MSVC 2013 llvm-svn: 218499	2014-09-26 04:47:54 +00:00
David Majnemer	1ac52ebfe2	llvm-vtabledump: Dump RTTI structures for the MS ABI llvm-svn: 218498	2014-09-26 04:21:51 +00:00
David Majnemer	de36075b41	Target: Fix build breakage. No functional change intended. llvm-svn: 218497	2014-09-26 02:57:05 +00:00
David Majnemer	4b3c90f209	Support: Remove undefined behavior from &raw_ostream::operator<< Don't negate signed integer types in &raw_ostream::operator<<(const FormattedNumber &FN). llvm-svn: 218496	2014-09-26 02:48:14 +00:00
David Xu	beff8bf746	Revert patch of r218493, delete the test case llvm-svn: 218495	2014-09-26 02:40:54 +00:00
David Xu	418da223dd	Revert patch ofr218493 llvm-svn: 218494	2014-09-26 02:28:03 +00:00
David Xu	64f661ee0b	Redundant store instructions should be removed as dead code llvm-svn: 218493	2014-09-26 02:02:09 +00:00
Eric Christopher	a9353d1798	Add the first backend support for on demand subtarget creation based on the Function. This is currently used to implement mips16 support in the mips backend via the existing module pass resetting the subtarget. Things to note: a) This involved running resetTargetOptions before creating a new subtarget so that code generation options like soft-float could be recognized when creating the new subtarget. This is to deal with initialization code in isel lowering that only paid attention to the initial value. b) Many of the existing testcases weren't using the soft-float feature correctly. I've corrected these based on the check values assuming that was the desired behavior. c) The mips port now pays attention to the target-cpu and target-features strings when generating code for a particular function. I've removed these from one function where the requested cpu and features didn't match the check lines in the testcase. llvm-svn: 218492	2014-09-26 01:44:08 +00:00
Eric Christopher	1e9aecd69c	Add a FIXME to TargetMachine to remove the function specific code generation options from TargetMachine. This will depend upon Function + TargetSubtargetInfo based code generation at which point resetTargetOptions and this code can be removed. llvm-svn: 218491	2014-09-26 01:44:05 +00:00
Eric Christopher	f2379a840e	Have setSubtarget take a const subtarget. llvm-svn: 218490	2014-09-26 01:28:13 +00:00
Eric Christopher	3976f78247	Move resetTargetOptions from taking a MachineFunction to a Function since we are accessing the TargetMachine that we're a member function of. llvm-svn: 218489	2014-09-26 01:28:10 +00:00
Matt Arsenault	0c652c3fbc	R600: Avoid repeated check lines llvm-svn: 218487	2014-09-26 01:12:36 +00:00
Matt Arsenault	3a99759498	R600/SI: Fix emitting trailing whitespace after s_waitcnt llvm-svn: 218486	2014-09-26 01:09:46 +00:00
Adam Nemet	ce465421d7	[AVX512] Simplify use of !con() No change in X86.td.expanded. llvm-svn: 218485	2014-09-26 00:53:12 +00:00
Adam Nemet	f7988d7364	[AVX512] Pull pattern for subvector extract into the instruction definition No functional change. I initially thought that pulling the Pat<> into the instruction pattern was not possible because it was doing a transform on the index in order to convert it from a per-element (extract_subvector) index into a per-chunk (vextract*x4) index. Turns out this also works inside the pattern because the vextract_extract PatFrag has an OperandTransform EXTRACT_get_vextract{128,256}_imm, so the index in $idx goes through the same conversion. The existing test CodeGen/X86/avx512-insert-extract.ll extended in the previous commit provides coverage for this change. llvm-svn: 218480	2014-09-25 23:48:49 +00:00
Adam Nemet	8d5354eaa2	[AVX512] Make vextractx4/vinsertx4 tests check for the index as well Extend test so that it provides coverage for the next commit. llvm-svn: 218479	2014-09-25 23:48:47 +00:00
Adam Nemet	55536c6a8f	[AVX512] Refactor subvector extracts No functional change. These are now implemented as two levels of multiclasses heavily relying on the new X86VectorVTInfo class. The multiclass at the first level that is called with float or int provides the 128 or 256 bit subvector extracts. The second level provides the register and memory variants and some more Pat<>s. I've compared the td.expanded files before and after. One change is that ExeDomain for 64x4 is SSEPackedDouble now. I think this is correct, i.e. a bugfix. (BTW, this is the change that was blocked on the recent tablegen fix. The class-instance values X86VectorVTInfo inside vextract_for_type weren't properly evaluated.) Part of <rdar://problem/17688758> llvm-svn: 218478	2014-09-25 23:48:45 +00:00
Adam Nemet	6ea09eb148	[AVX512] Fix typo F->I in VEXTRACTF32x4rr. llvm-svn: 218477	2014-09-25 23:48:42 +00:00
Hal Finkel	15c9b195b2	Add SDAG TableGen definitions for BR_CC Add SelectionDAG TableGen definitions for BR_CC so that targets can instruction-select BR_CC using TableGen pattern matching. Patch by deadal nix. llvm-svn: 218476	2014-09-25 23:34:18 +00:00
Matt Arsenault	42d1565844	R600: Fix some missing conversion testcases llvm-svn: 218474	2014-09-25 23:16:18 +00:00
Matt Arsenault	c16fafb24d	Remove duplicated RUN lines in middle of test llvm-svn: 218473	2014-09-25 23:16:14 +00:00
Bruno Cardoso Lopes	d04f7596e7	[MachineSink+PGO] Teach MachineSink to use BlockFrequencyInfo Machine Sink uses loop depth information to select between successors BBs to sink machine instructions into, where BBs within smaller loop depths are preferable. This patch adds support for choosing between successors by using profile information from BlockFrequencyInfo instead, whenever the information is available. Tested it under SPEC2006 train (average of 30 runs for each program); ~1.5% execution speedup in average on x86-64 darwin. <rdar://problem/18021659> llvm-svn: 218472	2014-09-25 23:14:26 +00:00
David Majnemer	eac48b61f4	Object: Add range iterators for Archive children No functional change intended. llvm-svn: 218471	2014-09-25 22:56:54 +00:00
Nick Kledzik	d49c3ad9bc	[Support] Fix Format.h to build on Windows llvm-svn: 218467	2014-09-25 21:00:38 +00:00
Nick Kledzik	e648037449	[Support] Add type-safe alternative to llvm::format() llvm::format() is somewhat unsafe. The compiler does not check that integer parameter size matches the %x or %d size and it does not complain when a StringRef is passed for a %s. And correctly using a StringRef with format() is ugly because you have to convert it to a std::string then call c_str(). The cases where llvm::format() is useful is controlling how numbers and strings are printed, especially when you want fixed width output. This patch adds some new formatting functions to raw_streams to format numbers and StringRefs in a type safe manner. Some examples: OS << format_hex(255, 6) => "0x00ff" OS << format_hex(255, 4) => "0xff" OS << format_decimal(0, 5) => " 0" OS << format_decimal(255, 5) => " 255" OS << right_justify(Str, 5) => " foo" OS << left_justify(Str, 5) => "foo " llvm-svn: 218463	2014-09-25 20:30:58 +00:00
Anton Yartsev	3fa65d4ef4	Refactoring: raw pointer -> unique_ptr llvm-svn: 218462	2014-09-25 19:55:58 +00:00
Tom Stellard	1fa1ce6112	ARM: Remove unneeded check for MI->hasPostISelHook() llvm-svn: 218459	2014-09-25 18:59:23 +00:00
Tom Stellard	529efcf9d0	SelectionDAG: Remove #if NDEBUG from check for a post-isel hook The InstrEmitter will skip the check of MI.hasPostISelHook() before calling AdjustInstrPostInstrSelection() when NDEBUG is not defined. This was added in r140228, and I'm not sure if it is intentional or not, but it is a likely source for bugs, because it means with Release+Asserts builds you can forget to set the hasPostISelHook flag on TableGen definitions and AdjustInstrPostInstrSelection() will still be called. llvm-svn: 218458	2014-09-25 18:59:22 +00:00
Tom Stellard	7980fc8562	R600/SI: Add support for global atomic add llvm-svn: 218457	2014-09-25 18:30:26 +00:00
Robin Morisset	810739d174	Lower idempotent RMWs to fence+load Summary: I originally tried doing this specifically for X86 in the backend in D5091, but it was rather brittle and generally running too late to be general. Furthermore, other targets may want to implement similar optimizations. So I reimplemented it at the IR-level, fitting it into AtomicExpandPass as it interacts with that pass (which could not be cleanly done before at the backend level). This optimization relies on a new target hook, which is only used by X86 for now, as the correctness of the optimization on other targets remains an open question. If it is found correct on other targets, it should be trivial to enable for them. Details of the optimization are discussed in D5091. Test Plan: make check-all + a new test Reviewers: jfb Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5422 llvm-svn: 218455	2014-09-25 17:27:43 +00:00
Aaron Ballman	8cb2cae03a	Since the DisasmMemoryObject only operates on const data, it now only accepts a const data pointer. This silences a -Wcast-qual warning. llvm-svn: 218454	2014-09-25 14:02:43 +00:00
Sid Manning	31f7125562	Add missing attributes !cmp.[eq,gt,gtu] instructions. These instructions do not indicate they are extendable or the number of bits in the extendable operand. Rename to match architected names. Add a testcase for the intrinsics. llvm-svn: 218453	2014-09-25 13:09:54 +00:00
Daniel Sanders	621589e7c0	Add llvm_unreachables() for [ASZ]ExtUpper to X86FastISel.cpp to appease the buildbots. llvm-svn: 218452	2014-09-25 13:08:51 +00:00
Daniel Sanders	ae275e38a2	[mips] Add CCValAssign::[ASZ]ExtUpper and CCPromoteToUpperBitsInType and handle struct's correctly on big-endian N32/N64 return values. Summary: The N32/N64 ABI's require that structs passed in registers are laid out such that spilling the register with 'sd' places the struct at the lowest address. For little endian this is trivial but for big-endian it requires that structs are shifted into the upper bits of the register. We also require that structs passed in registers have the 'inreg' attribute for big-endian N32/N64 to work correctly. This is because the tablegen-erated calling convention implementation only has access to the lowered form of struct arguments (one or more integers of up to 64-bits each) and is unable to determine the original type. Reviewers: vmedic Reviewed By: vmedic Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5286 llvm-svn: 218451	2014-09-25 12:15:05 +00:00
Renato Golin	f5dd1dacb6	Add aliases for VAND imm to VBIC ~imm On ARM NEON, VAND with immediate (16/32 bits) is an alias to VBIC ~imm with the same type size. Adding that logic to the parser, and generating VBIC instructions from VAND asm files. This patch also fixes the validation routines for NEON splat immediates which were wrong. Fixes PR20702. llvm-svn: 218450	2014-09-25 11:31:24 +00:00
Chandler Carruth	0a6e961efd	[x86] Teach the new vector shuffle lowering to use AVX2 instructions for v4f64 and v8f32 shuffles when they are lane-crossing. We have fully general lane-crossing permutation functions in AVX2 that make this easy. Part of this also changes exactly when and how these vectors are split up when we don't have AVX2. This isn't always a win but it usually is a win, so on the balance I think its better. The primary regressions are all things that just need to be fixed anyways such as modeling when a blend can be completely accomplished via VINSERTF128, etc. Also, this highlights one of the few remaining big features: we do a really poor job of inserting elements into AVX registers efficiently. This completes almost all of the big tricks I have in mind for AVX2. The only things left that I plan to add: 1) element insertion smarts 2) palignr and other fairly specialized lowerings when they happen to apply llvm-svn: 218449	2014-09-25 11:03:55 +00:00
Sylvestre Ledru	1623b463ae	Update my previous commit to fit 80 cols... llvm-svn: 218448	2014-09-25 10:58:16 +00:00
Sylvestre Ledru	b5984fabbd	Details that -debug-only is not available when LLVM is built with --enable-optimized llvm-svn: 218447	2014-09-25 10:57:00 +00:00
Chandler Carruth	e91d68c475	[x86] Teach the new vector shuffle lowering a fancier way to lower 256-bit vectors with lane-crossing. Rather than immediately decomposing to 128-bit vectors, try flipping the 256-bit vector lanes, shuffling them and blending them together. This reduces our worst case shuffle by a pretty significant margin across the board. llvm-svn: 218446	2014-09-25 10:21:15 +00:00
Oliver Stannard	3256b26ef2	[Thumb2] BXJ should be undefined for v7M, v8A The Thumb2 BXJ instruction (Branch and Exchange Jazelle) is not defined for v7M or v8A. It is defined for all other Thumb2-supporting architectures (v6T2, v7A and v7R). llvm-svn: 218445	2014-09-25 10:02:05 +00:00
Chandler Carruth	02387122e0	[x86] Fix an oversight in the v8i32 path of the new vector shuffle lowering where it only used the mask of the low 128-bit lane rather than the entire mask. This allows the new lowering to correctly match the unpack patterns for v8i32 vectors. For reference, the reason that we check for the the entire mask rather than checking the repeated mask is because the repeated masks don't abide by all of the invariants of normal masks. As a consequence, it is safer to use the full mask with functions like the generic equivalence test. llvm-svn: 218442	2014-09-25 04:10:27 +00:00
Chandler Carruth	8140158cb5	[x86] Rearrange the code for v16i16 lowering a bit for clarity and to reduce the amount of checking we do here. The first realization is that only non-crossing cases between 128-bit lanes are handled by almost the entire function. It makes more sense to handle the crossing cases first. THe second is that until we actually are going to generate fancy shared lowering strategies that use the repeated semantics of the v8i16 lowering, we should waste time checking for repeated masks. It is simplest to directly test for the entire unpck masks anyways, so we gained nothing from this. This also matches the structure of v32i8 more closely. No functionality changed here. llvm-svn: 218441	2014-09-25 04:03:22 +00:00
Chandler Carruth	d8f528adb8	[x86] Implement AVX2 support for v32i8 in the new vector shuffle lowering. This completes the basic AVX2 feature support, but there are still some improvements I'd like to do to really get the last mile of performance here. llvm-svn: 218440	2014-09-25 02:52:12 +00:00
Chandler Carruth	397d12c4b4	[x86] More tweaks to the v32i8 test cases. I made a mistake in the previous commit and produced the wrong pattern. Fix that. Also make one more shuffle pattern byte-based rather than word-based, and add two more blend patterns. llvm-svn: 218439	2014-09-25 02:44:39 +00:00
Chandler Carruth	a03011ffae	[x86] Re-work a bunch of the v32i8 test cases to actually involve byte shuffles rather than word shuffles. As you might guess, these were built starting from the word shuffle test cases and I failed to properly port a bunch of them and left them as widened word shuffle test cases. We still have a couple of tests that check our ability to widen shuffles, but now we will test the actual byte shuffle quite a bit better. llvm-svn: 218438	2014-09-25 02:20:02 +00:00
Reid Kleckner	81782f0cb8	MC: Use @IMGREL instead of @IMGREL32, which we can't parse Nico Rieck added support for this 32-bit COFF relocation some time ago for Win64 stuff. It appears that as an oversight, the assembly output used "foo"@IMGREL32 instead of "foo"@IMGREL, which is what we can parse. Sadly, there were actually tests that took in IMGREL and put out IMGREL32, and we didn't notice the inconsistency. Oh well. Now LLVM can assemble it's own output with slightly more fidelity. llvm-svn: 218437	2014-09-25 02:09:18 +00:00
Chandler Carruth	d355369dbb	[x86] Remove the defunct X86ISD::BLENDV entry -- we use vector selects for this now. Should prevent folks from running afoul of this and not knowing why their code won't instruction select the way I just did... llvm-svn: 218436	2014-09-25 01:16:01 +00:00
Chandler Carruth	a577bc26b6	[x86] Fix the v16i16 blend logic I added in the prior commit and add the missing test cases for it. Unsurprisingly, without test cases, there were bugs here. Surprisingly, this bug wasn't caught at compile time. Yep, there is an X86ISD::BLENDV. It isn't wired to anything. Oops. I'll fix than next. llvm-svn: 218434	2014-09-25 01:13:38 +00:00
Justin Bogner	b35a72ae9e	llvm-cov: Combine segments that cover the same location If we have multiple coverage counts for the same segment, we need to add them up rather than arbitrarily choosing one. This fixes that and adds a test with template instantiations to exercise it. llvm-svn: 218432	2014-09-25 00:34:18 +00:00
Akira Hatanaka	8cc48bd159	[X86,AVX] Add an isel pattern for X86VBroadcast. This fixes PR21050 and rdar://problem/18434607. llvm-svn: 218431	2014-09-25 00:26:15 +00:00

1 2 3 4 5 ...

108168 Commits