llvm-project

Commit Graph

Author	SHA1	Message	Date
David Majnemer	e830c60d89	Object, COFF: Increase code reuse Split getObject's smarts into checkOffset, use this to replace the handwritten check in getSectionContents. Similarly, replace checks in section_rel_begin/section_rel_end with getNumberOfRelocations. No functionality change intended. llvm-svn: 221873	2014-11-13 08:46:37 +00:00
David Majnemer	1f80b0a8e0	Object, COFF: getRelocationSymbol shouldn't assert lib/Object is supposed to be robust to malformed object files. Don't assert if we don't have a symbol table. I'll try to come up with a test case later. llvm-svn: 221870	2014-11-13 07:42:11 +00:00
David Majnemer	2314b3defa	Object, COFF: Cleanup some code in getSectionName Use StringRef::startswith to tidy up some code, no functionality change intended. llvm-svn: 221869	2014-11-13 07:42:09 +00:00
David Majnemer	58323a9763	Object, COFF: Fix some theoretical bugs getObject didn't consider the case where a pointer came before the start of the object file. No test is included, trying to come up with something reasonable. llvm-svn: 221868	2014-11-13 07:42:07 +00:00
Rafael Espindola	c11bd4229c	Read 64 bits at a time in the bitcode reader. The reading of 64 bit values could still be optimized, but at least this cuts down on the number of virtual calls to fetch more data. llvm-svn: 221865	2014-11-13 07:23:22 +00:00
Chandler Carruth	fee91883f4	[x86] Teach the vector shuffle lowering to make a more nuanced decision between splitting a vector into 128-bit lanes and recombining them vs. decomposing things into single-input shuffles and a final blend. This handles a large number of cases in AVX1 where the cross-lane shuffles would be much more expensive to represent even though we end up with a fast blend at the root. Instead, we can do a better job of shuffling in a single lane and then inserting it into the other lanes. This fixes the remaining bits of Halide's regression captured in PR21281 for AVX1. However, the bug persists in AVX2 because I've made this change reasonably conservative. The cases where it makes sense in AVX2 to split into 128-bit lanes are much more rare because we can often do full permutations across all elements of the 256-bit vector. However, the particular test case in PR21281 is an example of one of the rare cases where it is always better to work in a single 128-bit lane. I'm going to try to teach the logic to detect and form the good code even in AVX2 next, but it will need to use a separate heuristic. Finally, there is one pesky regression here where we previously would craftily use vpermilps in AVX1 to shuffle both high and low halves at the same time. We no longer pull that off, and not for any really good reason. Ultimately, I think this is just another missing nuance to the selection heuristic that I'll try to add in afterward, but this change already seems strictly worth doing considering the magnitude of the improvements in common matrix math shuffle patterns. As always, please let me know if this causes a surprising regression for you. llvm-svn: 221861	2014-11-13 04:06:10 +00:00
Rui Ueyama	ffa4cebe91	llvm-readobj: Print out address table when dumping COFF delay-import table llvm-svn: 221855	2014-11-13 03:22:54 +00:00
Frederic Riss	0f7abef2cf	Add an assert and a test that verify r221709's fix. llvm-svn: 221854	2014-11-13 03:20:23 +00:00
Chandler Carruth	253dd39a9a	[x86] Don't form overly fragmented blends when splitting and re-combining shuffles because nothing was available in the wider vector type. The key observation (which I've put in the comments for future maintainers) is that at this point, no further combining is really possible. And so even though these shuffles trivially could be combined, we need to actually do that as we produce them when producing them this late in the lowering. This fixes another (huge) part of the Halide vector shuffle regressions. As it happens, this was already well covered by the tests, but I hadn't noticed how bad some of these got. The specific patterns that turn directly into unpckl/h patterns were occurring many times in common vector processing code. There are still more problems here sadly, but trying to incrementally tease them apart and it looks like this is the core of the problem in the splitting logic. There is some chance of regression here, you can see it in the test changes. Specifically, where we stop forming pshufb in some cases, it is possible that pshufb was in fact faster. Intel "says" that pshufb is slower than the instruction sequences replacing it. llvm-svn: 221852	2014-11-13 02:42:08 +00:00
Quentin Colombet	f5485bb008	[CodeGenPrepare] Handle zero extensions in the TypePromotionHelper. Prior to this patch the TypePromotionHelper was promoting only sign extensions. Supporting zero extensions changes: - How constants are extended. - How sign extensions, zero extensions, and truncate are composed together. - How the type of the extended operation is recorded. Now we need to know the kind of the extension as well as its type. Each change is fairly small, unlike the diff. Most of the diff are comments/variable renaming to say "extension" instead of "sign extension". The performance improvements on the test suite are within the noise. Related to <rdar://problem/18310086>. llvm-svn: 221851	2014-11-13 01:44:51 +00:00
Juergen Ributzka	957a1454cc	[FastISel][AArch64] Optimize select when one of the operands is a 'true' or 'false' value. Optimize selects of i1 in the presence of 'true' and 'false' operands to simple logic operations. This fixes rdar://problem/18960150. llvm-svn: 221848	2014-11-13 00:36:46 +00:00
Juergen Ributzka	424c5fd12f	[FastISel][AArch64] Fold the cmp into the select when possible. This folds the compare emission into the select emission when possible, so we can directly use the flags and don't have to emit a separate compare. Related to rdar://problem/18960150. llvm-svn: 221847	2014-11-13 00:36:43 +00:00
Juergen Ributzka	d1a042abd0	[FastISel][AArch64] Extend 'select' lowering to support also i1 to i16. Related to rdar://problem/18960150. llvm-svn: 221846	2014-11-13 00:36:38 +00:00
Frederic Riss	e1f4958122	Revert "[dwarfdump] Add support for dumping accelerator tables." This reverts commit r221836. The tests are asserting on some buildbots. This also reverts the test part of r221837 as it relies on dwarfdump dumping the accelerator tables. llvm-svn: 221842	2014-11-13 00:15:15 +00:00
Paul Robinson	d9c4a9af7c	Improve long path name support on Windows. Windows normally limits the length of an absolute path name to 260 characters; directories can have lower limits. These limits increase to about 32K if you use absolute paths with the special '\\?\' prefix. Teach Support\Windows\Path.inc to use that prefix as needed. TODO: Other parts of Support could also learn to use this prefix. llvm-svn: 221841	2014-11-13 00:12:14 +00:00
Sanjoy Das	c5676df3ec	Teach ScalarEvolution to sharpen range information. If x is known to have the range [a, b), in a loop predicated by (icmp ne x, a) its range can be sharpened to [a + 1, b). Get ScalarEvolution and hence IndVars to exploit this fact. This change triggers an optimization to widen-loop-comp.ll, so it had to be edited to get it to pass. This change was originally landed in r219834 but had a bug and broke ASan. It was reverted in r219878, and is now being re-landed after fixing the original bug. phabricator: http://reviews.llvm.org/D5639 reviewed by: atrick llvm-svn: 221839	2014-11-13 00:00:58 +00:00
Frederic Riss	3a6b354b3e	Fix emission of Dwarf accelerator table when there are multiple CUs. The DIE offset in the accel tables is an offset relative to the start of the debug_info section, but we were encoding the offset to the start of the containing CU. llvm-svn: 221837	2014-11-12 23:48:14 +00:00
Frederic Riss	39467276d0	[dwarfdump] Add support for dumping accelerator tables. The class used for the dump only allows to dump for the moment, but it can (and will) be easily extended to support search also. llvm-svn: 221836	2014-11-12 23:48:10 +00:00
Frederic Riss	e4576d2c46	Allow DWARFFormValue::extractValue to be called with a null CU. Currently FormValues are only used for attributes of DIEs and thus uers always have a CU lying around when calling into the FormValue API. Accelerator tables encode their information using the same Forms as the attributes, thus it is natural to use DWARFFormValue to extract/dump them. There is no CU in that case though. Allow the API to be called with a null CU arguemnt by making the RelocMap lookup conditional on the CU pointer validity. And document this new behvior in the header. (Test coverage for this use of the API comes in the DwarfAccelTable support patch) llvm-svn: 221835	2014-11-12 23:48:04 +00:00
Frederic Riss	6cbfa91e91	Remove unsused variables. llvm-svn: 221834	2014-11-12 23:48:01 +00:00
Ahmed Bougacha	026600d967	[CodeGenPrepare] Replace other uses of EVT::getEVT with TL::getValueType. r221820 fixed a problem (PR21548) where an iPTR was used in TLI legality checks, which isn't valid and resulted in a failed assertion. The solution was to lower pointer types into the correct target's VT, by using TL::getValueType instead of EVT::getEVT. This commit changes 3 other uses of EVT::getEVT, but without any tests: - One of these non-lowered EVTs is passed to allowsMisalignedMemoryAccesses, which goes into target's TL implementation and doesn't cause any problem (yet.) - Two others are passed to TLI.isOperationLegalOrCustom: - one only looks at extensions, so doesn't concern pointers. - one only looks at binary operators, so also isn't a problem. The latter might some day be exposed to pointers and cause the same assert as the original PR, because there's a comment hinting at also supporting cast ops. For consistency, update all of them and be done with it. llvm-svn: 221827	2014-11-12 23:05:03 +00:00
Ahmed Bougacha	0788d49a40	[CodeGenPrepare][AArch64] Fix a TLI legality check on iPTR to use a lowered instead. Fixes PR21548. Related to PR20474. llvm-svn: 221820	2014-11-12 22:16:55 +00:00
Sanjay Patel	f6f7d5d1dd	Expose the number of Newton-Raphson iterations applied to the hardware's reciprocal estimate as a parameter (x86). This is a follow-on to r221706 and r221731 and discussed in more detail in PR21385. This patch also loosens the testcase checking for btver2. We know that the "1.0" will be loaded, but we can't tell exactly when, so replace the CHECK-NEXT specifiers with plain CHECKs. The CHECK-NEXT sequence relied on a quirk of post-RA-scheduling that may change independently of anything in these tests. llvm-svn: 221819	2014-11-12 21:39:01 +00:00
Ahmed Bougacha	55a333d89b	Add fortified (__*_chk) library functions to TLI (NFC) One of them (__memcpy_chk) was already there, the others were checked by comparing function names. Note that the fortified libfuncs are now part of TLI, but are always available, because they aren't generated, only optimized into the non-checking versions. Differential Revision: http://reviews.llvm.org/D6179 llvm-svn: 221817	2014-11-12 21:23:34 +00:00
Timur Iskhodzhanov	0e76a16200	Temporary fix for PR21528 - use mangled C++ function names in COFF debug info to un-break ASan on Windows llvm-svn: 221813	2014-11-12 20:21:20 +00:00
Timur Iskhodzhanov	a11b32b7e5	[COFF] Make it clearer that the symbols subsection holds function display name rather than just name llvm-svn: 221812	2014-11-12 20:10:09 +00:00
Cameron McInally	73a6bca32b	[AVX512] Add integer shift by immediate intrinsics. llvm-svn: 221811	2014-11-12 19:58:54 +00:00
Aaron Ballman	9f8d2b0995	Changing a StringRef::begin() call into StringRef::data(); NFC. llvm-svn: 221808	2014-11-12 19:43:13 +00:00
Rafael Espindola	0c9aa57a07	Use the return of readBytes to find out if we are at the end of the stream. This allows the removal of isObjectEnd and opens the way for reading 64 bits at a time. llvm-svn: 221804	2014-11-12 18:37:00 +00:00
Sanjay Patel	4c219fd248	CGSCC should not treat intrinsic calls like function calls (PR21403) Make the handling of calls to intrinsics in CGSCC consistent: they are not treated like regular function calls because they are never lowered to function calls. Without this patch, we can get dangling pointer asserts from the subsequent loop that processes callsites because it already ignores intrinsics. See http://llvm.org/bugs/show_bug.cgi?id=21403 for more details / discussion. Differential Revision: http://reviews.llvm.org/D6124 llvm-svn: 221802	2014-11-12 18:25:47 +00:00
Jingyue Wu	a41cf018b8	Fix broken doxygen annotations, NFC llvm-svn: 221801	2014-11-12 18:25:06 +00:00
Jingyue Wu	8a12cea5f1	Disable indvar widening if arithmetics on the wider type are more expensive Summary: Reapply r221772. The old patch breaks the bot because the @indvar_32_bit test was run whether NVPTX was enabled or not. IndVarSimplify should not widen an indvar if arithmetics on the wider indvar are more expensive than those on the narrower indvar. For instance, although NVPTX64 treats i64 as a legal type, an ADD on i64 is twice as expensive as that on i32, because the hardware needs to simulate a 64-bit integer using two 32-bit integers. Split from D6188, and based on D6195 which adds NVPTXTargetTransformInfo. Fixes PR21148. Test Plan: Added @indvar_32_bit that verifies we do not widen an indvar if the arithmetics on the wider type are more expensive. This test is run only when NVPTX is enabled. Reviewers: jholewinski, eliben, meheff, atrick Reviewed By: atrick Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D6196 llvm-svn: 221799	2014-11-12 18:09:15 +00:00
Sanjay Patel	7777b50eaf	remove function names from comments; NFC llvm-svn: 221798	2014-11-12 18:07:42 +00:00
Rafael Espindola	2d05db49bb	Return the number of read bytes in MemoryObject::readBytes. Returning more information will allow BitstreamReader to be simplified a bit and changed to read 64 bits at a time. llvm-svn: 221794	2014-11-12 17:11:16 +00:00
Justin Hibbits	a88b605721	Add support for small-model PIC for PowerPC. Summary: Large-model was added first. With the addition of support for multiple PIC models in LLVM, now add small-model PIC for 32-bit PowerPC, SysV4 ABI. This generates more optimal code, for shared libraries with less than about 16380 data objects. Test Plan: Test cases added or updated Reviewers: joerg, hfinkel Reviewed By: hfinkel Subscribers: jholewinski, mcrosier, emaste, llvm-commits Differential Revision: http://reviews.llvm.org/D5399 llvm-svn: 221791	2014-11-12 15:16:30 +00:00
Rafael Espindola	de1e5b8dfd	Reduce code duplication a bit. NFC. llvm-svn: 221785	2014-11-12 14:48:38 +00:00
Aaron Ballman	7a7b144117	Fixing a -Wcast-qual warning; NFC. llvm-svn: 221781	2014-11-12 13:55:27 +00:00
Zoran Jovanovic	fd888630b5	[mips][micromips] Add predicate 'InMicroMips' at CodeGen patterns for microMIPS instructions Differential Revision: http://reviews.llvm.org/D6198 llvm-svn: 221780	2014-11-12 13:30:10 +00:00
Chandler Carruth	0c922fcec5	[x86] Start improving the matching of unpck instructions based on test cases from Halide folks. This initial step was extracted from a prototype change by Clay Wood to try and address regressions found with Halide and the new vector shuffle lowering. llvm-svn: 221779	2014-11-12 10:05:18 +00:00
Elena Demikhovsky	be8808dc3f	AVX-512: Intrinsics for ERI 3 instructions: vrcp28, vrsqrt28, vexp2, only vector forms. Intrinsics include SAE (Suppres All Exceptions) parameter. http://reviews.llvm.org/D6214 llvm-svn: 221774	2014-11-12 07:31:03 +00:00
Jingyue Wu	a48273390c	Reverts r221772 which fails tests llvm-svn: 221773	2014-11-12 07:19:25 +00:00
Jingyue Wu	635a9b14fa	Disable indvar widening if arithmetics on the wider type are more expensive Summary: IndVarSimplify should not widen an indvar if arithmetics on the wider indvar are more expensive than those on the narrower indvar. For instance, although NVPTX64 treats i64 as a legal type, an ADD on i64 is twice as expensive as that on i32, because the hardware needs to simulate a 64-bit integer using two 32-bit integers. Split from D6188, and based on D6195 which adds NVPTXTargetTransformInfo. Fixes PR21148. Test Plan: Added @indvar_32_bit that verifies we do not widen an indvar if the arithmetics on the wider type are more expensive. Reviewers: jholewinski, eliben, meheff, atrick Reviewed By: atrick Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D6196 llvm-svn: 221772	2014-11-12 06:58:45 +00:00
Bill Schmidt	729547847f	[PowerPC] Add vec_vsx_ld and vec_vsx_st intrinsics This patch enables the vec_vsx_ld and vec_vsx_st intrinsics for PowerPC, which provide programmer access to the lxvd2x, lxvw4x, stxvd2x, and stxvw4x instructions. New LLVM intrinsics are provided to represent these four instructions in IntrinsicsPowerPC.td. These are patterned after the similar intrinsics for lvx and stvx (Altivec). In PPCInstrVSX.td, these intrinsics are tied to the code gen patterns, with additional patterns to allow plain vanilla loads and stores to still generate these instructions. At -O1 and higher the intrinsics are immediately converted to loads and stores in InstCombineCalls.cpp. This will open up more optimization opportunities while still allowing the correct instructions to be generated. (Similar code exists for aligned Altivec loads and stores.) The new intrinsics are added to the code that checks for consecutive loads and stores in PPCISelLowering.cpp, as well as to PPCTargetLowering::getTgtMemIntrinsic(). There's a new test to verify the correct instructions are generated. The loads and stores tend to be reordered, so the test just counts their number. It runs at -O2, as it's not very effective to test this at -O0, when many unnecessary loads and stores are generated. I ended up having to modify vsx-fma-m.ll. It turns out this test case is slightly unreliable, but I don't know a good way to prevent problems with it. The xvmaddmdp instructions read and write the same register, which is one of the multiplicands. Commutativity allows either to be chosen. If the FMAs are reordered differently than expected by the test, the register assignment can be different as a result. Hopefully this doesn't change often. There is a companion patch for Clang. llvm-svn: 221767	2014-11-12 04:19:40 +00:00
Rafael Espindola	79e1f9ff99	Merge StreamableMemoryObject into MemoryObject. Every MemoryObject is a StreamableMemoryObject since the removal of StringRefMemoryObject, so just merge the two. I will clean up the MemoryObject interface in the upcoming commits. llvm-svn: 221766	2014-11-12 03:55:46 +00:00
Rafael Espindola	5b7bb884e6	Remove unused method. NFC. llvm-svn: 221759	2014-11-12 02:35:31 +00:00
Rafael Espindola	5468ded718	Make readBytes pure virtual. Every real implementation has it. llvm-svn: 221758	2014-11-12 02:30:38 +00:00
Rafael Espindola	ef0482f50a	Remove unused method. NFC. llvm-svn: 221757	2014-11-12 02:27:40 +00:00
Rafael Espindola	cac0088e91	Remove the now unused StringRefMemoryObject.h. llvm-svn: 221755	2014-11-12 02:13:27 +00:00
Rafael Espindola	7fc5b87480	Pass an ArrayRef to MCDisassembler::getInstruction. With this patch MCDisassembler::getInstruction takes an ArrayRef<uint8_t> instead of a MemoryObject. Even on X86 there is a maximum size an instruction can have. Given that, it seems way simpler and more efficient to just pass an ArrayRef to the disassembler instead of a MemoryObject and have it do a virtual call every time it wants some extra bytes. llvm-svn: 221751	2014-11-12 02:04:27 +00:00
Nick Kledzik	f44dbda542	Object, support both mach-o archive t.o.c file names For historical reasons archives on mach-o have two possible names for the file containing the table of contents for the archive: "__.SYMDEF SORTED" and "__.SYMDEF". But the libObject archive reader only supported the former. This patch fixes llvm::object::Archive to support both names. llvm-svn: 221747	2014-11-12 01:37:45 +00:00
Rafael Espindola	35a12a85a1	Remove a bit of dead code. Every "real" object file implements this an ptx doesn't use it. llvm-svn: 221746	2014-11-12 01:27:22 +00:00
Philip Reames	319c48eb2d	Extend intrinsic name mangling to support arrays, named structs, and function types. Currently, we have a type parameter mechanism for intrinsics. Rather than having to specify a separate intrinsic for each combination of argument and return types, we can specify a single intrinsic with one or more type parameters. These type parameters are passed explicitly to Intrinsic::getDeclaration or can be specified implicitly in the naming of the intrinsic function in an LL file. Today, the types are limited to integer, floating point, and pointer types. With a goal of supporting symbolic targets for patchpoints and statepoints, this change adds support for function types. The change also includes support for first class aggregate types (named structures and arrays) since these appear in function types we've encountered. Reviewed by: atrick, ributzka Differential Revision: http://reviews.llvm.org/D4608 llvm-svn: 221742	2014-11-12 00:21:51 +00:00
Chad Rosier	f53f07046b	[Reassociate] Canonicalize negative constants out of expressions. Add support for FDiv, which was regressed by the previous commit. llvm-svn: 221738	2014-11-11 23:36:42 +00:00
Philip Reames	66c6de61ee	Canonicalize an assume(load != null) into !nonnull metadata We currently have two ways of informing the optimizer that the result of a load is never null: metadata and assume. This change converts the second in to the former. This avoids a need to implement optimizations using both forms. We should probably extend this basic idea to metadata of other forms; in particular, range metadata. We view is that assumes should be considered a "last resort" for when there isn't a more canonical way to represent something. Reviewed by: Hal Differential Revision: http://reviews.llvm.org/D5951 llvm-svn: 221737	2014-11-11 23:33:19 +00:00
Sanjay Patel	50fc6ff5e3	Initialize new subtarget feature variable for generating reciprocal estimate instructions. This was missed in r221706. llvm-svn: 221731	2014-11-11 23:13:15 +00:00
Duncan P. N. Exon Smith	9419863909	libLTO: Assert if LTOCodeGenerator and LTOModule are from different contexts llvm-svn: 221730	2014-11-11 23:13:10 +00:00
Juergen Ributzka	89441b0dd8	[FastISel][AArch64] Add support for fabs intrinsic. Lower the llvm.fabs intrinsic to the 'fabs' MI instruction. This fixes rdar://problem/18946552. llvm-svn: 221729	2014-11-11 23:10:44 +00:00
Duncan P. N. Exon Smith	97b45874bf	libLTO: Allow LTOModule to own a context llvm-svn: 221728	2014-11-11 23:08:05 +00:00
Duncan P. N. Exon Smith	de5e32b5b4	libLTO: Allow LTOCodeGenerator to own a context llvm-svn: 221726	2014-11-11 23:03:29 +00:00
Kostya Serebryany	231bd088d8	[asan] adding ShadowOffset64 for mips64, patch by Kumar Sukhani llvm-svn: 221725	2014-11-11 23:02:57 +00:00
Chad Rosier	094ac7735b	[Reassociate] Canonicalize negative constants out of expressions. This is a reapplication of r221171, but we only perform the transformation on expressions which include a multiplication. We do not transform rem/div operations as this doesn't appear to be safe in all cases. llvm-svn: 221721	2014-11-11 22:58:35 +00:00
Kostya Serebryany	29a18dcbc5	Move asan-coverage into a separate phase. Summary: This change moves asan-coverage instrumentation into a separate Module pass. The other part of the change in clang introduces a new flag -fsanitize-coverage=N. Another small patch will update tests in compiler-rt. With this patch no functionality change is expected except for the flag name. The following changes will make the coverage instrumentation work with tsan/msan Test Plan: Run regression tests, chromium. Reviewers: nlewycky, samsonov Reviewed By: nlewycky, samsonov Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6152 llvm-svn: 221718	2014-11-11 22:14:37 +00:00
Duncan P. N. Exon Smith	de36e8040f	Revert "IR: MDNode => Value" Instead, we're going to separate metadata from the Value hierarchy. See PR21532. This reverts commit r221375. This reverts commit r221373. This reverts commit r221359. This reverts commit r221167. This reverts commit r221027. This reverts commit r221024. This reverts commit r221023. This reverts commit r220995. This reverts commit r220994. llvm-svn: 221711	2014-11-11 21:30:22 +00:00
Tom Roeder	6312f4a422	Fix build break: remove unused variable in FCFI. llvm-svn: 221710	2014-11-11 21:26:33 +00:00
Frederic Riss	8ad4f498fb	Totally forget deallocated SDNodes in SDDbgInfo. What would happen before that commit is that the SDDbgValues associated with a deallocated SDNode would be marked Invalidated, but SDDbgInfo would keep a map entry keyed by the SDNode pointer pointing to this list of invalidated SDDbgNodes. As the memory gets reused, the list might get wrongly associated with another new SDNode. As the SDDbgValues are cloned when they are transfered, this can lead to an exponential number of SDDbgValues being produced during DAGCombine like in http://llvm.org/bugs/show_bug.cgi?id=20893 Note that the previous behavior wasn't really buggy as the invalidation made sure that the SDDbgValues won't be used. This commit can be considered a memory optimization and as such is really hard to validate in a unit-test. llvm-svn: 221709	2014-11-11 21:21:08 +00:00
Tom Roeder	eb7a303d1b	Add Forward Control-Flow Integrity. This commit adds a new pass that can inject checks before indirect calls to make sure that these calls target known locations. It supports three types of checks and, at compile time, it can take the name of a custom function to call when an indirect call check fails. The default failure function ignores the error and continues. This pass incidentally moves the function JumpInstrTables::transformType from private to public and makes it static (with a new argument that specifies the table type to use); this is so that the CFI code can transform function types at call sites to determine which jump-instruction table to use for the check at that site. Also, this removes support for jumptables in ARM, pending further performance analysis and discussion. Review: http://reviews.llvm.org/D4167 llvm-svn: 221708	2014-11-11 21:08:02 +00:00
Sanjay Patel	e2e589288f	Use rcpss/rcpps (X86) to speed up reciprocal calcs (PR21385). This is a first step for generating SSE rcp instructions for reciprocal calcs when fast-math allows it. This is very similar to the rsqrt optimization enabled in D5658 ( http://reviews.llvm.org/rL220570 ). For now, be conservative and only enable this for AMD btver2 where performance improves significantly both in terms of latency and throughput. We may never enable this codegen for Intel Core* chips because the divider circuits are just too fast. On SandyBridge, divss can be as fast as 10 cycles versus the 21 cycle critical path for the rcp + mul + sub + mul + add estimate. Follow-on patches may allow configuration of the number of Newton-Raphson refinement steps, add AVX512 support, and enable the optimization for more chips. More background here: http://llvm.org/bugs/show_bug.cgi?id=21385 Differential Revision: http://reviews.llvm.org/D6175 llvm-svn: 221706	2014-11-11 20:51:00 +00:00
Bill Schmidt	3d9674cfb1	[PowerPC] Replace foul hackery with real calls to __tls_get_addr My original support for the general dynamic and local dynamic TLS models contained some fairly obtuse hacks to generate calls to __tls_get_addr when lowering a TargetGlobalAddress. Rather than generating real calls, special GET_TLS_ADDR nodes were used to wrap the calls and only reveal them at assembly time. I attempted to provide correct parameter and return values by chaining CopyToReg and CopyFromReg nodes onto the GET_TLS_ADDR nodes, but this was also not fully correct. Problems were seen with two back-to-back stores to TLS variables, where the call sequences ended up overlapping with unhappy results. Additionally, since these weren't real calls, the proper register side effects of a call were not recorded, so clobbered values were kept live across the calls. The proper thing to do is to lower these into calls in the first place. This is relatively straightforward; see the changes to PPCTargetLowering::LowerGlobalTLSAddress() in PPCISelLowering.cpp. The changes here are standard call lowering, except that we need to track the fact that these calls will require a relocation. This is done by adding a machine operand flag of MO_TLSLD or MO_TLSGD to the TargetGlobalAddress operand that appears earlier in the sequence. The calls to LowerCallTo() eventually find their way to LowerCall_64SVR4() or LowerCall_32SVR4(), which call FinishCall(), which calls PrepareCall(). In PrepareCall(), we detect the calls to __tls_get_addr and immediately snag the TargetGlobalTLSAddress with the annotated relocation information. This becomes an extra operand on the call following the callee, which is expected for nodes of type tlscall. We change the call opcode to CALL_TLS for this case. Back in FinishCall(), we change it again to CALL_NOP_TLS for 64-bit only, since we require a TOC-restore nop following the call for the 64-bit ABIs. During selection, patterns in PPCInstrInfo.td and PPCInstr64Bit.td convert the CALL_TLS nodes into BL_TLS nodes, and convert the CALL_NOP_TLS nodes into BL8_NOP_TLS nodes. This replaces the code removed from PPCAsmPrinter.cpp, as the BL_TLS or BL8_NOP_TLS nodes can now be emitted normally using their patterns and the associated printTLSCall print method. Finally, as a result of these changes, all references to get-tls-addr in its various guises are no longer used, so they have been removed. There are existing TLS tests to verify the changes haven't messed anything up). I've added one new test that verifies that the problem with the original code has been fixed. llvm-svn: 221703	2014-11-11 20:44:09 +00:00
Rafael Espindola	a9c28b68cd	Use a 8 bit immediate when possible. This fixes pr21529. llvm-svn: 221700	2014-11-11 19:46:36 +00:00
Dario Domizioli	e904e85faf	[X86][ELF] Fix PR20243 - leaf frame pointer bug with TLS access The ISel lowering for global TLS access in PIC mode was creating a pseudo instruction that is later expanded to a call, but the code was not setting the hasCalls flag in the MachineFrameInfo alongside the adjustsStack flag. This caused some functions to be mistakenly recognized as leaf functions, and this in turn affected the decision to eliminate the frame pointer. With the fix, hasCalls is properly set and the leaf frame pointer is correctly preserved. llvm-svn: 221695	2014-11-11 18:44:49 +00:00
Oliver Stannard	8c2c67e63c	LLVM incorrectly folds xor into select LLVM replaces the SelectionDAG pattern (xor (set_cc cc x y) 1) with (set_cc !cc x y), which is only correct when the xor has type i1. Instead, we should check that the constant operand to the xor is all ones. llvm-svn: 221693	2014-11-11 17:36:01 +00:00
Vasileios Kalintiris	b2dd15f8c7	[mips] Add preliminary support for the MIPS II target. Summary: This patch enables code generation for the MIPS II target. Pre-Mips32 targets don't have the MUL instruction, so we add the correspondent pattern that uses the MULT/MFLO combination in order to retrieve the product. This is WIP as we don't support code generation for select nodes due to the lack of conditional-move instructions. Reviewers: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6150 llvm-svn: 221686	2014-11-11 11:43:55 +00:00
Vasileios Kalintiris	8c1c95e95c	[mips] Add hardware register name "hwr_ulr" ($29) The canonical name when printing assembly is still $29. The reason is that GAS does not accept "$hwr_ulr" at the moment. This addresses the comments from r221307, which reverted the original commit r221299. llvm-svn: 221685	2014-11-11 11:22:39 +00:00
Andrea Di Biagio	5fa2e15453	[X86] Add missing check for 'isINSERTPSMask' in method 'isShuffleMaskLegal'. This helps the DAGCombiner to identify more opportunities to fold shuffles. llvm-svn: 221684	2014-11-11 11:20:31 +00:00
Vasileios Kalintiris	10b5ba3f6e	Recommit "[mips] Add names and tests for the hardware registers" The original commit r221299 was reverted in r221307. I removed the name "hrw_ulr" ($29) from the original commit because two tests were failing. llvm-svn: 221681	2014-11-11 10:31:31 +00:00
David Majnemer	2cc4bc77bf	MC, COFF: Use relocations for function references inside the section Referencing one symbol from another in the same section does not generally require a relocation. However, the MS linker has a feature called /INCREMENTAL which enables incremental links. It achieves this by creating thunks to the actual function and redirecting all relocations to point to the thunk. This breaks down with the old scheme if you have a function which references, say, itself. On x86_64, we would use %rip relative addressing to reference the start of the function from out current position. This would lead to miscompiles because other references might reference the thunk instead, breaking function pointer equality. This fixes PR21520. llvm-svn: 221678	2014-11-11 08:43:57 +00:00
Craig Topper	f655cddb13	Use uint64_t as the type for the X86 TSFlag format enum. Allows removal of the VEXShift hack that was used to access the higher bits of TSFlags. llvm-svn: 221673	2014-11-11 07:32:32 +00:00
Michael Kuperstein	3fe15e498f	[X86] Fix pattern match for 32-to-64-bit zext in the presence of AssertSext This fixes an issue with matching trunc -> assertsext -> zext on x86-64, which would not zero the high 32-bits. See PR20494 for details. Recommitting - This time, with a hopefully working test. Differential Revision: http://reviews.llvm.org/D6128 llvm-svn: 221672	2014-11-11 07:07:40 +00:00
Jingyue Wu	dfd4eb9285	[NVPTX] Remove dead code in NVPTXTargetTransformInfo (NFC) llvm-svn: 221668	2014-11-11 05:24:04 +00:00
Rafael Espindola	961d469445	MCAsmParserExtension has a copy of the MCAsmParser. Use it. Base classes were storing a second copy. llvm-svn: 221667	2014-11-11 05:18:41 +00:00
Rafael Espindola	804f43c655	Add const. NFC. This adds const to a few methods that already return const references or creates a const version when they reterun non-const references. llvm-svn: 221666	2014-11-11 05:11:47 +00:00
Quentin Colombet	360460ba64	[X86] Custom lower UINT_TO_FP from v4f32 to v4i32, and for v8f32 to v8i32 if AVX2 is available. According to IACA, the new lowering has a throughput of 8 cycles instead of 13 with the previous one. Althought this lowering kicks in some SPECs benchmarks, the performance improvement was within the noise. Correctness testing has been done for the whole range of uint32_t with the following program: uint4 v = (uint4) {0,1,2,3}; uint32_t i; //Check correctness over entire range for uint4 -> float4 conversion for( i = 0; i < 1U << (32-2); i++ ) { float4 t = test(v); float4 c = correct(v); if( 0xf != _mm_movemask_ps( t == c )) { printf( "Error @ %vx: %vf vs. %vf\n", v, c, t); return -1; } v += 4; } Where "correct" is the old lowering and "test" the new one. The patch adds a test case for the two custom lowering instruction. It also modifies the vector cost model, which is why cast.ll and uitofp.ll are modified. 2009-02-26-MachineLICMBug.ll is also modified because we now hoist 7 instructions instead of 4 (3 more constant loads). rdar://problem/18153096> llvm-svn: 221657	2014-11-11 02:23:47 +00:00
Nico Weber	7f654a8e8f	speling. llvm-svn: 221652	2014-11-11 01:13:42 +00:00
Chad Rosier	a9ae3e311c	[yaml2obj] Support AArch64 relocations. Patch by Daniel Stewart <stewartd@codeaurora.org>! Phabricator Revision: http://reviews.llvm.org/D6192 llvm-svn: 221639	2014-11-10 23:02:03 +00:00
Michael Kuperstein	217e1eec0d	Reverting r221626 due to a too-strict test. llvm-svn: 221629	2014-11-10 21:07:41 +00:00
Juergen Ributzka	ea5870a530	[AArch64][FastISel] Fix kill flags for integer extends. In the case we optimize an integer extend away and replace it directly with the source register, we also have to clear all kill flags at all its uses. This is necessary, because the orignal IR instruction might be trivially dead, but we replaced it with a nop at MI level. llvm-svn: 221628	2014-11-10 21:05:31 +00:00
Juergen Ributzka	d441725d3d	[SwitchLowering] Fix the "fixPhis" function. Switch statements may have more than one incoming edge into the same BB if they all have the same value. When the switch statement is converted these incoming edges are now coming from multiple BBs. Updating all incoming values to be from a single BB is incorrect and would generate invalid LLVM IR. The fix is to only update the first occurrence of an incoming value. Switch lowering will perform subsequent calls to this helper function for each incoming edge with a new basic block - updating all edges in the process. This fixes rdar://problem/18916275. llvm-svn: 221627	2014-11-10 21:05:27 +00:00
Michael Kuperstein	3218b942f4	[X86] Fix pattern match for 32-to-64-bit zext in the presence of AssertSext This fixes an issue with matching trunc -> assertsext -> zext on x86-64, which would not zero the high 32-bits. See PR20494 for details. Differential Revision: http://reviews.llvm.org/D6128 llvm-svn: 221626	2014-11-10 20:40:21 +00:00
Rafael Espindola	75b809c9b6	Copy externally_initialized in GlobalVariable::copyAttributesFrom. Patch by Kevin Frei! llvm-svn: 221620	2014-11-10 18:41:59 +00:00
Jingyue Wu	0c981bd7df	[NVPTX] Add an NVPTX-specific TargetTransformInfo Summary: It currently only implements hasBranchDivergence, and will be extended in later diffs. Split from D6188. Test Plan: make check-all Reviewers: jholewinski Reviewed By: jholewinski Subscribers: llvm-commits, meheff, eliben, jholewinski Differential Revision: http://reviews.llvm.org/D6195 llvm-svn: 221619	2014-11-10 18:38:25 +00:00
Rafael Espindola	4aa6bea7a2	Misc style fixes. NFC. This fixes a few cases of: * Wrong variable name style. * Lines longer than 80 columns. * Repeated names in comments. * clang-format of the above. This make the next patch a lot easier to read. llvm-svn: 221615	2014-11-10 18:11:10 +00:00
Vasileios Kalintiris	ccde2a9a1e	Fix extra semicolon warning. NFC. llvm-svn: 221613	2014-11-10 17:37:53 +00:00
Zoran Jovanovic	37bca10148	[mips][microMIPS] Fix issue with delay slot filler and microMIPS Differential Revision: http://reviews.llvm.org/D6193 llvm-svn: 221612	2014-11-10 17:27:56 +00:00
Daniel Sanders	87f9b88bfb	[mips] Fix sret arguments for N32/N64 which were accidentally broken in r221534. llvm-svn: 221604	2014-11-10 15:57:53 +00:00
Saleem Abdulrasool	d2c5d7f6da	Transforms: address some late comments We already use the llvm namespace. Remove the unnecessary prefix. Use the StringRef::equals method to compare with C strings rather than instantiating std::strings. Addresses late review comments from David Majnemer. llvm-svn: 221564	2014-11-08 00:00:50 +00:00
Saleem Abdulrasool	92b13aac04	Transforms: sort source files in build Sort target sources. NFC. llvm-svn: 221563	2014-11-08 00:00:47 +00:00
Chad Rosier	b3eb452e83	[Reassociate] Better preserve NSW/NUW flags. Part of PR12985. Phabricator Revision: http://reviews.llvm.org/D6172 llvm-svn: 221555	2014-11-07 22:12:57 +00:00
Saleem Abdulrasool	89c5ad4cda	Transforms: use typedef rather than using aliases Visual Studio 2012 apparently does not support using alias declarations. Use the more traditional typedef approach. This should let the Windows buildbots pass. NFC. llvm-svn: 221554	2014-11-07 22:09:52 +00:00
Saleem Abdulrasool	5898e09057	Transform: add SymbolRewriter pass This introduces the symbol rewriter. This is an IR->IR transformation that is implemented as a CodeGenPrepare pass. This allows for the transparent adjustment of the symbols during compilation. It provides a clean, simple, elegant solution for symbol inter-positioning. This technique is often used, such as in the various sanitizers and performance analysis. The control of this is via a custom YAML syntax map file that indicates source to destination mapping, so as to avoid having the compiler to know the exact details of the source to destination transformations. llvm-svn: 221548	2014-11-07 21:32:08 +00:00
Michael J. Spencer	d48829b958	Fix style. llvm-svn: 221547	2014-11-07 21:30:36 +00:00
Matt Arsenault	23755997e4	R600: Remove unused define llvm-svn: 221543	2014-11-07 20:45:00 +00:00
Daniel Sanders	c43cda84ff	[mips] Promote i32 arguments to i64 for the N32/N64 ABI and fix <64-bit structs... Summary: ... and after all that refactoring, it's possible to distinguish softfloat floating point values from integers so this patch no longer breaks softfloat to do it. Remove direct handling of i32's in the N32/N64 ABI by promoting them to i64. This more closely reflects the ABI documentation and also fixes problems with stack arguments on big-endian targets. We now rely on signext/zeroext annotations (already generated by clang) and the Assert[SZ]ext nodes to avoid the introduction of unnecessary sign/zero extends. It was not possible to convert three tests to use signext/zeroext. These tests are bswap.ll, ctlz-v.ll, ctlz-v.ll. It's not possible to put signext on a vector type so we just accept the sign extends here for now. These tests don't pass the vectors the same way clang does (clang puts multiple elements in the same argument, these map 1 element to 1 argument) so we don't need to worry too much about it. With this patch, all known N32/N64 bugs should be fixed and we now pass the first 10,000 tests generated by ABITest.py. Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6117 llvm-svn: 221534	2014-11-07 16:54:21 +00:00
NAKAMURA Takumi	13437e8414	[CMake] LLVMSupport: Give system_libs PRIVATE scope when LLVMSupport is built as SHARED. Users of LLVMSupport won't inherit ${system_libs}. unittests/SupporTests is another user of libpthreads. Apply LLVM_SYSTEM_LIBS for him explicitly. llvm-svn: 221531	2014-11-07 16:08:19 +00:00
Daniel Sanders	b315c8c762	[mips] Removed the remainder of MipsCC. NFC. Summary: One of the calls to AllocateStack (the one in LowerCall) doesn't look like it should be there but it was there before and removing it breaks the frame size calculation. Reviewers: vmedic, theraven Reviewed By: theraven Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6116 llvm-svn: 221529	2014-11-07 15:33:08 +00:00
Daniel Sanders	2c6f4b430b	[mips] Remove MipsCC::reservedArgArea() in favour of MipsABIInfo::GetCalleeAllocdArgSizeInBytes(). NFC. Summary: Reviewers: theraven, vmedic Reviewed By: vmedic Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6115 llvm-svn: 221528	2014-11-07 15:03:53 +00:00
NAKAMURA Takumi	0ebd071450	MipsCCState.h: Use LLVM_DELETED_FUNCTION for msc17. llvm-svn: 221527	2014-11-07 14:56:31 +00:00
Daniel Sanders	0456c15c58	[mips] Move MipsCCState to a separate file and clang-formatted it. Summary: Depends on D6113 Reviewers: theraven, vmedic Reviewed By: vmedic Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6114 llvm-svn: 221525	2014-11-07 14:24:31 +00:00
Daniel Sanders	892cf8af46	[mips] Fix unused variable warnings introduced in r221521 llvm-svn: 221522	2014-11-07 12:43:01 +00:00
Daniel Sanders	d7eba31508	[mips] Remove remaining use of MipsCC::intArgRegs() in favour of MipsABIInfo::GetByValArgRegs() and MipsABIInfo::GetVarArgRegs() Summary: Depends on D6112 Reviewers: theraven, vmedic Reviewed By: vmedic Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6113 llvm-svn: 221521	2014-11-07 12:21:37 +00:00
Daniel Sanders	4f1bedaa47	[mips] Remove MipsCC::getRegVT(). NFC Summary: It's no longer used. Reviewers: vmedic, theraven Reviewed By: theraven Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6112 llvm-svn: 221519	2014-11-07 12:02:59 +00:00
Daniel Sanders	cfad1e3fca	[mips] Remove MipsCC::analyzeCallOperands in favour of CCState::AnalyzeCallOperands. NFC Summary: In addition to the usual f128 workaround, it was also necessary to provide a means of accessing ArgListEntry::IsFixed. Reviewers: theraven, vmedic Reviewed By: vmedic Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6111 llvm-svn: 221518	2014-11-07 11:43:49 +00:00
Daniel Sanders	41a64c407f	[mips] Move SpecialCallingConv to MipsCCState and use it from tablegen-erated code. NFC Summary: In the long run, it should probably become a calling convention in its own right but for now just move it out of MipsISelLowering::analyzeCallOperands() so that we can drop this function in favour of CCState::AnalyzeCallOperands(). Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6085 llvm-svn: 221517	2014-11-07 11:10:48 +00:00
Daniel Sanders	f3096a1c8d	[mips] Removed IsVarArg from MipsISelLowering::analyzeCallOperands(). NFC. Summary: CCState objects already carry this information in their isVarArg() method. Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6084 llvm-svn: 221516	2014-11-07 10:45:16 +00:00
David Majnemer	2098b86f64	SCCP: overdefined calls cannot become constant We would attempt to fold away a call instruction which had been marked overdefined. However, it's not valid to transition to constant from overdefined. This fixes PR21512. llvm-svn: 221513	2014-11-07 08:54:19 +00:00
Justin Hibbits	771c132e0f	Add Position-independent Code model Module API. Summary: This makes PIC levels a Module flag attribute, which can be queried by the backend. The flag is named `PIC Level`, and can have a value of: 0 - Backend-default 1 - Small-model (-fpic) 2 - Large-model (-fPIC) These match the `-pic-level' command line argument for clang, and the value of the preprocessor macro `__PIC__'. Test Plan: New flags tests specific for the 'PIC Level' module flag. Tests to be added as part of a future commit for PowerPC, which will use this new API. Reviewers: rafael, echristo Reviewed By: rafael, echristo Subscribers: rafael, llvm-commits Differential Revision: http://reviews.llvm.org/D5882 llvm-svn: 221510	2014-11-07 04:46:10 +00:00
Ahmed Bougacha	72001cf287	[AArch64] Keep flags on condition vreg when instantiating a CB branch. Reversing a CB* instruction used to drop the flags on the condition. On the included testcase, this lead to a read from an undefined vreg. Using addOperand keeps the flags, here <undef>. Differential Revision: http://reviews.llvm.org/D6159 llvm-svn: 221507	2014-11-07 02:50:00 +00:00
Rafael Espindola	1d7d4eb15c	Use a StringRefMemoryObject. NFC. llvm-svn: 221503	2014-11-07 01:09:51 +00:00
David Majnemer	bf93e7c7d3	LoopVectorize: Don't assume pointees are sized A pointer's pointee might not be sized: the pointee could be a function. Report this as IK_NoInduction when calculating isInductionVariable. This fixes PR21508. llvm-svn: 221501	2014-11-07 00:31:14 +00:00
David Majnemer	c1eca5ad7c	InstCombine: Rely on cmpxchg's return code when it's strong Comparing the result of a cmpxchg instruction can be replaced with an extractvalue of the cmpxchg success indicator. llvm-svn: 221498	2014-11-06 23:23:30 +00:00
Rafael Espindola	89cb407729	Remove unused variable. NFC. llvm-svn: 221497	2014-11-06 23:16:57 +00:00
Simon Atanasyan	60e1a79242	[ELF][yaml2obj] Handle additional MIPS specific st_other field flags The ELF symbol `st_other` field might contain additional flags besides visibility ones. This patch implements support for some MIPS specific flags. llvm-svn: 221491	2014-11-06 22:46:24 +00:00
Rafael Espindola	e2541bd60e	Factor out call to push_back. NFC. llvm-svn: 221490	2014-11-06 22:39:16 +00:00
Simon Pilgrim	615ab8e721	[X86][SSE] Vector integer/float conversion memory folding (cvttps2dq / cvttpd2dq) Fixed an issue with the (v)cvttps2dq and (v)cvttpd2dq instructions being incorrectly put in the 2 source operand folding tables instead of the 1 source operand and added the missing SSE/AVX versions. Also added missing (v)cvtps2dq and (v)cvtpd2dq instructions to the folding tables. Differential Revision: http://reviews.llvm.org/D6001 llvm-svn: 221489	2014-11-06 22:15:41 +00:00
Ahmed Bougacha	b5367eeea3	[X86] Add VFMADDSUB cases for the 213->231 custom inserter. Also add tests for vfmadd/vfmsub. llvm-svn: 221488	2014-11-06 22:04:15 +00:00
Ahmed Bougacha	9152361d73	[X86] Add missing FMA3 VFMADDSUB in the emitter. Also reuse the fma4 intrinsic test to cover fma3 instructions too. llvm-svn: 221487	2014-11-06 21:58:11 +00:00
David Majnemer	504165df71	Object, COFF: Don't consider AuxFunctionDefinition for getSymbolSize mingw lies about the size of a function's AuxFunctionDefinition. Ignore the field and rely on our heuristic to determine the symbol's size. llvm-svn: 221485	2014-11-06 21:46:55 +00:00
Rafael Espindola	b7a4505a3f	Base check on the section name, not the variable name. The variable is private, so the name should not be relied on. Also, the linker uses the sections, so asan should too when trying to avoid causing the linker problems. llvm-svn: 221480	2014-11-06 20:01:34 +00:00
Lang Hames	cdd9077f3a	[RegAlloc] Kill off the trivial spiller - nobody is using it any more. llvm-svn: 221474	2014-11-06 19:12:38 +00:00
Michael Liao	736bac6482	Indentation fixes llvm-svn: 221472	2014-11-06 19:05:57 +00:00
Frederic Riss	051cd75b6d	Try to appease MSVC buildbots after r221466. llvm-svn: 221471	2014-11-06 19:00:47 +00:00
Frederic Riss	4aa51ae6c9	Change DIBuilder::createImportedDeclaration from taking a DIScope to a DIDescriptor. Imported declarations can be DIGlobalVariables which aren't a DIScope. Today clang (unknowingly I believe) shoehorns these into a DIScope and it all works just because we never access the fields. llvm-svn: 221466	2014-11-06 17:46:55 +00:00
Colin LeMahieu	2c769209a1	[Hexagon] Adding basic Hexagon ELF object emitter. llvm-svn: 221465	2014-11-06 17:05:51 +00:00
Eli Bendersky	799c564236	Clean up NVPTXLowerStructArgs.cpp. NFC * Remove unnecessary const_casts and C-style casts * Simplify attribute access code * Simplify ArrayRef creation * 80-col and clang-format llvm-svn: 221464	2014-11-06 17:05:49 +00:00
Daniel Sanders	2373af3475	[mips] Removed IsSoftFloat from MipsISelLowering::analyzeCallOperands(). NFC Summary: It isn't used anymore. Depends on D6081 Reviewers: vmedic Reviewed By: vmedic Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6083 llvm-svn: 221463	2014-11-06 16:48:57 +00:00
Chad Rosier	ac6a2f532c	[Reassociate] Don't reassociate when mixing regular and fast-math FP instructions. Inlining might cause such cases and it's not valid to reassociate floating-point instructions without the unsafe algebra flag. Patch by Mehdi Amini <mehdi_amini@apple.com>! llvm-svn: 221462	2014-11-06 16:46:37 +00:00
Daniel Sanders	b70e27ca7b	[mips] Removed MipsISelLowering::analyzeFormalArguments() in favour of CCState::AnalyzeFormalArguments() Summary: As with returns, we must be able to identify f128 arguments despite them being lowered away. We do this with a pre-analyze step that builds a vector and then we use this vector from the tablegen-erated code. Reviewers: vmedic Reviewed By: vmedic Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6081 llvm-svn: 221461	2014-11-06 16:36:30 +00:00
Rafael Espindola	26cfbea738	Compute the correct jump table entries on 32 bit windows. On 32 bit windows we use label differences and .set does not suppress rolocations, a combination that was not used before r220256. This fixes PR21497. llvm-svn: 221456	2014-11-06 14:39:49 +00:00
Andrea Di Biagio	7ecd22ca4a	[X86] When commuting SSE immediate blend, make sure that the new blend mask is a valid imm8. Example: define <4 x i32> @test(<4 x i32> %a, <4 x i32> %b) { %shuffle = shufflevector <4 x i32> %a, <4 x i32> %b, <4 x i32> <i32 4, i32 5, i32 6, i32 3> ret <4 x i32> %shuffle } Before llc (-mattr=+sse4.1), produced the following assembly instruction: pblendw $4294967103, %xmm1, %xmm0 After pblendw $63, %xmm1, %xmm0 llvm-svn: 221455	2014-11-06 14:36:45 +00:00
Aaron Ballman	e77ffe35bf	Fixing some -Wcast-qual warnings; NFC. llvm-svn: 221454	2014-11-06 14:32:30 +00:00
Toma Tabacu	27cab751ca	[mips] Tolerate the use of the %z inline asm operand modifier with non-immediates. Summary: Currently, we give an error if %z is used with non-immediates, instead of continuing as if the %z isn't there. For example, you use the %z operand modifier along with the "Jr" constraints ("r" makes the operand a register, and "J" makes it an immediate, but only if its value is 0). In this case, you want the compiler to print "$0" if the inline asm input operand turns out to be an immediate zero and you want it to print the register containing the operand, if it's not. We give an error in the latter case, and we shouldn't (GCC also doesn't). Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6023 llvm-svn: 221453	2014-11-06 14:25:42 +00:00
Sasa Stankovic	b38db1eff8	[mips] Add the following MIPS options that control gp-relative addressing of small data items: -mgpopt, -mlocal-sdata, -mextern-sdata. Implement gp-relative addressing for constants. Differential Revision: http://reviews.llvm.org/D4903 llvm-svn: 221450	2014-11-06 13:20:12 +00:00
Toma Tabacu	dde4c464dd	[mips] Improve error/warning messages and testing for the .cpload assembler directive. Summary: Improved warning message when using .cpload inside a reorder section and added an error message for using .cpload with Mips16 enabled. Modified the tests to fit with the changes mentioned above, added a test-case for the N32 ABI in cpload.s and did some reformatting to make the tests easier to read. Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5465 llvm-svn: 221447	2014-11-06 10:02:45 +00:00
Daniel Sanders	66e799ff1b	[JIT] Fix more missing endian conversions (opcodes for AArch64, ARM, and Mips stub functions, and ARM target in general) Summary: Fixed all of the missing endian conversions that Lang Hames and I identified in RuntimeDyldMachOARM.h. Fixed the opcode emission in RuntimeDyldImpl::createStubFunction() for AArch64, ARM, Mips when the host endian doesn't match the target endian. PowerPC will need changing if it's opcodes are affected by endianness but I've left this for now since I'm unsure if this is the case and it's the only path that specifies the target endian. This patch fixes MachO_ARM_PIC_relocations.s on a big-endian Mips host. This is the last of the known issues on this host. Reviewers: lhames Reviewed By: lhames Subscribers: aemerson, llvm-commits Differential Revision: http://reviews.llvm.org/D6130 llvm-svn: 221446	2014-11-06 09:53:05 +00:00
David Majnemer	51ff559500	Object, COFF: Infer symbol sizes from adjacent symbols Use the position of the subsequent symbol in the object file to infer the size of it's predecessor. I hope to eventually remove whatever COFF specific details from this little algorithm so that we can unify this logic with what Mach-O does. llvm-svn: 221444	2014-11-06 08:10:41 +00:00
David Majnemer	03d2c51cf2	X86, MC: Tidy up some whitespace in GetRelocType No functionality change intended. llvm-svn: 221443	2014-11-06 08:10:37 +00:00
Justin Bogner	58e41344f9	GCOV: Make sure that function idents in the .gcda and .gcno match When generating gcov compatible profiling, we sometimes skip emitting data for functions for one reason or another. However, this was emitting different function IDs in the .gcno and .gcda files, because the .gcno case was using the loop index before skipping functions and the .gcda the array index after. This resulted in completely invalid gcov data. This fixes the problem by making the .gcno loop track the ID separately from the loop index. llvm-svn: 221441	2014-11-06 06:55:02 +00:00
Rafael Espindola	83f0ea8982	Add three other sections when L symbols are allowed. llvm-svn: 221436	2014-11-06 05:01:21 +00:00
Rafael Espindola	bf77ed6826	Allow L symbols in no_dead_strip sections. If a section cannot be dead stripped, it is safe to use L symbols, since the linker will keep all of it in the end. llvm-svn: 221431	2014-11-06 02:42:03 +00:00
Quentin Colombet	dbe33e7aa4	[X86] Lower VSELECT into SHRUNKBLEND when we shrink the bits used into the condition to match a blend. This prevents optimizations that work on VSELECT to perform invalid transformations. Indeed, the optimized condition does not match the vector boolean content that is expected and bad things may happen. This patch yields the exact same code on the whole test-suite + specs (-O3 and -O3 -march=core-avx2), it improves one test case (vector-blend.ll) and fixes a bug reduced in vselect-avx.ll. <rdar://problem/18819506> llvm-svn: 221429	2014-11-06 02:25:03 +00:00
Matt Arsenault	6e863d12e5	Remove unnecessary .c_str() when implicitly converting to Twine llvm-svn: 221422	2014-11-06 01:13:27 +00:00
Petar Jovanovic	35f05747f3	[mips64] Fix MIPS64 exception personality encoding Remove dynamic relocations of __gxx_personality_v0 from the .eh_frame. The MIPS64 follow-up of the MIPS32 fix (rL209907). Patch by Vladimir Stefanovic. Differential Revision: http://reviews.llvm.org/D6141 llvm-svn: 221408	2014-11-05 22:42:31 +00:00
Simon Pilgrim	1fc483d991	[X86][SSE] Vector integer to float conversion memory folding Added missing memory folding for the (V)CVTDQ2PS instructions - we can safely fold these (but not the (V)CVTDQ2PD versions which have a register/memory size discrepancy in the source operand). I've added a test case demonstrating that stack folding now works. Differential Revision: http://reviews.llvm.org/D5981 llvm-svn: 221407	2014-11-05 22:28:25 +00:00
Michael Ilseman	a7202bdbed	Fix heap-use-after-free bug in expandSDiv when the operands are constants, as discovered by ASAN. Patch by Mehdi Amini! llvm-svn: 221401	2014-11-05 21:28:24 +00:00
Steven Wu	d994b8aaa4	Remove obsolete ARM intrinsics vclz and vcnt Both of the intrinsics get autoupgraded to target independent intrinsics. llvm-svn: 221396	2014-11-05 21:02:55 +00:00
Matt Arsenault	f2676a5afc	R600/SI: Fix omod display for VOP3b llvm-svn: 221387	2014-11-05 19:35:00 +00:00
Derek Schuff	a54222045e	[x86 fast-isel] Materialize allocas with the correct-sized lea for ILP32 Summary: X86FastISel::fastMaterializeAlloca was incorrectly conditioning its opcode selection on subtarget bitness rather than pointer size. Differential Revision: http://reviews.llvm.org/D6136 llvm-svn: 221386	2014-11-05 19:27:21 +00:00
Matt Arsenault	f3cd4512ac	R600/SI: Move all rsrc building functions to SIISelLowering llvm-svn: 221383	2014-11-05 19:01:19 +00:00
Matt Arsenault	485defe58c	R600/SI: Remove SI_ADDR64_RSRC llvm-svn: 221382	2014-11-05 19:01:17 +00:00
Justin Holewinski	3d140fcfd1	[NVPTX] Add NVPTXLowerStructArgs pass This works around the limitation that PTX does not allow .param space loads/stores with arbitrary pointers. If a function has a by-val struct ptr arg, say foo(%struct.x byval %d), then add the following instructions to the first basic block : %temp = alloca %struct.x, align 8 %tt1 = bitcast %struct.x %d to i8 * %tt2 = llvm.nvvm.cvt.gen.to.param %tt2 %tempd = bitcast i8 addrspace(101) * to %struct.x addrspace(101) * %tv = load %struct.x addrspace(101) * %tempd store %struct.x %tv, %struct.x * %temp, align 8 The above code allocates some space in the stack and copies the incoming struct from param space to local space. Then replace all occurences of %d by %temp. Fixes PR21465. llvm-svn: 221377	2014-11-05 18:19:30 +00:00
Duncan P. N. Exon Smith	c5754a65e6	IR: MDNode => Value: NamedMDNode::getOperator() Change `NamedMDNode::getOperator()` from returning `MDNode ` to returning `Value `. To reduce boilerplate at some call sites, add a `getOperatorAsMDNode()` for named metadata that's expected to only return `MDNode` -- for now, that's everything, but debug node named metadata (such as llvm.dbg.cu and llvm.dbg.sp) will soon change. This is part of PR21433. Note that there's a follow-up patch to clang for the API change. llvm-svn: 221375	2014-11-05 18:16:03 +00:00
Sanjay Patel	8f093f4138	remove extra breaks; NFC llvm-svn: 221374	2014-11-05 18:00:07 +00:00
Duncan P. N. Exon Smith	21efe02e59	IR: MDNode => Value: AsmWriter SlotTracker API Change `SlotTracker::CreateMetadataSlot()` and `SlotTracker::getMetadataSlot()` to use `Value` instead of `MDNode`. Part of PR21433. llvm-svn: 221373	2014-11-05 17:56:28 +00:00
Tilmann Scheller	30c5ca25a5	[ARM] Remove more dead code. Dead code identified by the Clang static analyzer. llvm-svn: 221372	2014-11-05 17:45:04 +00:00
Zoran Jovanovic	06c9d55123	ps][microMIPS] Implement CodeGen support for ANDI16 instruction llvm-svn: 221371	2014-11-05 17:43:00 +00:00
Colin LeMahieu	816ef086f6	[Hexagon] [NFC] Alphabetizing cmake files. llvm-svn: 221370	2014-11-05 17:38:48 +00:00
Zoran Jovanovic	9f99723d92	ps][microMIPS] Implement CodeGen support for SLL16 and SRL16 instructions llvm-svn: 221369	2014-11-05 17:38:31 +00:00
Tilmann Scheller	c339992338	[ARM] Remove another redundant assignment. Found by the Clang static analyzer. llvm-svn: 221368	2014-11-05 17:34:04 +00:00
Zoran Jovanovic	8853171b46	[mips][microMIPS] Implement ANDI16 instruction llvm-svn: 221367	2014-11-05 17:31:00 +00:00
Tilmann Scheller	219ad28076	[ARM] Remove redundant assignment. Found by the Clang static analyzer. llvm-svn: 221366	2014-11-05 17:28:19 +00:00
Peter Collingbourne	a1099840ff	[dfsan] Abort at runtime on indirect calls to uninstrumented vararg functions. We currently have no infrastructure to support these correctly. This is accomplished by generating a call to a runtime library function that aborts at runtime in place of the regular wrapper for such functions. Direct calls are rewritten in the usual way during traversal of the caller's IR. We also remove the "split-stack" attribute from such wrappers, as the code generator cannot currently handle split-stack vararg functions. llvm-svn: 221360	2014-11-05 17:21:00 +00:00
Duncan P. N. Exon Smith	9727e7865e	IR: MDNode => Value: NamedMDNode::addOperand() Change `NamedMDNode::addOperand()` to take a `Value ` instead of an `MDNode `. This is part of PR21433. llvm-svn: 221359	2014-11-05 17:16:09 +00:00
Tilmann Scheller	f2572c5097	[ARM] Remove dead code identified by the Clang static analyzer. llvm-svn: 221358	2014-11-05 17:10:43 +00:00
Zoran Jovanovic	9c654830f7	[mips][microMIPS] Mark symbols as microMIPS if necessary Differential Revision: http://reviews.llvm.org/D6039 llvm-svn: 221355	2014-11-05 16:35:20 +00:00
Zoran Jovanovic	a87308c84c	Reverted revisions 221351, 221352 and 221353. llvm-svn: 221354	2014-11-05 16:19:59 +00:00
Zoran Jovanovic	3038500f3b	[mips][microMIPS] Implement CodeGen support for ANDI16 instruction Differential Revision: http://reviews.llvm.org/D5797 llvm-svn: 221353	2014-11-05 15:54:05 +00:00
Zoran Jovanovic	f4f5f1e272	[mips][microMIPS] Implement CodeGen support for SLL16 and SRL16 instructions Differential Revision: http://reviews.llvm.org/D5933 llvm-svn: 221352	2014-11-05 15:46:53 +00:00
Zoran Jovanovic	e548bb0634	[mips][microMIPS] Implement ANDI16 instruction Differential Revision: http://reviews.llvm.org/D5163 llvm-svn: 221351	2014-11-05 15:39:41 +00:00
Tom Stellard	326d6ece94	R600/SI: Change all instruction assembly names to lowercase. This matches the format produced by the AMD proprietary driver. //==================================================================// // Shell script for converting .ll test cases: (Pass the .ll files you want to convert to this script as arguments). //==================================================================// ; This was necessary on my system so that A-Z in sed would match only ; upper case. I'm not sure why. export LC_ALL='C' TEST_FILES="$" MATCHES=`grep -v Patterns SIInstructions.td \| grep -o '"[A-Z0-9_]\+["e]' \| grep -o '[A-Z0-9_]\+' \| sort -r` for f in $TEST_FILES; do # Check that there are SI tests: grep -q -e 'verde' -e 'bonaire' -e 'SI' -e 'tahiti' $f if [ $? -eq 0 ]; then for match in $MATCHES; do sed -i -e "s/$[ :]$match$/\L\1/" $f done # Try to get check lines with partial instruction names sed -i 's/$;[ ]SI[A-Z\\-]: $$[A-Z_0-9]\+$/\1\L\2/' $f fi done sed -i -e 's/bb0_1/BB0_1/g' ../../../test/CodeGen/R600/infinite-loop.ll sed -i -e 's/SI-NOT: bfe/SI-NOT: {{[^@]}}bfe/g'../../../test/CodeGen/R600/llvm.AMDGPU.bfe.32.ll ../../../test/CodeGen/R600/sext-in-reg.ll sed -i -e 's/exp_IEEE/EXP_IEEE/g' ../../../test/CodeGen/R600/llvm.exp2.ll sed -i -e 's/numVgprs/NumVgprs/g' ../../../test/CodeGen/R600/register-count-comments.ll sed -i 's/$; CHECK[-NOT]*: $$[A-Z_0-9]\+$/\1\L\2/' ../../../test/CodeGen/R600/select64.ll ../../../test/CodeGen/R600/sgpr-copy.ll //==================================================================// // Shell script for converting .td files (run this last) //==================================================================// export LC_ALL='C' sed -i -e '/Patterns/!s/$"[A-Z0-9_]\+[ "e]$/\L\1/g' SIInstructions.td sed -i -e 's/"EXP/"exp/g' SIInstrInfo.td llvm-svn: 221350	2014-11-05 14:50:53 +00:00
Andrea Di Biagio	ce46b97b48	[X86] Teach method 'isVectorClearMaskLegal' how to check for legal blend masks. This patch improves the folding of vector AND nodes into blend operations for targets that feature SSE4.1. A vector AND node where one of the operands is a constant build_vector with elements that are either zero or all-ones can be converted into a blend. This allows for example to simplify the following code: define <4 x i32> @test(<4 x i32> %A, <4 x i32> %B) { %1 = and <4 x i32> %A, <i32 0, i32 0, i32 0, i32 -1> %2 = and <4 x i32> %B, <i32 -1, i32 -1, i32 -1, i32 0> %3 = or <4 x i32> %1, %2 ret <4 x i32> %3 } Before this patch llc (-mcpu=corei7) generated: andps LCPI1_0(%rip), %xmm0, %xmm0 andps LCPI1_1(%rip), %xmm1, %xmm1 orps %xmm1, %xmm0, %xmm0 retq With this patch we generate a single 'vpblendw'. llvm-svn: 221343	2014-11-05 13:04:14 +00:00
Oliver Stannard	9e89d8cc5c	[ARM] Honor FeatureD16 in the assembler and disassembler Some ARM FPUs only have 16 double-precision registers, rather than the normal 32. LLVM represents this with the D16 target feature. This is currently used by CodeGen to avoid using high registers when they are not available, but the assembler and disassembler do not. I fix this in the assmebler and disassembler rather than the InstrInfo.td files, as the latter would require a large number of changes everywhere one of the floating-point instructions is referenced in the backend. This solution is similar to the one used for co-processor numbers and MSR masks. llvm-svn: 221341	2014-11-05 12:06:39 +00:00
Craig Topper	12f0d9ef2c	Improve logic that decides if its profitable to commute when some of the virtual registers involved have uses/defs chains connecting them to physical register. Fix up the tests that this change improves. llvm-svn: 221336	2014-11-05 06:43:02 +00:00
David Majnemer	5026722287	llvm-readobj: Add support for dumping the DOS header in PE files llvm-svn: 221333	2014-11-05 06:24:35 +00:00
Jiangning Liu	1fb71bc395	Revert 220932. Commit 220932 caused crash when building clang-tblgen on aarch64 debian target, so it's blocking all daily tests. The std::call_once implementation in pthread has bug for aarch64 debian. llvm-svn: 221331	2014-11-05 04:44:31 +00:00
Duncan P. N. Exon Smith	b28deb1967	IR: Metadata: Remove unnecessary dyn_cast llvm-svn: 221328	2014-11-05 01:55:06 +00:00
David Majnemer	bf7550e7ec	InstSimplify: Exact shifts of X by Y are X if X has the lsb set Exact shifts may not shift out any non-zero bits. Use computeKnownBits to determine when this occurs and just return the left hand side. This fixes PR21477. llvm-svn: 221325	2014-11-05 00:59:59 +00:00
Tim Northover	dc0d9e46a5	ARM: try to add extra CS-register whenever stack alignment >= 8. We currently try to push an even number of registers to preserve 8-byte alignment during a function's prologue, but only when the stack alignment is prcisely 8. Many of the reasons for doing it apply also when that alignment > 8 (the extra store is often free, and can save another stack adjustment, though less frequently for 16-byte stack alignment). llvm-svn: 221321	2014-11-05 00:27:20 +00:00
Tim Northover	228c943f31	ARM/Dwarf: correctly align stack before callee-saved VPRs We were making an attempt to do this by adding an extra callee-saved GPR (so that there was an even number in the list), but when that failed we went ahead and pushed anyway. This had a couple of potential issues: + The .cfi directives we emit misplaced dN because they were based on PrologEpilogInserter's calculation. + Unaligned stores can be less efficient. + Unaligned stores can actually fault (likely only an issue in niche cases, but possible). This adds a final explicit stack adjustment if all other options fail, so that the actual locations of the registers match up with where they should be. llvm-svn: 221320	2014-11-05 00:27:13 +00:00
David Majnemer	f20d7c4c61	Analysis: Make isSafeToSpeculativelyExecute fire less for divides Divides and remainder operations do not behave like other operations when they are given poison: they turn into undefined behavior. It's really hard to know if the operands going into a div are or are not poison. Because of this, we should only choose to speculate if there are constant operands which we can easily reason about. This fixes PR21412. llvm-svn: 221318	2014-11-04 23:49:08 +00:00
Reid Kleckner	941e93e9a8	Revert "[Reassociate] Canonicalize negative constants out of expressions." This reverts commit r221171. It performs this invalid transformation: - %div.i = urem i64 -1, %add - %sub.i = sub i64 -2, %div.i + %div.i = urem i64 1, %add + %sub.i1 = add i64 %div.i, -2 llvm-svn: 221317	2014-11-04 23:42:45 +00:00
Simon Pilgrim	c9a0779309	[X86][SSE] Enable commutation for SSE immediate blend instructions Patch to allow (v)blendps, (v)blendpd, (v)pblendw and vpblendd instructions to be commuted - swaps the src registers and inverts the blend mask. This is primarily to improve memory folding (see new tests), but it also improves the quality of shuffles (see modified tests). Differential Revision: http://reviews.llvm.org/D6015 llvm-svn: 221313	2014-11-04 23:25:08 +00:00
Mark Heffernan	2d393ea6ef	Revert earlier change removing setPreservesCFG from instcombine (r221223) and change LoopSimplifyPass to be !isCFGOnly. The motivation for the earlier patch (r221223) was that LoopSimplify is not preserved by instcombine though setPreservesCFG indicates that it is. This change fixes the issue by making setPreservesCFG no longer imply LoopSimplifyPass, and is therefore less invasive. llvm-svn: 221311	2014-11-04 23:02:09 +00:00
Juergen Ributzka	f9660f0712	[AArch64] Use the correct register class for ORR. While fixing up the register classes in the machine combiner in a previous commit I missed one. This fixes the last one and adds a test case. llvm-svn: 221308	2014-11-04 22:20:07 +00:00
Rafael Espindola	d85260827c	Revert "[mips] Add names and tests for the hardware registers" This reverts commit r221299. The tests LLVM :: MC/Disassembler/Mips/mips32.txt LLVM :: MC/Disassembler/Mips/mips32_le.txt were failing. llvm-svn: 221307	2014-11-04 22:15:05 +00:00
David Blaikie	3a443c29b9	Provide gmlt-like inline scope information in the skeleton CU to facilitate symbolication without needing the .dwo files Clang -gsplit-dwarf self-host -O0, binary increases by 0.0005%, -O2, binary increases by 25%. A large binary inside Google, split-dwarf, -O0, and other internal flags (GDB index, etc) increases by 1.8%, optimized build is 35%. The size impact may be somewhat greater in .o files (I haven't measured that much - since the linked executable -O0 numbers seemed low enough) due to relocations. These relocations could be removed if we taught the llvm-symbolizer to handle indexed addressing in the .o file (GDB can't cope with this just yet, but GDB won't be reading this info anyway). Also debug_ranges could be shared between .o and .dwo, though ideally debug_ranges would get a schema that could used index(+offset) addressing, and move to the .dwo file, then we'd be back to sharing addresses in the address pool again. But for now, these sizes seem small enough to go ahead with this. Verified that no other DW_TAGs are produced into the .o file other than subprograms and inlined_subroutines. llvm-svn: 221306	2014-11-04 22:12:25 +00:00
David Blaikie	9bfd7a9f43	Move cross-unit DIE caching to the DwarfFile level, so it doesn't interfere with fission-gmlt data and produce skeleton<>full unit cross referencing. llvm-svn: 221305	2014-11-04 22:12:18 +00:00
Rafael Espindola	bfd0f01dd7	Don't produce relocations for a difference in a section with no symbols. We were producing a relocation for ---------------- .section foo,bar La: Lb: .long La-Lb -------------- but not for --------------------- .section foo,bar zed: La: Lb: .long La-Lb ---------------- This patch handles the case where both fragments are part of the first atom in a section and there is no corresponding symbol to that atom. This fixes pr21328. llvm-svn: 221304	2014-11-04 22:10:33 +00:00
Vasileios Kalintiris	a16974a5c0	[mips] Move COP2 & COP3 load/store instructions from MipsInstrFPU.td to MipsInstrInfo.td. NFC. Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5843 llvm-svn: 221300	2014-11-04 21:45:16 +00:00
Vasileios Kalintiris	df6e0d0371	[mips] Add names and tests for the hardware registers Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5763 llvm-svn: 221299	2014-11-04 21:30:44 +00:00
Andrea Di Biagio	f5b34e535d	[X86] Add 'FeatureSlowSHLD' to cpu 'bdver3'. Also explicit set FeatureAVX and FeatureSSE4A for all the bdver* cpus. This patch adds 'FeatureSlowSHLD' to 'bdver3'. According to the official AMD optimization guide for amdfam15: "Using alternative code in place of SHLD achieves lower overall latency and requires fewer execution resources. The 32-bit and 64-bit forms of ADD, ADC, SHR, and LEA (except 16-bit form) are DirectPath instructions, while SHLD is a VectorPath instruction." This patch also explicitly sets feature AVX and SSE4A for all the bdver* cpus. This part of the patch is a non-functional change and it is mainly done for clarity reasons (Both XOP and FMA4 already imply AVX and SSE4A). llvm-svn: 221296	2014-11-04 21:18:09 +00:00
Arnaud A. de Grandmaison	a11cab3120	[PBQP] Callee saved regs should have a higher cost than scratch regs Registers are not all equal. Some are not allocatable (infinite cost), some have to be preserved but can be used, and some others are just free to use. Ensure there is a cost hierarchy reflecting this fact, so that the allocator will favor scratch registers over callee-saved registers. llvm-svn: 221293	2014-11-04 20:51:29 +00:00
Arnaud A. de Grandmaison	829dd81377	[PBQP] Tweak spill costs and coalescing benefits This patch improves how the different costs (register, interference, spill and coalescing) relates together. The assumption is now that: - coalescing (or any other "side effect" of reg alloc) is negative, and instead of being derived from a spill cost, they use the block frequency info. - spill costs are in the [MinSpillCost:+inf( range - register or interference costs are in [0.0:MinSpillCost( or +inf The current MinSpillCost is set to 10.0, which is a random value high enough that the current constraint builders do not need to worry about when settings costs. It would however be worth adding a normalization step for register and interference costs as the last step in the constraint builder chain to ensure they are not greater than SpillMinCost (unless this has some sense for some architectures). This would work well with the current builder pipeline, where all costs are tweaked relatively to each others, but could grow above MinSpillCost if the pipeline is deep enough. The current heuristic is tuned to depend rather on the number of uses of a live interval rather than a density of uses, as used by the greedy allocator. This heuristic provides a few percent improvement on a number of benchmarks (eembc, spec, ...) and will definitely need to change once spill placement is implemented: the current spill placement is really ineficient, so making the cost proportionnal to the number of use is a clear win. llvm-svn: 221292	2014-11-04 20:51:24 +00:00
Matt Arsenault	a95f5a0ec1	R600/SI: Rename div_scale dest operands to match documentation llvm-svn: 221291	2014-11-04 20:29:20 +00:00
Benjamin Kramer	185dc0da1f	AArch64: Pattern match integer vector abs like we do on ARM. This kind of pattern is emitted by the loop vectorizer. llvm-svn: 221289	2014-11-04 20:10:06 +00:00
Kostya Serebryany	c5bd9810cc	[asan] [mips] changed ShadowOffset32 for systems having 16kb PageSize; patch by Kumar Sukhani llvm-svn: 221288	2014-11-04 19:46:15 +00:00
David Majnemer	2de97fcd9a	InstSimplify: Fold a hasNoSignedWrap() call into a match() expression No functionality change intended, it's just a little more concise. llvm-svn: 221281	2014-11-04 17:47:13 +00:00
David Majnemer	4f438377fb	InstSimplify: Fold a hasNoUnsignedWrap() call into a match() expression No functionality change intended, it's just a little more concise. llvm-svn: 221280	2014-11-04 17:38:50 +00:00
Toma Tabacu	cc2502d8f3	[mips] Improve support for the .set mips16/nomips16 assembler directives. Summary: Appropriately set/clear the FeatureBit for Mips16 when these assembler directives are used and also emit ".set nomips16" (previously, only ".set mips16" was being emitted). These improvements allow for better testing of the .cpload/.cprestore assembler directives (which are not supposed to work when Mips16 is enabled). Test Plan: The test is bare-bones because there are no MC tests for Mips16 instructions (there's only one, which checks that the Mips16 ELF header flag gets set), and that suggests to me that it has not been implemented yet in the IAS. Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5462 llvm-svn: 221277	2014-11-04 17:18:07 +00:00
Sanjay Patel	aee8421088	remove function names from comments; NFC llvm-svn: 221274	2014-11-04 16:27:42 +00:00
Sanjay Patel	547e9752ff	fix typo in comment llvm-svn: 221273	2014-11-04 16:09:50 +00:00
Simon Atanasyan	d2bfd00e71	[yaml2obj] Allow yaml2obj tool to recognize EF_MIPS_NAN2008 flag llvm-svn: 221268	2014-11-04 13:33:36 +00:00
Rafael Espindola	c1f30877e0	Remove FindProgramByName. NFC. llvm-svn: 221258	2014-11-04 12:35:47 +00:00
Yaron Keren	ec69a4ece1	Fix Visual C++ warning, Program.inc(85): warning C4018: '<' : signed/unsigned mismatch. llvm-svn: 221252	2014-11-04 09:22:41 +00:00
NAKAMURA Takumi	72e626e305	sys::findProgramByName(): [Win32] Tweak to pass lowercase .exe to SearchPath() to appease clang Driver's tests. It seems SearchPath() doesn't show actual extension on the filesystem. FIXME: Shall we use FindFirstFile() here? llvm-svn: 221246	2014-11-04 08:17:15 +00:00
David Majnemer	b925715c56	CodeGen: Enable DWARF emission for MS ABI targets This is experimental, just barely enough to get things to not immediately combust. A note for those who are curious: Only lld can successfully link the object files, other linkers truncate the section names making the debug sections illegible to debuggers. Even with this in mind, we believe we are having trouble with SECREL relocations. llvm-svn: 221245	2014-11-04 08:03:31 +00:00
Yaron Keren	6091fe7db9	#include <winbase.h> is not enough for Visual C++ 2013, it errors: 1>C:\Program Files (x86)\Windows Kits\8.1\Include\um\minwinbase.h(46): error C2146: syntax error : missing ';' before identifier 'nLength' 1>C:\Program Files (x86)\Windows Kits\8.1\Include\um\minwinbase.h(46): error C4430: missing type specifier - int assumed. Note: C++ does not support default-int ... including <windows.h> is actually required. llvm-svn: 221244	2014-11-04 07:53:30 +00:00
NAKAMURA Takumi	8d8f396d86	R600/LLVMBuild.txt: Add TransformUtils. llvm-svn: 221228	2014-11-04 02:16:53 +00:00
Reid Kleckner	dd3f3edafa	Revert "Transforms: reapply SVN r219899" This reverts commit r220811 and r220839. It made an incorrect change to musttail handling. llvm-svn: 221226	2014-11-04 02:02:14 +00:00
Mark Heffernan	2e25042a93	Remove setPreservesCFG from instcombine. The pass, in particular, does not preserve LoopSimplify because instcombine may replace branch predicates with undef which loop simplify then replaces with always exit. Replace setPreservesCFG with the more constrained preservation of DomTree and LoopInfo. llvm-svn: 221223	2014-11-04 01:51:01 +00:00
Michael J. Spencer	f9074b5a91	Use findProgramByName. llvm-svn: 221221	2014-11-04 01:29:59 +00:00
Michael J. Spencer	65ffd92f07	[Support][Program] Add findProgramByName(Name, OptionalPaths) llvm-svn: 221220	2014-11-04 01:29:29 +00:00
Kevin Enderby	72cdbf47a9	Remove the static version of getScatteredRelocationType() now that r221211 added a public version MachOObjectFile::getScatteredRelocationType(). This should fix the build bot for the unused function error. llvm-svn: 221216	2014-11-04 01:12:39 +00:00
Sanjoy Das	e839965faa	The patchpoint lowering logic would crash with live constants equal to the tombstone or empty keys of a DenseMap<int64_t, T>. This patch fixes the issue (and adds a tests case). llvm-svn: 221214	2014-11-04 00:59:21 +00:00
Kevin Enderby	9907d0a3c2	Add the code and test cases for 32-bit Intel to llvm-objdump’s Mach-O symbolizer. llvm-svn: 221211	2014-11-04 00:43:16 +00:00
Colin LeMahieu	5241881bbc	[Hexagon] Reverting 220584 to address ASAN errors. llvm-svn: 221210	2014-11-04 00:14:36 +00:00
Sanjoy Das	429c9cae09	Change logic in StackMaps::recordStackMapOpers to use the isInt<32> predicate instead of bitwise operations. This is not a functional change. llvm-svn: 221209	2014-11-04 00:06:57 +00:00
Akira Hatanaka	bc950d52d7	Rename variables to conform to llvm coding standards. Differential Revision: http://reviews.llvm.org/D6062 llvm-svn: 221204	2014-11-03 23:24:10 +00:00
Hal Finkel	840257a49c	Use AA in LoadCombine LoadCombine can be smarter about aborting when a writing instruction is encountered, instead of aborting upon encountering any writing instruction, use an AliasSetTracker, and only abort when encountering some write that might alias with the loads that could potentially be combined. This was originally motivated by comments made (and a test case provided) by David Majnemer in response to PR21448. It turned out that LoadCombine was not responsible for that PR, but LoadCombine should also be improved so that unrelated stores (and @llvm.assume) don't interrupt load combining. llvm-svn: 221203	2014-11-03 23:19:16 +00:00
David Blaikie	5b02a19f90	Use common range handling for the CU's ranges This generalizes the range handling for ranges in both the skeleton and full unit, laying the foundation for the addition of more ranges (rather than just the CU's special case) in the skeleton CU with fission+gmlt. llvm-svn: 221202	2014-11-03 23:10:59 +00:00
Akira Hatanaka	9ee2c26b49	[AArch64] Make function processLogicalImmediate more efficient. NFC. llvm-svn: 221199	2014-11-03 23:06:31 +00:00
David Majnemer	7e2b9882b1	InstCombine: Remove infinite loop caused by FoldOpIntoPhi FoldOpIntoPhi could create an infinite loop if the PHI could potentially reach a BB it was considering inserting instructions into. The instructions it would insert would eventually lead to other combines firing which would, again, lead to FoldOpIntoPhi firing. The solution is to handicap FoldOpIntoPhi so that it doesn't attempt to insert instructions that the PHI might reach. This fixes PR21377. llvm-svn: 221187	2014-11-03 21:55:12 +00:00
David Blaikie	542616d47c	Push the CURangeList down into the skeleton CU (where available) rather than the full CU So that it may be shared between skeleton/full compile unit, for CU ranges and other ranges to be added for fission+gmlt. (at some point we might want some kind of object shared between the skeleton and full compile units for all those things we only want one of in that scope, rather than having the full unit always look through to the skeleton... - alternatively, we might be able to have the skeleton pointer (or another, separate pointer) point to the skeleton or to the unit itself in non-fission, so we don't have to special case its absence) llvm-svn: 221186	2014-11-03 21:52:56 +00:00
Ahmed Bougacha	ec0f3d755f	[X86] Add debug print name for X86ISD::[US]MUL8. NFC-ish. The opcodes were added in r220516, but I forgot to add the print names. llvm-svn: 221185	2014-11-03 21:25:18 +00:00
David Blaikie	ce343492ee	Add DwarfCompileUnit::BaseAddress to track the base address used by relative addressing in debug_ranges and debug_loc This is one of a few steps to generalize range handling to include the CU range (thus the CU's range list will be moved into the range list list, losing track of the base address in the process), which means generalizing ranges from both the skeleton and full unit under fission. And... then I can used that generalized support for ranges in fission+gmlt where there'll be a bunch more ranges in the skeleton. llvm-svn: 221182	2014-11-03 21:15:30 +00:00
Akira Hatanaka	b961534818	[ARM, inline-asm] Fix ARMTargetLowering::getRegForInlineAsmConstraint to return register class tGPRRegClass if the target is thumb1. This commit fixes a crash that occurs during register allocation which was triggered when a virtual register defined by an inline-asm instruction had to be spilled. rdar://problem/18740489 llvm-svn: 221178	2014-11-03 20:37:04 +00:00
Ahmed Bougacha	12eb558bd9	[X86] 8bit divrem: Improve codegen for AH register extraction. For 8-bit divrems where the remainder is used, we used to generate: divb %sil shrw $8, %ax movzbl %al, %eax That was to avoid an H-reg access, which is problematic mainly because it isn't possible in REX-prefixed instructions. This patch optimizes that to: divb %sil movzbl %ah, %eax To do that, we explicitly extend AH, and extract the L-subreg in the resulting register. The extension is done using the NOREX variants of MOVZX. To support signed operations, MOVSX_NOREX is also added. Further, this introduces a new SDNode type, [us]divrem_ext_hreg, which is then lowered to a sequence containing a single zext (rather than 2). Differential Revision: http://reviews.llvm.org/D6064 llvm-svn: 221176	2014-11-03 20:26:35 +00:00
Hal Finkel	1e16fa302e	EarlyCSE should ignore calls to @llvm.assume EarlyCSE uses a simple generation scheme for handling memory-based dependencies, and calls to @llvm.assume (which are marked as writing to memory to ensure the preservation of control dependencies) disturb that scheme unnecessarily. Skipping calls to @llvm.assume is legal, and the alternative (adding AA calls in EarlyCSE) is likely undesirable (we have GVN for that). Fixes PR21448. llvm-svn: 221175	2014-11-03 20:21:32 +00:00
Tom Stellard	5cbb53c41e	Reapply: R600: Make sure to inline all internal functions Function calls aren't supported yet. This was reverted due to build breakages, which should be fixed now. llvm-svn: 221173	2014-11-03 19:49:05 +00:00
Chad Rosier	005505b027	[Reassociate] Canonicalize negative constants out of expressions. This gives CSE/GVN more options to eliminate duplicate expressions. This is a follow up patch to http://reviews.llvm.org/D4904. http://reviews.llvm.org/D5363 llvm-svn: 221171	2014-11-03 19:11:30 +00:00
Paul Robinson	ad06e430ce	Normally an 'optnone' function goes through fast-isel, which does not call DAGCombiner. But we ran into a case (on Windows) where the calling convention causes argument lowering to bail out of fast-isel, and we end up in CodeGenAndEmitDAG() which does run DAGCombiner. So, we need to make DAGCombiner check for 'optnone' after all. Commit includes the test that found this, plus another one that got missed in the original optnone work. llvm-svn: 221168	2014-11-03 18:19:26 +00:00
Duncan P. N. Exon Smith	3d5a02f677	IR: MDNode => Value: Instruction::getAllMetadataOtherThanDebugLoc() Change `Instruction::getAllMetadataOtherThanDebugLoc()` from a vector of `MDNode` to one of `Value`. Part of PR21433. llvm-svn: 221167	2014-11-03 18:13:57 +00:00
Charlie Turner	1d8cc909cc	Remove the cortex-a9-mp CPU. This CPU definition is redundant. The Cortex-A9 is defined as supporting multiprocessing extensions. Remove its definition and update appropriate tests. LLVM defines both a cortex-a9 CPU and a cortex-a9-mp CPU. The only difference between the two CPU definitions in ARM.td is that cortex-a9-mp contains the feature FeatureMP for multiprocessing extensions. This is redundant since the Cortex-A9 is defined as having multiprocessing extensions in the TRMs. armcc also defines the Cortex-A9 as having multiprocessing extensions by default. Change-Id: Ifcadaa6c322be0a33d9d2a39cfdd7da1d75981a7 llvm-svn: 221166	2014-11-03 17:38:00 +00:00
David Blaikie	077ad48447	Cleanup some unused or trivial functions in DwarfCompileUnit llvm-svn: 221164	2014-11-03 17:10:38 +00:00
David Blaikie	bc532b44a0	Sink DwarfUnit::CURanges into DwarfCompileUnit llvm-svn: 221161	2014-11-03 16:40:43 +00:00
Oliver Stannard	269a275cb4	[AArch64] Fix miscompile of comparison with 0xffffffffffffffff Some literals in the AArch64 backend had 15 'f's rather than 16, causing comparisons with a constant 0xffffffffffffffff to be miscompiled. llvm-svn: 221157	2014-11-03 15:28:40 +00:00
Sid Manning	326f8af463	Handle ctor/init_array initialization. Hexagon was not calling InitializeELF and could not select between ctors and init_array. Phabricator revision: http://reviews.llvm.org/D6061 llvm-svn: 221156	2014-11-03 14:56:05 +00:00
Rafael Espindola	42bce8f69d	Add CRLF support to LineIterator. The MRI scripts have to work with CRLF, and in general it is probably a good idea to support this in a core utility like LineIterator. llvm-svn: 221153	2014-11-03 14:09:47 +00:00
Oliver Stannard	cf6bfb1dd0	Revert r221150, as it broke sanitizer tests llvm-svn: 221151	2014-11-03 12:19:03 +00:00
Oliver Stannard	652ec6ee89	Emit .eh_frame with relocations to functions, rather than sections When LLVM emits DWARF call frame information, it currently creates a local, section-relative symbol in the code section, which is pointed to by a relocation on the .eh_frame section. However, for C++ we emit some functions in section groups, and the SysV ABI has some rules to make it easier to remove these sections (http://www.sco.com/developers/gabi/latest/ch4.sheader.html#section_group_rules): A symbol table entry with STB_LOCAL binding that is defined relative to one of a group's sections, and that is contained in a symbol table section that is not part of the group, must be discarded if the group members are discarded. References to this symbol table entry from outside the group are not allowed. This means that we need to use the function symbol for the relocation, not a temporary symbol. There was a comment in the code claiming that the local symbol was used to avoid creating a relocation, but a relocation must be created anyway as the code and CFI are in different sections. llvm-svn: 221150	2014-11-03 12:02:51 +00:00
Peter Collingbourne	094d061659	CMake: Add libm to list of system libs printed by llvm-config. This is required by the interpreter library, and also matches the autoconf behavior. llvm-svn: 221147	2014-11-03 10:38:26 +00:00
Daniel Sanders	0ad1719d15	[mips] Remove unused prototype and variable. NFC. llvm-svn: 221146	2014-11-03 10:14:57 +00:00

... 3 4 5 6 7 ...

74347 Commits