llvm-project

Commit Graph

Author	SHA1	Message	Date
Simon Atanasyan	1cd169f137	[Mips] Support SHT_MIPS_ABIFLAGS section type flag in the llvm-readobj, obj2yaml and yaml2obj tools. llvm-svn: 212908	2014-07-13 15:28:54 +00:00
NAKAMURA Takumi	c4fc6eec53	[CMake] Add LLVM_LINK_COMPONENTS to loadable modules, LLVMHello and BugpointPasses, on Win32. llvm-svn: 212904	2014-07-13 13:36:48 +00:00
David Majnemer	ebc741168b	IR: Allow comdats to be applied to globals with internal linkage Our verifier check for checking if a global has local linkage was too strict. Forbid private linkage but permit local linkage. Object file formats permit this and forbidding it prevents elimination of unused, internal, vftables under the MSVC ABI. llvm-svn: 212900	2014-07-13 04:56:11 +00:00
David Majnemer	299674e94f	MC: Let non-temporary COFF aliases be in symtab MC was aping a binutils bug where aliases would default their linkage to private instead of internal. I've sent a patch to the binutils maintainers and they've recently applied it to the GNU assembler sources. This fixes PR20152. Differential Revision: http://reviews.llvm.org/D4395 llvm-svn: 212899	2014-07-13 04:31:19 +00:00
Matt Arsenault	c3f6a7e44e	Remove unused include llvm-svn: 212898	2014-07-13 03:08:59 +00:00
Matt Arsenault	d32dbb6a10	R600: Use range for and fix missing consts. llvm-svn: 212897	2014-07-13 03:06:43 +00:00
Matt Arsenault	762af96f46	R600: Make ShaderType private llvm-svn: 212896	2014-07-13 03:06:39 +00:00
Matt Arsenault	d9a23ab20d	R600: Add option to disable promote alloca This can make writing some tests harder, so add a flag to disable it. llvm-svn: 212893	2014-07-13 02:08:26 +00:00
Matt Arsenault	4181ea36a9	Templatify DominanceFrontier. Theoretically this should now work for MachineBasicBlocks. llvm-svn: 212885	2014-07-12 21:59:52 +00:00
Saleem Abdulrasool	f74d48a011	AArch64: add support for llvm.aarch64.hint intrinsic This adds a llvm.aarch64.hint intrinsic to mirror the llvm.arm.hint in order to support the various hint intrinsic functions in the ACLE. Add an optional pattern field that permits the subclass to specify the pattern that matches the selection. The intrinsic pattern is set as mayLoad, mayStore, so overload the value for the definition of the hint instruction. llvm-svn: 212883	2014-07-12 21:20:49 +00:00
Saleem Abdulrasool	db514056de	MC: remove use of unnecessary variable Due to the fact that the windows unwinding has the concept of chained frames, we maintain a current frame info pointer that is adjusted on any push and pop of a unwinding context. This just removes an unnecessary variable that was used to mirror the DWARF unwinding code. llvm-svn: 212882	2014-07-12 20:49:13 +00:00
Saleem Abdulrasool	4a1a2f7790	MC: rename MCW64UnwindInfo to MCWinFrameInfo This structure contains information related to the call frame used to generate unwinding information. Rename this to reflect the future use to represent the shared state between various architectures for WinCFI information. llvm-svn: 212881	2014-07-12 20:49:09 +00:00
Simon Atanasyan	8ebb6aed9b	[ELFYAML] Group ELF section type flags to target specific blocks. Recognize only flags which correspond to the current target. llvm-svn: 212880	2014-07-12 18:25:08 +00:00
Owen Anderson	a8d1c3e74e	Fix an issue with the MergeBasicBlockIntoOnlyPred() helper function where it did not properly handle the case where the predecessor block was the entry block to the function. The only in-tree client of this is JumpThreading, which worked around the issue in its own code. This patch moves the solution into the helper so that JumpThreading (and other clients) do not have to replicate the same fix everywhere. llvm-svn: 212875	2014-07-12 07:12:47 +00:00
Alexey Samsonov	15c9669615	[ASan] Collect unmangled names of global variables in Clang to print them in error reports. Currently ASan instrumentation pass creates a string with global name for each instrumented global (to include global names in the error report). Global name is already mangled at this point, and we may not be able to demangle it at runtime (e.g. there is no __cxa_demangle on Android). Instead, create a string with fully qualified global name in Clang, and pass it to ASan instrumentation pass in llvm.asan.globals metadata. If there is no metadata for some global, ASan will use the original algorithm. This fixes https://code.google.com/p/address-sanitizer/issues/detail?id=264. llvm-svn: 212872	2014-07-12 00:42:52 +00:00
Duncan P. N. Exon Smith	6075510839	BFI: Add constructor for Weight llvm-svn: 212868	2014-07-12 00:26:00 +00:00
Duncan P. N. Exon Smith	345c287da9	BFI: Clean up BlockMass Implementation is small now -- the interesting logic was moved to `BranchProbability` a while ago. Move it into `bfi_detail` and get rid of the related TODOs. I was originally planning to define it within `BlockFrequencyInfoImpl` (or `BFIIBase`), but it seems cleaner in a namespace. Besides, `isPodLike` needs to be specialized before `BlockMass` can be used in some of the other data structures, and there isn't a clear way to do that. llvm-svn: 212866	2014-07-12 00:21:30 +00:00
Lang Hames	5d10284238	[RuntimeDyld] Fix stub size and offset for AArch64 in RuntimeDyldMachO.h. <rdar://problem/17648000> llvm-svn: 212864	2014-07-12 00:16:47 +00:00
Reid Kleckner	fb9519838a	Avoid a warning from MSVC on "*/" in this code by inserting a space llvm-svn: 212862	2014-07-12 00:06:46 +00:00
Duncan P. N. Exon Smith	b5650e5eae	BFI: Mark the end of namespaces llvm-svn: 212861	2014-07-11 23:56:50 +00:00
Lang Hames	cb314ceaa9	[RuntimeDyld] Add GOT support for AArch64 to RuntimeDyldMachO. Test cases to follow once RuntimeDyldChecker supports introspection of stubs. Fixes <rdar://problem/17648000> llvm-svn: 212859	2014-07-11 23:52:07 +00:00
Juergen Ributzka	d755e9f730	Revert "[FastISel][X86] Implement the FastLowerIntrinsicCall hook." This reverts commit r212851, because it broke the memset lowering. llvm-svn: 212855	2014-07-11 23:10:08 +00:00
Juergen Ributzka	04b444913b	[FastISel][X86] Implement the FastLowerIntrinsicCall hook. Rename X86VisitIntrinsicCall -> FastLowerIntrinsicCall, which effectively implements the target hook. llvm-svn: 212851	2014-07-11 22:37:43 +00:00
Alexey Samsonov	08f022ae84	[ASan] Introduce a struct representing the layout of metadata entry in llvm.asan.globals. No functionality change. llvm-svn: 212850	2014-07-11 22:36:02 +00:00
Juergen Ributzka	3d9e6755e4	[FastISel] Add target-independent patchpoint intrinsic support. WIP. This implements the target-independent lowering for the patchpoint intrinsic. Targets have to implement the FastLowerCall hook to support this intrinsic. Related to <rdar://problem/17427052> llvm-svn: 212849	2014-07-11 22:19:02 +00:00
Juergen Ributzka	8179e9e5ad	[FastISel] Add basic infrastructure to support a target-independent call lowering hook in FastISel. WIP The infrastructure mimics the call lowering we have already in place for SelectionDAG, but with limitations. For example structure return demotion and non-simple types are not supported (yet). Currently every backend has its own implementation and duplicated code for call lowering. There is also no specified interface that could be called from target-independent code. The target-hook is opt-in and doesn't affect current implementations. llvm-svn: 212848	2014-07-11 22:01:42 +00:00
Aditya Nandakumar	0b5a674243	When we sink an instruction, this can open up opportunity for the operands to be sunk - add them to the worklist llvm-svn: 212847	2014-07-11 21:49:39 +00:00
Argyrios Kyrtzidis	730abd2f4a	Move the API and implementation of clang::driver::getARMCPUForMArch() to llvm::Triple::getARMCPUForArch(). Suggested by Eric Christopher. llvm-svn: 212846	2014-07-11 21:44:54 +00:00
Juergen Ributzka	4ce9863d0b	[FastISel] Make isInTailCallPosition independent of SelectionDAG. Break out the arguemnts required from SelectionDAG, so that this function can also be used by FastISel. llvm-svn: 212844	2014-07-11 20:50:47 +00:00
Juergen Ributzka	5dd32136b9	[FastISel] Breakout intrinsic lowering into a separate function and add a target-hook. Create a separate helper function for target-independent intrinsic lowering. Also add an target-hook that allows to directly call into a target-sepcific intrinsic lowering method. Currently the implementation is opt-in and doesn't affect existing target implementations. llvm-svn: 212843	2014-07-11 20:42:12 +00:00
Alp Toker	48bbd061bc	Simplify the raw_svector_ostream tweak from r212816 The memcpy() and overlap helps didn't help much with timings, so clean up the change. The difference at this point is that we now leave growth of the storage buffer up to SmallVector's implementation: - OS.reserve(OS.capacity() * 2); + OS.reserve(OS.size() + 64); llvm-svn: 212837	2014-07-11 18:23:08 +00:00
Ulrich Weigand	0a51abc100	[MC] Constify MCELF::GetVisibility and MCELF::getOther These two routines didn't take a "const MCSymbolData &SD" like the other MCELF::Get routines for some reason ... llvm-svn: 212834	2014-07-11 17:34:44 +00:00
Ulrich Weigand	ea147a9d43	[PowerPC] Fix invalid displacement created by LocalStackAlloc This commit fixes a bug in PPCRegisterInfo::isFrameOffsetLegal that could result in the LocalStackAlloc pass creating an MI instruction out-of-range displacement: %vreg17<def> = LD 33184, %vreg31; mem:LD8[%g](align=32) %G8RC:%vreg17 G8RC_and_G8RC_NOX0:%vreg31 (In final assembler output the top bits are stripped off, resulting in a negative offset loading from below the stack pointer.) Common code expects the isFrameOffsetLegal routine to verify whether adding a given offset to the offset already present in the instruction results in a valid displacement. However, on PowerPC the routine did not take the already present instruction offset into account. This commit fixes isFrameOffsetLegal to add the instruction offset, and updates a local caller (needsFrameBaseReg) to no longer add the instruction offset itself before calling isFrameOffsetLegal. Reviewed by Hal Finkel. llvm-svn: 212832	2014-07-11 17:19:31 +00:00
Marek Olsak	eac5062cc0	R600/SI: Use i32 vectors for resources and samplers This affects new intrinsics only. What surprises me is that v32i8 still works. llvm-svn: 212831	2014-07-11 17:11:52 +00:00
Marek Olsak	d8ecaeec02	R600/SI: add sample and image intrinsics exposing all instruction fields We need the intrinsics with offsets, so why not just add them all. The R128 parameter will also be useful for reducing SGPR usage. GL_ARB_image_load_store also adds some image GLSL modifiers like "coherent", so Mesa will probably translate those to slc, glc, etc. When LLVM 3.5 is released, I'll switch Mesa to these new intrinsics. llvm-svn: 212830	2014-07-11 17:11:46 +00:00
Marek Olsak	ba77c3e4ed	R600/SI: fix shadow mapping for 1D and 2D array textures It was conflicting with def TEX_SHADOW_ARRAY, which also handles them. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 212829	2014-07-11 17:11:39 +00:00
Alp Toker	bc4d1a3604	raw_svector_ostream: grow and reserve atomically Including the scratch buffer size in the initial reservation eliminates the subsequent malloc+move operation and offers a healthier constant growth with less memory wastage. When doing this, take care to avoid invalidating the source buffer. llvm-svn: 212816	2014-07-11 14:02:04 +00:00
Oliver Stannard	6eda6ffc0c	ARM: Allow __fp16 as a function arg or return type for AArch64 ACLE 2.0 allows __fp16 to be used as a function argument or return type. This enables this for AArch64. llvm-svn: 212812	2014-07-11 13:33:46 +00:00
Quentin Colombet	0f179c4d8a	[X86] Fix the inversion of low and high bits for the lowering of MUL_LOHI. Also add a few comments. <rdar://problem/17581756> llvm-svn: 212808	2014-07-11 12:08:23 +00:00
Marcello Maggioni	78035b11ec	Fixup PHIs in LowerSwitch when a Leaf node is not emitted. This commit fixes bug http://llvm.org/bugs/show_bug.cgi?id=20103. Thanks to Qwertyuiop for the report and the proposed fix. llvm-svn: 212802	2014-07-11 10:34:36 +00:00
Adam Nemet	26f817497c	[X86] AVX512: Improve readability of isCDisp8 No functional change. As I was trying to understand this function, I found that variables were reused with confusing names and the broadcast case was a bit too implicit. Hopefully, this is an improvement. llvm-svn: 212795	2014-07-11 05:23:25 +00:00
Adam Nemet	e311c3c836	[X86] AVX512: Simplify logic in isCDisp8 It was computing the VL/n case as: MemObjSize = VectorByteSize / ElemByteSize / Divider * ElemByteSize ElemByteSize not only falls out but VectorByteSize/Divider now actually matches the definition of VL/n. Also some formatting fixes. llvm-svn: 212794	2014-07-11 05:23:12 +00:00
David Blaikie	de1e1a60e8	Revert "Reapply "DebugInfo: Ensure that all debug location scope chains from instructions within a function, lead to the function itself."" This reverts commit r212776. Nope, still seems to be failing on the sanitizer bots... but hey, not the msan self-host anymore, it's failing in asan now. I'll start looking there next. llvm-svn: 212793	2014-07-11 02:42:57 +00:00
Mark Heffernan	675d401a26	Partially fix PR20058: reduce compile time for loop unrolling with very high count by reducing calls to SE->forgetLoop llvm-svn: 212782	2014-07-10 23:30:06 +00:00
Lang Hames	16086b984e	[RuntimeDyld] Improve error diagnostic in RuntimeDyldChecker. The compiler often emits assembler-local labels (beginning with 'L') for use in relocation expressions, however these aren't included in the object files. Teach RuntimeDyldChecker to warn the user if they try to use one of these in an expression, since it will never work. llvm-svn: 212777	2014-07-10 23:26:20 +00:00
David Blaikie	3ca92d2406	Reapply "DebugInfo: Ensure that all debug location scope chains from instructions within a function, lead to the function itself." Committed in r212205 and reverted in r212226 due to msan self-hosting failure, I believe I've got that fixed by r212761 to Clang. Original commit message: "Originally committed in r211723, reverted in r211724 due to failure cases found and fixed (ArgumentPromotion: r211872, Inlining: r212065), committed again in r212085 and reverted again in r212089 after fixing some other cases, such as debug info subprogram lists not keeping track of the function they represent (r212128) and then short-circuiting things like LiveDebugVariables that build LexicalScopes for functions that might not have full debug info. And again, I believe the invariant actually holds for some reasonable amount of code (but I'll keep an eye on the buildbots and see what happens... ). Original commit message: PR20038: DebugInfo: Inlined call sites where the caller has debug info but the call itself has no debug location. This situation does bad things when inlined, so I've fixed Clang not to produce inlinable call sites without locations when the caller has debug info (in the one case where I could find that this occurred). This updates the PR20038 test case to be what clang now produces, and readds the assertion that had to be removed due to this bug. I've also beefed up the debug info verifier to help diagnose these issues in the future, and I hope to add checks to the inliner to just assert-fail if it encounters this situation. If, in the future, we decide we have to cope with this situation, the right thing to do is probably to just remove all the DebugLocs from the inlined instructions." llvm-svn: 212776	2014-07-10 22:59:39 +00:00
Jan Vesely	2cb62ce2a0	R600: Implement float to long/ulong Use alg. from LegalizeDAG.cpp Move Expand setting to SIISellowering v2: Extend existing tests instead of creating new ones v3: use separate LowerFPTOSINT function v4: use TargetLowering::expandFP_TO_SINT add comment about using FP_TO_SINT for uints Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <tom@stellard.net> llvm-svn: 212773	2014-07-10 22:40:21 +00:00
Jan Vesely	eca89d283e	SelectionDAG: Factor FP_TO_SINT lower code out of DAGLegalizer Move the code to a helper function to allow calls from TypeLegalizer. No functionality change intended Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <tom@stellard.net> Reviewed-by: Owen Anderson <resistor@mac.com> llvm-svn: 212772	2014-07-10 22:40:18 +00:00
Brad Smith	733cb6437d	Use the integrated assembler by default on OpenBSD. llvm-svn: 212771	2014-07-10 22:37:28 +00:00
Zoran Jovanovic	f34b454219	[mips] Emit two CFI offset directives per double precision SDC1/LDC1 instead of just one for FR=1 registers Differential Revision: http://reviews.llvm.org/D4310 llvm-svn: 212769	2014-07-10 22:23:30 +00:00
Matt Arsenault	3332b70627	Revert "Revert r212640, "Add trunc (select c, a, b) -> select c (trunc a), (trunc b) combine."" Don't try to convert the select condition type. llvm-svn: 212750	2014-07-10 18:21:04 +00:00
Andrea Di Biagio	b2921c7ca0	[DAG] Further improve the logic in DAGCombiner that folds a pair of shuffles into a single shuffle if the resulting mask is legal. This patch teaches the DAGCombiner how to fold shuffles according to the following new rules: 1. shuffle(shuffle(x, y), undef) -> x 2. shuffle(shuffle(x, y), undef) -> y 3. shuffle(shuffle(x, y), undef) -> shuffle(x, undef) 4. shuffle(shuffle(x, y), undef) -> shuffle(y, undef) The backend avoids to combine shuffles according to rules 3. and 4. if the resulting shuffle does not have a legal mask. This is to avoid introducing illegal shuffles that are potentially expanded into a sub-optimal sequence of target specific dag nodes during vector legalization. Added test case combine-vec-shuffle-2.ll to verify that we correctly triggers the new rules when combining shuffles. llvm-svn: 212748	2014-07-10 18:04:55 +00:00
Akira Hatanaka	7cc27649a6	[X86] Mark pseudo instruction TEST8ri_NOEREX as hasSIdeEffects=0. Also, add a case clause in X86InstrInfo::shouldScheduleAdjacent to enable macro-fusion. <rdar://problem/15680770> llvm-svn: 212747	2014-07-10 18:00:53 +00:00
Eric Christopher	54fe1b260c	Add the CSR company and the Kalimba DSP processor to Triple. Patch by Matthew Gardiner with fixes by me. llvm-svn: 212745	2014-07-10 17:26:54 +00:00
Eric Christopher	22405e4bbf	Make it possible for the Subtarget to change between function passes in the mips back end. This, unfortunately, required a bit of churn in the various predicates to use a pointer rather than a reference. llvm-svn: 212744	2014-07-10 17:26:51 +00:00
Duncan P. N. Exon Smith	04934b0fec	InstCombine: Fix a crash in Descale for multiply-by-zero Fix a crash in `InstCombiner::Descale()` when a multiply-by-zero gets created as an argument to a GEP partway through an iteration, causing -instcombine to optimize the GEP before the multiply. rdar://problem/17615671 llvm-svn: 212742	2014-07-10 17:13:27 +00:00
David Majnemer	ed33243e86	IR: Aliases don't belong to an explicit comdat Aliases inherit their comdat from their aliasee, they don't have an explicit comdat. This fixes PR20279. llvm-svn: 212732	2014-07-10 16:26:10 +00:00
Hal Finkel	511fea7acd	Feeding isSafeToSpeculativelyExecute its DataLayout pointer (in Sink) This is the one remaining place I see where passing isSafeToSpeculativelyExecute a DataLayout pointer might matter (at least for loads) -- I think I got the others in r212720. Most of the other remaining callers of isSafeToSpeculativelyExecute only use it for call sites (or otherwise exclude loads). llvm-svn: 212730	2014-07-10 16:07:11 +00:00
David Majnemer	99ef236542	Mips: Silence a -Wcovered-switch-default Remove a default label which covered no enumerators, replace it with a llvm_unreachable. No functionality changed. llvm-svn: 212729	2014-07-10 16:04:04 +00:00
Zoran Jovanovic	255d00dc23	[mips] Added FPXX modeless calling convention. Differential Revision: http://reviews.llvm.org/D4293 llvm-svn: 212726	2014-07-10 15:36:12 +00:00
Arnaud A. de Grandmaison	f643231163	[AArch64] Add logical alias instructions to MC AsmParser This patch teaches the AsmParser to accept some logical+immediate instructions and convert them as shown: bic Rd, Rn, #imm -> and Rd, Rn, #~imm bics Rd, Rn, #imm -> ands Rd, Rn, #~imm orn Rd, Rn, #imm -> orr Rd, Rn, #~imm eon Rd, Rn, #imm -> eor Rd, Rn, #~imm Those instructions are an alternate syntax available to assembly coders, and are needed in order to support code already compiling with some other assemblers. For example, the bic construct is used by the linux kernel. llvm-svn: 212722	2014-07-10 15:12:26 +00:00
Hal Finkel	a995f92627	Feeding isSafeToSpeculativelyExecute its DataLayout pointer isSafeToSpeculativelyExecute can optionally take a DataLayout pointer. In the past, this was mainly used to make better decisions regarding divisions known not to trap, and so was not all that important for users concerned with "cheap" instructions. However, now it also helps look through bitcasts for dereferencable loads, and will also be important if/when we add a dereferencable pointer attribute. This is some initial work to feed a DataLayout pointer through to callers of isSafeToSpeculativelyExecute, generally where one was already available. llvm-svn: 212720	2014-07-10 14:41:31 +00:00
Tim Northover	fee2adefba	AArch64: correctly fast-isel i8 & i16 multiplies We were asking for a register for type i8 or i16 which caused an assert. rdar://problem/17620015 llvm-svn: 212718	2014-07-10 14:18:46 +00:00
Daniel Sanders	7e527423f5	[mips] Add support for -modd-spreg/-mno-odd-spreg Summary: When -mno-odd-spreg is in effect, 32-bit floating point values are not permitted in odd FPU registers. The option also prohibits 32-bit and 64-bit floating point comparison results from being written to odd registers. This option has three purposes: * It allows support for certain MIPS implementations such as loongson-3a that do not allow the use of odd registers for single precision arithmetic. * When using -mfpxx, -mno-odd-spreg is the default and this allows us to statically check that code is compliant with the O32 FPXX ABI since mtc1/mfc1 instructions to/from odd registers are guaranteed not to appear for any reason. Once this has been established, the user can then re-enable -modd-spreg to regain the use of all 32 single-precision registers. * When using -mfp64 and -mno-odd-spreg together, an O32 extension named O32 FP64A is used as the ABI. This is intended to provide almost all functionality of an FR=1 processor but can also be executed on a FR=0 core with the assistance of a hardware compatibility mode which emulates FR=0 behaviour on an FR=1 processor. * Added '.module oddspreg' and '.module nooddspreg' each of which update the .MIPS.abiflags section appropriately * Moved setFpABI() call inside emitDirectiveModuleFP() so that the caller doesn't have to remember to do it. * MipsABIFlags now calculates the flags1 and flags2 member on demand rather than trying to maintain them in the same format they will be emitted in. There is one portion of the -mfp64 and -mno-odd-spreg combination that is not implemented yet. Moves to/from odd-numbered double-precision registers must not use mtc1. I will fix this in a follow-up. Differential Revision: http://reviews.llvm.org/D4383 llvm-svn: 212717	2014-07-10 13:38:23 +00:00
Zinovy Nis	cad431c122	[x32] Add AsmBackend for X32 which uses ELF32 with x86_64 (the author is Pavel Chupin). This is minimal change for backend required to have "hello world" compiled and working on x32 target (x86_64-linux-gnux32). More patches for x32 will follow. Differential Revision: http://reviews.llvm.org/D4181 llvm-svn: 212716	2014-07-10 13:03:26 +00:00
Chandler Carruth	0b666e0648	[x86,SDAG] Introduce any- and sign-extend-vector-inreg nodes analogous to the zero-extend-vector-inreg node introduced previously for the same purpose: manage the type legalization of widened extend operations, especially to support the experimental widening mode for x86. I'm adding both because sign-extend is expanded in terms of any-extend with shifts to propagate the sign bit. This removes the last fundamental scalarization from vec_cast2.ll (a test case that hit many really bad edge cases for widening legalization), although the trunc tests in that file still appear scalarized because the the shuffle legalization is scalarizing. Funny thing, I've been working on that. Some initial experiments with this and SSE2 scenarios is showing moderately good behavior already for sign extension. Still some work to do on the shuffle combining on X86 before we're generating optimal sequences, but avoiding scalarization is a huge step forward. llvm-svn: 212714	2014-07-10 12:32:32 +00:00
Richard Sandiford	02bb0ec368	[SystemZ] Use SystemZCallingConv.td to define callee-saved registers Just a clean-up. No behavioral change intended. llvm-svn: 212711	2014-07-10 11:44:37 +00:00
NAKAMURA Takumi	f862ce8908	Revert r212640, "Add trunc (select c, a, b) -> select c (trunc a), (trunc b) combine." This caused miscompilation on, at least, x86-64. SExt(i1 cond) confused other optimizations. llvm-svn: 212708	2014-07-10 11:37:28 +00:00
Richard Sandiford	909aa3ad21	[SystemZ] Tweak instruction format classifications There's no real need to have Shift as a separate format type from Binary. The comments for other format types were too specific and in some cases no longer accurate. Just a clean-up, no behavioral change intended. llvm-svn: 212707	2014-07-10 11:29:23 +00:00
Chandler Carruth	df8d0caab7	[x86] Add another combine that is particularly useful for the new vector shuffle lowering: match shuffle patterns equivalent to an unpcklwd or unpckhwd instruction. This allows us to use generic lowering code for v8i16 shuffles and match the unpack pattern late. llvm-svn: 212705	2014-07-10 11:09:29 +00:00
Richard Sandiford	e66e8c8b66	[SystemZ] Add MC support for LEDBRA, LEXBRA and LDXBRA These instructions aren't used for codegen since the original L*DB instructions are suitable for fround. llvm-svn: 212703	2014-07-10 11:00:55 +00:00
Richard Sandiford	ca44614ac0	[SystemZ] Avoid using i8 constants for immediate fields Immediate fields that have no natural MVT type tended to use i8 if the field was small enough. This was a bit confusing since i8 isn't a legal type for the target. Fields for short immediates in a 32-bit or 64-bit operation use i32 or i64 instead, so it would be better to do the same for all fields. No behavioral change intended. llvm-svn: 212702	2014-07-10 10:52:51 +00:00
Richard Sandiford	ac1dba0fdf	[SystemZ] Fix FPR dwarf numbering The dwarf FPR numbers are supposed to have the order F0, F2, F4, F6, F1, F3, F5, F7, F8, etc., which matches the pairing of registers for long doubles. E.g. a long double stored in F0 is paired with F2. llvm-svn: 212701	2014-07-10 10:45:11 +00:00
Daniel Sanders	cbd44c591d	Make it possible for ints/floats to return different values from getBooleanContents() Summary: On MIPS32r6/MIPS64r6, floating point comparisons return 0 or -1 but integer comparisons return 0 or 1. Updated the various uses of getBooleanContents. Two simplifications had to be disabled when float and int boolean contents differ: - ScalarizeVecRes_VSELECT except when the kind of boolean contents is trivially discoverable (i.e. when the condition of the VSELECT is a SETCC node). - visitVSELECT (select C, 0, 1) -> (xor C, 1). Come to think of it, this one could test for the common case of 'C' being a SETCC too. Preserved existing behaviour for all other targets and updated the affected MIPS32r6/MIPS64r6 tests. This also fixes the pi benchmark where the 'low' variable was counting in the wrong direction because it thought it could simply add the result of the comparison. Reviewers: hfinkel Reviewed By: hfinkel Subscribers: hfinkel, jholewinski, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D4389 llvm-svn: 212697	2014-07-10 10:18:12 +00:00
Chandler Carruth	853fa0ac8d	[x86] Expand the target DAG combining for PSHUFD nodes to be able to combine into half-shuffles through unpack instructions that expand the half to a whole vector without messing with the dword lanes. This fixes some redundant instructions in splat-like lowerings for v16i8, which are now getting to be really nice. llvm-svn: 212695	2014-07-10 09:57:36 +00:00
Chandler Carruth	a34a8e230d	[x86] Tweak the v16i8 single input special case lowering for shuffles that splat i8s into i16s. Previously, we would try much too hard to arrange a sequence of i8s in one half of the input such that we could unpack them into i16s and shuffle those into place. This isn't always going to be a cheaper i8 shuffle than our other strategies. The case where it is always going to be cheaper is when we can arrange all the necessary inputs into one half using just i16 shuffles. It happens that viewing the problem this way also makes it much easier to produce an efficient set of shuffles to move the inputs into one half and then unpack them. With this, our splat code gets one step closer to being not terrible with the new experimental lowering strategy. It also exposes two combines missing which I will add next. llvm-svn: 212692	2014-07-10 09:16:40 +00:00
Hal Finkel	66e23f126d	Fix isDereferenceablePointer not to try to take the size of an unsized type. I'll add a test-case shortly. llvm-svn: 212687	2014-07-10 06:06:11 +00:00
Hal Finkel	2e42c34d05	Allow isDereferenceablePointer to look through some bitcasts isDereferenceablePointer should not give up upon encountering any bitcast. If we're casting from a pointer to a larger type to a pointer to a small type, we can continue by examining the bitcast's operand. This missing capability was noted in a comment in the function. In order for this to work, isDereferenceablePointer now takes an optional DataLayout pointer (essentially all callers already had such a pointer available). Most code uses isDereferenceablePointer though isSafeToSpeculativelyExecute (which already took an optional DataLayout pointer), and to enable the LICM test case, LICM needs to actually provide its DL pointer to isSafeToSpeculativelyExecute (which it was not doing previously). llvm-svn: 212686	2014-07-10 05:27:53 +00:00
Saleem Abdulrasool	1e76cbdff7	MC: modernise for loop Convert a for loop to range bsaed form. NFC. llvm-svn: 212684	2014-07-10 04:50:09 +00:00
Saleem Abdulrasool	427c08d48b	MC: add and use an accessor for WinCFI This adds a utility method to access the WinCFI information in bulk and uses that to iterate rather than requesting the count and individually iterating them. This is in preparation for restructuring WinCFI handling to enable more clear sharing across architectures to enable unwind information emission for Windows on ARM. llvm-svn: 212683	2014-07-10 04:50:06 +00:00
Peter Collingbourne	8876c3face	Remove move assignment operator to appease older GCCs. llvm-svn: 212682	2014-07-10 04:39:40 +00:00
Chandler Carruth	7d2ffb5492	[x86] Initial improvements to the new shuffle lowering for v16i8 shuffles specifically for cases where a small subset of the elements in the input vector are actually used. This is specifically targetted at improving the shuffles generated for trunc operations, but also helps out splat-like operations. There is still some really low-hanging fruit here that I want to address but this is a huge step in the right direction. llvm-svn: 212680	2014-07-10 04:34:06 +00:00
Peter Collingbourne	05b9ebf2f9	Explicitly define move constructor and move assignment operator to appease MSVC. llvm-svn: 212679	2014-07-10 04:29:06 +00:00
Peter Collingbourne	d5feb7ba42	SpecialCaseList: use std::unique_ptr. llvm-svn: 212678	2014-07-10 03:55:02 +00:00
Hao Liu	71224b02fb	[AArch64]Fix an assertion failure in DAG Combiner about concating 2 build_vector. llvm-svn: 212677	2014-07-10 03:41:50 +00:00
Matt Arsenault	b0df92577d	R600/SI: Add support for llvm.convert.{to\|from}.fp16 llvm-svn: 212676	2014-07-10 03:22:20 +00:00
Chandler Carruth	b3840a55ae	[x86] Refactor some of the new code for lowering v16i8 shuffles to remove duplication and make it easier to select different strategies. No functionality changed. llvm-svn: 212674	2014-07-10 02:24:26 +00:00
Peter Collingbourne	2e28edf8e1	[dfsan] Handle bitcast aliases. llvm-svn: 212668	2014-07-10 01:30:39 +00:00
Chandler Carruth	d3561f6fec	[SDAG] Make the new zext-vector-inreg node default to expand so targets don't need to set it manually. This is based on feedback from Tom who pointed out that if every target needs to handle this we need to reach out to those maintainers. In fact, it doesn't make sense to duplicate everything when anything other than expand seems unlikely at this stage. llvm-svn: 212661	2014-07-09 22:53:04 +00:00
David Blaikie	029bd3350e	Recommit r212203: Don't try to construct debug LexicalScopes hierarchy for functions that do not have top level debug information. Reverted by Eric Christopher (Thanks!) in r212203 after Bob Wilson reported LTO issues. Duncan Exon Smith and Aditya Nandakumar helped provide a reduced reproduction, though the failure wasn't too hard to guess, and even easier with the example to confirm. The assertion that the subprogram metadata associated with an llvm::Function matches the scope data referenced by the DbgLocs on the instructions in that function is not valid under LTO. In LTO, a C++ inline function might exist in multiple CUs and the subprogram metadata nodes will refer to the same llvm::Function. In this case, depending on the order of the CUs, the first intance of the subprogram metadata may not be the one referenced by the instructions in that function and the assertion will fail. A test case (test/DebugInfo/cross-cu-linkonce-distinct.ll) is added, the assertion removed and a comment added to explain this situation. Original commit message: If a function isn't actually in a CU's subprogram list in the debug info metadata, ignore all the DebugLocs and don't try to build scopes, track variables, etc. While this is possibly a minor optimization, it's also a correctness fix for an incoming patch that will add assertions to LexicalScopes and the debug info verifier to ensure that all scope chains lead to debug info for the current function. Fix up a few test cases that had broken/incomplete debug info that could violate this constraint. Add a test case where this occurs by design (inlining a debug-info-having function in an attribute nodebug function - we want this to work because /if/ the nodebug function is then inlined into a debug-info-having function, it should be fine (and will work fine - we just stitch the scopes up as usual), but should the inlining not happen we need to not assert fail either). llvm-svn: 212649	2014-07-09 21:02:41 +00:00
Alexey Samsonov	b7dd329f2f	Decouple llvm::SpecialCaseList text representation and its LLVM IR semantics. Turn llvm::SpecialCaseList into a simple class that parses text files in a specified format and knows nothing about LLVM IR. Move this class into LLVMSupport library. Implement two users of this class: * DFSanABIList in DFSan instrumentation pass. * SanitizerBlacklist in Clang CodeGen library. The latter will be modified to use actual source-level information from frontend (source file names) instead of unstable LLVM IR things (LLVM Module identifier). Remove dependency edge from ClangCodeGen/ClangDriver to LLVMTransformUtils. No functionality change. llvm-svn: 212643	2014-07-09 19:40:08 +00:00
Matt Arsenault	658c5576d1	Add trunc (select c, a, b) -> select c (trunc a), (trunc b) combine. Do this if the truncate is free and the select is legal. llvm-svn: 212640	2014-07-09 19:12:07 +00:00
Jim Grosbach	34cc92b475	AArch64: Better codegen for storing to __fp16. Storing will generally be immediately preceded by rounding from an f32 or f64, so make sure to match those patterns directly to convert into the FPR16 register class directly rather than going through the integer GPRs. This also eliminates an extra step in the convert-from-f64 path which was first converting to f32 and then to f16 from there. rdar://17594379 llvm-svn: 212638	2014-07-09 18:55:52 +00:00
Benjamin Kramer	c560a6cadc	TargetRegisterInfo: Remove function that fell out of use years ago. llvm-svn: 212636	2014-07-09 18:53:57 +00:00
Adam Nemet	2820a5b9e9	[X86] AVX512: Enable it in the Loop Vectorizer This lets us experiment with 512-bit vectorization without passing force-vector-width manually. The code generated for a simple integer memset loop is properly vectorized. Disassembly is still broken for it though :(. llvm-svn: 212634	2014-07-09 18:22:33 +00:00
Louis Gerbarg	1ce0c37bf0	Make AArch64FastISel::EmitIntExt explicitly check its source and destination types This is a follow up to r212492. There should be no functional difference, but this patch makes it clear that SrcVT must be an i1/i8/16/i32 and DestVT must be an i8/i16/i32/i64. rdar://17516686 llvm-svn: 212633	2014-07-09 17:54:32 +00:00
Sanjay Patel	58814445d4	Fix for PR20059 (instcombine reorders shufflevector after instruction that may trap) In PR20059 ( http://llvm.org/pr20059 ), instcombine eliminates shuffles that are necessary before performing an operation that can trap (srem). This patch calls isSafeToSpeculativelyExecute() and bails out of the optimization in SimplifyVectorOp() if needed. Differential Revision: http://reviews.llvm.org/D4424 llvm-svn: 212629	2014-07-09 16:34:54 +00:00
Daniel Sanders	c5626f4444	Add Imagination Technologies to the vendors in llvm::Triple Summary: This is a pre-requisite for supporting the mips-img-linux-gnu triple in clang. Differential Revision: http://reviews.llvm.org/D4435 llvm-svn: 212626	2014-07-09 16:03:10 +00:00
Tim Northover	ac002d3e34	Generic: add range-adapter for option parsing. I want to use it in lld, but while I'm here I'll update LLVM uses. llvm-svn: 212615	2014-07-09 13:03:37 +00:00
Chandler Carruth	5865a73a82	[x86] Fix a bug in my new zext-vector-inreg DAG trickery where we were not widening the input type to the node sufficiently to let the ext take place in a register. This would in turn result in a mysterious bitcast assertion failure downstream. First change here is to add back the helpful assert I had in an earlier version of the code to catch this immediately. Next change is to add support to the type legalization to detect when we have widened the operand either too little or too much (for whatever reason) and find a size-matched legal vector type to convert it to first. This can also fail so we get a new fallback path, but that seems OK. With this, we no longer crash on vec_cast2.ll when using widening. I've also added the CHECK lines for the zero-extend cases here. We still need to support sign-extend and trunc (or something) to get plausible code for the other two thirds of this test which is one of the regression tests that showed the most scalarization when widening was force-enabled. Slowly closing in on widening being a viable legalization strategy without it resorting to scalarization at every turn. =] llvm-svn: 212614	2014-07-09 12:36:54 +00:00
Chandler Carruth	14cad41e14	Sink two variables only used in an assert into the assert itself. Should fix the release builds with Werror. llvm-svn: 212612	2014-07-09 11:13:16 +00:00
Benjamin Kramer	d6f1733add	X86: When lowering v8i32 himuls use the correct shuffle masks for AVX2. Turns out my trick of using the same masks for SSE4.1 and AVX2 didn't work out as we have to blend two vectors. While there remove unecessary cross-lane moves from the shuffles so the backend can lower it to palignr instead of vperm. Fixes PR20118, a miscompilation of vector sdiv by constant on AVX2. llvm-svn: 212611	2014-07-09 11:12:39 +00:00
Chandler Carruth	afe4b2507e	[x86] Add a ZERO_EXTEND_VECTOR_INREG DAG node and use it when widening vector types to be legal and a ZERO_EXTEND node is encountered. When we use widening to legalize vector types, extend nodes are a real challenge. Either the input or output is likely to be legal, but in many cases not both. As a consequence, we don't really have any way to represent this situation and the prior code in the widening legalization framework would just scalarize the extend operation completely. This patch introduces a new DAG node to represent doing a zero extend of a vector "in register". The core of the idea is to allow legal but different vector types in the input and output. The output vector must have fewer lanes but wider elements. The operation is defined to zero extend the low elements of the input to the size of the output elements, and drop all of the high elements which don't have a corresponding lane in the output vector. It also includes generic expansion of this node in terms of blending a zero vector into the high elements of the vector and bitcasting across. This in turn yields extremely nice code for x86 SSE2 when we use the new widening legalization logic in conjunction with the new shuffle lowering logic. There is still more to do here. We need to support sign extension, any extension, and potentially int-to-float conversions. My current plan is to continue using similar synthetic nodes to model each of these transitions with generic lowering code for each one. However, with this patch LLVM already reaches performance parity with GCC for the core C loops of the x264 code (assuming you disable the hand-written assembly versions) when compiling for SSE2 and SSE3 architectures and enabling the new widening and lowering logic for vectors. Differential Revision: http://reviews.llvm.org/D4405 llvm-svn: 212610	2014-07-09 10:58:18 +00:00
Daniel Sanders	e31155fd1a	[mips][mips64r6] Correct select patterns that have the condition or true/false values backwards Summary: This bug caused SingleSource/Regression/C/uint64_to_float and SingleSource/UnitTests/2002-05-02-CastTest3 to fail (among others). Differential Revision: http://reviews.llvm.org/D4388 llvm-svn: 212608	2014-07-09 10:47:26 +00:00
Daniel Sanders	dc06718e0b	[mips][mips64r6] Correct cond names in the cmp.cond.[ds] instructions Summary: It seems we accidentally read the wrong column of the table MIPS64r6 spec and used the names for c.cond.fmt instead of cmp.cond.fmt. Differential Revision: http://reviews.llvm.org/D4387 llvm-svn: 212607	2014-07-09 10:40:20 +00:00
Chandler Carruth	ef5dcf571e	[x86] Initialize a pointer to null to fix a bug in r212602. This should restore GCC hosts (which happen to put the bad stuff into the pointer) and MSan, etc. llvm-svn: 212606	2014-07-09 10:36:42 +00:00
Daniel Sanders	f5a5fbd3f4	[mips][mips64r6] Use JALR for indirect branches instead of JR (which is not available on MIPS32r6/MIPS64r6) Summary: This completes the change to use JALR instead of JR on MIPS32r6/MIPS64r6. Reviewers: jkolek, vmedic, zoran.jovanovic, dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D4269 llvm-svn: 212605	2014-07-09 10:21:59 +00:00
Daniel Sanders	338513b3fa	[mips][mips64r6] Use JALR for returns instead of JR (which is not available on MIPS32r6/MIPS64r6) Summary: RET, and RET_MM have been replaced by a pseudo named PseudoReturn. In addition a version with a 64-bit GPR named PseudoReturn64 has been added. Instruction selection for a return matches RetRA, which is expanded post register allocation to PseudoReturn/PseudoReturn64. During MipsAsmPrinter, this PseudoReturn/PseudoReturn64 are emitted as: - (JALR64 $zero, $rs) on MIPS64r6 - (JALR $zero, $rs) on MIPS32r6 - (JR_MM $rs) on microMIPS - (JR $rs) otherwise On MIPS32r6/MIPS64r6, 'jr $rs' is an alias for 'jalr $zero, $rs'. To aid development and review (specifically, to ensure all cases of jr are updated), these aliases are temporarily named 'r6.jr' instead of 'jr'. A follow up patch will change them back to the correct mnemonic. Added (JALR $zero, $rs) to MipsNaClELFStreamer's definition of an indirect jump, and removed it from its definition of a call. Note: I haven't accounted for MIPS64 in MipsNaClELFStreamer since it's doesn't appear to account for any MIPS64-specifics. The return instruction created as part of eh_return expansion is now expanded using expandRetRA() so we use the right return instruction on MIPS32r6/MIPS64r6 ('jalr $zero, $rs'). Also, fixed a misuse of isABI_N64() to detect 64-bit wide registers in expandEhReturn(). Reviewers: jkolek, vmedic, mseaborn, zoran.jovanovic, dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D4268 llvm-svn: 212604	2014-07-09 10:16:07 +00:00
Chandler Carruth	2ebc942683	[x86] Re-apply a variant of the x86 side of r212324 now that the rest has settled without incident, removing the x86-specific and overly strict 'isVectorSplat' routine in favor of generic and more powerful splat detection. The primary motivation and result of this is that the x86 backend can now see through splats which contain undef elements. This is essential if we are using a widening form of legalization and I've updated a test case to also run in that mode as before this change the generated code for the test case was completely scalarized. This version of the patch much more carefully handles the undef lanes. - We aren't overly conservative about them in the shift lowering (where we will never use the splat itself). - One place where the splat would have been re-used by the existing code now explicitly constructs a new constant splat that will be safe. - The broadcast lowering is much more reasonable with undefs by doing a correct check of whether the splat is the only user of a loaded value, checking that the splat actually crosses multiple lanes before using a broadcast, and handling broadcasts of non-constant splats. As a consequence of the last bullet, the weird usage of vpshufd instead of vbroadcast is gone, and we actually can lower an AVX splat with vbroadcastss where before we emitted a really strange pattern of a vector load and a manual splat across the vector. llvm-svn: 212602	2014-07-09 10:06:58 +00:00
Timur Iskhodzhanov	e40fb373ef	[ASan/Win] Don't instrument COMDAT globals. Properly fixes PR20244. llvm-svn: 212596	2014-07-09 08:35:33 +00:00
Dmitri Gribenko	a5b27a7128	SourceMgr: consistently use 'unsigned' for the memory buffer ID type llvm-svn: 212595	2014-07-09 08:30:15 +00:00
Chandler Carruth	f0a33b71e9	[SDAG] At the suggestion of Hal, switch to an output parameter that tracks which elements of the build vector are in fact undef. This should make actually inpsecting them (likely in my next patch) reasonably pretty. Also makes the output parameter optional as it is clear now that most users are happy with undefs in their splats. llvm-svn: 212581	2014-07-09 00:41:34 +00:00
NAKAMURA Takumi	843c4cb401	MipsTargetStreamer.h: Avoid "using" to appease msc17. llvm-svn: 212577	2014-07-08 23:48:22 +00:00
Jim Grosbach	04691a530d	AArch64: Better codegen for loading from __fp16. Loading will generally extend to an f32 or an 64, so make sure to match those patterns directly to load into the FPR16 register class directly rather than going through the integer GPRs. This also eliminates an extra step in the convert-to-f64 path which was first converting to f32 and then to f64 from there. rdar://17594379 llvm-svn: 212573	2014-07-08 23:28:48 +00:00
Hal Finkel	8ae0f8d618	Improve BasicAA CS-CS queries BasicAA contains knowledge of certain intrinsics, such as memcpy and memset, and uses that information to form more-accurate answers to CallSite vs. Loc ModRef queries. Unfortunately, it did not use this information when answering CallSite vs. CallSite queries. Generically, when an intrinsic takes one or more pointers and the intrinsic is marked only to read/write from its arguments, the offset/size is unknown. As a result, the generic code that answers CallSite vs. CallSite (and CallSite vs. Loc) queries in AA uses UnknownSize when forming Locs from an intrinsic's arguments. While BasicAA's CallSite vs. Loc override could use more-accurate size information for some intrinsics, it did not do the same for CallSite vs. CallSite queries. This change refactors the intrinsic-specific logic in BasicAA into a generic AA query function: getArgLocation, which is overridden by BasicAA to supply the intrinsic-specific knowledge, and used by AA's generic implementation. This allows the intrinsic-specific knowledge to be used by both CallSite vs. Loc and CallSite vs. CallSite queries, and simplifies the BasicAA implementation. Currently, only one function, Mac's memset_pattern16, is handled by BasicAA (all the rest are intrinsics). As a side-effect of this refactoring, BasicAA's getModRefBehavior override now also returns OnlyAccessesArgumentPointees for this function (which is an improvement). llvm-svn: 212572	2014-07-08 23:16:49 +00:00
Kevin Enderby	8c50dbb8cb	Add support for BSD format Archive map symbols (aka the table of contents from a __.SYMDEF or "__.SYMDEF SORTED" archive member). llvm-svn: 212568	2014-07-08 22:10:02 +00:00
Pete Cooper	91e4ba2f88	Revert "GlobalDCE: Delete available_externally initializers if it allows removing the value the initializer is referring to." This reverts commit 5b55a47e94e28fbb56d0cd5d72c3db9105c15b4c. A test case was found to crash after this was applied. I'll file a bug to track fixing this with the test case needed. llvm-svn: 212550	2014-07-08 17:06:03 +00:00
Ulrich Weigand	862d8b8d06	[PowerPC] Implement atomic NAND operations as actual NAND This changes the implementation of atomic NAND operations from "a & ~b" (compatible with GCC < 4.4) to actual "~(a & b)" (compatible with GCC >= 4.4). This is in line with the common-code and ARM back-end change implemented in r212433. llvm-svn: 212547	2014-07-08 16:16:02 +00:00
Andrea Di Biagio	d261e98f3d	[DAG] Teach how to combine a pair of shuffles into a single shuffle if the resulting mask is legal. This patch teaches how to fold a shuffle according to rule: shuffle (shuffle (x, undef, M0), undef, M1) -> shuffle(x, undef, M2) We do this only if the resulting mask M2 is legal; this is to avoid introducing illegal shuffles that are potentially expanded into a sub-optimal sequence of target specific dag nodes. This patch has the advantage of being target independent, since it works on ISD nodes. Therefore, all targets (not only x86) can take advantage of this rule. The idea behind this patch is that most shuffle pairs can be safely combined before we run the legalizer on vector operations. This allows us to combine/simplify dag nodes earlier in the process and not only immediately before instruction selection stage. That said. This patch is not meant to replace any existing target specific combine rules; backends might still introduce new shuffles during legalization stage. Also, this rule is very simple and avoids to aggressively optimize shuffles. llvm-svn: 212539	2014-07-08 15:22:29 +00:00
Benjamin Kramer	cccdadca45	Fix some Twine locals. Two of those are use after frees. Found by clang-tidy, fixed by me. llvm-svn: 212537	2014-07-08 14:55:06 +00:00
Timur Iskhodzhanov	a4212c244a	[ASan/Win] Don't instrument private COMDAT globals until PR20244 is properly fixed llvm-svn: 212530	2014-07-08 13:18:58 +00:00
Daniel Sanders	324ad956e0	[mips] Fixed struct/class mismatch introduced in r212522. Clang emits a warning about this. llvm-svn: 212528	2014-07-08 13:13:42 +00:00
Daniel Sanders	7201a3e3bb	Fix r212522 - [mips] Improve encapsulation of the .MIPS.abiflags implementation and limit scope of related enums Added two lines that should have been in r212522. llvm-svn: 212523	2014-07-08 10:35:52 +00:00
Daniel Sanders	c7dbc630e5	[mips] Improve encapsulation of the .MIPS.abiflags implementation and limit scope of related enums Summary: Follow on to r212519 to improve the encapsulation and limit the scope of the enums. Also merged two very similar parser functions, fixed a bug where ASE's were not being reported, and marked CPR1's as being 128-bit when MSA is enabled. Differential Revision: http://reviews.llvm.org/D4384 llvm-svn: 212522	2014-07-08 10:11:38 +00:00
Renato Golin	b8a86c43c0	Revert "Refactor ARM subarchitecture parsing" This reverts commit 7b4a6882467e7fef4516a0cbc418cbfce0fc6f6d. llvm-svn: 212521	2014-07-08 10:06:16 +00:00
Arnaud A. de Grandmaison	d7827606de	Truncate the immediate in logical operation to the register width And continue to produce an error if the 32 most significant bits are not all ones or zeros. llvm-svn: 212520	2014-07-08 09:53:04 +00:00
Vladimir Medic	fb8a2a95cd	Mips.abiflags is a new implicitly generated section that will be present on all new modules. The section contains a versioned data structure which represents essentially information to allow a program loader to determine the requirements of the application. This patch implements mips.abiflags section and provides test cases for it. llvm-svn: 212519	2014-07-08 08:59:22 +00:00
Chandler Carruth	142e966261	[x86,SDAG] Sink the logic for folding shuffles of splats more aggressively from the x86 shuffle lowering to the generic SDAG vector shuffle formation code. This code already tried to fold away shuffles of splats! It just had lots of bugs and couldn't handle the case my new x86 shuffle lowering needed. First, it failed to correctly compute whether N2 was undef because it pre-computed this, then did transformations which could make N2 undef, then failed to ever re-consider the precomputed state. Second, it didn't look through bitcasts at all, even in the safe cases where they are just element-type bitcasts with no change to the number of elements. Third, it didn't handle all-zero bit casts nicely the way my code in the x86 side of things did, which is essential to getting good zext-shuffle lowerings. But all of these are generic. I just ported the code down to this layer and fixed the surrounding bugs. Tests exercising this in the x86 backend still pass and some silly code in widen_cast-6.ll gets better. I updated that test to be a bit more precise but it's still pretty unclear what the value of the test is in this day and age. llvm-svn: 212517	2014-07-08 08:45:38 +00:00
Chandler Carruth	efbce58775	[SDAG] Actually check for a non-constant splat and clarify comments around the handling of UNDEF lanes in boolean vector content analysis. The code before my changes here also failed to check for non-constant splats in a buildvector. I have no idea how to trigger this, I just spotted by inspection when trying to understand the code. It seems extremely unlikely to be worth the trouble to teach the only caller of this code (DAG combining setcc patterns) how to cleverly handle undef lanes, so I've just commented more thoroughly that we're giving up there. llvm-svn: 212515	2014-07-08 07:44:15 +00:00
Chandler Carruth	b844e72e85	[SDAG] Build up a more rich set of APIs for querying build-vector SDAG nodes about whether they are splats. This is factored out and improved from r212324 which got reverted as it was far too aggressive. The new API should help more conservatively handle buildvectors that are a mixture of splatted and undef values. No functionality change at this point. The hope is to slowly re-introduce the undef-tolerant optimization of splats, but each time being forced to make a concious decision about how to handle the undefs in a way that doesn't lead to contradicting assumptions about the collapsed value. Hal has pointed out in discussions that this may not end up being the desired API and instead it may be more convenient to get a mask of the undef elements or something similar. I'm starting simple and will expand the API as I adapt actual callers and see exactly what they need. llvm-svn: 212514	2014-07-08 07:19:55 +00:00
Alexey Samsonov	c94285a1a0	[ASan] Completely remove sanitizer blacklist file from instrumentation pass. All blacklisting logic is now moved to the frontend (Clang). If a function (or source file it is in) is blacklisted, it doesn't get sanitize_address attribute and is therefore not instrumented. If a global variable (or source file it is in) is blacklisted, it is reported to be blacklisted by the entry in llvm.asan.globals metadata, and is not modified by the instrumentation. The latter may lead to certain false positives - not all the globals created by Clang are described in llvm.asan.globals metadata (e.g, RTTI descriptors are not), so we may start reporting errors on them even if "module" they appear in is blacklisted. We assume it's fine to take such risk: 1) errors on these globals are rare and usually indicate wild memory access 2) we can lazily add descriptors for these globals into llvm.asan.globals lazily. llvm-svn: 212505	2014-07-08 00:50:49 +00:00
Adam Nemet	79580db918	[X86] AVX512: Only allow k1-k7 as predicates to vpcmp* As destination k0 is allowed but not as predicate/writemask. I also modified the test to allow checking of error messages by the assembler. I applied a similar approach to the test ret.s in the same directory. llvm-svn: 212504	2014-07-08 00:22:32 +00:00
Alexey Samsonov	07435c4775	Kill unnecessary include llvm-svn: 212503	2014-07-08 00:03:11 +00:00
Andrea Di Biagio	2620b877b6	[x86] Fix assertion failure caused by a wrong combine of PSHUFD nodes with different types. When combining a sequence of two PSHUFD dag nodes into a single PSHUFD, make sure that we assign the correct type to the resulting PSHUFD. X86ISD::PSHUFD dag nodes can be either MVT::v4i32 or MVT::v4f32. Before this change, an assertion failure was triggered in method 'DAGCombinerInfo::CombineTo' when trying to combine the shuffles from the test below into a single PSHUFD. define <4 x float> @test1(<4 x float> %V) { %1 = shufflevector <4 x float> %V, <4 x float> undef, <4 x i32> <i32 3, i32 0, i32 2, i32 1> %2 = shufflevector <4 x float> %1, <4 x float> undef, <4 x i32> <i32 3, i32 0, i32 2, i32 1> ret <4 x float> %2 } llvm-svn: 212498	2014-07-07 23:25:23 +00:00
Sanjay Patel	70af1fdf9d	fixed some typos llvm-svn: 212495	2014-07-07 22:13:58 +00:00
Juergen Ributzka	665ea71fcd	[FastISel][X86] Fix smul.with.overflow.i8 lowering. Add custom lowering code for signed multiply instruction selection, because the default FastISel instruction selection for ISD::MUL will use unsigned multiply for the i8 type and signed multiply for all other types. This would set the incorrect flags for the overflow check. This fixes <rdar://problem/17549300> llvm-svn: 212493	2014-07-07 21:52:21 +00:00
Louis Gerbarg	4c5b4054b2	Allow AArch64FastISel to degrade graceully in the presence of an MVT::i128 Currently AArch64FastISel crashes if it tries to extend an integer into an MVT::i128. This can happen by creating 128 bit integers like so: typedef unsigned int uint128_t __attribute__((mode(TI))); typedef int sint128_t __attribute__((mode(TI))); This patch makes EmitIntExt check for their presence and then falls back to SelectionDAG. Tests included. rdar://17516686 llvm-svn: 212492	2014-07-07 21:37:51 +00:00
Sanjay Patel	a932da8f35	Fix for PR17073 ( http://llvm.org/pr17073 ), simplifycfg illegally hoists an operation in a phi node that can trap. This patch adds to an existing loop over phi nodes in SimplifyCondBranchToCondBranch() to check for trapping ops and bails out of the optimization if we find one of those. The test cases verify that trapping ops are not hoisted and non-trapping ops are still optimized as expected. llvm-svn: 212490	2014-07-07 21:19:00 +00:00
Renato Golin	1e9c282cd1	Refactor ARM subarchitecture parsing According to a FIXME in ARMMCTargetDesc.cpp the ARM version parsing should be in the Triple helper class. Patch by: Gabor Ballabas llvm-svn: 212479	2014-07-07 20:01:11 +00:00
Ulrich Weigand	de8641bfde	[PowerPC] Fix no-assert build r212476 caused a compile failure (unused variable) in a non-assertion build ... llvm-svn: 212477	2014-07-07 19:39:44 +00:00
Ulrich Weigand	ec2bf93895	[PowerPC] Fix "byval align" arguments Arguments passed as "byval align" should get the specified alignment in the parameter save area. There was some code in PPCISelLowering.cpp that attempted to implement this, but this didn't work correctly: while code did update the ArgOffset value, it neglected to update the PtrOff value (which was already computed from the old ArgOffset), and it also neglected to update GPR_idx -- fields skipped due to alignment in the save area must likewise be skipped in GPRs. This patch fixes and simplifies this logic by: - handling argument offset alignment right at the beginning of argument processing, using a new helper routine CalculateStackSlotAlignment (this avoids having to update PtrOff and other derived values later on) - not tracking GPR_idx separately, but always computing the correct GPR_idx for each argument from its ArgOffset - removing some redundant computation in LowerFormalArguments: MinReservedArea must equal ArgOffset after argument processing, so there's no use in computing it twice. [This doesn't change the behavior of the current clang front-end, since that never creates "byval align" arguments at the moment. This will change with a follow-on patch, however.] llvm-svn: 212476	2014-07-07 19:26:41 +00:00
Chandler Carruth	beeacac0b3	[x86] Revert r212324 which was too aggressive w.r.t. allowing undef lanes in vector splats. The core problem here is that undef lanes can't unilaterally be considered to contribute to splats. Their handling needs to be more cautious. There is also a reported failure of the nightly testers (thanks Tobias!) that may well stem from the same core issue. I'm going to fix this theoretical issue, factor the APIs a bit better, and then verify that I don't see anything bad with Tobias's reduction from the test suite before recommitting. Original commit message for r212324: [x86] Generalize BuildVectorSDNode::getConstantSplatValue to work for any constant, constant FP, or undef splat and to tolerate any undef lanes in a splat, then replace all uses of isSplatVector in X86's lowering with it. This fixes issues where undef lanes in an otherwise splat vector would prevent the splat logic from firing. It is a touch more awkward to use this interface, but it is much more accurate. Suggestions for better interface structuring welcome. With this fix, the code generated with the widening legalization strategy for widen_cast-4.ll is dramatically improved as the special lowering strategies for a v16i8 SRA kick in even though the high lanes are undef. We also get a slightly different choice for broadcasting an aligned memory location, and use vpshufd instead of vbroadcastss. This looks like a minor win for pipelining and domain crossing, but a minor loss for the number of micro-ops. I suspect its a wash, but folks can easily tweak the lowering if they want. llvm-svn: 212475	2014-07-07 19:03:32 +00:00
Matt Arsenault	d2c9e08b63	R600: Fix mishandling of load / store chains. Fixes various bugs with reordering loads and stores. Scalarized vector loads weren't collecting the chains at all. llvm-svn: 212473	2014-07-07 18:34:45 +00:00
Matt Arsenault	fda9dad17f	Fix typo, weird indentation llvm-svn: 212472	2014-07-07 18:34:42 +00:00
Benjamin Kramer	6cbe670db8	Make helper functions static. llvm-svn: 212460	2014-07-07 14:47:51 +00:00
Tim Northover	3705283b24	X86: revert unintentional change to X86FastISel. This crept in with r212443. llvm-svn: 212459	2014-07-07 14:06:42 +00:00
Evgeniy Stepanov	6fa6c677cc	[asan] Generate asm instrumentation in MC. Generate entire ASan asm instrumentation in MC without relying on runtime helper functions. Patch by Yuri Gorshenin. llvm-svn: 212455	2014-07-07 13:57:37 +00:00
Evgeniy Stepanov	d948a5f3c3	[msan] Fix handling of phi in blacklisted functions. llvm-svn: 212454	2014-07-07 13:28:31 +00:00
Benjamin Kramer	d0993e0077	InstCombine: Simplify code, no functionality change. llvm-svn: 212449	2014-07-07 11:01:16 +00:00
Chandler Carruth	0dcb366268	[x86] Teach the new vector shuffle lowering code to handle what is essentially a DAG combine that never gets a chance to run. We might typically expect DAG combining to remove shuffles-of-splats and other similar patterns, but we don't get a chance to run the DAG combiner when we recursively form sub-shuffles during the lowering of a shuffle. So instead hand-roll a really important combine directly into the lowering code to detect shuffles-of-splats, especially shuffles of an all-zero splat which needn't even have the same element width, etc. This lets the new vector shuffle lowering handle shuffles which implement things like zero-extension really nicely. This will become even more important when I wire the legalization of zero-extension to vector shuffles with the new widening legalization strategy. llvm-svn: 212444	2014-07-07 09:06:58 +00:00
Tim Northover	55beb64bd0	CodeGen: it turns out that NAND is not the same thing as BIC. At all. We've been performing the wrong operation on ARM for "atomicrmw nand" for years, since "a NAND b" is "~(a & b)" rather than ARM's very tempting "a & ~b". This bled over into the generic expansion pass. So I assume no-one has ever actually tried to do an atomic nand in the real world. Oh well. llvm-svn: 212443	2014-07-07 09:06:35 +00:00
Saleem Abdulrasool	763f9a50a5	ARM: properly lower dllimport'ed global values This completes the handling for DLL import storage symbols when lowering instructions. A DLL import storage symbol must have an additional load performed prior to use. This is applicable to variables and functions. This is particularly important for non-function symbols as it is possible to handle function references by emitting a thunk which performs the translation from the unprefixed __imp_ symbol to the proper symbol (although, this is a non-optimal lowering). For a variable symbol, no such thunk can be accommodated. llvm-svn: 212431	2014-07-07 05:18:35 +00:00
Saleem Abdulrasool	220a044888	ARM: correctly mangle dllimport symbols Add support for tracking DLLImport storage class information on a per symbol basis in the ARM instruction selection. Use that information to correctly mangle the symbol (dllimport symbols are referenced via *__imp_<name>). llvm-svn: 212430	2014-07-07 05:18:30 +00:00
Saleem Abdulrasool	1eb4a28b44	ARM: unify symbol name retrieval Ensure that all paths that retrieve the symbol name go through GetARMGVSymbol rather than getSymbol. This is desirable so that any global symbol mangling can be centralised to this function. The motivation for this is handling of symbols that are marked as having dll import dll storage. Such a symbol requires an extra load that is currently handled in the backend and a __imp_ prefix on the symbol name. llvm-svn: 212429	2014-07-07 05:18:22 +00:00
Kevin Qin	4473c1943f	[AArch64] Normalize all constants to build a vector. The value of constant operands will be truncated to fit element width. llvm-svn: 212428	2014-07-07 02:45:40 +00:00
Sanjay Patel	784a5a41e7	fixed typos in comments llvm-svn: 212424	2014-07-06 23:24:53 +00:00
Sanjay Patel	0a2ada7b98	fixed some typos in comments llvm-svn: 212423	2014-07-06 23:10:24 +00:00
Saleem Abdulrasool	97255a017b	AArch64: whitespace cleanup llvm-svn: 212420	2014-07-06 22:13:26 +00:00
Rafael Espindola	adf21f2a56	Update the MemoryBuffer API to use ErrorOr. llvm-svn: 212405	2014-07-06 17:43:13 +00:00
Rafael Espindola	a3c65096cf	This only needs a StringRef. llvm-svn: 212402	2014-07-06 14:24:03 +00:00
Rafael Espindola	8026bd0b2a	This only needs a StringRef. llvm-svn: 212401	2014-07-06 14:17:29 +00:00
Alp Toker	a55b95b58a	SourceMgr: make valid buffer IDs start from one Use 0 for the invalid buffer instead of -1/~0 and switch to unsigned representation to enable more idiomatic usage. Also introduce a trivial SourceMgr::getMainFileID() instead of hard-coding 0/1 to identify the main file. llvm-svn: 212398	2014-07-06 10:33:31 +00:00
Matt Arsenault	4261973548	Use cast<> instead of dyn_cast + assert llvm-svn: 212380	2014-07-05 21:16:43 +00:00
Matt Arsenault	258c6e7cd9	Fix grammar llvm-svn: 212379	2014-07-05 21:16:40 +00:00
Rafael Espindola	d5a8efe733	This only needs a StringRef. No functionality change. llvm-svn: 212371	2014-07-05 11:38:52 +00:00
David Majnemer	82cb0309e2	MC: make MCSymbolData::dump work on const objects This just lets us dump a const MCSymbolData object, no functionality changed. llvm-svn: 212365	2014-07-05 00:39:52 +00:00
Rafael Espindola	8286fbf4c4	Make a helper function static. No functionality change. llvm-svn: 212364	2014-07-05 00:39:08 +00:00
David Majnemer	e0950ee85c	MC: Correct comment in ExportSymbol No functionality changed, just make it so that the code _could_ be uncommented. llvm-svn: 212363	2014-07-04 23:20:46 +00:00
David Majnemer	bee5f754f2	MC: Cleanup COFFAsmParser::ParseSectionFlags Switch a normal for-loop to a range-based for. No functionality changed. llvm-svn: 212362	2014-07-04 23:15:28 +00:00
Rafael Espindola	ba79dba8ed	Make RecordStreamer.h private. llvm-svn: 212361	2014-07-04 22:44:18 +00:00
David Majnemer	d1bea693e2	IR: Fold away compares between GV GEPs and GVs A GEP of a non-weak global variable will not be equivalent to another non-weak global variable or a GEP of such a variable. Differential Revision: http://reviews.llvm.org/D4238 llvm-svn: 212360	2014-07-04 22:05:26 +00:00
Rafael Espindola	e6107799fa	Fix a bug in the conversion to ErrorOr. The regular end of the bitcode parsing is in the BitstreamEntry::EndBlock case. Should fix the LTO bootstrap on OS X (this function is only used by ld64). llvm-svn: 212357	2014-07-04 20:05:56 +00:00
Rafael Espindola	c75c4fad46	Revert "Convert a few std::strings to StringRef." This reverts commit r212342. We can get a StringRef into the current Record, but not one in the bitcode itself since the string is compressed in it. llvm-svn: 212356	2014-07-04 20:02:42 +00:00
Rafael Espindola	089a317c64	Ignore llvm specific symbols in the LTOModule. These are the llvm.* globals and functions. I don't think it is possible to test this directly since llvm-lto is not a full linker and will not report duplicated symbols, but this fixes bootstrap with gold and lto enabled. llvm-svn: 212354	2014-07-04 19:31:27 +00:00
Ehsan Akhgari	4103da6bfb	Add support for parsing the not operator in Microsoft inline assembly This fixes http://llvm.org/PR20202 llvm-svn: 212352	2014-07-04 19:13:05 +00:00
Rafael Espindola	2dc0d9bddb	Ignore llvm.* globals. It is not clear if llvm.global_ctors should or should not be in llvm.metadata, but in practice it is not and we need to ignore it for LTO. llvm-svn: 212351	2014-07-04 19:08:22 +00:00
Saleem Abdulrasool	4e63fc498c	TableGen: introduce support for MSBuiltin Add MSBuiltin which is similar in vein to GCCBuiltin. This allows for adding intrinsics for Microsoft compatibility to individual instructions. This is needed to permit the creation of ARM specific MSVC extensions. This is not currently in use, and requires an associated change in clang to enable use of the intrinsics defined by this new class. This merely sets the LLVM portion of the infrastructure in place to permit the use of this functionality. A separate set of changes will enable the new intrinsics. llvm-svn: 212350	2014-07-04 18:42:25 +00:00
Rafael Espindola	dddd1fd9f4	Implement LTOModule on top of IRObjectFile. IRObjectFile provides all the logic for producing mangled names and getting symbols from inline assembly. LTOModule then adds logic for linking specific tasks, like constructing llvm.compiler_user or extracting linker options from the bitcode. The rule of the thumb is that IRObjectFile has the functionality that is needed by both LTO and llvm-ar. llvm-svn: 212349	2014-07-04 18:40:36 +00:00
Rafael Espindola	0972d41c73	Avoid mangling names twice. No functionality change. llvm-svn: 212348	2014-07-04 16:37:02 +00:00
Rafael Espindola	3885090b86	Mark intrinsic functions as llvm-specific. llvm-svn: 212347	2014-07-04 15:58:00 +00:00
Daniel Sanders	950f48d3c7	[mips][mips64r6] Set ELF e_flags for MIPS32r6/MIPS64r6. Also do MIPS-I to MIPS-V Differential Revision: http://reviews.llvm.org/D4386 llvm-svn: 212346	2014-07-04 15:21:53 +00:00
Rafael Espindola	b674c17deb	Don't include llvm.metadata variables in archive symbol tables. llvm-svn: 212344	2014-07-04 15:03:17 +00:00
Rafael Espindola	f98536a046	Convert a few std::strings to StringRef. llvm-svn: 212342	2014-07-04 14:12:46 +00:00
Rafael Espindola	d346cc8efc	Convert these functions to use ErrorOr. llvm-svn: 212341	2014-07-04 13:52:01 +00:00
Rafael Espindola	ce8a0d6cd8	Remove unused old-style error handling. If needed, an ErrorOr should be used. llvm-svn: 212340	2014-07-04 13:30:13 +00:00
Benjamin Kramer	3c5b126239	GlobalDCE: Delete available_externally initializers if it allows removing the value the initializer is referring to. This is useful for functions that are not actually available externally but referenced by a vtable of some kind. Clang emits functions like this for the MS ABI. PR20182. llvm-svn: 212337	2014-07-04 12:36:05 +00:00
Tim Northover	1bc367a41b	ARM: when falling back to scattered relocs, keep the type. The linker relies on relocation type info (e.g. is it a branch?) to perform the correct actions, so we should keep that even when we end up using a scattered relocation for whatever reason. rdar://problem/17553104 llvm-svn: 212333	2014-07-04 10:58:05 +00:00
Tim Northover	07f99fb769	llvm-readobj: fix MachO relocatoin printing a bit. There were two issues here: 1. At the very least, scattered relocations cannot use the same code to determine the corresponding symbol being referred to. For some reason we pretend there is no symbol, even when one actually exists in the symtab, so to match this behaviour getRelocationSymbol should simply return symbols_end for scattered relocations. 2. Printing "-" when we can't get a symbol (including the scattered case, but not exclusively), isn't that helpful. In both cases there is interesting information in that field, so we should print it. As hex will do. Small part of rdar://problem/17553104 llvm-svn: 212332	2014-07-04 10:57:56 +00:00
Benjamin Kramer	a420df2999	InstCombine: Strength reduce sadd.with.overflow into a regular nsw add if we can prove that it cannot overflow. PR20194 llvm-svn: 212331	2014-07-04 10:22:21 +00:00
Daniel Sanders	2e03d66453	[mips][mips64r6] Correct the encoding of dmuh, dmuhu, dmul, and dmulu. We have detected a documentation bug in the encoding tables of the released MIPS64r6 specification that has resulted in the wrong encodings being used for these instructions in LLVM. This commit corrects them. llvm-svn: 212330	2014-07-04 10:08:27 +00:00
Chandler Carruth	5d79bb5d32	[x86] Generalize BuildVectorSDNode::getConstantSplatValue to work for any constant, constant FP, or undef splat and to tolerate any undef lanes in a splat, then replace all uses of isSplatVector in X86's lowering with it. This fixes issues where undef lanes in an otherwise splat vector would prevent the splat logic from firing. It is a touch more awkward to use this interface, but it is much more accurate. Suggestions for better interface structuring welcome. With this fix, the code generated with the widening legalization strategy for widen_cast-4.ll is dramatically improved as the special lowering strategies for a v16i8 SRA kick in even though the high lanes are undef. We also get a slightly different choice for broadcasting an aligned memory location, and use vpshufd instead of vbroadcastss. This looks like a minor win for pipelining and domain crossing, but a minor loss for the number of micro-ops. I suspect its a wash, but folks can easily tweak the lowering if they want. llvm-svn: 212324	2014-07-04 08:11:49 +00:00
Alexey Volkov	302309f39f	[X86] Limit maximum nop length on Silvermont Silvermont can only decode one instruction per cycle if the instruction exceeds 8 bytes. Also in Silvermont instructions with more than 3 prefixes will cause 3 cycle penalty. Maximum nop length is limited to 7 bytes when used for padding on Silvermont. For other x86 processors max nop length remains unchanged 15 bytes. Differential Revision: http://reviews.llvm.org/D4374 llvm-svn: 212321	2014-07-04 07:14:56 +00:00
Robert Lytton	37d3fa7e36	XCore target: remove incorrect DebugLoc entries from prologue Summary: This was causing the prologue_end to be incorrectly positioned. Differential Revision: http://reviews.llvm.org/D4122 llvm-svn: 212318	2014-07-04 06:38:22 +00:00
Alp Toker	be53eebe5a	Fix prefix comparison from r212308 llvm-svn: 212310	2014-07-04 02:01:54 +00:00
Eric Christopher	c1058df66f	Move function dependent resetting of a subtarget variable out of the subtarget. This involved having the movt predicate take the current function - since we care about size in instruction selection for whether or not to use movw/movt take the function so we can check the attributes. This required adding the current MachineFunction to FastISel and propagating through. llvm-svn: 212309	2014-07-04 01:55:26 +00:00
Alp Toker	ac90380b5e	Sink undesirable LTO functions into the old C API We want to encourage users of the C++ LTO API to reuse memory buffers instead of repeatedly opening and reading the same file contents. This reverts commit r212305 and implements a tidier scheme. llvm-svn: 212308	2014-07-04 00:58:41 +00:00
David Majnemer	651ed5e8fd	InstSimplify: Fix a bug when INT_MIN is in a sdiv When INT_MIN is the numerator in a sdiv, we would not properly handle overflow when calculating the bounds of possible values; abs(INT_MIN) is not a meaningful number. Instead, check and handle INT_MIN by reasoning that the largest value is INT_MIN/-2 and the smallest value is INT_MIN. This fixes PR20199. llvm-svn: 212307	2014-07-04 00:23:39 +00:00
Peter Collingbourne	d7f75eeffc	Modify LTOModule::isTargetMatch to take a StringRef instead of a MemoryBuffer. llvm-svn: 212305	2014-07-03 23:49:28 +00:00
Peter Collingbourne	63086fe166	LTO: rename the various makeLTOModule overloads. This rename makes it easier to identify the specific overload being called in each particular case and makes future refactorings easier. Differential Revision: http://reviews.llvm.org/D4370 llvm-svn: 212302	2014-07-03 23:28:00 +00:00
Rafael Espindola	30f37f5fc4	Move createIRObjectFile to the IRObjectFile class and return the concrete type. llvm-svn: 212301	2014-07-03 23:03:50 +00:00
Chandler Carruth	19cff8205e	[x86] Clarify that this lowering only applies to vectors and is only used when we have SSE2. llvm-svn: 212300	2014-07-03 22:57:44 +00:00
Rafael Espindola	2f0647cfdc	Use std::unique_ptr to manage memory. No functionality change. llvm-svn: 212299	2014-07-03 22:43:03 +00:00
Eric Christopher	09f7131984	Temporarily revert "Don't try to construct debug LexicalScopes hierarchy for functions that do not have top level debug information." as it appears to be breaking some LTO constructs. This reverts commit r212203. llvm-svn: 212298	2014-07-03 22:24:54 +00:00
Eric Christopher	2f991c9ee1	Remove caching of the target machine and initialization of the subtarget from ARMISelDAGtoDAG. The former is unnecessary and the latter is initialized on each runOnMachineFunction. llvm-svn: 212297	2014-07-03 22:24:49 +00:00
Andrea Di Biagio	c8e8bda58f	[CostModel][x86] Improved cost model for alternate shuffles. This patch: 1) Improves the cost model for x86 alternate shuffles (originally added at revision 211339); 2) Teaches the Cost Model Analysis pass how to analyze alternate shuffles. Alternate shuffles are a special kind of blend; on x86, we can often easily lowered alternate shuffled into single blend instruction (depending on the subtarget features). The existing cost model didn't take into account subtarget features. Also, it had a couple of "dead" entries for vector types that are never legal (example: on x86 types v2i32 and v2f32 are not legal; those are always either promoted or widened to 128-bit vector types). The new x86 cost model takes into account what target features we have before returning the shuffle cost (i.e. the number of instructions after the blend is lowered/expanded). This patch also teaches the Cost Model Analysis how to identify and analyze alternate shuffles (i.e. 'SK_Alternate' shufflevector instructions): - added function 'isAlternateVectorMask'; - added some logic to check if an instruction is a alternate shuffle and, in case, call the target specific TTI to get the corresponding shuffle cost; - added a test to verify the cost model analysis on alternate shuffles. llvm-svn: 212296	2014-07-03 22:24:18 +00:00
Andrea Di Biagio	a37a2fc81f	[X86] Add ISel patterns to select 'f32_to_f16' and 'f16_to_f32' dag nodes. This patch adds tablegen patterns to select F16C float-to-half-float conversion instructions from 'f32_to_f16' and 'f16_to_f32' dag nodes. If the target doesn't have F16C, then 'f32_to_f16' and 'f16_to_f32' are expanded into library calls. llvm-svn: 212293	2014-07-03 21:51:06 +00:00
Rafael Espindola	c63c714ed1	LTO depends on Object now. Fixes the build with only the ARM backend enabled. For some reason some other backend was pulling Object and this went unnoticed. llvm-svn: 212288	2014-07-03 20:19:03 +00:00
Gerolf Hoflehner	65b13324e1	Run interprocedural const prop before global optimizer Exposes more constant globals that can be removed by the global optimizer. A specific example is the removal of the static global block address array in clang/test/CodeGen/indirect-goto.c. This change impacts only lower optimization levels. With LTO interprocedural const prop runs already before global opt. llvm-svn: 212284	2014-07-03 19:28:15 +00:00
Rafael Espindola	13b69d63e6	Add support for inline asm symbols to IRObjectFile. This also enables it in llvm-nm so that it can be tested. llvm-svn: 212282	2014-07-03 18:59:23 +00:00
David Majnemer	3374910f19	IR: cleanup Module::dropReferences This replaces some old-style loops with range-based for. llvm-svn: 212278	2014-07-03 16:12:55 +00:00
Yi Kong	93e52da641	[ARM] Implement ISB memory barrier intrinsic Adds support for __builtin_arm_isb. Also corrects DSB and ISB instructions modelling by adding has-side-effects property. llvm-svn: 212276	2014-07-03 16:00:41 +00:00
Sanjay Patel	dc574ab500	bug fix for PR20020: anti-dependency-breaker causes miscompilation This patch sets the 'KeepReg' bit for any tied and live registers during the PrescanInstruction() phase of the dependency breaking algorithm. It then checks those 'KeepReg' bits during the ScanInstruction() phase to avoid changing any tied registers. For more details, please see comments in: http://llvm.org/bugs/show_bug.cgi?id=20020 I added two FIXME comments for code that I think can be removed by using register iterators that include self. I don't want to include those code changes with this patch, however, to keep things as small as possible. The test case is larger than I'd like, but I don't know how to reduce it further and still produce the failing asm. Differential Revision: http://reviews.llvm.org/D4351 llvm-svn: 212275	2014-07-03 15:19:40 +00:00
Ulrich Weigand	f236bb1b5b	Fix ppcf128 component access on little-endian systems The PowerPC 128-bit long double data type (ppcf128 in LLVM) is in fact a pair of two doubles, where one is considered the "high" or more-significant part, and the other is considered the "low" or less-significant part. When a ppcf128 value is stored in memory or a register pair, the high part always comes first, i.e. at the lower memory address or in the lower-numbered register, and the low part always comes second. This is true both on big-endian and little-endian PowerPC systems. (Similar to how with a complex number, the real part always comes first and the imaginary part second, no matter the byte order of the system.) This was implemented incorrectly for little-endian systems in LLVM. This commit fixes three related issues: - When printing an immediate ppcf128 constant to assembler output in emitGlobalConstantFP, emit the high part first on both big- and little-endian systems. - When lowering a ppcf128 type to a pair of f64 types in SelectionDAG (which is used e.g. when generating code to load an argument into a register pair), use correct low/high part ordering on little-endian systems. - In a related issue, because lowering ppcf128 into a pair of f64 must operate differently from lowering an int128 into a pair of i64, bitcasts between ppcf128 and int128 must not be optimized away by the DAG combiner on little-endian systems, but must effect a word-swap. Reviewed by Hal Finkel. llvm-svn: 212274	2014-07-03 15:06:47 +00:00
Evgeniy Stepanov	174242c74c	[msan] Stop propagating shadow in blacklisted functions. With this change all values passed through blacklisted functions become fully initialized. Previous behavior was to initialize all loads in blacklisted functions, but apply normal shadow propagation logic for all other operation. This makes blacklist applicable in a wider range of situations. It also makes code for blacklisted functions a lot shorter, which works as yet another workaround for PR17409. llvm-svn: 212268	2014-07-03 11:56:30 +00:00
Evgeniy Stepanov	e1a5a1f7a8	Revert of r212265. llvm-svn: 212266	2014-07-03 11:35:08 +00:00
Evgeniy Stepanov	cfc40ef98a	[msan] Stop propagating shadow in blacklisted functions. With this change all values passed through blacklisted functions become fully initialized. Previous behavior was to initialize all loads in blacklisted functions, but apply normal shadow propagation logic for all other operation. This makes blacklist applicable in a wider range of situations. It also makes code for blacklisted functions a lot shorter, which works as yet another workaround for PR17409. llvm-svn: 212265	2014-07-03 11:18:48 +00:00
Marcello Maggioni	89c05ad165	Minor stylistic fix in SimplifyCFG (test commit) llvm-svn: 212259	2014-07-03 08:29:06 +00:00
Chandler Carruth	99b1104c46	[x86] Fix the completely broken vector widening legalization of bswap. This operation was classified as a binary operation in the widening logic for some reason (clearly, untested). It is in fact a unary operation. Add a RUN line to a test to exercise this for x86. Note that again the vector widening strategy doesn't regress anything and in one case removes a totally unecessary instruction that we couldn't avoid when promoting the element type. llvm-svn: 212257	2014-07-03 07:04:38 +00:00
Chandler Carruth	739b6ada99	[x86] Fix crashes in lowering bitcast instructions with the widening mode. This also runs the test in that mode which would reproduce the crash. What I love is that every single FIXME in the test is addressed by switching to widening. llvm-svn: 212254	2014-07-03 03:43:47 +00:00
Richard Trieu	f2a795241a	Add new lines to debugging information. Differential Revision: http://reviews.llvm.org/D4262 llvm-svn: 212250	2014-07-03 02:11:49 +00:00
Chandler Carruth	49a8b10d82	[x86] Based on a long conversation between myself, Jim Grosbach, Hal Finkel, Eric Christopher, and a bunch of other people I'm probably forgetting (sorry), add an option to the x86 backend to widen vectors during type legalization rather than promote them. This still would promote vNi1 vectors to get the masks right, but would widen other vectors. A lot of experiments are piling up right now showing that widening should probably be the default legalization strategy outside of vNi1 cases, but it is very hard to test the rammifications of that and fix bugs in widening-based legalization without an option that enables it. I'll be checking in tests shortly that use this option to exercise cases where widening doesn't work well and hopefully we'll be able to switch fully to this soon. llvm-svn: 212249	2014-07-03 02:11:29 +00:00
Rafael Espindola	97de474a36	Invert the MC -> Object dependency. Now that we have a lib/MC/MCAnalysis, the dependency was there just because of two helper classes. Move the two over to MC. This will allow IRObjectFile to parse inline assembly. llvm-svn: 212248	2014-07-03 02:01:39 +00:00
Eric Christopher	f204208e4f	Make these preprocessor directives match all of the others in the port. llvm-svn: 212245	2014-07-03 00:44:31 +00:00
Eric Christopher	ad4de684ea	Remove dead code. llvm-svn: 212244	2014-07-03 00:44:28 +00:00
Chandler Carruth	9d010fffe1	[codegen,aarch64] Add a target hook to the code generator to control vector type legalization strategies in a more fine grained manner, and change the legalization of several v1iN types and v1f32 to be widening rather than scalarization on AArch64. This fixes an assertion failure caused by scalarizing nodes like "v1i32 trunc v1i64". As v1i64 is legal it will fail to scalarize v1i32. This also provides a foundation for other targets to have more granular control over how vector types are legalized. Patch by Hao Liu, reviewed by Tim Northover. I'm committing it to allow some work to start taking place on top of this patch as it adds some really important hooks to the backend that I'd like to immediately start using. =] http://reviews.llvm.org/D4322 llvm-svn: 212242	2014-07-03 00:23:43 +00:00
Eric Christopher	daa9dbbbd5	Move subtarget dependent features into the subtarget from the target machine. Includes a fix for a subtarget initialization for hard floating point on mips16. llvm-svn: 212240	2014-07-03 00:10:24 +00:00
Eric Christopher	4cdb3f9b6a	So that we can include frame lowering in the subtarget, remove include circular dependency with the subtarget by inlining accessor methods and outlining a routine. llvm-svn: 212236	2014-07-02 23:29:55 +00:00
Eric Christopher	bf33a3cf70	So that we can include target lowering in the subtarget, remove include circular dependency with the subtarget by inlining accessor methods and outlining a routine. llvm-svn: 212234	2014-07-02 23:18:40 +00:00
Eric Christopher	0eaa541ea5	Fix typos. llvm-svn: 212228	2014-07-02 22:05:40 +00:00
David Blaikie	9a0f7948a2	Revert "DebugInfo: Ensure that all debug location scope chains from instructions within a function, lead to the function itself." This reverts commit r212205. Reverting this again, still seeing crashes when building compiler-rt... Sorry for the continued noise, not sure why I'm failing to reproduce this locally. llvm-svn: 212226	2014-07-02 21:42:28 +00:00
Eric Christopher	5f9fd210b3	Move the data layout and selection dag info from the mips target machine down to the subtarget. llvm-svn: 212224	2014-07-02 21:29:23 +00:00
Adam Nemet	11dd5cf9f1	[X86] AVX512: Allow writemask argument in vpermt* intrinsics llvm-svn: 212223	2014-07-02 21:26:01 +00:00
Adam Nemet	efe9c98a16	[X86] AVX512: Generate Pat<>'s for the vpermt2* intrinsics via multiclass This new multiclass, avx512_perm_table_3src derives from the current one and provides the Pat<>. The next patch will add another Pat<> that uses the writemask. Note that I dropped the type annotation from the intrinsic call, i.e.: (v16f32 VR512:$src1) -> R512:$src1. I think that this should be fine (at least many intrinsic calls don't provide them) and it greatly reduces the number of template arguments. llvm-svn: 212222	2014-07-02 21:25:58 +00:00
Adam Nemet	2415a497b5	[X86] AVX512: Add writemask variants for vperm2 This includes assembler and codegen support (see the new tests in avx512-encodings.s and avx512-shuffle.ll). <rdar://problem/17492620> llvm-svn: 212221	2014-07-02 21:25:54 +00:00
Tom Stellard	e9219e0026	R600: Add a comment that llvm.AMDGPU.trunc is a legacy intrinsic llvm-svn: 212218	2014-07-02 20:53:57 +00:00
Tom Stellard	7c1838d797	R600/SI: Use a ComplexPattern for ADDR64 addressing of MUBUF loads llvm-svn: 212217	2014-07-02 20:53:56 +00:00
Tom Stellard	10ae6a0e6a	R600: Promote i64 loads to v2i32 llvm-svn: 212216	2014-07-02 20:53:54 +00:00
Tom Stellard	b2de94e0c6	R600/SI: Adjsut SGPR live ranges before register allocation SGPRs are written by instructions that sometimes will ignore control flow, which means if you have code like: if (VGPR0) { SGPR0 = S_MOV_B32 0 } else { SGPR0 = S_MOV_B32 1 } The value of SGPR0 will 1 no matter what the condition is. In order to deal with this situation correctly, we need to view the program as if it were a single basic block when we calculate the live ranges for the SGPRs. They way we actually update the live range is by iterating over all of the segments in each LiveRange object and setting the end of each segment equal to the start of the next segment. So a live range like: [3888r,9312r:0)[10032B,10384B:0) 0@3888r will become: [3888r,10032B:0)[10032B,10384B:0) 0@3888r This change will allow us to use SALU instructions within branches. llvm-svn: 212215	2014-07-02 20:53:48 +00:00
Tom Stellard	a305f93d81	R600/SI: Add verifier check for immediates in register operands. llvm-svn: 212214	2014-07-02 20:53:44 +00:00
Alexey Samsonov	0c5ecdd053	Remove non-static field initializer to appease MSVC llvm-svn: 212212	2014-07-02 20:25:42 +00:00
Rafael Espindola	e1865a8e8c	Fix configure+make build. llvm-svn: 212210	2014-07-02 20:05:48 +00:00
Rafael Espindola	cbc5ac7a7e	Move CFG building code to a new lib/MC/MCAnalysis library. The new library is 150KB on a Release+Asserts build, so it is quiet a bit of code that regular users of MC don't need to link with now. llvm-svn: 212209	2014-07-02 19:49:34 +00:00
David Blaikie	9408f5282e	DebugInfo: Ensure that all debug location scope chains from instructions within a function, lead to the function itself. Originally committed in r211723, reverted in r211724 due to failure cases found and fixed (ArgumentPromotion: r211872, Inlining: r212065), committed again in r212085 and reverted again in r212089 after fixing some other cases, such as debug info subprogram lists not keeping track of the function they represent (r212128) and then short-circuiting things like LiveDebugVariables that build LexicalScopes for functions that might not have full debug info. And again, I believe the invariant actually holds for some reasonable amount of code (but I'll keep an eye on the buildbots and see what happens... ). Original commit message: PR20038: DebugInfo: Inlined call sites where the caller has debug info but the call itself has no debug location. This situation does bad things when inlined, so I've fixed Clang not to produce inlinable call sites without locations when the caller has debug info (in the one case where I could find that this occurred). This updates the PR20038 test case to be what clang now produces, and readds the assertion that had to be removed due to this bug. I've also beefed up the debug info verifier to help diagnose these issues in the future, and I hope to add checks to the inliner to just assert-fail if it encounters this situation. If, in the future, we decide we have to cope with this situation, the right thing to do is probably to just remove all the DebugLocs from the inlined instructions. llvm-svn: 212205	2014-07-02 18:32:05 +00:00
Quentin Colombet	5caa6a2da1	[RegAllocGreedy] Provide a subtarget hook to disable the local reassignment heuristic. By default, no functionality change. This is a follow-up of r212099. This hook provides a finer grain to control the optimization. <rdar://problem/17444599> llvm-svn: 212204	2014-07-02 18:32:04 +00:00
David Blaikie	d47fb5b339	Don't try to construct debug LexicalScopes hierarchy for functions that do not have top level debug information. If a function isn't actually in a CU's subprogram list in the debug info metadata, ignore all the DebugLocs and don't try to build scopes, track variables, etc. While this is possibly a minor optimization, it's also a correctness fix for an incoming patch that will add assertions to LexicalScopes and the debug info verifier to ensure that all scope chains lead to debug info for the current function. Fix up a few test cases that had broken/incomplete debug info that could violate this constraint. Add a test case where this occurs by design (inlining a debug-info-having function in an attribute nodebug function - we want this to work because /if/ the nodebug function is then inlined into a debug-info-having function, it should be fine (and will work fine - we just stitch the scopes up as usual), but should the inlining not happen we need to not assert fail either). llvm-svn: 212203	2014-07-02 18:31:35 +00:00
David Blaikie	a8c3509ffe	Constify the Function pointers in the result of makeSubprogramMap These don't need to be mutable and callers being added soon in CodeGen won't have access to non-const Module&. llvm-svn: 212202	2014-07-02 18:30:05 +00:00
Duncan P. N. Exon Smith	de58870394	AArch64: Re-enable AArch64AddressTypePromotion This reverts commits r212189 and r212190. While this pass was accidentally disabled (until r212073), r205437 slipped in a use of `auto` that should have been `auto&`. This fixes PR20188. llvm-svn: 212201	2014-07-02 18:17:40 +00:00
Duncan P. N. Exon Smith	0945abc142	AArch64: Remove unnecessary parens llvm-svn: 212199	2014-07-02 18:14:03 +00:00
Matt Arsenault	c324b95c77	R600: Fix crashes when an illegal type load or store is not handled. I don't think anything hits this now, but will be exposed in future patches. llvm-svn: 212197	2014-07-02 17:44:53 +00:00
Duncan P. N. Exon Smith	c4db656221	AArch64: Merge isa with dyn_cast llvm-svn: 212194	2014-07-02 17:26:39 +00:00
Duncan P. N. Exon Smith	6d1fc66e9b	AArch64: Temporarily disable AArch64AddressTypePromotion Temporarily disable AArch64AddressTypePromotion, which was effectively re-enabled in r212073 and r212075, while I look into PR20188. llvm-svn: 212189	2014-07-02 17:03:16 +00:00
Alexey Samsonov	4f319cca42	[ASan] Print exact source location of global variables in error reports. See https://code.google.com/p/address-sanitizer/issues/detail?id=299 for the original feature request. Introduce llvm.asan.globals metadata, which Clang (or any other frontend) may use to report extra information about global variables to ASan instrumentation pass in the backend. This metadata replaces llvm.asan.dynamically_initialized_globals that was used to detect init-order bugs. llvm.asan.globals contains the following data for each global: 1) source location (file/line/column info); 2) whether it is dynamically initialized; 3) whether it is blacklisted (shouldn't be instrumented). Source location data is then emitted in the binary and can be picked up by ASan runtime in case it needs to print error report involving some global. For example: 0x... is located 4 bytes to the right of global variable 'C::array' defined in '/path/to/file:17:8' (0x...) of size 40 These source locations are printed even if the binary doesn't have any debug info. This is an ABI-breaking change. ASan initialization is renamed to __asan_init_v4(). Pre-built libraries compiled with older Clang will not work with the fresh runtime. llvm-svn: 212188	2014-07-02 16:54:41 +00:00
Chad Rosier	aba845e835	Revert "Revert "MachineScheduler: better book-keeping for asserts."" This reverts commit r212109, which reverted r212088. However, disable the assert as it's not necessary for correctness. There are several corner cases that the assert needed to handle better for in-order scheduling, but none of them are incorrect scheduler behavior. The assert is mainly there to collect good unit tests like this and ensure that the target-independent scheduler is working as expected with the various machine models. llvm-svn: 212187	2014-07-02 16:46:08 +00:00
Benjamin Kramer	e739cf3eb5	X86: When combining shuffles just remove shuffles that are completely redundant. CombineTo doesn't allow replacing a node with itself so this would crash if the combined shuffle is the same as the input shuffle. llvm-svn: 212181	2014-07-02 15:09:44 +00:00
Elena Demikhovsky	678bd5ba4a	AVX-512: dec/inc instructions are slow on KNL After Alexey Volkov, I'm adding the same property for KNL, that prefers ADD/SUB instead of INC/DEC. Added a test. llvm-svn: 212178	2014-07-02 14:11:05 +00:00
Matt Arsenault	e9a5a50322	Fix missing const llvm-svn: 212168	2014-07-02 06:45:26 +00:00
David Majnemer	f28e2a4282	InstCombine: Optimize x/INT_MIN to x==INT_MIN The result of x/INT_MIN is either 0 or 1, we can just use an icmp instead. llvm-svn: 212167	2014-07-02 06:42:13 +00:00
Chandler Carruth	c1bedac3bd	[cleanup] Hoist an if-else chain on ISD opcodes (really designed for switches) into a switch, and sink them into a dispatch function that can return the result rather than awkward variable setting with breaks. llvm-svn: 212166	2014-07-02 06:23:34 +00:00
David Majnemer	bdeef602e9	InstCombine: Don't turn -(x/INT_MIN) -> x/INT_MIN It is not safe to negate the smallest signed integer, doing so yields the same number back. This fixes PR20186. llvm-svn: 212164	2014-07-02 06:07:09 +00:00
Saleem Abdulrasool	2e09c514c0	aarch64: support target-specific .req assembler directive Based on the support for .req on ARM. The aarch64 variant has to keep track if the alias register was a vector register (v0-31) or a general purpose or VFP/Advanced SIMD ([bhsdq]0-31) register. Patch by Janne Grunau! llvm-svn: 212161	2014-07-02 04:50:23 +00:00
Chandler Carruth	722289f311	[cleanup] Remove dead 'break;' statements that I meant to nuke in r212158 but missed. Thanks to Craig for spotting the goof! llvm-svn: 212159	2014-07-02 04:39:34 +00:00
Chandler Carruth	2746c2861f	[cleanup] Hoist the promotion dispatch logic into the promote function so that we can use return to express it more cleanly and avoid so many nested switch statements. llvm-svn: 212158	2014-07-02 03:07:15 +00:00
Chandler Carruth	1cfa895c4a	[cleanup] Nuke the 'VectorOp' bit of the promote method names. This doesn't add any information for methods in the VectorLegalizer class that clearly take SDAG operations to legalize. llvm-svn: 212157	2014-07-02 03:07:11 +00:00
Chandler Carruth	68adf1568a	[x86] Clean up and modernize the doxygen and API comments for the vector operation legalization code. llvm-svn: 212155	2014-07-02 02:16:57 +00:00
Eric Christopher	5b336a242c	Break out subtarget initialization that dependent variables need into a separate function and clean up calling convention for helper function. llvm-svn: 212153	2014-07-02 01:14:43 +00:00
Eric Christopher	a4d901f599	Unify these two lines. llvm-svn: 212152	2014-07-02 01:02:28 +00:00
Eric Christopher	1f51ddda98	Move MipsJITInfo to the subtarget rather than the target machine. llvm-svn: 212151	2014-07-02 00:54:12 +00:00
Eric Christopher	404c94c0fc	Remove unnecessary include. llvm-svn: 212150	2014-07-02 00:54:10 +00:00
Eric Christopher	4407ddefd0	Remove the cached InstrItineraryData on the TargetMachine, it's unnecessary. llvm-svn: 212149	2014-07-02 00:54:07 +00:00
Eric Christopher	1b1cef2323	Move the subtarget dependent features from XCoreTargetMachine down to the subtarget. llvm-svn: 212147	2014-07-02 00:10:09 +00:00
Eric Christopher	99a5ba8b20	Make XCoreSelectionDAGInfo take a DataLayout since it only needs that information. llvm-svn: 212146	2014-07-02 00:10:05 +00:00
Juergen Ributzka	190305b648	[FastISel] Factor out stackmap intrinsic selection code into a dedicated helper method. NFCI. llvm-svn: 212140	2014-07-01 22:25:49 +00:00
Tim Northover	334d8eebe5	X86: remove atomic instructions after we've iterated through them. Otherwise they get freed and the implicit "isa<XYZ>" tests following turn out badly (at least under sanitizers). Also corrects the ordering of unordered atomic stores. llvm-svn: 212136	2014-07-01 22:10:30 +00:00
Juergen Ributzka	3bd03c7099	[DAG] Pass the argument list to the CallLoweringInfo via move semantics. NFCI. The argument list vector is never used after it has been passed to the CallLoweringInfo and moving it to the CallLoweringInfo is cleaner and pretty much as cheap as keeping a pointer to it. llvm-svn: 212135	2014-07-01 22:01:54 +00:00
Tim Northover	df58625e3c	X86: delegate expanding atomic libcalls to generic code. On targets without cmpxchg16b or cmpxchg8b, the borderline atomic operations were slipping through the gaps. X86AtomicExpand.cpp was delegating to ISelLowering. Generic ISelLowering was delegating to X86ISelLowering and X86ISelLowering was asserting. The correct behaviour is to expand to a libcall, preferably in generic ISelLowering. This can be achieved by X86ISelLowering deciding it doesn't want the faff after all. llvm-svn: 212134	2014-07-01 21:44:59 +00:00
Reid Kleckner	813dab2fc6	Optimize InstCombine stack memory consumption This patch reduces the stack memory consumption of the InstCombine function "isOnlyCopiedFromConstantGlobal() ", that in certain conditions could overflow the stack because of excessive recursiveness. For example, in a case like this: %0 = alloca [50025 x i32], align 4 %1 = getelementptr inbounds [50025 x i32]* %0, i64 0, i64 0 store i32 0, i32* %1 %2 = getelementptr inbounds i32* %1, i64 1 store i32 1, i32* %2 %3 = getelementptr inbounds i32* %2, i64 1 store i32 2, i32* %3 %4 = getelementptr inbounds i32* %3, i64 1 store i32 3, i32* %4 %5 = getelementptr inbounds i32* %4, i64 1 store i32 4, i32* %5 %6 = getelementptr inbounds i32* %5, i64 1 store i32 5, i32* %6 ... This piece of code crashes llvm when trying to apply instcombine on desktop. On embedded devices this could happen with a much lower limit of recursiveness. Some instructions (getelementptr and bitcasts) make the function recursively call itself on their uses, which is what makes the example above consume so much stack (it becomes a recursive depth-first tree visit with a very big depth). The patch changes the algorithm to be semantically equivalent, but iterative instead of recursive and the visiting order to be from a depth-first visit to a breadth-first visit (visit all the instructions of the current level before the ones of the next one). Now if a lot of memory is required a heap allocation is done instead of the the stack allocation, avoiding the possible crash. Reviewed By: rnk Differential Revision: http://reviews.llvm.org/D4355 Patch by Marcello Maggioni! We don't generally commit large stress test that look for out of memory conditions, so I didn't request that one be added to the patch. llvm-svn: 212133	2014-07-01 21:36:20 +00:00
Alp Toker	d8d510af92	Move remaining LLVM_ENABLE_DUMP conditionals out of the headers This macro is sometimes defined manually but isn't (and doesn't need to be) in llvm-config.h so shouldn't appear in the headers, likewise NDEBUG. Instead switch them over to LLVM_DUMP_METHOD on the definitions. llvm-svn: 212130	2014-07-01 21:19:13 +00:00
David Blaikie	e844cd5305	DebugInfo: Keep track of subprograms who's arguments have been promoted. Matching behavior with DeadArgumentElimination (and leveraging some now-common infrastructure), keep track of the function from debug info metadata if arguments are promoted. This may produce interesting debug info - since the arguments may be missing or of different types... but at least backtraces, inlining, etc, will be correct. llvm-svn: 212128	2014-07-01 21:13:37 +00:00
Eric Christopher	5234995e80	Move the subtarget dependent features from SystemZTargetMachine down to the subtarget. Add an initialization routine to assist. llvm-svn: 212124	2014-07-01 20:19:02 +00:00
Eric Christopher	f1bd22dfa4	Remove the use and initialization of the target machine and subtarget from SystemZFrameLowering. llvm-svn: 212123	2014-07-01 20:18:59 +00:00
David Blaikie	6876b3bcff	DebugInfo: Provide a utility for building a mapping from llvm::Function*s to llvm::DISubprograms Update DeadArgumentElimintation to use this, with the intent of reusing the functionality for ArgumentPromotion as well. llvm-svn: 212122	2014-07-01 20:05:26 +00:00
Tim Northover	21feb2e1d2	AArch64: fix comment typo llvm-svn: 212120	2014-07-01 19:47:09 +00:00
Tim Northover	277066ab43	X86: expand atomics in IR instead of as MachineInstrs. The logic for expanding atomics that aren't natively supported in terms of cmpxchg loops is much simpler to express at the IR level. It also allows the normal optimisations and CodeGen improvements to help out with atomics, instead of using a limited set of possible instructions.. rdar://problem/13496295 llvm-svn: 212119	2014-07-01 18:53:31 +00:00
Adam Nemet	16de2486cb	[X86] AVX512: Allow writemasks with vpcmp For now I only updated the _alt variants. The main variants are used by codegen and that will need a bit more work to trigger. <rdar://problem/17492620> llvm-svn: 212114	2014-07-01 18:03:45 +00:00
Adam Nemet	1efcb90fcd	[X86] AVX512: Factor generating the AsmString into avx512_icmp_cc Adding a writemask variant would require a third asm string to be passed to the template. Generate the AsmString in the template instead. No change in X86.td.expanded. llvm-svn: 212113	2014-07-01 18:03:43 +00:00
Chad Rosier	f575a73751	Revert "MachineScheduler: better book-keeping for asserts." This reverts commit r212088, which is causing a number of spec failures. Will provide reduced test cases shortly. PR20057 llvm-svn: 212109	2014-07-01 17:23:11 +00:00
Quentin Colombet	6d590d538f	[PeepholeOptimzer] Fix a typo in a comment. Spotted by Amara Emerson. llvm-svn: 212106	2014-07-01 16:23:44 +00:00
David Majnemer	5c92115972	GlobalOpt: Don't swap private for internal linkage There were transforms whose intent was to downgrade the linkage of external objects to have internal linkage. However, it fired on things with private linkage as well. llvm-svn: 212104	2014-07-01 15:26:50 +00:00
Rafael Espindola	83120cdf68	Avoid revocations when possible. This is a small targeted fix for pr20119. The code needs quiet a bit of refactoring and I added some FIXMEs about it, but I want to get the testcase passing first. llvm-svn: 212101	2014-07-01 14:34:30 +00:00
Quentin Colombet	1111e6fe84	[PeepholeOptimizer] Advanced rewriting of copies to avoid cross register banks copies. This patch extends the peephole optimization introduced in r190713 to produce register-coalescer friendly copies when possible. This extension taught the existing cross-bank copy optimization how to deal with the instructions that generate cross-bank copies, i.e., insert_subreg, extract_subreg, reg_sequence, and subreg_to_reg. E.g. b = insert_subreg e, A, sub0 <-- cross-bank copy ... C = copy b.sub0 <-- cross-bank copy Would produce the following code: b = insert_subreg e, A, sub0 <-- cross-bank copy ... C = copy A <-- same-bank copy This patch also introduces a new helper class for that: ValueTracker. This class implements the logic to look through the copy related instructions and get the related source. For now, the advanced rewriting is disabled by default as we are lacking the semantic on target specific instructions to catch the motivating examples. Related to <rdar://problem/12702965>. llvm-svn: 212100	2014-07-01 14:33:36 +00:00
Quentin Colombet	e1a36634b7	[RegAllocGreedy] Provide a flag to disable the local reassignment heuristic. By default, no functionality change. Before evicting a local variable, this heuristic tries to find another (set of) local(s) that can be reassigned to a free color. In some extreme cases (large basic blocks with tons of local variables), the compilation time is dominated by the local interference checks that this heuristic must perform, with no code gen gain. E.g., the motivating example takes 4 minutes to compile with this heuristic, 12 seconds without. Improving the situation will likely require to make drastic changes to the register allocator and/or the interference check framework. For now, provide this flag to better understand the impact of that heuristic. <rdar://problem/17444599> llvm-svn: 212099	2014-07-01 14:08:37 +00:00
Alp Toker	1a9ea52edb	Remove obsolete function TargetRegistry::getClosestTargetForJIT() This was kept around "for compatibility through 2.6" in 2009 and is not used or tested. llvm-svn: 212095	2014-07-01 10:47:13 +00:00
David Blaikie	c8caa1702a	Revert "DebugInfo: Ensure that all debug location scope chains from instructions within a function, lead to the function itself." This reverts commit r212085. This breaks the sanitizer bot... & I thought I'd tried pretty hard not to do that. Guess I need to try harder. llvm-svn: 212089	2014-07-01 04:11:45 +00:00
Andrew Trick	f1b307bcb0	MachineScheduler: better book-keeping for asserts. Fixes another test case under PR20057. llvm-svn: 212088	2014-07-01 03:23:13 +00:00
Alp Toker	568c31f236	ExecutionEngine::create(): fix interpreter fallback when JIT is unavailable ForceInterpreter=false shouldn't disable the interpreter completely because it can still be necessary to interpret if the target doesn't support JIT. No obvious way to test this in LLVM, but this matches what LLVMCreateExecutionEngineForModule() does and fixes the clang-interpreter example in the clang source tree which uses the ExecutionEngine. llvm-svn: 212086	2014-07-01 03:18:49 +00:00
David Blaikie	b89e6d93d9	DebugInfo: Ensure that all debug location scope chains from instructions within a function, lead to the function itself. Originally committed in r211723, reverted in r211724 due to failure cases found and fixed (ArgumentPromotion: r211872, Inlining: r212065), and I now believe the invariant actually holds for some reasonable amount of code (but I'll keep an eye on the buildbots and see what happens... ). Original commit message: PR20038: DebugInfo: Inlined call sites where the caller has debug info but the call itself has no debug location. This situation does bad things when inlined, so I've fixed Clang not to produce inlinable call sites without locations when the caller has debug info (in the one case where I could find that this occurred). This updates the PR20038 test case to be what clang now produces, and readds the assertion that had to be removed due to this bug. I've also beefed up the debug info verifier to help diagnose these issues in the future, and I hope to add checks to the inliner to just assert-fail if it encounters this situation. If, in the future, we decide we have to cope with this situation, the right thing to do is probably to just remove all the DebugLocs from the inlined instructions. llvm-svn: 212085	2014-07-01 03:11:59 +00:00
Reid Kleckner	b5dd9452b4	Fix .seh_stackalloc 0 seh_stackalloc 0 is not representable in Win64 SEH info, so emitting it is a bug. Reviewers: rnk Differential Revision: http://reviews.llvm.org/D4334 Patch by Vadim Chugunov! llvm-svn: 212081	2014-07-01 00:42:47 +00:00
David Majnemer	0e2cc2a519	GlobalOpt: Handle non-zero offsets for aliases An alias with an aliasee of a non-zero GEP is not trivially replacable with it's aliasee. llvm-svn: 212079	2014-07-01 00:30:56 +00:00
Gerolf Hoflehner	734f4c8984	Suppress inlining when the block address is taken Inlining functions with block addresses can cause many problem and requires a rich infrastructure to support including escape analysis. At this point the safest approach to address these problems is by blocking inlining from happening. Background: There have been reports on Ruby segmentation faults triggered by inlining functions with block addresses like //Ruby code snippet vm_exec_core() { finish_insn_seq_0 = &&INSN_LABEL_finish; INSN_LABEL_finish: ; } This kind of scenario can also happen when LLVM picks a subset of blocks for inlining, which is the case with the actual code in the Ruby environment. LLVM suppresses inlining for such functions when there is an indirect branch. The attached patch does so even when there is no indirect branch. Note that user code like above would not make much sense: using the global for jumping across function boundaries would be illegal. Why was there a segfault: In the snipped above the block with the label is recognized as dead So it is eliminated. Instead of a block address the cloner stores a constant (sic!) into the global resulting in the segfault (when the global is used in a goto). Why had it worked in the past then: By luck. In older versions vm_exec_core was also inlined but the label address used was the block label address in vm_exec_core. So the global jump ended up in the original function rather than in the caller which accidentally happened to work. Test case ./tools/clang/test/CodeGen/indirect-goto.c will fail as a result of this commit. rdar://17245966 llvm-svn: 212077	2014-07-01 00:19:34 +00:00
Duncan P. N. Exon Smith	4f5e9f8207	AArch64: Follow-up to r212073 In r212073 I missed a call of `use_begin()` that assumed the wrong semantics. It's not clear to me at all what this code does without the fix, so I'm not sure how to write a testcase. llvm-svn: 212075	2014-07-01 00:05:37 +00:00
Duncan P. N. Exon Smith	7d7ae93139	AArch64: Actually do address type promotion AArch64AddressTypePromotion was doing nothing because it was using the old semantics of `Use` and `uses()`, when it really wanted to get at the `users()`. llvm-svn: 212073	2014-06-30 23:42:14 +00:00
David Blaikie	644d2eee59	DebugInfo: Preserve debug location information when transforming a call into an invoke during inlining. This both improves basic debug info quality, but also fixes a larger hole whenever we inline a call/invoke without a location (debug info for the entire inlining is lost and other badness that the debug info emission code is currently working around but shouldn't have to). llvm-svn: 212065	2014-06-30 20:30:39 +00:00
Reid Kleckner	4da3d57e62	Speculatively fix some code handling Power64 MachO files MSVC was warning on a switch containing only default labels. In this instance, it looks like it uncovered a real bug. :) llvm-svn: 212062	2014-06-30 20:12:59 +00:00
Reid Kleckner	833740ac5e	msan: Stop stripping the 'tail' modifier off of calls This probably isn't necessary since msan started to unpoison the return value shadow memory before all calls. llvm-svn: 212061	2014-06-30 20:12:27 +00:00
Ehsan Akhgari	33d1ae53f7	Refactor the code in clang to find a file in a PATH like environment variable into a helper function llvm-svn: 212057	2014-06-30 19:54:20 +00:00
Alp Toker	cf21875d41	Fix 'platform-specific' hyphenations llvm-svn: 212056	2014-06-30 18:57:16 +00:00
Alp Toker	b792a01e13	Build fix for systems without futimes/futimens Some versions of Android don't have futimes/futimens and this code wasn't updated during the recent errc refactoring. Patch by Luqman Aden! llvm-svn: 212055	2014-06-30 18:57:04 +00:00
Kevin Enderby	4c8dfe4d0f	Add the -arch flag support to llvm-nm to select the slice out of a Mach-O universal file. This also includes support for -arch all, selecting the host architecture by default from a universal file and checking if -arch is used with a standard Mach-O it matches that architecture. llvm-svn: 212054	2014-06-30 18:45:23 +00:00
Matt Arsenault	d0e0f0aea0	R600: Move mul combine to separate function llvm-svn: 212052	2014-06-30 17:55:48 +00:00
Matt Arsenault	5d293eefda	R600: Remove unused declarations leftover from AMDIL llvm-svn: 212051	2014-06-30 17:37:17 +00:00
Adrian Prantl	da7d92e3e2	Debug info: split out complex DIVariable address expressions into a separate MDNode so they can be uniqued via folding set magic. To conserve space, DIVariable nodes are still variable-length, with the last two fields being optional. No functional change. http://reviews.llvm.org/D3526 llvm-svn: 212050	2014-06-30 17:17:35 +00:00
Andrea Di Biagio	53b6830069	[X86] Add support for builtin to read performance monitoring counters. This patch adds support for a new builtin instruction called __builtin_ia32_rdpmc. Builtin '__builtin_ia32_rdpmc' is defined as a 'GCC builtin'; on X86, it can be used to read performance monitoring counters. It takes as input the index of the performance counter to read, and returns the value of the specified performance counter as a 64-bit number. Calls to this new builtin will map to instruction RDPMC. The index in input to the builtin call is moved to register %ECX. The result of the builtin call is the value of the specified performance counter (RDPMC would return that quantity in registers RDX:RAX). This patch: - Adds builtin int_x86_rdpmc as a GCCBuiltin; - Adds a new x86 DAG node called 'RDPMC_DAG'; - Teaches how to lower this new builtin; - Adds an ISel pattern to select instruction RDPMC; - Fixes the definition of instruction RDPMC adding %RAX and %RDX as implicit definitions, and adding %ECX as implicit use; - Adds a LLVM test to verify that the new builtin is correctly selected. llvm-svn: 212049	2014-06-30 17:14:21 +00:00
Chad Rosier	304fe3ff71	[AArch64] Unsized types don't specify an alignment. PR20109 llvm-svn: 212045	2014-06-30 15:03:00 +00:00
Chad Rosier	e6b8761ab9	[AArch64] Convert mul x, -(pow2 +/- 1) to shift + add/sub. The combine for mul x, pow2 +/- 1 is unchanged. Test cases for both combines as well as mul x, pow2 have been added as well. llvm-svn: 212044	2014-06-30 14:51:14 +00:00
Tim Northover	8f9590b622	macho-dump: add code to print LC_ID_DYLIB load commands. I want to check them in lld. llvm-svn: 212043	2014-06-30 14:40:57 +00:00
Scott Douglass	7650a9b871	ARM: take care not to set the ThumbFunc bit on TLS data symbols This fixes LNT SingleSource/UnitTests/Threads with -mthumb. Differential Revision: http://reviews.llvm.org/D4324 llvm-svn: 212029	2014-06-30 09:37:24 +00:00
Saleem Abdulrasool	e3c3fe53eb	X86: fix comment Fix a comment typo `DbgLocLImport` instead of `DLLImport`. llvm-svn: 212012	2014-06-30 03:11:18 +00:00
Saleem Abdulrasool	4a1e409a4d	ARM: use symbolic name for constant This just changes the constant value to the symbolic name corresponding to it. NFC. llvm-svn: 212011	2014-06-30 03:11:14 +00:00
Saleem Abdulrasool	67b548154e	CodeGen: rename Win64 ExceptionHandling to WinEH This exception format is not specific to Windows x64. A similar approach is taken on nearly all architectures. Generalise the name to reflect reality. This will eventually be used for Windows on ARM data emission as well. Switch the enum and namespace into an enum class. llvm-svn: 212000	2014-06-29 21:43:47 +00:00
Saleem Abdulrasool	7206a52522	MC: rename EmitWin64EH routines Rename the routines to reflect the reality that they are more related to call frame information than to Win64 EH. Although EH is implemented in an intertwined manner by augmenting with an exception handler and an associated parameter, the majority of these routines emit information required to unwind the frames. This also helps identify that these routines are generic for most windows platforms (they apply equally to nearly all architectures except x86) although the encoding of the information is architecture dependent. Unwinding data is emitted via EmitWinCFI* and exception handling information via EmitWinEH*. llvm-svn: 211994	2014-06-29 01:52:01 +00:00
Craig Topper	66e588be09	Add ops() method to SDNode that returns an ArrayRef<SDUse>. Use it to simplify some code. llvm-svn: 211993	2014-06-29 00:40:57 +00:00
Rafael Espindola	05cef84640	Use a range loop. No functionality change. llvm-svn: 211986	2014-06-28 18:44:59 +00:00
Alp Toker	c890aa5190	Fix build following r211956 RuntimeDyld now uses MCInst::dump_pretty() which introduces a dependency on 'MC'. llvm-svn: 211978	2014-06-28 06:31:47 +00:00
David Majnemer	329b5602d7	Verifier: Update assert message to reflect LangRef No functionality change, just correcting the assertion message. llvm-svn: 211977	2014-06-28 06:24:49 +00:00
Chandler Carruth	bd0717d7cc	[x86] Fix a bug in the v8i16 shuffling exposed by the new splat-like lowering for v16i8. ASan and some bots caught this bug with existing test cases. Fixing it even fixed a miscompile with one of the test cases. I'm still a bit suspicious of this test case as I've not taken a proper amount of time to think about it, but the fix here is strict goodness. llvm-svn: 211976	2014-06-28 05:46:28 +00:00
Chandler Carruth	887c2c3482	[x86] Add handling for splat-like widenings of v16i8 shuffles. These show up really frequently, not the least with actual splats. =] We lowered these quite badly before. The new code path tries to widen i8 shuffles to i16 shuffles in a splat-like way. There are still some inefficiencies in our i16 splat logic though, so we aren't really done here. Also, for certain patterns (bit of a gather-and-splat) we still generate pretty silly code, and I've left a fixme for addressing it. However, I'm not actually worried about this code pattern as much. The old shuffle lowering generates a 29 instruction monstrosity for it that should execute much more slowly. llvm-svn: 211974	2014-06-28 05:16:40 +00:00
Lang Hames	4a26c0ccae	[RuntimeDyld] Use a raw_ostream and llvm::format for int-to-string conversions. Some users' C++11 standard libraries don't support std::to_string yet. llvm-svn: 211961	2014-06-27 21:07:00 +00:00
Chad Rosier	5235973ee0	[AArch64] Fix memset ICE when memset value is f128. llvm-svn: 211960	2014-06-27 21:05:09 +00:00
Lang Hames	3b7ffd6314	[RuntimeDyld] #include <cctype> header in RuntimeDyldChecker.cpp. Hopefully this will unbreak the windows bots. llvm-svn: 211958	2014-06-27 20:37:39 +00:00
Lang Hames	e1c1138a38	[RuntimeDyld] Add a framework for testing relocation logic in RuntimeDyld. This patch adds a "-verify" mode to the llvm-rtdyld utility. In verify mode, llvm-rtdyld will test supplied expressions against the linked program images that it creates in memory. This scheme can be used to verify the correctness of the relocation logic applied by RuntimeDyld. The expressions to test will be read out of files passed via the -check option (there may be more than one of these). Expressions to check are extracted from lines of the form: # rtdyld-check: <expression> This system is designed to fit the llvm-lit regression test workflow. It is format and target agnostic, and supports verification of images linked for remote targets. The expression language is defined in llvm/include/llvm/RuntimeDyldChecker.h . Examples can be found in test/ExecutionEngine/RuntimeDyld. llvm-svn: 211956	2014-06-27 20:20:57 +00:00
Chandler Carruth	a94ef908d9	[x86] Fix another bug hit when bootstrapping with the new shuffle lowering. For maximum irony, I had already discovered this bug, diagnosed it, and left FIXMEs about it in the test cases. =[ I just failed to go back over those until after i had reduced a bootstrap miscompile down to a single TU, stared at the assembly for an hour, and figured out the bug. Again. Oh well. llvm-svn: 211955	2014-06-27 20:07:40 +00:00
Justin Holewinski	9982f06aa3	[NVPTX] Use GreatestCommonDivisor64 from MathExtras instead of using our own. Thanks Hal! llvm-svn: 211952	2014-06-27 19:36:25 +00:00
David Majnemer	82d6ff6b5b	Include <tuple> to make buildbots happy llvm-svn: 211949	2014-06-27 18:38:12 +00:00
Justin Holewinski	a0d531f031	[NVPTX] Add reflect intrinsic (better than matching by function name) Also clean up some of the logic in NVVMReflect.cpp while we're messing around in there. llvm-svn: 211948	2014-06-27 18:36:11 +00:00
Justin Holewinski	a8071856a6	[NVPTX] Handle all possible vector types in getSetCCResultType, not just the ones representable as MVTs llvm-svn: 211947	2014-06-27 18:36:08 +00:00
Justin Holewinski	2739c0175c	[NVPTX] Add 'b' asm constraint llvm-svn: 211946	2014-06-27 18:36:06 +00:00
Justin Holewinski	b5db95e465	[NVPTX] Simplify some argument lowering logic llvm-svn: 211945	2014-06-27 18:36:04 +00:00
Justin Holewinski	e519a4301b	[NVPTX] Do not process samplers in GenericToNVVM llvm-svn: 211944	2014-06-27 18:36:02 +00:00
Justin Holewinski	549c773619	[NVPTX] Error out if initializer is given for variable in an address space that does not support initialization llvm-svn: 211943	2014-06-27 18:36:01 +00:00
Justin Holewinski	773ca40f5d	[NVPTX] Add support for .managed variables for UVM llvm-svn: 211942	2014-06-27 18:35:58 +00:00
Justin Holewinski	d73767a80a	[NVPTX] Emit .weak linkage for link_once, weak, available_externally, and common linkage llvm-svn: 211941	2014-06-27 18:35:56 +00:00
Justin Holewinski	73cb5de546	[NVPTX] Variables that start with llvm. or nvvm. are reserved and should not be emitted llvm-svn: 211940	2014-06-27 18:35:53 +00:00
Justin Holewinski	b926d9d446	[NVPTX] Fix handling of ldg/ldu intrinsics. The address space of the pointer must be global (1) for these intrinsics. There must also be alignment metadata attached to the intrinsic calls, e.g. %val = tail call i32 @llvm.nvvm.ldu.i.global.i32.p1i32(i32 addrspace(1)* %ptr), !align !0 !0 = metadata !{i32 4} llvm-svn: 211939	2014-06-27 18:35:51 +00:00
Justin Holewinski	6e40f63e41	[NVPTX] Clean up argument lowering code and properly handle alignment for structs and vectors llvm-svn: 211938	2014-06-27 18:35:44 +00:00
Justin Holewinski	d7d8fe0e9c	[NVPTX] Add missing boolean vector contents flag llvm-svn: 211937	2014-06-27 18:35:42 +00:00
Justin Holewinski	360a5cfcd3	[NVPTX] Add support for [SHL,SRA,SRL]_PARTS llvm-svn: 211936	2014-06-27 18:35:40 +00:00
Justin Holewinski	eafe26d082	[NVPTX] Implement fma and imad contraction as target DAGCombiner patterns This also introduces DAGCombiner patterns for mul.wide to multiply two smaller integers and produce a larger integer llvm-svn: 211935	2014-06-27 18:35:37 +00:00
Justin Holewinski	832e09b4d9	[NVPTX] Add support for efficient rotate instructions on SM 3.2+ llvm-svn: 211934	2014-06-27 18:35:33 +00:00
Justin Holewinski	7be57de6b8	[NVPTX] Add missing isel patterns for 64-bit atomics llvm-svn: 211933	2014-06-27 18:35:30 +00:00
Justin Holewinski	ca7a4f136d	[NVPTX] Add isel patterns for bit-field extract (bfe) llvm-svn: 211932	2014-06-27 18:35:27 +00:00
Justin Holewinski	10c25968d8	[NVPTX] Add support for isspacep instruction llvm-svn: 211931	2014-06-27 18:35:24 +00:00
Justin Holewinski	124fc1951f	[NVPTX] Add support for envreg reads llvm-svn: 211930	2014-06-27 18:35:21 +00:00
Justin Holewinski	602fa5b5d1	[NVPTX] Add target options for PTX 3.2/4.0 and SM 5.0 (Maxwell) Default PTX version is set to PTX 3.2 llvm-svn: 211929	2014-06-27 18:35:18 +00:00
Justin Holewinski	c3f31ebe6e	[NVPTX] Update sub-target feature detection llvm-svn: 211928	2014-06-27 18:35:16 +00:00
Justin Holewinski	6dca83987c	[NVPTX] Directly control the Machine SSA passes that are invoked for NVPTX. NVPTX is a bit special in the optimizations it requires, so this gives us better control over the backend optimization pipeline. llvm-svn: 211927	2014-06-27 18:35:14 +00:00
Justin Holewinski	7d5bf66f61	[NVPTX] Emit .weak when linkage is not external, internal, or private llvm-svn: 211926	2014-06-27 18:35:10 +00:00
Justin Holewinski	0da758571c	[NVPTX] Just use getTypeAllocSize() when computing return value size for structures and vectors llvm-svn: 211925	2014-06-27 18:35:08 +00:00
Aaron Ballman	152dff71fa	Silencing some -Wcast-qual warnings. No functional changes intended. llvm-svn: 211923	2014-06-27 18:25:49 +00:00
Chandler Carruth	dd6470a9dd	[x86] Fix a miscompile in the new shuffle lowering uncovered by a bootstrap. I managed to mis-remember how PACKUS worked on x86, and was using undef for the high bytes instead of zero. The fix is fairly obvious. llvm-svn: 211922	2014-06-27 18:25:23 +00:00
David Majnemer	dad0a645a7	IR: Add COMDATs to the IR This new IR facility allows us to represent the object-file semantic of a COMDAT group. COMDATs allow us to tie together sections and make the inclusion of one dependent on another. This is required to implement features like MS ABI VFTables and optimizing away certain kinds of initialization in C++. This functionality is only representable in COFF and ELF, Mach-O has no similar mechanism. Differential Revision: http://reviews.llvm.org/D4178 llvm-svn: 211920	2014-06-27 18:19:56 +00:00
Julien Lerouge	a67d14f5a3	lldb can interrupt waitpid, so EINTR shouldn't be an error. This fixes the case where there is no timeout. In the case where there is a timeout though, the code is still wrong since it doesn't check that the alarm really went off. Without this patch, I cannot debug a program that forks itself using sys::ExecuteAndWait with lldb. llvm-svn: 211918	2014-06-27 18:02:54 +00:00
Matt Arsenault	d782d05666	R600: Move trivial getters into header, use initializer list llvm-svn: 211917	2014-06-27 17:57:00 +00:00
David Majnemer	c57d038240	MC: Fix associative sections on COFF COFF sections in MC were represented by a tuple of section-name and COMDAT-name. This is not sufficient to represent a .text section associated with another .text section; we need a way to distinguish between the key section and the one marked associative. llvm-svn: 211913	2014-06-27 17:19:44 +00:00
Juergen Ributzka	345589e257	[FastISel][X86] Fix typos. llvm-svn: 211911	2014-06-27 17:16:34 +00:00
Matt Arsenault	642d2e78b3	R600: Don't crash on unhandled instruction in promote alloca llvm-svn: 211906	2014-06-27 16:52:49 +00:00
Alexander Kornienko	b673b4b187	Clean up unused variable warning in release build. llvm-svn: 211902	2014-06-27 15:30:55 +00:00
Chandler Carruth	39cd216f8f	Re-apply r211287: Remove support for LLVM runtime multi-threading. I'll fix the problems in libclang and other projects in ways that don't require <mutex> until we sort out the cygwin situation. llvm-svn: 211900	2014-06-27 15:13:01 +00:00
Ulrich Weigand	14bd521f4c	[PowerPC] Constrain base register in PPCRegisterInfo::resolveFrameIndex I've run into a bug where current LLVM at -O0 (with fast-isel) generated invalid code like: ld 0, 20936(1) # 8-byte Folded Reload stw 12, 10348(0) stw 12, 10344(0) The underlying vreg had been introduced as base register by the Local Stack Slot Allocation pass. That register was constrained to G8RC by PPCRegisterInfo::materializeFrameBaseRegister to match the ADDI instruction used to set it, but it was not constrained to G8RC_NOX0 to fit the use of the register in an address. That should have happened in PPCRegisterInfo::resolveFrameIndex. This patch adds an appropriate constrainRegClass call. Reviewed by Hal Finkel. llvm-svn: 211897	2014-06-27 13:04:12 +00:00
Chandler Carruth	ed4a0bc734	[x86] Clean up some unused variables, especially in release builds. llvm-svn: 211894	2014-06-27 12:04:18 +00:00
Chandler Carruth	688001f042	[x86] Teach the target combine step to aggressively fold pshufd insturcions. Summary: This allows it to fold pshufd instructions across intervening half-shuffles and other noise. This pattern actually shows up in the generic lowering tests, but I've also added direct tests using intrinsics to make sure that the specific desired functionality is working even if the lowering stuff changes in the future. Differential Revision: http://reviews.llvm.org/D4292 llvm-svn: 211892	2014-06-27 11:40:13 +00:00
Chandler Carruth	0d6d1f2b17	[x86] Teach the target-specific combining how to aggressively fold half-shuffles, even looking through intervening instructions in a chain. Summary: This doesn't happen to show up with any test cases I've found for the current shuffle lowering, but previous attempts would benefit from this and it seems generally useful. I've tested it directly using intrinsics, which also shows that it will work with hand vectorized code as well. Note that even though pshufd isn't directly used in these tests, it gets exercised because we combine some of the half shuffles into a pshufd first, and then merge them. Differential Revision: http://reviews.llvm.org/D4291 llvm-svn: 211890	2014-06-27 11:34:40 +00:00
Chandler Carruth	97ebc2362c	[x86] Teach the X86 backend to DAG-combine SSE2 shuffles that are trivially redundant. This fixes several cases in the new vector shuffle lowering algorithm which would generate redundant shuffle instructions for the sake of simplicity. I'm also deleting a testcase which was somewhat ridiculous. It was checking for a bug in 2007 about incorrectly transforming shuffles by looking for the string "-86" in the output of a pretty substantial function. This test case doesn't seem to have any value at this point. Differential Revision: http://reviews.llvm.org/D4240 llvm-svn: 211889	2014-06-27 11:27:52 +00:00
Chandler Carruth	83860cfcfa	[x86] Begin a significant overhaul of how vector lowering is done in the x86 backend. This sketches out a new code path for vector lowering, hidden behind an off-by-default flag while it is under development. The fundamental idea behind the new code path is to aggressively break down the problem space in ways that ease selecting the odd set of instructions available on x86, and carefully avoid scalarizing code even when forced to use older ISAs. Notably, this starts off restricting itself to SSE2 and implements the complete vector shuffle and blend space for 128-bit vectors in SSE2 without scalarizing. The plan is to layer on top of this ISA extensions where we can bail out of the complex SSE2 lowering and opt for a cheaper, specialized instruction (or set of instructions). It also needs to be generalized to AVX and AVX512 vector widths. Currently, this does a decent but not perfect job for SSE2. There are some specific shortcomings that I plan to address: - We need a peephole combine to fold together shuffles where possible. There are cases where a previous shuffle could be modified slightly to arrange for elements to be in the correct position and a later shuffle eliminated. Doing this eagerly added quite a bit of complexity, and so my plan is to combine away these redundancies afterward. - There are a lot more clever ways to use unpck and pack that need to be added. This is essential for real world shuffles as it turns out... Once SSE2 is polished a bit I should be able to get interesting numbers on performance improvements on benchmarks conducive to vectorization. All of this will be off by default until it is functionally equivalent of course. Differential Revision: http://reviews.llvm.org/D4225 llvm-svn: 211888	2014-06-27 11:23:44 +00:00
Ulrich Weigand	8f1f87c734	[RuntimeDyld, PowerPC] Fix/improve handling of TOC relocations Current PPC64 RuntimeDyld code to handle TOC relocations has two problems: - With recent linkers, in addition to the relocations that implicitly refer to the TOC base (R_PPC64_TOC), you can now also use the .TOC. magic symbol with any other relocation to refer to the TOC base explicitly. This isn't currently used much in ELFv1 code (although it could be), but it is essential in ELFv2 code. - In a complex JIT environment with multiple modules, each module may have its own .toc section, and TOC relocations in one module must refer to its own* TOC section. The current findPPC64TOC implementation does not correctly implement this; in fact, it will always return the address of the first TOC section it finds anywhere. (Note that at the time findPPC64TOC is called, we don't even know which module the relocation originally resided in, so it is not even possible to fix this routine as-is.) This commit fixes both problems by handling TOC relocations earlier, in processRelocationRef. To do this, I've removed the findPPC64TOC routine and replaced it by a new routine findPPC64TOCSection, which works analogously to findOPDEntrySection in scanning the sections of the ObjImage provided by its caller, processRelocationRef. This solves the issue of finding the correct TOC section associated with the current module. This makes it straightforward to implement both R_PPC64_TOC relocations, and relocations explicitly refering to the .TOC. symbol, directly in processRelocationRef. There is now a new problem in implementing the R_PPC64_TOC16* relocations, because those can now in theory involve three different sections: the relocation may be applied in section A, refer explicitly to a symbol in section B, and refer implicitly to the TOC section C. The final processing of the relocation thus may only happen after all three of these sections have been assigned final addresses. There is currently no obvious means to implement this in its general form with the common-code RuntimeDyld infrastructure. Fortunately, ppc64 code usually makes no use of this most general form; in fact, TOC16 relocations are only ever generated by LLVM for symbols residing themselves in the TOC, which means "section B" == "section C" in the above terminology. This special case can easily be handled with the current infrastructure, and that is what this patch does. [ Unhandled cases result in an explicit error, unlike the current code which silently returns the wrong TOC base address ... ] This patch makes the JIT work on both BE and LE (ELFv2 requires additional patches, of course), and allowed me to successfully run complex JIT scenarios (via mesa/llvmpipe). Reviewed by Hal Finkel. llvm-svn: 211885	2014-06-27 10:32:14 +00:00
Alp Toker	de4c009be4	IRReader: don't mark MemoryBuffers const llvm-svn: 211883	2014-06-27 09:19:14 +00:00
Dinesh Dwivedi	adc07739a9	Added instruction combine to transform few more negative values addition to subtraction (Part 3) This patch enables transforms for (x + (~(y \| c) + 1) --> x - (y \| c) if c is odd Differential Revision: http://reviews.llvm.org/D4210 llvm-svn: 211881	2014-06-27 07:47:35 +00:00
Eric Christopher	93bf97c146	Remove the caching of the target machine from SystemZTargetLowering. Update all callers and uses accordingly. llvm-svn: 211880	2014-06-27 07:38:01 +00:00
Eric Christopher	673b3afacd	Remove target machine caching from SystemZInstrInfo and SystemZRegisterInfo and replace it with the subtarget as that's all they needed in the first place. Update all uses and calls accordingly. llvm-svn: 211877	2014-06-27 07:01:17 +00:00
David Blaikie	dada538bb4	Revert "Revert "Revert "PR20038: DebugInfo: Inlined call sites where the caller has debug info but the call itself has no debug location.""" Reverting this again, didn't mean to commit it - while r211872 fixes one of the issues here, there are still others to figure out and address. This reverts commit r211871. llvm-svn: 211873	2014-06-27 05:34:05 +00:00
David Blaikie	b0cdf530c3	ArgumentPromotion: Propagate debug locations on calls for which arguments are promoted. llvm-svn: 211872	2014-06-27 05:32:09 +00:00
David Blaikie	8832992df5	Revert "Revert "PR20038: DebugInfo: Inlined call sites where the caller has debug info but the call itself has no debug location."" This reverts commit r211724. llvm-svn: 211871	2014-06-27 05:31:49 +00:00
Eric Christopher	bb712eda0f	Have SystemZSelectionDAGInfo constructor take a DataLayout rather than a target machine since it doesn't need anything past the DataLayout. llvm-svn: 211870	2014-06-27 05:26:28 +00:00
Craig Topper	9f62d8006a	Rename getX86ConditonCode -> getX86ConditionCode llvm-svn: 211869	2014-06-27 05:18:21 +00:00
Andrew Trick	040c0da578	Left out the NDEBUG in the previous checkin. llvm-svn: 211867	2014-06-27 05:09:36 +00:00
Andrew Trick	5632722cab	MachineScheduler: add some book-keeping to fix an assert. Fixe for Bug 20057 - Assertion failied in llvm::SUnit* llvm::SchedBoundary::pickOnlyChoice(): Assertion `i <= (HazardRec->getMaxLookAhead() + MaxObservedStall) && "permanent hazard"' Thanks to Chad for the test case. llvm-svn: 211865	2014-06-27 04:57:05 +00:00
Alp Toker	f6ae844eea	Propagate const-correctness into parseBitcodeFile() llvm-svn: 211864	2014-06-27 04:48:32 +00:00
Eric Christopher	5432e75a25	Have MipsSelectionDAGInfo constructor take a DataLayout rather than a target machine since it doesn't need anything past the DataLayout. llvm-svn: 211863	2014-06-27 04:38:30 +00:00
Alp Toker	5ebb7b3112	ParseIR: don't take ownership of the MemoryBuffer clang was needlessly duplicating whole memory buffer contents in an attempt to satisfy unclear ownership semantics. Let's just hide internal LLVM quirks and present a simple non-owning interface. The public C API preserves previous behaviour for stability. llvm-svn: 211861	2014-06-27 04:33:58 +00:00
Eric Christopher	493f91b6de	Move NVPTX subtarget dependent variables from the target machine to the subtarget. llvm-svn: 211860	2014-06-27 04:33:14 +00:00
Eric Christopher	2ecb77e31f	Use the target lowering we can get off of the DAG rather than off of the cached target machine. llvm-svn: 211858	2014-06-27 03:45:49 +00:00
Matt Arsenault	6f62cf80d0	Fix missing newline and simplify debug printing. llvm-svn: 211850	2014-06-27 02:36:59 +00:00
Matt Arsenault	961ca43180	R600: Move load/store ReplaceNodeResults to common code. Future patches will want to custom lower loads on SI. llvm-svn: 211848	2014-06-27 02:33:47 +00:00
Eric Christopher	dd440f8727	Move the constructor for NVPTXFrameLowering into the implementation file in preparation for the subtarget move. llvm-svn: 211847	2014-06-27 02:05:24 +00:00
Eric Christopher	f0dad2670d	Remove unnecessary caching of the TargetMachine on NVPTXFrameLowering. Adjust the constructor accordingly. llvm-svn: 211846	2014-06-27 02:05:22 +00:00
Eric Christopher	d8132862a9	Rework the logic for setting the TargetName. This appears to be shorter and identical in goal. llvm-svn: 211845	2014-06-27 02:05:19 +00:00
Eric Christopher	e032a8c0bd	Remove caching of the target machine in NVPTXInstrInfo and update constructor accordingly. llvm-svn: 211840	2014-06-27 01:27:08 +00:00
Eric Christopher	a1869461e1	Remove comment that duplicated information in the constructor that it's after. llvm-svn: 211839	2014-06-27 01:27:06 +00:00
Eric Christopher	e8f50281a9	Remove commented out code. llvm-svn: 211838	2014-06-27 01:27:05 +00:00
Eric Christopher	c22ad16bc6	Remove extraneous parens and extraneous const cast (and fix the prototype for the function to patch what we were returning). llvm-svn: 211837	2014-06-27 01:27:03 +00:00
Eric Christopher	1f86ccac46	Move the subtarget dependent features from the target machine to the subtarget for the MSP430 target. llvm-svn: 211836	2014-06-27 01:14:54 +00:00

... 6 7 8 9 10 ...

71354 Commits