llvm-project

Commit Graph

Author	SHA1	Message	Date
Venkatraman Govindaraju	5615aca219	[SparcV9] Add ctpop instruction for i64. Also, expand ctlz, cttz and bswap. llvm-svn: 193941	2013-11-03 05:59:07 +00:00
Filip Pizlo	9f50ccd1a3	When LLVM is embedded in a larger application, it's not OK for LLVM to intercept crashes. LLVM already has the ability to disable this functionality. This patch exposes it via the C API. llvm-svn: 193937	2013-11-03 00:29:47 +00:00
Rafael Espindola	586af97a30	move getSymbolNMTypeChar to the one program that needs it: nm. llvm-svn: 193933	2013-11-02 21:16:09 +00:00
Rafael Espindola	06adfac8be	Convert another use of getSymbolNMTypeChar. llvm-svn: 193932	2013-11-02 20:10:07 +00:00
Rafael Espindola	e62ab11f09	Avoid some getSymbolNMTypeChar uses in COFFObjectFile.cpp itself. This is a fixed version of 193928 which keeps these uses in sync. llvm-svn: 193931	2013-11-02 18:07:48 +00:00
Rafael Espindola	7700d8bba5	Revert "Don't use getSymbolNMTypeChar for implementing COFFObjectFile::getSymbolFileOffset." Investigating a bot failure. This reverts commit r193928. llvm-svn: 193929	2013-11-02 17:12:49 +00:00
Rafael Espindola	48d0bb950d	Don't use getSymbolNMTypeChar for implementing COFFObjectFile::getSymbolFileOffset. llvm-svn: 193928	2013-11-02 16:55:21 +00:00
Benjamin Kramer	089c1e4f6d	SLPVectorizer: Remove duplicated function. llvm-svn: 193927	2013-11-02 14:46:27 +00:00
Benjamin Kramer	568a1cd9df	LoopVectorize: Remove quadratic behavior the local CSE. Doing this with a hash map doesn't change behavior and avoids calling isIdenticalTo O(n^2) times. This should probably eventually move into a utility class shared with EarlyCSE and the limited CSE in the SLPVectorizer. llvm-svn: 193926	2013-11-02 13:39:00 +00:00
Rafael Espindola	a135632af0	Fix llvm-nm to mach OS X's nm on some tests. There is still a long way to go for llvm-nm, but at least we now match nm's letter output in the cases we test for. llvm-svn: 193912	2013-11-02 05:03:24 +00:00
Michael Liao	b638d05ecb	Fix PR17764 - When selecting BLEND from vselect, the operands need swapping as due to the difference between vselect and SSE/AVX's BLEND insn llvm-svn: 193900	2013-11-02 00:10:02 +00:00
Yuchen Wu	dbcf19758d	Added command-line option to output llvm-cov to file. Added -o option to llvm-cov. If no output file is specified, it defaults to STDOUT. llvm-svn: 193899	2013-11-02 00:09:17 +00:00
Arnold Schwaighofer	d0789cdffe	LoopVectorizer: Move cse code into its own function llvm-svn: 193895	2013-11-01 23:28:54 +00:00
Eric Christopher	fedfa44922	Comment some and reformat for clarity beginFunction. llvm-svn: 193894	2013-11-01 23:14:17 +00:00
Arnold Schwaighofer	a846a7f8f0	LoopVectorizer: Perform redundancy elimination on induction variables When the loop vectorizer was part of the SCC inliner pass manager gvn would run after the loop vectorizer followed by instcombine. This way redundancy (multiple uses) were removed and instcombine could perform scalarization on the induction variables. Having moved the loop vectorizer to later we no longer run any form of redundancy elimination before we perform instcombine. This caused vectorized induction variables to survive that did not before. On a recent iMac this helps linpack back from 6000Mflops to 7000Mflops. This should also help lpbench and paq8p. I ran a Release (without Asserts) build over the test-suite and did not see any negative impact on compile time. radar://15339680 llvm-svn: 193891	2013-11-01 22:18:19 +00:00
David Blaikie	2ede02f6d0	DebugInfo: Make pubnames header printing similar to unit header printing In a failed attempt to allow the gnu-public-names.ll test case to not hardcode the size of the unit that the pubnames section referred to I've at least managed to have unit headers and pubnames headers print out in a similar style. This failed to achieve the desired goal because the header in a unit specifies the length of the unit without the length element of the header whereas the length in the pubnames includes this element, so the numbers are off by 4 bytes. I don't know of any arithmetic powers in FileCheck so the test case can't simply say "CU_LENGTH + 4". llvm-svn: 193872	2013-11-01 17:53:30 +00:00
Juergen Ributzka	359c532d36	[Stackmap] Remove erroneous assert. llvm-svn: 193871	2013-11-01 17:53:27 +00:00
Matt Arsenault	ef1a950b48	Use isa<> instead of dyn_cast<> with unused value llvm-svn: 193869	2013-11-01 17:39:26 +00:00
Chad Rosier	995d9c2fdc	[AArch64] Simplify a few of the instruction patterns. No functional change intended. llvm-svn: 193867	2013-11-01 17:13:44 +00:00
Chad Rosier	a4bfb44a9e	[AArch64] Fix assembly string formatting and other coding standard violations. llvm-svn: 193866	2013-11-01 17:13:42 +00:00
Rafael Espindola	716e7405d3	Remove linkonce_odr_auto_hide. linkonce_odr_auto_hide was in incomplete attempt to implement a way for the linker to hide symbols that are known to be available in every TU and whose addresses are not relevant for a particular DSO. It was redundant in that it all its uses are equivalent to linkonce_odr+unnamed_addr. Unlike those, it has never been connected to clang or llvm's optimizers, so it was effectively dead. Given that nothing produces it, this patch just nukes it (other than the llvm-c enum value). llvm-svn: 193865	2013-11-01 17:09:14 +00:00
Aaron Ballman	2b7a733b16	Commenting out this assert because it is causing the build bots to fail. This effectively reverts r193861, but needs to be fixed as part of r193769. llvm-svn: 193862	2013-11-01 15:12:23 +00:00
Aaron Ballman	96321aa523	Fixing an order of evaluation error in an assert. llvm-svn: 193861	2013-11-01 14:53:14 +00:00
Benjamin Kramer	1fbcdca9e3	LoopVectorize: Look for consecutive acces in GEPs with trailing zero indices If we have a pointer to a single-element struct we can still build wide loads and stores to it (if there is no padding). llvm-svn: 193860	2013-11-01 14:09:50 +00:00
Bradley Smith	2521975a42	[ARM] Add Virtualization subtarget feature and more build attributes in this area Add a Virtualization ARM subtarget feature along with adding proper build attribute emission for Tag_Virtualization_use (encodes Virtualization and TrustZone) and Tag_MPextension_use. Also rework test/CodeGen/ARM/2010-10-19-mc-elf-objheader.ll testcase to something that is more maintainable. This changes the focus of this testcase away from testing CPU defaults (which is tested elsewhere), onto specifically testing that attributes are encoded correctly. llvm-svn: 193859	2013-11-01 13:27:35 +00:00
Bradley Smith	c848beba5e	[ARM] Fix Tag_ABI_HardFP_use build attribute Fix Tag_ABI_HardFP_use build attribute to handle single precision FP, replace deprecated Tag_ABI_HardFP_use value of 3 with 0 and also add some tests for Tag_ABI_VFP_args. llvm-svn: 193856	2013-11-01 11:21:16 +00:00
Hal Finkel	4d94930bcb	Consider (x == -1) unlikely in BranchProbabilityInfo This adds another heuristic to BPI, similar to the existing heuristic that considers (x == 0) unlikely to be true. As suggested in the PACT'98 paper by Deitrich, Cheng, and Hwu, -1 is often used to indicate an invalid index, and equality comparisons with -1 are also unlikely to succeed. Local experimentation supports this hypothesis: This yields a 1-2% speedup in the test-suite sqlite benchmark on the PPC A2 core, with no significant regressions. llvm-svn: 193855	2013-11-01 10:58:22 +00:00
Arnold Schwaighofer	70a4665f55	LoopVectorizer: If dependency checks fail try runtime checks When a dependence check fails we can still try to vectorize loops with runtime array bounds checks. This helps linpack to vectorize a loop in dgefa. And we are back to 2x of the scalar performance on a corei7-avx. radar://15339680 llvm-svn: 193853	2013-11-01 03:05:07 +00:00
Arnold Schwaighofer	1ca922e296	LoopVectorizer: Clear all member data structures in RuntimeCheck.reset() Clear all data structures when resetting the RuntimeCheck data structure. No test case. This was exposed by an upcomming change. llvm-svn: 193852	2013-11-01 03:05:04 +00:00
David Blaikie	71d34a2eef	DebugInfo: Emit member variable locations as data instead of expressions in blocks Drive by space optimization. Also makes the DIEs more regular which might speed up DWARF parsing. llvm-svn: 193835	2013-11-01 00:25:45 +00:00
Kevin Enderby	3c5ac81032	Add to the disassembler C API output reference types for Objective-C data structures. This is allows tools such as darwin's otool(1) that uses the LLVM disassembler take a pointer value being loaded by an instruction and add a comment to what it is being referenced to make following disassembly of Objective-C programs more readable. For example disassembling the Mac OS X TextEdit app one will see comments like the following: movq 0x20684(%rip), %rsi ## Objc selector ref: standardUserDefaults movq 0x21985(%rip), %rdi ## Objc class ref: _OBJC_CLASS_$_NSUserDefaults movq 0x1d156(%rip), %r14 ## Objc message: +[NSUserDefaults standardUserDefaults] leaq 0x23615(%rip), %rdx ## Objc cfstring ref: @"SelectLinePanel" callq 0x10001386c ## Objc message: -[[%rdi super] initWithWindowNibName:] These diffs also include putting quotes around C strings in literal pools and uses "symbol address" in the comment when adding a symbol name to the comment to tell these types of references apart: leaq 0x4f(%rip), %rax ## literal pool for: "Hello world" movq 0x1c3ea(%rip), %rax ## literal pool symbol address: ___stack_chk_guard Of course the easy changes are in the LLVM disassembler and the hard work is up to the implementer of the SymbolLookUp() call back. rdar://10602439 llvm-svn: 193833	2013-11-01 00:00:07 +00:00
Dan Gohman	3e6f7aff3e	Fix unused variable warnings. llvm-svn: 193823	2013-10-31 22:58:11 +00:00
Andrew Trick	c21d86f7ec	Unused variable llvm-svn: 193819	2013-10-31 22:42:20 +00:00
Chad Rosier	74b65cd811	[AArch64] Add support for NEON scalar fixed-point convert to floating-point instructions. llvm-svn: 193816	2013-10-31 22:36:59 +00:00
Andrew Trick	a3a11dedca	Add new calling convention for WebKit Java Script. llvm-svn: 193812	2013-10-31 22:12:01 +00:00
Andrew Trick	153ebe6d2a	Add support for stack map generation in the X86 backend. Originally implemented by Lang Hames. llvm-svn: 193811	2013-10-31 22:11:56 +00:00
Manman Ren	87a2adc7fe	Do not convert "call asm" to "invoke asm" in Inliner. Given that backend does not handle "invoke asm" correctly ("invoke asm" will be handled by SelectionDAGBuilder::visitInlineAsm, which does not have the right setup for LPadToCallSiteMap) and we already made the assumption that inline asm does not throw in InstCombiner::visitCallSite, we are going to make the same assumption in Inliner to make sure we don't convert "call asm" to "invoke asm". If it becomes necessary to add support for "invoke asm" later on, we will need to modify the backend as well as remove the assumptions that inline asm does not throw. Fix rdar://15317907 llvm-svn: 193808	2013-10-31 21:56:03 +00:00
Rafael Espindola	282a47037b	Use LTO_SYMBOL_SCOPE_DEFAULT_CAN_BE_HIDDEN instead of the "dso list". There are two ways one could implement hiding of linkonce_odr symbols in LTO: * LLVM tells the linker which symbols can be hidden if not used from native files. * The linker tells LLVM which symbols are not used from other object files, but will be put in the dso symbol table if present. GOLD's API is the second option. It was implemented almost 1:1 in llvm by passing the list down to internalize. LLVM already had partial support for the first option. It is also very similar to how ld64 handles hiding these symbols when not doing LTO. This patch then * removes the APIs for the DSO list. * marks LTO_SYMBOL_SCOPE_DEFAULT_CAN_BE_HIDDEN all linkonce_odr unnamed_addr global values and other linkonce_odr whose address is not used. * makes the gold plugin responsible for handling the API mismatch. llvm-svn: 193800	2013-10-31 20:51:58 +00:00
Rui Ueyama	29d2910875	Use StringRef::startswith_lower. No functionality change. llvm-svn: 193796	2013-10-31 19:59:55 +00:00
Nuno Lopes	14d4b0cdb2	[ConstantRange] improve my previous patch per Nick suggestion llvm-svn: 193795	2013-10-31 19:53:53 +00:00
Chad Rosier	20e1f20d69	[AArch64] Add support for NEON scalar shift immediate instructions. llvm-svn: 193790	2013-10-31 19:28:44 +00:00
Roman Divacky	2262cfaf19	SparcV9 doesnt have rem instruction either. llvm-svn: 193789	2013-10-31 19:22:33 +00:00
Alexey Samsonov	d3cba699c1	DWARFDebugArangeSet: remove dead code llvm-svn: 193785	2013-10-31 18:54:20 +00:00
Alexey Samsonov	0428d12df1	DWARFUnit: kill dead code and make a couple of functions private. No functionality change. llvm-svn: 193780	2013-10-31 18:05:02 +00:00
Manman Ren	4dbdc9021d	Debug Info: remove duplication of DIEs when a DIE can be shared across CUs. We add a map in DwarfDebug to map MDNodes that are shareable across CUs to the corresponding DIEs: MDTypeNodeToDieMap. These DIEs can be shared across CUs, that is why we keep the maps in DwarfDebug instead of CompileUnit. We make the assumption that if a DIE is not added to an owner yet, we assume it belongs to the current CU. Since DIEs for the type system are added to their owners immediately after creation, and other DIEs belong to the current CU, the assumption should be true. A testing case is added to show that we only create a single DIE for a type MDNode and we use ref_addr to refer to the type DIE. We also add a testing case to show ref_addr relocations for non-darwin platforms. llvm-svn: 193779	2013-10-31 17:54:35 +00:00
Alexey Samsonov	d5cc93c3ae	DWARFAbbreviationDeclaration: remove dead code, refactor parsing code and make it more robust. No functionality change. llvm-svn: 193770	2013-10-31 17:20:14 +00:00
Andrew Trick	74f4c749cf	Lower stackmap intrinsics directly to their target opcode in the DAG builder. llvm-svn: 193769	2013-10-31 17:18:24 +00:00
Andrew Trick	a2efd99bdf	Enable variable arguments support for intrinsics. llvm-svn: 193766	2013-10-31 17:18:11 +00:00
Andrew Trick	d4d1d9c06e	whitespace llvm-svn: 193765	2013-10-31 17:18:07 +00:00
Rafael Espindola	4b102d0ead	Remove another unused flag. llvm-svn: 193756	2013-10-31 15:58:33 +00:00
Rafael Espindola	74e1d0a0a0	Remove unused flag. llvm-svn: 193752	2013-10-31 15:49:39 +00:00
Rafael Espindola	aca9739a1a	Rules adjustments in order to build on DragonFly BSD. Patch by Robin Hahling. llvm-svn: 193750	2013-10-31 14:35:00 +00:00
Rafael Espindola	dbec9d9b2a	Remove the --shrink-wrap option. It had no tests, was unused and was "experimental at best". llvm-svn: 193749	2013-10-31 14:07:59 +00:00
Cameron McInally	394d557f41	Add AVX512 unmasked integer broadcast intrinsics and support. llvm-svn: 193748	2013-10-31 13:56:31 +00:00
Elena Demikhovsky	496656900e	AVX-512: Implemented CMOV for 512-bit vectors llvm-svn: 193747	2013-10-31 13:15:32 +00:00
Richard Sandiford	f834ea19db	[SystemZ] Automatically detect zEC12 and z196 hosts As on other hosts, the CPU identification instruction is priveleged, so we need to look through /proc/cpuinfo. I copied the PowerPC way of handling "generic". Several tests were implicitly assuming z10 and so failed on z196. llvm-svn: 193742	2013-10-31 12:14:17 +00:00
Amara Emerson	f80f95fcc7	[AArch64] Make the use of FP instructions optional, but enabled by default. This adds a new subtarget feature called FPARMv8 (implied by NEON), and predicates the support of the FP instructions and registers on this feature. llvm-svn: 193739	2013-10-31 09:32:11 +00:00
Rafael Espindola	26b43cac18	Fix a use after free on invalid input. llvm-svn: 193737	2013-10-31 04:20:23 +00:00
Rafael Espindola	8fb73c8778	Fix most memory leaks in tablegen. Found by the valgrind bot. llvm-svn: 193736	2013-10-31 04:07:41 +00:00
Rafael Espindola	6554e5a94d	Merge CallGraph and BasicCallGraph. llvm-svn: 193734	2013-10-31 03:03:55 +00:00
Jim Grosbach	7236678687	Legalize: Improve legalization of long vector extends. When an extend more than doubles the size of the elements (e.g., a zext from v16i8 to v16i32), the normal legalization method of splitting the vectors will run into problems as by the time the destination vector is legal, the source vector is illegal. The end result is the operation often becoming scalarized, with the typical horrible performance. For example, on x86_64, the simple input of: define void @bar(<16 x i8> %a, <16 x i32>* %p) nounwind { %tmp = zext <16 x i8> %a to <16 x i32> store <16 x i32> %tmp, <16 x i32>*%p ret void } Generates: .section __TEXT,__text,regular,pure_instructions .section __TEXT,__const .align 5 LCPI0_0: .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .section __TEXT,__text,regular,pure_instructions .globl _bar .align 4, 0x90 _bar: vpunpckhbw %xmm0, %xmm0, %xmm1 vpunpckhwd %xmm0, %xmm1, %xmm2 vpmovzxwd %xmm1, %xmm1 vinsertf128 $1, %xmm2, %ymm1, %ymm1 vmovaps LCPI0_0(%rip), %ymm2 vandps %ymm2, %ymm1, %ymm1 vpmovzxbw %xmm0, %xmm3 vpunpckhwd %xmm0, %xmm3, %xmm3 vpmovzxbd %xmm0, %xmm0 vinsertf128 $1, %xmm3, %ymm0, %ymm0 vandps %ymm2, %ymm0, %ymm0 vmovaps %ymm0, (%rdi) vmovaps %ymm1, 32(%rdi) vzeroupper ret So instead we can check if there are legal types that enable us to split more cleverly when the input vector is already legal such that we don't turn it into an illegal type. If the extend is such that it's more than doubling the size of the input we check if - the number of vector elements is even, - the source type is legal, - the type of a split source is illegal, - the type of an extended (by doubling element size) source is legal, and - the type of that extended source when split is legal. If the conditions are met, instead of just splitting both the destination and the source types, we create an extend that only goes up one "step" (doubling the element width), and the continue legalizing the rest of the operation normally. The result is that this operates as a new, more effecient, termination condition for the loop of "split the operation until the destination type is legal." With this change, the above example now compiles to: _bar: vpxor %xmm1, %xmm1, %xmm1 vpunpcklbw %xmm1, %xmm0, %xmm2 vpunpckhwd %xmm1, %xmm2, %xmm3 vpunpcklwd %xmm1, %xmm2, %xmm2 vinsertf128 $1, %xmm3, %ymm2, %ymm2 vpunpckhbw %xmm1, %xmm0, %xmm0 vpunpckhwd %xmm1, %xmm0, %xmm3 vpunpcklwd %xmm1, %xmm0, %xmm0 vinsertf128 $1, %xmm3, %ymm0, %ymm0 vmovaps %ymm0, 32(%rdi) vmovaps %ymm2, (%rdi) vzeroupper ret This generalizes a custom lowering that was added a while back to the ARM backend. That lowering is no longer necessary, and is removed. The testcases for it, however, provide excellent ARM tests for this change and so remain. rdar://14735100 llvm-svn: 193727	2013-10-31 00:20:48 +00:00
Matt Arsenault	909d0c063f	Fix a few typos llvm-svn: 193723	2013-10-30 23:43:29 +00:00
Matt Arsenault	2ba54c3d90	Fix CodeGen for unaligned loads with address spaces llvm-svn: 193721	2013-10-30 23:30:05 +00:00
Matt Arsenault	38b8ecf378	Teach scalarrepl about address spaces llvm-svn: 193720	2013-10-30 22:54:58 +00:00
Rafael Espindola	55fdcff446	Add calls to doInitialization() and doFinalization() in verifyFunction() The function verifyFunction() in lib/IR/Verifier.cpp misses some calls. It creates a temporary FunctionPassManager that will run a single Verifier pass. Unfortunately, FunctionPassManager is no PassManager and does not call doInitialization() and doFinalization() by itself. Verifier does important tasks in doInitialization() such as collecting type information used to check DebugInfo metadata and doFinalization() does some additional checks. Therefore these checks were missed and debug info couldn't be verified at all, it just crashed if the function had some. verifyFunction() is currently not used in llvm unless -debug option is enabled, and in unittests/IR/VerifierTest.cpp VerifierTest had to be changed to create the function in a module from which the type debug info can be collected. Patch by Michael Kruse. llvm-svn: 193719	2013-10-30 22:37:51 +00:00
Rafael Espindola	6f1b2852fc	Produce .weak_def_can_be_hidden for some linkonce_odr values With this patch llvm produces a weak_def_can_be_hidden for linkonce_odr if they are also unnamed_addr or don't have their address taken. There is not a lot of documentation about .weak_def_can_be_hidden, but from the old discussion about linkonce_odr_auto_hide and the name of the directive this looks correct: these symbols can be hidden. Testing this with the ld64 in Xcode 5 linking clang reduces the number of exported symbols from 21053 to 19049. llvm-svn: 193718	2013-10-30 22:08:11 +00:00
David Blaikie	6b288cfa7a	DebugInfo: Push header handling down into CompileUnit This is a preliminary step to handling type units by abstracting over all (type or compile) units. llvm-svn: 193714	2013-10-30 20:42:41 +00:00
Matt Arsenault	614ea99da7	Fix GVN creating bitcast between address spaces llvm-svn: 193710	2013-10-30 19:05:41 +00:00
Tom Roeder	04d88fba3e	This commit adds some (but not all) of the x86-64 relocations that are not currently supported in the ELF object writer, along with a simple test case. llvm-svn: 193709	2013-10-30 18:47:25 +00:00
Rui Ueyama	00e24e48b6	Add {start,end}with_lower methods to StringRef. startswith_lower is ocassionally useful and I think worth adding. endwith_lower is added for completeness. Differential Revision: http://llvm-reviews.chandlerc.com/D2041 llvm-svn: 193706	2013-10-30 18:32:26 +00:00
Artyom Skrobov	c1be9c16bc	[ARM] NEON instructions were erroneously decoded from certain invalid encodings llvm-svn: 193705	2013-10-30 18:10:09 +00:00
Tom Stellard	c947d8ca64	R600: Custom lower f32 = uint_to_fp i64 llvm-svn: 193701	2013-10-30 17:22:05 +00:00
David Blaikie	2d4e11228b	DwarfDebug: Change Abbreviations member from pointer to reference llvm-svn: 193699	2013-10-30 17:14:24 +00:00
Hans Wennborg	3e9b1c1010	Add #include of raw_ostream.h to MipsSEISelLowering.cpp Fixing this Windows build error: ..\lib\Target\Mips\MipsSEISelLowering.cpp(997) : error C2027: use of undefined type 'llvm::raw_ostream' llvm-svn: 193696	2013-10-30 16:10:10 +00:00
Daniel Sanders	d5f554f0bb	[mips][msa] Correct definition of bins[lr] and CHECK-DAG-ize related tests llvm-svn: 193695	2013-10-30 15:45:42 +00:00
Nuno Lopes	1112eca0af	make ConstantRange::signExtend() optimal the case [x, INT_MIN) was not handled optimally llvm-svn: 193694	2013-10-30 15:36:50 +00:00
Daniel Sanders	ab94b537d7	[mips][msa] Added support for matching bmnz, bmnzi, bmz, and bmzi from normal IR (i.e. not intrinsics) Also corrected the definition of the intrinsics for these instructions (the result register is also the first operand), and added intrinsics for bsel and bseli to clang (they already existed in the backend). These four operations are mostly equivalent to bsel, and bseli (the difference is which operand is tied to the result). As a result some of the tests changed as described below. bitwise.ll: - bsel.v test adapted so that the mask is unknown at compile-time. This stops it emitting bmnzi.b instead of the intended bsel.v. - The bseli.b test now tests the right thing. Namely the case when one of the values is an uimm8, rather than when the condition is a uimm8 (which is covered by bmnzi.b) compare.ll: - bsel.v tests now (correctly) emits bmnz.v instead of bsel.v because this is the same operation (see MSA.txt). i8.ll - CHECK-DAG-ized test. - bmzi.b test now (correctly) emits equivalent bmnzi.b with swapped operands because this is the same operation (see MSA.txt). - bseli.b still emits bseli.b though because the immediate makes it distinguishable from bmnzi.b. vec.ll: - CHECK-DAG-ized test. - bmz.v tests now (correctly) emits bmnz.v with swapped operands (see MSA.txt). - bsel.v tests now (correctly) emits bmnz.v with swapped operands (see MSA.txt). llvm-svn: 193693	2013-10-30 15:20:38 +00:00
Chad Rosier	be020d0309	[AArch64] Add support for NEON scalar floating-point compare instructions. llvm-svn: 193691	2013-10-30 15:19:37 +00:00
Daniel Sanders	d74b130cc9	[mips][msa] Added support for matching bins[lr]i.[bhwd] from normal IR (i.e. not intrinsics) This required correcting the definition of the bins[lr]i intrinsics because the result is also the first operand. It also required removing the (arbitrary) check for 32-bit immediates in MipsSEDAGToDAGISel::selectVSplat(). Currently using binsli.d with 2 bits set in the mask doesn't select binsli.d because the constant is legalized into a ConstantPool. Similar things can happen with binsri.d with more than 10 bits set in the mask. The resulting code when this happens is correct but not optimal. llvm-svn: 193687	2013-10-30 14:45:14 +00:00
Daniel Sanders	53fe6c4d56	[mips][msa] Combine binsri-like DAG of AND and OR into equivalent VSELECT (or (and $a, $mask), (and $b, $inverse_mask)) => (vselect $mask, $a, $b). where $mask is a constant splat. This allows bitwise operations to make use of bsel. It's also a stepping stone towards matching bins[lr], and bins[lr]i from normal IR. Two sets of similar tests have been added in this commit. The bsel_* functions test the case where binsri cannot be used. The binsr_*_i functions will start to use the binsri instruction in the next commit. llvm-svn: 193682	2013-10-30 13:51:01 +00:00
Daniel Sanders	62aeab83e7	[mips] MipsSETargetLowering now reports DAGCombiner changes when using -debug-only=mips-isel No test since -debug output is intended for developers and not end-users. llvm-svn: 193681	2013-10-30 13:31:27 +00:00
Daniel Sanders	e7ef0c817b	[mips][msa] Added support for matching splat.[bhw] from normal IR (i.e. not intrinsics) splat.d is implemented but this subtest is currently disabled. This is because it is difficult to match the appropriate IR on MIPS32. There is a patch under review that should help with this so I hope to enable the subtest soon. llvm-svn: 193680	2013-10-30 13:07:44 +00:00
Juergen Ributzka	3bd686d493	Revert "SelectionDAG: Teach the legalizer to split SETCC if VSELECT needs splitting too." Now Hexagon and SystemZ are not happy with it :-( llvm-svn: 193677	2013-10-30 06:36:19 +00:00
Juergen Ributzka	6ad05d6b95	SelectionDAG: Teach the legalizer to split SETCC if VSELECT needs splitting too. The Type Legalizer recognizes that VSELECT needs to be split, because the type is to wide for the given target. The same does not always apply to SETCC, because less space is required to encode the result of a comparison. As a result VSELECT is split and SETCC is unrolled into scalar comparisons. This commit fixes the issue by checking for VSELECT-SETCC patterns in the DAG Combiner. If a matching pattern is found, then the result mask of SETCC is promoted to the expected vector mask type for the given target. This mask has usually the same size as the VSELECT return type (except for Intel KNL). Now the type legalizer will split both VSELECT and SETCC. This allows the following X86 DAG Combine code to sucessfully detect the MIN/MAX pattern. This fixes PR16695, PR17002, and <rdar://problem/14594431>. Reviewed by Nadav llvm-svn: 193676	2013-10-30 05:48:18 +00:00
Bill Wendling	d3b4344af9	Reformat Makefile. No other changes. llvm-svn: 193675	2013-10-30 04:03:03 +00:00
Akira Hatanaka	3048b0248a	[mips] Compute stack alignment on the fly. llvm-svn: 193673	2013-10-30 02:29:43 +00:00
Josh Magee	7245f1d85d	Reformat code with clang-format. Differential Revision: http://llvm-reviews.chandlerc.com/D2057 llvm-svn: 193672	2013-10-30 02:25:14 +00:00
Manman Ren	251a1bd215	Debug Info: code clean up. Use EmitLabelOffsetDifference for handling on darwin platform when non-darwin platforms use EmitLabelPlusOffset. Also fix a bug in EmitLabelOffsetDifference where the size is hard-coded to 4 even though Size is passed in as an argument. llvm-svn: 193660	2013-10-29 23:14:15 +00:00
Manman Ren	ce20d460e2	Debug Info: support for DW_FORM_ref_addr. To support ref_addr, we calculate the section offset of a DIE (i.e. offset of a DIE from beginning of the debug info section). The Offset field in DIE is currently CU-relative. To calculate the section offset, we add a DebugInfoOffset field in CompileUnit to store the offset of a CU from beginning of the debug info section. We set the value in DwarfUnits::computeSizeAndOffset for each CompileUnit. A helper function DIE::getCompileUnit is added to return the CU DIE that the input DIE belongs to. We also add a map CUDieMap in DwarfDebug to help finding the CU for a given CU DIE. For a cross-referenced DIE, we first find the CU DIE it belongs to with getCompileUnit, then we use CUDieMap to get the corresponding CU for the CU DIE. Adding the section offset of the CU with the CU-relative offset of a DIE gives us the seciton offset of the DIE. We correctly emit ref_addr with relocation using EmitLabelPlusOffset when doesDwarfUseRelocationsAcrossSections is true. This commit handles the emission of DW_FORM_ref_addr when we have an attribute with FORM_ref_addr. A follow-on patch will start using ref_addr when adding a DIEEntry. This commit will be tested and verified in the follow-on patch. Reviewed off-list by Eric, Thanks. llvm-svn: 193658	2013-10-29 22:57:10 +00:00
Manman Ren	f4c339e04a	Debug Info: instead of calling addToContextOwner which constructs the context after the DIE creation, we construct the context first. Ensure that we create the context before we create a type so that we can add the newly created type to the parent. Remove last use of addToContextOwner now that it's not needed. We use createAndAddDIE to wrap around "new DIE(". Now all shareable DIEs should be added to their parents right after the creation. Reviewed off-list by Eric, Thanks. llvm-svn: 193657	2013-10-29 22:49:29 +00:00
Manman Ren	b504f49448	Struct byval cleanup: add helper functions to reduce code duplication. Helper functions are added: emitPostLd: emit a post-increment load operation with given size. emitPostSt: emit a post-increment store operation with given size. No functionality change. llvm-svn: 193656	2013-10-29 22:27:32 +00:00
Josh Magee	3f1c0e35e6	[stackprotector] Update the StackProtector pass to perform datalayout analysis. This modifies the pass to classify every SSP-triggering AllocaInst according to an SSPLayoutKind (LargeArray, SmallArray, AddrOf). This analysis is collected by the pass and made available for use, but no other pass uses it yet. The next patch will make use of this analysis in PEI and StackSlot passes. The end goal is to support ssp-strong stack layout rules. WIP. Differential Revision: http://llvm-reviews.chandlerc.com/D1789 llvm-svn: 193653	2013-10-29 21:16:16 +00:00
Aaron Ballman	9ab670fb54	Removing a switch statement that contains only a default label. This resolves an MSVC warning. No functional change intended. llvm-svn: 193649	2013-10-29 20:40:52 +00:00
Akira Hatanaka	6b2d841975	[mips] Align the stack to 16-bytes for mfp64. llvm-svn: 193641	2013-10-29 19:29:03 +00:00
Rafael Espindola	e133ed88b5	Move getSymbol to TargetLoweringObjectFile. This allows constructing a Mangler with just a TargetMachine. llvm-svn: 193630	2013-10-29 17:28:26 +00:00
Rafael Espindola	79858aa3df	Add a helper getSymbol to AsmPrinter. llvm-svn: 193627	2013-10-29 17:07:16 +00:00
Weiming Zhao	ffade617bd	[AArch64] Implement FrameAddr and ReturnAddr Fixes PR17690 llvm-svn: 193625	2013-10-29 17:00:25 +00:00
Amara Emerson	f9a67fce26	[ARM] Make sure HasCRC is initialized to false in Subtarget. llvm-svn: 193624	2013-10-29 16:54:52 +00:00
Zoran Jovanovic	507e084a18	Support for microMIPS jump instructions llvm-svn: 193623	2013-10-29 16:38:59 +00:00
Tom Stellard	6e1ee476ab	R600/SI: Add compute support for CI v2 v2: - Fix LDS size calculation Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 193621	2013-10-29 16:37:28 +00:00
Tom Stellard	e118b8becd	R600: Expand vector FSQRT ops llvm-svn: 193620	2013-10-29 16:37:20 +00:00
Alexey Samsonov	cbd806aef8	DWARF parser: propery handle DW_FORM_ref_sig8 and fix Windows build. Based on D2050 by Timur Iskhodzhanov. llvm-svn: 193619	2013-10-29 16:32:19 +00:00
Rafael Espindola	7d78b2ae3a	The asm printer has a mangler. Use it. llvm-svn: 193618	2013-10-29 16:24:21 +00:00
Rafael Espindola	69c1d631f2	The AsmPrinter has a Mangler. Use it. llvm-svn: 193617	2013-10-29 16:18:15 +00:00
Rafael Espindola	38c2e65e78	The asm printer has a mangler. Don't keep a second pointer to it. llvm-svn: 193616	2013-10-29 16:11:22 +00:00
Timur Iskhodzhanov	cb4e7550eb	Quick-fix DebugInfo build on Windows MSVC can't comprehend template<typename T, size_t N> ArrayRef<T> makeArrayRef(const T (&Arr)[N]) { return ArrayRef<T>(Arr); } if Arr is static const uint8_t sizes[]; declared in a templated and defined a few lines later. I'll send a proper fix (i.e. get rid of unnecessary templates) for review soon. llvm-svn: 193604	2013-10-29 12:13:22 +00:00
Bernard Ogden	ee87e85505	ARM: Add subtarget feature for CRC Adds a subtarget feature for the CRC instructions (optional in v8-A) to the ARM (32-bit) backend. Differential Revision: http://llvm-reviews.chandlerc.com/D2036 llvm-svn: 193599	2013-10-29 09:47:35 +00:00
Anders Waldenborg	a36a7825fb	Fix misapplied patch in r193597 Sorry Peter Zotov, entirely my fault. llvm-svn: 193598	2013-10-29 09:37:28 +00:00
Anders Waldenborg	213a63fe53	llvm-c: Make LLVM{Get,Set}Alignment work on {Load,Store}Inst too Patch by Peter Zotov Differential Revision: http://llvm-reviews.chandlerc.com/D1910 llvm-svn: 193597	2013-10-29 09:02:02 +00:00
Tim Northover	d29ddf6713	AArch64: add 'a' inline asm operand modifier This is used in the Linux kernel, and effectively just means "print an address". llvm-svn: 193593	2013-10-29 08:22:33 +00:00
Manman Ren	f6b936bc06	Debug Info: instead of calling addToContextOwner which constructs the context after the DIE creation, we construct the context first. This touches creation of namespaces and global variables. The purpose is to handle all DIE creations similarly: constructs the context first, then creates the DIE and immediately adds the DIE to its parent. We use createAndAddDIE to wrap around "new DIE(". llvm-svn: 193589	2013-10-29 05:49:41 +00:00
Alp Toker	6a03374526	Fix "existant" typos llvm-svn: 193579	2013-10-29 02:35:28 +00:00
Richard Smith	58d575926c	Clean up. llvm-svn: 193576	2013-10-29 01:44:23 +00:00
NAKAMURA Takumi	83a05039eb	DWARFFormValue.cpp: Appease gcc to give explicit constructors. error: conversion from `const uint8_t*' to non-scalar type `llvm::ArrayRef<unsigned char>' requested llvm-svn: 193575	2013-10-29 01:43:05 +00:00
Arnold Schwaighofer	89ae217422	ARM cost model: Unaligned vectorized double stores are expensive Updated a test case that assumed that <2 x double> would vectorize to use <4 x float>. radar://15338229 llvm-svn: 193574	2013-10-29 01:33:57 +00:00
Arnold Schwaighofer	77af0f6e82	ARM cost model: Account for zero cost scalar SROA instructions By vectorizing a series of srl, or, ... instructions we have obfuscated the intention so much that the backend does not know how to fold this code away. radar://15336950 llvm-svn: 193573	2013-10-29 01:33:53 +00:00
Arnold Schwaighofer	86252451c4	SLPVectorizer: Use vector type for vectorized memory operations No test case, because with the current cost model we don't see a difference. An upcoming ARM memory cost model change will expose and test this bug. radar://15332579 llvm-svn: 193572	2013-10-29 01:33:50 +00:00
Joerg Sonnenberger	fc18473400	Move the STT_FILE symbols out of the normal symbol table processing for ELF. They can overlap with the other symbols, e.g. if a source file "foo.c" contains a function "foo" with a static variable "c". llvm-svn: 193569	2013-10-29 01:06:17 +00:00
Manman Ren	4a841a86bd	Debug Info: use createAndAddDIE to wrap around "new DIE" in DwarfDebug. This commit ensures DIEs are constructed within a compile unit and immediately added to their parents. Reviewed off-list by Eric. llvm-svn: 193568	2013-10-29 01:03:01 +00:00
Manman Ren	73d697c641	Debug Info: use createAndAddDIE for newly-created Subprogram DIEs. More patches will be submitted to convert "new DIE(" to use createAddAndDIE in DwarfCompileUnit.cpp. This will simplify implementation of addDIEEntry where we have to decide between ref4 and ref_addr, because DIEs that can be shared across CU will be added to a CU already. Reviewed off-list by Eric. llvm-svn: 193567	2013-10-29 00:58:04 +00:00
Manman Ren	b987e517f2	Debug Info: add a helper function createAndAddDIE. It wraps around "new DIE(" and handles the bookkeeping part of the newly-created DIE. It adds the DIE to its parent, and calls insertDIE if necessary. It makes sure that bookkeeping is done at the earliest time and we should not see parentless DIEs if all constructions of DIEs go through this helper function. Later on, we can use an allocator for DIE allocation, and will only need to change createAndAddDIE instead of modifying all the "new DIE(". Reviewed off-list by Eric. llvm-svn: 193566	2013-10-29 00:53:03 +00:00
Alexey Samsonov	330b8939bb	Merge DWARFDIE::extractFast and DWARFDIE::extract into one function. Complicated CU-DIE-specific logic in the latter was never used, and it makes sense to have safety checks for broken dwarf in the former. llvm-svn: 193563	2013-10-28 23:58:58 +00:00
Alexey Samsonov	a56bbf0c8c	DWARF parser: Use ArrayRef to represent form sizes and simplify DWARFDIE::extractFast() interface. No functionality change. llvm-svn: 193560	2013-10-28 23:41:49 +00:00
Alexey Samsonov	7614212fd1	DWARF parser: since DWARF4, DW_AT_high_pc may be a constant representing function size llvm-svn: 193555	2013-10-28 23:15:15 +00:00
Alexey Samsonov	48cbda5850	DebugInfo: Introduce the notion of "form classes" Summary: Use DWARF4 table of form classes to fetch attributes from DIE in a more consistent way. This shouldn't change the functionality and serves as a refactoring for upcoming change: DW_AT_high_pc has different semantics depending on its form class. Reviewers: dblaikie, echristo Reviewed By: echristo CC: echristo, llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D1961 llvm-svn: 193553	2013-10-28 23:01:48 +00:00
Akira Hatanaka	7d82252d4b	[mips] Simplify LowerFormalArguments using getRegClassFor. No functionality change. llvm-svn: 193540	2013-10-28 21:21:36 +00:00
Lang Hames	b52816615b	Return early from getUnconditionalBranchTargetOpValue if the branch target is an MCExpr, in order to avoid writing an encoded zero value in the immediate field. When getUnconditionalBranchTargetOpValue is called with an MCExpr target, we don't know what the final immediate field value should be. We shouldn't explicitly set the immediate field to an encoded zero value as zero is encoded with a non-zero bit pattern. This leads to bits being set that pollute the final immediate value. The nature of the encoding is such that the polluted bits only affect very large immediate values, explaining why this hasn't caused problems earlier. Fixes <rdar://problem/15155975>. llvm-svn: 193535	2013-10-28 20:51:11 +00:00
Logan Chien	8cbb80d159	[arm] Implement eabi_attribute, cpu, and fpu directives. This commit allows the ARM integrated assembler to parse and assemble the code with .eabi_attribute, .cpu, and .fpu directives. To implement the feature, this commit moves the code from AttrEmitter to ARMTargetStreamers, and several new test cases related to cortex-m4, cortex-r5, and cortex-a15 are added. Besides, this commit also change the Subtarget->isFPOnlySP() to Subtarget->hasD16() to match the usage of .fpu directive. This commit changes the test cases: * Several .eabi_attribute directives in 2010-09-29-mc-asm-header-test.ll are removed because the .fpu directive already cover the functionality. * In the Cortex-A15 test case, the value for Tag_Advanced_SIMD_arch has be changed from 1 to 2, which is more precise. llvm-svn: 193524	2013-10-28 17:51:12 +00:00
Nuno Lopes	8a24152048	simplify ConstantRange::getSetSize() llvm-svn: 193523	2013-10-28 16:52:38 +00:00
Richard Sandiford	094e609716	[SystemZ] Set usaAA to true useAA significantly improves the handling of vector code that has TBAA information attached. It also helps other cases, as shown by the testsuite changes here. The only real downside I've seen is that it interferes with MergeConsecutiveStores. The problem is that that optimization works top down, starting at the first store in the chain, and looks for cases where the chain result is only used by a single related store. These related stores don't alias, so useAA will have rewritten all the later stores to use a different chain input (typically the same one as the first store). I think the advantages outweigh the disadvantages though, so for now I've just disabled alias analysis for the unaligned-01.ll test. llvm-svn: 193521	2013-10-28 13:53:37 +00:00
Richard Sandiford	981fdeb477	[DAGCombiner] Respect volatility when checking for aliases Making useAA() default to true for SystemZ showed that the combiner alias analysis wasn't handling volatile accesses. This hit many of the SystemZ tests, but I arbitrarily picked one for the purpose of this patch. llvm-svn: 193518	2013-10-28 12:00:00 +00:00
Richard Sandiford	39c1ce4dc1	Keep TBAA info when rewriting SelectionDAG loads and stores Most SelectionDAG code drops the TBAA info when creating a new form of a load and store (e.g. during legalization, or when converting a plain load to an extending one). This patch tries to catch all cases where the TBAA information can legitimately be carried over. The patch adds alternative forms of getLoad() and getExtLoad() that take a MachineMemOperand instead of individual fields. (The corresponding getTruncStore() already exists.) The idea is to use the MachineMemOperand forms when all fields are carried over (size, pointer info, isVolatile, isNonTemporal, alignment and TBAA info). If some adjustment is being made, e.g. to narrow the load, then we still pass the individual fields but also pass the TBAA info. llvm-svn: 193517	2013-10-28 11:17:59 +00:00
Benjamin Kramer	6094f30da2	SCEV: Make the final add of an inbounds GEP nuw if we know that the index is positive. We can't do this for the general case as saying a GEP with a negative index doesn't have unsigned wrap isn't valid for negative indices. %gep = getelementptr inbounds i32* %p, i64 -1 But an inbounds GEP cannot run past the end of address space. So we check for the very common case of a positive index and make GEPs derived from that NUW. Together with Andy's recent non-unit stride work this lets us analyze loops like void foo3(int a, int b) { for (; a < b; a++) {} } PR12375, PR12376. Differential Revision: http://llvm-reviews.chandlerc.com/D2033 llvm-svn: 193514	2013-10-28 07:30:06 +00:00
NAKAMURA Takumi	8a0464393f	Prune utf8 chars in comments. llvm-svn: 193512	2013-10-28 04:07:38 +00:00
NAKAMURA Takumi	0b865d445e	Prune trailing linefeeds. llvm-svn: 193511	2013-10-28 04:07:31 +00:00
NAKAMURA Takumi	4bb85f90fd	Target/R600: Un-tab-ify. llvm-svn: 193510	2013-10-28 04:07:23 +00:00
Reed Kotler	91ae9829a9	Make first substantial checkin of my port of ARM constant islands code to Mips. Before I just ported the shell of the pass. I've tried to keep everything nearly identical to the ARM version. I think it will be very easy to eventually merge these two and create a new more general pass that other targets can use. I have some improvements I would like to make to allow pools to be shared across functions and some other things. When I'm all done we can think about making a more general pass. More to be ported but the basic mechanism works now almost as good as gcc mips16. llvm-svn: 193509	2013-10-27 21:57:36 +00:00
Benjamin Kramer	7ad4100f8b	NVPTX: Remove unused globals. llvm-svn: 193500	2013-10-27 11:31:46 +00:00
Benjamin Kramer	602bb4ad86	Hexagon: Remove global state. llvm-svn: 193499	2013-10-27 11:16:09 +00:00
Elena Demikhovsky	199c823555	AVX-512: PMIN/PMAX intrinsics and patterns Patch by Cameron McInally <cameron.mcinally@nyu.edu> llvm-svn: 193497	2013-10-27 08:18:37 +00:00
Shuxin Yang	2e1890e18b	Revert r193251 : Use address-taken to disambiguate global variable and indirect memops. llvm-svn: 193489	2013-10-27 03:08:44 +00:00
Wan Xiaofei	be640b28c0	Quick look-up for block in loop. This patch implements quick look-up for block in loop by maintaining a hash set for blocks. It improves the efficiency of loop analysis a lot, the biggest improvement could be 5-6%(458.sjeng). Below are the compilation time for our benchmark in llc before & after the patch. Benchmark llc - trunk llc - patched 401.bzip2 0.339081 100.00% 0.329657 102.86% 403.gcc 19.853966 100.00% 19.605466 101.27% 429.mcf 0.049823 100.00% 0.048451 102.83% 433.milc 0.514898 100.00% 0.510217 100.92% 444.namd 1.109328 100.00% 1.103481 100.53% 445.gobmk 4.988028 100.00% 4.929114 101.20% 456.hmmer 0.843871 100.00% 0.825865 102.18% 458.sjeng 0.754238 100.00% 0.714095 105.62% 464.h264ref 2.9668 100.00% 2.90612 102.09% 471.omnetpp 4.556533 100.00% 4.511886 100.99% bitmnp01 0.038168 100.00% 0.0357 106.91% idctrn01 0.037745 100.00% 0.037332 101.11% libquake2 3.78689 100.00% 3.76209 100.66% libquake_ 2.251525 100.00% 2.234104 100.78% linpack 0.033159 100.00% 0.032788 101.13% matrix01 0.045319 100.00% 0.043497 104.19% nbench 0.333161 100.00% 0.329799 101.02% tblook01 0.017863 100.00% 0.017666 101.12% ttsprk01 0.054337 100.00% 0.053057 102.41% Reviewer : Andrew Trick <atrick@apple.com>, Hal Finkel <hfinkel@anl.gov> Approver : Andrew Trick <atrick@apple.com> Test : Pass make check-all & llvm test-suite llvm-svn: 193460	2013-10-26 03:08:02 +00:00
Andrew Trick	57243da70f	Fix SCEVExpander: don't try to expand quadratic recurrences outside a loop. Partial fix for PR17459: wrong code at -O3 on x86_64-linux-gnu (affecting trunk and 3.3) When SCEV expands a recurrence outside of a loop it attempts to scale by the stride of the recurrence. Chained recurrences don't work that way. We could compute binomial coefficients, but would hve to guarantee that the chained AddRec's are in a perfectly reduced form. llvm-svn: 193438	2013-10-25 21:35:56 +00:00
Andrew Trick	29abce3189	Fix LSR: don't normalize quadratic recurrences. Partial fix for PR17459: wrong code at -O3 on x86_64-linux-gnu (affecting trunk and 3.3) ScalarEvolutionNormalization was attempting to normalize by adding and subtracting strides. Chained recurrences don't work that way. llvm-svn: 193437	2013-10-25 21:35:52 +00:00
Rafael Espindola	7749d7ccc7	Handle calls and invokes in GlobalStatus. This patch teaches GlobalStatus to analyze a call that uses the global value as a callee, not as an argument. With this change internalize call handle the common use of linkonce_odr functions. This reduces the number of linkonce_odr functions in a LTO build of clang (checked with the emit-llvm gold plugin option) from 1730 to 60. llvm-svn: 193436	2013-10-25 21:29:52 +00:00
Hal Finkel	02f562df43	LoopVectorizer: Don't attempt to vectorize extractelement instructions The loop vectorizer does not currently understand how to vectorize extractelement instructions. The existing check, which excluded all vector-valued instructions, did not catch extractelement instructions because it checked only the return value. As a result, vectorization would proceed, producing illegal instructions like this: %58 = extractelement <2 x i32> %15, i32 0 %59 = extractelement i32 %58, i32 0 where the second extractelement is illegal because its first operand is not a vector. llvm-svn: 193434	2013-10-25 20:40:15 +00:00
David Blaikie	8bc7db777d	DIEHash: Summary hashing of member functions llvm-svn: 193432	2013-10-25 20:04:25 +00:00
Rafael Espindola	1d19c8f03a	Change MemoryBuffer::getFile to take a Twine. llvm-svn: 193429	2013-10-25 19:06:52 +00:00
David Blaikie	65cc969f50	DIEHash: Summary hashing of nested types llvm-svn: 193427	2013-10-25 18:38:43 +00:00
Quentin Colombet	8761a8f5c0	[X86][AVX512] Add patterns that match the AVX512 floating point register vbroadcast intrinsics. Patch by Cameron McInally <cameron.mcinally@nyu.edu> llvm-svn: 193422	2013-10-25 18:04:12 +00:00
Quentin Colombet	4bf1c282c2	[X86][AVX512] Add patterns that match the AVX512 floating point vbroadcast intrinsics. Patch by Cameron McInally <cameron.mcinally@nyu.edu> llvm-svn: 193421	2013-10-25 17:47:18 +00:00
Rafael Espindola	64cc1b0043	Call destroy from ~BasicCallGraph. This fix a memory leak found by valgrind. Calling it from the base class destructor would not destroy the BasicCallGraph bits. FIXME: BasicCallGraph is the only thing that inherits from CallGraph. Can we merge the two? llvm-svn: 193412	2013-10-25 15:01:34 +00:00
Tim Northover	1744d0ad83	ARM: allow .thumb_func to be separated from symbol definition When assembling, a .thumb_func directive is supposed to be applicable to the next symbol definition, even if there are intervening directives. We were racing ahead to try and find it, and this commit should fix the issue. Patch by Gabor Ballabas llvm-svn: 193403	2013-10-25 12:49:50 +00:00
Yaron Keren	2eac89868c	The FIXME was indeed fixed in the linker, comment removed. llvm-svn: 193402	2013-10-25 12:01:53 +00:00
Tim Northover	c7ea8048e7	ARM: don't expand atomicrmw inline on Cortex-M0 There's a barrier instruction so that should still be used, but most actual atomic operations are going to need a platform decision on the correct behaviour (either nop if single-threaded or OS-support otherwise). rdar://problem/15287210 llvm-svn: 193399	2013-10-25 09:30:24 +00:00
Tim Northover	a564d329c2	LegalizeDAG: allow libcalls for max/min atomic operations ARM processors without ldrex/strex need to be able to make libcalls for all atomic operations, including the newer min/max versions. The alternative would probably be expanding these operations in terms of cmpxchg (as x86 does always), but in the configurations where this matters code-size tends to be paramount so the libcall is more desirable. llvm-svn: 193398	2013-10-25 09:30:20 +00:00
Nadav Rotem	d369d4bdf9	Optimize concat_vectors(X, undef) -> scalar_to_vector(X). This optimization is not SSE specific so I am moving it to DAGco. The new scalar_to_vector dag node exposed a missing pattern in the AArch64 target that I needed to add. llvm-svn: 193393	2013-10-25 06:41:18 +00:00
Yuchen Wu	03678157b5	llvm-cov dump to dbgs() instead of outs(). llvm-svn: 193390	2013-10-25 02:22:24 +00:00
Yuchen Wu	14ae8e6195	Support for reading program counts in llvm-cov. llvm-cov will now be able to read program counts from the GCDA file and output it in the same format as gcov. The program summary tag was identified from gcov-io.h as "\0\0\0\a3". There is currently a bug in GCOVProfiling.cpp which does not generate the run- or program-counting IR, so this change was tested manually by modifying the GCDA file and comparing the gcov and llvm-cov outputs. llvm-svn: 193389	2013-10-25 02:22:21 +00:00
Jim Grosbach	1d1d6d4675	ARM: Tweak usage of '*vfp' compiler_rt functions. Only use them if the subtarget has ARM mode, as these routines are implemented as ARM code. rdar://15302004 llvm-svn: 193381	2013-10-24 23:07:11 +00:00
David Blaikie	d8c5b4e8ef	MCStreamer: Reimplement the virtual EmitRawText as a protected member, EmitRawTextImpl, to avoid string literal ambiguities Also improve the implementation of EmitRawText(Twine) so it doesn't bother using the SmallString buffer if the Twine is a simple StringRef anyway. llvm-svn: 193378	2013-10-24 22:43:10 +00:00
David Blaikie	68642d3118	DWARF emission: Remove unnecessary/redundant DIE reference code The default case at the end of the switch handles this just fine. llvm-svn: 193374	2013-10-24 22:00:44 +00:00
Eric Christopher	e34116750f	Fix name of variable in comment. llvm-svn: 193373	2013-10-24 21:54:58 +00:00
Eric Christopher	670ee0e941	Grammar. llvm-svn: 193372	2013-10-24 21:20:23 +00:00
Eric Christopher	b088d2d0bc	Update misleading comment. llvm-svn: 193371	2013-10-24 21:05:08 +00:00
David Blaikie	2aee7be871	DIEHash: Const correct and use references where non-null/non-rebound. llvm-svn: 193363	2013-10-24 18:29:03 +00:00
David Blaikie	32744412d2	DIEHash: Do not use shallow type hashing for unnamed types llvm-svn: 193361	2013-10-24 17:53:58 +00:00
David Blaikie	afcb9656c3	DIEHash: Refactor ref attribute hashing into smaller functions llvm-svn: 193360	2013-10-24 17:51:43 +00:00
David Blaikie	e568225fc3	Remove unused debug-only member variable. This may've been used at some point but the 'print' member function grew an Indent parameter that entirely shadows this parameter. llvm-svn: 193358	2013-10-24 17:10:13 +00:00
David Peixotto	b0653e539b	Remove class abstraction from ARM struct byval lowering This commit changes the struct byval lowering for arm to use inline checks for the subtarget instead of a class abstraction to represent the differences. The class abstraction was judged to be too much code for this task. No intended functionality change. llvm-svn: 193357	2013-10-24 16:39:36 +00:00
Tom Stellard	bc7d87f07c	Inliner: Handle readonly attribute per argument when adding memcpy Patch by: Vincent Lejeune llvm-svn: 193356	2013-10-24 16:38:33 +00:00
Tim Northover	5620faf771	ARM: Mark double-precision instructions as such This prevents us from silently accepting invalid instructions on (for example) Cortex-M4 with just single-precision VFP support. No tests for the extra Pat Requires because they're essentially assertions: the affected code should have been lowered to libcalls before ISel. rdar://problem/15302004 llvm-svn: 193354	2013-10-24 15:49:39 +00:00
John Thompson	6cd5bd4a3d	Reverting my r193344 checkin due to build breakage. llvm-svn: 193350	2013-10-24 14:52:56 +00:00
Renato Golin	1ba143e140	Mark vector loops as already vectorized Make sure we mark all loops (scalar and vector) when vectorizing, so that we don't try to vectorize them anymore. Also, set unroll to 1, since this is what we check for on early exit. llvm-svn: 193349	2013-10-24 14:50:51 +00:00
John Thompson	e38e57206f	Added std::string as a built-in type for mapping. llvm-svn: 193344	2013-10-24 13:36:58 +00:00
Tim Northover	225bcbbe71	ARM: add a couple more NEON predicates. The fused multiply instructions were added in VFPv4 but are still NEON instructions, in particular they shouldn't be available on a Cortex-M4 not matter how floaty it is. llvm-svn: 193342	2013-10-24 12:48:05 +00:00
Tim Northover	64dacb2b8a	ARM: mark various aliases with their architecture requirements. If an alias inherits directly from InstAlias then it doesn't get any default "Requires" values, so llvm-mc will allow it even on architectures that don't support the underlying instruction. This tidies up the obvious VFP and NEON cases I found. llvm-svn: 193340	2013-10-24 12:22:58 +00:00
Tim Northover	94ecbd2e6c	ARM: Use non-VFP softcalls on embedded Darwinish targets The compiler-rt functions __adddf3vfp and so on exist purely to allow Thumb1 code to make use of VFP instructions by switching back to ARM mode, they make no sense for M-class processors which don't even have an ARM mode. Given that justification, in practice this is a platform ABI decision so the actual check is based on that rather than CPU features. rdar://problem/15302004 llvm-svn: 193327	2013-10-24 10:37:09 +00:00
Yaron Keren	1ec9df3322	Replaced non-ASCII character. llvm-svn: 193324	2013-10-24 10:04:47 +00:00
Chandler Carruth	d55d159d09	Revert part of r193291, restoring the deletion of loaded objects. Without this, customers of the MCJIT were leaking memory like crazy. It's not really clear what the right memory management is here, so I'm not trying to add lots of tests or other logic, just trying to get us back to a better baseline. I'll follow up on the original commit to figure out the right path forward. llvm-svn: 193323	2013-10-24 09:52:56 +00:00
Tim Northover	741e6ef4d4	ARM: fix assert on unpredictable POP instruction. POP instructions are aliased to the ARM LDM variants but have different syntax. This caused two problems: we tried to access a non-existent operand to annotate the '!', and the error message didn't make much sense. With some vigorous hand-waving in the error message both problems can be fixed. llvm-svn: 193322	2013-10-24 09:37:18 +00:00
Job Noorman	a8d35c98fd	Make sure SP is always aligned on a 2 byte boundary llvm-svn: 193320	2013-10-24 09:32:31 +00:00
Nuno Lopes	340b0463e6	fix PR17635: false positive with packed structures LLVM optimizers may widen accesses to packed structures that overflow the structure itself, but should be in bounds up to the alignment of the object llvm-svn: 193317	2013-10-24 09:17:24 +00:00
Amara Emerson	c5cae0f20c	[AArch64] Fix NZCV reg live-in bug in F128CSEL codegen. When generating the IfTrue basic block during the F128CSEL pseudo-instruction handling, the NZCV live-in for the newly created BB wasn't being added. This caused a fault during MI-sched/live range calculation when the predecessor for the fall-through BB didn't have a live-in for phys-reg as expected. llvm-svn: 193316	2013-10-24 08:28:24 +00:00
Elena Demikhovsky	dd0794e51b	AVX-512: added VCVTPH2PS, VCVTPS2PH with intrinsics llvm-svn: 193312	2013-10-24 07:16:35 +00:00
Juergen Ributzka	d04d096ecf	Fix a bug in LinearFunctionTestReplace that created invalid loop exit checks. Reviewed by Andy llvm-svn: 193303	2013-10-24 05:29:56 +00:00
Yuchen Wu	887c20ffc2	Fixed llvm-cov to count edges instead of blocks. This was a fundamental flaw in llvm-cov where it treated the values in the GCDA files as block counts instead of edge counts. This created incorrect line counts when branching was present. Instead, the edge counts should be summed to obtain the correct block count. The fix was tested using custom test files as well as single source files from the test-suite directory. The behaviour can be verified by reading the GCOV documentation that describes the GCDA spec ("ARC_COUNTS gives the counter values for those arcs that are instrumented") and the header description provided by GCOVProfiling.cpp ("instruments the code that runs to records (sic) the edges between blocks that run and emit a complementary "gcda" file on exit"). llvm-svn: 193299	2013-10-24 01:51:04 +00:00
Andrew Trick	ada2356ac9	Clarify comments in genLoopLimit. llvm-svn: 193292	2013-10-24 00:43:38 +00:00
Andrew Kaylor	c89fc826b2	Optimizing MCJIT module state tracking Patch co-developed with Yaron Keren. llvm-svn: 193291	2013-10-24 00:19:14 +00:00
Yaron Keren	79bb266346	(this is a corrected patch) Calling _chkstk is required on ELF as well as COFF on Windows. Without _chkstk, functions requiring large stack crash in initialization code. Previous code tested for COFF format but not Mach-O and this patch modifies the code to test for Windows OS (both Windows target and MingW target) but not Mach-O object format: Looks like macho environment was used to build some EFI code. Credits to Andrew MacPherson. llvm-svn: 193289	2013-10-23 23:37:01 +00:00
Manman Ren	ffc9a71866	Debug Info: code clean up. Since we never insert DIE for DITemplateTypeParameter to a map, there is no need to call getDIE in getOrCreateTemplateTypeParameterDIE. It is also renamed to constructTemplateTypeParameterDIE to match with other construct functions in CompileUnit. Same applies to getOrCreateTemplateValueParameterDIE. llvm-svn: 193287	2013-10-23 23:05:28 +00:00
Manman Ren	230ec864af	Debug Info: code clean up. Rename createMemberDIE to constructMemberDIE to match other construct functions in CompileUnit. llvm-svn: 193286	2013-10-23 23:00:44 +00:00
Manman Ren	57e6ff7e72	Debug Info: code clean up. Remove the unneeded return values from createMemberDIE, constructEnumTypeDIE, getOrCreateTemplateTypeParameterDIE, and getOrCreateTemplateValueParameterDIE. llvm-svn: 193285	2013-10-23 22:57:12 +00:00
Manman Ren	0cfd20b99e	Debug Info: code clean up. Unifying the argument ordering of private construct functions in CompileUnit to follow constructTypeDIE(DIE &, DIBasicType), constructTypeDIE(DIE &, DIDerivedType), constructTypeDIE(DIE &, DICompositeType), constructSubrangeDIE and constructArrayTypeDIE. llvm-svn: 193284	2013-10-23 22:52:22 +00:00
Manman Ren	b9512a7c57	Remove {} from one-line block. llvm-svn: 193276	2013-10-23 22:12:26 +00:00
Rafael Espindola	bca3ab0905	Revert "Calling _chkstk is required on ELF as well as COFF on Windows. Without _chkstk functions requiring large stack crash in initialization code. Previous code tested for COFF format but not Mach-O and this patch modifies the code to test for Windows." This reverts commit r193263. It is causing CodeGen/X86/mingw-alloca.ll to fail. llvm-svn: 193275	2013-10-23 21:45:09 +00:00
Rafael Espindola	b02877416e	Reduce casting and use a fully covered switch. llvm-svn: 193272	2013-10-23 21:24:34 +00:00
Benjamin Kramer	0ccab2d66c	X86: Custom lower sext v16i8 to v16i16, and the corresponding truncate. Also update the cost model. llvm-svn: 193270	2013-10-23 21:06:07 +00:00
Yuchen Wu	3197b25b27	Fixed comment typo in GCOVProfiling.cpp llvm-svn: 193268	2013-10-23 20:35:00 +00:00
Yuchen Wu	48342ee908	Use a map instead of vector to store line counts. There are a few motivations for this: - Using a map allows for checking if line is in map. This differentiates unexecutable lines (such as comments) from unexecuted logical lines of code. "#####" is now outputted in this case, in line with gcov. - Source files are no longer read in twice: once when storing the line counts, and once when outputting the data. - Greatly simplifies the function FileInfo::addLineCount(). llvm-svn: 193264	2013-10-23 19:45:03 +00:00

... 2 3 4 5 6 ...

65142 Commits