llvm-project

Commit Graph

Author	SHA1	Message	Date
Simon Pilgrim	f02a39c371	[X86] Add JCC/JECXZ/JECXZ/JRCXZ/LOOP schedule tests llvm-svn: 320603	2017-12-13 18:09:45 +00:00
Amaury Sechet	a402e51428	Regenerate test-shrink.ll test results. NFC llvm-svn: 320602	2017-12-13 18:04:57 +00:00
Jonas Devlieghere	2fbee4f869	[dsymutil] Re-enable threading Threading was disabled in r317263 because it broke a test in combination with `-DLLVM_ENABLE_THREADS=OFF`. This was because a ThreadPool warning was piped to llvm-dwarfdump which was expecting to read an object from stdin. This patch re-enables threading and fixes the offending test. Unfortunately this required more than just moving the ThreadPool out of the for loop because of the TempFile refactoring that took place in the meantime. Differential revision: https://reviews.llvm.org/D41180 llvm-svn: 320601	2017-12-13 18:03:04 +00:00
Simon Pilgrim	542a711806	[X86] Add RET/RETF schedule tests llvm-svn: 320600	2017-12-13 17:50:40 +00:00
Simon Pilgrim	c1bd968c8c	[X86] Add POP/PUSH schedule tests llvm-svn: 320598	2017-12-13 17:42:25 +00:00
Galina Kistanova	9dee3f0a97	Reverted r320229. It broke tests on builder llvm-clang-x86_64-expensive-checks-win. llvm-svn: 320588	2017-12-13 15:26:27 +00:00
Simon Pilgrim	0bd31a8360	[X86] Add PREFETCH schedule tests llvm-svn: 320587	2017-12-13 15:12:02 +00:00
Simon Pilgrim	1df18ee3fc	[X86] Add XCHG schedule tests llvm-svn: 320586	2017-12-13 15:02:10 +00:00
Simon Pilgrim	9d9f170172	[X86] Add MOVNTI schedule tests llvm-svn: 320585	2017-12-13 14:51:06 +00:00
Nemanja Ivanovic	6f590bf8bb	[PowerPC] MachineSSA pass to reduce the number of CR-logical operations The initial implementation of an MI SSA pass to reduce cr-logical operations. Currently, the only operations handled by the pass are binary operations where both CR-inputs come from the same block and the single use is a conditional branch (also in the same block). Committing this off by default to allow for a period of field testing. Will enable it by default in a follow-up patch soon. Differential Revision: https://reviews.llvm.org/D30431 llvm-svn: 320584	2017-12-13 14:47:35 +00:00
Simon Pilgrim	88e6f83f9e	[X86] Add ENTER/LEAVE schedule tests llvm-svn: 320583	2017-12-13 14:46:33 +00:00
Simon Pilgrim	cef5b64fdb	[X86] Add IMUL schedule tests llvm-svn: 320582	2017-12-13 14:24:04 +00:00
Simon Pilgrim	f00ea1b4cd	[X86] Add RDMSR/WRMSR, RDPMC + RDTSC/RDTSCP schedule tests Add missing RDTSCP itinerary llvm-svn: 320581	2017-12-13 14:22:04 +00:00
Simon Pilgrim	46ec195d19	[X86] Add ARPL/BOUND schedule tests llvm-svn: 320580	2017-12-13 13:54:45 +00:00
Alex Bradbury	845e5dce83	[RISCV] Define sfence.vma InstAliases to match the GNU RISC-V tools Unfortunately these aren't defined explicitly in the privileged spec, but the GNU assembler does accept `sfence.vma` and `sfence.vma rs` as well as the usual `sfence.vma rs, rt`. llvm-svn: 320575	2017-12-13 12:46:55 +00:00
Simon Pilgrim	f51f4d3623	[X86][SSE] MOVMSK only uses the sign bit from each vector element Pass the input vector through SimplifyDemandedBits as we only need the sign bit from each vector element of MOVMSK We'd probably get more hits if SimplifyDemandedBits was better at handling vectors... Differential Revision: https://reviews.llvm.org/D41119 llvm-svn: 320570	2017-12-13 11:43:14 +00:00
Alex Bradbury	fa7e4ec837	[RISCV] Implement floating point assembler pseudo instructions Adds the assembler aliases for the floating point instructions which can be mapped to a single canonical instruction. The missing pseudo instructions (flw, fld, fsw, fsd) are marked as TODO. Other things, like for example PCREL_LO, have to be implemented first. This patch builds upon D40902. Differential Revision: https://reviews.llvm.org/D41071 Patch by Mario Werner. llvm-svn: 320569	2017-12-13 11:37:19 +00:00
Igor Laevsky	e0edb66475	Reintroduce r320049, r320014 and r319894. OpenGL issues should be fixed by now. llvm-svn: 320568	2017-12-13 11:21:18 +00:00
Roger Ferrer Ibanez	e8d4e88bab	[DAG] Promote ADDCARRY / SUBCARRY Add missing case that was not implemented yet. Differential Revision: https://reviews.llvm.org/D38942 llvm-svn: 320567	2017-12-13 10:45:21 +00:00
Francis Visoiu Mistrih	b41dbbe325	[CodeGen] Print jump-table index operands as %jump-table.0 in both MIR and debug output Work towards the unification of MIR and debug output by printing `%jump-table.0` instead of `<jt#0>`. Only debug syntax is affected. llvm-svn: 320566	2017-12-13 10:30:59 +00:00
Francis Visoiu Mistrih	26ae8a6582	[CodeGen] Print constant pool index operands as %const.0 + 8 in both MIR and debug output Work towards the unification of MIR and debug output by printing `%const.0 + 8` instead of `<cp#0+8>` and `%const.0 - 8` instead of `<cp#0-8>`. Only debug syntax is affected. Differential Revision: https://reviews.llvm.org/D41116 llvm-svn: 320564	2017-12-13 10:30:45 +00:00
Stefan Maksimovic	0a075d68ec	[mips] Provide additional DSP bitconvert patterns Previously, v2i16 -> f32 bitcast could not be matched. Add patterns to support matching this and similar types of bitcasts. Differential revision: https://reviews.llvm.org/D40959 llvm-svn: 320562	2017-12-13 10:13:35 +00:00
Alex Bradbury	60714f98ba	[RISCV] MC layer support for the remaining RVC instructions Differential Revision: https://reviews.llvm.org/D40003 Patch by Shiva Chen. llvm-svn: 320558	2017-12-13 09:32:55 +00:00
Gadi Haber	6090c148dc	[X86][BMI]: Adding full coverage of MC encoding for the BMI isa set.<NFC> NFC. Adding MC regressions tests to cover the BMI1 and BMI2 ISA sets both 32 and 64 bit. This patch is part of a larger task to cover MC encoding of all X86 ISA Sets. started in revision: https://reviews.llvm.org/D39952 Reviewers: zvi, craig.topper, m_zuckerman, RKSimon Differential Revision: https://reviews.llvm.org/D41106 Change-Id: I033ce137b5b82d36e1e601cd5e0534637b43a4a9 llvm-svn: 320557	2017-12-13 09:13:53 +00:00
Serguei Katkov	ac4a8fb1cd	Revert "[CGP] Enable select in complex addr mode" Causes: Assertion `ScaledReg == nullptr' failed. This actually a revert of rL320551. llvm-svn: 320553	2017-12-13 07:39:35 +00:00
Serguei Katkov	b8cb5da28d	[CGP] Enable select in complex addr mode Enable select instruction handling in complex addr modes. Reviewers: john.brawn, reames, aaboud Reviewed By: reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D40634 llvm-svn: 320551	2017-12-13 06:57:59 +00:00
Mohammad Shahid	dbd30edb7f	[SLP] Vectorize jumbled memory loads. Summary: This patch tries to vectorize loads of consecutive memory accesses, accessed in non-consecutive or jumbled way. An earlier attempt was made with patch D26905 which was reverted back due to some basic issue with representing the 'use mask' of jumbled accesses. This patch fixes the mask representation by recording the 'use mask' in the usertree entry. Change-Id: I9fe7f5045f065d84c126fa307ef6ebe0787296df Reviewers: mkuper, loladiro, Ayal, zvi, danielcdh Reviewed By: Ayal Subscribers: mgrang, dcaballe, hans, mzolotukhin Differential Revision: https://reviews.llvm.org/D36130 llvm-svn: 320548	2017-12-13 03:08:29 +00:00
Florian Hahn	beda7d517d	[CallSiteSplitting] Refactor creating callsites. Summary: This change makes the call site creation more general if any of the arguments is predicated on a condition in the call site's predecessors. If we find a callsite, that potentially can be split, we collect the set of conditions for the call site's predecessors (currently only 2 predecessors are allowed). To do that, we traverse each predecessor's predecessors as long as it only has single predecessors and record the condition, if it is relevant to the call site. For each condition, we also check if the condition is taken or not. In case it is not taken, we record the inverse predicate. We use the recorded conditions to create the new call sites and split the basic block. This has 2 benefits: (1) it is slightly easier to see what is going on (IMO) and (2) we can easily extend it to handle more complex control flow. Reviewers: davidxl, junbuml Reviewed By: junbuml Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D40728 llvm-svn: 320547	2017-12-13 03:05:20 +00:00
Evgeniy Stepanov	ecb48e523e	[hwasan] Inline instrumentation & fixed shadow. Summary: This brings CPU overhead on bzip2 down from 5.5x to 2x. Reviewers: kcc, alekseyshl Subscribers: kubamracek, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D41137 llvm-svn: 320538	2017-12-13 01:16:34 +00:00
Michael Trent	1d3d8adad7	reverting out -r320532 because a warning is breaking the lld build llvm-svn: 320534	2017-12-13 00:36:13 +00:00
Michael Trent	0f6bfaf176	Updated llvm-objdump to display local relocations in Mach-O binaries Summary: llvm-objdump's Mach-O parser was updated in r306037 to display external relocations for MH_KEXT_BUNDLE file types. This change extends the Macho-O parser to display local relocations for MH_PRELOAD files. When used with the -macho option relocations will be displayed in a historical format. rdar://35778019 Reviewers: enderby Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D41061 llvm-svn: 320532	2017-12-12 23:53:46 +00:00
Sanjay Patel	3cf695aa38	[EarlyCSE] add tests for commuted min/max; NFC See PR35642: https://bugs.llvm.org/show_bug.cgi?id=35642 llvm-svn: 320530	2017-12-12 22:23:09 +00:00
Krzysztof Parzyszek	2eda05db87	[Hexagon] Relax some checks in testcases, NFC llvm-svn: 320529	2017-12-12 21:44:04 +00:00
Alexey Bataev	83c15b1363	[InstCombine] Fix PR35618: Instcombine hangs on single minmax load bitcast. Summary: If we have pattern `store (load(bitcast(select (cmp(V1, V2), &V1, &V2)))), bitcast)`, but the load is used in other instructions, it leads to looping in InstCombiner. Patch adds additional check that all users of the load instructions are stores and then replaces all uses of load instruction by the new one with new type. Reviewers: RKSimon, spatel, majnemer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D41072 llvm-svn: 320525	2017-12-12 20:28:46 +00:00
Krzysztof Parzyszek	edcd9dcbc4	[Hexagon] Better detection of identity and undef masks in shuffles llvm-svn: 320523	2017-12-12 20:23:12 +00:00
Krzysztof Parzyszek	40a605f1be	[Hexagon] Fix wrong order of operands for vmux Shuffle generation uses vmux to collapse vectors resulting from two individual shuffles into one. The indexes of the elements selected from the first operand were indicated by 0xFF in the constant vector used in the compare instruction, but the compare (veqb) set the bits corresponding to the 0x00 elements, thus inverting the selection. Reverse the order of operands to vmux to get the correct output. llvm-svn: 320516	2017-12-12 19:32:41 +00:00
Fiona Glaser	b8a330c42a	Reassociate: add global reassociation algorithm This algorithm (explained more in the source code) takes into account global redundancies by building a "pair map" to find common subexprs. The primary motivation of this is to handle situations like foo = (a * b) * c bar = (a * d) * c where we currently don't identify that "a * c" is redundant. Accordingly, it prioritizes the emission of a * c so that CSE can remove the redundant calculation later. Does not change the actual reassociation algorithm -- only the order in which the reassociated operand chain is reconstructed. Gives ~1.5% floating point math instruction count reduction on a large offline suite of graphics shaders. llvm-svn: 320515	2017-12-12 19:18:02 +00:00
Alexey Bataev	fa0a76dbcc	Revert "[InstCombine] Fix PR35618: Instcombine hangs on single minmax load bitcast." This reverts commit r320510 - again sanitizers bbots. llvm-svn: 320513	2017-12-12 19:12:34 +00:00
Sanjoy Das	1074eb225b	Reapply "[X86] Flag BroadWell scheduler model as complete" This reverts commit r320508, in effect re-applying r320308. Simon has already reverted the parts that caused the crash that motivated the revert in r320492. llvm-svn: 320512	2017-12-12 19:11:31 +00:00
Hiroshi Yamauchi	f3bda1daa2	Split IndirectBr critical edges before PGO gen/use passes. Summary: The PGO gen/use passes currently fail with an assert failure if there's a critical edge whose source is an IndirectBr instruction and that edge needs to be instrumented. To avoid this in certain cases, split IndirectBr critical edges in the PGO gen/use passes. This works for blocks with single indirectbr predecessors, but not for those with multiple indirectbr predecessors (splitting an IndirectBr critical edge isn't always possible.) Reviewers: davidxl, xur Reviewed By: davidxl Subscribers: efriedma, llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D40699 llvm-svn: 320511	2017-12-12 19:07:43 +00:00
Alexey Bataev	195c97e220	[InstCombine] Fix PR35618: Instcombine hangs on single minmax load bitcast. Summary: If we have pattern `store (load(bitcast(select (cmp(V1, V2), &V1, &V2)))), bitcast)`, but the load is used in other instructions, it leads to looping in InstCombiner. Patch adds additional check that all users of the load instructions are stores and then replaces all uses of load instruction by the new one with new type. Reviewers: RKSimon, spatel, majnemer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D41072 llvm-svn: 320510	2017-12-12 18:47:00 +00:00
Sanjoy Das	81a4a02cbc	Revert "[X86] Flag BroadWell scheduler model as complete" This reverts commit r320308. r320308 crashes LLC, please see the llvm-commits thread for a reproducer. llvm-svn: 320508	2017-12-12 18:40:58 +00:00
Nirav Dave	674d053d18	[X86] Cleanup type conversion of 64-bit load-store pairs. Summary: Simplify and generalize chain handling and search for 64-bit load-store pairs. Nontemporal test now converts 64-bit integer load-store into f64 which it realizes directly instead of splitting into two i32 pairs. Reviewers: craig.topper, spatel Reviewed By: craig.topper Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D40918 llvm-svn: 320505	2017-12-12 18:25:48 +00:00
Geoff Berry	60c431022e	[MachineOperand][MIR] Add isRenamable to MachineOperand. Summary: Add isRenamable() predicate to MachineOperand. This predicate can be used by machine passes after register allocation to determine whether it is safe to rename a given register operand. Register operands that aren't marked as renamable may be required to be assigned their current register to satisfy constraints that are not captured by the machine IR (e.g. ABI or ISA constraints). Reviewers: qcolombet, MatzeB, hfinkel Subscribers: nemanjai, mcrosier, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D39400 llvm-svn: 320503	2017-12-12 17:53:59 +00:00
Alexey Bataev	6132a50d2a	Revert "[InstCombine] Fix PR35618: Instcombine hangs on single minmax load bitcast." This reverts commit r320499 again to resolve the problem with the sanitizers bbots. llvm-svn: 320501	2017-12-12 17:35:29 +00:00
Alexey Bataev	ca4c9a5246	[InstCombine] Fix PR35618: Instcombine hangs on single minmax load bitcast. Summary: If we have pattern `store (load(bitcast(select (cmp(V1, V2), &V1, &V2)))), bitcast)`, but the load is used in other instructions, it leads to looping in InstCombiner. Patch adds additional check that all users of the load instructions are stores and then replaces all uses of load instruction by the new one with new type. Reviewers: RKSimon, spatel, majnemer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D41072 llvm-svn: 320499	2017-12-12 17:19:15 +00:00
Alexey Bataev	d19dbe6791	Revert "[InstCombine] Fix PR35618: Instcombine hangs on single minmax load bitcast." This reverts commit r320496 to solve the problems with sanitizer buildbots. llvm-svn: 320498	2017-12-12 17:08:48 +00:00
Alexey Bataev	d0c3aeb200	[InstCombine] Fix PR35618: Instcombine hangs on single minmax load bitcast. Summary: If we have pattern `store (load(bitcast(select (cmp(V1, V2), &V1, &V2)))), bitcast)`, but the load is used in other instructions, it leads to looping in InstCombiner. Patch adds additional check that all users of the load instructions are stores and then replaces all uses of load instruction by the new one with new type. Reviewers: RKSimon, spatel, majnemer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D41072 llvm-svn: 320496	2017-12-12 16:58:48 +00:00
Simon Pilgrim	68f9accf51	[X86] Remove CompleteModel tags from CPU targets until we have better error checking (PR35636) The checks we have for complete models are not great and miss many cases - e.g. in PR35636 it failed to recognise that only the first output (of 2) was actually tagged by the InstRW Raised PR35639 and PR35643 as examples llvm-svn: 320492	2017-12-12 16:12:53 +00:00
Alexey Bataev	c9f1d2e4a0	Revert "[InstCombine] Fix PR35618: Instcombine hangs on single minmax load bitcast." This reverts commit r320488 because of the failed asan buildbots.. llvm-svn: 320490	2017-12-12 16:05:52 +00:00
Alexey Bataev	fb68c48a82	[InstCombine] Fix PR35618: Instcombine hangs on single minmax load bitcast. Summary: If we have pattern `store (load(bitcast(select (cmp(V1, V2), &V1, &V2)))), bitcast)`, but the load is used in other instructions, it leads to looping in InstCombiner. Patch adds additional check that all users of the load instructions are stores and then replaces all uses of load instruction by the new one with new type. Reviewers: RKSimon, spatel, majnemer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D41072 llvm-svn: 320488	2017-12-12 15:54:49 +00:00
Alex Bradbury	9ed84c8ae8	[RISCV] Implement assembler pseudo instructions for RV32I and RV64I Adds the assembler pseudo instructions of RV32I and RV64I which can be mapped to a single canonical instruction. The missing pseudo instructions (e.g., call, tail, ...) are marked as TODO. Other things, like for example PCREL_LO, have to be implemented first. Currently, alias emission is disabled by default to keep the patch minimal. Alias emission by default will be enabled in a subsequent patch which also updates all affected tests. Note that this patch should actually break the floating point MC tests. However, the used FileCheck configuration is not tight enought to detect the breakage. Differential Revision: https://reviews.llvm.org/D40902 Patch by Mario Werner. llvm-svn: 320487	2017-12-12 15:46:15 +00:00
Alexey Bataev	ca2a8cea2f	Revert "[InstCombine] Fix PR35618: Instcombine hangs on single minmax load bitcast." This reverts commit r320483 because of the failed Windows buildbots. llvm-svn: 320485	2017-12-12 15:24:17 +00:00
Alex Bradbury	8bba6bfeef	[RISCV] MC layer support for the instructions added in the privileged spec Adds support for the instructions added in the RISC-V privileged ISA (https://content.riscv.org/wp-content/uploads/2017/05/riscv-privileged-v1.10.pdf): uret, sret, mret, wfi, and sfence.vma. Note from the committer: I made very minor formatting changes prior to commit, which didn't seem worth creating another review round-trip for. Differential Revision: https://reviews.llvm.org/D40383 Patch by David Craven. llvm-svn: 320484	2017-12-12 15:17:45 +00:00
Alexey Bataev	1daef8a667	[InstCombine] Fix PR35618: Instcombine hangs on single minmax load bitcast. If we have pattern `store (load(bitcast(select (cmp(V1, V2), &V1, &V2)))), bitcast)`, but the load is used in other instructions, it leads to looping in InstCombiner. Patch adds additional check that all users of the load instructions are stores and then replaces all uses of load instruction by the new one with new type. Reviewers: RKSimon, spatel, majnemer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D41072 llvm-svn: 320483	2017-12-12 15:03:17 +00:00
Ayman Musa	c2eed926b0	[X86] Recognize constant arrays with special values and replace loads from it with subtract and shift instructions, which then will be replaced by X86 BZHI machine instruction. Recognize constant arrays with the following values: 0x0, 0x1, 0x3, 0x7, 0xF, 0x1F, .... , 2^(size - 1) -1 where //size// is the size of the array. the result of a load with index //idx// from this array is equivalent to the result of the following: (0xFFFFFFFF >> (sub 32, idx)) (assuming the array of type 32-bit integer). And the result of an 'AND' operation on the returned value of such a load and another input, is exactly equivalent to the X86 BZHI instruction behavior. See test cases in the LIT test for better understanding. Differential Revision: https://reviews.llvm.org/D34141 llvm-svn: 320481	2017-12-12 14:13:51 +00:00
Anna Thomas	2dd9835f35	[InstComineLoadStoreAlloca] Optimize stores to GEP off null base Summary: Currently, in InstCombineLoadStoreAlloca, we have simplification rules for the following cases: 1. load off a null 2. load off a GEP with null base 3. store to a null This patch adds support for the fourth case which is store into a GEP with null base. Since this is UB as well (and directly analogous to the load off a GEP with null base), we can substitute the stored val with undef in instcombine, so that SimplifyCFG can optimize this code into unreachable code. Note: Right now, simplifyCFG hasn't been taught about optimizing this to unreachable and adding an llvm.trap (this is already done for the above 3 cases). Reviewers: majnemer, hfinkel, sanjoy, davide Reviewed by: sanjoy, davide Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D41026 llvm-svn: 320480	2017-12-12 14:12:33 +00:00
Nemanja Ivanovic	b0783cccb7	[PowerPC] Follow-up to r318436 to get the missed CSE opportunities The last of the three patches that https://reviews.llvm.org/D40348 was broken up into. Canonicalize the materialization of constants so that they are more likely to be CSE'd regardless of the bit-width of the use. If a constant can be materialized using PPC::LI, materialize it the same way always. For example: li 4, -1 li 4, 255 li 4, 65535 are equivalent if the uses only use the low byte. Canonicalize it to the first form. Differential Revision: https://reviews.llvm.org/D40348 llvm-svn: 320473	2017-12-12 12:09:34 +00:00
Jonas Devlieghere	f0945f48bd	[dsymutil] Accept line tables up to DWARFv5. This patch removes the hard-coded check for DWARFv2 line tables. Now dsymutil accepts line tables for DWARF versions 2 to 5 (inclusive). Differential revision: https://reviews.llvm.org/D41084 rdar://35968319 llvm-svn: 320469	2017-12-12 11:32:21 +00:00
Igor Laevsky	d63560b817	Revert r320049, r320014 and r319894 They were causing failures of the piglit OpenGL tests with AMD GPUs using the Mesa radeonsi driver. llvm-svn: 320466	2017-12-12 10:03:39 +00:00
Dorit Nuzman	927b31600e	[LV] Ignore the cost of values that will not appear in the vectorized loop VecValuesToIgnore holds values that will not appear in the vectorized loop. We should therefore ignore their cost when VF > 1. Differential Revision: https://reviews.llvm.org/D40883 llvm-svn: 320463	2017-12-12 08:57:43 +00:00
Mikael Holmen	66cf383761	[CallSiteSplitting] Don't let debug intrinsics affect optimizations Summary: This solves PR35616. We don't want the compiler to generate different code when we compile with/without -g, so we now ignore debug intrinsics when determining if the optimization can trigger or not. Reviewers: junbuml Subscribers: davide, JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D41068 llvm-svn: 320460	2017-12-12 07:29:57 +00:00
Craig Topper	468a813315	[X86] Use Ld scheduler classes for instructions with folded loads. llvm-svn: 320459	2017-12-12 07:06:35 +00:00
Craig Topper	c1e72c019d	[X86] Correct the FMA3 regular expressions in the znver1 scheduler model. llvm-svn: 320458	2017-12-12 07:06:32 +00:00
Vedant Kumar	7a911b5851	[llvm-cov] Simplify a test case. NFC. llvm-svn: 320439	2017-12-11 23:34:50 +00:00
Max Moroz	fe4d904917	[llvm-cov] Add an option for "export" command to emit only file summary data. Summary: That allows to get the same data as produced by "llvm-cov report", but in JSON format, which is better for further processing by end users. Reviewers: vsk Reviewed By: vsk Differential Revision: https://reviews.llvm.org/D41085 llvm-svn: 320435	2017-12-11 23:17:46 +00:00
Sam Clegg	f950b24a7a	Reland "[WebAssembly] Import the linear memory and function table." Original change: https://reviews.llvm.org/D40875 llvm-svn: 320432	2017-12-11 23:03:38 +00:00
Richard Trieu	efef032f02	Revert r318704 - [Sparc] efficient pattern for UINT_TO_FP conversion See bug https://bugs.llvm.org/show_bug.cgi?id=35631 r318704 is giving a fatal error on some code with unsigned to floating point conversions. llvm-svn: 320429	2017-12-11 22:25:04 +00:00
Matt Arsenault	3e268cc0dd	LSR: Check more intrinsic pointer operands llvm-svn: 320424	2017-12-11 21:38:43 +00:00
Hans Wennborg	27d1c00c01	Revert r320407 "[InstCombine] Fix PR35618: Instcombine hangs on single minmax load bitcast." The tests fail (opt asserts) on Windows. > Summary: > If we have pattern `store (load(bitcast(select (cmp(V1, V2), &V1, > &V2)))), bitcast)`, but the load is used in other instructions, it leads > to looping in InstCombiner. Patch adds additional check that all users > of the load instructions are stores and then replaces all uses of load > instruction by the new one with new type. > > Reviewers: RKSimon, spatel, majnemer > > Subscribers: llvm-commits > > Differential Revision: https://reviews.llvm.org/D41072 llvm-svn: 320421	2017-12-11 21:15:27 +00:00
Adrian Prantl	3c6c14d14b	ASAN: Provide reliable debug info for local variables at -O0. The function stack poisioner conditionally stores local variables either in an alloca or in malloc'ated memory, which has the unfortunate side-effect, that the actual address of the variable is only materialized when the variable is accessed, which means that those variables are mostly invisible to the debugger even when compiling without optimizations. This patch stores the address of the local stack base into an alloca, which can be referred to by the debug info and is available throughout the function. This adds one extra pointer-sized alloca to each stack frame (but mem2reg can optimize it away again when optimizations are enabled, yielding roughly the same debug info quality as before in optimized code). rdar://problem/30433661 Differential Revision: https://reviews.llvm.org/D41034 llvm-svn: 320415	2017-12-11 20:43:21 +00:00
Tony Jiang	3b49dc548f	[PowerPC] Partially enable the ISEL expansion pass. The pass to expand ISEL instructions into if-then-else sequences in patch D23630 is currently disabled. This patch partially enable it by always removing the unnecessary ISELs (all registers used by the ISELs are the same one) and folding the ISELs which have the same input registers into unconditional copies. Differential Revision: https://reviews.llvm.org/D40497 llvm-svn: 320414	2017-12-11 20:42:37 +00:00
Alexey Bataev	ec128ace8a	[InstCombine] Fix PR35618: Instcombine hangs on single minmax load bitcast. Summary: If we have pattern `store (load(bitcast(select (cmp(V1, V2), &V1, &V2)))), bitcast)`, but the load is used in other instructions, it leads to looping in InstCombiner. Patch adds additional check that all users of the load instructions are stores and then replaces all uses of load instruction by the new one with new type. Reviewers: RKSimon, spatel, majnemer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D41072 llvm-svn: 320407	2017-12-11 19:11:16 +00:00
Krzysztof Parzyszek	a8ab1b75cb	[Hexagon] Add support for Hexagon V65 llvm-svn: 320404	2017-12-11 18:57:54 +00:00
Simon Pilgrim	e83876e31d	[X86] Add LODS schedule tests llvm-svn: 320403	2017-12-11 18:39:42 +00:00
Simon Pilgrim	e8715025f5	[X86] Add CMP/TEST schedule tests llvm-svn: 320402	2017-12-11 18:32:59 +00:00
Simon Pilgrim	5512525c5d	[X86] Add AND/OR/XOR schedule tests llvm-svn: 320400	2017-12-11 18:23:24 +00:00
Jonas Devlieghere	ba915897da	[dwarfdump] Fix off-by-one bug in accelerator table extractor. This fixes a bug where the verifier was complaining about empty accelerator tables. When the table is empty, its size is not a valid offset as it points after the end of the section. This patch also makes the extractor return llvm:Error instead of bool for better error reporting in the verifier. Differential revision: https://reviews.llvm.org/D41063 rdar://35932007 llvm-svn: 320399	2017-12-11 18:22:47 +00:00
Simon Pilgrim	9b2a5e1e0b	[X86] Add ADD/SUB schedule tests llvm-svn: 320397	2017-12-11 18:13:40 +00:00
Simon Pilgrim	dbe6c45fcd	[X86] Add ADC/SBB schedule tests llvm-svn: 320395	2017-12-11 17:59:05 +00:00
Simon Pilgrim	8c2d90a2f4	[X86] Add MOVSLQ schedule tests llvm-svn: 320392	2017-12-11 17:37:08 +00:00
Amara Emerson	df9b529d42	[GlobalISel] Disable GISel for big endian. This is due to PR26161 needing to be resolved before we can fix big endian bugs like PR35359. The work to split aggregates into smaller LLTs instead of using one large scalar will take some time, so in the mean time we'll fall back to SDAG. Some ARM BE tests xfailed for now as a result. Differential Revision: https://reviews.llvm.org/D40789 llvm-svn: 320388	2017-12-11 16:58:29 +00:00
Simon Pilgrim	fabe354b42	[X86] Add LWP schedule tests Tag LWP instructions as WriteSystem llvm-svn: 320387	2017-12-11 16:47:21 +00:00
Simon Pilgrim	67644be692	[X86] Add INT/INTO schedule tests llvm-svn: 320386	2017-12-11 16:32:58 +00:00
Simon Pilgrim	1fe82016a2	[X86] Add IN/OUT schedule tests llvm-svn: 320385	2017-12-11 16:16:40 +00:00
Simon Pilgrim	d0ce975528	[X86] Add IDIV schedule tests llvm-svn: 320384	2017-12-11 16:08:21 +00:00
Simon Pilgrim	6c29962f2e	[X86] Add CMPXCHG schedule tests llvm-svn: 320383	2017-12-11 16:04:08 +00:00
Simon Pilgrim	1c83cd18ae	[X86] Add CLZERO schedule test llvm-svn: 320382	2017-12-11 15:53:12 +00:00
Simon Pilgrim	d9d37f8c3c	[X86] Add ADCX/ADOX/XADD/XLAT schedule tests llvm-svn: 320380	2017-12-11 15:41:52 +00:00
Nirav Dave	e830b758b8	[X86] Modify Nontemporal tests to avoid deadstore optimization. llvm-svn: 320379	2017-12-11 15:35:40 +00:00
Simon Pilgrim	4f2c415a13	[X86] Add SETCC/STC/STD/UD2 schedule tests llvm-svn: 320376	2017-12-11 15:25:31 +00:00
Dmitry Preobrazhensky	ac2b02643b	[AMDGPU][MC][GFX9] Corrected encoding of ttmp registers, disabled tba/tma See bugs 35494 and 35559: https://bugs.llvm.org/show_bug.cgi?id=35494 https://bugs.llvm.org/show_bug.cgi?id=35559 Reviewers: vpykhtin, artem.tamazov, arsenm Differential Revision: https://reviews.llvm.org/D41007 llvm-svn: 320375	2017-12-11 15:23:20 +00:00
Sanjay Patel	f3436d7dab	[DAGCombiner] protect against an infinite loop between shl <--> mul (PR35579) At first, I tried to thread the x86 needle and use a target hook (isVectorShiftByScalarCheap()) to disable the transform only for non-splat pow-of-2 constants, but not AVX2, but only some element types, but...it's difficult. Here we just avoid the loop with the x86 vector transform that conflicts with the general DAG combine and preserve all of the existing behavior AFAICT otherwise. Some tests that will probably fail if someone does try to restrict this in a more targeted way for x86-only may be found in: test/CodeGen/X86/combine-mul.ll test/CodeGen/X86/vector-mul.ll test/CodeGen/X86/widen_arith-5.ll This should prevent the infinite looping seen with: https://bugs.llvm.org/show_bug.cgi?id=35579 Differential Revision: https://reviews.llvm.org/D41040 llvm-svn: 320374	2017-12-11 15:19:31 +00:00
Simon Pilgrim	5154d249a8	[X86] Add SAR/SHL/SHR schedule tests llvm-svn: 320371	2017-12-11 14:56:44 +00:00
Simon Pilgrim	426add6915	[X86] Add RCL/RCR schedule tests llvm-svn: 320370	2017-12-11 14:46:42 +00:00
Krzysztof Parzyszek	152414595b	[Hexagon] Crash in instruction selection for insert_vector_elt for HVX A wrong type was passed to insertVector, causing an out-of-bounds value to be added an an operand to HexagonISD::INSERT. This later failed in instruction selection. llvm-svn: 320369	2017-12-11 14:46:06 +00:00
Nemanja Ivanovic	50d37a1129	[PowerPC] Sign-extend negative constant stores Second part of https://reviews.llvm.org/D40348. Revision r318436 has extended all constants feeding a store to 64 bits to allow for CSE on the SDAG. However, negative constants were zero extended which made the constant being loaded appear to be a positive value larger than 16 bits. This resulted in long sequences to materialize such constants rather than simply a "load immediate". This patch just sign-extends those updated constants so that they remain 16-bit signed immediates if they started out that way. llvm-svn: 320368	2017-12-11 14:35:48 +00:00
Diana Picus	291e8d924f	[ARM GlobalISel] Add test for a MOVTi16 pattern. NFC Add test for matching an OR with 0xFFFF0000 to a MOVTi16. llvm-svn: 320362	2017-12-11 13:28:45 +00:00
Simon Pilgrim	969850f514	[X86] Add fsgsbase schedule tests. llvm-svn: 320361	2017-12-11 13:25:02 +00:00
Alex Bradbury	dc31c61b18	[RISCV] Add custom CC_RISCV calling convention and improved call support The TableGen-based calling convention definitions are inflexible, while writing a function to implement the calling convention is very straight-forward, and allows difficult cases to be handled more easily. With this patch adds support for: * Passing large scalars according to the RV32I calling convention * Byval arguments * Passing values on the stack when the argument registers are exhausted The custom CC_RISCV calling convention is also used for returns. This patch also documents the ABI lowering that a language frontend is expected to perform. I would like to work to simplify these requirements over time, but this will require further discussion within the LLVM community. We add PendingArgFlags CCState, as a companion to PendingLocs. The PendingLocs vector is used by a number of backends to handle arguments that are split during legalisation. However CCValAssign doesn't keep track of the original argument alignment. Therefore, add a PendingArgFlags vector which can be used to keep track of the ISD::ArgFlagsTy for every value added to PendingLocs. Differential Revision: https://reviews.llvm.org/D39898 llvm-svn: 320359	2017-12-11 12:49:02 +00:00
Alex Bradbury	bfb00d4c1c	[RISCV] Allow lowering of dynamic_stackalloc, stacksave, stackrestore llvm-svn: 320358	2017-12-11 12:38:17 +00:00
Alex Bradbury	b014e3de52	[RISCV] Implement prolog and epilog insertion As frame pointer elimination isn't implemented until a later patch and we make extensive use of update_llc_test_checks.py, this changes touches a lot of the RISC-V tests. Differential Revision: https://reviews.llvm.org/D39849 llvm-svn: 320357	2017-12-11 12:34:11 +00:00
Simon Pilgrim	220b1c13bf	[X86] Regenerate fsgsbase intrinsic tests. NFCI. llvm-svn: 320356	2017-12-11 12:22:15 +00:00
Roger Ferrer Ibanez	5ea0f2501f	[ARM] Use ADDCARRY / SUBCARRY This is a preparatory step for D34515. This change: - makes nodes ISD::ADDCARRY and ISD::SUBCARRY legal for i32 - lowering is done by first converting the boolean value into the carry flag using (_, C) ← (ARMISD::ADDC R, -1) and converted back to an integer value using (R, _) ← (ARMISD::ADDE 0, 0, C). An ARMISD::ADDE between the two operations does the actual addition. - for subtraction, given that ISD::SUBCARRY second result is actually a borrow, we need to invert the value of the second operand and result before and after using ARMISD::SUBE. We need to invert the carry result of ARMISD::SUBE to preserve the semantics. - given that the generic combiner may lower ISD::ADDCARRY and ISD::SUBCARRYinto ISD::UADDO and ISD::USUBO we need to update their lowering as well otherwise i64 operations now would require branches. This implies updating the corresponding test for unsigned. - add new combiner to remove the redundant conversions from/to carry flags to/from boolean values (ARMISD::ADDC (ARMISD::ADDE 0, 0, C), -1) → C - fixes PR34045 - fixes PR34564 - fixes PR35103 Differential Revision: https://reviews.llvm.org/D35192 llvm-svn: 320355	2017-12-11 12:13:45 +00:00
Alex Bradbury	660bcceccf	[RISCV] Support lowering FrameIndex Introduces the AddrFI "addressing mode", which is necessary simply because it's not possible to write a pattern that directly matches a frameindex. Ensure callee-saved registers are accessed relative to the stackpointer. This is necessary as callee-saved register spills are performed before the frame pointer is set. Move HexagonDAGToDAGISel::isOrEquivalentToAdd to SelectionDAGISel, so we can make use of it in the RISC-V backend. Differential Revision: https://reviews.llvm.org/D39848 llvm-svn: 320353	2017-12-11 11:53:54 +00:00
Diana Picus	775bb74379	[ARM GlobalISel] Add tests for PKHBT and PKHTB Test (some of) the patterns for selecting PKHBT and PKHTB. The others are just very similar to the ones we're testing and there would be little value in covering them as well. llvm-svn: 320352	2017-12-11 11:44:23 +00:00
Aleksandar Beserminji	d6dada17ff	[mips] Removal of microMIPS64R6 All files and parts of files related to microMIPS4R6 are removed. When target is microMIPS4R6, errors are printed. This is LLVM part of patch. Differential Revision: https://reviews.llvm.org/D35625 llvm-svn: 320350	2017-12-11 11:21:40 +00:00
Dylan McKay	2124bcf805	[AVR] Implement some missing code paths This has been broken since r320009. llvm-svn: 320348	2017-12-11 11:01:27 +00:00
Craig Topper	ad45bf5895	[DAGCombiner] Support folding (mulhs/u X, 0)->0 for vectors. We should probably also fold (mulhs/u X, 1) for vectors, but that's harder. llvm-svn: 320344	2017-12-11 08:33:20 +00:00
Craig Topper	0bea09b737	[X86] Regenerate test with update_llc_test_checks.py llvm-svn: 320342	2017-12-11 06:16:26 +00:00
Craig Topper	1e83485613	[X86] Add a test case for masked scatter where the index needs to be legalized from v2i32 while other types are legal. llvm-svn: 320340	2017-12-11 01:48:10 +00:00
Simon Pilgrim	6b1f532ccf	[X86] Add ROL/ROR schedule tests llvm-svn: 320334	2017-12-10 22:11:56 +00:00
Simon Pilgrim	a6564e2358	[X86] Add DIV/MUL/NEG/NOP/NOT/PAUSE schedule tests llvm-svn: 320333	2017-12-10 21:56:24 +00:00
Simon Pilgrim	8e6d0fcbac	[X86] Add DEC/INC schedule tests Include i686 (non-REX) variant tests as well llvm-svn: 320332	2017-12-10 21:28:00 +00:00
Simon Pilgrim	f1c51d187a	[X86] Add INS/OUTS schedule tests llvm-svn: 320331	2017-12-10 21:10:28 +00:00
Simon Pilgrim	07ebbd53f0	[X86] Add CMPS/MOVS/SCAS/STOS schedule tests llvm-svn: 320330	2017-12-10 20:58:22 +00:00
Simon Pilgrim	f65831d731	[X86] Add CMOV schedule tests llvm-svn: 320329	2017-12-10 20:46:57 +00:00
Simon Pilgrim	4a431edddc	[X86] Add BT/BTC/BTR/BTS schedule tests llvm-svn: 320328	2017-12-10 20:22:47 +00:00
Craig Topper	a0be5a06c1	[X86] Rename some instructions that start with Int_ to have the _Int at the end. This matches AVX512 version and is more consistent overall. And improves our scheduler models. In some cases this adds _Int to instructions that didn't have any Int_ before. It's a side effect of the adjustments made to some of the multiclasses. llvm-svn: 320325	2017-12-10 19:47:56 +00:00
Simon Pilgrim	c493d4f5b9	[X86][X87] Fix typo in znver1 FIST/FISTT schedule patterns llvm-svn: 320322	2017-12-10 19:19:22 +00:00
Simon Pilgrim	930e435937	[X86][X87] Add missing x87 scheduler tests Split off some 'n' instruction versions to make it clearer when WAIT is being inserted llvm-svn: 320321	2017-12-10 18:53:15 +00:00
Craig Topper	1de942b2d1	[X86] Rename some instructions from 'rb' to 'rrb' to make 'b' a proper suffix. Fix the scheduling information for some of them. Some of the scheduling information was only present for the 'rb' version' and not the 'rr' version. Now we match 'rr(b?)' llvm-svn: 320320	2017-12-10 17:42:44 +00:00
Craig Topper	c7445f2cdc	[X86] Add VCVTQQ2PS to the skylake server scheduler models. llvm-svn: 320319	2017-12-10 17:42:43 +00:00
Sanjay Patel	b23e148114	[SimplifyLibCalls] propagate FMF when folding pow(x, -1.0) call Follow-up for a bug that's similar to: https://bugs.llvm.org/show_bug.cgi?id=35601 llvm-svn: 320312	2017-12-10 17:25:54 +00:00
Sanjay Patel	ac9cbd6c56	[InstCombine] add test for pow(x, -1.0) with FMF; NFC llvm-svn: 320311	2017-12-10 17:21:51 +00:00
Sanjay Patel	09ec34349a	[SimplifyLibCalls] propagate FMF when folding pow(x, 2.0) call (PR35601) This should fix the larger problem with sqrt shown in: https://bugs.llvm.org/show_bug.cgi?id=35601 llvm-svn: 320310	2017-12-10 16:52:26 +00:00
Sanjay Patel	719bc64ba5	[InstCombine] add test for pow(x, 2.0) with FMF; NFC llvm-svn: 320309	2017-12-10 16:43:34 +00:00
Simon Pilgrim	1f8cfba0bb	[X86] Flag BroadWell scheduler model as complete Locally tag COPY as WriteMove, which has caused some reg-reg + reg-mem instruction tests to reorder. llvm-svn: 320308	2017-12-10 13:49:51 +00:00
Simon Pilgrim	4ff43d8120	Regenerate some AVX2+ scheduling tests that got missed llvm-svn: 320307	2017-12-10 13:41:29 +00:00
Simon Pilgrim	af35b76bda	Regenerate some scheduling tests that got missed llvm-svn: 320305	2017-12-10 12:59:55 +00:00
Dorit Nuzman	5809e70540	[SCEV] Fix wrong Equal predicate created in getAddRecForPhiWithCasts CreateAddRecFromPHIWithCastsImpl() adds an IncrementNUSW overflow predicate which allows the PSCEV rewriter to rewrite this scev expression: (zext i8 {0, + , (trunc i32 step to i8)} to i32) into {0, +, (sext i8 (trunc i32 step to i8) to i32)} But then it adds the wrong Equal predicate: %step == (zext i8 (trunc i32 %step to i8) to i32). instead of: %step == (sext i8 (trunc i32 %step to i8) to i32) This is fixed here. Differential Revision: https://reviews.llvm.org/D40641 llvm-svn: 320298	2017-12-10 11:13:35 +00:00
Craig Topper	253562eb81	[X86] Fix duplicate entries in skylake server scheduler model by changing Z128 to Z256 Based on the fact that the 'Y' version of the instruction is next to this, I assume Z256 is the intended value. llvm-svn: 320295	2017-12-10 09:14:45 +00:00
Craig Topper	90c9c15936	[X86] Add MOVQI2PQIrm, MOVSDmr, and MOVSDrm to scheduler information The VEX versions were present but not the legacy SSE versions. llvm-svn: 320294	2017-12-10 09:14:44 +00:00
Tim Northover	cf4701bb89	PowerPC: support external pid instructions in MC layer. This adds assembly & disassembly support for the e500mc "external pid" instructions. See https://reviews.llvm.org/D39249. Patch by vit9696 <vit9696@avp.su> llvm-svn: 320287	2017-12-10 08:43:19 +00:00
Craig Topper	4e57776fb2	[X86] Correct the _Int part of more scheduler model instrexes. Put _b in the correct order relative to _Int llvm-svn: 320282	2017-12-10 03:16:38 +00:00
Craig Topper	29868dcbaa	[X86] Fix test case I failed ot update in r320279. llvm-svn: 320280	2017-12-10 01:27:54 +00:00
Craig Topper	391c6f9507	[X86] Fix bad regular expressions in the scheduler models. Question marks should be outside of multicharacter parenthesized expressions If the question mark is inside the parentheses it only applies to the single character proceeding it. I had to make a few additional cleanups to fix some duplicate warnings that were exposed by fixing this. llvm-svn: 320279	2017-12-10 01:24:08 +00:00
Joel Jones	5cc21e83ce	[AArch64] Improve loop unrolling performance on Cavium T99 This patch improves performance on Cavium T99 as shown here (libquantum 0.2.4): https://docs.google.com/spreadsheets/d/1Lo1o2E1NjrpkwS7DvYYWsiVvPdd93h7KBaqeptMrZPY/edit?usp=sharing By increasing the LoopMicroOpsBufferSize in the Cavium T99 Scheduler file, loop unrolling becomes more aggressive. This helps performance on T99. Test case included. Patch by Stefan Teleman Differential Revision: https://reviews.llvm.org/D40695 llvm-svn: 320272	2017-12-09 23:59:55 +00:00
Simon Pilgrim	a42a54258e	[InstCombine] Fix SimplifyDemandedUseBits SHL handling (PR35515) Don't assume that the pattern matched SRL can be cast to an Instruction (might be ConstExpr etc.) llvm-svn: 320270	2017-12-09 23:42:56 +00:00
Craig Topper	f4e3044db9	[X86] Use KMOV instructions to zero upper bits of vectors when possible. llvm-svn: 320268	2017-12-09 23:10:59 +00:00
Craig Topper	5ac75d5628	[X86] Improve lowering of vXi1 insert_subvectors to better utilize (insert_subvector zero, vec, 0) for zeroing upper bits. This can be better recognized during isel when the producer already zeroed the upper bits. llvm-svn: 320267	2017-12-09 22:44:42 +00:00
Craig Topper	504534514c	[X86] Don't use getTargetConstant for all 0s and all 1s mask vector. llvm-svn: 320260	2017-12-09 19:18:30 +00:00
Florian Hahn	c5bebffe4f	[InlineFunction] Set debug loc for call to forward varargs. Reviewers: aprantl, dblaikie, rnk Reviewed By: rnk Subscribers: eraman, llvm-commits, JDevlieghere Differential Revision: https://reviews.llvm.org/D40432 llvm-svn: 320252	2017-12-09 14:25:33 +00:00
Craig Topper	6504a8f888	[X86] When inserting into the upper bits of a vXi1 vector, make sure we shift enough bits if we widened the vector. We may need to widen the vector to make the shifts legal, but if we do that we need to make sure we shift left/right after accounting for the new size. If not we can't guarantee we are shifting in zeros. The test cases affected actually show cases where we should move the shifts all together, but that's another problem. llvm-svn: 320248	2017-12-09 08:19:07 +00:00
Dylan McKay	ba23343a45	Revert and accidentally committed revert commit This reverts commit r320245. llvm-svn: 320247	2017-12-09 08:01:28 +00:00
Dylan McKay	f7e8ec1348	[AVR] Fix two CodeGen tests These were broken because of various printing format changes. llvm-svn: 320246	2017-12-09 07:51:43 +00:00
Dylan McKay	f5422afdf0	Revert "[AVR] Override ParseDirective" This reverts commit 57c16f9267969ebb09d6448607999b4a9f40c418. llvm-svn: 320245	2017-12-09 07:51:37 +00:00
Craig Topper	b3e14ce90c	[X86] Improve lowering of concats of mask vectors to better optimize zero vector inputs. We were previously using kunpck with zero inputs unnecessarily. And we had cases where we would insert into a zero vector and then insert into larger zero vector incurring two sets of shifts. llvm-svn: 320244	2017-12-09 07:02:19 +00:00
Dylan McKay	80463fe64d	Relax unaligned access assertion when type is byte aligned Summary: This relaxes an assertion inside SelectionDAGBuilder which is overly restrictive on targets which have no concept of alignment (such as AVR). In these architectures, all types are aligned to 8-bits. After this, LLVM will only assert that accesses are aligned on targets which actually require alignment. This patch follows from a discussion on llvm-dev a few months ago http://llvm.1065342.n5.nabble.com/llvm-dev-Unaligned-atomic-load-store-td112815.html Reviewers: bogner, nemanjai, joerg, efriedma Reviewed By: efriedma Subscribers: efriedma, cactus, llvm-commits Differential Revision: https://reviews.llvm.org/D39946 llvm-svn: 320243	2017-12-09 06:45:36 +00:00
Jessica Paquette	a249c4f513	[MachineOutliner] Outline calls The outliner previously would never outline calls. Calls are pretty common in files, so it makes sense to outline them. In fact, in the LLVM test suite, if you count the number of instructions that the outliner misses when you outline calls vs when you don't, it turns out that, on average, around 6% of the instructions encountered are calls. So, if we outline calls, we can find more candidates, and thus save some more space. This commit adds that functionality and updates the mir test to reflect that. llvm-svn: 320229	2017-12-09 00:43:49 +00:00
Wolfgang Pieb	8b1a175be6	[NFC] Change the string offsets table tests to generate the object on the fly which enables us to remove the test scripts and object files from the repository. https://reviews.llvm.org/D40914 llvm-svn: 320227	2017-12-09 00:39:53 +00:00
Evgeniy Stepanov	c667c1f47a	Hardware-assisted AddressSanitizer (llvm part). Summary: This is LLVM instrumentation for the new HWASan tool. It is basically a stripped down copy of ASan at this point, w/o stack or global support. Instrumenation adds a global constructor + runtime callbacks for every load and store. HWASan comes with its own IR attribute. A brief design document can be found in clang/docs/HardwareAssistedAddressSanitizerDesign.rst (submitted earlier). Reviewers: kcc, pcc, alekseyshl Subscribers: srhines, mehdi_amini, mgorny, javed.absar, eraman, llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D40932 llvm-svn: 320217	2017-12-09 00:21:41 +00:00
Paul Robinson	8bd9d6ad83	Fix out-of-order stepping behavior in programs with sunk instructions. MachineSink attempts to place instructions near the basic blocks where they are needed. Once an instruction has been sunk, its location relative to other instructions no longer is consistent with the original source code. In order to ensure correct stepping in the debugger, the debug location for sunk instructions is either merged with the insertion point or erased if the target successor block is empty. Originally submitted as r318679, revised to fix sanitizer failure and improve testing. Patch by Matthew Voss! Differential Revision: https://reviews.llvm.org/D39933 llvm-svn: 320216	2017-12-09 00:17:01 +00:00
Adrian Prantl	01fb31cc89	dwarfdump: Add support for the --diff option. --diff Emit the output in a diff-friendly way by omitting offsets and addresses. <rdar://problem/34502625> llvm-svn: 320214	2017-12-08 23:32:47 +00:00
Duncan P. N. Exon Smith	9b8caf5bd7	Revert part of "Cleanup some GraphTraits iteration code" This reverts part of r300656, which caused a regression in propagateMassToSuccessors by counting edges n^2 times, where n is the number of edges from the source basic block to the same successor basic block. The result was both incorrect and very slow to compute for large values of n (e.g. switches with multiple cases that go to the same basic block). Patch by Andrew Scheidecker! llvm-svn: 320208	2017-12-08 22:42:43 +00:00
Vedant Kumar	195dfd10a6	[Debugify] Add a pass to test debug info preservation The Debugify pass synthesizes debug info for IR. It's paired with a CheckDebugify pass which determines how much of the original debug info is preserved. These passes make it easier to create targeted tests for debug info preservation. Here is the Debugify algorithm: NextLine = 1 for (Instruction &I : M) attach DebugLoc(NextLine++) to I NextVar = 1 for (Instruction &I : M) if (canAttachDebugValue(I)) attach dbg.value(NextVar++) to I The CheckDebugify pass expects contiguous ranges of DILocations and DILocalVariables. If it fails to find all of the expected debug info, it prints a specific error to stderr which can be FileChecked. This was discussed on llvm-dev in the thread: "Passes to add/validate synthetic debug info" Differential Revision: https://reviews.llvm.org/D40512 llvm-svn: 320202	2017-12-08 21:57:28 +00:00
Florian Hahn	e5089e2e94	[CodeExtractor] Add debug locations for new call and branch instrs. Summary: If a partially inlined function has debug info, we have to add debug locations to the call instruction calling the outlined function. We use the debug location of the first instruction in the outlined function, as the introduced call transfers control to this statement and there is no other equivalent line in the source code. We also use the same debug location for the branch instruction added to jump from artificial entry block for the outlined function, which just jumps to the first actual basic block of the outlined function. Reviewers: davide, aprantl, rriddle, dblaikie, danielcdh, wmi Reviewed By: aprantl, rriddle, danielcdh Subscribers: eraman, JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D40413 llvm-svn: 320199	2017-12-08 21:49:03 +00:00
Dan Gohman	3a762bf9df	[WebAssembly] Reapply r319186: "Support bitcasted function addresses with varargs." This puts the functionality under control of a command-line option which is off by default to avoid breaking existing setups. llvm-svn: 320197	2017-12-08 21:27:00 +00:00
Dan Gohman	6736f59078	[WebAssemby] Re-apply r320041: "Support main functions with alternate signatures." This includes a fix so that it doesn't transform declarations, and it puts the functionality under control of a command-line option which is off by default to avoid breaking existing setups. llvm-svn: 320196	2017-12-08 21:18:21 +00:00
Konstantin Zhuravlyov	c40d9f2e5d	AMDGPU/GCN: Bring processors in sync with AMDGPUUsage - Add gfx704 - Change bonaire to gfx704 - Remove gfx804 - Remove gfx901 - Remove gfx903 Differential Revision: https://reviews.llvm.org/D40046 llvm-svn: 320194	2017-12-08 20:52:28 +00:00
Craig Topper	7f0d456ef8	[X86] Teach lowering to only let through (insert_subvector (vXi1 zeros), subvec, 0) for vector sizes that have native KSHIFT support. For narrow sizes we'll widen the zero vector and widen the insert. Then do an extract_subvector to get back down to correct size. This allows us to remove some patterns from the isel table that had to COPY_TO_REGCLASS to an oversized register, do the shift and then COPY_TO_REGCLASS back to the narrow register. Now this is represented explicitly in the DAG. This seems to have perturbed the register allocation in one of the tests, but the number of instructions didn't change. llvm-svn: 320190	2017-12-08 20:10:33 +00:00
Simon Pilgrim	6415f56c79	[X86][X87] Tag x87 float compare instructions scheduler classes llvm-svn: 320189	2017-12-08 20:10:31 +00:00
Matt Arsenault	856777d8c9	AMDGPU: image_getlod and image_getresinfo do not read memory llvm-svn: 320187	2017-12-08 20:00:57 +00:00
Xinliang David Li	d91057bf52	Revert r320104: infinite loop profiling bug fix Causes unexpected memory issue with New PM this time. The new PM invalidates BPI but not BFI, leaving the reference to BPI from BFI invalid. Abandon this patch. There is a more general solution which also handles runtime infinite loop (but not statically). llvm-svn: 320180	2017-12-08 19:38:07 +00:00
Konstantin Zhuravlyov	e30f88f3a9	AMDGPU: Report Arg's Value name in metadata if kernel_arg_name metadata is not available Differential Revision: https://reviews.llvm.org/D40924 llvm-svn: 320176	2017-12-08 19:22:12 +00:00
Michael Trent	ad840d2206	Reverting r320166 to fix test failures. llvm-svn: 320174	2017-12-08 19:09:26 +00:00
Michael Trent	de5209bdbd	Updated llvm-objdump to display local relocations in Mach-O binaries Summary: llvm-objdump's Mach-O parser was updated in r306037 to display external relocations for MH_KEXT_BUNDLE file types. This change extends the Macho-O parser to display local relocations for MH_PRELOAD files. When used with the -macho option relocations will be displayed in a historical format. rdar://35778019 Reviewers: enderby Reviewed By: enderby Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D40867 llvm-svn: 320166	2017-12-08 17:51:04 +00:00
Davide Italiano	b5a62cc81a	[DebugInfo] Use llc instead of llc_dwarf to fix this test. We work around the fact that some platforms add a triple when they expand llc_dwarf in lit. llvm-svn: 320164	2017-12-08 17:15:50 +00:00
Simon Pilgrim	19d460b066	[X86][SHA] Tag SHA instructions scheduler classes Put these under VecIMul itinerary classes for now - seems to be a good average value llvm-svn: 320161	2017-12-08 16:38:41 +00:00
Alexey Bataev	ec95c6cc0a	[InstCombine] PR35354: Convert store(bitcast, load bitcast (select (Cond, &V1, &V2)) --> store (, load (select(Cond, load &V1, load &V2))) Summary: If we have the code like this: ``` float a, b; a = std::max(a ,b); ``` it is converted into something like this: ``` %call = call dereferenceable(4) float* @_ZSt3maxIfERKT_S2_S2_(float* nonnull dereferenceable(4) %a.addr, float* nonnull dereferenceable(4) %b.addr) %1 = bitcast float* %call to i32* %2 = load i32, i32* %1, align 4 %3 = bitcast float* %a.addr to i32* store i32 %2, i32* %3, align 4 ``` After inlinning this code is converted to the next: ``` %1 = load float, float* %a.addr %2 = load float, float* %b.addr %cmp.i = fcmp fast olt float %1, %2 %__b.__a.i = select i1 %cmp.i, float* %a.addr, float* %b.addr %3 = bitcast float* %__b.__a.i to i32* %4 = load i32, i32* %3, align 4 %5 = bitcast float* %arrayidx to i32* store i32 %4, i32* %5, align 4 ``` This pattern is not recognized as minmax pattern. Patch solves this problem by converting sequence ``` store (bitcast, (load bitcast (select ((cmp V1, V2), &V1, &V2)))) ``` to a sequence ``` store (,load (select((cmp V1, V2), &V1, &V2))) ``` After this the code is recognized as minmax pattern. Reviewers: RKSimon, spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D40304 llvm-svn: 320157	2017-12-08 15:32:10 +00:00
Max Kazantsev	9c08b7a053	[SCEV] Fix predicate usage in computeExitLimitFromICmp In this method, we invoke `SimplifyICmpOperands` which takes the `Cond` predicate by reference and may change it along with `LHS` and `RHS` SCEVs. But then we invoke `computeShiftCompareExitLimit` with Values from which the SCEVs have been derived, these Values have not been modified while `Cond` could be. One of possible outcomes of this is that we may falsely prove that an infinite loop ends within some finite number of iterations. In this patch, we save the original `Cond` and pass it along with original operands. This logic may be removed in future once `computeShiftCompareExitLimit` works with SCEVs instead of value operands. Reviewed By: sanjoy Differential Revision: https://reviews.llvm.org/D40953 llvm-svn: 320142	2017-12-08 12:19:45 +00:00
Gadi Haber	2cf601f28f	[X86][Haswell]: Updating the scheduling information for the Haswell subtarget. Updated the scheduling information for the Haswell subtarget with the following changes: Regrouped the instructions after adding appropriate load + store latencies. Added scheduling for missing instructions such as the GATHER instrs. The changes were made after revisiting the latencies impact of all memory uOps. Reviewers: RKSimon, zvi, craig.topper, apilipenko Differential Revision: https://reviews.llvm.org/D40021 Change-Id: Iaf6c1f5169add1552845a8a566af4e5a359217a7 llvm-svn: 320137	2017-12-08 09:48:44 +00:00
Abderrazek Zaafrani	2c80e4c7c3	[AArch64] Avoid SIMD interleaved store instruction for Exynos. Replace interleaved store instructions by equivalent and more efficient instructions based on latency cost model. Https://reviews.llvm.org/D38196 llvm-svn: 320123	2017-12-08 00:58:49 +00:00
Derek Schuff	9e1baeda74	Revert "[WebAssemby] Support main functions with alternate signatures." This reverts commit 959e37e669b0c3cfad4cb9f1f7c9261ce9f5e9ae. That commit doesn't handle the case where main is declared rather than defined, in particular the even-more special case where main is a prototypeless declaration (which is of course the one actually used by musl currently). llvm-svn: 320121	2017-12-08 00:39:54 +00:00
Craig Topper	323ba39f10	[X86] Handle alls version of vXi1 insert_vector_elt with a constant index without falling back to shuffles. We previously only supported inserting to the LSB or MSB where it was easy to zero to perform an OR to insert. This change effectively extracts the old value and the new value, xors them together and then xors that single bit with the correct location in the original vector. This will cancel out the old value in the first xor leaving the new value in the position. The way I've implemented this uses 3 shifts and two xors and uses an additional register. We can avoid the additional register at the cost of another shift. llvm-svn: 320120	2017-12-08 00:16:09 +00:00
Bill Seurer	957a076cce	[PowerPC][asan] Update asan to handle changed memory layouts in newer kernels In more recent Linux kernels with 47 bit VMAs the layout of virtual memory for powerpc64 changed causing the address sanitizer to not work properly. This patch adds support for 47 bit VMA kernels for powerpc64 and fixes up test cases. https://reviews.llvm.org/D40907 There is an associated patch for compiler-rt. Tested on several 4.x and 3.x kernel releases. llvm-svn: 320109	2017-12-07 22:53:33 +00:00
Eric Christopher	a469acac03	Temporarily revert "[PowerPC] Allow tail calls of fastcc functions from C CallingConv functions." It is causing sanitizer failures on llvm tests in a bootstrapped compiler. No bot link since it's currently down, but following up to get the bot up. This reverts commit r319218. llvm-svn: 320106	2017-12-07 22:26:19 +00:00
Xinliang David Li	4b0027f671	[PGO] detect infinite loop and form MST properly Differential Revision: http://reviews.llvm.org/D40873 llvm-svn: 320104	2017-12-07 22:23:28 +00:00
Jessica Paquette	59948666fb	[MachineOutliner] Fix offset overflow check The offset overflow check before was incorrect. It would always give the correct result, but it was comparing the SCALED potential fixed-up offset against an UNSCALED minimum/maximum. As a result, the outliner was missing a bunch of frame setup/destroy instructions that ought to have been safe to outline. This fixes that, and adds an instruction to the .mir test that failed the old test. llvm-svn: 320090	2017-12-07 21:51:43 +00:00
Mark Searles	9ebdbb433a	[AMDGPU] Revert "[AMDGPU] Add options for waitcnt pass debugging; add instr count in debug output." Patch caused a buildbot failure; http://lab.llvm.org:8011/builders/lld-x86_64-darwin13/builds/15733/steps/build_Lld/logs/stdio : lib/Target/AMDGPU/SIInsertWaitcnts.cpp:396:11: error: private field 'InstCnt' is not used [-Werror,-Wunused-private-field] int32_t InstCnt = 0; ^ 1 error generated. " This reverts commit 71627f79010aafe74fdcba901bba28dd7caa0869. llvm-svn: 320086	2017-12-07 21:14:41 +00:00
Mark Searles	a84d23489a	[AMDGPU] Add options for waitcnt pass debugging; add instr count in debug output. -amdgpu-waitcnt-forcezero={1\|0} Force all waitcnt instrs to be emitted as s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0) -amdgpu-waitcnt-forceexp=<n> Force emit a s_waitcnt expcnt(0) before the first <n> instrs -amdgpu-waitcnt-forcelgkm=<n> Force emit a s_waitcnt lgkmcnt(0) before the first <n> instrs -amdgpu-waitcnt-forcevm=<n> Force emit a s_waitcnt vmcnt(0) before the first <n> instrs Differential Revision: https://reviews.llvm.org/D40091 llvm-svn: 320084	2017-12-07 20:36:39 +00:00
Mark Searles	d29f24acfb	[AMDGPU] Add GCNHazardRecognizer::checkInlineAsmHazards() and GCNHazardRecognizer::checkVALUHazardsHelper(). checkInlineAsmHazards() checks INLINEASM for hazards that we particularly care about (so not exhaustive); this patch adds a check for INLINEASM that defs vregs that hold data-to-be stored by immediately preceding store of more than 8 bytes. If the instr were not within an INLINEASM, this scenario would be handled by checkVALUHazard(). Add checkVALUHazardsHelper(), which will be called by both checkVALUHazards() and checkInlineAsmHazards(). Differential Revision: https://reviews.llvm.org/D40098 llvm-svn: 320083	2017-12-07 20:34:25 +00:00
Craig Topper	dfc79c7c33	[X86] Fix InsertBitToMaskVector to only issue KSHIFTS of native size so that upper bits are properly zeroed. There's no v2i1 or v4i1 kshift, and v8i1 is only supported with AVXDQ. Isel has fake patterns to extend these types to native shifts, but makes no guarantees about the value of any bits shifted in when shifting right. This patch promotes the vector to a type that supports a native shift first and only allows inserting into the msb of a native sized shift. I've constructed this in a way that doesn't do the promotion if we're going to fallback to using a xmm/ymm/zmm shuffle. I think I have a plan to remove the shuffle fall back entirely. In which case we this can be simplified, but I wanted to fix the correctness issue first. llvm-svn: 320081	2017-12-07 20:10:04 +00:00
Sanjay Patel	6cfc136870	[InstCombine] add tests for abs using bit hackery; NFC llvm-svn: 320068	2017-12-07 18:13:33 +00:00
Simon Pilgrim	386b23f1fa	[X86] Tag BMI/BMI2/TBM instructions scheduler classes Put these under UNARY/BINOP ALU itinerary classes for now - seems to be a good average value llvm-svn: 320064	2017-12-07 17:37:39 +00:00
Krzysztof Parzyszek	039d4d9286	[Hexagon] Generate HVX code for basic arithmetic operations Handle and, or, xor, add, sub, mul for vectors of i8, i16, and i32. llvm-svn: 320063	2017-12-07 17:37:28 +00:00
Simon Pilgrim	d2e93e76b8	[X86][TBM] Add TBM scheduling tests llvm-svn: 320062	2017-12-07 17:23:00 +00:00
Craig Topper	5db260fca4	[X86] Rename function in recently added test case to not be 'main' returning 'void'. NFC llvm-svn: 320059	2017-12-07 17:02:49 +00:00
Davide Italiano	f6e180d523	[DebugInfo] Move this test to X86/ now that it specifies a triple. Should bring back the arm/arm64 bots. Reported by Yvan Roux. llvm-svn: 320057	2017-12-07 16:10:39 +00:00
Simon Pilgrim	2983b46973	[X86] Tag SALC instructions scheduler class Treat these the same as LAHF/SAHF (although its not a x86_64 instruction) llvm-svn: 320055	2017-12-07 16:07:06 +00:00
Simon Pilgrim	ffce0d8fbc	[X86] Add LAHF/SAHF scheduling test llvm-svn: 320054	2017-12-07 16:04:20 +00:00
Simon Pilgrim	a383f84233	[X86] Add SALC scheduling test llvm-svn: 320052	2017-12-07 15:46:58 +00:00
Simon Pilgrim	f1d599adb2	[X86] Tag LZCNT/TZCNT instructions scheduler classes Tagged as IMUL instructions for a reasonable approximation (ALU tends to be a lot faster) - POPCNT is currently tagged as FAdd which I think should be replaced with IMUL as well llvm-svn: 320051	2017-12-07 15:24:14 +00:00
Sanjay Patel	9012391af1	[DAGCombiner] eliminate shuffle of insert element I noticed this pattern in D38316 / D38388. We failed to combine a shuffle that is either repeating a scalar insertion at the same position in a vector or translated to a different element index. Like the earlier patch, this could be an instcombine too, but since we opted to make this a DAG transform earlier, I've made this one a DAG patch too. We do not need any legality checking because the new insert is identical to the existing insert except that it may have a different constant insertion operand. The constant insertion test in test/CodeGen/X86/vector-shuffle-combining.ll was the motivation for D38756. Differential Revision: https://reviews.llvm.org/D40209 llvm-svn: 320050	2017-12-07 15:17:58 +00:00
Igor Laevsky	4a4f2e8c67	[InstCombine] Don't crash on out of bounds index in the insertelement Differential Revision: https://reviews.llvm.org/D40390 llvm-svn: 320049	2017-12-07 15:00:52 +00:00
Simon Pilgrim	ff5212091a	[X86][FMA] Regenerate fma schedule tests llvm-svn: 320048	2017-12-07 14:51:47 +00:00
Simon Pilgrim	60411d9a8c	[X86] Tag RDRAND/RDSEED instruction scheduler classes llvm-svn: 320045	2017-12-07 14:18:48 +00:00
Simon Pilgrim	9a2898ed22	[X86] Regenerate RDTSC codegen tests llvm-svn: 320042	2017-12-07 13:50:29 +00:00
Dan Gohman	cdaa87dd2e	[WebAssemby] Support main functions with alternate signatures. WebAssembly requires caller and callee signatures to match, so the usual C runtime trick of calling main and having it just work regardless of whether main is defined as '()' or '(int argc, char *argv[])' doesn't work. Extend the FixFunctionBitcasts pass to rewrite main to use the latter form. llvm-svn: 320041	2017-12-07 13:49:27 +00:00
Simon Pilgrim	439679c085	[X86][RDSEED] Add rdseed scheduling tests llvm-svn: 320040	2017-12-07 13:47:17 +00:00

... 2 3 4 5 6 ...

49670 Commits