llvm-project

Commit Graph

Author	SHA1	Message	Date
Lama Saba	927468309f	[X86] Reduce Store Forward Block issues in HW - Recommit after fixing Bug 36346 If a load follows a store and reloads data that the store has written to memory, Intel microarchitectures can in many cases forward the data directly from the store to the load, This "store forwarding" saves cycles by enabling the load to directly obtain the data instead of accessing the data from cache or memory. A "store forward block" occurs in cases that a store cannot be forwarded to the load. The most typical case of store forward block on Intel Core microarchiticutre that a small store cannot be forwarded to a large load. The estimated penalty for a store forward block is ~13 cycles. This pass tries to recognize and handle cases where "store forward block" is created by the compiler when lowering memcpy calls to a sequence of a load and a store. The pass currently only handles cases where memcpy is lowered to XMM/YMM registers, it tries to break the memcpy into smaller copies. breaking the memcpy should be possible since there is no atomicity guarantee for loads and stores to XMM/YMM. Differential revision: https://reviews.llvm.org/D41330 Change-Id: Ib48836ccdf6005989f7d4466fa2035b7b04415d9 llvm-svn: 328973	2018-04-02 13:48:28 +00:00
Andrea Di Biagio	6fd62feff8	[llvm-mca] Do not assume that implicit reads cannot be associated with ReadAdvance entries. Before, the instruction builder incorrectly assumed that only explicit reads could have been associated with ReadAdvance entries. This patch fixes the issue and adds a test to verify it. llvm-svn: 328972	2018-04-02 13:46:49 +00:00
Nico Weber	62ea0c562e	Attempt to fix papertrail-warnings.test on Windows bots. llvm-svn: 328971	2018-04-02 13:45:39 +00:00
Nico Weber	dce9a72d98	Assume existence of inttypes.h and stdint.h in DataTypes.h. These should exist in all toolchains LLVM supports nowadays. Enables making DataTypes.h a regular header instead of a .h.cmake file and allows deleting a bunch of cmake goop (which should also speed up cmake configure time a bit). All the code this removes is 9+ years old. https://reviews.llvm.org/D45155 llvm-svn: 328970	2018-04-02 13:22:26 +00:00
Hiroshi Inoue	6d48493817	[PowerPC] fix assertion failure due to missing instruction in P9InstrResources.td This patch adds L(D\|W\|H\|B)XTLS instructions introduced by https://reviews.llvm.org/rL327635 in P9InstrResources.td. llvm-svn: 328969	2018-04-02 12:18:21 +00:00
Jonas Devlieghere	9e3e7a99e8	[dsymutil] Upstream emitting of papertrail warnings. When running dsymutil as part of your build system, it can be desirable for warnings to be part of the end product, rather than just being emitted to the output stream. This patch upstreams that functionality. Differential revision: https://reviews.llvm.org/D44639 llvm-svn: 328965	2018-04-02 10:40:43 +00:00
Simon Pilgrim	3f0bda296d	Wdocumentation fix. NFCI. llvm-svn: 328964	2018-04-02 10:34:39 +00:00
Simon Pilgrim	49a5ddfda0	Wdocumentation fixes. NFCI. llvm-svn: 328963	2018-04-02 10:21:51 +00:00
Craig Topper	96729cd64b	[X86][Silvermont] Use correct latency and throughput information for divide and square root in the scheduler model. Data taken from Table 16-17 in the Intel Optimization Manual. llvm-svn: 328962	2018-04-02 06:34:16 +00:00
Craig Topper	6a814904da	[X86][SkylakeServer] Correct throughput for 512-bit sqrt and divide. Data taken from the AVX512_SKX_PortAssign spreadsheet at http://instlatx64.atw.hu/ llvm-svn: 328961	2018-04-02 05:54:34 +00:00
Craig Topper	8104f266a4	[X86] Correct the throughput for divide instructions in Sandy Bridge/Haswell/Broadwell/Skylake scheduler models. Fixes most of PR36898. Still need to fix the 512-bit instructions, but Agner's tables don't have those. llvm-svn: 328960	2018-04-02 05:33:28 +00:00
Craig Topper	dc74094398	[X86] Fix the SchedRW for AVX512 shift instructions. It was being inadvertently defaulted to an FADD scheduler class. llvm-svn: 328959	2018-04-02 03:15:02 +00:00
Craig Topper	5fb1dc2d22	[X86] Give the AVX512 VEXTRACT instructions the same SchedRWs as the SSE/AVX versions. llvm-svn: 328958	2018-04-02 02:44:55 +00:00
Nico Weber	7a4f647dc2	Remove a few unreferenced config.h defines. Found by looking through the output of for f in $(grep -o '\bHAVE_[A-Z0-9_]\b' llvm/cmake/config-ix.cmake); do echo $f $(git grep $f '' \| wc -l); done in the monorepo. llvm-svn: 328957	2018-04-02 01:46:08 +00:00
Craig Topper	caec723a1a	[X86] Add an itinerary to BTR64rr. llvm-svn: 328956	2018-04-02 01:12:34 +00:00
Craig Topper	02daec00a2	[X86] Make sure all the classes declare in the Haswell scheduler model are prefixed with HW. The tablegen files all share a namespace so we shouldn't use a generic names in a specific scheduler model. llvm-svn: 328955	2018-04-02 01:12:32 +00:00
Craig Topper	c90d906b16	[X86] Give VINSERTPS the same intinerary as INSERTPS. llvm-svn: 328954	2018-04-02 00:48:11 +00:00
Harlan Haskins	b7881bbfa2	Add C API bindings for DIBuilder 'Type' APIs This patch adds a set of unstable C API bindings to the DIBuilder interface for creating structure, function, and aggregate types. This patch also removes the existing implementations of these functions from the Go bindings and updates the Go API to fit the new C APIs. llvm-svn: 328953	2018-04-02 00:17:40 +00:00
Craig Topper	dc4a6d1ef6	[X86] Cleanup ADCX/ADOX instruction definitions. Give them both the same itineraries. Add hasSideEffects = 0 to ADOX since they don't have patterns. Rename source operands to $src1 and $src2 instead of $src0 and $src. Add ReadAfterLd to the memory form SchedRW. llvm-svn: 328952	2018-04-01 23:58:50 +00:00
Petr Hosek	934e5d5436	[AArch64] Reserve x18 register on Fuchsia This register is reserved as a platform register on Fuchsia. Differential Revision: https://reviews.llvm.org/D45105 llvm-svn: 328950	2018-04-01 23:44:04 +00:00
Craig Topper	8a1787ae22	[DebugCounter] Make -debug-counter cl::Hidden. llvm-svn: 328948	2018-04-01 22:16:52 +00:00
Craig Topper	f5730c38e9	[LegacyPassManager] Make 'print-module-scope' cl::Hidden like the rest of the printing options. llvm-svn: 328947	2018-04-01 21:54:26 +00:00
Craig Topper	9f834810ea	[X86] Give ADC8/16/32/64mi the same scheduling information as ADC8/16/32/64mr and SBB8/16/32/64mi. It doesn't make a lot of sense that it would be different. llvm-svn: 328946	2018-04-01 21:54:24 +00:00
Chandler Carruth	4244625c51	[x86] Correct the operand structure of the ADOX instruction. This also moves to define it in the same way as ADCX which seems to use constraints a bit better. This is pulled out of the review for reducing the use of popf for restoring EFLAGS, but is independent. There are still more problems with our definitions for these instructions that Craig is going to look at but this is at least less broken and he can start from this to improve them more fully. Thanks to Craig for the review here. llvm-svn: 328945	2018-04-01 21:53:18 +00:00
Chandler Carruth	06b343c6ed	[x86] Expose more of the condition conversion routines in the public API for X86's instruction information. I've now got a second patch under review that needs these same APIs. This bit is nicely orthogonal and obvious, so landing it. NFC. llvm-svn: 328944	2018-04-01 21:47:55 +00:00
Mandeep Singh Grang	8db564e033	[tools] Change std::sort to llvm::sort in response to r327219 Summary: r327219 added wrappers to std::sort which randomly shuffle the container before sorting. This will help in uncovering non-determinism caused due to undefined sorting order of objects having the same key. To make use of that infrastructure we need to invoke llvm::sort instead of std::sort. Note: This patch is one of a series of patches to replace all std::sort to llvm::sort. Refer the comments section in D44363 for a list of all the required patches. Reviewers: JDevlieghere, zturner, echristo, dberris, friss Reviewed By: echristo Subscribers: gbedwell, llvm-commits Differential Revision: https://reviews.llvm.org/D45141 llvm-svn: 328943	2018-04-01 21:24:53 +00:00
Mandeep Singh Grang	ba8033be4d	[include] Change std::sort to llvm::sort in response to r327219 Summary: r327219 added wrappers to std::sort which randomly shuffle the container before sorting. This will help in uncovering non-determinism caused due to undefined sorting order of objects having the same key. To make use of that infrastructure we need to invoke llvm::sort instead of std::sort. Note: This patch is one of a series of patches to replace all std::sort to llvm::sort. Refer the comments section in D44363 for a list of all the required patches. Reviewers: echristo, zturner, mzolotukhin, lhames Reviewed By: echristo Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45135 llvm-svn: 328940	2018-04-01 18:39:50 +00:00
Nicolai Haehnle	4254d45a79	AMDGPU: Make isIntrinsicSourceOfDivergence table-driven Summary: This is in preparation for the new dimension-aware image intrinsics, which I'd rather not have to list here by hand. Change-Id: Iaa16e3a635a11283918ce0d9e1e618591b0bf6fa Reviewers: arsenm, rampitec, b-sumner Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D44938 llvm-svn: 328939	2018-04-01 17:09:14 +00:00
Nicolai Haehnle	5d0d30304c	AMDGPU: Make getTgtMemIntrinsic table-driven for resource-based intrinsics Summary: Avoids having to list all intrinsics manually. This is in preparation for the new dimension-aware image intrinsics, which I'd rather not have to list here by hand. Change-Id: If7ced04998397ef68c4cb8f7de66b5050fb767e5 Reviewers: arsenm, rampitec, b-sumner Subscribers: kzhuravl, wdng, mgorny, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D44937 llvm-svn: 328938	2018-04-01 17:09:07 +00:00
Nicolai Haehnle	398c0b6701	TableGen: Support Intrinsic values in SearchableTable Summary: We will use this in the AMDGPU backend in a subsequent patch in the stack to lookup target-specific per-intrinsic information. The generic CodeGenIntrinsic machinery is used to ensure that, even though we don't calculate actual enum values here, we do get the intrinsics in the right order for the binary search index. Change-Id: If61cd5587963a4c5a1cc53df1e59c5e4dec1f9dc Reviewers: arsenm, rampitec, b-sumner Subscribers: wdng, tpr, llvm-commits Differential Revision: https://reviews.llvm.org/D44935 llvm-svn: 328937	2018-04-01 17:08:58 +00:00
Nicolai Haehnle	24e3a4d6e9	TableGen: More helpful error messages Summary: Change-Id: I3c23f6f6597912423762780cd8c5315870412bbe Reviewers: arsenm, rampitec, b-sumner Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D44936 Change-Id: Ie62614a3e2d7774f46e4034478b28f57100a2c92 llvm-svn: 328936	2018-04-01 17:08:49 +00:00
Mandeep Singh Grang	fe1d28e83d	[DebugInfo] Change std::sort to llvm::sort in response to r327219 Summary: r327219 added wrappers to std::sort which randomly shuffle the container before sorting. This will help in uncovering non-determinism caused due to undefined sorting order of objects having the same key. To make use of that infrastructure we need to invoke llvm::sort instead of std::sort. Note: This patch is one of a series of patches to replace all std::sort to llvm::sort. Refer the comments section in D44363 for a list of all the required patches. Reviewers: echristo, zturner, samsonov Reviewed By: echristo Subscribers: JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D45134 llvm-svn: 328935	2018-04-01 16:18:49 +00:00
Teresa Johnson	974706ebf7	[ThinLTO] Add an import cutoff for debugging/triaging Summary: Adds -import-cutoff=N which will stop importing during the thin link after N imports. Default is -1 (no limit). Reviewers: wmi Subscribers: inglorion, llvm-commits Differential Revision: https://reviews.llvm.org/D45127 llvm-svn: 328934	2018-04-01 15:54:40 +00:00
David Green	f80ebc8d21	[LoopRotate] Rotate loops with loop exiting latches If a loop has a loop exiting latch, it can be profitable to rotate the loop if it leads to the simplification of a phi node. Perform rotation in these cases even if loop rotate itself didnt simplify the loop to get there. Differential Revision: https://reviews.llvm.org/D44199 llvm-svn: 328933	2018-04-01 12:48:24 +00:00
Craig Topper	9b8cd5fe55	[X86] Don't check for folding into a store when deciding if we can promote an i16 mul. There's no RMW mul operation. llvm-svn: 328931	2018-04-01 06:29:32 +00:00
Craig Topper	db6caabccc	[X86] Check if the load and store are to the same pointer before preventing i16 RMW shifts and subtracts from being promoted. llvm-svn: 328930	2018-04-01 06:29:28 +00:00
Craig Topper	3998041e80	[X86] Add test case to show failure to promote i16 subtract when the LHS is a load and the result is stored to a different address. We mistakenly believe we might be able to fold this as a RMW operation, but that doesn't end up happening. llvm-svn: 328929	2018-04-01 06:29:27 +00:00
Craig Topper	ae2de57db0	[X86] Allow i16 subtracts to be promoted if the load is on the LHS and its not being stored. llvm-svn: 328928	2018-04-01 06:29:25 +00:00
Craig Topper	280f631350	[X86] Add test case to show failure to promote i16 subtract because we mistakenly believe the load can be folded. NFC The left hand side of the subtract is a load, but we cna't fold those unless we also have a store. llvm-svn: 328927	2018-04-01 06:29:23 +00:00
Craig Topper	9bc0d881a3	[X86] Remove unneeded temporary variable. NFC This Promote flag was alwasys set to true except in the default case. But in the default case we don't need to set PVT and can just return false. llvm-svn: 328926	2018-04-01 06:29:21 +00:00
Mandeep Singh Grang	97bcade70f	[Analysis] Change std::sort to llvm::sort in response to r327219 Summary: r327219 added wrappers to std::sort which randomly shuffle the container before sorting. This will help in uncovering non-determinism caused due to undefined sorting order of objects having the same key. To make use of that infrastructure we need to invoke llvm::sort instead of std::sort. Note: This patch is one of a series of patches to replace all std::sort to llvm::sort. Refer D44363 for a list of all the required patches. Reviewers: sanjoy, dexonsmith, hfinkel, RKSimon Reviewed By: dexonsmith Subscribers: david2050, llvm-commits Differential Revision: https://reviews.llvm.org/D44944 llvm-svn: 328925	2018-04-01 01:46:51 +00:00
Sanjay Patel	6124cae8f7	[DAGCombine] (float)((int) f) --> ftrunc (PR36617) fptosi / fptoui round towards zero, and that's the same behavior as ISD::FTRUNC, so replace a pair of casts with the equivalent node. We don't have to account for special cases (NaN, INF) because out-of-range casts are undefined. Differential Revision: https://reviews.llvm.org/D44909 llvm-svn: 328921	2018-03-31 17:55:44 +00:00
Lang Hames	9c755450ef	[llvm-rtdyld] Fix the InputFileList cl::opt description: it accepts multiple input files. llvm-svn: 328920	2018-03-31 16:01:01 +00:00
Simon Pilgrim	3b8ad346f9	[X86][Btver2] Add MMX_PSHUFB to the JWritePSHUFB InstRW entries llvm-svn: 328918	2018-03-31 09:15:54 +00:00
Simon Pilgrim	8c8ebd7945	Fix trailing whitespace. NFCI. llvm-svn: 328917	2018-03-31 09:14:14 +00:00
Benjamin Kramer	824f36edff	Unbreak the build of the go bindings after r328839. llvm-svn: 328916	2018-03-31 07:41:25 +00:00
Puyan Lotfi	57c4f38c35	[MIR-Canon] Adding support for local idempotent instruction hoisting. llvm-svn: 328915	2018-03-31 05:48:51 +00:00
Craig Topper	13a0f83a05	[X86] Add SchedRW for PMULLD Summary: It seems many CPUs don't implement this instruction as well as the other vector multiplies. Often using a multi uop flow. Silvermont in particular has a 7 uop flow with 11 cycle throughput. Sandy Bridge implements it as a single uop with 5 cycle latency and 1 cycle throughput. But Haswell and later use 2 uops with 10 cycle latency and 2 cycle throughput. This patch adds a new X86SchedWritePair we can use to tag this instruction separately. I've provided correct information for Silvermont, Btver2, and Sandy Bridge. I've removed the InstRWs for SandyBridge. I've left Haswell/Broadwell/Skylake InstRWs in place because I wasn't sure how to account for the different load latency between 128 and 256 bits. I also left Znver1 InstRWs in place because the existing values don't match Agner's spreadsheet. I also left a FIXME in the SandyBridge model because it being used for the "generic" model is too optimistic for the 256/512-bit versions since those are multiple uops on all known CPUs. Reviewers: RKSimon, GGanesh, courbet Reviewed By: RKSimon Subscribers: gchatelet, gbedwell, andreadb, llvm-commits Differential Revision: https://reviews.llvm.org/D44972 llvm-svn: 328914	2018-03-31 04:54:32 +00:00
Teresa Johnson	db83aceb06	[ThinLTO] Add an option to force summary call edges cold for debugging Summary: Useful to selectively disable importing into specific modules for debugging/triaging/workarounds. Reviewers: eraman Subscribers: inglorion, llvm-commits Differential Revision: https://reviews.llvm.org/D45062 llvm-svn: 328909	2018-03-31 00:18:08 +00:00
Fangrui Song	956ee79795	Fix a bunch of typoes. NFC llvm-svn: 328907	2018-03-30 22:22:31 +00:00
Ekaterina Romanova	0b01dfbba6	Prevent data races in concurrent ThinLTO processes. Make sure ThinLTO with caching doesn't use non-atomic writes to the cache file (to prevent data races and cache files corruption). 1. Place temp file to the same place where the caching directory is (instead of creating it the directory pointed to by TMP/TEMP variable). This will help to prevent using non-atomic rename and falling back to non-atomic "direct" write to the cache file. 2. if rename failed do not write to the cache file directly (direct write to the file is non-atomic and could cause data race conditions). 3. if cache file doesn't exist (e.g., because 'rename' failed or because some other reasons), bypass using the cache altogether. Differential Revision: https://reviews.llvm.org/D45076 llvm-svn: 328904	2018-03-30 21:35:42 +00:00
Jacob Gravelle	40926451d2	[WebAssembly] Register wasm passes with the PassRegistry Summary: This exposes WebAssembly passes for use on the command line (as arguments to -print-before and the like). Reviewers: dschuff, sunfish Subscribers: MatzeB, jfb, sbc100, llvm-commits, aheejin Differential Revision: https://reviews.llvm.org/D45103 llvm-svn: 328901	2018-03-30 20:36:58 +00:00
Krzysztof Parzyszek	526fbf8e33	[Hexagon] Fix testcase llvm-svn: 328899	2018-03-30 19:46:28 +00:00
Krzysztof Parzyszek	74096f7258	[Hexagon] Reduce excessive indentation in .s output llvm-svn: 328898	2018-03-30 19:30:28 +00:00
Krzysztof Parzyszek	0f983d69a4	[Hexagon] Avoid creating invalid offsets in packetizer Two memory instructions with a dependency only on the address register between the two (the first one of them being post-incrememnt) can be packetized together after the offset on the second was updated to the incremement value. Make sure that the new offset is valid for the instruction. llvm-svn: 328897	2018-03-30 19:28:37 +00:00
Andrea Di Biagio	dc97172b2f	[X86][BtVer2] Fixed the number of micro opcodes for AVX vector converts and VSQRT instructions. There were still a few AVX instructions with an incorrect number of opcodes. These should be fixed now. llvm-svn: 328892	2018-03-30 18:53:47 +00:00
Peter Collingbourne	d03bf12c1b	DataFlowSanitizer: wrappers of functions with local linkage should have the same linkage as the function being wrapped This patch resolves link errors when the address of a static function is taken, and that function is uninstrumented by DFSan. This change resolves bug 36314. Patch by Sam Kerner! Differential Revision: https://reviews.llvm.org/D44784 llvm-svn: 328890	2018-03-30 18:37:55 +00:00
Puyan Lotfi	399b46c98d	[MIR] Adding support for Named Virtual Registers in MIR. llvm-svn: 328887	2018-03-30 18:15:54 +00:00
Andrea Di Biagio	3eaa26bb64	[X86][BtVer2] Fix the number of uOps for horizontal operations. llvm-svn: 328886	2018-03-30 18:15:30 +00:00
Tim Shen	8f9f026965	[NVPTX] Enable StructuredCFG for NVPTX Summary: Make NVPTX require structured CFG. Added a temporary flag to "roll back" the behavior for easy deployment. Combined with D45008, this fixes several internal Nvidia GPU test failures that we suspect to be ptxas miscompiles (PR27738). Reviewers: jlebar Subscribers: jholewinski, sanjoy, llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D45070 llvm-svn: 328885	2018-03-30 17:51:03 +00:00
Tim Shen	1a8c6776a3	[BlockPlacement] Disable block placement tail duplciation in structured CFG. Summary: Tail duplication easily breaks the structure of CFG, e.g. duplicating on a region entry. If the structure is intended to be preserved, then we may want to configure tail duplication, or disable it for structured CFG. From our benchmark results disabling it doesn't cause performance regression. Notice that this currently affects AMDGPU backend. In the next patch, I also plan to turn on requiresStructuredCFG for NVPTX. All unit tests still pass. Reviewers: jlebar, arsenm Subscribers: jholewinski, sanjoy, wdng, tpr, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D45008 llvm-svn: 328884	2018-03-30 17:51:00 +00:00
Robert Widmann	478fce9ebf	[LLVM-C] Finish exception instruction bindings - Round 2 Summary: Previous revision caused a leak in the echo test that got caught by the ASAN bots because of missing free of the handlers array and was reverted in r328759. Resubmitting the patch with that correction. Add support for cleanupret, catchret, catchpad, cleanuppad and catchswitch and their associated accessors. Test is modified from SimplifyCFG because it contains many diverse usages of these instructions. Reviewers: whitequark, deadalnix Reviewed By: whitequark Subscribers: llvm-commits, vlad.tsyrklevich Differential Revision: https://reviews.llvm.org/D45100 llvm-svn: 328883	2018-03-30 17:49:53 +00:00
Zachary Turner	ce5b834abf	Fix some signed / unsigned conversion problems. llvm-svn: 328881	2018-03-30 17:28:35 +00:00
Zachary Turner	d5cf5cf637	[llvm-pdbutil] Dig deeper into the PDB and DBI streams when explaining. This will show more detail when using `llvm-pdbutil explain` on an offset in the DBI or PDB streams. Specifically, it will dig into individual header fields and substreams to give a more precise description of what the byte represents. llvm-svn: 328878	2018-03-30 17:16:50 +00:00
Derek Schuff	a2726e9ab6	[WebAssembly] Refactor tablegen for store instructions (NFC) Summary: Add patterns similar to loads. Differential Revision: https://reviews.llvm.org/D45064 llvm-svn: 328876	2018-03-30 17:02:50 +00:00
Krzysztof Parzyszek	fce30c2ba3	Revert "peel loops with runtime small trip counts" This reverts commit r328854, it breaks some Hexagon tests. llvm-svn: 328875	2018-03-30 16:55:44 +00:00
Stanislav Mekhanoshin	74e2974ac6	[AMDGPU] Fixed some instructions latencies Differential Revision: https://reviews.llvm.org/D45073 llvm-svn: 328874	2018-03-30 16:19:13 +00:00
Sanjay Patel	e09b7dcf3d	[SelectionDAG] Removing FABS folding from DAGCombiner The code has bugs dealing with -0.0. Since D44550 introduced FABS pattern folding in InstCombine, this patch removes the now-redundant code that causes https://bugs.llvm.org/show_bug.cgi?id=36600. Patch by Mikhail Dvoretckii! Differential Revision: https://reviews.llvm.org/D44683 llvm-svn: 328872	2018-03-30 15:42:52 +00:00
Krzysztof Parzyszek	4f99836a9e	[Hexagon] Recognize and handle :endloop01 llvm-svn: 328870	2018-03-30 15:29:47 +00:00
Krzysztof Parzyszek	46abcb236b	[Hexagon] Fix printing :mem_noshuf on compiler-generated packets llvm-svn: 328869	2018-03-30 15:09:05 +00:00
Krzysztof Parzyszek	71731fab24	[Hexagon] Fix flags for store-related intrinsics llvm-svn: 328868	2018-03-30 14:57:01 +00:00
Andrea Di Biagio	073a9d74ca	[X86][BtVer2] Add missing ReadAfterLd to RM variants of AVX horizontal adds and most vector logic instructions. Fixed a few InstRW that forgot to specify a ReadAfterLd for the register input operand. llvm-svn: 328867	2018-03-30 14:48:08 +00:00
Krzysztof Parzyszek	3f55ad8fae	[Hexagon] Remove unused scheduling classes llvm-svn: 328866	2018-03-30 14:34:32 +00:00
Andrea Di Biagio	42d8ea22c0	[X86][BtVer2] Add tests that show how ReadAfterLd is missing for some instructions. In the Btver2 model, there are a few InstRW overrides that don't specify a ReadAfterLd for the register input operand. As a result, a few AVX variants of horizontal operations and most vector logic operations with a folded memory operand don't have a ReadAdvance info associated to their input register operands. llvm-svn: 328865	2018-03-30 14:29:33 +00:00
Krzysztof Parzyszek	1ca23d9837	[Hexagon] Pass pointer to SelectionDAG to dump functions llvm-svn: 328864	2018-03-30 14:29:15 +00:00
Andrea Di Biagio	01043625cf	[X86] Add llvm-mca tests for r328834. Verify that the ReadAfterLd is correctly applied to FMA and 4-ops variable blend instructions. As Craig pointed out in D44726, some Intel models still have to be fixed. llvm-svn: 328861	2018-03-30 13:38:37 +00:00
Andrea Di Biagio	0823090843	[X86] Add tests to verify the presence of "ReadAfterLd" after r328823. This change adds a couple of tests to verify the change introduced by revision 328823 ([X86] Correct the placement of ReadAfterLd in BEXTR and BZHI). llvm-svn: 328859	2018-03-30 11:44:48 +00:00
Vlad Tsyrklevich	894c028d56	Revert "[LLVM-C] Finish exception instruction bindings" This reverts commit r328759. It was causing LSan failures on sanitizer-x86_64-linux-bootstrap llvm-svn: 328858	2018-03-30 06:21:28 +00:00
Michael Bedy	59e5ef793c	[AMDGPU] Fix the SDWA Peephole phase to handle src for dst:UNUSED_PRESERVE. Summary: The phase attempts to transform operations that extract a portion of a value into an SDWA src operand in cases where that value is used only once. It was not prepared for this use to be the preserved portion of a value for dst:UNUSED_PRESERVE, resulting in a crash or assert. This change either rejects the illegal SDWA attempt, or in the case where dst:WORD_1 and the src_sel would be WORD_0, removes the unneeded extract instruction. Reviewers: arsenm, #amdgpu Reviewed By: arsenm, #amdgpu Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D44364 llvm-svn: 328856	2018-03-30 05:03:36 +00:00
Ikhlas Ajbar	a7d614c3e5	[Hexagon] add missing lit config file llvm-svn: 328855	2018-03-30 03:32:24 +00:00
Ikhlas Ajbar	66c8ba5a50	peel loops with runtime small trip counts For Hexagon, peeling loops with small runtime trip count is beneficial for our benchmarks. We set PeelCount in HexagonTargetInfo.cpp and we use PeelCount set by the target for computing the desired peel count. Differential Revision: https://reviews.llvm.org/D44880 llvm-svn: 328854	2018-03-30 03:05:34 +00:00
Eli Friedman	208fe67a78	[MachineCopyPropagation] Handle COPY with overlapping source/dest. MachineCopyPropagation::CopyPropagateBlock has a bunch of special handling for COPY instructions. This handling assumes that COPY instructions do not modify the source of the copy; this is wrong if the COPY destination overlaps the source. To fix the bug, check explicitly for this situation, and fall back to the generic instruction handling. This bug can't happen for most register classes because they don't have this sort of overlap, but there are a few register classes where this is possible. The testcase uses the AArch64 QQQQ register class. Differential Revision: https://reviews.llvm.org/D44911 llvm-svn: 328851	2018-03-30 00:56:03 +00:00
Eugene Zelenko	7fb5d41e44	[IR] Fix some Clang-tidy modernize-use-auto warnings; other minor fixes (NFC). llvm-svn: 328850	2018-03-30 00:47:31 +00:00
Rafael Espindola	4b4d85fd4d	Style update. NFC. Rename 3 functions to start with lowercase letters. Don't repeat the name in the comments. llvm-svn: 328848	2018-03-29 23:32:54 +00:00
David Blaikie	f423062aff	Fix some layering in StripNonLineTableDebugInfo, moving its declaration from IPO.h to Utils.h to match its implementation llvm-svn: 328844	2018-03-29 22:42:08 +00:00
David Blaikie	7883340331	Remove unused header to fix layering. llvm-svn: 328842	2018-03-29 22:35:59 +00:00
David Blaikie	4778bb88ef	Remove unused headers to fix layering llvm-svn: 328840	2018-03-29 22:31:39 +00:00
David Blaikie	c90289b5d3	llvm-c: Split Utils out of Scalar.h To fix layering (so that Scalar.h, a libScalarOpts header, isn't included from Utils - which libScalarOpts depends on). llvm-svn: 328839	2018-03-29 22:31:38 +00:00
David Blaikie	bd0c88078a	Remove some unneeded #includes to fix layering llvm-svn: 328838	2018-03-29 22:31:36 +00:00
Craig Topper	ee3c19fd7f	[X86] Add ReadAfterLds to some 3 src instructions Sometimes the operand comes after the memory operand so we need 5 ReadDefaults first. I suspect we also need to do something for the mask operand for masked avx512 instructions? I'm not sure if the mask should be ReadAfterLd or not since it can mask faults. If it shouldn't be ReadAfterLd then we're probably wrong for zero masking instructions already. Differential Revision: https://reviews.llvm.org/D44726 llvm-svn: 328834	2018-03-29 22:03:05 +00:00
Eric Christopher	dd4baff48d	Typo fix: epilouge->epilogue. NFC. llvm-svn: 328833	2018-03-29 21:59:04 +00:00
Matt Arsenault	efd1b30436	AMDGPU: Fix build warning in release llvm-svn: 328832	2018-03-29 21:44:44 +00:00
Matt Arsenault	03ae399d50	AMDGPU: Support realigning stack While the stack access instructions don't care about alignment > 4, some transformations on the pointer calculation do make assumptions based on knowing the low bits of a pointer are 0. If a stack object ends up being accessed through its absolute address (relative to the kernel scratch wave offset), the addressing expression may depend on the stack frame being properly aligned. This was breaking in a testcase due to the add->or combine. I think some of the SP/FP handling logic is still backwards, and overly simplistic to support all of the stack features. Code which tries to modify the SP with inline asm for example or variable sized objects will probably require redoing this. llvm-svn: 328831	2018-03-29 21:30:06 +00:00
Evgeniy Stepanov	50635dab26	Add msan custom mapping options. Similarly to https://reviews.llvm.org/D18865 this adds options to provide custom mapping for msan. As discussed in http://lists.llvm.org/pipermail/llvm-dev/2018-February/121339.html Patch by vit9696(at)avp.su. Differential Revision: https://reviews.llvm.org/D44926 llvm-svn: 328830	2018-03-29 21:18:17 +00:00
Craig Topper	3f2dbec652	[X86] Remove ReadAfterLd from BMI and TBM instructions that don't have a register operand in their memory form The memory form of these instructions only read an input from memory. They don't have any register operands. Differential Revision: https://reviews.llvm.org/D44836 llvm-svn: 328828	2018-03-29 21:03:53 +00:00
Kevin Enderby	4129308413	Try to fix sanitizer-x86_64-linux-fast bot due to change in r328820. llvm-svn: 328824	2018-03-29 20:49:24 +00:00
Craig Topper	89310f56c8	[X86] Correct the placement of ReadAfterLd in BEXTR and BZHI. Add dedicated SchedRW for BEXTR/BZHI. These instructions have the memory operand before the register operand. So we need to put ReadDefault for all the load ops first. Then the ReadAfterLd Differential Revision: https://reviews.llvm.org/D44838 llvm-svn: 328823	2018-03-29 20:41:39 +00:00
Philip Reames	5c14ed89f6	[NFC][LICM] Rearrange checks to have the cheap bail out first llvm-svn: 328822	2018-03-29 20:32:15 +00:00
Matt Arsenault	ffb132e74b	AMDGPU: Increase default stack alignment 8 and 16-byte values are common, so increase the default alignment to avoid realigning the stack in most functions. llvm-svn: 328821	2018-03-29 20:22:04 +00:00
Kevin Enderby	d9911f6f7b	For llvm-nm and Mach-O files that are fully stripped, special case a redacted LC_MAIN As a further refinement on: r328274 - For llvm-nm and Mach-O files also use function starts info in some cases when printing symbols we want to special case a redacted LC_MAIN so it is easier to find. rdar://38978929 llvm-svn: 328820	2018-03-29 20:04:29 +00:00
Matt Arsenault	6c041a3cab	AMDGPU: Fix selection error on constant loads with < 4 byte alignment llvm-svn: 328818	2018-03-29 19:59:28 +00:00
Philip Reames	e4b728e82b	Fix an accidental circular dependence llvm-svn: 328816	2018-03-29 19:22:12 +00:00
Mandeep Singh Grang	10d8b85570	[Mips] Change std::sort to llvm::sort in response to r327219 Summary: r327219 added wrappers to std::sort which randomly shuffle the container before sorting. This will help in uncovering non-determinism caused due to undefined sorting order of objects having the same key. To make use of that infrastructure we need to invoke llvm::sort instead of std::sort. Note: This patch is one of a series of patches to replace all std::sort to llvm::sort. Refer the comments section in D44363 for a list of all the required patches. Reviewers: sdardis, RKSimon, dsanders, atanasyan Reviewed By: atanasyan Subscribers: atanasyan, arichardson, llvm-commits Differential Revision: https://reviews.llvm.org/D44869 llvm-svn: 328815	2018-03-29 19:05:26 +00:00
Paul Robinson	407ff1b1cd	Try to fix a couple tests for Windows. llvm-svn: 328814	2018-03-29 18:59:33 +00:00
Dinar Temirbulatov	c326c1c582	[SLPVectorizer] Add tests related to PR30787, NFCI. llvm-svn: 328813	2018-03-29 18:57:03 +00:00
Zachary Turner	3203e27473	[MSF] Default to FPM2, and always mark FPM pages allocated. There are two FPMs in an MSF file, the idea being that for incremental updates you can write to the alternate one and then atomically swap them on commit. LLVM defaulted to using FPM1 on the first commit, but this differs from Microsoft's behavior which is to default to using FPM2 on the first commit. To eliminate some byte-level file differences, this patch changes LLVM's default to also be FPM2. Additionally, LLVM was trying to be "smart" about marking FPM pages allocated. In addition to marking every page belonging to the alternate FPM as unallocated, LLVM also marked pages at the end of the main FPM which were not needed as unallocated. In order to match the behavior of Microsoft-generated PDBs, we now always mark every FPM block as allocated, regardless of whether it is in the main FPM or the alt FPM, and regardless of whether or not it describes blocks which are actually in the file. This has the side benefit of simplifying our code. llvm-svn: 328812	2018-03-29 18:34:15 +00:00
Zachary Turner	f4b6dcf6af	[PDB] Print some more details when explaining MSF fields. When we determine that a field belongs to an MSF super block or the free page map, we wouldn't print any additional information. With this patch, we now print the value of the field (for super block fields) or the allocation status of the specified byte (in the case of offsets in the FPM). llvm-svn: 328808	2018-03-29 17:45:34 +00:00
Craig Topper	2fa1436206	[IR][CodeGen] Remove dependency on EVT from IR/Function.cpp. Move EVT to CodeGen layer. Currently EVT is in the IR layer only because of Function.cpp needing a very small piece of the functionality of EVT::getEVTString(). The rest of EVT is used in codegen making CodeGen a better place for it. The previous code converted a Type* to EVT and then called getEVTString. This was only expected to handle the primitive types from Type*. Since there only a few primitive types, we can just print them as strings directly. Differential Revision: https://reviews.llvm.org/D45017 llvm-svn: 328806	2018-03-29 17:21:10 +00:00
Paul Robinson	b271f31d8d	Reapply "[DWARFv5] Emit file 0 to the line table." DWARF v5 specifies that the root file (also given in the DW_AT_name attribute of the compilation unit DIE) should be emitted explicitly to the line table's list of files. This makes the line table more independent of the .debug_info section. We emit the new syntax only for DWARF v5 and later. Fixes the bug found by asan. Also XFAIL the new test for Darwin, which is stuck on DWARF v2, and fix up other tests so they stop failing on Windows. Last but not least, don't break "clang -g" of an assembler file that has .file directives in it. Differential Revision: https://reviews.llvm.org/D44054 llvm-svn: 328805	2018-03-29 17:16:41 +00:00
Zachary Turner	1b20416bfa	[PDB] Fix a bug in the explain subcommand. We were trying to dig into the super block fields and print a description of the field at the specified offset, but we were printing the wrong field due to an off-by-one-field-error. llvm-svn: 328804	2018-03-29 17:11:14 +00:00
David Zarzycki	b458329327	[ADT] NFC: Fix bogus StringSwitch rule-of-five boilerplate Now that 'Str' is constant, the rule-of-file logic needs updating. Reported by: vit9696@avp.su Reviewed by: jordan_rose@apple.com llvm-svn: 328803	2018-03-29 16:51:28 +00:00
Zachary Turner	db0f2f68b0	Remove unused function. llvm-svn: 328802	2018-03-29 16:46:47 +00:00
Zachary Turner	ea40f40e1b	[PDB] Add an explain subcommand. When investigating various things, we often have a file offset and what to know what's in the PDB at that address. For example we may be doing a binary comparison of two LLD-generated PDBs to look for sources of non-determinism, or we may wish to compare an LLD-generated PDB with a Microsoft generated PDB for sources of byte-for-byte incompatibility. In these cases, we can do a binary diff of the two files, and once we find a mismatched byte we can use explain to figure out what that byte is, immediately honining in on the problem. This patch implements this by trying to narrow the meaning of a particular file offset down as much as possible. Differential Revision: https://reviews.llvm.org/D44959 llvm-svn: 328799	2018-03-29 16:28:20 +00:00
Haicheng Wu	c7cc87922e	[JumpThreading] Don't select an edge that we know we can't thread In r312664 (D36404), JumpThreading stopped threading edges into loop headers. Unfortunately, I observed a significant performance regression as a result of this change. Upon further investigation, the problematic pattern looked something like this (after many high level optimizations): while (true) { bool cond = ...; if (!cond) { <body> } if (cond) break; } Now, naturally we want jump threading to essentially eliminate the second if check and hook up the edges appropriately. However, the above mentioned change, prevented it from doing this because it would have to thread an edge into the loop header. Upon further investigation, what is happening is that since both branches are threadable, JumpThreading picks one of them at arbitrarily. In my case, because of the way that the IR ended up, it tended to pick the one to the loop header, bailing out immediately after. However, if it had picked the one to the exit block, everything would have worked out fine (because the only remaining branch would then be folded, not thraded which is acceptable). Thus, to fix this problem, we can simply eliminate loop headers from consideration as possible threading targets earlier, to make sure that if there are multiple eligible branches, we can still thread one of the ones that don't target a loop header. Patch by Keno Fischer! Differential Revision: https://reviews.llvm.org/D42260 llvm-svn: 328798	2018-03-29 16:01:26 +00:00
Pavel Labath	ea0f841c3b	.debug_names: Correctly align the AugmentationStringSize field We should align the value of the field, not the overall section offset. This distinction matters if one of the debug_names contributions is not of size which is a multiple of four. The dwarf producers may choose to emit rounded contributions, but they are not required to do so. In the latter case, without this patch we would corrupt the parsing state, as we would adjust the offset even if subsequent contributions contained correctly rounded augmentation strings. llvm-svn: 328796	2018-03-29 15:12:45 +00:00
Andrea Di Biagio	0a837ef6b1	[llvm-mca] Correctly set the ReadAdvance information for register use operands. The tool was passing the wrong operand index to method MCSubtargetInfo::getReadAdvanceCycles(). That method requires a "UseIdx", and not the operand index. This was found when testing X86 code where instructions had a memory folded operand. This patch fixes the issue and adds test read-advance-1.s to ensure that the ReadAfterLd (a ReadAdvance of 3cy) information is correctly used. llvm-svn: 328790	2018-03-29 14:26:56 +00:00
Krzysztof Parzyszek	dc7a557e6a	[Hexagon] Add support to handle bit-reverse load intrinsics Patch by Sumanth Gundapaneni. llvm-svn: 328774	2018-03-29 13:52:46 +00:00
Pavel Labath	2d1fc4375f	.debug_names: Parse DW_IDX_die_offset as a reference Before this patch we were parsing the attributes as section offsets, as that is what apple_names is doing. However, this is not correct as DWARF v5 specifies that this attribute should use the Reference form class. This also updates all the testcases (except the ones that deliberately pass a different form) to use the correct form class. llvm-svn: 328773	2018-03-29 13:47:57 +00:00
Sjoerd Meijer	4f8f1e5115	[Kaleidoscope] Tiny typo fixes Fixes for "lets" references which should be "let's" in the Kaleidoscope tutorial. Patch by: Robin Dupret Differential Revision: https://reviews.llvm.org/D44990 llvm-svn: 328772	2018-03-29 12:31:06 +00:00
Simon Pilgrim	71c5f3fffd	[X86][SSE] Don't bother re-adding combined target shuffles to the work list We are re-adding all the bitcasts, constant masks and target shuffles to the work list for no apparent gain. Found while investigating adding SimplifyDemandedVectorElts to target shuffles. Differential Revision: https://reviews.llvm.org/D44942 llvm-svn: 328771	2018-03-29 11:18:41 +00:00
Sylvestre Ledru	f22ebb7599	Rename llvm library from libLLVM-X.Y to libLLVM-X Summary: As we are only doing X.0.Z releases (not using the minor version), there is no need to keep -X.Y in the version. Like patch https://reviews.llvm.org/D41808, I propose that we rename libLLVM-7.0svn.so to libLLVM-7svn.so This patch will also rename downstream libraries like liblldb-7.0 to liblldb-7 Reviewers: axw, beanz, dim, hans Reviewed By: dim, hans Subscribers: mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D41869 llvm-svn: 328768	2018-03-29 09:44:09 +00:00
Simon Dardis	32a27fc77a	[Mips] Remove dead code I believe the role of ehDataReg has been replaced by MipsABIInfo::GetEhDataReg, thus removing the dead code. Patch By: Wei-Ren Chen. Reviewers: ehostunreach, sdardis Differential Revision: https://reviews.llvm.org/D44867 llvm-svn: 328767	2018-03-29 09:21:20 +00:00
David Green	b0aa36f9c2	[LoopRotate] Restructuring LoopRotation.cpp to create Loop Rotation Pass with Loop Rotation Utility Interface The existing LoopRotation.cpp is implemented as one of loop passes instead of being a utility. The user cannot easily perform the loop rotation selectively (or on demand) under different optimization level. For example, the loop rotation is needed as part of the logic to convert a loop into a loop with bottom test for a transformation. If the loop rotation is simply added as a loop pass before the transformation, the pass is skipped if it is compiled at –O0 or if it is explicitly disabled by the user, causing the compiler to generate incorrect code. Furthermore, as a loop pass it will rotate all loops instead of just the relevant loops. We provide a utility interface for the loop rotation so that the loop rotation can be called on demand. The changeset is as follows: - Create a new file lib/Transforms/Utils/LoopRotationUtils.cpp and move the main implementation of class LoopRotate into this file. - Create a new file llvm/include/Transform/Utils/LoopRotationUtils.h with the interface LoopRotation(...). - Original LoopRotation.cpp is changed to use the utility function LoopRotation in LoopRotationUtils.cpp. This is done in the same way community did for mem-to-reg implementation. Patch by Jin Lin! Differential Revision: https://reviews.llvm.org/D44595 llvm-svn: 328766	2018-03-29 08:48:15 +00:00
Benjamin Kramer	6b995a4a7e	[Transforms] Make sure to include the c binding header when defining c binding functions Otherwise the definitions can't see the extern C declarations and get name mangled, making it impossible for users to call them. This breaks the Go bindings. llvm-svn: 328765	2018-03-29 07:56:53 +00:00
Max Kazantsev	18f93894db	[NFC] Fix meaningless assert in SCEV llvm-svn: 328764	2018-03-29 07:54:59 +00:00
Craig Topper	a21758fa2c	[X86] Don't pass getRegisterName from the InstPrinters into EmitAnyX86InstComments. Just always use the function from the ATTPrinter. NFC The IntelPrinter and the ATTPrinter produce the same strings for the same input. We already use the ATTPrinter explicitly in several other places. llvm-svn: 328762	2018-03-29 04:14:04 +00:00
Robert Widmann	6775f52fe0	[LLVM-C] Finish exception instruction bindings Summary: Add support for cleanupret, catchret, catchpad, cleanuppad and catchswitch and their associated accessors. Test is modified from SimplifyCFG because it contains many diverse usages of these instructions. Reviewers: whitequark, deadalnix, echristo Reviewed By: echristo Subscribers: llvm-commits, harlanhaskins Differential Revision: https://reviews.llvm.org/D44496 llvm-svn: 328759	2018-03-29 03:43:15 +00:00
Craig Topper	7456af88f4	[X86] Rename RIi64_NOREX tblgen class to just Ii64. Make RIi64 inherit from it. NFC This feels more consistent with the other classes. We don't need to say _NOREX if we didn't start it with an R in the first place. llvm-svn: 328757	2018-03-29 03:14:57 +00:00
Craig Topper	7441ffff84	[X86] Cleanup inheritance of the X86InstrFormats.td classes. NFC EVEX shouldn't inherit from VEX and EVEX_4V shouldn't inherit from VEX_4V. llvm-svn: 328756	2018-03-29 03:14:56 +00:00
George Burgess IV	af0b06f4fd	[MemorySSA] Turn an assert into a condition Eli pointed out that variadic functions are totally a thing, so this assert is incorrect. No test-case is provided, since the only way this assert fires is if a specific DenseMap falls back to doing `isEqual` checks, and that seems fairly brittle (and requires a pyramid of growing `call void (i8, ...) @varargs(i8 0)`). llvm-svn: 328755	2018-03-29 03:12:03 +00:00
George Burgess IV	3588fd4865	[MemorySSA] Consider callsite args for hashing and equality. We use a `DenseMap<MemoryLocOrCall, MemlocStackInfo>` to keep track of prior work when optimizing uses in MemorySSA. Because we weren't accounting for callsite arguments in either the hash code or equality tests for `MemoryLocOrCall`s, we optimized uses too aggressively in some rare cases. Fix by Daniel Berlin. Should fix PR36883. llvm-svn: 328748	2018-03-29 00:54:39 +00:00
David Blaikie	b3f471a4bd	Remove some unused includes to fix layering. llvm-svn: 328745	2018-03-29 00:29:45 +00:00
David Blaikie	3d94e9f320	Split Disassembler.h in two to fix dependencies Support includes this header for the typedefs - but logically it's part of the MC/Disassembler library that implements the functions. Split the header so as not to create a circular dependency. This is another case where probably inverting the llvm-c implementation might be best (rather than core llvm libraries implementing the parts of llvm-c - instead llvm-c could be its own library, depending on all the parts of LLVM's core libraries to then implement llvm-c on top of them... if that makes sense) llvm-svn: 328744	2018-03-29 00:29:44 +00:00
David Blaikie	10f71304b7	Add missing dependency (headers are included from MC, so a link dependency could exist easily enough) llvm-svn: 328743	2018-03-29 00:29:43 +00:00
David Blaikie	8ad9a97310	Plumb useAA through TargetTransformInfo to remove Transforms->CodeGen header dependency Thanks to echristo for the pointers on direction. llvm-svn: 328737	2018-03-28 22:28:50 +00:00
Craig Topper	aac23d7881	[X86][SkylakeServer] Remove checks for 'k', 'z', '_Int' and 'b' from scheduler regexs. Most of these were optional matches at the end of the strings, but since the strings themselves are prefix matches by default you don't need to check for something optional at the end. I've left the 'b' on memory instructions where it means 'broadcast' because I'm not sure those really have the same load latency and we may need to split them explicitly in the future. llvm-svn: 328730	2018-03-28 20:40:24 +00:00
Jun Bum Lim	f90fe701ef	[PostRAMachineSink] preserve CFG Summary: Mark CFG is preserved since this pass do not make any change in CFG. Reviewers: sebpop, mzolotukhin, mcrosier Reviewed By: mzolotukhin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D44845 llvm-svn: 328727	2018-03-28 19:56:26 +00:00
Krzysztof Parzyszek	440ba3ae5c	[Hexagon] Add support for "new" circular buffer intrinsics These instructions have been around for a long time, but we haven't supported intrinsics for them. The "new" versions use the CSx register for the start of the buffer instead of the K field in the Mx register. We need to use pseudo instructions for these instructions until after register allocation. The problem is that these instructions allocate a M0/CS0 or M1/CS1 pair. But, we can't generate code for the CSx set-up until after register allocation when the Mx register has been fixed for the instruction. There is a related clang patch. Patch by Brendon Cahoon. llvm-svn: 328724	2018-03-28 19:38:29 +00:00
David Blaikie	eb8cc04ea2	Oops - moved slightly too many things from Scalar to Utils. Move LoopSimplifyCFG things back llvm-svn: 328720	2018-03-28 18:03:25 +00:00
Jessica Paquette	4aa14dbcc2	[MachineOutliner] Simplify call outlining + require valid callee save info for call outlining This commit simplifies the call outlining logic by removing references to the Function associated with the callee. To do this, it requires that valid callee save info is available to the outliner. llvm-svn: 328719	2018-03-28 17:52:31 +00:00
David Blaikie	a373d18eb7	Transforms: Introduce Transforms/Utils.h rather than spreading the declarations amongst Scalar.h and IPO.h Fixes layering - Transforms/Utils shouldn't depend on including a Scalar or IPO header, because Scalar and IPO depend on Utils. llvm-svn: 328717	2018-03-28 17:44:36 +00:00
Peter Collingbourne	d579c31d68	[llvm-ar] Support multiple dashed options This allows syntax like: $ llvm-ar -c -r -u file.a file.o This is in addition to the other formats that are already supported: $ llvm-ar cru file.a file.o $ llvm-ar -cru file.a file.o Patch by Tom Anderson! Differential Revision: https://reviews.llvm.org/D44452 llvm-svn: 328716	2018-03-28 17:21:14 +00:00
Simon Pilgrim	7237e0cf39	[X86][AVX2] Add shuffle test case from PR36933 llvm-svn: 328714	2018-03-28 16:48:48 +00:00
Dmitry Preobrazhensky	622bde8bc7	[AMDGPU][MC] Added ds_add_src2_f32 See bug 36833: https://bugs.llvm.org/show_bug.cgi?id=36833 Differential Revision: https://reviews.llvm.org/D44779 Reviewers: arsenm, artem.tamazov, timcorringham llvm-svn: 328713	2018-03-28 16:21:56 +00:00
Lang Hames	da5c6acfe9	[ORC] Restore the narrower check from before r328687. This should get the builders green again while I investigate why r328706 was insufficient. llvm-svn: 328711	2018-03-28 15:58:14 +00:00
Dmitry Preobrazhensky	2456ac696a	[AMDGPU][MC] Added PCK variants of image load/store instructions See bug 36834: https://bugs.llvm.org/show_bug.cgi?id=36834 Differential Revision: https://reviews.llvm.org/D44795 Reviewers: artem.tamazov, arsenm, timcorringham, nhaehnle llvm-svn: 328710	2018-03-28 15:44:16 +00:00
Daniel Neilson	45796f6be9	[PatternMatch] Add matchers for vector operations Summary: There aren't any matchers for the three vector operations: insertelement, extractelement, and shufflevector. This patch adds them as well as corresponding unit tests. llvm-svn: 328709	2018-03-28 15:39:00 +00:00
Dmitry Preobrazhensky	a917e88585	[AMDGPU][MC][GFX9] Added buffer_*_format_d16_hi_x See bug 36835: https://bugs.llvm.org/show_bug.cgi?id=36835 Differential Revision: https://reviews.llvm.org/D44825 Reviewers: artem.tamazov, arsenm, timcorringham llvm-svn: 328707	2018-03-28 14:53:13 +00:00
Lang Hames	ec978e2226	[ORC] Re-add the Windows check that was dropped in r328687. This check prevents the ORC execution tests from running on Windows (which is not supported yet). This should fix the windows bots. llvm-svn: 328706	2018-03-28 14:47:11 +00:00
Dmitry Preobrazhensky	dd2b929ffb	[AMDGPU][MC][GFX9] Added s_scratch* instructions See bug 36836: https://bugs.llvm.org/show_bug.cgi?id=36836 Differential Revision: https://reviews.llvm.org/D44832 Reviewers: artem.tamazov, arsenm, timcorringham llvm-svn: 328704	2018-03-28 14:08:03 +00:00
Dan Liew	8ade9e75b0	Revert "[lit] Temporarily disable shtest-timeout.py on darwin" This reverts commit 771829b640a5494ab65c810dd6b4330522bf3a33 (rr328598) Hopefully the test will now pass on the bots. rdar://problem/38774530 llvm-svn: 328703	2018-03-28 13:55:13 +00:00
Dan Liew	7efde3c440	[lit] Remove a timing senstive part of `shtest-timeout.py` The `shtest-timeout.py` test was failing intermittently. It looks like the issue is that on a resource constrained system lit is unable to run `quick_then_slow.py` twice and print out the messages the tests expects within the one second timeout. The underlying issue is that the test is dependent on the performance of the host machine is a rather fragile way. This is due to hardcoding timeout values and having assumptions that the host machine is able to perform a certain amount of work within the hardcoded timeout values. We could increase the timeout values but that doesn't really fix the underlying issue. Instead this patch removes one of fragile assumptions in the hope that this will be enough to fix the bots. There are other fragile assumptions in this test (e.g. `quick.py` can be executed in less than 1 second). If the bots continue to fail we'll have to revisit this. rdar://problem/38774530 llvm-svn: 328702	2018-03-28 13:55:08 +00:00
Simon Pilgrim	b1bc6cd96b	[X86][Btver2] Moved JWriteFCmp/JWriteFCmpY classes next to each other. NFCI Renamed JWriteFPAY22 to JWriteFCmpY - we've tended to avoid latency based names llvm-svn: 328701	2018-03-28 13:53:21 +00:00
Alexander Potapenko	202f809437	Revert "Reapply "[DWARFv5] Emit file 0 to the line table."" This reverts commit r328676. Commit r328676 broke the -no-integrated-as flag necessary to build Linux kernel with Clang: $ cat t.c void foo() {} $ clang -no-integrated-as -c t.c -g /tmp/t-dcdec5.s: Assembler messages: /tmp/t-dcdec5.s:8: Error: file number less than one clang-7.0: error: assembler command failed with exit code 1 (use -v to see invocation) llvm-svn: 328699	2018-03-28 12:36:46 +00:00
Andrea Di Biagio	5076b98fb9	[X86][BtVer2] Fix the number of micro opcodes for AES[ENC\|DEC] and other YMM instructions. Similar to r328694. The number of micro opcodes should be 2 for those instructions. This was found when testing AVX code for BtVer2 using llvm-mca. llvm-svn: 328698	2018-03-28 12:12:04 +00:00
Alexander Potapenko	4e7ad0805e	[MSan] Introduce ActualFnStart. NFC This is a step towards the upcoming KMSAN implementation patch. KMSAN is going to prepend a special basic block containing tool-specific calls to each function. Because we still want to instrument the original entry block, we'll need to store it in ActualFnStart. For MSan this will still be F.getEntryBlock(), whereas for KMSAN it'll contain the second BB. llvm-svn: 328697	2018-03-28 11:35:09 +00:00
Tim Renouf	cdac172e2a	Revert "[AMDGPU] For OS type AMDPAL, fixed scratch on compute shader" This reverts commit 0daf86291d3aa04d3cc280cd0ef24abdb0174981. It was causing an assert in test/CodeGen/AMDGPU/amdpal.ll only on a release-with-asserts build. I will resubmit the change when I have fixed that. Change-Id: If270594eba27a7dc4076bdeab3fa8e6bfda3288a llvm-svn: 328695	2018-03-28 11:21:07 +00:00
Andrea Di Biagio	010924e35c	[X86][BtVer2] Fix the number of micro opcodes for a bunch of YMM instructions. The Jaguar backend natively supports 128-bit data types. Operations on YMM registers are split into two COPs (complex operations). Each COP consumes a slot in the dispatch group, and in the reorder buffer. The scheduling model for Jaguar should mark those instructions as `let NumMicroOps = 2`. This was found when testing AVX code for BtVer2 using llvm-mca. llvm-svn: 328694	2018-03-28 10:49:33 +00:00
Alexander Potapenko	e1d5877847	[MSan] Add an isStore argument to getShadowOriginPtr(). NFC This is a step towards the upcoming KMSAN implementation patch. The isStore argument is to be used by getShadowOriginPtrKernel(), it is ignored by getShadowOriginPtrUserspace(). Depending on whether a memory access is a load or a store, KMSAN instruments it with different functions, __msan_metadata_ptr_for_load_X() and __msan_metadata_ptr_for_store_X(). Those functions may return different values for a single address, which is necessary in the case the runtime library decides to ignore particular accesses. llvm-svn: 328692	2018-03-28 10:17:17 +00:00
Christof Douma	a1e77c0e02	[ARM] Support float literals under XO Follow up patch of r328313 to support the UseVMOVSR constraint. Removed some unneeded instructions from the test and removed some stray comments. Differential Revision: https://reviews.llvm.org/D44941 llvm-svn: 328691	2018-03-28 10:02:26 +00:00
Mikael Holmen	6c062b7641	[RegisterCoalescing] Don't move COPY if it would interfere with another value Summary: RegisterCoalescer::removePartialRedundancy tries to hoist B = A from BB0/BB2 to BB1: BB1: ... BB0/BB2: ---- B = A; \| ... \| A = B; \| \|------- \| It does so if a number of conditions are fulfilled. However, it failed to check if B was used by any of the terminators in BB1. Since we must insert B = A before the terminators (since it's not a terminator itself), this means that we could erroneously insert a new definition of B before a use of it. Reviewers: wmi, qcolombet Reviewed By: wmi Subscribers: MatzeB, llvm-commits, sdardis Differential Revision: https://reviews.llvm.org/D44918 llvm-svn: 328689	2018-03-28 06:01:30 +00:00
Lang Hames	a95b0df5ed	[ORC] Fix ORC on platforms without indirection support. Previously this crashed because a nullptr (returned by createLocalIndirectStubsManagerBuilder() on platforms without indirection support) functor was unconditionally invoked. Patch by Andres Freund. Thanks Andres! llvm-svn: 328687	2018-03-28 03:41:45 +00:00
Sanjay Patel	bb33007b25	[AArch64] add ftrunc tests; NFC As suggested in D44909. llvm-svn: 328683	2018-03-28 00:56:00 +00:00
Sanjay Patel	594c1546f1	[PowerPC] add ftrunc vector tests; NFC Baseline tests for vectors as suggested in D44909. llvm-svn: 328682	2018-03-28 00:49:12 +00:00
Heejin Ahn	37307b450f	[WebAssembly] Add exception and selector intrinsics Summary: Since wasm EH does not use landingpad instructions, these instructions provide exception pointer and selector values until we lower them in WasmEHPrepare. Reviewers: jgravelle-google Subscribers: jfb, sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D44930 llvm-svn: 328678	2018-03-27 23:37:07 +00:00
Matt Arsenault	bd49eccca1	AMDGPU: Really implement getFrameRegister Currently this seems to only really be used for debug info. llvm-svn: 328677	2018-03-27 23:26:59 +00:00
Paul Robinson	07480bd177	Reapply "[DWARFv5] Emit file 0 to the line table." DWARF v5 specifies that the root file (also given in the DW_AT_name attribute of the compilation unit DIE) should be emitted explicitly to the line table's list of files. This makes the line table more independent of the .debug_info section. Fixes the bug found by asan. Also XFAIL the new test for Darwin, which is stuck on DWARF v2, and fix up other tests so they stop failing on Windows. Last but not least, don't break "clang -g" of an assembler file that has .file directives in it. Differential Revision: https://reviews.llvm.org/D44054 llvm-svn: 328676	2018-03-27 22:40:34 +00:00
Jessica Paquette	2519ee7081	[MachineOutliner] AArch64: Don't outline ADRPs with un-outlinable operands If an ADRP appears with, say, a CPI operand, we shouldn't outline it. This moves the check for unsafe operands so that it occurs before the special-case for ADRPs. Also add a test for outlining ADRPs. llvm-svn: 328674	2018-03-27 22:23:48 +00:00
Tim Renouf	e4208bfa5b	[AMDGPU] For OS type AMDPAL, fixed scratch on compute shader Summary: For OS type AMDPAL, the scratch descriptor is loaded from offset 0 of the GIT, whose 32 bit pointer is in s0 (s8 for gfx9 merged shaders). This commit fixes that to use offset 0x10 instead of offset 0 for a compute shader, per the PAL ABI spec. Reviewers: kzhuravl, nhaehnle, timcorringham Subscribers: kzhuravl, wdng, yaxunl, t-tye, llvm-commits, dstuttard, nhaehnle, arsenm Differential Revision: https://reviews.llvm.org/D44468 Change-Id: I93dffa647758e37f613bb5e0dfca840d82e6d26f llvm-svn: 328673	2018-03-27 21:35:00 +00:00
Paul Robinson	7cb26ad2ef	[DWARF] Suppress split line tables more carefully. If a given split type unit does not have source locations, don't have it refer to the split line table. If no split type unit refers to the split line table, don't emit the line table at all. This will save a little space on rare occasions, but also refactors things a bit to improve which class is responsible for what. Responding to review comments on r326395. Differential Revision: https://reviews.llvm.org/D44220 llvm-svn: 328670	2018-03-27 21:28:59 +00:00
Tony Tye	01bfd6c4e5	[AMDGPU] Define code object identification string used in AMDHSA runtimes. Differential Revision: https://reviews.llvm.org/D44718 llvm-svn: 328669	2018-03-27 21:20:46 +00:00
Tim Renouf	4db0960420	[CodeGen] Fixed unreachable with -print-machineinstrs and custom pseudo source value Summary: Rev 327580 "[CodeGen] Use MIR syntax for MachineMemOperand printing" broke -print-machineinstrs for us on AMDGPU, because we have custom pseudo source values, and MIR serialization does not implement that. This commit at least restores the functionality of -print-machineinstrs, even if it does not properly implement the missing MIR serialization functionality. Differential Revision: https://reviews.llvm.org/D44871 Change-Id: I44961c0b90bf6d48c01484ed7a4e466fd300db66 llvm-svn: 328668	2018-03-27 21:14:04 +00:00
Sterling Augustine	33dc01861a	Initialize variable added in r328617. llvm-svn: 328667	2018-03-27 21:11:57 +00:00
Graydon Hoare	b62f86a6b7	[YAML] Remove unit test of multibyte non-printable escaping that uses C++11 escapes llvm-svn: 328665	2018-03-27 20:46:26 +00:00
Simon Pilgrim	a2f26788a3	[X86] Add WriteFMOVMSK/WriteVecMOVMSK/WriteMMXMOVMSK scheduler classes Currently MOVMSK instructions use the WriteVecLogic class, which is a very poor choice given that MOVMSK involves a SSE->GPR transfer. Differential Revision: https://reviews.llvm.org/D44924 llvm-svn: 328664	2018-03-27 20:38:54 +00:00
Wolfgang Pieb	ab068eaa57	[DWARF][DWARF v5]: Adding support for dumping DW_RLE_offset_pair and DW_RLE_base_address Reviewers: dblakie, aprantl Differential Revision: https://reviews.llvm.org/D44811 llvm-svn: 328662	2018-03-27 20:27:36 +00:00
Graydon Hoare	926cd9b837	[YAML] Escape non-printable multibyte UTF8 in Output::scalarString. The existing YAML Output::scalarString code path includes a partial and incorrect implementation of YAML escaping logic. In particular, the logic put in place in rL321283 escapes non-printable bytes only if they are not part of a multibyte UTF8 sequence; implicitly this means that all multibyte UTF8 sequences -- printable and non -- are passed through verbatim. The simplest solution to this is to direct the Output::scalarString method to use the standalone yaml::escape function, and this _almost_ works, except that the existing code in that function _over_ escapes: any multibyte UTF8 sequence is escaped, even printable ones. While this is permitted for YAML, it is also more aggressive (and hard to read for non-English locales) than necessary, and the entire point of rL321283 was to back off such aggressive over-escaping. So in this change, I have both redirected Output::scalarString to use yaml::escape _and_ modified yaml::escape to optionally restrict its escaping to non-printables. This preserves behaviour of any existing clients while giving them a path to more moderate escaping should they desire. Reviewers: JDevlieghere, thegameg, MatzeB, vladimir.plyashkun Reviewed By: thegameg Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D44863 llvm-svn: 328661	2018-03-27 19:52:45 +00:00
Xin Tong	0272cb077f	80-line wrap. NFC llvm-svn: 328660	2018-03-27 19:43:02 +00:00
Matt Arsenault	17f3338015	AMDGPU: Fix not preserving CSR VGPR if used for SGPR spills Before this was not done if the function had no calls in it. This is still a possible issue with any callable function, regardless of calls present. llvm-svn: 328659	2018-03-27 19:42:55 +00:00
Matt Arsenault	95329f8c53	AMDGPU: Set natural stack alignment in DataLayout Only 4 byte alignment is ever useful, so increasing anything beyond this may require realigning the stack. llvm-svn: 328656	2018-03-27 19:26:40 +00:00
Fangrui Song	fc5dabe738	[DWARF] Simplify DWARFAddressRange::contains This transform is valid because the ranges have been validated (LowPC <= HighPC). Differential Revision: https://reviews.llvm.org/D44772 llvm-svn: 328655	2018-03-27 19:05:02 +00:00
Rong Xu	662f38b16f	[PGO] Fix branch probability remarks assert Fixed counter/weight overflow that leads to an assertion. Also fixed the help string for pgo-emit-branch-prob option. Differential Revision: https://reviews.llvm.org/D44809 llvm-svn: 328653	2018-03-27 18:55:56 +00:00
Matt Arsenault	0a0c871f60	AMDGPU: Fix crash when MachinePointerInfo invalid The combine on a select of a load only triggers for addrspace 0, and discards the MachinePointerInfo. The conservative default needs to be used for this. llvm-svn: 328652	2018-03-27 18:39:45 +00:00
Matt Arsenault	126a874952	AMDGPU: Fix register name format in tests These were changed to match the asm output name a long time ago, although I think the old tablegenerated names still work. llvm-svn: 328651	2018-03-27 18:39:42 +00:00
Matt Arsenault	e9f3679031	AMDGPU: Fix FP restore from being reordered with stack ops In a function, s5 is used as the frame base SGPR. If a function is calling another function, during the call sequence it is copied to a preserved SGPR and restored. Before it was possible for the scheduler to move stack operations before the restore of s5, since there's nothing to associate a frame index access with the restore. Add an implicit use of s5 to the adjcallstack pseudo which ends the call sequence to preven this from happening. I'm not 100% satisfied with this solution, but I'm not sure what else would be better. llvm-svn: 328650	2018-03-27 18:38:51 +00:00
Krzysztof Parzyszek	0375cd46ef	[Hexagon] Implement TTI::shouldMaximizeVectorBandwidth llvm-svn: 328648	2018-03-27 18:10:47 +00:00
Stefan Pintilie	659f040351	[Power9] Fix the resource list for the COPY instruction. The COPY instruction was listed as a 4 cycle instruction. It is now listed correctly as a 2 cycle ALU instruction. llvm-svn: 328647	2018-03-27 17:51:53 +00:00
Pirama Arumuga Nainar	ddd7b06842	Remap values in PromotedFloats Summary: When a node is about to be erased from ReplacedValues, we should also remap its corresponding values in PromotedFloats. Patch by Yan Luo (Yan.Luo2@synopsys.com) Reviewers: pirama Reviewed By: pirama Subscribers: lebedev.ri, llvm-commits Differential Revision: https://reviews.llvm.org/D44872 llvm-svn: 328644	2018-03-27 17:42:36 +00:00
Artur Pilipenko	ca1d849cd6	Fix a reoccuring typo in load-combine tests %tmp = bitcast i32* %arg to i8* %tmp1 = getelementptr inbounds i8, i8* %tmp, i32 0 - %tmp2 = load i8, i8* %tmp, align 1 + %tmp2 = load i8, i8* %tmp1, align 1 This doesn't change the semantics of the tests but makes use of %tmp1 which was originally intended. llvm-svn: 328642	2018-03-27 17:33:50 +00:00
Krzysztof Parzyszek	0a15d24134	[Hexagon] Rudimentary support for auto-vectorization for HVX This implements a set of TTI functions that the loop vectorizer uses. The only purpose of this is to enable testing. Auto-vectorization is disabled by default, enabled by -hexagon-autohvx. llvm-svn: 328639	2018-03-27 17:07:52 +00:00
Rafael Auler	d058b882be	[AArch64] Decorate AArch64 instrs with OPERAND_PCREL Summary: This is a canonical way to teach objdump to print the target symbols for branches when disassembling AArch64 code. Reviewers: evandro, t.p.northover, espindola Reviewed By: t.p.northover Differential Revision: https://reviews.llvm.org/D44851 llvm-svn: 328638	2018-03-27 16:58:01 +00:00
Fedor Sergeev	98014e433f	[NFC] OptPassGate extracted from OptBisect Summary: This is an NFC refactoring of the OptBisect class to split it into an optional pass gate interface used by LLVMContext and the Optional Pass Bisector (OptBisect) used for debugging of optional passes. This refactoring is needed for D44464, which introduces setOptPassGate() method to allow implementations other than OptBisect. Patch by Yevgeny Rouban. Reviewers: andrew.w.kaylor, fedor.sergeev, vsk, dberlin, Eugene.Zelenko, reames, skatkov Reviewed By: fedor.sergeev Differential Revision: https://reviews.llvm.org/D44821 llvm-svn: 328637	2018-03-27 16:57:20 +00:00
Krzysztof Parzyszek	52396bb9c5	Use .set instead of = when printing assignment in assembly output On Hexagon "x = y" is a syntax used in most instructions, and is not treated as a directive. Differential Revision: https://reviews.llvm.org/D44256 llvm-svn: 328635	2018-03-27 16:44:41 +00:00
Krzysztof Parzyszek	5d93fdfa89	[LV] Add TTI::shouldMaximizeVectorBandwidth to allow enabling it per target The default implementation returns false and keeps the current behavior. Differential Revision: https://reviews.llvm.org/D44735 llvm-svn: 328632	2018-03-27 16:14:11 +00:00
Andrea Di Biagio	9ecb4011ca	[llvm-mca] pass the correct set of used registers in checkRAT. We were incorrectly initializing the array of used registers in method checkRAT. As a consequence, the number of register file stalls was misreported. Added a test to cover this case. llvm-svn: 328629	2018-03-27 15:23:41 +00:00
Simon Pilgrim	5f7ab4fedf	[X86][Btver2] Add MMX_PMOVMSKBrr to MOVMSK scheduler class llvm-svn: 328620	2018-03-27 12:26:12 +00:00
Strahinja Petrovic	06cf6a6490	[PowerPC] Secure PLT support This patch supports secure PLT mode for PowerPC 32 architecture. Differential Revision: https://reviews.llvm.org/D42112 llvm-svn: 328617	2018-03-27 11:23:53 +00:00
Alexander Richardson	e8059b1de4	[MIPS] Add static_assert that all Fixups are handled in getFixupKind Summary: I recently added a new Fixup kind to our fork of LLVM but forgot to add it to the table in MipsAsmBackend.cpp. With this static_assert the error would have been caught instead of zero-initializing the array entries for the new fixups. Reviewers: sdardis, atanasyan Reviewed By: atanasyan Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D44895 llvm-svn: 328616	2018-03-27 10:08:12 +00:00
Max Kazantsev	b1ad66ff12	[LoopUnroll][NFC] Remove redundant canPeel check We check `canPeel` twice: when evaluating the number of iterations to be peeled and within the method `peelLoop` that performs peeling. This method is only executed if the calculated peel count is positive. Thus, the check in `peelLoop` can never fail. This patch replaces this check with an assert. Differential Revision: https://reviews.llvm.org/D44919 Reviewed By: fhahn llvm-svn: 328615	2018-03-27 09:40:51 +00:00
Sam Parker	90b7f4f72c	[IRCE] Enable decreasing loops of non-const bound As a follow-up to r328480, this updates the logic for the decreasing safety checks in a similar manner: - CanBeMax is replaced by CannotBeMaxInLoop which queries isLoopEntryGuardedByCond on the maximum value. - SumCanReachMin is replaced by isSafeDecreasingBound which includes some logic from parseLoopStructure and, again, has been updated to use isLoopEntryGuardedByCond on the given bounds. Differential Revision: https://reviews.llvm.org/D44776 llvm-svn: 328613	2018-03-27 08:24:53 +00:00
Max Kazantsev	ee5dd8306f	[NFC] Fix comments in getExact() llvm-svn: 328612	2018-03-27 08:13:55 +00:00
Max Kazantsev	7094c8deb2	[SCEV] Make exact taken count calculation more optimistic Currently, `getExact` fails if it sees two exit counts in different blocks. There is no solid reason to do so, given that we only calculate exact non-taken count for exiting blocks that dominate latch. Using this fact, we can simply take min out of all exits of all blocks to get the exact taken count. This patch makes the calculation more optimistic with enforcing our assumption with asserts. It allows us to calculate exact backedge taken count in trivial loops like for (int i = 0; i < 100; i++) { if (i > 50) break; . . . } Differential Revision: https://reviews.llvm.org/D44676 Reviewed By: fhahn llvm-svn: 328611	2018-03-27 07:30:38 +00:00
Max Kazantsev	a63d333881	[SCEV] Add one more case in computeConstantDifference This patch teaches `computeConstantDifference` handle calculation of constant difference between `(X + C1)` and `(X + C2)` which is `(C2 - C1)`. Differential Revision: https://reviews.llvm.org/D43759 Reviewed By: anna llvm-svn: 328609	2018-03-27 04:54:00 +00:00
Craig Topper	44a23f4283	[MachineScheduler] Add itinerary to schedcover.py. Make default work in the command line filter Summary: This patch adds itinerary support to the schedcover.py script. I've been trying to use this script to figure out why SSE and AVX instructions are ending up in separate tablegen scheduler classes and sometimes its because we are using different itineraries. Rather than using None to indicate the default scheduler model, I now use the string "default". I had to hack around the sorting a little to keep "default" at the beginning. But this also makes it so you can specify "default" on the command line to just get the defaults I also fixed the regular expression code so that the no_default wasn't evaluated twice. Reviewers: RKSimon, atrick, jmolloy, javed.absar Reviewed By: javed.absar Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D44834 llvm-svn: 328608	2018-03-27 04:26:39 +00:00
Mircea Trofin	56ba71b2a7	Revert "Revert "[lit] Generalized /dev/null support on Windows."" Summary: This reverts commit r328596. Checking if the arguments are strings before testing if they contain "/dev/null". Reviewers: rnk Reviewed By: rnk Subscribers: delcypher, llvm-commits Differential Revision: https://reviews.llvm.org/D44914 llvm-svn: 328603	2018-03-27 01:39:17 +00:00
Sanjay Patel	15f7df9f44	[x86] add RUN for target before roundss; NFC llvm-svn: 328601	2018-03-27 00:32:19 +00:00
Jan Korous	1e0e0b077d	[lit] Temporarily disable shtest-timeout.py on darwin Disabled until fixed in order to avoid random failures on green dragon. rdar://problem/38774530 llvm-svn: 328598	2018-03-27 00:16:28 +00:00
Mircea Trofin	373c445c24	Revert "[lit] Generalized /dev/null support on Windows." This reverts commit ca7fdbb974384ce5a05528b22a41d46b1cc13e92. llvm-svn: 328596	2018-03-26 23:59:39 +00:00
David Blaikie	60e62438d2	Add a build dependency from libMC to libDebugInfoCodeView to match the reality of header dependencies here llvm-svn: 328595	2018-03-26 23:48:52 +00:00
David Blaikie	211c67cdb2	Move CVDebugRecord from CodeView to Object to fix layering llvm-svn: 328593	2018-03-26 23:37:02 +00:00
Sanjay Patel	8653776367	[x86] add tests for ftrunc; NFC llvm-svn: 328592	2018-03-26 23:18:32 +00:00
Aaron Smith	f13938382c	[DebugInfoPDB] Print the method name along with the variant value Before this change, using dumpProperties() with PDBSymbolData would look like this: get_locationType: 3 1 After this change: get_locationType: 3 get_value: 1 llvm-svn: 328590	2018-03-26 22:53:38 +00:00
Mircea Trofin	88911686c8	[lit] Generalized /dev/null support on Windows. Generalized /dev/null remapping on Windows, and added test. Reviewers: rnk Reviewed By: rnk Subscribers: amccarth, zturner, delcypher, llvm-commits Differential Revision: https://reviews.llvm.org/D44771 llvm-svn: 328589	2018-03-26 22:41:06 +00:00
Aaron Smith	1af50bcf89	[DebugInfoPDB] Add methods to get the compiland and line numbers with PDBSymbolData llvm-svn: 328587	2018-03-26 22:17:12 +00:00
Aaron Smith	ed81a9db29	[DebugInfoPDB] Add DIA implementation of findLineNumbersByRVA This method is used to find line numbers for PDBSymbolData that have an invalid virtual address. llvm-svn: 328586	2018-03-26 22:13:22 +00:00
Aaron Smith	53708a5e9e	[DebugInfoPDB] Add DIA implementation of addressForVA and addressForRVA These are used in finding line numbers for PDBSymbolData llvm-svn: 328585	2018-03-26 22:10:02 +00:00
Simon Pilgrim	f6440b6fb1	Fix newlines. NFCI. llvm-svn: 328583	2018-03-26 21:07:59 +00:00
Simon Pilgrim	28e7bcbba6	[X86] Add WriteCRC32 scheduler class Currently CRC32 instructions use the WriteFAdd class, this patch splits them off into their own, at the moment it is still mostly just a duplicate of WriteFAdd but it can now be tweaked on a target by target basis. Differential Revision: https://reviews.llvm.org/D44647 llvm-svn: 328582	2018-03-26 21:06:14 +00:00
Rafael Espindola	78fdca3cd5	Use local symbols for creating .stack-size. llvm-svn: 328581	2018-03-26 20:40:22 +00:00
Reid Kleckner	2fe905cfee	Fix go bindings test when using goma distributed build tool Goma[1] is a distributed build system similar to distcc and icecc primarily used to compile Chromium. The client is open source, and hopefully soon the server will be as well. The intended usage model is similar to most distributed build systems: prefix gomacc onto your compiler command line, and it transparently distributes compilation. The go lit config wants to determine the host compiler binary, so it needs some extra logic to avoid looking at these prefixes. [1] https://chromium.googlesource.com/infra/goma/client/ llvm-svn: 328580	2018-03-26 20:19:14 +00:00
Paul Robinson	82e4864730	Use correct format specifier. Review comment on r328235 by James Henderson. llvm-svn: 328578	2018-03-26 19:55:01 +00:00
Eli Friedman	88e2bac94d	[MemorySSA] Fix exponential compile-time updating MemorySSA. MemorySSAUpdater::getPreviousDefRecursive is a recursive algorithm, for each block, it computes the previous definition for each predecessor, then takes those definitions and combines them. But currently it doesn't remember results which it already computed; this means it can visit the same block multiple times, which adds up to exponential time overall. To fix this, this patch adds a cache. If we computed the result for a block already, we don't need to visit it again because we'll come up with the same result. Well, unless we RAUW a MemoryPHI; in that case, the TrackingVH will be updated automatically. This matches the original source paper for this algorithm. The testcase isn't really a test for the bug, but it adds coverage for the case where tryRemoveTrivialPhi erases an existing PHI node. (It's hard to write a good regression test for a performance issue.) Differential Revision: https://reviews.llvm.org/D44715 llvm-svn: 328577	2018-03-26 19:52:54 +00:00
Krzysztof Parzyszek	4a5a80c370	[Hexagon] Assertion failure in HexagonSubtarget.cpp In restoreLatency, replace range-for loop with std::find. Patch by Jyotsna Verma. llvm-svn: 328574	2018-03-26 19:04:58 +00:00
Simon Pilgrim	fcf49df21c	[X86][Btver2] Add (U)COMISD/(U)COMISD scheduler costs Account for the "+i" integer pipe transfer cost (1cy use of JALU0 for GPR PRF write) llvm-svn: 328573	2018-03-26 19:01:06 +00:00
Haicheng Wu	b45f921678	[SLP] Add more checks to a test case. NFC. llvm-svn: 328572	2018-03-26 18:59:28 +00:00
Reid Kleckner	41fb2dba9c	[X86] Fix Windows `i1 zeroext` conventions to use i8 instead of i32 Summary: Re-lands r328386 and r328443, reverting r328482. Incorporates fixes from @mstorsjo in D44876 (thanks!) so that small parameters in i8 and i16 do not end up in the SysV register parameters (EDI, ESI, etc). I added tests for how we receive small parameters, since that is the important part. It's always safe to store more bytes than will be read, but the assumptions you make when loading them are what really matter. I also tested this by self-hosting clang and it passed tests on win64. Reviewers: mstorsjo, hans Subscribers: hiraditya, mstorsjo, llvm-commits Differential Revision: https://reviews.llvm.org/D44900 llvm-svn: 328570	2018-03-26 18:49:48 +00:00
Simon Pilgrim	f33d905293	[X86] Add WriteBitScan/WriteLZCNT/WriteTZCNT/WritePOPCNT scheduler classes (PR36881) Give the bit count instructions their own scheduler classes instead of forcing them into existing classes. These were mostly overridden anyway, but I had to add in costs from Agner for silvermont and znver1 and the Fam16h SoG for btver2 (Jaguar). Differential Revision: https://reviews.llvm.org/D44879 llvm-svn: 328566	2018-03-26 18:19:28 +00:00
David Blaikie	7c4b5d92f1	Remove unused file, ExecutionEngine/MCJIT/ObjectBuffer.h This header also wasn't self contained/modular - but with no users, it didn't seem worth fixing because it'd break so easily again. llvm-svn: 328565	2018-03-26 18:10:31 +00:00
Mandeep Singh Grang	1b9ff45157	[XCore] Change std::sort to llvm::sort in response to r327219 Summary: r327219 added wrappers to std::sort which randomly shuffle the container before sorting. This will help in uncovering non-determinism caused due to undefined sorting order of objects having the same key. To make use of that infrastructure we need to invoke llvm::sort instead of std::sort. Note: This patch is one of a series of patches to replace all std::sort to llvm::sort. Refer the comments section in D44363 for a list of all the required patches. Reviewers: dblaikie, RKSimon, robertlytton Reviewed By: robertlytton Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D44875 llvm-svn: 328564	2018-03-26 18:08:26 +00:00
Reid Kleckner	8252892951	[lit] Implement 'cat' command for internal shell Fixes PR36449 Patch by Chamal de Silva Differential Revision: https://reviews.llvm.org/D43501 llvm-svn: 328563	2018-03-26 18:05:12 +00:00
Zachary Turner	7b84b678a9	Delete pdbutil diff mode. This has been made obsolete by the fact that almost all of the things it previously checked for are no longer relevant since we can just compare bytes in a lot of places. llvm-svn: 328562	2018-03-26 18:01:07 +00:00
Krzysztof Parzyszek	5488deb1ab	[Hexagon] Add more lit tests llvm-svn: 328561	2018-03-26 17:53:48 +00:00
Sanjay Patel	0e3167cb30	[InstCombine] improve code comment; NFC llvm-svn: 328560	2018-03-26 17:52:02 +00:00
Lei Huang	be0afb0870	[Power9]Legalize and emit code for quad-precision convert from double-precision Legalize and emit code for quad-precision floating point operation xscvdpqp and add option to guard the quad precision operation support. Differential Revision: https://reviews.llvm.org/D44746 llvm-svn: 328558	2018-03-26 17:46:25 +00:00
Stefan Pintilie	26d4f923c4	[PowerPC] Infrastructure work. Implement getting the opcode for a spill in one place. A new function getOpcodeForSpill should now be the only place to get the opcode for a given spilled register. Differential Revision: https://reviews.llvm.org/D43086 llvm-svn: 328556	2018-03-26 17:39:18 +00:00
Zaara Syeda	17e4eeaa8b	Disable [MachineLICM] Add functions to MachineLICM to hoist invariant stores Disable https://reviews.llvm.org/D40196 with setting option hoist-const-stores to false since failing s390 buildbot. llvm-svn: 328555	2018-03-26 17:22:33 +00:00
Krzysztof Parzyszek	3ca233414b	[Pipeliner] Several node-ordering fixes First, we change the heuristic that is used to ignore the recurrent node-sets in the node ordering. In certain cases it's not important to focus on the recurrent node-sets. Instead, the algorithm begins by considering all the instructions in the node ordering step. Second, a minor change to the bottom up traversal, which needs to consider loop carried dependences (modeled as anti dependences). Previously, these instructions were skipped, which caused problems because the instruction ends up having both predecessors and sucessors in the schedule. Third, consider anti-dependences as a tie breaker when choosing between instructions in the node ordering. We want to make sure that the source of the anti-dependence does not end up with both predecesssors and sucessors in the final node ordering. Patch by Brendon Cahoon. llvm-svn: 328554	2018-03-26 17:07:41 +00:00
Tim Corringham	7116e8963d	[AMDGPU] Improve disassembler error handling Summary: llvm-objdump now disassembles unrecognised opcodes as data, using the .long directive. We treat unrecognised opcodes as being 32 bit values, so move along 4 bytes rather than the single byte which previously resulted in a cascade of bogus disassembly following an unrecognised opcode. While no solution can always disassemble code that contains embedded data correctly this provides a significant improvement. The disassembler will now cope with an arbitrary length section as it no longer truncates it to a multiple of 4 bytes, and will use the .byte directive for trailing bytes. Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D44685 llvm-svn: 328553	2018-03-26 17:06:33 +00:00
Simon Pilgrim	86ea53123d	[X86][Btver2] Add CVTSI2SD/CVTSI2SS scheduler costs We still need to account for how Jaguar passes data from GPR -> XMM, which isn't as clean as XMM -> GPR..... llvm-svn: 328551	2018-03-26 17:02:02 +00:00
Krzysztof Parzyszek	8c07d0c42c	[Pipeliner] Check for affine expression in isLoopCarriedOrder The pipeliner must add a loop carried dependence between two memory operations if the base register is not an affine (linear) exression. The current implementation doesn't check how the base register is defined, which allows non-affine expressions, and then the pipeliner does not add a loop carried dependence when one is needed. This patch adds code to isLoopCarriedOrder that checks if the base register of the memory operations is defined by a phi, and the loop definition for the phi is a constant increment value. This is a very simple check for a linear expression. Patch by Brendon Cahoon. llvm-svn: 328550	2018-03-26 16:58:40 +00:00
David Blaikie	535ca36e5e	Remove an unneeded (& mislayered) include from Target/TargetLoweringObjectFile on a CodeGen header llvm-svn: 328549	2018-03-26 16:57:31 +00:00
David Blaikie	a1b2bf4c71	Remove unneeded (& mislayered) include from TargetMachine.cpp on a CodeGen header llvm-svn: 328548	2018-03-26 16:52:10 +00:00
Krzysztof Parzyszek	9f041b1830	[Pipeliner] Add missing loop carried dependences The pipeliner is not adding a dependence edge for a loop carried dependence, and ends up scheduling a load from iteration n prior to an aliased store in iteration n-1. The code that adds the loop carried dependences in the pipeliner doesn't check if the memory objects for loads and stores are "identified" (i.e., distinct) objects. If they are not, then the code that adds the dependences needs to be conservative. The objects can be used to check dependences only when they are distinct objects. The code that checks for loop carried dependences has been updated to classify loads and stores that are not identified as "unknown" values. A store with an "unknown" value can potentially create a loop carried dependence with any pending load. Patch by Brendon Cahoon. llvm-svn: 328547	2018-03-26 16:50:11 +00:00
Haicheng Wu	0ec1dbe417	[SLP] Add a test case. NFC. llvm-svn: 328546	2018-03-26 16:47:37 +00:00
Krzysztof Parzyszek	16e66f5901	[Pipeliner] Fix renaming in pipeliner when eliminating phis The phi renaming code in the pipeliner uses the wrong value when rewriting phi uses, which results in an undefined value. In this case, the original phi is no longer needed due to the order of instruction in the pipelined loop. The pipeliner was assuming, in this case, the the phi loop definition should be used to rewrite the uses. However, the pipeliner needs to check to make sure that the loop definition has already been scheduled. If not, then the phi initial value needs to be used instead. Patch by Brendon Cahoon. llvm-svn: 328545	2018-03-26 16:41:36 +00:00
Krzysztof Parzyszek	3f72a6b7a1	[Pipeliner] Fix number of phis to generate in the epilog The pipeliner was generating too many phis in the epilog blocks, which caused incorrect code generation when rewriting an instruction that uses the phi. In this case, there 3 prolog and epilog stages. An existing phi was scheduled at stage 1. When generating the code for the 2nd epilog an extra new phi was generated. To fix this, we need to update the code that calculates the maximum number of phis that can be generated, which is based upon the current prolog stage and the stage of the original phi. In this case, when the prolog stage is 1 and the original phi stage is 1, the maximum number of phis to generate is 2. Patch by Brendon Cahoon. llvm-svn: 328543	2018-03-26 16:37:55 +00:00
Krzysztof Parzyszek	a212204453	[Pipeliner] Use latency to compute RecMII The patch contains severals changes needed to pipeline an example that was transformed so that a Phi with a subreg is converted to copies. The pipeliner wasn't working for a couple of reasons. - The RecMII was 3 instead of 2 due to the extra copies. - Copy instructions contained a latency of 1. - The node order algorithm was not choosing the best "bottom" node, which caused an instruction to be scheduled that had a predecessor and successor already scheduled. - Updated the Hexagon Machine Scheduler to check if the node is latency bound when adding the cost for a 0-latency dependence. The RecMII was 3 because the computation looks at the number of nodes in the recurrence. The extra copy is an extra node but it shouldn't increase the latency. The new RecMII computation looks at the latency of the instructions in the recurrence. We changed the latency of the dependence of a copy to 0. The latency computation for the copy also checks the use of the copy (similar to a reg_sequence). The node order algorithm was not choosing the last instruction in the recurrence for a bottom up traversal. This was when the last instruction is a copy. A check was added when choosing the instruction to check for NodeNum if the maxASAP is the same. This means that the scheduler will not end up with another node in the recurrence that has both a predecessor and successor already scheduled. The cost computation in Hexagon Machine Scheduler adds cost when an instruction can be packetized with a zero-latency instruction. We should only do this if the schedule is latency bound. Patch by Brendon Cahoon. llvm-svn: 328542	2018-03-26 16:33:16 +00:00
Simon Pilgrim	8815105cd5	[X86][Btver2] Add CVTSD2SS/CVTSS2SD scheduler costs llvm-svn: 328541	2018-03-26 16:24:13 +00:00
Krzysztof Parzyszek	f13bbf1d58	[Pipeliner] Fix assert caused by pipeliner serialization The pipeliner is asserting because the serialization step that occurs at the end is deleting an instruction. The assert occurs later on because there is a use without a definition. The problem occurs when an instruction defines a value used by a REQ_SEQUENCE and that value is used by a COPY instruction. The latencies between these instructions are zero, so they are put in to the same packet. The serialization code is unable to handle this correctly, and ends up putting the REG_SEQUENCE before its definition. There is special code in the serialization step that attempts to handle zero-cost instructions (phis, copy, reg_sequence) differently than regular instructions. Unfortunately, this means the order does not come out correct. This patch simplifies the code by changing the seperate steps for handling zero-cost and regular instructions. Only phis are handled separate now, since they should occurs first. Then, this patch adds checks to make use the MoveUse is set to the smallest value if there are multiple uses in a cycle. Patch by Brendon Cahoon. llvm-svn: 328540	2018-03-26 16:23:29 +00:00
Sebastian Pop	d870aea03e	[InstCombine] reassociate loop invariant GEP chains to enable LICM This change brings performance of zlib up by 10%. The example below is from a hot loop in longest_match() from zlib. do.body: %cur_match.addr.0 = phi i32 [ %cur_match, %entry ], [ %2, %do.cond ] %idx.ext = zext i32 %cur_match.addr.0 to i64 %add.ptr = getelementptr inbounds i8, i8* %win, i64 %idx.ext %add.ptr2 = getelementptr inbounds i8, i8* %add.ptr, i64 %idx.ext1 %add.ptr3 = getelementptr inbounds i8, i8* %add.ptr2, i64 -1 In this example %idx.ext1 is a loop invariant. It will be moved above the use of loop induction variable %idx.ext such that it can be hoisted out of the loop by LICM. The operands that have dependences carried by the loop will be sinked down in the GEP chain. This patch will produce the following output: do.body: %cur_match.addr.0 = phi i32 [ %cur_match, %entry ], [ %2, %do.cond ] %idx.ext = zext i32 %cur_match.addr.0 to i64 %add.ptr = getelementptr inbounds i8, i8* %win, i64 %idx.ext1 %add.ptr2 = getelementptr inbounds i8, i8* %add.ptr, i64 -1 %add.ptr3 = getelementptr inbounds i8, i8* %add.ptr2, i64 %idx.ext llvm-svn: 328539	2018-03-26 16:19:31 +00:00

... 3 4 5 6 7 ...

162432 Commits