llvm-project

Commit Graph

Author	SHA1	Message	Date
Craig Topper	dbe3473634	[X86] Add test case to show missed opportunity to use a single pmuludq to implement a multiply when a zext lives in another basic block. This can occur when one of the inputs to the multiply is loop invariant. Though my test cases just use two basic blocks with an unconditional jump which we won't merge until after isel in the codegen pipeline. For scalars, I believe SelectionDAGBuilder can add an AssertZExt to pass knowledge across basic blocks but its explicitly disabled for vectors. llvm-svn: 347266	2018-11-19 22:04:12 +00:00
Konstantin Zhuravlyov	700b1ef54d	AMDGPU: Fix V_FMA_F16 selection on GFX9 GFX9 should select opsel version. Differential Revision: https://reviews.llvm.org/D54545 llvm-svn: 347265	2018-11-19 21:10:16 +00:00
Benjamin Kramer	fdd9b4fc8f	Revert "[LoopSimplifyCFG] Teach LoopSimplifyCFG to constant-fold branches and switches" This reverts commits r347183 & r347184. Crashes while building libxml. llvm-svn: 347260	2018-11-19 20:01:20 +00:00
Stanislav Mekhanoshin	8bafbae889	[AMDGPU] Restored selection of scalar_to_vector (v2x16) This works if DAG combiner is enabled, but without combining we cannot select scalar_to_vector of <2 x half> and <2 x i16>. Differential Revision: https://reviews.llvm.org/D54718 llvm-svn: 347259	2018-11-19 19:58:13 +00:00
Vedant Kumar	238533ec2e	[InstCombine] Set debug loc on `mergeStoreIntoSuccessor` phi Assigning a merged debug location to the `mergeStoreIntoSuccessor` phi improves backtrace quality. Fixes llvm.org/PR38083. llvm-svn: 347257	2018-11-19 19:55:02 +00:00
Vedant Kumar	4de31bba51	[IR] Add hasNPredecessors, hasNPredecessorsOrMore to BasicBlock Add methods to BasicBlock which make it easier to efficiently check whether a block has N (or more) predecessors. This can be more efficient than using pred_size(), which is a linear time operation. We might consider adding similar methods for successors. I haven't done so in this patch because succ_size() is already O(1). With this patch applied, I measured a 0.065% compile-time reduction in user time for running `opt -O3` on the sqlite3 amalgamation (30 trials). The change in mergeStoreIntoSuccessor alone saves 45 million linked list iterations in a stage2 Release build of llc. See llvm.org/PR39702 for a harder but more general way of achieving similar results. Differential Revision: https://reviews.llvm.org/D54686 llvm-svn: 347256	2018-11-19 19:54:27 +00:00
Simon Pilgrim	740122fb8c	[DAGCombine] SimplifyNodeWithTwoResults - ensure same legalization for LO/HI operands (PR21207) Consistently use (!LegalOperations \|\| isOperationLegalOrCustom) for all node pairs. Differential Revision: https://reviews.llvm.org/D53478 llvm-svn: 347255	2018-11-19 19:37:59 +00:00
Reid Kleckner	c6846a812b	Fix clang test suite on Windows by reverting part of r347216 Otherwise, the clang analyzer tests fail on Windows when attempting to unpickle AnalyzerTest objects in the worker processes. The pattern of, add to path, import, remove from path, serialize, deserialize, doesn't work. Once something gets added to the path, if we want to move it across the wire for multiprocessing, we need to keep the module on sys.path. llvm-svn: 347254	2018-11-19 19:36:28 +00:00
Simon Pilgrim	01f4c4be90	Fix Wdocumentation warning. NFCI. llvm-svn: 347253	2018-11-19 19:18:33 +00:00
Simon Pilgrim	493ac5320b	Fix unused function warning. llvm-svn: 347252	2018-11-19 19:18:00 +00:00
Simon Pilgrim	de3605f56b	[TargetLowering] expandFP_TO_UINT - improve fp16 support As discussed on D53794, for float types with ranges smaller than the destination integer type, then we should be able to just use a regular FP_TO_SINT opcode. I thought we'd need to provide MSA test cases for very small integer types as well (fp16 -> i8 etc.), but it turns out that promotion will kick in so they're unnecessary. Differential Revision: https://reviews.llvm.org/D54703 llvm-svn: 347251	2018-11-19 19:16:13 +00:00
Roman Lebedev	f284803213	[IR] DISubprogram::toSPFlags(): fix "enumeral and non-enumeral type in conditional expression" /build/llvm/include/llvm/IR/DebugInfoMetadata.h: In static member function ‘static llvm::DISubprogram::DISPFlags llvm::DISubprogram::toSPFlags(bool, bool, bool, unsigned int)’: /build/llvm/include/llvm/IR/DebugInfoMetadata.h:1636:50: warning: enumeral and non-enumeral type in conditional expression [-Wextra] (IsLocalToUnit ? SPFlagLocalToUnit : 0) \| ~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~ /build/llvm/include/llvm/IR/DebugInfoMetadata.h:1637:49: warning: enumeral and non-enumeral type in conditional expression [-Wextra] (IsDefinition ? SPFlagDefinition : 0) \| ~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~ /build/llvm/include/llvm/IR/DebugInfoMetadata.h:1638:48: warning: enumeral and non-enumeral type in conditional expression [-Wextra] (IsOptimized ? SPFlagOptimized : 0)); ~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~ llvm-svn: 347250	2018-11-19 19:07:03 +00:00
Simon Pilgrim	9ad5717fcc	Add missing stream operator for Polynomial class to fix debug builds. llvm-svn: 347249	2018-11-19 18:57:49 +00:00
Craig Topper	a5e0380c30	[X86][CostModel] Don't lookup intrinsic cost tables if the intrinsic isn't one we care about We're seeing some issues internally where we sent some intrinsics into the cost model that the getTypeLegalizationCost call fails on, but X86 specific tables don't care about. Our base class implementation takes care of them. We'd just like X86 backend to ignore them. This patch makes sure the switch returned something X86 cares about and skips the table lookups and type legalization call if not. Probably more efficient too since we don't go scanning the tables for every intrinsic we could possibly see. Differential Revision: https://reviews.llvm.org/D54711 llvm-svn: 347248	2018-11-19 18:57:31 +00:00
Simon Pilgrim	0a1cb71e64	Add missing closing bracket. llvm-svn: 347247	2018-11-19 18:54:34 +00:00
Paul Robinson	fdaeb0c647	Fix build break from r347239 llvm-svn: 347246	2018-11-19 18:51:11 +00:00
Simon Pilgrim	987c253e5a	Fix Wdocumentation warning. NFCI. llvm-svn: 347245	2018-11-19 18:46:40 +00:00
Simon Pilgrim	c4861ab170	[X86][SSE] Remove unnecessary bit-and in pshufb vector ctlz (PR39703) SSE PSHUFB vector ctlz lowering works at the i4 nibble level. As detailed in PR39703, we were masking the lower nibble off but we only actually use it in the case where the upper nibble is known to be zero, making it safe to remove the mask and save an instruction. Differential Revision: https://reviews.llvm.org/D54707 llvm-svn: 347242	2018-11-19 18:40:59 +00:00
Martin Elshuber	5a47dc607e	[InterleavedLoadCombine] Fix warnings * remove unused function * fix compare llvm-svn: 347241	2018-11-19 18:35:31 +00:00
Craig Topper	311bbcd535	[X86] Attempt to improve v32i8/v64i8 multiply lowering by applying the v16i8 non-avx2 algorithm to each 128-bit lane. Previously we split the vectors in half to allow the two halves to be any extended then concatenated the results back together. This patch instead instead extends the v16i8 sse algorithm to extend half of each 128-bit lane using punpcklbw/punpckhbw. Multiplies all the low half lanes and high half lanes together in separate operations. Then merges the half lane results back together using packuswb. Unfortunately, some of the cases in vector-reduce-mul.ll regress because we aren't narrowing the vector width of the multiplies as we reduce. The splitting was somewhat making up for that before by causing halves to be discarded after the split. Differential Revision: https://reviews.llvm.org/D54668 llvm-svn: 347240	2018-11-19 18:32:53 +00:00
Paul Robinson	cda5421016	[DebugInfo] DISubprogram flags get their own flags word. NFC. This will hold flags specific to subprograms. In the future we could potentially free up scarce bits in DIFlags by moving subprogram-specific flags from there to the new flags word. This patch does not change IR/bitcode formats, that will be done in a follow-up. Differential Revision: https://reviews.llvm.org/D54597 llvm-svn: 347239	2018-11-19 18:29:28 +00:00
Sam Parker	1c803f5988	[ARM] Attempt to fix arm selfhost bots after rL347191 llvm-svn: 347238	2018-11-19 18:08:46 +00:00
Fangrui Song	d83a5526d5	[AMDGPU] Fix -Wunused-variable llvm-svn: 347234	2018-11-19 17:54:27 +00:00
Stanislav Mekhanoshin	054f8101f1	[AMDGPU] Convert insert_vector_elt into set of selects This allows to avoid scratch use or indirect VGPR addressing for small vectors. Differential Revision: https://reviews.llvm.org/D54606 llvm-svn: 347231	2018-11-19 17:39:20 +00:00
Francis Visoiu Mistrih	fe034625df	[llvm-nm] Fix use-after-free for MachOUniversalBinaries MachOObjectFile::getHostArch() returns a temporary, and getArchName returns a StringRef pointing to a temporary std::string. No tests since it doesn't trigger any errors except with the sanitizers. llvm-svn: 347230	2018-11-19 17:19:50 +00:00
Martin Elshuber	026a2a5152	[InterleavedLoadCombine] Fix warning unused variable Differential Revision: https://reviews.llvm.org/D52653 llvm-svn: 347229	2018-11-19 17:11:48 +00:00
Wouter van Oortmerssen	49482f824a	[WebAssembly] replaced .param/.result by .functype Summary: This makes it easier/cleaner to generate a single signature from this directive. Also: - Adds the symbol name, such that we don't depend on the location of this directive anymore. - Actually constructs the signature in the assembler, and make the assembler own it. - Refactor the use of MVT vs ValType in the streamer and assembler to require less conversions overall. - Changed 700 or so tests to use it. Reviewers: sbc100, dschuff Subscribers: jgravelle-google, eraman, aheejin, sunfish, jfb, llvm-commits Differential Revision: https://reviews.llvm.org/D54652 llvm-svn: 347228	2018-11-19 17:10:36 +00:00
Sanjay Patel	b25adf5edb	[SelectionDAG] simplify vector select with undef operand(s) llvm-svn: 347227	2018-11-19 17:06:05 +00:00
Benjamin Kramer	22a04efcb0	[InterleavedLoadCombine] Remove unused include. NFC. llvm-svn: 347226	2018-11-19 17:01:19 +00:00
Benjamin Kramer	2cad359c91	Revert "[LICM] Make LICM able to hoist phis" This reverts commit r347190. llvm-svn: 347225	2018-11-19 16:51:57 +00:00
David Stuttard	be3d7ba9fb	[AMDGPU] Derive GCNSubtarget from MF to get overridden target features Summary: AMDGPUAsmPrinter has a getSTI function that derives a GCNSubtarget from the TM. However, this means that overridden target features are not detected and can result in incorrect behaviour. Switch to using STM which is a GCNSubtarget derived from the MF (used elsewhere in the same function). Change-Id: Ib6328ad667b7fcdc87e9c06344e59859207db9b0 Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D54301 llvm-svn: 347221	2018-11-19 15:44:20 +00:00
Anna Thomas	5e9215f02b	[LV] Avoid vectorizing unsafe dependencies in uniform address Summary: Currently, when vectorizing stores to uniform addresses, the only instance we prevent vectorization is if there are multiple stores to the same uniform address causing an unsafe dependency. This patch teaches LAA to avoid vectorizing loops that have an unsafe cross-iteration dependency between a load and a store to the same uniform address. Fixes PR39653. Reviewers: Ayal, efriedma Subscribers: rkruppe, llvm-commits Differential Revision: https://reviews.llvm.org/D54538 llvm-svn: 347220	2018-11-19 15:39:59 +00:00
Sanjay Patel	08c0a0ac58	[Hexagon] make test immune to improvements in undef simplification llvm-svn: 347218	2018-11-19 15:34:09 +00:00
Sanjay Patel	60abc29b0a	[x86] add/make tests immune to improvements in undef simplification llvm-svn: 347217	2018-11-19 15:33:44 +00:00
Zachary Turner	58db03a116	Fix some issues with LLDB's lit configuration files. Recently I tried to port LLDB's lit configuration files over to use a on the surface, but broke some cases that weren't broken before and also exposed some additional problems with the old approach that we were just getting lucky with. When we set up a lit environment, the goal is to make it as hermetic as possible. We should not be relying on PATH and enabling the use of arbitrary shell commands. Instead, only whitelisted commands should be allowed. These are, generally speaking, the lit builtins such as echo, cd, etc, as well as anything for which substitutions have been explicitly set up for. These substitutions should map to the build output directory, but in some cases it's useful to be able to override this (for example to point to an installed tools directory). This is, of course, how it's supposed to work. What was actually happening is that we were bringing in PATH and LD_LIBRARY_PATH and then just running the given run line as a shell command. This led to problems such as finding the wrong version of clang-cl on PATH since it wasn't even a substitution, and flakiness / non-determinism since the environment the tests were running in would change per-machine. On the other hand, it also made other things possible. For example, we had some tests that were explicitly running cl.exe and link.exe instead of clang-cl and lld-link and the only reason it worked at all is because it was finding them on PATH. Unfortunately we can't entirely get rid of these tests, because they support a few things in debug info that clang-cl and lld-link don't (notably, the LF_UDT_MOD_SRC_LINE record which makes some of the tests fail. The high level changes introduced in this patch are: 1. Removal of functionality - The lit test suite no longer respects LLDB_TEST_C_COMPILER and LLDB_TEST_CXX_COMPILER. This means there is no more support for gcc, but nobody was using this anyway (note: The functionality is still there for the dotest suite, just not the lit test suite). There is no longer a single substitution %cxx and %cc which maps to <arbitrary-compiler>, you now explicitly specify the compiler with a substitution like %clang or %clangxx or %clang_cl. We can revisit this in the future when someone needs gcc. 2. Introduction of the LLDB_LIT_TOOLS_DIR directory. This does in spirit what LLDB_TEST_C_COMPILER and LLDB_TEST_CXX_COMPILER used to do, but now more friendly. If this is not specified, all tools are expected to be the just-built tools. If it is specified, the tools which are not themselves being tested but are being used to construct and run checks (e.g. clang, FileCheck, llvm-mc, etc) will be searched for in this directory first, then the build output directory. 3. Changes to core llvm lit files. The use_lld() and use_clang() functions were introduced long ago in anticipation of using them in lldb, but since they were never actually used anywhere but their respective problems, there were some issues to be resolved regarding generality and ability to use them outside their project. 4. Changes to .test files - These are all just replacing things like clang-cl with %clang_cl and %cxx with %clangxx, etc. 5. Changes to lit.cfg.py - Previously we would load up some system environment variables and then add some new things to them. Then do a bunch of work building out our own substitutions. First, we delete the system environment variable code, making the environment hermetic. Then, we refactor the substitution logic into two separate helper functions, one which sets up substitutions for the tools we want to test (which must come from the build output directory), and another which sets up substitutions for support tools (like compilers, etc). 6. New substitutions for MSVC -- Previously we relied on location of MSVC by bringing in the entire parent's PATH and letting subprocess.Popen just run the command line. Now we set up real substitutions that should have the same effect. We use PATH to find them, and then look for INCLUDE and LIB to construct a substitution command line with appropriate /I and /LIBPATH: arguments. The nice thing about this is that it opens the door to having separate %msvc-cl32 and %msvc-cl64 substitutions, rather than only requiring the user to run vcvars first. Because we can deduce the path to 32-bit libraries from 64-bit library directories, and vice versa. Without these substitutions this would have been impossible. Differential Revision: https://reviews.llvm.org/D54567 llvm-svn: 347216	2018-11-19 15:12:34 +00:00
Fedor Sergeev	3a3d688cc8	[LoopPass] fixing 'Modification' messages in -debug-pass=Executions for loop passes Legacy loop pass manager is issuing "Made Modification" message after each Loop Pass run, however condition for issuing it is accumulated among all the runs. That leads to confusing 'modification' messages as soon as the first modification is done. Changing condition to be "current pass made modifications", similar to how it is being done in all other pass managers. llvm-svn: 347215	2018-11-19 15:10:59 +00:00
Sanjay Patel	a1dca3553e	[SelectionDAG] simplify select FP with undef condition llvm-svn: 347212	2018-11-19 14:42:28 +00:00
Sanjay Patel	7a51bdcf3b	[x86] add test for select FP with undef condition; NFC llvm-svn: 347211	2018-11-19 14:39:57 +00:00
Sanjay Patel	c036d844be	[SelectionDAG] add simplifySelect() to reduce code duplication; NFC This should be extended to handle FP and vectors in follow-up patches. llvm-svn: 347210	2018-11-19 14:35:22 +00:00
Clement Courbet	bbab546a71	[llvm-exegesis][NFC] More tests for ExegesisTarget::fillMemoryOperands(). Reviewers: gchatelet Subscribers: tschuett, llvm-commits Differential Revision: https://reviews.llvm.org/D54304 llvm-svn: 347209	2018-11-19 14:31:43 +00:00
Martin Elshuber	fef3036d37	Subject: [PATCH] [CodeGen] Add pass to combine interleaved loads. This patch defines an interleaved-load-combine pass. The pass searches for ShuffleVector instructions that represent interleaved loads. Matches are converted such that they will be captured by the InterleavedAccessPass. The pass extends LLVMs capabilities to use target specific instruction selection of interleaved load patterns (e.g.: ld4 on Aarch64 architectures). Differential Revision: https://reviews.llvm.org/D52653 llvm-svn: 347208	2018-11-19 14:26:10 +00:00
Eugene Leviant	0c7460ad05	[ThinLTO] Fix comment. NFC llvm-svn: 347207	2018-11-19 14:19:37 +00:00
Sanjay Patel	3fa76bd08f	[SelectionDAG] fix formatting; NFC llvm-svn: 347206	2018-11-19 14:03:07 +00:00
Roman Lebedev	71fdb57640	[llvm-exegesis] (+final perf overview) InstructionBenchmarkClustering::rangeQuery(): reserve for the upper bound of Neighbors Summary: As it was pointed out in D54388+D54390, the maximal size of `Neighbors` is known, it will contain at most Points_.size() minus one (the center of the cluster) While that is the upper bound, meaning in the most cases, the actual count will be much smaller, since D54390 made the allocation persistent, we no longer have to worry about overly-optimistically `reserve()`ing. Old: (D54393) ``` Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=100000 -benchmarks-file=/tmp/benchmarks.yaml -analysis-inconsistencies-output-file=/tmp/clusters.html' (16 runs): 6553.167456 task-clock (msec) # 1.000 CPUs utilized ( +- 0.21% ) ... 6.5547 +- 0.0134 seconds time elapsed ( +- 0.20% ) ``` New: ``` Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=100000 -benchmarks-file=/tmp/benchmarks.yaml -analysis-inconsistencies-output-file=/tmp/clusters.html' (16 runs): 6315.057872 task-clock (msec) # 0.999 CPUs utilized ( +- 0.24% ) ... 6.3187 +- 0.0160 seconds time elapsed ( +- 0.25% ) ``` And that is another -~4%. Since this is the last (as of this moment) patch in this patch series, it is a good time to summarize: Old: (svn trunk, as stated in D54381) ``` $ time ./bin/llvm-exegesis -mode=analysis -analysis-epsilon=100000 -benchmarks-file=/tmp/benchmarks.yaml -analysis-inconsistencies-output-file=/tmp/clusters.html &> /dev/null real 0m24.884s user 0m24.099s sys 0m0.785s ``` So these patches, on a given benchmark, has decreased llvm-exegesis analysis time by 74.62%. There surely is more room for further improvements. D54514 may improve thins by -11.5% more (relative to this patch). Parallelization may improve things further significantly, too. Reviewers: courbet, MaskRay, RKSimon, gchatelet, john.brawn Reviewed By: courbet, MaskRay Subscribers: tschuett, llvm-commits Differential Revision: https://reviews.llvm.org/D54415 llvm-svn: 347204	2018-11-19 13:28:41 +00:00
Roman Lebedev	8e315b66c2	[llvm-exegesis] Move InstructionBenchmarkClustering::isNeighbour() into header Summary: Old: (D54390) ``` Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=100000 -benchmarks-file=/tmp/benchmarks.yaml -analysis-inconsistencies-output-file=/tmp/clusters.html' (10 runs): 7432.421721 task-clock (msec) # 1.000 CPUs utilized ( +- 0.15% ) ... 7.4336 +- 0.0115 seconds time elapsed ( +- 0.15% ) ``` New: ``` Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=100000 -benchmarks-file=/tmp/benchmarks.yaml -analysis-inconsistencies-output-file=/tmp/clusters.html' (10 runs): 6569.936144 task-clock (msec) # 1.000 CPUs utilized ( +- 0.22% ) ... 6.5711 +- 0.0143 seconds time elapsed ( +- 0.22% ) ``` And another -12%. You'd think it would be `inline`d anyway, but no! :) Reviewers: courbet, MaskRay, RKSimon, gchatelet, john.brawn Reviewed By: courbet, MaskRay Subscribers: tschuett, llvm-commits Differential Revision: https://reviews.llvm.org/D54393 llvm-svn: 347203	2018-11-19 13:28:36 +00:00
Roman Lebedev	666d855fbb	[llvm-exegesis] InstructionBenchmarkClustering::rangeQuery(): write into llvm::SmallVectorImpl& output parameter Summary: I do believe this is the correct fix. We call `rangeQuery()` very often. And many times it's output vector is large (tens of thousands entries), so small-size-opt won't help. Old: (D54389) ``` Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=100000 -benchmarks-file=/tmp/benchmarks.yaml -analysis-inconsistencies-output-file=/tmp/clusters.html' (10 runs): 7934.528363 task-clock (msec) # 1.000 CPUs utilized ( +- 0.19% ) ... 7.9354 +- 0.0148 seconds time elapsed ( +- 0.19% ) ``` New: ``` Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=100000 -benchmarks-file=/tmp/benchmarks.yaml -analysis-inconsistencies-output-file=/tmp/clusters.html' (10 runs): 7383.793440 task-clock (msec) # 1.000 CPUs utilized ( +- 0.47% ) ... 7.3868 +- 0.0340 seconds time elapsed ( +- 0.46% ) ``` And another -7%. And that isn't even the good bit yet. Old: * calls to allocation functions: 2081419 * temporary allocations: 219658 (10.55%) * bytes allocated in total (ignoring deallocations): 4.31 GB New: * calls to allocation functions: 1880295 (-10%) * temporary allocations: 18758 (1%) (-91% sic) * bytes allocated in total (ignoring deallocations): 545.15 MB (-88% sic) Reviewers: courbet, MaskRay, RKSimon, gchatelet, john.brawn Reviewed By: courbet, MaskRay Subscribers: tschuett, llvm-commits Differential Revision: https://reviews.llvm.org/D54390 llvm-svn: 347202	2018-11-19 13:28:31 +00:00
Roman Lebedev	5c5b1ea725	[llvm-exegesis] InstructionBenchmarkClustering::dbScan(): replace std::vector<> with std::deque<> in llvm::SetVector<> Summary: Old: (D54388) ``` Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=100000 -benchmarks-file=/tmp/benchmarks.yaml -analysis-inconsistencies-output-file=/tmp/clusters.html' (10 runs): 8606.323981 task-clock (msec) # 1.000 CPUs utilized ( +- 0.11% ) ... 8.60773 +- 0.00978 seconds time elapsed ( +- 0.11% ) ``` New: ``` Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=100000 -benchmarks-file=/tmp/benchmarks.yaml -analysis-inconsistencies-output-file=/tmp/clusters.html' (10 runs): 7971.403653 task-clock (msec) # 1.000 CPUs utilized ( +- 0.14% ) ... 7.9728 +- 0.0113 seconds time elapsed ( +- 0.14% ) ``` Another -~7%. Reviewers: courbet, MaskRay, RKSimon, gchatelet, john.brawn Reviewed By: courbet, RKSimon Subscribers: tschuett, llvm-commits Differential Revision: https://reviews.llvm.org/D54389 llvm-svn: 347201	2018-11-19 13:28:26 +00:00
Roman Lebedev	8aecb0c489	[llvm-exegesis] InstructionBenchmarkClustering::rangeQuery(): use llvm::SmallVector<size_t, 0> for storage. Summary: Old: (D54383) ``` Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=100000 -benchmarks-file=/tmp/benchmarks.yaml -analysis-inconsistencies-output-file=/tmp/clusters.html' (10 runs): 9098.781978 task-clock (msec) # 1.000 CPUs utilized ( +- 0.16% ) ... 9.1015 +- 0.0148 seconds time elapsed ( +- 0.16% ) ``` New: ``` Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=100000 -benchmarks-file=/tmp/benchmarks.yaml -analysis-inconsistencies-output-file=/tmp/clusters.html' (10 runs): 8553.352480 task-clock (msec) # 1.000 CPUs utilized ( +- 0.12% ) ... 8.5539 +- 0.0105 seconds time elapsed ( +- 0.12% ) ``` So another -6%. That is because the `SmallVector` doubles it size when reallocating, which is great here, since we can't `reserve()` since we can't know how many `Neighbors` we will have. Reviewers: courbet, MaskRay, RKSimon, gchatelet, john.brawn Subscribers: tschuett, llvm-commits Differential Revision: https://reviews.llvm.org/D54388 llvm-svn: 347200	2018-11-19 13:28:22 +00:00
Roman Lebedev	b311c1d6b8	[llvm-exegesis] Analysis: writeMeasurementValue(): don't alloc string for double each time. Summary: Test data: 500kLOC of benchmark.yaml, 23Mb. (that is a subset of the actual uops benchmark i was trying to analyze!) Old time: (D54382) ``` Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=100000 -benchmarks-file=/tmp/benchmarks.yaml -analysis-inconsistencies-output-file=/tmp/clusters.html' (16 runs): 9024.354355 task-clock (msec) # 1.000 CPUs utilized ( +- 0.18% ) ... 9.0262 +- 0.0161 seconds time elapsed ( +- 0.18% ) ``` New time: ``` Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=100000 -benchmarks-file=/tmp/benchmarks.yaml -analysis-inconsistencies-output-file=/tmp/clusters.html' (16 runs): 8996.541057 task-clock (msec) # 0.999 CPUs utilized ( +- 0.19% ) ... 9.0045 +- 0.0172 seconds time elapsed ( +- 0.19% ) ``` -~0.3%, not that much. But this isn't the important part. Old: * calls to allocation functions: 2109712 * temporary allocations: 33112 * bytes allocated in total (ignoring deallocations): 4.43 GB New: * calls to allocation functions: 2095345 (-0.68%) * temporary allocations: 18745 (-43.39% !!!) * bytes allocated in total (ignoring deallocations): 4.31 GB (-2.71%) Reviewers: courbet, MaskRay, RKSimon, gchatelet, john.brawn Reviewed By: courbet Subscribers: tschuett, llvm-commits Differential Revision: https://reviews.llvm.org/D54383 llvm-svn: 347199	2018-11-19 13:28:17 +00:00
Roman Lebedev	f8b28e9bf4	[llvm-exegesis] Analysis::writeSnippet(): be smarter about memory allocations. Summary: Test data: 500kLOC of benchmark.yaml, 23Mb. (that is a subset of the actual uops benchmark i was trying to analyze!) Old time: (D54381) ``` $ time ./bin/llvm-exegesis -mode=analysis -analysis-epsilon=100000 -benchmarks-file=/tmp/benchmarks.yaml -analysis-inconsistencies-output-file=/tmp/clusters.html &> /dev/null real 0m10.487s user 0m9.745s sys 0m0.740s ``` New time: ``` $ time ./bin/llvm-exegesis -mode=analysis -analysis-epsilon=100000 -benchmarks-file=/tmp/benchmarks.yaml -analysis-inconsistencies-output-file=/tmp/clusters.html &> /dev/null real 0m9.599s user 0m8.824s sys 0m0.772s ``` Not that much, around -9%. But that is not the good part yet, again. Old: * calls to allocation functions: 3347676 * temporary allocations: 277818 * bytes allocated in total (ignoring deallocations): 10.52 GB New: * calls to allocation functions: 2109712 (-36%) * temporary allocations: 33112 (-88%) * bytes allocated in total (ignoring deallocations): 4.43 GB (-58% sic) Reviewers: courbet, MaskRay, RKSimon, gchatelet, john.brawn Reviewed By: courbet, MaskRay Subscribers: tschuett, llvm-commits Differential Revision: https://reviews.llvm.org/D54382 llvm-svn: 347198	2018-11-19 13:28:14 +00:00
Roman Lebedev	0b4b512826	[llvm-exegesis] InstructionBenchmarkClustering::dbScan(): use llvm::SetVector<> instead of ILLEGAL std::unordered_set<> Summary: Test data: 500kLOC of benchmark.yaml, 23Mb. (that is a subset of the actual uops benchmark i was trying to analyze!) Old time: ``` $ time ./bin/llvm-exegesis -mode=analysis -analysis-epsilon=100000 -benchmarks-file=/tmp/benchmarks.yaml -analysis-inconsistencies-output-file=/tmp/clusters.html &> /dev/null real 0m24.884s user 0m24.099s sys 0m0.785s ``` New time: ``` $ time ./bin/llvm-exegesis -mode=analysis -analysis-epsilon=100000 -benchmarks-file=/tmp/benchmarks.yaml -analysis-inconsistencies-output-file=/tmp/clusters.html &> /dev/null real 0m10.469s user 0m9.797s sys 0m0.672s ``` So -60%. And that isn't the good bit yet. Old: * calls to allocation functions: 106560180 (yes, 107 million allocations.) * bytes allocated in total (ignoring deallocations): 12.17 GB New: * calls to allocation functions: 3347676 (-96.86%) (just 3 mil) * bytes allocated in total (ignoring deallocations): 10.52 GB (~2GB less) --- Two points i want to raise: * `std::unordered_set<>` should not have been used there in the first place. It is banned by the https://llvm.org/docs/ProgrammersManual.html#other-set-like-container-options * There is no tests, so i'm not fully sure this is correct. Since it was unordered set, i guess there are zero restrictions on the order, and anything will be ok? * I tried other containers suggested in https://llvm.org/docs/ProgrammersManual.html#set-like-containers-std-set-smallset-setvector-etc, this `llvm::SetVector<>` seems to be best here. Reviewers: courbet, MaskRay, RKSimon, gchatelet, john.brawn Reviewed By: courbet Subscribers: kristina, bobsayshilol, tschuett, llvm-commits Differential Revision: https://reviews.llvm.org/D54381 llvm-svn: 347197	2018-11-19 13:28:09 +00:00
Simon Pilgrim	f6c2fbdd1a	[X86] Add codegen tests for slow-shld scalar funnel shifts llvm-svn: 347195	2018-11-19 12:29:41 +00:00
Michael Platings	dad611c1a0	Test commit - delete a trailing space. llvm-svn: 347193	2018-11-19 12:10:07 +00:00
Nicolai Haehnle	c548d91419	AMDGPU/InsertWaitcnts: Some more const-correctness Reviewers: msearles, rampitec, scott.linder, kanarayan Subscribers: arsenm, kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits, hakzsam Differential Revision: https://reviews.llvm.org/D54225 llvm-svn: 347192	2018-11-19 12:03:11 +00:00
Sam Parker	e7c42dd7e2	[ARM] Remove trunc sinks in ARM CGP Truncs are treated as sources if their produce a value of the same type as the one we currently trying to promote. Truncs used to be considered as a sink if their operand was the same value type. We now allow smaller types in the search, so we should search through truncs that produce a smaller value. These truncs can then be converted to an AND mask. This leaves sinks as being: - points where the value in the register is being observed, such as an icmp, switch or store. - points where value types have to match, such as calls and returns. - zext are included to ease the transformation and are generally removed later on. During this change, it also became apart from truncating sinks was broken: if a sink used a source, its type information had already been lost by the time the truncation happens. So I've changed the method of caching the type information. Differential Revision: https://reviews.llvm.org/D54515 llvm-svn: 347191	2018-11-19 11:34:40 +00:00
John Brawn	12c046fba0	[LICM] Make LICM able to hoist phis The general approach taken is to make note of loop invariant branches, then when we see something conditional on that branch, such as a phi, we create a copy of the branch and (empty versions of) its successors and hoist using that. This has no impact by itself that I've been able to see, as LICM typically doesn't see such phis as they will have been converted into selects by the time LICM is run, but once we start doing phi-to-select conversion later it will be important. Differential Revision: https://reviews.llvm.org/D52827 llvm-svn: 347190	2018-11-19 11:31:24 +00:00
Anton Korobeynikov	4df19b75c0	[MSP430] Optimize srl/sra in case of A >> (8 + N) There is no variable-length shifts on MSP430. Therefore "eat" 8 bits of shift via bswap & ext. Path by Kristina Bessonova! Differential Revision: https://reviews.llvm.org/D54623 llvm-svn: 347187	2018-11-19 10:43:02 +00:00
Serge Guelton	12c7a96064	Fix disturbing warning - NFCI llvm-svn: 347186	2018-11-19 10:05:28 +00:00
Craig Topper	8b22bcd39f	[X86] Use a pcmpgt with 0 instead of psrad 31, to fill elements with the sign bit in v4i32 MULH lowering. The shift requires a copy to avoid clobbering a register. Comparing with 0 uses an xor to produce 0 that will be overwritten with the compare results. So still requires 2 instructions, but should be one byte shorter since it doesn't need to encode an immediate. llvm-svn: 347185	2018-11-19 07:22:26 +00:00
Fangrui Song	209cfbe60e	[LoopSimplifyCFG] Add requires: asserts after rL347183 llvm-svn: 347184	2018-11-19 06:28:15 +00:00
Max Kazantsev	8e3e33d138	[LoopSimplifyCFG] Teach LoopSimplifyCFG to constant-fold branches and switches This patch introduces infrastructure and the simplest case for constant-folding of branch and switch instructions within loop into unconditional branches. It is useful as a cleanup for such passes as loop unswitching that sometimes produce such branches. Only the simplest case supported in this patch: after the folding, no block should become dead or stop being part of the loop. Support for more sophisticated cases will go separately in follow-up patches. Differential Revision: https://reviews.llvm.org/D54021 Reviewed By: anna llvm-svn: 347183	2018-11-19 05:54:38 +00:00
Vedant Kumar	e7b789b529	[ProfileSummary] Standardize methods and fix comment Every Analysis pass has a get method that returns a reference of the Result of the Analysis, for example, BlockFrequencyInfo &BlockFrequencyInfoWrapperPass::getBFI(). I believe that ProfileSummaryInfo::getPSI() is the only exception to that, as it was returning a pointer. Another change is renaming isHotBB and isColdBB to isHotBlock and isColdBlock, respectively. Most methods use BB as the argument of variable names while methods usually refer to Basic Blocks as Blocks, instead of BB. For example, Function::getEntryBlock, Loop:getExitBlock, etc. I also fixed one of the comments. Patch by Rodrigo Caetano Rocha! Differential Revision: https://reviews.llvm.org/D54669 llvm-svn: 347182	2018-11-19 05:23:16 +00:00
Craig Topper	3616891046	[X86] Use compare with 0 to fill an element with sign bits when sign extending to v2i64 pre-sse4.1 Previously we used an arithmetic shift right by 31, but that requires a copy to preserve the input. So we might as well materialize a zero and compare to it since the comparison will overwrite the register that contains the zeros. This should be one byte shorter. llvm-svn: 347181	2018-11-19 04:33:20 +00:00
Craig Topper	053f1eea96	[X86] Remove most of the SEXTLOAD Custom setOperationAction calls under -x86-experimental-vector-widening-legalization. Leave just the v4i8->v4i64 and v8i8->v8i64, but only enable them on pre-sse4.1 targets when 64-bit mode is enabled. In those cases we end up creating sext loads that get scalarized to code that looks better than what we get from loading into a vector register and doing a multiple step sign extend using unpacks and shifts. llvm-svn: 347180	2018-11-19 00:33:16 +00:00
Simon Pilgrim	7f92efa5a9	[X86][SSE] Add SimplifyDemandedVectorElts support for SSE packed i2fp conversions. llvm-svn: 347177	2018-11-18 22:13:31 +00:00
Craig Topper	0468c860b7	[X86] Add custom type legalization for extending v4i8/v4i16->v4i64. Pre-SSE4.1 sext_invec for v2i64 is complicated because we don't have a v2i64 sra instruction. So instead we sign extend to i32 using unpack and sra, then copy the elements and do a v4i32 sra to fill with sign bits, then interleave the i32 sign extend and the sign bits. So really we're doing to two sign extends but only using half of the v4i32 intermediate result. When the result is more than 128 bits, default type legalization would prefer to split the destination type all the way down to v2i64 with shuffles followed by v16i8/v8i16->v2i64 sext_inreg operations. This results in more instructions than necessary because we are only utilizing the lower 2 elements of the v4i32 intermediate result. Instead we can custom split a v4i8/v4i16->v4i64 sign_extend. Then we can sign extend v4i8/v4i16->v4i32 invec producing a full v4i32 result. Create the sign bit vector as a v4i32 then split and interleave with the sign bits using an punpackldq and punpackhdq. llvm-svn: 347176	2018-11-18 21:28:50 +00:00
Craig Topper	950f3842cc	[X86] Add a 32-bit command line with only sse2 to vector-sext.ll and vector-sext.ll to show some of the scalarized load sequences without 64-bit scalar support. Some of these sequeces look pretty bad since we have to copy the sign bit from a 32 bit register to a 64 bit register to finish a sign extend. llvm-svn: 347175	2018-11-18 21:28:47 +00:00
Simon Pilgrim	b31bdbd2e9	[X86][SSE] Add SimplifyDemandedVectorElts support for SSE splat-vector-shifts. SSE vector shifts only use the bottom 64-bits of the shift amount vector. llvm-svn: 347173	2018-11-18 20:21:52 +00:00
Craig Topper	11d50948e2	[X86] Disable combineToExtendVectorInReg under -x86-experimental-vector-widening-legalization. Add custom type legalization for extends. If we widen illegal types instead of promoting, we should be able to rely on the type legalizer to create the vector_inreg operations for us with some caveats. This patch disables combineToExtendVectorInReg when we are using widening. I've enabled custom legalization for v8i8->v8i64 extends under avx512f since the type legalizer would want to create a vector_inreg with a v64i8 input type which isn't legal without avx512bw. So we go to v16i8 with custom code using the relaxation of rules we get from D54346. I've also enable custom legalization of v8i64 and v16i32 operations with with AVX. When the input type is 128 bits, the default splitting legalization would extend first 128->256, then do the a split to two 128 pieces. Extend each half to 256 and then concat the result. The custom legalization I've added instead uses a 128->256 bit vector_inreg extend that only reads the lower 64-bits for the low half of the split. Then shuffles the high 64-bits to the low 64-bits and does another vector_inreg extend. llvm-svn: 347172	2018-11-18 18:11:25 +00:00
Craig Topper	bc8148f7b0	[X86] Lower v16i16->v8i16 truncate using an 'and' with 255, an extract_subvector, and a packuswb instruction. Summary: This is an improvement over the two pshufbs and punpcklqdq we'd get otherwise. Reviewers: RKSimon, spatel Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D54671 llvm-svn: 347171	2018-11-18 17:59:28 +00:00
Sanjay Patel	8c0cd77bff	[DAG] add undef simplifications for select nodes Sadly, this duplicates (twice) the logic from InstSimplify. There might be some way to at least share the DAG versions of the code, but copying the folds seems to be the standard method to ensure that we don't miss these folds. Unlike in IR, we don't run DAGCombiner to fixpoint, so there's no way to ensure that we do these kinds of simplifications unless the code is repeated at node creation time and during combines. There were other tests that would become worthless with this improvement that I changed as pre-commits: rL347161 rL347164 rL347165 rL347166 rL347167 I'm not sure how to salvage the remaining tests (diffs in this patch). So the x86 tests verify that the new code is working as intended. The AMDGPU test is actually similar to my motivating case: we have some undef value that has survived to machine IR in an x86 test, and then it gets folded in some weird way, or we crash if we don't transfer the undef flag. But we would have been better off never getting to that point by doing these simplifications. This will lead back to PR32023 someday... https://bugs.llvm.org/show_bug.cgi?id=32023 llvm-svn: 347170	2018-11-18 17:36:23 +00:00
Simon Pilgrim	ec808cf541	Remove unused variable. NFCI. llvm-svn: 347169	2018-11-18 17:24:59 +00:00
Simon Pilgrim	50828c75d0	[X86][SSE] Split IsSplatValue into GetSplatValue and IsSplatVector Refactor towards making this recursive (necessary for PR38243 rotation splat detection). IsSplatVector returns the original vector source of the splat and the splat index. GetSplatValue returns the scalar splatted value as an extraction from IsSplatVector. llvm-svn: 347168	2018-11-18 17:15:06 +00:00
Sanjay Patel	bc23408fe5	[x86] regenerate full checks; NFC llvm-svn: 347167	2018-11-18 16:56:17 +00:00
Sanjay Patel	7e659ef4b1	[SystemZ] make test immune to improvements in undef simplification llvm-svn: 347166	2018-11-18 16:50:44 +00:00
Sanjay Patel	cb04e590d3	[Hexagon] make tests immune to improvements in undef simplification llvm-svn: 347165	2018-11-18 16:50:16 +00:00
Sanjay Patel	becf03efa1	[ARM] make test immune to improvements in undef simplification llvm-svn: 347164	2018-11-18 16:49:42 +00:00
Simon Pilgrim	fec9f8657b	[X86][SSE] Relax IsSplatValue - remove the 'variable shift' limit on subtracts. Means we don't use the per-lane-shifts as much when we can cheaply use the older splat-variable-shifts. llvm-svn: 347162	2018-11-18 15:52:08 +00:00
Sanjay Patel	40509997eb	[x86] make tests immune to improvements in undef handling llvm-svn: 347161	2018-11-18 15:27:19 +00:00
Sanjay Patel	42c22a1f87	[SelectionDAG] simplify code; NFC llvm-svn: 347160	2018-11-18 14:39:03 +00:00
Simon Pilgrim	7fdbae3224	[X86][SSE] Add some generic masked gather codegen tests llvm-svn: 347159	2018-11-18 14:35:57 +00:00
Simon Pilgrim	cc1f5d2407	[X86][SSE] Use raw shuffle mask decode in SimplifyDemandedVectorEltsForTargetNode (PR39549) We were using the 'normalized' shuffle mask from resolveTargetShuffleInputs, which replaces zero/undef inputs with sentinel values. For SimplifyDemandedVectorElts we need the raw mask so we can correctly demand those 'zero' inputs that got normalized away, this requires an extra bit of logic to locally normalize undef inputs. llvm-svn: 347158	2018-11-18 13:34:53 +00:00
Kamil Rytarowski	83aabf43ea	Swap order of discovering of -ltinfo and -lterminfo Summary: NetBSD ships with native curses(3) and -ltinfo is a part of ncurses. Set -lterminfo before -ltinfo, as it allows to prioritize native curses libraries. Mixing curses and ncurses does not work well, especially in software built on top of llvm. Original patch by Ryo Onodera (NetBSD) in pkgsrc. Reviewers: labath, dim, mgorny Reviewed By: dim, mgorny Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D54650 llvm-svn: 347156	2018-11-18 12:13:51 +00:00
Heejin Ahn	e0f8b9bfc6	[WebAssembly] Add null streamer support Summary: Now `llc -filetype=null` works. Reviewers: eush Subscribers: dschuff, jgravelle-google, sbc100, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D54660 llvm-svn: 347155	2018-11-18 11:58:47 +00:00
Heejin Ahn	7a391ff918	[WebAssembly] Add equality comparison operators for WasmEventType Summary: This was missing in D54096. Independent tests for this is not available here, because these are used in lld. Reviewers: sbc100 Subscribers: dschuff, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D54662 llvm-svn: 347154	2018-11-18 11:53:35 +00:00
Craig Topper	cd94a7c227	[X86] Add -x86-experimental-vector-widening-legalization check to combineSelect and combineSetCC to cover vXi16/vXi8 promotion without BWI. I don't yet have any test cases for this, but its the right thing to do based on log file inspection. llvm-svn: 347151	2018-11-18 08:30:09 +00:00
Craig Topper	b03f80a21c	[X86] Rename WidenMaskArithmetic->PromoteMaskArithmetic since we usually use widen to refer to adding elements not making elements larger. NFC llvm-svn: 347150	2018-11-18 07:35:08 +00:00
Craig Topper	f56a57518d	[X86] Don't use a pmaddwd for vXi32 multiply if the inputs are zero extends from i8 or smaller without SSE4.1. Prefer to shrink the mul instead. The zero extend will require two stages of unpacks to implement. So its better to shrink the multiply using pmullw and then extend that result back to v4i32 using a single unpack. llvm-svn: 347149	2018-11-18 05:53:21 +00:00
John Regehr	ab7781493d	tighten up a couple of assertions. hitting the BitPosition == BitWidth case that was previously not caught resulted in nasty corruption of APInts that (on my system at least) could not be detected using UBSan, ASan, or Valgrind. this patch does not cause any extra failures in a check-all nor does it interfere with bootstrapping. David Blaikie informally approved this change. llvm-svn: 347148	2018-11-18 01:51:43 +00:00
Vedant Kumar	35f504c113	[CorrelatedValuePropagation] Preserve debug locations (PR38178) Fix all of the missing debug location errors in CVP found by debugify. This includes the missing-location-after-udiv-truncation case described in llvm.org/PR38178. llvm-svn: 347147	2018-11-18 00:29:58 +00:00
Teresa Johnson	5b9bb25c45	Fix bot failure from r347145 The #if check around the statistics computation gave an error about the statistic being an unused variable. Instead, guard with AreStatisticsEnabled(). llvm-svn: 347146	2018-11-17 20:41:45 +00:00
Teresa Johnson	8c1915cc01	[ThinLTO] Add some stats for read only variable internalization Summary: Follow up to D49362 ([ThinLTO] Internalize read only globals). Add a statistic on the number of read only variables (only counting live variables since dead variables will be dropped anyway). Reviewers: evgeny777 Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, arphaman, llvm-commits Differential Revision: https://reviews.llvm.org/D54642 llvm-svn: 347145	2018-11-17 20:03:22 +00:00
Craig Topper	0438d791fa	[X86] Add support for matching PACKUSWB from a v64i8 shuffle. llvm-svn: 347143	2018-11-17 18:54:43 +00:00
Craig Topper	c6c760f07f	[X86] Add test case to show missed opportunity to use PACKUSWB in v64i8 shuffle lowering. llvm-svn: 347142	2018-11-17 18:54:41 +00:00
David Blaikie	ef543381ed	Move BuryPointer from Clang to LLVM for use in other LLVM tools Specifically planning to use this in llvm-symbolizer to remove the cost of cleanup there. llvm-svn: 347140	2018-11-17 18:03:47 +00:00
Simon Pilgrim	0e1a9d5ee6	[X86][SSE] Add shuffle demanded elts test case for PR39549 llvm-svn: 347139	2018-11-17 14:06:03 +00:00
Xing GUO	785edea926	[llvm-objdump] Print a blank row at the end of sections Summary: When using option `-x` (--all-headers), it will print `Sections`, `Symbol Table`, `Program Header` ... `Sections` and `Symbol Table` will be connected together. Before: ``` Sections: Idx Name Size Address Type 0 00000000 0000000000000000 ... 29 .shstrtab 0000011a 0000000000000000 SYMBOL TABLE: ... ``` After: ``` Sections: Idx Name Size Address Type 0 00000000 0000000000000000 ... 29 .shstrtab 0000011a 0000000000000000 SYMBOL TABLE: ... ``` Reviewers: Higuoxing Reviewed By: Higuoxing Subscribers: llvm-commits, jhenderson Differential Revision: https://reviews.llvm.org/D54665 llvm-svn: 347135	2018-11-17 08:12:48 +00:00
David Blaikie	81959a2730	llvm-symbolizer: Avoid calling getFromOffset when the index entry is already available Especially for symbolizer it can be efficient to have to search through the entire index when it isn't needed - llvm-symbolizer looks up only a few CUs & already has an index available in getUnitForEntry, once it's passed down to DWARFUnitHeader::extract then there's no need for it to call getFromOffset. llvm-svn: 347134	2018-11-17 05:57:58 +00:00
Craig Topper	dd61f11642	[X86] Don't extend v32i8 multiplies to v32i16 with avx512bw and prefer-vector-width=256. llvm-svn: 347131	2018-11-17 02:36:07 +00:00
Craig Topper	d8da95bbe3	[X86] Add test cases to show incorrect use of a 512 bit vector in v32i8 multiply lowering with prefer-vector-width=256. On the min-legal-vector-width test this actually causes some of the v32i16 operations we emitted to be scalarized. llvm-svn: 347130	2018-11-17 02:36:02 +00:00
Vyacheslav Zakharin	6a5d5ac4bd	Reverted r347092 due to the following build fails: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap/builds/8662 http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/26263 llvm-svn: 347129	2018-11-17 02:26:34 +00:00
Nico Weber	f94d6ea95a	Add initial scaffolding for the GN build. See "GN build roundtable summary; adding GN build files to the repo" on llvm-dev and cfe-dev for discussion. In particular, this build is completely unsupported. People adding new files to LLVM are not expected to update the GN build files, and reviewers are not supposed to request the gn build files to be updated. This adds just enough to be able to build llvm/lib/Demangle. It requires using a monorepo. This adds a few build config options you can set in args.gn (`gn args out/foo --list` for all): - is_debug = true to enable debug builds (defaults to release) - llvm_enable_assertions to toggle assertions (defaults to true) - clang_base_path, if set an absolute path to a locally-built clang to be used as host compiler Differential Revision: https://reviews.llvm.org/D54345 llvm-svn: 347128	2018-11-17 02:21:53 +00:00
Craig Topper	b05ea28f1f	[X86] Use getUnpackl/getUnpackh instead of hardcoding a shuffle mask. llvm-svn: 347127	2018-11-17 02:18:12 +00:00
Fangrui Song	7570932977	Use llvm::copy. NFC llvm-svn: 347126	2018-11-17 01:44:25 +00:00
Fangrui Song	5ec95dbd74	[llvm-objcopy] Use llvm::all_of and rename the variables "Segment" to avoid confusion with the type of the same name llvm-svn: 347123	2018-11-17 01:15:55 +00:00
Stanislav Mekhanoshin	c12d64ab16	Moved dag-combine-select-undef.ll into amdgpu. NFC. Tests really needs target arch to be specified. llvm-svn: 347115	2018-11-17 00:17:15 +00:00
James Y Knight	85362113c7	Make git-llvm python3 compatible again. Hopefully. :) llvm-svn: 347113	2018-11-16 23:59:23 +00:00
Stanislav Mekhanoshin	c3214ad1dd	Fixed test after r347110 Comments in llc outputs are printed differently on different platforms, some with '#', some with '##'. Removed non-essential part of the checks. llvm-svn: 347112	2018-11-16 23:40:04 +00:00
Stanislav Mekhanoshin	0ff7c8309d	DAG combiner: fold (select, C, X, undef) -> X Differential Revision: https://reviews.llvm.org/D54646 llvm-svn: 347110	2018-11-16 23:13:38 +00:00
Craig Topper	ee0333b4a9	[X86] Add custom promotion of narrow fp_to_uint/fp_to_sint operations under -x86-experimental-vector-widening-legalization. This tries to force the result type to vXi32 followed by a truncate. This can help avoid scalarization that would otherwise occur. There's some annoying examples of an avx512 truncate instruction followed by a packus where we should really be able to just use one truncate. But overall this is still a net improvement. llvm-svn: 347105	2018-11-16 22:53:00 +00:00
James Y Knight	12167822be	Speed up git-llvm script by only svn up'ing affected directories. Also, support modifications to toplevel files in git (which need to be committed to "monorepo-root" in svn). Differential Revision: https://reviews.llvm.org/D54341 llvm-svn: 347103	2018-11-16 22:36:17 +00:00
Craig Topper	87bc07b3dd	[X86] Qualify part of the masked gather handling in ReplaceNodeResults with a getTypeAction call to know if we can use default legalization. If we managed to switch to -x86-experimental-vector-widening-legalization this block can be removed. llvm-svn: 347100	2018-11-16 22:04:29 +00:00
Sam Clegg	a2827edc2f	[WebAssembly] Cleanup unused declares in test code. NFC. In one case probably you have be using it, in the other it looks like it was redundant. Differential Revision: https://reviews.llvm.org/D54644 llvm-svn: 347098	2018-11-16 21:20:00 +00:00
Fedor Sergeev	2e3e224e71	[SimpleLoopUnswitch] adding cost multiplier to cap exponential unswitch with We need to control exponential behavior of loop-unswitch so we do not get run-away compilation. Suggested solution is to introduce a multiplier for an unswitch cost that makes cost prohibitive as soon as there are too many candidates and too many sibling loops (meaning we have already started duplicating loops by unswitching). It does solve the currently known problem with compile-time degradation (PR 39544). Tests are built on top of a recently implemented CHECK-COUNT-<num> FileCheck directives. Reviewed By: chandlerc, mkazantsev Differential Revision: https://reviews.llvm.org/D54223 llvm-svn: 347097	2018-11-16 21:16:43 +00:00
Craig Topper	567aaeb40d	[X86] Remove a branch on SSE4.1 from LowerLoad We should be able to use getExtendInVec with or without sse4.1 to produce a SIGN_EXTEND_VECTOR_INREG. llvm-svn: 347095	2018-11-16 21:05:00 +00:00
Craig Topper	9e97054211	[LegalizeVectorOps] After custom legalizing an extending load or a truncating store, make sure the custom code is also legal. For example, on X86 we emit a sign_extend_vector_inreg from LowerLoad and without sse4.1 this node will need further legalization. Previously this sign_extend_vector_inreg was being custom lowered during DAG legalization instead of vector op legalization. Unfortunately, this doesn't seem to matter for the output of any existing lit tests. llvm-svn: 347094	2018-11-16 21:04:58 +00:00
Craig Topper	7fff9a9aef	[X86] In LowerLoad, fix assert messages and rename a variable that use Zize instead of Size. NFC llvm-svn: 347093	2018-11-16 21:04:56 +00:00
Vyacheslav Zakharin	dd0a1fdf56	Preprocessing support in tablegen. Differential Revision: https://reviews.llvm.org/D53840 llvm-svn: 347092	2018-11-16 20:57:29 +00:00
Nemanja Ivanovic	ed6159bb71	[PowerPC][NFC] Add tests for vector fp <-> int conversions This NFC patch just adds test cases for conversions that currently require scalarization of vectors. An updcoming patch will change the legalization for these and it is more suitable on the review to show the diferences in code gen rather than just the new code gen. llvm-svn: 347090	2018-11-16 20:24:10 +00:00
Peter Collingbourne	527024469a	AArch64: Emit a call frame instruction for the shadow call stack register. When unwinding past a function that uses shadow call stack, we must subtract 8 from the value of the x18 register. This patch causes us to emit a call frame instruction that causes that to happen. Differential Revision: https://reviews.llvm.org/D54609 llvm-svn: 347089	2018-11-16 20:08:54 +00:00
Cameron McInally	e4ee9849c0	[FNeg] Add FNeg Instruction to LangRef document The FNeg IR Instruction code was added with D53877. Differential Revision: https://reviews.llvm.org/D54549 llvm-svn: 347086	2018-11-16 19:52:59 +00:00
Anton Korobeynikov	e5cb1c35b4	[MSP430] Add RTLIB::[SRL/SRA/SHL]_I32 lowering to EABI lib calls Patch by Kristina Bessonova! Differential Revision: https://reviews.llvm.org/D54626 llvm-svn: 347080	2018-11-16 19:36:15 +00:00
Rong Xu	3a38175723	[X86] Disable Condbr_merge pass Disable Condbr_merge pass for now due to PR39658. Will reenable the pass once the bug is fixed. llvm-svn: 347079	2018-11-16 19:35:00 +00:00
Stefan Pintilie	9004444d81	Revert "[PowerPC] Make no-PIC default to match GCC - LLVM" This reverts commit r347069 llvm-svn: 347076	2018-11-16 19:24:23 +00:00
Anton Korobeynikov	883c70959d	[MSP430] Use R_MSP430_16_BYTE type for FK_Data_2 fixup Linker fails to link example like this (simplified case from newlib sources): $ cat test.c extern const char _ctype_b[]; struct _t { char ptr; }; struct _t T = { ((char ) _ctype_b + 3) }; $ cat ctype.c char _ctype_b[4] = { 0, 0, 0, 0 }; LD: test.o:(.data+0x0): warning: internal error: unsupported relocation error We also follow gnu toolchain here, where 2-byte relocation mapped to R_MSP430_16_BYTE, instead of R_MSP430_16. Patch by Kristina Bessonova! Differential Revision: https://reviews.llvm.org/D54620 llvm-svn: 347074	2018-11-16 19:20:51 +00:00
Sam Clegg	74f5fd4e32	[WebAssembly] Default to static reloc model Differential Revision: https://reviews.llvm.org/D54637 llvm-svn: 347073	2018-11-16 18:59:51 +00:00
Reid Kleckner	755577168a	[codeview] Expose -gcodeview-ghash for global type hashing Summary: Experience has shown that the functionality is useful. It makes linking optimized clang with debug info for me a lot faster, 20s to 13s. The type merging phase of PDB writing goes from 10s to 3s. This removes the LLVM cl::opt and replaces it with a metadata flag. After this change, users can do the following to use ghash: - add -gcodeview-ghash to compiler flags - replace /DEBUG with /DEBUG:GHASH in linker flags Reviewers: zturner, hans, thakis, takuto.ikuta Subscribers: aprantl, hiraditya, JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D54370 llvm-svn: 347072	2018-11-16 18:47:41 +00:00
Stefan Pintilie	046eff502f	[PowerPC] Make no-PIC default to match GCC - LLVM Set -fno-PIC as the default option. Differential Revision: https://reviews.llvm.org/D53383 llvm-svn: 347069	2018-11-16 18:36:21 +00:00
Stefan Granitz	534618d78e	[CMake] Accept ENTITLEMENTS in add_llvm_executable and llvm_codesign Summary: Allow code-signing with entitlements. FORCE may be used to avoid an error when replacing existing signatures. Reviewers: beanz, bogner Reviewed By: beanz Subscribers: mgorny, llvm-commits, lldb-commits Differential Revision: https://reviews.llvm.org/D54443 llvm-svn: 347068	2018-11-16 18:10:36 +00:00
Simon Pilgrim	66f42ea6e1	[SelectionDAG] Move (repeated) SDTIntShiftDOp double shift node def to common code. NFCI. Prep work for PR39467. llvm-svn: 347067	2018-11-16 17:50:59 +00:00
Simon Pilgrim	96f7924fe2	[X86] Add codegen tests for scalar funnel shifts llvm-svn: 347066	2018-11-16 17:48:52 +00:00
Adrian Prantl	83d87520ed	GlobalDCE: Teach isEmptyFunction() to ignore debug intrinsics. This fixes PR39669. https://bugs.llvm.org/show_bug.cgi?id=39669 llvm-svn: 347065	2018-11-16 17:47:21 +00:00
Than McIntosh	fec48be329	[CodeGen] Expose some data types and accessors from StackMaps Summary: This is for supporting custom stack map formats, where the custom printer can access the stack map data. Patch by Cherry Zhang <cherryyz@google.com>. Related: https://reviews.llvm.org/D53892 Reviewers: thanm, apilipenko Reviewed By: apilipenko Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D54224 llvm-svn: 347061	2018-11-16 16:48:49 +00:00
Sanjay Patel	f967328e24	[InstSimplify] add tests for saturating add/sub; NFC These are baseline tests for D54532. Patch based on the original tests by: @nikic (Nikita Popov) llvm-svn: 347060	2018-11-16 16:32:34 +00:00
Sanjay Patel	5ebd2a785e	[InstSimplify] add test to demonstrate undef matching differences; NFC This is a baseline test for D54631. Patch by: @nikic (Nikita Popov) llvm-svn: 347055	2018-11-16 15:35:58 +00:00
Simon Pilgrim	bcd6631a2a	[X86][SSE] Move number of input limit out of resolveTargetShuffleInputs. Only combineX86ShufflesRecursively needs this limit. llvm-svn: 347054	2018-11-16 15:01:05 +00:00
Sanjay Patel	8da76a6581	[x86] regenerate complete checks for test; NFC llvm-svn: 347051	2018-11-16 14:44:20 +00:00
Than McIntosh	4a1c5da7ac	[IRVerifier] Allow StructRet in statepoint Summary: StructRet attribute is not allowed in vararg calls. The statepoint intrinsic is vararg, but the wrapped function may be not. Allow calls of statepoint with StructRet arg, as long as the wrapped function is not vararg. Reviewers: thanm, anna Reviewed By: anna Subscribers: anna, llvm-commits Differential Revision: https://reviews.llvm.org/D53602 llvm-svn: 347050	2018-11-16 14:28:05 +00:00
Simon Atanasyan	705fbd5d4f	[DWARF] Use PRIx64 instead of 'x' to format 64-bit values This is a follow-up to r346715. Use PRIx64 to formatted print of 64-bit value in the `DWARFDebugLoclists::LocationList::dump` to escape problem on big-endian hosts. llvm-svn: 347049	2018-11-16 13:14:26 +00:00
Roman Lebedev	90c5b3f78e	[X86] X86DAGToDAGISel::matchBitExtract(): extract 'lshr' from `X` Summary: As discussed in previous review, and noted in the FIXME, if `X` is actually an `lshr Y, Z` (logical!), we can fold the `Z` into 'control`, and let the `BEXTR` do this too. We could just insert those 8 bits of shift amount into control, but it is better to instead zero-extend them, and 'or' them in place. We can only do this for `lshr`, not `ashr`, because we do not know that the mask cover only the bits of `Y`, and not any of the sign-extended bits. The obvious question is, is this actually legal to do? I believe it is. Relevant quotes, from `Intel® 64 and IA-32 Architectures Software Developer’s Manual`, `BEXTR — Bit Field Extract`: * `Bit 7:0 of the second source operand specifies the starting bit position of bit extraction.` * `A START value exceeding the operand size will not extract any bits from the second source operand.` * `Only bit positions up to (OperandSize -1) of the first source operand are extracted.` * `All higher order bits in the destination operand (starting at bit position LENGTH) are zeroed.` * `The destination register is cleared if no bits are extracted.` FIXME: if we can do this, i wonder if we should prefer `BEXTR` over `BZHI` in such cases. Reviewers: RKSimon, craig.topper, spatel, andreadb Reviewed By: RKSimon, craig.topper, andreadb Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D54095 llvm-svn: 347048	2018-11-16 13:04:54 +00:00
Simon Pilgrim	3c8baf4f90	[TargetLowering] Cleanup more of the EXTEND demanded bits cases so that they match. NFCI. Use the same variable names etc. llvm-svn: 347045	2018-11-16 12:26:26 +00:00
Alex Bradbury	b4a64cede8	[RISCV][NFC] Define and use the new CA instruction format The RISC-V ISA manual was updated on 2018-11-07 (commit 00557c3) to define a new compressed instruction format, RVC format CA (no actual instruction encodings were changed). This patch updates the RISC-V backend to define the new format, and to use it in the relevant instructions. Differential Revision: https://reviews.llvm.org/D54302 Patch by Luís Marques. llvm-svn: 347043	2018-11-16 10:33:23 +00:00
Alex Bradbury	2146e8fb1e	[RISCV] Constant materialisation for RV64I This commit introduces support for materialising 64-bit constants for RV64I, making use of the RISCVMatInt::generateInstSeq helper in order to share logic for immediate materialisation with the MC layer (where it's used for the li pseudoinstruction). test/CodeGen/RISCV/imm.ll is updated to test RV64, and gains new 64-bit constant tests. It would be preferable if anyext constant returns were sign rather than zero extended (see PR39092). This patch simply adds an explicit signext to the returns in imm.ll. Further optimisations for constant materialisation are possible, most notably for mask-like values which can be generated my loading -1 and shifting right. A future patch will standardise on the C++ codepath for immediate selection on RV32 as well as RV64, and then add further such optimisations to RISCVMatInt::generateInstSeq in order to benefit both RV32 and RV64 for codegen and li expansion. Differential Revision: https://reviews.llvm.org/D52962 llvm-svn: 347042	2018-11-16 10:14:16 +00:00
Anton Korobeynikov	411773d227	[MSP430] Add support for .refsym directive Introduces support for '.refsym' assembler directive. From GCC docs (for MSP430): '.refsym' - This directive instructs assembler to add an undefined reference to the symbol following the directive. No relocation is created for this symbol; it will exist purely for pulling in object files from archives. Patch by Kristina Bessonova! Differential Revision: https://reviews.llvm.org/D54618 llvm-svn: 347041	2018-11-16 09:50:24 +00:00
Anton Korobeynikov	cad2b83182	[MSP430] Add more tests for ABI and calling convention Patch by Kristina Bessonova! Differential Revision: https://reviews.llvm.org/D54582 llvm-svn: 347040	2018-11-16 09:47:58 +00:00
Sam Parker	ab99cfab21	[DAGCombine] Fix non-deterministic debug output PR37970 reported non-deterministic debug output, this was caused by iterating through a set and not a a vector. bugzilla: https://bugs.llvm.org/show_bug.cgi?id=37970 Differential Revision: https://reviews.llvm.org/D54570 llvm-svn: 347037	2018-11-16 08:35:19 +00:00
Craig Topper	bac7d9735a	[LegalizeVectorTypes] Teach WidenVecRes_Convert to turn ANY_EXTEND into ANY_EXTEND_VECTOR_INREG when the input and output types need to be widened to the same width. If we don't do it here, DAGCombine will just end up creating it from the scalar any_extend+build_vector so might as well save a step. llvm-svn: 347034	2018-11-16 07:13:34 +00:00
Eugene Leviant	bf46e7410c	[ThinLTO] Internalize readonly globals An attempt to recommit r346584 after failure on OSX build bot. Fixed cache key computation in ThinLTOCodeGenerator and added test case llvm-svn: 347033	2018-11-16 07:08:00 +00:00
Craig Topper	079c37da58	[X86] Add custom type legalization for v2i8/v4i8/v8i8 mul under -x86-experimental-vector-widening. By early promoting the multiply to use an i16 element type we can avoid op legalization emit a second multiply for the 8 upper elements of the v16i8 type we would otherwise get. llvm-svn: 347032	2018-11-16 06:15:21 +00:00
Craig Topper	dc957d49f9	[X86] Add some test cases for vector multiplies on vectors shorter than 128 bits with -x86-experimental-vector-widening-legalization. llvm-svn: 347031	2018-11-16 06:15:20 +00:00
Matt Arsenault	eabb8dd015	AMDGPU: Fix analyzeBranch failing with pseudoterminators If a block had one of the _term instructions used for gluing exec modifying instructions to the end of the block, analyzeBranch would fail, preventing the verifier from catching a broken successor list. llvm-svn: 347027	2018-11-16 05:03:02 +00:00
Petr Hosek	f8e27b3878	[CMake] Support cross-compiling with multi-stage builds When using multi-stage builds, we would like support cross-compilation. Example is 2-stage build when the first stage is compiled for host while the second stage is compiled for the target. Normally, the second stage would be also used for compiling runtimes, but that's not possible when cross-compiling, so we use the first stage compiler instead. However, we still want to use the second stage paths. To do so, we set the -resource-dir of the first stage compiler to point to the resource directory of the second stage. We also need compiler tools that support the target architecture. These tools are not guaranteed to be present on the host, but in case of multi-stage build, we can build these tools in the first stage. Differential Revision: https://reviews.llvm.org/D54461 llvm-svn: 347025	2018-11-16 04:46:48 +00:00
Zachary Turner	6284aee9f8	[NativePDB] Rewrite the PdbSymUid to use our own custom namespacing scheme. Originally we created our 64-bit UID scheme by using the first byte as sort of a "tag" to represent what kind of symbol this was, and we re-used the PDB_SymType enumeration for this. For native pdb support, this is not really the right abstraction layer, because what we really want is something that tells us how to find the symbol. This means, specifically, is in the globals stream / public stream / module stream / TPI stream / etc, and for whichever one it is in, where is it within that stream? A good example of why the old namespacing scheme was insufficient is that it is more or less impossible to create a uid for a field list member of a class/struction/union/enum that tells you how to locate the original record. With this new scheme, the first byte is no longer a PDB_SymType enum but a new enum created specifically to identify where in the PDB this record lives. This gives us much better flexibility in what kinds of symbols the uids can identify. llvm-svn: 347018	2018-11-16 02:42:32 +00:00
Volodymyr Sapsai	ab73426c68	[VFS] Update unittest to fix Windows buildbot. Buildbot http://lab.llvm.org:8011/builders/clang-x64-windows-msvc is failing because it doesn't like paths in VFS, make them more Windows-friendly. Follow up to r347009. llvm-svn: 347016	2018-11-16 02:20:33 +00:00
Craig Topper	c93ae2b0a2	Revert r347014 "[X86] Add some test cases for vector multiplies on vectors shorter than 128 bits with -x86-experimental-vector-widening-legalization." Apparently I failed to update this after turnign sign extend to any extend. llvm-svn: 347015	2018-11-16 01:57:55 +00:00
Craig Topper	36920b44f7	[X86] Add some test cases for vector multiplies on vectors shorter than 128 bits with -x86-experimental-vector-widening-legalization. llvm-svn: 347014	2018-11-16 01:52:32 +00:00
Artem Belevich	d9e21fff62	Added missing whitespace in the link. llvm-svn: 347013	2018-11-16 01:23:12 +00:00
Craig Topper	5802b82b40	[X86] Use ANY_EXTEND instead of SIGN_EXTEND in the AVX2 and later path for legalizing vXi8 multiply. We aren't going to use the upper bits of the multiply result that the extend would effect. So we don't need a specific type of extend. This makes some reduction test cases shorter because we were previously trying to sign_extend a truncate which we can't eliminate. llvm-svn: 347011	2018-11-16 01:16:59 +00:00
Craig Topper	1acafd863f	[X86] Update a couple comments to remove a mention of a sign extending that no longer happens. NFC llvm-svn: 347010	2018-11-16 01:16:51 +00:00
Volodymyr Sapsai	7610033f56	[VFS] Implement `RedirectingFileSystem::getRealPath`. It fixes the case when Objective-C framework is added as a subframework through a symlink. When parent framework infers a module map and fails to detect a symlink, it would add a subframework as a submodule. And when we parse module map for the subframework, we would encounter an error like > error: umbrella for module 'WithSubframework.Foo' already covers this directory By implementing `getRealPath` "an egregious but useful hack" in `ModuleMap::inferFrameworkModule` works as expected. rdar://problem/45821279 Reviewers: bruno, benlangmuir, erik.pilkington Reviewed By: bruno Subscribers: hiraditya, dexonsmith, JDevlieghere, cfe-commits, llvm-commits Differential Revision: https://reviews.llvm.org/D54245 llvm-svn: 347009	2018-11-16 01:15:54 +00:00
Ron Lieberman	cac749ac88	[AMDGPU] Add FixupVectorISel pass, currently Supports SREGs in GLOBAL LD/ST Add a pass to fixup various vector ISel issues. Currently we handle converting GLOBAL_{LOAD\|STORE}_* and GLOBAL_Atomic_* instructions into their _SADDR variants. This involves feeding the sreg into the saddr field of the new instruction. llvm-svn: 347008	2018-11-16 01:13:34 +00:00
Artem Belevich	5d14b72d5c	[CUDA] updated CompileCudaWithLLVM.rst Differential Revision: https://reviews.llvm.org/D54608 llvm-svn: 347007	2018-11-16 01:02:43 +00:00
Tom Stellard	fb7d1a92e6	Re-apply r346985: [ADT] Drop llvm::Optional clang-specific optimization for trivially copyable types Remove a test case that was added with the optimization we are now removing. llvm-svn: 347004	2018-11-16 00:47:24 +00:00
Heejin Ahn	095796a391	[WebAssembly] Split BBs after throw instructions Summary: `throw` instruction is a terminator in wasm, but BBs were not splitted after `throw` instructions, causing machine instruction verifier to fail. This patch - Splits BBs after `throw` instructions in WasmEHPrepare and adding an unreachable instruction after `throw`, which will be deleted in LateEHPrepare pass - Refactors WasmEHPrepare into two member functions - Changes the semantics of `eraseBBsAndChildren` in LateEHPrepare pass to match that of WasmEHPrepare pass, which is newly added. Now `eraseBBsAndChildren` does not delete BBs with remaining predecessors. - Fixes style nits, making static function names conform to clang-tidy - Re-enables the test temporarily disabled by rL346840 && rL346845 Reviewers: dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D54571 llvm-svn: 347003	2018-11-16 00:47:18 +00:00
Ron Lieberman	2f5683e6b0	[AMDGPU] NFC Test commit llvm-svn: 347002	2018-11-16 00:46:51 +00:00
Konstantin Zhuravlyov	af7b5d7092	AMDHSA: More code object v3 fixes: - Make sure IsaInfo::hasCodeObjectV3 returns true only for AMDHSA - Update assembler metadata tests to use v2 by default llvm-svn: 347001	2018-11-15 23:14:23 +00:00
Craig Topper	22bfa99448	[X86] Remove ANY_EXTEND special case from canReduceVMulWidth Removing this code doesn't affect any lit tests so it doesn't appear to be tested anymore. I assume it was when it was added, but I guess something else changed? Code coverage report also says its unused. I mostly didn't like that it seemed to count the sign bits as if it was a sign_extend, but then set isPositive as if it was a zero_extend. It feels like we should have picked one interpretation? Differential Revision: https://reviews.llvm.org/D54596 llvm-svn: 346995	2018-11-15 21:19:32 +00:00
Scott Linder	8d5a36a839	[AMDGPU] Update code object metadata format documentation * Add amdhsa prefix to names to allow other tools to use the metadata without collision. * Make names consistent. * Simplify structure. * Change note record ID. * Switch from YAML to MsgPack format. * Document metadata assembler directive. Patch By: t-tye (Tony Tye) Differential Revision: https://reviews.llvm.org/D53445 llvm-svn: 346992	2018-11-15 20:46:55 +00:00
Tom Stellard	67666d7629	Revert "[ADT] Drop llvm::Optional clang-specific optmization for trivially copyable types" This reverts commit r346985. It looks like one of the unittests also needs to be updated, reverting while I investigate. llvm-svn: 346990	2018-11-15 20:27:11 +00:00
Tom Stellard	cc14a32411	[ADT] Drop llvm::Optional clang-specific optmization for trivially copyable types Summary: This fixes libLLVM.so ABI mismatches between llvm compiled with clang and llvm compiled with gcc (PR39427). Reviewers: bkramer, sylvestre.ledru, mgorny, hans Reviewed By: bkramer, hans Subscribers: dexonsmith, kristina, llvm-commits Differential Revision: https://reviews.llvm.org/D54540 llvm-svn: 346985	2018-11-15 19:32:24 +00:00
Craig Topper	b144c7a6fb	[X86] Minor cleanup to getExtendInVec. NFCI Use unsigned to calculate the subvector index to avoid a cast. Remove an unnecessary condition and replace it with a stronger assert. Use the InVT variable we updated when we extracted instead of grabbing it from the In SDValue. llvm-svn: 346983	2018-11-15 19:20:22 +00:00
Sanjay Patel	c92aa7618f	[InstCombine] adjust rotate direction in tests; NFC Copy/paste errors - all of the changed tests rotated left before. llvm-svn: 346982	2018-11-15 19:15:41 +00:00
Craig Topper	73bb04ab6f	[X86] Add -x86-experimental-vector-widening support to reduceVMULWidth and combineMulToPMADDWD In reduceVMULWidth, we no longer need to worry about extending the vector to 128 bits first. Regular widening of extends, muls and shuffles will take care of that for us. In combineMulToPMADDWD, we can handle v2i32 multiplies and allow the VPMADDWD to be widened to v4i32 during type legalization by adding custom widening like we do have for AVG/ADDUS/SUBUS. I had to modify that code a little to allow different and output VTs. Differential Revision: https://reviews.llvm.org/D54512 llvm-svn: 346980	2018-11-15 18:59:31 +00:00
Thomas Lively	fc3163b67a	[WebAssembly] Fix return type of nextByte Summary: The old return type did not allow for correct error reporting and was causing a compiler warning. Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D54586 llvm-svn: 346979	2018-11-15 18:56:49 +00:00
Scott Linder	919fbbbcca	[BinaryFormat] Add MsgPackTypes Add data structure to represent MessagePack "documents" and convert to/from both MessagePack and YAML encodings. Differential Revision: https://reviews.llvm.org/D48175 llvm-svn: 346978	2018-11-15 18:50:01 +00:00
Sanjay Patel	6cda87463f	[InstCombine] add tests for funnel shift (rotate) canonicalization; NFC llvm-svn: 346975	2018-11-15 18:19:56 +00:00
Craig Topper	aa3f2494b3	[X86] Guess that a CPU is Icelake it if reports support for AVX512VBMI2. llvm-svn: 346973	2018-11-15 18:11:52 +00:00
Xin Tong	642c8d3575	[LTO] Load sample profile in LTO link step. Summary: Load sample profile in LTO link step. ThinLTO calls populateModulePassManager to load the profile Reviewers: tejohnson, davidxl, danielcdh Subscribers: mehdi_amini, inglorion, steven_wu, dexonsmith, llvm-commits Differential Revision: https://reviews.llvm.org/D54564 llvm-svn: 346971	2018-11-15 18:06:42 +00:00
Simon Pilgrim	924f193419	[TTI] Reduction costs only need to include a single extract element cost We were adding the entire scalarization extraction cost for reductions, which returns the total cost of extracting every element of a vector type. For reductions we don't need to do this - we just need to extract the 0'th element after the reduction pattern has completed. Fixes PR37731 Differential Revision: https://reviews.llvm.org/D54585 llvm-svn: 346970	2018-11-15 17:42:53 +00:00
Sanjay Patel	bc56b2432d	[InstCombine] fix rotate narrowing bug for non-pow-2 types llvm-svn: 346968	2018-11-15 17:19:14 +00:00
Sanjay Patel	712bdb275c	[InstCombine] add rotate narrowing tests with odd types; NFC There's a potential miscompile here. It's unlikely in the real world because this transform is guarded with shouldChangeType(), but this test file doesn't include a standard data-layout for some reason (despite including a custom 1), so we can see the bug. llvm-svn: 346966	2018-11-15 16:34:26 +00:00
Simon Pilgrim	5a1b7cea91	[SLPVectorizer][X86] Regenerate reduction minmax tests and cleanup check prefixes llvm-svn: 346965	2018-11-15 16:34:15 +00:00
Simon Pilgrim	4dd692ec2a	[SLPVectorizer][X86] Regenerate reduction tests and add PR37731 test Cleanup check prefixes llvm-svn: 346964	2018-11-15 16:08:25 +00:00
Simon Pilgrim	0db8cb0147	[X86] Fix MCNullStreamer support for modules with a CodeView flag This fixes -filetype=null support when compiling for a Win32 target and the module has a CodeView flag. The only places changed are the uses of getTargetStreamer function - this patch guards both of them with null checks. Committed on behalf of @eush (Eugene Sharygin) Differential Revision: https://reviews.llvm.org/D54008 llvm-svn: 346962	2018-11-15 15:17:15 +00:00
Sanjay Patel	e98ec77a95	[InstSimplify] delete shift-of-zero guard ops around funnel shifts This is a problem seen in common rotate idioms as noted in: https://bugs.llvm.org/show_bug.cgi?id=34924 Note that we are not canonicalizing standard IR (shifts and logic) to the intrinsics yet. (Although I've written this before...) I think this is the last step before we enable that transform. Ie, we could regress code by doing that transform without this simplification in place. In PR34924, I questioned whether this is a valid transform for target-independent IR, but I convinced myself this is ok. If we're speculating a funnel shift by turning cmp+br into select, then SimplifyCFG has already determined that the transform is justified. It's possible that SimplifyCFG is not taking into account profile or other metadata, but if that's true, then it's a bug independent of funnel shifts. Also, we do have CGP code to restore a guard like this around an intrinsic if it can't be lowered cheaply. But that isn't necessary for funnel shift because the default expansion in SelectionDAGBuilder includes this same cmp+select. Differential Revision: https://reviews.llvm.org/D54552 llvm-svn: 346960	2018-11-15 14:53:37 +00:00
Alex Bradbury	f809d89980	[RISCV] Mark C.EBREAK instruction as having side effects C.EBREAK was defined with hasSideEffects = 0, which is incorrect and inconsistent with the non-compressed instruction form. This patch corrects this oversight. This wouldn't cause codegen issues, as compressed instructions are only ever generated by converting the non-compressed form as an MCInst. But having correct flags is still worthwhile. Differential Revision: https://reviews.llvm.org/D54256 Patch by Luís Marques. llvm-svn: 346959	2018-11-15 14:52:24 +00:00
Alex Bradbury	7727240438	[RISCV] Mark FREM as Expand Mark the FREM SelectionDAG node as Expand, which is necessary in order to support the frem IR instruction on RISC-V. This is expanded into a library call. Adds the corresponding test. Previously, this would have triggered an assertion at instruction selection time. Differential Revision: https://reviews.llvm.org/D54159 Patch by Luís Marques. llvm-svn: 346958	2018-11-15 14:46:11 +00:00
Anton Korobeynikov	f0001f4186	Add missed files from prev. commit llvm-svn: 346949	2018-11-15 12:35:04 +00:00
Anton Korobeynikov	49045c6a0d	[MSP430] Add MC layer Reapply r346374 with the fixes for modules build. Original summary: This change implements assembler parser, code emitter, ELF object writer and disassembler for the MSP430 ISA. Also, more instruction forms are added to the target description. Patch by Michael Skvortsov! llvm-svn: 346948	2018-11-15 12:29:43 +00:00
Xing GUO	cc0829f3cb	[llvm-objdump] Use `auto` declaration in typecasting Summary: According to `MaskRay`, use `auto` for type inference, according to coding standards. Delete some comments, because these comments can be easily inferred from codes. Reviewers: jhenderson, MaskRay Reviewed By: jhenderson Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D54573 llvm-svn: 346946	2018-11-15 11:51:13 +00:00
Alex Bradbury	22c091fc3c	[RISCV] Introduce the RISCVMatInt::generateInstSeq helper Logic to load 32-bit and 64-bit immediates is currently present in RISCVAsmParser::emitLoadImm in order to support the li pseudoinstruction. With the introduction of RV64 codegen, there is a greater benefit of sharing immediate materialisation logic between the MC layer and codegen. The generateInstSeq helper allows this by producing a vector of simple structs representing the chosen instructions. This can then be consumed in the MC layer to produce MCInsts or at instruction selection time to produce appropriate SelectionDAG node. Sharing this logic means that both the li pseudoinstruction and codegen can benefit from future optimisations, and that this logic can be used for materialising constants during RV64 codegen. This patch does contain a behaviour change: addi will now be produced on RV64 when no lui is necessary to materialise the constant. In that case addiw takes x0 as the source register, so is semantically identical to addi. Differential Revision: https://reviews.llvm.org/D52961 llvm-svn: 346937	2018-11-15 10:11:31 +00:00
Craig Topper	553ac560aa	[X86] Add some custom type legalization rules for truncate with -x86-experimental-vector-widening-legalization. This avoids some nasty shuffles when we have avx512. It will also prevent using zmm truncate instructions when a ymm instruction that zeroes part of an xmm register will do. Also avoid using avx512 truncate instructions when the input is 128 bits or less. These instructions are 2 uops on skx so we can probably find a better single uop shuffle like pshufb. llvm-svn: 346936	2018-11-15 08:23:40 +00:00
Craig Topper	926dbdd601	[X86] Add -x86-experimental-vector-widening-legalization versions of shuffle-vs-trunc tests. llvm-svn: 346935	2018-11-15 08:23:37 +00:00
Thomas Lively	77b33c86f5	[WebAssembly] Renumber SIMD bitwise instructions Summary: Changed to match https://github.com/WebAssembly/simd/pull/54. Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D54561 llvm-svn: 346931	2018-11-15 03:38:59 +00:00
Konstantin Zhuravlyov	7d1532d333	AMDGPU: Fix check lines in fdot2 test: GCN900 -> GFX900 llvm-svn: 346925	2018-11-15 02:42:04 +00:00
Xing GUO	2e3364f9c2	[commit-test] Add blank line for test/tools/llvm-objdump/symbol-table-elf.test Summary: Test commit Reviewers: Higuoxing Reviewed By: Higuoxing Subscribers: llvm-commits, Higuoxing Differential Revision: https://reviews.llvm.org/D54562 llvm-svn: 346924	2018-11-15 02:36:20 +00:00
Konstantin Zhuravlyov	a25e0524c0	AMDGPU: Enable code object v3 for AMDHSA only Differential Revision: https://reviews.llvm.org/D54186 llvm-svn: 346923	2018-11-15 02:32:43 +00:00
Craig Topper	ea6ced9d1a	[X86] Don't mark SEXTLOADS with narrow types as Custom with -x86-experimental-vector-widening-legalization. The narrow types end up requesting widening, but generic legalization will end up scalaring and using a build_vector to do the widening. llvm-svn: 346916	2018-11-15 00:21:41 +00:00
Jessica Paquette	ddb039a199	[MachineOutliner][NFC] Check if CandidatesForRepeatedSeq < 2 There's no reason to call getOutliningCandidateInfo with a single candidate. llvm-svn: 346913	2018-11-15 00:02:24 +00:00
Benjamin Kramer	6b7d6fe079	[X86] Remove unused variable llvm-svn: 346909	2018-11-14 23:13:27 +00:00

... 2 3 4 5 6 ...

171928 Commits