llvm-project

Commit Graph

Author	SHA1	Message	Date
Justin Bogner	1f5c505437	GlobalISel: Fix text wrapping in a comment. NFC llvm-svn: 292460	2017-01-19 01:04:46 +00:00
Matthias Braun	58f99615d6	Use an actual valid register in test llvm-svn: 292459	2017-01-19 01:04:08 +00:00
Dehao Chen	b3a70de753	Add -fdebug-info-for-profiling to emit more debug info for sample pgo profile collection Summary: SamplePGO uses profile with debug info to collect profile. Unlike the traditional debugging purpose, sample pgo needs more accurate debug info to represent the profile. We add -femit-accurate-debug-info for this purpose. It can be combined with all debugging modes (-g, -gmlt, etc). It makes sure that the following pieces of info is always emitted: * start line of all subprograms * linkage name of all subprograms * standalone subprograms (functions that has neither inlined nor been inlined) The impact on speccpu2006 binary size (size increase comparing with -g0 binary, also includes data for -g binary, which does not change with this patch): -gmlt(orig) -gmlt(patched) -g 433.milc 4.68% 5.40% 19.73% 444.namd 8.45% 8.93% 45.99% 447.dealII 97.43% 115.21% 374.89% 450.soplex 27.75% 31.88% 126.04% 453.povray 21.81% 26.16% 92.03% 470.lbm 0.60% 0.67% 1.96% 482.sphinx3 5.77% 6.47% 26.17% 400.perlbench 17.81% 19.43% 73.08% 401.bzip2 3.73% 3.92% 12.18% 403.gcc 31.75% 34.48% 122.75% 429.mcf 0.78% 0.88% 3.89% 445.gobmk 6.08% 7.92% 42.27% 456.hmmer 10.36% 11.25% 35.23% 458.sjeng 5.08% 5.42% 14.36% 462.libquantum 1.71% 1.96% 6.36% 464.h264ref 15.61% 16.56% 43.92% 471.omnetpp 11.93% 15.84% 60.09% 473.astar 3.11% 3.69% 14.18% 483.xalancbmk 56.29% 81.63% 353.22% geomean 15.60% 18.30% 57.81% Debug info size change for -gmlt binary with this patch: 433.milc 13.46% 444.namd 5.35% 447.dealII 18.21% 450.soplex 14.68% 453.povray 19.65% 470.lbm 6.03% 482.sphinx3 11.21% 400.perlbench 8.91% 401.bzip2 4.41% 403.gcc 8.56% 429.mcf 8.24% 445.gobmk 29.47% 456.hmmer 8.19% 458.sjeng 6.05% 462.libquantum 11.23% 464.h264ref 5.93% 471.omnetpp 31.89% 473.astar 16.20% 483.xalancbmk 44.62% geomean 16.83% Reviewers: davidxl, andreadb, rob.lougher, dblaikie, echristo Reviewed By: dblaikie, echristo Subscribers: hfinkel, rob.lougher, andreadb, gbedwell, cfe-commits, probinson, llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D25435 llvm-svn: 292458	2017-01-19 00:44:21 +00:00
Dehao Chen	1ce8d6ca59	Add -debug-info-for-profiling to emit more debug info for sample pgo profile collection Summary: SamplePGO binaries built with -gmlt to collect profile. The current -gmlt debug info is limited, and we need some additional info: * start line of all subprograms * linkage name of all subprograms * standalone subprograms (functions that has neither inlined nor been inlined) This patch adds these information to the -gmlt binary. The impact on speccpu2006 binary size (size increase comparing with -g0 binary, also includes data for -g binary, which does not change with this patch): -gmlt(orig) -gmlt(patched) -g 433.milc 4.68% 5.40% 19.73% 444.namd 8.45% 8.93% 45.99% 447.dealII 97.43% 115.21% 374.89% 450.soplex 27.75% 31.88% 126.04% 453.povray 21.81% 26.16% 92.03% 470.lbm 0.60% 0.67% 1.96% 482.sphinx3 5.77% 6.47% 26.17% 400.perlbench 17.81% 19.43% 73.08% 401.bzip2 3.73% 3.92% 12.18% 403.gcc 31.75% 34.48% 122.75% 429.mcf 0.78% 0.88% 3.89% 445.gobmk 6.08% 7.92% 42.27% 456.hmmer 10.36% 11.25% 35.23% 458.sjeng 5.08% 5.42% 14.36% 462.libquantum 1.71% 1.96% 6.36% 464.h264ref 15.61% 16.56% 43.92% 471.omnetpp 11.93% 15.84% 60.09% 473.astar 3.11% 3.69% 14.18% 483.xalancbmk 56.29% 81.63% 353.22% geomean 15.60% 18.30% 57.81% Debug info size change for -gmlt binary with this patch: 433.milc 13.46% 444.namd 5.35% 447.dealII 18.21% 450.soplex 14.68% 453.povray 19.65% 470.lbm 6.03% 482.sphinx3 11.21% 400.perlbench 8.91% 401.bzip2 4.41% 403.gcc 8.56% 429.mcf 8.24% 445.gobmk 29.47% 456.hmmer 8.19% 458.sjeng 6.05% 462.libquantum 11.23% 464.h264ref 5.93% 471.omnetpp 31.89% 473.astar 16.20% 483.xalancbmk 44.62% geomean 16.83% Reviewers: davidxl, echristo, dblaikie Reviewed By: echristo, dblaikie Subscribers: aprantl, probinson, llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D25434 llvm-svn: 292457	2017-01-19 00:44:11 +00:00
Michael Kuperstein	230867e583	[LV] Run loop-simplify and LCSSA explicitly instead of "requiring" them This changes the vectorizer to explicitly use the loopsimplify and lcssa utils, instead of "requiring" the transformations as if they were analyses. This is not NFC, since it changes the LCSSA behavior - we no longer run LCSSA for all loops, but rather only for the loops we expect to modify. Differential Revision: https://reviews.llvm.org/D28868 llvm-svn: 292456	2017-01-19 00:42:28 +00:00
Matthias Braun	9f21a8d787	LiveIntervalAnalysis: Cleanup; NFC - Fix doxygen comments: Do not repeat name, remove duplicated doxygen comment (on declaration + implementation), etc. - Use more range based for llvm-svn: 292455	2017-01-19 00:32:13 +00:00
Jason Molenda	848c7be02a	Fix a problem with the new dyld interface code -- when a new process starts up, we need to clear the target's image list and only add the binaries into the target that are actually present in this process run. <rdar://problem/29857613> llvm-svn: 292454	2017-01-19 00:20:29 +00:00
Artem Belevich	3d3f6190ab	[NVPTX] Fix lowering of fp16 ISD::FNEG. There's no neg.f16 instruction, so negation has to be done via subtraction from zero. Differential Revision: https://reviews.llvm.org/D28876 llvm-svn: 292452	2017-01-19 00:14:45 +00:00
Peter Collingbourne	87cdfa7635	Add llvm-dis dependency to check-clang. llvm-svn: 292450	2017-01-19 00:04:44 +00:00
Eli Friedman	f1f49c8265	[SCEV] Make getUDivExactExpr handle non-nuw multiplies correctly. To avoid regressions, make ScalarEvolution::createSCEV a bit more clever. Also get rid of some useless code in ScalarEvolution::howFarToZero which was hiding this bug. No new testcase because it's impossible to actually expose this bug: we don't have any in-tree users of getUDivExactExpr besides the two functions I just mentioned, and they both dodged the problem. I'll try to add some interesting users in a followup. Differential Revision: https://reviews.llvm.org/D28587 llvm-svn: 292449	2017-01-18 23:56:42 +00:00
Peter Collingbourne	1e1475ace5	Move vtable type metadata emission behind a cc1-level flag. In ThinLTO mode, type metadata will require the module to be written as a multi-module bitcode file, which is currently incompatible with the Darwin linker. It is also useful to be able to enable or disable multi-module bitcode for testing purposes. This introduces a cc1-level flag, -f{,no-}lto-unit, which is used by the driver to enable multi-module bitcode on all but Darwin+ThinLTO, and can also be used to enable/disable the feature manually. Differential Revision: https://reviews.llvm.org/D28877 llvm-svn: 292448	2017-01-18 23:55:27 +00:00
Eli Friedman	0a2174533e	Preserve domtree and loop-simplify for runtime unrolling. Mostly straightforward changes; we just didn't do the computation before. One sort of interesting change in LoopUnroll.cpp: we weren't handling dominance for children of the loop latch correctly, but foldBlockIntoPredecessor hid the problem for complete unrolling. Currently punting on loop peeling; made some minor changes to isolate that problem to LoopUnrollPeel.cpp. Adds a flag -unroll-verify-domtree; it verifies the domtree immediately after we finish updating it. This is on by default for +Asserts builds. Differential Revision: https://reviews.llvm.org/D28073 llvm-svn: 292447	2017-01-18 23:26:37 +00:00
Krzysztof Parzyszek	de44c9d857	Treat segment [B, E) as not overlapping block with boundaries [A, B) llvm-svn: 292446	2017-01-18 23:12:19 +00:00
Krzysztof Parzyszek	954dd8d9ba	[Hexagon] Remove dead defs from the live set when expanding wstores llvm-svn: 292445	2017-01-18 23:11:40 +00:00
Michael Kuperstein	d3d2925933	Revert r291670 because it introduces a crash. r291670 doesn't crash on the original testcase from PR31589, but it crashes on a slightly more complex one. PR31589 has the new reproducer. llvm-svn: 292444	2017-01-18 23:05:58 +00:00
Stephan T. Lavavej	d6c0b35c11	[libcxx] [test] Add msvc_stdlib_force_include.hpp. No functional change; nothing includes this, instead our test harness injects it via the /FI compiler option. No code review; blessed in advance by EricWF. llvm-svn: 292443	2017-01-18 22:19:14 +00:00
Mehdi Amini	062b3fed4c	Improve the `-filter-print-funcs` option to skip the banner for CGSCC pass when nothing is to be printed Before, it would print a sequence of: * IR Dump After Function Integration/Inlining ** * IR Dump After Function Integration/Inlining **** * IR Dump After Function Integration/Inlining ****** ... for every single function in the module. llvm-svn: 292442	2017-01-18 21:37:11 +00:00
Sanjay Patel	cfb8a45942	[InstCombine] add tests for shl nsw with icmp eq/ne; NFCI These should be fixed with D28406. llvm-svn: 292441	2017-01-18 21:31:21 +00:00
Sanjay Patel	ae23d65a7d	[InstCombine] add an assert to make a shl+icmp transform assumption explicit; NFCI llvm-svn: 292440	2017-01-18 21:16:12 +00:00
David Blaikie	75ed8ad69e	Remove now redundant code that ensured debug info for class definitions was emitted under certain circumstances Introduced in r181561 - it may've been subsumed by work done to allow emission of declarations for vtable types while still emitting some of their member functions correctly for those declarations. Whatever the reason, the tests pass without this code now. llvm-svn: 292439	2017-01-18 21:15:18 +00:00
Haicheng Wu	8ce2d14356	[CodeGenPrepare] Fix a typo in the comment. NFC. encode => endcode. Differential Revision: https://reviews.llvm.org/D28866 llvm-svn: 292438	2017-01-18 21:12:10 +00:00
Arpith Chacko Jacob	fe4890a68b	[OpenMP] Support for the if-clause on the combined directive 'target parallel'. The if-clause on the combined directive potentially applies to both the 'target' and the 'parallel' regions. Codegen'ing the if-clause on the combined directive requires additional support because the expression in the clause must be captured by the 'target' capture statement but not the 'parallel' capture statement. Note that this situation arises for other clauses such as num_threads. The OMPIfClause class inherits OMPClauseWithPreInit to support capturing of expressions in the clause. A member CaptureRegion is added to OMPClauseWithPreInit to indicate which captured statement (in this case 'target' but not 'parallel') captures these expressions. To ensure correct codegen of captured expressions in the presence of combined 'target' directives, OMPParallelScope was added to 'parallel' codegen. Reviewers: ABataev Differential Revision: https://reviews.llvm.org/D28781 llvm-svn: 292437	2017-01-18 20:40:48 +00:00
Graydon Hoare	9c982440a2	[ASTReader] Add a DeserializationListener callback for IMPORTED_MODULES Summary: Add a callback from ASTReader to DeserializationListener when the former reads an IMPORTED_MODULES block. This supports Swift in using PCH for bridging headers. Reviewers: doug.gregor, manmanren, bruno Reviewed By: manmanren Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D28779 llvm-svn: 292436	2017-01-18 20:36:59 +00:00
Graydon Hoare	dc0405f74c	[Modules] Correct test comment from obsolete earlier version of code. NFC Summary: Code committed in rL290219 went through a few iterations; test wound up with stale comment. Reviewers: doug.gregor, manmanren Reviewed By: manmanren Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D28790 llvm-svn: 292435	2017-01-18 20:34:44 +00:00
Stephan T. Lavavej	a730ed3149	[libcxx] [test] Fix comment typos, strip trailing whitespace. No functional change, no code review. llvm-svn: 292434	2017-01-18 20:10:25 +00:00
Sanjay Patel	589de5ea4e	[InstCombine] remove a redundant check; NFCI I missed deleting this check when I refactored this chunk in: https://reviews.llvm.org/rL292260 llvm-svn: 292433	2017-01-18 20:09:59 +00:00
Stephan T. Lavavej	3d26ee2921	[libcxx] [test] Fix MSVC warnings C4127 and C6326 about constants. MSVC has compiler warnings C4127 "conditional expression is constant" (enabled by /W4) and C6326 "Potential comparison of a constant with another constant" (enabled by /analyze). They're potentially useful, although they're slightly annoying to library devs who know what they're doing. In the latest version of the compiler, C4127 is suppressed when the compiler sees simple tests like "if (name_of_thing)", so extracting comparison expressions into named constants is a workaround. At the same time, using std::integral_constant avoids C6326, which doesn't look at template arguments. test/std/containers/sequences/vector.bool/emplace.pass.cpp Replace 1 == 1 with true, which is the same as far as the library is concerned. Fixes D28837. llvm-svn: 292432	2017-01-18 20:09:56 +00:00
Peter Collingbourne	20a00933fb	ThinLTOBitcodeWriter: Clear comdats on filtered globals. Differential Revision: https://reviews.llvm.org/D28839 llvm-svn: 292431	2017-01-18 20:03:02 +00:00
Peter Collingbourne	10e3b12c7a	Cloning: Copy comdats when cloning globals. Differential Revision: https://reviews.llvm.org/D28838 llvm-svn: 292430	2017-01-18 20:02:31 +00:00
Arpith Chacko Jacob	44a87c9f1b	[OpenMP] Codegen for the 'target parallel' directive on the NVPTX device. This patch adds codegen for the 'target parallel' directive on the NVPTX device. We term offload OpenMP directives such as 'target parallel' and 'target teams distribute parallel for' as SPMD constructs. SPMD constructs, in contrast to Generic ones like the plain 'target', can never contain a serial region. SPMD constructs can be handled more efficiently on the GPU and do not require the Warp Loop of the Generic codegen scheme. This patch adds SPMD codegen support for 'target parallel' on the NVPTX device and can be reused for other SPMD constructs. Reviewers: ABataev Differential Revision: https://reviews.llvm.org/D28755 llvm-svn: 292428	2017-01-18 19:35:00 +00:00
Richard Smith	11255ec765	PR9551: Implement DR1004 (http://wg21.link/cwg1004 ). This rule permits the injected-class-name of a class template to be used as both a template type argument and a template template argument, with no extra syntax required to disambiguate. llvm-svn: 292426	2017-01-18 19:19:22 +00:00
Michael Kuperstein	0de990da16	Fix up a comment. NFC. llvm-svn: 292425	2017-01-18 19:05:48 +00:00
Michael Kuperstein	7cefb409b0	[LV] Allow reductions that have several uses outside the loop We currently check whether a reduction has a single outside user. We don't really need to require that - we just need to make sure a single value is used externally. The number of external users of that value shouldn't actually matter. Differential Revision: https://reviews.llvm.org/D28830 llvm-svn: 292424	2017-01-18 19:02:52 +00:00
Justin Bogner	2ceeb30eb6	cmake: Only sanitize use-after-scope if the host compiler supports it In r292256, we started adding -fsanitize-use-after-scope when using the address sanitizer, but that flag wasn't always available. This fixes the config to only add the flag if the host compiler supports it. llvm-svn: 292423	2017-01-18 19:01:58 +00:00
Evandro Menezes	7960b2e19a	[AArch64] Generate literals by the little end ARM seems to prefer that long literals be formed from their little end in order to promote the fusion of the instrs pairs MOV/MOVK and MOVK/MOVK on Cortex A57 and others (v. "Cortex A57 Software Optimisation Guide", section 4.14). Differential revision: https://reviews.llvm.org/D28697 llvm-svn: 292422	2017-01-18 18:57:08 +00:00
Davide Italiano	bca9d73309	[NewGVN] We don't use postdom info anymore. Update. Differential Revision: https://reviews.llvm.org/D28842 llvm-svn: 292421	2017-01-18 18:42:28 +00:00
Mehdi Amini	67d2cc1fad	[ThinLTO] Add a recursive step in Metadata lazy-loading Summary: Without this, we're stressing the RAUW of unique nodes, which is a costly operation. This is intended to limit the number of RAUW, and is very effective on the total link-time of opt with ThinLTO, before: real 4m4.587s user 15m3.401s sys 0m23.616s after: real 3m25.261s user 12m22.132s sys 0m24.152s Reviewers: tejohnson, pcc Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D28751 llvm-svn: 292420	2017-01-18 18:36:21 +00:00
Arpith Chacko Jacob	19b911cb75	[OpenMP] Codegen support for 'target parallel' on the host. This patch adds support for codegen of 'target parallel' on the host. It is also the first combined directive that requires two or more captured statements. Support for this functionality is included in the patch. A combined directive such as 'target parallel' has two captured statements, one for the 'target' and the other for the 'parallel' region. Two captured statements are required because each has different implicit parameters (see SemaOpenMP.cpp). For example, the 'parallel' has 'global_tid' and 'bound_tid' while the 'target' does not. The patch adds support for handling multiple captured statements based on the combined directive. When codegen'ing the 'target parallel' directive, the 'target' outlined function is created using the outer captured statement and the 'parallel' outlined function is created using the inner captured statement. Reviewers: ABataev Differential Revision: https://reviews.llvm.org/D28753 llvm-svn: 292419	2017-01-18 18:18:53 +00:00
Jonathan Roelofs	8829e961e6	Revert r286788 The Itanium ABI [1] specifies that __cxa_demangle accept either: 1) symbol names, which start with "_Z" 2) type manglings, which do not start with "_Z" r286788 erroneously assumes that it should only handle symbols, so this patch reverts it and adds a counterexample to the testcase. 1: https://mentorembedded.github.io/cxx-abi/abi.html#demangler Reviewers: zygoloid, EricWF llvm-svn: 292418	2017-01-18 18:12:39 +00:00
Graydon Hoare	ae5d7bb4f5	[lit] Support sharding testsuites, for parallel execution. Summary: This change equips lit.py with two new options, --num-shards=M and --run-shard=N (set by default from env vars LIT_NUM_SHARDS and LIT_RUN_SHARD). The options must be used together, and N must be in 1..M. Together these options effect only test selection: they partition the testsuite into M equal-sized "shards", then select only the Nth shard. They can be used in a cluster of test machines to achieve a very crude (static) form of parallelism, with minimal configuration work. Reviewers: modocache, ddunbar Reviewed By: ddunbar Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D28789 llvm-svn: 292417	2017-01-18 18:12:20 +00:00
Alexey Bataev	f86cca1a42	[SLP] Add a tests for a fix for PR30787. Add a test for PR30787: Failure to beneficially vectorize 'copyable' elements in integer binary ops. llvm-svn: 292416	2017-01-18 18:07:46 +00:00
Ehsan Akhgari	221ab77ff7	[clang-tidy] Add -extra-arg and -extra-arg-before to run-clang-tidy.py Summary: These flags allow specifying extra arguments to the tool's command line which don't appear in the compilation database. Reviewers: alexfh Differential Revision: https://reviews.llvm.org/D28334 llvm-svn: 292415	2017-01-18 17:49:35 +00:00
Pavel Labath	c69d0a203b	Fix new Log unit test the test was flaky because I specified the format string for the process id incorrectly. This should fix it. llvm-svn: 292414	2017-01-18 17:31:55 +00:00
Stanislav Mekhanoshin	a4e63ead4b	[AMDGPU] Do not allow register coalescer to create big superregs Limit register coalescer by not allowing it to artificially increase size of registers beyond dword. Such super-registers are in fact register sequences and not distinct HW registers. With more super-regs we would need to allocate adjacent registers and constraint regalloc more than needed. Moreover, our super registers are overlapping. For instance we have VGPR0_VGPR1_VGPR2, VGPR1_VGPR2_VGPR3, VGPR2_VGPR3_VGPR4 etc, which complicates registers allocation even more, resulting in excessive spilling. Differential Revision: https://reviews.llvm.org/D28782 llvm-svn: 292413	2017-01-18 17:30:05 +00:00
Justin Bogner	fde0104649	GlobalISel: Implement narrowing for G_STORE Legalize stores of types that are too wide by breaking them up into sequences of smaller stores. llvm-svn: 292412	2017-01-18 17:29:54 +00:00
Justin Bogner	cb60161a25	GlobalISel: Correct copy-pasted comment. NFC llvm-svn: 292411	2017-01-18 17:28:41 +00:00
Kostya Kortchinsky	b39dff4551	[scudo] Refactor of CRC32 and ARM runtime CRC32 detection Summary: ARM & AArch64 runtime detection for hardware support of CRC32 has been added via check of the AT_HWVAL auxiliary vector. Following Michal's suggestions in D28417, the CRC32 code has been further changed and looks better now. When compiled with full relro (which is strongly suggested to benefit from additional hardening), the weak symbol for computeHardwareCRC32 is read-only and the assembly generated is fairly clean and straight forward. As suggested, an additional optimization is to skip the runtime check if SSE 4.2 has been enabled globally, as opposed to only for scudo_crc32.cpp. scudo_crc32.h has no purpose anymore and was removed. Reviewers: alekseyshl, kcc, rengolin, mgorny, phosek Reviewed By: rengolin, mgorny Subscribers: aemerson, rengolin, llvm-commits Differential Revision: https://reviews.llvm.org/D28574 llvm-svn: 292409	2017-01-18 17:11:17 +00:00
Teresa Johnson	2d384ac381	Don't create a comdat group for a dropped def with initializer Non-prevailing weak/linkonce odr symbols will be dropped by ThinLTO to available_externally when possible. If they had an initializer in the global_ctors list, a comdat group was being created. This code already had logic to skip available_externally defs, but now the EliminateAvailableExternally pass will drop these symbols to declarations earlier. Change the check to skip all declarations for linker (which includes available_externally along with declarations). Reviewers: mehdi_amini Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D28737 llvm-svn: 292408	2017-01-18 16:58:43 +00:00
Kirill Bobyrev	6afbaf0944	Revert 292404 due to buildbot failures. llvm-svn: 292407	2017-01-18 16:34:25 +00:00
Benjamin Kramer	8de9c9b01e	[ASTUnit] Reset diag state when creating the ASTUnit. A client could call this with a dirty diagnostic engine, don't crash. llvm-svn: 292406	2017-01-18 16:25:48 +00:00

1 2 3 4 5 ...

252400 Commits All Branches Search

252400 Commits

All Branches