llvm-project

Commit Graph

Author	SHA1	Message	Date
Richard Trieu	a9354b2f33	Fix narrowing issue from r353129 llvm-svn: 353134	2019-02-05 02:26:03 +00:00
Wei Mi	4901f371a2	[SamplePGO][NFC] Minor improvement to replace a temporary vector with a brace-enclosed init list. Differential Revision: https://reviews.llvm.org/D57726 llvm-svn: 353129	2019-02-05 00:57:50 +00:00
James Y Knight	7716075a17	[opaque pointer types] Pass value type to GetElementPtr creation. This cleans up all GetElementPtr creation in LLVM to explicitly pass a value type rather than deriving it from the pointer's element-type. Differential Revision: https://reviews.llvm.org/D57173 llvm-svn: 352913	2019-02-01 20:44:47 +00:00
James Y Knight	14359ef1b6	[opaque pointer types] Pass value type to LoadInst creation. This cleans up all LoadInst creation in LLVM to explicitly pass the value type rather than deriving it from the pointer's element-type. Differential Revision: https://reviews.llvm.org/D57172 llvm-svn: 352911	2019-02-01 20:44:24 +00:00
James Y Knight	d9e85a0861	[opaque pointer types] Pass function types to InvokeInst creation. This cleans up all InvokeInst creation in LLVM to explicitly pass a function type rather than deriving it from the pointer's element-type. Differential Revision: https://reviews.llvm.org/D57171 llvm-svn: 352910	2019-02-01 20:43:34 +00:00
James Y Knight	7976eb5838	[opaque pointer types] Pass function types to CallInst creation. This cleans up all CallInst creation in LLVM to explicitly pass a function type rather than deriving it from the pointer's element-type. Differential Revision: https://reviews.llvm.org/D57170 llvm-svn: 352909	2019-02-01 20:43:25 +00:00
Yevgeny Rouban	15b17d0a7c	Provide reason messages for unviable inlining InlineCost's isInlineViable() is changed to return InlineResult instead of bool. This provides messages for failure reasons and allows to get more specific messages for cases where callsites are not viable for inlining. Reviewed By: xbolva00, anemet Differential Revision: https://reviews.llvm.org/D57089 llvm-svn: 352849	2019-02-01 10:44:43 +00:00
James Y Knight	13680223b9	[opaque pointer types] Add a FunctionCallee wrapper type, and use it. Recommit r352791 after tweaking DerivedTypes.h slightly, so that gcc doesn't choke on it, hopefully. Original Message: The FunctionCallee type is effectively a {FunctionType,Value} pair, and is a useful convenience to enable code to continue passing the result of getOrInsertFunction() through to EmitCall, even once pointer types lose their pointee-type. Then: - update the CallInst/InvokeInst instruction creation functions to take a Callee, - modify getOrInsertFunction to return FunctionCallee, and - update all callers appropriately. One area of particular note is the change to the sanitizer code. Previously, they had been casting the result of `getOrInsertFunction` to a `Function*` via `checkSanitizerInterfaceFunction`, and storing that. That would report an error if someone had already inserted a function declaraction with a mismatching signature. However, in general, LLVM allows for such mismatches, as `getOrInsertFunction` will automatically insert a bitcast if needed. As part of this cleanup, cause the sanitizer code to do the same. (It will call its functions using the expected signature, however they may have been declared.) Finally, in a small number of locations, callers of `getOrInsertFunction` actually were expecting/requiring that a brand new function was being created. In such cases, I've switched them to Function::Create instead. Differential Revision: https://reviews.llvm.org/D57315 llvm-svn: 352827	2019-02-01 02:28:03 +00:00
James Y Knight	fadf25068e	Revert "[opaque pointer types] Add a FunctionCallee wrapper type, and use it." This reverts commit `f47d6b38c7` (r352791). Seems to run into compilation failures with GCC (but not clang, where I tested it). Reverting while I investigate. llvm-svn: 352800	2019-01-31 21:51:58 +00:00
James Y Knight	f47d6b38c7	[opaque pointer types] Add a FunctionCallee wrapper type, and use it. The FunctionCallee type is effectively a {FunctionType,Value} pair, and is a useful convenience to enable code to continue passing the result of getOrInsertFunction() through to EmitCall, even once pointer types lose their pointee-type. Then: - update the CallInst/InvokeInst instruction creation functions to take a Callee, - modify getOrInsertFunction to return FunctionCallee, and - update all callers appropriately. One area of particular note is the change to the sanitizer code. Previously, they had been casting the result of `getOrInsertFunction` to a `Function*` via `checkSanitizerInterfaceFunction`, and storing that. That would report an error if someone had already inserted a function declaraction with a mismatching signature. However, in general, LLVM allows for such mismatches, as `getOrInsertFunction` will automatically insert a bitcast if needed. As part of this cleanup, cause the sanitizer code to do the same. (It will call its functions using the expected signature, however they may have been declared.) Finally, in a small number of locations, callers of `getOrInsertFunction` actually were expecting/requiring that a brand new function was being created. In such cases, I've switched them to Function::Create instead. Differential Revision: https://reviews.llvm.org/D57315 llvm-svn: 352791	2019-01-31 20:35:56 +00:00
Bjorn Pettersson	d014d576a9	[IPCP] Don't crash due to arg count/type mismatch between caller/callee Summary: This patch avoids an assert in IPConstantPropagation when there is a argument count/type mismatch between the caller and the callee. While this is actually UB on C-level (clang emits a warning), the IR verifier seems to accept it. I'm not sure what other frontends/languages might think about this, so simply bailing out to avoid hitting an assert (in CallSiteBase<>::getArgOperand or Value::doRAUW) seems like a simple solution. The problem is exposed by the fact that AbstractCallSites will look through a bitcast at the callee position of a call/invoke. Reviewers: jdoerfert, reames, efriedma Reviewed By: jdoerfert, efriedma Subscribers: eli.friedman, efriedma, llvm-commits Differential Revision: https://reviews.llvm.org/D57052 llvm-svn: 352469	2019-01-29 10:19:44 +00:00
Teresa Johnson	5b2f6a1bc2	[ThinLTO] Refine reachability check to fix compile time increase Summary: A recent fix to the ThinLTO whole program dead code elimination (D56117) increased the thin link time on a large MSAN'ed binary by 2x. It's likely that the time increased elsewhere, but was more noticeable here since it was already large and ended up timing out. That change made it so we would repeatedly scan all copies of linkonce symbols for liveness every time they were encountered during the graph traversal. This was needed since we only mark one copy of an aliasee as live when we encounter a live alias. This patch fixes the issue in a more efficient manner by simply proactively visiting the aliasee (thus marking all copies live) when we encounter a live alias. Two notes: One, this requires a hash table lookup (finding the aliasee summary in the index based on aliasee GUID). However, the impact of this seems to be small compared to the original pre-D56117 thin link time. It could be addressed if we keep the aliasee ValueInfo in the alias summary instead of the aliasee GUID, which I am exploring in a separate patch. Second, we only populate the aliasee GUID field when reading summaries from bitcode (whether we are reading individual summaries and merging on the fly to form the compiled index, or reading in a serialized combined index). Thankfully, that's currently the only way we can get to this code as we don't yet support reading summaries from LLVM assembly directly into a tool that performs the thin link (they must be converted to bitcode first). I added a FIXME, however I have the fix under test already. The easiest fix is to simply populate this field always, which isn't hard, but more likely the change I am exploring to store the ValueInfo instead as described above will subsume this. I don't want to hold up the regression fix for this though. Reviewers: trentxintong Subscribers: mehdi_amini, inglorion, dexonsmith, llvm-commits Differential Revision: https://reviews.llvm.org/D57203 llvm-svn: 352438	2019-01-28 22:27:05 +00:00
Vedant Kumar	db3f9774ee	[HotColdSplit] Introduce a cost model to control splitting behavior The main goal of the model is to avoid increasing function size, as that would eradicate any memory locality benefits from splitting. This happens when: - There are too many inputs or outputs to the cold region. Argument materialization and reloads of outputs have a cost. - The cold region has too many distinct exit blocks, causing a large switch to be formed in the caller. - The code size cost of the split code is less than the cost of a set-up call. A secondary goal is to prevent excessive overall binary size growth. With the cost model in place, I experimented to find a splitting threshold that works well in practice. To make warm & cold code easily separable for analysis purposes, I moved split functions to a "cold" section. I experimented with thresholds between [0, 4] and set the default to the threshold which minimized geomean __text size. Experiment data from building LNT+externals for X86 (N = 639 programs, all sizes in bytes): \| Configuration \| __text geom size \| __cold geom size \| TEXT geom size \| \| -Os \| 1736.3 \| 0, n=0 \| 10961.6 \| \| -Os, thresh=0 \| 1740.53 \| 124.482, n=134 \| 11014 \| \| -Os, thresh=1 \| 1734.79 \| 57.8781, n=90 \| 10978.6 \| \| -Os, thresh=2 \| 1733.85 \| 65.6604, n=61 \| 10977.6 \| \| -Os, thresh=3 \| 1733.85 \| 65.3071, n=61 \| 10977.6 \| \| -Os, thresh=4 \| 1735.08 \| 67.5156, n=54 \| 10965.7 \| \| -Oz \| 1554.4 \| 0, n=0 \| 10153 \| \| -Oz, thresh=2 \| 1552.2 \| 65.633, n=61 \| 10176 \| \| -O3 \| 2563.37 \| 0, n=0 \| 13105.4 \| \| -O3, thresh=2 \| 2559.49 \| 71.1072, n=61 \| 13162.4 \| Picking thresh=2 reduces the geomean __text section size by 0.14% at -Os, -Oz, and -O3 and causes ~0.2% growth in the TEXT segment. Note that TEXT size is page-aligned, whereas section sizes are byte-aligned. Experiment data from building LNT+externals for ARM64 (N = 558 programs, all sizes in bytes): \| Configuration \| __text geom size \| __cold geom size \| TEXT geom size \| \| -Os \| 1763.96 \| 0, n=0 \| 42934.9 \| \| -Os, thresh=2 \| 1760.9 \| 76.6755, n=61 \| 42934.9 \| Picking thresh=2 reduces the geomean __text section size by 0.17% at -Os and causes no growth in the TEXT segment. Measurements were done with D57082 (r352080) applied. Differential Revision: https://reviews.llvm.org/D57125 llvm-svn: 352228	2019-01-25 18:30:37 +00:00
Vedant Kumar	9d70f2b939	[HotColdSplit] Describe the pass in more detail, NFC llvm-svn: 352161	2019-01-25 03:22:38 +00:00
Vedant Kumar	65de025d64	[HotColdSplit] Split more aggressively before/after cold invokes While a cold invoke itself and its unwind destination can't be extracted, code which unconditionally executes before/after the invoke may still be profitable to extract. With cost model changes from D57125 applied, this gives a 3.5% increase in split text across LNT+externals on arm64 at -Os. llvm-svn: 352160	2019-01-25 03:22:23 +00:00
Vedant Kumar	ef1ebed1c6	[HotColdSplit] Move splitting earlier in the pipeline Performing splitting early has several advantages: - Inhibiting inlining of cold code early improves code size. Compared to scheduling splitting at the end of the pipeline, this cuts code size growth in half within the iOS shared cache (0.69% to 0.34%). - Inhibiting inlining of cold code improves compile time. There's no need to inline split cold functions, or to inline as much within those split functions as they are marked `minsize`. - During LTO, extra work is only done in the pre-link step. Less code must be inlined during cross-module inlining. An additional motivation here is that the most common cold regions identified by the static/conservative splitting heuristic can (a) be found before inlining and (b) do not grow after inlining. E.g. __assert_fail, os_log_error. The disadvantages are: - Some opportunities for splitting out cold code may be missed. This gap can potentially be narrowed by adding a worklist algorithm to the splitting pass. - Some opportunities to reduce code size may be lost (e.g. store sinking, when one side of the CFG diamond is split). This does not outweigh the code size benefits of splitting earlier. On net, splitting early in the pipeline has substantial code size benefits, and no major effects on memory locality or performance. We measured memory locality using ktrace data, and consistently found that 10% fewer pages were needed to capture 95% of text page faults in key iOS benchmarks. We measured performance on frequency-stabilized iOS devices using LNT+externals. This reverses course on the decision made to schedule splitting late in r344869 (D53437). Differential Revision: https://reviews.llvm.org/D57082 llvm-svn: 352080	2019-01-24 18:55:49 +00:00
Julian Lettner	b62e9dc46b	Revert "[Sanitizers] UBSan unreachable incompatible with ASan in the presence of `noreturn` calls" This reverts commit `cea84ab93a`. llvm-svn: 352069	2019-01-24 18:04:21 +00:00
Florian Hahn	bed7f9eab2	Revert "[HotColdSplitting] Get DT and PDT from the pass manager." This reverts commit `a6982414ed` (llvm-svn: 352036), because it causes a memory leak in the pass manager. Failing bot http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap/builds/10351/steps/check-llvm%20asan/logs/stdio llvm-svn: 352041	2019-01-24 11:22:08 +00:00
Florian Hahn	a6982414ed	[HotColdSplitting] Get DT and PDT from the pass manager. Instead of manually computing DT and PDT, we can get the from the pass manager, which ideally has them already cached. With the new pass manager, we could even preserve DT/PDT on a per function basis in a module pass. I think this also addresses the TODO about re-using the computed DTs for BFI. IIUC, GetBFI will fetch the DT from the pass manager and when we will fetch the cached version later. Reviewers: vsk, hiraditya, tejohnson, thegameg, sebpop Reviewed By: vsk Differential Revision: https://reviews.llvm.org/D57092 llvm-svn: 352036	2019-01-24 09:44:52 +00:00
Julian Lettner	cea84ab93a	[Sanitizers] UBSan unreachable incompatible with ASan in the presence of `noreturn` calls Summary: UBSan wants to detect when unreachable code is actually reached, so it adds instrumentation before every `unreachable` instruction. However, the optimizer will remove code after calls to functions marked with `noreturn`. To avoid this UBSan removes `noreturn` from both the call instruction as well as from the function itself. Unfortunately, ASan relies on this annotation to unpoison the stack by inserting calls to `_asan_handle_no_return` before `noreturn` functions. This is important for functions that do not return but access the the stack memory, e.g., unwinder functions like `longjmp` (`longjmp` itself is actually "double-proofed" via its interceptor). The result is that when ASan and UBSan are combined, the `noreturn` attributes are missing and ASan cannot unpoison the stack, so it has false positives when stack unwinding is used. Changes: # UBSan now adds the `expect_noreturn` attribute whenever it removes the `noreturn` attribute from a function # ASan additionally checks for the presence of this attribute Generated code: ``` call void @__asan_handle_no_return // Additionally inserted to avoid false positives call void @longjmp call void @__asan_handle_no_return call void @__ubsan_handle_builtin_unreachable unreachable ``` The second call to `__asan_handle_no_return` is redundant. This will be cleaned up in a follow-up patch. rdar://problem/40723397 Reviewers: delcypher, eugenis Tags: #sanitizers Differential Revision: https://reviews.llvm.org/D56624 llvm-svn: 352003	2019-01-24 01:06:19 +00:00
David Callahan	d2eeb2516d	Update entry count for cold calls Summary: Profile sample files include the number of times each entry or inlined call site is sampled. This is translated into the entry count metadta on functions. When sample data is being read, if a call site that was inlined in the sample program is considered cold and not inlined, then the entry count of the out-of-line functions does not reflect the current compilation. In this patch, we note call sites where the function was not inlined and as a last action of the sample profile loading, we update the called function's entry count to reflect the calls from these call sites which are not included in the profile file. Reviewers: danielcdh, wmi, Kader, modocache Reviewed By: wmi Subscribers: davidxl, eraman, llvm-commits Differential Revision: https://reviews.llvm.org/D52845 llvm-svn: 352001	2019-01-24 00:55:23 +00:00
Florian Hahn	68cea130df	[HotColdSplitting] Remove unused SSAUpdater.h include (NFC). llvm-svn: 351945	2019-01-23 11:51:38 +00:00
Vedant Kumar	cde65c0fac	[HotColdSplit] Calculate BFI lazily to reduce compile-time, NFC The splitting pass does not need BFI unless the Module actually has a profile summary. Do not calcualte BFI unless the summary is present. For the sqlite3 amalgamation, this reduces time spent in the splitting pass from 0.4% of the total to under 0.1%. llvm-svn: 351894	2019-01-22 22:49:22 +00:00
Vedant Kumar	38874c8f7b	[HotColdSplit] Calculate domtrees lazily to reduce compile-time, NFC The splitting pass does not need (post)domtrees until after it's found a cold block. Defer domtree calculation until a cold block is found. For the sqlite3 amalgamation, this reduces time spent in the splitting pass from 0.8% of the total to 0.4%. llvm-svn: 351892	2019-01-22 22:49:08 +00:00
Vedant Kumar	857cacd9de	[ConstantMerge] Factor out check for un-mergeable globals, NFC llvm-svn: 351671	2019-01-20 02:44:43 +00:00
Chandler Carruth	2946cd7010	Update the file headers across all of the LLVM projects in the monorepo to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636	2019-01-19 08:50:56 +00:00
Johannes Doerfert	36872b5db9	Enable IPConstantPropagation to work with abstract call sites This modification of the currently unused inter-procedural constant propagation pass (IPConstantPropagation) shows how abstract call sites enable optimization of callback calls alongside direct and indirect calls. Through minimal changes, mostly dealing with the partial mapping of callbacks, inter-procedural constant propagation was enabled for callbacks, e.g., OpenMP runtime calls or pthreads_create. Differential Revision: https://reviews.llvm.org/D56447 llvm-svn: 351628	2019-01-19 05:19:12 +00:00
Vedant Kumar	b537b946b8	[MergeFunc] Allow merging identical vararg functions using aliases Thanks to Nikita Popov for pointing out this missed case. This is a follow-up to r351411, which disabled function merging for vararg functions outright due to a miscompile (see llvm.org/PR40345). Differential Revision: https://reviews.llvm.org/D56865 llvm-svn: 351624	2019-01-19 02:46:22 +00:00
Vedant Kumar	b755a2df51	[HotColdSplit] Mark inherently cold functions as such If an inherently cold function is found, mark it as cold. For now this means applying the `cold` and `minsize` attributes. As a drive-by, revisit and clean up the criteria for considering a function for splitting. Add tests. llvm-svn: 351623	2019-01-19 02:38:47 +00:00
Vedant Kumar	4de1962bb9	[HotColdSplit] Remove a set which tracked split functions (NFC) Use the begin/end iterator idiom to avoid visiting split functions, instead of doing a set lookup. llvm-svn: 351622	2019-01-19 02:38:17 +00:00
Vedant Kumar	f529b50702	[HotColdSplit] Allow outlining with live outputs Prior to r348205, extracting code regions with live output values was disabled because of a miscompilation (PR39433). Lift the restriction as PR39433 has been addressed. Tested on LNT+externals, on a run of check-llvm in a stage2 build, and with a full build of iOS (with hot/cold splitting enabled). As a drive-by, remove an errant TODO. llvm-svn: 351492	2019-01-17 22:36:05 +00:00
Vedant Kumar	b70e20db62	[HotColdSplit] Consider resume instructions to be cold Resuming exception unwinding is roughly as unlikely as throwing an exception. Tested on LNT+externals (in particular, the C++ EH regression tests provide end-to-end test coverage), as well as with a full build of iOS. llvm-svn: 351491	2019-01-17 22:35:47 +00:00
Vedant Kumar	4541be0686	[HotColdSplit] Relax requirement that the cold sink block be extractable Relaxing this requirement creates opportunities to split code dominated by an EH pad. Tested on LNT+externals. llvm-svn: 351483	2019-01-17 21:42:36 +00:00
Vedant Kumar	32a014d048	[HotColdSplit] Simplify tests by lowering their splitting thresholds This gets rid of the brittle/mysterious calls to @sink()/@sideeffect() peppered throughout the test cases. They are no longer needed to force splitting to occur. llvm-svn: 351480	2019-01-17 21:29:34 +00:00
Wei Mi	3bcccdfe38	[SampleFDO] Skip profile reading when flattened profile used in ThinLTO postlink If the sample profile has no inlining hierachy information included, we call the sample profile is flattened. For flattened profile, in ThinLTO postlink phase, SampleProfileLoader's hot function inlining and profile annotation will do nothing, so it is better to save the effort to read in the profile and run the sample profile loader pass. It is helpful for reducing compile time when the flattened profile is huge. Differential Revision: https://reviews.llvm.org/D54819 llvm-svn: 351476	2019-01-17 20:48:34 +00:00
Teresa Johnson	8d86f1ba47	Revert "[ThinLTO] Add summary entries for index-based WPD" Mistaken commit of something still under review! This reverts commit r351453. llvm-svn: 351455	2019-01-17 16:05:04 +00:00
Teresa Johnson	4fcf3b1621	[ThinLTO] Add summary entries for index-based WPD Summary: If LTOUnit splitting is disabled, the module summary analysis computes the summary information necessary to perform single implementation devirtualization during the thin link with the index and no IR. The information collected from the regular LTO IR in the current hybrid WPD algorithm is summarized, including: 1) For vtable definitions, record the function pointers and their offset within the vtable initializer (subsumes the information collected from IR by tryFindVirtualCallTargets). 2) A record for each type metadata summarizing the vtable definitions decorated with that metadata (subsumes the TypeIdentiferMap collected from IR). Also added are the necessary bitcode records, and the corresponding assembly support. The index-based WPD will be sent as a follow-on. Depends on D53890. Reviewers: pcc Subscribers: mehdi_amini, Prazek, inglorion, eraman, steven_wu, dexonsmith, arphaman, llvm-commits Differential Revision: https://reviews.llvm.org/D54815 llvm-svn: 351453	2019-01-17 15:49:03 +00:00
Vedant Kumar	a9906c1e5e	[MergeFunc] Prevent silent miscompile of vararg functions The function merging pass miscompiles identical vararg functions. The forwarding thunk it emits doesn't forward the full variable-length list of arguments. Disable merging for vararg functions for now. I've filed llvm.org/PR40345 to track the issue. rdar://47326238 llvm-svn: 351411	2019-01-17 02:15:05 +00:00
Wei Mi	79c4408aa2	Fix a mistake in rL351392. PGOInstrGen should be initialized to "" instead of false. llvm-svn: 351397	2019-01-16 23:31:40 +00:00
Wei Mi	c876e3d42b	[PGO] Make pgo related options in opt more consistent. Currently we have pgo options defined in PassManagerBuilder.cpp only for instrument pgo, but not for sample pgo. We also have pgo options defined in NewPMDriver.cpp in opt only for new pass manager and for all kinds of pgo. They have some inconsistency. To make the options more consistent and make tests writing easier, the patch let old pass manager to share the same pgo options with new pass manager in opt, and removes the options in PassManagerBuilder.cpp. Differential Revision: https://reviews.llvm.org/D56749 llvm-svn: 351392	2019-01-16 23:19:02 +00:00
Tom Stellard	3d36e5c3e6	Only promote args when function attributes are compatible Summary: Check to make sure that the caller and the callee have compatible function arguments before promoting arguments. This uses the same TargetTransformInfo queries that are used to determine if attributes are compatible for inlining. The goal here is to avoid breaking ABI when a called function's ABI depends on a target feature that is not enabled in the caller. This is a very conservative fix for PR37358. Ideally we would have a more sophisticated check for ABI compatiblity rather than checking if the attributes are compatible for inlining. Reviewers: echristo, chandlerc, eli.friedman, craig.topper Reviewed By: echristo, chandlerc Subscribers: nikic, xbolva00, rkruppe, alexcrichton, llvm-commits Differential Revision: https://reviews.llvm.org/D53554 llvm-svn: 351296	2019-01-16 05:15:31 +00:00
David Callahan	dee001207e	We can improve the performance (generally) by memo-izing the action to map a debug location to its function summary. Summary: Here are timings (as reported by "opt -time-passes") for sample-profile pass for some files holding hot functions from a major service©r. Average 17% reduction. Delta column is 100*(old-new)/old. ``` Old New Delta 0.0537 0.0538 -0.2% 0.8155 0.6522 20.0% 0.0779 0.0751 3.6% 0.0727 0.0913 -25.6% 0.1622 0.1302 19.7% 0.0627 0.0594 5.3% 0.0766 0.0744 2.9% 0.6426 0.4387 31.7% 0.3521 0.2776 21.2% 0.3549 0.2721 23.3% 0.0912 0.0904 0.9% 0.1236 0.1059 14.3% 0.0854 0.0866 -1.4% 0.0757 0.0722 4.6% 0.1293 0.1147 11.3% 0.1354 0.1122 17.1% 0.0767 0.0770 -0.4% 0.1135 0.0968 14.7% 0.0524 0.0608 -16.0% 0.1279 0.1106 13.5% ========== 3.6820 3.0520 17.1% Total ``` Reviewers: twoh, Kader, danielcdh, wmi Reviewed By: wmi Subscribers: dblaikie, llvm-commits Differential Revision: https://reviews.llvm.org/D56435 llvm-svn: 351211	2019-01-15 17:45:54 +00:00
David Callahan	957795973b	Ignore PhiNodes when mapping sample profile data Summary: Like branch instructions, phi nodes frequently do not have debug information related to the block they are in and so they should be ignored. Reviewers: danielcdh, twoh, Kader, wmi Reviewed By: wmi Subscribers: aprantl, llvm-commits Differential Revision: https://reviews.llvm.org/D55094 llvm-svn: 351102	2019-01-14 19:05:59 +00:00
David Callahan	b1853a6a94	Revert "Merge branch 'arcpatch-D55094'" This reverts commit a9788dd6587d67c856df74eedff5a6ad34ce8320, reversing changes made to f1309ffebf718d16aec4fab83380556c660e2825. unintended merge pushed llvm-svn: 351095	2019-01-14 18:49:27 +00:00
David Callahan	1b46231764	Merge branch 'arcpatch-D55094' llvm-svn: 351092	2019-01-14 18:35:43 +00:00
Benjamin Kramer	b17d2136ea	Give helper classes/functions local linkage. NFC. llvm-svn: 351016	2019-01-12 18:36:22 +00:00
Teresa Johnson	290a839891	[LTO] Record whether LTOUnit splitting is enabled in index Summary: Records in the module summary index whether the bitcode was compiled with the option necessary to enable splitting the LTO unit (e.g. -fsanitize=cfi, -fwhole-program-vtables, or -fsplit-lto-unit). The information is passed down to the ModuleSummaryIndex builder via a new module flag "EnableSplitLTOUnit", which is propagated onto a flag on the summary index. This is then used during the LTO link to check whether all linked summaries were built with the same value of this flag. If not, an error is issued when we detect a situation requiring whole program visibility of the class hierarchy. This is the case when both of the following conditions are met: 1) We are performing LowerTypeTests or Whole Program Devirtualization. 2) There are type tests or type checked loads in the code. Note I have also changed the ThinLTOBitcodeWriter to also gate the module splitting on the value of this flag. Reviewers: pcc Subscribers: ormris, mehdi_amini, Prazek, inglorion, eraman, steven_wu, dexonsmith, arphaman, dang, llvm-commits Differential Revision: https://reviews.llvm.org/D53890 llvm-svn: 350948	2019-01-11 18:31:57 +00:00
Vedant Kumar	ee10ef737e	[MergeFunc] Erase unused duplicate functions if they are discardable MergeFunc only deletes unused duplicate functions if they have local linkage, but it should be safe to relax this to any "discardable if unused" linkage type. Differential Revision: https://reviews.llvm.org/D56574 llvm-svn: 350939	2019-01-11 17:56:35 +00:00
Vedant Kumar	08fe7e02fb	[MergeFunc] Use Instruction::getFunction as a cleanup, NFC llvm-svn: 350938	2019-01-11 17:56:21 +00:00
Easwaran Raman	b45994b843	Refactor synthetic profile count computation. NFC. Summary: Instead of using two separate callbacks to return the entry count and the relative block frequency, use a single callback to return callsite count. This would allow better supporting hybrid mode in the future as the count of callsite need not always be derived from entry count (as in sample PGO). Reviewers: davidxl Subscribers: mehdi_amini, steven_wu, dexonsmith, dang, llvm-commits Differential Revision: https://reviews.llvm.org/D56464 llvm-svn: 350755	2019-01-09 20:10:27 +00:00

1 2 3 4 5 ...

3407 Commits