llvm-project

Commit Graph

Author	SHA1	Message	Date
Tom Stellard	3d36e5c3e6	Only promote args when function attributes are compatible Summary: Check to make sure that the caller and the callee have compatible function arguments before promoting arguments. This uses the same TargetTransformInfo queries that are used to determine if attributes are compatible for inlining. The goal here is to avoid breaking ABI when a called function's ABI depends on a target feature that is not enabled in the caller. This is a very conservative fix for PR37358. Ideally we would have a more sophisticated check for ABI compatiblity rather than checking if the attributes are compatible for inlining. Reviewers: echristo, chandlerc, eli.friedman, craig.topper Reviewed By: echristo, chandlerc Subscribers: nikic, xbolva00, rkruppe, alexcrichton, llvm-commits Differential Revision: https://reviews.llvm.org/D53554 llvm-svn: 351296	2019-01-16 05:15:31 +00:00
Serguei Katkov	a5b0e5585b	[InstCombine]Avoid introduction of unaligned mem access InstCombine is able to transform mem transfer instrinsic to alone store or store/load pair. It might result in generation of unaligned atomic load/store which later in backend will be transformed to libcall. It is not an evident gain and it is better to keep intrinsic as is and handle it at backend. Reviewers: reames, anna, apilipenko, mkazantsev Reviewed By: reames Subscribers: t.p.northover, jfb, llvm-commits Differential Revision: https://reviews.llvm.org/D56582 llvm-svn: 351295	2019-01-16 04:36:26 +00:00
David Callahan	d129d3e93f	treat invoke like call Summary: InvokeInst should be treated like CallInst and assigned a separate discriminator. This is particularly import when an Invoke is converted to a Call during compilation and so can invalidate sample profile data collected wtih different link time optimizations Reviewers: twoh, Kader, danielcdh, wmi Reviewed By: wmi Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D56491 llvm-svn: 351251	2019-01-15 21:26:51 +00:00
Matt Morehouse	19ff35c481	[SanitizerCoverage] Don't create comdat for interposable functions. Summary: Comdat groups override weak symbol behavior, allowing the linker to keep the comdats for weak symbols in favor of comdats for strong symbols. Fixes the issue described in: https://bugs.chromium.org/p/chromium/issues/detail?id=918662 Reviewers: eugenis, pcc, rnk Reviewed By: pcc, rnk Subscribers: smeenai, rnk, bd1976llvm, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D56516 llvm-svn: 351247	2019-01-15 21:21:01 +00:00
David Callahan	dee001207e	We can improve the performance (generally) by memo-izing the action to map a debug location to its function summary. Summary: Here are timings (as reported by "opt -time-passes") for sample-profile pass for some files holding hot functions from a major service©r. Average 17% reduction. Delta column is 100*(old-new)/old. ``` Old New Delta 0.0537 0.0538 -0.2% 0.8155 0.6522 20.0% 0.0779 0.0751 3.6% 0.0727 0.0913 -25.6% 0.1622 0.1302 19.7% 0.0627 0.0594 5.3% 0.0766 0.0744 2.9% 0.6426 0.4387 31.7% 0.3521 0.2776 21.2% 0.3549 0.2721 23.3% 0.0912 0.0904 0.9% 0.1236 0.1059 14.3% 0.0854 0.0866 -1.4% 0.0757 0.0722 4.6% 0.1293 0.1147 11.3% 0.1354 0.1122 17.1% 0.0767 0.0770 -0.4% 0.1135 0.0968 14.7% 0.0524 0.0608 -16.0% 0.1279 0.1106 13.5% ========== 3.6820 3.0520 17.1% Total ``` Reviewers: twoh, Kader, danielcdh, wmi Reviewed By: wmi Subscribers: dblaikie, llvm-commits Differential Revision: https://reviews.llvm.org/D56435 llvm-svn: 351211	2019-01-15 17:45:54 +00:00
Zaara Syeda	b7dff9c9af	[SimpleLoopUnswitch] Increment stats counter for unswitching switch instruction Increment statistics counter NumSwitches at unswitchNontrivialInvariants() for unswitching a non-trivial switch instruction. This is to fix a bug that it increments NumBranches even for the case of switch instruction. There is no functional change in this patch. Differential Revision: https://reviews.llvm.org/D56408 llvm-svn: 351193	2019-01-15 15:08:01 +00:00
Florian Hahn	4094f34f78	[InstCombine] Don't undo 0 - (X * Y) canonicalization when combining subs. Otherwise instcombine gets stuck in a cycle. The canonicalization was added in D55961. This patch fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=12400 llvm-svn: 351187	2019-01-15 11:18:21 +00:00
Max Kazantsev	f8a0e0ddf0	[NFC] Remove some code duplication llvm-svn: 351185	2019-01-15 11:16:14 +00:00
Max Kazantsev	80242ee87e	[NFC] Remove obsolete enum RangeCheckKind llvm-svn: 351183	2019-01-15 10:48:45 +00:00
Max Kazantsev	78a5435284	[NFC] Decrease if nest llvm-svn: 351180	2019-01-15 10:01:46 +00:00
Max Kazantsev	a78dc4d6c8	[NFC] Move some functions to LoopUtils llvm-svn: 351179	2019-01-15 09:51:34 +00:00
Marek Olsak	33eb4d947d	AMDGPU: Add a fast path for icmp.i1(src, false, NE) Summary: This allows moving the condition from the intrinsic to the standard ICmp opcode, so that LLVM can do simplifications on it. The icmp.i1 intrinsic is an identity for retrieving the SGPR mask. And we can also get the mask from and i1, or i1, xor i1. Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D52060 llvm-svn: 351150	2019-01-15 02:13:18 +00:00
Jonathan Metzman	e159a0dd1a	[SanitizerCoverage][NFC] Use appendToUsed instead of include Summary: Use appendToUsed instead of include to ensure that SanitizerCoverage's constructors are not stripped. Also, use isOSBinFormatCOFF() to determine if target binary format is COFF. Reviewers: pcc Reviewed By: pcc Subscribers: hiraditya Differential Revision: https://reviews.llvm.org/D56369 llvm-svn: 351118	2019-01-14 21:02:02 +00:00
David Callahan	957795973b	Ignore PhiNodes when mapping sample profile data Summary: Like branch instructions, phi nodes frequently do not have debug information related to the block they are in and so they should be ignored. Reviewers: danielcdh, twoh, Kader, wmi Reviewed By: wmi Subscribers: aprantl, llvm-commits Differential Revision: https://reviews.llvm.org/D55094 llvm-svn: 351102	2019-01-14 19:05:59 +00:00
David Callahan	b1853a6a94	Revert "Merge branch 'arcpatch-D55094'" This reverts commit a9788dd6587d67c856df74eedff5a6ad34ce8320, reversing changes made to f1309ffebf718d16aec4fab83380556c660e2825. unintended merge pushed llvm-svn: 351095	2019-01-14 18:49:27 +00:00
David Callahan	1b46231764	Merge branch 'arcpatch-D55094' llvm-svn: 351092	2019-01-14 18:35:43 +00:00
Tom Stellard	d0a7676087	cmake: Don't install plugins used for examples or tests Summary: This patch drops install targets for LLVMHello.so, TestPlugin.so, and BugpointPasses.so. Reviewers: chandlerc, beanz, thakis, philip.pfaffe Reviewed By: chandlerc Subscribers: SquallATF, mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D55965 llvm-svn: 351087	2019-01-14 18:25:35 +00:00
Jeremy Morse	f216da7ee0	[DebugInfo] Remove un-necessary logic from HoistThenElseCodeToIf Following PR39807, the way in which SimplifyCFG hoists common code on branch paths was fixed in r347782. However this left extra code hanging around HoistThenElseCodeToIf that wasn't necessary and needlessly complicated matters -- we no longer need to look up through the 'if' basic block to find a location for hoisted 'select' insts, we can instead use the location chosen by applyMergedLocation. This patch deletes that extra logic, and updates a regression test to reflect the new logic (selects get the merged location, not a previous insts location). Differential Revision: https://reviews.llvm.org/D55272 llvm-svn: 351058	2019-01-14 12:13:12 +00:00
David Stuttard	f77079f892	[AMDGPU] Add support for TFE/LWE in image intrinsics. 2nd try TFE and LWE support requires extra result registers that are written in the event of a failure in order to detect that failure case. The specific use-case that initiated these changes is sparse texture support. This means that if image intrinsics are used with either option turned on, the programmer must ensure that the return type can contain all of the expected results. This can result in redundant registers since the vector size must be a power-of-2. This change takes roughly 6 parts: 1. Modify the instruction defs in tablegen to add new instruction variants that can accomodate the extra return values. 2. Updates to lowerImage in SIISelLowering.cpp to accomodate setting TFE or LWE (where the bulk of the work for these instruction types is now done) 3. Extra verification code to catch cases where intrinsics have been used but insufficient return registers are used. 4. Modification to the adjustWritemask optimisation to account for TFE/LWE being enabled (requires extra registers to be maintained for error return value). 5. An extra pass to zero initialize the error value return - this is because if the error does not occur, the register is not written and thus must be zeroed before use. Also added a new (on by default) option to ensure ALL return values are zero-initialized that is required for sparse texture support. 6. Disable the inst_combine optimization in the presence of tfe/lwe (later TODO for this to re-enable and handle correctly). There's an additional fix now to avoid a dmask=0 For an image intrinsic with tfe where all result channels except tfe were unused, I was getting an image instruction with dmask=0 and only a single vgpr result for tfe. That is incorrect because the hardware assumes there is at least one vgpr result, plus the one for tfe. Fixed by forcing dmask to 1, which gives the desired two vgpr result with tfe in the second one. The TFE or LWE result is returned from the intrinsics using an aggregate type. Look in the test code provided to see how this works, but in essence IR code to invoke the intrinsic looks as follows: %v = call {<4 x float>,i32} @llvm.amdgcn.image.load.1d.v4f32i32.i32(i32 15, i32 %s, <8 x i32> %rsrc, i32 1, i32 0) %v.vec = extractvalue {<4 x float>, i32} %v, 0 %v.err = extractvalue {<4 x float>, i32} %v, 1 This re-submit of the change also includes a slight modification in SIISelLowering.cpp to work-around a compiler bug for the powerpc_le platform that caused a buildbot failure on a previous submission. Differential revision: https://reviews.llvm.org/D48826 Change-Id: If222bc03642e76cf98059a6bef5d5bffeda38dda Work around for ppcle compiler bug Change-Id: Ie284cf24b2271215be1b9dc95b485fd15000e32b llvm-svn: 351054	2019-01-14 11:55:24 +00:00
Max Kazantsev	1f73310e1e	[BasicBlockUtils] Generalize DeleteDeadBlock to deal with multiple dead blocks Utility function `DeleteDeadBlock` expects that all predecessors of a block being deleted are already deleted, with the exception of single-block loop. It makes it hard to use for deletion of a set of blocks that may contain cyclic dependencies. The is no correct order of invocations of this function that does not produce dangling pointers on already deleted blocks. This patch introduces a generalized version of this function `DeleteDeadBlocks` that allows us to remove multiple blocks at once, even if there are cycles among them. The only requirement is that no block being deleted should have a predecessor that is not being deleted. The logic of `DeleteDeadBlocks` is following: for each block create relevant DT updates; remove all instructions (replace with undef if needed); replace terminator with unreacheable; apply DT updates; for each block delete block; Therefore, `DeleteDeadBlock` becomes a particular case of the general algorithm called for a single block. Differential Revision: https://reviews.llvm.org/D56120 Reviewed By: skatkov llvm-svn: 351045	2019-01-14 10:26:26 +00:00
Benjamin Kramer	b17d2136ea	Give helper classes/functions local linkage. NFC. llvm-svn: 351016	2019-01-12 18:36:22 +00:00
Sanjay Patel	7d65fe5cd5	[LoopVectorizer] give more advice in remark about failure to vectorize call Something like this is requested by: https://bugs.llvm.org/show_bug.cgi?id=40265 ...and it seems like a common enough case that we should acknowledge it. Differential Revision: https://reviews.llvm.org/D56551 llvm-svn: 351010	2019-01-12 15:27:15 +00:00
Teresa Johnson	290a839891	[LTO] Record whether LTOUnit splitting is enabled in index Summary: Records in the module summary index whether the bitcode was compiled with the option necessary to enable splitting the LTO unit (e.g. -fsanitize=cfi, -fwhole-program-vtables, or -fsplit-lto-unit). The information is passed down to the ModuleSummaryIndex builder via a new module flag "EnableSplitLTOUnit", which is propagated onto a flag on the summary index. This is then used during the LTO link to check whether all linked summaries were built with the same value of this flag. If not, an error is issued when we detect a situation requiring whole program visibility of the class hierarchy. This is the case when both of the following conditions are met: 1) We are performing LowerTypeTests or Whole Program Devirtualization. 2) There are type tests or type checked loads in the code. Note I have also changed the ThinLTOBitcodeWriter to also gate the module splitting on the value of this flag. Reviewers: pcc Subscribers: ormris, mehdi_amini, Prazek, inglorion, eraman, steven_wu, dexonsmith, arphaman, dang, llvm-commits Differential Revision: https://reviews.llvm.org/D53890 llvm-svn: 350948	2019-01-11 18:31:57 +00:00
Vedant Kumar	ee10ef737e	[MergeFunc] Erase unused duplicate functions if they are discardable MergeFunc only deletes unused duplicate functions if they have local linkage, but it should be safe to relax this to any "discardable if unused" linkage type. Differential Revision: https://reviews.llvm.org/D56574 llvm-svn: 350939	2019-01-11 17:56:35 +00:00
Vedant Kumar	08fe7e02fb	[MergeFunc] Use Instruction::getFunction as a cleanup, NFC llvm-svn: 350938	2019-01-11 17:56:21 +00:00
Ehsan Amiri	f452f116d2	[Jump Threading] Unfold a select insn that feeds a switch via a phi node Currently when a select has a constant value in one branch and the select feeds a conditional branch (via a compare/ phi and compare) we unfold the select statement. This results in threading the conditional branch later on. Similar opportunity exists when a select (with a constant in one branch) feeds a switch (via a phi node). The patch unfolds select under this condition. A testcase is provided. llvm-svn: 350931	2019-01-11 15:52:57 +00:00
Matt Davis	9cd9f41f0e	[GVN] Update BlockRPONumber prior to use. Summary: The original patch addressed the use of BlockRPONumber by forcing a sequence point when accessing that map in a conditional. In short we found cases where that map was being accessed with blocks that had not yet been added to that structure. For context, I've kept the wall of text below, to what we are trying to fix, by always ensuring a updated BlockRPONumber. == Backstory == I was investigating an ICE (segfault accessing a DenseMap item). This failure happened non-deterministically, with no apparent reason and only on a Windows build of LLVM (from October 2018). After looking into the crashes (multiple core files) and running DynamoRio, the cores and DynamoRio (DR) log pointed to the same code in `GVN::performScalarPRE()`. The values in the map are unsigned integers, the keys are `llvm::BasicBlock*`. Our test case that triggered this warning and periodic crash is rather involved. But the problematic line looks to be: GVN.cpp: Line 2197 ``` if (BlockRPONumber[P] >= BlockRPONumber[CurrentBlock] && ``` To test things out, I cooked up a patch that accessed the items in the map outside of the condition, by forcing a sequence point between accesses. DynamoRio stopped warning of the issue, and the test didn't seem to crash after 1000+ runs. My investigation was on an older version of LLVM, (source from October this year). What it looks like was occurring is the following, and the assembly from the latest pull of llvm in December seems to confirm this might still be an issue; however, I have not witnessed the crash on more recent builds. Of course the asm in question is generated from the host compiler on that Windows box (not clang), but it hints that we might want to consider how we access the BlockRPONumber map in this conditional (line 2197, listed above). In any case, I don't think the host compiler is wrong, rather I think it is pointing out a possibly latent bug in llvm. 1) There is no sequence point for the `>=` operation. 2) A call to a `DenseMapBase::operator[]` can have the side effect of the map reallocating a larger store (more Buckets, via a call to `DenseMap::grow`). 3) It seems perfectly legal for a host compiler to generate assembly that stores the result of a call to `operator[]` on the stack (that's what my host compile of GVN.cpp is doing) . A second call to `operator[]` //might// encourage the map to 'grow' thus making any pointers to the map's store invalid. The `>=` compares the first and second values. If the first happens to be a pointer produced from operator[], it could be invalid when dereferenced at the time of comparison. The assembly generated from the Window's host compiler does show the result of the first access to the map via `operator[]` produces a pointer to an unsigned int. And that pointer is being stored on the stack. If a second call to the map (which does occur) causes the map to grow, that address (on the stack) is now invalid. Reviewers: t.p.northover, efriedma Reviewed By: efriedma Subscribers: efriedma, llvm-commits Differential Revision: https://reviews.llvm.org/D55974 llvm-svn: 350880	2019-01-10 19:56:03 +00:00
Alina Sbirlea	cae12edaaa	Use MemorySSA in LICM to do sinking and hoisting. Summary: Step 2 in using MemorySSA in LICM: Use MemorySSA in LICM to do sinking and hoisting, all under "EnableMSSALoopDependency" flag. Promotion is disabled. Enable flag in LICM sink/hoist tests to test correctness of this change. Moved one test which relied on promotion, in order to test all sinking tests. Reviewers: sanjoy, davide, gberry, george.burgess.iv Subscribers: llvm-commits, Prazek Differential Revision: https://reviews.llvm.org/D40375 llvm-svn: 350879	2019-01-10 19:29:04 +00:00
James Y Knight	62df5eed16	[opaque pointer types] Remove some calls to generic Type subtype accessors. That is, remove many of the calls to Type::getNumContainedTypes(), Type::subtypes(), and Type::getContainedType(N). I'm not intending to remove these accessors -- they are useful/necessary in some cases. However, removing the pointee type from pointers would potentially break some uses, and reducing the number of calls makes it easier to audit. llvm-svn: 350835	2019-01-10 16:07:20 +00:00
Eli Friedman	d4e7a0d83c	[SimplifyLibCalls] Fix memchr expansion for constant strings. The C standard says "The memchr function locates the first occurrence of c (converted to an unsigned char)[...]". The expansion was missing the conversion to unsigned char. Fixes https://bugs.llvm.org/show_bug.cgi?id=39041 . Differential Revision: https://reviews.llvm.org/D55947 llvm-svn: 350775	2019-01-09 23:39:26 +00:00
Easwaran Raman	b45994b843	Refactor synthetic profile count computation. NFC. Summary: Instead of using two separate callbacks to return the entry count and the relative block frequency, use a single callback to return callsite count. This would allow better supporting hybrid mode in the future as the count of callsite need not always be derived from entry count (as in sample PGO). Reviewers: davidxl Subscribers: mehdi_amini, steven_wu, dexonsmith, dang, llvm-commits Differential Revision: https://reviews.llvm.org/D56464 llvm-svn: 350755	2019-01-09 20:10:27 +00:00
Florian Hahn	9697d2a764	Revert r350647: "[NewPM] Port tsan" This patch breaks thread sanitizer on some macOS builders, e.g. http://green.lab.llvm.org/green/job/clang-stage1-configure-RA/52725/ llvm-svn: 350719	2019-01-09 13:32:16 +00:00
Max Kazantsev	4615a505f8	[IPT] Drop cache less eagerly in GVN and LoopSafetyInfo Current strategy of dropping `InstructionPrecedenceTracking` cache is to invalidate the entire basic block whenever we change its contents. In fact, `InstructionPrecedenceTracking` has 2 internal strictures: `OrderedInstructions` that is needed to be invalidated whenever the contents changes, and the map with first special instructions in block. This second map does not need an update if we add/remove a non-special instuction because it cannot affect the contents of this map. This patch changes API of `InstructionPrecedenceTracking` so that it now accounts for reasons under which we invalidate blocks. This should lead to much less recalculations of the map and should save us some compile time because in practice we don't typically add/remove special instructions. Differential Revision: https://reviews.llvm.org/D54462 Reviewed By: efriedma llvm-svn: 350694	2019-01-09 07:28:13 +00:00
Sanjay Patel	d023dd60e9	[InstCombine] canonicalize another raw IR rotate pattern to funnel shift This is matching the equivalent of the DAG expansion, so it should never end up with worse perf than the original code even if the target doesn't have a rotate instruction. llvm-svn: 350672	2019-01-08 22:39:55 +00:00
Philip Pfaffe	82f995db75	[NewPM] Port tsan A straightforward port of tsan to the new PM, following the same path as D55647. Differential Revision: https://reviews.llvm.org/D56433 llvm-svn: 350647	2019-01-08 19:21:57 +00:00
Anna Thomas	2dfa412efe	[UnrollRuntime] Fix domTree failures in multiexit unrolling Summary: This fixes the IDom for exit blocks and all blocks reachable from the exit blocks, when runtime unrolling under multiexit/exiting case. We initially had a restrictive check that the IDom is only updated when it is the header of the loop. However, we also need to update the IDom to the correct one when the IDom is any block within the original loop. See added test cases (which fail dom tree verification without the patch). Reviewers: reames, mzolotukhin, mkazantsev, hfinkel Reviewed by: brzycki, kuhar Subscribers: zzheng, dmgreen, llvm-commits Differential Revision: https://reviews.llvm.org/D56284 llvm-svn: 350640	2019-01-08 17:16:25 +00:00
Chandler Carruth	57578aaf96	[CallSite removal] Port `IndirectCallSiteVisitor` to use `CallBase` and update client code. Also rename it to use the more generic term `call` instead of something that could be confused with a praticular type. Differential Revision: https://reviews.llvm.org/D56183 llvm-svn: 350508	2019-01-07 07:15:51 +00:00
Chandler Carruth	363ac68374	[CallSite removal] Migrate all Alias Analysis APIs to use the newly minted `CallBase` class instead of the `CallSite` wrapper. This moves the largest interwoven collection of APIs that traffic in `CallSite`s. While a handful of these could have been migrated with a minorly more shallow migration by converting from a `CallSite` to a `CallBase`, it hardly seemed worth it. Most of the APIs needed to migrate together because of the complex interplay of AA APIs and the fact that converting from a `CallBase` to a `CallSite` isn't free in its current implementation. Out of tree users of these APIs can fairly reliably migrate with some combination of `.getInstruction()` on the `CallSite` instance and casting the resulting pointer. The most generic form will look like `CS` -> `cast_or_null<CallBase>(CS.getInstruction())` but in most cases there is a more elegant migration. Hopefully, this migrates enough APIs for users to fully move from `CallSite` to the base class. All of the in-tree users were easily migrated in that fashion. Thanks for the review from Saleem! Differential Revision: https://reviews.llvm.org/D55641 llvm-svn: 350503	2019-01-07 05:42:51 +00:00
Nikita Popov	65038515ee	[InstCombine] Relax cttz/ctlz with select on zero The cttz/ctlz intrinsics have a parameter specifying whether the result is undefined for zero. cttz(x, false) can be relaxed to cttz(x, true) if x is known non-zero, and in fact such an optimization is already performed. However, this currently doesn't work if x is non-zero as a result of a select rather than an explicit branch. This patch adds handling for this case, thus allowing x != 0 ? cttz(x, false) : y to simplify to x != 0 ? cttz(x, true) : y. Differential Revision: https://reviews.llvm.org/D55786 llvm-svn: 350463	2019-01-05 09:48:16 +00:00
Easwaran Raman	366a873f14	[Inliner] Optimize shouldBeDeferred This has some minor optimizations to shouldBeDeferred. This is not strictly NFC because the early exit inside the loop assumes TotalSecondaryCost is monotonically non-decreasing, which is not true if the threshold used by CostAnalyzer is negative. AFAICT the thresholds do not go below 0 for the default values of the various options we use. llvm-svn: 350456	2019-01-05 02:26:29 +00:00
Evgeniy Stepanov	0184c53cbd	Revert "Revert "[hwasan] Android: Switch from TLS_SLOT_TSAN(8) to TLS_SLOT_SANITIZER(6)"" This reapplies commit r348983. llvm-svn: 350448	2019-01-05 00:44:58 +00:00
Nikita Popov	6658fce4fc	[BDCE] Remove dead uses of arguments In addition to finding dead uses of instructions, also find dead uses of function arguments, and replace them with zero as well. I'm changing the way the known bits are computed here to remove the coupling between the transfer function and the algorithm. It previously relied on the first op being visited first and computing known bits -- unless the first op is not an instruction, in which case they're computed on the second op. I could have adjusted this to check for "instruction or argument", but I think it's better to avoid the repeated calculation with an explicit flag. Differential Revision: https://reviews.llvm.org/D56247 llvm-svn: 350435	2019-01-04 21:21:43 +00:00
Peter Collingbourne	87f477b5e4	hwasan: Implement lazy thread initialization for the interceptor ABI. The problem is similar to D55986 but for threads: a process with the interceptor hwasan library loaded might have some threads started by instrumented libraries and some by uninstrumented libraries, and we need to be able to run instrumented code on the latter. The solution is to perform per-thread initialization lazily. If a function needs to access shadow memory or add itself to the per-thread ring buffer its prologue checks to see whether the value in the sanitizer TLS slot is null, and if so it calls __hwasan_thread_enter and reloads from the TLS slot. The runtime does the same thing if it needs to access this data structure. This change means that the code generator needs to know whether we are targeting the interceptor runtime, since we don't want to pay the cost of lazy initialization when targeting a platform with native hwasan support. A flag -fsanitize-hwaddress-abi={interceptor,platform} has been introduced for selecting the runtime ABI to target. The default ABI is set to interceptor since it's assumed that it will be more common that users will be compiling application code than platform code. Because we can no longer assume that the TLS slot is initialized, the pthread_create interceptor is no longer necessary, so it has been removed. Ideally, lazy initialization should only cost one instruction in the hot path, but at present the call may cause us to spill arguments to the stack, which means more instructions in the hot path (or theoretically in the cold path if the spills are moved with shrink wrapping). With an appropriately chosen calling convention for the per-thread initialization function (TODO) the hot path should always need just one instruction and the cold path should need two instructions with no spilling required. Differential Revision: https://reviews.llvm.org/D56038 llvm-svn: 350429	2019-01-04 19:27:04 +00:00
Teresa Johnson	853b962416	[ThinLTO] Handle chains of aliases At -O0, globalopt is not run during the compile step, and we can have a chain of an alias having an immediate aliasee of another alias. The summaries are constructed assuming aliases in a canonical form (flattened chains), and as a result only the base object but no intermediate aliases were preserved. Fix by adding a pass that canonicalize aliases, which ensures each alias is a direct alias of the base object. Reviewers: pcc, davidxl Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, arphaman, llvm-commits Differential Revision: https://reviews.llvm.org/D54507 llvm-svn: 350423	2019-01-04 19:04:54 +00:00
Vedant Kumar	a1778df474	[CodeExtractor] Do not extract unsafe lifetime markers Lifetime markers which reference inputs to the extraction region are not safe to extract. Example ('rhs' will be extracted): ``` entry: +------------+ \| x = alloca \| \| y = alloca \| +------------+ / \ lhs: rhs: +-------------------+ +-------------------+ \| lifetime_start(x) \| \| lifetime_start(x) \| \| use(x) \| \| lifetime_start(y) \| \| lifetime_end(x) \| \| use(x, y) \| \| lifetime_start(y) \| \| lifetime_end(y) \| \| use(y) \| \| lifetime_end(x) \| \| lifetime_end(y) \| +-------------------+ +-------------------+ ``` Prior to extraction, the stack coloring pass sees that the slots for 'x' and 'y' are in-use at the same time. After extraction, the coloring pass infers that 'x' and 'y' are not in-use concurrently, because markers from 'rhs' are no longer available to help decide otherwise. This leads to a miscompile, because the stack slots actually are in-use concurrently in the extracted function. Fix this by moving lifetime start/end markers for memory regions defined in the calling function around the call to the extracted function. Fixes llvm.org/PR39671 (rdar://45939472). Differential Revision: https://reviews.llvm.org/D55967 llvm-svn: 350420	2019-01-04 17:43:22 +00:00
Sanjay Patel	722466e1f1	[InstCombine] reduce raw IR narrowing rotate patterns to funnel shift Similar to rL350199 - there are no known analysis/codegen holes for funnel shift intrinsics now, so we can canonicalize the 6+ regular instructions to funnel shift to improve vectorization, inlining, unrolling, etc. llvm-svn: 350419	2019-01-04 17:38:12 +00:00
John Brawn	39ac159c24	[LICM] Adjust how moving the re-hoist point works In some cases the order that we hoist instructions in means that when rehoisting (which uses the same order as hoisting) we can rehoist to a block A, then a block B, then block A again. This currently causes an assertion failure as it expects that when changing the hoist point it only ever moves to a block that dominates the hoist point being moved from. Fix this by moving the re-hoist point when it doesn't dominate the dominator of hoisted instruction, or in other words when it wouldn't dominate the uses of the instruction being rehoisted. Differential Revision: https://reviews.llvm.org/D55266 llvm-svn: 350408	2019-01-04 17:12:09 +00:00
Xin Tong	47beee2f3f	[memcpyopt] Remove a few unnecessary isVolatile() checks. NFC We already checked for isSimple() on the store. llvm-svn: 350378	2019-01-04 02:13:22 +00:00
Anna Thomas	a470aa6701	[UnrollRuntime] Move the DomTree verification under expensive checks Suggested by Hal as done in r349871. llvm-svn: 350349	2019-01-03 19:43:33 +00:00
Anna Thomas	0785e7307e	[UnrollRuntime] Add DomTree verification under debug mode NFC: This adds the dom tree verification under debug mode at a point just before we start unrolling the loop. This allows us to verify dom tree at a state where it is much smaller and before the unrolling actually happens. This also implies we do not need to run -verify-dom-info everytime to see if the DT is in a valid state when we transform the loop for runtime unrolling. llvm-svn: 350334	2019-01-03 17:44:44 +00:00

1 2 3 4 5 ...

21147 Commits