llvm-project

Commit Graph

Author	SHA1	Message	Date
Bruno Cardoso Lopes	c9005f2f2b	[InstCombine] Fix visitSwitchInst to use right operand types for sub cstexpr The visitSwitchInst generates SUB constant expressions to recompute the switch condition. When truncating the condition to a smaller type, SUB expressions should use the previous type (before trunc) for both operands. This fixes an assertion crash. Differential Revision: http://reviews.llvm.org/D6644 rdar://problem/19191835 llvm-svn: 224574	2014-12-19 14:23:15 +00:00
Duncan P. N. Exon Smith	46d7af5729	Rename MapValue(Metadata*) to MapMetadata() Instead of reusing the name `MapValue()` when mapping `Metadata`, use `MapMetadata()`. The old name doesn't make much sense after the `Metadata`/`Value` split. llvm-svn: 224566	2014-12-19 06:06:18 +00:00
Sanjay Patel	c242dbb3b6	fix formatting; NFC llvm-svn: 224542	2014-12-18 21:11:09 +00:00
Viktor Kutuzov	b4ffb5d5e9	[Msan] Generalize instrumentation code to support FreeBSD mapping Differential Revision: http://reviews.llvm.org/D6666 llvm-svn: 224514	2014-12-18 12:12:59 +00:00
Chandler Carruth	68ea415d04	[SROA] Cleanup - remove the use of std::mem_fun_ref nonsense and use a lambda now that we have them. llvm-svn: 224500	2014-12-18 05:19:47 +00:00
Kostya Serebryany	fea4fb404e	[sanitizer] allow -fsanitize-coverage=N w/ -fsanitize=leak, llvm part llvm-svn: 224463	2014-12-17 21:50:04 +00:00
Suyog Sarda	43fae93da8	Revert 224119 "This patch recognizes (+ (+ v0, v1) (+ v2, v3)), reorders them for bundling into vector of loads, and vectorizes it." This was re-ordering floating point data types resulting in mismatch in output. llvm-svn: 224424	2014-12-17 10:34:27 +00:00
Erik Eckstein	a451b9b0b5	Strength reduce intrinsics with overflow into regular arithmetic operations if possible. Some intrinsics, like s/uadd.with.overflow and umul.with.overflow, are already strength reduced. This change adds other arithmetic intrinsics: s/usub.with.overflow, smul.with.overflow. It completes the work on PR20194. llvm-svn: 224417	2014-12-17 07:29:19 +00:00
Kostya Serebryany	7376294086	[sanitizer] prevent function call merging for sanitizer-coverage callbacks llvm-svn: 224372	2014-12-16 21:24:15 +00:00
Elena Demikhovsky	f5b72afff4	Masked Load and Store Intrinsics in loop vectorizer. The loop vectorizer optimizes loops containing conditional memory accesses by generating masked load and store intrinsics. This decision is target dependent. http://reviews.llvm.org/D6527 llvm-svn: 224334	2014-12-16 11:50:42 +00:00
Elena Demikhovsky	a5599bfd72	Sink store based on alias analysis - by Ella Bolshinsky The alias analysis is used define whether the given instruction is a barrier for store sinking. For 2 identical stores, following instructions are checked in the both basic blocks, to determine whether they are sinking barriers. http://reviews.llvm.org/D6420 llvm-svn: 224247	2014-12-15 14:09:53 +00:00
Elena Demikhovsky	3fcafa2cdb	Loop Vectorizer minor changes in the code - some comments, function names, identation. Reviewed here: http://reviews.llvm.org/D6527 llvm-svn: 224218	2014-12-14 09:43:50 +00:00
Steven Wu	f179d12e50	More code format fix from r224133, NFC llvm-svn: 224140	2014-12-12 18:48:37 +00:00
Steven Wu	1f7402a14e	Restructure code from r224097. NFC llvm-svn: 224133	2014-12-12 17:21:54 +00:00
Chad Rosier	78943bcc18	[Reassociate] Use dbgs() instead of errs(). llvm-svn: 224125	2014-12-12 14:44:12 +00:00
Suyog Sarda	384095e65c	This patch recognizes (+ (+ v0, v1) (+ v2, v3)), reorders them for bundling into vector of loads, and vectorizes it. Test case : float hadd(float* a) { return (a[0] + a[1]) + (a[2] + a[3]); } AArch64 assembly before patch : ldp s0, s1, [x0] ldp s2, s3, [x0, #8] fadd s0, s0, s1 fadd s1, s2, s3 fadd s0, s0, s1 ret AArch64 assembly after patch : ldp d0, d1, [x0] fadd v0.2s, v0.2s, v1.2s faddp s0, v0.2s ret Reviewed Link : http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20141208/248531.html llvm-svn: 224119	2014-12-12 12:53:44 +00:00
Steven Wu	881916dea5	Fix another infinite loop in InstCombine Summary: InstCombine infinite-loops for the testcase added It is because InstCombine is generating instructions that can be optimized by itself. Fix by not optimizing frem if the optimized type is the same as original type. rdar://problem/19150820 Reviewers: majnemer Differential Revision: http://reviews.llvm.org/D6634 llvm-svn: 224097	2014-12-12 04:34:07 +00:00
Alexey Samsonov	4b7f413e3e	[ASan] Change fake stack and local variables handling. This commit changes the way we get fake stack from ASan runtime (to find use-after-return errors) and the way we represent local variables: - __asan_stack_malloc function now returns pointer to newly allocated fake stack frame, or NULL if frame cannot be allocated. It doesn't take pointer to real stack as an input argument, it is calculated inside the runtime. - __asan_stack_free function doesn't take pointer to real stack as an input argument. Now this function is never called if fake stack frame wasn't allocated. - __asan_init version is bumped to reflect changes in the ABI. - new flag "-asan-stack-dynamic-alloca" allows to store all the function local variables in a dynamic alloca, instead of the static one. It reduces the stack space usage in use-after-return mode (dynamic alloca will not be called if the local variables are stored in a fake stack), and improves the debug info quality for local variables (they will not be described relatively to %rbp/%rsp, which are assumed to be clobbered by function calls). This flag is turned off by default for now, but I plan to turn it on after more testing. llvm-svn: 224062	2014-12-11 21:53:03 +00:00
Andrea Di Biagio	72b05aa59c	[InstCombine][X86] Improved folding of calls to Intrinsic::x86_sse4a_insertqi. This patch teaches the instruction combiner how to fold a call to 'insertqi' if the 'length field' (3rd operand) is set to zero, and if the sum between field 'length' and 'bit index' (4th operand) is bigger than 64. From the AMD64 Architecture Programmer's Manual: 1. If the sum of the bit index + length field is greater than 64, then the results are undefined; 2. A value of zero in the field length is defined as a length of 64. This patch improves the existing combining logic for intrinsic 'insertqi' adding extra checks to address both point 1. and point 2. Differential Revision: http://reviews.llvm.org/D6583 llvm-svn: 224054	2014-12-11 20:44:59 +00:00
Michael Kuperstein	fffb6996c9	The inliner needs to fix up debug information for llvm.dbg.declare, not only for llvm.dbg.value. Patch by Amjad Aboud Differential Revision: http://reviews.llvm.org/D6525 llvm-svn: 224015	2014-12-11 12:41:10 +00:00
Erik Eckstein	096ff7dcd6	Refactor creation of overflow result tuples in InstCombineCalls. Extract the creation of overflow result tuples in a separate function. NFC. llvm-svn: 224006	2014-12-11 08:02:30 +00:00
Kaelyn Takata	22324f378a	Rename static functiom "map" to be more descriptive and to avoid potential confusion with the std::map type. llvm-svn: 223853	2014-12-09 23:32:46 +00:00
Michael Zolotukhin	4def395646	Remove redundant variable. Tested by adding assert(LoopVectorPreHeader == VecPreheader) on LLVM test suite and SPECs. llvm-svn: 223847	2014-12-09 22:45:07 +00:00
Chandler Carruth	a7f247ea56	Revert r223764 which taught instcombine about integer-based elment extraction patterns. This is causing Clang to miscompile itself for 32-bit x86 somehow, and likely also on ARM and PPC. I really don't know how, but reverting now that I've confirmed this is actually the culprit. I have a reproduction as well and so should be able to restore this shortly. This reverts commit r223764. Original commit log follows: Teach instcombine to canonicalize "element extraction" from a load of an integer and "element insertion" into a store of an integer into actual element extraction, element insertion, and vector loads and stores. Previously various parts of LLVM (including instcombine itself) would introduce integer loads and stores into the code as a way of opaquely loading and storing "bits". In some cases (such as a memcpy of std::complex<float> object) we will eventually end up using those bits in non-integer types. In order for SROA to effectively promote the allocas involved, it splits these "store a bag of bits" integer loads and stores up into the constituent parts. However, for non-alloca loads and tsores which remain, it uses integer math to recombine the values into a large integer to load or store. All of this would be "fine", except that it forces LLVM to go through integer math to combine and split up values. While this makes perfect sense for integers (and in fact is critical for bitfields to end up lowering efficiently) it is terrible for non-integer types, especially floating point types. We have a much more canonical way of representing the act of concatenating the bits of two SSA values in LLVM: a vector and insertelement. This patch teaching InstCombine to use this representation. With this patch applied, LLVM will no longer introduce integer math into the critical path of every loop over std::complex<float> operations such as those that make up the hot path of ... oh, most HPC code, Eigen, and any other heavy linear algebra library. For the record, I looked extensively at fixing this in other parts of the compiler, but it just doesn't work: - We really do want to canonicalize memcpy and other bit-motion to integer loads and stores. SSA values are tremendously more powerful than "copy" intrinsics. Not doing this regresses massive amounts of LLVM's scalar optimizer. - We really do need to split up integer loads and stores of this form in SROA or every memcpy of a trivially copyable struct will prevent SSA formation of the members of that struct. It essentially turns off SROA. - The closest alternative is to actually split the loads and stores when partitioning with SROA, but this has all of the downsides historically discussed of splitting up loads and stores -- the wide-store information is fundamentally lost. We would also see performance regressions for bitfield-heavy code and other places where the integers aren't really intended to be split without seemingly arbitrary logic to treat integers totally differently. - We can effectively fix this in instcombine, so it isn't that hard of a choice to make IMO. llvm-svn: 223813	2014-12-09 19:21:16 +00:00
Frederic Riss	35f0a9aeba	Remove unneeded curly braces. llvm-svn: 223809	2014-12-09 18:57:39 +00:00
Frederic Riss	ff58fd207e	Reorder the code to avoid inserting at the beginning of a vector. As per dblaikie suggestion, thanks\! llvm-svn: 223808	2014-12-09 18:57:34 +00:00
Duncan P. N. Exon Smith	5bf8fef580	IR: Split Metadata from Value Split `Metadata` away from the `Value` class hierarchy, as part of PR21532. Assembly and bitcode changes are in the wings, but this is the bulk of the change for the IR C++ API. I have a follow-up patch prepared for `clang`. If this breaks other sub-projects, I apologize in advance :(. Help me compile it on Darwin I'll try to fix it. FWIW, the errors should be easy to fix, so it may be simpler to just fix it yourself. This breaks the build for all metadata-related code that's out-of-tree. Rest assured the transition is mechanical and the compiler should catch almost all of the problems. Here's a quick guide for updating your code: - `Metadata` is the root of a class hierarchy with three main classes: `MDNode`, `MDString`, and `ValueAsMetadata`. It is distinct from the `Value` class hierarchy. It is typeless -- i.e., instances do not have a `Type`. - `MDNode`'s operands are all `Metadata ` (instead of `Value `). - `TrackingVH<MDNode>` and `WeakVH` referring to metadata can be replaced with `TrackingMDNodeRef` and `TrackingMDRef`, respectively. If you're referring solely to resolved `MDNode`s -- post graph construction -- just use `MDNode`. - `MDNode` (and the rest of `Metadata`) have only limited support for `replaceAllUsesWith()`. As long as an `MDNode` is pointing at a forward declaration -- the result of `MDNode::getTemporary()` -- it maintains a side map of its uses and can RAUW itself. Once the forward declarations are fully resolved RAUW support is dropped on the ground. This means that uniquing collisions on changing operands cause nodes to become "distinct". (This already happened fairly commonly, whenever an operand went to null.) If you're constructing complex (non self-reference) `MDNode` cycles, you need to call `MDNode::resolveCycles()` on each node (or on a top-level node that somehow references all of the nodes). Also, don't do that. Metadata cycles (and the RAUW machinery needed to construct them) are expensive. - An `MDNode` can only refer to a `Constant` through a bridge called `ConstantAsMetadata` (one of the subclasses of `ValueAsMetadata`). As a side effect, accessing an operand of an `MDNode` that is known to be, e.g., `ConstantInt`, takes three steps: first, cast from `Metadata` to `ConstantAsMetadata`; second, extract the `Constant`; third, cast down to `ConstantInt`. The eventual goal is to introduce `MDInt`/`MDFloat`/etc. and have metadata schema owners transition away from using `Constant`s when the type isn't important (and they don't care about referring to `GlobalValue`s). In the meantime, I've added transitional API to the `mdconst` namespace that matches semantics with the old code, in order to avoid adding the error-prone three-step equivalent to every call site. If your old code was: MDNode N = foo(); bar(isa <ConstantInt>(N->getOperand(0))); baz(cast <ConstantInt>(N->getOperand(1))); bak(cast_or_null <ConstantInt>(N->getOperand(2))); bat(dyn_cast <ConstantInt>(N->getOperand(3))); bay(dyn_cast_or_null<ConstantInt>(N->getOperand(4))); you can trivially match its semantics with: MDNode N = foo(); bar(mdconst::hasa <ConstantInt>(N->getOperand(0))); baz(mdconst::extract <ConstantInt>(N->getOperand(1))); bak(mdconst::extract_or_null <ConstantInt>(N->getOperand(2))); bat(mdconst::dyn_extract <ConstantInt>(N->getOperand(3))); bay(mdconst::dyn_extract_or_null<ConstantInt>(N->getOperand(4))); and when you transition your metadata schema to `MDInt`: MDNode N = foo(); bar(isa <MDInt>(N->getOperand(0))); baz(cast <MDInt>(N->getOperand(1))); bak(cast_or_null <MDInt>(N->getOperand(2))); bat(dyn_cast <MDInt>(N->getOperand(3))); bay(dyn_cast_or_null<MDInt>(N->getOperand(4))); - A `CallInst` -- specifically, intrinsic instructions -- can refer to metadata through a bridge called `MetadataAsValue`. This is a subclass of `Value` where `getType()->isMetadataTy()`. `MetadataAsValue` is the only class that can legally refer to a `LocalAsMetadata`, which is a bridged form of non-`Constant` values like `Argument` and `Instruction`. It can also refer to any other `Metadata` subclass. (I'll break all your testcases in a follow-up commit, when I propagate this change to assembly.) llvm-svn: 223802	2014-12-09 18:38:53 +00:00
Frederic Riss	7c78db5065	Correctly handle complex locations expressions in replaceDbgDeclareForAlloca() replaceDbgDeclareForAlloca() replaces an alloca by a value storing the address of what was the alloca. If there is a dbg.declare corresponding to that alloca, we need to lower it to a dbg.value describing the additional dereference operation to be performed to get to the underlying variable. This is done by adding a DW_OP_deref to the complex location part of the location description. This deref was added to the end of the operation list, which is wrong. The expression applies to what is described by the dbg.{declare,value}, and as we are changing this, we need to apply the DW_OP_deref as the first operation in the list. Part of the fix for rdar://19162268. llvm-svn: 223799	2014-12-09 17:55:48 +00:00
Juergen Ributzka	194350a936	Revert "Move function to obtain branch weights into the BranchInst class. NFC." This reverts commit r223784 and copies the 'ExtractBranchMetadata' to CodeGenPrepare. llvm-svn: 223795	2014-12-09 17:32:12 +00:00
Juergen Ributzka	e2aa3aa38a	Move function to obtain branch weights into the BranchInst class. NFC. Make this function available to other parts of LLVM. llvm-svn: 223784	2014-12-09 16:36:06 +00:00
Chandler Carruth	7415205113	Teach instcombine to canonicalize "element extraction" from a load of an integer and "element insertion" into a store of an integer into actual element extraction, element insertion, and vector loads and stores. Previously various parts of LLVM (including instcombine itself) would introduce integer loads and stores into the code as a way of opaquely loading and storing "bits". In some cases (such as a memcpy of std::complex<float> object) we will eventually end up using those bits in non-integer types. In order for SROA to effectively promote the allocas involved, it splits these "store a bag of bits" integer loads and stores up into the constituent parts. However, for non-alloca loads and tsores which remain, it uses integer math to recombine the values into a large integer to load or store. All of this would be "fine", except that it forces LLVM to go through integer math to combine and split up values. While this makes perfect sense for integers (and in fact is critical for bitfields to end up lowering efficiently) it is terrible for non-integer types, especially floating point types. We have a much more canonical way of representing the act of concatenating the bits of two SSA values in LLVM: a vector and insertelement. This patch teaching InstCombine to use this representation. With this patch applied, LLVM will no longer introduce integer math into the critical path of every loop over std::complex<float> operations such as those that make up the hot path of ... oh, most HPC code, Eigen, and any other heavy linear algebra library. For the record, I looked extensively at fixing this in other parts of the compiler, but it just doesn't work: - We really do want to canonicalize memcpy and other bit-motion to integer loads and stores. SSA values are tremendously more powerful than "copy" intrinsics. Not doing this regresses massive amounts of LLVM's scalar optimizer. - We really do need to split up integer loads and stores of this form in SROA or every memcpy of a trivially copyable struct will prevent SSA formation of the members of that struct. It essentially turns off SROA. - The closest alternative is to actually split the loads and stores when partitioning with SROA, but this has all of the downsides historically discussed of splitting up loads and stores -- the wide-store information is fundamentally lost. We would also see performance regressions for bitfield-heavy code and other places where the integers aren't really intended to be split without seemingly arbitrary logic to treat integers totally differently. - We can effectively fix this in instcombine, so it isn't that hard of a choice to make IMO. Differential Revision: http://reviews.llvm.org/D6548 llvm-svn: 223764	2014-12-09 08:55:32 +00:00
Justin Bogner	61ba2e3996	InstrProf: An intrinsic and lowering for instrumentation based profiling Introduce the ``llvm.instrprof_increment`` intrinsic and the ``-instrprof`` pass. These provide the infrastructure for writing counters for profiling, as in clang's ``-fprofile-instr-generate``. The implementation of the instrprof pass is ported directly out of the CodeGenPGO classes in clang, and with the followup in clang that rips that code out to use these new intrinsics this ends up being NFC. Doing the instrumentation this way opens some doors in terms of improving the counter performance. For example, this will make it simple to experiment with alternate lowering strategies, and allows us to try handling profiling specially in some optimizations if we want to. Finally, this drastically simplifies the frontend and puts all of the lowering logic in one place. llvm-svn: 223672	2014-12-08 18:02:35 +00:00
NAKAMURA Takumi	6980404cfe	LLVMInstrumentation requires MC since r223532. llvm-svn: 223573	2014-12-06 02:22:11 +00:00
Duncan P. N. Exon Smith	b236211c4c	Utils: Style cleanups, NFC llvm-svn: 223556	2014-12-06 00:48:17 +00:00
Duncan P. N. Exon Smith	b13f7d2e36	Utils: Avoid RAUW on metadata in CloneFunction() llvm-svn: 223555	2014-12-06 00:48:13 +00:00
Kuba Brecka	1001bb533b	Recommit of r223513 and r223514. Reviewed at http://reviews.llvm.org/D6488 llvm-svn: 223532	2014-12-05 22:19:18 +00:00
Kuba Brecka	086e34bef8	Reverting r223513 and r223514. llvm-svn: 223520	2014-12-05 21:32:46 +00:00
Peter Collingbourne	0826e60748	[DFSAN][MIPS][LLVM] Defining ShadowPtrMask variable for MIPS64 Patch by Kumar Sukhani! corresponding compiler-rt patch: http://reviews.llvm.org/D6437 clang patch: http://reviews.llvm.org/D6147 Differential Revision: http://reviews.llvm.org/D6459 llvm-svn: 223516	2014-12-05 21:22:32 +00:00
Kuba Brecka	1e21378a37	AddressSanitizer - Don't instrument globals from cstring_literals sections. (llvm part) Reviewed at http://reviews.llvm.org/D6488 llvm-svn: 223513	2014-12-05 21:04:43 +00:00
Evgeniy Stepanov	d85ddee01d	[msan] Avoid extra origin address realignment. Do not realign origin address if the corresponding application address is at least 4-byte-aligned. Saves 2.5% code size in track-origins mode. llvm-svn: 223464	2014-12-05 14:34:03 +00:00
Simon Pilgrim	be24ab367b	[InstCombine] Minor optimization for bswap with binary ops Added instcombine optimizations for BSWAP with AND/OR/XOR ops: OP( BSWAP(x), BSWAP(y) ) -> BSWAP( OP(x, y) ) OP( BSWAP(x), CONSTANT ) -> BSWAP( OP(x, BSWAP(CONSTANT) ) ) Since its just a one liner, I've also added BSWAP to the DAGCombiner equivalent as well: fold (OP (bswap x), (bswap y)) -> (bswap (OP x, y)) Refactored bswap-fold tests to use FileCheck instead of just checking that the bswaps had gone. Differential Revision: http://reviews.llvm.org/D6407 llvm-svn: 223349	2014-12-04 09:44:01 +00:00
Kostya Serebryany	543f3db572	[msan] allow -fsanitize-coverage=N together with -fsanitize=memory, llvm part llvm-svn: 223312	2014-12-03 23:28:26 +00:00
Matthias Braun	395a82f6cc	correct spelling, NFC llvm-svn: 223274	2014-12-03 22:10:39 +00:00
Matthias Braun	d34e4d2354	[SimplifyLibCalls] Improve double->float shrinking to consider constants This allows cases like float x; fmin(1.0, x); to be optimized to fminf(1.0f, x); rdar://19049359 Differential Revision: http://reviews.llvm.org/D6496 llvm-svn: 223270	2014-12-03 21:46:33 +00:00
Matthias Braun	892c923c46	[SimplifyLibCalls] Enable double to float shrinking for copysign rdar://19049359 Differential Revision: http://reviews.llvm.org/D6495 llvm-svn: 223269	2014-12-03 21:46:29 +00:00
Evgeniy Stepanov	2e5a1f1c9c	msan] Add compile-time checks for missing origins. This change makes MemorySanitizer instrumentation a bit more strict about instructions that have no origin id assigned to them. This would have caught the bug that was fixed in r222918. This is re-commit of r222997, reverted in r223211, with 3 more missing origins added. llvm-svn: 223236	2014-12-03 14:15:53 +00:00
Erik Eckstein	d181752be0	InstCombine: simplify signed range checks Try to convert two compares of a signed range check into a single unsigned compare. Examples: (icmp sge x, 0) & (icmp slt x, n) --> icmp ult x, n (icmp slt x, 0) \| (icmp sgt x, n) --> icmp ugt x, n llvm-svn: 223224	2014-12-03 10:39:15 +00:00
Nick Lewycky	a4acb44995	Revert r222997. The newly added compile-time checks are finding missing origins, testcase is being reduced and a PR will be posted shortly. llvm-svn: 223211	2014-12-03 05:47:00 +00:00
Duncan P. N. Exon Smith	a48bd07e5e	LoopVectorize: Remove unnecessary RAUW Remove an unnecessary `MDNode::replaceAllUsesWith()`. In the preceding line, `TheLoop->setLoopID()` visits all backedges and sets the new loop ID. This sufficiently updates the loop metadata. Metadata RAUW is going away as part of PR21532. llvm-svn: 223210	2014-12-03 05:41:20 +00:00
Tom Stellard	1f0dded057	StructurizeCFG: Use LoopInfo analysis for better loop detection We were assuming that each back-edge in a region represented a unique loop, which is not always the case. We need to use LoopInfo to correctly determine which back-edges are loops. llvm-svn: 223199	2014-12-03 04:28:32 +00:00
Nick Lewycky	2e8a6219fc	Emit the entry block first and the exit block second, then all the blocks in between afterwards. This is what gcc always does, and some out of tree tools depend on that. llvm-svn: 223193	2014-12-03 02:45:01 +00:00
Peter Collingbourne	51d2de7b9e	Prologue support Patch by Ben Gamari! This redefines the `prefix` attribute introduced previously and introduces a `prologue` attribute. There are a two primary usecases that these attributes aim to serve, 1. Function prologue sigils 2. Function hot-patching: Enable the user to insert `nop` operations at the beginning of the function which can later be safely replaced with a call to some instrumentation facility 3. Runtime metadata: Allow a compiler to insert data for use by the runtime during execution. GHC is one example of a compiler that needs this functionality for its tables-next-to-code functionality. Previously `prefix` served cases (1) and (2) quite well by allowing the user to introduce arbitrary data at the entrypoint but before the function body. Case (3), however, was poorly handled by this approach as it required that prefix data was valid executable code. Here we redefine the notion of prefix data to instead be data which occurs immediately before the function entrypoint (i.e. the symbol address). Since prefix data now occurs before the function entrypoint, there is no need for the data to be valid code. The previous notion of prefix data now goes under the name "prologue data" to emphasize its duality with the function epilogue. The intention here is to handle cases (1) and (2) with prologue data and case (3) with prefix data. References ---------- This idea arose out of discussions[1] with Reid Kleckner in response to a proposal to introduce the notion of symbol offsets to enable handling of case (3). [1] http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-May/073235.html Test Plan: testsuite Differential Revision: http://reviews.llvm.org/D6454 llvm-svn: 223189	2014-12-03 02:08:38 +00:00
Michael Zolotukhin	ea8327b80f	PR21302. Vectorize only bottom-tested loops. rdar://problem/18886083 llvm-svn: 223171	2014-12-02 22:59:06 +00:00
Philip Reames	1a1bdb22bf	[Statepoints 3/4] Statepoint infrastructure for garbage collection: SelectionDAGBuilder This is the third patch in a small series. It contains the CodeGen support for lowering the gc.statepoint intrinsic sequences (223078) to the STATEPOINT pseudo machine instruction (223085). The change also includes the set of helper routines and classes for working with gc.statepoints, gc.relocates, and gc.results since the lowering code uses them. With this change, gc.statepoints should be functionally complete. The documentation will follow in the fourth change, and there will likely be some cleanup changes, but interested parties can start experimenting now. I'm not particularly happy with the amount of code or complexity involved with the lowering step, but at least it's fairly well isolated. The statepoint lowering code is split into it's own files and anyone not working on the statepoint support itself should be able to ignore it. During the lowering process, we currently spill aggressively to stack. This is not entirely ideal (and we have plans to do better), but it's functional, relatively straight forward, and matches closely the implementations of the patchpoint intrinsics. Most of the complexity comes from trying to keep relocated copies of values in the same stack slots across statepoints. Doing so avoids the insertion of pointless load and store instructions to reshuffle the stack. The current implementation isn't as effective as I'd like, but it is functional and 'good enough' for many common use cases. In the long term, I'd like to figure out how to integrate the statepoint lowering with the register allocator. In principal, we shouldn't need to eagerly spill at all. The register allocator should do any spilling required and the statepoint should simply record that fact. Depending on how challenging that turns out to be, we may invest in a smarter global stack slot assignment mechanism as a stop gap measure. Reviewed by: atrick, ributzka llvm-svn: 223137	2014-12-02 18:50:36 +00:00
Bruno Cardoso Lopes	15520db9ad	[SwitchLowering] Handle destinations on multiple phi instructions Follow up from r222926. Also handle multiple destinations from merged cases on multiple and subsequent phi instructions. rdar://problem/19106978 llvm-svn: 223135	2014-12-02 18:31:53 +00:00
Bruno Cardoso Lopes	d035fbb96f	[LICM] Avoind store sinking if no preheader is available Load instructions are inserted into loop preheaders when sinking stores and later removed if not used by the SSA updater. Avoid sinking if the loop has no preheader and avoid crashes. This fixes one more side effect of not handling indirectbr instructions properly on LoopSimplify. llvm-svn: 223119	2014-12-02 14:22:34 +00:00
Hans Wennborg	5bef5b522b	Revert r223049, r223050 and r223051 while investigating test failures. I didn't foresee affecting the Clang test suite :/ llvm-svn: 223054	2014-12-01 17:36:43 +00:00
Hans Wennborg	269ebb612e	SimplifyCFG: Omit range checks for switch lookup tables when default is unreachable They would get optimized away later, but we might as well not emit them. llvm-svn: 223051	2014-12-01 17:08:38 +00:00
Hans Wennborg	5a1e5c05d8	SimplifyCFG: don't remove unreachable default switch destinations An unreachable default destination can be exploited by other optimizations, and SDag lowering is now prepared to handle them efficiently. For example, branches to the unreachable destination will be optimized away, such as in the case of range checks for switch lookup tables. On 64-bit Linux, this reduces the size of a clang bootstrap by 80 kB (and Chromium by 30 kB). llvm-svn: 223050	2014-12-01 17:08:35 +00:00
Evgeniy Stepanov	a056ac8a98	[msan] Add compile-time checks for missing origins. This change makes MemorySanitizer instrumentation a bit more strict about instructions that have no origin id assigned to them. This would have caught the bug that was fixed in r222918. No functional change. llvm-svn: 222997	2014-12-01 09:53:51 +00:00
Yury Gribov	3ae427d811	[asan] Change dynamic alloca instrumentation to only consider allocas that are dominating all exits from function. Reviewed in http://reviews.llvm.org/D6412 llvm-svn: 222991	2014-12-01 08:47:58 +00:00
Duncan P. N. Exon Smith	910f05d181	DebugIR: Delete -debug-ir llvm-svn: 222945	2014-11-29 03:15:47 +00:00
Duncan P. N. Exon Smith	9bc81fbe92	Revert "Masked Vector Load and Store Intrinsics." This reverts commit r222632 (and follow-up r222636), which caused a host of LNT failures on an internal bot. I'll respond to the commit on the list with a reproduction of one of the failures. Conflicts: lib/Target/X86/X86TargetTransformInfo.cpp llvm-svn: 222936	2014-11-28 21:29:14 +00:00
David Majnemer	3d6f80b619	InstCombine: FoldOrOfICmps harder We may be in a situation where the icmps might not be near each other in a tree of or instructions. Try to dig out related compare instructions and see if they combine. N.B. This won't fire on deep trees of compares because rewritting the tree might end up creating a net increase of IR. We may have to resort to something more sophisticated if this is a real problem. llvm-svn: 222928	2014-11-28 19:58:29 +00:00
Bruno Cardoso Lopes	46d5bf2982	[LICM] Store sink and indirectbr instructions Loop simplify skips exit-block insertion when exits contain indirectbr instructions. This leads to an assertion in LICM when trying to sink stores out of non-dedicated loop exits containing indirectbr instructions. This patch fix this issue by re-checking for dedicated exits in LICM prior to store sink attempts. Differential Revision: http://reviews.llvm.org/D6414 rdar://problem/18943047 llvm-svn: 222927	2014-11-28 19:47:46 +00:00
Bruno Cardoso Lopes	bc7ba2c766	[SwitchLowering] Handle multiple destinations on condensed case stmts Switch cases statements with sequential values that branch to the same destination BB may often be handled together in a single new source BB. In this scenario we need to remove remaining incoming values from PHI instructions in the destination BB, as to match the number of source branches. Differential Revision: http://reviews.llvm.org/D6415 rdar://problem/19040894 llvm-svn: 222926	2014-11-28 19:47:33 +00:00
Evgeniy Stepanov	a0b6899234	[msan] Fix origin propagation for select of floats. MSan does not assign origin for instrumentation temps (i.e. the ones that do not come from the application code), but "select" instrumentation erroneously tried to use one of those. https://code.google.com/p/memory-sanitizer/issues/detail?id=78 llvm-svn: 222918	2014-11-28 11:17:58 +00:00
Ankur Garg	876b891d51	Removed extra line from a comment to test first commit. NFC. llvm-svn: 222916	2014-11-28 10:38:18 +00:00
Erik Eckstein	0d86c7623f	reinstate r222872: Peephole optimization in switch table lookup: reuse the guarding table comparison if possible. Fixed missing dominance check. Original commit message: This optimization tries to reuse the generated compare instruction, if there is a comparison against the default value after the switch. Example: if (idx < tablesize) r = table[idx]; // table does not contain default_value else r = default_value; if (r != default_value) ... Is optimized to: cond = idx < tablesize; if (cond) r = table[idx]; else r = default_value; if (cond) ... Jump threading will then eliminate the second if(cond). llvm-svn: 222891	2014-11-27 15:13:14 +00:00
Evgeniy Stepanov	e402d9ef4c	[msan] Remove indirect call wrapping code. This functionality was only used in MSanDR, which is deprecated. llvm-svn: 222889	2014-11-27 14:54:02 +00:00
Erik Eckstein	2190cd9ffa	Revert "Peephole optimization in switch table lookup: reuse the guarding table comparison if possible." It is breaking the clang bootstrag. llvm-svn: 222877	2014-11-27 10:59:08 +00:00
Erik Eckstein	e73e308ab9	Peephole optimization in switch table lookup: reuse the guarding table comparison if possible. This optimization tries to reuse the generated compare instruction, if there is a comparison against the default value after the switch. Example: if (idx < tablesize) r = table[idx]; // table does not contain default_value else r = default_value; if (r != default_value) ... Is optimized to: cond = idx < tablesize; if (cond) r = table[idx]; else r = default_value; if (cond) ... \endcode Jump threading will then eliminate the second if(cond). llvm-svn: 222872	2014-11-27 08:33:51 +00:00
David Majnemer	40157d5c4d	InstCombine: Restore optimizations lost in r210006 This restores our ability to optimize: (X & C) == 0 ? X ^ C : X into X \| C (X & C) != 0 ? X ^ C : X into X & ~C llvm-svn: 222871	2014-11-27 07:25:21 +00:00
David Majnemer	5468e86469	Revert "Added inst combine transforms for single bit tests from Chris's note" This reverts commit r210006, it miscompiled libapr which is used in who knows how many projects. A test has been added to ensure that we don't regress again. I'll work on a rewrite of what the optimization was trying to do later. llvm-svn: 222856	2014-11-26 23:00:38 +00:00
Chandler Carruth	816d26fe5e	[InstCombine] Change LLVM To canonicalize toward the value type being stored rather than the pointer type. This change is analogous to r220138 which changed the canonicalization for loads. The rationale is the same: memory does not have a type, operations (and thus the values they produce) have a type. We should match that type as closely as possible rather than reading some form of semantics into the pointer type. With this change, loads and stores should no longer be made with nonsensical types for the values that tehy load and store. This is particularly important when trying to match specific loaded and stored types in the process of doing other instcombines, which is what led me down this twisty maze of miscanonicalization. I've put quite some effort into looking through IR to find places where LLVM's optimizer was being unreasonably conservative in the face of mismatched load and store types, however it is possible (let's say, likely!) I have missed some. If you see regressions here, or from r220138, the likely cause is some part of LLVM failing to cope with load and store types differing. Test cases appreciated, it is important that we root all of these out of LLVM. llvm-svn: 222748	2014-11-25 10:09:51 +00:00
Chandler Carruth	1a3c2c414c	Revert r220349 to re-instate r220277 with a fix for PR21330 -- quite clearly only exactly equal width ptrtoint and inttoptr casts are no-op casts, it says so right there in the langref. Make the code agree. Original log from r220277: Teach the load analysis to allow finding available values which require inttoptr or ptrtoint cast provided there is datalayout available. Eventually, the datalayout can just be required but in practice it will always be there today. To go with the ability to expose available values requiring a ptrtoint or inttoptr cast, helpers are added to perform one of these three casts. These smarts are necessary to finish canonicalizing loads and stores to the operational type requirements without regressing fundamental combines. I've added some test cases. These should actually improve as the load combining and store combining improves, but they may fundamentally be highlighting some missing combines for select in addition to exercising the specific added logic to load analysis. llvm-svn: 222739	2014-11-25 08:20:27 +00:00
Matt Arsenault	238ff1ad1e	Bug 21610: Canonicalize min/max fcmp selects to use ordered comparisons llvm-svn: 222705	2014-11-24 23:15:18 +00:00
Kostya Serebryany	4cadd4afa0	[asan/coverage] change the way asan coverage instrumentation is done: instead of setting the guard to 1 in the generated code, pass the pointer to guard to __sanitizer_cov and set it there. No user-visible functionality change expected llvm-svn: 222675	2014-11-24 18:49:53 +00:00
David Majnemer	8e6f6a98b5	InstCombine: Don't create an unused instruction We would create an instruction but not inserting it. Not inserting the unused instruction would lead us to verification failure. This fixes PR21653. llvm-svn: 222659	2014-11-24 16:41:13 +00:00
David Majnemer	b2a6e7458d	InstCombine: Don't assume DataLayout is always available We tried to get the result of DataLayout::getLargestLegalIntTypeSize but we didn't have a DataLayout. This resulted in opt crashing. This fixes PR21651. llvm-svn: 222645	2014-11-24 07:26:20 +00:00
Elena Demikhovsky	9e5089a938	Masked Vector Load and Store Intrinsics. Introduced new target-independent intrinsics in order to support masked vector loads and stores. The loop vectorizer optimizes loops containing conditional memory accesses by generating these intrinsics for existing targets AVX2 and AVX-512. The vectorizer asks the target about availability of masked vector loads and stores. Added SDNodes for masked operations and lowering patterns for X86 code generator. Examples: <16 x i32> @llvm.masked.load.v16i32(i8* %addr, <16 x i32> %passthru, i32 4 /* align /, <16 x i1> %mask) declare void @llvm.masked.store.v8f64(i8 %addr, <8 x double> %value, i32 4, <8 x i1> %mask) Scalarizer for other targets (not AVX2/AVX-512) will be done in a separate patch. http://reviews.llvm.org/D6191 llvm-svn: 222632	2014-11-23 08:07:43 +00:00
David Majnemer	fb3805576b	InstCombine: Propagate exact for (sdiv X, Pow2) -> (udiv X, Pow2) llvm-svn: 222625	2014-11-22 20:00:41 +00:00
David Majnemer	ec6e481bc5	InstCombine: Propagate exact for (sdiv X, Y) -> (udiv X, Y) llvm-svn: 222624	2014-11-22 20:00:38 +00:00
David Majnemer	fa4699e65f	InstCombine: Propagate exact for (sdiv -X, C) -> (sdiv X, -C) llvm-svn: 222623	2014-11-22 20:00:34 +00:00
Simon Pilgrim	a279410ede	Tidied up target triple OS detection. NFC Use Triple::isOS*() helper functions where possible. llvm-svn: 222622	2014-11-22 19:12:10 +00:00
David Majnemer	a3aeb15613	InstCombine: Propagate exact in (udiv (lshr X,C1),C2) -> (udiv x,C1<<C2) llvm-svn: 222620	2014-11-22 18:16:54 +00:00
David Majnemer	546f81064c	InstCombine: Propagate NSW/NUW for X*(1<<Y) -> X<<Y llvm-svn: 222613	2014-11-22 08:57:02 +00:00
David Majnemer	8279a7506d	InstCombine: Propagate NSW for -X * -Y -> X * Y llvm-svn: 222612	2014-11-22 07:25:19 +00:00
David Majnemer	83484fdb8b	InstCombine: Silence a parenthesis warning llvm-svn: 222609	2014-11-22 06:09:28 +00:00
David Majnemer	80c8f627db	InstCombine: Preserve nsw when folding X*(2^C) -> X << C llvm-svn: 222606	2014-11-22 04:52:55 +00:00
David Majnemer	fd4a6d2b7a	InstCombine: Preserve nsw/nuw for ((X << C2)C1) -> (X (C1 << C2)) llvm-svn: 222605	2014-11-22 04:52:52 +00:00
David Majnemer	027bc80928	InstCombine: Preserve nsw for (mul %V, -1) -> (sub 0, %V) llvm-svn: 222604	2014-11-22 04:52:38 +00:00
Gerolf Hoflehner	ec6217c929	[InstCombine] Re-commit of r218721 (Optimize icmp-select-icmp sequence) Fixes the self-host fail. Note that this commit activates dominator analysis in the combiner by default (like the original commit did). llvm-svn: 222590	2014-11-21 23:36:44 +00:00
Kostya Serebryany	60ef25bd54	[asan] remove old experimental code llvm-svn: 222586	2014-11-21 22:34:29 +00:00
Kostya Serebryany	ea2cb6f616	[asan] add statistic counter to dynamic alloca instrumentation llvm-svn: 222573	2014-11-21 21:25:18 +00:00
Roman Divacky	d2b9a1b890	Disable header duplication at -Oz in loop-rotate pass. llvm-svn: 222562	2014-11-21 19:53:24 +00:00
Yury Gribov	55441bb601	[asan] Add new hidden compile-time flag asan-instrument-allocas to sanitize variable-sized dynamic allocas. Patch by Max Ostapenko. Reviewed at http://reviews.llvm.org/D6055 llvm-svn: 222519	2014-11-21 10:29:50 +00:00
David Majnemer	1f44142e4e	This Reassociate change unintentionally slipped in r222499 llvm-svn: 222500	2014-11-21 02:37:38 +00:00
David Majnemer	c0a313b57c	SROA: The alloca type isn't a candidate promotion type for vectors The alloca's type is irrelevant, only those types which are used in a load or store of the exact size of the slice should be considered. This manifested as an assertion failure when we compared the various types: we had a size mismatch. This fixes PR21480. llvm-svn: 222499	2014-11-21 02:34:55 +00:00
Mehdi Amini	ffd0100618	SimplifyCFG: Refactor GatherConstantCompares() result in a struct Code seems cleaner and easier to understand this way This is basically r222416, after fixes for MSVC lack of standard support, and a few cleaning (got rid of a warning). Thanks Nakamura Takumi and Nico Weber for the MSVC fixes. llvm-svn: 222472	2014-11-20 22:40:25 +00:00

1 2 3 4 5 ...

12258 Commits