Summary: Mimic parseTriple(); and exposes it to LTOModule.cpp
Reviewers: dexonsmith, rafael
Subscribers: llvm-commits
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 252442
This commit adds enums in LLVMBitCodes.h to improve readability and
maintainability. This is a follow-up to r252368 which was discussed
here:
http://reviews.llvm.org/D12923
llvm-svn: 252395
Summary:
The CLR's personality routine passes these in rdx/edx, not rax/eax.
Make getExceptionPointerRegister a virtual method parameterized by
personality function to allow making this distinction.
Similarly make getExceptionSelectorRegister a virtual method parameterized
by personality function, for symmetry.
Reviewers: pgavlin, majnemer, rnk
Subscribers: jyknight, dsanders, llvm-commits
Differential Revision: http://reviews.llvm.org/D14344
llvm-svn: 252383
This reverts commit r252373, reapplying r252372 now that I've updated
clang-tools-extra. Original commit message follows.
ADT: Require explicit ilist iterator/pointer conversions
Disallow implicit conversions between ilist iterators and element
points. Explicit conversions still work of course.
This is the first step toward removing the undefined behaviour in
`ilist` and `iplist`:
http://lists.llvm.org/pipermail/llvm-dev/2015-October/091115.html
The motivation for removing the implicit iterators is that I came across
real bugs (that were *really* getting lucky). More details and some
brief discussion later in that thread:
http://lists.llvm.org/pipermail/llvm-dev/2015-October/091617.html
Note: if you have out-of-tree code, it should be fairly easy to revert
this patch downstream while you update your out-of-tree call sites.
Note that these conversions are occasionally latent bugs (that may
happen to "work" now, but only because of getting lucky with UB;
follow-ups will change your luck). When they are valid, I suggest using
`->getIterator()` to go from pointer to iterator, and `&*` to go from
iterator to pointer.
llvm-svn: 252380
Disallow implicit conversions between ilist iterators and element
points. Explicit conversions still work of course.
This is the first step toward removing the undefined behaviour in
`ilist` and `iplist`:
http://lists.llvm.org/pipermail/llvm-dev/2015-October/091115.html
The motivation for removing the implicit iterators is that I came across
real bugs (that were *really* getting lucky). More details and some
brief discussion later in that thread:
http://lists.llvm.org/pipermail/llvm-dev/2015-October/091617.html
Note: if you have out-of-tree code, it should be fairly easy to revert
this patch downstream while you update your out-of-tree call sites.
Note that these conversions are occasionally latent bugs (that may
happen to "work" now, but only because of getting lucky with UB;
follow-ups will change your luck). When they are valid, I suggest using
`->getIterator()` to go from pointer to iterator, and `&*` to go from
iterator to pointer.
llvm-svn: 252372
This marker prevents optimization passes from adding 'tail' or
'musttail' markers to a call. Is is used to prevent tail call
optimization from being performed on the call.
rdar://problem/22667622
Differential Revision: http://reviews.llvm.org/D12923
llvm-svn: 252368
Summary:
This change makes the `isImpliedCondition` interface similar to the rest
of the functions in ValueTracking (in that it takes a DataLayout,
AssumptionCache etc.). This is an NFC, intended to make a later diff
less noisy.
Depends on D14369
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D14391
llvm-svn: 252333
Summary:
In this implementation, LiveIntervalAnalysis invents a few register
masks on basic block boundaries that preserve no registers. The nice
thing about this is that it prevents the prologue inserter from thinking
it needs to spill all XMM CSRs, because it doesn't see any explicit
physreg defs in the MI.
Reviewers: MatzeB, qcolombet, JosephTremoulet, majnemer
Subscribers: MatzeB, llvm-commits
Differential Revision: http://reviews.llvm.org/D14407
llvm-svn: 252318
We now create the .eh_frame section early, just like every other special
section.
This means that the special flags are visible in code that explicitly
asks for ".eh_frame".
llvm-svn: 252313
This attribute allows the compiler to assume that the function never recurses into itself, either directly or indirectly (transitively). This can be used among other things to demote global variables to locals.
llvm-svn: 252282
The bug: I missed adding break statements in the switch / case.
Original commit message:
[SCEV] Teach SCEV some axioms about non-wrapping arithmetic
Summary:
- A s< (A + C)<nsw> if C > 0
- A s<= (A + C)<nsw> if C >= 0
- (A + C)<nsw> s< A if C < 0
- (A + C)<nsw> s<= A if C <= 0
Right now `C` needs to be a constant, but we can later generalize it to
be a non-constant if needed.
Reviewers: atrick, hfinkel, reames, nlewycky
Subscribers: sanjoy, llvm-commits
Differential Revision: http://reviews.llvm.org/D13686
llvm-svn: 252236
Previously, subprograms contained a metadata reference to the function they
described. Because most clients need to get or set a subprogram for a given
function rather than the other way around, this created unneeded inefficiency.
For example, many passes needed to call the function llvm::makeSubprogramMap()
to build a mapping from functions to subprograms, and the IR linker needed to
fix up function references in a way that caused quadratic complexity in the IR
linking phase of LTO.
This change reverses the direction of the edge by storing the subprogram as
function-level metadata and removing DISubprogram's function field.
Since this is an IR change, a bitcode upgrade has been provided.
Fixes PR23367. An upgrade script for textual IR for out-of-tree clients is
attached to the PR.
Differential Revision: http://reviews.llvm.org/D14265
llvm-svn: 252219
Also, remove an enum hack where enum values were used as indexes into an array.
We may want to make this a real class to allow pattern-based queries/customization (D13417).
llvm-svn: 252196
The needed lld matching changes to be submitted immediately next,
but this revision will cause lld failures with this alone which is expected.
This removes the eating of the error in Archive::Child::getSize() when the characters
in the size field in the archive header for the member is not a number. To do this we
have all of the needed methods return ErrorOr to push them up until we get out of lib.
Then the tools and can handle the error in whatever way is appropriate for that tool.
So the solution is to plumb all the ErrorOr stuff through everything that touches archives.
This include its iterators as one can create an Archive object but the first or any other
Child object may fail to be created due to a bad size field in its header.
Thanks to Lang Hames on the changes making child_iterator contain an
ErrorOr<Child> instead of a Child and the needed changes to ErrorOr.h to add
operator overloading for * and -> .
We don’t want to use llvm_unreachable() as it calls abort() and is produces a “crash”
and using report_fatal_error() to move the error checking will cause the program to
stop, neither of which are really correct in library code. There are still some uses of
these that should be cleaned up in this library code for other than the size field.
The test cases use archives with text files so one can see the non-digit character,
in this case a ‘%’, in the size field.
These changes will require corresponding changes to the lld project. That will be
committed immediately after this change. But this revision will cause lld failures
with this alone which is expected.
llvm-svn: 252192
With this change, instrumentation code and reader/write
code related to profile data structs are kept strictly
in-sync. THis will be extended to cfe and compile-rt
references as well.
Differential Revision: http://reviews.llvm.org/D13843
llvm-svn: 252113
1. A macro with argument: LLVM_PACKED(StructDefinition)
2. A pair of macros defining scope of region with packing:
LLVM_PACKED_START
struct A { ... };
struct B { ... };
LLVM_PACKED_END
Differential Revision: http://reviews.llvm.org/D14337
llvm-svn: 252099
Splits PrintLoopPass into a new-style pass and a PrintLoopPassWrapper,
much like we already do for PrintFunctionPass and PrintModulePass.
llvm-svn: 252085
This is part-1 of the patch that replaces all edge weights in MBB by
probabilities, which only adds new interfaces. No functional changes.
Differential revision: http://reviews.llvm.org/D13908
llvm-svn: 252083
Summary:
Data operands of a call or invoke consist of the call arguments, and
the bundle operands associated with the `call` (or `invoke`)
instruction. The motivation for this change is that we'd like to be
able to query "argument attributes" like `readonly` and `nocapture`
for bundle operands naturally.
This change also provides a conservative "implementation" for these
attributes for any bundle operand, and an extension point for future
work.
Reviewers: chandlerc, majnemer, reames
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D14305
llvm-svn: 252077
We can often end up with conditional stores that cannot be speculated. They can come from fairly simple, idiomatic code:
if (c & flag1)
*a = x;
if (c & flag2)
*a = y;
...
There is no dominating or post-dominating store to a, so it is not legal to move the store unconditionally to the end of the sequence and cache the intermediate result in a register, as we would like to.
It is, however, legal to merge the stores together and do the store once:
tmp = undef;
if (c & flag1)
tmp = x;
if (c & flag2)
tmp = y;
if (c & flag1 || c & flag2)
*a = tmp;
The real power in this optimization is that it allows arbitrary length ladders such as these to be completely and trivially if-converted. The typical code I'd expect this to trigger on often uses binary-AND with constants as the condition (as in the above example), which means the ending condition can simply be truncated into a single binary-AND too: 'if (c & (flag1|flag2))'. As in the general case there are bitwise operators here, the ladder can often be optimized further too.
This optimization involves potentially increasing register pressure. Even in the simplest case, the lifetime of the first predicate is extended. This can be elided in some cases such as using binary-AND on constants, but not in the general case. Threading 'tmp' through all branches can also increase register pressure.
The optimization as in this patch is enabled by default but kept in a very conservative mode. It will only optimize if it thinks the resultant code should be if-convertable, and additionally if it can thread 'tmp' through at least one existing PHI, so it will only ever in the worst case create one more PHI and extend the lifetime of a predicate.
This doesn't trigger much in LNT, unfortunately, but it does trigger in a big way in a third party test suite.
llvm-svn: 252051
This was breaking the modules build and is being reverted while we reach consensus on the right way to solve this layering problem. This reverts commit r251785.
llvm-svn: 252040
Intended to make later changes simpler. Exposes
`getBundleOperandsStartIndex` and `getBundleOperandsEndIndex`, and uses
them for the computation in `getNumTotalBundleOperands`.
llvm-svn: 252037
Summary:
The goal of this pass is to perform store-to-load forwarding across the
backedge of a loop. E.g.:
for (i)
A[i + 1] = A[i] + B[i]
=>
T = A[0]
for (i)
T = T + B[i]
A[i + 1] = T
The pass relies on loop dependence analysis via LoopAccessAnalisys to
find opportunities of loop-carried dependences with a distance of one
between a store and a load. Since it's using LoopAccessAnalysis, it was
easy to also add support for versioning away may-aliasing intervening
stores that would otherwise prevent this transformation.
This optimization is also performed by Load-PRE in GVN without the
option of multi-versioning. As was discussed with Daniel Berlin in
http://reviews.llvm.org/D9548, this is inferior to a more loop-aware
solution applied here. Hopefully, we will be able to remove some
complexity from GVN/MemorySSA as a consequence.
In the long run, we may want to extend this pass (or create a new one if
there is little overlap) to also eliminate loop-indepedent redundant
loads and store that *require* versioning due to may-aliasing
intervening stores/loads. I have some motivating cases for store
elimination. My plan right now is to wait for MemorySSA to come online
first rather than using memdep for this.
The main motiviation for this pass is the 456.hmmer loop in SPECint2006
where after distributing the original loop and vectorizing the top part,
we are left with the critical path exposed in the bottom loop. Being
able to promote the memory dependence into a register depedence (even
though the HW does perform store-to-load fowarding as well) results in a
major gain (~20%). This gain also transfers over to x86: it's
around 8-10%.
Right now the pass is off by default and can be enabled
with -enable-loop-load-elim. On the LNT testsuite, there are two
performance changes (negative number -> improvement):
1. -28% in Polybench/linear-algebra/solvers/dynprog: the length of the
critical paths is reduced
2. +2% in Polybench/stencils/adi: Unfortunately, I couldn't reproduce this
outside of LNT
The pass is scheduled after the loop vectorizer (which is after loop
distribution). The rational is to try to reuse LAA state, rather than
recomputing it. The order between LV and LLE is not critical because
normally LV does not touch scalar st->ld forwarding cases where
vectorizing would inhibit the CPU's st->ld forwarding to kick in.
LoopLoadElimination requires LAA to provide the full set of dependences
(including forward dependences). LAA is known to omit loop-independent
dependences in certain situations. The big comment before
removeDependencesFromMultipleStores explains why this should not occur
for the cases that we're interested in.
Reviewers: dberlin, hfinkel
Subscribers: junbuml, dberlin, mssimpso, rengolin, sanjoy, llvm-commits
Differential Revision: http://reviews.llvm.org/D13259
llvm-svn: 252017
Summary: Will be used by the LoopLoadElimination pass.
Reviewers: hfinkel
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D13258
llvm-svn: 252016
Summary:
The functions use LAI and MemoryDepChecker classes so they need to be
defined after those definitions outside of the Dependence class.
Will be used by the LoopLoadElimination pass.
Reviewers: hfinkel
Subscribers: rengolin, llvm-commits
Differential Revision: http://reviews.llvm.org/D13257
llvm-svn: 252015
A profile of an LTO link of Chrome revealed that we were spending some
~30-50% of execution time in the function Constant::getRelocationInfo(),
which is called from TargetLoweringObjectFile::getKindForGlobal() and in turn
from TargetMachine::getNameWithPrefix().
It turns out that we only need the result of getKindForGlobal() when
targeting Mach-O, so this change moves the relevant part of the logic to
TargetLoweringObjectFileMachO.
NFCI.
Differential Revision: http://reviews.llvm.org/D14168
llvm-svn: 252014
Introduce DIPrinter which takes care of rendering DILineInfo and
friends. This allows LLVMSymbolizer class to return a structured data
instead of plain std::strings.
llvm-svn: 251989
Summary:
We now collect all types of dependences including lexically forward
deps not just "interesting" ones.
Reviewers: hfinkel
Subscribers: rengolin, llvm-commits
Differential Revision: http://reviews.llvm.org/D13256
llvm-svn: 251985
Make printDILineInfo and friends responsible for just rendering the
contents of the structures, demangling should actually be performed
earlier, when we have the information about the originating
SymbolizableModule at hand.
llvm-svn: 251981
Summary:
When the dependence distance in zero then we have a loop-independent
dependence from the earlier to the later access.
No current client of LAA uses forward dependences so other than
potentially hitting the MaxDependences threshold earlier, this change
shouldn't affect anything right now.
This and the previous patch were tested together for compile-time
regression. None found in LNT/SPEC.
Reviewers: hfinkel
Subscribers: rengolin, llvm-commits
Differential Revision: http://reviews.llvm.org/D13255
llvm-svn: 251973
Bypassing LLVM for this has a number of benefits:
1) Laziness support becomes asm-syntax agnostic (previously lazy jitting didn't
work on Windows as the resolver block was in Darwin asm).
2) For cross-process JITs, it allows resolver blocks and trampolines to be
emitted directly in the target process, reducing cross process traffic.
3) It should be marginally faster.
llvm-svn: 251933
When push instructions are being used to pass function arguments on
the stack, and either EH or debugging are enabled, we need to generate
.cfi_adjust_cfa_offset directives appropriately. For (synch) EH, it is
enough for the CFA offset to be correct at every call site, while
for debugging we want to be correct after every push.
Darwin does not support this well, so don't use pushes whenever it
would be required.
Differential Revision: http://reviews.llvm.org/D13767
llvm-svn: 251904
ScheduleDAGInstrs doesn't behave differently before or after register
allocation. It was only used in a method of MachineSchedulerBase which
behaved differently in MachineScheduler/PostMachineScheduler. Change
this to let MachineScheduler/PostMachineScheduler just pass in a
parameter to that function.
The order of the LiveIntervals* and bool RemoveKillFlags paramters have
been switched to make out-of-tree code fail instead of unintentionally
passing a value intended for the IsPostRA flag to the (previously
following and default initialized) RemoveKillFlags.
Differential Revision: http://reviews.llvm.org/D14245
llvm-svn: 251883
This reverts commit r251837, due to a number of bot failures of the form:
/home/grosser/buildslave/perf-x86_64-penryn-O3-polly-fast/llvm.obj/tools/llvm-link/Release+Asserts/llvm-link.o:llvm-link.cpp:function
loadIndex(llvm::LLVMContext&, llvm::Module const*): error: undefined
reference to
'llvm::object::FunctionIndexObjectFile::create(llvm::MemoryBufferRef,
llvm::LLVMContext&, llvm::Module const*, bool)'
/home/grosser/buildslave/perf-x86_64-penryn-O3-polly-fast/llvm.obj/tools/llvm-link/Release+Asserts/llvm-link.o:llvm-link.cpp:function
loadIndex(llvm::LLVMContext&, llvm::Module const*): error: undefined
reference to 'llvm::object::FunctionIndexObjectFile::takeIndex()'
I'm not sure why these are happening - I added Object to the requred
libraries in tools/llvm-link/LLVMBuild.txt and the LLVM_LINK_COMPONENTS
in tools/llvm-link/CMakeLists.txt. Confirmed for my build that these
symbols come out of libLLVMObject.a. What am I missing?
llvm-svn: 251841
Summary:
Support for necessary linkage changes and symbol renaming during
ThinLTO function importing.
Also includes llvm-link support for manually importing functions
and associated llvm-link based tests.
Note that this does not include support for intelligently importing
metadata, which is currently imported duplicate times. That support will
be in the follow-on patch, and currently is ignored by the tests.
Reviewers: dexonsmith, joker.eph, davidxl
Subscribers: tobiasvk, tejohnson, llvm-commits
Differential Revision: http://reviews.llvm.org/D13515
llvm-svn: 251837
Summary:
SCEV Predicates represent conditions that typically cannot be derived from
static analysis, but can be used to reduce SCEV expressions to forms which are
usable for different optimizers.
ScalarEvolution now has the rewriteUsingPredicate method which can simplify a
SCEV expression using a SCEVPredicateSet. The normal workflow of a pass using
SCEVPredicates would be to hold a SCEVPredicateSet and every time assumptions
need to be made a new SCEV Predicate would be created and added to the set.
Each time after calling getSCEV, the user will call the rewriteUsingPredicate
method.
We add two types of predicates
SCEVPredicateSet - implements a set of predicates
SCEVEqualPredicate - tests for equality between two SCEV expressions
We use the SCEVEqualPredicate to re-implement stride versioning. Every time we
version a stride, we will add a SCEVEqualPredicate to the context.
Instead of adding specific stride checks, LoopVectorize now adds a more
generic SCEV check.
We only need to add support for this in the LoopVectorizer since this is the
only pass that will do stride versioning.
Reviewers: mzolotukhin, anemet, hfinkel, sanjoy
Subscribers: sanjoy, hfinkel, rengolin, jmolloy, llvm-commits
Differential Revision: http://reviews.llvm.org/D13595
llvm-svn: 251800
Summary:
The new function sys::path::user_cache_directory tries to discover
a directory suitable for cache storage for current system user.
On Windows and Darwin it returns a path to system-specific user cache directory.
On Linux it follows XDG Base Directory Specification, what is:
- use non-empty $XDG_CACHE_HOME env var,
- use $HOME/.cache.
Reviewers: chapuni, aaron.ballman, rafael
Subscribers: rafael, aaron.ballman, llvm-commits
Differential Revision: http://reviews.llvm.org/D13801
llvm-svn: 251784
1. Added a set of public interfaces in InstrProfRecord
class to access (read/write) value profile data.
2. Changed IndexedProfile reader and writer code to
use the newly defined interfaces and hide implementation
details.
3. Added a couple of unittests for value profiling:
- Test new interfaces to get and set value profile data
- Test value profile data merging with various scenarios.
No functional change is expected. The new interfaces will also
make it possible to change on-disk format of value prof data
to be more compact (to be submitted).
llvm-svn: 251771
This is a bit ugly, but has a few advantages:
* Archive is now easy to copy since there is no Archive -> Child -> Archive
loop.
* It makes it clear that we already checked for errors when finding the Child
data.
llvm-svn: 251750
Introduce LLVMSymbolizer::symbolizeInlinedCode() instead of switching
on PrintInlining option passed to the constructor. This will be needed
once we retrun structured data (instead of std::string) from
LLVMSymbolizer and move printing logic out.
llvm-svn: 251675
Summary:
This is mostly NFC. It is a first step in cleaning up LLVMSymbolize
library. It removes "ModuleInfo" class which bundles together ObjectFile
and its debug info context in favor of:
* abstract SymbolizableModule in public headers;
* SymbolizableObjectFile subclass in implementation.
Additionally, SymbolizableObjectFile is now created via factory, so we
can properly detect object parsing error at this stage instead of keeping
the broken half-parsed object. As a next step, we would be able to
propagate the error all the way back to the library user.
Further improvements might include:
* factoring out the logic of finding appropriate file with debug info
for a given object file, and caching all parsed object files into a
separate class [A].
* factoring out DILineInfo rendering [B].
This would make what is now a heavyweight "LLVMSymbolizer" a relatively
straightforward class, that calls into [A] to turn filepath into a
SymbolizableModule, delegates actual symbolization to concrete SymbolizableModule
implementation, and lets [C] render the result.
Reviewers: dblaikie, echristo, rafael
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D14099
llvm-svn: 251662
This was a layering violation in ScheduleDAGInstrs (and
MachineSchedulerBase) they both shouldn't know directly whether they are
used by the PostMachineScheduler or the MachineScheduler.
llvm-svn: 251608
These MachO file directives are used by linkers and other tools to provide
compatibility information, much like the existing .ios_version_min and
.macosx_version_min.
llvm-svn: 251569
It looks like this broke the stage 2 builder:
http://lab.llvm.org:8080/green/job/clang-stage2-configure-Rlto/6989/
Original commit message:
AliasSetTracker does not need to convert the access mode to ModRefAccess if the
new visited UnknownInst has only 'REF' modrefinfo to existing pointers in the
sets.
Patch by Andrew Zhogin!
llvm-svn: 251562
This teaches SCEV to compute //max// backedge taken counts for loops
like
for (int i = k; i != 0; i >>>= 1)
whatever();
SCEV yet cannot represent the exact backedge count for these loops, and
this patch does not change that. This is really geared towards teaching
SCEV that loops like the above are *not* infinite.
llvm-svn: 251558
Add a couple of helper methods to make the primary
raw profile reader interface's implementation more
readable. It also hides more format details. This
patch has no functional change.
llvm-svn: 251546
When checking if an indirect global (a global with pointer type) is only assigned by allocation functions, first check if the global is itself initialized. If it is, it's not only assigned by allocation functions.
This fixes PR25309. Thanks to David Majnemer for reducing the test case!
llvm-svn: 251508
Change InstrProfReaderIndex from typedef into a wrapper
class with helper methods. This makes the index profile
reader code more readable. It also hides the implementation
detail of the index format and make it more flexible to allow
support different (or more than one) format in the future.
llvm-svn: 251491
This also lets us remove the versions of the functions that took a statically sized array as we can rely on ArrayRef implicit conversion now.
llvm-svn: 251490
Summary: This will allow a later patch to `JumpThreading` use this functionality.
Reviewers: reames
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D13971
llvm-svn: 251488
r248010 changed the -debug output to use short ids, but did not
similarly modify the graph printer. Change to be consistent, for ease of
cross-reference.
llvm-svn: 251465
Use 10 bits to represent calling convention ID's instead of 13, and
update the bitcode compatibility tests accordingly. We now error-out in
the bitcode reader when we see bad calling conv ID's.
Thanks to rnk and dexonsmith for feedback!
Differential Revision: http://reviews.llvm.org/D13826
llvm-svn: 251452
AliasSetTracker does not need to convert the access mode to ModRefAccess if the
new visited UnknownInst has only 'REF' modrefinfo to existing pointers in the
sets.
Patch by Andrew Zhogin!
llvm-svn: 251451
When optimization is disabled, edge weights that are stored in MBB won't be used so that we don't have to store them. Currently, this is done by adding successors with default weight 0, and if all successors have default weights, the weight list will be empty. But that the weight list is empty doesn't mean disabled optimization (as is stated several times in MachineBasicBlock.cpp): it may also mean all successors just have default weights.
We should discourage using default weights when adding successors, because it is very easy for users to forget update the correct edge weights instead of using default ones (one exception is that the MBB only has one successor). In order to detect such usages, it is better to differentiate using default weights from the case when optimizations is disabled.
In this patch, a new interface addSuccessorWithoutWeight(MBB*) is created for when optimization is disabled. In this case, MBB will try to maintain an empty weight list, but it cannot guarantee this as for many uses of addSuccessor() whether optimization is disabled or not is not checked. But it can guarantee that if optimization is enabled, then the weight list always has the same size of the successor list.
Differential revision: http://reviews.llvm.org/D13963
llvm-svn: 251429
convert float to half with mask/maskz for the reg to reg version and mask for the reg to mem version (there is no maskz version for reg to mem).
Differential Revision: http://reviews.llvm.org/D14113
llvm-svn: 251409
GNU tools require elfiamcu to take up the entire OS field, so, e.g.
i?86-*-linux-elfiamcu is not considered a legal triple.
Make us compatible.
Differential Revision: http://reviews.llvm.org/D14081
llvm-svn: 251390
This avoid mentioning the table name an extra time and allows the lookup to be done directly in the ifs by relying on the bool conversion of the pointer.
While there make use of ArrayRef and std::find_if.
llvm-svn: 251382
LLVMSymbolizer::Options is mostly used in LLVMSymbolizer class anyway.
Let's keep their usage restricted to that class, especially given that
it's worth to move ModuleInfo to a different header, independent from
the symbolizer class.
llvm-svn: 251363
Summary: This idiom is used elsewhere in LLVM, but was overlooked here.
Reviewers: chandlerc
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D13628
llvm-svn: 251348
We should remove noalias along with dereference and dereference_or_null attributes
because statepoint could potentially touch the entire heap including noalias objects.
Differential Revision: http://reviews.llvm.org/D14032
llvm-svn: 251333
Processing bitcode from a different LLVM version can lead to
unexpected behavior. The LLVM project guarantees autoupdating
bitcode from a previous minor revision for the same major, but
can't make any promise when reading bitcode generated from a
either a non-released LLVM, a vendor toolchain, or a "future"
LLVM release. This patch aims at being more user-friendly and
allows a bitcode produce to emit an optional block at the
beginning of the bitcode that will contains an opaque string
intended to describe the bitcode producer information. The
bitcode reader will dump this information alongside any error it
reports.
The optional block also includes an "epoch" number, monotonically
increasing when incompatible changes are made to the bitcode. The
reader will reject bitcode whose epoch is different from the one
expected.
Differential Revision: http://reviews.llvm.org/D13666
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 251325
Android libc provides a fixed TLS slot for the unsafe stack pointer,
and this change implements direct access to that slot on AArch64 via
__builtin_thread_pointer() + offset.
This change also moves more code into TargetLowering and its
target-specific subclasses to get rid of target-specific codegen
in SafeStackPass.
This change does not touch the ARM backend because ARM lowers
builting_thread_pointer as aeabi_read_tp, which is not available
on Android.
The previous iteration of this change was reverted in r250461. This
version leaves the generic, compiler-rt based implementation in
SafeStack.cpp instead of moving it to TargetLoweringBase in order to
allow testing without a TargetMachine.
llvm-svn: 251324
We were previously overflowing a 32-bit multiply operation when emitting large
(>512MB) bitcode files, resulting in corrupted bitcode. Fix by extending
one of the operands to 64 bits.
There are a few other 32-bit integer types in this code that seem like they
also ought to be extended to 64 bits; this will be done separately.
llvm-svn: 251323
In PIC mode we were previously computing global variable addresses (or GOT
entry addresses) by adding the PC, the PC-relative GOT displacement and
the GOT-relative symbol/GOT entry displacement. Because the latter two
displacements are fixed, we ended up performing one more addition than
necessary.
This change causes us to compute addresses using a single PC-relative
displacement, resulting in a shorter code sequence. This reduces code size
by about 4% in a recent build of Chromium for Android.
As a result of this change we no longer need to compute the GOT base address
in the ARM backend, which allows us to remove the Global Base Reg pass and
SDAG lowering for the GOT.
We also now no longer use the GOT when addressing a symbol which is known
to be defined in the same linkage unit. Specifically, the symbol must have
either hidden visibility or a strong definition in the current module in
order to not use the the GOT.
This is a change from the previous behaviour where we would use the GOT to
address externally visible symbols defined in the same module. I think the
only cases where this could matter are cases involving symbol interposition,
but we don't really support that well anyway.
Differential Revision: http://reviews.llvm.org/D13650
llvm-svn: 251322
Summary:
Replace (const SCEVAddRecExpr *) with cast<SCEVAddRecExpr>.
Rename SCEVApplyRewriter to SCEVLoopAddRecRewriter (which is a more
appropriate name) since the description is "takes a scalar evolution
expression and applies the Map (Loop -> SCEV) to all AddRecExprs."
Subscribers: llvm-commits, sanjoy
Differential Revision: http://reviews.llvm.org/D14065
llvm-svn: 251292
Summary:
Add a SCEVRewriteVisitor class which contains the common
visiting patterns used when rewriting SCEVs.
SCEVParameterRewriter and SCEVApplyRewriter now inherit
from SCEVRewriteVisitor (and are therefore much simpler).
Reviewers: anemet, mzolotukhin, sanjoy
Subscribers: rengolin, llvm-commits, sanjoy
Differential Revision: http://reviews.llvm.org/D13242
llvm-svn: 251283
the module pointer type passed in by the user.
The previous ownership scheme, where the user pointer was always moved into a
std::shared_ptr, breaks if the user passes in a raw pointer.
Discovered while working on the Orc C API, which should be landing shortly.
I expect to include a test-case with that.
llvm-svn: 251273
Instead of playing around with dominance to verify if the possible expansion of
a scop region is indeed a single entry single exit region, we now distinguish
two cases. In case we only append a basic block, all edges entering this basic
block need to have come from within the region that is expanded. In case we join
two regions, the source basic blocks of the edges that end at the entry node of
the region that is appended most be part of either the original region or the
region that is appended.
This change will be tested through Polly.
This fixes llvm.org/PR25242
llvm-svn: 251267
When the target does not support these intrinsics they should be converted to a chain of scalar load or store operations.
If the mask is not constant, the scalarizer will build a chain of conditional basic blocks.
I added isLegalMaskedGather() isLegalMaskedScatter() APIs.
Differential Revision: http://reviews.llvm.org/D13722
llvm-svn: 251237
When using the MCU psABI, compiler-generated library calls should pass
some parameters in-register. However, since inreg marking for x86 is currently
done by the front end, it will not be applied to backend-generated calls.
This is a workaround for PR3997, which describes a similar issue for -mregparm.
Differential Revision: http://reviews.llvm.org/D13977
llvm-svn: 251223
This adds support for the i?86-*-elfiamcu triple, which indicates the IAMCU psABI is used.
Differential Revision: http://reviews.llvm.org/D13977
llvm-svn: 251222
I think it's fine to keep this fields around in terms of overhead,
I wasn't able to measure any substantial regression while running the
test suite, but, in case this causes some regression I'm ready to revert
and work on an alternative solution.
This was tested building with clang/gcc both in Debug and Release mode
and passes the test-suite.
llvm-svn: 251209
The loop idiom creating a ConstantRange is repeated twice in the
codebase, time to give it a name and a home.
The loop is also repeated in `rangeMetadataExcludesValue`, but using
`getConstantRangeFromMetadata` there would not be an NFC -- the range
returned by `getConstantRangeFromMetadata` may contain a value that none
of the subranges did.
llvm-svn: 251180
In this mode it just tries to tail merge the strings without imposing any other
format constrains. It will not, for example, add a null byte between them.
Also add support for keeping a tentative size and offset if we decide to
not optimize after all.
This will be used shortly in lld for merging SHF_STRINGS sections.
llvm-svn: 251153
Also adds a 'trivial' ELF file. This was generated by assembling
and linking a file with the symbol main which contains a single
return instruction.
llvm-svn: 251096
This patch converts the remaining references to literal
strings for names of profile runtime entites (such as
profile runtime hook, runtime hook use function, profile
init method, register function etc).
Also added documentation for all the new interfaces.
llvm-svn: 251093
In r251064 I removed a logically unreachable call to `redoLoop`, and
now there aren't any callers of this API at all. Remove the needless
complexity.
llvm-svn: 251067
The insertLoop() API is only used to add new loops, and has confusing
ownership semantics. Simplify it by replacing it with addLoop().
llvm-svn: 251064
This is a clean up patch that defines instr prof section and variable
name prefixes in a common header with access helper functions.
clang FE change will be done as a follow up once this patch is in.
Differential Revision: http://reviews.llvm.org/D13919
llvm-svn: 251058
Summary:
An unsigned comparision is equivalent to is corresponding signed version
if both the operands being compared are positive. Teach SCEV to use
this fact when profitable.
Reviewers: atrick, hfinkel, reames, nlewycky
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D13687
llvm-svn: 251051
Summary:
- A s< (A + C)<nsw> if C > 0
- A s<= (A + C)<nsw> if C >= 0
- (A + C)<nsw> s< A if C < 0
- (A + C)<nsw> s<= A if C <= 0
Right now `C` needs to be a constant, but we can later generalize it to
be a non-constant if needed.
Reviewers: atrick, hfinkel, reames, nlewycky
Subscribers: sanjoy, llvm-commits
Differential Revision: http://reviews.llvm.org/D13686
llvm-svn: 251050
The array handling CondCodes only allocated 2 bits to describe the
desired action for each type. The new addition of a "LibCall" option
overflowed this and caused corruption for Custom actions.
No in-tree targets have a Custom CondCodeAction, so unfortunately it
can't be tested.
llvm-svn: 251033
isKnownNonEqual(A, B) returns true if it can be determined that A != B.
At the moment it only knows two facts, that a non-wrapping add of nonzero to a value cannot be that value:
A + B != A [where B != 0, addition is nsw or nuw]
and that contradictory known bits imply two values are not equal.
This patch also hooks this up to InstSimplify; InstSimplify had a peephole for the first fact but not the second so this teaches InstSimplify a new trick too (alas no measured performance impact!)
llvm-svn: 251012
Summary: This will be used in a future change to ScalarEvolution.
Reviewers: hfinkel, reames, nlewycky
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D13612
llvm-svn: 250975
Summary:
If a `CallSite` has operand bundles, then do not peek into the called
function to get a more precise `ModRef` answer.
This is tested using `argmemonly`, `-basicaa` and `-gvn`; but the
functionality is not specific to any of these.
Depends on D13961
Reviewers: reames, chandlerc
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D13962
llvm-svn: 250974
Summary:
This makes attribute accessors on `CallInst` and `InvokeInst` do the
(conservatively) right thing. This essentially involves, in some
cases, *not* falling back querying the attributes on the called
`llvm::Function` when operand bundles are present.
Attributes locally present on the `CallInst` or `InvokeInst` will still
override operand bundle semantics. The LangRef has been amended to
reflect this. Note: this change does not do anything prevent
`-function-attrs` from inferring `CallSite` local attributes after
inspecting the called function -- that will be done as a separate
change.
I've used `-adce` and `-early-cse` to test these changes. There is
nothing special about these passes (and they did not require any
changes) except that they seemed be the easiest way to write the tests.
This change does not add deal with `argmemonly`. That's a later change
because alias analysis requires a related fix before `argmemonly` can be
tested.
Reviewers: reames, chandlerc
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D13961
llvm-svn: 250973
in the size field in the archive header for the member is not a number. To do this we
have all of the needed methods return ErrorOr to push them up until we get out of lib.
Then the tools and can handle the error in whatever way is appropriate for that tool.
So the solution is to plumb all the ErrorOr stuff through everything that touches archives.
This include its iterators as one can create an Archive object but the first or any other
Child object may fail to be created due to a bad size field in its header.
Thanks to Lang Hames on the changes making child_iterator contain an
ErrorOr<Child> instead of a Child and the needed changes to ErrorOr.h to add
operator overloading for * and -> .
We don’t want to use llvm_unreachable() as it calls abort() and is produces a “crash”
and using report_fatal_error() to move the error checking will cause the program to
stop, neither of which are really correct in library code. There are still some uses of
these that should be cleaned up in this library code for other than the size field.
Also corrected the code where the size gets us to the “at the end of the archive”
which is OK but past the end of the archive will return object_error::parse_failed now.
The test cases use archives with text files so one can see the non-digit character,
in this case a ‘%’, in the size field.
llvm-svn: 250906
"external" AA wrapper pass.
This is a generic hook that can be used to thread custom code into the
primary AAResultsWrapperPass for the legacy pass manager in order to
allow it to merge external AA results into the AA results it is
building. It does this by threading in a raw callback and so it is
*very* powerful and should serve almost any use case I have come up with
for extending the set of alias analyses used. The only thing not well
supported here is using a *different order* of alias analyses. That form
of extension *is* supportable with the new pass manager, and I can make
the callback structure here more elaborate to support it in the legacy
pass manager if this is a critical use case that people are already
depending on, but the only use cases I have heard of thus far should be
reasonably satisfied by this simpler extension mechanism.
It is hard to test this using normal facilities (the built-in AAs don't
use this for obvious reasons) so I've written a fairly extensive set of
custom passes in the alias analysis unit test that should be an
excellent test case because it models the out-of-tree users: it adds
a totally custom AA to the system. This should also serve as
a reasonably good example and guide for out-of-tree users to follow in
order to rig up their existing alias analyses.
No support in opt for commandline control is provided here however. I'm
really unhappy with the kind of contortions that would be required to
support that. It would fully re-introduce the analysis group
self-recursion kind of patterns. =/
I've heard from out-of-tree users that this will unblock their use cases
with extending AAs on top of the new infrastructure and let us retain
the new analysis-group-free-world.
Differential Revision: http://reviews.llvm.org/D13418
llvm-svn: 250894
This reverts commit r250239.
It seems unwanted changes got committed here, and part of
the patch does not seem correct.
For instance RoundUpToAlignment() is called without its returned
value actually used.
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 250882
Summary:
TargetLoweringBase::Expand is defined as "Try to expand this to other ops,
otherwise use a libcall." For ISD::UDIV and ISD::SDIV, the choice between
the two possibilities was defined in a rather convoluted way:
- if DIVREM is legal, expand to DIVREM
- if DIVREM has a custom lowering, expand to DIVREM
- if DIVREM libcall is defined and a remainder from the same division is
computed elsewhere, expand to a DIVREM libcall
- else, expand to a DIV libcall
This had the undesirable effect that if both DIV and DIVREM are implemented
as libcalls, then ISD::UDIV and ISD::SDIV are expanded to the heavier DIVREM
libcall, even when the remainder isn't used.
The new code adds a new LegalizeAction, TargetLoweringBase::LibCall, so that
backends can directly control whether they prefer an expansion or a conversion
to a libcall. This makes the generic lowering code even more generic,
allowing its reuse in a wider range of target-specific configurations.
The useful effect is that ARM backend will now generate a call
to __aeabi_{i,u}div rather than __aeabi_{i,u}divmod in cases where
it doesn't need the remainder. There's no functional change outside
the ARM backend.
Reviewers: t.p.northover, rengolin
Subscribers: t.p.northover, llvm-commits, aemerson
Differential Revision: http://reviews.llvm.org/D13862
llvm-svn: 250826
The mask value type for maskload/maskstore GCC builtins is never a vector of
packed floats/doubles.
This patch fixes the following issues:
1. The mask argument for builtin_ia32_maskloadpd and builtin_ia32_maskstorepd
should be of type llvm_v2i64_ty and not llvm_v2f64_ty.
2. The mask argument for builtin_ia32_maskloadpd256 and
builtin_ia32_maskstorepd256 should be of type llvm_v4i64_ty and not
llvm_v4f64_ty.
3. The mask argument for builtin_ia32_maskloadps and builtin_ia32_maskstoreps
should be of type llvm_v4i32_ty and not llvm_v4f32_ty.
4. The mask argument for builtin_ia32_maskloadps256 and
builtin_ia32_maskstoreps256 should be of type llvm_v8i32_ty and not
llvm_v8f32_ty.
Differential Revision: http://reviews.llvm.org/D13776
llvm-svn: 250817
As usual, this is a polymorphic hierarchy without polymorphic ownership,
so simply make the dtor protected non-virtual, protected default copy
ctor/assign, and make derived classes final. The derived classes will
pick up correct default public copy ops (and dtor) implicitly.
(wish I could add -Wdeprecated to the build, but last time I tried it
triggered on some system headers I still need to look into/figure out)
llvm-svn: 250747
Besides the usual, I finally added an overload to
`BasicBlock::splitBasicBlock()` that accepts an `Instruction*` instead
of `BasicBlock::iterator`. Someone can go back and remove this overload
later (after updating the callers I'm going to skip going forward), but
the most common call seems to be
`BB->splitBasicBlock(BB->getTerminator(), ...)` and I'm not sure it's
better to add `->getIterator()` to every one than have the overload.
It's pretty hard to get the usage wrong.
llvm-svn: 250745
memory, rather than representing the stubs in IR. Update the CompileOnDemand
layer to use this functionality.
Directly emitting stubs is much cheaper than building them in IR and codegen'ing
them (see below). It also plays well with remote JITing - stubs can be emitted
directly in the target process, rather than having to send them over the wire.
The downsides are:
(1) Care must be taken when resolving symbols, as stub symbols are held in a
separate symbol table. This is only a problem for layer writers and other
people using this API directly. The CompileOnDemand layer hides this detail.
(2) Aliases of function stubs can't be symbolic any more (since there's no
symbol definition in IR), but must be converted into a constant pointer
expression. This means that modules containing aliases of stubs cannot be
cached. In practice this is unlikely to be a problem: There's no benefit to
caching such a module anyway.
On balance I think the extra performance is more than worth the trade-offs: In a
simple stress test with 10000 dummy functions requiring stubs and a single
executed "hello world" main function, directly emitting stubs reduced user time
for JITing / executing by over 90% (1.5s for IR stubs vs 0.1s for direct
emission).
llvm-svn: 250712
Originally I planned to use the same interface for masked gather/scatter and set isConsecutive to "false" in this case.
Now I'm implementing masked gather/scatter and see that the interface is inconvenient. I want to add interfaces isLegalMaskedGather() / isLegalMaskedScatter() instead of using the "Consecutive" parameter in the existing interfaces.
Differential Revision: http://reviews.llvm.org/D13850
llvm-svn: 250686
Sometimes it is more natural to use a ArrayRef<uint8_t> than a StringRef to
represent a range of bytes that is not, semantically, a string.
This will be used in lld in a sec.
llvm-svn: 250658