Summary:
extract_vector_elt can cause an implicit any_ext if the types don't
match. When processing the following pattern:
(and (extract_vector_elt (load ([non_ext|any_ext|zero_ext] V))), c)
DAGCombine was ignoring the possible extend, and sometimes removing
the AND even though it was required to maintain some of the bits
in the result to 0, resulting in a miscompile.
This change fixes the issue by limiting the transformation only to
cases where the extract_vector_elt doesn't perform the implicit
extend.
Reviewers: t.p.northover, jmolloy
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D18247
llvm-svn: 263935
Summary:
The old address space inference pass (NVPTXFavorNonGenericAddrSpaces) is unable
to convert the address space of a pointer induction variable. This patch adds a
new pass called NVPTXInferAddressSpaces that overcomes that limitation using a
fixed-point data-flow analysis (see the file header comments for details).
The new pass is experimental and not enabled by default. Users can turn
it on by setting the -nvptx-use-infer-addrspace flag of llc.
Reviewers: jholewinski, tra, jlebar
Subscribers: jholewinski, llvm-commits
Differential Revision: http://reviews.llvm.org/D17965
llvm-svn: 263916
Improve computeZeroableShuffleElements to be able to peek through bitcasts to extract zero/undef values from BUILD_VECTOR nodes of different element sizes to the shuffle mask.
Differential Revision: http://reviews.llvm.org/D14261
llvm-svn: 263906
Summary: Also expose getters and setters in the C API, so that the change can be tested.
Reviewers: nhaehnle, axw, joker.eph
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D18260
From: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
llvm-svn: 263886
The sinpi/cospi can be replaced with sincospi to remove unnecessary
computations. However, we need to make sure that the calls are within
the same function!
This fixes PR26993.
llvm-svn: 263875
MDString are uniqued in the Context on creation, hashing the
pointer is less expensive than hashing the String itself.
Reviewers: dexonsmith
Differential Revision: http://reviews.llvm.org/D16560
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 263867
Summary:
This patch changes the computation of the hash key for DISubprogram to
be computed on a small subset of the fields. The hash is computed a
lot faster, but there might be more collision in the table.
However by carefully selecting the fields, colisions should be rare.
Using `opt` to load the IR for FastISelEmitter.cpp.o, with this patch:
- DISubprogram::getImpl() goes from 28ms to 15ms.
- DICompositeType::getImpl() goes from 6ms to 2ms
- DIDerivedType::getImpl() goes from 18 to 12ms
Reviewers: dexonsmith
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D16571
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 263866
Summary:
ThinLTO is relying on linkInModule to import selected function.
However a lot of "magic" was hidden in linkInModule and the IRMover,
who would rename and promote global variables on the fly.
This is moving to an approach where the steps are decoupled and the
client is reponsible to specify the list of globals to import.
As a consequence some test are changed because they were relying on
the previous behavior which was importing the definition of *every*
single global without control on the client side.
Now the burden is on the client to decide if a global has to be imported
or not.
Reviewers: tejohnson
Subscribers: joker.eph, llvm-commits
Differential Revision: http://reviews.llvm.org/D18122
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 263863
On Rafael's suggestion!
(also fix a discrepancy between this error message format and the others)
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 263860
We need to be careful on which registers can be explicitly handled
via copies. Prologue, Epilogue use physical registers and if one belongs
to the set of CSRsViaCopy, it will no longer be CSRed, since PEI overwrites
it after the explicit copies.
llvm-svn: 263857
Avoid modifying other modules in `AArch64PromoteConstant` when the
constant is `ConstantData` (a horrible accident, I'm sure, caught by an
experimental follow-up to r261464).
Previously, this walked through all the users of a constant, but that
reaches into other modules when the constant doesn't depend transitively
on a `GlobalValue`! Since we're walking instructions anyway, just
modify the instructions we actually see.
As a drive-by, instead of storing `Use` and getting the instructions
again via `Use::getUser()` (which is not a constantant time lookup),
store `std::pair<Instruction, unsigned>`. Besides being cheaper, this
makes it easier to drop use-lists form `ConstantData` in the future.
(I threw this in because I was touching all the code anyway.)
Because the patch completely changes the traversal logic, it looks
like a rewrite of the pass, but the core logic is all the same (or
should be, minus the out-of-module changes). In other words, there
should be NFC as long as the LLVMContext only has a single Module.
I didn't think of a good way to test this, but I hope to submit a patch
eventually that makes walking these use-lists illegal/impossible.
llvm-svn: 263853
While not strictly necessary, since we don't support large integer
types, this avoids bugs due to silent truncation from uint64_t to a
32-bit unsigned (e.g. DL.isLegalInteger(DL.getTypeSizeInBits(Ty) )
This fixes PR26972.
Differential Revision: http://reviews.llvm.org/D18258
llvm-svn: 263850
* Renamed to be camel case, consistent with other docs.
* Fixed non-ascii characters (this is what I get for writing docs on an iPad).
llvm-svn: 263840
The loop on IVOperand's incoming values assumes IVOperand to be an
induction variable on the loop over which `S Pred X` is invariant;
otherwise loop invariant incoming values to IVOperand are not guaranteed
to dominate the comparision.
This fixes PR26973.
llvm-svn: 263827
In the <DisplayString> of PointerIntPair , I cast the pointer to the actual type, so VS can leverage it while visualizing, not unlike the recent change to PointerUnion visualization.
In the expansion, the current code is casting to the incorrect type (wrong number of stars), so I fixed that as well.
llvm-svn: 263821
This patch adds unscaled loads and sign-extend loads to the TII
getMemOpBaseRegImmOfs API, which is used to control clustering in the MI
scheduler. This is done to create more opportunities for load pairing. I've
also added the scaled LDRSWui instruction, which was missing from the scaled
instructions. Finally, I've added support in shouldClusterLoads for clustering
adjacent sext and zext loads that too can be paired by the load/store optimizer.
Differential Revision: http://reviews.llvm.org/D18048
llvm-svn: 263819
Summary:
These dependencies would be used in the future to reduce the number
of instrumented blocks(http://reviews.llvm.org/rL262103)
This is submitted as a separate CL because of previous problems with
ARM.
Subscribers: aemerson
Differential Revision: http://reviews.llvm.org/D18227
llvm-svn: 263797
Summary:
Allow the selection of BUFFER_LOAD_FORMAT_x and _XY. Do this now before
the frontend patches land in Mesa. Eventually, we may want to automatically
reduce the size of loads at the LLVM IR level, which requires such overloads,
and in some cases Mesa can generate them directly.
Reviewers: tstellarAMD, arsenm
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D18255
llvm-svn: 263792
Summary:
These intrinsics expose the BUFFER_ATOMIC_* instructions and will be used
by Mesa to implement atomics with buffer semantics. The intrinsic interface
matches that of buffer.load.format and buffer.store.format, except that the
GLC bit is not exposed (it is automatically deduced based on whether the
return value is used).
The change of hasSideEffects is required for TableGen to accept the pattern
that matches the intrinsic.
Reviewers: tstellarAMD, arsenm
Subscribers: arsenm, rivanvx, llvm-commits
Differential Revision: http://reviews.llvm.org/D18151
llvm-svn: 263791
Summary:
We cannot easily deduce that an offset is in an SGPR, but the Mesa frontend
cannot easily make use of an explicit soffset parameter either. Furthermore,
it is likely that in the future, LLVM will be in a better position than the
frontend to choose an SGPR offset if possible.
Since there aren't any frontend uses of these intrinsics in upstream
repositories yet, I would like to take this opportunity to change the
intrinsic signatures to a single offset parameter, which is then selected
to immediate offsets or voffsets using a ComplexPattern.
Reviewers: arsenm, tstellarAMD, mareko
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D18218
llvm-svn: 263790
Now that the resolved path cache stores the StringRef's, its
best to just always cache the results, even when realpath isn't
used. This way we'll still avoid the StringMap hashing and lookup.
This also conveniently reorganises this code in a way I need for
a future patch.
llvm-svn: 263777
ResolvedPaths was storing std::string's as a cache. We would then take those strings and look them up in the internString pool to get a unique StringRef for each path.
This patch changes ResolvedPaths to store the StringRef pointing in to the internString pool itself. This way, when getResolvedPath returns a string, we know we have the StringRef we would find in the pool anyway. We can avoid the duplicate memory of the std::string's, and also the time from the lookup.
Unfortunately my profiles show no runtime change here, but it should still save memory allocations which is nice.
Reviewed by Frederic Riss.
Differential Revision: http://reviews.llvm.org/D18259
llvm-svn: 263774
Summary:
It can hurt performance to prefetch ahead too much. Be conservative for
now and don't prefetch ahead more than 3 iterations on Cyclone.
Reviewers: hfinkel
Subscribers: llvm-commits, mzolotukhin
Differential Revision: http://reviews.llvm.org/D17949
llvm-svn: 263772
Summary:
And use this TTI for Cyclone. As it was explained in the original RFC
(http://thread.gmane.org/gmane.comp.compilers.llvm.devel/92758), the HW
prefetcher work up to 2KB strides.
I am also adding tests for this and the previous change (D17943):
* Cyclone prefetching accesses with a large stride
* Cyclone not prefetching accesses with a small stride
* Generic Aarch64 subtarget not prefetching either
Reviewers: hfinkel
Subscribers: aemerson, rengolin, llvm-commits, mzolotukhin
Differential Revision: http://reviews.llvm.org/D17945
llvm-svn: 263771
Summary:
This wires up the pass for Cyclone but keeps it off for now because we
need a few more TTIs.
The getPrefetchMinStride value is not very well tuned right now but it
works well with CFP2006/433.milc which motivated this.
Tests will be added as part of the upcoming large-stride prefetching
patch.
Reviewers: t.p.northover
Subscribers: llvm-commits, aemerson, hfinkel, rengolin
Differential Revision: http://reviews.llvm.org/D17943
llvm-svn: 263770
A virtual index of -1u indicates that the subprogram's virtual index is
unrepresentable (for example, when using the relative vtable ABI), so do
not emit a DW_AT_vtable_elem_location attribute for it.
Differential Revision: http://reviews.llvm.org/D18236
llvm-svn: 263765
MSVC as usual:
C:\Buildbot\Slave\llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast\llvm.src\include\llvm/ADT/STLExtras.h(120):
error C2100: illegal indirection
C:\Buildbot\Slave\llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast\llvm.src\include\llvm/IR/Instructions.h(3966):
note: see reference to class template instantiation
'llvm::mapped_iterator<llvm::User::op_iterator,llvm::CatchSwitchInst::DerefFnTy>'
being compiled
This reverts commit e091dd63f1f34e043748e28ad160d3bc17731168.
llvm-svn: 263760
For fcmp, major concern about the following 6 cases is NaN result. The
comparison result consists of 4 bits, indicating lt, eq, gt and un (unordered),
only one of which will be set. The result is generated by fcmpu
instruction. However, bc instruction only inspects one of the first 3
bits, so when un is set, bc instruction may jump to to an undesired
place.
More specifically, if we expect an unordered comparison and un is set, we
expect to always go to true branch; in such case UEQ, UGT and ULT still
give false, which are undesired; but UNE, UGE, ULE happen to give true,
since they are tested by inspecting !eq, !lt, !gt, respectively.
Similarly, for ordered comparison, when un is set, we always expect the
result to be false. In such case OGT, OLT and OEQ is good, since they are
actually testing GT, LT, and EQ respectively, which are false. OGE, OLE
and ONE are tested through !lt, !gt and !eq, and these are true.
llvm-svn: 263753
idiom.
Most LLVM tool code exits immediately when an error is encountered and prints an
error message to stderr. The ExitOnError class supports this by providing two
call operators - one for Errors, and one for Expected<T>s. Calls to code that
can return Errors (or Expected<T>s) can use these calls to bail out on error,
and otherwise continue as if the operation had succeeded. E.g.
Error foo();
Expected<int> bar();
int main(int argc, char *argv[]) {
ExitOnError ExitOnErr;
ExitOnErr.setBanner(std::string("Error in ") + argv[0] + ":");
// Exit if foo returns an error. No need to manually check error return.
ExitOnErr(foo());
// Exit if bar returns an error, otherwise unwrap the contained int and
// continue.
int X = ExitOnErr(bar());
// ...
return 0;
}
llvm-svn: 263749
Summary:
Use the new LoopVersioning facility (D16712) to add noalias metadata in
the vector loop if we versioned with memchecks. This can enable some
optimization opportunities further down the pipeline (see the included
test or the benchmark improvement quoted in D16712).
The test also covers the bug I had in the initial version in D16712.
The vectorizer did not previously use LoopVersioning. The reason is
that the vectorizer performs its transformations in single shot. It
creates an empty single-block vector loop that it then populates with
the widened, if-converted instructions. Thus creating an intermediate
versioned scalar loop seems wasteful.
So this patch (rather than bringing in LoopVersioning fully) adds a
special interface to LoopVersioning to allow the vectorizer to add
no-alias annotation while still performing its own versioning.
As the vectorizer propagates metadata from the instructions in the
original loop to the vector instructions we also check the pointer in
the original instruction and see if LoopVersioning can add no-alias
metadata based on the issued memchecks.
Reviewers: hfinkel, nadav, mzolotukhin
Subscribers: mzolotukhin, llvm-commits
Differential Revision: http://reviews.llvm.org/D17191
llvm-svn: 263744
Summary:
If we decide to version a loop to benefit a transformation, it makes
sense to record the now non-aliasing accesses in the newly versioned
loop. This allows non-aliasing information to be used by subsequent
passes.
One example is 456.hmmer in SPECint2006 where after loop distribution,
we vectorize one of the newly distributed loops. To vectorize we
version this loop to fully disambiguate may-aliasing accesses. If we
add the noalias markers, we can use the same information in a later DSE
pass to eliminate some dead stores which amounts to ~25% of the
instructions of this hot memory-pipeline-bound loop. The overall
performance improves by 18% on our ARM64.
The scoped noalias annotation is added in LoopVersioning. The patch
then enables this for loop distribution. A follow-on patch will enable
it for the vectorizer. Eventually this should be run by default when
versioning the loop but first I'd like to get some feedback whether my
understanding and application of scoped noalias metadata is correct.
Essentially my approach was to have a separate alias domain for each
versioning of the loop. For example, if we first version in loop
distribution and then in vectorization of the distributed loops, we have
a different set of memchecks for each versioning. By keeping the scopes
in different domains they can conveniently be defined independently
since different alias domains don't affect each other.
As written, I also have a separate domain for each loop. This is not
necessary and we could save some metadata here by using the same domain
across the different loops. I don't think it's a big deal either way.
Probably the best is to review the tests first to see if I mapped this
problem correctly to scoped noalias markers. I have plenty of comments
in the tests.
Note that the interface is prepared for the vectorizer which needs the
annotateInstWithNoAlias API. The vectorizer does not use LoopVersioning
so we need a way to pass in the versioned instructions. This is also
why the maps have to become part of the object state.
Also currently, we only have an AA-aware DSE after the vectorizer if we
also run the LTO pipeline. Depending how widely this triggers we may
want to schedule a DSE toward the end of the regular pass pipeline.
Reviewers: hfinkel, nadav, ashutosh.nema
Subscribers: mssimpso, aemerson, llvm-commits, mcrosier
Differential Revision: http://reviews.llvm.org/D16712
llvm-svn: 263743
I hit a crash in the bitcode reader on some corrupt input where an
MDString had somehow been attached to an instruction instead of an
MDNode. This input is pretty bogus, but we shouldn't be crashing on bad
input here.
This change adds error handling in all of the places where we
currently have unchecked casts from Metadata to MDNode, which means
we'll error out instead of crashing for that sort of input.
Unfortunately, I don't have tests. Hitting this requires flipping bits
in the input bitcode, and committing corrupt binary files to catch
these cases is a bit too opaque and unmaintainable.
llvm-svn: 263742
Summary:
The multiprocessing.Queue.put() call can hang if we try queueing all the
tests before starting to take them out of the queue.
The current implementation hangs if tests exceed 2^^15, on Mac OS X.
This might happen with a ninja check-all if one has a bunch of llvm
projects.
Reviewers: delcypher, bkramer
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D17609
llvm-svn: 263731
Autogenerated from the corresponding assembler tests with a few FIXME added (will fix soon).
Differential Revision: http://reviews.llvm.org/D18249
llvm-svn: 263729
This patch prevents CTR loops optimization when using soft float operations
inside loop body. Soft float operations use function calls, but function
calls are not allowed inside CTR optimized loops.
Patch by Aleksandar Beserminji.
Differential Revision: http://reviews.llvm.org/D17600
llvm-svn: 263727
Summary:
MRI::eliminateFrameIndex can emit several instructions to do address
calculations; these can usually be stackified. Because instructions with
FI operands can have subsequent operands which may be expression trees,
find the top of the leftmost tree and insert the code before it, to keep
the LIFO property.
Also use stackified registers when writing back the SP value to memory
in the epilog; it's unnecessary because SP will not be used after the
epilog, and it results in better code.
Differential Revision: http://reviews.llvm.org/D18234
llvm-svn: 263725
Symmary:
ds_permute/ds_bpermute do not read memory so s_waitcnt is not needed.
Reviewers
arsenm, tstellarAMD
Subscribers
llvm-commits, arsenm
Differential Revision:
http://reviews.llvm.org/D18197
llvm-svn: 263720
Summary:
As explained by the comment, threads will typically see different values
returned by atomic instructions even if the arguments are equal.
Reviewers: arsenm, tstellarAMD
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D18156
llvm-svn: 263719
We were being too aggressive in trying to combine a shuffle into a blend-with-zero pattern, often resulting in a endless loop of contrasting combines
This patch stops the combine if we already have a blend in place (means we miss some domain corrections)
llvm-svn: 263717
This is similar to D18133 where we allowed profile weights on select instructions.
This extends that change to also allow the 'unpredictable' attribute of branches to apply to selects.
A test to check that 'unpredictable' metadata is preserved when cloning instructions was checked in at:
http://reviews.llvm.org/rL263648
Differential Revision: http://reviews.llvm.org/D18220
llvm-svn: 263716
The two changes together weakened the test and caused a regression with division
handling in MSVC mode. They were applied to avoid an assertion being triggered
in the block frequency analysis. However, the underlying problem was simply
being masked rather than solved properly. Address the actual underlying problem
and revert the changes. Rather than analyze the cause of the assertion, the
division failure was assumed to be an overflow.
The underlying issue was a subtle bug in the BB construction in the emission of
the div-by-zero check (WIN__DBZCHK). We did not construct the proper successor
information in the basic blocks, nor did we update the PHIs associated with the
basic block when we split them. This would result in assertions being triggered
in the block frequency analysis pass.
Although the original tests are being removed, the tests themselves performed
very little in terms of validation but merely tested that we did not assert when
generating code. Update this with new tests that actually ensure that we do not
regress on the code generation.
llvm-svn: 263714
It might be hard to recognize a hexadecimal number without '0x' prefix.
Besides that '0x' prefix corresponds to GNU objdump behaviour.
Differential Revision: http://reviews.llvm.org/D18207
llvm-svn: 263705
That allows, for example, to print hex-formatted immediates using
llvm-objdump --print-imm-hex command line option.
Differential Revision: http://reviews.llvm.org/D18195
llvm-svn: 263704
Summary:
This should eliminate all occurrences of this within LLVMMipsAsmParser.
This patch is in response to http://reviews.llvm.org/D17983. I was unable
to reproduce the warnings on my machine so please advise if this fixes the
warnings.
Reviewers: ariccio, vkalintiris, dsanders
Subscribers: dblaikie, dsanders, llvm-commits
Differential Revision: http://reviews.llvm.org/D18087
llvm-svn: 263703
The section alignment field was marked optional but not provided a
default value: initialize it with 0.
While we are here, ensure that the section alignment is plausible.
llvm-svn: 263692
This splits out the logic that maps the `"statepoint-id"` attribute into
the actual statepoint ID, and the `"statepoint-num-patch-bytes"`
attribute into the number of patchable bytes the statpeoint is lowered
into. The new home of this logic is in IR/Statepoint.cpp, and this
refactoring will support similar functionality when lowering calls with
deopt operand bundles in the future.
llvm-svn: 263685
The allocator here can still be a nullptr, but this atleast makes the
single caller which needed nullptr be explicit about it.
Note, lld started always passing a parameter here as of r263680. If
anything builds out of sync, that would be why errors may occur.
llvm-svn: 263681
In lld we allocate atoms on an allocator and so don't run their
destructors. This means we also shouldn't allocate memory inside
them without that also being on an allocator.
Reviewed by Lang Hames and Rafael Espindola.
llvm-svn: 263676
Summary: If TBAA is on an intrinsic and it gets upgraded and drops the TBAA we hit an odd assert. We should just upgrade the TBAA first because it doesn't have side-effects.
Reviewers: reames, apilipenko, manmanren
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D18229
llvm-svn: 263673
Summary:
This is a step towards implementing "direct" lowering of calls and
invokes with deopt operand bundles into STATEPOINT nodes (as opposed to
having them mandatorily pass through RewriteStatepointsForGC, which is
the case today).
This change extracts out a `SelectionDAGBuilder::LowerAsStatepoint`
helper function that is able to lower a "statepoint like thing", and
uses it to lower `gc.statepoint` calls. This is an NFC now, but in a
later change we will use `LowerAsStatepoint` to directly lower calls and
invokes with operand bundles without going through an intermediate
`gc.statepoint` IR representation.
FYI: I expect `SelectionDAGBuilder::StatepointInfo` will evolve as I add
support for lowering non gc.statepoints, right now it is fairly tightly
coupled with an IR level `gc.statepoint`.
Reviewers: reames, pgavlin, JosephTremoulet
Subscribers: sanjoy, mcrosier, llvm-commits
Differential Revision: http://reviews.llvm.org/D18106
llvm-svn: 263671
I'm testing out a script that auto-generates the check lines.
It's 98% copied from utils/update_llc_test_checks.py.
If others think this is useful, please let me know.
llvm-svn: 263668
I'm testing out a script that auto-generates the check lines.
It's 98% copied from utils/update_llc_test_checks.py.
If others think this is useful, please let me know.
llvm-svn: 263667
- Rename getATOMIC to getSYNC, as llvm will soon be able to emit both
'__sync' libcalls and '__atomic' libcalls, and this function is for
the '__sync' ones.
- getInsertFencesForAtomic() has been replaced with
shouldInsertFencesForAtomic(Instruction), so that the decision can be
made per-instruction. This functionality will be used soon.
- emitLeadingFence/emitTrailingFence are no longer called if
shouldInsertFencesForAtomic returns false, and thus don't need to
check the condition themselves.
llvm-svn: 263665
SelectionDAGBuilder::populateCallLoweringInfo is now used instead of
SelectionDAGBuilder::lowerCallOperands. The populateCallLoweringInfo
interface is more composable in face of design changes like
http://reviews.llvm.org/D18106
llvm-svn: 263663
The swift frontend needs to be able to look up PGO function name
variables based on the original raw function name. That's because it's
not possible to create PGO function name variables while emitting swift
IR. Instead, we have to create the name variables while lowering swift
IR to llvm IR, at which point we fix up all calls to the increment
intrinsic to point to the right name variable.
llvm-svn: 263662
Summary:
Uniform loops where the branch leaving the loop is predicated on VCCNZ
must be skipped if EXEC = 0, otherwise they will be infinite.
Reviewers: tstellarAMD, arsenm
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D18137
llvm-svn: 263658
Summary:
Fix LSRInstance::HoistInsertPosition() to check the original insert
position block first for a canonical insertion point that is dominated
by all inputs. This leads to SCEV being able to reuse more instructions
since it currently tracks the instructions it creates for reuse by
keeping a table of <Value, insert point> pairs.
Reviewers: atrick
Subscribers: mcrosier, mzolotukhin, llvm-commits
Differential Revision: http://reviews.llvm.org/D18001
llvm-svn: 263644
And emit an error if it fails.
This prevents illegal instructions from getting sent to the GPU, which
would potentially result in a hang.
This is a candidate for the stable branch(es).
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
llvm-svn: 263627
Fork off compatibility.ll for the 3.8 release. The *.bc file in this
commit was produced using a Release build of the release_38 branch.
llvm-svn: 263620
This patch introduces the Error classs for lightweight, structured,
recoverable error handling. It includes utilities for creating, manipulating
and handling errors. The scheme is similar to exceptions, in that errors are
described with user-defined types. Unlike exceptions however, errors are
represented as ordinary return types in the API (similar to the way
std::error_code is used).
For usage notes see the LLVM programmer's manual, and the Error.h header.
Usage examples can be found in unittests/Support/ErrorTest.cpp.
Many thanks to David Blaikie, Mehdi Amini, Kevin Enderby and others on the
llvm-dev and llvm-commits lists for lots of discussion and review.
llvm-svn: 263609
We can currently only match zeroable vector elements of the same size as the shuffle type - these tests demonstrate the problem and a solution will be shortly added in an updated D14261
llvm-svn: 263606