The immediate should be 1 or 2, not 0 or 1. This was found while adding bounds checking to clang. In fact the existing clang builtin test failed if we ran it all the way to assembly.
llvm-svn: 297591
This reverts commit a6a29374662716710f80c8ece96629751697841e.
It has a few compilation failures that I don't have time to fix
at the moment.
llvm-svn: 297589
x86 has undef SSE/AVX intrinsics that should represent a bogus register operand.
This is not the same as LLVM's undef value which can take on multiple bit patterns.
There are better solutions / follow-ups to this discussed here:
https://bugs.llvm.org/show_bug.cgi?id=32176
...but this should prevent miscompiles with a one-line code change.
Differential Revision: https://reviews.llvm.org/D30834
llvm-svn: 297588
In ScheduleOptimizer::isTileableBand(), allow the case in which
the band node's child is an isl_schedule_sequence_node and its
grandchildren isl_schedule_leaf_nodes. This case can arise when
two or more statements are fused by the isl scheduler.
The tile_after_fusion.ll test has two statements in separate
loop nests and checks whether they are tiled after being fused
when polly-opt-fusion equals "max".
Reviewers: grosser
Subscribers: gareevroman, pollydev
Tags: #polly
Contributed-by: Theodoros Theodoridis <theodort@student.ethz.ch>
Differential Revision: https://reviews.llvm.org/D30815
llvm-svn: 297587
I noticed unnecessary 'sbb' instructions in D30472 and while looking at 'ptest' codegen recently.
This happens because we were transforming any 'setb' - even when we only wanted a single-bit result.
This patch moves those transforms under visitAdd/visitSub, so we we're only creating sbb/adc when it
is a win. I don't know why we need a SETCC_CARRY node type, but I'm not proposing to change that
existing behavior in this patch.
Also, I'm skeptical that sbb/adc are a win for all micro-arches, so I added comments to the test files
where this transform still fires.
The test changes here are all cases where we no longer produce sbb/adc. Avoiding partial register
stalls (generating an xor to clear a register) is not handled in some cases, but that's a separate
issue.
Differential Revision: https://reviews.llvm.org/D30611
llvm-svn: 297586
There were a couple of problems with this function on Windows. Different
separators and differences in how tilde expressions are resolved for
starters, but in addition there was no clear indication of what the
function's inputs or outputs were supposed to be, and there were no tests
to demonstrate its use.
To more easily paper over the differences between Windows paths,
non-Windows paths, and tilde expressions, I've ported this function to use
LLVM-based directory iteration (in fact, I would like to eliminate all of
LLDB's directory iteration code entirely since LLVM's is cleaner / more
efficient (i.e. it invokes fewer stat calls)). and llvm's portable path
manipulation library.
Since file and directory completion assumes you are referring to files and
directories on your local machine, it's safe to assume the path syntax
properties of the host in doing so, so LLVM's APIs are perfect for this.
I've also added a fairly robust set of unit tests. Since you can't really
predict what users will be on your machine, or what their home directories
will be, I added an interface called TildeExpressionResolver, and in the
unit test I've mocked up a fake implementation that acts like a unix
password database. This allows us to configure some fake users and home
directories in the test, so we can exercise all of those hard-to-test
codepaths that normally otherwise depend on the host.
Differential Revision: https://reviews.llvm.org/D30789
llvm-svn: 297585
Summary:
A53 scheduler causes an assertion failure on all CRC instructions:
include/llvm/CodeGen/MachineInstr.h:280: const llvm::MachineOperand
&llvm::MachineInstr::getOperand(unsigned int) const: Assertion `i <
getNumOperands() && "getOperand() out of range!"' failed.
The case statements corresponding to CRC instructions are incorrect and should
be removed.
Also adding a testcase while on this.
Reviewers: t.p.northover, javed.absar, apazos, rengolin
Reviewed By: rengolin
Subscribers: evandro, aemerson, llvm-commits, rengolin
Differential Revision: https://reviews.llvm.org/D30274
llvm-svn: 297582
Unroller's specialized scalarizeInstruction() is mostly duplicating Vectorizer's
variant. OTOH Vectorizer's scalarizeInstruction() already supports the special
case of VF==1 except for avoiding mask-bit extraction in that case. This patch
removes Unroller's specialized version in favor of a unified method.
The only functional difference between the two variants seems to be setting
memcheck metadata for loads and stores only in Vectorizer's variant, which is a
bug in Unroller. To keep this patch an NFC the unified method doesn't set
memcheck metadata for VF==1.
Differential Revision: https://reviews.llvm.org/D30715
llvm-svn: 297580
If a SCoP is most probably sequential, then it's better to run it on a CPU.
Hence, there's no point in running it on a GPU.
Reviewers: grosser
Subscribers: nemanjai
Tags: #polly
Contributed-by: Singapuram Sanjay <singapuram.sanjay@gmail.com>
Differential Revision: https://reviews.llvm.org/D30864
llvm-svn: 297578
I'm pretty sure there are more problems lurking here. But I think this fixes PR32241.
I've added the test case from that bug and added asserts that will fail if we ever try to copy between high registers and mask registers again.
llvm-svn: 297574
Without SSE41 (pextrb) we currently extract byte elements from a vector by spilling to stack and reloading the byte.
This patch is an initial attempt at using MOVD/PEXTRW to extract the relevant DWORD/WORD from the vector and then shift+truncate to collect the correct byte.
Extraction of multiple bytes this way would result in code bloat, but as explained in the patch we could probably afford to be more aggressive with the supported extractions before again falling back on spilling - possibly through counting the number of extracts and which DWORD/WORD they originate?
Differential Revision: https://reviews.llvm.org/D29841
llvm-svn: 297568
Summary:
This patch implements a initial support of lit test for builtins.
Unit/arm/call_apsr.S is updated to support thumb1.
It also fixes a bug in arm/aeabi_uldivmod_test.c
gcc_personality_test is XFAILED as the framework cannot handle it so far.
cpu_model_test is also XFAILED for now as it is expected to return non-zero.
Reviewers: rengolin, compnerd, jroelofs, erik.pilkington, arphaman
Reviewed By: jroelofs
Subscribers: jroelofs, aemerson, srhines, nemanjai, llvm-commits, mgorny
Differential Revision: https://reviews.llvm.org/D30802
llvm-svn: 297566
Since v_max_f32_e64/v_max_f16_e64 can be folded if the target
instruction supports the clamp bit, we also need to maintain
modifiers when converting v_mac to v_mad.
This fixes a rendering issue with Dirt Rally because a v_mac
instruction with the clamp bit set was converted to a v_mad
but that bit was lost during the conversion.
Fixes: e184e01dd79 ("AMDGPU: Fold FP clamp as modifier bit")
Patch by Samuel Pitoiset <samuel.pitoiset@gmail.com>
llvm-svn: 297556
Clang doesn't produce gcov compatible coverage files. This
causes lcov to break because it uses gcov by default. This
patch switches lcov to use llvm-cov as the gcov-tool.
Unfortunatly llvm-cov doesn't provide a gcov like interface by
default so it won't work with lcov. However `llvm-cov gcov` does.
For this reason we generate 'llvm-cov-wrapper' script that always
passes the gcov flag.
llvm-svn: 297553
Summary:
Some coroutine diagnostics need to point to the location of the first coroutine keyword in the function, like when diagnosing a `return` inside a coroutine. Previously we did this by storing each *valid* coroutine statement in a list and select the first one to use in diagnostics. However if every coroutine statement is invalid we would have no location to point to.
This patch fixes the storage of the first coroutine statement location, ensuring that it gets stored even when the resulting AST node would be invalid.
This patch also removes the `CoroutineStmts` list in `FunctionScopeInfo` because it was unused.
Reviewers: rsmith, GorNishanov, aaron.ballman
Reviewed By: GorNishanov
Subscribers: mehdi_amini, cfe-commits
Differential Revision: https://reviews.llvm.org/D30776
llvm-svn: 297547
When CMAKE_INSTALL_MANDIR isn't defined it ends up attempting to install
the man pages under "/man1" and we really don't want to accidentally install
stuff at the filesystem root.
llvm-svn: 297545