This is a follow up to r212492. There should be no functional difference, but
this patch makes it clear that SrcVT must be an i1/i8/16/i32 and DestVT must be
an i8/i16/i32/i64.
rdar://17516686
llvm-svn: 212633
gcc supports this behavior and it is pervasively used inside the Linux
kernel.
Note that both gcc and clang will reject code that attempts to do this
in a C++ language mode.
This fixes PR17998.
llvm-svn: 212631
The current strategy for host allocation is to choose a random
address and attempt to allocate there, eventually failing if the
allocation cannot be satisfied.
The C standard only guarantees that RAND_MAX >= 32767, so for
platforms that use a very small RAND_MAX allocations will fail
with very high probability. On such platforms (Windows is one),
you can reproduce this trivially by running lldb, typing "expr (3)"
and then hitting enter you see a failure. Failures generally
happen with a frequency of about 1 failure every 5 evaluations.
There is no good reason that allocations need to look like "real"
pointers, so this patch changes the allocation scheme to simply
jump straight to the end and grab a free chunk of memory.
Reviewed By: Sean Callanan
Differential Revision: http://reviews.llvm.org/D4300
llvm-svn: 212630
In PR20059 ( http://llvm.org/pr20059 ), instcombine eliminates shuffles that are necessary before performing an operation that can trap (srem).
This patch calls isSafeToSpeculativelyExecute() and bails out of the optimization in SimplifyVectorOp() if needed.
Differential Revision: http://reviews.llvm.org/D4424
llvm-svn: 212629
The getopt library has a structure called option (lowercase). We
have a structure called Option (uppercase). previously the two
structures had exactly the same definitions, and we were doing a
C-style cast of an Option* to an option*. C-style casts don't
bother to warn you when you cast to unrelated types, but in the
original OptionValidator patch I modified the definition of Option.
This patch fixes the errors by building an array of option
structures and filling it out the correct way before passing it to
the getopt library.
This also fixes one other source of test failures: an uninitialized
read that occurs due to not initializing a field of the
OptionDefinition.
Reviewed By: Todd Fiala
Differential Revision: http://reviews.llvm.org/D4425
llvm-svn: 212628
Summary: This is a pre-requisite for supporting the mips-img-linux-gnu triple in clang.
Differential Revision: http://reviews.llvm.org/D4435
llvm-svn: 212626
The clang -cc1as options are nearly a strict subset of -cc1. Instead of
duplicating the definitions and documentation, let's go ahead and share the
definitions in a similar way the current handling of combined driver and
frontend flags, eliminating some of the vestigial legacy surrounding the
assembler subcommand.
llvm-svn: 212620
Summary:
This removes the need to pass -mnan=2008 explicitly to be able to compile
the test-suite for MIPS32r6/MIPS64r6.
Differential Revision: http://reviews.llvm.org/D4433
llvm-svn: 212619
Summary:
While debugging another issue, I noticed that Mips currently specifies that the
count leading zero builtins are undefined when the input is zero. The
architecture specifications say that the clz and dclz instructions write 32 or
64 respectively when given zero.
This doesn't fix any bugs that I'm aware of but it may improve optimisation in
some cases.
Differential Revision: http://reviews.llvm.org/D4431
llvm-svn: 212618
not widening the input type to the node sufficiently to let the ext take
place in a register.
This would in turn result in a mysterious bitcast assertion failure
downstream. First change here is to add back the helpful assert I had in
an earlier version of the code to catch this immediately.
Next change is to add support to the type legalization to detect when we
have widened the operand either too little or too much (for whatever
reason) and find a size-matched legal vector type to convert it to
first. This can also fail so we get a new fallback path, but that seems
OK.
With this, we no longer crash on vec_cast2.ll when using widening. I've
also added the CHECK lines for the zero-extend cases here. We still need
to support sign-extend and trunc (or something) to get plausible code
for the other two thirds of this test which is one of the regression
tests that showed the most scalarization when widening was
force-enabled. Slowly closing in on widening being a viable legalization
strategy without it resorting to scalarization at every turn. =]
llvm-svn: 212614
Turns out my trick of using the same masks for SSE4.1 and AVX2 didn't work out
as we have to blend two vectors. While there remove unecessary cross-lane moves
from the shuffles so the backend can lower it to palignr instead of vperm.
Fixes PR20118, a miscompilation of vector sdiv by constant on AVX2.
llvm-svn: 212611
vector types to be legal and a ZERO_EXTEND node is encountered.
When we use widening to legalize vector types, extend nodes are a real
challenge. Either the input or output is likely to be legal, but in many
cases not both. As a consequence, we don't really have any way to
represent this situation and the prior code in the widening legalization
framework would just scalarize the extend operation completely.
This patch introduces a new DAG node to represent doing a zero extend of
a vector "in register". The core of the idea is to allow legal but
different vector types in the input and output. The output vector must
have fewer lanes but wider elements. The operation is defined to zero
extend the low elements of the input to the size of the output elements,
and drop all of the high elements which don't have a corresponding lane
in the output vector.
It also includes generic expansion of this node in terms of blending
a zero vector into the high elements of the vector and bitcasting
across. This in turn yields extremely nice code for x86 SSE2 when we use
the new widening legalization logic in conjunction with the new shuffle
lowering logic.
There is still more to do here. We need to support sign extension, any
extension, and potentially int-to-float conversions. My current plan is
to continue using similar synthetic nodes to model each of these
transitions with generic lowering code for each one.
However, with this patch LLVM already reaches performance parity with
GCC for the core C loops of the x264 code (assuming you disable the
hand-written assembly versions) when compiling for SSE2 and SSE3
architectures and enabling the new widening and lowering logic for
vectors.
Differential Revision: http://reviews.llvm.org/D4405
llvm-svn: 212610
Summary:
It seems we accidentally read the wrong column of the table MIPS64r6 spec
and used the names for c.cond.fmt instead of cmp.cond.fmt.
Differential Revision: http://reviews.llvm.org/D4387
llvm-svn: 212607
Summary:
This completes the change to use JALR instead of JR on MIPS32r6/MIPS64r6.
Reviewers: jkolek, vmedic, zoran.jovanovic, dsanders
Reviewed By: dsanders
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D4269
llvm-svn: 212605
Summary:
RET, and RET_MM have been replaced by a pseudo named PseudoReturn.
In addition a version with a 64-bit GPR named PseudoReturn64 has been
added.
Instruction selection for a return matches RetRA, which is expanded post
register allocation to PseudoReturn/PseudoReturn64. During MipsAsmPrinter,
this PseudoReturn/PseudoReturn64 are emitted as:
- (JALR64 $zero, $rs) on MIPS64r6
- (JALR $zero, $rs) on MIPS32r6
- (JR_MM $rs) on microMIPS
- (JR $rs) otherwise
On MIPS32r6/MIPS64r6, 'jr $rs' is an alias for 'jalr $zero, $rs'. To aid
development and review (specifically, to ensure all cases of jr are
updated), these aliases are temporarily named 'r6.jr' instead of 'jr'.
A follow up patch will change them back to the correct mnemonic.
Added (JALR $zero, $rs) to MipsNaClELFStreamer's definition of an indirect
jump, and removed it from its definition of a call.
Note: I haven't accounted for MIPS64 in MipsNaClELFStreamer since it's
doesn't appear to account for any MIPS64-specifics.
The return instruction created as part of eh_return expansion is now expanded
using expandRetRA() so we use the right return instruction on MIPS32r6/MIPS64r6
('jalr $zero, $rs').
Also, fixed a misuse of isABI_N64() to detect 64-bit wide registers in
expandEhReturn().
Reviewers: jkolek, vmedic, mseaborn, zoran.jovanovic, dsanders
Reviewed By: dsanders
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D4268
llvm-svn: 212604
Summary:
This patch re-uses the implementation of 'llvm-mc -show-inst' and makes it
available to llc as 'llc -asm-show-inst'.
This is necessary to test parts of MIPS32r6/MIPS64r6 without resorting to
'llc -filetype=obj' tests. For example, on MIPS32r2 and earlier we use the
'jr $rs' instruction for indirect branches and returns. On MIPS32r6, we no
longer have 'jr $rs' and use 'jalr $zero, $rs' instead. The catch is that,
on MIPS32r6, 'jr $rs' is an alias for 'jalr $zero, $rs' and is the preferred
way of writing this instruction. As a result, all MIPS ISA's emit 'jr $rs' in
their assembly output and the assembler encodes this to different opcodes
according to the ISA.
Using this option, we can check that the MCInst really is a JR or a JALR by
matching the emitted comment. This removes the need for a 'llc -filetype=obj'
test.
Reviewers: rafael, dsanders
Reviewed By: dsanders
Subscribers: zoran.jovanovic, llvm-commits
Differential Revision: http://reviews.llvm.org/D4267
llvm-svn: 212603
has settled without incident, removing the x86-specific and overly
strict 'isVectorSplat' routine in favor of generic and more powerful
splat detection.
The primary motivation and result of this is that the x86 backend can
now see through splats which contain undef elements. This is essential
if we are using a widening form of legalization and I've updated a test
case to also run in that mode as before this change the generated code
for the test case was completely scalarized.
This version of the patch much more carefully handles the undef lanes.
- We aren't overly conservative about them in the shift lowering
(where we will never use the splat itself).
- One place where the splat would have been re-used by the existing code
now explicitly constructs a new constant splat that will be safe.
- The broadcast lowering is much more reasonable with undefs by doing
a correct check of whether the splat is the only user of a loaded
value, checking that the splat actually crosses multiple lanes before
using a broadcast, and handling broadcasts of non-constant splats.
As a consequence of the last bullet, the weird usage of vpshufd instead
of vbroadcast is gone, and we actually can lower an AVX splat with
vbroadcastss where before we emitted a really strange pattern of
a vector load and a manual splat across the vector.
llvm-svn: 212602
Having some kind of weird kernel-assisted ABI for these when the
native instructions are available appears to be (and should be) the
exception; OSs have been gradually opting in for years and the code
was getting silly.
So let LLVM decide whether it's possible/profitable to inline them by
default.
Patch by Phoebe Buckheister.
llvm-svn: 212598
Though not completely identical, make former
IndentFunctionDeclarationAfterType change this flag for backwards
compatibility (it is somewhat close in meaning and better the err'ing on
an unknown config flag).
llvm-svn: 212597
Key changes:
- Correctly (well ...) distinguish function declarations and variable
declarations with ()-initialization.
- Don't indent when breaking function declarations/definitions after the
return type.
- Indent variable declarations and typedefs when breaking after the
type.
This fixes llvm.org/PR17999.
llvm-svn: 212591
This -f group flag appears to influence linker flags, breaking the usual rules
and causing CMake's link invocation to fail during feature detection due to
missing link dependencies (msan_*).
Let's forcibly add it for now to get things the way they were before feature
detection started working.
llvm-svn: 212590
When LLVM_ENABLE_TIMESTAMPS has been disabled we can prevent the preprocessor
from embedding dates, times and file timestamps.
There are a few motivations for this:
1) Validate the recent CMake feature detection bugfix from LLVM r212586 with
a flag that's not actually available everywhere.
2) Dogfood clang's new -Wdate-time warning from r210511 when bootstrapping.
3) Encourage reproducible builds.
llvm-svn: 212587
add_flag_if_supported() and add_flag_or_print_warning() were effectively
no-ops, just returning the value of the first result (usually
'-fno-omit-frame-pointer') for all subsequent checks for different flags.
Due to the way CMake caches feature detection results, we need to provide
symbolic variable names which will persist the cached results. This commit
fixes feature detection using these two macros.
The feature checks now run and get stored correctly, and the correct output can
be observed in configure logs:
-- Performing Test C_SUPPORTS_FPIC
-- Performing Test C_SUPPORTS_FPIC - Success
-- Performing Test CXX_SUPPORTS_FPIC
-- Performing Test CXX_SUPPORTS_FPIC - Success
llvm-svn: 212586
This flag is set by most other tools and avoids extra stat() calls. The
frontend will diagnose anyway as it performs the check atomically while opening
files at point of use.
We could probably make Driver::CheckInputsExist default to false and only
enable it in the main 'clang' binary, or even better only perform the checks if
we know the tool is external but that needs more thought.
llvm-svn: 212585