Removes unnecessary casts from non-generic address spaces to the generic address
space for certain code patterns.
Patch by Jingyue Wu.
llvm-svn: 205571
ELFLinkingContext has a method addUndefinedAtomsFromSharedLibrary().
The method is being used to skip a shared library within --start-group
and --end-group if it's not the first iteration of the group.
We have the same, incomplete mechanism to skip a shared library within
a group too. That's implemented in ELFFileNode. It's intended to not
return a shared library on the second or further iterations in the
first place. This mechanism is preferred over
addUndefinedAtomsFromSharedLibrary because the policy is implemented
in Input Graph -- that's what Input Graph is for.
This patch removes the dupluicate feature and fixes ELFFileNode.
Differential Revision: http://llvm-reviews.chandlerc.com/D3280
llvm-svn: 205566
When rematerializing through truncates, the coalescer may produce instructions
with dead defs, but live implicit-defs of subregs:
E.g.
%X1<def,dead> = MOVi64imm 2, %W1<imp-def>; %X1:GPR64, %W1:GPR32
These instructions are live, and their definitions should not be rewritten.
Fixes <rdar://problem/16492408>
llvm-svn: 205565
Acording to AMD documentation, the correct opcode for
BFE_INT is 0x5, not 0x4
Fixes Arithm/Absdiff.Mat/3 OpenCV test
Patch by: Bruno Jiménez
llvm-svn: 205562
"x.empty()" is more idiomatic than "x.size() == 0". This patch is to
add such method and use it in LLD.
Differential Revision: http://llvm-reviews.chandlerc.com/D3279
llvm-svn: 205558
Reversed the order in which LD_LIBRARY_PATH is defined in order to make sure the
${CLOOG_INSTALL} prefix is found first.
Contributed-by: Christian Bielert <cib123@googlemail.com>
llvm-svn: 205556
By ignoring this pragma with a warning, we're essentially miscompiling
the user's program. WebKit / Blink use this pragma to disable dynamic
initialization and finalization of some static data, and running the
dtors crashes the program.
Error out for now, so that /fallback compiles the TU correctly with
MSVC. This pragma should be implemented some time this month, and we
can remove this hack.
llvm-svn: 205554
Seems clang-modernize couldn't add "override" to nested classes, so
doing it by hand. Also removed unused virtual member function that
is not overriding anything, that seems to have been added by mistake.
llvm-svn: 205552
these is very much off and is more than just the branch
from this bug incorrect:
Address Line Column File ISA Discriminator Flags
------------------ ------ ------ ------ --- ------------- -------------
0x30830a0100000002 3 0 1 0 0 is_stmt
0x30830a0100000008 3 0 1 0 0 is_stmt end_sequence
llvm-svn: 205551
Extend the SSE2 comment lexing to AVX2. Only 16byte align when not on AVX2.
This provides some 3% speedup when preprocessing gcc.c as a single file.
The patch is wrong, it always uses SSE2, and when I fix that there's no speedup
at all. I am not sure where the 3% came from previously.
--Thi lie, and those below, will be ignored--
M Lex/Lexer.cpp
llvm-svn: 205548
More updating of tests to be explicit about the target triple rather than
relying on the default target triple supporting ARM mode.
Indicate to lit that object emission is not yet available for Windows on ARM.
llvm-svn: 205545
No functionality change.
When determining the pattern for instantiating a generic lambda call operator specialization - we must not go drilling down for the 'prototype' (i.e. as written) pattern - rather we must use our partially transformed pattern (whose DeclRefExprs are wired correctly to any enclosing lambda's decls that should be mapped correctly in a local instantiation scope) that is the templated pattern of the specialization's primary template (even though the primary template might be instantiated from a 'prototype' member-template). Previously, the drilling down was haltted by marking the instantiated-from primary template as a member-specialization (incorrectly).
This prompted Richard to remark (http://llvm-reviews.chandlerc.com/D1784?id=4687#inline-10272)
"It's a bit nasty to (essentially) set this bit incorrectly. Can you put the check into getTemplateInstantiationPattern instead?"
In my reckless youth, I chose to ignore that comment. With the passage of time, I have come to learn the value of bowing to the will of the angry Gods ;)
llvm-svn: 205543
This changes the tests that were targeting ARM EABI to explicitly specify the
environment rather than relying on the default. This breaks with the new
Windows on ARM support when running the tests on Windows where the default
environment is no longer EABI.
Take the opportunity to avoid a pointless redirect (helps when trying to debug
with providing a command line invocation which can be copy and pasted) and
removing a few greps in favour of FileCheck.
llvm-svn: 205541
Implementing this via ComputeMaskedBits has two advantages:
+ It actually works. DAGISel doesn't deal with the chains properly
in the previous pattern-based solution, so they never trigger.
+ The information can be used in other DAG combines, as well as the
trivial "get rid of truncs". For example if the trunc is in a
different basic block.
rdar://problem/16227836
llvm-svn: 205540
Summary:
test/MC/Mips/<isa1>/invalid-<isa2>.s
Test that <isa1> does not support <isa2>'s instructions.
test/MC/Mips/<isa1>/invalid-<isa2>-xfail.s
Things that should be invalid but currently aren't. Will XPASS if any
become invalid.
Reviewers: matheusalmeida
Reviewed By: matheusalmeida
Differential Revision: http://llvm-reviews.chandlerc.com/D3262
llvm-svn: 205538
The terminal barrier of a cmpxchg expansion will be either Acquire or
SequentiallyConsistent. In either case it can be skipped if the
operation has Monotonic requirements on failure.
rdar://problem/15996804
llvm-svn: 205535
If a multi-threaded program calls fork(), TSan ignores all memory accesses
in the child to prevent deadlocks in TSan runtime. This is OK, as child is
probably going to call exec() as soon as possible. However, a rare deadlocks
could be caused by ThreadIgnoreBegin() function itself.
ThreadIgnoreBegin() remembers the current stack trace and puts it into the
StackDepot to report a warning later if a thread exited with ignores enabled.
Using StackDepotPut in a child process is dangerous: it locks a mutex on
a slow path, which could be already locked in a parent process.
The fix is simple: just don't put current stack traces to StackDepot in
ThreadIgnoreBegin() and ThreadIgnoreSyncBegin() functions if we're
running after a multithreaded fork. We will not report any
"thread exited with ignores enabled" errors in this case anyway.
Submitting this without a testcase, as I believe the standalone reproducer
is pretty hard to construct.
llvm-svn: 205534
Summary:
Adds the 'mips4' processor and a simple test of the ELF e_flags.
Patch by David Chisnall
His work was sponsored by: DARPA, AFRL
I made one small change to the testcase so that it uses
mips64-unknown-linux instead of mips4-unknown-linux.
This patch indirectly adds FeatureCondMov to FeatureMips64. This is ok
because it's supposed to be there anyway and it turns out that
FeatureCondMov is not a predicate of any instructions at the moment
(this is a bug that hasn't been noticed because there are no targets
without the conditional move instructions yet).
CC: theraven
Differential Revision: http://llvm-reviews.chandlerc.com/D3244
llvm-svn: 205530
llc doesn't generate nodes for unconditional fall-through branches for targets
without FastISel implementation (X86 has it, but can be disabled by
"-fast-isel=false") in SelectionDAGBuilder::visitBr().
So for line 4 in the following testcase
1: void foo(int i){
2: switch(i){
3: default:
4: break;
5: }
6: return;
7: }
there is no corresponding line in .debug_line section, and a debugger
cannot set a breakpoint at line 4.
Fix this by always emitting a branch when we're not optimizing and add a
testcase to ensure that there's code on every line we'd want to break.
Patch by Daniil Fukalov.
llvm-svn: 205529
Don't allow the RHS of an operator to be split over multiple
lines unless there is a line-break right after the operator.
Before:
if (aaaa && bbbbb || // break
cccc) {
}
After:
if (aaaa &&
bbbbb || // break
cccc) {
}
In most cases, this seems to increase readability.
llvm-svn: 205527
The previous situation where ATOMIC_LOAD_WHATEVER nodes were expanded
at MachineInstr emission time had grown to be extremely large and
involved, to account for the subtly different code needed for the
various flavours (8/16/32/64 bit, cmpxchg/add/minmax).
Moving this transformation into the IR clears up the code
substantially, and makes future optimisations much easier:
1. an atomicrmw followed by using the *new* value can be more
efficient. As an IR pass, simple CSE could handle this
efficiently.
2. Making use of cmpxchg success/failure orderings only has to be done
in one (simpler) place.
3. The common "cmpxchg; did we store?" idiom can be exposed to
optimisation.
I intend to gradually improve this situation within the ARM backend
and make sure there are no hidden issues before moving the code out
into CodeGen to be shared with (at least ARM64/AArch64, though I think
PPC & Mips could benefit too).
llvm-svn: 205525
The trouble as in ARMAsmParser, in ParseInstruction method. It assumes that ARM::R12 + 1 == ARM::SP.
It is wrong, since ARM::<Register> codes are generated by tablegen and actually could be any random numbers.
llvm-svn: 205524
add operation since extract_vector_elt can perform an extend operation. Get the input lane
type from the vector on which we're performing the vpaddl operation on and extend or
truncate it to the output type of the original add node.
llvm-svn: 205523