The main problem is in the predicate passed to the `std::stable_sort()`.
This predicate always returns false if **both** section's names do not
start with `.init_array` or `.fini_array` prefixes. In short, it does not
define a strict weak orderng. Suppose we have the following sections:
.A .init_array.1 .init_array.2
The predicate states that:
not .init_array.1 < .A
not .A < .init_array.2
but .init_array.1 < .init_array.2 !!!
The second problem is that `.init_array` section without number should
go last in the list. Not it has the lowest priority '0' and goes first.
The patch fixes both of the problems.
llvm-svn: 209875
http://llvm.org/bugs/show_bug.cgi?id=19876
The following C++1y code results in a crash:
struct X {
int m = 10;
int n = [this](auto) { return m; }(20);
};
When implicitly instantiating the generic lambda's call operator specialization body, Sema is unable to determine the current 'this' type when transforming the MemberExpr 'm' - since it looks for the nearest enclosing FunctionDeclDC - which is obviously null.
I considered two ways to fix this:
1) In InstantiateFunctionDefinition, when the context is saved after the lambda scope info is created, retain the 'this' pointer.
2) Teach getCurrentThisType() to recognize it is within a generic lambda within an NSDMI/default-initializer and return the appropriate this type.
I chose to implement #2 (though I confess I do not have a compelling reason for choosing it over #1).
Richard Smith accepted the patch:
http://reviews.llvm.org/D3935
Thank you!
llvm-svn: 209874
This patch adds support to vectorize intrinsics such as powi, cttz and ctlz in Vectorizer. These intrinsics are different from other
intrinsics as second argument to these function must be same in order to vectorize them and it should be represented as a scalar.
Review: http://reviews.llvm.org/D3851#inline-32769 and http://reviews.llvm.org/D3937#inline-32857
llvm-svn: 209873
The corresponding CFE patch replaces these intrinsics with vector initializers
in avxintrin.h. This patch removes the LLVM intrinsics from the backend.
We now stop lowering at X86ISD::VBROADCAST custom node rather than lowering
that further to the intrinsics.
The patch only changes VBROADCASTS* and leaves VBROADCAST[FI]128 to continue
to use intrinsics. As explained in the CFE patch, the reason is that we
currently don't generate as good code for them without the intrinsics.
CodeGen/X86/avx-vbroadcast.ll already provides coverage for this change. It
checks that for a series of insertelements we generate the appropriate
vbroadcast instruction.
Also verified that there was no assembly change in the test-suite before and
after this patch.
llvm-svn: 209864
They are replaced with the same IR that is generated for the
vector-initializers in avxintrin.h.
The test verifies that we get back the original instruction. I haven't seen
this approach to be used in other auto-upgrade tests (i.e. llc + FileCheck)
but I think it's the most direct way to test this case. I believe this should
work because llc upgrades calls during parsing. (Other tests mostly check
that assembling and disassembling yields the upgraded IR.)
llvm-svn: 209863
original fix would actually trigger the *exact* same crasher as the
original bug for a different reason. Awesomesauce.
Working on test cases now, but wanted to get bots healthier.
llvm-svn: 209860
across PHI nodes. The code was computing the Idxs from the 'GEP'
variable's indices when what it wanted was Op1's indices. This caused an
ASan heap-overflow for me that pin pointed the issue when Op1 had more
indices than GEP did. =] I'll let Louis add a specific test case for
this if he wants.
llvm-svn: 209857
The loop vectorizer instantiates be-taken-count + 1 as the loop iteration count.
If this expression overflows the generated code was invalid.
In case of overflow the code now jumps to the scalar loop.
Fixes PR17288.
llvm-svn: 209854
These tests ensure that a change I will propose in clang works as
expected.
Summary:
Added tests for the generation of blend+immediate instructions from a
shufflevector.
These tests were proposed along with a patch that was dropped. I'm
committing the tests anyway to protect against possible regressions in
codegen.
Reviewers: nadav, bkramer
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D3600
llvm-svn: 209853
Changes include:
- ObjectFileMachO can now determine if a binary is "*-apple-ios" or "*-apple-macosx" by checking the min OS and SDK load commands
- ArchSpec now says "<arch>-apple-macosx" is equivalent to "<arch>-apple-ios" since the simulator mixes and matches binaries (some from the system and most from the iOS SDK).
- Getting process inforamtion on MacOSX now correctly classifies iOS simulator processes so they have "*-apple-ios" architectures in the ProcessInstanceInfo
- PlatformiOSSimulator can now list iOS simulator processes correctly instead of showing nothing by using:
(lldb) platform select ios-simulator
(lldb) platform process list
- debugserver can now properly return "*-apple-ios" for the triple in the process info packets for iOS simulator executables
- GDBRemoteCommunicationClient now correctly passes along the triples it gets for process info by setting the OS in the llvm::Triple correctly
<rdar://problem/17060217>
llvm-svn: 209852
These intrinsics are special because they directly take a memory operand (AVX2
adds the register counterparts). Typically, other non-memop intrinsics take
registers and then it's left to isel to fold memory operands.
In order to LICM intrinsics directly reading memory, we require that no stores
are in the loop (LICM) or that the folded load accesses constant memory
(MachineLICM). When neither is the case we fail to hoist a loop-invariant
broadcast.
We can work around this limitation if we expose the load as a regular load and
then just implement the broadcast using the vector initializer syntax. This
exposes the load to LICM and other optimizations.
At the IR level this is translated into a series of insertelements. The
sequence is already recognized as a broadcast so there is no impact on the
quality of codegen.
_mm256_broadcast_pd and _mm256_broadcast_ps are not updated by this patch
because right now we lack the DAG-combiner smartness to recover the broadcast
instructions. This will be tackled in a follow-on.
There will be completing changes on the LLVM side to remove the LLVM
intrinsics and to auto-upgrade bitcode files.
Fixes <rdar://problem/16494520>
llvm-svn: 209846
Added new SocketPacketPump class to decouple gdb remote packet
reading from packet expectations code. This allowed for cleaner
implementation of the separate $O output streams (non-deterministic
packaging of inferior stdout/stderr) from all the rest of the packets.
Added a packet expectation matcher that can match expected accumulated
output with a timeout. Use a dictionary with "type":"output_match".
See lldbgdbserverutils.MatchRemoteOutputEntry for details.
Added a gdb remote test to verify that $Hc (continue thread selection)
plus signal delivery ($C{signo}) works. Having trouble getting this
to pass with debugserver on MacOSX 10.9. Tried different variants,
including $vCont;C{signo}:{thread-id};c. In some cases, I get the
test exe's signal handler to run ($vCont variant first time), in others I don't
($vCont second and further times). $C{signo} doesn't hit the signal
handler code at all in the test exe but delivers a stop. Further
$Hc and $C{signo} deliver the stop marking the wrong thread. For now I'm
marking the test as XFAIL on dsym/debugserver. Will revisit this on
lldb-dev.
Updated the text exe for these tests to support thread:print-ids (each
thread announces its thread id) and provide a SIGUSR1 thread handler
that prints out the thread id on which it was signaled.
llvm-svn: 209845
Currently LLVM will generally merge GEPs. This allows backends to use more
complex addressing modes. In some cases this is not happening because there
is PHI inbetween the two GEPs:
GEP1--\
|-->PHI1-->GEP3
GEP2--/
This patch checks to see if GEP1 and GEP2 are similiar enough that they can be
cloned (GEP12) in GEP3's BB, allowing GEP->GEP merging (GEP123):
GEP1--\ --\ --\
|-->PHI1-->GEP3 ==> |-->PHI2->GEP12->GEP3 == > |-->PHI2->GEP123
GEP2--/ --/ --/
This also breaks certain use chains that are preventing GEP->GEP merges that the
the existing instcombine would merge otherwise.
Tests included.
llvm-svn: 209843
Summary:
This adds documentation for -Rpass, -Rpass-missed and -Rpass-analysis.
It also adds release notes for 3.5.
Reviewers: rsmith
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D3730
llvm-svn: 209841
Summary:
These two flags are in the same family as -Rpass, but are used in
different situations.
-Rpass-missed is used by optimizers to inform the user when they tried
to apply an optimization but couldn't (or wouldn't).
-Rpass-analysis is used by optimizers to report analysis results back
to the user (e.g., why the transformation could not be applied).
Depends on D3682.
Reviewers: rsmith
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D3683
llvm-svn: 209839
without this case we would end on an infinite recursion: the remainder is zero,
so Numerator - Remainder is equal to Numerator and so we would recursively ask
for the division of Numerator by Denominator.
llvm-svn: 209838
when ScalarEvolution::getElementSize returns nullptr it is safe to early return
in ScalarEvolution::findArrayDimensions such that we avoid later problems when
we try to divide the terms by ElementSize.
llvm-svn: 209837
You can expect the sanitizers to be built under any of the following conditions:
1) CMAKE_C_COMPILER is GCC built to cross-compile to ARM
2) CMAKE_C_COMPILER is Clang built to cross-compile to ARM (ARM is default target)
3) CMAKE_C_COMPILER is Clang and CMAKE_C_FLAGS contains -target and --sysroot
Differential Revision: http://reviews.llvm.org/D3794
llvm-svn: 209835
This makes it slightly harder to misuse Twines. It is still possible to
refer to destroyed temporaries with the regular constructors, though.
Patch by Marco Alesiani!
llvm-svn: 209832