Change `NamedMDNode::getOperator()` from returning `MDNode *` to
returning `Value *`. To reduce boilerplate at some call sites, add a
`getOperatorAsMDNode()` for named metadata that's expected to only
return `MDNode` -- for now, that's everything, but debug node named
metadata (such as llvm.dbg.cu and llvm.dbg.sp) will soon change. This
is part of PR21433.
Note that there's a follow-up patch to clang for the API change.
llvm-svn: 221375
We currently have no infrastructure to support these correctly.
This is accomplished by generating a call to a runtime library function that
aborts at runtime in place of the regular wrapper for such functions. Direct
calls are rewritten in the usual way during traversal of the caller's IR.
We also remove the "split-stack" attribute from such wrappers, as the code
generator cannot currently handle split-stack vararg functions.
llvm-svn: 221360
This matches the format produced by the AMD proprietary driver.
//==================================================================//
// Shell script for converting .ll test cases: (Pass the .ll files
you want to convert to this script as arguments).
//==================================================================//
; This was necessary on my system so that A-Z in sed would match only
; upper case. I'm not sure why.
export LC_ALL='C'
TEST_FILES="$*"
MATCHES=`grep -v Patterns SIInstructions.td | grep -o '"[A-Z0-9_]\+["e]' | grep -o '[A-Z0-9_]\+' | sort -r`
for f in $TEST_FILES; do
# Check that there are SI tests:
grep -q -e 'verde' -e 'bonaire' -e 'SI' -e 'tahiti' $f
if [ $? -eq 0 ]; then
for match in $MATCHES; do
sed -i -e "s/\([ :]$match\)/\L\1/" $f
done
# Try to get check lines with partial instruction names
sed -i 's/\(;[ ]*SI[A-Z\\-]*: \)\([A-Z_0-9]\+\)/\1\L\2/' $f
fi
done
sed -i -e 's/bb0_1/BB0_1/g' ../../../test/CodeGen/R600/infinite-loop.ll
sed -i -e 's/SI-NOT: bfe/SI-NOT: {{[^@]}}bfe/g'../../../test/CodeGen/R600/llvm.AMDGPU.bfe.*32.ll ../../../test/CodeGen/R600/sext-in-reg.ll
sed -i -e 's/exp_IEEE/EXP_IEEE/g' ../../../test/CodeGen/R600/llvm.exp2.ll
sed -i -e 's/numVgprs/NumVgprs/g' ../../../test/CodeGen/R600/register-count-comments.ll
sed -i 's/\(; CHECK[-NOT]*: \)\([A-Z_0-9]\+\)/\1\L\2/' ../../../test/CodeGen/R600/select64.ll ../../../test/CodeGen/R600/sgpr-copy.ll
//==================================================================//
// Shell script for converting .td files (run this last)
//==================================================================//
export LC_ALL='C'
sed -i -e '/Patterns/!s/\("[A-Z0-9_]\+[ "e]\)/\L\1/g' SIInstructions.td
sed -i -e 's/"EXP/"exp/g' SIInstrInfo.td
llvm-svn: 221350
This patch improves the folding of vector AND nodes into blend operations for
targets that feature SSE4.1. A vector AND node where one of the operands is
a constant build_vector with elements that are either zero or all-ones can be
converted into a blend.
This allows for example to simplify the following code:
define <4 x i32> @test(<4 x i32> %A, <4 x i32> %B) {
%1 = and <4 x i32> %A, <i32 0, i32 0, i32 0, i32 -1>
%2 = and <4 x i32> %B, <i32 -1, i32 -1, i32 -1, i32 0>
%3 = or <4 x i32> %1, %2
ret <4 x i32> %3
}
Before this patch llc (-mcpu=corei7) generated:
andps LCPI1_0(%rip), %xmm0, %xmm0
andps LCPI1_1(%rip), %xmm1, %xmm1
orps %xmm1, %xmm0, %xmm0
retq
With this patch we generate a single 'vpblendw'.
llvm-svn: 221343
Some ARM FPUs only have 16 double-precision registers, rather than the
normal 32. LLVM represents this with the D16 target feature. This is
currently used by CodeGen to avoid using high registers when they are
not available, but the assembler and disassembler do not.
I fix this in the assmebler and disassembler rather than the
InstrInfo.td files, as the latter would require a large number of
changes everywhere one of the floating-point instructions is referenced
in the backend. This solution is similar to the one used for
co-processor numbers and MSR masks.
llvm-svn: 221341
LLVM Parser decodes "\bb" as hex in "C:\bb-win7\buildername\build...", with MDString.
See also, http://llvm.org/docs/LangRef.html#metadata-nodes-and-metadata-strings
This reverts r221270, "Disable 3 tests in llvm/test/Transforms/GCOVProfiling/ for now. Investigating."
FIXME: Please check EC in GCOVProfiler::emitProfileNotes().
llvm-svn: 221334
Commit 220932 caused crash when building clang-tblgen on aarch64 debian target,
so it's blocking all daily tests.
The std::call_once implementation in pthread has bug for aarch64 debian.
llvm-svn: 221331
Exact shifts may not shift out any non-zero bits. Use computeKnownBits
to determine when this occurs and just return the left hand side.
This fixes PR21477.
llvm-svn: 221325
We currently try to push an even number of registers to preserve 8-byte
alignment during a function's prologue, but only when the stack alignment is
prcisely 8. Many of the reasons for doing it apply also when that alignment > 8
(the extra store is often free, and can save another stack adjustment, though
less frequently for 16-byte stack alignment).
llvm-svn: 221321
We were making an attempt to do this by adding an extra callee-saved GPR (so
that there was an even number in the list), but when that failed we went ahead
and pushed anyway.
This had a couple of potential issues:
+ The .cfi directives we emit misplaced dN because they were based on
PrologEpilogInserter's calculation.
+ Unaligned stores can be less efficient.
+ Unaligned stores can actually fault (likely only an issue in niche cases,
but possible).
This adds a final explicit stack adjustment if all other options fail, so that
the actual locations of the registers match up with where they should be.
llvm-svn: 221320
Divides and remainder operations do not behave like other operations
when they are given poison: they turn into undefined behavior.
It's really hard to know if the operands going into a div are or are not
poison. Because of this, we should only choose to speculate if there
are constant operands which we can easily reason about.
This fixes PR21412.
llvm-svn: 221318
Patch to allow (v)blendps, (v)blendpd, (v)pblendw and vpblendd instructions to be commuted - swaps the src registers and inverts the blend mask.
This is primarily to improve memory folding (see new tests), but it also improves the quality of shuffles (see modified tests).
Differential Revision: http://reviews.llvm.org/D6015
llvm-svn: 221313
change LoopSimplifyPass to be !isCFGOnly. The motivation for the earlier patch
(r221223) was that LoopSimplify is not preserved by instcombine though
setPreservesCFG indicates that it is. This change fixes the issue
by making setPreservesCFG no longer imply LoopSimplifyPass, and is therefore less
invasive.
llvm-svn: 221311
While fixing up the register classes in the machine combiner in a previous
commit I missed one.
This fixes the last one and adds a test case.
llvm-svn: 221308
Clang -gsplit-dwarf self-host -O0, binary increases by 0.0005%, -O2,
binary increases by 25%.
A large binary inside Google, split-dwarf, -O0, and other internal flags
(GDB index, etc) increases by 1.8%, optimized build is 35%.
The size impact may be somewhat greater in .o files (I haven't measured
that much - since the linked executable -O0 numbers seemed low enough)
due to relocations. These relocations could be removed if we taught the
llvm-symbolizer to handle indexed addressing in the .o file (GDB can't
cope with this just yet, but GDB won't be reading this info anyway).
Also debug_ranges could be shared between .o and .dwo, though ideally
debug_ranges would get a schema that could used index(+offset)
addressing, and move to the .dwo file, then we'd be back to sharing
addresses in the address pool again.
But for now, these sizes seem small enough to go ahead with this.
Verified that no other DW_TAGs are produced into the .o file other than
subprograms and inlined_subroutines.
llvm-svn: 221306
We were producing a relocation for
----------------
.section foo,bar
La:
Lb:
.long La-Lb
--------------
but not for
---------------------
.section foo,bar
zed:
La:
Lb:
.long La-Lb
----------------
This patch handles the case where both fragments are part of the first atom
in a section and there is no corresponding symbol to that atom.
This fixes pr21328.
llvm-svn: 221304
This patch adds 'FeatureSlowSHLD' to 'bdver3'.
According to the official AMD optimization guide for amdfam15: "Using
alternative code in place of SHLD achieves lower overall latency and
requires fewer execution resources. The 32-bit and 64-bit forms of
ADD, ADC, SHR, and LEA (except 16-bit form) are DirectPath
instructions, while SHLD is a VectorPath instruction."
This patch also explicitly sets feature AVX and SSE4A for all the bdver*
cpus. This part of the patch is a non-functional change and it is mainly
done for clarity reasons (Both XOP and FMA4 already imply AVX and SSE4A).
llvm-svn: 221296
Registers are not all equal. Some are not allocatable (infinite cost),
some have to be preserved but can be used, and some others are just free
to use.
Ensure there is a cost hierarchy reflecting this fact, so that the
allocator will favor scratch registers over callee-saved registers.
llvm-svn: 221293
This patch improves how the different costs (register, interference, spill
and coalescing) relates together. The assumption is now that:
- coalescing (or any other "side effect" of reg alloc) is negative, and
instead of being derived from a spill cost, they use the block
frequency info.
- spill costs are in the [MinSpillCost:+inf( range
- register or interference costs are in [0.0:MinSpillCost( or +inf
The current MinSpillCost is set to 10.0, which is a random value high
enough that the current constraint builders do not need to worry about
when settings costs. It would however be worth adding a normalization
step for register and interference costs as the last step in the
constraint builder chain to ensure they are not greater than SpillMinCost
(unless this has some sense for some architectures). This would work well
with the current builder pipeline, where all costs are tweaked relatively
to each others, but could grow above MinSpillCost if the pipeline is
deep enough.
The current heuristic is tuned to depend rather on the number of uses of
a live interval rather than a density of uses, as used by the greedy
allocator. This heuristic provides a few percent improvement on a number
of benchmarks (eembc, spec, ...) and will definitely need to change once
spill placement is implemented: the current spill placement is really
ineficient, so making the cost proportionnal to the number of use is a
clear win.
llvm-svn: 221292
Summary:
Appropriately set/clear the FeatureBit for Mips16 when these assembler directives are used and also emit ".set nomips16" (previously, only ".set mips16" was being emitted).
These improvements allow for better testing of the .cpload/.cprestore assembler directives (which are not supposed to work when Mips16 is enabled).
Test Plan: The test is bare-bones because there are no MC tests for Mips16 instructions (there's only one, which checks that the Mips16 ELF header flag gets set), and that suggests to me that it has not been implemented yet in the IAS.
Reviewers: dsanders
Reviewed By: dsanders
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D5462
llvm-svn: 221277
test/MC/ARM/directive-eabi_attribute.s was missing several tests of object file
encodings relative to the existing tests for assembly file encodings. This
commit adds the missing tests.
Change-Id: Ie110ca02b65e8f4d4c77f437bd09d03607fa5c0d
llvm-svn: 221250
This is experimental, just barely enough to get things to not
immediately combust.
A note for those who are curious:
Only lld can successfully link the object files, other linkers truncate
the section names making the debug sections illegible to debuggers.
Even with this in mind, we believe we are having trouble with SECREL
relocations.
llvm-svn: 221245
1>C:\Program Files (x86)\Windows Kits\8.1\Include\um\minwinbase.h(46):
error C2146: syntax error : missing ';' before identifier 'nLength'
1>C:\Program Files (x86)\Windows Kits\8.1\Include\um\minwinbase.h(46):
error C4430: missing type specifier - int assumed. Note: C++ does not support default-int
...
including <windows.h> is actually required.
llvm-svn: 221244
preserve LoopSimplify because instcombine may replace branch predicates
with undef which loop simplify then replaces with always exit. Replace
setPreservesCFG with the more constrained preservation of DomTree and
LoopInfo.
llvm-svn: 221223
We shouldn't put this kind of attribute stuff in DataTypes.h.
Leave the END_WITH_NULL name for now so I can update clang without
making build spam.
llvm-svn: 221215
LoadCombine can be smarter about aborting when a writing instruction is
encountered, instead of aborting upon encountering any writing instruction, use
an AliasSetTracker, and only abort when encountering some write that might
alias with the loads that could potentially be combined.
This was originally motivated by comments made (and a test case provided) by
David Majnemer in response to PR21448. It turned out that LoadCombine was not
responsible for that PR, but LoadCombine should also be improved so that
unrelated stores (and @llvm.assume) don't interrupt load combining.
llvm-svn: 221203
This generalizes the range handling for ranges in both the skeleton and
full unit, laying the foundation for the addition of more ranges (rather
than just the CU's special case) in the skeleton CU with fission+gmlt.
llvm-svn: 221202
FoldOpIntoPhi could create an infinite loop if the PHI could potentially
reach a BB it was considering inserting instructions into. The
instructions it would insert would eventually lead to other combines
firing which would, again, lead to FoldOpIntoPhi firing.
The solution is to handicap FoldOpIntoPhi so that it doesn't attempt to
insert instructions that the PHI might reach.
This fixes PR21377.
llvm-svn: 221187
So that it may be shared between skeleton/full compile unit, for CU
ranges and other ranges to be added for fission+gmlt.
(at some point we might want some kind of object shared between the
skeleton and full compile units for all those things we only want one of
in that scope, rather than having the full unit always look through to
the skeleton... - alternatively, we might be able to have the skeleton
pointer (or another, separate pointer) point to the skeleton or to the
unit itself in non-fission, so we don't have to special case its
absence)
llvm-svn: 221186
This is one of a few steps to generalize range handling to include the
CU range (thus the CU's range list will be moved into the range list
list, losing track of the base address in the process), which means
generalizing ranges from both the skeleton and full unit under fission.
And... then I can used that generalized support for ranges in
fission+gmlt where there'll be a bunch more ranges in the skeleton.
llvm-svn: 221182
register class tGPRRegClass if the target is thumb1.
This commit fixes a crash that occurs during register allocation which was
triggered when a virtual register defined by an inline-asm instruction had to
be spilled.
rdar://problem/18740489
llvm-svn: 221178
For 8-bit divrems where the remainder is used, we used to generate:
divb %sil
shrw $8, %ax
movzbl %al, %eax
That was to avoid an H-reg access, which is problematic mainly because
it isn't possible in REX-prefixed instructions.
This patch optimizes that to:
divb %sil
movzbl %ah, %eax
To do that, we explicitly extend AH, and extract the L-subreg in the
resulting register. The extension is done using the NOREX variants of
MOVZX. To support signed operations, MOVSX_NOREX is also added.
Further, this introduces a new SDNode type, [us]divrem_ext_hreg, which is
then lowered to a sequence containing a single zext (rather than 2).
Differential Revision: http://reviews.llvm.org/D6064
llvm-svn: 221176
EarlyCSE uses a simple generation scheme for handling memory-based
dependencies, and calls to @llvm.assume (which are marked as writing to memory
to ensure the preservation of control dependencies) disturb that scheme
unnecessarily. Skipping calls to @llvm.assume is legal, and the alternative
(adding AA calls in EarlyCSE) is likely undesirable (we have GVN for that).
Fixes PR21448.
llvm-svn: 221175
Unconditional noexcept support was added in the VS 2013 Nov CTP. Given
that there have been three CTPs since then, I don't think we need
careful macro magic to target that specific tech preview. Instead,
target the major release version number of 1900, which corresponds to
the as-yet unreleased VS "14".
llvm-svn: 221169
call DAGCombiner. But we ran into a case (on Windows) where the
calling convention causes argument lowering to bail out of fast-isel,
and we end up in CodeGenAndEmitDAG() which does run DAGCombiner.
So, we need to make DAGCombiner check for 'optnone' after all.
Commit includes the test that found this, plus another one that got
missed in the original optnone work.
llvm-svn: 221168
This CPU definition is redundant. The Cortex-A9 is defined as
supporting multiprocessing extensions. Remove its definition and
update appropriate tests.
LLVM defines both a cortex-a9 CPU and a cortex-a9-mp CPU. The only
difference between the two CPU definitions in ARM.td is that
cortex-a9-mp contains the feature FeatureMP for multiprocessing
extensions.
This is redundant since the Cortex-A9 is defined as having
multiprocessing extensions in the TRMs. armcc also defines the
Cortex-A9 as having multiprocessing extensions by default.
Change-Id: Ifcadaa6c322be0a33d9d2a39cfdd7da1d75981a7
llvm-svn: 221166
Some literals in the AArch64 backend had 15 'f's rather than 16, causing
comparisons with a constant 0xffffffffffffffff to be miscompiled.
llvm-svn: 221157
Hexagon was not calling InitializeELF and could not select between
ctors and init_array.
Phabricator revision: http://reviews.llvm.org/D6061
llvm-svn: 221156
test/MC/ARM/directive-eabi_attribute.s had gotten out-of-sync with
test/MC/ARM/directive-eabi_attribute-2.s. The former tests the encoding of
build attributes in object files, and the latter the encoding in assembly
files. Since both these tests need to be updated at the same time, it makes
sense to combine them into a single test. The object file encodings are being
checked against the ouput of -arm-attributes rather than by direct byte
comparisons which makes for easier reading.
Change-Id: I0075de506ae5626fb2fa235383fe5ce6a65a15a9
llvm-svn: 221155
The MRI scripts have to work with CRLF, and in general it is probably
a good idea to support this in a core utility like LineIterator.
llvm-svn: 221153
When LLVM emits DWARF call frame information, it currently creates a local,
section-relative symbol in the code section, which is pointed to by a
relocation on the .eh_frame section. However, for C++ we emit some functions in
section groups, and the SysV ABI has some rules to make it easier to remove
these sections
(http://www.sco.com/developers/gabi/latest/ch4.sheader.html#section_group_rules):
A symbol table entry with STB_LOCAL binding that is defined relative to one
of a group's sections, and that is contained in a symbol table section that is
not part of the group, must be discarded if the group members are discarded.
References to this symbol table entry from outside the group are not allowed.
This means that we need to use the function symbol for the relocation, not a
temporary symbol.
There was a comment in the code claiming that the local symbol was used to
avoid creating a relocation, but a relocation must be created anyway as the
code and CFI are in different sections.
llvm-svn: 221150
Bindings built out-of-tree, e.g. via OPAM, should append
a line to META.llvm like the following:
linkopts = "-cclib -L$libdir -cclib -Wl,-rpath,$libdir"
where $libdir is the lib/ directory where LLVM libraries are
installed.
llvm-svn: 221139
ocamlc and ocamlopt expose a distinct set of buildsystem bugs, e.g.
only ocamlc would detect -custom or -dllib-related bugs, and as all
buildbots will have ocamlopt, these bugs will stay hidden.
This change should add no more than 30 seconds of testing time.
llvm-svn: 221137
The problem is mostly that variadic output instruction
aren't handled, so it is rejected for having an inconsistent
number of operands, and then the right number of operands
isn't emitted.
llvm-svn: 221117
The issue was that linkAppendingVarProto does the full linking job, including
deleting the old dst variable. The fix is just to call it and return early
if we have a GV with appending linkage.
original message:
Refactor duplicated code in liking GlobalValues.
There is quiet a bit of logic that is common to any GlobalValue but was
duplicated for Functions, GlobalVariables and GlobalAliases.
While at it, merge visibility even when comdats are used, fixing pr21415.
llvm-svn: 221098
This commit introduces heap-use-after-free detected by ASan. Here is the output
for one of several tests that detect it:
******************** TEST 'LLVM :: Linker/AppendingLinkage.ll' FAILED ********************
Command Output (stderr):
--
=================================================================
==2122==ERROR: AddressSanitizer: heap-use-after-free on address 0x60c00000b9c8 at pc 0x0000005d05d1 bp 0x7fff64ed27c0 sp 0x7fff64ed27b8
READ of size 4 at 0x60c00000b9c8 thread T0
#0 0x5d05d0 in llvm::GlobalValue::setUnnamedAddr(bool) /usr/local/google/home/chandlerc/src/llvm/build/../include/llvm/IR/GlobalValue.h:115:35
#1 0x69fff1 in (anonymous namespace)::ModuleLinker::linkGlobalValueProto(llvm::GlobalValue*) /usr/local/google/home/chandlerc/src/llvm/build/../lib/Linker/LinkModules.cpp:1041:5
#2 0x697229 in (anonymous namespace)::ModuleLinker::run() /usr/local/google/home/chandlerc/src/llvm/build/../lib/Linker/LinkModules.cpp:1485:9
#3 0x696542 in llvm::Linker::linkInModule(llvm::Module*) /usr/local/google/home/chandlerc/src/llvm/build/../lib/Linker/LinkModules.cpp:1621:10
#4 0x4a2db7 in main /usr/local/google/home/chandlerc/src/llvm/build/../tools/llvm-link/llvm-link.cpp:116:9
#5 0x7f4ae61e5ec4 in __libc_start_main /build/buildd/eglibc-2.19/csu/libc-start.c:287
#6 0x41eb71 in _start (/usr/local/google/home/chandlerc/src/llvm/build/bin/llvm-link+0x41eb71)
0x60c00000b9c8 is located 72 bytes inside of 128-byte region [0x60c00000b980,0x60c00000ba00)
freed by thread T0 here:
#0 0x4a1e6b in operator delete(void*) /usr/local/google/home/chandlerc/src/llvm/opt-build/../projects/compiler-rt/lib/asan/asan_new_delete.cc:94:3
#1 0x5d1a7a in llvm::iplist<llvm::GlobalVariable, llvm::ilist_traits<llvm::GlobalVariable> >::erase(llvm::ilist_iterator<llvm::GlobalVariable>) /usr/local/google/home/chandlerc/src/llvm/build/../inclu
de/llvm/ADT/ilist.h:466:5
#2 0x5d1980 in llvm::GlobalVariable::eraseFromParent() /usr/local/google/home/chandlerc/src/llvm/build/../lib/IR/Globals.cpp:204:3
#3 0x6a8a4d in (anonymous namespace)::ModuleLinker::linkAppendingVarProto(llvm::GlobalVariable*, llvm::GlobalVariable const*) /usr/local/google/home/chandlerc/src/llvm/build/../lib/Linker/LinkModules.
cpp:980:3
#4 0x6a7403 in (anonymous namespace)::ModuleLinker::linkGlobalVariableProto(llvm::GlobalVariable const*, llvm::GlobalValue*, bool) /usr/local/google/home/chandlerc/src/llvm/build/../lib/Linker/LinkMod
ules.cpp:1074:11
#5 0x69ff4e in (anonymous namespace)::ModuleLinker::linkGlobalValueProto(llvm::GlobalValue*) /usr/local/google/home/chandlerc/src/llvm/build/../lib/Linker/LinkModules.cpp:1028:13
#6 0x697229 in (anonymous namespace)::ModuleLinker::run() /usr/local/google/home/chandlerc/src/llvm/build/../lib/Linker/LinkModules.cpp:1485:9
#7 0x696542 in llvm::Linker::linkInModule(llvm::Module*) /usr/local/google/home/chandlerc/src/llvm/build/../lib/Linker/LinkModules.cpp:1621:10
#8 0x4a2db7 in main /usr/local/google/home/chandlerc/src/llvm/build/../tools/llvm-link/llvm-link.cpp:116:9
#9 0x7f4ae61e5ec4 in __libc_start_main /build/buildd/eglibc-2.19/csu/libc-start.c:287
previously allocated by thread T0 here:
#0 0x4a192b in operator new(unsigned long) /usr/local/google/home/chandlerc/src/llvm/opt-build/../projects/compiler-rt/lib/asan/asan_new_delete.cc:62:35
#1 0x61d85c in llvm::User::operator new(unsigned long, unsigned int) /usr/local/google/home/chandlerc/src/llvm/build/../lib/IR/User.cpp:57:19
#2 0x6a7525 in (anonymous namespace)::ModuleLinker::linkGlobalVariableProto(llvm::GlobalVariable const*, llvm::GlobalValue*, bool) /usr/local/google/home/chandlerc/src/llvm/build/../lib/Linker/LinkMod
ules.cpp:1100:3
#3 0x69ff4e in (anonymous namespace)::ModuleLinker::linkGlobalValueProto(llvm::GlobalValue*) /usr/local/google/home/chandlerc/src/llvm/build/../lib/Linker/LinkModules.cpp:1028:13
#4 0x697229 in (anonymous namespace)::ModuleLinker::run() /usr/local/google/home/chandlerc/src/llvm/build/../lib/Linker/LinkModules.cpp:1485:9
#5 0x696542 in llvm::Linker::linkInModule(llvm::Module*) /usr/local/google/home/chandlerc/src/llvm/build/../lib/Linker/LinkModules.cpp:1621:10
#6 0x4a2db7 in main /usr/local/google/home/chandlerc/src/llvm/build/../tools/llvm-link/llvm-link.cpp:116:9
#7 0x7f4ae61e5ec4 in __libc_start_main /build/buildd/eglibc-2.19/csu/libc-start.c:287
SUMMARY: AddressSanitizer: heap-use-after-free /usr/local/google/home/chandlerc/src/llvm/build/../include/llvm/IR/GlobalValue.h:115 llvm::GlobalValue::setUnnamedAddr(bool)
Shadow bytes around the buggy address:
0x0c187fff96e0: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
0x0c187fff96f0: 00 00 00 00 00 00 00 fa fa fa fa fa fa fa fa fa
0x0c187fff9700: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fa
0x0c187fff9710: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
0x0c187fff9720: 00 00 00 00 00 00 00 00 fa fa fa fa fa fa fa fa
=>0x0c187fff9730: fd fd fd fd fd fd fd fd fd[fd]fd fd fd fd fd fd
0x0c187fff9740: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd
0x0c187fff9750: fd fd fd fd fd fd fd fa fa fa fa fa fa fa fa fa
0x0c187fff9760: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
0x0c187fff9770: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd
0x0c187fff9780: fd fd fd fd fd fd fd fd fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Heap right redzone: fb
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack partial redzone: f4
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
ASan internal: fe
==2122==ABORTING
llvm-svn: 221096
Currently we only need to emit skeleton strings into the CU header and
we do this by explicitly calling "addLocalString". With gmlt-in-fission,
we'll be emitting a bunch of other strings from other codepaths where
it's not statically known that these strings will be local or not.
Introduce a virtual function to indicate whether this unit is a DWO unit
or not (I'm not sure if we have a good term for this, the
opposite/alternative to 'skeleton' unit) and use that to generalize the
string emission logic so that strings can be correctly emitted in both
the skeleton and dwo unit when in split dwarf mode.
And to demonstrate that this works, switch the existing special callers
of addLocalString in the skeleton builder to addString - and they still
work. Yay.
llvm-svn: 221094
This is a useful distinction/invariant/delination to make because
LineTablesOnly mode is never relevant to type units, so it's clear that
we're not doing weird line-tables-only-with-types by making this API
choice.
It also lays the foundations nicely for adding gmlt-like data to fission
skeleton CUs while limiting the effects to CUs and not TUs.
llvm-svn: 221093
(these will shortly become virtual, with a null implementation in
DwarfUnit (since type units don't have accelerator tables in the current
schema) and the current implementation down in DwarfCompileUnit, moving
the actual maps there too)
llvm-svn: 221082
r221056 "[mips] Move F128 argument handling into MipsCCState as we did for returns. NFC."
r221058 "[mips] Fix unused variable warning introduced in r221056"
r221059 "[mips] Move all ByVal handling into CCState and tablegen-erated code. NFC."
r221061 "Renamed CCState members that appear to misspell 'Processed' as 'Proceed'. NFC."
It cuased an undefined behavior in LLVM :: CodeGen/Mips/return-vector.ll.
llvm-svn: 221081
This would help catch cases where we might otherwise try to reference a
dwo CU label, which would be weird - because without relocations in the
dwo file it's not generally meaningful to talk about the CU offsets
there (or, if it is, we can do so in absolute terms without using a
relocation to compute it).
llvm-svn: 221078
The given example was overflowing its alloca and segfaulting if actually run on
x86, so it's a good idea to provide something that works there too.
Patch by Ramkumar Ramachandra.
llvm-svn: 221077
This allows the CU label to be emitted only for compile units, as
they're the only ones that need it (so they can be referenced from
pubnames)
llvm-svn: 221072
m_ZExt might bind against a ConstantExpr instead of an Instruction.
Assuming this, using cast<Instruction>, results in InstCombine crashing.
Instead, introduce ZExtOperator to bridge both Instruction and
ConstantExpr ZExts.
This fixes PR21445.
llvm-svn: 221069
This was a compile-unit specific label (unused in type units) and seems
unnecessary anyway when we can more easily directly compute the size of
the compile unit.
llvm-svn: 221067
Summary:
CCState already contains a byval implementation that is very similar to the
Mips custom code. This patch merges the custom code into the existing
common code and tablegen-erated code.
Reviewers: vmedic
Reviewed By: vmedic
Subscribers: rnk, llvm-commits
Differential Revision: http://reviews.llvm.org/D5977
llvm-svn: 221059
Summary:
There are a couple more changes to make before analyzeFormalArguments can
be merged into the standard AnalyzeFormalArguments. I've had to temporarily
poke a couple holes in MipsCCState's encapsulation to save having to make
all the required changes for this merge all at once*. These will be removed
shortly.
* We must merge our ByVal argument handling with the implementation in CCState.
This will be done over the next three patches, then the fourth will merge
analyzeFormalArguments with AnalyzeFormalArguments.
Depends on D5967
Reviewers: vmedic
Reviewed By: vmedic
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D5969
llvm-svn: 221056
Type units no longer have skeletons and it's misleading to be able to
query for a type unit's skeleton (it might incorrectly lead one to
conclude that if a unit doesn't have a skeleton it's not in a .dwo
file... ).
llvm-svn: 221055
Summary:
It's now passed in as an argument to functions that need it. Eventually
this argument will be replaced by the 'this' pointer for a MipsCCState
object.
Depends on D5966
Reviewers: vmedic
Reviewed By: vmedic
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D5967
llvm-svn: 221054
Summary:
There is one remaining trace of it in MipsCC::analyzeCallOperands() where
Mips16 might override the calling convention. This will moved into
tablegen-erated code later.
Depends on D5965
Reviewers: vmedic
Reviewed By: vmedic
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D5966
llvm-svn: 221053
Summary:
CustomCallingConv is simply a CallingConv that tablegen should not generate the
implementation for. It allows regular CallingConv's to delegate to these custom
functions. This is (currently) necessary for Mips and we cannot use CCCustom
without having to adapt to the different API that CCCustom uses.
This brings us a bit closer to being able to remove
MipsCC::analyzeCallOperands and MipsCC::analyzeFormalArguments in favour of
the common implementation.
No functional change to the targets.
Depends on D3341
Reviewers: vmedic
Reviewed By: vmedic
Subscribers: vmedic, llvm-commits
Differential Revision: http://reviews.llvm.org/D5965
llvm-svn: 221052
This is the first big step to allowing gmlt-like inline scope
information in the skeleton CU. While this commit doesn't change the
functionality, it's only a small step to call
"constructAbstractSubprogramDIE" on both the InfoHolder and the
SkeletonHolder (when in use) and that will at least create the abstract
SP dies in that case, though still not creating the other subprograms.
llvm-svn: 221051
This removes calls to isMaterializable in the following cases:
* It was redundant with a call to isDeclaration now that isDeclaration returns
the correct answer for materializable functions.
* It was followed by a call to Materialize. Just call Materialize and check EC.
llvm-svn: 221050
This can happen pretty often in code that looks like:
int foo = bar - 1;
if (foo < 0)
do stuff
In this case, bar < 1 is an equivalent condition.
This transform requires that the add instruction be annotated with nsw.
llvm-svn: 221045
getDISubprogram was mistakenly thought to contain a bug: we thought we
might need to try harder if we found a DebugLoc we didn't find.
llvm-svn: 221044
Summary:
This patch extends the 'show' and 'merge' commands in llvm-profdata to handle
sample PGO formats. Using the 'merge' command it is now possible to convert
one sample PGO format to another.
The only format that is currently not working is 'gcc'. I still need to
implement support for it in lib/ProfileData.
The changes in the sample profile support classes are needed for the
merge operation.
Reviewers: bogner
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D6065
llvm-svn: 221032
"[x86] Simplify vector selection if condition value type matches vselect value type and true value is all ones or false value is all zeros."
llvm-svn: 221028
Change `Instruction::getAllMetadata()` to modify a vector of `Value`
instead of `MDNode` and update call sites. This is part of PR21433.
llvm-svn: 221027
Change `Instruction::getMetadata()` to return `Value` as part of
PR21433.
Update most callers to use `Instruction::getMDNode()`, which wraps the
result in a `cast_or_null<MDNode>`.
llvm-svn: 221024
Add `Instruction::getMDNode()` that casts to `MDNode` before changing
`Instruction::getMetadata()` to return `Value`. This avoids adding
`cast_or_null<MDNode>` boiler-plate throughout the code.
Part of PR21433.
llvm-svn: 221023
It appears to ignore or find ambiguous MachineInstrBuilder's conversion
operators that allow conversion to MachineInstr* and
MachineBasicBlock::bundle_iterator.
As a workaround, add an explicit way to get the MachineInstr.
llvm-svn: 221017
There is quiet a bit of logic that is common to any GlobalValue but was
duplicated for Functions, GlobalVariables and GlobalAliases.
While at it, merge visibility even when comdats are used, fixing pr21415.
llvm-svn: 221014
We have to use _MSC_FULL_VER here as CTP 2 and earlier didn't define
noexcept to my knowledge.
Fixes build error in lib/Support/Error.cpp when inheriting from
std::error_category, which has a noexcept virtual method.
llvm-svn: 221013
IMO we need to clean up some of these, but the member variable one
(C4458) has false positives on static methods. It is currently firing
on Twine, which has a static method like:
struct Twine {
uintptr_t LHS, RHS;
static void staticMethod() {
// warning C4458: declaration of 'LHS' hides class member
uintptr_t LHS;
...
}
};
We should fix up clang's -Wshadow and clean it up, and then we can
re-enable some of these MSVC warnings.
llvm-svn: 221012
The getBinary and getBuffer method now return ordinary pointers of appropriate
const-ness. Ownership is transferred by calling takeBinary(), which returns a
pair of the Binary and a MemoryBuffer.
llvm-svn: 221003
We need to figure out how to track ptrtoint values all the
way until result is converted back to a pointer in order
to correctly rewrite the pointer type.
llvm-svn: 220997
Now that we have initial support for VSX, we can begin adding
intrinsics for programmer access to VSX instructions. This patch adds
basic support for VSX intrinsics in general, and tests it by
implementing intrinsics for minimum and maximum for the vector double
data type.
The LLVM portion of this is quite straightforward. There is a
companion patch for Clang.
llvm-svn: 220988
Our internal test reveals such case should not be transformed:
cmp x17, #3
b.lt .LBB10_15
...
subs x12, x12, #1
b.gt .LBB10_1
where x12 is a liveout, becomes:
cmp x17, #2
b.le .LBB10_15
...
subs x12, x12, #2
b.ge .LBB10_1
Unable to provide test case as it's difficult to reproduce on community branch.
http://reviews.llvm.org/D6048
Patch by Zhaoshi Zheng <zhaoshiz@codeaurora.org>!
llvm-svn: 220987
This patch adds an optimization in CodeGenPrepare to move an extractelement
right before a store when the target can combine them.
The optimization may promote any scalar operations to vector operations in the
way to make that possible.
** Context **
Some targets use different register files for both vector and scalar operations.
This means that transitioning from one domain to another may incur copy from one
register file to another. These copies are not coalescable and may be expensive.
For example, according to the scheduling model, on cortex-A8 a vector to GPR
move is 20 cycles.
** Motivating Example **
Let us consider an example:
define void @foo(<2 x i32>* %addr1, i32* %dest) {
%in1 = load <2 x i32>* %addr1, align 8
%extract = extractelement <2 x i32> %in1, i32 1
%out = or i32 %extract, 1
store i32 %out, i32* %dest, align 4
ret void
}
As it is, this IR generates the following assembly on armv7:
vldr d16, [r0] @vector load
vmov.32 r0, d16[1] @ cross-register-file copy: 20 cycles
orr r0, r0, #1 @ scalar bitwise or
str r0, [r1] @ scalar store
bx lr
Whereas we could generate much faster code:
vldr d16, [r0] @ vector load
vorr.i32 d16, #0x1 @ vector bitwise or
vst1.32 {d16[1]}, [r1:32] @ vector extract + store
bx lr
Half of the computation made in the vector is useless, but this allows to get
rid of the expensive cross-register-file copy.
** Proposed Solution **
To avoid this cross-register-copy penalty, we promote the scalar operations to
vector operations. The penalty will be removed if we manage to promote the whole
chain of computation in the vector domain.
Currently, we do that only when the chain of computation ends by a store and the
target is able to combine an extract with a store.
Stores are the most likely candidates, because other instructions produce values
that would need to be promoted and so, extracted as some point[1]. Moreover,
this is customary that targets feature stores that perform a vector extract (see
AArch64 and X86 for instance).
The proposed implementation relies on the TargetTransformInfo to decide whether
or not it is beneficial to promote a chain of computation in the vector domain.
Unfortunately, this interface is rather inaccurate for this level of details and
although this optimization may be beneficial for X86 and AArch64, the inaccuracy
will lead to the optimization being too aggressive.
Basically in TargetTransformInfo, everything that is legal has a cost of 1,
whereas, even if a vector type is legal, usually a vector operation is slightly
more expensive than its scalar counterpart. That will lead to too many
promotions that may not be counter balanced by the saving of the
cross-register-file copy. For instance, on AArch64 this penalty is just 4
cycles.
For now, the optimization is just enabled for ARM prior than v8, since those
processors have a larger penalty on cross-register-file copies, and the scope is
limited to basic blocks. Because of these two factors, we limit the effects of
the inaccuracy. Indeed, I did not want to build up a fancy cost model with block
frequency and everything on top of that.
[1] We can imagine targets that can combine an extractelement with other
instructions than just stores. If we want to go into that direction, the current
interfaces must be augmented and, moreover, I think this becomes a global isel
problem.
Differential Revision: http://reviews.llvm.org/D5921
<rdar://problem/14170854>
llvm-svn: 220978
Tested this by #if 0'ing out the pthreads implementation, which
indicated that this fallback was not currently compiling successfully
and applying this patch resolves that.
Patch by Andy Chien.
llvm-svn: 220969
In a case where we have a no {un,}signed wrap flag on the increment, if
RHS - Start is constant then we can avoid inserting a max operation bewteen
the two, since we can statically determine which is greater.
This allows us to unroll loops such as:
void testcase3(int v) {
for (int i=v; i<=v+1; ++i)
f(i);
}
llvm-svn: 220960
Since block address values can be larger than 2GB in 64-bit code, they
cannot be loaded simply using an @l / @ha pair, but instead must be
loaded from the TOC, just like GlobalAddress, ConstantPool, and
JumpTable values are.
The commit also fixes a bug in PPCLinuxAsmPrinter::doFinalization where
temporary labels could not be used as TOC values, since code would
attempt (and fail) to use GetOrCreateSymbol to create a symbol of the
same name as the temporary label.
llvm-svn: 220959
Since JIT->MCJIT migration, most of the ExecutionEngine interface
became deprecated and/or broken. This especially affected the OCaml
bindings, as runFunction is no longer available, and unlike in C,
it is not possible to coerce a pointer to a function and call it
in OCaml.
In practice, LLVM 3.5 shipped completely unusable
Llvm_executionengine.
The GenericValue interface and runFunction were essentially
a poor man's FFI. As such, this interface was removed and instead
a dependency on ctypes >=0.3 added, which handled platform-specific
aspects of accessing data and calling functions.
The new interface does not expose JIT (which is a shim around MCJIT),
as well as the interpreter (which can't handle a lot of valid IR).
Llvm_executionengine.add_global_mapping is currently unusable
due to PR20656.
llvm-svn: 220957
r212242 introduced a legalizer hook, originally to let AArch64 widen
v1i{32,16,8} rather than scalarize, because the legalizer expected, when
scalarizing the result of a conversion operation, to already have
scalarized the operands. On AArch64, v1i64 is legal, so that commit
ensured operations such as v1i32 = trunc v1i64 wouldn't assert.
It did that by choosing to widen v1 types whenever possible. However,
v1i1 types, for which there's no legal widened type, would still trigger
the assert.
This commit fixes that, by only scalarizing a trunc's result when the
operand has already been scalarized, and introducing an extract_elt
otherwise.
This is similar to r205625.
Fixes PR20777.
llvm-svn: 220937
Summary: This is a fix for the command line syntax error while building LTO when using MinGW.
Patch By: jsroemer
Reviewers: rnk
Reviewed By: rnk
Subscribers: rnk, beanz, llvm-commits
Differential Revision: http://reviews.llvm.org/D5476
llvm-svn: 220935
Earlier this summer I fixed an issue where we were incorrectly combining
multiple loads that had different constraints such alignment, invariance,
temporality, etc. Apparently in one case I made copt paste error and swapped
alignment and invariance.
Tests included.
rdar://18816719
llvm-svn: 220933
Summary:
This patch adds an llvm_call_once which is a wrapper around std::call_once on platforms where it is available and devoid of bugs. The patch also migrates the ManagedStatic mutex to be allocated using llvm_call_once.
These changes are philosophically equivalent to the changes added in r219638, which were reverted due to a hang on Win32 which was the result of a bug in the Windows implementation of std::call_once.
Reviewers: aaron.ballman, chapuni, chandlerc, rnk
Reviewed By: rnk
Subscribers: majnemer, llvm-commits
Differential Revision: http://reviews.llvm.org/D5922
llvm-svn: 220932
We're using cl::opt here, but for some reason we're reading out one
particular option by hand instead. This makes -help and the like
behave rather poorly, so let's not do it this way.
llvm-svn: 220928
The langref says:
LLVM explicitly allows declarations of global variables to be marked
constant, even if the final definition of the global is not. This
capability can be used to enable slightly better optimization of the
program, but requires the language definition to guarantee that
optimizations based on the ‘constantness’ are valid for the
translation units that do not include the definition.
Given that definition, when merging two declarations, we have to drop
constantness if of of them is not marked contant, since the Module
without the constant marker might not have the necessary guarantees.
llvm-svn: 220927