r187874 seems to have been missed by the build bot infrastructure, and
the subsequent commits to compiler-rt don't seem to be queuing up new
build requsets. Hopefully this will.
As it happens, having the space here is the more common formatting. =]
llvm-svn: 187879
using it to detect whether or not a terminal supports colors. This
replaces a particularly egregious hack that merely compared the TERM
environment variable to "dumb". That doesn't really translate to
a reasonable experience for users that have actually ensured their
terminal's capabilities are accurately reflected.
This makes testing a terminal for color support somewhat more expensive,
but it is called very rarely anyways. The important fast path when the
output is being piped somewhere is already in place.
The global lock may seem excessive, but the spec for calling into curses
is *terrible*. The whole library is terrible, and I spent quite a bit of
time looking for a better way of doing this before convincing myself
that this was the fundamentally correct way to behave. The damage of the
curses library is very narrowly confined, and we continue to use raw
escape codes for actually manipulating the colors which is a much sane
system than directly using curses here (IMO).
If this causes trouble for folks, please let me know. I've tested it on
Linux and will watch the bots carefully. I've also worked to account for
the variances of curses interfaces that I could finde documentation for,
but that may not have been sufficient.
llvm-svn: 187874
lld has a hashtable with StringRef keys; it needs to iterate over the keys in
*insertion* order. This is currently implemented as std::vector<StringRef> +
DenseMap<StringRef, T>. This will probably need a proper
DenseMapInfo<StringRef> if we don't want to lose memory/performance by
migrating to a different data structure.
llvm-svn: 187868
for StringRef with a StringMap
The bug is that the empty key compares equal to the tombstone key.
Also added an assertion to DenseMap to catch similar bugs in future.
llvm-svn: 187866
- Since we only have a few of these, use the cumbersome method of getting the
exception object from 'sys' to retain the current pre-2.6 compatibility.
llvm-svn: 187854
One use needs to copy the alloca into a std::string, and the other use
is before calling CreateProcess, which is very heavyweight anyway.
llvm-svn: 187845
Previously this check was guarded by MSVC, which doesn't distinguish
between the compiler and the headers/library. This enables clang to
compile more of LLVM on Windows with Microsoft headers.
Remove some unused macros while I'm here: error_t and LTDL stuff.
llvm-svn: 187839
Summary:
This is a second attempt to get this right. After reading the Unicode
Standard I came up with the code that uses definitions of "printable" and
"column width" more suitable for terminal output (i.e. fixed-width fonts and
special treatment of many control characters).
The implementation here can probably be used for Windows and MacOS if someone
can test it properly.
The patch addresses PR14910.
Reviewers: jordan_rose, gribozavr
CC: llvm-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D1253
llvm-svn: 187837
Since the VSrc_* register classes contain both VGPRs and SGPRs, copies
that used be emitted by isel like this:
SGPR = COPY VGPR
Will now be emitted like this:
VSrC = COPY VGPR
This patch also adds a pass that tries to identify and fix situations where
a VGPR to SGPR copy may occur. Hopefully, these changes will make it
impossible for the compiler to generate illegal VGPR to SGPR copies.
llvm-svn: 187831
The globals being generated here were given the 'private' linkage type. However,
this caused them to end up in different sections with the wrong prefix. E.g.,
they would be in the __TEXT,__const section with an 'L' prefix instead of an 'l'
(lowercase ell) prefix.
The problem is that the linker will eat a literal label with 'L'. If a weak
symbol is then placed into the __TEXT,__const section near that literal, then it
cannot distinguish between the literal and the weak symbol.
Part of the problems here was introduced because the address sanitizer converted
some C strings into constant initializers with trailing nuls. (Thus putting them
in the __const section with the wrong prefix.) The others were variables that
the address sanitizer created but simply had the wrong linkage type.
llvm-svn: 187827
LLVM's coding standards recommend raw_ostream and MemoryBuffer for
reading and writing text.
This has the side effect of allowing clang to compile more of Support
and TableGen in the Microsoft C++ ABI.
llvm-svn: 187826
Also remove checking of llvm.dbg.sp since it is not used in generating dwarf.
Current state of Finder:
DebugInfoFinder tries to list all debug info MDNodes used in a module. To
list debug info MDNodes used by an instruction, DebugInfoFinder provides
processDeclare, processValue and processLocation to handle DbgDeclareInst,
DbgValueInst and DbgLoc attached to instructions. processModule will go
through all DICompileUnits in llvm.dbg.cu and list debug info MDNodes
used by the CUs.
TODO:
1> Finder has a list of CUs, SPs, Types, Scopes and global variables. We
need to add a list of variables that are used by DbgDeclareInst and
DbgValueInst.
2> MDString fields should be null or isa<MDString> and MDNode fields should be
null or isa<MDNode>. We currently use empty string or int 0 to represent null.
3> Go though Verify functions and make sure that they check field types.
4> Clean up existing testing cases to remove llvm.dbg.sp and make sure each
testing case has a llvm.dbg.cu.
Re-apply r187609 with fix to pass ocaml binding. vmcore.ml generates a debug
location with scope being metadata !{}, in verifier we treat this as a null
scope.
llvm-svn: 187812
The PPC backend had been missing a pattern to generate mulli for 64-bit
multiples. We had been generating it only for 32-bit multiplies. Unfortunately,
generating li + mulld unnecessarily increases register pressure.
llvm-svn: 187807
This change converts the NVPTX target to use the MC infrastructure
instead of directly emitting MachineInstr instances. This brings
the target more up-to-date with LLVM TOT, and should fix PR15175
and PR15958 (libNVPTXInstPrinter is empty) as a side-effect.
llvm-svn: 187798
fix for: Bug 16694 - ExecutionEngine/test-interp-vec-loadstore.ll failing on powerpc-darwin8 (http://llvm.org/bugs/show_bug.cgi?id=16694)
The ExecutionEngine/test-interp-vec-loadstore.ll test has been failing on powerpc-darwin8 (on other platforms it passed)
the reason of fail was wrong output by printf. this output is checked by FileCheck, but on little-endian powerpc the output numeric data were printed inside out and FileCheck reported fail.
the printfs have been replaced by checking data inside test and numeric output has been replaced by the text output like : "int test passed, float test passed". The text output is checked by FileCheck.
the dependency on data layout has been removed.
done by Yuri Veselov (Intel)
llvm-svn: 187791
This change came about primarily because of two issues in the existing code.
Niether of:
define i64 @test1(i64 %val) {
%in = trunc i64 %val to i32
tail call i32 @ret32(i32 returned %in)
ret i64 %val
}
define i64 @test2(i64 %val) {
tail call i32 @ret32(i32 returned undef)
ret i32 42
}
should be tail calls, and the function sameNoopInput is responsible. The main
problem is that it is completely symmetric in the "tail call" and "ret" value,
but in reality different things are allowed on each side.
For these cases:
1. Any truncation should lead to a larger value being generated by "tail call"
than needed by "ret".
2. Undef should only be allowed as a source for ret, not as a result of the
call.
Along the way I noticed that a mismatch between what this function treats as a
valid truncation and what the backends see can lead to invalid calls as well
(see x86-32 test case).
This patch refactors the code so that instead of being based primarily on
values which it recurses into when necessary, it starts by inspecting the type
and considers each fundamental slot that the backend will see in turn. For
example, given a pathological function that returned {{}, {{}, i32, {}}, i32}
we would consider each "real" i32 in turn, and ask if it passes through
unchanged. This is much closer to what the backend sees as a result of
ComputeValueVTs.
Aside from the bug fixes, this eliminates the recursion that's going on and, I
believe, makes the bulk of the code significantly easier to understand. The
trade-off is the nasty iterators needed to find the real types inside a
returned value.
llvm-svn: 187787
Without explicit dependencies, both per-file action and in-CommonTableGen action could run in parallel.
It races to emit *.inc files simultaneously.
llvm-svn: 187780
This virtual function can be implemented by targets to specify the type
to use for the index operand of INSERT_VECTOR_ELT, EXTRACT_VECTOR_ELT,
INSERT_SUBVECTOR, EXTRACT_SUBVECTOR. The default implementation returns
the result from TargetLowering::getPointerTy()
The previous code was using TargetLowering::getPointerTy() for vector
indices, because this is guaranteed to be legal on all targets. However,
using TargetLowering::getPointerTy() can be a problem for targets with
pointer sizes that differ across address spaces. On such targets,
when vectors need to be loaded or stored to an address space other than the
default 'zero' address space (which is the address space assumed by
TargetLowering::getPointerTy()), having an index that
is a different size than the pointer can lead to inefficient
pointer calculations, (e.g. 64-bit adds for a 32-bit address space).
There is no intended functionality change with this patch.
llvm-svn: 187748
Our internal regex implementation does not cope with large numbers
of anchors very efficiently. Given a ~3600-entry special case list,
regex compilation can take on the order of seconds. This patch solves
the problem for the special case of patterns matching literal global
names (i.e. patterns with no regex metacharacters). Rather than
forming regexes from literal global name patterns, add them to
a StringSet which is checked before matching against the regex.
This reduces regex compilation time by an order of roughly thousands
when reading the aforementioned special case list, according to a
completely unscientific study.
No test cases. I figure that any new tests for this code should
check that regex metacharacters are properly recognised. However,
I could not find any documentation which documents the fact that the
syntax of global names in special case lists is based on regexes.
The extent to which regex syntax is supported in special case lists
should probably be decided on/documented before writing tests.
Differential Revision: http://llvm-reviews.chandlerc.com/D1150
llvm-svn: 187732
This will be used to implement an optimisation for literal entries
in special case lists.
Differential Revision: http://llvm-reviews.chandlerc.com/D1278
llvm-svn: 187731
This patch just uses a peephole test for "add; compare; branch" sequences
within a single block. The IR optimizers already convert loops to
decrement-and-branch-on-nonzero form in some cases, so even this
simplistic test triggers many times during a clang bootstrap and
projects/test-suite run. It looks like there are still cases where we
need to more strongly prefer branches on nonzero though. E.g. I saw a
case where a loop that started out with a check for 0 ended up with a
check for -1. I'll try to look at that sometime.
I ended up adding the Reference class because MachineInstr::readsRegister()
doesn't check for subregisters (by design, as far as I could tell).
llvm-svn: 187723
Perhaps predictably, doing comparison elimination on the fly during
SystemZLongBranch turned out to be a bad idea. The next patches make
use of LOAD AND TEST and BRANCH ON COUNT, both of which require
changes to earlier instructions.
No functionality change intended.
llvm-svn: 187718
helper functions. This can be optimized out later when the remaining
parts of the helper function work is moved into the Mips16HardFloat pass.
For now it forces us to use the 32 bit save/restore instructions instead
of the 16 bit ones.
llvm-svn: 187712
Due to the weird and wondeful usual arithmetic conversions, some
calculations involving negative values were getting performed in
uint32_t and then promoted to int64_t, which is really not a good
idea.
Patch by Katsuhiro Ueno.
llvm-svn: 187703
Internally, the PowerPC backend names the 32-bit GPRs R[0-9]+, and names the
64-bit parent GPRs X[0-9]+. When matching inline assembly constraints with
explicit register names, on PPC64 when an i64 MVT has been requested, we need
to follow gcc's convention of using r[0-9]+ to refer to the 64-bit (parent)
registers.
At some point, we'll probably want to arrange things so that the generic code
in TargetLowering uses the AsmName fields declared in *RegisterInfo.td in order
to match these inline asm register constraints. If we do that, this change can
be reverted.
llvm-svn: 187693
Recent versions of the OS X linker support this but follow the existing
OS X linker convention of using an underscore in the option name, i.e.,
-export_dynamic. Rather than changing our configure scripts to check for
that alternate spelling, it is simpler to just use the compiler's -rdynamic
option and let it deal with translating that to the appropriate linker
option. One potential disadvantage of this approach is that the compiler
will typically ignore -rdynamic on platforms where it is not supported, so
the HAVE_LINK_EXPORT_DYNAMIC in config.h will not necessarily show whether
that option has any effect or not. I don't see any in-tree uses of that
macro, so I'm assuming it is OK.
llvm-svn: 187686
Everything that comes after -- should be treated as a filename. This
enables passing in filenames that would otherwise be conflated with
command-line options.
This is especially important for clang-cl which supports options
starting with /, which are easily conflatable with Unix-style
path names.
Differential Revision: http://llvm-reviews.chandlerc.com/D1274
llvm-svn: 187675
The ExtractLoops function tries to reduce the failing test case by extracting
one or more loops from the misoptimized piece of the program. In doing this,
ExtractLoops must keep the MiscompiledFunctions vector up-to-date by ensuring
that the pointers refer to functions in the current failing program.
Unfortunately, this is not trivial because:
- ExtractLoops is iterative, and there are several early exits (and the
MiscompiledFunctions vector must be consistent with the current program at
every non-fatal exit point).
- Several of the utility functions used by ExtractLoops (such as
TestOptimizer, some of which are called through the TestFn callback
parameter, and Linker::LinkModules) delete their inputs upon success.
This change adds several updates of the MiscompiledFunctions vector at
different points. The first is after the initial call to TestMergedProgram
which checks that the loop-extracted program still works. The second is after
the call to TestFn (TestOptimizer, for example). This function will delete its
inputs (which is why the existing ExtractLoops logic cloned the inputs first).
llvm-svn: 187674
This patch fixes the multiple breakages on ARM test-suite after the SLP
vectorizer was introduced by default on O3. The problem was an illegal
vector type on ARMTTI::getCmpSelInstrCost() <3 x i1> which is not simple.
The guard protects this code from breaking (cause of the problems) but
doesn't fix the issue that is generating the odd vector in the first
place, which also needs to be investigated.
llvm-svn: 187658
Function attributes are the future! So just query whether we want to realign the
stack directly from the function instead of through a random target options
structure.
llvm-svn: 187618
This is actually an LLVM bug in the way it generates signatures for these
when soft float is enabled. For example, floor ends up having the signature
of int64(int64). The signature part is not the same as where the actual
parameter types are recorded, and those ARE of course int64(int64) when
soft float is enabled. (Yes, Mips16 hard float uses soft float but with
different runtime rounes but then has to interoperate with Mips32 using
normal floating point). This logic will eventually be moved to the
Mips16HardFloat pass so it's not worth sorting out these issues in LLVM
since nobody but Mips16 cares about these signatures, as far as I know,
and even I won't eventually either.
llvm-svn: 187613
Also remove checking of llvm.dbg.sp since it is not used in generating dwarf.
Current state of Finder:
DebugInfoFinder tries to list all debug info MDNodes used in a module. To
list debug info MDNodes used by an instruction, DebugInfoFinder provides
processDeclare, processValue and processLocation to handle DbgDeclareInst,
DbgValueInst and DbgLoc attached to instructions. processModule will go
through all DICompileUnits in llvm.dbg.cu and list debug info MDNodes
used by the CUs.
TODO:
1> Finder has a list of CUs, SPs, Types, Scopes and global variables. We
need to add a list of variables that are used by DbgDeclareInst and
DbgValueInst.
2> MDString fields should be null or isa<MDString> and MDNode fields should be
null or isa<MDNode>. We currently use empty string or int 0 to represent null.
3> Go though Verify functions and make sure that they check field types.
4> Clean up existing testing cases to remove llvm.dbg.sp and make sure each
testing case has a llvm.dbg.cu.
llvm-svn: 187609
This is another case where internalize hides a symbol that is needed by
a loadable module. I am currently investigating a proper fix but this patch
will get our buildbot to pass in the meantime. <rdar://problem/14578094>
llvm-svn: 187601
* Added R600_Reg64 class
* Added T#Index#.XY registers definition
* Added v2i32 register reads from parameter and global space
* Added f32 and i32 elements extraction from v2f32 and v2i32
* Added v2i32 -> v2f32 conversions
Tom Stellard:
- Mark vec2 operations as expand. The addition of a vec2 register
class made them all legal.
Patch by: Dmitry Cherkassov
Signed-off-by: Dmitry Cherkassov <dcherkassov@gmail.com>
llvm-svn: 187582
This also fixes a bug in the predication of LR to LOCR: I'd forgotten
that with these in-place instruction builds, the implicit operands need
to be added manually. I think this was latent until now, but is tested
by int-cmp-45.c. It also adds a CC valid mask to STOC, again tested by
int-cmp-45.c.
llvm-svn: 187573
Convert >= 1 to > 0, etc. Using comparison with zero isn't a win on its own,
but it exposes more opportunities for CC reuse (the next patch).
llvm-svn: 187571
Patch by Ana Pazos.
- Completed implementation of instruction formats:
AdvSIMD three same
AdvSIMD modified immediate
AdvSIMD scalar pairwise
- Completed implementation of instruction classes
(some of the instructions in these classes
belong to yet unfinished instruction formats):
Vector Arithmetic
Vector Immediate
Vector Pairwise Arithmetic
- Initial implementation of instruction formats:
AdvSIMD scalar two-reg misc
AdvSIMD scalar three same
- Intial implementation of instruction class:
Scalar Arithmetic
- Initial clang changes to support arm v8 intrinsics.
Note: no clang changes for scalar intrinsics function name mangling yet.
- Comprehensive test cases for added instructions
To verify auto codegen, encoding, decoding, diagnosis, intrinsics.
llvm-svn: 187567
This makes option aliases more powerful by enabling them to
pass along arguments to the option they're aliasing.
For example, if we have a joined option "-foo=", we can now
specify a flag option "-bar" to be an alias of that, with the
argument "baz".
This is especially useful for the cl.exe compatible clang driver,
where many options are aliases. For example, this patch enables
us to alias "/Ox" to "-O3" (-O is a joined option), and "/WX" to
"-Werror" (again, -W is a joined option).
Differential Revision: http://llvm-reviews.chandlerc.com/D1245
llvm-svn: 187537
While the .td entry is nice and all, it takes a pretty gross hack in
ARMAsmParser::ParseInstruction() because of handling of other "subs"
instructions to get it to match. Ran it by Jim Grosbach and he said it was
about what he expected to make this work given the existing code.
rdar://14214063
llvm-svn: 187530
If we merge vector when a vector is used, it will generate an artificial
antidependency that can prevent 2 tex/vtx instructions to use the same
clause and thus generate extra clauses that reduce performance.
There is no test case as such situation is really hard to predict.
llvm-svn: 187516
There are a lot of restrictions on instruction groups that contain
LDS instructions, so for now we will be conservative and not packetize
anything else with them.
llvm-svn: 187513
We were using two instructions for similar purpose : break and
predicated break. Only predicated_break was emitted and it was
lowered at R600ControlFlowFinalizer to JUMP;CF_BREAK;POP.
This commit simplify the situation by making AMDILCFGStructurizer
emit IF_PREDICATE;BREAK;ENDIF; instead of predicated_break (which
is now removed).
There is no functionality change.
llvm-svn: 187510
The loop optimizers were assuming that scales > 1 were OK. I think this
is actually a bug in TargetLoweringBase::isLegalAddressingMode(),
since it seems to be trying to reject anything that isn't r+i or r+r,
but it has no default case for scales other than 0, 1 or 2. Implementing
the hook for z means that z can no longer test any change there though.
llvm-svn: 187497
Extend r187495 to conditional loads. I split this out because the
easiest way seemed to be to force a particular operand order in
SystemZISelDAGToDAG.cpp.
llvm-svn: 187496
System z branches have a mask to select which of the 4 CC values should
cause the branch to be taken. We can invert a branch by inverting the mask.
However, not all instructions can produce all 4 CC values, so inverting
the branch like this can lead to some oddities. For example, integer
comparisons only produce a CC of 0 (equal), 1 (less) or 2 (greater).
If an integer EQ is reversed to NE before instruction selection,
the branch will test for 1 or 2. If instead the branch is reversed
after instruction selection (by inverting the mask), it will test for
1, 2 or 3. Both are correct, but the second isn't really canonical.
This patch therefore keeps track of which CC values are possible
and uses this when inverting a mask.
Although this is mostly cosmestic, it fixes undefined behavior
for the CIJNLH in branch-08.ll. Another fix would have been
to mask out bit 0 when generating the fused compare and branch,
but the point of this patch is that we shouldn't need to do that
in the first place.
The patch also makes it easier to reuse CC results from other instructions.
llvm-svn: 187495
r187116 moved compare-and-branch generation from the instruction-selection
pass to the peephole optimizer (via optimizeCompare). It turns out that even
this is a bit too early. Fused compare-and-branch instructions don't
interact well with predication, where a CC result is needed. They also
make it harder to reuse the CC side-effects of earlier instructions
(not yet implemented, but the subject of a later patch).
Another problem was that the AnalyzeBranch family of routines weren't
handling compares and branches, so we weren't able to reverse the fused
form in cases where we would reverse a separate branch. This could have
been fixed by extending AnalyzeBranch, but given the other problems,
I've instead moved the fusing to the long-branch pass, which is also
responsible for the opposite transformation: splitting out-of-range
compares and branches into separate compares and long branches.
I've added a test for the AnalyzeBranch problem. A test for the
predication problem is included in the next patch, which fixes a bug
in the choice of CC mask.
llvm-svn: 187494
r186399 aggressively used the RISBG instruction for immediate ANDs,
both because it can handle some values that AND IMMEDIATE can't,
and because it allows the destination register to be different from
the source. I realized later while implementing the distinct-ops
support that it would be better to leave the choice up to
convertToThreeAddress() instead. The AND IMMEDIATE form is shorter
and is less likely to be cracked.
This is a problem for 32-bit ANDs because we assume that all 32-bit
operations will leave the high word untouched, whereas RISBG used in
this way will either clear the high word or copy it from the source
register. The patch uses the z196 instruction RISBLG for this instead.
This means that z10 will be restricted to NILL, NILH and NILF for
32-bit ANDs, but I think that should be OK for now. Although we're
using z10 as the base architecture, the optimization work is going
to be focused more on z196 and zEC12.
llvm-svn: 187492
All insertf*/extractf* functions replaced with insert/extract since we have insertf and inserti forms.
Added lowering for INSERT_VECTOR_ELT / EXTRACT_VECTOR_ELT for 512-bit vectors.
Added lowering for EXTRACT/INSERT subvector for 512-bit vectors.
Added a test.
llvm-svn: 187491