It is currently broken because it reads a wrong value from profile (heap instead of total).
Also make it faster by reading /proc/self/statm. Reading of /proc/self/smaps
can consume more than 50% of time on beefy apps if done every 100ms.
llvm-svn: 213942
Specifically the part where we removed a warning to be compatible with GCC, which has been widely regarded as a bad idea.
I'm not quite happy with how obtuse this warning is, especially in the fairly common case of a 32-bit integer literal, so I've got another patch awaiting review that adds a fixit to reduce confusion.
llvm-svn: 213935
SDValues, fixing the two bugs left in the regression suite.
The key for both of these was the use a single value type rather than
a VTList which caused an unintentionally single-result merge-value node.
Fix this by getting the appropriate VTList in place.
Doing this exposed that the comments in x86's code abouth how MUL_LOHI
operands are handle is wrong. The bug with the use of out-of-range
result numbers was hiding the bug about the order of operands here (as
best i can tell). There are more places where the code appears to get
this backwards still...
llvm-svn: 213931
with a result number outside the range of results for the node.
I don't know how we managed to not really check this very basic
invariant for so long, but the code is *very* broken at this point.
I have over 270 test failures with the assert enabled. I'm committing it
disabled so that others can join in the cleanup effort and reproduce the
issues. I've also included one of the obvious fixes that I already
found. More fixes to come.
llvm-svn: 213926
assembly instructions.
This is necessary to ensure ARM assembler switches to Thumb mode before it
starts assembling the file level inline assembly instructions at the beginning
of a .s file.
<rdar://problem/17757232>
llvm-svn: 213924
* Track override set across module load and save
* Track originating module to allow proper re-export of #undef
* Make override set properly transitive when it picks up a #undef
This fixes nearly all of the remaining macro issues with self-host.
llvm-svn: 213922
StringMap doesn't guarantee any particular iteration order,
this is suboptimal when comparing llvm-vtabledump's output for two
object files.
llvm-svn: 213921
Summary:
This patch extends the __asm parser to make it keep parsing input tokens
as inline assembly if a single-line __asm line is followed by another line
starting with __asm too. It also makes sure that we correctly keep
matching braces in such situations by separating the notions of how many
braces we are matching and whether we are in single-line asm block mode.
Reviewers: rnk
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D4598
llvm-svn: 213916
Because the PowerPC vmrgh* and vmrgl* instructions have a built-in
big-endian bias, it is necessary to swap their inputs in little-endian
mode when using them to implement a vector shuffle. This was
previously missed in the vector LE implementation.
There was already logic to distinguish between unary and "normal"
vmrg* vector shuffles, so this patch extends that logic to use a third
option: "swapped" vmrg* vector shuffles that are used for little
endian in place of the "normal" ones.
I've updated the vec-shuffle-le.ll test to check for the expected
register ordering on the generated instructions.
This bug was discovered when testing the LE and ELFv2 patches for
safety if they were backported to 3.4. A different vectorization
decision was made in 3.4 than on mainline trunk, and that exposed the
problem. I've verified this fix takes care of that issue.
llvm-svn: 213915
This change has the practical effect of fixing some backtrace
scenarios that would fail with inferiors running on the Android Art
host-side JVM under Linux x86_64 on Ubuntu 14.04.
See this lldb-commits thread for more details:
http://lists.cs.uiuc.edu/pipermail/lldb-commits/Week-of-Mon-20140721/011988.html
Change by Tong Shen.
Reviewed by Jason Molenda.
Tested:
Ubuntu 14.04 x86_64, clang-3.5-built lldb.
MacOSX 10.10 Preview 4, Xcode 6 Beta 4-built lldb.
llvm-svn: 213914
it through the normal TreeTransform logic for Exprs (which will strip off
implicit parts of the initialization and never re-create them).
llvm-svn: 213913
This patch implements the data structures, the reader and
the writers for the new code coverage mapping system.
The new code coverage mapping system uses the instrumentation
based profiling to provide code coverage analysis.
llvm-svn: 213910
This patch implements the data structures, the reader and
the writers for the new code coverage mapping system.
The new code coverage mapping system uses the instrumentation
based profiling to provide code coverage analysis.
llvm-svn: 213909
The -print-file-name option in llvm-nm is to precede each symbol
with the object file it came from. While code for the parsing of this
option and its aliases existed there was no code to implement it.
llvm-svn: 213906
This tool's job is to dump the vtables inside object files. It is
currently limited to MS ABI vf- and vb-tables but it will eventually
support Itanium-style v-tables as well.
Differential Revision: http://reviews.llvm.org/D4584
llvm-svn: 213903
Sometimes compilers emit data into code sections (e.g. constant pools or
jump tables). These runs of data can throw off disassemblers. The solution
in mach-o is that ranges of data-in-code are encoded into a table pointed to
by the LC_DATA_IN_CODE load command.
The way the data-in-code information is encoded into lld's Atom model is that
that start and end of each data run is marked with a Reference whose offset
is the start/end of the data run. For arm, the switch back to code also marks
whether it is thumb or arm code.
llvm-svn: 213901
hint) the loop unroller replaces the llvm.loop.unroll.count metadata with
llvm.loop.unroll.disable metadata to prevent any subsequent unrolling
passes from unrolling more than the hint indicates. This patch fixes
an issue where loop unrolling could be disabled for other loops as well which
share the same llvm.loop metadata.
llvm-svn: 213900
which have successfully round-tripped through the combine phase, and use
this to ensure all operands to DAG nodes are visited by the combiner,
even if they are only added during the combine phase.
This is critical to have the combiner reach nodes that are *introduced*
during combining. Previously these would sometimes be visited and
sometimes not be visited based on whether they happened to end up on the
worklist or not. Now we always run them through the combiner.
This fixes quite a few bad codegen test cases lurking in the suite while
also being more principled. Among these, the TLS codegeneration is
particularly exciting for programs that have this in the critical path
like TSan-instrumented binaries (although I think they engineer to use
a different TLS that is faster anyways).
I've tried to check for compile-time regressions here by running llc
over a merged (but not LTO-ed) clang bitcode file and observed at most
a 3% slowdown in llc. Given that this is essentially a worst case (none
of opt or clang are running at this phase) I think this is tolerable.
The actual LTO case should be even less costly, and the cost in normal
compilation should be negligible.
With this combining logic, it is possible to re-legalize as we combine
which is necessary to implement PSHUFB formation on x86 as
a post-legalize DAG combine (my ultimate goal).
Differential Revision: http://reviews.llvm.org/D4638
llvm-svn: 213898
vector operation legalization with support for custom target lowering
and fallback to expand when it fails, and use this to implement sext and
anyext load lowering for x86 in a more principled way.
Previously, the x86 backend relied on a target DAG combine to "combine
away" sextload and extload nodes prior to legalization, or would expand
them during legalization with terrible code. This is particularly
problematic because the DAG combine relies on running over non-canonical
DAG nodes at just the right time to match several common and important
patterns. It used a combine rather than lowering because we didn't have
good lowering support, and to expose some tricks being employed to more
combine phases.
With this change it becomes a proper lowering operation, the backend
marks that it can lower these nodes, and I've added support for handling
the canonical forms that don't have direct legal representations such as
sextload of a v4i8 -> v4i64 on AVX1. With this change, our test cases
for this behavior continue to pass even after the DAG combiner beigns
running more systematically over every node.
There is some noise caused by this in the test suite where we actually
use vector extends instead of subregister extraction. This doesn't
really seem like the right thing to do, but is unlikely to be a critical
regression. We do regress in one case where by lowering to the
target-specific patterns early we were able to combine away extraneous
legal math nodes. However, this regression is completely addressed by
switching to a widening based legalization which is what I'm working
toward anyways, so I've just switched the test to that mode.
Differential Revision: http://reviews.llvm.org/D4654
llvm-svn: 213897
The Microsoft ABI and MSVCRT are considered the canonical C runtime and ABI.
The long double routines are not part of this environment. However, cygwin and
MinGW both provide supplementary implementations. Change the condition to
reflect this reality.
llvm-svn: 213896
This patch minimizes the number of nops that must be emitted on X86 to satisfy
stackmap shadow constraints.
To minimize the number of nops inserted, the X86AsmPrinter now records the
size of the most recent stackmap's shadow in the StackMapShadowTracker class,
and tracks the number of instruction bytes emitted since the that stackmap
instruction was encountered. Padding is emitted (if it is required at all)
immediately before the next stackmap/patchpoint instruction, or at the end of
the basic block.
This optimization should reduce code-size and improve performance for people
using the llvm stackmap intrinsic on X86.
<rdar://problem/14959522>
llvm-svn: 213892