Instead of unconditionally storing origin with every application store,
only do this when the shadow of the stored value is != 0.
This change also delays instrumentation of stores until after the walk over
function's instructions, because adding new basic blocks confuses InstVisitor.
We only keep 1 origin value per 4 bytes of application memory. This change
fixes the bug when a store of a single clean byte wiped the origin for the
whole 4-byte area.
Since stores of uninitialized values are relatively uncommon, this change
improves performance of track-origins mode by 5% median and by up to 47% on
specs.
llvm-svn: 169490
generally support the C++11 memory model requirements for bitfield
accesses by relying more heavily on LLVM's memory model.
The primary change this introduces is to move from a manually aligned
and strided access pattern across the bits of the bitfield to a much
simpler lump access of all bits in the bitfield followed by math to
extract the bits relevant for the particular field.
This simplifies the code significantly, but relies on LLVM to
intelligently lowering these integers.
I have tested LLVM's lowering both synthetically and in benchmarks. The
lowering appears to be functional, and there are no really significant
performance regressions. Different code patterns accessing bitfields
will vary in how this impacts them. The only real regressions I'm seeing
are a few patterns where the LLVM code generation for loads that feed
directly into a mask operation don't take advantage of the x86 ability
to do a smaller load and a cheap zero-extension. This doesn't regress
any benchmark in the nightly test suite on my box past the noise
threshold, but my box is quite noisy. I'll be watching the LNT numbers,
and will look into further improvements to the LLVM lowering as needed.
llvm-svn: 169489
Some languages, e.g. Ada and Pascal, allow you to specify that the array bounds
are different from the default (1 in these cases). If we have a lower bound
that's non-default, then we emit the lower bound. We also calculate the correct
upper bound in those cases.
llvm-svn: 169484
We still need to do a recursive walk to determine all static/global variables
referenced by a block, which is needed for region invalidation.
llvm-svn: 169481
Don't require that, during template deduction, a template specialization type
as a function parameter has at least as many template arguments as one used in
a function argument (not even if the argument has been resolved to an exact
type); the additional parameters might be provided by default template
arguments in the template. We don't need this check, since we now implement
[temp.deduct.call]p4 with an additional check after deduction.
llvm-svn: 169475
Fixed zero sized arrays to work correctly. This will only happen once we get a clang that emits correct debug info for zero sized arrays. For now I have marked the TestStructTypes.py as an expected failure.
llvm-svn: 169465
of the "self"/"this" pointer for the current stack
frame before wrapping expressions in C++ or
Objective-C methods. This works around bad debug
info where the compiler emits a "this" or "self"
but doesn't give any way to find its location.
<rdar://problem/12809985>
llvm-svn: 169461
- Removed the BitfieldMap class because it is unnecessary.
We now just track the most recently added field.
- Moved the code that calculates bitfield widths so it
can also be used to determine whether it's necessary
to insert anonymous fields.
- Simplified the anonymous field calculation code into
three cases (two of which are resolved identically).
- Beefed up the bitfield testcase.
llvm-svn: 169449
RUN: a
RUN: b || true
as "a && (b || true)" in Tcl mode, and as "(a && b) || true" in sh mode.
Everyone seems to (quite reasonably) write tests assuming the Tcl behavior,
so use that in sh mode too.
llvm-svn: 169441
RUN: a
RUN: b || true
lit expands it to a && b || true, and the || true applies to both commands (thus ignoring failures in 'a')! This is PR10867 again.
llvm-svn: 169434
This is more consistent with other vectors in this code. In addition, I ran some
tests compiling a large program and >96% of fragments have 4 or less fixups, so
SmallVector<4> is a good optimization.
llvm-svn: 169433