The way that stack coloring updated MMOs when merging stack slots, while
correct, is suboptimal, and is incompatible with the use of AA during
instruction scheduling. The solution, which involves the use of const_cast (and
more importantly, updating the IR from within an MI-level pass), obviously
requires some explanation:
When the stack coloring pass was originally committed, the code in
ScheduleDAGInstrs::buildSchedGraph tracked possible alias sets by using
GetUnderlyingObject, and all load/store and store/store memory control
dependencies where added between SUs at the object level (where only one
object, that returned by GetUnderlyingObject, was used to identify the object
associated with each MMO). When stack coloring merged stack slots, it would
replace MMOs derived from the remapped alloca with the alloca with which the
remapped alloca was being replaced. Because ScheduleDAGInstrs only used single
objects, and tracked alias sets at the object level, this was a fine solution.
In r169744, (Andy and) I updated the code in ScheduleDAGInstrs to use
GetUnderlyingObjects, and track alias sets using, potentially, multiple
underlying objects for each MMO. This was done, primarily, to provide the
ability to look through PHIs, and provide better scheduling for
induction-variable-dependent loads and stores inside loops. At this point, the
MMO-updating code in stack coloring became suboptimal, because it would clear
the MMOs for (i.e. completely pessimize) all instructions for which r169744
might help in scheduling. Updating the IR directly is the simplest fix for this
(and the one with, by far, the least compile-time impact), but others are
possible (we could give each MMO a small vector of potential values, or make
use of a remapping table, constructed from MFI, inside ScheduleDAGInstrs).
Unfortunately, replacing all MMO values derived from the remapped alloca with
the base replacement alloca fundamentally breaks our ability to use AA during
instruction scheduling (which is critical to performance on some targets). The
reason is that the original MMO might have had an offset (either constant or
dynamic) from the base remapped alloca, and that offset is not present in the
updated MMO. One possible way around this would be to use
GetPointerBaseWithConstantOffset, and update not only the MMO's value, but also
its offset based on the original offset. Unfortunately, this solution would
only handle constant offsets, and for safety (because AA is not completely
restricted to deducing relationships with constant offsets), we would need to
clear all MMOs without constant offsets over the entire function. This would be
an even worse pessimization than the current single-object restriction. Any
other solution would involve passing around a vector of remapped allocas, and
teaching AA to use it, introducing additional complexity and overhead into AA.
Instead, when remapping an alloca, we replace all IR uses of that alloca as
well (optionally inserting a bitcast as necessary). This is even more efficient
that the old MMO-updating code in the stack coloring pass (because it removes
the need to call GetUnderlyingObject on all MMO values), removes the
single-object pessimization in the default configuration, and enables the
correct use of AA during instruction scheduling (all without any additional
overhead).
LLVM now no longer miscompiles itself on x86_64 when using -enable-misched
-enable-aa-sched-mi -misched-bottomup=0 -misched-topdown=0 -misched=shuffle!
Fixed PR18497.
Because the alloca replacement is now done at the IR level, unless the MMO
directly refers to the remapped alloca, the change cannot be seen at the MI
level. As a result, there is no good way to fix test/CodeGen/X86/pr14090.ll.
llvm-svn: 199658
When using AA to break false chain dependencies, we need to track multiple
stores per object in ScheduleDAGInstrs. Historically, we tracked potential alias
chains at the object level, and so all loads of an object would retain
dependencies on any store to that object. With AA, however, this is not
sufficient: non-overlapping stores and loads to the same object all need to be
tested for dependencies separately, we cannot only test all loads to an object
against only the last store (see PR18497 for an explicit example).
To mitigate any unwelcome compile-time impact when not using AA, only one store
is kept in the list per object when not using AA.
This, along with a stack coloring change to come shortly, will provide a test
case, fix PR18497 (and allow LLVM to compile itself using -enable-aa-sched-mi
on x86-64).
llvm-svn: 199657
In optimized hybrid execution we do not use DynamoRIO private loader, which
mangles TLS access, so we can access the application's TLS directly.
Patch by Qin Zhao.
llvm-svn: 199655
The addition of IC_OPSIZE_ADSIZE in r198759 wasn't quite complete. It
also turns out to have been unnecessary. The disassembler handles the
AdSize prefix for itself, and doesn't care about the difference between
(e.g.) MOV8ao8 and MOB8ao8_16 definitions. So just let them coexist and
don't worry about it.
llvm-svn: 199654
The disassembler has a special case for 'L' vs. 'W' in its heuristic for
checking for 32-bit and 16-bit equivalents. We could expand the heuristic,
but better just to be consistent in using the 'L' suffix.
llvm-svn: 199652
When disassembling in 16-bit mode the meaning of the OpSize bit is
inverted. Instructions found in the IC_OPSIZE context will actually
*not* have the 0x66 prefix, and instructions in the IC context will
have the 0x66 prefix. Make use of the existing special-case handling
for the 0x66 prefix being in the wrong place, to cope with this.
llvm-svn: 199650
Aside from cleaning up the code, this also adds support for the -code16
environment and actually enables the MODE_16BIT mode that was previously
not accessible.
There is no point adding any testing for 16-bit yet though; basically
nothing will work because we aren't handling the OpSize prefix correctly
for 16-bit mode.
llvm-svn: 199649
various opt verifier commandline options.
Mostly mechanical wiring of the verifier to the new pass manager.
Exercises one of the more unusual aspects of it -- a pass can be either
a module or function pass interchangably. If this is ever problematic,
we can make things more constrained, but for things like the verifier
where there is an "obvious" applicability at both levels, it seems
convenient.
This is the next-to-last piece of basic functionality left to make the
opt commandline driving of the new pass manager minimally functional for
testing and further development. There is still a lot to be done there
(notably the factoring into .def files to kill the current boilerplate
code) but it is relatively uninteresting. The only interesting bit left
for minimal functionality is supporting the registration of analyses.
I'm planning on doing that on top of the .def file switch mostly because
the boilerplate for the analyses would be significantly worse.
llvm-svn: 199646
ADDITIONAL_HEADERS is intended to add header files for IDEs as hint.
For example:
add_llvm_library(LLVMSupport
Host.cpp
ADDITIONAL_HEADERS
Unix/Host.inc
Windows/Host.inc
)
llvm-svn: 199639
coding standards, and instead fix the existing section.
Thanks to Daniel Jasper for pointing out we already had a section
devoted to this topic. Instead of adding sections, just hack on this
section some. Also fix the example in the anonymous namespace section
below it to agree with the new advice.
As a re-cap, this switches the LLVM preferred style to never indent
namespaces. Having two approaches just led to endless (and utterly
pointless) debates about what was "small enough". This wasn't helping
anyone. The no-indent rule is easy to understand and doesn't really make
anything harder to read. Moreover, with tools like clang-format it is
considerably nicer to have simple consistent rules.
llvm-svn: 199637
Now instead of just looking in the system root for it, we also look
relative to the clang binary's directory. This should "just work" in
almost all cases. I've added test cases accordingly.
This is probably *very* worthwhile to backport to the 3.4 branch so that
folks can check it out, build it, and use that as their host compiler
going forward.
llvm-svn: 199632
type units were enabled. The crux of the issue is that the
addDwarfTypeUnitType routine can end up being indirectly recursive. In
this case, the reference into the dense map (TU) became invalid by the
time we popped all the way back and used it to add the DIE type
signature.
Instead, use early return in the case where we can bypass the recursive
step and creating a type unit. Then use the pointer to the new type unit
to set up the DIE type signature in the case where we have to.
I tried really hard to reduce a testcase for this, but it's really
annoying. You have to get this to be mid-recursion when the densemap
grows. Even if we got a test case for this today, it'd be very unlikely
to continue exercising this pattern.
llvm-svn: 199630
This logic hadn't been updated to handle FastMathFlags, and it took me a while to detect it because it doesn't show up in a simple search for CreateFAdd.
llvm-svn: 199629
This attribute is supported by GCC. More generally it should
probably be a type attribute, but this behavior matches 'nonnull'.
This patch does not include warning logic for checking if a null
value is returned from a function annotated with this attribute.
That will come in subsequent patches.
llvm-svn: 199626
The alias test "exprf x 1234" expands to "expr -f x 1234" and is
expected to fail: it ends up trying to evaluate the invalid expression
void
$__lldb_expr(void *$__lldb_arg)
{
-f x 1234;
}
On FreeBSD LLDB ends up finding a static function f() in a math library,
and thus the error produced does not include "use of undeclared
identifier 'f'".
We will report failure to parse the expression in any case, so require
only that error message.
llvm-svn: 199623
Have I mentioned that functions returning true on error and false on
success are confusing? They're more confusing when their name is
"verify". Anyways...
llvm-svn: 199622
For FCMEQ, FCMGE, FCMGT, FCMLE and FCMLT, floating point zero will be
printed as #0.0 instead of #0. To support the history codes using #0,
we consider to let asm parser accept both #0.0 and #0.
llvm-svn: 199621
(and to mention namespace ending comments). This is based on a quick
discussion on the developer mailing list where there was essentially no
objections to a simple and consistent rule. This should avoid future
debates about whether or not a namespace is "big enough" to indent. It
also matches clang-format's current behavior with LLVM source code which
hasn't really seen any opposition in code reviews that I spot checked.
llvm-svn: 199620
Implement type trait primitives used in the latest edition of the Microsoft
standard C++ library type_traits header.
With this change we can parse much of the Visual Studio 2013 standard headers,
particularly anything that includes <type_traits>.
Fully implemented, available in all language modes:
* __is_constructible()
* __is_nothrow_constructible()
* __is_nothrow_assignable()
Partially implemented, semantic analysis WIP, available as MS extensions:
* __is_destructible()
* __is_nothrow_destructible()
llvm-svn: 199619
Check all default ctors, not just the first one we see. This brings
__has_nothrow_constructor() in line with the other unary type traits.
A C++ class can have multiple default constructors but clang was only checking
the first one written, presumably due to ambiguity in the GNU specification.
MSVC has the same bug, while g++ has the correct implementation which we now
match.
llvm-svn: 199618
This was due to arithmetic overflow in the getNumBits() computation. Now we
cast BitWidth to a uint64_t so that does not occur during the computation. After
the computation is complete, the uint64_t is truncated when the function
returns.
I know that this is not something that is likely to happen, but it *IS* a valid
input and we should not blow up.
llvm-svn: 199609