type erased interface and a single analysis pass rather than an
extremely complex analysis group.
The end result is that the TTI analysis can contain a type erased
implementation that supports the polymorphic TTI interface. We can build
one from a target-specific implementation or from a dummy one in the IR.
I've also factored all of the code into "mix-in"-able base classes,
including CRTP base classes to facilitate calling back up to the most
specialized form when delegating horizontally across the surface. These
aren't as clean as I would like and I'm planning to work on cleaning
some of this up, but I wanted to start by putting into the right form.
There are a number of reasons for this change, and this particular
design. The first and foremost reason is that an analysis group is
complete overkill, and the chaining delegation strategy was so opaque,
confusing, and high overhead that TTI was suffering greatly for it.
Several of the TTI functions had failed to be implemented in all places
because of the chaining-based delegation making there be no checking of
this. A few other functions were implemented with incorrect delegation.
The message to me was very clear working on this -- the delegation and
analysis group structure was too confusing to be useful here.
The other reason of course is that this is *much* more natural fit for
the new pass manager. This will lay the ground work for a type-erased
per-function info object that can look up the correct subtarget and even
cache it.
Yet another benefit is that this will significantly simplify the
interaction of the pass managers and the TargetMachine. See the future
work below.
The downside of this change is that it is very, very verbose. I'm going
to work to improve that, but it is somewhat an implementation necessity
in C++ to do type erasure. =/ I discussed this design really extensively
with Eric and Hal prior to going down this path, and afterward showed
them the result. No one was really thrilled with it, but there doesn't
seem to be a substantially better alternative. Using a base class and
virtual method dispatch would make the code much shorter, but as
discussed in the update to the programmer's manual and elsewhere,
a polymorphic interface feels like the more principled approach even if
this is perhaps the least compelling example of it. ;]
Ultimately, there is still a lot more to be done here, but this was the
huge chunk that I couldn't really split things out of because this was
the interface change to TTI. I've tried to minimize all the other parts
of this. The follow up work should include at least:
1) Improving the TargetMachine interface by having it directly return
a TTI object. Because we have a non-pass object with value semantics
and an internal type erasure mechanism, we can narrow the interface
of the TargetMachine to *just* do what we need: build and return
a TTI object that we can then insert into the pass pipeline.
2) Make the TTI object be fully specialized for a particular function.
This will include splitting off a minimal form of it which is
sufficient for the inliner and the old pass manager.
3) Add a new pass manager analysis which produces TTI objects from the
target machine for each function. This may actually be done as part
of #2 in order to use the new analysis to implement #2.
4) Work on narrowing the API between TTI and the targets so that it is
easier to understand and less verbose to type erase.
5) Work on narrowing the API between TTI and its clients so that it is
easier to understand and less verbose to forward.
6) Try to improve the CRTP-based delegation. I feel like this code is
just a bit messy and exacerbating the complexity of implementing
the TTI in each target.
Many thanks to Eric and Hal for their help here. I ended up blocked on
this somewhat more abruptly than I expected, and so I appreciate getting
it sorted out very quickly.
Differential Revision: http://reviews.llvm.org/D7293
llvm-svn: 227669
This is affecting the behavior of some ObjC++ / AArch64 test cases on Darwin.
Reverting to get the bots green while I track down the source of the changed
behavior.
llvm-svn: 225311
This matches gcc's behavior. It also seems natural given that aliases
contain other properties that govern how it is accessed (linkage,
visibility, dll storage).
Clang still has to be updated to expose this feature to C.
llvm-svn: 209759
This matches both what we do for the non-thread case and what gcc does.
With this patch clang would match gcc's behaviour in
static __thread int a = 42;
extern __thread int b __attribute__((alias("a")));
int *f(void) { return &a; }
int *g(void) { return &b; }
if not for pr19843. Manually writing the IL does produce the same access modes.
It is also a step in the direction of fixing pr19844.
llvm-svn: 209543
make the functions to set them non-static.
Move and rename the llvm specific backend options to avoid conflicting
with the clang option.
Paired with a backend commit to update.
llvm-svn: 209238
For now it contains a single flag, SanitizeAddress, which enables
AddressSanitizer instrumentation of inline assembly.
Patch by Yuri Gorshenin.
llvm-svn: 206971
This adds back r204781.
Original message:
Aliases are just another name for a position in a file. As such, the
regular symbol resolutions are not applied. For example, given
define void @my_func() {
ret void
}
@my_alias = alias weak void ()* @my_func
@my_alias2 = alias void ()* @my_alias
We produce without this patch:
.weak my_alias
my_alias = my_func
.globl my_alias2
my_alias2 = my_alias
That is, in the resulting ELF file my_alias, my_func and my_alias are
just 3 names pointing to offset 0 of .text. That is *not* the
semantics of IR linking. For example, linking in a
@my_alias = alias void ()* @other_func
would require the strong my_alias to override the weak one and
my_alias2 would end up pointing to other_func.
There is no way to represent that with aliases being just another
name, so the best solution seems to be to just disallow it, converting
a miscompile into an error.
llvm-svn: 204934
This reverts commit r204781.
I will follow up to with msan folks to see what is what they
were trying to do with aliases to weak aliases.
llvm-svn: 204784
Aliases are just another name for a position in a file. As such, the
regular symbol resolutions are not applied. For example, given
define void @my_func() {
ret void
}
@my_alias = alias weak void ()* @my_func
@my_alias2 = alias void ()* @my_alias
We produce without this patch:
.weak my_alias
my_alias = my_func
.globl my_alias2
my_alias2 = my_alias
That is, in the resulting ELF file my_alias, my_func and my_alias are
just 3 names pointing to offset 0 of .text. That is *not* the
semantics of IR linking. For example, linking in a
@my_alias = alias void ()* @other_func
would require the strong my_alias to override the weak one and
my_alias2 would end up pointing to other_func.
There is no way to represent that with aliases being just another
name, so the best solution seems to be to just disallow it, converting
a miscompile into an error.
llvm-svn: 204781
TargetLoweringBase is implemented in CodeGen, so before this patch we had
a dependency fom Target to CodeGen. This would show up as a link failure of
llvm-stress when building with -DBUILD_SHARED_LIBS=ON.
This fixes pr18900.
llvm-svn: 201711
code to see if we're emitting a function into a non-default
text section. This is still a less-than-ideal solution, but more
contained than r199871 to determine whether or not we're emitting
code into an array of comdat sections.
llvm-svn: 200269
e.g. linkonce, to TargetMachine and set it when we've done so
for ELF targets currently. This involved making TargetMachine
non-const in a TLOF use and propagating that change around - I'm
open to other ideas.
This will be used in a future commit to handle emitting debug
information with ranges.
llvm-svn: 199871
Improvements over r195317:
- Set/restore EnableFastISel flag instead of just running FastISel within
SelectAllBasicBlocks; the flag is checked in various places, and
FastISel won't run properly if those places don't do the right thing.
- Test looks for normal ISel versus FastISel behavior, and not
something more subtle that doesn't work everywhere.
Based on work by Andrea Di Biagio.
llvm-svn: 195491
It broke, at least, i686 target. It is reproducible with "llc -mtriple=i686-unknown".
FYI, it didn't appear to add either "-O0" or "-fast-isel".
llvm-svn: 195339
There's no need to specify a flag to omit frame pointer elimination on non-leaf
nodes...(Honestly, I can't parse that option out.) Use the function attribute
stuff instead.
llvm-svn: 187093
This doesn't reset all of the target options within the TargetOptions
object. This is because some of those are ABI-specific and must be determined if
it's okay to change those on the fly.
llvm-svn: 176986
into their new header subdirectory: include/llvm/IR. This matches the
directory structure of lib, and begins to correct a long standing point
of file layout clutter in LLVM.
There are still more header files to move here, but I wanted to handle
them in separate commits to make tracking what files make sense at each
layer easier.
The only really questionable files here are the target intrinsic
tablegen files. But that's a battle I'd rather not fight today.
I've updated both CMake and Makefile build systems (I think, and my
tests think, but I may have missed something).
I've also re-sorted the includes throughout the project. I'll be
committing updates to Clang, DragonEgg, and Polly momentarily.
llvm-svn: 171366
Sooooo many of these had incorrect or strange main module includes.
I have manually inspected all of these, and fixed the main module
include to be the nearest plausible thing I could find. If you own or
care about any of these source files, I encourage you to take some time
and check that these edits were sensible. I can't have broken anything
(I strictly added headers, and reordered them, never removed), but they
may not be the headers you'd really like to identify as containing the
API being implemented.
Many forward declarations and missing includes were added to a header
files to allow them to parse cleanly when included first. The main
module rule does in fact have its merits. =]
llvm-svn: 169131
This allows the user/front-end to specify a model that is better
than what LLVM would choose by default. For example, a variable
might be declared as
@x = thread_local(initialexec) global i32 42
if it will not be used in a shared library that is dlopen'ed.
If the specified model isn't supported by the target, or if LLVM can
make a better choice, a different model may be used.
llvm-svn: 159077
optimizations which are valid for position independent code being linked
into a single executable, but not for such code being linked into
a shared library.
I discussed the design of this with Eric Christopher, and the decision
was to support an optional bit rather than a completely separate
relocation model. Fundamentally, this is still PIC relocation, its just
that certain optimizations are only valid under a PIC relocation model
when the resulting code won't be in a shared library. The simplest path
to here is to expose a single bit option in the TargetOptions. If folks
have different/better designs, I'm all ears. =]
I've included the first optimization based upon this: changing TLS
models to the *Exec models when PIE is enabled. This is the LLVM
component of PR12380 and is all of the hard work.
llvm-svn: 154294
in TargetLowering. There was already a FIXME about this location being
odd. The interface is simplified as a consequence. This will also make
it easier to change TLS models when compiling with PIE.
llvm-svn: 154292
Creates a configurable regalloc pipeline.
Ensure specific llc options do what they say and nothing more: -reglloc=... has no effect other than selecting the allocator pass itself. This patch introduces a new umbrella flag, "-optimize-regalloc", to enable/disable the optimizing regalloc "superpass". This allows for example testing coalscing and scheduling under -O0 or vice-versa.
When a CodeGen pass requires the MachineFunction to have a particular property, we need to explicitly define that property so it can be directly queried rather than naming a specific Pass. For example, to check for SSA, use MRI->isSSA, not addRequired<PHIElimination>.
CodeGen transformation passes are never "required" as an analysis
ProcessImplicitDefs does not require LiveVariables.
We have a plan to massively simplify some of the early passes within the regalloc superpass.
llvm-svn: 150226
change, now you need a TargetOptions object to create a TargetMachine. Clang
patch to follow.
One small functionality change in PTX. PTX had commented out the machine
verifier parts in their copy of printAndVerify. That now calls the version in
LLVMTargetMachine. Users of PTX who need verification disabled should rely on
not passing the command-line flag to enable it.
llvm-svn: 145714
and code model. This eliminates the need to pass OptLevel flag all over the
place and makes it possible for any codegen pass to use this information.
llvm-svn: 144788
.file filenumber "directory" "filename"
This removes one join+split of the directory+filename in MC internals. Because
bitcode files have independent fields for directory and filenames in debug info,
this patch may change the .o files written by existing .bc files.
llvm-svn: 142300
- Introduce JITDefault code model. This tells targets to set different default
code model for JIT. This eliminates the ugly hack in TargetMachine where
code model is changed after construction.
llvm-svn: 135580
(including compilation, assembly). Move relocation model Reloc::Model from
TargetMachine to MCCodeGenInfo so it's accessible even without TargetMachine.
llvm-svn: 135468
- Each target asm parser now creates its own MCSubtatgetInfo (if needed).
- Changed AssemblerPredicate to take subtarget features which tablegen uses
to generate asm matcher subtarget feature queries. e.g.
"ModeThumb,FeatureThumb2" is translated to
"(Bits & ModeThumb) != 0 && (Bits & FeatureThumb2) != 0".
llvm-svn: 134678
MCStreamer instead of just MCObjectStreamer. Address changes cannot
be as efficient as we have to use DW_LNE_set_addres, but at least
most of the logic is shared.
This will be used so that, with CodeGen still using EmitDwarfLocDirective,
llvm-gcc is able to produce debug_line sections without needing an
assembler that supports .loc.
llvm-svn: 119777
-enable-no-nans-fp-math and -enable-no-infs-fp-math. All of the current codegen fp math optimizations only care whether the fp arithmetics arguments and results can never be NaN.
llvm-svn: 108465
the variable actually tracks.
N.B., several back-ends are using "HasCalls" as being synonymous for something
that adjusts the stack. This isn't 100% correct and should be looked into.
llvm-svn: 103802
the '-pre-RA-sched' flag. It actually makes more sense to do it this way. Also,
keep track of the SDNode ordering by default. Eventually, we would like to make
this ordering a way to break a "tie" in the scheduler. However, doing that now
breaks the "CodeGen/X86/abi-isel.ll" test for 32-bit Linux.
llvm-svn: 94308
- Move DisableScheduling flag into TargetOption.h
- Move SDNodeOrdering into its own header file. Give it a minimal interface that
doesn't conflate construction with storage.
- Move assigning the ordering into the SelectionDAGBuilder.
This isn't used yet, so there should be no functional changes.
llvm-svn: 91727
feature, either build the JIT in debug mode to enable it by default or pass
-jit-emit-debug to lli.
Right now, the only debug information that this communicates to GDB is call
frame information, since it's already being generated to support exceptions in
the JIT. Eventually, when DWARF generation isn't tied so tightly to AsmPrinter,
it will be easy to push that information to GDB through this interface.
Here's a step-by-step breakdown of how the feature works:
- The JIT generates the machine code and DWARF call frame info
(.eh_frame/.debug_frame) for a function into memory.
- The JIT copies that info into an in-memory ELF file with a symbol for the
function.
- The JIT creates a code entry pointing to the ELF buffer and adds it to a
linked list hanging off of a global descriptor at a special symbol that GDB
knows about.
- The JIT calls a function marked noinline that GDB knows about and has put an
internal breakpoint in.
- GDB catches the breakpoint and reads the global descriptor to look for new
code.
- When sees there is new code, it reads the ELF from the inferior's memory and
adds it to itself as an object file.
- The JIT continues, and the next time we stop the program, we are able to
produce a proper backtrace.
Consider running the following program through the JIT:
#include <stdio.h>
void baz(short z) {
long w = z + 1;
printf("%d, %x\n", w, *((int*)NULL)); // SEGFAULT here
}
void bar(short y) {
int z = y + 1;
baz(z);
}
void foo(char x) {
short y = x + 1;
bar(y);
}
int main(int argc, char** argv) {
char x = 1;
foo(x);
}
Here is a backtrace before this patch:
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x2aaaabdfbd10 (LWP 25476)]
0x00002aaaabe7d1a8 in ?? ()
(gdb) bt
#0 0x00002aaaabe7d1a8 in ?? ()
#1 0x0000000000000003 in ?? ()
#2 0x0000000000000004 in ?? ()
#3 0x00032aaaabe7cfd0 in ?? ()
#4 0x00002aaaabe7d12c in ?? ()
#5 0x00022aaa00000003 in ?? ()
#6 0x00002aaaabe7d0aa in ?? ()
#7 0x01000002abe7cff0 in ?? ()
#8 0x00002aaaabe7d02c in ?? ()
#9 0x0100000000000001 in ?? ()
#10 0x00000000014388e0 in ?? ()
#11 0x00007fff00000001 in ?? ()
#12 0x0000000000b870a2 in llvm::JIT::runFunction (this=0x1405b70,
F=0x14024e0, ArgValues=@0x7fffffffe050)
at /home/rnk/llvm-gdb/lib/ExecutionEngine/JIT/JIT.cpp:395
#13 0x0000000000baa4c5 in llvm::ExecutionEngine::runFunctionAsMain
(this=0x1405b70, Fn=0x14024e0, argv=@0x13f06f8, envp=0x7fffffffe3b0)
at /home/rnk/llvm-gdb/lib/ExecutionEngine/ExecutionEngine.cpp:377
#14 0x00000000007ebd52 in main (argc=2, argv=0x7fffffffe398,
envp=0x7fffffffe3b0) at /home/rnk/llvm-gdb/tools/lli/lli.cpp:208
And a backtrace after this patch:
Program received signal SIGSEGV, Segmentation fault.
0x00002aaaabe7d1a8 in baz ()
(gdb) bt
#0 0x00002aaaabe7d1a8 in baz ()
#1 0x00002aaaabe7d12c in bar ()
#2 0x00002aaaabe7d0aa in foo ()
#3 0x00002aaaabe7d02c in main ()
#4 0x0000000000b870a2 in llvm::JIT::runFunction (this=0x1405b70,
F=0x14024e0, ArgValues=...)
at /home/rnk/llvm-gdb/lib/ExecutionEngine/JIT/JIT.cpp:395
#5 0x0000000000baa4c5 in llvm::ExecutionEngine::runFunctionAsMain
(this=0x1405b70, Fn=0x14024e0, argv=..., envp=0x7fffffffe3c0)
at /home/rnk/llvm-gdb/lib/ExecutionEngine/ExecutionEngine.cpp:377
#6 0x00000000007ebd52 in main (argc=2, argv=0x7fffffffe3a8,
envp=0x7fffffffe3c0) at /home/rnk/llvm-gdb/tools/lli/lli.cpp:208
llvm-svn: 82418
and short. Well, it's kinda short. Definitely nasty and brutish.
The front-end generates the register/unregister calls into the SjLj runtime,
call-site indices and landing pad dispatch. The back end fills in the LSDA
with the call-site information provided by the front end. Catch blocks are
not yet implemented.
Built on Darwin and verified no llvm-core "make check" regressions.
llvm-svn: 78625
--- Reverse-merging r75799 into '.':
U test/Analysis/PointerTracking
U include/llvm/Target/TargetMachineRegistry.h
U include/llvm/Target/TargetMachine.h
U include/llvm/Target/TargetRegistry.h
U include/llvm/Target/TargetSelect.h
U tools/lto/LTOCodeGenerator.cpp
U tools/lto/LTOModule.cpp
U tools/llc/llc.cpp
U lib/Target/PowerPC/PPCTargetMachine.h
U lib/Target/PowerPC/AsmPrinter/PPCAsmPrinter.cpp
U lib/Target/PowerPC/PPCTargetMachine.cpp
U lib/Target/PowerPC/PPC.h
U lib/Target/ARM/ARMTargetMachine.cpp
U lib/Target/ARM/AsmPrinter/ARMAsmPrinter.cpp
U lib/Target/ARM/ARMTargetMachine.h
U lib/Target/ARM/ARM.h
U lib/Target/XCore/XCoreTargetMachine.cpp
U lib/Target/XCore/XCoreTargetMachine.h
U lib/Target/PIC16/PIC16TargetMachine.cpp
U lib/Target/PIC16/PIC16TargetMachine.h
U lib/Target/Alpha/AsmPrinter/AlphaAsmPrinter.cpp
U lib/Target/Alpha/AlphaTargetMachine.cpp
U lib/Target/Alpha/AlphaTargetMachine.h
U lib/Target/X86/X86TargetMachine.h
U lib/Target/X86/X86.h
U lib/Target/X86/AsmPrinter/X86ATTAsmPrinter.h
U lib/Target/X86/AsmPrinter/X86AsmPrinter.cpp
U lib/Target/X86/AsmPrinter/X86IntelAsmPrinter.h
U lib/Target/X86/X86TargetMachine.cpp
U lib/Target/MSP430/MSP430TargetMachine.cpp
U lib/Target/MSP430/MSP430TargetMachine.h
U lib/Target/CppBackend/CPPTargetMachine.h
U lib/Target/CppBackend/CPPBackend.cpp
U lib/Target/CBackend/CTargetMachine.h
U lib/Target/CBackend/CBackend.cpp
U lib/Target/TargetMachine.cpp
U lib/Target/IA64/IA64TargetMachine.cpp
U lib/Target/IA64/AsmPrinter/IA64AsmPrinter.cpp
U lib/Target/IA64/IA64TargetMachine.h
U lib/Target/IA64/IA64.h
U lib/Target/MSIL/MSILWriter.cpp
U lib/Target/CellSPU/SPUTargetMachine.h
U lib/Target/CellSPU/SPU.h
U lib/Target/CellSPU/AsmPrinter/SPUAsmPrinter.cpp
U lib/Target/CellSPU/SPUTargetMachine.cpp
U lib/Target/Mips/AsmPrinter/MipsAsmPrinter.cpp
U lib/Target/Mips/MipsTargetMachine.cpp
U lib/Target/Mips/MipsTargetMachine.h
U lib/Target/Mips/Mips.h
U lib/Target/Sparc/AsmPrinter/SparcAsmPrinter.cpp
U lib/Target/Sparc/SparcTargetMachine.cpp
U lib/Target/Sparc/SparcTargetMachine.h
U lib/ExecutionEngine/JIT/TargetSelect.cpp
U lib/Support/TargetRegistry.cpp
llvm-svn: 75820
Update code generator to use this attribute and remove NoImplicitFloat target option.
Update llc to set this attribute when -no-implicit-float command line option is used.
llvm-svn: 72959
Update code generator to use this attribute and remove DisableRedZone target option.
Update llc to set this attribute when -disable-red-zone command line option is used.
llvm-svn: 72894
instead of requiring all "short description" strings to begin with
two spaces. This makes these strings less mysterious, and it fixes
some cases where short description strings mistakenly did not
begin with two spaces.
llvm-svn: 57521
- Add a basic machine-level dead block eliminator.
These two have to go together, since many other parts of the code generator are unable to handle the unreachable blocks otherwise created.
llvm-svn: 54333
switches use the binary search algorithm) for
environments that don't support it. PPC64 JIT
is such an environment; turn the flag on for that.
llvm-svn: 54248