cost modeling for if-conversion. Now if only we had a way to estimate the misprediction probability.
Adjsut CodeGen/ARM/ifcvt10.ll. The pipeline on Cortex-A8 is long enough that it is still profitable
to predicate an ldm, but the shorter pipeline on Cortex-A9 makes it unprofitable.
llvm-svn: 114995
take multiple cycles to decode.
For the current if-converter clients (actually only ARM), the instructions that
are predicated on false are not nops. They would still take machine cycles to
decode. Micro-coded instructions such as LDM / STM can potentially take multiple
cycles to decode. If-converter should take treat them as non-micro-coded
simple instructions.
llvm-svn: 113570
memory and synchronization barrier dmb and dsb instructions.
- Change instruction names to something more sensible (matching name of actual
instructions).
- Added tests for memory barrier codegen.
llvm-svn: 110785
Modules and ModuleProviders. Because the "ModuleProvider" simply materializes
GlobalValues now, and doesn't provide modules, it's renamed to
"GVMaterializer". Code that used to need a ModuleProvider to materialize
Functions can now materialize the Functions directly. Functions no longer use a
magic linkage to record that they're materializable; they simply ask the
GVMaterializer.
Because the C ABI must never change, we can't remove LLVMModuleProviderRef or
the functions that refer to it. Instead, because Module now exposes the same
functionality ModuleProvider used to, we store a Module* in any
LLVMModuleProviderRef and translate in the wrapper methods. The bindings to
other languages still use the ModuleProvider concept. It would probably be
worth some time to update them to follow the C++ more closely, but I don't
intend to do it.
Fixes http://llvm.org/PR5737 and http://llvm.org/PR5735.
llvm-svn: 94686
for all the processors where I have tried it, and even when it might not help
performance, the cost is quite low. The opportunities for duplicating
indirect branches are limited by other factors so code size does not change
much due to tail duplicating indirect branches aggressively.
llvm-svn: 90144
than doing the same via constpool:
1. Load from constpool costs 3 cycles on A9, movt/movw pair - just 2.
2. Load from constpool might stall up to 300 cycles due to cache miss.
3. Movt/movw does not use load/store unit.
4. Less constpool entries => better compiler performance.
This is only enabled on ELF systems, since darwin does not have needed
relocations (yet).
llvm-svn: 89720
contents of the block to be duplicated. Use this for ARM Cortex A8/9 to
be more aggressive tail duplicating indirect branches, since it makes it
much more likely that they will be predicted in the branch target buffer.
Testcase coming soon.
llvm-svn: 89187
Module*.
Also, dropped uses of TargetMachine where unnecessary. The only target which
still takes a TargetMachine& is Mips, I would appreciate it if someone would
normalize this to match other targets.
llvm-svn: 77918
instructions for calls since BL and BLX are always 32-bit long and BX is always
16-bit long.
Also, we should be using BLX to call external function stubs.
llvm-svn: 77756