1. Makes it possible to lower with floating point loads and stores.
2. Avoid unaligned loads / stores unless it's fast.
3. Fix some memcpy lowering logic bug related to when to optimize a
load from constant string into a constant.
4. Adjust x86 memcpy lowering threshold to make it more sane.
5. Fix x86 target hook so it uses vector and floating point memory
ops more effectively.
rdar://7774704
llvm-svn: 100090
e.g., llvm.memcpy.i32(i8*, i8*, i32, i32) -> llvm.memcpy.p0i8.p0i8.i32(i8*, i8*, i32, i32, i1)
A update of langref will occur in a subsequent checkin.
llvm-svn: 99928
"the bigstack patch for SPU, with testcase. It is essentially the patch committed as 97091, and reverted as 97099, but with the following additions:
-in vararg handling, registers are marked to be live, to not confuse the register scavenger
-function prologue and epilogue are not emitted, if the stack size is 16. 16 means it is empty - there is only the register scavenger emergency spill slot, which is not used as there is no stack."
llvm-svn: 99819
transforming it into (add (i32 GPR), 4). This allows us to write type
generic multi patterns and have tblgen automatically drop the bitconvert
in the case when the types align. This allows us to fold an extra load
in the changed testcase.
llvm-svn: 99756
If a TableGen class has an initializer expression containing an X.Y subexpression,
AND X depends on template parameters,
AND those template parameters have defaults,
AND some parameters with defaults are beyond position 1,
THEN parts of the initializer expression are evaluated prematurely with the default values when the first explicit template parameter is substituted, before the remaining explicit template parameters have been substituted.
llvm-svn: 99492
happening.
Enhance scheduling to set the DEAD flag on implicit defs
more aggressively. Before, we'd set an implicit def operand
to dead if it were present in the SDNode corresponding to
the machineinstr but had no use. Now we do it in this case
AND if the implicit def does not exist in the SDNode at all.
This exposes a couple of problems: one is the FIXME, which
causes a live intervals crash on CodeGen/X86/sibcall.ll.
The second is that it makes machinecse and licm more
aggressive (which is a good thing) but also exposes a case
where licm hoists a set0 and then it doesn't get resunk.
Talking to codegen folks about both these issues, but I need
this patch in in the meantime.
llvm-svn: 99485
--- Reverse-merging r99440 into '.':
U test/MC/AsmParser/X86/x86_32-bit_cat.s
U test/MC/AsmParser/X86/x86_32-encoding.s
U include/llvm/IntrinsicsX86.td
U include/llvm/CodeGen/SelectionDAGNodes.h
U lib/Target/X86/X86InstrSSE.td
U lib/Target/X86/X86ISelLowering.h
llvm-svn: 99450
not get an "Unknown immediate size" assert failure when used. All instructions
of this form have an 8-bit immediate. Also added a test case of an example
instruction that is of this form.
llvm-svn: 99435
otherwise the SmallVector it contains doesn't free its memory.
In most cases LiveIntervalAnalysis could get away by not calling the destructor,
because VNInfos are bumpptr-allocated, and smallvectors usually don't grow.
However when the SmallVector does grow it always leaks.
This is the valgrind shown leak from the original testcase:
==8206== 18,304 bytes in 151 blocks are definitely lost in loss record 164 of 164
==8206== at 0x4A079C7: operator new(unsigned long) (vg_replace_malloc.c:220)
==8206== by 0x4DB7A7E: llvm::SmallVectorBase::grow_pod(unsigned long, unsigned long) (in /home/edwin/clam/git/builds/defaul
t/libclamav/.libs/libclamav.so.6.1.0)
==8206== by 0x4F90382: llvm::VNInfo::addKill(llvm::SlotIndex) (in /home/edwin/clam/git/builds/default/libclamav/.libs/libcl
amav.so.6.1.0)
==8206== by 0x5126B5C: llvm::LiveIntervals::handleVirtualRegisterDef(llvm::MachineBasicBlock*, llvm::ilist_iterator<llvm::M
achineInstr>, llvm::SlotIndex, llvm::MachineOperand&, unsigned int, llvm::LiveInterval&) (in /home/edwin/clam/git/builds/defau
lt/libclamav/.libs/libclamav.so.6.1.0)
==8206== by 0x512725E: llvm::LiveIntervals::handleRegisterDef(llvm::MachineBasicBlock*, llvm::ilist_iterator<llvm::MachineI
nstr>, llvm::SlotIndex, llvm::MachineOperand&, unsigned int) (in /home/edwin/clam/git/builds/default/libclamav/.libs/libclamav
.so.6.1.0)
==8206== by 0x51278A8: llvm::LiveIntervals::computeIntervals() (in /home/edwin/clam/git/builds/default/libclamav/.libs/libc
lamav.so.6.1.0)
==8206== by 0x5127CB4: llvm::LiveIntervals::runOnMachineFunction(llvm::MachineFunction&) (in /home/edwin/clam/git/builds/de
fault/libclamav/.libs/libclamav.so.6.1.0)
==8206== by 0x4DAE935: llvm::FPPassManager::runOnFunction(llvm::Function&) (in /home/edwin/clam/git/builds/default/libclama
v/.libs/libclamav.so.6.1.0)
==8206== by 0x4DAEB10: llvm::FunctionPassManagerImpl::run(llvm::Function&) (in /home/edwin/clam/git/builds/default/libclama
v/.libs/libclamav.so.6.1.0)
==8206== by 0x4DAED3D: llvm::FunctionPassManager::run(llvm::Function&) (in /home/edwin/clam/git/builds/default/libclamav/.l
ibs/libclamav.so.6.1.0)
==8206== by 0x4D8BE8E: llvm::JIT::runJITOnFunctionUnlocked(llvm::Function*, llvm::MutexGuard const&) (in /home/edwin/clam/git/builds/default/libclamav/.libs/libclamav.so.6.1.0)
==8206== by 0x4D8CA72: llvm::JIT::getPointerToFunction(llvm::Function*) (in /home/edwin/clam/git/builds/default/libclamav/.libs/libclamav.so.6.1.0)
llvm-svn: 99400
for the noinline attribute, and make the inliner refuse to
inline a call site when the call site is marked noinline even
if the callee isn't. This fixes PR6682.
llvm-svn: 99341
of runs without leak checking. We add -vg to the triple for non-checked runs,
or -vg_leak for checked runs. Also use this to XFAIL the TableGen tests, since
tablegen leaks like a sieve. This includes some valgrindArgs refactoring.
llvm-svn: 99103
override prefix and only the r/m16 forms should have had that. Also for variant
one, the AT&T syntax, added suffixes to all forms. Also added the missing
64-bit form for 'CRC32 r64, r/m8'. Plus added test cases for all forms and
tweaked one test case to add the needed suffixes.
llvm-svn: 98980
- This is "extraordinarily" Darwin 'as' compatible. See the litany of FIXMEs littered about for more information.
- There are a few cases which seem to clearly be 'as' bugs which I have left unsupported, and there is one cases where we diverge but should fix if it blocks diffing .o files (Darwin 'as' ends up widening a jump unnecessarily).
- 403.gcc build, runs, and diffs equivalently to the 'as' built version now (using llvm-mc). However, it builds so slowly that I wouldn't recommend trying it quite yet. :)
llvm-svn: 98974
script to the #! command by using bash instead of /bin/sh. Bash
searches $PATH for its script argument, but dash, which /bin/sh
resolves to on some systems, does not.
https://bugs.kde.org/show_bug.cgi?id=231257 tracks the valgrind
problem.
llvm-svn: 98892
temporary workaround for matching inc/dec on x86_64 to the correct instruction.
- This hack will eventually be replaced with a robust mechanism for handling
matching instructions based on the available target features.
llvm-svn: 98858
instructions to help disassembly.
We also changed the output of the addressing modes to omit the '+' from the
assembler syntax #+/-<imm> or +/-<Rm>. See, for example, A8.6.57/58/60.
And modified test cases to not expect '+' in +reg or #+num. For example,
; CHECK: ldr.w r9, [r7, #28]
llvm-svn: 98745
U test/CodeGen/ARM/tls2.ll
U test/CodeGen/ARM/arm-negative-stride.ll
U test/CodeGen/ARM/2009-10-30.ll
U test/CodeGen/ARM/globals.ll
U test/CodeGen/ARM/str_pre-2.ll
U test/CodeGen/ARM/ldrd.ll
U test/CodeGen/ARM/2009-10-27-double-align.ll
U test/CodeGen/Thumb2/thumb2-strb.ll
U test/CodeGen/Thumb2/ldr-str-imm12.ll
U test/CodeGen/Thumb2/thumb2-strh.ll
U test/CodeGen/Thumb2/thumb2-ldr.ll
U test/CodeGen/Thumb2/thumb2-str_pre.ll
U test/CodeGen/Thumb2/thumb2-str.ll
U test/CodeGen/Thumb2/thumb2-ldrh.ll
U utils/TableGen/TableGen.cpp
U utils/TableGen/DisassemblerEmitter.cpp
D utils/TableGen/RISCDisassemblerEmitter.h
D utils/TableGen/RISCDisassemblerEmitter.cpp
U Makefile.rules
U lib/Target/ARM/ARMInstrNEON.td
U lib/Target/ARM/Makefile
U lib/Target/ARM/AsmPrinter/ARMInstPrinter.cpp
U lib/Target/ARM/AsmPrinter/ARMAsmPrinter.cpp
U lib/Target/ARM/AsmPrinter/ARMInstPrinter.h
D lib/Target/ARM/Disassembler
U lib/Target/ARM/ARMInstrFormats.td
U lib/Target/ARM/ARMAddressingModes.h
U lib/Target/ARM/Thumb2ITBlockPass.cpp
llvm-svn: 98640
(RISCDisassemblerEmitter) which emits the decoder functions for ARM and Thumb,
and the disassembler core which invokes the decoder function and builds up the
MCInst based on the decoded Opcode.
Added sub-formats to the NeonI/NeonXI instructions to further refine the NEONFrm
instructions to help disassembly.
We also changed the output of the addressing modes to omit the '+' from the
assembler syntax #+/-<imm> or +/-<Rm>. See, for example, A8.6.57/58/60.
And modified test cases to not expect '+' in +reg or #+num. For example,
; CHECK: ldr.w r9, [r7, #28]
llvm-svn: 98637
to LLVM IR changes with addr label weirdness. In the testcase, we
generate references to the two bb's when codegen'ing the first
function:
_test1: ## @test1
leaq Ltmp0(%rip), %rax
..
leaq Ltmp1(%rip), %rax
Then continue to codegen the second function where the blocks
get merged. We're now smart enough to emit both labels, producing
this code:
_test_fun: ## @test_fun
## BB#0: ## %entry
Ltmp1: ## Block address taken
Ltmp0:
## BB#1: ## %ret
movl $-1, %eax
ret
Rejoice.
llvm-svn: 98595
- Although it would be nice to allow this decoupling, the assembler needs to be able to reason about MCSymbolRefExprs in too many places to make this viable. We can use a target specific encoding of the variant if this becomes an issue.
- This patch also extends llvm-mc to support parsing of the modifiers, as opposed to lumping them in with the symbol.
llvm-svn: 98592
32-bit indices. Instead of shuffling each element out of the index vector,
when all indices are needed, just store the input vector to the stack and
load the elements out.
llvm-svn: 98588
label is generated, but then the block is deleted. Since the
value is undefined, we just emit the label right after the entry
label of the function. It might matter that the label is in the
same section as the function was afterall.
llvm-svn: 98579
function, then the BB is RAUW'd before the definition is emitted. There
are still two cases not being handled, but this should improve us back to
the situation before I touched anything.
llvm-svn: 98566
- The implementation is currently very brain dead and inefficient, but I have a
clear plan on how to fix it.
- The good news is, it works and correctly assembles 403.gcc (when built with
Clang, at '-Os', '-Os -g', and '-O3'). Even better, at '-Os' and '-Os -g',
the resulting binary is exactly equivalent to that when built with the system
assembler. So it probably works! :)
llvm-svn: 98396
cl = EXTRACT_SUBREG reg1024, 1, is overly conservative. It should check
for overlaps of vr's live interval with the super registers of the
physical register (ECX in this case) and let JoinIntervals() handle checking
the coalescing feasibility against the physical register (cl in this case).
llvm-svn: 98251
PR6540: Set the newly introduced variables ENABLE_SHARED and
SHLIBPATH_VAR in lit.site.cfg not only in the autoconf build, but also
in a cmake one.
llvm-svn: 98171
directly to the maccu / maccs instructions. We handle this in
ExpandADDSUB since after type legalisation it is messy to
recognise these operations.
llvm-svn: 98150
Make it so. (This patch is in LowerCall_Darwin, which seems
to be used by SVR4 code as well; since that doesn't belong here,
I haven't worried about this case.)
llvm-svn: 98077
This patch updates LLVMDebugVersion to 8.
Debug info descriptors encoded using LLVMDebugVersion 7 is supported.
Corresponding llvmgcc and clang FE commits are required.
llvm-svn: 98020