instructions which have no direct register usage.
Darwin 'as' accepts:
add $0, (%rax)
but rejects
mov $0, (%rax)
for example.
Given that, only accept suffix matches which match exactly one form. We still
need to emit nice diagnostics for failures...
llvm-svn: 103015
- The idea is that when a match fails, we just try to match each of +'b', +'w',
+'l'. If exactly one matches, we assume this is a mnemonic prefix and accept
it. If all match, we assume it is width generic, and take the 'l' form.
- This would be a horrible hack, if it weren't so simple. Therefore it is an
elegant solution! Chris gets the credit for this particular elegant
solution. :)
- Next step to making this more robust is to have the X86 matcher generate the
mnemonic prefix information. Ideally we would also compute up-front exactly
which mnemonic to attempt to match, but this may require more custom code in
the matcher than is really worth it.
llvm-svn: 103012
instructions as the Mac OS X darwin assembler. Some of which like 'fcoml'
assembled to different opcodes. While some of the suffixes were just different.
llvm-svn: 102958
caused the a pushl instruction to be incorrectly encoding using only two bytes
of immediate, causing the following 2 instruction bytes to be part of the 32-bit
immediate value. Also fixed the one byte form of push to be used when the
immediate would fit in a signed extended byte. Lastly changed the names to not
include the 32 of PUSH32 since they actually push the size of the stack pointer.
llvm-svn: 102951
before reglist were not properly handled with respect to IT Block. Fix that by
creating a new method ARMBasicMCBuilder::DoPredicateOperands() used by those
instructions for disassembly. Add a test case.
llvm-svn: 101974
Pseudocode details of conditional, Condition bits '111x' indicate the
instruction is always executed. That is, '1111' is a leagl condition field
value, which is now mapped to ARMCC::AL.
Also add a test case for condition field '1111'.
llvm-svn: 101817
case. Also, the 0xFF hex literal involved in the shift for ESize64 should be
suffixed "ul" to preserve the shift result.
Implemented printHex*ImmOperand() by copying from ARMAsmPrinter.cpp and added a
test case for DisassembleN1RegModImmFrm()/printHex64ImmOperand().
llvm-svn: 101557
to the UAL syntax of LDCL<c>, instead.
Add a test case for this change which also tests the removal of assert() from
printAddrMode2OffsetOperand().
llvm-svn: 101527
current PC. rdar://7834775
We now produce an identical .o file compared to the cctools
assembler for something like this:
_f0:
L0:
jmp L1
.long . - L0
L1:
jmp A
.long . - L1
.zerofill __DATA,_bss,A,0
llvm-svn: 101227
ARM_AM::getSoImmVal(V) with a legitimate so_imm value: #245 rotate right by 2.
Introduce ARM_AM::getSOImmValOneOrNoRotate(unsigned Arg) which is called from
ARMInstPrinter.cpp's printSOImm() function, replacing ARM_AM::getSOImmVal(V).
[12:44:43] johnny:/Volumes/data/llvm/git/trunk (local-trunk) $ gdb Debug/bin/llvm-mc
GNU gdb 6.3.50-20050815 (Apple version gdb-1346) (Fri Sep 18 20:40:51 UTC 2009)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "x86_64-apple-darwin"...Reading symbols for shared libraries ... done
(gdb) set args -triple=arm-apple-darwin9 -debug-only=arm-disassembler --disassemble
(gdb) r
Starting program: /Volumes/data/llvm/git/trunk/Debug/bin/llvm-mc -triple=arm-apple-darwin9 -debug-only=arm-disassembler --disassemble
Reading symbols for shared libraries ++. done
0xf5 0x71 0xf0 0x53
Opcode=201 Name=MVNi Format=ARM_FORMAT_DPFRM(4)
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
-------------------------------------------------------------------------------------------------
| 0: 1: 0: 1| 0: 0: 1: 1| 1: 1: 1: 1| 0: 0: 0: 0| 0: 1: 1: 1| 0: 0: 0: 1| 1: 1: 1: 1| 0: 1: 0: 1|
-------------------------------------------------------------------------------------------------
mvnpls r7, Assertion failed: (V != -1 && "Not a valid so_imm value!"), function printSOImm, file ARMInstPrinter.cpp, line 229.
Program received signal SIGABRT, Aborted.
0x00007fff88c65886 in __kill ()
(gdb) bt
#0 0x00007fff88c65886 in __kill ()
#1 0x00007fff88d05eae in abort ()
#2 0x00007fff88cf2ef0 in __assert_rtn ()
#3 0x000000010020e422 in printSOImm (O=@0x1010bdf80, V=-1, VerboseAsm=false, MAI=0x1020106d0) at ARMInstPrinter.cpp:229
#4 0x000000010020e5fe in llvm::ARMInstPrinter::printSOImmOperand (this=0x1020107e0, MI=0x7fff5fbfee70, OpNum=1, O=@0x1010bdf80) at ARMInstPrinter.cpp:254
#5 0x00000001001ffbc0 in llvm::ARMInstPrinter::printInstruction (this=0x1020107e0, MI=0x7fff5fbfee70, O=@0x1010bdf80) at ARMGenAsmWriter.inc:3236
#6 0x000000010020c27c in llvm::ARMInstPrinter::printInst (this=0x1020107e0, MI=0x7fff5fbfee70, O=@0x1010bdf80) at ARMInstPrinter.cpp:182
#7 0x000000010003cbff in PrintInsts (DisAsm=@0x10200f4e0, Printer=@0x1020107e0, Bytes=@0x7fff5fbff060, SM=@0x7fff5fbff078) at Disassembler.cpp:65
#8 0x000000010003c8b4 in llvm::Disassembler::disassemble (T=@0x1010c13c0, Triple=@0x1010b6798, Buffer=@0x102010690) at Disassembler.cpp:153
#9 0x000000010004095c in DisassembleInput (ProgName=0x7fff5fbff3f0 "/Volumes/data/llvm/git/trunk/Debug/bin/llvm-mc") at llvm-mc.cpp:347
#10 0x000000010003eefb in main (argc=4, argv=0x7fff5fbff298) at llvm-mc.cpp:374
(gdb) q
The program is running. Exit anyway? (y or n) y
[13:36:26] johnny:/Volumes/data/llvm/git/trunk (local-trunk) $
llvm-svn: 101053
backend (ARMDecoderEmitter) which emits the decoder functions for ARM and Thumb,
and the disassembler core which invokes the decoder function and builds up the
MCInst based on the decoded Opcode.
Reviewed by Chris Latter and Bob Wilson.
llvm-svn: 100233
--- Reverse-merging r99440 into '.':
U test/MC/AsmParser/X86/x86_32-bit_cat.s
U test/MC/AsmParser/X86/x86_32-encoding.s
U include/llvm/IntrinsicsX86.td
U include/llvm/CodeGen/SelectionDAGNodes.h
U lib/Target/X86/X86InstrSSE.td
U lib/Target/X86/X86ISelLowering.h
llvm-svn: 99450
not get an "Unknown immediate size" assert failure when used. All instructions
of this form have an 8-bit immediate. Also added a test case of an example
instruction that is of this form.
llvm-svn: 99435
override prefix and only the r/m16 forms should have had that. Also for variant
one, the AT&T syntax, added suffixes to all forms. Also added the missing
64-bit form for 'CRC32 r64, r/m8'. Plus added test cases for all forms and
tweaked one test case to add the needed suffixes.
llvm-svn: 98980
- This is "extraordinarily" Darwin 'as' compatible. See the litany of FIXMEs littered about for more information.
- There are a few cases which seem to clearly be 'as' bugs which I have left unsupported, and there is one cases where we diverge but should fix if it blocks diffing .o files (Darwin 'as' ends up widening a jump unnecessarily).
- 403.gcc build, runs, and diffs equivalently to the 'as' built version now (using llvm-mc). However, it builds so slowly that I wouldn't recommend trying it quite yet. :)
llvm-svn: 98974
temporary workaround for matching inc/dec on x86_64 to the correct instruction.
- This hack will eventually be replaced with a robust mechanism for handling
matching instructions based on the available target features.
llvm-svn: 98858
- The implementation is currently very brain dead and inefficient, but I have a
clear plan on how to fix it.
- The good news is, it works and correctly assembles 403.gcc (when built with
Clang, at '-Os', '-Os -g', and '-O3'). Even better, at '-Os' and '-Os -g',
the resulting binary is exactly equivalent to that when built with the system
assembler. So it probably works! :)
llvm-svn: 98396
section with TextAlignFillValue and calls EmitCodeAlignment() instead of
calling EmitValueToAlignment(). This allows x86 assembly code to be aligned
with optimal nops.
llvm-svn: 97158
containing the subset of the full auto generated test case that currently
encodes correctly. Again it is useful as we bring up the the new encoder
to make sure currently working stuff stays working.
llvm-svn: 95791
in X86-32 mode. This is still required in x86-64 mode to avoid
forming [disp+rip] encoding. Rewrite the SIB byte decision logic
to be actually understandable.
llvm-svn: 95693
Lock prefix, Repeat string operation prefixes and the Segment override prefixes.
Also added versions of the move string and store string instructions without the
repeat prefixes to X86InstrInfo.td. And finally marked the rep versions of
move/store string records in X86InstrInfo.td as isCodeGenOnly = 1 so tblgen is
happy building the disassembler files.
llvm-svn: 95252
It's unclear if the matcher is nondeterminstic of what here,
but I'm getting matches without TAILCALL and some other hosts
are getting matches with it.
llvm-svn: 95149
This test case is different subset of the full auto generated test case, and a
larger subset that is in x86_32-bit.s (that set will encode correctly). These
instructions can pass though llvm-mc as it were a logical cat(1) and then
reassemble to the same instruction. It is useful as we bring up the parser and
matcher so we don't break things that currently work.
llvm-svn: 95107
something totally broken and parsing them as immediates, but the .td file also
had the wrong match class so things sortof worked. Except, that is, that we
would parse
movl $0, %eax
as
movl 0, %eax
Feel free to guess how well that worked.
llvm-svn: 94869
- This test case is auto generated, and has been verified to round-trip
correctly through llvm-mc by checking the assembled .o file before and after
piping through llvm-mc. It will be extended over time as the matcher grows
support for more instructions.
llvm-svn: 94857
parses the .word directive as 4 bytes and ARMAsmParser::ParseInstruction will
give an error is called. Broke out the test of the .word directive into two
different test cases, one for x86 and one for arm.
llvm-svn: 81817
- I'm still trying to figure out the cleanest way to implement this and match the assembler, currently there are some substantial differences.
llvm-svn: 80347
- I moved section creation back into AsmParser. I think policy decisions like
this should be pushed higher, not lower, when possible (in addition the
assembler has flags which change this behavior, for example).
llvm-svn: 80162
- I haven't really tried to find the "right" way to store the fixups or apply
them, yet. This works, but isn't particularly elegant or fast.
- Still no evaluation support, so we don't actually ever not turn a fixup into
a relocation entry.
llvm-svn: 80089
- This is mostly complete, the main thing missing is .indirect_symbol support
(which would be straight-forward, except that the way it is implemented in
'as' makes getting an exact .o match interesting).
llvm-svn: 79899
- Honor .globl.
- Set symbol type and section correctly ('nm' now works), and order symbols
appropriately.
- Take care to the string table so that the .o matches 'as' exactly (for ease
of testing).
llvm-svn: 79740
(e.g., .objc_message_refs).
- Just emit a .align when we see the directive; this isn't exactly what 'as'
does but in practice it should be ok, at least for now. See FIXME.
llvm-svn: 79697
- Together these form the (Mach-O) back end of the assembler.
- MCAssembler is the actual assembler backend, which is designed to have a
reasonable API. This will eventually grow to support multiple object file
implementations, but for now its Mach-O/i386 only.
- MCMachOStreamer adapts the MCStreamer "actions" API to the MCAssembler API,
e.g. converting the various directives into fragments, managing state like
the current section, and so on.
- llvm-mc will use the new backend via '-filetype=obj', which may eventually
be, but is not yet, since I hear that people like assemblers which actually
assemble.
- The only thing that works at the moment is changing sections. For the time
being I have a Python Mach-O dumping tool in test/scripts so this stuff can
be easily tested, eventually I expect to replace this with a real LLVM tool.
- More doxyments to come.
I assume that since this stuff doesn't touch any of the things which are part of
2.6 that it is ok to put this in not so long before the freeze, but if someone
objects let me know, I can pull it.
llvm-svn: 79612
symbol as the symbol name itself, not the expression it was defined to. These
have different semantics due to the quirky .set behavior (which absolutizes an
expression that would otherwise be treated as a relocation).
llvm-svn: 79025
specific printer (this only works on x86, for now).
- This makes it possible to do some correctness checking of the parsing and
matching, since we can compare the results of 'as' on the original input, to
those of 'as' on the output from llvm-mc.
- In theory, we could now have an easy ATT -> Intel syntax converter. :)
llvm-svn: 78986
- This doesn't actually improve the algorithm (its still linear), but the
generated (match) code is now fairly compact and table driven. Still need a
generic string matcher.
- The table still needs to be compressed, this is quite simple to do and should
shrink it to under 16k.
- This also simplifies and restructures the code to make the match classes more
explicit, in anticipation of resolving ambiguities.
llvm-svn: 78461
I can clean this up a bit more and do way with the TheCondState and just use
the top element on the TheCondStack if not empty. Also may tweak the code
around ParseConditionalAssemblyDirectives() to simplify the AsmParser code.
llvm-svn: 78423
- Still not very sane, but a least its not 60k lines on X86. :)
- In terms of correctness, currently some things are hard wired for X86, and we
still don't properly resolve ambiguities (this is ignoring the instructions
we don't even match due to funny .td stuff or other corner cases).
The high level changes:
1. Represent tokens which are significant for matching explicitly as separate
operands. This uniformly handles not only the instruction mnemonic, but
also 'signficiant' syntax like the '*' in "call * ...".
2. Separate the matching of operands to an instruction from the construction of
the MCInst. In theory this can be done during matching, but since the number
of variations is small I think it makes sense to decompose the problems.
3. Improved a few of the mechanisms to at least successfully flatten / tokenize
the assembly strings for PowerPC and ARM.
4. The comment at the top of AsmMatcherEmitter.cpp explains the approach I'm
moving towards for handling ambiguous instructions. The high-bit is to infer
a partial ordering of the operand classes (and force the user to specify one
if we can't) and use that to resolve ambiguities.
llvm-svn: 78378
- Operands which are just a label should be parsed as immediates, not memory
operands (from the assembler perspective).
- Match a few more flavors of immediates.
- Distinguish match functions for memory operands which don't take a segment
register.
- We match the .s for "hello world" now!
llvm-svn: 77745
- Uses MCAsmToken::getIdentifier which returns the (sub)string representing the
meaningfull contents a string or identifier token.
- Directives aren't done yet.
llvm-svn: 77739
- This is "experimental" code, I am feeling my way around and working out the
best way to do things (and learning tblgen in the process). Comments welcome,
but keep in mind this stuff will change radically.
- This is enough to match "subb" and friends, but not much else. The next step is to
automatically generate the matchers for individual operands.
llvm-svn: 77657
the parsing of the .dump and .load should be done in the assembly parser and
not have any need for an MCStreamer API. Changed the code for now so these
just produce an error saying these specific directives are not yet implemented
since they are likely no longer used and may never need to be implemented.
llvm-svn: 76462
allowed to be undefined when the expression is seen, we cannot enforce the
same-section requirement until the entire assembly file has been seen.
llvm-svn: 74565