This commit caches the value of the AllowAtInIdentifier variable as
a class variable in AsmLexer. We do this to avoid repeated MAI
queries and string comparisons each time we lex an identifier.
llvm-svn: 196622
The integrated assembler fails to properly lex arm comments when
they are adjacent to an identifier in the input stream. The reason
is that the arm comment symbol '@' is also used as symbol variant in
other assembly languages so when lexing an identifier it allows the
'@' symbol as part of the identifier.
Example:
$ cat comment.s
foo:
add r0, r0@got to parse this as a comment
$ llvm-mc -triple armv7 comment.s
comment.s:4:18: error: unexpected token in argument list
add r0, r0@got to parse this as a comment
^
This should be parsed as correctly as `add r0, r0`.
This commit modifes the assembly lexer to not include the '@' symbol
in identifiers when lexing for targets that use '@' for comments.
llvm-svn: 196607
ELF_Other_Weakref and ELF_Other_ThumbFunc seems to be LLVM
internal ELF symbol flags. These should not be emitted to
object file.
This commit defines ELF_STO_Shift for the target-defined
flags for st_other, and increase the value of
ELF_Other_Shift to 16.
llvm-svn: 196440
ARM symbol variants are written with parens instead of @ like this:
.word __GLOBAL_I_a(target1)
This commit adds support for parsing these symbol variants in
expressions. We introduce a new flag to MCAsmInfo that indicates the
parser should use parens to parse the symbol variant. The expression
parser is modified to look for symbol variants using parens instead
of @ when the corresponding MCAsmInfo flag is true.
The MCAsmInfo parens flag is enabled only for ARM on ELF.
By adding this flag to MCAsmInfo, we are able to get rid of
redundant ARM-specific symbol variants and use the generic variants
instead (e.g. VK_GOT instead of VK_ARM_GOT). We use the new
UseParensForSymbolVariant attribute in MCAsmInfo to correctly print
the symbol variants for arm.
To achive this we need to keep a handle to the MCAsmInfo in the
MCSymbolRefExpr class that we can check when printing the symbol
variant.
Updated Tests:
Changed case of symbol variant to match the generic kind.
test/CodeGen/ARM/tls-models.ll
test/CodeGen/ARM/tls1.ll
test/CodeGen/ARM/tls2.ll
test/CodeGen/Thumb2/tls1.ll
test/CodeGen/Thumb2/tls2.ll
PR18080
llvm-svn: 196424
With this patch we use simple names for COMDAT sections (like .text or .bss).
This matches the MSVC behavior.
When merging it is the COMDAT symbol that is used to decide if two sections
should be merged, so there is no point in building a fancy name.
This survived a bootstrap on mingw32.
llvm-svn: 195798
This patch fixes a bug in the assembler that was causing bad code to
be emitted. When switching modes in an assembly file (e.g. arm to
thumb mode) we would always emit the opcode from the original mode.
Consider this small example:
$ cat align.s
.code 16
foo:
add r0, r0
.align 3
add r0, r0
$ llvm-mc -triple armv7-none-linux align.s -filetype=obj -o t.o
$ llvm-objdump -triple thumbv7 -d t.o
Disassembly of section .text:
foo:
0: 00 44 add r0, r0
2: 00 f0 20 e3 blx #4195904
6: 00 00 movs r0, r0
8: 00 44 add r0, r0
This shows that we have actually emitted an arm nop (e320f000)
instead of a thumb nop. Unfortunately, this encodes to a thumb
branch which causes bad things to happen when compiling assembly
code with align directives.
The fix is to notify the ARMAsmBackend when we switch mode. The
MCMachOStreamer was already doing this correctly. This patch makes
the same change for the MCElfStreamer.
There is still a bug in the way nops are emitted for alignment
because the MCAlignment fragment does not store the correct mode.
The ARMAsmBackend will emit nops for the last mode it knew about. In
the example above, we still generate an arm nop if we add a `.code
32` to the end of the file.
PR18019
llvm-svn: 195677
These should not use COMDATs. GNU as uses .bss for .lcomm and section 0 for
.comm.
Given
static int a;
int b;
MSVC puts both in .bss. This patch then puts both .comm and .lcomm on .bss. With
this change we agree with gas on .lcomm, are much closer on .comm and clang-cl
matches msvc on the above example.
llvm-svn: 195654
This is the first step to fix pr17918.
It extends the .section directive a bit, inspired by what the ELF one looks
like. The problem with using linkonce is that given
.section foo
.linkonce....
.section foo
.linkonce
we would already have switched sections when getting to .linkonce. The cleanest
solution seems to be to add the comdat information in the .section itself.
llvm-svn: 195148
This patch removes most of the trivial cases of weak vtables by pinning them to
a single object file. The memory leaks in this version have been fixed. Thanks
Alexey for pointing them out.
Differential Revision: http://llvm-reviews.chandlerc.com/D2068
Reviewed by Andy
llvm-svn: 195064
This reverts commit r190888, to fix PR17967. The original change wasn't
the right way to get @feat.00 into the object file. The right fix is to
make @feat.00 be a global symbol.
llvm-svn: 195053
This change is incorrect. If you delete virtual destructor of both a base class
and a subclass, then the following code:
Base *foo = new Child();
delete foo;
will not cause the destructor for members of Child class. As a result, I observe
plently of memory leaks. Notable examples I investigated are:
ObjectBuffer and ObjectBufferStream, AttributeImpl and StringSAttributeImpl.
llvm-svn: 194997
This patch removes most of the trivial cases of weak vtables by pinning them to
a single object file.
Differential Revision: http://llvm-reviews.chandlerc.com/D2068
Reviewed by Andy
llvm-svn: 194865
There is nothing special about quotes and newlines from the object
file point of view, only the assembler has to worry about expanding
the \n and \".
This patch then removes the special handling from the Mangler.
llvm-svn: 194667
Accepting quotes is a property of an assembler, not of an object file. For
example, ELF can support any names for sections and symbols, but the gnu
assembler only accepts quotes in some contexts and llvm-mc in a few more.
LLVM should not produce different symbols based on a guess about which assembler
will be reading the code it is printing.
llvm-svn: 194575
Objective-C data structures.
This is allows tools such as darwin's otool(1) that uses the
LLVM disassembler take a pointer value being loaded by
an instruction and add a comment to what it is being referenced
to make following disassembly of Objective-C programs
more readable.
For example disassembling the Mac OS X TextEdit app one
will see comments like the following:
movq 0x20684(%rip), %rsi ## Objc selector ref: standardUserDefaults
movq 0x21985(%rip), %rdi ## Objc class ref: _OBJC_CLASS_$_NSUserDefaults
movq 0x1d156(%rip), %r14 ## Objc message: +[NSUserDefaults standardUserDefaults]
leaq 0x23615(%rip), %rdx ## Objc cfstring ref: @"SelectLinePanel"
callq 0x10001386c ## Objc message: -[[%rdi super] initWithWindowNibName:]
These diffs also include putting quotes around C strings
in literal pools and uses "symbol address" in the comment
when adding a symbol name to the comment to tell these
types of references apart:
leaq 0x4f(%rip), %rax ## literal pool for: "Hello world"
movq 0x1c3ea(%rip), %rax ## literal pool symbol address: ___stack_chk_guard
Of course the easy changes are in the LLVM disassembler and
the hard work is up to the implementer of the SymbolLookUp()
call back.
rdar://10602439
llvm-svn: 193833
When assembling, a .thumb_func directive is supposed to be applicable to the
next symbol definition, even if there are intervening directives. We were
racing ahead to try and find it, and this commit should fix the issue.
Patch by Gabor Ballabas
llvm-svn: 193403
Also improve the implementation of EmitRawText(Twine) so it doesn't
bother using the SmallString buffer if the Twine is a simple StringRef
anyway.
llvm-svn: 193378
This is another (final?) stab at making us able to parse our own asm output
on Windows.
Symbols on Windows often contain @'s and ?'s in their names. Our asm parser
didn't like this. ?'s were not allowed, and @'s were intepreted as trying to
reference PLT/GOT/etc.
We can't just add quotes around the bad names, since e.g. for MinGW, we use gas
to assemble, and it doesn't like quotes in some places (notably in .def
directives).
This commit makes us allow ?'s in symbol names, and @'s in symbol names for MS
assembly.
Differential Revision: http://llvm-reviews.chandlerc.com/D1978
llvm-svn: 193000
This caused the clang-native-mingw32-win7 buildbot to break.
The assembler was complaining about the following lines that were showing up
in the asm for CrashRecoveryContext.cpp:
movl $"__ZL16ExceptionHandlerP19_EXCEPTION_POINTERS@4", 4(%eax)
calll "_AddVectoredExceptionHandler@8"
.def "__ZL16ExceptionHandlerP19_EXCEPTION_POINTERS@4";
"__ZL16ExceptionHandlerP19_EXCEPTION_POINTERS@4":
calll "_RemoveVectoredExceptionHandler@4"
Reverting for now.
llvm-svn: 192940
The reason this got reverted was that the @feat.00 symbol which was emitted
for every TU became quoted, and on cygwin/mingw we use the gas assembler which
couldn't handle the quotes.
This commit fixes the problem by only emitting @feat.00 for win32, where we use
clang -cc1as to assemble. gas would just drop this symbol anyway, so there is no
loss there.
With @feat.00 gone, there shouldn't be quoted symbols showing up on cygwin since
it uses the Itanium ABI, which doesn't put these funny characters in symbols.
> Because of win32 mangling, we produce symbol and section names with
> funny characters in them, most notably @ characters.
>
> MC would choke on trying to parse its own assembly output. This patch addresses
> that by:
>
> - Making @ trigger quoting of symbol names
> - Also quote section names in the same way
> - Just parse section names like other identifiers (to allow for quotes)
> - Don't assume @ signifies a symbol variant if it is in a string.
llvm-svn: 192859
This patch fixes a small mistake in MCDataAtom::addData() where it doesn't ever
call remap():
- if (Data.size() > Begin - End - 1)
+ if (Data.size() > End + 1 - Begin)
remap(Begin, End + 1);
This is currently not visible because of another bug is the disassembler, so
the patch includes a unit test.
Patch by Stephen Checkoway.
llvm-svn: 192823
GNU AS didn't like quotes in symbol names.
Error: junk at end of line, first unrecognized character is `"'
.def "@feat.00";
"@feat.00" = 1
Reproduced on Cygwin's 2.23.52.20130309 and mingw32's 2.20.1.20100303.
llvm-svn: 192775
Because of win32 mangling, we produce symbol and section names with
funny characters in them, most notably @ characters.
MC would choke on trying to parse its own assembly output. This patch addresses
that by:
- Making @ trigger quoting of symbol names
- Also quote section names in the same way
- Just parse section names like other identifiers (to allow for quotes)
- Don't assume @ signifies a symbol variant if it is in a string.
Differential Revision: http://llvm-reviews.chandlerc.com/D1945
llvm-svn: 192758
This can happen when processing command line arguments, which
are often stored as std::string's and later turned into
StringRef's via std::string::data(). Unfortunately this
is not guaranteed to return a null-terminated string
until C++11, causing breakage on platforms that don't do this.
llvm-svn: 192558
This patch fixes an old FIXME by creating a MCTargetStreamer interface
and moving the target specific functions for ARM, Mips and PPC to it.
The ARM streamer is still declared in a common place because it is
used from lib/CodeGen/ARMException.cpp, but the Mips and PPC are
completely hidden in the corresponding Target directories.
I will send an email to llvmdev with instructions on how to use this.
llvm-svn: 192181
When MC was first added, targets could use hasRawTextSupport to keep features
working before they were added to the MC interface.
The design goal of MC is to provide an uniform api for printing assembly and
object files. Short of relaxations and other corner cases, a object file is
just another representation of the assembly.
It was never the intention that targets would keep doing things like
if (hasRawTextSupport())
Set flags in one way.
else
Set flags in another way.
When they do that they create two code paths and the object file is no longer
just another representation of the assembly. This also then requires testing
with llc -filetype=obj, which is extremelly brittle.
This patch removes some of these hacks by replacing them with smaller ones.
The ARM flag setting is trivial, so I just moved it to the constructor. For
Mips, the patch adds two temporary hack directives that allow the assembly
to represent the same things as the object file was already able to.
The hope is that the mips developers will replace the hack directives with
the same ones that gas uses and drop the -print-hack-directives flag.
I will also try to implement a target streamer interface, so that we can
move this out of the common code.
In summary, for any new work, two rules of the thumb are
* Don't use "llc -filetype=obj" in tests.
* Don't add calls to hasRawTextSupport.
llvm-svn: 192035
This patch handles LLVM standalone assembler (llvm-mc) ELF flag setting based on input file
directive processing.
Mips assembly requires processing inline directives that directly and
indirectly affect the output ELF header flags. This patch handles one
".abicalls".
To process these directives we are following the model the code generator
uses by storing state in a container as we go through processing and when
we detect the end of input file processing, AsmParser is notified and we
update the ELF header flags through a MipsELFStreamer method with a call from
MCTargetAsmParser::emitEndOfAsmFile(MCStreamer &OutStreamer).
This patch will allow other targets the same functionality.
Jack
llvm-svn: 191982
itinerary model in case the target does not supply a scheduling model.
By doing this, targets like cortex-a8 can benefit from the latency printing
feature added in r191859.
This part of <rdar://problem/14687488>.
llvm-svn: 191916
classes that are marked as Variant as those require an MI to pass to
SubTargetInfo::resolveSchedClass.
This is part of <rdar://problem/14687488>.
llvm-svn: 191864