After this I will set the default back to F_None. The advantage is that
before this patch forgetting to set F_Binary would corrupt a file on windows.
Forgetting to set F_Text produces one that cannot be read in notepad, which
is a better failure mode :-)
llvm-svn: 202052
The .error directive is similar to .err in that it will halt assembly if it is
evaluated for assembly. However, it permits a user supplied message to be
rendered.
llvm-svn: 201999
The .ifeqs directive assembles the following code if the quoted string
parameters are equal. The strings must be quoted using double quotes.
llvm-svn: 201998
The .ifne directive assembles the following section of code if the argument
expression is non-zero. Effectively, it is equivalent to if.
llvm-svn: 201986
If the strings are not quoted, the first string stops at the first comma, and
the second string stops at the end of the line. Strings which contain
whitespace should be quoted. Unquoted space is to be discarded.
llvm-svn: 201985
The .err directive produces an error whenever it is assembled. This can be
useful for preventing assembly when an unexpected condition occurs.
llvm-svn: 201984
This enhances the macro parser to parse and handle parameter qualifications,
which is needed to support required formal parameters in macro definitions. A
required parameter may not be defaulted (though providing a default value is
accepted with a warning). This improves GAS compatibility.
Partially addresses PR9248.
llvm-svn: 201630
Rather than using std::pair, create a structure to represent the type. This is
a preliminary refactoring to enable required parameter handling. Additional
state is needed to indicate required parameters. This has a minor side effect
of improving readability by providing more accurate names compared to first and
second.
llvm-svn: 201629
This is implemented by handling assignments to the '.' pseudo symbol
as ".org" directives.
Differential Revision: http://llvm-reviews.chandlerc.com/D2625
llvm-svn: 201530
Until this point only macro definition with named parameters were parsed but the
names were ignored. This adds support for using that information for named
parameter instantiation.
In order to support the full semantics of the keyword arguments, the arguments
are no longer lazily initialised since the keyword arguments can be specified
out of order and partially if they are defaulted. Prepopulate the arguments
with the default value for any defaulted parameters, and then parse the
specified arguments.
This simplies some of the handling of the arguments in the inner loop since
empty arguments simply increment the parameter index and move on.
Note that keyword and positional arguments cannot be mixed.
llvm-svn: 201499
The Linux kernel defines empty macros for compatibility with ARM UAL syntax.
The comma after the name is optional, and if present can be safely lexed. This
improves compatibility with the GNU assembler.
llvm-svn: 201474
Some of the more complex directive and macro handling for GAS compatibility
requires lookahead. Add a single token lookahead in the MCAsmLexer.
llvm-svn: 201058
This patch fixes the ldr-pseudo implementation to work when used in
inline assembly. The fix is to move arm assembler constant pools
from the ARMAsmParser class to the ARMTargetStreamer class.
Previously we kept the assembler generated constant pools in the
ARMAsmParser object. This does not work for inline assembly because
a new parser object is created for each blob of inline assembly.
This patch moves the constant pools to the ARMTargetStreamer class
so that the constant pool will remain alive for the entire code
generation process.
An ARMTargetStreamer class is now required for the arm backend.
There was no existing implementation for MachO, only Asm and ELF.
Instead of creating an empty MachO subclass, we decided to make the
ARMTargetStreamer a non-abstract class and provide default
(llvm_unreachable) implementations for the non constant-pool related
methods.
Differential Revision: http://llvm-reviews.chandlerc.com/D2638
llvm-svn: 200777
This is a minimal implementation which accepts only constants rather than
full expressions, but that should be perfectly sufficient for all known
users for now.
Patch from PaX Team <pageexec@freemail.hu>
llvm-svn: 200614
This will be needed for .octa support, but we don't want to just use the
existing AsmLexer::Integer for it and then have to litter all its users
with explicit checks for the size, and make them use the new get APIntVal()
method.
So let the lexer produce an AsmLexer::Integer as before for numbers which
are small enough — which appears to cover what was previously a nasty
special case handling of numbers which don't fit in int64_t but *do* fit
in uint64_t.
Where the number is too large even for that, produce an AsmLexer::BigNum
instead. We do nothing with these except complain about them for now,
but that will be changed shortly...
Based on a patch from PaX Team <pageexec@freemail.hu>
llvm-svn: 200613
Per the GAS documentation, .fill should permit pattern widths that
aren't a power of two. While I was in the neighborhood, I added some
sanity checking. This change was motivated by a use of this construct
in the Linux Kernel.
llvm-svn: 200606
The linux kernel makes uses of a GAS `feature' which substitutes nothing
for macro arguments which aren't specified.
Proper support for these kind of macro arguments necessitated a cleanup of
differences between `GAS' and `Darwin' dialect macro processing.
Differential Revision: http://llvm-reviews.chandlerc.com/D2634
llvm-svn: 200409
This commit allows LLVM MC to process .cfi_startproc directives when
they are followed by an additional `simple' identifier. This signals to
elide the emission of target specific CFI instructions that would
normally occur initially.
This fixes PR16587.
Differential Revision: http://llvm-reviews.chandlerc.com/D2624
llvm-svn: 200227
An emitted diagnostic for an invalid relocation variant would place the caret on
the token following the relocation variant indicator or at the end of the line
if there was no following token. This change corrects the placement of the
caret to point to the token.
llvm-svn: 200159
ARM assembly syntax uses @ for a comment, execpt for the second
parameter of the .symver directive which requires @ as part of the
symbol name. This commit fixes the parsing of this directive by
adding a special case for ARM for this one argumnet.
To make the change we had to move the AllowAtInIdentifier variable
to the MCAsmLexer interface (from AsmLexer) and expose a setter for
the value. The ELFAsmParser then toggles this value when parsing
the second argument to the .symver directive for a target that
uses @ as a comment symbol
llvm-svn: 199339
subsequent changes are easier to review. About to fix some layering
issues, and wanted to separate out the necessary churn.
Also comment and sink the include of "Windows.h" in three .inc files to
match the usage in Memory.inc.
llvm-svn: 198685
Introduce a new virtual method Note into the AsmParser. This completements the
existing Warning and Error methods. Use the new method to clean up the output
of the unwind routines in the ARM AsmParser.
llvm-svn: 198661
The GNU assembler supports .rep as an alias for .rept. This simply creates the
alias for it and introduces a test for both .rept and .rep.
llvm-svn: 198097
This callback is invoked when the parse has finished successfuly. It
will be used to write out ARM constant pools to implement the ldr
pseudo.
llvm-svn: 197706
The .end directive indicates the end of the file. No further instructions are
processed after a .end directive is encountered.
One potential (glaringly obvious) optimisation that could be pursued here is to
extend MCAsmParser with a DiscardRemainder method to avoid processing lexemes to
the end of the file. It was unclear at this point if that would be worth
adding, and could easily be added in a follow on change.
Signed-off-by: Saleem Abdulrasool <compnerd@compnerd.org>
llvm-svn: 197547
This re-lands commit r196876, which was reverted in r196879.
The tests have been fixed to pass on platforms with a stack alignment
larger than 4.
Update to clang side tests will land shortly.
llvm-svn: 196939
For stack frames requiring realignment, three pointers may be needed:
- ebp to address incoming arguments
- esi (could be any callee-saved register) to address locals
- esp to address outgoing arguments
We would use esi unconditionally without verifying that it did not
conflict with inline assembly.
This change doesn't do the verification, it simply emits a fatal error
on functions that use stack realignment, dynamic SP adjustments, and
inline assembly.
Because stack realignment is common on Windows, we also no longer assume
that MS inline assembly clobbers esp. Instead, we analyze the inline
instructions for implicit definitions and check if esp is there. If so,
we require the use of a base pointer and consider it in the condition
above.
Mostly fixes PR16830, but we could try harder to find a non-conflicting
base pointer.
Reviewers: sunfish
Differential Revision: http://llvm-reviews.chandlerc.com/D1317
llvm-svn: 196876
This commit caches the value of the AllowAtInIdentifier variable as
a class variable in AsmLexer. We do this to avoid repeated MAI
queries and string comparisons each time we lex an identifier.
llvm-svn: 196622
The integrated assembler fails to properly lex arm comments when
they are adjacent to an identifier in the input stream. The reason
is that the arm comment symbol '@' is also used as symbol variant in
other assembly languages so when lexing an identifier it allows the
'@' symbol as part of the identifier.
Example:
$ cat comment.s
foo:
add r0, r0@got to parse this as a comment
$ llvm-mc -triple armv7 comment.s
comment.s:4:18: error: unexpected token in argument list
add r0, r0@got to parse this as a comment
^
This should be parsed as correctly as `add r0, r0`.
This commit modifes the assembly lexer to not include the '@' symbol
in identifiers when lexing for targets that use '@' for comments.
llvm-svn: 196607
ARM symbol variants are written with parens instead of @ like this:
.word __GLOBAL_I_a(target1)
This commit adds support for parsing these symbol variants in
expressions. We introduce a new flag to MCAsmInfo that indicates the
parser should use parens to parse the symbol variant. The expression
parser is modified to look for symbol variants using parens instead
of @ when the corresponding MCAsmInfo flag is true.
The MCAsmInfo parens flag is enabled only for ARM on ELF.
By adding this flag to MCAsmInfo, we are able to get rid of
redundant ARM-specific symbol variants and use the generic variants
instead (e.g. VK_GOT instead of VK_ARM_GOT). We use the new
UseParensForSymbolVariant attribute in MCAsmInfo to correctly print
the symbol variants for arm.
To achive this we need to keep a handle to the MCAsmInfo in the
MCSymbolRefExpr class that we can check when printing the symbol
variant.
Updated Tests:
Changed case of symbol variant to match the generic kind.
test/CodeGen/ARM/tls-models.ll
test/CodeGen/ARM/tls1.ll
test/CodeGen/ARM/tls2.ll
test/CodeGen/Thumb2/tls1.ll
test/CodeGen/Thumb2/tls2.ll
PR18080
llvm-svn: 196424
This is the first step to fix pr17918.
It extends the .section directive a bit, inspired by what the ELF one looks
like. The problem with using linkonce is that given
.section foo
.linkonce....
.section foo
.linkonce
we would already have switched sections when getting to .linkonce. The cleanest
solution seems to be to add the comdat information in the .section itself.
llvm-svn: 195148
When assembling, a .thumb_func directive is supposed to be applicable to the
next symbol definition, even if there are intervening directives. We were
racing ahead to try and find it, and this commit should fix the issue.
Patch by Gabor Ballabas
llvm-svn: 193403
This is another (final?) stab at making us able to parse our own asm output
on Windows.
Symbols on Windows often contain @'s and ?'s in their names. Our asm parser
didn't like this. ?'s were not allowed, and @'s were intepreted as trying to
reference PLT/GOT/etc.
We can't just add quotes around the bad names, since e.g. for MinGW, we use gas
to assemble, and it doesn't like quotes in some places (notably in .def
directives).
This commit makes us allow ?'s in symbol names, and @'s in symbol names for MS
assembly.
Differential Revision: http://llvm-reviews.chandlerc.com/D1978
llvm-svn: 193000
This caused the clang-native-mingw32-win7 buildbot to break.
The assembler was complaining about the following lines that were showing up
in the asm for CrashRecoveryContext.cpp:
movl $"__ZL16ExceptionHandlerP19_EXCEPTION_POINTERS@4", 4(%eax)
calll "_AddVectoredExceptionHandler@8"
.def "__ZL16ExceptionHandlerP19_EXCEPTION_POINTERS@4";
"__ZL16ExceptionHandlerP19_EXCEPTION_POINTERS@4":
calll "_RemoveVectoredExceptionHandler@4"
Reverting for now.
llvm-svn: 192940
The reason this got reverted was that the @feat.00 symbol which was emitted
for every TU became quoted, and on cygwin/mingw we use the gas assembler which
couldn't handle the quotes.
This commit fixes the problem by only emitting @feat.00 for win32, where we use
clang -cc1as to assemble. gas would just drop this symbol anyway, so there is no
loss there.
With @feat.00 gone, there shouldn't be quoted symbols showing up on cygwin since
it uses the Itanium ABI, which doesn't put these funny characters in symbols.
> Because of win32 mangling, we produce symbol and section names with
> funny characters in them, most notably @ characters.
>
> MC would choke on trying to parse its own assembly output. This patch addresses
> that by:
>
> - Making @ trigger quoting of symbol names
> - Also quote section names in the same way
> - Just parse section names like other identifiers (to allow for quotes)
> - Don't assume @ signifies a symbol variant if it is in a string.
llvm-svn: 192859
GNU AS didn't like quotes in symbol names.
Error: junk at end of line, first unrecognized character is `"'
.def "@feat.00";
"@feat.00" = 1
Reproduced on Cygwin's 2.23.52.20130309 and mingw32's 2.20.1.20100303.
llvm-svn: 192775
Because of win32 mangling, we produce symbol and section names with
funny characters in them, most notably @ characters.
MC would choke on trying to parse its own assembly output. This patch addresses
that by:
- Making @ trigger quoting of symbol names
- Also quote section names in the same way
- Just parse section names like other identifiers (to allow for quotes)
- Don't assume @ signifies a symbol variant if it is in a string.
Differential Revision: http://llvm-reviews.chandlerc.com/D1945
llvm-svn: 192758
This patch handles LLVM standalone assembler (llvm-mc) ELF flag setting based on input file
directive processing.
Mips assembly requires processing inline directives that directly and
indirectly affect the output ELF header flags. This patch handles one
".abicalls".
To process these directives we are following the model the code generator
uses by storing state in a container as we go through processing and when
we detect the end of input file processing, AsmParser is notified and we
update the ELF header flags through a MipsELFStreamer method with a call from
MCTargetAsmParser::emitEndOfAsmFile(MCStreamer &OutStreamer).
This patch will allow other targets the same functionality.
Jack
llvm-svn: 191982
CFE produces it to indicate artificial locations.
c.f.: DWARF standard, Table 6.2:
line -- An unsigned integer indicating a source line number. Lines are numbered beginning at 1. The compiler may emit the value 0 in cases where an instruction cannot be attributed to any source line.
llvm-svn: 191471
The binutils assembler supports a mode called DOLLAR_DOT which treats
the dollar sign token as a reference to the current program counter if
the dollar sign doesn't precede a constant or identifier.
This commit adds a new MCAsmInfo flag stating whether or not a given
target supports this interpretation of the dollar sign token; by
default, this flag is not enabled.
Further, enable this flag for PPC. The system assembler for AIX and
binutils both support using the dollar sign in this manner.
This fixes PR17353.
llvm-svn: 191368
This makes using array_pod_sort significantly safer. The implementation relies
on function pointer casting but that should be safe as we're dealing with void*
here.
llvm-svn: 191175
Clean up some simple code quality issues. Bring internal naming
conventions up to current standard, fix inconsistent formatting, and
tidy up a couple of odd contructs.
llvm-svn: 191117
Summary:
The '?' flag uses the last section group if the last had a section
group. We treat combining an explicit section group and the '?' as a
hard error.
This fixes PR17198.
Reviewers: rafael, bkramer
Reviewed By: bkramer
CC: llvm-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D1686
llvm-svn: 190768
with a debug build) with this buggy .indirect_symbol directive usage:
% cat test.s
x: .indirect_symbol _y
The assertion is because it is trying to get the symbol index for the
symbol _y when it is writing out the indirect symbol table. This line of
code in MachObjectWriter::WriteObject() :
Write32(Asm.getSymbolData(*it->Symbol).getIndex());
And while there is a symbol _y it does not have any getSymbolData set which
is only done in MachObjectWriter::BindIndirectSymbols() for pointer sections
or stub sections. I added a check and an error in there to catch this in case
something slips through.
But to get a better error the parser should detect when a .indirect_symbol
directive is used and it is not in a pointer section or stub section. To make
that work I moved the handling of the indirect symbol out of the target
independent AsmParser code into the DarwinAsmParser code that can check
for the proper Mach-O section types.
rdar://14825505
llvm-svn: 189497
It's useful to be able to write down floating-point numbers without having to
worry about what they'll be rounded to (as C99 discovered), this extends that
ability to the MC assembly parsers.
llvm-svn: 188370
Currently, when an invalid attribute is encountered on processing a .s file,
clang will abort due to llvm_unreachable. Invalid user input should not cause
an abnormal termination of the compiler. Change the interface to return a
boolean to indicate the failure as a first step towards improving hanlding of
malformed user input to clang.
Signed-off-by: Saleem Abdulrasool <compnerd@compnerd.org>
llvm-svn: 188047
This is dead code since PIC16 was removed in 2010. The result was an odd mix,
where some parts would carefully pass it along and others would assert it was
zero (most of the object streamer for example).
llvm-svn: 185436
that have been run through the 'C' pre-processor.
The implementation of SrcMgr.FindLineNumber() is slow but OK if
it uses its cache when called multiple times with an SMLoc that is
forward of the previous call.
In the case of generating dwarf for assembly source files that have
been run through the 'C' pre-processor we need to calculate the
logical line number based on the last parsed cpp hash file line
comment. And the current code calls SrcMgr.FindLineNumber()
twice to do this causing its cache not to work and results in very
slow compile times:
% time /Volumes/SandBox/build-llvm/Debug+Asserts/bin/llvm-mc -triple thumbv7-apple-ios -filetype=obj -o /tmp/x.o mscorlib.dll.E -g
672.542u 0.299s 11:13.15 99.9% 0+0k 0+2io 2106pf+0w
So we save the info from the last parsed cpp hash file line comment
to avoid making the second call to SrcMgr.FindLineNumber() most times
and end up with compile times like:
% time /Volumes/SandBox/build-llvm/Debug+Asserts/bin/llvm-mc -triple thumbv7-apple-ios -filetype=obj -o /tmp/x.o mscorlib.dll.E -g
3.404u 0.104s 0:03.80 92.1% 0+0k 0+3io 2105pf+0w
rdar://14156934
llvm-svn: 184592
The assembler parser common code supports recognizing symbol variants
using the @ modifer. On PowerPC, it should also be possible to use
(some of) those modifiers with directional labels, like "1f@l".
This patch adds support for accepting symbol variants on directional
labels as well.
llvm-svn: 184437
Test cases that regressed due to r179115, plus a few more, were added in
r179182. Original commit message below:
[ms-inline asm] Use parsePrimaryExpr in lieu of parseExpression if we need to
parse an identifier. Otherwise, parseExpression may parse multiple tokens,
which makes it impossible to properly compute an immediate displacement.
An example of such a case is the source operand (i.e., [Symbol + ImmDisp]) in
the below example:
__asm mov eax, [Symbol + ImmDisp]
Part of rdar://13611297
llvm-svn: 179187
parse an identifier. Otherwise, parseExpression may parse multiple tokens,
which makes it impossible to properly compute an immediate displacement.
An example of such a case is the source operand (i.e., [Symbol + ImmDisp]) in
the below example:
__asm mov eax, [Symbol + ImmDisp]
The existing test cases exercise this patch.
rdar://13611297
llvm-svn: 179115
rather than deriving the StringRef from the Start and End SMLocs.
Using the Start and End SMLocs works fine for operands such as [Symbol], but
not for operands such as [Symbol + ImmDisp]. All existing test cases that
reference a variable exercise this patch.
rdar://13602265
llvm-svn: 179109
added back in by X86AsmPrinter::printIntelMemReference() during codegen.
Previously, this following example
void t() {
int i;
__asm mov eax, [i]
}
would generate the below assembly
mov eax, dword ptr [[eax]]
which resulted in a fatal error when compiling. Test case coming on the
clang side.
rdar://13444264
llvm-svn: 177440
For integer constants, allow 'L', 'UL' as well as 'ULL' and 'LL'. This provides
better support for shared headers between .s and .c files that define bunches
of constant values.
rdar://9321056
llvm-svn: 176118
GNU as rejects them and there are configure scripts in the wild that check if
the assembler rejects ".align 3" to determine whether the alignment is in bytes
or powers of two.
llvm-svn: 175360
Input/Output rewrite to the same location. Make sure the SizeDirective rewrite
is performed first. This also ensure the sort algorithm is stable.
llvm-svn: 175317
This is complicated by backward labels (e.g., 0b can be both a backward label
and a binary zero). The current implementation assumes [0-9]b is always a
label and thus it's possible for 0b and 1b to not be interpreted correctly for
ms-style inline assembly. However, this is relatively simple to fix in the
inline assembly (i.e., drop the [bB]).
This patch also limits backward labels to [0-9]b, so that only 0b and 1b are
ambiguous.
Part of rdar://12470373
llvm-svn: 174983
the body does not use them and it appears the body has positional parameters.
This can cause unexpected results as in the added test case. As the darwin
version of gas(1) which only supported positional parameters, happened to
ignore the named parameters. Now that we want to support both styles of
macros we issue a warning in this specific case.
rdar://12861644
llvm-svn: 173199
// FIXME: Constraints are hard coded to 'm', but we need an 'r'
// constraint for addressof. This needs to be cleaned up!
Test cases are already in place. Specifically,
clang/test/CodeGen/ms-inline-asm.c t15(), t16(), and t24().
llvm-svn: 172569
After discussing the refactoring with Jim and Daniel, the following changes were
made:
* All generic directive parsing is now done by AsmParser itself. The previous
division between it and GenericAsmParser did not have clear boundaries and
just produced unnatural code of GenericAsmParser juggling the internals of
AsmParser through an interface.
The division of responsibilities is now clear: target-specific directives,
other extensions (used by platform-specific parseres), and generic directives.
* Priority for directive parsing was reshuffled to ask extensions first and
check the generic directives later.
No change in functionality.
llvm-svn: 172568
simply use the getParser method from MCAsmParserExtension, working through the
MCAsmParser interface. There's no longer a need to overload that method to
cast it to the concrete AsmParser.
llvm-svn: 172491
This finally allows AsmParser to no longer list GenericAsmParser as a friend.
All member vars directly accessed by GenericAsmParser have been properly
encapsulated and exposed through the MCAsmParser interface. This reduces the
coupling between AsmParser and GenericAsmParser.
llvm-svn: 172490
Now that it behaves itself in terms of streamer independence (r172450), this
method can be moved to MCAsmParser to be available to all extensions,
overriding, etc.
-- -This line, and those below, will be ignored--
M lib/MC/MCParser/AsmParser.cpp
M include/llvm/MC/MCParser/MCAsmParser.h
llvm-svn: 172451
The aim of this patch is to fix the following piece of code in the
platform-independent AsmParser:
void AsmParser::CheckForValidSection() {
if (!ParsingInlineAsm && !getStreamer().getCurrentSection()) {
TokError("expected section directive before assembly directive");
Out.SwitchSection(Ctx.getMachOSection(
"__TEXT", "__text",
MCSectionMachO::S_ATTR_PURE_INSTRUCTIONS,
0, SectionKind::getText()));
}
}
This was added for the "-n" option of llvm-mc.
The proposed fix adds another virtual method to MCStreamer, called
InitToTextSection. Conceptually, it's similar to the existing
InitSections which initializes all common sections and switches to
text. The new method is implemented by each platform streamer in a way
that it sees fit. So AsmParser can now do this:
void AsmParser::CheckForValidSection() {
if (!ParsingInlineAsm && !getStreamer().getCurrentSection()) {
TokError("expected section directive before assembly directive");
Out.InitToTextSection();
}
}
Which is much more reasonable.
llvm-svn: 172450
Since it's used by extensions. One further step to fully decoupling
GenericAsmParser from an intimate knowledge of the internals of AsmParser,
pointing it to the MCASmParser interface instead (like all other parser
extensions do).
Since this change moves the MacroArgument type to the interface header, it's
renamed to be a bit more descriptive in a general context.
llvm-svn: 172449
The methods are also exposed via the MCAsmParser interface, which allows more
than one client to control them. Previously, GenericAsmParser was playing with
a member var in AsmParser directly (by virtue of being its friend).
llvm-svn: 172440
The MCAsmParser interface defines ParseIdentifier is public. There's no reason
whatsoever for AsmParser (which implements the MCAsmParser interface) to hide
this method.
This is all part of a bigger scheme. Several asm parsing "extensions" use the
main parser properly through the MCAsmParser interface. However,
GenericAsmParser has much more exclusive access and uses implementation details
from the concrete implementation - AsmParser, in which it is also declared as
a friend. This makes for overly coupled code, and even makes it hard to split
GenericAsmParser into a separate file. There's no reason why GenericAsmParser
shouldn't be able to access AsmParser through an abstract interface, as long
as it's actually registered as an extension.
llvm-svn: 172276
GenericAsmParser extension, where a lot of directives are already being parsed.
The end goal is having just a single place (and a single lookup table) for
all directive parsing.
llvm-svn: 172268