Commit Graph

13 Commits

Author SHA1 Message Date
Corentin Jabot 274adcb866 Implement delimited escape sequences.
\x{XXXX} \u{XXXX} and \o{OOOO} are accepted in all languages mode
in characters and string literals.

This is a feature proposed for both C++ (P2290R1) and C (N2785). The
papers have been seen by both committees but are not yet adopted into
either standard. However, they do have support from both committees.
2021-09-15 09:54:49 -04:00
Corentin Jabot 4e80636db7 Implement P1949
This adds the Unicode 13 data for XID_Start and XID_Continue.
The definition of valid identifier is changed in all C++ modes
as P1949 (https://wg21.link/p1949) was accepted by WG21 as a defect
report.
2021-08-18 07:33:14 -04:00
Richard Smith e87aeb378d When pretty-printing a C++11 literal operator, don't insert whitespace between
the "" and the suffix; that breaks names such as 'operator""if'. For symmetry,
also remove the space between the 'operator' and the '""'.

llvm-svn: 249641
2015-10-08 00:17:59 +00:00
Alp Toker b05e0b53b9 Preprocessor: support defined() with operator names for MS compatibility
Also flesh out missing tests, improve diagnostic QOI and fix a couple of corner
cases found in the process.

Fixes PR10606.

llvm-svn: 209276
2014-05-21 06:13:51 +00:00
Richard Smith 4ee696d55c PR18870: Parse language linkage specifiers properly if the string-literal is
spelled in an interesting way.

llvm-svn: 201536
2014-02-17 23:25:27 +00:00
Richard Smith 8b7258bdb3 PR18855: Add support for UCNs and UTF-8 encoding within ud-suffixes.
llvm-svn: 201532
2014-02-17 21:52:30 +00:00
Richard Smith 6f21206850 DR1473: Do not require a space between operator"" and the ud-suffix in a
literal-operator-id.

llvm-svn: 166373
2012-10-20 08:41:10 +00:00
Richard Smith bcc22fc4e1 Support for raw and template forms of numeric user-defined literals,
and lots of tidying up.

llvm-svn: 152392
2012-03-09 08:00:36 +00:00
Richard Smith 7d182a7909 Fix a couple of issues with literal-operator-id parsing, and provide recovery
for a few kinds of error. Specifically:

Since we're after translation phase 6, the "" token might be formed by multiple
source-level string literals. Checking the token width is not a correct way of
detecting empty string literals, due to escaped newlines. Diagnose and recover
from a missing space between "" and suffix, and from string literals other than
"", which are followed by a suffix.

llvm-svn: 152348
2012-03-08 23:06:02 +00:00
Richard Smith 39570d0020 Add support for cooked forms of user-defined-integer-literal and
user-defined-floating-literal. Support for raw forms of these literals
to follow.

llvm-svn: 152302
2012-03-08 08:45:32 +00:00
Richard Smith 75b67d6dc5 User-defined literal support for character literals.
llvm-svn: 152277
2012-03-08 01:34:56 +00:00
Richard Smith c67fdd4eb9 AST representation for user-defined literals, plus just enough of semantic
analysis to make the AST representation testable. They are represented by a
new UserDefinedLiteral AST node, which is a sugared CallExpr. All semantic
properties, including full CodeGen support, are achieved for free by this
representation.

UserDefinedLiterals can never be dependent, so no custom instantiation
behavior is required. They are mangled as if they were direct calls to the
underlying literal operator. This matches g++'s apparent behavior (but not its
actual mangling, which is broken for literal-operator-ids).

User-defined *string* literals are now fully-operational, but the semantic
analysis is quite hacky and needs more work. No other forms of user-defined
literal are created yet, but the AST support for them is present.

This patch committed after midnight because we had already hit the quota for
new kinds of literal yesterday.

llvm-svn: 152211
2012-03-07 08:35:16 +00:00
Richard Smith d67aea28f6 User-defined literals: reject string and character UDLs in all places where the
grammar requires a string-literal and not a user-defined-string-literal. The
two constructs are still represented by the same TokenKind, in order to prevent
a combinatorial explosion of different kinds of token. A flag on Token tracks
whether a ud-suffix is present, in order to prevent clients from needing to look
at the token's spelling.

llvm-svn: 152098
2012-03-06 03:21:47 +00:00