We previously had support for char and wchar_t string literals. VS 2015
added support for char16_t and char32_t.
String literals must be mangled in the MS ABI in order for them to be
deduplicated across translation units: their linker has no notion of
mergeable section. Instead, they use the mangled name to make a COMDAT
for the string literal; the COMDAT will merge with other COMDATs in
other object files.
This allows strings in object files generated by clang to get merged
with strings in object files generated by MSVC.
llvm-svn: 222564
This adds coverage for Unicode code points which are encoded with
non-zero values in the upper half of the wchar_t.
No functionality change.
llvm-svn: 205251
COFF doesn't have mergeable sections so LLVM/clang's normal tactics for
string deduplication will not have any effect.
To remedy this we place each string inside it's own section and mark
the section as IMAGE_COMDAT_SELECT_ANY. However, we can only do this if the
string has an external name that we can generate from it's contents.
To be compatible with MSVC, we must use their scheme. Otherwise identical
strings in translation units from clang may not be deduplicated with
translation units in MSVC.
This fixes PR18248.
N.B. We will not attempt to do anything with a string literal which is not of
type 'char' or 'wchar_t' because their compiler does not support unicode
string literals as of this date. Further, we avoid doing this if
either -fwritable-strings or -fsanitize=address are present.
This reverts commit r204596.
llvm-svn: 204675
This commit cleans up a few accidents:
- Do not rely on the order in which StringLiteral lays out bytes.
- Use a more efficient mechanism for handling so-called
"special-mappings" when mangling string literals.
- There is no need to allocate a copy of the mangled name.
- Add the test written for r204562.
Thanks to Richard Smith for pointing these out!
llvm-svn: 204586