llvm-project

Commit Graph

Author	SHA1	Message	Date
Richard Smith	8ed7776bc4	PR38870: Add warning for zero-width unicode characters appearing in identifiers. llvm-svn: 341700	2018-09-07 19:25:39 +00:00
Richard Smith	77091b167f	Warn if we find a Unicode homoglyph for a symbol in an identifier. Specifically, warn if: * we find a character that the language standard says we must treat as an identifier, and * that character is not reasonably an identifier character (it's a punctuation character or similar), and * it renders identically to a valid non-identifier character in common fixed-width fonts. Some tools "helpfully" substitute the surprising characters for the expected characters, and replacing semicolons with Greek question marks is a common "prank". llvm-svn: 320697	2017-12-14 13:15:08 +00:00
Richard Smith	664798c034	Add test that we correctly allow some non-letter unicode characters in identifiers, and extend existing test to also cover C++. llvm-svn: 248079	2015-09-19 02:14:12 +00:00
Jordan Rose	cc538345be	Lexer: Don't warn about Unicode in preprocessor directives. This allows people to use Unicode in their #pragma mark and in macros that exist only to be string-ized. <rdar://problem/13107323&13121362> llvm-svn: 174081	2013-01-31 19:48:48 +00:00
Jordan Rose	17441589c3	Don't warn about Unicode characters in -E mode. People use the C preprocessor for things other than C files. Some of them have Unicode characters. We shouldn't warn about Unicode characters appearing outside of identifiers in this case. There's not currently a way for the preprocessor to tell if it's in -E mode, so I added a new flag, derived from the PreprocessorOutputOptions. This is only used by the Unicode warnings for now, but could conceivably be used by other warnings or even behavioral differences later. <rdar://problem/13107323> llvm-svn: 173881	2013-01-30 01:52:57 +00:00
Jordan Rose	4246ae0089	As an extension, treat Unicode whitespace characters as whitespace. llvm-svn: 173370	2013-01-24 20:50:50 +00:00

6 Commits