Commit Graph

141 Commits

Author SHA1 Message Date
Hubert Tong 614c406f9e [libcxx] Add "flag" default arg: basic_regex ptr_size_flag ctor
Summary:
The synopsis in C++11 subclause 28.8 [re.regex] has:
```
basic_regex(const charT* p, size_t len,
            flag_type f = regex_constants::ECMAScript);
```

The default argument is added to libc++ by this change.

Reviewers: mclow.lists, rsmith, hubert.reinterpretcast

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D22702

llvm-svn: 277966
2016-08-07 22:18:33 +00:00
Hubert Tong ac98d59802 [libcxx] basic_regex: add traits_type, string_type
Summary:
In the synopsis in C++11 subclause 28.8 [re.regex], `basic_regex` is
specified to have member typedefs `traits_type` and `string_type`. This
change adds them to libc++.

Reviewers: mclow.lists, rsmith, hubert.reinterpretcast

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D22698

Patch by Jason Liu!

llvm-svn: 277526
2016-08-02 21:34:48 +00:00
Daniel Sanders 4788179a2c [libcxx] Fix definition of regex_traits::__regex_word on big-endian glibc systems
Summary:
On glibc, the bits used for the various character classes is endian dependant
(see _ISbit() in ctypes.h) but __regex_word does not account for this and uses
a spare bit that isn't spare on big-endian. On big-endian, it overlaps with the
bit for graphic characters which causes '-', '@', etc. to be considered a word
character.

Fixed this by defining the value using _ISbit(15) on MIPS glibc systems. We've
restricted this to MIPS for now to avoid the risk of introducing failures in
other targets.

Fixes PR26476.

Reviewers: hans, mclow.lists

Subscribers: dsanders, cfe-commits

Differential Revision: http://reviews.llvm.org/D17132

llvm-svn: 261088
2016-02-17 13:16:31 +00:00
Duncan P. N. Exon Smith f42ef3efca re.results.form: Format out-of-range subexpression references as null
Rather than crashing in match_results::format() when a reference to a
marked subexpression is out of range, format the subexpression as empty
(i.e., replace it with an empty string).  Note that
match_results::operator[]() has a range-check and returns a null match
in this case, so this just re-uses that logic.

llvm-svn: 259682
2016-02-03 19:30:20 +00:00
Marshall Clow b414b2f54b Fix PR#26175. Thanks to Josh Petrie for the report and the patch. Reviewed as http://reviews.llvm.org/D16262
llvm-svn: 258107
2016-01-19 00:50:37 +00:00
Evgeniy Stepanov 906c872db9 Cleanup: move visibility/linkage attributes to the first declaration.
This change moves visibility attributes from out-of-class method
definitions to in-class declaration. This is needed for a switch to
attribute((internal_linkage)) (see http://reviews.llvm.org/D13925)
which can only appear on the first declaration.

This change does not touch istream/ostream/streambuf. They are
handled separately in http://reviews.llvm.org/D14409.

llvm-svn: 252385
2015-11-07 01:22:13 +00:00
Marshall Clow 550dfe79ca Fix a crasher found by libFuzzer
llvm-svn: 245849
2015-08-24 15:57:09 +00:00
Marshall Clow 05ddbffbf3 Make regex and any assert when they should throw an exception _but_ the user has decreed 'no exceptions'. This matches the behavior of string and vector
llvm-svn: 245239
2015-08-17 21:14:16 +00:00
Marshall Clow bcbc37d8a1 Consolidate a bunch of #ifdef _LIBCPP_NO_EXCEPTIONS .. #endif blocks into a single template function. NFC
llvm-svn: 243415
2015-07-28 13:30:47 +00:00
Marshall Clow 983d178108 Detect and throw on a class of bad regexes that we mistakenly accepted before. Thanks to Trevor Smigiel for the report
llvm-svn: 243030
2015-07-23 18:27:51 +00:00
Eric Fiselier b50f8f9ee7 Fix initializer list order in <regex> to be correct
llvm-svn: 242864
2015-07-22 01:29:41 +00:00
Eric Fiselier 818139da59 Remove unused typedefs in random and regex
llvm-svn: 242628
2015-07-18 22:57:14 +00:00
Marshall Clow 8fa8e5fc74 Add code to honor the match_not_bol and match_not_eol regex flats. Fixes PR#22651. Thanks to Jim Porter for the report and suggested fix.
llvm-svn: 232733
2015-03-19 17:05:59 +00:00
Marshall Clow 538fec0e59 Fix for PR22061 by K-ballo
llvm-svn: 227384
2015-01-28 22:22:35 +00:00
Marshall Clow 9db9069cf3 Make regex::assign not clobber the regex in case of failure. Fixes PR#22213
llvm-svn: 225799
2015-01-13 16:49:52 +00:00
Marshall Clow b04058e8c1 Implement LWG 2217 - operator==(sub_match, string) slices on embedded '\0's
llvm-svn: 224292
2014-12-15 23:57:56 +00:00
Dan Albert 15c010a37e Base regex code on char_class_type.
__get_classname() and __bracket_expression were assuming that
char_class_type was ctype_base::mask rather than using
regex_traits<_CharT>::char_class_type.

This change allows char_class_type to be defined to something other than
ctype_base::mask so that the implementation will still work for
platforms with an 8-bit ctype mask (such as Android and OpenBSD).

llvm-svn: 214201
2014-07-29 19:23:39 +00:00
Marshall Clow 9393b5113b Fix Bug 19678 - libc++ does not correctly handle the regex: '[^\0]*'
llvm-svn: 209307
2014-05-21 16:29:50 +00:00
Marshall Clow 16da324051 Implement LWG issue 2306: match_results::reference should be value_type&, not const value_type&. This is a general move by the LWG to have the reference type of read-only containers be a non-const reference; however, there are no methods that return a non-const reference to a match_result entry, so there's no worries about getting a non-const reference to a constant object.
llvm-svn: 202214
2014-02-26 01:56:31 +00:00
Marshall Clow 7d35711187 Implement LWG Issues #2329 and #2332 - disallow iterators into temporary regexes and regexes into temporary strings
llvm-svn: 201717
2014-02-19 21:21:11 +00:00
Marshall Clow 9aafa898f9 Update __parse_DUP_COUNT and __parse_BACKREF to use the traits class to recognize digits. Fixes PR18514
llvm-svn: 199541
2014-01-18 03:40:03 +00:00
Marshall Clow 54f6bd59f5 Fix a bug in regex_token_iterator's copy constructor. Caught by Bob Wilson.
llvm-svn: 199122
2014-01-13 17:47:08 +00:00
Marshall Clow 79b0fee3c6 Fix PR18404 - 'Bug in regex_token_iterator::operator++(int) implementation'. Enhance the tests for regex_token_iterator and regex_iterator.
llvm-svn: 198878
2014-01-09 18:25:57 +00:00
Marshall Clow e604469e5c Patch by GM: apparently '__value' (two underscores) is a special name in Visual Studio, so rename the private method in <regex> with that name. GM's patch used '___value' (three underscores), but I changed that to '__regex_traits_value' because I've been burned in the past by identifiers that appear identical but are not.
llvm-svn: 193087
2013-10-21 15:43:25 +00:00
Howard Hinnant fc88dbd298 Debug mode for string. This commit also marks the first time libc++ debug-mode has found a bug (found one in regex). Had to play with extern templates a bit to get this to work since string is heavily used within libc++.dylib.
llvm-svn: 189114
2013-08-23 17:37:05 +00:00
Howard Hinnant f0544c2086 Nico Rieck: this patch series fixes visibility issues on Windows as explained in <http://lists.cs.uiuc.edu/pipermail/cfe-dev/2013-August/031214.html>.
llvm-svn: 188192
2013-08-12 18:38:34 +00:00
Howard Hinnant 7491a16031 Bill Fisher: This patch fixes a bug where std::regex in ECMAScript mode was ignoring capture groups inside lookahead assertions.
For example, matching /(?=(a))(a)/ to "a" should yield two captures: \1 = "a", \2 = "a"

llvm-svn: 186954
2013-07-23 16:18:04 +00:00
Howard Hinnant 22161401df Bill Fisher: This patch fixes an ill-formed comparison when parsing control escapes, e.g. "\cA\ca". The code will now throw an error_escape exception for invalid control sequences like "\c:" or "\c".
I've added the test cases to bad_escape.pass.cpp.

llvm-svn: 186335
2013-07-15 18:21:11 +00:00
Howard Hinnant c815a4e297 Bill Fisher: This patch fixes a less likely case where '\b' can back up into invalid memory, when driven by a regex_iterator (for case 1, see r185273 or http://llvm.org/bugs/show_bug.cgi?id=16240)
The attached test program also supplies a test for the case 1 fix in r185273.

llvm-svn: 186089
2013-07-11 15:32:55 +00:00
Howard Hinnant dbdeb153d8 Bill Fisher: This patch fixes a bug where regex_iterator doesn't indicate when it's restarting in the middle of a string. This bug causes /^a/ to match in the middle of the string "aaaaaaa", during iteration.
My patch uses  to communicate when  is false.

llvm-svn: 185950
2013-07-09 17:29:09 +00:00
Howard Hinnant 43bbdd29de Bill Fisher: This patch fixes a bug where the regex parser doesn't advance the pointer after reading the third character of an octal escape (in awk mode).
That is, regex{"\141", awk} results in the regular expression /a1/ instead of just /a/.

llvm-svn: 185449
2013-07-02 17:43:31 +00:00
Howard Hinnant 660f2ae422 Prevent '\b' from backing up into invalid memory. Fixes http://llvm.org/bugs/show_bug.cgi?id=16240. Sorry, I can not think of a good test case for this one, except by running valgrind as reported in the bug.
llvm-svn: 185273
2013-06-29 23:45:43 +00:00
Howard Hinnant 3f75953d82 Provide missing '{' in parsing extended quoted characters. This fixes http://llvm.org/bugs/show_bug.cgi?id=16135
llvm-svn: 185211
2013-06-28 20:31:05 +00:00
Howard Hinnant 8d1e822432 William Fisher: A bug in __lookahead::exec causes /(?=^)b/ to match ab. When makes a recursive call to , it passes true for the value of . This causes a beginning-of-line anchor (^) inside a lookahead assertion to match anywhere in the text. This fixes http://llvm.org/bugs/show_bug.cgi?id=11118
llvm-svn: 185196
2013-06-28 19:11:23 +00:00
Howard Hinnant 21246e3314 Bill Fisher: Fix for failing to throw an exception in regex when parsing an invalid escape sequence. This fixes http://llvm.org/bugs/show_bug.cgi?id=16023
llvm-svn: 185192
2013-06-28 18:57:30 +00:00
Marshall Clow 1c2c986796 Fix undefined behavior in syntax_option_type::operator~ and match_flag_type::operator./a.out Found by UBSan
llvm-svn: 177693
2013-03-22 02:13:55 +00:00
Howard Hinnant c60bf548c5 Albert Wong: definition for regex_traits<_CharT>::__regex_word.
llvm-svn: 176640
2013-03-07 19:38:08 +00:00
Howard Hinnant 6e41256f68 No functionality change at this time. I've split _LIBCPP_VISIBLE up into two flags: _LIBCPP_TYPE_VIS and _LIBCPP_FUNC_VIS. This is in preparation for taking advantage of clang's new __type_visibility__ attribute.
llvm-svn: 176593
2013-03-06 23:30:19 +00:00
Howard Hinnant 16694b5df5 Zhang Xiongpang: Add definitions for const data members. Fixes http://llvm.org/bugs/show_bug.cgi?id=14585.
llvm-svn: 170026
2012-12-12 21:14:28 +00:00
Argyrios Kyrtzidis 88db3171dd Don't neglect to "return *this".
llvm-svn: 165860
2012-10-13 02:03:45 +00:00
Howard Hinnant aeb85680fb Dimitry Andric: many visibility fixes. Howard: Much appreciated. Can you send me a patch to CREDITS.TXT?
llvm-svn: 163862
2012-09-14 00:39:16 +00:00
Howard Hinnant 42be98ab54 noexcept and constexpr applied to <regex>.
llvm-svn: 160594
2012-07-21 01:31:58 +00:00
Howard Hinnant c206366fd7 Quash a whole bunch of warnings
llvm-svn: 145624
2011-12-01 20:21:04 +00:00
Howard Hinnant c003db1fca Further macro protection by replacing _[A-Z] with _[A-Z]p
llvm-svn: 145410
2011-11-29 18:15:50 +00:00
Howard Hinnant ab4f438239 Add protection from min/max macros
llvm-svn: 145407
2011-11-29 16:45:27 +00:00
Howard Hinnant 073458b1ab Windows support by Ruben Van Boxem.
llvm-svn: 142235
2011-10-17 20:05:10 +00:00
Howard Hinnant 2a4812fd04 Fix <rdar://problem/10255403> match_results::begin() is off by one
llvm-svn: 141494
2011-10-08 14:36:16 +00:00
Howard Hinnant 54976f2619 Fixed PR10574: http://llvm.org/bugs/show_bug.cgi?id=10574
llvm-svn: 137522
2011-08-12 21:56:02 +00:00
Howard Hinnant ce48a1137d _STD -> _VSTD to avoid macro clash on windows
llvm-svn: 134190
2011-06-30 21:18:19 +00:00
Howard Hinnant ce53420e37 Provide names for template and function parameters in forward declarations. The purpose is to aid automated documentation tools.
llvm-svn: 133008
2011-06-14 19:58:17 +00:00
Howard Hinnant 382600ff97 Jonathan Sauer found a bug in the way ^ was handled
llvm-svn: 128350
2011-03-26 20:02:27 +00:00
Howard Hinnant a0fe8c436e Chris Jefferson noted many places where function calls needed to be qualified (thanks Chris).
llvm-svn: 125510
2011-02-14 19:12:38 +00:00
Howard Hinnant 966b5a3157 N3158 Missing preconditions for default-constructed match_result objects
llvm-svn: 121282
2010-12-08 21:07:55 +00:00
Howard Hinnant 412dbebe1b license change
llvm-svn: 119395
2010-11-16 22:09:02 +00:00
Howard Hinnant 3e84caaebb visibility-decoration.
llvm-svn: 114647
2010-09-23 15:13:20 +00:00
Howard Hinnant 7609c9b665 Changed __config to react to all of clang's currently documented has_feature flags, and renamed _LIBCPP_MOVE to _LIBCPP_HAS_NO_RVALUE_REFERENCES to be more consistent with the rest of the libc++'s flags, and with clang's nomenclature.
llvm-svn: 113086
2010-09-04 23:28:19 +00:00
Howard Hinnant b3371f6f49 Fixing whitespace problems
llvm-svn: 111750
2010-08-22 00:02:43 +00:00
Howard Hinnant 86550b0038 [re.alg.replace]. This finishes all of <regex>. That being said, <regex> is exceptionally difficult to thoroughly test. If anyone has the ability to test this, combined with the interest to do so, now would be a good time. :-)
llvm-svn: 111333
2010-08-18 00:13:08 +00:00
Howard Hinnant 14dcd3d1ff [re.tokiter]
llvm-svn: 111278
2010-08-17 20:42:03 +00:00
Howard Hinnant 2bf1fd99b1 [re.regiter]
llvm-svn: 111178
2010-08-16 20:21:16 +00:00
Howard Hinnant 48b242a275 Everything under [re.results]
llvm-svn: 111074
2010-08-14 18:14:02 +00:00
Howard Hinnant 5cd6658798 Everything under [re.regex]
llvm-svn: 111024
2010-08-13 18:11:23 +00:00
Howard Hinnant 54b409fdb9 now works with -fno-exceptions and -fno-rtti
llvm-svn: 110828
2010-08-11 17:04:31 +00:00
Howard Hinnant 7189782c6b bug fix concerning search not at beginning of string and word boundaries
llvm-svn: 109750
2010-07-29 15:17:28 +00:00
Howard Hinnant 7949ab0743 fix bug incrementing past end in search
llvm-svn: 109716
2010-07-29 01:15:27 +00:00
Howard Hinnant 4ea5240e05 fix parse bug in ecma non-greedy loop
llvm-svn: 109711
2010-07-29 00:36:00 +00:00
Howard Hinnant 6e156afa71 Fixed some bugs in the ecma bracket epression regarding escaped characters, and got the awk grammar going.
llvm-svn: 109599
2010-07-28 17:35:27 +00:00
Howard Hinnant c1124300fe lookahead for ecma
llvm-svn: 109548
2010-07-27 22:20:32 +00:00
Howard Hinnant 93da3b2e41 grep and egrep grammars
llvm-svn: 109534
2010-07-27 19:53:10 +00:00
Howard Hinnant 6afe8b0a23 continued regex development...
llvm-svn: 109512
2010-07-27 17:24:17 +00:00
Howard Hinnant 5c67986156 A good start on ecma regex's. Maybe even feature complete, not sure yet. Also an unrelated fix to is_constructible thanks to Daniel Krugler.
llvm-svn: 109479
2010-07-27 01:25:38 +00:00
Howard Hinnant f7109438ea I believe posix extended expr is feature complete. Getting started on ecma exprs.
llvm-svn: 109126
2010-07-22 17:53:24 +00:00
Howard Hinnant b762bea3ba A few more tests for posix extended alternation
llvm-svn: 109107
2010-07-22 14:12:20 +00:00
Howard Hinnant c1198c320f A good start on extended posix regex. Loops working. Alternation working. Also update by-chapter completeness summary.
llvm-svn: 108548
2010-07-16 19:08:36 +00:00
Howard Hinnant 5d695f041c Fixed to work with generalized iterators.
llvm-svn: 108359
2010-07-14 21:14:52 +00:00
Howard Hinnant 5699358c63 Minor optimizations. Minor bug fixes. More tests.
llvm-svn: 108331
2010-07-14 15:45:11 +00:00
Howard Hinnant 8ab959c961 Bracket expressions are working (lightly tested).
llvm-svn: 108280
2010-07-13 21:48:06 +00:00
Howard Hinnant fdec08bd8b regex_constants icase and collate for matching a single char and for matching back references
llvm-svn: 108178
2010-07-12 19:11:27 +00:00
Howard Hinnant aea2afe334 back references for BRE
llvm-svn: 108168
2010-07-12 18:16:05 +00:00
Howard Hinnant 0cbed7e140 Redesign number 3. The previous design was not handling matching of empty strings inside of loops.
llvm-svn: 108151
2010-07-12 15:51:17 +00:00
Howard Hinnant 87ec03a2ea weekly update to by-chapter-summary, plus left and right anchor support in basic posix.
llvm-svn: 107938
2010-07-09 00:15:26 +00:00
Howard Hinnant 8c459a14a9 Marked subexpressions in a loop in basic posix working (only lightly tested so far)
llvm-svn: 107889
2010-07-08 17:43:58 +00:00
Howard Hinnant 189b212662 First loop test passed. The data structure and search algorithm is still crude and in-flux. But this milestone needed to be locked in. Right now every loop is implemented in terms of a structure that will handle the most complicated {min, max} loop. Though only *-loops are tested at the moment. In a future iteration *-loops will likely be optimized a little more. The only tests are for basic posix so far, but I have prototype code running for extended posix and ecma. The prototype code lacks the complicating properties of the real <regex> requirements though.
llvm-svn: 107803
2010-07-07 19:14:52 +00:00
Howard Hinnant 928658cd70 First test for marked subexpressions
llvm-svn: 107317
2010-06-30 20:30:19 +00:00
Howard Hinnant 237ee6fef8 First, very primitive, search results on one engine
llvm-svn: 107294
2010-06-30 17:22:19 +00:00
Howard Hinnant cdefdeee28 two steps forward, one step back...
llvm-svn: 107230
2010-06-30 00:21:42 +00:00
Howard Hinnant e5561b04e4 [re.submatch]
llvm-svn: 107187
2010-06-29 18:37:43 +00:00
Howard Hinnant 853aff80dd regex: learning to crawl
llvm-svn: 106882
2010-06-25 20:56:08 +00:00
Howard Hinnant 24e98486a3 Continuing to work through regex, and updated libcxx_by_chapter.pdf with weekly test results
llvm-svn: 106790
2010-06-24 21:28:00 +00:00
Howard Hinnant 24757ff75e Finished [re.traits]. I'd like to acknowledge the help of Bjorn Reese with <regex>.
llvm-svn: 106478
2010-06-21 21:01:43 +00:00
Howard Hinnant 70505305c1 Just getting our toes wet on <regex>
llvm-svn: 106187
2010-06-17 00:34:59 +00:00