Chapter 10. Regular Expressions
In the previous chapter, we learned all about formatted input and output in C++. We saw that there are good solutions for formatted output--as long as you make sure you're in the C
locale--but that despite the many approaches to input parsing, even the simple task of parsing an int
out of a string can be quite difficult. (Recall that of the two most foolproof methods, std::stoi(x)
requires converting x
to a heap-allocated std::string
, and the verbose std::from_chars(x.begin(), x.end(), &value, 10)
is lagging the rest of C++17 in vendor adoption.) The fiddliest part of parsing numbers is figuring out what to do with the part of the input that isn't numeric!
Parsing gets easier if you can split it into two subtasks: First, figure out exactly how many bytes of the input correspond to one "input item" (this is called lexing); and second, parse the value of that item, with some error recovery in the case that the item's value is out of range or otherwise nonsensical...