unix regex
| regex feature | BREs | EREs |
| dot, ^, $, [ ], [^ ] | | |
| "any number" quantifier | * | * |
| + and ? quantifiers | + ? | |
| range quantifier | \{min, max\} | {min, max} |
| grouping | \( \) | ( ) |
| can apply quantifiers to parentheses | | |
| backreferences | \1 through \9 | |
| alternation | |
regex table
[cite:;taken from @nlp_jurafsky_2020 chapter 2 regular expressions, text normalization, edit distance]
| regex | match | example pattern |
| [0-9] \d | a single digit | |
| [A-Z] | an upper case letter | |
| [a-z] | a lower case letter | |
| ^ | start of line | |
| \$ | end of line | |
| \b | word boundary | |
| \B | non-word boundary | |
| \D | any non-digit | |
| \w | any alphanumeric/underscore | |
| \W | a non-alphanumeric | |
| \s | whitespace(space,tab) | |
| \S | non-whitespace | |
| * | zero or more occurrences of the previous char or expression | |
| + | one or more occurrences of the previous char or expression | |
| ? | exactly zero or one occurrence of the previous char or expression | |
| {n} | | |
| {n,m} | from | |
| {n,} | at least | |
| {,m} | up to | |
| \* | an asterisk | |
| \. | a period | |
| \? | a question mark | |
| \n | a newline | |
| \t | a tab |