Regex
Choice information
Regex will arbitrarily match pattern in a line unless pinned to a location.
Any meta-character to be used literally must be escaped with a \ prepending it.
Generally find what excludes the bad case.
Groups can be nested.
Groups will only capture what is enclosed in ().
Groups can typically be referenced with \{some_num} (like \0 or \1).
Groups are referenced from \1 onwards; \0 is the whole matched string.
| can decide what to OR a little arbitrarily. Maybe on whitespace as well as group?
All set matchers (like \d) can be inversed through capitalization.
Flags can be combined like (?im).
Sets
\dwill match any digit [0-9].\wwill match any alphanumeric [A-Za-z0-9_].\swill match any whitespace.
Globbing
[]encapsulating a collection of characters indicates an OR.-can be used inside[]to indicate a sequential range of values to be represented:[a-z].{}can be appended to a character to indicate number of instances to match:{4}.,can be used in some engines inside{}to designate a range:{1,5}.()will capture characters match in a group for later use.?:inside()will result in a non-capturing group:(?:.*).
Pinning
^is a meta-character IN[]for NOT.^is a meta-character OUTSIDE of[]for start of line.$is a meta-character for end of line.\bis a boundary between alphanumeric and non-alphanumeric character.
Matching
.is a meta-character for literally anything (including\nwith(?s)).*is a meta-character for 0 or more matches.+is a meta-character for 1 or more matches.?after QUANTIER (*,+,{d+,d+}) is a meta-character for laziness.?after REGULAR CHARACTER/TOKEN is a meta-character for optionality.|indicates an OR for the whole sequence.
Flags
(?i)flag for case-insensitive.(?m)flag for multiline (^and$match line boundaries).(?s)flag for dotall, making.match newlines.