Formatting in Python

Regular Expressions (RegEx) patterns are constructed from ordinary strings. Backslash (\) character inside them escapes special symbols. Unfortunately, it is also special for normal strings.

regularexpression = "\byellow" # wrong! \b is ASCII backspace character
regularexpression = "\\byellow" # ok
regularexpression = r"\byellow" # easier and more intuitive Try it yourself

The r"..." and multilined r"""...""" are Python's raw strings. It means that special characters are not special anymore. Actually, it is easy to think about them as about strings dedicated for RegEx, because of the beginning r letter.

One more word about the formating. The re.VERBOSE flag will help us write more readable patterns. All white characters and comments (starting from the # sign) will be ignored. RegEx below is the same like the previous one:

regularexpression = r"""
\b # word boundary
yellow # color to match
"""

If we want to match the space or # (hash) in such expressions, we have to escape it by the \.