Expressions grouping

Parentheses have second meaning in regular expressions. They simply join (group) characters in a bigger units. Expression in parentheses is treated as a whole and has higher priority for operators. Like in a math equation.

The pattern berry\W* will match only one word with arbitrary number of non-word characters after it: berry,berry,berry!!! Try it yourself while the pattern (berry\W)* will match arbitrary number of words followed by one non-word character: berry,berry,berry!!! Try it yourself Lets notice a side effect - the final group match.

Everything enclosed by parentheses will be subject to repetition with *, +, ? or {} operators. But | - the alternation operator uses groups differently. It takes all characters and groups as a one RegEx. For example this pattern d(uria)|(bana)n will not match the final letter durian Try it yourself The result is the same like without parentheses at all: duria|banan. What we need is d(uria|bana)n durian Try it yourself

In the above example we can notice a side effect, the group which we will maybe not need. There is a special syntax for forcing parentheses to not be a group marker. For last example the d(?:uria|bana)n pattern will not return any group: durian Try it yourself