Lazy patterns

By default repetition operators match as much as possible: Lyncheeeeee Try it yourself That was matched by the Lynche+ pattern. Those operators are called greedy. When we use a not-greedy version of the same operator (by adding the ? after it) like Lynche+?, we will much as little as possible: Lyncheeeeee Try it yourself Surprise come when we get no match at all Lyncheeeeee Try it yourself because some operators accept zero characters as a match (here Lynche*?).

Lets take a more useful example. The greedy pattern \d (.+), matches too many fruits: 1 coconut, 2 guavas, 3 jackfruit Try it yourself but the not greedy version \d (.+?), matches exactly one: 1 coconut, 2 guavas, 3 jackfruit Try it yourself

RegEx greediness is hasty. What I mean is that greedy operators takes as much a possible but don't try remaining string for wider match. The pattern \w+ is satisfied by the first word, even the others are much longer peach, gooseberry, aaaaaaaaaaaaaaaaaaaaaaaaaa Try it yourself

There are 4 non-greedy operators in Python's RegEx: *?, +?, ?? and {}?. Let me explain here the last operator, which is not known as good as others. The {} is general form of repetition. It takes a minimum and maximum value to repeat. Therefor the others operators can be constructed from it:

  • * stands for {0,}
  • + for {1,}
  • ? for {0,1}