The python provides very powerful grouping facilities. When you get a matching object,
you can apply the group(index) or groups() method on it.
The groups() is the same as {group(1), group(2), ...}
The syntax for a named group is one of the Python-specific extensions: (?P
>>> p = re.compile(r'(?P
>>> m = p.search( '(((( Lots of punctuation )))' )
>>> m.group('word')
'Lots'
>>> m.group(1)
'Lots'
You can refer to the previous named groups by their names:
>>> p = re.compile(r'(?P
>>> p.search('Paris in the the spring').group()
'the the'
p.s.:
Be noted that both match and search will stop once they find ONE substring that fits the pattern.
One example about the grouping:
This is an example I met in the recovery and backup system.
Suppose we want to split the filename
2009-07-22-23-09.tar.gz (year-month-day-hour-minute.tar.gz)
We want to get the year, month, day, hour and minute when this file was created. How to write the regular expression?
the pattern should be
?P
The file name is grouped into five parts, and we can retrieve each part by invoking the group(name) function.
A better example using VERBOSE:
pat = re.compile(r"""
\s* # Skip leading whitespace
(?P
\s* : # Whitespace, and a colon
(?P
# lose the following trailing whitespace
\s*$ # Trailing whitespace to end-of-line
""", re.VERBOSE)
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.