最新消息:20210816 当前crifan.com域名已被污染,为防止失联,请关注(页面右下角的)公众号

【已解决】Python中的正则re查找中,从多个匹配的组中获得所有的匹配的值

Python re crifan 8758浏览 0评论

【问题】

别人遇到的问题:

求正则表达式牛人 怎样获得截获了多次的组的所有子串

Match.group(i)方法说明里说
如果一个组被截获了多次 则 截获了多次的组返回最后一次截获的子串
比如"(\w)*"这样在组后跟数量词就会造成一个分组被截获多次,我想拿到某一组全部被截获的子串,而不仅是最后一次的该怎么做。

【解决过程】

1.参考:

Python regexes: How to access multiple matches of a group?

可知,有三种方法:

(1)去除星号*,然后用re.findall

解释详见:

http://docs.python.org/2/library/re.html#re.findall

re.findall(pattern, string, flags=0)

Return all non-overlapping matches of pattern in string, as a list of strings. The string is scanned left-to-right, and matches are returned in the order found. If one or more groups are present in the pattern, return a list of groups; this will be a list of tuples if the pattern has more than one group. Empty matches are included in the result unless they touch the beginning of another match.

New in version 1.5.2.

Changed in version 2.4: Added the optional flags argument.

 

具体如何使用可参考我的:

【整理】Python中的re.search和re.findall之间的区别和联系 + re.finall中带命名的组,不带命名的组,非捕获的组,没有分组四种类型之间的区别

 

(2)去除星号*,然后用re.finditer

解释详见:

http://docs.python.org/2/library/re.html#re.finditer

re.finditer(pattern, string, flags=0)

Return an iterator yielding MatchObject instances over all non-overlapping matches for the RE pattern in string. The string is scanned left-to-right, and matches are returned in the order found. Empty matches are included in the result unless they touch the beginning of another match.

New in version 2.2.

Changed in version 2.4: Added the optional flags argument.

 

(3)(对于更复杂的解析任务)用pyparsing

 

【总结】

之前还真没注意到过这个问题。

也还真没去用re.finditer,有空可以去试试。

转载请注明:在路上 » 【已解决】Python中的正则re查找中,从多个匹配的组中获得所有的匹配的值

发表我的评论
取消评论

表情

Hi,您需要填写昵称和邮箱!

  • 昵称 (必填)
  • 邮箱 (必填)
  • 网址
82 queries in 0.169 seconds, using 22.07MB memory