【教程】详解Python正则表达式之： (?:…) non-capturing group 非捕获组

Python 2.7手册中的官网解释为：

(?:...)
A non-capturing version of regular parentheses. Matches whatever regular expression is inside the parentheses, but the substring matched by the group cannot be retrieved after performing a match or referenced later in the pattern.

下面就来解释详细的含义：

1.non-capturing group，中文直译为，非捕获组。

2.非捕获组的含义：

非：不的意思
捕获：捕获的含义，可以理解为：

匹配
且记录此处所匹配的内容，为一个对应的group组，为后续使用
后续如何使用：

比如.group(1)就是获得第一个捕获的组的值

组：group，将此部分匹配的内容，用括号扩起来，称为一个组，是为了方便（看代码的）用户，更加明白其逻辑上的含义而已。

3.此处的，非捕获组，主要是，相对于，捕获的组来说的。

4.捕获的组，是你所常见到的，比如：

(xxx)类型的：【教程】详解Python正则表达式之： (…) group 分组

(?P<name>xxx)类型的：【教程】详解Python正则表达式之： (?P<name>…) named group 带命名的组

【Python中的re模块中(?:xxx)的non-capturing group 非捕获组示例代码详解】

#!/usr/bin/python
# -*- coding: utf-8 -*-
"""
Function:
【教程】详解Python正则表达式之： (?:…) non-capturing group 非捕获组
【教程】详解Python正则表达式之： (?:…) non-capturing group 非捕获组


Version:    2013-09-06
Author:     Crifan Li
Contact:    https://www.crifan.com/about/me/
"""

import re;

def python_re_non_capturing_group():
    """
        demo Pyton non-capturing group
    """
    inputStr = "hello 123 world 456 nihao 789";
    rePatternAllCapturingGroup = "\w+ (\d+) \w+ (\d+) \w+ (\d+)";
    rePatternWithNonCapturingGroup = "\w+ (\d+) \w+ (?:\d+) \w+ (\d+)";
    print "inputStr=",inputStr;
    print "rePatternAllCapturingGroup=",rePatternAllCapturingGroup;
    print "rePatternWithNonCapturingGroup=",rePatternWithNonCapturingGroup;
    print "--- 1. show normal case, all captured group ---"
    foundDigitsAllCapturingGroup = re.search(rePatternAllCapturingGroup, inputStr);
    if(foundDigitsAllCapturingGroup):
        firstGroup = foundDigitsAllCapturingGroup.group(1);
        print "firstGroup=",firstGroup; #firstGroup= 123
        secondGroup = foundDigitsAllCapturingGroup.group(2);
        print "secondGroup=",secondGroup; #secondGroup= 456
        thirdGroup = foundDigitsAllCapturingGroup.group(3);
        print "thirdGroup=",thirdGroup; #thirdGroup= 789
    print "--- 2. show with non-capturing group ---"
    foundDigitsWithNonCapturingGroup = re.search(rePatternWithNonCapturingGroup, inputStr);
    if(foundDigitsWithNonCapturingGroup):
        firstGroup = foundDigitsWithNonCapturingGroup.group(1);
        print "firstGroup=",firstGroup; #firstGroup= 123
        secondGroup = foundDigitsWithNonCapturingGroup.group(2);
        print "secondGroup=",secondGroup; #secondGroup= 789
        #thirdGroup = foundDigitsWithNonCapturingGroup.group(3); # will error -> IndexError: no such group
        print """Explains:
1. for second group (?:\d+),
is something like (?:xxx)
is a non-capturing group
so only match this group,
but not usable(indexable) later
so, here second group is not 456, but is 789

2. also, second group is omitted
so there is not index=3 group
so above use group(3) will cause error:
IndexError: no such group
        """

###############################################################################
if __name__=="__main__":
    python_re_non_capturing_group();

【总结】

非捕获组的目的：

我的理解是：只是为了，将对应的匹配的内容，

弄成一个逻辑的整体，即group，但是实际上并没有记录内容为group

应该是为了，让看代码的人，更加明白所要匹配的内容的逻辑关系

并没有太多实用价值。

提示：

更多相关系列教程可参考：

【教程】详解Python正则表达式

转载请注明：在路上 » 【教程】详解Python正则表达式之： (?:…) non-capturing group 非捕获组

Post Views: 3,857

与本文相关的文章