6.3. re模块的findall的模式(pattern)中是否加括号的区别

关于search的结果,第 6.2 节 “re模块的search的含义和用法及查找后group的含义”中已经解释过了。

下面详细给出关于findall中,对于pattern中,加括号,与不加括号,所查找到的结果的区别。

其中加括号,表示()内的匹配的内容为一组,供得到结果,通过group(N)所获取的到,N从0开始。

下面是详细测试结果,看结果,就明白是否加括号之间的区别了:

# here blogContent contains following pic url link:
# http://hiphotos.baidu.com/againinput_tmp/pic/item/069e0d89033b5bb53d07e9b536d3d539b400bce2.jpg
# http://hiphotos.baidu.com/recommend_music/pic/item/221ebedfa1a34d224954039e.jpg
# following is test result:
pic_pattern_no_parenthesis = r'http://hiphotos.baidu.com/\S+/[ab]{0,2}pic/item/[a-zA-Z0-9]{24,40}\.\w{3}'
picList_no_parenthesis = re.findall(pic_pattern_no_parenthesis, blogContent) # findall result is a list if matched
print 'findall no()=',picList_no_parenthesis
print 'findall no() len=',len(picList_no_parenthesis)
#print 'findall no() group=',picList_no_parenthesis.group(0) # -> cause error
pic_pattern_with_parenthesis = r'http://hiphotos.baidu.com/(\S+)/([ab]{0,2})pic/item/([a-zA-Z0-9]+)\.([a-zA-Z]{3})'
picList_with_parenthesis = re.findall(pic_pattern_with_parenthesis, blogContent) # findall result is a list if matched
print 'findall with()=',picList_with_parenthesis
print 'findall with() len=',len(picList_with_parenthesis)
#print 'findall with() group(0)=',picList_with_parenthesis.group(0) # -> cause error
#print 'findall with() group(1)=',picList_with_parenthesis.group(1) # -> cause error
print 'findall with() [0][0]=',picList_with_parenthesis[0][0]
print 'findall with() [0][1]=',picList_with_parenthesis[0][1]
print 'findall with() [0][2]=',picList_with_parenthesis[0][2]
print 'findall with() [0][3]=',picList_with_parenthesis[0][3]
#print 'findall with() [0][4]=',picList_with_parenthesis[0][4] # no [4] -> cause error
    

测试结果为:

findall no()= [u'http://hiphotos.baidu.com/againinput_tmp/pic/item/069e0d89033b5bb53d07e9b536d3d539b400bce2.jpg', u'http://hiphotos.baidu.com/recommend_music/pic/item/221ebedfa1a34d224954039e.jpg'] findall no() len= 2 findall with()= [(u'againinput_tmp', u'', u'069e0d89033b5bb53d07e9b536d3d539b400bce2', u'jpg'), (u'recommend_music', u'', u'221ebedfa1a34d224954039e', u'jpg')] findall with() len= 2 findall with() [0][0]= againinput_tmp findall with() [0][1]= findall with() [0][2]= 069e0d89033b5bb53d07e9b536d3d539b400bce2 findall with() [0][3]= jpg