【Python 3.6.1】正则表达式:findall & compile
一、findall:搜索string,以列表形式返回匹配字符串
1 2 3 4 5 |
>>> import re >>> text = "Karl Marx was born in Germany, and German was his native language." >>> re.findall(r"Ger\w+", text) ['Germany', 'German'] >>> |
解析:findall查找“Ger”的字符串。
- r””作用:告诉编译器这个string是个raw string,不要转义“\”,例如:\n在raw string就表示两个字符,而不是我们常常用作的换行符。所以当我们在用正则表达式的时候为了避免与“\”冲突,最好在字符串前加r
- \w+作用:\w表示:匹配任意字母数字字符;+表示:匹配1次或无限次。
二、compile:将书写好的正则表达式编译成正则表达式对象,多用于多次使用或常用的正则表达式
1 2 3 4 5 6 |
>>> import re >>> text = "Karl Marx was born in Germany, and German was his native language." >>> rec = re.compile(r"Ger\w*") >>> rec.findall(text) ['Germany', 'German'] >>> |
这个就比较好理解了,将”Ger\w+”编译成一个对象,然后直接使用就可以了。
三、“*” 和 “+” 区别
*:匹配0次或多次
+:匹配1次或多次
通过下面的测试可以很清楚的理解他们的区别
1 2 3 4 5 6 7 8 |
>>> text = "Karl Marx was born in Germany, and German was his native language." >>> >>> re.findall(r"\w+a\w+",text) ['Karl', 'Marx', 'was', 'Germany', 'German', 'was', 'native', 'language'] >>> >>> re.findall(r"\w*a\w*",text) ['Karl', 'Marx', 'was', 'Germany', 'and', 'German', 'was', 'native', 'language'] >>> |