如何使用Python的正则表达式在字符串中找到两次出现的字母对?
我想遍历一个字符串列表,找到那些具有重复字母对的字符串,并将它们放入列表中.字母不需要相同,只需要重复,但字母可以相同.
例如:
xxhgfhdeifhjfrikfoixx
- 这个有xx
两次,所以我想保留这个字符串
kwofhdbugktrkdidhdnbk
- 这个也会保留,因为hd
重复
我得到的最好的是找到对: ([a-z][a-z])\1|([a-z])\2
我需要找到哪些字符串有重复的对.
(\w{2}).*?(\1)
https://regex101.com/r/yB3nX6/1
for match in re.finditer(r"(\w{2}).*?(\1)", subject, re.IGNORECASE): # match start: match.start() # match end (exclusive): match.end() # matched text: match.group()
result = re.findall(r"(\w{2}).*?(\1)", subject, re.IGNORECASE)
# (\w{2}).*?(\1) # # Options: Case insensitive; Exact spacing; Dot doesn’t match line breaks; ^$ don’t match at line breaks; Regex syntax only # # Match the regex below and capture its match into backreference number 1 «(\w{2})» # Match a single character that is a “word character” (Unicode; any letter or ideograph, any number, underscore) «\w{2}» # Exactly 2 times «{2}» # Match any single character that is NOT a line break character (line feed) «.*?» # Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?» # Match the regex below and capture its match into backreference number 2 «(\1)» # Match the same text that was most recently matched by capturing group number 1 (case insensitive; fail if the group did not participate in the match so far) «\1»
您可以切换出\w
的[a-z]
,如果你想具体谈谈只接受a-z
字符.