我使用以下RegEx:
url (blank:false, matches: /^(https?:\/\/)(?:[A-Za-z0-9]+([\-\.][A-Za-z0-9]+)*\.)+[A-Za-z]{2,40}(:[1-9][0-9]{0,4})?(\/\S*)?/)
我想添加(?i)以使所有内容不区分大小写.我该如何添加?
我可以确认(?i)
正则表达式的开头使它不区分大小写.
无论如何,如果你的目的是减少正则表达式长度,你可以使用groovy 美元slashy字符串形式.它允许你不逃避斜杠/
(转义字符变为$
).
瘾:
posix chars \p{Alnum}
是紧凑的等价物[0-9a-zA-Z]
(这样你可以避免使用它(?i)
).
从char类中删除不需要的反斜杠[\-\.]
- > [-.]
(当破折号是第一个或最后一个元素时,它不是强制性的,并且点在字符组内始终是字面值).
从协议部分删除不需要的圆括号
在以下版本中,我利用了美元slashy字符串和自由间距正则表达式标志的多行支持(?x)
:
$/(?x) ^ # start of the string https?:// # http:// or https://, no need of round brackets ( # start group 1, have to be a non capturing (?: ... ) but is less readable \p{Alnum}+ # one or more alphanumeric char instead of [a-zA-Z0-9] ([.-]\p{Alnum}+)* # zero or more of (literal dot or dash followed by one or more [a-zA-Z0-9]) \. # a literal dot )+ # repeat the group 1 one or more \p{Alpha}{2,40} # between 2 and 40 alphabetic chars [a-zA-Z] (:[1-9][0-9]{0,4})? # [optional] a literal colon ':' followed by at least one non zero digit till 5 digits (/\S*)? # [optional] a literal slash '/' followed by zero or more non-space chars /$
美元贬值的紧凑版本:
$/^https?://(\p{Alnum}+([.-]\p{Alnum}+)*\.)+\p{Alpha}{2,40}([1-9][0-9]{0,4})?(/\S*)?/$
如果你必须使用slashy版本,这是一个等价的:
/^https?:\/\/(?:\p{Alnum}+([.-]\p{Alnum}+)*\.)+\p{Alpha}{2,40}(:[1-9][0-9]{0,4})?(\/\S*)?/
一段代码来测试所有这些正则表达式:
def multiline_pattern = $/(?x) ^ # start of the string https?:// # http:// or https://, no need of round bracket ( # start group 1, have to be a non capturing (?: ... ) but is less readable \p{Alnum}+ # one or more alphanumeric char, instead of [a-zA-Z0-9] ([.-]\p{Alnum}+)* # zero or more of (literal dot or dash followed by one or more [0-9a-zA-Z]) \. # a literal dot )+ # repeat the group 1 one or more \p{Alpha}{2,40} # between 2 and 40 alphabetic chars [a-zA-Z] (:[1-9][0-9]{0,4})? # [optional] a literal colon ':' followed by at least one non zero digit till 5 digits (/\S*)? # [optional] a literal slash '/' followed by zero or more non-space chars /$ def compact_pattern = $/^https?://(\p{Alnum}+([.-]\p{Alnum}+)*\.)+\p{Alpha}{2,40}(:[1-9][0-9]{0,4})?(/\S*)?/$ def slashy_pattern = /^https?:\/\/(?:\p{Alnum}+([.-]\p{Alnum}+)*\.)+\p{Alpha}{2,40}(:[1-9][0-9]{0,4})?(\/\S*)?/ def url1 = 'https://www.example-test.domain.com:12344/aloha/index.html' def notUrl1 = 'htxps://www.example-test.domain.com:12344/aloha/index.html' def notUrl2 = 'https://www.example-test.domain.com:02344/aloha/index.html' assert url1 ==~ multiline_pattern assert url1 ==~ compact_pattern assert url1 ==~ slashy_pattern assert !( notUrl1 ==~ compact_pattern ) assert !( notUrl1 ==~ slashy_pattern ) assert !( notUrl1 ==~ slashy_pattern ) assert !( notUrl2 ==~ compact_pattern ) assert !( notUrl2 ==~ slashy_pattern ) assert !( notUrl2 ==~ slashy_pattern )