我有一个带有命名捕获组的正则表达式,其中最后一个组是可选的.我无法弄清楚如何迭代组并在空的时候正确处理可选组; 我得到一个EListOutOfBounds异常.
正则表达式解析由我们通过电子邮件接收的外部系统生成的文件,该文件包含有关已向供应商发出的检查的信息.该文件是以管道分隔的; 示例在下面的代码中.
program Project1; {$APPTYPE CONSOLE} uses System.SysUtils, System.RegularExpressions, System.RegularExpressionsCore; { File format (pipe-delimited): Check #|Batch|CheckDate|System|Vendor#|VendorName|CheckAmount|Cancelled (if voided - optional) } const CheckFile = '201|3001|12/01/2015|1|001|JOHN SMITH|123.45|'#13 + '202|3001|12/01/2015|1|002|FRED JONES|234.56|'#13 + '103|2099|11/15/2015|2|001|JOHN SMITH|97.95|C'#13 ; var RegEx: TRegEx; MatchResult: TMatch; begin try RegEx := TRegEx.Create( '^(?\d+)\|'#10 + ' (? \d{3,4})\|'#10 + ' (? \d{2}\/\d{2}\/\d{4})\|'#10 + ' (? [1-3])\|'#10 + ' (? [0-9X]+)\|'#10 + ' (? [^|]+)\|'#10 + ' (? \d+\.\d+)\|'#10 + '(? C)?$', [roIgnorePatternSpace, roMultiLine]); MatchResult := RegEx.Match(CheckFile); while MatchResult.Success do begin WriteLn('Check: ', MatchResult.Groups['Check'].Value); WriteLn('Dated: ', MatchResult.Groups['ChkDate'].Value); WriteLn('Amount: ', MatchResult.Groups['Amount'].Value); WriteLn('Payee: ', MatchResult.Groups['Payee'].Value); // Problem is here, where Cancelled is optional and doesn't // exist (first two lines of sample CheckFile.) // Raises ERegularExpressionError // with message 'Index out of bounds (8)' exception. WriteLn('Cancelled: ', MatchResult.Groups['Cancelled'].Value); WriteLn(''); MatchResult := MatchResult.NextMatch; end; ReadLn; except // Regular expression syntax error. on E: ERegularExpressionError do Writeln(E.ClassName, ': ', E.Message); end; end.
我已经尝试检查MatchResult.Groups['Cancelled'].Index
是否小于MatchResult.Groups.Count
,尝试检查MatchResult.Groups['Cancelled'].Length > 0
,并检查是否MatchResult.Groups['Cancelled'].Value <> ''
没有成功.
如何正确处理可选捕获组当该组没有匹配时取消?
如果结果中不存在请求的命名组,ERegularExpressionError
则会引发异常.这是设计的(尽管异常消息的措辞具有误导性).如果您ReadLn()
在try/except
阻止之后移动,则在进程退出之前,您将在控制台窗口中看到异常消息.引发异常时,您的代码不会等待用户输入.
由于您的其他组不是可选的,您可以简单地测试是否MatchResult.Groups.Count
足够大以容纳Cancelled
组(测试的字符串在索引0的组中,因此它包含在中Count
):
if MatchResult.Groups.Count > 8 then WriteLn('Cancelled: ', Write(MatchResult.Groups['Cancelled'].Value) else WriteLn('Cancelled: ');
要么:
Write('Cancelled: '); if MatchResult.Groups.Count > 8 then Write(MatchResult.Groups['Cancelled'].Value); WriteLn('');
顺便说一句,你的循环也没有调用NextMatch()
,所以你的代码陷入无限循环.
while MatchResult.Success do begin ... MatchResult := MatchResult.NextMatch; // <-- add this end;
您还可以避免使用可选组,并使取消组成为强制性,包括C或任何内容.只需将正则表达式的最后一行更改为
'(?C|)$'
对于您的测试应用程序,这不会改变输出.如果您需要进一步使用取消,您只需检查它是否包含C或空字符串.
if MatchResult.Groups['Cancelled'].Value = 'C' then DoSomething;