我正在为搜索结果页面编写代码,需要突出显示搜索字词.这些术语碰巧发生在表格单元格中(应用程序正在迭代GridView行单元格),这些表格单元格可能包含HTML.
目前,我的代码看起来像这样(相关的帅哥如下所示):
const string highlightPattern = @"$0";
DataBoundLiteralControl litCustomerComments = (DataBoundLiteralControl)e.Row.Cells[CUSTOMERCOMMENTS_COLUMN].Controls[0];
// Turn "term1 term2" into "(term1|term2)"
string spaceDelimited = txtTextFilter.Text.Trim();
string pipeDelimited = string.Join("|", spaceDelimited.Split(new[] {" "}, StringSplitOptions.RemoveEmptyEntries));
string searchPattern = "(" + pipeDelimited + ")";
// Highlight search terms in Customer - Comments column
e.Row.Cells[CUSTOMERCOMMENTS_COLUMN].Text = Regex.Replace(litCustomerComments.Text, searchPattern, highlightPattern, RegexOptions.IgnoreCase);
令人惊讶的是它有效.但是,有时我匹配的文本是HTML,如下所示:
Fred was a classy individual.
如果你搜索"类"我希望突出显示代码将"class"包装在"classy"中,但当然不是HTML属性"class"恰好在那里!如果您搜索"Fred",则应突出显示.
那么什么是一个好的正则表达式,以确保匹配只发生在html标签之外?它不一定是超级铁杆.我认为,只需确保匹配不在<和>之间就行了.
这个正则表达式应该完成这项任务:(?]*)(regex you want to check: Fred|span)
它检查<[^>]*
从匹配的字符串开始,不可能匹配正向后退的正则表达式.
修改后的代码:
const string notInsideBracketsRegex = @"(?]*)";
const string highlightPattern = @"$0";
DataBoundLiteralControl litCustomerComments = (DataBoundLiteralControl)e.Row.Cells[CUSTOMERCOMMENTS_COLUMN].Controls[0];
// Turn "term1 term2" into "(term1|term2)"
string spaceDelimited = txtTextFilter.Text.Trim();
string pipeDelimited = string.Join("|", spaceDelimited.Split(new[] {" "}, StringSplitOptions.RemoveEmptyEntries));
string searchPattern = "(" + pipeDelimited + ")";
searchPattern = notInsideBracketsRegex + searchPattern;
// Highlight search terms in Customer - Comments column
e.Row.Cells[CUSTOMERCOMMENTS_COLUMN].Text = Regex.Replace(litCustomerComments.Text, searchPattern, highlightPattern, RegexOptions.IgnoreCase);