我正在尝试使用html敏捷包和此xpath从html文档中检索特定图像:
//div[@id='topslot']/a/img/@src
据我所知,它找到了src-attribute,但它返回了img-tag.这是为什么?
我希望设置InnerHtml/InnerText或其他东西,但两者都是空字符串.OuterHtml设置为完整的img-tag.
有没有Html Agility Pack的文档?
如果使用,则可以直接获取属性HtmlNavigator
.
//Load document from some html string HtmlDocument hdoc = new HtmlDocument(); hdoc.LoadHtml(htmlContent); //Load navigator for current document HtmlNodeNavigator navigator = (HtmlNodeNavigator)hdoc.CreateNavigator(); //Get value from given xpath string xpath = "//div[@id='topslot']/a/img/@src"; string val = navigator.SelectSingleNode(xpath).Value;
Html Agility Pack 不支持属性选择.
您可以使用方法"GetAttributeValue".
例:
//[...] code before needs to load a html document HtmlAgilityPack.HtmlDocument htmldoc = e.Document; //get all nodes "a" matching the XPath expression HtmlNodeCollection AllNodes = htmldoc.DocumentNode.SelectNodes("*[@class='item']/p/a"); //show a messagebox for each node found that shows the content of attribute "href" foreach (var MensaNode in AllNodes) { string url = MensaNode.GetAttributeValue("href", "not found"); MessageBox.Show(url); }