当前位置:  开发笔记 > 编程语言 > 正文

如何从WebView获取网页内容?

如何解决《如何从WebView获取网页内容?》经验,为你挑选了4个好方法。



1> jluckyiv..:

我知道这是一个迟到的答案,但我发现了这个问题,因为我遇到了同样的问题.我想我在lexandera.com上的这篇文章中找到了答案.下面的代码基本上是来自网站的剪切和粘贴.它似乎做了伎俩.

final Context myApp = this;

/* An instance of this class will be registered as a JavaScript interface */
class MyJavaScriptInterface
{
    @JavascriptInterface
    @SuppressWarnings("unused")
    public void processHTML(String html)
    {
        // process the html as needed by the app
    }
}

final WebView browser = (WebView)findViewById(R.id.browser);
/* JavaScript must be enabled if you want it to work, obviously */
browser.getSettings().setJavaScriptEnabled(true);

/* Register a new JavaScript interface called HTMLOUT */
browser.addJavascriptInterface(new MyJavaScriptInterface(), "HTMLOUT");

/* WebViewClient must be set BEFORE calling loadUrl! */
browser.setWebViewClient(new WebViewClient() {
    @Override
    public void onPageFinished(WebView view, String url)
    {
        /* This call inject JavaScript into the page which just finished loading. */
        browser.loadUrl("javascript:window.HTMLOUT.processHTML(''+document.getElementsByTagName('html')[0].innerHTML+'');");
    }
});

/* load a web page */
browser.loadUrl("http://lexandera.com/files/jsexamples/gethtml.html");


请注意,这可能不是页面的原始HTML; 在执行`onPageFinished()`之前,页面内容可能已通过JavaScript动态更改.
这很棒,但在`onPageFinished`中调用`browser.loadUrl`方法会导致`onPageFinished`再次被调用.您可能想在调用`browser.loadUrl`之前检查它是否是第一次调用`onPageFinished`.

2> durka42..:

在第12987期,Blundell的答案崩溃了(至少在我的2.3 VM上).相反,我使用特殊前缀拦截对console.log的调用:

// intercept calls to console.log
web.setWebChromeClient(new WebChromeClient() {
    public boolean onConsoleMessage(ConsoleMessage cmsg)
    {
        // check secret prefix
        if (cmsg.message().startsWith("MAGIC"))
        {
            String msg = cmsg.message().substring(5); // strip off prefix

            /* process HTML */

            return true;
        }

        return false;
    }
});

// inject the JavaScript on page load
web.setWebViewClient(new WebViewClient() {
    public void onPageFinished(WebView view, String address)
    {
        // have the page spill its guts, with a secret prefix
        view.loadUrl("javascript:console.log('MAGIC'+document.getElementsByTagName('html')[0].innerHTML);");
    }
});

web.loadUrl("http://www.google.com");



3> nagoya0..:

这是基于jluckyiv的答案,但我认为更改Javascript更好更简单如下.

browser.loadUrl("javascript:HTMLOUT.processHTML(document.documentElement.outerHTML);");



4> larham1..:

您是否考虑过单独获取HTML,然后将其加载到webview中?

String fetchContent(WebView view, String url) throws IOException {
    HttpClient httpClient = new DefaultHttpClient();
    HttpGet get = new HttpGet(url);
    HttpResponse response = httpClient.execute(get);
    StatusLine statusLine = response.getStatusLine();
    int statusCode = statusLine.getStatusCode();
    HttpEntity entity = response.getEntity();
    String html = EntityUtils.toString(entity); // assume html for simplicity
    view.loadDataWithBaseURL(url, html, "text/html", "utf-8", url); // todo: get mime, charset from entity
    if (statusCode != 200) {
        // handle fail
    }
    return html;
}

推荐阅读
吻过彩虹的脸_378
这个屌丝很懒,什么也没留下!
DevBox开发工具箱 | 专业的在线开发工具网站    京公网安备 11010802040832号  |  京ICP备19059560号-6
Copyright © 1998 - 2020 DevBox.CN. All Rights Reserved devBox.cn 开发工具箱 版权所有