我正在编写一个专门的爬虫和解析器供内部使用,我需要能够截取网页的截图,以检查整个过程中使用的颜色.该程序将接收大约十个网址,并将其保存为位图图像.
从那里我计划使用LockBits来创建图像中五种最常用颜色的列表.据我所知,这是获取网页中使用的颜色的最简单方法,但如果有更简单的方法,请提出您的建议.
无论如何,在我看到价格标签之前,我打算使用ACA WebThumb ActiveX Control.我也是C#的新手,只用了几个月.有没有解决我的网页截图以提取配色方案的问题?
一种快速而肮脏的方法是使用WinForms WebBrowser控件并将其绘制到位图.在独立控制台应用程序中执行此操作有点棘手,因为您必须了解在使用基本异步编程模式时托管STAThread控件的含义.但这是一个工作概念证明,它将网页捕获到800x600 BMP文件:
namespace WebBrowserScreenshotSample { using System; using System.Drawing; using System.Drawing.Imaging; using System.Threading; using System.Windows.Forms; class Program { [STAThread] static void Main() { int width = 800; int height = 600; using (WebBrowser browser = new WebBrowser()) { browser.Width = width; browser.Height = height; browser.ScrollBarsEnabled = true; // This will be called when the page finishes loading browser.DocumentCompleted += Program.OnDocumentCompleted; browser.Navigate("https://stackoverflow.com/"); // This prevents the application from exiting until // Application.Exit is called Application.Run(); } } static void OnDocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e) { // Now that the page is loaded, save it to a bitmap WebBrowser browser = (WebBrowser)sender; using (Graphics graphics = browser.CreateGraphics()) using (Bitmap bitmap = new Bitmap(browser.Width, browser.Height, graphics)) { Rectangle bounds = new Rectangle(0, 0, bitmap.Width, bitmap.Height); browser.DrawToBitmap(bitmap, bounds); bitmap.Save("screenshot.bmp", ImageFormat.Bmp); } // Instruct the application to exit Application.Exit(); } } }
要编译它,请创建一个新的控制台应用程序,并确保为System.Drawing
和添加程序集引用System.Windows.Forms
.
更新:我重写了代码,以避免使用hacky轮询WaitOne/DoEvents模式.此代码应更接近以下最佳实践.
更新2:您表明要在Windows窗体应用程序中使用它.在这种情况下,忘记动态创建WebBrowser
控件.你想要的是WebBrowser
在你的表单上创建一个隐藏的(Visible = false)实例,并按照我上面显示的方式使用它.这是另一个示例,它显示了带有文本框(webAddressTextBox
),按钮(generateScreenshotButton
)和隐藏浏览器(webBrowser
)的表单的用户代码部分.当我正在研究这个时,我发现了一个我之前没有处理过的特性 - DocumentCompleted事件实际上可以根据页面的性质多次提升.此示例应该可以正常工作,您可以扩展它以执行任何操作:
namespace WebBrowserScreenshotFormsSample { using System; using System.Drawing; using System.Drawing.Imaging; using System.IO; using System.Windows.Forms; public partial class MainForm : Form { public MainForm() { this.InitializeComponent(); // Register for this event; we'll save the screenshot when it fires this.webBrowser.DocumentCompleted += new WebBrowserDocumentCompletedEventHandler(this.OnDocumentCompleted); } private void OnClickGenerateScreenshot(object sender, EventArgs e) { // Disable button to prevent multiple concurrent operations this.generateScreenshotButton.Enabled = false; string webAddressString = this.webAddressTextBox.Text; Uri webAddress; if (Uri.TryCreate(webAddressString, UriKind.Absolute, out webAddress)) { this.webBrowser.Navigate(webAddress); } else { MessageBox.Show( "Please enter a valid URI.", "WebBrowser Screenshot Forms Sample", MessageBoxButtons.OK, MessageBoxIcon.Exclamation); // Re-enable button on error before returning this.generateScreenshotButton.Enabled = true; } } private void OnDocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e) { // This event can be raised multiple times depending on how much of the // document has loaded, if there are multiple frames, etc. // We only want the final page result, so we do the following check: if (this.webBrowser.ReadyState == WebBrowserReadyState.Complete && e.Url == this.webBrowser.Url) { // Generate the file name here string screenshotFileName = Path.GetFullPath( "screenshot_" + DateTime.Now.Ticks + ".png"); this.SaveScreenshot(screenshotFileName); MessageBox.Show( "Screenshot saved to '" + screenshotFileName + "'.", "WebBrowser Screenshot Forms Sample", MessageBoxButtons.OK, MessageBoxIcon.Information); // Re-enable button before returning this.generateScreenshotButton.Enabled = true; } } private void SaveScreenshot(string fileName) { int width = this.webBrowser.Width; int height = this.webBrowser.Height; using (Graphics graphics = this.webBrowser.CreateGraphics()) using (Bitmap bitmap = new Bitmap(width, height, graphics)) { Rectangle bounds = new Rectangle(0, 0, width, height); this.webBrowser.DrawToBitmap(bitmap, bounds); bitmap.Save(fileName, ImageFormat.Png); } } } }
https://www.url2png.com/docs是一个很好的.他们有免费套餐.
您需要使用HttpWebRequest下载图像的二进制文件.这是一个例子:
HttpWebRequest request = HttpWebRequest.Create("https://[url]") as HttpWebRequest; Bitmap bitmap; using (Stream stream = request.GetResponse().GetResponseStream()) { bitmap = new Bitmap(stream); } // now that you have a bitmap, you can do what you need to do...
要生成URL ...
HttpWebRequest request = HttpWebRequest.Create("https://[url]") as HttpWebRequest; Bitmap bitmap; using (Stream stream = request.GetResponse().GetResponseStream()) { bitmap = new Bitmap(stream); } // now that you have a bitmap, you can do what you need to do...
这个问题很老但是,你也可以使用nuget package Freezer.它是免费的,使用最近的Gecko webbrowser(支持HTML5和CSS3)并且只能在一个dll中使用.
var screenshotJob = ScreenshotJobBuilder.Create("https://google.com") .SetBrowserSize(1366, 768) .SetCaptureZone(CaptureZone.FullPage) .SetTrigger(new WindowLoadTrigger()); System.Drawing.Image screenshot = screenshotJob.Freeze();
有一个很棒的基于Webkit的浏览器PhantomJS允许从命令行执行任何JavaScript.
从http://phantomjs.org/download.html安装它, 并从命令行执行以下示例脚本:
./phantomjs ../examples/rasterize.js http://www.panoramio.com/photo/76188108 test.jpg
它将在JPEG文件中创建给定页面的屏幕截图.这种方法的好处是您不依赖任何外部提供商,并且可以轻松自动化大量截屏.