有一个在线文件(例如http://www.example.com/information.asp
)我需要抓取并保存到目录中.我知道有几种方法可以逐行获取和读取在线文件(URL),但有没有办法只使用Java下载和保存文件?
提供的Java NIO一试:
URL website = new URL("http://www.website.com/information.asp"); ReadableByteChannel rbc = Channels.newChannel(website.openStream()); FileOutputStream fos = new FileOutputStream("information.html"); fos.getChannel().transferFrom(rbc, 0, Long.MAX_VALUE);
使用transferFrom()
是可能不是一个简单的循环从源信道的读取和写入这个频道有效得多.许多操作系统可以直接从源通道将字节传输到文件系统缓存中,而无需实际复制它们.
在这里查看更多相关信息.
注意:transferFrom中的第三个参数是要传输的最大字节数. Integer.MAX_VALUE
将最多传输2 ^ 31个字节,Long.MAX_VALUE
最多允许2 ^ 63个字节(大于现有的任何文件).
使用apache commons-io,只需一行代码:
FileUtils.copyURLToFile(URL, File)
更简单的nio用法:
URL website = new URL("http://www.website.com/information.asp"); try (InputStream in = website.openStream()) { Files.copy(in, target, StandardCopyOption.REPLACE_EXISTING); }
public void saveUrl(final String filename, final String urlString) throws MalformedURLException, IOException { BufferedInputStream in = null; FileOutputStream fout = null; try { in = new BufferedInputStream(new URL(urlString).openStream()); fout = new FileOutputStream(filename); final byte data[] = new byte[1024]; int count; while ((count = in.read(data, 0, 1024)) != -1) { fout.write(data, 0, count); } } finally { if (in != null) { in.close(); } if (fout != null) { fout.close(); } } }
您需要处理异常,可能是此方法的外部异常.
下载文件需要您阅读它,无论哪种方式,您都必须以某种方式浏览文件.您可以直接从流中读取字节,而不是逐行读取:
BufferedInputStream in = new BufferedInputStream(new URL("http://www.website.com/information.asp").openStream()) byte data[] = new byte[1024]; int count; while((count = in.read(data,0,1024)) != -1) { out.write(data, 0, count); }
这是一个老问题,但这是一个优雅的JDK解决方案:
public static void download(String url, String fileName) throws Exception { try (InputStream in = URI.create(url).toURL().openStream()) { Files.copy(in, Paths.get(fileName)); } }
简洁,可读,适当封闭的资源,除了核心JDK和语言功能之外什么都不用.
使用时Java 7+
使用以下方法从Internet下载文件并将其保存到某个目录:
private static Path download(String sourceURL, String targetDirectory) throws IOException { URL url = new URL(sourceURL); String fileName = sourceURL.substring(sourceURL.lastIndexOf('/') + 1, sourceURL.length()); Path targetPath = new File(targetDirectory + File.separator + fileName).toPath(); Files.copy(url.openStream(), targetPath, StandardCopyOption.REPLACE_EXISTING); return targetPath; }
文档在这里.
这个答案几乎就像选择的答案,但有两个增强功能:它是一个方法,它关闭了FileOutputStream对象:
public static void downloadFileFromURL(String urlString, File destination) { try { URL website = new URL(urlString); ReadableByteChannel rbc; rbc = Channels.newChannel(website.openStream()); FileOutputStream fos = new FileOutputStream(destination); fos.getChannel().transferFrom(rbc, 0, Long.MAX_VALUE); fos.close(); rbc.close(); } catch (IOException e) { e.printStackTrace(); } }
import java.io.*; import java.net.*; public class filedown { public static void download(String address, String localFileName) { OutputStream out = null; URLConnection conn = null; InputStream in = null; try { URL url = new URL(address); out = new BufferedOutputStream(new FileOutputStream(localFileName)); conn = url.openConnection(); in = conn.getInputStream(); byte[] buffer = new byte[1024]; int numRead; long numWritten = 0; while ((numRead = in.read(buffer)) != -1) { out.write(buffer, 0, numRead); numWritten += numRead; } System.out.println(localFileName + "\t" + numWritten); } catch (Exception exception) { exception.printStackTrace(); } finally { try { if (in != null) { in.close(); } if (out != null) { out.close(); } } catch (IOException ioe) { } } } public static void download(String address) { int lastSlashIndex = address.lastIndexOf('/'); if (lastSlashIndex >= 0 && lastSlashIndex < address.length() - 1) { download(address, (new URL(address)).getFile()); } else { System.err.println("Could not figure out local file name for "+address); } } public static void main(String[] args) { for (int i = 0; i < args.length; i++) { download(args[i]); } } }
就个人而言,我发现Apache的HttpClient能够满足我对此需要做的所有事情. 这是一个关于使用HttpClient的精彩教程
这是另一个基于Brian Risk使用try-with语句的答案的 java7变体:
public static void downloadFileFromURL(String urlString, File destination) throws Throwable { URL website = new URL(urlString); try( ReadableByteChannel rbc = Channels.newChannel(website.openStream()); FileOutputStream fos = new FileOutputStream(destination); ){ fos.getChannel().transferFrom(rbc, 0, Long.MAX_VALUE); } }