这是关于Unix shell脚本(任何shell)的问题,但任何其他"标准"脚本语言解决方案也将受到赞赏:
我有一个充满文件的目录,其中文件名是哈希值,如下所示:
fd73d0cf8ee68073dce270cf7e770b97 fec8047a9186fdcc98fdbfc0ea6075ee
这些文件具有不同的原始文件类型,如png,zip,doc,pdf等.
任何人都可以提供一个脚本来重命名文件,以便它们获得适当的文件扩展名,可能是基于file
命令的输出?
JF Sebastian的脚本既适用于文件名的输出,也适用于实际的重命名.
您可以使用
file -i filename
获取MIME类型.您可以在列表中查找类型,然后附加扩展名.您可以在网上找到MIME类型和示例文件扩展名列表.
这是mimetypes的版本:
#!/usr/bin/env python """It is a `filename -> filename.ext` filter. `ext` is mime-based. """ import fileinput import mimetypes import os import sys from subprocess import Popen, PIPE if len(sys.argv) > 1 and sys.argv[1] == '--rename': do_rename = True del sys.argv[1] else: do_rename = False for filename in (line.rstrip() for line in fileinput.input()): output, _ = Popen(['file', '-bi', filename], stdout=PIPE).communicate() mime = output.split(';', 1)[0].lower().strip() ext = mimetypes.guess_extension(mime, strict=False) if ext is None: ext = os.path.extsep + 'undefined' filename_ext = filename + ext print filename_ext if do_rename: os.rename(filename, filename_ext)
例:
$ ls *.file? | python add-ext.py --rename avi.file.avi djvu.file.undefined doc.file.dot gif.file.gif html.file.html ico.file.obj jpg.file.jpe m3u.file.ksh mp3.file.mp3 mpg.file.m1v pdf.file.pdf pdf.file2.pdf pdf.file3.pdf png.file.png tar.bz2.file.undefined
遵循@Phil H的回复,遵循@csl的回复:
#!/usr/bin/env python """It is a `filename -> filename.ext` filter. `ext` is mime-based. """ # Mapping of mime-types to extensions is taken form here: # http://as3corelib.googlecode.com/svn/trunk/src/com/adobe/net/MimeTypeMap.as mime2exts_list = [ ["application/andrew-inset","ez"], ["application/atom+xml","atom"], ["application/mac-binhex40","hqx"], ["application/mac-compactpro","cpt"], ["application/mathml+xml","mathml"], ["application/msword","doc"], ["application/octet-stream","bin","dms","lha","lzh","exe","class","so","dll","dmg"], ["application/oda","oda"], ["application/ogg","ogg"], ["application/pdf","pdf"], ["application/postscript","ai","eps","ps"], ["application/rdf+xml","rdf"], ["application/smil","smi","smil"], ["application/srgs","gram"], ["application/srgs+xml","grxml"], ["application/vnd.adobe.apollo-application-installer-package+zip","air"], ["application/vnd.mif","mif"], ["application/vnd.mozilla.xul+xml","xul"], ["application/vnd.ms-excel","xls"], ["application/vnd.ms-powerpoint","ppt"], ["application/vnd.rn-realmedia","rm"], ["application/vnd.wap.wbxml","wbxml"], ["application/vnd.wap.wmlc","wmlc"], ["application/vnd.wap.wmlscriptc","wmlsc"], ["application/voicexml+xml","vxml"], ["application/x-bcpio","bcpio"], ["application/x-cdlink","vcd"], ["application/x-chess-pgn","pgn"], ["application/x-cpio","cpio"], ["application/x-csh","csh"], ["application/x-director","dcr","dir","dxr"], ["application/x-dvi","dvi"], ["application/x-futuresplash","spl"], ["application/x-gtar","gtar"], ["application/x-hdf","hdf"], ["application/x-javascript","js"], ["application/x-koan","skp","skd","skt","skm"], ["application/x-latex","latex"], ["application/x-netcdf","nc","cdf"], ["application/x-sh","sh"], ["application/x-shar","shar"], ["application/x-shockwave-flash","swf"], ["application/x-stuffit","sit"], ["application/x-sv4cpio","sv4cpio"], ["application/x-sv4crc","sv4crc"], ["application/x-tar","tar"], ["application/x-tcl","tcl"], ["application/x-tex","tex"], ["application/x-texinfo","texinfo","texi"], ["application/x-troff","t","tr","roff"], ["application/x-troff-man","man"], ["application/x-troff-me","me"], ["application/x-troff-ms","ms"], ["application/x-ustar","ustar"], ["application/x-wais-source","src"], ["application/xhtml+xml","xhtml","xht"], ["application/xml","xml","xsl"], ["application/xml-dtd","dtd"], ["application/xslt+xml","xslt"], ["application/zip","zip"], ["audio/basic","au","snd"], ["audio/midi","mid","midi","kar"], ["audio/mpeg","mp3","mpga","mp2"], ["audio/x-aiff","aif","aiff","aifc"], ["audio/x-mpegurl","m3u"], ["audio/x-pn-realaudio","ram","ra"], ["audio/x-wav","wav"], ["chemical/x-pdb","pdb"], ["chemical/x-xyz","xyz"], ["image/bmp","bmp"], ["image/cgm","cgm"], ["image/gif","gif"], ["image/ief","ief"], ["image/jpeg","jpg","jpeg","jpe"], ["image/png","png"], ["image/svg+xml","svg"], ["image/tiff","tiff","tif"], ["image/vnd.djvu","djvu","djv"], ["image/vnd.wap.wbmp","wbmp"], ["image/x-cmu-raster","ras"], ["image/x-icon","ico"], ["image/x-portable-anymap","pnm"], ["image/x-portable-bitmap","pbm"], ["image/x-portable-graymap","pgm"], ["image/x-portable-pixmap","ppm"], ["image/x-rgb","rgb"], ["image/x-xbitmap","xbm"], ["image/x-xpixmap","xpm"], ["image/x-xwindowdump","xwd"], ["model/iges","igs","iges"], ["model/mesh","msh","mesh","silo"], ["model/vrml","wrl","vrml"], ["text/calendar","ics","ifb"], ["text/css","css"], ["text/html","html","htm"], ["text/plain","txt","asc"], ["text/richtext","rtx"], ["text/rtf","rtf"], ["text/sgml","sgml","sgm"], ["text/tab-separated-values","tsv"], ["text/vnd.wap.wml","wml"], ["text/vnd.wap.wmlscript","wmls"], ["text/x-setext","etx"], ["video/mpeg","mpg","mpeg","mpe"], ["video/quicktime","mov","qt"], ["video/vnd.mpegurl","m4u","mxu"], ["video/x-flv","flv"], ["video/x-msvideo","avi"], ["video/x-sgi-movie","movie"], ["x-conference/x-cooltalk","ice"]] #NOTE: take only the first extension mime2ext = dict(x[:2] for x in mime2exts_list) if __name__ == '__main__': import fileinput, os.path from subprocess import Popen, PIPE for filename in (line.rstrip() for line in fileinput.input()): output, _ = Popen(['file', '-bi', filename], stdout=PIPE).communicate() mime = output.split(';', 1)[0].lower().strip() print filename + os.path.extsep + mime2ext.get(mime, 'undefined')
这是旧python版本的片段(未经测试):
#NOTE: take only the first extension mime2ext = {} for x in mime2exts_list: mime2ext[x[0]] = x[1] if __name__ == '__main__': import os import sys # this version supports only stdin (part of fileinput.input() functionality) lines = sys.stdin.read().split('\n') for line in lines: filename = line.rstrip() output = os.popen('file -bi ' + filename).read() mime = output.split(';')[0].lower().strip() try: ext = mime2ext[mime] except KeyError: ext = 'undefined' print filename + '.' + ext
它应该适用于Python 2.3.5(我猜).
根据csl的回复:
您可以使用
file -i filename获取MIME类型.您可以在列表中查找类型,然后附加扩展名.您可以在网上找到MIME类型列表和建议的文件扩展名.
我建议你编写一个脚本来获取输出file -i filename
,然后用你选择的语言返回一个扩展(在空格上拆分,找到'/',在表文件中查找该术语) - 最多几行.然后你可以这样做:
ls | while read f; do mv "$f" "$f".`file -i "$f" | get_extension.py`; done
在bash中,或者在bash脚本中抛出它.或者将get_extension脚本放大,但这样下次想要相关扩展时就不那么有用了.
编辑:更改为for f in *
,ls | while read f
因为后者处理带有空格的文件名(Windows上的特定噩梦).