我的代码很简单,如下:
import pytesseract from PIL import Image img = Image.open('C:/temp/foo.jpg') img.load() i = pytesseract.image_to_string(img)
我得到的错误响应是:
Traceback (most recent call last): File "img.py", line 6, ini = pytesseract.image_to_string(img) File "build\bdist.win32\egg\pytesseract\pytesseract.py", line 161, in image_to _string File "build\bdist.win32\egg\pytesseract\pytesseract.py", line 94, in run_tesse ract File "C:\Users\%USER%\AppData\Local\Continuum\Anaconda\lib\subprocess.py", line 710, in __init__ errread, errwrite) File "C:\Users\%USER%\AppData\Local\Continuum\Anaconda\lib\subprocess.py", line 958, in _execute_child startupinfo) WindowsError: [Error 2] The system cannot find the file specified
任何指导都会很棒.
将tesseract添加到我的路径变量有助于:
C:\Program Files (x86)\Tesseract-OCR
但是在尝试运行pytesseract时,代码现在崩溃了.
刚刚遇到同样的错误并决定回答这个问题 - 这可能有助于节省时间......
首先,确保已安装/复制Tesseract-OCR可执行文件.
Windows无法tesseract
在PATH
环境变量中指定的目录中找到可执行文件.因此要么确保包含的目录在Python脚本tesseract
中的PATH
变量或覆盖tesseract_cmd
变量中,如下所示(改为放置PATH):
import pytesseract pytesseract.pytesseract.tesseract_cmd = 'C:/Program Files (x86)/Tesseract-OCR/tesseract'
除此之外,请确保将TESSDATA_PREFIX
Windows环境变量设置为包含tessdata
目录的目录.例如:
TESSDATA_PREFIX=C:\Program Files (x86)\Tesseract-OCR
如果tessdata
位置是:C:\Program Files (x86)\Tesseract-OCR\tessdata