在阅读CSV文件时,我遇到了NumPy 1.10.2的以下问题.我无法弄清楚如何给出明确的数据类型genfromtxt
.
下面是CSV, minimal.csv
:
x,y 1,hello 2,hello 3,jello 4,jelly 5,belly
在这里,我尝试用genfromtxt阅读它:
import numpy numpy.genfromtxt('minimal.csv', dtype=(int, str))
我也尝试过:
import numpy numpy.genfromtxt('minimal.csv', names=True, dtype=(int, str))
无论如何,我得到错误:
Traceback (most recent call last): File "visualize_numpy.py", line 39, innumpy.genfromtxt('minimal.csv', dtype=(int, str)) File "/Users/xeli/workspace/myproj/env/lib/python3.5/site-packages/numpy/lib/npyio.py", line 1518, in genfromtxt replace_space=replace_space) File "/Users/xeli/workspace/myproj/env/lib/python3.5/site-packages/numpy/lib/_iotools.py", line 881, in easy_dtype ndtype = np.dtype(ndtype) ValueError: mismatch in size of old and new data-descriptor
或者,我试过:
import numpy numpy.genfromtxt('minimal.csv', dtype=[('x', int), ('y', str)])
哪个投掷:
Traceback (most recent call last): File "visualize_numpy.py", line 39, innumpy.genfromtxt('minimal.csv', dtype=[('x', int), ('y', str)]) File "/Users/xeli/workspace/myproj/env/lib/python3.5/site-packages/numpy/lib/npyio.py", line 1834, in genfromtxt rows = np.array(data, dtype=[('', _) for _ in dtype_flat]) ValueError: size of tuple must match number of fields.
我知道dtype=None
让NumPy尝试猜测正确的类型并且通常效果很好.但是,文档提到它比显式类型慢得多.在我的情况下,计算效率是必需的,因此dtype=None
不是一种选择.
我的方法或NumPy有什么特别的错误吗?