我在向CSV文件发出Http请求时收到此类CSV数据.非常畸形的字符串.
response = '"Subject";"Start Date";"Start Time";"End Date";"End Time";"All day event";"Description""Play football";"16/11/2009";"10:00 PM";"16/11/2009";"11:00 PM";"false";"""Watch 2012";"20/11/2009";"07:00 PM";"20/11/2009";"08:00 PM";"false";""'
我想将其转换为字典列表
[{"Subject": "Play football", "Start Date": "16/11/2009", "Start Time": "10:00 PM", "End Date": "16/11/2009", "End Time": "11:00 PM", "All day event", false, "Description": ""}, {"Subject": "Watch 2012", "Start Date": "20/11/2009", "Start Time": "07:00 PM", "End Date": "20/11/2009", "End Time": "08:00 PM", "All day event", false, "Description": ""}]
我尝试使用python csv模块解决这个问题但是没有用.
import csv from cStringIO import StringIO >>> str_obj = StringIO(response) >>> reader = csv.reader(str_obj, delimiter=';') >>> [x for x in reader] [['Subject', 'Start Date', 'Start Time', 'End Date', 'End Time', 'All day event', 'Description"Play football', '16/11/2009', '10:00 PM', '16/11/2009', '11:00 PM', 'false', '"Watch 2012', '20/11/2009', '07:00 PM', '20/11/2009', '08:00 PM', 'false', '']]
我得到了上面的结果.
任何形式的帮助将不胜感激.提前致谢.
这是一个pyparsing解决方案:
from pyparsing import QuotedString, Group, delimitedList, OneOrMore # a row of headings or data is a list of quoted strings, delimited by ';'s qs = QuotedString('"') datarow = Group(delimitedList(qs, ';')) # an entire data set is a single data row containing the headings, followed by # one or more data rows containing the data dataset_parser = datarow("headings") + OneOrMore(datarow)("rows") # parse the returned response data = dataset_parser.parseString(response) # create dict by zipping headings with each row's data values datadict = [dict(zip(data.headings, row)) for row in data.rows] print datadict
打印:
[{'End Date': '16/11/2009', 'Description': '', 'All day event': 'false', 'Start Time': '10:00 PM', 'End Time': '11:00 PM', 'Start Date': '16/11/2009', 'Subject': 'Play football'}, {'End Date': '20/11/2009', 'Description': '', 'All day event': 'false', 'Start Time': '07:00 PM', 'End Time': '08:00 PM', 'Start Date': '20/11/2009', 'Subject': 'Watch 2012'}]
如果引用的字符串包含嵌入的分号,这也将处理这种情况.