我正在开发一个函数,它接受一个CSV的文件名并将每一行转换为一个字典,然后返回一个创建的字典列表(以便能够迭代并在以后的函数中组织.我已经得到它通过以下方式做我想做的事,但觉得必须有更好的方法.有任何改进的建议吗?
import re def import_incidents(filename): """Imports CSV and returns list of dictionaries for each incident""" with open(filename, 'r') as file: data = file.read() data = data.split('\n') list_of_data = [] headers = True for line in data: line = line.split('","') if headers == True: #Skip header and set to false headers = False elif len(line) == 1 or line[3] == '': #File always has a 1 lenth final line, skip it. #Events can leave blank policies, skip those too. pass else: temp_dict = {} temp_dict['id'] = re.sub('"', '', line[0]) temp_dict['time'] = re.sub('GMT-0600','',line[1]) temp_dict['source'] = line[2] temp_dict['policy'] = line[3] temp_dict['destination'] = line[5] temp_dict['status'] = line[10] list_of_data.append(temp_dict) return list_of_data print(import_incidents('Incidents (Yesterday Only).csv'))
CSV内容示例:
"ID","Incident Time","Source","Policies","Channel","Destination","Severity","Action","Maximum Matches","Transaction Size","Status", "9511564","29 Dec. 2015, 08:33:59 AM GMT-0600","Doe, John","Encrypted files","HTTPS","blah.blah.com","Medium","Permitted","0","47.7 KB","Closed - Authorized", "1848446","29 Dec. 2015, 08:23:36 AM GMT-0600","Smith, Joe","","HTTP","google.com","Low","Permitted","0","775 B","Closed"
Martijn Piet.. 6
你重新改造了csv.DictReader()
课程,我担心:
import csv def import_incidents(filename): with open(filename, 'r', newline='') as file: reader = csv.DictReader(file) for row in reader: if not row or not row['Policies']: continue row['Incident Time'] = re.sub('GMT-0600', '', row['Incident Time']) yield row
这依赖于字典键的标题行.您可以使用fieldnames
参数to 定义自己的字典键DictReader()
(fieldnames
字段按顺序与文件中的列匹配),但文件中的第一行仍然像任何其他行一样被读取.您可以使用该next()
函数跳过行(请参阅使用Python编辑csv文件时跳过标题).
你重新改造了csv.DictReader()
课程,我担心:
import csv def import_incidents(filename): with open(filename, 'r', newline='') as file: reader = csv.DictReader(file) for row in reader: if not row or not row['Policies']: continue row['Incident Time'] = re.sub('GMT-0600', '', row['Incident Time']) yield row
这依赖于字典键的标题行.您可以使用fieldnames
参数to 定义自己的字典键DictReader()
(fieldnames
字段按顺序与文件中的列匹配),但文件中的第一行仍然像任何其他行一样被读取.您可以使用该next()
函数跳过行(请参阅使用Python编辑csv文件时跳过标题).