2009-12-14 71 views
9

我對Python很新。我需要創建一個將csv數據加載到字典中的類。python csv into dictionary

我想能夠控制鍵和值 因此,讓我們說下面的代碼,我可以隨時拔出worker1.name或worker1.age。

class ageName(object): 
'''class to represent a person''' 
def __init__(self, name, age): 
self.name = name 
self.age = age 

worker1 = ageName('jon', 40) 
worker2 = ageName('lise', 22) 

#Now if we print this you see that it`s stored in a dictionary 
print worker1.__dict__ 
print worker2.__dict__ 
# 
''' 
{'age': 40, 'name': 'jon'} 
# 
{'age': 22, 'name': 'lise'} 
# 
''' 
# 

#when we call (key)worker1.name we are getting the (value) 
print worker1.name 
# 
''' 
# 
jon 
# 
''' 

但我被困在加載我的csv數據到鍵和值。

[1]我要創造我自己的鍵 worker1 = ageName([名],[年齡],[ID],[兩性])

[2]每[名],[年齡] ,[id]和[gender]來自csv數據文件中特定的一列

我真的不知道如何處理這個問題。我嘗試了很多方法,但是失敗了。我需要一些幫助才能開始。

----編輯 這是我原來的代碼

import csv 

# let us first make student an object 

class Student(): 
    def __init__(self): 
     self.fname = [] 
     self.lname = [] 
     self.ID = [] 
     self.sport = [] 
     # let us read this file 
     for row in list(csv.reader(open("copy-john.csv", "rb")))[1:]: 
      self.fname.append(row[0]) 
      self.lname.append(row[1]) 
      self.ID.append(row[2]) 
      self.sport.append(row[3]) 
    def Tableformat(self): 
     print "%-14s|%-10s|%-5s|%-11s" %('First Name','Last Name','ID','Favorite Sport') 
     print "-" * 45 
     for (i, fname) in enumerate(self.fname): 
      print "%-14s|%-10s|%-5s|%3s" %(fname,self.lname[i],self.ID[i],self.sport[i]) 
    def Table(self): 
     print self.lname 

class Database(Student): 
    def __init__(self): 
     g = 0 
     choice = ['Basketball','Football','Other','Baseball','Handball','Soccer','Volleyball','I do not like sport'] 
     data = student.sport 
     k = len(student.fname) 
     print k 
     freq = {} 
     for i in data: 
      freq[i] = freq.get(i, 0) + 1 
     for i in choice: 
      if i not in freq: 
       freq[i] = 0 
      print i, freq[i] 


student = Student() 
database = Database() 

這是我當前的代碼(不完全)

import csv 
class Student(object): 
    '''class to represent a person''' 
    def __init__(self, lname, fname, ID, sport): 
     self.lname = lname 
     self.fname = fname 
     self.ID = ID 
     self.sport = sport 
reader = csv.reader(open('copy-john.csv'), delimiter=',', quotechar='"') 
student = [Student(row[0], row[1], row[2], row[3]) for row in reader][1::] 
print "%-14s|%-10s|%-5s|%-11s" %('First Name','Last Name','ID','Favorite Sport') 
print "-" * 45 
for i in range(len(student)): 
    print "%-14s|%-10s|%-5s|%3s" %(student[i].lname,student[i].fname,student[i].ID,student[i].sport) 

choice = ['Basketball','Football','Other','Baseball','Handball','Soccer','Volleyball','I do not like sport'] 
lst = [] 
h = 0 
k = len(student) 
# 23 
for i in range(len(student)): 
    lst.append(student[i].sport) # merge together 

for a in set(lst): 
    print a, lst.count(a) 

for i in set(choice): 
    if i not in set(lst): 
     lst.append(i) 
     lst.count(i) = 0 
     print lst.count(i) 
+0

請注意,如果您確實需要字典,則不能使用'worker1.name'來獲取值。字典可以用'worker1 ['name']'的形式訪問。那麼,你真的想要哪個? – 2009-12-14 00:13:07

+0

嗨,彼得。我很抱歉,我非常感謝您的評論。這是個好問題。任何親和好?我很抱歉... – CppLearner 2009-12-14 00:18:39

+0

總有利弊,但您要求提供字典。你的意思是你不知道你是否應該使用一個?爲了回答這個問題,我們需要更多地瞭解你將如何處理數據。 – 2009-12-14 00:54:14

回答

12
import csv 

reader = csv.reader(open('workers.csv', newline=''), delimiter=',', quotechar='"') 
workers = [ageName(row[0], row[1]) for row in reader] 

工人現在擁有的所有的員工名單中

>>> workers[0].name 
'jon' 

問題補充後編輯被改變

有沒有你正在使用舊樣式類的原因嗎?我在這裏使用新的風格。

class Student: 
    sports = [] 
    def __init__(self, row): 
     self.lname, self.fname, self.ID, self.sport = row 
     self.sports.append(self.sport) 
    def get(self): 
     return (self.lname, self.fname, self.ID, self.sport) 

reader = csv.reader(open('copy-john.csv'), delimiter=',', quotechar='"') 
print "%-14s|%-10s|%-5s|%-11s" % tuple(reader.next()) # read header line from csv 
print "-" * 45 
students = list(map(Student, reader)) # read all remaining lines 
for student in students: 
    print "%-14s|%-10s|%-5s|%3s" % student.get() 

# Printing all sports that are specified by students 
for s in set(Student.sports): # class attribute 
    print s, Student.sports.count(s) 

# Printing sports that are not picked 
allsports = ['Basketball','Football','Other','Baseball','Handball','Soccer','Volleyball','I do not like sport'] 
for s in set(allsports) - set(Student.sports): 
    print s, 0 

希望這給你一些python序列的力量的想法。 )

編輯2,縮短儘可能...只是炫耀:P

各位,7(0.5)線。

allsports = ['Basketball','Football','Other','Baseball','Handball', 
      'Soccer','Volleyball','I do not like sport'] 
sports = [] 
reader = csv.reader(open('copy-john.csv')) 
for row in reader: 
    if reader.line_num: sports.append(s[3]) 
    print "%-14s|%-10s|%-5s|%-11s" % tuple(s) 
for s in allsports: print s, sports.count(s) 
+0

Wooo這正是我正在尋找的 閱讀器= csv.reader(open('Book.csv'),delimiter =',',quotechar ='「') workers = [ageName(row [0],row (len(workers)): print workers [i] .lname,workers [i] .fname,workers [i] .ID – CppLearner 2009-12-14 00:26:36

+0

你在閱讀器中排的行數爲 無法將「打印」定義爲方法*並將它用作相同模塊上的語句 – nosklo 2009-12-14 03:38:52

+0

我在寫它時正在考慮它,而且我認爲它也許可以工作,但現在我改變了它。 – 2009-12-14 03:57:44

2

你有沒有看csv模塊?

import csv 
+0

是的,我做過。事實上,我有一個寫得不好的版本,但我意識到這是一個痛苦的脖子,所以我決定立即做字典。 – CppLearner 2009-12-14 00:19:29

8

我第二個馬克的建議。特別是,從csv模塊中查看DictReader,它允許將逗號分隔(或通常用分隔符)文件作爲字典。

PyMotW's coverage of csv module爲DictReader的使用的快速參考和示例,DictWriter

+0

我不明白爲什麼這有一個-1沒有解釋性評論。建議使用DictReader有什麼問題? – CruiZen 2010-04-06 10:17:47

+0

我必須說,你發佈的鏈接非常有用=)。打算爲此投票。 (國防部的,我假設沒關係?) – victorhooi 2010-05-18 05:09:39

+0

謝謝@victorhooi – CruiZen 2011-05-04 13:02:37

9

我知道這是一個很老的問題,但它是不可能的閱讀,而不會想到驚人的新的(ISH)的Python的圖書館,pandas。它的主要分析單位是一種被稱爲DataFrame的思想,它是以R處理數據的方式建模的。

比方說,你有一個名爲example.csv一個(很無聊)的CSV文件,該文件是這樣的:

day,fruit,sales 
Monday,Banana,10 
Monday,Orange,20 
Tuesday,Banana,12 
Tuesday,Orange,22 

如果你想在雙快的時間一個CSV閱讀,做「東西」吧,你會很難打敗以下代碼,以簡化或易用性:

>>> import pandas as pd 
>>> csv = pd.read_csv('example.csv') 
>>> csv 
     day fruit sales 
0 Monday Banana  10 
1 Monday Orange  20 
2 Tuesday Banana  12 
3 Tuesday Orange  22 
>>> csv[csv.fruit=='Banana'] 
     day fruit sales 
0 Monday Banana  10 
2 Tuesday Banana  12 
>>> csv[(csv.fruit=='Banana') & (csv.day=='Monday')] 
     day fruit sales 
0 Monday Banana  10 

在我看來,這真是太棒了。永遠不要再次遍歷csv.reader對象!

+1

相當不錯。 Thansk。哦,什麼?差不多4年! :) – CppLearner 2013-08-15 02:12:24

+0

這是關於網絡世界的瘋狂事物:它永遠不會死!我在一個完全正常的Google搜索會話中遇到了這個問題。希望你看看'熊貓'並享受它! – LondonRob 2013-08-15 10:34:55