2012-03-21 102 views
0

我有這個作爲一個家庭作業問題,並不知道我應該如何去做。Python搜索數據集

首先,我給了一個數據集,其中包含員工姓名,地址,電子郵件等清單,共有約50名員工。

你被要求寫一個應用程序來提供有關員工的信息。你的程序應該提示用戶輸入搜索條件。誰與搜索標準匹配的工作人員中的任何成員應打印在屏幕上以下面的格式:

Position Designation Room and Extension Name and Email Address
(列是製表符分隔)

Matching信息.......... ..
您將不得不修改數據集進行處理,並且您可以選擇將其保存在單獨的文件中,但這不是必需的。您的程序應該滿足一定的限制條件:

  • 您應該將數據集中的每一列與搜索條件進行比較。
  • 比較不應區分大小寫。
  • 除電子郵件地址外,所有輸出都應在首筆資本中。
  • 如果找到匹配項,則應打印結果行並且列應全部對齊。
  • 如果沒有匹配,則應打印一條消息,不要有標題行。

您應該保存(1)您的程序,(2)一段說明您是如何完成數據集的處理的。

你也應該運行你的應用這些測試用例:

  • 爲「布蘭達」
  • 搜索所有文書人員搜索。
  • 爲「BredNa」
  • 檢索查找卡爾博士的位置
  • 哪個辦公室尼爾位於?

所以,首先,我應該如何讀取這個數據集?我應該將它作爲文本文件讀取還是創建一個元組,字典?等


staff = [['prof.liam maguire','head of school','academic','MS127','75605','[email protected]'], 
['prof. martin McGinnity','director of intelligent systems research centre','academic','MS112','75616','[email protected]'], 
['dr laxmidhar Behera','reader','academic','MS107','75276','[email protected]'], 
['dr girijesh Prasad','professor','academic','MS137','75645','[email protected]'], 
['dr kevin Curran','senior lecturer','academic','MS130','75565','[email protected]'], 
['mr aiden McCaughey','Senior Lecturer','academic','MG126','75131','[email protected]'], 
['dr tom Lunney','postgraduate courses co-ordinator (Senior Lecturer)','academic','MG121D','75388','[email protected]'], 
['dr heather Sayers','undergraduate courses','co-ordinator (Senior Lecturer)','academic','MG121C','75148','[email protected]'], 
['dr liam Mc Daid','senior lecturer','academic','MS016','75452','[email protected]'], 
['mr derek Woods','senior lecturer','academic','MS134','75380','[email protected]'], 
['dr ammar Belatreche','lecturer','academic','MS104','75185','[email protected]'], 
['mr michael Callaghan','lecturer','academic','MS132','75771','[email protected]'], 
['dr sonya Coleman','lecturer','academic','MS133','75030','[email protected]'], 
['dr joan Condell','lecturer','academic','MS131','75024','[email protected]'], 
['dr damien Coyle','lecturer','academic','MS103','75170','[email protected]'], 
['mr martin Doherty','lecturer','academic','MG121A','75552','[email protected]'], 
['dr jim Harkin','lecturer','academic','MS108','75128','[email protected]'], 
['dr yuhua Li','lecturer','academic','MS106','75528','[email protected]'], 
['dr sandra Moffett','lecturer','academic','MS015','75381','[email protected]'], 
['mrs mairin Nicell','lecturer','academic','MG127','75007','[email protected]'], 
['mrs maeve Paris','lecturer','academic','MG040','75212','[email protected]'], 
['dr jose Santos','lecturer','academic','MG035','75034','[email protected]'], 
['dr nH. Siddique','lecturer','academic','MG037','75340','[email protected]'], 
['dr zumao Weng','lecturer','academic','MG050','75358','[email protected]'], 
['dr shane Wilson','lecturer','academic','MG038','75527','[email protected]'], 
['dr caitriona carr','computing and Technical Support','MG121B','75003','[email protected]'], 
['mr neil McDonnell','technical Services Supervisor','computing and Technical Support','MS030/MF143','75360','[email protected]'], 
['mr paddy McDonough','technical Services Engineer','computing and Technical Support','MS034','75322','[email protected]'], 
['mr bernard McGarry','network Assistant','computing and Technical Support','MG132','75644','[email protected]'], 
['mr stephen Friel','secretary','clerical staff','MG048','75148','[email protected]'], 
['ms emma McLaughlin','secretary','clerical staff','MG048','75153','[email protected]'], 
['mrs. brenda Plummer','secretary','clerical staff','MS126','75605','[email protected]'], 
['miss paula Sheerin','secretary','clerical staff','MS111','75616','[email protected]'], 
['mrs michelle Stewart','secretary','clerical staff','MG048','75382','[email protected]']] 


matches = [] 

criterion = input ("please enter search criterion: ") 
criterion = criterion.lower() 

for person in staff: 
for characteristic in person: 
if characteristic in person: 
if criterion in characteristic: 
matches.append(person) 
break 
if len(matches) == 0: 
print("No Match") 
else: 
    print("POSITION |||DESIGNATION ||| EXT & ROOM NO||| NAME & EMAIL") 
for i in matches: 
print (i[1].title(),': ',i[2].title(),':',i[3].upper()+ i[4],':',i[0].title(), i[5].title())` 

這是香港專業教育學院想出了這麼遠,它似乎工作,在那裏你會作出改善?

+0

數據集的格式是什麼?你能提供一個樣本入口嗎?另外,你到目前爲止嘗試過什麼? – Taymon 2012-03-21 19:26:18

回答

1

謝謝你誠實告訴我們,這是一個家庭作業的問題。 StackOverflow不鼓勵直接給出家庭作業問題的答案,但我們可以引導您找到正確的答案。

關於「修改數據集進行處理」:這意味着數據當前沒有一致的格式。您需要做的第一件事就是查看您提供的數據,並確定數據的最佳表示形式。

我推薦一個列標籤分隔數據文件 - 這很容易在Microsoft Excel中創建,方法是將數據放入電子表格中,並將其保存爲文本。 (Excel會抱怨說它會失去所有使它成爲電子表格而不是文本文件的所有東西,但這沒關係 - 你想要一個文本文件。)保存更新的文件。

Excel中產生什麼叫做製表符分隔文本文件:數據的2維網格(如電子表格的形狀),每行(改寫數據的一行表示的,換行符符號是用於分隔數據行,文本編輯器將其解釋爲開始在新行上寫入的命令)以及製表符(用Python在轉義字符串中編寫爲\t,但實際上是它自己的單個字符)每一行。這也被稱爲製表符分隔值或TSV。密切相關的是逗號分隔值或CSV,這是Excel中的另一個選項。 CSV也可以代表字符分隔值,這是通過使用某些字符(','爲逗號分隔,'\ t'爲製表符分隔來表示數據網格來分隔數據網格以便分離的通用術語記錄。

CSV是一種非常常見的文件格式,因此Python已經準備好在這裏爲您提供幫助。 Python有a library, csv,旨在爲你讀取這些文件。如果您使用的是Excel文本格式,則需要告訴它dialectexcel-tab,因爲它象徵Excel製表符分隔的文件。

您需要構建一個csv.reader來讀取格式化的數據文件。使用您放置列的順序來了解當您每次讀取CSV一行時獲得的列表 - 列的順序和每行中項目的順序相同,因此請使用該信息來索引正確地進入列表以查找每個字段。

一旦你讀了一行,你想用它做什麼?

你的存儲格式可以選擇在你的程序:

  • 保存每一條記錄到一個列表(表現得像一個列表的列表,因爲每個記錄的行爲就像列表)。現在它已經加載,並且當你想搜索它時,你遍歷整個列表列表並且使用相等性測試來查找匹配。這可以通過列表理解來完成,這幾乎可以肯定你的老師正在尋找什麼。
  • 此外,爲每個文件列創建一個字典,並在每個字典中存儲每條記錄:每個字典將該列值映射到您的密鑰。這裏有一個問題!一個字典只能存儲每個鍵的一個記錄,但你肯定會在同一個「指定」(多個教授,多個文書人員等)中擁有不同的人員,並且無法確定沒有兩個人會擁有同名,要麼。索引記錄必須自己存儲記錄列表,而不僅僅是單個記錄。

對於重複查詢,第二種方法要快得多,因爲您在開始時組織了所有記錄以進行快速查找。然而,第一個實施起來要容易得多,而且更有可能成爲您的老師所期望的。我建議實現第一個,理解它,然後如果你有時間,實施第二個。

所有這些的用戶界面當然都取決於您,但這應該會讓您很好地實現程序的核心。祝你好運。

+0

在下面發佈了我的嘗試 – smorr87 2012-03-22 19:33:01

0

我假設你有你的數據集作爲一個純文本文件(或電子郵件可複製文本等),那麼你有幾種選擇:

  1. 創建一個文本文件,其中每行存儲信息關於一位員工的指定格式:「姓名」,「職位」等在這種情況下,要執行搜索,您需要掃描文件並打印匹配的行,然後重複匹配的部分。

  2. 使用Python數據類型將信息存儲在內存中,例如一個名爲「Name」,「Position」等的字典列表。然後,搜索將變得稍微複雜一點(只是一點點,真的),但是你可以用任何你喜歡的方式格式化輸出。 但是,首先您需要通過閱讀文本文件(或手動硬編碼,如果您絕望)用數據填充列表。

  3. 您可以通過僅從文件的匹配行形成字典來稍微結合這些方法。

  4. 你可以使用像MySQL這樣的真正的數據庫引擎,但是這對於這個作業來說可能是一個真正的矯枉過正。

1

這是我怎麼會去一下:

staff_details = [["Prof. Liam Maguire","Head of School","Academic","MS127","75605","[email protected]"], 
       ["Prof. Martin McGinnity","Director of Intelligent Systems Research Centre","Academic","MS112","75616","[email protected]"], 
       ["Dr Laxmidhar Behera","Reader","Academic","MS107","75276", "[email protected]"], 
       ["Dr Girijesh Prasad","Professor","Academic","MS137","75645","[email protected]"], 
       ["Dr Kevin Curran","Senior Lecturer","Academic","MS130","75565","[email protected]"], 
       ["Mr Aiden McCaughey","Senior Lecturer","Academic","MG126","75131","[email protected]"], 
       ["Dr Tom Lunney","Postgraduate Courses’ Co-ordinator (Senior Lecturer) ","Academic","MG121D","75388","[email protected]"], 
       ["Dr Heather Sayers","Undergraduate Courses’ Co-ordinator (Senior Lecturer) ","Academic","MG121C","75148","[email protected]"], 
       ["Dr Liam Mc Daid","Senior Lecturer","Academic","MS016","75452","[email protected]"], 
       ["Mr Derek Woods","Senior Lecturer","Academic","MS134","75380","[email protected]"], 
       ["Dr Ammar Belatreche","Lecturer","Academic","MS104","75185","[email protected]"], 
       ["Mr Michael Callaghan","Lecturer","Academic","MS132","75771","[email protected]"], 
       ["Dr Sonya Coleman","Lecturer","Academic","MS133","75030","[email protected]"], 
       ["Dr Joan Condell","Lecturer","Academic","MS131","75024","[email protected]"], 
       ["Dr Damien Coyle","Lecturer","Academic","MS103","75170","[email protected]"], 
       ["Mr Martin Doherty","Lecturer","Academic","MG121A","75552","[email protected]"], 
       ["Dr Jim Harkin","Lecturer","Academic","MS108","75128","[email protected]"], 
       ["Dr Yuhua Li","Lecturer","Academic","MS106","75528","[email protected]"], 
       ["Dr Sandra Moffett","Lecturer","Academic","MS015","75381","[email protected]"], 
       ["Mrs Mairin Nicell","Lecturer","Academic","MG127","75007","[email protected]"], 
       ["Mrs Maeve Paris","Lecturer","Academic","MG040","75212","[email protected]"], 
       ["Dr Jose Santos","Lecturer","Academic","MG035","75034","[email protected]"], 
       ["Dr NH. Siddique","Lecturer","Academic","MG037","75340","[email protected]"], 
       ["Dr Zumao Weng","Lecturer","Academic","MG050 ","75358","[email protected]"], 
       ["Dr Shane Wilson","Lecturer","Academic","MG038","75527","[email protected]"], 
       ["Dr Caitriona Carr","Technical Services Engineer","Computing and Technical Support","MG121B","75003","[email protected]"], 
       ["Mr Neil McDonnell","Technical Services Supervisor","Computing and Technical Support","MS030/MF143","75360", "[email protected]"], 
       ["Mr Paddy McDonough","Technical Services Engineer","Computing and Technical Support","MS034","75322","[email protected]"], 
       ["Mr Bernard McGarry","Network Assistant","Computing and Technical Support","MG132","75644","[email protected]"], 
       ["Mr Stephen Friel","Secretary","Clerical Staff","MG048","75148","[email protected]"], 
       ["Ms Emma McLaughlin","Secretary","Clerical Staff","MG048","75153","[email protected]"], 
       ["Mrs. Brenda Plummer","Secretary","Clerical Staff","MS126","75605","[email protected]"], 
       ["Miss Paula Sheerin","Secretary","Clerical Staff","MS111","75616","[email protected]"], 
       ["Mrs Michelle Stewart","Secretary","Clerical Staff","MG048","75382","[email protected]"]] 

search_result = [] 

search_input = input ("Please enter a search criterion: ") 
search_input = search_input.title() 

for person in staff_details: 
    for characteristic in person: 
if characteristic in person: 
    if search_input in characteristic: 
      search_result.append(person) 
       break 

if len(search_result) == 0: 
    print ("No staff members match your search criterion of ->", search_input) 


else: 
    print("We have a match!") 
    print ("{0:<30} {1:<40} {2:<40} {3:<50}".format("Position:", "Designation:", "Room and Extension:", "Name and Email:")) 
    print ("-" * 160) 

for align in search_result: 
    print("{0:<30} {1:<40} {2:<40} {3:<50}".format((align[1]), (align[2]), (align[3] + ", Ext:" + align[4]), align[0] + "(" + align[5] + ")")) 

我希望這可以幫助你!