AttributeError：嘗試匹配file2中file1的字符串標識符列表

這是我的目標的簡短摘要。我有一個數據文本文件中的數據列表，基本上是名稱或標識符。名稱列表全部位於同一行，並由空格分隔。我想讓每個數據都是獨立的行。這些數據是標識符。例如，如果在大文件中同時存在一個來自原始數據文本文件的名稱，我希望在該大文件中包含該行數據，即將同一行中的名稱和一些附加信息寫入較小的數據文件。AttributeError：嘗試匹配file2中file1的字符串標識符列表

這是我開始嘗試這樣一個壯舉的程序。也許這是在推動我的技能的極限，但我希望能夠完成這一點。

datafile = open ('C:\\datatext.txt', 'r') 

line = [item for item in open('C:\\datatext.txt', 'r').read().split(' ') 
        if item.startswith("name") or item.startswith("name2")] 

line_list = line.split(" ") 

completedataset = open('C:\\bigfile.txt', 'r') 
smallerdataset = open('C:\\smallerdataset.txt', 'w') 

trials = [ line_list ] 


for line in completedataset: 
    for t in trials: 
     if t in line: 
      smallerdataset.write(line) 

completedataset.close() 
smallerdataset.close()

這裏是我收到的錯誤，當我在Python運行程序：

Traceback (most recent call last): 
    File "C:/program3.py", line 7, in <module> 
    line_list = line.split(" ") 
AttributeError: 'list' object has no attribute 'split'

我已經試過很thourough，期待您的意見。如果您還有其他問題，我會盡快詳細闡述。一切順利，享受雨天。

編輯：

我已經做了一些改動，以基於建議方案。我有這個作爲我現在的計劃：

with open('C:\\datatext.txt', 'r') as datafile: 
    lines = datafile.read().split(' ') 
matchedLines = [item for item in lines if item.startswith("name1") or item.startswith("othername")] 


completedataset = open('C:\\bigfile.txt', 'r') 
smallerdataset = open('C:\\smallerdataset.txt', 'w') 

trials = [ matchedLines ] 


for line in completedataset: 
    for t in trials: 
     if t in line: 
      smallerdataset.write(line) 

completedataset.close() 
smallerdataset.close()

，現在我得到這個錯誤：

 
Traceback (most recent call last): 
    File "C:/program5.py", line 17, in 
    if t in line: 
TypeError: 'in ' requires string as left operand, not list 
>>>

感謝您對您的幫助繼續在這個問題上。

編輯2：

我已經做了一些改動，現在我得到這個錯誤：

 
Traceback (most recent call last): 
    File "C:/program6.py", line 9, in 
    open('C:\\smallerdataset.txt', 'w')) as (completedataset, smallerdataset): 
AttributeError: 'tuple' object has no attribute '__exit__'

這是我的計劃，因爲它現在代表：

with open('C:\\datatext.txt', 'r') as datafile: 
    lines = datafile.read().split(' ') 
matchedLines = [item for item in lines if item.startswith("nam1") or item.startswith("ndname")] 


with (open('C:\\bigfile.txt', 'r'), 
     open('C:\\smallerdataset.txt', 'w')) as (completedataset, smallerdataset): 
    for line in completedataset: 
    for t in matchedLines: 
     if t in line: 
     smallerdataset.write(line) 

completedataset.close() 
smallerdataset.close()

哪有我繞過這個障礙？

來源

2010-07-25 Robert A. Fettikowski

您可以刪除第二個'item.startswith'電話。如果它以'name2'開頭，它總是會以'name'開始，所以這些代碼永遠不會被調用。 – Daenyth 2010-07-25 19:45:11

你意識到如果item.startswith（'name'）'是'True'，你的第二個條件從不被檢查，如果它是'False'，那麼第二個條件總是'False'，對吧？此外，'startswith'接受要檢查的字符串元組。 – SilentGhost 2010-07-25 19:47:19

Name1和2是我剛剛更改以提出問題的臨時名稱。我不想用一個人的真實姓名來提問，因爲這是私人信息。 – 2010-07-25 20:36:14

line = [item for item in open('C:\chiptext.txt', 'r').read().split(' ') 
      if item.startswith("SNP") or item.startswith("AFFY")]

這是使行成爲字符串列表。列表對象沒有拆分方法。

它看起來像你想要一個datatext中的所有名稱列表和該列表的子集匹配某些謂詞的名稱。最好的辦法是以下幾點。

with open('C:\\datatext.txt', 'r') as datafile: 
    lines = datafile.read().split(' ') 
matchedLines = [item for item in lines if (PREDICATE)]

作爲一般性評論，儘量不要過分使用單線代碼。您的列表理解行將打開文件對象。

編輯新編輯： matchedLines已經是一個列表，所以我不知道爲什麼你的包裹它在另一個列表當你trials。以下是你正在做的一個簡單的例子。

l = [1,2,3] 
ll = [l] 
print ll //[[1, 2, 3]]

當你的錯誤，不要讓基於你所期望的一個變量的值是感，你應該在打印語句添加這樣你就可以確認值是正確的。

這可能是你所需要的：

with open('C:\datatext.txt', 'r') as datafile: 
    lines = datafile.read().split(' ') 
matchedLines = [item for item in lines if item.startswith("name1") or item.startswith("othername")] 

with open('C:\bigfile.txt', 'r') as completedataset: 
    with open('C:\smallerdataset.txt', 'w') as smallerdataset: 
    for line in completedataset: 
     for t in matchedLines: 
     if t in line: 
      smallerdataset.write(line)

來源

2010-07-25 19:36:34 unholysampler

謝謝你指出我的錯誤。我怎樣才能最好地糾正它？ – 2010-07-25 20:31:20

添加了一些示例代碼。如果這個答案解決了你的問題，記得接受它。 – unholysampler 2010-07-25 20:49:48

我仍然有問題。查看我編輯過的原始文章，以反映這一行編碼的問題。謝謝。 – 2010-07-25 22:05:34

AttributeError：嘗試匹配file2中file1的字符串標識符列表

回答

相關問題