Python檢查項目在迭代時存在於列表中

我試圖循環兩個列表，只想打印一個項目，如果它存在於第二個列表中。我將通過非常大的文件來做到這一點，所以不想像列表或字典那樣將它們存儲在內存中。有沒有一種方法可以在不存儲到列表或字典中的情況下執行此操作？Python檢查項目在迭代時存在於列表中

我能夠做到以下確認他們不在列表中，但不確定爲什麼它不工作，當我試圖通過刪除「不」來確認他們在列表中。

驗證項目的代碼在列表2中不存在。

list_1 = ['apple', 
      'pear', 
      'orange', 
      'kiwi', 
      'strawberry', 
      'banana'] 

list_2 = ['kiwi', 
      'melon', 
      'grape', 
      'pear'] 

for fruit_1 in list_1: 
    if all(fruit_1 not in fruit_2 for fruit_2 in list_2): 
     print(fruit_1)

驗證項目的代碼是否存在於list_2中。

list_1 = ['apple', 
      'pear', 
      'orange', 
      'kiwi', 
      'strawberry', 
      'banana'] 

list_2 = ['kiwi', 
      'melon', 
      'grape', 
      'pear'] 

for fruit_1 in list_1: 
    if all(fruit_1 in fruit_2 for fruit_2 in list_2): 
     print(fruit_1)

來源

2017-04-26 MBasith

難道你不能只使用列表理解？ '如果list_2中的x在列表__1中爲x，則返回'list_1'中列表中的項目列表，如果它們在'list_2'中的話。如果x不在list_2中，則反過來'[x for list_1]。 – Wright

@MBatish如果您接受將_one_列表保存在內存中，那麼您可以使用該列表創建一個集合，並在另一個列表上進行迭代（讀取文件）。那會很快。否則它將永遠佔用。 –

好的使用all（）函數我正在避免這種情況。只是混淆了爲什麼反轉不起作用。 – MBasith

因此，這是你如何讓他們：

exists = [item for item in list_1 if item in list_2] 
does_not_exist = [item for item in list_1 if item not in list_2]

而要print他們：

for item in exists: 
    print item 
for item in does_not_exist: 
    print item

但是，如果你只想打印：

for item in list_1: 
    if item in list_2: 
     print item

來源

2017-04-26 15:53:54 zipa

感謝您的回覆。但是這會將輸出保存到exists和does_does_not_exist變量。我正在處理的文件很大，並且希望避免將它們保存到內存中。 – MBasith

你可以使用python的集合兩個列表中制定出項目

set(list1).intersection(set(list_2))

見你的代碼https://docs.python.org/2/library/sets.html

來源

2017-04-26 15:54:14 wrdeman

「我將通過非常大的文件來完成此操作，因此不想將它們存儲在內存中，如列表或字典」... –

-1

的一個問題是，所有的方法returns false if any single check returns false。另一個是fruit_1 in fruit_2部分正在檢查以查看fruit_1是否爲fruit_2的子字符串。如果我們要修改清單，讓您的邏輯工作，他們看起來像：

list_1 = ['apple', 
      'pear', 
      'orange', 
      'kiwi', 
      'berry', 
      'banana', 
      'grape'] 

list_2 = ['grape', 
      'grape', 
      'grape', 
      'grape', 
      'grape']

，但可能是：

list_1 = ['apple', 
      'pear', 
      'orange', 
      'kiwi', 
      'berry', 
      'banana', 
      'grape'] 

list_2 = ['strawberry', 
      'strawberry', 
      'strawberry', 
      'strawberry', 
      'strawberry', 
      'strawberry']

因爲berry是strawberry。如果我們繼續使用迭代進行此項檢查，而不是套，as @wrdeman suggested一個路口，然後使用你所提供的數據集，它應該是這樣的：

for fruit_1 in list_1: 
    if fruit_1 in list_2: 
     print(fruit)

的其他修改，可能是將all更改爲any，其中returns true if any of the iterables items return true。然後你的代碼將如下所示：

for fruit_1 in list_1: 
    if any(fruit_1 == fruit_2 for fruit_2 in list_2): 
     print(fruit_1)

來源

2017-04-26 16:10:47

我能夠通過進行真/假評估來完成反演。

list_1 = ['apple', 
      'pear', 
      'orange', 
      'kiwi', 
      'strawberry', 
      'banana'] 

list_2 = ['kiwi', 
      'melon', 
      'grape', 
      'pear'] 

# DOES exist 
for fruit_1 in list_1: 
    if all(fruit_1 not in fruit_2 for fruit_2 in list_2) is False: 
     print(fruit_1) 

print('\n') 

# DOES NOT exist 
for fruit_1 in list_1: 
    if all(fruit_1 not in fruit_2 for fruit_2 in list_2) is True: 
     print(fruit_1)

來源

2017-04-26 16:20:06 MBasith

我推薦pandas，它適用於大規模數據。

使用PIP進行安裝：

pip install pandas

並在某種程度上，你可以做到這樣的：

import pandas as pd 

s1 = pd.Index(list_1) 
s2 = pd.Index(list_2) 

exists = s1.intersection(s2) 
does_not_exist = s1.difference(s2)

現在你會看到神奇的東西，如果你執行print exists

請參閱Pandas Docs

來源

2017-04-26 16:21:43 Lodour

問題的代碼是如何對all（）函數進行評估。把它分解得更簡單一點。

## DOES EXIST 
print all('kiwi' in fruit_2 for fruit_2 in ['pear', 'kiwi']) 
print all('pear' in fruit_2 for fruit_2 in ['pear', 'kiwi'])

則計算結果爲

False 
False

反之，如果你做這樣的事情

#DOES NOT EXIST 
print all('apple' not in fruit_2 for fruit_2 in ['pear', 'kiwi']) 
print all('pear' not in fruit_2 for fruit_2 in ['pear', 'kiwi'])

則計算結果爲

True 
False

我不能找出爲什麼是這樣的原因，但它可能是如何的全部（）函數返回true 如果迭代的所有元素都爲真否則爲false。

在任何情況下，我認爲使用任何（）而不是所有（）的DOES存在部分將工作。

print "DOES NOT EXIST" 
for fruit_1 in list_1: 
    # print all(fruit_1 not in fruit_2 for fruit_2 in list_2) 
    if all(fruit_1 not in fruit_2 for fruit_2 in list_2): 
     print(fruit_1) 

print "\nDOES EXIST" 
for fruit_1 in list_1: 
    if any(fruit_1 in fruit_2 for fruit_2 in list_2): 
     print(fruit_1) 

DOES NOT EXIST 
apple 
orange 
strawberry 
banana 

DOES EXIST 
pear 
kiwi

來源

2017-04-26 17:15:19 ThatsANo

這是使用pandas.read_csv創建存儲器中的溶液映射文件：

import pandas as pd 

list1 = pd.read_csv('list1.txt', dtype=str, header=None, memory_map=True) 
list2 = pd.read_csv('list2.txt', dtype=str, header=None, memory_map=True) 

exists = pd.merge(list1, list2, how='inner', on=0) 
for fruit in exists[0].tolist(): 
    print fruit

的list1.txt和list2.txt文件包含從問題的字符串，每行一個字符串。

輸出

pear 
kiwi

我沒有任何真正的大文件進行實驗，所以我沒有任何性能測量。

來源

2017-04-26 17:23:36

Python檢查項目在迭代時存在於列表中

回答

相關問題