我有以下代碼:如何比較兩個CSV文件在Python 3 - 模塊格式 -
import csv
import subprocess
from subprocess import check_output
# Writing the pacman command output to file in csv format
sysApps = check_output(["pacman", "-Qn"])
sysAppsCSV = csv.DictReader(sysApps.decode('ascii').splitlines(),
delimiter=' ', skipinitialspace=True,
fieldnames=[ 'name', 'version']) # Thanks to https://stackoverflow.com/a/8880768/5565713 jcollado
with open('pacman.csv', 'w') as csvfile:
rows_sys = csv.writer(csvfile)
rows_sys.writerow(sysAppsCSV)
# Writing the pip command output in csv format
pipApps = check_output(["pip", "list"])
pipAppsCSV = csv.DictReader(pipApps.decode('ascii').splitlines(),
delimiter=' ', skipinitialspace=True,
fieldnames=[ 'name', 'version']) # Thanks to https://stackoverflow.com/a/8880768/5565713 jcollado
with open('pip.csv', 'w') as csvfile:
rows_pip = csv.writer(csvfile)
rows_pip.writerow(pipAppsCSV)
# Comparing the files
我要比較兩個文件,不是必需的文件,也可以是變量的內容已經創建,並從pip.csv
文件得到結果作爲差異,實際上我想知道什麼是pip.csv
而不是pacman.csv
。 here的例子不適用於我的情況,但我會通過列出名稱和版本以類似的方式輸出結果。
編輯: @Greg Sadetsky感謝您的建議我用你的例子來簡化我的代碼,但不能解決我的問題,我不能以這種方式比較列表。我取得了一些進展,但我仍然沒有得到期望的輸出:
import csv
import subprocess
from subprocess import check_output
#Initializing variables
results_sys = ""
results_pip = ""
# Running the linux commands
sys_apps = set(check_output(["pacman", "-Qn"]).splitlines())
pip_apps = set(check_output(["pip", "list"]).splitlines())
# Saving the outputs of the commands in to a CSV format
for row in sys_apps:
result = row.decode('ascii').split(sep=" ")
with open('pacman.csv', 'a') as csvfile:
rows_sys = csv.writer(csvfile)
rows_sys.writerow(result)
for row in pip_apps:
result = row.decode('ascii').split(sep=" ")
with open('pip.csv', 'a') as csvfile:
rows_sys = csv.writer(csvfile)
rows_sys.writerow(result)
# Opening the files and comparing the results
with open('pacman.csv', 'r') as pacmanCSV:
sys_apps = pacmanCSV.readlines()
for row in sys_apps:
apps = row.split(",")
results_sys = results_sys + " " + apps[0]
with open('pip.csv', 'r') as pipCSV:
pip_apps = pipCSV.readlines()
for row in pip_apps:
apps = row.split(",")
results_pip = results_pip + " " + apps[0]
results_final = "List of apps installed from pip:\n################################"
for val in results_pip:
if val not in results_sys:
results_final = results_final + "\n" + val
print(results_final)
當我運行這段代碼,我得到一些大寫字母,例如:Imgur
確定,所以後閱讀有關集我做到了這一點:
r1 = set(results_pip)
r2 = set(results_sys)
print(r1 - r2)
但我得到了類似的結果,只有大寫字母的第一個字母出現。
http://stackoverflow.com/questions/15864641/python-difflib-comparing-files http://stackoverflow.com/questions/977491/comparing-2-txt-files-using-difflib-in-python – Sam
問題是'results_sys'和'results_pip'都是你連續追加字符串的字符串(即'results_sys +「」+ apps [0]')。如果你在'for results_pip'中的val中迭代字符串,那麼你將一個一個遍歷字符串中的字母......這不是你想要做的。我將用您的新版本的解決方案編輯我的答案 –