獲取已更改的行數

給定兩個文本文件A，B，獲取B中行數不在A中的簡單方法是什麼？我看到有difflib，卻看不到一個接口，用於檢索行號獲取已更改的行數

2012-02-29 Yaroslav Bulatov

你只是尋找文件B不在一個系？線的順序是否重要？ – 2012-02-29 20:08:21

difflib可以給你[統一差異]（http://docs.python.org/library/difflib.html#difflib.unified_diff）。 [這些格式]（http://en.wikipedia.org/wiki/Diff#Unified_format）'@@ -l，s + l，s @@'給你刪除的行數，起始行，行數補充，起跑線。 – 2012-02-29 21:06:13

是的，順序很重要，基本上difflib已經實現了智能化diff'ing，只是不是行號部分 – 2012-02-29 21:06:35

difflib可以給你所需要的。假設：

A.TXT

this 
is 
a 
bunch 
of 
lines

b.txt

this 
is 
a 
different 
bunch 
of 
other 
lines

這樣的代碼：

import difflib 

fileA = open("a.txt", "rt").readlines() 
fileB = open("b.txt", "rt").readlines() 

d = difflib.Differ() 
diffs = d.compare(fileA, fileB) 
lineNum = 0 

for line in diffs: 
    # split off the code 
    code = line[:2] 
    # if the line is in both files or just b, increment the line number. 
    if code in (" ", "+ "): 
     lineNum += 1 
    # if this line is only in b, print the line number and the text on the line 
    if code == "+ ": 
     print "%d: %s" % (lineNum, line[2:].strip())

給出了類似的輸出：

[email protected] ~/temp:python diffy.py 
4: different 
7: other

您還需要查看difflib代碼"? "並查看您希望如何處理該問題。

（另外，在實際的代碼，你想使用上下文管理器，以確保文件被關閉，等等等等）

來源

2012-02-29 21:13:49 bgporter

謝謝！順便說一句，這是基於1的，所以要從fileB獲得修改後的行，它是lineNum-1 – 2012-02-29 21:37:17

是的，它應該能夠正確地以1爲基礎工作。它在打印之前遞增。對？ – bgporter 2012-02-29 21:43:03

一個窮人的解決方案：

with open('A.txt') as f: 
    linesA = f.readlines() 

with open('B.txt') as f: 
    linesB = f.readlines() 

print [k for k, v in enumerate(linesB) if not v in linesA]

來源

2012-02-29 20:10:45

它具有二次運行時間，並且不考慮行的順序。想象一下A的行是'1'，'2'和'3'，B的行是'1'，'2'，'3'，'2' - 你的代碼不會產生任何輸出。另外請注意，在A中詢問行* *而不是*的問題。 – 2012-02-29 20:19:32

請求不在A中的行，糾正了該問題。是的，運行時相當糟糕，因此也是窮人的解決方案。是的，這也是一個很好的觀點，這個代碼不尊重行的順序。 – 2012-02-29 20:22:53

獲取已更改的行數

回答

相關問題