對於HTML字符串的漂亮打印assertEqual（）

我想比較包含html的python unittest中的兩個字符串。對於HTML字符串的漂亮打印assertEqual（）

是否有一種方法在人類友好（差異）版本中輸出結果？

2011-11-04 guettli

自1.4版以來，Django具有assertHTMLEqual：http://docs.djangoproject.com/en/dev/topics/testing/#django.test.SimpleTestCase.assertHTMLEqual – guettli

我（一個問這個問題）使用BeautfulSoup現在：

def assertEqualHTML(string1, string2, file1='', file2=''): 
    u''' 
    Compare two unicode strings containing HTML. 
    A human friendly diff goes to logging.error() if there 
    are not equal, and an exception gets raised. 
    ''' 
    from BeautifulSoup import BeautifulSoup as bs 
    import difflib 
    def short(mystr): 
     max=20 
     if len(mystr)>max: 
      return mystr[:max] 
     return mystr 
    p=[] 
    for mystr, file in [(string1, file1), (string2, file2)]: 
     if not isinstance(mystr, unicode): 
      raise Exception(u'string ist not unicode: %r %s' % (short(mystr), file)) 
     soup=bs(mystr) 
     pretty=soup.prettify() 
     p.append(pretty) 
    if p[0]!=p[1]: 
     for line in difflib.unified_diff(p[0].splitlines(), p[1].splitlines(), fromfile=file1, tofile=file2): 
      logging.error(line) 
     raise Exception('Not equal %s %s' % (file1, file2))

來源

2011-11-10 08:50:55 guettli

也許這是一個相當「冗長」的解決方案。你可以添加一個新的「平等功能」爲您的用戶定義類型（例如：HTMLString），您必須首先定義：

class HTMLString(str): 
    pass

現在你必須定義一個類型相等功能：

def assertHTMLStringEqual(first, second): 
    if first != second: 
     message = ... # TODO here: format your message, e.g a diff 
     raise AssertionError(message)

你所要做的就是根據你的喜好格式化你的信息。您也可以在您的特定TestCase中使用類方法作爲類型相等函數。這給你更多的功能來格式化你的信息，因爲unittest.TestCase做了很多。

現在，你有你的unittest.TestCase註冊這種平等功能：

... 
def __init__(self): 
    self.addTypeEqualityFunc(HTMLString, assertHTMLStringEqual)

同爲一類方法：

... 
def __init__(self): 
    self.addTypeEqualityFunc(HTMLString, 'assertHTMLStringEqual')

現在你可以在你的測試中使用它：

def test_something(self): 
    htmlstring1 = HTMLString(...) 
    htmlstring2 = HTMLString(...) 
    self.assertEqual(htmlstring1, htmlstring2)

這應該適用於python 2.7。

來源

2011-11-04 09:42:10 Constantinius

幾年前，我提交了一個補丁來做到這一點。該補丁已被拒絕，但您仍然可以在python bug list上查看它。

我懷疑你是否想破解你的unittest.py來應用這個補丁（如果它在所有這段時間後仍然有效），但是這裏的功能是將兩個字符串減少到一個可管理的大小，同時仍然至少保留部分內容不同。只要你不想完全不同這威力是你想要什麼：

def shortdiff(x,y): 
    '''shortdiff(x,y) 

    Compare strings x and y and display differences. 
    If the strings are too long, shorten them to fit 
    in one line, while still keeping at least some difference. 
    ''' 
    import difflib 
    LINELEN = 79 
    def limit(s): 
     if len(s) > LINELEN: 
      return s[:LINELEN-3] + '...' 
     return s 

    def firstdiff(s, t): 
     span = 1000 
     for pos in range(0, max(len(s), len(t)), span): 
      if s[pos:pos+span] != t[pos:pos+span]: 
       for index in range(pos, pos+span): 
        if s[index:index+1] != t[index:index+1]: 
         return index 

    left = LINELEN/4 
    index = firstdiff(x, y) 
    if index > left + 7: 
     x = x[:left] + '...' + x[index-4:index+LINELEN] 
     y = y[:left] + '...' + y[index-4:index+LINELEN] 
    else: 
     x, y = x[:LINELEN+1], y[:LINELEN+1] 
     left = 0 

    cruncher = difflib.SequenceMatcher(None) 
    xtags = ytags = "" 
    cruncher.set_seqs(x, y) 
    editchars = { 'replace': ('^', '^'), 
        'delete': ('-', ''), 
        'insert': ('', '+'), 
        'equal': (' ',' ') } 
    for tag, xi1, xi2, yj1, yj2 in cruncher.get_opcodes(): 
     lx, ly = xi2 - xi1, yj2 - yj1 
     edits = editchars[tag] 
     xtags += edits[0] * lx 
     ytags += edits[1] * ly 

    # Include ellipsis in edits line. 
    if left: 
     xtags = xtags[:left] + '...' + xtags[left+3:] 
     ytags = ytags[:left] + '...' + ytags[left+3:] 

    diffs = [ x, xtags, y, ytags ] 
    if max([len(s) for s in diffs]) < LINELEN: 
     return '\n'.join(diffs) 

    diffs = [ limit(s) for s in diffs ] 
    return '\n'.join(diffs)

來源

2011-11-04 10:48:19 Duncan

的簡單方法是從HTML剝離空白並將其分割成一個列表。 Python 2.7's unittest（或backported unittest2）然後給出列表之間的人類可讀的差異。

import re 

def split_html(html): 
    return re.split(r'\s*\n\s*', html.strip()) 

def test_render_html(): 
    expected = ['<div>', '...', '</div>'] 
    got = split_html(render_html()) 
    self.assertEqual(expected, got)

如果我寫工作代碼的測試，我通常先設定expected = []，插入斷言前self.maxDiff = None，讓測試失敗一次。預期列表可以從測試輸出中複製粘貼。

您可能需要根據HTML的外觀調整空白的剝離方式。

來源

2011-11-24 08:37:12 akaihola

對於HTML字符串的漂亮打印assertEqual（）

回答

相關問題