2017-04-04 58 views
1

Adding a single character to add keys in Counter,@AshwiniChaudhary給了一個很好的答案,創建一個新的Counter對象有不同的set()函數:創建具有特殊設定功能自定義計數器對象

from collections import Counter 

class CustomCounter(Counter): 
    def __setitem__(self, key, value): 
     if len(key) > 1 and not key.endswith(u"\uE000"): 
      key += u"\uE000" 
     super(CustomCounter, self).__setitem__(key, value) 

要允許用戶自定義字符/ STR追加到重點,我已經試過:

from collections import Counter, defaultdict 

class AppendedStrCounter(Counter): 
    def __init__(self, str_to_append): 
     self._appended_str = str_to_append 
     super(AppendedStrCounter, self).__init__() 
    def __setitem__(self, key, value): 
     if len(key) > 1 and not key.endswith(self._appended_str): 
      key += self._appended_str 
     super(AppendedStrCounter, self).__setitem__(tuple(key), value) 

但它返回一個空計數器:

>>> class AppendedStrCounter(Counter): 
...  def __init__(self, str_to_append): 
...   self._appended_str = str_to_append 
...   super(AppendedStrCounter, self).__init__() 
...  def __setitem__(self, key, value): 
...   if len(key) > 1 and not key.endswith(self._appended_str): 
...    key += self._appended_str 
...   super(AppendedStrCounter, self).__setitem__(tuple(key), value) 
... 
>>> AppendedStrCounter('foo bar bar blah'.split()) 
AppendedStrCounter() 

那是因爲我的思念在__init__()的ITER:

from collections import Counter, defaultdict 

class AppendedStrCounter(Counter): 
    def __init__(self, iter, str_to_append): 
     self._appended_str = str_to_append 
     super(AppendedStrCounter, self).__init__(iter) 
    def __setitem__(self, key, value): 
     if len(key) > 1 and not key.endswith(self._appended_str): 
      key += self._appended_str 
     super(AppendedStrCounter, self).__setitem__(tuple(key), value) 

[出]:

>>> AppendedStrCounter('foo bar bar blah'.split(), u'\ue000') 
AppendedStrCounter({('f', 'o', 'o', '\ue000'): 1, ('b', 'a', 'r', '\ue000'): 1, ('b', 'l', 'a', 'h', '\ue000'): 1}) 

'bar'值是錯誤的,它應該是2,而不是1

正在使用iter__init__()正確的方式來初始化Counter

+2

您製作的超類構造函數使用'__setitem__'爲它增加了每個項目的假設,但沒有保證它必須。https://docs.python.org/2/library/collections.html#collections.Counter只承諾它的缺點tructor會表現,而不是如何實施。 – amalloy

+2

仔細查看@AshwiniChaudhary的參考答案。在他的答案中,「the」鍵的計數器也是1而不是2 – Felix

+0

更改存儲鍵的方式可能會帶來一些令人討厭的驚喜...例如,沒有人可以存儲「word」ue000「計數與'CustomCounter'中的''word''分開。另外,他們如何獲得特定的詞語?用戶必須記得每當他們需要cc ['word']'時要求'cc ['word \ ue000']',這完全破壞了封裝的OOP目標。 –

回答

1

正如 Felix's comment指出, collections.Counter 不會記錄__init__方法如何增加鍵或設置值,只是它的作用。 由於它沒有明確的子類化設計,最明智的做法是而不是的子類。

collections.abc 模塊的存在是爲了提供易於子類的抽象類Python的內建類型,包括dictMutableMapping,在ABC術語)。 所以,如果你需要的是「一個Counter狀類」 (而不是「,將滿足喜歡isinstanceissubclass建宏Counter一個子類), 您可以創建自己的MutableMapping有-一個Counter,然後「中間人」初始化和三種方法Counter增加了典型dict

import collections 
import collections.abc 


def _identity(s): 
    ''' 
    Default mutator function. 
    ''' 
    return s 


class CustomCounter(collections.abc.MutableMapping): 
    ''' 
    Overrides the 5 methods of a MutableMapping: 
    __getitem__, __setitem__, __delitem__, __iter__, __len__ 

    ...and the 3 non-Mapping methods of Counter: 
    elements, most_common, subtract 
    ''' 

    def __init__(self, values=None, *, mutator=_identity): 
     self._mutator = mutator 
     if values is None: 
      self._counter = collections.Counter() 
     else: 
      values = (self._mutator(v) for v in values) 
      self._counter = collections.Counter(values) 
     return 

    def __getitem__(self, item): 
     return self._counter[self._mutator(item)] 

    def __setitem__(self, item, value): 
     self._counter[self._mutator(item)] = value 
     return 

    def __delitem__(self, item): 
     del self._counter[self._mutator(item)] 
     return 

    def __iter__(self): 
     return iter(self._counter) 

    def __len__(self): 
     return len(self._counter) 

    def __repr__(self): 
     return ''.join([ 
      self.__class__.__name__, 
      '(', 
      repr(dict(self._counter)), 
      ')' 
      ]) 

    def elements(self): 
     return self._counter.elements() 

    def most_common(self, n): 
     return self._counter.most_common(n) 

    def subtract(self, values): 
     if isinstance(values, collections.abc.Mapping): 
      values = {self._mutator(k): v for k, v in values.items()} 
      return self._counter.subtract(values) 
     else: 
      values = (self._mutator(v) for v in values) 
      return self._counter.subtract(values) 


def main(): 
    def mutator(s): 
     # Asterisks are easier to print than '\ue000'. 
     return '*' + s + '*' 

    words = 'the lazy fox jumps over the brown dog'.split() 

    # Test None (allowed by collections.Counter). 
    ctr_none = CustomCounter(None) 
    assert 0 == len(ctr_none) 

    # Test typical dict and collections.Counter methods. 
    ctr = CustomCounter(words, mutator=mutator) 
    print(ctr) 
    assert 1 == ctr['dog'] 
    assert 2 == ctr['the'] 
    assert 7 == len(ctr) 
    del(ctr['lazy']) 
    assert 6 == len(ctr) 
    ctr.subtract(['jumps', 'dog']) 
    assert 0 == ctr['dog'] 
    assert 6 == len(ctr) 
    ctr.subtract({'the': 5, 'bogus': 100}) 
    assert -3 == ctr['the'] 
    assert -100 == ctr['bogus'] 
    assert 7 == len(ctr) 
    return 


if "__main__" == __name__: 
    main() 

輸出(線包裹,爲了便於閱讀):

CustomCounter({ 
    '*brown*': 1, 
    '*lazy*': 1, 
    '*the*': 2, 
    '*over*': 1, 
    '*jumps*': 1, 
    '*fox*': 1, 
    '*dog*': 1 
    }) 

我爲初始化程序mutator添加了一個關鍵字參數,用於存儲將真實世界的發起者轉換爲「突變」計數版本的函數。 請注意,這可能意味着CustomCounter不再存儲「可哈希對象」,而是「不能生成增變器barf的可哈希對象」。

此外,如果標準庫的Counter有新的方法,您必須更新CustomCounter以「覆蓋」它們。 (也許你可以解決,通過使用 __getattr__ 到任何未知屬性傳遞給self._counter,但在參數中的任何鑰匙將其原料交給了Counter,「非突變」的形式。

最後,正如我前面提到的,它不是實際上collections.Counter一個子類,如果其他代碼是專找一個。