2011-12-31 58 views
2

我有我的代碼中使用了很多小的方便類如下:循環引用的對象沒有得到垃圾收集

class Structure(dict): 
    def __init__(self, **kwargs): 
     dict.__init__(self, **kwargs) 
     self.__dict__ = self 

關於它的好處是,你可以通過訪問屬性字典的鍵語法或通常對象樣式:

myStructure = Structure(name="My Structure") 
print myStructure["name"] 
print myStructure.name 

今天,我已經注意到,我的應用程序的內存消耗是在我本來期望它減少的情況略有增加。在我看來,從結構類生成的實例不垃圾收集。在這裏說明這是一個小片段:

import gc 

class Structure(dict): 
    def __init__(self, **kwargs): 
     dict.__init__(self, **kwargs) 
     self.__dict__ = self 

structures = [Structure(name="__{0}".format(str(value))) for value in range(4096)] 
print "Structure name: ", structures[16].name 
print "Structure name: ", structures[16]["name"] 
del structures 
gc.collect() 
print "Structures count: ", len([obj for obj in gc.get_objects() if type(obj) is Structure]) 

用下面的輸出:

Structure name: __16 
Structure name: __16 
Structures count: 4096 

當你注意到結構實例數仍是4096

我評論的行創建方便的自我參考:

import gc 

class Structure(dict): 
    def __init__(self, **kwargs): 
     dict.__init__(self, **kwargs) 
     # self.__dict__ = self 

structures = [Structure(name="__{0}".format(str(value))) for value in range(4096)] 
# print "Structure name: ", structures[16].name 
print "Structure name: ", structures[16]["name"] 
del structures 
gc.collect() 
print "Structures count: ", len([obj for obj in gc.get_objects() if type(obj) is Structure]) 

既然循環引用被移除時輸出有意義:

Structure name: __16 
Structures count: 0 

我推一點進一步使用Melia分析內存消耗測試:

import gc 
import pprint 
from meliae import scanner 
from meliae import loader 

class Structure(dict): 
    def __init__(self, **kwargs): 
     dict.__init__(self, **kwargs) 
     self.__dict__ = self 

structures = [Structure(name="__{0}".format(str(value))) for value in range(4096)] 
print "Structure name: ", structures[16].name 
print "Structure name: ", structures[16]["name"] 
del structures 
gc.collect() 
print "Structures count: ", len([obj for obj in gc.get_objects() if type(obj) is Structure]) 

scanner.dump_all_objects("Test_001.json") 
om = loader.load("Test_001.json") 
summary = om.summarize() 
print summary 

structures = om.get_all("Structure") 
if structures: 
    pprint.pprint(structures[0].c) 

產生以下輸出:

Structure name: __16 
Structure name: __16 
Structures count: 4096 
loading... line 5001, 5002 objs, 0.6/ 1.8 MiB read in 0.2s 
loading... line 10002, 10003 objs, 1.1/ 1.8 MiB read in 0.3s 
loading... line 15003, 15004 objs, 1.7/ 1.8 MiB read in 0.5s 
loaded line 16405, 16406 objs, 1.8/ 1.8 MiB read in 0.5s   
checked  1/ 16406 collapsed  0  
checked 16405/ 16406 collapsed  157  
compute parents  0/ 16249   
compute parents 16248/ 16249   
set parents 16248/ 16249   
collapsed in 0.2s 
Total 16249 objects, 58 types, Total size = 3.2MiB (3306183 bytes) 
Index Count %  Size % Cum  Max Kind 
    0 4096 25 1212416 36 36  296 Structure 
    1  390 2 536976 16 52 49432 dict 
    2 5135 31 417550 12 65 12479 str 
    3  82 0 290976 8 74 12624 module 
    4  235 1 212440 6 80  904 type 
    5  947 5 121216 3 84  128 code 
    6 1008 6 120960 3 88  120 function 
    7 1048 6  83840 2 90  80 wrapper_descriptor 
    8  654 4  47088 1 92  72 builtin_function_or_method 
    9  562 3  40464 1 93  72 method_descriptor 
    10  517 3  37008 1 94  216 tuple 
    11  139 0  35832 1 95 2280 set 
    12  351 2  30888 0 96  88 weakref 
    13  186 1  23200 0 97 1664 list 
    14  63 0  21672 0 97  344 WeakSet 
    15  21 0  18984 0 98  904 ABCMeta 
    16  197 1  14184 0 98  72 member_descriptor 
    17  188 1  13536 0 99  72 getset_descriptor 
    18  284 1  6816 0 99  24 int 
    19  14 0  5296 0 99 2280 frozenset 
[Structure(4312707312 296B 2refs 2par), 
type(4298634592 904B 4refs 100par 'Structure')] 

內存使用量爲3.2MiB,刪除自引用行會導致以下輸出:

Structure name: __16 
Structures count: 0 
loading... line 5001, 5002 objs, 0.6/ 1.4 MiB read in 0.1s 
loading... line 10002, 10003 objs, 1.1/ 1.4 MiB read in 0.3s 
loaded line 12308, 12309 objs, 1.4/ 1.4 MiB read in 0.4s   
checked  12/ 12309 collapsed  0  
checked 12308/ 12309 collapsed  157  
compute parents  0/ 12152   
compute parents 12151/ 12152   
set parents 12151/ 12152   
collapsed in 0.1s 
Total 12152 objects, 57 types, Total size = 2.0MiB (2093714 bytes) 
Index Count %  Size % Cum  Max Kind 
    0  390 3 536976 25 25 49432 dict 
    1 5134 42 417497 19 45 12479 str 
    2  82 0 290976 13 59 12624 module 
    3  235 1 212440 10 69  904 type 
    4  947 7 121216 5 75  128 code 
    5 1008 8 120960 5 81  120 function 
    6 1048 8  83840 4 85  80 wrapper_descriptor 
    7  654 5  47088 2 87  72 builtin_function_or_method 
    8  562 4  40464 1 89  72 method_descriptor 
    9  517 4  37008 1 91  216 tuple 
    10  139 1  35832 1 92 2280 set 
    11  351 2  30888 1 94  88 weakref 
    12  186 1  23200 1 95 1664 list 
    13  63 0  21672 1 96  344 WeakSet 
    14  21 0  18984 0 97  904 ABCMeta 
    15  197 1  14184 0 98  72 member_descriptor 
    16  188 1  13536 0 98  72 getset_descriptor 
    17  284 2  6816 0 99  24 int 
    18  14 0  5296 0 99 2280 frozenset 
    19  22 0  2288 0 99  104 classobj 

確認結構情況下已被銷燬和內存使用率降至2.0MiB。

任何想法我怎麼能確保這個類得到正確的垃圾收集?順便說一下,所有這些都是在Python 2.7.2(Darwin)上執行的。

乾杯,

托馬斯

+0

你爲什麼要這樣的自我引用?即使你堅持屬性訪問和項目查找的雙重性(恕我直言,根據Python的Zen),還有更好,更簡單的方法來實現這一點。 – delnan 2011-12-31 11:53:14

回答

3

您可以更直接地利用__getattr____setattr__,使屬性訪問到底層的字典實現你的結構類。

class Structure(dict): 
    def __getattr__(self, k): 
     return self[k] 
    def __setattr__(self, k, v): 
     self[k] = v 

週期垃圾收集在Python,但只是週期性(不像得到儘快收集它們的引用計數經常引用計數的對象降到0)。

避免週期(因爲使用__getattr____setattr__的Structure類會),意味着您將獲得更好的gc行爲。你可能想看看collections.namedtuple作爲一個很好的選擇:它不是完全按照你實現的,但也許它適合你的目的。

+0

嗨保羅,乾杯!它看起來是一個很好的選擇,我實際上是從這篇文章中讀到的:http://ruslanspivak.com/2011/06/12/the-bunch-pattern/。顯然垃圾收集的錯誤也是已知的:http://bugs.python.org/issue1469629關於namedTuple:我很早以前就看過它,但我需要我的數據是可變的。 – 2011-12-31 12:01:27