list（）使用比列表理解更多的內存

所以我玩list對象，發現一點奇怪的事情，如果list創建與list()它使用更多的內存，比列表理解？我使用Python 3.5.2list（）使用比列表理解更多的內存

In [1]: import sys 
In [2]: a = list(range(100)) 
In [3]: sys.getsizeof(a) 
Out[3]: 1008 
In [4]: b = [i for i in range(100)] 
In [5]: sys.getsizeof(b) 
Out[5]: 912 
In [6]: type(a) == type(b) 
Out[6]: True 
In [7]: a == b 
Out[7]: True 
In [8]: sys.getsizeof(list(b)) 
Out[8]: 1008

從docs：

列表可以通過多種方式來構建：

使用方括號對來表示空列表：[]

使用方括號，用逗號分隔項目：[a],[a, b, c]

使用列表理解：[x for x in iterable]

使用類型構造：list()或list(iterable)

但似乎使用list()它使用更多的內存。

而且list越大，差距就越大。

爲什麼出現這種情況？

UPDATE＃1

測試與Python 3.6.0b2：

Python 3.6.0b2 (default, Oct 11 2016, 11:52:53) 
[GCC 5.4.0 20160609] on linux 
Type "help", "copyright", "credits" or "license" for more information. 
>>> import sys 
>>> sys.getsizeof(list(range(100))) 
1008 
>>> sys.getsizeof([i for i in range(100)]) 
912

更新＃2

測試與Python 2.7.12：

Python 2.7.12 (default, Jul 1 2016, 15:12:24) 
[GCC 5.4.0 20160609] on linux2 
Type "help", "copyright", "credits" or "license" for more information. 
>>> import sys 
>>> sys.getsizeof(list(xrange(100))) 
1016 
>>> sys.getsizeof([i for i in xrange(100)]) 
920

來源

2016-10-13 vishes_shell

這是一個非常有趣的問題。我可以在Python 3.4.3中重現這一現象。甚至更有趣的：關於Python 2.7.5'sys.getsizeof（列表（範圍（100）））'是1016，'getsizeof（範圍（100））'是872和'getsizeof（[I爲i的範圍（100） ]）'是920.所有的都有'list'類型。 –

感興趣的是，Python 2.7.10中也存在這種差異（儘管實際數字與Python 3不同）。還有在3.5和3.6b。 – cdarke

當使用'xrange'時，我得到的Python 2.7.6與@SvenFestersen相同。 – RemcoGerlich

我想你正在看o VER-分配模式，這是一個sample from the source：

/* This over-allocates proportional to the list size, making room 
* for additional growth. The over-allocation is mild, but is 
* enough to give linear-time amortized behavior over a long 
* sequence of appends() in the presence of a poorly-performing 
* system realloc(). 
* The growth pattern is: 0, 4, 8, 16, 25, 35, 46, 58, 72, 88, ... 
*/ 

new_allocated = (newsize >> 3) + (newsize < 9 ? 3 : 6);

壓印長度0-88可以看到圖案的列表推導的大小相匹配：

# create comprehensions for sizes 0-88 
comprehensions = [sys.getsizeof([1 for _ in range(l)]) for l in range(90)] 

# only take those that resulted in growth compared to previous length 
steps = zip(comprehensions, comprehensions[1:]) 
growths = [x for x in list(enumerate(steps)) if x[1][0] != x[1][1]] 

# print the results: 
for growth in growths: 
    print(growth)

結果（格式爲(list length, (old total size, new total size))）：

(0, (64, 96)) 
(4, (96, 128)) 
(8, (128, 192)) 
(16, (192, 264)) 
(25, (264, 344)) 
(35, (344, 432)) 
(46, (432, 528)) 
(58, (528, 640)) 
(72, (640, 768)) 
(88, (768, 912))

超額分配是出於性能原因而完成的，允許列表在不增加每次增長的情況下分配更多內存的情況下增長（更好的性能爲amortized性能）。

與使用列表理解差異的可能原因是列表理解不能確定性地計算生成列表的大小，但list()可以。這意味着理解會不斷增加列表，因爲它會使用過度分配填充列表，直到最終填充列表。

一旦它完成（實際上，在大多數情況下它不會超出分配目的），將不會增加帶有未使用分配節點的過度分配緩衝區。然而，由於它事先知道最終列表大小，所以無論列表大小如何，都可以添加一些緩衝區。

另一個支撐的證據，也從源頭上，就是我們看到list comprehensions invoking LIST_APPEND，這表明的list.resize使用，這又表明食用預分配緩衝區不知道有多少會被填滿。這與你所看到的行爲是一致的。

最後，list()將預分配更多的節點作爲列表大小

>>> sys.getsizeof(list([1,2,3])) 
60 
>>> sys.getsizeof(list([1,2,3,4])) 
64

列表理解不知道該列表大小，因此它使用附加的操作，因爲它生長的功能，耗盡預分配緩衝區：

# one item before filling pre-allocation buffer completely 
>>> sys.getsizeof([i for i in [1,2,3]]) 
52 
# fills pre-allocation buffer completely 
# note that size did not change, we still have buffered unused nodes 
>>> sys.getsizeof([i for i in [1,2,3,4]]) 
52 
# grows pre-allocation buffer 
>>> sys.getsizeof([i for i in [1,2,3,4,5]]) 
68

來源

2016-10-13 10:40:13

但爲什麼過度分配會發生在一個而不是另一個？ – cdarke

這個具體來自'list.resize'。我不是一位在源頭上導航的專家，但是如果一個人調用調整大小，而另一個則不調整 - 它可以解釋不同之處。 –

Python 3.5.2在這裏。嘗試在循環中打印列表大小從0到35。對於表I，參見'64,96,104,112,120,128,136,144,160,192,200,208,216,224,232,240,256,264,272,280,288,296,304 ，312，328，336，344，352，360，368，376，384，400，408，416'和用於理解'64，96，96，96，96，128，128，128，128，192，192 ，192,192,192,192,192,192,264,264,264,264,264,264,264,264,264,344,344,344,344,344,344,344,344,344'中的一個或多個。除了理解力是那些似乎預先分配內存的人成爲對特定大小使用更多內存的算法之外我纔會理解。 – tavo