如何讓Python使所有相同的字符串使用相同的內存？

可能重複：
What does python intern do, and when should it be used?如何讓Python使所有相同的字符串使用相同的內存？

我與Python中的程序，必須以百萬計的字符串對象的數組上相關工作。我發現如果它們全都來自同一個引用的字符串，則每個附加的「字符串」僅僅是對第一個主字符串的引用。但是，如果從文件中讀取字符串，並且字符串全部相等，則每個字符串仍需要新的內存分配。

也就是說，大約需要存儲的14meg：

a = ["foo" for a in range(0,1000000)]

雖然這需要比存儲的65meg更多：

現在我可以讓內存取空間少得多與此：

s = {"f11":"f11"} 
a = [s["foo".replace("o","1")] for a in range(0,1000000)]

但這似乎很愚蠢。有沒有更簡單的方法來做到這一點？

來源

2012-08-05 vy32

@Maulwurfn，只是因爲答案是一樣的並不意味着問題是一樣的。 – 2012-08-05 17:16:48

爲什麼不先儲存'replace'操作的值？ – JBernardo 2012-08-05 17:17:05

你怎麼測量列表的大小？如果我使用'sys.getsizeof（[「foo」作爲範圍（0,1000000）]）'我得到與'sys.getsizeof（[「foo」.replace（「o」，「1」）相同的大小）對於範圍（0,1000000）]）'' - 至少在Python 3.2中 – 2012-08-05 18:54:32

只是做一個intern()，它告訴Python來存儲和從存儲器取串：

a = [intern("foo".replace("o","1")) for a in range(0,1000000)]

這也導致周圍18MB，相同於第一示例。

另請注意下面的註釋，如果您使用python3。 Thx @Abe Karplus

來源

2012-08-05 17:31:57 erikbwork

請注意，在Python 3中，'intern'已被重命名爲'sys.intern'。 – 2012-08-05 17:32:35

+1我不知道'intern（）'。 – 2012-08-05 17:54:42

非常感謝。謝謝。我不知道實習生。是的，我使用Python3，所以我需要使用sys.intern（）。 – vy32 2012-08-05 20:50:07

你可以嘗試這樣的事：

strs=["this is string1","this is string2","this is string1","this is string2", 
     "this is string3","this is string4","this is string5","this is string1", 
     "this is string5"] 
new_strs=[] 
for x in strs: 
    if x in new_strs: 
     new_strs.append(new_strs[new_strs.index(x)]) #find the index of the string 
                #and instead of appending the 
               #string itself, append it's reference. 
    else: 
     new_strs.append(x) 

print [id(y) for y in new_strs]

字符串是相同，現在將有相同的id()

輸出：

[18632400, 18632160, 18632400, 18632160, 18651400, 18651440, 18651360, 18632400, 18651360]

來源

2012-08-05 17:21:20

好主意。不幸的是，它是一個O（n ** 2）算法，隨着列表變長，它會變得非常慢。 – 2012-08-05 17:23:00

-1

保持所看到的字符串的字典應該工作

new_strs = [] 
str_record = {} 
for x in strs: 
    if x not in str_record: 
     str_record[x] = x 
    new_strs.append(str_record[x])

（未測試）

來源

2012-08-05 17:29:26

如何讓Python使所有相同的字符串使用相同的內存？

回答

相關問題