如果你有大的列表製作BA組的效率會更高:
st = set(b)
print([b.index(x) for x in a if x in st])
當你的數據進行排序,並從假定所有元素都在B您也可以使用bisect,以便每個索引查找爲O(log n):
a = [1993, 1993, 1994, 1995, 1996, 1996, 1998, 2003, 2005, 2005]
b = [1966, 1967, 1968, 1969, 1970, 1971, 1972, 1973, 1974, 1975, 1976, 1977, 1978, 1979, 1980, 1981, 1982, 1983, 1984, 1985, 1986, 1987, 1988, 1989, 1990, 1991, 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014]
from bisect import bisect_left
print [bisect_left(b, x) for x in a]
[27, 27, 28, 29, 30, 30, 32, 37, 39, 39]
在小數據集運行快兩倍,只是索引:
In [22]: timeit [bisect_left(b, x) for x in a]
100000 loops, best of 3: 4.2 µs per loop
In [23]: timeit [b.index(x) for x in a]
100000 loops, best of 3: 8.84 µs per loop
另一種選擇是使用一個字典來存儲這意味着該代碼將在線性時間運行的指標,一個傳過來和一個傳過b:
# store all indexes as values and years as keys
indexes = {k: i for i, k in enumerate(b)}
# one pass over a accessing each index in constant time
print [indexes[x] for x in a]
[27, 27, 28, 29, 30, 30, 32, 37, 39, 39]
,甚至連小輸入設定的功能要比索引,更高效的一個增長將是一個很大更有效率:
In [34]: %%timeit
indexes = {k: i for i, k in enumerate(b)}
[indexes[x] for x in a]
....:
100000 loops, best of 3: 7.54 µs per loop
In [39]: b = list(range(1966,2100))
In [40]: samp = list(range(1966,2100))
In [41]: a = [choice(samp) for _ in range(100)]
In [42]: timeit [b.index(x) for x in a
10000 loops, best of 3: 154 µs per loop
In [43]: %%timeit
indexes = {k: i for i, k in enumerate(b)}
[indexes[x] for x in a]
....:
10000 loops, best of 3: 22.5 µs per loop
在這種情況下工作,但如果'a'中的任何值不在'b'中,將引發ValueError。如果b.count(x)]中的[b.index(x)for x會更安全... –
當然,這完全取決於你是否希望它失敗。 –