# The data is input as '1: 0.022,' format
def process_data(line):
# for returning the new string that is cleaned up
result_line = ''
for character in line:
# check if it is either a number or a letter
if character.isdigit() or character.isalpha():
result_line += character
# we want the decimal point
elif character == '.':
result_line += character
# else we replace it with space ' '
else:
result_line += ' '
return result_line
my_list = []
with open('input.txt') as file:
for lines in file:
processed_line = process_data(lines)
# temp_list has ['letter', 'frequency']
temp_list = (processed_line.split())
value = temp_list[0]
# Require to cast it to a float, since it is a string
frequency = float(temp_list[1])
my_list.append([value, frequency])
print(my_list)
從你這一點可以找出與你的價值觀做。我記錄了代碼(授予一個非常簡單樸素的方式來處理輸入文件)。但my_list
現在乾淨,格式良好,其中string
(值)和float
(頻率)。希望這個幫助。從上面的代碼的
輸出:
[['0', 0.017], ['1', 0.022], ['2', 0.033], ['3', 0.033],
['4', 0.029], ['5', 0.028], ['6', 0.035], ['7', 0.032],
['8', 0.028], ['9', 0.027], ['a', 0.019], ['b', 0.022],
['c', 0.029], ['d', 0.03], ['e', 0.028], ['f', 0.035],
['g', 0.026], ['h', 0.037], ['i', 0.029], ['j', 0.025],
['k', 0.025], ['l', 0.037], ['m', 0.025], ['n', 0.023],
['o', 0.026], ['p', 0.035], ['q', 0.033], ['r', 0.031],
['s', 0.023], ['t', 0.022], ['u', 0.038], ['v', 0.022],
['w', 0.016], ['x', 0.026], ['y', 0.021], ['z', 0.033]]
然後...的
# Took a page out of TokenMacGuy, credit to him
distribution = []
distribution.append(0.00)
total = 0.0 # Create a float here
for entry in my_list:
distribution.append(entry[1])
total += frequency
total = round(total, 3) # Rounding to 2 decimal points
distribution.append(1.00) # Missing the 1.00 value
print(distribution) # Print to check
輸出是在這裏:
[0.0, 0.017, 0.022, 0.033, 0.033, 0.029, 0.028, 0.035, 0.032,
0.028, 0.027, 0.019, 0.022, 0.029, 0.03, 0.028, 0.035, 0.026,
0.037, 0.029, 0.025, 0.025, 0.037, 0.025, 0.023, 0.026, 0.035,
0.033, 0.031, 0.023, 0.022, 0.038, 0.022, 0.016, 0.026, 0.021,
0.033, 1.0]
最後,爲了輸出最終結果:在那裏沒有什麼特別的,我用pattern
和format
讓它們看起來更漂亮。這幾乎是根據ninjagecko的方法來計算的。因爲計算沒有顯示它,所以我必須將0.00和1.00填充到分佈中。非常直接的執行之後我們計算出如何做概率。
pattern = '{0}: [{1:1.3f}, {2:1.3f})'
count = 1 # a counter to keep track of the index
pre_p = distribution[0]
p = distribution[1]
# Here we will print it out at the end in the format you said in the question
for entry in my_list:
print(pattern.format(entry[0], pre_p, p))
pre_p += distribution[count]
p += distribution[count+1]
count = count + 1
輸出:
0: [0.000, 0.017)
1: [0.017, 0.039)
2: [0.039, 0.072)
3: [0.072, 0.105)
4: [0.105, 0.134)
5: [0.134, 0.162)
6: [0.162, 0.197)
7: [0.197, 0.229)
8: [0.229, 0.257)
9: [0.257, 0.284)
a: [0.284, 0.303)
b: [0.303, 0.325)
c: [0.325, 0.354)
d: [0.354, 0.384)
e: [0.384, 0.412)
f: [0.412, 0.447)
g: [0.447, 0.473)
h: [0.473, 0.510)
i: [0.510, 0.539)
j: [0.539, 0.564)
k: [0.564, 0.589)
l: [0.589, 0.626)
m: [0.626, 0.651)
n: [0.651, 0.674)
o: [0.674, 0.700)
p: [0.700, 0.735)
q: [0.735, 0.768)
r: [0.768, 0.799)
s: [0.799, 0.822)
t: [0.822, 0.844)
u: [0.844, 0.882)
v: [0.882, 0.904)
w: [0.904, 0.920)
x: [0.920, 0.946)
y: [0.946, 0.967)
z: [0.967, 1.000)
完整的源是在這裏:http://codepad.org/a6YkHhed
是在一個文本文件數據?或者這是某種數據結構? – George 2012-04-08 01:19:59
@George需要一個數據結構,我從隨機字符/數字的文本文件中獲得概率 – iCodeLikeImDrunk 2012-04-08 01:25:35
*「0將會是[0,0.017),[0.017,0.022)」* - 您是不是指「0會是[ 0,0.017),1將爲[0.017,0.017 + 0.022),2將爲[0.017 + 0.022,0.017 + 0.022 + 0.033)「 – ninjagecko 2012-04-08 01:28:26