我正嘗試讀取csv並使用Bokeh計算PDF和CDF。我收到錯誤。輸入文件是keyword
和freq
。頻率的分佈是繪製的。下面的輸入是來自超過50k行的幾行。錯誤 - 繪製PDF和CDF散景:不支持的操作數類型爲/:'list'和'int'
輸入:
#sportsnews,8
#mashupradiomx,1
#arrestobama,2
#alemanha,1
#bizeskiden,1
#musicnews,4
#costumedesign,2
#champain,1
#pacer,1
#brunner,1
#fotoviajera,1
#itsjihadstupid,1
#lesdernierssurvivants,1
#sainsburycentre,1
#alanalwaysinourheart,1
#runinapp,1
#foroporlavida,1
#kidsday,1
#momentofart,2
代碼:
# -*- coding: utf-8 -*-
import numpy as np
import scipy.special
import pandas as pd
from bokeh.plotting import figure, show, output_file, vplot
df = pd.read_csv('keyword.csv', header = None)
df.columns = ['keyword','freq']
p5 = figure(title="Weibull Distribution (λ=1, k=1.25)", tools="save",
background_fill_color="#E8DDCB")
lam, k = 1, 1.25
#measured = lam*(-np.log(np.random.uniform(0, 1, 1000)))**(1/k)
#hist, edges = np.histogram(measured, density=True, bins=50)
x = df['freq']
pdf = (k/lam)*(x/lam)**(k-1) * np.exp(-(x/lam)**k)
cdf = 1 - np.exp(-(x/lam)**k)
p5.quad(top=hist, bottom=0, left=edges[:-1], right=edges[1:],
fill_color="#036564", line_color="#033649")
p5.line(x, pdf, line_color="#D95B43", line_width=8, alpha=0.7, legend="PDF")
p5.line(x, cdf, line_color="white", line_width=2, alpha=0.7, legend="CDF")
p5.legend.location = "top_left"
p5.xaxis.axis_label = 'x'
p5.yaxis.axis_label = 'Pr(x)'
output_file('histogram.html', title="histogram.py example")
show(vplot(p5))
我只想繪製兩個line
地塊。
錯誤:
Traceback (most recent call last):
File "pdf_bokeh.py", line 21, in <module>
pdf = (k/lam)*(x/lam)**(k-1) * np.exp(-(x/lam)**k)
TypeError: unsupported operand type(s) for /: 'list' and 'int'
編輯1:改變x=df['freq']
後,我越來越陌生輸出。 完整的輸入文件Dropbox數據本質上是離散的,但仍然分佈圖不像下面的輸出。
輸出:這不是真的在什麼地方接近它應該。
什麼'x'意味着要爲你已經將它定義爲'x = ['freq']'? – EdChum
@EdChum'x'是要繪製的'freq' –
我假設你想要x = df ['freq'] – oystein