我有一個CSV數據集,看起來像這樣:導入CSV成Python
FirstAge,SecondAge,FirstCountry,SecondCountry,Income,NAME
41,41,USA,UK,113764,John
53,43,USA,USA,145963,Fred
47,37,USA,UK,42857,Dan
47,44,UK,USA,95352,Mark
我試圖把它與這個代碼加載到Python的3.6:
>>> from numpy import genfromtxt
>>> my_data = genfromtxt('first.csv', delimiter=',')
>>> print(train_data)
輸出:
[[ nan nan nan nan
nan nan]
[ 4.10000000e+01 4.10000000e+01 nan nan
1.13764000e+05 nan]
[ 5.30000000e+01 4.30000000e+01 nan nan
1.45963000e+05 nan]
...,
[ 2.10000000e+01 3.00000000e+01 nan nan
1.19929000e+05 nan]
[ 6.90000000e+01 6.40000000e+01 nan nan
1.52667000e+05 nan]
[ 2.00000000e+01 1.90000000e+01 nan nan
1.05077000e+05 nan]]
我看了Numpy文檔,但是我沒有看到任何關於此的內容。
是'USA'或'UK'多少?!你面臨的問題是什麼? –
您可能會遇到的問題是numpy想要將數據解析爲數字類型,這可能會導致意外的行爲。 – AgnosticDev
數字列/行是正確的,只是浮動。 'nan'代表不能被解釋爲浮動的字符串。 – hpaulj