2016-11-22 106 views
2

我正在嘗試使用python讀取avro文件。如何在python 3.5.2中讀取avro文件

我成功地安裝了Apache Avro的(我覺得我做,因爲我能「進口Avro的」在Python殼)這裏的指令

https://avro.apache.org/docs/1.8.1/gettingstartedpython.html 

但是,當我嘗試閱讀下面的Avro的文件以下代碼在上述指令中。導入avro相關內容時,我一直收到錯誤信息。

>>> import avro.schema 
Traceback (most recent call last): 
File "<pyshell#6>", line 1, in <module> 
import avro.schema 
File "<frozen importlib._bootstrap>", line 969, in _find_and_load 
File "<frozen importlib._bootstrap>", line 954, in _find_and_load_unlocked 
File "<frozen importlib._bootstrap>", line 896, in _find_spec 
File "<frozen importlib._bootstrap_external>", line 1139, in find_spec 
File "<frozen importlib._bootstrap_external>", line 1115, in _get_spec 
File "<frozen importlib._bootstrap_external>", line 1096, in _legacy_get_spec 
File "<frozen importlib._bootstrap>", line 444, in spec_from_loader 
File "<frozen importlib._bootstrap_external>", line 533, in spec_from_file_location 
File "I:\Program Files\lib\site-packages\avro-_avro_version_-py3.5.egg\avro\schema.py", line 340 
except Exception, e: 
       ^
SyntaxError: invalid syntax 


>>> from avro.datafile import DataFileReader, DataFileWriter 
Traceback (most recent call last): 
File "I:\Program Files\lib\site-packages\avro-_avro_version_-py3.5.egg\avro\datafile.py", line 21, in <module> 
from cStringIO import StringIO 
ImportError: No module named 'cStringIO' 

During handling of the above exception, another exception occurred: 

Traceback (most recent call last): 
File "<pyshell#7>", line 1, in <module> 
from avro.datafile import DataFileReader, DataFileWriter 
File "I:\Program Files\lib\site-packages\avro-_avro_version_-py3.5.egg\avro\datafile.py", line 23, in <module> 
from StringIO import StringIO 
ImportError: No module named 'StringIO' 


>>> from avro.io import DatumReader, DatumWriter 
Traceback (most recent call last): 
File "<pyshell#19>", line 1, in <module> 
from avro.io import DatumReader, DatumWriter 
File "<frozen importlib._bootstrap>", line 969, in _find_and_load 
File "<frozen importlib._bootstrap>", line 954, in _find_and_load_unlocked 
File "<frozen importlib._bootstrap>", line 896, in _find_spec 
File "<frozen importlib._bootstrap_external>", line 1139, in find_spec 
File "<frozen importlib._bootstrap_external>", line 1115, in _get_spec 
File "<frozen importlib._bootstrap_external>", line 1096, in _legacy_get_spec 
File "<frozen importlib._bootstrap>", line 444, in spec_from_loader 
File "<frozen importlib._bootstrap_external>", line 533, in spec_from_file_location 
File "I:\Program Files\lib\site-packages\avro-_avro_version_-py3.5.egg\avro\io.py", line 200 
bits = (((ord(self.read(1)) & 0xffL)) | 
           ^
SyntaxError: invalid syntax 

那麼我是否成功安裝avro?爲什麼我收到這些錯誤?我在Windows 7上使用python 3.5.2。

編輯 我在Stephane Martin的建議後解決了這個問題。然後我嘗試將avro文件讀入python。我在一個已經被設置爲Python正確路徑的目錄中有一堆avros。這裏是我的代碼

import avro.schema 
from avro.datafile import DataFileReader, DataFileWriter 
from avro.io import DatumReader, DatumWriter 

reader = DataFileReader(open("part-00000-of-01733.avro", "r"), DatumReader()) 
for user in reader: 
    print (user) 
reader.close() 

並返回錯誤

Traceback (most recent call last): 
File "I:\DJ data\read avro.py", line 5, in <module> 
reader = DataFileReader(open("part-00000-of-01733.avro", "r"), DatumReader()) 
File "I:\Program Files\lib\site-packages\avro_python3-1.8.1-py3.5.egg\avro\datafile.py", line 349, in __init__ 
self._read_header() 
File "I:\Program Files\lib\site-packages\avro_python3-1.8.1-py3.5.egg\avro\datafile.py", line 459, in _read_header 
META_SCHEMA, META_SCHEMA, self.raw_decoder) 
File "I:\Program Files\lib\site-packages\avro_python3-1.8.1-py3.5.egg\avro\io.py", line 525, in read_data 
return self.read_record(writer_schema, reader_schema, decoder) 
File "I:\Program Files\lib\site-packages\avro_python3-1.8.1-py3.5.egg\avro\io.py", line 725, in read_record 
field_val = self.read_data(field.type, readers_field.type, decoder) 
File "I:\Program Files\lib\site-packages\avro_python3-1.8.1-py3.5.egg\avro\io.py", line 515, in read_data 
return self.read_fixed(writer_schema, reader_schema, decoder) 
File "I:\Program Files\lib\site-packages\avro_python3-1.8.1-py3.5.egg\avro\io.py", line 568, in read_fixed 
return decoder.read(writer_schema.size) 
File "I:\Program Files\lib\site-packages\avro_python3-1.8.1-py3.5.egg\avro\io.py", line 170, in read 
input_bytes = self.reader.read(n) 
File "I:\Program Files\lib\encodings\cp1252.py", line 23, in decode 
return codecs.charmap_decode(input,self.errors,decoding_table)[0] 

的UnicodeDecodeError:「字符映射」編解碼器不能在863位解碼字節的0x90:字符映射到

我確實知道在指令的例子中,首先創建一個模式。但什麼是avsc文件?我應該如何創建它以及我的情況下的相應模式?

+0

except Exception,e => python 2 only語法。該庫可能不是蟒蛇3兼容 –

回答

2

使用的Python 3的Avro的分佈,而不是一個爲Python 2

http://apache.mediamirrors.org/avro/avro-1.8.2/py3/

需要注意的是上面的鏈接,如果Avro的-1.8.2被刪除可能無法正常工作。

+0

我爲python 3安裝,但錯誤仍然發生。是否有可能python仍然使用舊版本的avro?我應該先卸載舊的嗎? –

+0

我會從右側的'site-packages'目錄中刪除兩個avro軟件包,然後重新安裝avro的python3分發版 –

+0

它的工作原理。謝謝。但我有其他問題閱讀avro文件。你能看看編輯過的問題嗎?再次感謝你的幫助。 –

0

當通過pip或類似的包管理器進行安裝時:安裝avro-python3包而非avro