如何解碼dtype = numpy.string_的numpy數組？

我需要解碼，使用Python 3，經編碼方式如下字符串：如何解碼dtype = numpy.string_的numpy數組？

>>> s = numpy.asarray(numpy.string_("hello\nworld")) 
>>> s 
array(b'hello\nworld', 
     dtype='|S11')

我想：

>>> str(s) 
"b'hello\\nworld'" 

>>> s.decode() 
AttributeError       Traceback (most recent call last) 
<ipython-input-31-7f8dd6e0676b> in <module>() 
----> 1 s.decode() 

AttributeError: 'numpy.ndarray' object has no attribute 'decode' 

>>> s[0].decode() 
--------------------------------------------------------------------------- 
IndexError        Traceback (most recent call last) 
<ipython-input-34-fae1dad6938f> in <module>() 
----> 1 s[0].decode() 

IndexError: 0-d arrays can't be indexed

來源

2016-10-03 PiRK

如果我的理解是正確的，你可以做到這一點與astype這如果copy = False將返回與該內容的陣列中的對應類型：

>>> s = numpy.asarray(numpy.string_("hello\nworld")) 
>>> r = s.astype(str, copy=False) 
>>> r 
array('hello\nworld', 
     dtype='<U11')

來源

2016-10-03 12:13:13

謝謝！這有很大幫助。現在我可以通過這種方式恢復我的字符串：'s = str（s.astype（str））' – PiRK

當你可以直接用'unicode_'獲得常規字符串時，不需要轉換類型。 – Kasramvd

我不控制編碼階段。在我現實世界的問題中，我自己並沒有創造's'。我只是碰巧知道它在編碼階段後寫入了一個文件。 – PiRK

在Python 3，有兩種類型的第代表字符序列：bytes和str（包含Unicode字符）。當您使用string_作爲您的類型時，numpy將返回bytes。如果你想經常str你應該numpy的使用unicode_類型：

>>> s = numpy.asarray(numpy.unicode_("hello\nworld")) 
>>> s 
array('hello\nworld', 
     dtype='<U11') 

>>> str(s) 
'hello\nworld'

但要注意的是，如果你不爲你的字符串指定類型（string_或UNICODE_）將返回默認STR型（在python 3.x是str（包含unicode字符））。

>>> s = numpy.asarray("hello\nworld") 
>>> str(s) 
'hello\nworld'

來源

2016-10-03 12:22:08 Kasramvd

我使用numpy.string_數據進行編碼的原因是爲了兼容性。我的數據轉換爲一種名爲HDF5的數據格式，並且可能會被其他軟件讀取，而不僅僅是python。 – PiRK

http://docs.h5py.org/en/latest/strings.html（兼容性部分） – PiRK

@PiRK如果你想在Python版本之間使用兼容的方法，你應該使用'numpy.asarray（）'，否則它沒有任何用python做。 – Kasramvd

另一種選擇是np.char字符串操作的集合。

In [255]: np.char.decode(s) 
Out[255]: 
array('hello\nworld', 
     dtype='<U11')

它接受encoding關鍵字，如果需要的話。但如果你不需要這個，.astype可能會更好。

s是0d（shape（）），所以需要用s[()]索引。

In [268]: s[()] 
Out[268]: b'hello\nworld' 
In [269]: s[()].decode() 
Out[269]: 'hello\nworld'

s.item()也有效。

來源

2016-10-03 16:18:01 hpaulj

如何解碼dtype = numpy.string_的numpy數組？

回答

相關問題