關於如何在常用的bytearray字段上加入兩個熊貓數組的任何想法?源(Teradata)中的字段是一個實際的ByteArray,並且從Teradata一方,這不能強制爲Teradata以外的字符或可用內容)pandas:在ByteArray列上合併
Teradata Export精美地讀入熊貓的陣列。但是我無法將兩個表與通常命名的字段(DatabaseId)合併,其中該字段是一個字節陣列。
(導入這兩種熊貓作爲PD和itertools)
當我嘗試的簡單合併:
merge1 = pd.merge(tvm, dbase, on="DatabaseId")
我得到的錯誤:
TypeError: type object argument after * must be a sequence, not itertools.imap
我搜索的StackOverflow,發現a similar problem for joining on a cell containing a collection
dbase['DBID'] = dbase.DatabaseId.apply(lambda r: type(sorted(r.iteritems())))
但我得到的錯誤:
AttributeError: 'bytearray' object has no attribute 'iteritems'
更新數據 的
例收集的數據通過熊貓使用
dbase = pd.read_sql('select databaseid, databasename from ud812.dbase sample 10', conn)
conn is a connection to a teradata database
數據類型出來的Teradata的類型爲VARCHAR所有專欄除外:
DatabaseID = bytearray (Byte(4))
TVMID = bytearray (Byte(4))
>>> dbase.dtypes
DatabaseId object
DatabaseName object
dtype: object
>>> dbase
DatabaseId DatabaseName
0 [2, 0, 243, 185] PCDW_CRS_BBCONV3_TB
1 [2, 0, 168, 114] PAMLIF_TB
2 [2, 0, 133, 153] PADW_PRESN_TB
3 [2, 0, 29, 184] CEDW_MOBILE_TB
4 [2, 0, 190, 183] CEDW_MODEL_SCORE_TB
5 [2, 0, 71, 55] PBBBAM_TB
6 [2, 0, 169, 183] CEDW_OCC_TB
7 [2, 0, 201, 183] CCDW_DGTL_DEAL_TB
8 [0, 0, 139, 8] PRECDSS_TB
9 [2, 0, 142, 203] CDBDW_TB
>>>
>>>
>>> tvm.dtypes
TVMId object
DatabaseId object
TVMName object
TableKind object
CreateText object
dtype: object
>>> tvm
TVMId DatabaseId TVMName \
0 [230, 1, 41, 11, 0, 0] [2, 0, 67, 183] JCP_03538_112002
1 [214, 1, 60, 133, 0, 0] [2, 0, 186, 52] STL_AUTHNCTD_RULE_EXECN
2 [193, 2, 59, 48, 0, 0] [2, 0, 225, 150] uye177_Xsell_EM_OPCL_TB2
3 [0, 2, 235, 154, 0, 0] [2, 0, 244, 181] PL_CALCD_INVSTR_MTHLY_HIST_ST
4 [255, 1, 131, 76, 0, 0] [2, 0, 110, 63] IMH867_AVA0803_SNAP
5 [125, 1, 217, 138, 0, 0] [2, 0, 237, 153] FD_ACCT_STMT_ADR_ST
6 [224, 0, 80, 233, 0, 0] [2, 0, 243, 127] EXP_SRCH_RSLT_DESC
7 [208, 1, 72, 15, 0, 0] [2, 0, 8, 57] SGI_PAY_DENIED_SEP_112012
8 [246, 0, 27, 61, 0, 0] [2, 0, 143, 130] CR_INDIVD
9 [186, 1, 242, 167, 0, 0] [0, 0, 244, 18] wzu448_sb_apps
TableKind CreateText
0 T None
1 V CREATE VIEW ... ... ... ... ... ... ... ... ...
2 T None
3 V CREATE VIEW ... ... ... ... ... ... ... ... ...
4 T None
5 V CREATE VIEW ... ... ... ... ... ... ... ... ...
6 V CREATE VIEW ... ... ... ... ... ... ... ... ...
7 V CREATE VIEW ... ... ... ... ... ... ... ... ...
8 V CREATE VIEW ... ... ... ... ... ... ... ... ...
9 T None
「tvm」的類型是什麼?你能提供你的數據樣本嗎? –
那麼,您可以使用FROM_BYTES函數將BYTE轉換爲字符串。這是醜陋的語法,因爲你必須使用LPAD(前導零被忽略)和TRANSLATE(結果是Unicode),CAST(LPAD返回一個VARCHAR(32000):'CAST(TRANSLATE(LPAD(FROM_BYTES(tvmid,'Base16')) ,12,'0')USING unicode_to_latin)AS VARCHAR(12)) '(** 12 **是字節數的兩倍) – dnoeth