如何從字符串中刪除所有字符並僅在數據幀中保留數字？

我有一個包含數值和字符串
數據幀夫婦列和我要刪除所有字符，只留下數字如何從字符串中刪除所有字符並僅在數據幀中保留數字？

Admit_DX_Description   Primary_DX_Description 
510.9 - EMPYEMA W/O FISTULA  510.9 - EMPYEMA W/O FISTULA 
681.10 - CELLULITIS, TOE NOS 681.10 - CELLULITIS, TOE NOS 
780.2 - SYNCOPE AND COLLAPSE 427.89 - CARDIAC DYSRHYTHMIAS NEC 
729.5 - PAIN IN LIMB   998.30 - DISRUPTION OF WOUND, UNSPEC

到

Admit_DX_Description   Primary_DX_Description 
510.9        510.9 
681.10       681.10 
780.2        427.89 
729.5        998.30

代碼：

for col in strip_col: 
     # # Encoding only categorical variables 
     if df[col].dtypes =='object': 
      df[col] = df[col].map(lambda x: x.rstrip(r'[a-zA-Z]')) 

print df.head()

錯誤：
回溯（最近通話最後一個）：

df[col] = df[col].map(lambda x: x.rstrip(r'[a-zA-Z]'))

文件「/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pandas/core/series.py」線2175，在地圖 new_values = map_f（值，ARG）文件「熊貓/ SRC/inference.pyx」，線1217，在pandas.lib.map_infer（熊貓/ lib.c：63307）

df[col] = df[col].map(lambda x: x.rstrip(r'[a-zA-Z]'))

AttributeError：'int'對象沒有屬性'rstrip'

來源

2017-02-03 kero

可以使用此示例：

我選擇re模塊僅提取浮點數。

import re 
import pandas 

df = pandas.DataFrame({'A': ['Hello 199.9', '19.99 Hello'], 'B': ['700.52 Test', 'Test 7.7']}) 

df 
      A   B 
0 Hello 199.9 700.52 Test 
1 19.99 Hello  Test 7.7 

for col in df: 
    df[col] = [''.join(re.findall("\d+\.\d+", item)) for item in df[col]] 

     A  B 
0 199.9 700.52 
1 19.99  7.7

如果你有整數也改變re pattern這樣：\d*\.?\d+。

EDITED

對於TypeError我建議使用try。在這個例子中，我創建了一個列表errs。此列表將用於except TypeError。您可以通過print (errs)查看這些值。

也檢查df。

... 
... 
errs = [] 
for col in df: 
    try: 
     df[col] = [''.join(re.findall("\d+\.\d+", item)) for item in df[col]] 
    except TypeError: 
     errs.extend([item for item in df[col]])

來源

2017-02-03 21:38:01 estebanpdl

嘿它是一個很好的答案，但我得到這個錯誤** TypeError：期望的字符串或緩衝區**但我想出了一些字符串的值類似於這個「250.82 - DIABETES，.TYPE II」你有什麼想法我可以處理這個 – kero

我運行這個新的數據框：'df = pandas.DataFrame（{'A'：['250.82 - DIABETES，.TYPE II'，'19 .99 Hello']，'B'：['700.52 Test'，'Test 7.7']}）'並且我不得到任何'TypeError'。也許是另一種不同於*** 250.82的字符串 - 糖尿病，.TYPE II ***。 – estebanpdl

我不知道，但它可能是這樣的** V22.0 - SUPERVIS NORMAL 1ST PREG ** – kero

你應該看看df.applymap並將其應用於要從中刪除文本的列。 [編輯] 或者：

import pandas as pd 
test = [{'c1':10, 'c2':100}, {'c1':11,'c2':110}, {'c1':12,'c2':120}] 
fun = lambda x: x+10 
df = pd.DataFrame(test) 
df['c1'] = df['c1'].apply(fun) 
print df

來源

2017-02-03 21:11:10

我嘗試過，但我得到這個錯誤** AttributeError的：「系列」對象有沒有屬性「applymap」 ** – kero

OK對不起，編輯我的答覆 –

如何從字符串中刪除所有字符並僅在數據幀中保留數字？

回答

相關問題