0
因此,我有關於飛機失事的數據幀。熊貓:試圖使用正則表達式的應用方法列
In []: df = pd.read_csv('Airplane_Crashes_and_Fatalities_Since_1908.csv')
In []: df.info()
In []: df.head()
Out []:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5268 entries, 0 to 5267
Data columns (total 13 columns):
Date 5268 non-null object
Time 3049 non-null object
Location 5248 non-null object
Operator 5250 non-null object
Flight # 1069 non-null object
Route 3562 non-null object
Type 5241 non-null object
Registration 4933 non-null object
cn/In 4040 non-null object
Aboard 5246 non-null float64
Fatalities 5256 non-null float64
Ground 5246 non-null float64
Summary 4878 non-null object
dtypes: float64(3), object(10)
memory usage: 535.1+ KB
Out []:
Date Time Location \
0 09/17/1908 17:18 Fort Myer, Virginia
1 07/12/1912 06:30 AtlantiCity, New Jersey
2 08/06/1913 NaN Victoria, British Columbia, Canada
3 09/09/1913 18:30 Over the North Sea
4 10/17/1913 10:30 Near Johannisthal, Germany
Operator Flight # Route Type \
0 Military - U.S. Army NaN Demonstration Wright Flyer III
1 Military - U.S. Navy NaN Test flight Dirigible
2 Private - NaN Curtiss seaplane
3 Military - German Navy NaN NaN Zeppelin L-1 (airship)
4 Military - German Navy NaN NaN Zeppelin L-2 (airship)
Registration cn/In Aboard Fatalities Ground \
0 NaN 1 2.0 1.0 0.0
1 NaN NaN 5.0 5.0 0.0
2 NaN NaN 1.0 1.0 0.0
3 NaN NaN 20.0 14.0 0.0
4 NaN NaN 30.0 30.0 0.0
Summary
0 During a demonstration flight, a U.S. Army fly...
1 First U.S. dirigible Akron exploded just offsh...
2 The first fatal airplane accident in Canada oc...
3 The airship flew into a thunderstorm and encou...
4 Hydrogen gas which was being vented was sucked...
所以我想分類'操作員'列並創建新的包含平面類型。 我試圖用正則表達式。適用使用():
def plane_type(plane):
m = re.search('\w*Military', plane)
p = re.search('\w*Private', plane)
if m:
return 'Military'
elif p:
return 'Private'
else:
return 'Passengers'
df['plane_type'] = df['operator'].apply(plane_type)
與拉姆達也試過:
df['plane_type'] = df['operator'].apply(lambda x: plane_type(x))
末,每次我得到類型錯誤:
TypeError: expected string or buffer
請,有人告訴我,我錯過了什麼?
嘗試:'DF [ 'plane_type'] = DF [ '操作符']。astype(STR)。適用(羊肉da x:plane_type(x))'。 – Abdou
另外,還有兩件事:你的列名是'Operator',但你似乎在索引'operator',你確定那些是你想使用的'regex'模式? – Abdou
@Abdou,是的,我將列名更改爲小寫,我忘了提及它。謝謝,它現在的作品:)(我只是混淆了astype(str)的順序) –