2017-06-06 99 views
2

我有一所高中。我試圖去掉學校名稱的通用結尾。刪除Sub Strings熊貓,python

in[1]:df 
out[2]: 
    time school 
1 09:00 Brown Academy 
2 10:00 Covfefe High School 
3 11:00 Bradley High 
4 12:00 Johnson Prep 

school_endings = ['Academy','Prep,'High','High School'] 

期望:

out[3]: 
    time school 
1 09:00 Brown 
2 10:00 Covfefe 
3 11:00 Bradley 
4 12:00 Johnson 

回答

2
endings = ['Academy', 'Prep', 'High', 'High School'] 

endings = sorted(endings, key=len, reverse=True) 

df.assign(school=df.school.replace(endings, '', regex=True).str.strip()) 

    time school 
1 09:00 Brown 
2 10:00 Covfefe 
3 11:00 Bradley 
4 12:00 Johnson 
0

使用rstrip()方法剝去從原始字符串的後不希望的字符串。 如:

mystring = "Brown Academy"

mystring.rstrip("Academy") - >將要給你的O/P: '布朗'

0

我可能會用正則表達式替換走:

import re 

df['school']=df['school'].apply(lambda x: re.sub(r'\s+((Academy)|(Prep)|(High)|(High School))$','',x)) 
4

使用拆分

df.school = df.school.str.split(' ').str[0] 

    school time 
0 Brown 09:00 
1 Covfefe 10:00 
2 Bradley 11:00 
3 Johnson 12:00