獨立的第一個，中間名和最後一個名（Python）

我有幾百個成員的名單，我想用名，中間名和姓來分開，但一些成員有前綴（用'P'表示）。所有可能的組合：獨立的第一個，中間名和最後一個名（Python）

First Middle Last 
P First Middle Last 
First P Middle Last 
P First p Middle Last

如何分開的第一（以P，如果有的話），中（其中P，如果有的話），並在Python的姓氏？這是我想出來的，但它不起作用。

import csv 
inPath = "input.txt" 
outPath = "output.txt" 

newlist = [] 

file = open(inPath, 'rU') 
if file: 
    for line in file: 
     member = line.split() 
     newlist.append(member) 
    file.close() 
else: 
    print "Error Opening File." 

file = open(outPath, 'wb') 
if file: 
    for i in range(len(newlist)): 
     print i, newlist[i][0] # Should get the First Name with Prefix 
     print i, newlist[i][1] # Should get the Middle Name with Prefix 
     print i, newlist[i][-1] 
    file.close() 
else: 
    print "Error Opening File."

我要的是：

獲得第一和中段的名字與他們的前綴（如果可用）
輸出每（第一，中間，最後）分隔txt文件，或者一個CSV文件（優選的）。

非常感謝您的幫助。

來源

2011-01-13 3zzy

從示例中不清楚「前綴」是什麼;例如，如何判斷「A B C D」是「（A B」，「C」，「D」）還是「（」A「，」B C「，」D「）`。請給出一個更完整的例子，並更具體地解釋「前綴」是什麼。 – 2011-01-13 08:47:30

如果前綴的長度是一個字母，並且沒有長度爲一個字母的名稱，則可以嘗試將`len（）`過濾出來，並將它們與它們各自的名稱進行分組。只是一個想法。 – soulseekah 2011-01-13 08:55:55

只有三個前綴「M」，「Shk」和「BS」 – 3zzy 2011-01-13 09:12:54

這個怎麼樣完整的測試腳本：

import sys 

def process(file): 
    for line in file: 
     arr = line.split() 
     if not arr: 
      continue 
     last = arr.pop() 
     n = len(arr) 
     if n == 4: 
      first, middle = ' '.join(arr[:2]), ' '.join(arr[2:]) 
     elif n == 3: 
      if arr[0] in ('M', 'Shk', 'BS'): 
       first, middle = ' '.join(arr[:2]), arr[-1] 
      else: 
       first, middle = arr[0], ' '.join(arr[1:]) 
     elif n == 2: 
      first, middle = arr 
     else: 
      continue 
     print 'First: %r' % first 
     print 'Middle: %r' % middle 
     print 'Last: %r' % last 

if __name__ == '__main__': 
    process(sys.stdin)

如果你在Linux上，在例如線型運行這個，然後按Ctrl + d來表示結束輸入。在Windows上，使用Ctrl + Z而不是Ctrl + D。當然，您也可以通過管道輸入文件。

以下輸入文件：

First Middle Last 
M First Middle Last 
First Shk Middle Last 
BS First M Middle Last

給出了這樣的輸出：

First: 'First' 
Middle: 'Middle' 
Last: 'Last' 
First: 'M First' 
Middle: 'Middle' 
Last: 'Last' 
First: 'First' 
Middle: 'Shk Middle' 
Last: 'Last' 
First: 'BS First' 
Middle: 'M Middle' 
Last: 'Last'

來源

2011-01-13 09:19:14

太棒了！奇蹟般有效！：D – 3zzy 2011-01-13 09:49:53

names = [('A', 'John', 'Paul', 'Smith'), 
('Matthew', 'M', 'Phil', 'Bond'), 
('A', 'Morris', 'O', 'Reil', 'M', 'Big')] 

def getItem(): 
    for name in names: 
     for (pos,item) in enumerate(name): 
      yield item 

itembase = getItem() 

for i in enumerate(names): 
    element = itembase.next() 
    if len(element) == 1: firstName = element+" "+itembase.next() 
    else: firstName = element 
    element = itembase.next() 
    if len(element) == 1: mName = element+" "+itembase.next() 
    else: mName = element 
    element = itembase.next() 
    if len(element) == 1: lastName = element+" "+itembase.next() 
    else: lastName = element 

    print "First Name: "+firstName 
    print "Middle Name: "+mName 
    print "Last Name: "+lastName 
    print "--"

這似乎工作。替換len(element) == 1的條件（我不知道你只需要檢查3個，所以我已經完成了一個任何單個字母）條件尋找你有三個前綴。

**Output** 
First Name: A John 
Middle Name: Paul 
Last Name: Smith 

First Name: Matthew 
Middle Name: M Phil 
Last Name: Bond 

First Name: A Morris 
Middle Name: O Reil 
Last Name: M Big

來源

2011-01-13 09:16:38 soulseekah

似乎不適用於此：`Firts Middle Last | M第一中間| First Shk Middle Last | Shk First M Middle Last` – 3zzy 2011-01-13 09:31:38

我說你必須用你需要的條件來替換`len（element）== 1`。我無法爲你做所有的工作，這只是一個例子。其他人提供的更好，我們都在這裏學習。 – soulseekah 2011-01-13 10:12:01

-1

下面是另一種解決辦法（通過更改有關給定源代碼獲得）：

import csv 
inPath = "input.txt" 
outPath = "output.txt" 

newlist = [] 

file = open(inPath, 'rU') 
if file: 
    for line in file: 
     member = line.split() 
     newlist.append(member) 
    file.close() 
else: 
    print "Error Opening File." 

file = open(outPath, 'wb') 
if file: 
    for fullName in newlist: 
     prefix = "" 
     for name in fullName: 
      if name == "P" or name == "p": 
       prefix = name + " " 
       continue 
      print prefix+name 
      prefix = "" 
     print 
    file.close() 
else: 
    print "Error Opening File."

來源

2011-01-13 09:25:56 Trivikram

您在這裏以面向對象的方式去：

class Name(object): 
    def __init__(self, fullname): 
     self.full = fullname 
     s = self.full.split() 

     try: 
      self.first = " ".join(s[:2]) if len(s[0]) == 1 else s[0] 
      s = s[len(self.first.split()):] 

      self.middle = " ".join(s[:2]) if len(s[0]) == 1 else s[0] 
      s = s[len(self.middle.split()):] 

      self.last = " ".join(s[:2]) if len(s[0]) == 1 else s[0] 
     finally: 
      pass 

names = [ 
    "First Middle Last", 
    "P First Middle Last", 
    "First P Middle Last", 
    "P First p Middle Last", 
] 

for fullname in names: 
    name = Name(fullname) 
    print (name.first, name.middle, name.last)

來源

2011-01-13 09:33:09 kovshenin

如果「M」，「新鴻基」和「BS」是無效的名字/姓氏，即你不關心他們的確切位置，你可以過濾出來用一行代碼：

first, middle, last = filter(lambda x: x not in ('M','Shk','BS'), yourNameHere.split())

其中，當然，yourNameHere是包含您想要解析的名稱的字符串。

警告：對於這段代碼，我假設你總是有一箇中間名，正如你在上面的例子中指定的那樣。如果不是，你必須得到整個列表並計算元素，以確定你是否有中間名。

編輯：如果你不關心的前綴位置：

first, middle, last = map(
    lambda x: x[1], 
    filter(
     lambda (i,x): i not in (0, 2) or x not in ('M','Shk','BS'), 
     enumerate(yourNameHere.split())))

來源

2011-01-13 09:34:44 redShadow

-2

我會用正則表達式，即particulaly設計用於這種用途。這個解決方案很容易保持和理解。

值得一試。 http://docs.python.org/library/re.html

import re 
from operator import truth 

// patterns 
        //First Middle Last 
first = re.compile ("^([\w]+) +([\w]+) ([\w]+)$") 
        //P First Middle Last 
second = re.compile ("^(M|Shk|BS) +([\w]+) +([\w]+) ([\w]+)$") 
        //First  P Middle Last 
third = re.compile ("^([\w]+) +(M|Shk|BS) +([\w]+) ([\w]+)$")  
        //P First p Middle Last 
forth = re.compile ("^(M|Shk|BS) +([\w]+) +(M|Shk|BS) +([\w]+) ([\w]+)$")  

if truth (first.search (you_string)): 
    parsed = first.search (you_string) 
    print parsed.group(1), parsed.group(2), parsed.group(3) 
elif truth (second.search (you_string)): 
    parsed = first.search (you_string) 
    print parsed.group(1), parsed.group(2), parsed.group(3) 
elif truth (third.search (you_string)): 
    parsed = first.search (you_string) 
    print parsed.group(1), parsed.group(2), parsed.group(3) 
elif truth (forth.search (you_string)): 
    parsed = first.search (you_string) 
    print parsed.group(1), parsed.group(2), parsed.group(3) 
else: 
    print "not match at all"

這將更快，由於執行預編譯模式

來源

2011-01-13 09:37:31 pav

import csv 

class CsvWriter(object): 
    """ 
    Wraps csv.writer in a partial file-API compatibility layer 
    """ 
    def __init__(self, fname, mode='w', *args, **kwargs): 
     super(CsvWriter, self).__init__() 
     self.f = open(fname, mode) 
     self.writer = csv.writer(self.f, *args, **kwargs) 

    def write(self, *args): 
     """ 
     Writes a row of data to the csv file 

     Can be called as 
      .write()   puts a blank row 
      .write(2)  puts a single cell 
      .write([1,2,3]) puts 3 cells 
      .write(1,2,3) puts 3 cells 
     """ 
     if len(args)==1 and hasattr(args[0], ('__iter__')): 
      # single argument, and it's a sequence - let it be the row data 
      rowdata = args[0] 
     else: 
      rowdata = args 

     self.writer.writerow(rowdata) 

    def close(self): 
     self.writer = None 
     self.f.close() 

    def __enter__(self): 
     return self 

    def __exit__(self, *exc): 
     self.close() 

class NameSplitter(object): 
    def __init__(self, pre=None): 
     super(NameSplitter, self).__init__() 

     # list of accepted prefixes 
     if pre is None: 
      self.pre = set(['m','shk','bs']) 
     else: 
      self.pre = set([s.lower() for s in pre]) 

     # is-a-prefix word tester 
     self.isPre = lambda x,p=self.pre: x.lower() in p 

     jn = lambda *args: ' '.join(*args) 

     # signature-based dispatch table 
     self.match = {} 
     self.match[(3,())] = lambda w,j=jn: (w[0],   w[1],   w[2]) 
     self.match[(4,(0,))] = lambda w,j=jn: (j(w[0],w[1]), w[2],   w[3]) 
     self.match[(4,(1,))] = lambda w,j=jn: (w[0],   j(w[1],w[2]), w[3]) 
     self.match[(5,(0,2))] = lambda w,j=jn: (j(w[0],w[1]), j(w[2],w[3]), w[4]) 

    def __call__(self, nameStr): 
     words = nameStr.split() 

     # build hashable signature 
     pres = tuple(n for n,word in enumerate(words) if self.isPre(word)) 
     sig = (len(words), pres) 

     try: 
      do = self.match[sig] 
      return do(words) 
     except KeyError: 
      return None 

def process(inf, outf, fn): 
    for line in inf: 
     res = fn(line) 
     if res is not None: 
      outf.write(res) 

def main(): 
    infname = "input.txt" 
    outfname = "output.csv" 

    with open(infname,'rU') as inf: 
     with CsvWriter(outfname) as outf: 
      process(inf, outf, NameSplitter()) 

if __name__=="__main__": 
    main()

來源

2011-01-14 00:40:56

完整的腳本：

import sys 

def f(a,b): 
    if b in ('M','Shk','BS'): 
      return '%s %s' % (b,a) 
    else: 
      return '%s,%s' % (b,a) 

for line in sys.stdin: 
    sys.stdout.write(reduce(f, reversed(line.split(' '))))

輸入：

First Middle Last 
M First Middle Last 
First Shk Middle Last 
BS First M Middle Last

CSVØ輸出：

First,Middle,Last 
M First,Middle,Last 
First,Shk Middle,Last 
BS First,M Middle,Last

來源

2011-01-14 09:28:14

獨立的第一個，中間名和最後一個名（Python）

回答

相關問題