2015-02-11 89 views
1

我的輸入是這樣的:的Python:抓住整個字符串作爲一個元素

MSG1   .STRINGZ 「This is my sample string : " 
MEMORYSPACE .BLKW  9 
NEWLINE  .FILL  #10 
NEG48   .FILl  #-48 

     .END 

現在我有,通過字將每個行我輸入文件中像這樣的代碼:

['MSG1', '.STRINGZ', '"This', 'is', 'a' , 'sample' , 'string','"'] 
['MEMORYSPACE', '.BLKW', '9'] 
['NEWLINE', '.FILL', '#10'] 
['NEG48', '.FILl', '#-48'] 
[] 
['.END'] 

在輸入文件中,在我的第一行我有字符串,我希望它把整個字符串當作一個元素,這樣我就可以在我的代碼中計算它的長度。有沒有辦法做到這一點?這裏是我的代碼:

f = open ('testLC31.txt', 'r') 
line_count = 0 

to_ignore = ["AND", "ADD", "LEA", "PUTS", "JSR", "LD", "JSRR" , "NOT", "LDI" , 
      "LDR", "ST", "STI", "STR", "BR" , "JMP", "TRAP" , "JMP", "RTI" , 
      "BR", "ST", "STI" , "STR" , "BRz", "BRn" , "HALT"] 

label = [] 
instructions = [] 

for line in f: 
    elem = line.split() if line.split() else [''] 
    if len(elem) > 1 and elem[0] not in to_ignore: 
     label.append(elem[0]) 
     instructions.append(elem[1]) 
     line_count += 1 
    elif elem[0] in to_ignore: 
     line_count += 1 
+0

是分隔符製表,空格運行或組合方式? – 2015-02-11 03:00:24

回答

0

這可以通過假設.STRINGZ在表示字符串時總是在一行上。

結果

「這是我的樣本字符串:」 LEN(strinz_):32

text_ = """ 
MSG1   .STRINGZ "This is my sample string : " 
MEMORYSPACE .BLKW  9 
NEWLINE  .FILL  #10 
NEG48   .FILl  #-48 

     .END 
""" 

STRINGZ_ = '.STRINGZ' 
line_count_ = 0 

lines_ = text_.split('\n') 

to_ignore = ["AND", "ADD", "LEA", "PUTS", "JSR", "LD", "JSRR" , "NOT", "LDI" , 
      "LDR", "ST", "STI", "STR", "BR" , "JMP", "TRAP" , "JMP", "RTI" , 
      "BR", "ST", "STI" , "STR" , "BRz", "BRn" , "HALT"] 

label = [] 
instructions = [] 

for line in lines_: 
    if STRINGZ_ in line: 
     stringz_ = line.split(STRINGZ_)[1] 
     print stringz_ 
     print 'len(stringz_): ' + str(len(stringz_)) 
    elem = line.split() if line.split() else [''] 
    if len(elem) > 1 and elem[0] not in to_ignore: 
     label.append(elem[0]) 
     instructions.append(elem[1]) 
     line_count_ += 1 
    elif elem[0] in to_ignore: 
     line_count_ += 1 
0
with open("filename") as f: 
    rd = f.readlines() 
    print (rd[0].split("\n")[0].split()) 

拆分\n和空間。打印每個列表的第一個元素。 readlines()將返回一個列表,操縱它更容易。另外with open()方法更好。

1

str.split方法有一個可選參數maxsplit,這限制在結果列表中元素的個數:

>>> 'MSG1   .STRINGZ 「This is my sample string : "'.split(None, 2) 
['MSG1', '.STRINGZ', '「This is my sample string : "'] 

如果你想要的東西比得到的前兩個單詞,而保留其餘較複雜的完好,shlex.split可能適合你。它使用類似shell的語法來分割字符串的各個部分,並將引號中的字符串視爲單個元素。您可以通過創建shlex對象實例並更改其屬性來準確設置格式。詳情請參閱文檔。

>>> shlex.split('MSG1   .STRINGZ "This is my sample string : "') 
['MSG1', '.STRINGZ', 'This is my sample string : '] 
>>> shlex.split('MSG1   .STRINGZ "This is my sample string : "', posix=False) 
['MSG1', '.STRINGZ', '"This is my sample string : "'] 

如果這還不夠,以及,在選擇就是寫一個完整的解析器的格式,例如使用pyparsing庫。

1

您可以嘗試手動回來,像這樣結合這些字符串的這種粗略的方法:

tags = ['MSG1', '.STRINGZ', '"This', 'is', 'a' , 'sample' , 'string','"'] 
FirstOccurance = 0 
longtag = "" 
for tag in tags: 
    if FirstOccurance == 1: 
     if tag == "\"": 
      longtag += tag 
     else: 
      longtag += " "+tag 
    if ("\"" in tag) and (FirstOccurance == 0): 
     longtag += tag 
     FirstOccurance = 1 
    elif ("\"" in tag) and (FirstOccurance == 1): 
     FirstOccurance = 0 

print longtag 

希望這有助於。

0

一個簡單的彙編程序?這是一個粗略的通使用pyparsing:

code = """ 
MSG1   .STRINGZ "This is my sample string : " 
MEMORYSPACE .BLKW  9 
NEWLINE  .FILL  #10 
NEG48   .FILL  #-48 

     .END""" 

from pyparsing import Word, alphas, alphanums, Regex, Combine, quotedString, Optional 

identifier = Word(alphas, alphanums+'_') 
command = Word('.', alphanums) 

integer = Regex(r'[+-]?\d+') 
byte_literal = Combine('#' + integer) 
command_arg = quotedString | integer | byte_literal 
codeline = Optional(identifier)("label") + command("instruction") + Optional(command_arg("arg")) 

for line in code.splitlines(): 
    line = line.strip() 
    if not line: 
     continue 

    print line 
    assemline = codeline.parseString(line) 
    print assemline.dump() 
    print 

打印

MSG1   .STRINGZ "This is my sample string : " 
['MSG1', '.STRINGZ', '"This is my sample string : "'] 
- arg: "This is my sample string : " 
- instruction: .STRINGZ 
- label: MSG1 

MEMORYSPACE .BLKW  9 
['MEMORYSPACE', '.BLKW', '9'] 
- arg: 9 
- instruction: .BLKW 
- label: MEMORYSPACE 

NEWLINE  .FILL  #10 
['NEWLINE', '.FILL', '#10'] 
- arg: #10 
- instruction: .FILL 
- label: NEWLINE 

NEG48   .FILL  #-48 
['NEG48', '.FILL', '#-48'] 
- arg: #-48 
- instruction: .FILL 
- label: NEG48 

.END 
['.END'] 
- instruction: .END 
相關問題