2013-05-08 110 views
1

嗨,大家遇到了一個特定的問題。 我正在使用python的正則表達式來更改標記源以輸出html格式。Python Regex Mark-Up

標記來源:

[ 
# sometextsometextsometextsometextsometextsometext. # 

# sometextsometextsometextsometextsometextsometextsometextsometext 
sometextsometextsometextsometextsometextsometext. # 
] 


[ 
hello i am a normal paragraph. 
] 

所需的輸出:

<ol> 
<li> sometextsometextsometextsometextsometextsometext. </li> 

<li> sometextsometextsometextsometextsometextsometextsometextsometext 
sometextsometextsometextsometextsometextsometext. </li> 
</ol> 

<p> 
hello i am a normal paragraph. 
</p> 
+1

是怎樣的代碼增刊想知道是把文本放在一個列表還是一個段落中? – 2013-05-08 00:51:12

+0

方括號內存在'#'..我想,不完全確定。 – user2360404 2013-05-08 00:57:07

+3

那麼你的問題究竟是什麼,你試過什麼解決方案? – jwodder 2013-05-08 01:00:04

回答

1
import re 
with open('mk.txt') as f: 
    with open('newmk.txt','w+') as g: 
     text = f.read() 
     SquareGroups = re.findall(r'\[(?:.|\n)+?\]',text) 
     for group in SquareGroups: 
      if '#' in group: #must be ol 
       group = group.replace('[','<ol>') 
       group = group.replace(']','</ol>') 
       group = re.sub('#(?= ?\w)','<li>',group) 
       group = re.sub('(?<=[\w ])#','</li>',group) 
      else: 
       group = group.replace('[','<p>') 
       group = group.replace(']','</p>') 
      g.write(group) 
      g.write('\n') #optional, just makes the output look 'nicer' 

將您輸入mk.txt到下面的文本中newmk.txt

<ol> 
<li> sometextsometextsometextsometextsometextsometext. </li> 

<li> sometextsometextsometextsometextsometextsometextsometextsometext 
sometextsometextsometextsometextsometextsometext. </li> 
</ol> 
<p> 
hello i am a normal paragraph. 
</p>