使用正則表達式

我試圖提取第一款提取第一段。但我發現了任何運氣。誰能幫我？這裏是文字。 http://dpaste.com/638776/。我的文字是動態的。感謝使用正則表達式

更新：我在讀使用eTree模塊XML文件。在XML中有標籤叫做<text></text>。 <text></text>is here之間的數據。我只想從text tags打印以下數據。可能嗎？感謝

'''Zamindar''' ({{te|జమీందార్}}) is a 1965 [[Telugu language|Telugu]] "Thriller" film 
    directed by [[V. Madhusudhan Rao]] and produced by [[Tammareddy Krishna Murthy]] 
    of Ravindra Art Pictures.This is variety role for [[Akkineni Nageswara Rao]] 
    who is more popular with soft Romantic roles.He plays the role of a tough CID Officer  very well.The Movie has some Good songs.This movie has a considerable resemblance with the 1963 [[Cary Grant]] English Movie ''[[Charade (1963 film)|Charade]]''.

來源

2011-10-22 no_freedom

你是什麼意思了款？從{{'到'}}'的所有東西？它似乎是一個維基百科模板，所以如果你使用pywikipedia，可能有更好的方法。 –

@wiso它是維基百科模板。感謝您的建議。 –

非常不清楚...... – heltonbiker

修訂基於新的信息...

如果你能生產標籤之間的文本，你只需要找到第一款的模式，將適合所有的情況下，因此基於在這個例子中：

#data - stuff between text tags 
firstparagraph = re.search("}}(.*?)\r*\n\r*\n",data,re.DOTALL) 
print firstparagraph.group(1)

來源

2011-10-22 13:48:20

感謝您的回覆。但它不工作。 –

如果你喜歡發佈一些細節...我還不確定你是否試圖解析pastebin或只是文本？ –

它的工作很棒。但最後我也得到了警告信息。 '打印firstparagraph.group（1） AttributeError的： 'NoneType' 對象沒有屬性 '組' 。我只想要第一段，所以不需要'{{Infobox電影 | name = Bheemli Kabadi Jattu |圖片= |字幕= |導演= [[Tatineni Satya]] |製片= NV普拉薩德，第耆那教 }}'謝謝 –

如果你建立在點換行符相匹配的正則表達式，你（在紅寶石測試，但我猜想，這將在Python上班的）。這是完全一樣的尼爾·伯恩回答：

}}\n(.*?)\n\n

請參閱在rubular效果。

來源

2011-10-22 15:50:58 lkuty

使用正則表達式

回答

相關問題