Python讀物前四行從readlines方法（）

如何從readlines()讀前四行，我得到一個STDIN從代理到我的腳本：Python讀物前四行從readlines方法（）

GET http://www.yum.com/ HTTP/1.1 
Host: www.yum.com 
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:9.0.1) Gecko/20100101 Firefox/9.0.1 
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 
Accept-Language: en-gb,en;q=0.5 
Accept-Encoding: gzip, deflate 
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 
Proxy-Connection: keep-alive

我讀它使用sys.stdin.readlines()並登錄到文件，但我只想將GET和User-Agent行記錄到文件中。

while True: 
    line = sys.stdin.readlines() 
    for l in line: 
     log = open('/tmp/redirect.log', 'a') 
     log.write(l) 
     log.close()

來源

2012-02-01 krisdigitx

每個縮進級別使用4個空格。 – 2012-02-01 11:16:03

爲什麼使用'.readlines（）'？ – 2012-02-01 11:18:17

爲什麼你打開，寫入和關閉每一行讀取的文件？這似乎很愚蠢。 – 2012-02-01 11:18:27

使用with確保良好的日誌關閉。與Python中的任何文件類型對象一樣，您可以迭代sys.stdin，因爲它不需要創建列表，所以速度更快。

with open('/tmp/redirect.log', 'a') as log: 
    while True: #If you need to continuously check for more. 
     for line in sys.stdin: 
      if line.startswith(("GET", "User-Agent")): 
       log.write(line)

以下是一種有效的方法，因爲它不會一次又一次地檢查相同的行，只在有需要的行時進行檢查。考慮到這種情況，可能不需要，但如果你有更多項目需要檢查，還有更多需要排序的東西，那麼值得去做。這也意味着你跟蹤你擁有的部分，不要超出你需要的範圍。如果閱讀是一項昂貴的操作，這可能很有價值。

with open('/tmp/redirect.log', 'a') as log: 
    while True: #If you need to continuously check for more. 
     needed = {"GET", "User-Agent"} 
     for line in sys.stdin: 
      for item in needed: 
       if line.startswith(item): 
        log.write(line) 
        break 
      needed.remove(item) 
      if not needed: #The set is empty, we have found all the lines we need. 
       break

該集合是無序的，但我們可以假定這些行會按順序排列，因此會按順序登錄。

這種設置也可能需要更復雜的線條檢查（例如：使用正則表達式）。然而，就你的情況而言，第一個例子很簡潔，應該很好。

來源

2012-02-01 11:32:41

嗨lattyware，這是一個很好的解決方案....但我有另一個功能，該操作在該行，看起來像它沒有正常打破循環 – krisdigitx 2012-02-01 12:41:25

@Lattyware很高興知道'startswith'也接受一個字符串元組。 – jcollado 2012-02-01 12:49:41

感謝那個lattyware .. – krisdigitx 2012-02-01 17:12:08

假設你輸入總是開始與你想要得到的4條線，這應該工作：

log = open('/tmp/redirect.log', 'a') 
for l in sys.stdin.readlines()[:4]: 
    log.write(l) 
log.close()

否則，你需要分析的輸入，並可能使用正則表達式（還有另一個答案那）。

來源

2012-02-01 11:17:31

爲什麼打開和關閉每行添加的日誌？ – 2012-02-01 11:19:53

我剛剛複製了他的代碼，答案是readlines（）[：4] - 我同意這根本不需要，儘管如此。編輯。 – 2012-02-01 11:30:48

可以寫入日誌之前檢查行的內容：

while True: 
    lines = sys.stdin.readlines() 
    for line in lines: 
     if line.startswith('GET') or line.startswith('User-Agent:'): 
      log = open('/tmp/redirect.log', 'a') 
      log.write(l) 
      log.close()

對於更復雜的檢查，你也可以使用正則表達式。

來源

2012-02-01 11:17:46 jcollado

給我這個錯誤：if line.startswith（'GET'）： AttributeError：'list'object has no attribute'startswith' – krisdigitx 2012-02-01 12:07:54

@krisdigitx這是因爲變量名稱的問題。我用更好的名字更新了我的答案。 – jcollado 2012-02-01 12:47:42

謝謝jcollado ...我也用過你的解決方案的一部分...歡呼 – krisdigitx 2012-02-01 17:23:03

>>> lines 
0: ['GET http://www.yum.com/ HTTP/1.1', 
'Host: www.yum.com', 
'User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:9.0.1) Gecko/20100101 Firefox/9.0.1', 
'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8', 
'Accept-Language: en-gb,en;q=0.5', 
'Accept-Encoding: gzip, deflate', 
'Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7', 
'Proxy-Connection: keep-alive'] 
>>> patterns = ["GET", "User-Agent"] 
>>> for line in lines: 
...  for pattern in patterns: 
...   if line.startswith(pattern): 
...    with open("/tmp/redirect.log", "a") as f: 
...     f.write(line) 
       break

with應內部使用if語句，如果線列表很長，這會導致文件句柄打開了很長一段時間。使用break是因爲每行只匹配一個模式，如果一行已經匹配了模式，則不需要檢查列表中的其他模式。

來源

2012-02-01 11:41:41 Kracekumar

Python讀物前四行從readlines方法（）

回答

相關問題