在python中創建文件夾

如何讓此腳本從鏈接名稱中獲取「nmv-fas」並創建一個具有該名稱的目錄，然後放置所有下載到該目錄中的文件。在python中創建文件夾

all.html：保存在文件夾中

<a href="http://www.youversion.com/bible/gen.45.nmv-fas">http://www.youversion.com/bible/gen.45.nmv-fas</a> 
<a href="http://www.youversion.com/bible/gen.46.nmv-fas">http://www.youversion.com/bible/gen.46.nmv-fas</a> 
<a href="http://www.youversion.com/bible/gen.47.nmv-fas">http://www.youversion.com/bible/gen.47.nmv-fas</a> 
<a href="http://www.youversion.com/bible/gen.48.nmv-fas">http://www.youversion.com/bible/gen.48.nmv-fas</a> 
<a href="http://www.youversion.com/bible/gen.49.nmv-fas">http://www.youversion.com/bible/gen.49.nmv-fas</a> 
<a href="http://www.youversion.com/bible/gen.50.nmv-fas">http://www.youversion.com/bible/gen.50.nmv-fas</a> 
<a href="http://www.youversion.com/bible/exod.1.nmv-fas">http://www.youversion.com/bible/exod.1.nmv-fas</a> 
<a href="http://www.youversion.com/bible/exod.2.nmv-fas">http://www.youversion.com/bible/exod.2.nmv-fas</a> 
<a href="http://www.youversion.com/bible/exod.3.nmv-fas">http://www.youversion.com/bible/exod.3.nmv-fas</a>

文件名爲：

nmv-fas

蟒蛇：

import lxml.html as html 
import urllib 
import urlparse 
from BeautifulSoup import BeautifulSoup 
import re 

root = html.parse(open('all.html')) 
for link in root.findall('//a'): 
    url = link.get('href') 
    name = urlparse.urlparse(url).path.split('/')[-1] 
    f = urllib.urlopen(url) 
    s = f.read() 
    f.close() 
    soup = BeautifulSoup(s) 
    articleTag = soup.html.body.article 
    converted = str(articleTag) 
    open(name, 'w').write(converted)

來源

2012-04-25 Blainer

您可以使用lxml模塊解析出來的文件鏈接，然後使用urllib下載每個鏈接。閱讀的鏈接可能是這樣的：

import lxml.html as html 

root = html.parse(open('links.html')) 
for link in root.findall('//a'): 
    url = link.get('href')

您可以下載的鏈接，使用urllib.urlopen文件：

import urllib 
import urlparse 

# extract the final path component and use it as 
# the local filename. 
name = urlparse.urlparse(url).path.split('/')[-1] 

fd = urllib.urlopen(url) 
open(name, 'w').write(fd.read())

一起把這些，你應該有類似你想要的東西。

來源

2012-04-25 16:26:19 larsks

它工作得很好，除了它只下載最後一個鏈接，不是所有的鏈接 – Blainer 2012-04-25 16:37:11

哦，不，如果你把它們正確地放在一起，它就可以正常工作。你只是在沒有想到的情況下複製和粘貼。也許你需要把東西*放在循環中*。 – larsks 2012-04-25 16:37:58

是的男人我不知道我在做什麼 – Blainer 2012-04-25 17:00:15

在python中創建文件夾

回答

相關問題