2011-06-14 58 views
164

我想下載一個網頁的本地副本,並得到所有的CSS,圖片,JavaScript的,等下載網頁的工作本地副本

在以前的討論(如herehere,既其中兩年以上),一般提出兩點建議:wget -phttrack。但是,這些建議都失敗了。我非常感謝幫助使用這些工具來完成任務;替代品也很可愛。


選項1:wget -p

wget -p成功下載所有網頁的先決條件(CSS,圖片,JS)。但是,當我在Web瀏覽器中加載本地副本時,該頁面無法加載先決條件,因爲這些先決條件的路徑尚未從Web上的版本進行修改。

例如:

  • 在頁面的HTML,<link rel="stylesheet href="/stylesheets/foo.css" />將需要糾正,以點帶面的foo.css
  • 在CSS文件中新的相對路徑,background-image: url(/images/bar.png)將同樣需要調整。

有沒有辦法修改wget -p以便路徑正確?


選項2:httrack

httrack似乎是鏡像整個網站一個偉大的工具,但它是我不清楚如何使用它來創建一個單頁的本地副本。在httrack論壇中有關於這個主題的很多討論(例如here),但沒有人似乎有一個防彈解決方案。


選項3:另一種工具?

有人提出了付費工具,但我不敢相信那裏沒有免費的解決方案。

非常感謝!

+14

如果答案不能正常工作,請嘗試:'wget的-E -H -k -K -p的http:// example.com' - 僅此工作了我。貸:http://超級用戶。com/a/136335/94039 – 2013-07-02 06:35:03

+0

也有軟件可以做到這一點,[Teleport Pro](http://www.tenmax.com/teleport/pro/home.htm)。 – pbies 2016-10-08 20:18:00

+0

'wget --random-wait -r -p -e robots = off -U mozilla http:// www.example.com' – davidcondrey 2017-07-17 07:47:04

回答

207

wget有能力做你在問什麼。剛剛嘗試以下操作:

wget -p -k http://www.example.com/ 

-p將讓你的所有必需的元素,以正確地查看網站(CSS,圖像等)。 -k將更改所有鏈接(包括CSS &圖像的鏈接)以允許您脫機查看頁面,因爲它在線顯示。

從wget的文檔:

‘-k’ 
‘--convert-links’ 
After the download is complete, convert the links in the document to make them 
suitable for local viewing. This affects not only the visible hyperlinks, but 
any part of the document that links to external content, such as embedded images, 
links to style sheets, hyperlinks to non-html content, etc. 

Each link will be changed in one of the two ways: 

    The links to files that have been downloaded by Wget will be changed to refer 
    to the file they point to as a relative link. 

    Example: if the downloaded file /foo/doc.html links to /bar/img.gif, also 
    downloaded, then the link in doc.html will be modified to point to 
    ‘../bar/img.gif’. This kind of transformation works reliably for arbitrary 
    combinations of directories. 

    The links to files that have not been downloaded by Wget will be changed to 
    include host name and absolute path of the location they point to. 

    Example: if the downloaded file /foo/doc.html links to /bar/img.gif (or to 
    ../bar/img.gif), then the link in doc.html will be modified to point to 
    http://hostname/bar/img.gif. 

Because of this, local browsing works reliably: if a linked file was downloaded, 
the link will refer to its local name; if it was not downloaded, the link will 
refer to its full Internet address rather than presenting a broken link. The fact 
that the former links are converted to relative links ensures that you can move 
the downloaded hierarchy to another directory. 

Note that only at the end of the download can Wget know which links have been 
downloaded. Because of that, the work done by ‘-k’ will be performed at the end 
of all the downloads. 
+0

謝謝!不知道我是如何錯過了這個選擇。 – brahn 2011-07-01 15:05:35

+2

我試過了,但不知怎麼的,像index.html#link-to-element-on-same-page這樣的內部鏈接停止了工作。 – rhand 2013-08-31 02:50:03

+1

整個網站:http://snipplr.com/view/23838/downloading-an-entire-web-site-with-wget/ – 2014-04-09 17:05:34