Have you ever Googled the Internet for a software to Download complete website for you , but you only found a Windows software or maybe a Linux one too , but did you ever knew that your Linux box has a nifty command to make all your troubles go away and download a full website with just a single command , Yes ! wget does it and here is the command just copy paste it in the shell and edit the website details at the bottom .
$ wget \
–recursive \
–no-clobber \
–page-requisites \
–html-extension \
–convert-links \
–restrict-file-names=windows \
–domains techstroke.com \
–no-parent \
www.techstroke.com/Windows/
This command downloads the Web site www.techstroke.com/Windows/.
The options are:
- –recursive: download the entire Web site.
- –domains-techstroke.com: don’t follow links outside techstroke.com.
- –no-parent: don’t follow links outside the directory /Windows/.
- –page-requisites: get all the elements that compose the page (images, CSS and so on).
- –html-extension: save files with the .html extension.
- –convert-links: convert links so that they work locally, off-line.
- –restrict-file-names=windows: modify filenames so that they will work in Windows as well.
- –no-clobber: don’t overwrite any existing files (used in case the download is interrupted and
resumed).
All these options are uber cool and they download a perfect browsable copy with all images javascript and css intact !!
via [linuxJournal]
Last 5 posts in Linux
- Free Effective and Simple Image Size and Format Converter for Linux - February 10th, 2010
- Google Chrome For Linux and Chrome for Mac available for Download - December 12th, 2009
- Recommended directory in Linux for Installing / Copying Apps Games or LAMP – Linux File system Explained - November 1st, 2009
- How to check server response time using cUrl - October 29th, 2009
- Best Linux Distro for a Old and Slow PC – Small Low on Resources and ultra fast - October 22nd, 2009


March 28th, 2009 at 3:16 am
Hi, I tried that with a webpage. The one is https, but all was OK when using –no-check-certificate
I used “-r –no-check-certificate -page-requisites” only. The problem: The main css file is saved, and in it the other css files are linked with
@import url(bla.css);
@import url(blubb.css);
…
and none of these are saved at all. Is there a wget tweak that also makes wget save these files as well?
–page-requisites itself should be enough to make wget do it, but it seems it just won’t. Using wget 1.11.1
April 6th, 2009 at 12:25 am
@Rava I dont think so , that we have such a tweak, it parses html files but not the CSS ones, I went through the wget manual and found nothing regarding this, you can google for some website copy tool for linux if you need this done .
September 26th, 2009 at 6:01 pm
Wow…
March 4th, 2010 at 6:07 am
Thanks for sharing this … the best