From Alessandro's Wiki
Jump to navigation Jump to search

software to download web pages from commandline

  • Limit the bandwidth of download (ex. to 60 kb).
wget --limit-rate=60k 
  • resume downloading (Continue to download an incomplete file):
wget -c http://www.yourdomain.com/bigfile.bin
  • Download into specified file name...
wget http://www.example.com -O index.html
  • Mirror an entire website
wget -m -p -l 0 -E -k http://site
  • and also
wget -k -l 0 -m -nh -r http://site
  • and...
wget --mirror -w 2 -p --html-extension --convert-links http://site
  • another one
wget -o wget.log --html-extension --restrict-file-names=windows --convert-links --recursive --level=inf --page-requisites --wait=0 --quota=inf --reject="*_form, *@*, sitemap, RSS" http://site
  • -k: converts non-relative links to relative
  • -l 0 recursion infinite
  • -E / --adjust-extension
  • -K / --backup-converted
  • -p / --page-requisites
  • exclude a directory:
  • for SSL encripted with certificate websites:
  • send authentication to apache:
--http-user=USER --http-password=PASSWD
  • send authentication by post action (backslash to escape & from shell):