[Tfug] mirroring software?
Choprboy
tfug@tfug.org
Sun Jan 5 03:41:01 2003
On Sunday 05 January 2003 02:56 am, Gordon C. Zaft wrote:
> I'm setting up an internal webserver and I'd like to grab copies of some
> of my favorite websites so that I'll always have 'em around in case they go
> under. Also I'd like to mirror some that I have externally hosted. Can
> anyone recommend software to do this? The server is running Apache 1.3.27
> on FreeBSD 4.7.
>
Well, it won't pull CGI code, etc. (the raw code that is). But I regularly use
both "wget" and "pavuk" to crawl websites and download entire layouts. Grab a
section of news articles or all the text and pictures of a big description/
instruction piece on some obscure hardware, for later offline reading. Both
will automatically rewrite URLs in the downloaded pages as well so the local
copy works just like the remote. Wget is very robust, mostly commandline
driven. Pavuk has a nice interface for setting lots of matching/reject
parameters for crawling, though hard to script and cumbersome to enter lots
of individual starting URLs.
Adrian