'Save' a website to disk?

necroerotica

Really Experienced
Joined
May 17, 2007
Posts
180
Hi folks -
does anyone know of a utility which would enable me to save the contents of a website, or a portion of a website to disk?

I'll clarify -
My dad was given a Computer. Now, before he will get broadband installed he wants to be convinced that the internet will be of any use to him (not exactly a techniphobe, but a little... old fashioned?).

So, what I had the idea to do, was see if I could 'save' a selection of websites to disk - one's which I think he'd like. I can then pop them on his PC, and show him what it's all about.

Now, I realise there's limitations - searches won't work, references to back-end databases won't work, and so on - what I want is a 'snapshot' of the pages, with the hyperlinks between them working.

Now, I know I can save individual pages and edit the hyperlinks myself to relative paths, but I'd really prefer if I could be lazy and get a tool that would save me the hassle.....


Ideas, anyone?
 
You'd have varying suscess with this. Most commercial web sites have a lot of code that isn't going to get saved. It all runs on the server hosting the web site.

Why not bring your dad to the library or to your place and show him that way?

Small, simpler web sites you could probably save. Then you'd have to make adjustments to them so their references to files and folders worked right. Really, there isn't a completely simple way to do it. I don't know of any utilities that would do this.

MJL
 
Aye; I'm actually a software engineer myself (you'd think I'd know this stuff.....) and I've been working on .NET web-applications - I know just how much can potentially be server-side.

I'm not looking for a fully *functional* copy - I'd settle for something that'd be smart enough to automatically save me a copy of the page layout and images and whatever basic URL redirects there are between them.


I thought about the library thing, but realistically I might be better waiting until he can visit me and use my PC - privacy and time would be good.

I've been assured I'm a lousy teacher anyway - because I've been doing this so long, I often forget that some people don't even know what I mean when I say 'double-click' or 'right click on that icon'.

:eek:
 
necroerotica said:
Aye; I'm actually a software engineer myself (you'd think I'd know this stuff.....) and I've been working on .NET web-applications - I know just how much can potentially be server-side.

I'm not looking for a fully *functional* copy - I'd settle for something that'd be smart enough to automatically save me a copy of the page layout and images and whatever basic URL redirects there are between them.


I thought about the library thing, but realistically I might be better waiting until he can visit me and use my PC - privacy and time would be good.

I've been assured I'm a lousy teacher anyway - because I've been doing this so long, I often forget that some people don't even know what I mean when I say 'double-click' or 'right click on that icon'.

:eek:

Well, here's what you should do:

Install cygwin:

http://www.cygwin.com/

Then run the 'wget' command in the shell, with the following option:

--convert-links
After the download is complete, convert the links in the document to make them suitable for local viewing. This affects not only the visible hyperlinks, but any part of the document that links to external content, such as embedded images, links to style sheets, hyperlinks to non-HTML content, etc.

Each link will be changed in one of the two ways:
*

The links to files that have been downloaded by Wget will be changed to refer to the file they point to as a relative link.
Example: if the downloaded file /foo/doc.html links to /bar/img.gif, also downloaded, then the link in doc.html will be modified to point to ../bar/img.gif. This kind of transformation works reliably for arbitrary combinations of directories.
*

The links to files that have not been downloaded by Wget will be changed to include host name and absolute path of the location they point to.
Example: if the downloaded file /foo/doc.html links to /bar/img.gif (or to ../bar/img.gif), then the link in doc.html will be modified to point to http://hostname/bar/img.gif.
Because of this, local browsing works reliably: if a linked file was downloaded, the link will refer to its local name; if it was not downloaded, the link will refer to its full Internet address rather than presenting a broken link. The fact that the former links are converted to relative links ensures that you can move the downloaded hierarchy to another directory.

Note that only at the end of the download can Wget know which links have been downloaded. Because of that, the work done by -k will be performed at the end of all the downloads.

If you don't want cygwin, you can probably find wget for windows, but it might be an old version.

More documentation for wget:
http://linux.die.net/man/1/wget

It's the most efficient method for a tech-head, and you can brush up on your unix command line skills while you're at it.
 
Last edited:
necroerotica said:
So, what I had the idea to do, was see if I could 'save' a selection of websites to disk - one's which I think he'd like. I can then pop them on his PC, and show him what it's all about.

What size/type of disk are you considering -- websites probably aren't going to fit very well on a floppie.

necroerotica said:
Now, I know I can save individual pages and edit the hyperlinks myself to relative paths, but I'd really prefer if I could be lazy and get a tool that would save me the hassle.....


Ideas, anyone?

What I would do is acquire a laptop (or an easily moveable desktop) and add it to my internet connection.

I'd find the websites I wanted to use as teasers for your dad and bookmark them as favorites (with "make page available offline" and "three links deep") checked)

Once I had a favorites folder full of suitable websites, I'd clear the temporary internet files and re-sychronise alll of the sites.

Disconnect from the computer from your internet connection and confirm that the sites arein fact available offline down to three links deep.

Once you've got a selection of sites available offline on a computer you can easily move to your dad's, let him use the offline images to play with and if he wants to re-synch them, make him get his own internet connection. :p
 
Chances are if he can't see why he needs it now, he won't use it later. There are people in this world that are perfectly fine without the internet.... I know it's even hard for me to believe being that my world REVOLVES around a computer! Maybe this would be a waste of money? If it even matters. Only an observation...

Otherwise:
You should be able to save the webpage for offline use onto a laptop, as others have said. Just make sure that you save the linked webpages as well. If you don't have a laptop, a thumb (USB) drive would probably work also.
 
Excellent stuff - I'll have a tinker with some of that stuff later, and many thanks one and all :nana:

To answer a few of the previous queries -

- I plan to burn whatever I have to DVD to put on his PC. I know that anything but the most basic website structure wouldn't fit on there, but all I'm really aiming for is a teaser, or demo, of what he can do if he had the internet.

- This is under windows - yes, I really should brush up on my unix. It's been years.

- I'm pretty sure both he and my mum *would* use the internet, but they're both retired (ie. on pensions now) and come very much from a generation where, if you don't *need* it, you don't spend money on it. My mission, should I choose to accept it, is to introduce them to the interweb.

- I didn't actually know that trick with synchronising pages offline when saving as bookmarks. The only time I ever use Internet Explorer is when I want to test code I've written - my browser of choice is Opera, which doesn't seem to offer that option. Cheers for the tip.
 
necroerotica said:
- I didn't actually know that trick with synchronising pages offline when saving as bookmarks. The only time I ever use Internet Explorer is when I want to test code I've written - my browser of choice is Opera, which doesn't seem to offer that option. Cheers for the tip.

If it worked better than it does, I'm sure that moe people would be aware of it.

Be aware that synchronizing anything deeper than the page you're bookmarking will usually lead to a HUGE amount of data being saved because it doesn't distinguish between ads linking to other sites and actual related links. Synchronizing 3-deep on a page with ads.google links (or other similar ad banners) might windup taking over your entire disk.
 
In the browser:
File > Save As > (change Save as type to Web Page, complete)

and that's about it.
 
kindashy said:
In the browser:
File > Save As > (change Save as type to Web Page, complete)

and that's about it.

That's a bit tedious when it comes to maintaining the link to child pages on a site; you have to duplicate the relative directory structure (if there is one)and save each page manually.

That does result in something that is more portable then making sies available offline -- but that's why so many utilities to automate the process are around for SubNebGuy's goggle link above to turn up
 
Back
Top