PDA

View Full Version : Downloading Guides


dwflo
16th July 2003, 04:54
Is there anyway to download a perticular guide from this sight, either in PDF or HTML, so it can be read off-line? That includes all page links with images.

r6d2
16th July 2003, 05:05
Originally posted by dwflo
Is there anyway to download a perticular guide from this sight, either in PDF or HTML, so it can be read off-line? That includes all page links with images.
Try using IE's option to have a web page available without connection. It works every time and is independent of the format the guide is written on.

You can also user a spider to download a full web site.

Hope this helps.

__________
A stupid answer is only as stupid as the question itself.

dwflo
16th July 2003, 18:05
@r6d2
Thanks, much appreciated. Never used that option before.

The Belgain
10th August 2003, 22:05
How do you download a whole website using a spider...?

r6d2
11th August 2003, 00:09
There are plenty of tools for this. One I've tried is Webzip:

http://download.com.com/3120-20-0.html?qt=webzip&tg=dl-2001

Look for more of the same kind at www.download.com, search by "Download Web sites".

Hope this helps.

Doom9
11th August 2003, 00:48
How do you download a whole website using a spider...?
Site owners/admins really hate that and you risk that your IP gets on a permanent blacklist if you do this. By downloading files you don't need you take away bandwith from others that could use it, and you increase the hosting costs.

r6d2
11th August 2003, 01:15
Originally posted by Doom9
Site owners/admins really hate that and you risk that your IP gets on a permanent blacklist if you do this.
Sorry, Doom9, I didn't know you thougth like this and didn't mean to promote ill behaviour.

Anyway, most ISPs give home users dynamic IP addresses, so by blocking one IP you are likely not punishing the offending guy but another user who cannot possibly know his IP is banned, or why it is banned.

If this matter is really a concern, I'd suggest it to be on the forum rules, or at least on the Disclaimer page.

dwflo
11th August 2003, 05:24
Originally posted by Doom9
Site owners/admins really hate that and you risk that your IP gets on a permanent blacklist if you do this. By downloading files you don't need you take away bandwith from others that could use it, and you increase the hosting costs.

Do not want to piss any owners/ISPs off, but why would the option be available anyhoo? I see an advantage to saving a web site, or at least parts of it. Especialy sites that have all their documentation posted. All I plan doing is downloading one time and converting to PDF, so I have my own reference on my hard drive.
That makes since to me, when you have to use dial-up!

dwflo

mpucoder
11th August 2003, 05:57
Why does the option exist? Because it can be done.
It's OK to use if you use reasonable limits, such as the depth of nesting, not following links outside the URI, and limiting the number of connections and bandwidth. The problem is a large number of people try to get the entire site, with as many connections as the server will allow, with no bandwidth limit - closing down the site to other users until it's done or blocked. Also a lot of the crawlers don't remember very well where they've been, and download pages and images multiple times. And then there are people who download everything every day!
If you started on the main page of doom9.org and set no depth limit, but remained in doom9.org you would not only get the guides and codec comparison, but every program available for download. If you allowed it to wander over to the forum (remaining in *.doom9.org) it would attempt to download over 47K+ threads, as well as 35K+ user profiles.

jeremymacmull
14th August 2003, 00:06
Or easier without pissing anyone off

save the webpage one page at a time (ie with the GKNOT divx 5 guide there are only 6 -7 pages so no prob) in the MHT format in internet explorer for offline viewing

thats what i did

JEREMY

mpucoder
14th August 2003, 04:46
About a year ago I got so p'd at crawlers I hid a link to a game in several pages. Being hidden from view, only crawlers would follow it. And being only a 13K game, it took almost no room on my server. The game (http://www.mpucoder.com/puzzle.php) has over 2x10^13 combinations, though (16!), so following and saving every link is too much for most crawlers.
While it amused me to watch attempts to download the entire site, it was a bandwidth killer, so the links are gone (and so was the game, but I put it back today for one day).

r6d2
14th August 2003, 04:59
Originally posted by mpucoder
While it amused me to watch attempts to download the entire site, it was a bandwidth killer, so the links are gone (and so was the game, but I put it back today for one day).
I always new you were evil ;)

Why not asking the hosting provider to give the site some sort of QOS, in order to restrict equal BW slices to all different IPs visiting concurrently?

Do they charge for that?