Short: Fetch Web source trees, save to file. Author: James Burton (burton@cs.latrobe.edu.au) Uploader: James Burton (burton cs latrobe edu au) Type: comm/tcp Architecture: m68k-amigaos TITLE GetURL.rexx VERSION 1.04 AUTHOR James Burton c/o Department of Computer Science & Computer Engineering Latrobe University Bundoora, Victoria, 3083 Australia EMail: burton@cs.latrobe.edu.au Web: http://www.cs.latrobe.edu.au/~burton/ Also a few other people helped out, please read the documentation file. DESCRIPTION -- Script to download HTML systems across the network -- GetURL.rexx is an ARexx script which will download World-Wide Web pages. With a simple command line it will download a specific page, and with more complex command lines it will be able to download specific sets of documents. The intention was to create a tool that allowed local caching of important web pages and a flexible way of specifying what pages are important. The script has no GUI as of yet but may have at some stage in the future. If you have ever tried to download and save to disc a 200 page document using Mosaic, then you know what this script is for. Mosaic will only let you load a page, then load it to disc, then load another page etc. This is a very frustrating process. GetURL automates this process and will run in batch mode without user intervention. The major features of GetURL.rexx are as follows: * doesn't require AMosaic, so you can be browsing something else with AMosaic whilst this is running * save pages to your hard disc so that they can be read offline and you can also give them to friends on a floppy disc. Who knows, you may even be able to sell discs containing web pages :-) * flexible set of command line switches that allow you to restrict the type of pages that it downloads * ability to specify files for the lists of URLs that it keeps so that any search for pages can be stopped and restarted at a later date. i.e. you could run GetURL for 2 hours a day whilst you are online and gradually download everything in the entire universe and it won't repeat itself. * includes the ability to download itself when there are new versions. * will use a proxy if you have access to one, in order to both speed up access to pages and also to reduce network load. * will download binary files (*.gif, *.lha) as easily as text and html files. NEW FEATURES * AmigaDOS pattern matching to specify or restict URLs to download * Update facility * a few bugs fixed * separate documentation * More strict control of URLs SPECIAL REQUIREMENTS * Until somebody writes a TCP: device for the AS225 TCP/IP protocol stack unfortunately this script will require AmiTCP. * requires the TCP: device be mounted * either restraint, or an extremely large hard disc - your choice :-) - This script is no use at all unless you have AmiTCP set up and running. If you don't know what this means then please ask me (burton@cs.latrobe.edu.au). HOST NAME DIRECTORY This script is available via Anonymous FTP to AmiNet wuarchive.wustl.edu (128.252.135.4) /pub/aminet/comm/tcp/GetURL-1.04.lha and all of it's mirrors. Please check the closest mirror FIRST. HTTP from my university account http://www.cs.latrobe.edu.au/~burton/Public/GetURL.rexx (this URL will always point to the newest version) FILE NAMES GetURL-1.04.lha 41770 Bytes GetURL-1.04.readme 3874 Bytes PRICE Absolutely free to humans DISTRIBUTABILITY Public domain. But so that a hundred different versions of this don't appear, please send corrections, new features, bug fixes etc. to me and I will coordinate. End of File