metanohi/site/projects/noncrawl/index.org

#+title: noncrawl
#&summary
A links-centric webcrawler
#&
#+license: bysa, page
#+license: gpl 3+, program

* noncrawl

#&img;url=img/noncrawl-logo-192.png, float=right, alt=noncrawl logo, \
#&    width=192, height=192

noncrawl is a crawler that saves only links. It crawls the web but does not
attempt to do everything. Instead, its only purpose is to recursively check
sites for links to other sites, which are then also checked for links to other
sites, etc. So, if site Y links to site X, that piece of information is saved,
and if site X has not been checked yet, it will be crawled just like site Y
was.

[[noncrawl-0.1.tar.gz][DOWNLOAD]].

noncrawl has its branches at Gitorious; see [[http://gitorious.org/noncrawl]]. A
bugtracker can be found at Launchpad; see [[http://launchpad.net/noncrawl]].
A lot of projects ported from the old metanohi site. 2011-08-02 23:08:13 +02:00			`#+title: noncrawl`
			`#&summary`
			`A links-centric webcrawler`
			`#&`
			`#+license: bysa, page`
			`#+license: gpl 3+, program`

			`* noncrawl`

			`#&img;url=img/noncrawl-logo-192.png, float=right, alt=noncrawl logo, \`
			`#& width=192, height=192`

			`noncrawl is a crawler that saves only links. It crawls the web but does not`
			`attempt to do everything. Instead, its only purpose is to recursively check`
			`sites for links to other sites, which are then also checked for links to other`
			`sites, etc. So, if site Y links to site X, that piece of information is saved,`
			`and if site X has not been checked yet, it will be crawled just like site Y`
			`was.`

			`[[noncrawl-0.1.tar.gz][DOWNLOAD]].`

			`noncrawl has its branches at Gitorious; see [[http://gitorious.org/noncrawl]]. A`
			`bugtracker can be found at Launchpad; see [[http://launchpad.net/noncrawl]].`