A lot of projects ported from the old metanohi site.

2011-08-02 23:08:13 +02:00
parent d1b8234f78
commit a63b3ad10d
261 changed files with 12435 additions and 28 deletions
--- a/site/projects/noncrawl/index.org
+++ b/site/projects/noncrawl/index.org
@@ -0,0 +1,23 @@
+#+title: noncrawl
+#&summary
+A links-centric webcrawler
+#&
+#+license: bysa, page
+#+license: gpl 3+, program
+
+* noncrawl
+
+#&img;url=img/noncrawl-logo-192.png, float=right, alt=noncrawl logo, \
+#&    width=192, height=192
+
+noncrawl is a crawler that saves only links. It crawls the web but does not
+attempt to do everything. Instead, its only purpose is to recursively check
+sites for links to other sites, which are then also checked for links to other
+sites, etc. So, if site Y links to site X, that piece of information is saved,
+and if site X has not been checked yet, it will be crawled just like site Y
+was.
+
+[[noncrawl-0.1.tar.gz][DOWNLOAD]].
+
+noncrawl has its branches at Gitorious; see [[http://gitorious.org/noncrawl]]. A
+bugtracker can be found at Launchpad; see [[http://launchpad.net/noncrawl]].