name mode size
art 040000
misc 040000
morse 040000
tests 040000
wordworklib 040000
COPYING 100644 34 kb
README 100644 5 kb
VERSION 100644 0 kb
gen-package.sh 100755 0 kb
getwtype 100755 0 kb
sengen.py 100755 2 kb
wordwork-client.py 100755 2 kb
wordwork-extractor.py 100755 2 kb
wordwork-server.py 100755 2 kb
README
___________________________ __ \ word\ \/ / | | |> |/ || \/\/ |__| |\ |\ || ___________________________// ---------------------------' Wordwork (or "wordWORK") is a program which, eventually by parsing definition files, can generate sentences of little or no meaning but with (mostly) correct natural language syntax (or "grammar"). Wordwork consists of 4 parts: * A server * A client * A dictionary generator/extractor * A simple dictionary generator/extractor (not in use by the rest of Wordwork, but shipped for the sake of completeness) The server must be run for the client to work. The client uses a special definition text format to receive different types of word. The server naturally understands this format. The server understands different types of words, specified by the client using the following letters (a couple of word types are missing, as are several verb inflections. In irregular verbs such as to be and to have this is a problem, but otherwise it's not a problem): # j: Adjective ### The software was finally >>free<<. # c: Comparative form of adjective ### This was >>funnier<< than last time. # s: Superlative form of adjective ### That pie is the >>tastiest<< I've ever had. # d: Adverbium ### The driver drove >>slowly<<. # C: Comparative form of adverbium ### The film played earlier. # S: Superlative form of adverbium ### You can do best! # D: Not comparable adverbium ### The spaceships >>always<< disappear. # v: Verb ### He was asking when to >>eat<<. # t: Third-person singular form of verb ### She >>eats<<. # p: Present particle form of verb ### The lion was >>eating<< something. # l: Past participle form of verb ### The gazelle was >>eaten<<. # o: Past form of verb ### He >>killed<< the program. # n: Noun ### The >>keyboard<< was different. # O: Plural form of noun ### He gave away >>flowers<<. # P: Proper noun ### Her name was >>Cleopatra<<. # i: Preposition ### I am >>on<< the wastebin. The current extractor does not support finding comparative and superlative forms when "more" and "most" must be prefixed. So, while it is possible to stumble upon "funnier", you will not see "more OK". To fetch a random word, use the format "&#", where # is the letter. For example, to make a claim that you have a special ability, use this: I am &j! This might return "I am wise", but it might also return "I am unconfirmed". If you want to use the same random word more than once, you can save it in a variable, which is also supported in the format. You can write something like this: $(somewords "look &j") - You #somewords#. - I do not #somewords#! Check the client's help message (specify --help on the command line) for information on how to properly apply this format. You'll find some tests in the "tests" directory. The format does of course have its limits, which is why it is also possible to work directly in Python. An example of this is in "sengen.py". It still uses the letter codes, though. * * * * * The server does of course not work without a dictionary. Wordwork comes with a dictionary, "dict", created from a Wiktionary XML file. The current version was created from the file found at <http://dumps.wikimedia.org/enwiktionary/latest/enwiktionary-latest-pages-articles.xml.bz2> by "wordwork-extractor.py". This generator parses the file and outputs only English words. The final format is a newline-separated list with a word type (a letter) and a word on each line. The data used from the Wiktionary is available under both the Creative Commons Attribution/Share-Alike License 3.0 (Unported) license and the GNU Free Documentation License (unversioned, with no invariant sections, front-cover texts, or back-cover texts). The version shipped with the current version of Wordwork is from April 4 2010. * * * * * The simple extractor can be found in the "misc" directory under the name "wiktionary-extractor-simple.c". Is it written in C, not Python. To compile it, do something like this (the -s and -Os makes its filesize smaller): $ cd misc $ gcc -s -Os -o extract-simple wiktionary-extractor-simple.c Then you can run it like this: $ ./extract-simple path/to/wiktionary-file > your-output-file This will output a newline-separated list of all words in the Wiktionary file to "your-output-file". This does not sort the items. * * * * * wordwork 0.1 Copyright (C) 2010 Niels Serup License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. The logo in the "art" directory is available under the Creative Commons Attribution-ShareAlike 3.0 Unported license. Attribute Niels Serup of metanohi.org. * * * * * Copyright (C) 2010 Niels Serup Copying and distribution of this file, with or without modification, are permitted in any medium without royalty provided the copyright notice and this notice are preserved. This file is offered as-is, without any warranty.