From e20e4891d9e381c7bc5387a82f66f7d9b0041d16 Mon Sep 17 00:00:00 2001 From: "Niels G. W. Serup" Date: Wed, 8 Apr 2015 14:31:45 +0200 Subject: [PATCH] Wrote about using two spaces. --- site/writings/two-spaces.org | 132 +++++++++++++++++++++++++++++++++++ 1 file changed, 132 insertions(+) create mode 100644 site/writings/two-spaces.org diff --git a/site/writings/two-spaces.org b/site/writings/two-spaces.org new file mode 100644 index 0000000..71f305a --- /dev/null +++ b/site/writings/two-spaces.org @@ -0,0 +1,132 @@ +#+title: Two spaces +#&summary +Not one. +#& +#+license: wtfpl +#+startup: showall +#&toc + + +* Two spaces + +When I end a sentence and intend on writing a new, I type two spaces instead of +one. + +I do this only to separate sentence endings from period-terminated +abbreviations. Consider this sentence: + +: I eat couches, e.g. brown ones. They are nice. + +If you didn't know that "e.g." is an abbreviation, you might think that there +are three sentences: "I eat couches, e.g.", "brown ones.", and "They are nice." + +Now consider this sentence: + +: I eat couches, e.g. brown ones. They are nice. + +By typing two spaces between the sentences, I have made clear that there are +only two sentences, and that the period in "e.g." is not the end of a sentence. + +The problem is that the period has two purposes: To end a sentence and to end +some abbreviations. Always using two spaces to separate sentences solves this. + + +* Other solutions + +** Revolution + +The best solution would be to use a separate character for abbreviation +termination, or none at all, so that the period is exclusively used for ending +sentences. + + +** No change + +One might think that another solution is to use just one space, the very thing +that I'm arguing against. In the example above with one space between +sentences, it's actually /not/ difficult to see that there are only two +sentences: We know that a sentence must start with an uppercase letter, and +"brown" after "e.g." does not, so it's not a new sentence. + +However, uppercase letters *can* occur after abbreviations if they are part of +given names. Consider this sentence: + +: I eat couches, e.g. Priscilla's brown one. They are nice. + +It's not clear that "Priscilla'" does not start a new sentence, because it's very +similar to "They": Both words start with an uppercase letter and are placed +after a period and a space. But "Priscilla's" is just another word in the first +sentence! + +This almost shows that the one-space methodology is insufficient, but not +completely. One can argue that if we know all valid abbreviations, we can just +check if a period is an end to an abbreviation or not, and determine that way +whether it's a sentence. + +But this is only true if the abbreviation can be used in only one way! Read +this sentence: + +: I used to eat couches bef. I found the cow. + +It uses the abbreviation "bef." for "before"; see +[[http://public.oed.com/how-to-use-the-oed/abbreviations/]]. + +The sentence can be read in two ways: Either you read it as one sentence -- "I +used to eat couches before I found the cow" -- or you read it as two sentences +-- "I used to eat couches before." and "I found the cow." + +Both are valid (at least if you accept that a preposition can be the last word +in a sentence). + +I admit that that this example is a bit extreme. After all, most abbreviations +can be used only in unambigious ways. Nevertheless, it still shows that just +using a single space between sentences *is insufficient*! + +Also, we have assumed that all abbreviations are known, which excludes temporary +(and to some extent field-specific) abbreviations. This is not good! It's much +easier to just use two spaces between your sentences! + + +* Two spaces and fixed width output + +Due to my background/foreground as a programmer, I have a tendency to limit +myself to 80 characters per line, and write two newlines when I start a new +paragraph (just look at the source of this page). + +This is just a choice of representation which works well in many cases, but I +won't write about that. The interesting thing is: How does this mix with using +two spaces between sentences? This can actually be a problem; look at this +sentence: + +: Bla bla bla bla bef. bla bla. + +This is one sentence, as "bef." does not end the sentence. If we assume that +the line width is not 80 characters, but instead 16 characters, then the line +should be wrapped like this: + +#+BEGIN_SRC +Bla bla bla bef. +bla bla +#+END_SRC + +But now it's not clear if "bef." ends a sentence or not! If we want to turn the +fixed-width representation back into a simple line representation, we don't know +if we should insert one or two spaces after "bef.". How do we solve that? + +The answer is that, when you line-wrap, you don't split word sequences separated +by ". ", .i.e. you see an abbreviation and its following word as a single word. +That way, you would end up with: + +#+BEGIN_SRC +Bla bla bla +bef. bla bla +#+END_SRC + +which would not cause any problems. + + +* General thoughts + +Most natural languages have some amount of unambiguity, and part of it seems to +make some things easier, i.e. allowing speakers to be loose when talking about +stuff.