135 lines
4.6 KiB
Org Mode
135 lines
4.6 KiB
Org Mode
#+title: Two spaces
|
|
#&summary
|
|
Not one.
|
|
#&
|
|
#+license: wtfpl
|
|
#+startup: showall
|
|
#&toc
|
|
|
|
|
|
* Two spaces
|
|
|
|
When I end a sentence and intend on writing a new, I type two spaces instead of
|
|
one.
|
|
|
|
I do this only to separate sentence endings from period-terminated
|
|
abbreviations. Consider this sentence:
|
|
|
|
: I eat couches, e.g. brown ones. They are nice.
|
|
|
|
If you didn't know that "e.g." is an abbreviation, you might think that there
|
|
are three sentences: "I eat couches, e.g.", "brown ones.", and "They are nice."
|
|
|
|
Now consider this sentence:
|
|
|
|
: I eat couches, e.g. brown ones. They are nice.
|
|
|
|
By typing two spaces between the sentences, I have made clear that there are
|
|
only two sentences, and that the period in "e.g." is not the end of a sentence.
|
|
|
|
The problem is that the period has two purposes: To end a sentence and to end
|
|
some abbreviations. Always using two spaces to separate sentences solves this.
|
|
|
|
|
|
* Other solutions
|
|
|
|
** Revolution
|
|
|
|
The best solution would be to use a separate character for abbreviation
|
|
termination, or none at all, so that the period is exclusively used for ending
|
|
sentences.
|
|
|
|
|
|
** No change
|
|
|
|
One might think that another solution is to use just one space, the very thing
|
|
that I'm arguing against. In the example above with one space between
|
|
sentences, it's actually /not/ difficult to see that there are only two
|
|
sentences: We know that a sentence must start with an uppercase letter, and
|
|
"brown" after "e.g." does not, so it's not a new sentence.
|
|
|
|
However, uppercase letters *can* occur after abbreviations if they are part of
|
|
given names. Consider this sentence:
|
|
|
|
: I eat couches, e.g. Priscilla's brown one. They are nice.
|
|
|
|
It's not clear that "Priscilla'" does not start a new sentence, because it's very
|
|
similar to "They": Both words start with an uppercase letter and are placed
|
|
after a period and a space. But "Priscilla's" is just another word in the first
|
|
sentence!
|
|
|
|
This almost shows that the one-space methodology is insufficient, but not
|
|
completely. One can argue that if we know all valid abbreviations, we can just
|
|
check if a period is an end to an abbreviation or not, and determine that way
|
|
whether it's a sentence.
|
|
|
|
But this is only true if the abbreviation can be used in only one way! Read
|
|
this sentence:
|
|
|
|
: I used to eat couches bef. I found the cow.
|
|
|
|
It uses the abbreviation "bef." for "before"; see
|
|
[[http://public.oed.com/how-to-use-the-oed/abbreviations/]].
|
|
|
|
The sentence can be read in two ways: Either you read it as one sentence -- "I
|
|
used to eat couches before I found the cow" -- or you read it as two sentences
|
|
-- "I used to eat couches before." and "I found the cow."
|
|
|
|
Both are valid (at least if you accept that a preposition can be the last word
|
|
in a sentence).
|
|
|
|
I admit that that this example is a bit extreme. After all, most abbreviations
|
|
can be used only in unambigious ways. Nevertheless, it still shows that just
|
|
using a single space between sentences *is insufficient*!
|
|
|
|
Also, we have assumed that all abbreviations are known, which excludes temporary
|
|
(and to some extent field-specific) abbreviations. This is not good! It's much
|
|
easier to just use two spaces between your sentences!
|
|
|
|
|
|
* Two spaces and fixed width output
|
|
|
|
Due to my background/foreground as a programmer, I have a tendency to limit
|
|
myself to 80 characters per line, and write two newlines when I start a new
|
|
paragraph (just look at the source of this page).
|
|
|
|
This is just a choice of representation which works well in many cases, but I
|
|
won't write about that. The interesting thing is: How does this mix with using
|
|
two spaces between sentences? This can actually be a problem; look at this
|
|
sentence:
|
|
|
|
: Bla bla bla bla bef. bla bla.
|
|
|
|
This is one sentence, as "bef." does not end the sentence. If we assume that
|
|
the line width is not 80 characters, but instead 16 characters, then the line
|
|
should be wrapped like this:
|
|
|
|
#+BEGIN_SRC
|
|
Bla bla bla bef.
|
|
bla bla
|
|
#+END_SRC
|
|
|
|
But now it's not clear if "bef." ends a sentence or not! If we want to turn the
|
|
fixed-width representation back into a simple line representation, we don't know
|
|
if we should insert one or two spaces after "bef.". How do we solve that?
|
|
|
|
The answer is that, when you line-wrap, you don't split word sequences separated
|
|
by ". ", .i.e. you see an abbreviation and its following word as a single word.
|
|
That way, you would end up with:
|
|
|
|
#+BEGIN_SRC
|
|
Bla bla bla
|
|
bef. bla bla
|
|
#+END_SRC
|
|
|
|
which would not cause any problems.
|
|
|
|
|
|
* General thoughts
|
|
|
|
Most natural languages have some amount of unambiguity, and part of it seems to
|
|
make some things easier, i.e. allowing speakers to be loose when talking about
|
|
stuff.
|
|
|
|
This other kind of ambiguity doesn't help anyone.
|