Wrote about using two spaces.
This commit is contained in:
parent
f93ac48919
commit
e20e4891d9
|
@ -0,0 +1,132 @@
|
|||
#+title: Two spaces
|
||||
#&summary
|
||||
Not one.
|
||||
#&
|
||||
#+license: wtfpl
|
||||
#+startup: showall
|
||||
#&toc
|
||||
|
||||
|
||||
* Two spaces
|
||||
|
||||
When I end a sentence and intend on writing a new, I type two spaces instead of
|
||||
one.
|
||||
|
||||
I do this only to separate sentence endings from period-terminated
|
||||
abbreviations. Consider this sentence:
|
||||
|
||||
: I eat couches, e.g. brown ones. They are nice.
|
||||
|
||||
If you didn't know that "e.g." is an abbreviation, you might think that there
|
||||
are three sentences: "I eat couches, e.g.", "brown ones.", and "They are nice."
|
||||
|
||||
Now consider this sentence:
|
||||
|
||||
: I eat couches, e.g. brown ones. They are nice.
|
||||
|
||||
By typing two spaces between the sentences, I have made clear that there are
|
||||
only two sentences, and that the period in "e.g." is not the end of a sentence.
|
||||
|
||||
The problem is that the period has two purposes: To end a sentence and to end
|
||||
some abbreviations. Always using two spaces to separate sentences solves this.
|
||||
|
||||
|
||||
* Other solutions
|
||||
|
||||
** Revolution
|
||||
|
||||
The best solution would be to use a separate character for abbreviation
|
||||
termination, or none at all, so that the period is exclusively used for ending
|
||||
sentences.
|
||||
|
||||
|
||||
** No change
|
||||
|
||||
One might think that another solution is to use just one space, the very thing
|
||||
that I'm arguing against. In the example above with one space between
|
||||
sentences, it's actually /not/ difficult to see that there are only two
|
||||
sentences: We know that a sentence must start with an uppercase letter, and
|
||||
"brown" after "e.g." does not, so it's not a new sentence.
|
||||
|
||||
However, uppercase letters *can* occur after abbreviations if they are part of
|
||||
given names. Consider this sentence:
|
||||
|
||||
: I eat couches, e.g. Priscilla's brown one. They are nice.
|
||||
|
||||
It's not clear that "Priscilla'" does not start a new sentence, because it's very
|
||||
similar to "They": Both words start with an uppercase letter and are placed
|
||||
after a period and a space. But "Priscilla's" is just another word in the first
|
||||
sentence!
|
||||
|
||||
This almost shows that the one-space methodology is insufficient, but not
|
||||
completely. One can argue that if we know all valid abbreviations, we can just
|
||||
check if a period is an end to an abbreviation or not, and determine that way
|
||||
whether it's a sentence.
|
||||
|
||||
But this is only true if the abbreviation can be used in only one way! Read
|
||||
this sentence:
|
||||
|
||||
: I used to eat couches bef. I found the cow.
|
||||
|
||||
It uses the abbreviation "bef." for "before"; see
|
||||
[[http://public.oed.com/how-to-use-the-oed/abbreviations/]].
|
||||
|
||||
The sentence can be read in two ways: Either you read it as one sentence -- "I
|
||||
used to eat couches before I found the cow" -- or you read it as two sentences
|
||||
-- "I used to eat couches before." and "I found the cow."
|
||||
|
||||
Both are valid (at least if you accept that a preposition can be the last word
|
||||
in a sentence).
|
||||
|
||||
I admit that that this example is a bit extreme. After all, most abbreviations
|
||||
can be used only in unambigious ways. Nevertheless, it still shows that just
|
||||
using a single space between sentences *is insufficient*!
|
||||
|
||||
Also, we have assumed that all abbreviations are known, which excludes temporary
|
||||
(and to some extent field-specific) abbreviations. This is not good! It's much
|
||||
easier to just use two spaces between your sentences!
|
||||
|
||||
|
||||
* Two spaces and fixed width output
|
||||
|
||||
Due to my background/foreground as a programmer, I have a tendency to limit
|
||||
myself to 80 characters per line, and write two newlines when I start a new
|
||||
paragraph (just look at the source of this page).
|
||||
|
||||
This is just a choice of representation which works well in many cases, but I
|
||||
won't write about that. The interesting thing is: How does this mix with using
|
||||
two spaces between sentences? This can actually be a problem; look at this
|
||||
sentence:
|
||||
|
||||
: Bla bla bla bla bef. bla bla.
|
||||
|
||||
This is one sentence, as "bef." does not end the sentence. If we assume that
|
||||
the line width is not 80 characters, but instead 16 characters, then the line
|
||||
should be wrapped like this:
|
||||
|
||||
#+BEGIN_SRC
|
||||
Bla bla bla bef.
|
||||
bla bla
|
||||
#+END_SRC
|
||||
|
||||
But now it's not clear if "bef." ends a sentence or not! If we want to turn the
|
||||
fixed-width representation back into a simple line representation, we don't know
|
||||
if we should insert one or two spaces after "bef.". How do we solve that?
|
||||
|
||||
The answer is that, when you line-wrap, you don't split word sequences separated
|
||||
by ". ", .i.e. you see an abbreviation and its following word as a single word.
|
||||
That way, you would end up with:
|
||||
|
||||
#+BEGIN_SRC
|
||||
Bla bla bla
|
||||
bef. bla bla
|
||||
#+END_SRC
|
||||
|
||||
which would not cause any problems.
|
||||
|
||||
|
||||
* General thoughts
|
||||
|
||||
Most natural languages have some amount of unambiguity, and part of it seems to
|
||||
make some things easier, i.e. allowing speakers to be loose when talking about
|
||||
stuff.
|
Loading…
Reference in New Issue