Monday, December 17, 2018

Using machine stenography in pen shorthand: Pen Steno

Longtime Plover community member and steno enthusiast Kevin Knox (codepoke) has invented a method to represent machine steno strokes using pen-shorthand-like techniques. As a fan and user of steno, I tried to learn Gregg shorthand and found it frustrating. You need to relearn all sorts of briefs, deal with phonetic conflicts, and just generally speak a whole new language compared to the one we use on our steno machines. Kevin took this frustration and actually did something about it, and I'm very excited to share what he's prepared with The Plover Blog.


Pen Steno

by Kevin Knox 
tl;dr: Pen Steno is a niche writing system leveraging the dictionaries of stenography to double or triple the speed of standard longhand writing. Gregg shorthand can quadruple or quintuple the speed of longhand, but at the cost of a great deal more learning than Pen Steno. 
Have you ever found yourself unable to enjoy your ice cream because you had too much cake? I found myself there in June 2018.

I'd learned Gregg shorthand back in the 90's, and loved it. For 20+ years I'd taken all my notes in Gregg, using it probably about a half-hour a week. It didn't change my life, but it was a lot of fun to take near-complete notes and to scribble out bizarre shapes, each of which had a solid meaning. The whorling kanji of Gregg not only made sense, they made writing feel "righter".

In June 2016, my relationship with the written word grew even richer. That's when I found the Open Steno Project, Plover, and the work of Mirabai Knight, Ted Morin, Benoit Pierre, and others. I found stenography, and I was in love. Writing flows in a whole new way when each word takes on its own special geometry and shape.

There was one little complexity with my new-found love triangle. Gregg is the Perl [fn 1] of the writing world. I could write anything in Gregg, but rather like Perl, it hurt to read it again. Using Gregg only once a week or so, I never got good enough to decipher my own notes smoothlynot that I'd ever have admitted that out loud.

On that fateful day in June, I was taking Gregg notes and brain-crashed on the word "whatever". I was flummoxed. In Gregg, "whatever" is spelled OTEV. (If you squint your ears funny, and say "uh" for the O, you can kind of hear it, but it's not easy to read.) In stenography, "whatever" is spelled WHAFR. Relatively speaking, the word jumps out upon a read-back. On my day of decision, the steno brief came to mind, blocking the Gregg brief. I missed hearing a half sentence while trying to figure out how to write WHAFR in Gregg, and wondering why I'd never tried to put an H in there before.

I couldn't enjoy my Gregg cake because I suddenly had too much stenography ice cream.

What to do?

Obviously, I had to create a pen steno system. With the creativity of a blind badger I named it Pen Steno, and now I'm on version 4.02. The first 3 versions were all painfully different from each other, but I'm comfortable with version 4 now and rolling with it daily. The quest for a working pen steno system took me down a number of roads, but the destination is happily a working system.

Cake à la mode was born.

How to Pen Steno 

If you don't know stenography, nothing from this point forward will mean much to you. It's possible to learn significant things about stenography by starting with pen steno, but it's truly an add-on to stenography rather than a separate path. For that reason, all the further discussion in this post will assume a working knowledge of stenography.

In discussing pen steno, I will use the following terms:

  • Outline: A shape drawn without lifting the pen.
    An outline usually represents a word, but sometimes two or more outlines are needed to define a word while other single outlines define multiple words.
  • Stroke: One piece of an outline drawn in a continuous direction.
    Most outlines are composed of several strokes.
  • Shape: What the stroke looks like after you've drawn it.
    Some strokes take the shape of a curve, others of a loop, and a few of a line. Direction, curve, and length are all significant in a stroke's shape.
  • Brief: A non-phonetic outline that represents a complex word or phrase using minimal strokes.
The goal of pen steno is to match each stenographic keystroke to a pen stroke. If you strike 9 steno keys to make a word, you can pen stroke those same 9 keys to draw an outline for that same word.

You may have noticed 9 is a bad number. Stenography makes striking 9 keys a breeze. Drawing 9 strokes to make a single pen outline can take longer than writing the word out in good, old longhand. And reading it wouldn't make anyone happy, either.

To bridge that gap, pen steno mimics the well-known blends of stenography. TKPW- is the stenographic blend for G-, and pen steno follows the same pattern by introducing a single, circular shape to represent striking 4 well-known keys.

The following map shows the shapes corresponding to each direct and blended stroke. G- is the largest circle on the left-hand side.



  • Blue shapes are the simple, direct shape corresponding to striking any one steno key.
  • Purple shapes extend across multiple steno keys horizontally adjacent to each other.
  • Orange shapes correspond to vertically adjacent keys.
  • Green shapes are loops taking in 4 adjacent keys. 

There is a logic to the choice of these shapes. In a blog post like this, I shouldn't try to teach the use of pen steno, but I'll give some of my reasoning.

  • The pinky and forefingers are shortest, so you'll see the shapes for S-, H-, -F, and -T are all short. The shapes for the longer fingers are all long. This includes the shapes for the vertical blends.
  • The curves for the left two fingers on either hand bend to the left. The curves for the right two fingers bend right.
  • The curves for -D and -Z are just like -T and -S, but with a curl on the end.
  • The horizontal blends for ST-, SK-, TP-, KW-, -FP, and -RB are straight lines on the pinky side. The outer mark is shorter, the upper lines stroke upward, and the lower lines strike downward.
  • The horizontal blends for TPH-, PH-, KWR-, WR-, -PL, -PLT, -BG, and -BGS are inclusive curves. The bigger curves includes more keys.
  • Circles always enclose 4 keys, even in the horizontal blends.
Note that every shape is either short or long. The Gregg convention of differentiating meaning with short, medium, and long strokes caused me too much pain, so I've dropped all medium-length shapes. It's easier to tell short from long than either from medium.

The Vowels 

Vowels are the critical thing. There are 16 of them in steno (counting the nil vowel, there are 2⁴ combinations), and they're the most overloaded of all the strokes. The vowels give words their sound, of course, but they also separate the left-side strokes from the right-side ones and disambiguate many homonyms. Vowels must be quick to write and distinct from every consonant stroke. I tried a dozen ideas before settling on version 4's method.

I've chosen to let the vowels be the sole owners of the simple small-U shape. A- and -E on the left get the upward-arching U, and O- and -U on the right get the downward-arching U shape. When either pair of vowels is blended together, the U shape widens to contain both of them. AO- is a wide, upward-arching U shape, and so is -EU. There is no wide, downward-arching U shape in pen steno.

That leaves one more edge case. Sometimes there's neither an A- nor an O-. In that case, a small circle is drawn to represent the nil vowel. This is only needed for A- and O-. The absence of -E and -U carries no special meaning, so it's not drawn. The absence of all vowels is shown by the small circle for the absence of the A-/O- pair.

Asterisk

Lastly, there's the special steno char, "*". The asterisk  in pen steno is always drawn at the end of the full outline. This is to keep it from being confused with the nil vowel marker that can appear at the beginning or in the center of a brief.

Machine Learning

Pen Steno has no conflicts. An outline, once written, can only correspond to a single dictionary entry, which in turn can only correspond to a single translation. This makes pen steno as simple a candidate for machine learning as I'd hope to find. Each outline can be read (by a person or by an ML process) and submitted to the Plover engine serially to create direct output. I'd dearly love to see a machine learning tool developed to support the process, but it's too many leaps too far for my current skill set.

Briefs

Arbitrary briefs can be created by anyone at any time, and one day a standard set may be created. I'm not excited about it, because I'm not yet feeling a need for pen briefs. Plover already has a great set of one and two stroke briefs that translate well enough to pen steno to make me happy. If, however, I stop and think a moment about a briefing system, I like using a flat, short stroke like an n-dash as each brief's leading stroke.

When used normally, pen steno looks like this:


Using Pen Steno 

This is the part where I fail to recommend everyone start using pen steno. According to data quoted on the Plover Discord server, a normal person can use a pen to write about 300 strokes per minute. I have no independent verification of that number, but it makes sense to me, so let's run with it.

Let's say a classic, lowercase, cursive W takes three strokes to write, H is 3, A is 3, T is 3, E is 2, V is 2, E again is 2, and R is 3. That's 18 strokes. A normal longhand writer could produce "whatever" 17 times in one minute. Let's call that 30 words per minute, since "whatever" is nearly 2 words long.

Gregg only uses 4 strokes to produce that same longish word. A competent user of Gregg shorthand should be able to write "whatever" 75 times in that same minute for 150 WPM. The improvement is plain to see, at 5 times as fast.

A pen steno user would also write the word "whatever" in 4 strokes, but let's not be misled by a single favorable example. A quick and dirty check of some of my old Gregg shows it weighing in at a consistent 2.5 strokes per word. Better users will use deeper briefs, and get better numbers. Using all blends, I expect pen steno to weigh in closer to 4 strokes per word.

I think, allowing for 1 significant figure, we can put longhand at 10 strokes per word, pen steno at 4 strokes per word, and Gregg at 2 strokes per word.

That's a win for me, and maybe it is for you, too.

Pen Steno should fit a niche of folk who already know stenography, need to write a few longhand notes per day, and don't want to give precious brain-space to two different dictionaries of briefs.

If that's you, I hope you have a little fun with it!

—Kevin

Footnote 1: Perl is a delightful and powerful programming language guilty of luring its users to write near unreadable code.

No comments: