18 December 2022

The Digital Detective: Encryption ≠ Encoding


Telex paper tape
Telex paper tape

The Explainers

Don’t refer to ‘computer codes’. When coaching lawyers for depositions, that became my first rule. I urge the same rule for authors as well. Don’t blow credibility by trying to ‘pluralize’ code with codes– in computerdom the plural of code is still code.

And what kind of code? Source code? Microcode? Machine code? Generic ‘computer code’ is less than meaningless. And while we’re at it, hackers can’s remotely set opponents’ computers on fire, not unless they slip their adversary certain laptops with defective Sony batteries.

To illustrate concepts, real-world analogies appeal to me, but some computer specialites are so abstract, explaining them is difficult. A few software specialists relate systems programming to composing music: Both take place in the originator’s mind, both use symbolic languages and, since the invention of the player piano and now modern mixing consoles, both can be programmed. But analogies can go only so far.

One of the most common questions has proved the most difficult to answer: How are characters stored in the computer? For example, what does “Now is the time” look like inside the machine? Explaining each character has a numeric representation loses some people, but mentioning numbers 0123 are represented as 30313233 (or worse, F0F1F2F3) results in eye-glazing and blood leaking from the ears.

Many programming courses don’t attempt to explain how letters and numbers are recognized and stored in computers. It’s taken for granted and too often they fall back upon, “Do as we say and you’ll do okay.” But that doesn’t answer the question.

Mike Drop

And then… two Michaels came together and showed me the way.

One was Michael Bracken. The other was… Mike Lindell.

Yes, that Michael Lindell, everyone’s favorite mad uncle, the My Pillow Guy. Wait, this is not about politics, I promise. We’re talking about writing.

Mr Lindell has very publicly complained that data in voting machines is secretly encrypted to prevent it being studied. He’s sponsored symposiums with ‘proof’ of skulduggery, and he infamously slandered and libeled voting machine companies, inviting lawsuits with nine decimal zeros in the complaints.

During one interview, Mr Lindell displayed a sample on the screen giving me my first glance at what he was talking about. Could he be correct?

As a writer, I try to get details right, because as a reader, I’ve been yanked out of stories when authors get details wrong. Mr Lindell got it wrong:

Voting machine data isn’t encrypted. It’s encoded.

Wait. Same thing, you say, right? To*mah*to versus To*may*to?

Nope: encoded ≠  encrypted.

Encryption implies obfuscation. It’s how spies try to protect their secrets. It’s how financial institutions are supposed to shield their transactions.

Mikey isn’t all techie and sciencey. I don’t doubt Mr Lindell innocently misunderstood what he saw, but his misunderstanding ‘plain text’ 0123 looks like 30313233 is costing him millions. If a highly visible businessman with political connections doesn’t understand, what about us ordinary readers and writers?

Michael Bracken’s Fault

Baudot 5-bit paper tape

An upcoming anthology for Michael Bracken required digging into historical events, early radio, and teletypes. I didn’t use teletypery (that’s a word, right?) in my story, but at some point the penny dropped, how to help people visualize character encoding. It’s so simple.

Once upon a time, I communicated by telex with offices in Europe. For quick notes, we’d dial in, tap out a few words and perhaps receive an immediate response. But overseas connection time was expensive, so for long flirtations, I mean messages, I’d prepare text on paper tape, then connect and transmit.

And therein lay my solution for anyone to see: encoding on paper tape, a technology a century and a half old. People could see and touch each character as a distinct hole pattern easily converted to a unique number:

hole = binary 1; no hole = binary 0

No Remorse

Morse Code, developed in the 1830s for single-key telegraphy, wasn’t suitable for this new medium. In the 1870s, French engineer Émile Baudot developed a five-bit code. Five bits allows for 2⁵ or 32 distinct characters, but Baudot and the subsequent Morkrum Code (1915) used ‘escape’ characters to switch to and from alphabetic letters mode and numbers-symbols mode, bringing possible combinations closer to sixty, although in practice, far fewer were used. (One of those ‘characters’ rang an attention-getting bell at the other end.)

Baudot paper tape showing shifted values
Baudot paper tape showing shifted values

Morkrum’s new ‘teletypewriter’ was literally a modified typewriter. Morkrum, by the way, is not a person, but rather three people: Joy Morton, founder of Morton Salt, and mechanical engineer Charles Krum, joined by the latter’s son, electrical engineer Howard Krum.

Puzzle Me This

This is paper tape, the stuff of telexes and teletypes, the technology that once powered Western Union, Wall Steel, and news wires. I’ve included only the Roman alphabet, invented a century and a half ago. Each letter has a distinct punch pattern. Curiously, the hole combination for A looks nothing like those for B, C, D, and so on. Each letter’s numeric assignment seems so utterly random as as to defy logic.

Baudot paper tape showing alpha/number shift values.
Baudot paper tape showing decimal values of alphabet

But there is a logic and I’m betting you can figure it out. Why didn’t Baudot lay out letters one after the other in alphabetical order and bump holes one-by-one?

There is method to the madness. Your challenge is to suggest a reason for these seemingly arbitrary hole assignments.

binary values of holes (numbered right to left)
values of holes
Hint № 1
It helps to know *the earliest* machines had five piano-like keys corresponding to the holes. A teletype operator would press the correct keys one-by-one, and the machine punched holes and advanced the tape.
Hint № 2
Note this sample includes a space character. It’s actually a clue.
Hint № 3
Hover for another clue…

Twitchy Fingers

AT&T developed a machine nearly identical to the Telex but using 7-bit code similar to ASCII and its Unicode descendants. Seven bits allowed for 2⁷ or 128 characters, many of them assigned special purposes. Many universities hung cheap, obsolete TTYs  on their early Unix computers, making an ASCII relationship clearer.

ASCII paper tape showing 7-bit values
ASCII paper tape with 7-bit values

[Unix aficionados blame those sluggish keyboards for the plethora of ungodly, abstruse Unix commands: awk, chown, df, grep, lp, m4, qalter, renice, uucp, yacc.]

Did It Work?

So does the paper tape comparison help explain how ‘plain text’ data is used and stored in computers? And does the difference between encoding and encryption make sense? Enquiring minds want to know.

Puzzle Answer ↷

The earliest of these machines had simple five-key keypads, requiring the operator to punch one key at a time and manually transmit the completed character. It made sense to keep key presses to a minimum, so, like the Morse telegraph, the letters were sorted by frequency of use rather than their position within the alphabet.

Thus the letters E and T, spaces, and the line feed / carriage return characters each require only one key punched. The letter S and vowels A, I, O need two keys pressed, and so on until we reach K, Q, X requiring four keys. Minimizing keystrokes and operator fatigue explains why Baudot characters have such a peculiar order.

Did you guess it?




While I was stationed at a client’s headquarters in France, their British subsidiary sent a telex message asking how I was getting along with the French. I’m not admitting who sent the reply, but the return telex read,

We’re giving tit for tat.
PS: Send more tats.

6 comments:

  1. Yes it does and the short history of coding into punch strips is interesting too.

    ReplyDelete
    Replies
    1. Thank you, Janice. I appreciate it. In this case, teaching became a learning moment.

      Delete
  2. Wow! Great info! BTW, Mr. Lindell would have complete fits if he ever saw the code that court reporters use (or used to use, before recordings):
    UR
    O PB
    R
    Does that really look like, “YOUR HONOR”? OR:

    AO U YOU
    W E R WERE
    TH A EU R THERE
    AO EU I
    S A U SAW
    U YOU

    I had SUCH fun researching that for "Happy Families"!

    ReplyDelete
    Replies
    1. Eve, I have wondered about that strange medley court reporters use. People's lives can depend upon them getting it right. It's brilliant using that in a story.

      Delete
    2. Thanks, Leigh. What I'm most proud of was that I ran all my "coding" by the court reporter I used to work with the most - and I got it all right!

      Delete
  3. For more on the topic, I recommend The Evolution of Character Codes, by Eric Fischer.

    ReplyDelete

Welcome. Please feel free to comment.

Our corporate secretary is notoriously lax when it comes to comments trapped in the spam folder. It may take Velma a few days to notice, usually after digging in a bottom drawer for a packet of seamed hose, a .38, her flask, or a cigarette.

She’s also sarcastically flip-lipped, but where else can a P.I. find a gal who can wield a candlestick phone, a typewriter, and a gat all at the same time? So bear with us, we value your comment. Once she finishes her Fatima Long Gold.

You can format HTML codes of <b>bold</b>, <i>italics</i>, and links: <a href="https://about.me/SleuthSayers">SleuthSayers</a>