Showing posts with label code. Show all posts
Showing posts with label code. Show all posts

18 December 2022

The Digital Detective: Encryption ≠ Encoding


Telex paper tape
Telex paper tape

The Explainers

Don’t refer to ‘computer codes’. When coaching lawyers for depositions, that became my first rule. I urge the same rule for authors as well. Don’t blow credibility by trying to ‘pluralize’ code with codes– in computerdom the plural of code is still code.

And what kind of code? Source code? Microcode? Machine code? Generic ‘computer code’ is less than meaningless. And while we’re at it, hackers can’s remotely set opponents’ computers on fire, not unless they slip their adversary certain laptops with defective Sony batteries.

To illustrate concepts, real-world analogies appeal to me, but some computer specialites are so abstract, explaining them is difficult. A few software specialists relate systems programming to composing music: Both take place in the originator’s mind, both use symbolic languages and, since the invention of the player piano and now modern mixing consoles, both can be programmed. But analogies can go only so far.

One of the most common questions has proved the most difficult to answer: How are characters stored in the computer? For example, what does “Now is the time” look like inside the machine? Explaining each character has a numeric representation loses some people, but mentioning numbers 0123 are represented as 30313233 (or worse, F0F1F2F3) results in eye-glazing and blood leaking from the ears.

Many programming courses don’t attempt to explain how letters and numbers are recognized and stored in computers. It’s taken for granted and too often they fall back upon, “Do as we say and you’ll do okay.” But that doesn’t answer the question.

Mike Drop

And then… two Michaels came together and showed me the way.

One was Michael Bracken. The other was… Mike Lindell.

Yes, that Michael Lindell, everyone’s favorite mad uncle, the My Pillow Guy. Wait, this is not about politics, I promise. We’re talking about writing.

Mr Lindell has very publicly complained that data in voting machines is secretly encrypted to prevent it being studied. He’s sponsored symposiums with ‘proof’ of skulduggery, and he infamously slandered and libeled voting machine companies, inviting lawsuits with nine decimal zeros in the complaints.

During one interview, Mr Lindell displayed a sample on the screen giving me my first glance at what he was talking about. Could he be correct?

As a writer, I try to get details right, because as a reader, I’ve been yanked out of stories when authors get details wrong. Mr Lindell got it wrong:

Voting machine data isn’t encrypted. It’s encoded.

Wait. Same thing, you say, right? To*mah*to versus To*may*to?

Nope: encoded ≠  encrypted.

Encryption implies obfuscation. It’s how spies try to protect their secrets. It’s how financial institutions are supposed to shield their transactions.

Mikey isn’t all techie and sciencey. I don’t doubt Mr Lindell innocently misunderstood what he saw, but his misunderstanding ‘plain text’ 0123 looks like 30313233 is costing him millions. If a highly visible businessman with political connections doesn’t understand, what about us ordinary readers and writers?

Michael Bracken’s Fault

Baudot 5-bit paper tape

An upcoming anthology for Michael Bracken required digging into historical events, early radio, and teletypes. I didn’t use teletypery (that’s a word, right?) in my story, but at some point the penny dropped, how to help people visualize character encoding. It’s so simple.

Once upon a time, I communicated by telex with offices in Europe. For quick notes, we’d dial in, tap out a few words and perhaps receive an immediate response. But overseas connection time was expensive, so for long flirtations, I mean messages, I’d prepare text on paper tape, then connect and transmit.

And therein lay my solution for anyone to see: encoding on paper tape, a technology a century and a half old. People could see and touch each character as a distinct hole pattern easily converted to a unique number:

hole = binary 1; no hole = binary 0

No Remorse

Morse Code, developed in the 1830s for single-key telegraphy, wasn’t suitable for this new medium. In the 1870s, French engineer Émile Baudot developed a five-bit code. Five bits allows for 2⁵ or 32 distinct characters, but Baudot and the subsequent Morkrum Code (1915) used ‘escape’ characters to switch to and from alphabetic letters mode and numbers-symbols mode, bringing possible combinations closer to sixty, although in practice, far fewer were used. (One of those ‘characters’ rang an attention-getting bell at the other end.)

Baudot paper tape showing shifted values
Baudot paper tape showing shifted values

Morkrum’s new ‘teletypewriter’ was literally a modified typewriter. Morkrum, by the way, is not a person, but rather three people: Joy Morton, founder of Morton Salt, and mechanical engineer Charles Krum, joined by the latter’s son, electrical engineer Howard Krum.

Puzzle Me This

This is paper tape, the stuff of telexes and teletypes, the technology that once powered Western Union, Wall Steel, and news wires. I’ve included only the Roman alphabet, invented a century and a half ago. Each letter has a distinct punch pattern. Curiously, the hole combination for A looks nothing like those for B, C, D, and so on. Each letter’s numeric assignment seems so utterly random as as to defy logic.

Baudot paper tape showing alpha/number shift values.
Baudot paper tape showing decimal values of alphabet

But there is a logic and I’m betting you can figure it out. Why didn’t Baudot lay out letters one after the other in alphabetical order and bump holes one-by-one?

There is method to the madness. Your challenge is to suggest a reason for these seemingly arbitrary hole assignments.

binary values of holes (numbered right to left)
values of holes
Hint № 1
It helps to know *the earliest* machines had five piano-like keys corresponding to the holes. A teletype operator would press the correct keys one-by-one, and the machine punched holes and advanced the tape.
Hint № 2
Note this sample includes a space character. It’s actually a clue.
Hint № 3
Hover for another clue…

Twitchy Fingers

AT&T developed a machine nearly identical to the Telex but using 7-bit code similar to ASCII and its Unicode descendants. Seven bits allowed for 2⁷ or 128 characters, many of them assigned special purposes. Many universities hung cheap, obsolete TTYs  on their early Unix computers, making an ASCII relationship clearer.

ASCII paper tape showing 7-bit values
ASCII paper tape with 7-bit values

[Unix aficionados blame those sluggish keyboards for the plethora of ungodly, abstruse Unix commands: awk, chown, df, grep, lp, m4, qalter, renice, uucp, yacc.]

Did It Work?

So does the paper tape comparison help explain how ‘plain text’ data is used and stored in computers? And does the difference between encoding and encryption make sense? Enquiring minds want to know.

Puzzle Answer ↷

01 May 2016

Mayday, Mayday


by Leigh Lundin

TeleType telex TTY
TeleType – early texting
It’s May Day, which got me thinking about mayday and codes. How did ‘mayday’ come to be a distress signal? It’s a mispronunciation of the French m’aider, from venez m’aider, “Come to my aid,” or “Come help me.”

So, parents and writers, it’s been a long time since we posted SMS codes and acronyms in use by kids, counter-culture, and people in technology. Some mnemonics have faded into obscurity like ROFL (rolling on floor laughing) and others have been truncated like WTF.

But OMG, a number remain with us (LOL). Some not only predate texting, but at least two, BRB and GA, date back to the days of that early messaging system, the telex. I wouldn't be surprised if Samuel Morse used such abbreviations.

I confess to liking ILYSM and 'bae' (short for bae-bae). Yet, as kids search for ever-more-circumspect communication, codes change rapidly.

You may see ‘Kik’ floating around. It’s not an acronym but a messaging phone app, popular with the young and bad guys because its messages evaporate after reading.
code meaning…
AF As ƒ, in context with other words, e.g, “That’s cool as ƒ.”
AFAIK As far as I know.
bae Babe, baby.
BMS Broke my scale, i.e, high marks for looks or deeds.
BRB Be right back.
cook Gang-up, dump on someone.
DOC Drug of choice.
FML ƒ my life, chagrin.
GA Go ahead.
HMU Hit me up, request for phone or message contact.
IDK I don't know.
ILYSM I love you so much.
KOTD Kicks of the day, sneakers.
LMAO Laughing my ass off.
LOL Laughing out loud. (still in use)
OMG Oh my God. (still in use)
OOTD Outfit of the day.
RN Right now.
smash Sexy, want sex.
SO Shout out, give recognition.
TBH To be honest.
TBR To be rude.
TF WTF? (What) the ƒ?
6 Sex, often used in combination with other codes, e.g, IW26U.
9, CD9 Parent in the room, or PIR. Formerly, POS meant parent over shoulder.

What codes are your kids sending?

08 March 2015

The Kaspersky Code


Three weeks ago, Kaspersky Lab, the Russian security software maker exposed a cyber-espionage operation that many believe originated within the NSA. The devilishly clever bit of code hides in the firmware of disc drives and has the ability to continuously infect a machine. If you use a Windows computer, there’s a good chance it’s not only infected but was built that way likely without the manufacturers' knowledge.
Kaspersky researcher Costin Raiu says the NSA couldn’t have done it without the source code.

What?!!

The contention that the NSA definitely had access to the source code is not only patent nonsense, it ignores that fact that Kaspersky themselves supposedly didn’t have the code. Having the source code is the easy way, perhaps the preferred way, but it’s hardly the only way.

A Reuters article speculates how the NSA might have obtained the source code and indeed, one of those is a likely scenario. But it’s also feasible to do the job without the source and I’ll show you what I mean, a technique I used to unravel computer fraud programs. Fasten your seat belt because this is going to get technical.

World’s Greatest Puzzle

Those around in my Criminal Brief days know that I love puzzles. For me, the ultimate puzzle has been systems software programming, making the machine do what I want. But sometimes I’ve come up against puzzles, some benign, some not, where I didn’t have the source code.

Let’s try an example. What if we found mysterious code in our computer that looked something like this:

confused pseudo code snippet
Mysterious Snippet of Computer Code

If you can’t make sense out of this, you’re not alone. 98% of computer programmers wouldn’t know what to make of it either. But if you look closely, the data populating the upper block looks different from that in the lower block. This is a clue.

Unlike commercial and scientific programs, systems software deals with the operation of the computer itself– utilities, communications, and especially the operating system. The realm of a computer’s internals are abstract, far more so than the Tron movies. Key aspects seldom relate to real-world equivalents. Sure, we say that RAM is a little like notes spread out on your work table and that disc storage is kinda sorta like a file cabinet… but not really. Even the term RAM– random access memory– is misleading; there’s nothing random about it.

Back in the real world, let’s say you want to write a simple program that adds the number of apples and oranges. In most programming languages, this code would look like this:
total = apples + oranges
Internally, a program loads apples and oranges into registers (kind of like keying them into a calculator), adds them, and stores them in a variable called total. If we were to write this in the argot of the computer, we’d use assembly language mnemonics, an abstraction of the computer’s machine language. Deep, deep down in a program, we’d see nothing but numbers where we count…
0, 1, 2, 3, 5, 6, 7, 8, 9, A, B, C, D, E, F
Yes, A-F are digits in this context. Within the computer, our little program above might resemble…

simple pseudo-code program: total=apples+oranges
total = apples + oranges

What isn’t obvious to many programmers is that computer instructions are data. Indeed, some black-hat crackers (the bad guys) have used this property to sneak malware onto unsuspecting computers.

If you look again at the original sneak peek of data, you’ll start to see patterns and may even pick out the machine instructions from our code example above.

clarified pseudo code snippet
Less Mysterious Code Snippet

This puzzle solving is called reverse engineering. It’s possible to write a program called a disassembler (I have) or a de-compiler (I haven’t) to decode the machine language into something more intelligible. The program has to be smart enough to not only separate actual data from instructions, but distinguish the type of data.

As you see, compiling source into binary executable code isn’t a one-way street. With dedication and know-how, reversing the process is well within reach.

How safe do you feel now?