24 October 2021

The Digital Detective, Wall Street part 4


When corporations upgrade large computer systems, they typically run the old and the new in parallel a few weeks or months until the bugs are shaken out. Occasionally events take a turn as discussed last week.

Mutual Admiration Society

Back in New York, our mutual funds firm (not so fondly referred to as MuFu) faced a different problem. They had completely rewritten the primary application, changing over from Cobol to C, and it hadn’t gone well. Four months after parallel commenced, they were experiencing glitches and crashes.

The sizeOf problem I’d caught wasn’t a contributing cause. An unidentified problem was triggering errors, an oversight so simple it would boggle the mind.

Robert, their very defensive senior C expert, hadn’t told me about a front-end program written by yet another programmer. I had to figure that out for myself. The bug wasn’t in the program they’d assigned me; it was introduced by what came before.

Front end and Back end Processing
Front end and Back end Processing

As previously mentioned, Cobol reads like English and C… well, C is sometimes great and often horrible. C had become the most recent fad and application programmers were feeling the bite of its double edge sword.

The staff was comprised of university C students and the last Cobol member on her way out. Machine language (and assembler) weren’t in their purview and when they dismissed John, ‘the old guy’, they'd rid themselves of their only person who could poke around in memory (RAM) to determine what went wrong.

And memory was a problem. The program used customer numbers to index into a table and reference records in storage… in theory. In practice, I soon learned the customer was occasionally wrong, wildly wrong, trying to access a memory location off in the wilds of Kansas.

Cobol could detect out-of-bounds matrix subscripts; C could not. Thus it took me a little while to figure out the bogus account code was coming from a front end program. That preprocessor queued submitted entries, performed minor verification with a check digit, converted the input to binary, and passed the record on to the back-end program I first investigated.

In short, sometimes the data entry folks included dashes in the account number (e.g, 7654321-1) and sometimes they didn't. The Cobol app extracted only the digits; the C program didn’t. Both programs tentatively vouched for the account number (7654321) using the check digit (1), indicating it resided in the realm of possible valid numbers. Unfortunately, the newly written C routine included the hyphen when attempting to convert the number to binary. Both versions then ‘piped’ (passed along) the massaged data to the back-end program where hell and fury would erupt when a bad number with the mashed-up hyphen was passed along.

For all the grief it caused, correcting the C front end was trivial. Worryingly, the front-end program, instead of creating the transaction serial number, left that task for the back-end program. Bad, bad, error-prone design. And, as I would discover, prone to manipulation.

I returned the program to service and turned my attention back to the mysterious ‘sizeOf’ conundrum.

Faith, Hope, and Charity

Many organizations buy into mutual funds for long term storage of their money. City, county, and state governments store tax revenues, fines and fees there. Churches and charities divide money between money market and mutual funds.

In the mutual funds program, a template field labeled IRS501C was data-typed binary in the old Cobol Record data division and as boolean in the matching C Struct.

When I returned to the section with the anomalous ‘sizeOf’ routine, I could see this field being referenced, but I didn’t know why. A library search for original source code for sizeOf and the parent routines turned up nothing.

Growing more suspicious, I asked operations to dig through their archives and find the code. “Don't hold your breath,” they said.

Next day, the IT director gave me the conference room to spread out my work. I mapped binary instruction after instruction, recreating an assembler code version of the program. C could fool the eye, but machine code, even in the absence of context, revealed details of what was going on– if I could figure it out.

I constructed charts of data structures, trying to figure out what was taking place. At last when I spotted buried instructions trimming fractions of a cent from daily interests earned, I knew I’d stumbled upon skulduggery.

Figuring out the sleight-of-hand was mind-bending, but I got a break. Like so many magic tricks, the chicanery was breathtakingly simple. Only the surface artifice was complex.

I had accumulated a suite of experimental data to test extremes of the system. It contained only a dozen records but I noticed the audit log reported thirteen. What? A record with a proper transaction serial number had materialized like a magic trick.

As mentioned previously, the front-end processor should have been creating the transaction serial number, not the back end, but apparently no one here knew better. That oversight facilitated the deception, allowing crooked code to create records undetected.

Computer hours were reduced that day. Being the first of the quarter, month-end and quarter-end reports took priority. Idling, I suddenly wondered if month-end had anything to do with the mysterious symptoms I was witnessing. Once again I nagged operations about searching archives for source code.

An hour later found me wrestling with that data cleverly hidden beyond the end-of-data marker. An impatient operator slapped a cartridge on my work table. "Try this," he said.

Former employee John had made a rare oversight. He’d deleted the source files, but… Each evening, operations backed up everything, and that included John’s source code. It filled in gaps.

No comments, of course, but lo, I beheld the twisted mind of a criminal genius. The routines were rife with indirection and misdirection. The ‘sizeOf’ trick merely hinted at the scam iceberg. While the obfuscated C code suggested one thing, the meticulous machine instructions I’d decoded step by step helped me understand what was really happening.

The scheme launched from a database record under MuFu’s own name and address, 100 Maiden Lane. The registered agent was listed as K. King, address 103rd floor, 350 Fifth Avenue, Manhattan, New York 10118. Midtown… I looked it up… Empire State Building. The street address was legitimate, but 103rd floor?

interest truncation example

Greed Kills

The charlatan routine skimmed thousandths of a cent or so following rounding errors– interest and binary-to-decimal trailing digits after rounding high. On average, the algorithm could have siphoned a quarter of a cent per transaction without setting off alarms, but our sneaky programmer apparently wanted to stay well below nets cast by auditors. Those fractions of a penny accumulated in the bogus MuFu self-owned bucket until the end of the month. Dollars– thousands of them– and been created out of thin air.

I fully expected John’s wife or a friend had opened another account to receive the transfers, but as I traced the code, it invoked a random number generator to index into an entry in the hidden part of the file, just one binary field,  which turned out to be an account number. At month end, the subversive routine transferred out between $1200 to $5000 a month from the bogus MuFu in-house account to the account selected by the random number generator. But why only certain accounts? What was special about them? How was John profiting?

As always, I sat outside on the ferry shielded by a bulkhead. As I started at the lights of Brooklyn, the answer hit me, knocking sleep out of the equation. I rode the ferry back.

With suppressed excitement, I extracted the account numbers and checked the first indicated record. Bingo. And the next one. And the next. And then the 20th and the 100th. Bingo, bingo. Every case showed the IRS501C non-profit tag.

Damnation. I’d unmasked a freaking Robin Hood. John– or should one say Little John– was stochastically selecting non-profit accounts to donate to. That generated the thirteenth record.

Fascinatingly, the audit trail reinforced the fraud’s legitimacy rather than exposed it. Only a paper trail might suggest a missing document, but who was going to dig through reams of flattened dead trees?

If United Way or Scouting USA or Bethune Cookman read their statements at the end of the month, they might have scratched their heads but concluded they surely made a deposit and misplaced their record of it.

I made copious notes and documented everything. When presented to the firm’s CIO, she looked disbelieving, then doubtful, and finally bewildered.

“I know your reputation,” Loretta said, “but this can’t be possible. Besides, IT claims John had aged beyond usefulness. He couldn’t keep up. He barely finished this, his last project, before we let him go.”

“If so, he put effort into making a final masterpiece.”

“Leigh, darling, can you fix it?”

Call me darling and I can fix anything. I yanked the too-clever code out by its roots and their senior programmer, Robert, fixed the hole and, upon my recommendation, moved the transaction serializer to the front-end.

“What will you do about the spurious deposits?” I asked.

“They go back months. We wouldn’t look good demanding hospitals and heart foundations return money deliberately deposited into their accounts. John gave away money we couldn’t detect was missing. We’ll leave it that way.”

“What about John?”

Loretta sighed. “Same reasoning. Arresting him will bring nothing but bad publicity. Can you imagine the Times or the Journal with headlines about a Wall Street Robin Hood? That’s bad enough, but a sympathetic soul would raise issues about ageism. No, we can’t win there. Thank God we discovered it.”

“Can you get me John’s contact info?”

“What? No, maybe, yes, why not. I’ll discreetly ask HR for it.”

Robbin’ Robin

I phoned ‘John’ and invited him to lunch.

“I don’t think so,” he said. “Who is this again?”

“Leigh Lundin.”

“Oh shit, you? What do you want?”

“Just a chat. Really.”

“You’re working for MuFu?”

“Yes, today I am; tomorrow, no. I’m wrapping up.”

“So you know…?”

“Lunch,” I said. “Let’s not do this on the phone.”

“Fraunces Tavern?”

“Whew! If you pay.”

He laughed. “Okay. If you accept that, you aren’t out to nail me.”

“I’m not. John, can you afford it?”

“I landed on my feet. Arthur Lipper knows me and his son hired me.”

I respected Lipper Inc. He chose well.

The Wolf Pup of Wall Street

We met in the pub where George Washington bade farewell to his troops. John looked like a mad Santa with puppy dog eyes and an Albert Einstein hairdo. I’d bet a dozen grandkids employed him as a stage for hundreds of adventures.

He said, “You’re not recording this?”

“No.” I kept my smile easy and relaxed my body language.

“I’m not admitting anything including this statement.”

“Hmm. Let’s talk hypothetically, this entire conversation, okay?”

“Sounds fair. What have you figured out?”

“Most of it, I imagine. Cancer research received a couple of grand on the first before I could stop it. That will be the last payment.”

“Good,” he said. “I mean, embezzling’s awful.”

I snorted. “SizeOf.”

He laughed. “I thought that was clever hiding in plain sight, but apparently not clever enough.”

“I overlooked it at first. John, what was going on? Why did our suppositional programmer take such a risk?”

He dropped the hypotheticals.

“They dismissed anyone approaching retirement, figuring to save paying pensions, I suppose. You heard about Walston?”

“I was there, John.”

“The MuFu bastards had a definite preference for young faces. I knew for months they were going to fire me, I could smell it in the air.”

“I know that feeling, John.”

“The staff treated me like crap, acting like I was in my dotage. They figured my brain had rotted along with Cobol, but they needed me to effect the conversion. I learned C until I knew it better than they did and then studied it more. Their superstars couldn’t read a dump or comprehend machine instructions during debugging. I turned the joke on their little experts.”

“Sheesh. I’m sorry you went through that, John.”

He shrugged. “What will happen to me now?”

“Far as I know, nothing. I think they’re too embarrassed. One or two, the CIO and the VP maybe, have shown a touch of grudging respect. They’re coming to grips with the senile grey-beard who fooled them.”

“Good, because I’m a coward. I’m not looking for fame and misfortune.”

“Don’t worry, John. Everyone but the sheriff loves a Robin Hood.”

Final Thoughts

And that is my favorite Wall Street crime case. I’m called when matters go mysteriously wrong, so Miss Marple-like, I occasionally stumble upon another puzzle and test of wits.

In this case, charities profited and the bad guy turned out a good guy. Some may object that a criminal avoided prosecution, but personally, I couldn’t imagine a better outcome.


Following are a few more tech notes.

Tech Notes

As computers process data, the risk of losing transactions grows as minutes and hours pass. Airlines and hospitals and banks can’t afford to lose dozens, hundreds, or thousands of records from a power outage or a program that abruptly bombs out.

One practice is to take checkpoints where data is sealed off and copied every few minutes, perhaps sacrificing a few recent entries. Typically checkpoints include a paper trail to aid recovery, perhaps bank deposits or election ballots.

Another approach is to devise more robust databases through confirmed writing, redundancy, and resilient recording technology such as data arrays and NAS– network attached storage. These usually involve taking the time to check data is actually recorded intact, either reading it back entire or confirming it with a polynomial.

Both of the above operate in real time, meaning records are updated directly.

Front end and Back end Processing

In some businesses, it’s possible to separate the collection of data (front end) with its processing (back end). The front end handles validation and sometimes preprocessing such as converting fields to binary. An audit trail can easily be saved as a byproduct. A transaction serial number could be added too. The front end then 'pipes' that entry to the back-end program.

Implementations of logical and physical separation vary widely. For example, voting machines typically accumulate votes in an isolated environment, variously called a vacuum jar, a (virtual) Faraday Cage, or an air-gap system. Other apps may write to the cloud (a distant server) for later use. Both functions could coexist in the same computer, like the case of our mutual funds house.

Character Values

Characters are encoded with numeric values, usually in eight-bit bytes. For example, A B C is internally represented by 41 42 43 (base 16, hexadecimal) or 65 66 67 (base 10, decimal). As you see in the following table, the digits 1 2 3 are stored as 31 32 33, which must be converted to binary or packed decimal for calculations and converted back to display or print. The algorithm for conversion considers only the right hexadecimal digits to calculate the binary equivalent.

But notice the values for a hyphen, comma, and decimal point are not recognizably numeric. If the program fails to realize this and treats them as a number, the conversion algorithm falls apart resulting in a meaningless number. This is why the C program miscalculated the index from the account number.

ASCII/UniCode Character Values Extract
name char hex dec binary
:
space
20 32 0010 0000
:
comma , 2C 44 0010 1100
hyphen - 2D 45 0010 1101
dot . 2E 46 0010 1110
slash / 2F 47 0010 1111
zero 0 30 48 0011 0000
one 1 31 49 0011 0001
two 2 32 50 0011 0010
three 3 33 51 0011 0011
four 4 34 52 0011 0100
five 5 35 53 0011 0101
six 6 36 54 0011 0110
seven 7 37 55 0011 0111
eight 8 38 56 0011 1000
nine 9 39 57 0011 1001
colon : 3A 58 0011 1010
semicolon ; 3B 59 0011 1011
:

8 comments:

  1. Leigh, once again, you have made the mysteries of computer malfeasance interesting. I love this Robin Hood, that's for sure!

    ReplyDelete
    Replies
    1. Thanks, Eve. This has been the most difficult of the series to write about because the intricacies were so technical, the firms own people couldn't follow it. How could I hope to explain it?

      But as you suggest, the story's really about 'John', a surprisingly likeable pudgy teddy bear hiding a razor sharp brain.

      Delete
  2. As I was preparing the article, I showed my friend Thrush the C sizeOf problem and he twigged to it within moments. Programmers often use CamelCase for the names of constants, variables, objects and methods, but caps are never found in C reserved names.

    ReplyDelete
  3. Leigh, interesting story.

    In Nam, I heard about a programmer who was so mad at the Army that he submitted some lines into the monthly pay program for it to blow up six months after he rotated home and ETS'd. Never did hear how that one came out at the end.

    ReplyDelete
    Replies
    1. Oh joy. RT, it's within the realm of possibility, but the best could make it damn difficult to detect. The Army could put both the program and the programmer in isolation.

      Delete
  4. Sigh ... I remember being paid in fractional cents per line of transcription. I don't know if other computer-related shenanigans were involved but I suspect so. Of course "I" didn't have access to the code they were using.

    ReplyDelete
    Replies
    1. That's interesting, Elizabeth, a bit like turk-work of today. I haven't heard of that, but it might be tempting to puff it with as many lines as possible.

      When I was a kid baling hay in the summer, some of the generous farmers would pay 1¼¢ or even 1½¢ instead of just a penny per bale. We had a preacher who would cheat the kids. My father had a quiet word with him.

      Delete
    2. It was nothing at all like turking. The transcription I was doing was medical, required great expertise, & there was no way to add to the line count. The pay rates were expressed as, let's say, 0.106, which equals 10.6 cents per line (believe me I've worked for less) ... then how long is a line? 45, 65, 73, 88 characters? Do spaces count? Then, God forbid a person gets paid only for actually TYPED characters ... aaargghh, I'm so glad I'm retired now!

      Delete

Welcome. Please feel free to comment.

Our corporate secretary is notoriously lax when it comes to comments trapped in the spam folder. It may take Velma a few days to notice, usually after digging in a bottom drawer for a packet of seamed hose, a .38, her flask, or a cigarette.

She’s also sarcastically flip-lipped, but where else can a P.I. find a gal who can wield a candlestick phone, a typewriter, and a gat all at the same time? So bear with us, we value your comment. Once she finishes her Fatima Long Gold.

You can format HTML codes of <b>bold</b>, <i>italics</i>, and links: <a href="https://about.me/SleuthSayers">SleuthSayers</a>