computer science, math, programming and other stuff
a blog by Christopher Swenson

64-bit assembly language programming under macOS with NASM

Just a quick follow-up to a previous post on 32-bit assembly language programming for OS X.

I've had a fair amount of interest in this post, surprisingly, so I thought I would update it for the 64-bit Intel world we live in now. The biggest change is that the calling convention now uses registers instead of the stack by default, and we use the r* 64-bit registers instead of the e* 32-bit registers.

A huge source of annoyance for me is the way we pass pointers to data. Now, in macOS, it is necessary for local data pointers to relative to the instruction pointer, which is most easily accomplished using rel your_data_here and using lea instead of a bare mov. This can also be accomplished using the DEFAULT REL directive, which says that all addresses in lea should be rel.

DEFAULT REL ; RIP-relative addressing by default
; Basic OS X calls to glibc
; compile with:
; nasm -g -f macho64 malloc64.asm
; gcc -o a.out malloc64.o

; glibc stuff
extern _puts, _printf, _malloc, _free

; static data
section .data

hello_world_str db "Hello world!", 10, 0
int_str db "Address %llx", 10, 0

; code
section .text

global _main

        ; save registers and align stack
        push rbp
        push r12
        push rbx

        lea  rdi, [hello_world_str]
        call _puts

        mov  rdi, 16
        call _malloc

        ; check if the malloc failed
        test rax, rax
        jz   fail_exit
        mov  rbx, rax

        xor  rax, rax
        mov  rsi, rbx
        lea  rdi, [int_str]
        call _printf

        ; print "A\nB\n..."
        mov [rbx], word 0xD41 ; 'A\n'

        mov r12, 10
        mov  rdi, rbx
        call  _puts
        inc  qword [rbx]
        dec  r12
        jnz  _loop

        ; free the malloc'd memory
        mov  rdi, rbx
        call _free

        xor rax, rax
        pop rbx
        pop r12
        pop rbp

        mov rax, 1
        pop rbx
        pop r12
        pop rbp

The output should look something like this:

Hello world!

Address 100200000

ttyrec conversion

I'm an armchair fan of NetHack and other roguelikes, though I don't play them too much. I've always wondered why there wasn't, say, a NetHack screensaver that would just play through the game, since it seemed to be ripe for screensavering.

I'd found some pieces of tools that might do the job around the Internet: TTY "recordings" of NetHack ascensions (essentially, replaying through the terminal sequences while they are playing), pyte (a Python terminal emulator), and images2gif (a mostly working animated GIF maker).

It's almost there, so I wrote the missing piece that takes the screen buffer (as a matrix of character codes) and converts it to an image using an old DOS bitmap font, and then all of the glue code to make it all work together. I sped up the results by 5× or so, and then run the whole thing through ImageMagick in the end to shrink the animated GIF.

I open sourced the whole thing as ttyrec2gif. It produces nice GIFs like:

Sample nethack screensaver animated GIF

There's one last piece: how to actually turn this into a screen saver.

In OS X, it's a matter of writing a little Quartz Composer program to randomly pick GIFs from a directory and play them as movies. This has hard-coded paths in it, because I haven't figured out how to make OS X screen saver configuration parameters, so I'll just post a screen shot of what the program looks like:

Quartz Composer screenshot of my screensaver app.

(Having Quartz pick a random GIF and then reload and pick another one after 5 minutes is a fun challenge. The best solution I came up with involved using a wave generator that, when rounded, would trigger an event only when it reached its apex, and this triggered a sampler to resample and pick a new random GIF. Kind of a Rube Goldberg way of doing it, but it was fun.)

Anyway, at the very least, there are some cool GIFs out there to look at now. I'm considering doing a big run and converting a more runs into GIFs, but it's very time-consuming (it takes many hours to convert a game to an animated GIF for now).


Friendship API

This weekend, I wrote a quick email application called Friendship API. It solves a problem that I have: I am just terrible at staying in touch with my friends via email. This is a shame: email is really a great tool for keeping in touch.

Friendship API works like this: it uses ContextIO to look at people you interact with, filter out what looks like spam and business email, and send you a weekly email reminding you that you owe a person an email.

There's no account or password to manage: the weekly emails contain action links to ignore certain people in the future, generate another reminder, and cancel your subscription. Hence, I use the term "email application".

There's also very little storage we have to do: basically, we keep track of email addresses of people who sign up, their ContextIO tokens, what recommendations we send out, and a list of email addresses that the people want to ignore. If they cancel their subscription, we revoke our ContextIO tokens.

Why Python? The Stack

It's all built on Python 2.7. Why Python? Because there are a lot of great libraries and utilities that I can leverage to help me write the app quickly (over the weekend). Sadly, some of the stack was still not ready for Python 3+.

  • ContextIO: to do the extremely hard work of connecting to people's email accounts. Plus, they have a great Python API.
  • Mailgun: to send emails, which has a dead-simple RESTful API.
  • Flask: a straightforward way to build simple web APIs.
  • Celery: a wonderful Python-based task queue system.
  • SQLAlchemy: a robust way to deal with SQL databases.
  • Heroku: an easy way to host the email application.
  • Amazon S3: an easy way to host a simple static website.
  • ThemeForest: a cheap way to get an website theme that looks good.
  • Diet Dr Pepper: my choice of caffeine to keep me typing throughout the weekend. :)

Massive thanks to the hard work of those who wrote the stack I stand on.

The Process

How does one go about writing an app like this after getting the idea and setting aside a spare weekend? Well, here's the process I followed at least:

  1. Outline in your head (or write down) the features that you need to launch with.
  2. Talk it over with someone.
  3. Sign up for all of the accounts you might need and don't already have (in this case, the ContextIO developer account).
  4. Write a simple program to prove that the hard things are possible: in this case, that is reasonable to use ContextIO to get the data we need, and then to generate sensible recommendations.
  5. Use Celery to make the program calls asynchronous.
  6. Write a simple Flask app that calls into that program.
  7. Use RequestBin to capture the ContextIO callback, so you know how to capture the authentication token when someone signs in.
  8. Write the Flask endpoints to generate a ContextIO signup session, and the Flask endpoint to capture the ContextIO callback information, and process based on that.
  9. Hook up the recommendation function to send out an email.
  10. Write a barebones web page that calls out to your web site.
  11. Setup hosting for everything.
  12. Have someone who isn't you test out the flow.
  13. Write the rest of the functions for your bare minimum features.
  14. Theme the static site.

That was fun!

Future Improvements

There are a few more things that would be nice to do for the application.

  • Better recommendations. There's still a lot of work that could be done to improve recommendations:
    • Using past recommendations to influence future recommendations.
    • Adjusting the metric used to pick who to recommend.
    • More advanced filters to differentiate business, spam, and friendship conversations.
  • Adjusting the frequency of recommendations. It's currently hardcoded to one week per email.
  • Better-looking recommendation emails.
  • Suggest topics of conversation. We could pick topics randomly, or we could try to analyze past emails, perhaps.
  • Handle people with multiple email addresses. Currently, the application keys off of email addresses alone, so it can suggest someone whom you might be corresponding with regularly if you are doing so with a different email address.

An simple AES-based random number generator

When browsing through some stack traces doing Go development, I noticed that Go had done something rather clever in their codebase: on x86-64 processors, they used the AES instruction set to build a hash function.

The more I've thought about it, the more this is potentially quite brilliant. AES, the block cipher, compounds multiple rounds of mixing data together, shuffling bits around, and transforming it. For the past several years, all Intel and AMD processors have supported doing an AES round in a single instruction.

Hashing and random number generation use the same sort of principles as AES is based on: basically, shuffle bits around and transform them. AES gets cryptographic strength by doing this many times in a row (usually 10+).

However, if we are building a hash table or generating random numbers for simulations (or other non-cryptographic uses), then we don't need the full cryptographic strength of AES. But, we can still leverage the AES instruction set to build some fast random number generators and hashes.

For starters, I've gone ahead and written an AES-based random number generator that performs a single round of AES as its core, and iterates to produce more random numbers.

In practice, it is about twice as fast as the Mersenne Twister.

I think that the biggest reason that most people haven't done this is that it's a relatively new feature, and doing CPUID detection to make sure that the AES instruction set is available at runtime is kind of annoying. At the very least, this library has those pieces already written.

Some caveats:

  • The quality of random numbers produced is not as high as the mersenne twister (they don't quite pass the dieharder test suite). They're not terrible either; they're just not as good.
  • This only works on relatively modern x86-64 processors. To detect support, you can use the included cpuid.h to call intel_has_feature(INTEL_FEATURE_AES) to see if the CPUID instruction indicates support for the AES instruction set.
  • It's very probably that more performance or better numbers could be created. I used a fairly straightforward seed and random number generator.

I've also been experimenting with an AES-based hash function that is extremely similar to this. There is an equivalent test set to the Diehard tests for hash functions, called SMHasher.

Unfortunately, my preliminary results are not encouraging: AES instructions are possibly slower than the Murmur3 algorithm when tuned to pass the SMHasher tests. With only 1 or 2 rounds of encryption used, my AES hash function does not seem to pass the SMHasher suite, though it does after 3 or 4 rounds. Unfortunately, these extra rounds make the performance not quite as competitive.

I'm still tinkering with the AES-based hash function though to try to make it faster or hash better. I'll report back with my findings.


So You Want To Learn Crypto, Part 2: Cyclic Groups And Short Codes

Part 1 is here

Modular arithmetic

When I first started tinkering with crypto, one thing I was confused about was modular arithmetic. It's applications to crypto are everywhere: AES is based in part on finite field arithmetic, for example. However, modular arithmetic acts a little strange if you aren't used to it. I've seen this topic confuse just about everyone at some point, so I thought I would explore modular arithmetic a little, and show a fun application of it for generating shortened URLs (or short codes).

To start, you can think of modular arithmetic simply as: every time you do a numerical operation, like adding, subtracting, or multiplying two integers, you also take the modulus with respect to some other number. Like, we know that $1 + 10 = 11$, but if we are doing arithmetic modulo $6$, then $1 + 10 \equiv 5$ (because $11 \div 6 = 1$ with remainder $5$). We use $\equiv$ to mean "equals modulo something" or "is congruent modulo something".

Addition, subtraction, and multiplication are easy: just add, subtract, or multiply the numbers like normal, then divide by the modulus and take the remainder.

Some more examples modulo $6$: $2 + 5 \equiv 1$, $2 \cdot 5 \equiv 4$, $2 - 5 \equiv -3 \equiv 3$.

There in the last example we saw that $-3 \equiv 3$ modulo $6$. In math, we often will use only the positive remainder (so, $3$), but in some programming languages, negative remainders are allowed in some cases (so you might see $-3$). For the most part, that doesn't matter: all of the arithmetic will still work as expected.

Division, or, multiplicative inverses

Addition has its opposite, subtraction, that works by default. Multiplication also works, but its opposite, division, isn't guaranteed to work.

For example, we might want to know, can we divide by 2? Division is really multiplying by the inverse, so what is the inverse of 2? Looking, again, modulo 6:

$$2 \cdot 1 \equiv 2$$ $$2 \cdot 2 \equiv 4$$ $$2 \cdot 3 \equiv 0$$ $$2 \cdot 4 \equiv 2$$ $$2 \cdot 5 \equiv 4$$

So, $2$ has no inverse modulo $6$, because nothing, when multiplied by $2$, is equal to $1$. And if we try to divide by $2$ modulo $6$, it fails: $4 \div 2$ is still $2$, but what is $3 \div 2$? Normally, we'd just say $3/2$ or $1.5$, but we don't have fractions or decimal points here: all we have is the integers $0--5$, and none of them, when multiplied by $2$, is equal to $3$.

But, sometimes we have multiplicative inverses. Modulo $6$, $5$ has a multiplicative inverse: it's also $5$. This means that dividing by $5$ is the same as multiplying by it.

$$2 \cdot 5 \equiv 10$$ $$2 \div 5 \equiv 2 \cdot 5 \equiv 10$$

Cyclic groups

Okay, so, division sometimes works. When?

Well, simply put, division is guaranteed to work, in that we will be able to find a multiplicative inverse, if we are working modulo a prime or a prime power, e.g., modulo $7$ or $7^2 = 49$. When we have such a case, we call the set of numbers modulo the prime (or prime power) a finite field. I won't go into the nitty gritty details on the terminology, but essentially, a field is what we get when division works. Sometimes, we call the numbers (modulo our modulus), except $0$, the multiplicative group, or the cyclic group.

Why cyclic? Well, when we are working modulo, say, a prime $p$, then we can generate at least part of the group with another prime, $g < p$, by multiplying $g$ by itself a bunch of times.

For example, modulo $11$, with $g = 2$:

$$g \equiv 2$$ $$g^2 \equiv 2 \cdot 2 \equiv 4$$ $$g^3 \equiv 4 \cdot 2 \equiv 8$$ $$g^4 \equiv 8 \cdot 2 \equiv 16 \equiv 5$$ $$g^5 \equiv 5 \cdot 2 \equiv 10$$ $$g^6 \equiv 10 \cdot 2 \equiv 20 \equiv 9$$ $$g^7 \equiv 9 \cdot 2 \equiv 18 \equiv 7$$ $$g^8 \equiv 7 \cdot 2 \equiv 14 \equiv 3$$ $$g^9 \equiv 3 \cdot 2 \equiv 6$$ $$g^{10} \equiv 6 \cdot 2 \equiv 12 \equiv 1$$ $$g^{11} \equiv 1 \cdot 2 \equiv 2$$ and we've looped back around. Hence the term cyclic.

Modulo $11$, we generated the whole multiplicative group (of size $10$). However, we won't necessarily always generate the entire group: sometimes we get a subgroup instead. For example, look at $g = 2, p = 7$:

$$g \equiv 2$$ $$g^2 \equiv 2 \cdot 2 \equiv 4$$ $$g^3 \equiv 4 \cdot 2 \equiv 8 \equiv 1$$ $$g^4 \equiv 1 \equiv 2 \equiv 2$$ and we've looped after only $3$ elements.

In general, for a multiplicative group defined by $q$ of size $q - 1$, then any element will generate a subgroup that divides $q - 1$. So, for $7 - 1 = 6 = 2 \cdot 3$, so an element could generate subgroups of size $1$, $2$, $3$, or $6$.


So, where's the crypto?

Well, first off, we might have noticed that the operation $g^x$ seems to have this "scrambling" effect: the numbers that you get out are kind of random.

In fact, they're very random for large moduli $q$. In general, it's believed to be quite hard, given $g$, $q$, and $g^x$ (modulo $q$) to figure out $x$. This is called the discrete logarithm problem, and the fact that it is believed to be difficult means that people have used its difficulty as the foundation behind crypto, such as, especially:

How large is "large"? Large $q$ generally means hundreds or thousands of bits in size.

We'll come back to this bit in a bit, but the important part to note here is that, in general, inverting the exponentiation, or, taking the discrete logarithm, is hard.

Also, finite field arithmetic, in particular, division in a finite field, plays a critical part in AES.

Short codes

There's another useful thing we can use modular arithmetic and discrete logarithms for: a URL shortening scheme, like the URLs you see starting with,,, etc.

There are a few ways you might try to implement such a scheme. First, we'll assume that you are just assigning each URL a number, and then you'll use something like Base32 to convert the number (in binary) to a string.

Then how do you assign the number?

  1. Just pick a random number and store it in a database, tied to the full URL.
  2. Hash the full URL, and use part of the hash as the number.
  3. Use a 1-up counter and tie it to the full URL.
  4. Use a 1-up counter, but try to scramble it, for some definition of scramble.

Option 1 is the simplest, and requires basically no math: just generate a random number and stick it in the database. If there is a collision, try again.

The downside is that, for very short codes, such as for a 5-character Base32 code, the numbers will all be $< 2^{25}$, and will be very likely to collide because of the birthday paradox. Lots of collisions means you have to try a bunch of times to insert into the database successfully, and that means more round trips, more chances for weird race conditions, and more chances for bugs.

Option 2, hashing the full URL, has similar downsides for short hashes: collisions. This case is even worse though, because you can't change the hash of the URL, so any collisions are fatal, and mean you can't use that full URL. So that option is not great.

Option 3, using a 1-up counter in the database, is really easy: just use a built-in 1-up counter, or assign a bunch of numbers to individual servers to use, and have them request new batches of numbers occasionally. However, this leads to undesirable URLs, like http://short.url/AAAAC. It allows users to guess what their URLs might be, and to start poking around at other short URLs.

Option 4 is a compromise: we want the simplicity of the 1-up counter, but we don't want users to be able to easily and meaningfully guess the short URLs. We just need a scrambling function.

Scrambling a 1-up sequence

But what to choose?

Well, again, we have some options. Some thought might lead you to think: well, we could just add a constant to the number, or multiply by a constant. However, this will be pretty obvious to users: they'd notice that consecutive short URLs would always differ by constant amounts.

Another line of thought: could we use the number in a random number generator (say, as the seed), and just turn the crank and use the next number in the sequence? Yes, that's a great idea, as long as we're careful. Specifically, we have to be careful about collisions.

Two common random number generators that might fit the bill are linear-feedback shift registers and linear congruential generators.

But, we might have a problem: we might need to "unscramble" the numbers, to tell if they might be in the database. We might just store the scrambled versions next to the 1-up counter, but that would require extra storage, and would probably require us to either write complicated SQL or do an UPDATE, and another database round-trip, to set the scrambled number as well. I don't like either of those options.

So, how hard is it to unscramble those random number generators? For linear-feedback shift registers, this is trying to count how many steps it took to get to the given output number, which is kind of difficult to do. But, the same is true of linear congruential generators.

Let's take a closer look at the linear congruential generator. It works by, given an number $X_i$, generating the next number by:

$$X_ {i+1} = A \cdot X_i + C$$

If we assume $C = 0$ and $X_ 0 = 1$, we can compute $X_i = A^i$: this is exactly the same thing as exponentiation, which is how we found cyclic groups. So, to "scramble", all we need to do is exponentiate in our cyclic group.

Oh, but wait, this sounds like bad news: we know that inverting exponentiation is the discrete logarithm, which is hard in general.

The keywords there are "in general". Can we make pick cyclic groups where it is easy?

Yes, we can.

Easy discrete logarithms

If we are working in a cyclic group modulo $p$, a prime, and if $p - 1$ is the product of a bunch of small primes or prime powers, then computing discrete logarithms is easy. To compute discrete logarithms in that case, we can use the Pohlig–Hellman algorithm.

The Pohlig–Hellman algorithm works by taking advantage of short cycles in the cyclic group: essentially, we can piece together the "full" discrete logarithm by computing a bunch of "small" discrete logarithms, and using the Chinese remainder theorem to stitch them back together. Each of these "small" discrete logarithms are in subgroups defined by the factors of $p - 1$: so a bunch of small factors of $p - 1$ means a bunch of small discrete logarithms to do. If the factors are all very small, then we can just precompute all factors in the small subgroups.

Putting it all together

Since we want roughly five-character codes, we need to find a large 25-bit prime with which to make our cyclic group. Specifically, we want a prime $p$ such that $p - 1$ is all tiny factors. To make the math easier, it would be nice if $p - 1$ has no repeated factors (that is, no prime powers).

In Sage (a Python-based mathematics environment), we can find such a prime with this code:

smoothest = 2^30 # just something large
for p in primes(2^24, 2^25):
  smoothness = 0
  for f, e in (p - 1).factor():
    if e != 1:
        smoothness = 2^30
    smoothness += f
  if smoothness < smoothest:
    smooth = p
    smoothest = smoothness
print smooth, smooth - 1, (smooth - 1).factor()

If we run this snippet, we find the best prime is $17160991$. We can confirm that $17160991 - 1 = 17160990 = 2 \cdot 3 \cdot 5 \cdot 7 \cdot 11 \cdot 17 \cdot 19 \cdot 23$.

This means that we can compute logarithms modulo $17160991$ by pre-computing and storing just $87$ numbers (the sum of the factors of $p - 1$) using Pohlig–Hellman.

We also need to pick a base $g$, that is, the number we are exponentiating in our cyclic group. We basically need a number that generates the entire multiplicative group. We can use Sage again to find such a number:

for q in primes(3, 1000):
  if GF(modulus)(q).multiplicative_order() == modulus - 1:
    print q

In this case, we find that $61$ is the smallest prime that generates the full multiplicative group, so it will be our base.

So, our procedure so far to scramble an integer $x$ looks like:

$$61^x\ (\text{mod}\ 17160991)$$

However, there is one tiny problem here: for $x = 0$, we get $1$, and for $x = 1$, we get $61$, and for $x = 2$, we get $3721$. These values stick out a bit (they're all small and easily recognized). To hide them, we can just add some small number to $x$, like 30, giving us

$$61^{x + 30}\ (\text{mod}\ 17160991)$$

So, for $x = 0$, we have $4244504$ and $x = 1$ gives us $1499879$.

If we want to invert a short code, we'll need to unscramble a number $y$ to find the unscrambled number, that is, to find $x$ in

$$y \equiv 61^{x+30}\ (\text{mod}\ 17160991)$$

In the first of the two examples above, we would be trying to solve:

$$4244504 \equiv 61^{x+30}\ (\text{mod}\ 17160991)$$

Using the Pohlig–Hellman algorithm, we can easily compute (with a few modular exponentiations) the discrete logarithm of $4244504$ is $30$, which means that $x = 0$.

Some code

I've released code for the above computations, including the Pohlig–Hellman algorithm implementation, in Python on GitHub, licensed under MIT.


How to pronounce programming words

Certain words in programming I've seen people mangle in various ways, and many times, I realized that I don't actually know the correct way to pronounce them.

So, I've constructed a list of what I consider to be the correct pronounciations of various words. I've tried to use the original authors of as the source, when possible, or other authoritative sources.

In alphabetical order:


So You Want To Learn Crypto, Part 1

Cryptography and cryptanalysis are topics that fascinate many people, but it's a tricky area to get into. For instance, knowing the difference between a hash function and a cryptographically strong hash function is critical, but extremely easy to mix up due to confusing terminology. In general, there's heavy jargon that takes serious studying to get a grasp on.

Cryptography is also almost entirely a self-taught discipline: even today, there are few college and graduate courses on cryptography, and most of those only cover the basics. (Indeed, this is why I wrote my book on cryptanalysis: I wanted to teach a class on the topic, and there was no suitable text, so I wrote my own.) This means that it is reasonable, even necessary, to learn a great deal of crypto on your own.

With that in mind, I'm going list a few resources, recommendations, and maybe even explain a little bit myself to help get you started.

Some resources to get you started

  • lvh has a video from PyCon 2013 called Crypto 101. I'm not the biggest fan of this video, as it goes through too much material in too short an amount of time, but it does give you a quick look at many different areas of cryptography. Since it does cover so much, don't worry if not everything clicks. Instead, I'd pick something that interests you, like hash functions or AES, and dive further into that one topic. If you try to tackle too many subjects at once, it might be overwhelming.
  • Once you've figured out some areas you actually like, you can start digging in in that particular field.
  • There are some great books on cryptography that you'll probably want to read eventually, so I'll go ahead and list some of them here:
    • Modern Cryptanlysis by Christopher Swenson. Obviously, I'm a bit biased here, but I wrote some decent introduction sections to many areas of cryptography and cryptanalysis with lots of examples in Python, and there are far worse places to start. I focus primarily on block ciphers (like AES) and public key cryptography (like RSA).
    • Applied Cryptography by Bruce Schneier. Extremely dated, but still has some useful stuff. A lot of block and stream cipher stuff.
    • Practical Cryptography by Bruce Schneier. A good companion to Applied Cryptography that is a bit more updated, with lots of focus on understanding and correctly using cryptography.
    • Algorithmic Cryptanalysis by Antoine Joux. One of the newest books, and chock full of good stuff, with a focus on public key cryptography, stream ciphers, and lattices. This one's probably one of the most advanced cryptography books on the market.
    • Handbook for Applied Cryptography by Alfred Menezes, Paul van Oorschot and Scott Vanstone. This book covers a lot of material, but I find it a little terse at times. It is free though, which is a big plus.

This list is not comprehensive: it's probably best to just look at all of the available books, glance through some of the pages and the table of contents, and see if the style, approach, and depth meet your interests. In general, there are two kinds of crypto books you'll see on the market: academic ones and practical ones. When you start out learning, you'll definitely want to focus on the more practical tomes (Schneier and Swenson, for example), and start reading academic books (Joux's book, and many published by academic publishers like Springer) as you get more comfortable.

I know the basics, what next?

Once you know Crypto 101 and you want to dive into Crypto 201, where do you go?

One great next step is to work on The Matasano Crypto Challenges. Not only will you have to learn a lot to complete them, but you'll have real experience making and breaking codes. I think this is a great next-step for a would-be crypto enthusiast. (And this is even a great step for cryptographers on any level – I certainly had fun doing them.)

There are some Coursera courses on Cryptography taught by Dan Boneh (Cryptography I and Cryptography II). I haven't taken them, but the instructor is top-notch, and the syllabi are solid.

As you progress further in your knowledge, you should start reading recent research papers in cryptography. One great place to keep track of is the IACR Cryptology ePrint Archive. These are the cutting edge of cryptography and cryptanalysis, and they range from beginner-friendly to extremely deep, specialist literature. The writing and publication quality of the cutting-edge papers also has a wide range. And even though it is possible to learn almost everything about cryptography and cryptanalysis for free from these papers, the books in the previous section are often much easier to learn from since they have consistent writing style, terminology, notation, and editing.

This part confuses me. Help!

I'm going to have some more blog posts in the near future clearing up some areas that I've seen be particularly sticky for some people. In particular, I know that there are some sticky points in finite fields, RSA, and hashing that I think I can help clarify.

Check back next week for part 2 of So You Want To Learn Crypto.


What I Learned As A Teen Programmer

I recently stumbled across the source to a NES emulator I wrote when I was a teenager. Reading through the source code, I thought a lot about how much I had grown as a developer and engineer since that long-forgotten summer. For instance, I now understand things like (ha!) having a consistent style (some of my code was indented over 24 spaces so that it would be in the middle of the monitor), writing comments (there are basically no comments in this code), and tests (I didn't even know that testing was a thing). But, I also realized that writing that emulator, and all of the development I did at the time, laid a strange foundation that colored how I've thought about programming since.

Up to that point, I'd puttered around a little in BASIC and C, but I didn't fall in love with programming until I discovered assembly language. Assembly was the first thing that really clicked for me: it was so simple, its semantics were so clear, and I found some great resources to help me learn (e.g., a nice tutorial I found on a BBS). So, in my heart, assembly language is my first language.

When I tell this to people now, they give me strange looks. It's almost unfathomable to teach people assembly language first. But, in doing so, I learned an awful lot about how software works, how computers work, and what foundation programming languages are built on top of.

My masterpiece as a teen programmer was the NES emulator I wrote entirely in x86 assembly language: almost exactly 10,000 lines of it. This was my first taste of a real program: it was potentially useful, it had meaning, and it was a moderately large software project (my first). Programming this was what made me dig deep and learn how to program.

Here's some things that writing that emulator, and assembly language in general, taught me:

  1. Data is just memory. I remember trying to understand pointers in C and being incredibly frustrated with the strange syntax and the weird semantics. On the other hand, dealing with memory in assembly is so clearcut: memory is just a giant flat array (more or less), and pointers just point to stuff in that giant array. Indirect references, arrays of pointers, etc., make so much more sense when boiled down like this.

    Even today, when trying to visualize many problems, I think to myself: what does this look like in memory? What structures are being used and how are they laid out? This helps me understand how algorithms and data structures work together.

  2. How to write everything yourself. There were some limited BIOS and DOS kernel calls, but for the most part, if I needed some capability, say, reading from a floppy drive or writing to the screen, I had to write it from scratch: calling I/O ports, reading and writing to RAM addresses, etc. If I wanted graphics, I had to draw every single pixel myself. These days, I know that this is just reinventing the wheel, something normally regarded as a bad habit; however, I also learned an awful lot about inventing wheels doing it. I still have the bad habit of writing too many things myself, and not leveraging existing libraries as often as I should, so this lesson wasn't a complete victory.

  3. Sometimes you just have to buckle down and program all of the tedious parts. A big part of the NES emulator is writing a cycle-accurate CPU emulator. I'm sure that you can imagine how daunting it must be to start writing CPU.ASM and have an empty jump table with 256 entries, knowing that you have to write all 256 functions to handle every single operation. Even worse, you have to get every single piece exactly right.

    This taught me an extremely valuable skill: being able to focus on tedious, repetitive tasks. Programming is absolutely chock-full of tedious programming that is absolutely necessary. A great example is date and time functionality (especially time zones): incredibly important code to get correct, but so incredibly tedious and detail-oriented. Being able to dive into thousands of pages of specifications and turn them into working libraries and APIs has paid great dividends in my career as a software engineer, and I first learned that here. Being methodical and untiring is a great skill.

    And really, sometimes there is just no elegant solution. Often I see programmers strive to come up with a beautiful, simple, clean way to solve a problem, only to realize that the solution doesn't stand up to reality. Sometimes reality is ugly, and sometimes you have to write ugly code to solve it: you just need to program your way out of it.

  4. Assembly is useful. Learning assembly has come in handy countless times: reading GDB disassembly of my code for degugging, understanding and optimizing performance, writing compilers. Every time I see an assembly dump on a screen, it feels like home.

  5. You have to fail a lot to succeed. When writing code as a teenager in assembly language, I spent a lot of time crashing my computer. I mean, assembly is really easy to get wrong. When you get it wrong, you often do so catastrophically.

    But more so, in order to learn how to build things, you first have to learn to build them wrong, and learn to recognize your mistakes. And this code is just full of mistakes.

Overall, revisiting old code like this helped me reflect on my programming: where I've been, where I am, where I'm going. I encourage everyone to save as much of their code as they ever write, and come back to it in a few years, and to reflect.


Pyway or the highway

Pyway is a database migration tool that I wrote to solve the problem of: I want database migrations, but I don't want to bring Rails or the entire JVM with me just to do them.

So, I wrote a little script that does the most simple thing you can do with database migrations: keep track of them by file name, and run any new ones in sorted order. Python is pretty much universally present at this point in server environments, so that's the tool that I used.

This is nothing sophisticated, and currently has sqlite3 hardcoded in it, but it should be trivial to modify for any other databases. I didn't even use Python bindings for sqlite3: I just called the command-line program to do the migrations. That way, it should be easy to modify it for other databases in the near future.

I'll probably end up adding Postgres or MySQL support sometime soon (when one of my projects that uses those technologies needs to be updated).


This Holacracy Thing

I've had a busy year where I've been trying out a lot of new ideas for working and collaborating, and learning a lot. One thing that I learned about and experienced for a few weeks was the system of Holacracy. The goal of Holacracy is to have a managerless, self-organized team of people rather than having a traditional top-down management tree. First, I'll state that my view of Holacracy comes after only about a month of working within it, and I don't feel that even after that I fully got it. So take this post with a grain of salt.

It's an AI for a boss

The initial appeal of Holacracy was that the group within the startup I was working at had no boss. The Holacratic system has enough processes in place to theoretically allow decisions to be made without a central leader: a true peer-to-peer sort of system.

Essentially, we had several Holacracy meetings a week. These meetings were incredibly structure-heavy. People would propose ideas at point A, discuss them at point B, and these discussions and round robins followed rules that theoretically ensured fairness.

An insight I had to this was that your team is controlling an NPC "boss", and you have a little script that you run through to control its AI.

This sounds like a good thing: if you have a very small team, maybe only a couple of people, it might seem like there would not be enough work for a full-time manager. So you can have a cardboard cutout that you all pretend is your boss, and you have a system through which your cardboard captain makes decisions. (I like to think of my cardboard cutout boss as Captain Picard.)

It's too complex

But, this AI is quite complex. Holacracy has a lot of rules. And I mean a lot.

There was a discussion I had with someone at the startup where they indicated that Holacracy was probably impossible to understand (and certainly to get right) without hiring a special Holacracy consulting company to come in and help you adopt the system — helping you go learn the rules and make your Captain Picard make decisions, basically.

This kind of put me off. A system that's so complex that it takes weeks of training to be able to even use it to have a constructive meeting? It smells a bit fishy to me. Meetings are hard, but they're not that hard, even for a cave troll programmer like me. There are entire consulting companies who would love to sell you tons of expensive training for Holacracy.

To make it even worse, it's also nearly impossible to run with Holacracy without using special Holacracy meeting software to help you facilitate your meetings. There's a lot of structure and rules, and the software helps enforce these rules. This also smells a bit fishy.

Being a newcomer to this incredibly confusing set of rules was also incredibly unpleasant. I felt lost in a maze of rules. I felt that I couldn't contribute to meetings because I didn't even know when I was allowed to talk. And when I was allowed to talk, I often didn't know what to say. (There are lots of magic words you use to indicate your status, and I sort of had to learn these all from context.) Given that I was on a feeler contract (before I came on full-time), I was not given any training in Holacracy.

Finally, with such elaborate rules, there's inevitable rules-lawyering. The rules are meant to be used as a weapon to combat inefficiency, but sometimes people get wrapped up in them to the point that legitimate, necessary work was dropped on the floor because it did not fit into the system well. I felt that at times this did a disservice to the team, and certainly to the users of our software.

It's time-consuming

Given the AI manager and the complex rules, I think it's about a wash so far. It's nice not to have a manager, but annoying to have a large RPG rules book to memorize before starting work.

But, those meetings to make Captain Picard dance were ridiculously time-consuming. I'm of the leave-me-alone-so-I-can-code-in-my-cave School of Programming. I can be coaxed out of my cave for the occasional meeting, and sometimes even enjoy it. But the level of meetings required for the Captain Picard gears to run was debilitating. Hours and hours each week were spent. Several people had a significant amount of work before and after meetings dedicated to cleanup and prep work. (And even during the meeting, a couple of people were often not allowed to participate in discussions, or supposed to limit such contributions significantly, as they played the role of facilitator, secretary, etc.)

In conclusion

Coming in as a new hire is daunting. You are adjusting to a new team, learning their quirks, and learning their work. When you add on a whole new system of thinking about meetings and management, and give no training (when a lot of training is required), you are setting people up to fail. It certainly was just too much for me. Holacracy was a huge contributor to my deciding not to work for this particular company long-term.

The company that I worked with clearly prefers Holacracy, and that's their prerogative. And if it works for them, that's great. But, I didn't feel comfortable in that system, and I don't think I'm the only one. I don't expect to in the future, and I plan to avoid working for companies that embrace Holacracy.

Or, you know, maybe Holacracy could be rebranded to use an actual Captain Picard cardboard cutout and a real RPG-like rulebook. I might have had more fun then.