A group calling itself the “Shadow Brokers” claimed to have stolen some of the NSA’s “Equation Group’s” “cyber weapons.” A sample of the tools were made publicly available with the others supposed available to the winner of an auction. The Washington Post is reporting that these are legitimate NSA tools.
I decided to take a quick look at some of the “cyber weapons” this morning and I’m pretty underwhelmed by their quality.
In particular, I looked at the BANANAGLEE BG2100 code for creating encryption keys (
keygen-2140) and redirecting communications (
bg_redirector-2140). Both of these are statically-linked Linux executables.
The purpose of the keygen tool is to generate a 16-byte random number for use by the other tools. This simple task can be accomplished by reading 16 bytes from
/dev/urandom. Here’s a one-liner to accomplish what they want.
dd if=/dev/urandom bs=16 count=1 2>/dev/null |xxd -p
Instead, 16-bytes are generated by the following procedure.
- Read 32 bytes from
- Use the first 4 of those bytes as an argument to
srandom(3) to seed the non-cryptographic
random(3) pseudorandom number generator.
- Generate 20 bytes with the following procedure for each byte.
rand(3) and throw away the result
(rand() % 2931)*(rand() % 242) + 2351 times.
rand(3) again and use the least significant byte.
- Use the second 4 bytes from
/dev/urandom as the argument to
srandom(3) and repeat step 3 to generate another 20 bytes.
- Compute the SHA-1 hash of the 40 bytes.
- Output the most significant 16 bytes of the hash.
This is ridiculous. There’s no reason to read 32 bytes from
/dev/urandom. There’s no benefit to calling
rand(3) so many times. (It’s a little ridiculous to be seeding with
srandom(3) and calling
rand(3), but in this particular implementation,
rand(3) does nothing but call
But worst of all, rather than having 2128 possible 128-bit keys, this procedure can only produce 264 distinct keys!
The background redirector appears to listen for IP packets being sent from the “attack” host to some particular other host and instead encrypts them and sends them to a third host. It also listens for encrypted packets from the third host, decrypts them, and sends them along to the attack host.
As near as I can tell, the idea is to make it look like the attack host is sending data to and receiving data from the second host while actually communicating with the third. That’s kinda neat.
But both the code and the crypto are bad. Very bad. The code has some boring memory leaks. The protocol used for encrypted communication is much more interesting.
I didn’t dive into it too deeply, but it appears to be encapsulating tcp or udp packets into fragments of size 526 bytes, prepending a header, encrypting part of the whole and sending it along.
The header consists of a 4-byte random number (shared by each fragment corresponding to a given IP packet), an 8-byte initialization vector, a 4-byte magic number
0xDECAFBAD, a 2-byte fragment number, a 2-byte total number of fragments, and a 2-byte size.
Starting with the magic number field of the header, the packet is encrypted using RC6 in output feedback mode. To compute the IV, a SHA-1 hash of the plain text (starting with the magic field of the header) is computed and the 8 most significant bytes form the least significant 8 bytes of the 16-byte IV. (The most significant bytes of the IV are set to 0). It’s important to note that the random value identifying the fragments is not hashed into the IV.
I’m no cryptographer, but this seems crazy. An IV should never be reused for a given key. And yet identical messages will produce identical IVs, even if the keys are different. Perhaps there’s something that guarantees a message will never be sent twice, but if I were designing this, I sure wouldn’t rely on that property.
Update: Sean Devlin pointed out that leaking bits of the hash of the plain text is also a pretty bad idea.
I looked at two tools and found
- 128-bit keys generated using 64 bits of entropy.
- Apparently repeated IVs.
- (Update) IV leaks bits of the hash of the plain text.
- No authentication of the encrypted communication channel.
- Sloppy and buggy code.
Maybe I simply picked bad tools and the others are all fantastic, but I kind of doubt it. Overall, I’m not impressed by what I’ve seen here.