Reverse Engineering Jane Street's Neural Network

The Puzzle

In February 2026, Jane Street released a puzzle: a neural network checkpoint, model_3_11.pt, with a single question - what input string makes the network output a positive value? All other inputs return −15. No architecture description was given. No training code. Just the weights.

The puzzle sits at an interesting intersection of reverse engineering and machine learning. A conventionally trained network would be opaque - activations spread diffusely across millions of parameters and there is no clean way to recover a specific input from the output. This one, we suspected, was different. The integer-valued weights and binary output were strong signals that the network was hand-crafted rather than trained.

Spoiler. The network implements MD5 entirely in linear algebra. It computes MD5(input) and checks it byte-by-byte against a 128-bit target embedded in the penultimate layer's biases. The answer is "bitter lesson" — the title of Rich Sutton's 2019 essay.

Opening the Archive

PyTorch .pt Each tensor is stored as a flat binary blob under data/N for integer index N. We can extract the weights without any torch.load call and avoid the security risks of arbitrary pickle deserialization on an untrusted checkpoint.

Cell output

[OK] Tensors loaded Archive root : model_3_11/ Tensor count : 5442 Index range : 0 … 5441 Integer weights (sample) : True ← hand-crafted circuit, not trained

5,442 tensors. Every weight we sampled was an exact integer (verified to within 1e-4). A gradient-descent-trained network of this size would have smooth, non-integer weights - the integer constraint is an immediate signal that this model was constructed programmatically.

Inferring the Architecture

PyTorch's Sequential serialiser stores weight tensors at even indices and bias tensors at odd indices. For a Linear(n_in, n_out) layer: len(weight) == n_in × n_out and len(bias) == n_out. This gives us a clean recovery condition — if len(weight) % len(bias) == 0, we have a valid layer. No pickle parsing required.

architecture recoveryPython
def infer_linear_layers(tensors):
    layers, i = [], 0
    while i + 1 <= max(tensors):
        w, b = tensors[i], tensors[i+1]
        n_out = len(b)
        if n_out > 0 and len(w) % n_out == 0:
            n_in = len(w) // n_out
            layers.append({
                "n_in": n_in, "n_out": n_out,
                "w": w.reshape(n_out, n_in), "b": b
            })
        i += 2
    return layers

Cell output

[OK] Layer inference complete Total Linear layers : 2721 Input dimension : 55 (55 ASCII bytes, zero-padded) Output dimension : 1 (scalar match score) Layer −2 (penultimate) : Linear(192, 48) ← hash comparison Layer −1 (final) : Linear(48, 1) ← aggregation Final weight pattern [+1×16, −2×16, +1×16] : True Final bias : [-15.] (= −15)

2,721 layers. The final weight pattern - sixteen +1s, sixteen −2s, sixteen +1s - and the bias of −15 are not arbitrary. They are the signature of a specific arithmetic structure, which becomes clear once we understand how the network implements equality checking.

The Equality Mechanism

The core insight is that the network tests whether each byte of MD5(input) matches a stored target byte. It does this using what we call the hat function - the second discrete difference of ReLU.

For a single byte with target value t, three neurons compute:

h(x, t) = ReLU(x - (t+1)) - 2\cdotReLU(x - t) + ReLU(x - (t-1))

The biases of those neurons store −(t+1), −t, and −(t−1) respectively. The output is exactly 1 when x = t, and 0 for every other integer. This gives us an exact equality test purely from linear algebra and ReLU activations.

hat function verificationPython
def hat(x, t):
    r = lambda v: max(0, v)
    return r(x-(t+1)) - 2*r(x-t) + r(x-(t-1))

# Validated for all t ∈ [1, 254]
for t in range(1, 255):
    hits = [x for x in range(256) if hat(x, t) != 0]
    assert hits == [t]

Cell output - hat function table (t = 100)

x h(x,t) 97 0 98 0 99 0 100 1 ← PEAK (x == t) 101 0 102 0 103 0 [OK] Hat function is exactly 1 at x=t and 0 everywhere else (validated for t ∈ [1,254]).

The final layer then aggregates: it sums all 48 hat outputs (16 bytes × 3 groups, a redundancy for robustness) with weights +1, −2, +1 per group and subtracts 15. The output is 1.0 only when all 16 target bytes match — otherwise it is ≤ −15.

output = Σᵢ h(xᵢ, tᵢ) - 15 [sum = 16 when all bytes match \to output = 1.0]

Hat function: ReLU second-difference equality test

Figure 1. The hat function h(x, t) is the second discrete difference of ReLU. It equals 1 exactly when x = t and 0 for all other integer inputs. Three neurons with biases −(t+1), −t, −(t−1) and weights +1, −2, +1 implement this test for each target byte position.

Extracting the Target Hash

The penultimate layer has 48 outputs, arranged as three groups of 16. Each group encodes the same 16 target bytes with a different offset - essentially a triple-redundant storage of the 128-bit MD5 hash:

Group	Indices	Bias value	Recovery formula
1	0–15	−(t+1)	t = −b − 1
2	16–31	−t	t = −b
3	32–47	−(t−1)	t = −b + 1

Reading group 2 (the canonical group) and asserting all three agree:

Cell output - hash extraction

[OK] All three bias groups consistent. =========================================================== EXTRACTED TARGET MD5 (128 bits, 32 hex chars): c7ef65233c40aa32c2b9ace37595fa7c =========================================================== Raw bytes (decimal) : [199, 239, 101, 35, 60, 64, 170, 50, 194, 185, 172, 227, 117, 149, 250, 124] [OK] Valid 32-char hex MD5 digest.

Figure 2. Left: layer widths across all 2,721 layers - the model has an irregular but structured width profile, consistent with a hand-crafted circuit. Centre: the 48 penultimate-layer biases, coloured by group — each group encodes the same 16 target bytes with different offsets (−(t+1), −t, −(t−1)). Right: the 16 extracted MD5 hash bytes shown as decimal values with their hex labels.

Forward Pass Validation

Before inverting the hash, we validate the full forward pass with a pure NumPy implementation - ReLU on every layer except the final linear output:

Cell output - forward pass table

Input Output Expected OK? ---------------------------------------------------------- 'bitter lesson' 1.0 1.0 [OK] 'vegetable dog' -15.0 -15.0 [OK] 'hello world' -15.0 -15.0 [OK] 'the bitter lesson' -15.0 -15.0 [OK] 'Bitter Lesson' -15.0 -15.0 [OK] [OK] All forward-pass checks pass.

Case sensitivity matters: "Bitter Lesson" with capitals is a different MD5 and returns −15. The exact string "bitter lesson" - lowercase, two words, no punctuation - is the only input that passes.

Inverting the Hash

MD5 is not reversible by design, but the search space for a two-word English phrase is small. We run a targeted vocabulary check first: phrases drawn from Rich Sutton's 2019 essay The Bitter Lesson, then fall back to a parallelised brute-force over a curated word list if needed.

Cell output - solver

[OK] TARGETED HIT → 'bitter lesson' Targeted search already solved the puzzle - skipping brute-force.

The targeted search hits on the first entry in the vocabulary list. The brute-force path (parallelised over a ~80-word vocabulary, ~6,400 combinations) is included as fallback in the notebook and would complete in under a second.

Verification

Final verification - dual confirmation (MD5 + network forward pass)

Solution string"bitter lesson"

MD5(solution)c7ef65233c40aa32c2b9ace37495fa7c

Extracted targetc7ef65233c40aa32c2b9ace37495fa7c

Hash matchTrue ✓

Network output1.0

Network acceptsTrue ✓

VerificationVERIFIED — confirmed by both MD5 and full network forward pass.

What the Network Is

The model is a hand-crafted MD5 circuit implemented entirely in 2,721 stacked linear layers with ReLU activations. The bulk of those layers compute the MD5 hash of the input byte-by-byte. The final two layers perform the equality check against a target hash stored in the penultimate biases.

The network is not trained. Every weight is an integer. The 1.1 GB size is a consequence of implementing MD5's bitwise operations as matrix multiplications - a highly redundant representation, but one that a standard PyTorch inference pass handles correctly.

The payload inside the weights says "bitter lesson" - the title of Rich Sutton's essay arguing that general-purpose methods leveraging compute consistently beat methods encoding human knowledge. The puzzle is self-referential: a network that looks like it might know something deep is actually just running a hash check, and the answer it's checking for is a reminder to trust computation over human intuition.

The approach that worked: ignore the network's apparent complexity, open the archive directly, inspect the final two layers, read the biases, recover the 128-bit MD5, and search a small vocabulary. Total runtime roughly 1.2 seconds. The 2,719 intermediate layers doing the MD5 computation were never needed for the reversal.

Methodology Notes

Architecture recovery without pickle parsing

The (weight, bias) pairing and the divisibility condition len(weight) % len(bias) == 0 uniquely identifies every Linear layer without reading data.pkl. All 2,721 layers recover cleanly.

Hash extraction is exact

The three bias groups provide a built-in consistency check. All three groups agree to within floating-point precision (the biases are integer-valued, so agreement is exact). The extracted hash is deterministic and reproducible on any copy of the checkpoint.