cryptopals

Matasano's cryptopals challenges (cryptopals.com).
git clone git://git.jtobin.io/cryptopals.git
Log | Files | Refs | README | LICENSE

s4.md (16759B)


      1 ### Set 4
      2 
      3 #### 4.25
      4 
      5 If we can control the offset and plaintext input, then we can ask
      6 the oracle to encrypt a plaintext with the same length of the
      7 ciphertext at offset 0. *Et voilà*, there's our keystream, which we
      8 can just xor with the original ciphertext in order to decrypt it.
      9 Cryptopals.Stream.Attacks.rawrCtrAttack implements such an "attack":
     10 
     11     rawrCtrAttack :: IO BS.ByteString
     12     rawrCtrAttack = do
     13       cip <- rawrCtrOracle (maxBound :: Int) mempty
     14       let l = BS.length cip
     15           p = BS.replicate l 65
     16 
     17       new <- rawrCtrOracle 0 p
     18       let ks = new `CU.fixedXor` p
     19 
     20       pure $ ks `CU.fixedXor` cip
     21 
     22 which gives the expected plaintext:
     23 
     24     > fmap (BS.take 33) rawrCtrAttack
     25     "I'm back and I'm ringin' the bell"
     26 
     27 #### 4.26
     28 
     29 After making the necessary adjustments, we can generate an analogous
     30 ciphertext via:
     31 
     32     > -- Cryptopals.Stream.Attacks.bfcEncrypter
     33     > let cip = bfcEncrypter "AAAAA!admin!true"
     34 
     35 Encoded as hex, this one looks like:
     36 
     37     911197ace68288173aeea79803ba30ff comment1=cooking
     38     7a45f6350fefac4ff9b4f1277500e130 %20MCs;userdata=
     39     843d67e24f8af8305928b16352b943e9 AAAAA!admin!true
     40     39e33d0551e309a752c2bb658a8ba9ee ;comment=%20like
     41     dda6c8de9da59acc48f4d83f0d63fb8b %20a%20pound%20o
     42     4969b8aac52e78b701c8da0f253efe59 fbacon__________
     43 
     44 Here we want to target the desired bytes themselves, not the bytes in
     45 a previous block as was done for the attack on CBC mode. Since we know
     46 the ciphertext and plaintext byte values, we can trivially recover the
     47 keystream at those bytes (i.e. at indices 37 and 43). Then it's just a
     48 matter of replacing the ciphertext bytes by the keystream bytes XOR'd by
     49 the desired plaintext:
     50 
     51     > -- Cryptopals.Stream.Attacks.bfcEncrypter
     52     > let cip = bfcEncrypter "AAAAA!admin!true"
     53     > :{
     54     ghci| let munge cip = loop 0 mempty cip where
     55     ghci|       p = fi (C.ord '!')
     56     ghci|       s = fi (C.ord ';')
     57     ghci|       e = fi (C.ord '=')
     58     ghci|       loop j acc !bs = case BS.uncons bs of
     59     ghci|         Nothing -> acc
     60     ghci|         Just (b, etc)
     61     ghci|           | j == 37 -> let c = BS.index cip j
     62     ghci|                            k = c `B.xor` p
     63     ghci|                            vil = k `B.xor` s
     64     ghci|                            nex = BS.snoc acc vil
     65     ghci|                        in  loop (succ j) nex etc
     66     ghci|           | j == 43 -> let c = BS.index cip j
     67     ghci|                            k = c `B.xor` p
     68     ghci|                            vil = k `B.xor` e
     69     ghci|                            nex = BS.snoc acc vil
     70     ghci|                        in  loop (succ j) nex etc
     71     ghci|           | otherwise -> loop (succ j) (BS.snoc acc b) etc
     72     ghci| :}
     73     > let munged = munge cip
     74     > bfcChecker cip
     75     False
     76     > bfcChecker munged
     77     True
     78 
     79 #### 4.27
     80 
     81 This one works exactly as advertised. The ivl{en, de}cryptCbcAES128
     82 functions in Cryptopals.Block.Attacks will {en, de}crypt inputs in CBC
     83 mode using identical key and IV's, and ivlVerifier serves as the desired
     84 oracle.
     85 
     86 First, assembling the nasty ciphertext:
     87 
     88     > let b = "YELLOW SUBMARINE"
     89     > B16.encodeBase16 consistentKey
     90     "d18a7e96a50f45cb9b928e502c2b310d"
     91     > let cip = ivlEncryptCbcAES128 consistentKey (b <> b <> b)
     92     > let cs = CU.chunks 16 cip
     93     > let mcip = cs !! 0 <> BS.replicate 16 0 <> cs !! 0
     94 
     95 And now recovering the key:
     96 
     97     > let Left mpay = ivlVerifier mcip
     98     > let ps = CU.chunks 16 mpay
     99     > B16.encodeBase16 $ (ps !! 0) `CU.fixedXor` (ps !! 2)
    100     "d18a7e96a50f45cb9b928e502c2b310d"
    101 
    102 As for how this works: refer back to the omnipresent CBC-mode decryption
    103 scheme from 2.16 (here modified):
    104 
    105     for ciphertext                    c = (c_1, c_2, c_3)
    106         block decryption w/key k      dec_k
    107         xor operator                  +
    108 
    109     let p_1 = dec_k(c_1) + k
    110         p_2 = dec_k(c_2) + c_1
    111         p_3 = dec_k(c_3) + c_2
    112 
    113     in  plaintext                     p = (p_1, p_2, p_3)
    114 
    115 So if we provide the modified `c = (c_1, 0, c_1)`, decryption will give us:
    116 
    117     p_1' = dec_k(c_1) + k
    118     p_2' = dec_k(0) + c_1
    119     p_3' = dec_k(c_1) + 0
    120 
    121 such that, trivially:
    122 
    123     p_1' + p_3' = dec_k(c_1) + k + dec_k(c_1) + 0
    124                 = k.
    125 
    126 #### 4.28
    127 
    128 Using the SHA1 (Secure Hashing Algorithm) implementation from the 'sha'
    129 package under the hood, Cryptopals.MAC.sha1mac implements the desired
    130 MAC (i.e. message authentication code):
    131 
    132     > let mac = sha1mac "YELLOW SUBMARINE" "question 4.28"
    133     > B16.encodeBase16 . BSL.toStrict $ mac
    134     "45b5bb1ab02988df4609ff1227c90fe997236719"
    135 
    136 verifysha1mac verifies a MAC given a key and message:
    137 
    138     > verifysha1mac "YELLOW SUBMARINE" mac "question 4.28"
    139     True
    140 
    141 and we obviously can't tamper with anything without the MAC failing to
    142 verify:
    143 
    144     > verifysha1mac "YELLOW SUBMARINE" mac "question 4.29"
    145     False
    146 
    147 #### 4.29
    148 
    149 So, length extension on SHA-1. A preliminary note: "MD
    150 padding" refers to Merkle-Damgård compliant padding, described
    151 [here](https://en.wikipedia.org/wiki/Merkle%E2%80%93Damg%C3%A5rd_constru
    152 ction) and concretely elaborated on in
    153 [RFC1321](http://www.faqs.org/rfcs/rfc1321.html), section 3.1, and
    154 [RFC3174](http://www.faqs.org/rfcs/rfc3174.html), section 4. The latter
    155 contains the TLDR summary:
    156 
    157 > The purpose of message padding is to make the total length of a padded
    158 > message a multiple of 512. SHA-1 sequentially processes blocks of
    159 > 512 bits when computing the message digest. The following specifies
    160 > how this padding shall be performed. As a summary, a "1" followed by
    161 > m "0"s followed by a 64- bit integer are appended to the end of the
    162 > message to produce a padded message of length 512 * n. The 64-bit
    163 > integer is the length of the original message.
    164 
    165 The crux of this attack is that, like the Mersenne Twister, the SHA1
    166 algorithm (indeed, every SHA algorithm) maintains internal state.
    167 In the Mersenne Twister we use the state and temper it ("twisting"
    168 occasionally) to produce outputs; in SHA1, the output on termination
    169 of the (padded) input *is* the internal state, i.e. five 32-bit words,
    170 glued together. Any hash output thus corresponds to the internal state
    171 of the algorithm at the end of some (padded) input.
    172 
    173 Here, as the attacker, we know the message -- but not the key -- used
    174 to produce the MAC, and we know the MAC itself. The MAC corresponds
    175 to the internal state of SHA1 when we finished hashing the original
    176 `key <> message` input, padded to a multiple of 512 bits. So, we can
    177 reinitialise SHA1 with that state and continue hashing further input,
    178 the goal being to produce a MAC that verifies for an arbitrary extension
    179 of the original message without knowledge of the secret key.
    180 
    181 (N.b., two things in this challenge drove me crazy. The first was
    182 getting the padding right everywhere, something which broke my brain
    183 for awhile until things eventually slotted into place. The second was a
    184 really, really stupid error, in which I accidentally treated a 40-byte
    185 UTF8-encoded hex string as a 20-byte bytestring, messing up the internal
    186 state of SHA1 whenever I'd restore it from an output. That was annoying
    187 to track down.)
    188 
    189 *Alors*. Various functions in Cryptopals.MAC.Attacks will be employed to
    190 forge a MAC from another MAC. First, as recommended, let's hack together
    191 something to grab a key from /usr/share/dict:
    192 
    193     key :: IO BSL.ByteString
    194     key = do
    195       gen <- MWC.createSystemRandom
    196       idx <- MWC.uniformR (0, 235885) gen
    197       dict <- BL8.readFile "/usr/share/dict/words"
    198       let ls = BL8.lines dict
    199       pure $ ls !! idx
    200 
    201 and then grab one and produce a mac ('raw' is the given input text):
    202 
    203     > k <- key
    204     > let mac = CM.sha1mac k raw
    205 
    206 Now, the evil message for which we will forge a MAC. This evil message
    207 must include the *original* padding of the 'key + message' input used to
    208 produce the MAC, since SHA1 stopped hashing on completion of pad(key +
    209 message). All we know is that the message length is at least the same as
    210 'raw', and, for nontrivial keys, is strictly more.
    211 
    212 Similarly, to verify integrity, one computes sha1(key + message) and
    213 checks that it equals the provided MAC. I.e., for an evil message and
    214 forged MAC, one checks that:
    215 
    216     sha1(key + evil) == forged
    217 
    218 SHA1 will terminate at pad(key + evil), which includes the total
    219 message length of 'key + evil'. So we must ensure that our resumed
    220 length-extension hash uses this padding.
    221 
    222 As best I can tell, in order to guess productively at the key length
    223 used to construct the original MAC, we need access to an oracle that
    224 can validate message/MAC pairs for us. A sort of wonky way to simulate
    225 this is via the following 'leasha1' procedure that, while interacting with
    226 something that needs a key, doesn't make use of a key itself:
    227 
    228     leasha1
    229       :: BSL.ByteString
    230       -> BSL.ByteString
    231       -> BSL.ByteString
    232       -> R.Reader BSL.ByteString (BSL.ByteString, BSL.ByteString)
    233     leasha1 input mac addl = loop 0 where
    234       loop j = do
    235         let len = fromIntegral $ BSL.length input
    236             evil = pad (len + j) input <> addl
    237             rs   = inject mac
    238             p    = fromIntegral (BSL.length evil) + j
    239             forged = sha1 rs p addl
    240         validates <- oracleValidates evil forged
    241         if   validates
    242         then pure (evil, forged)
    243         else loop (succ j)
    244 
    245       oracleValidates msg mac = do
    246         k <- R.ask
    247         pure $ CM.verifysha1mac k mac msg
    248 
    249 'sha1' here calls the modified SHA1 allowing us to 1) initialize its
    250 internal state from the provided registers, and 2) use the specified
    251 message length for padding, instead of calculating it from the provided
    252 bytestring.
    253 
    254 So, with all that, let's *cackles* forge the evil message and a MAC that
    255 will validate for it. 'mal' is the malicious text ";admin=true":
    256 
    257     > let (evil, forged) = R.runReader (leasha1 raw mac mal) k
    258     > evil
    259     "comment1=cooking%20MCs;userdata=foo;comment2=%20like%20a%20pound%20of%20ba
    260     con\128\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL
    261     \NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NU
    262     L\NUL\NUL\NUL\STX\176;admin=true"
    263     > B16.encodeBase16 . BSL.toStrict $ forged
    264     "0ad748ec8eef1a0b510b01b8f9ff692cf050bd15"
    265     > CM.verifysha1mac k forged evil
    266     True
    267 
    268 #### 4.30
    269 
    270 I grabbed and proceeded to butcher [this
    271 guy's](https://github.com/mfeyg/md4/tree/master) MD4 implementation. He
    272 evidently likes to keep me on my toes by doing everything little-endian;
    273 after making sure everything conforms to that, the story is exactly the
    274 same as the last challenge. Here's the length extension attack that uses
    275 access to a verifying oracle:
    276 
    277     leamd4
    278       :: BSL.ByteString
    279       -> BSL.ByteString
    280       -> BSL.ByteString
    281       -> R.Reader BSL.ByteString (BSL.ByteString, BSL.ByteString)
    282     leamd4 input mac addl = loop 0 where
    283       loop j = do
    284         let len = fromIntegral $ BSL.length input
    285             evil = padle (len + j) input <> addl
    286             rs   = injectMd4 mac
    287             p    = fromIntegral (BSL.length evil) + j
    288             forged = md4 rs p addl
    289         validates <- oracleValidates evil forged
    290         if   validates
    291         then pure (evil, forged)
    292         else loop (succ j)
    293 
    294       oracleValidates msg mac = do
    295         k <- R.ask
    296         pure $ CM.verifymd4mac k mac msg
    297 
    298 Again, 'md4' is a modified implementation allowing us to resume from an
    299 arbitrary state and use devious padding. Let's give it a whirl:
    300 
    301     > k <- key
    302     > let mac = CM.md4mac k raw
    303     > let (evil, forged) = R.runReader (leamd4 raw mac mal) k
    304     > B16.encodeBase16 . BSL.toStrict $ forged
    305     "289e55e2fd99091f1b4e09e1ac4167f3"
    306     > CM.verifymd4mac k forged evil
    307     True
    308 
    309 #### 4.31
    310 
    311 For this exercise we'll switch it up and write some JavaScript. etc/hmac
    312 contains some JS HMAC utilities (in hmac.js) and a tiny express.js app
    313 (in index.js) that will verify messages and MACs it receives via query
    314 parameters.
    315 
    316 HMAC (Hash-based MAC) is not vulnerable to length-extension attacks
    317 due the presence of multiple (at least two, and potentially more)
    318 enveloping calls to the hash function when creating the MAC. Ferguson et
    319 al. recommended a similar approach in *Cryptography Engineering* when
    320 dealing with Merkle-Damgård hash functions, in which, for hash function
    321 h and message m, instead of using h(m) one uses h(h(m) || m). The HMAC
    322 construction, which goes:
    323 
    324     hmac(k, m) = h((k' + 0x5c) || h((k' + 0x36) || m))
    325 
    326 for:
    327 
    328     k' = { h(k)   for k > block size
    329          { k      otherwise
    330 
    331 where + is single-byte XOR and || is concatenation, is more or less the
    332 same idea, except.. moreso.
    333 
    334 Anyway. The idea here is that we have a webserver that does
    335 byte-at-a-time comparison with the supplied MAC, sleeping a little
    336 per byte verified and bailing out early if equality isn't found. So,
    337 we start guessing MACs by working on one byte at a time and checking
    338 whether the server takes appreciably longer to reject what we supply.
    339 If it does, then we probably know that we've matched the value for that
    340 byte, and so we move onto the next one, repeating the same procedure.
    341 
    342 The server does what is expected of it. etc/crackhmac.sh is a
    343 particularly-hacky bash script that calls curl in the requisite loop,
    344 hammering the server with different MACs until it reports verification.
    345 
    346 So, boot up the server via 'node index.js' and elsewhere do:
    347 
    348     $ ./crackhmac.sh "secrets.csv"
    349     present MAC guess: 0000000000000000000000000000000000000000
    350     working on next byte (hexstring index 0)..
    351     found byte b0
    352     present MAC guess: b000000000000000000000000000000000000000
    353     working on next byte (hexstring index 2)..
    354     found byte 94
    355     present MAC guess: b094000000000000000000000000000000000000
    356     working on next byte (hexstring index 4)..
    357     found byte e8
    358     present MAC guess: b094e80000000000000000000000000000000000
    359     working on next byte (hexstring index 6)..
    360     [..]
    361 
    362 N.b., the script is sort of janky, so there are facilities for resuming
    363 the loop in case the next byte can't be found due to timing weirdness.
    364 Append as arguments the index we were working on, the time in seconds
    365 the last comparison took, and the MAC we've extracted thus far, e.g.:
    366 
    367     $ ./crackhmac.sh "secrets.csv" 20 "0.57" "b094e8b74021e638ca4a"
    368 
    369 I probably shouldn't have chosen to do this with the shell, in
    370 hindsight, and I can immediately think of a better way to handle the
    371 timing variation. But in any case, after a while we should recover the
    372 secret MAC:
    373 
    374     working on next byte (hexstring index 38)..
    375     succeeded
    376     file: secrets.csv
    377     hmac: b094e8b74021e638ca4a65a9ac466cc7fde35c03
    378 
    379 And we can check our result manually to see HTTP status code 200 (the
    380 'safe' query parameter determines whether we do a sane comparison or
    381 this insane bytewise sleep thingy):
    382 
    383     $ file="secrets.csv"
    384     $ hmac="b094e8b74021e638ca4a65a9ac466cc7fde35c03"
    385     $ curl -o /dev/null --silent -Iw "%{http_code}\n" \
    386         "localhost:3000/hmac?safe=false&file=$file&signature=$hmac"
    387     200
    388 
    389 #### 4.32
    390 
    391 Here we do the same thing as in the last challenge, but now it's harder.
    392 
    393 > I probably shouldn't have chosen to do this with the shell, in
    394 > hindsight, and I can immediately think of a better way to handle the
    395 > timing variation.
    396 
    397 This is where we atone for our previous hacky bash scripts and write
    398 something a little more l33t. In this case we'll loop over bytes a
    399 few times each, collect timings, and then choose byte values on a
    400 statistical basis.
    401 
    402 Cryptopals.MAC.Attacks implements a few functions for poking the server,
    403 gathering timing samples, and choosing byte values. The thing that
    404 required the most fiddling was selecting the number of samples to take
    405 for each byte. A size of three didn't cut it, five often but not always
    406 cut it, and ten seemed to usually cut it, all using a boring average to
    407 aggregate the samples. Then I reckoned I could try taking less samples,
    408 but compare a more robust statistic for central tendency -- i.e., the
    409 median -- and that indeed worked really well. I converged on using seven
    410 samples, so that the median could be calculated by merely plucking out
    411 the middle sorted element. As a rule, the median is awesome.
    412 
    413 With that all in place, let's crack us a HMAC for the provided "file."
    414 We get periodic updates as more bytes are found:
    415 
    416     > mac <- crackHmac "secrets.csv"
    417     current guess: "34"
    418     current guess: "34ad"
    419     current guess: "34ade1"
    420     current guess: "34ade1f0"
    421     [..]
    422 
    423 when all is said and done, we recover our MAC:
    424 
    425     > B16.encodeBase16 mac
    426     34ade1f0a9c49af9bea5dea3f949b4b07582ba0d
    427 
    428 and can verify it validates properly for our "file":
    429 
    430     $ file="secrets.csv"
    431     $ hmac="34ade1f0a9c49af9bea5dea3f949b4b07582ba0d"
    432     $ curl -o /dev/null --silent -Iw "%{http_code}\n" \
    433         "localhost:3000/hmac?safe=true&file=$file&signature=$hmac"
    434     200
    435 
    436 A HTTP 200 status code, as expected.