s4.md (16759B)
1 ### Set 4 2 3 #### 4.25 4 5 If we can control the offset and plaintext input, then we can ask 6 the oracle to encrypt a plaintext with the same length of the 7 ciphertext at offset 0. *Et voilà*, there's our keystream, which we 8 can just xor with the original ciphertext in order to decrypt it. 9 Cryptopals.Stream.Attacks.rawrCtrAttack implements such an "attack": 10 11 rawrCtrAttack :: IO BS.ByteString 12 rawrCtrAttack = do 13 cip <- rawrCtrOracle (maxBound :: Int) mempty 14 let l = BS.length cip 15 p = BS.replicate l 65 16 17 new <- rawrCtrOracle 0 p 18 let ks = new `CU.fixedXor` p 19 20 pure $ ks `CU.fixedXor` cip 21 22 which gives the expected plaintext: 23 24 > fmap (BS.take 33) rawrCtrAttack 25 "I'm back and I'm ringin' the bell" 26 27 #### 4.26 28 29 After making the necessary adjustments, we can generate an analogous 30 ciphertext via: 31 32 > -- Cryptopals.Stream.Attacks.bfcEncrypter 33 > let cip = bfcEncrypter "AAAAA!admin!true" 34 35 Encoded as hex, this one looks like: 36 37 911197ace68288173aeea79803ba30ff comment1=cooking 38 7a45f6350fefac4ff9b4f1277500e130 %20MCs;userdata= 39 843d67e24f8af8305928b16352b943e9 AAAAA!admin!true 40 39e33d0551e309a752c2bb658a8ba9ee ;comment=%20like 41 dda6c8de9da59acc48f4d83f0d63fb8b %20a%20pound%20o 42 4969b8aac52e78b701c8da0f253efe59 fbacon__________ 43 44 Here we want to target the desired bytes themselves, not the bytes in 45 a previous block as was done for the attack on CBC mode. Since we know 46 the ciphertext and plaintext byte values, we can trivially recover the 47 keystream at those bytes (i.e. at indices 37 and 43). Then it's just a 48 matter of replacing the ciphertext bytes by the keystream bytes XOR'd by 49 the desired plaintext: 50 51 > -- Cryptopals.Stream.Attacks.bfcEncrypter 52 > let cip = bfcEncrypter "AAAAA!admin!true" 53 > :{ 54 ghci| let munge cip = loop 0 mempty cip where 55 ghci| p = fi (C.ord '!') 56 ghci| s = fi (C.ord ';') 57 ghci| e = fi (C.ord '=') 58 ghci| loop j acc !bs = case BS.uncons bs of 59 ghci| Nothing -> acc 60 ghci| Just (b, etc) 61 ghci| | j == 37 -> let c = BS.index cip j 62 ghci| k = c `B.xor` p 63 ghci| vil = k `B.xor` s 64 ghci| nex = BS.snoc acc vil 65 ghci| in loop (succ j) nex etc 66 ghci| | j == 43 -> let c = BS.index cip j 67 ghci| k = c `B.xor` p 68 ghci| vil = k `B.xor` e 69 ghci| nex = BS.snoc acc vil 70 ghci| in loop (succ j) nex etc 71 ghci| | otherwise -> loop (succ j) (BS.snoc acc b) etc 72 ghci| :} 73 > let munged = munge cip 74 > bfcChecker cip 75 False 76 > bfcChecker munged 77 True 78 79 #### 4.27 80 81 This one works exactly as advertised. The ivl{en, de}cryptCbcAES128 82 functions in Cryptopals.Block.Attacks will {en, de}crypt inputs in CBC 83 mode using identical key and IV's, and ivlVerifier serves as the desired 84 oracle. 85 86 First, assembling the nasty ciphertext: 87 88 > let b = "YELLOW SUBMARINE" 89 > B16.encodeBase16 consistentKey 90 "d18a7e96a50f45cb9b928e502c2b310d" 91 > let cip = ivlEncryptCbcAES128 consistentKey (b <> b <> b) 92 > let cs = CU.chunks 16 cip 93 > let mcip = cs !! 0 <> BS.replicate 16 0 <> cs !! 0 94 95 And now recovering the key: 96 97 > let Left mpay = ivlVerifier mcip 98 > let ps = CU.chunks 16 mpay 99 > B16.encodeBase16 $ (ps !! 0) `CU.fixedXor` (ps !! 2) 100 "d18a7e96a50f45cb9b928e502c2b310d" 101 102 As for how this works: refer back to the omnipresent CBC-mode decryption 103 scheme from 2.16 (here modified): 104 105 for ciphertext c = (c_1, c_2, c_3) 106 block decryption w/key k dec_k 107 xor operator + 108 109 let p_1 = dec_k(c_1) + k 110 p_2 = dec_k(c_2) + c_1 111 p_3 = dec_k(c_3) + c_2 112 113 in plaintext p = (p_1, p_2, p_3) 114 115 So if we provide the modified `c = (c_1, 0, c_1)`, decryption will give us: 116 117 p_1' = dec_k(c_1) + k 118 p_2' = dec_k(0) + c_1 119 p_3' = dec_k(c_1) + 0 120 121 such that, trivially: 122 123 p_1' + p_3' = dec_k(c_1) + k + dec_k(c_1) + 0 124 = k. 125 126 #### 4.28 127 128 Using the SHA1 (Secure Hashing Algorithm) implementation from the 'sha' 129 package under the hood, Cryptopals.MAC.sha1mac implements the desired 130 MAC (i.e. message authentication code): 131 132 > let mac = sha1mac "YELLOW SUBMARINE" "question 4.28" 133 > B16.encodeBase16 . BSL.toStrict $ mac 134 "45b5bb1ab02988df4609ff1227c90fe997236719" 135 136 verifysha1mac verifies a MAC given a key and message: 137 138 > verifysha1mac "YELLOW SUBMARINE" mac "question 4.28" 139 True 140 141 and we obviously can't tamper with anything without the MAC failing to 142 verify: 143 144 > verifysha1mac "YELLOW SUBMARINE" mac "question 4.29" 145 False 146 147 #### 4.29 148 149 So, length extension on SHA-1. A preliminary note: "MD 150 padding" refers to Merkle-Damgård compliant padding, described 151 [here](https://en.wikipedia.org/wiki/Merkle%E2%80%93Damg%C3%A5rd_constru 152 ction) and concretely elaborated on in 153 [RFC1321](http://www.faqs.org/rfcs/rfc1321.html), section 3.1, and 154 [RFC3174](http://www.faqs.org/rfcs/rfc3174.html), section 4. The latter 155 contains the TLDR summary: 156 157 > The purpose of message padding is to make the total length of a padded 158 > message a multiple of 512. SHA-1 sequentially processes blocks of 159 > 512 bits when computing the message digest. The following specifies 160 > how this padding shall be performed. As a summary, a "1" followed by 161 > m "0"s followed by a 64- bit integer are appended to the end of the 162 > message to produce a padded message of length 512 * n. The 64-bit 163 > integer is the length of the original message. 164 165 The crux of this attack is that, like the Mersenne Twister, the SHA1 166 algorithm (indeed, every SHA algorithm) maintains internal state. 167 In the Mersenne Twister we use the state and temper it ("twisting" 168 occasionally) to produce outputs; in SHA1, the output on termination 169 of the (padded) input *is* the internal state, i.e. five 32-bit words, 170 glued together. Any hash output thus corresponds to the internal state 171 of the algorithm at the end of some (padded) input. 172 173 Here, as the attacker, we know the message -- but not the key -- used 174 to produce the MAC, and we know the MAC itself. The MAC corresponds 175 to the internal state of SHA1 when we finished hashing the original 176 `key <> message` input, padded to a multiple of 512 bits. So, we can 177 reinitialise SHA1 with that state and continue hashing further input, 178 the goal being to produce a MAC that verifies for an arbitrary extension 179 of the original message without knowledge of the secret key. 180 181 (N.b., two things in this challenge drove me crazy. The first was 182 getting the padding right everywhere, something which broke my brain 183 for awhile until things eventually slotted into place. The second was a 184 really, really stupid error, in which I accidentally treated a 40-byte 185 UTF8-encoded hex string as a 20-byte bytestring, messing up the internal 186 state of SHA1 whenever I'd restore it from an output. That was annoying 187 to track down.) 188 189 *Alors*. Various functions in Cryptopals.MAC.Attacks will be employed to 190 forge a MAC from another MAC. First, as recommended, let's hack together 191 something to grab a key from /usr/share/dict: 192 193 key :: IO BSL.ByteString 194 key = do 195 gen <- MWC.createSystemRandom 196 idx <- MWC.uniformR (0, 235885) gen 197 dict <- BL8.readFile "/usr/share/dict/words" 198 let ls = BL8.lines dict 199 pure $ ls !! idx 200 201 and then grab one and produce a mac ('raw' is the given input text): 202 203 > k <- key 204 > let mac = CM.sha1mac k raw 205 206 Now, the evil message for which we will forge a MAC. This evil message 207 must include the *original* padding of the 'key + message' input used to 208 produce the MAC, since SHA1 stopped hashing on completion of pad(key + 209 message). All we know is that the message length is at least the same as 210 'raw', and, for nontrivial keys, is strictly more. 211 212 Similarly, to verify integrity, one computes sha1(key + message) and 213 checks that it equals the provided MAC. I.e., for an evil message and 214 forged MAC, one checks that: 215 216 sha1(key + evil) == forged 217 218 SHA1 will terminate at pad(key + evil), which includes the total 219 message length of 'key + evil'. So we must ensure that our resumed 220 length-extension hash uses this padding. 221 222 As best I can tell, in order to guess productively at the key length 223 used to construct the original MAC, we need access to an oracle that 224 can validate message/MAC pairs for us. A sort of wonky way to simulate 225 this is via the following 'leasha1' procedure that, while interacting with 226 something that needs a key, doesn't make use of a key itself: 227 228 leasha1 229 :: BSL.ByteString 230 -> BSL.ByteString 231 -> BSL.ByteString 232 -> R.Reader BSL.ByteString (BSL.ByteString, BSL.ByteString) 233 leasha1 input mac addl = loop 0 where 234 loop j = do 235 let len = fromIntegral $ BSL.length input 236 evil = pad (len + j) input <> addl 237 rs = inject mac 238 p = fromIntegral (BSL.length evil) + j 239 forged = sha1 rs p addl 240 validates <- oracleValidates evil forged 241 if validates 242 then pure (evil, forged) 243 else loop (succ j) 244 245 oracleValidates msg mac = do 246 k <- R.ask 247 pure $ CM.verifysha1mac k mac msg 248 249 'sha1' here calls the modified SHA1 allowing us to 1) initialize its 250 internal state from the provided registers, and 2) use the specified 251 message length for padding, instead of calculating it from the provided 252 bytestring. 253 254 So, with all that, let's *cackles* forge the evil message and a MAC that 255 will validate for it. 'mal' is the malicious text ";admin=true": 256 257 > let (evil, forged) = R.runReader (leasha1 raw mac mal) k 258 > evil 259 "comment1=cooking%20MCs;userdata=foo;comment2=%20like%20a%20pound%20of%20ba 260 con\128\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL 261 \NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NU 262 L\NUL\NUL\NUL\STX\176;admin=true" 263 > B16.encodeBase16 . BSL.toStrict $ forged 264 "0ad748ec8eef1a0b510b01b8f9ff692cf050bd15" 265 > CM.verifysha1mac k forged evil 266 True 267 268 #### 4.30 269 270 I grabbed and proceeded to butcher [this 271 guy's](https://github.com/mfeyg/md4/tree/master) MD4 implementation. He 272 evidently likes to keep me on my toes by doing everything little-endian; 273 after making sure everything conforms to that, the story is exactly the 274 same as the last challenge. Here's the length extension attack that uses 275 access to a verifying oracle: 276 277 leamd4 278 :: BSL.ByteString 279 -> BSL.ByteString 280 -> BSL.ByteString 281 -> R.Reader BSL.ByteString (BSL.ByteString, BSL.ByteString) 282 leamd4 input mac addl = loop 0 where 283 loop j = do 284 let len = fromIntegral $ BSL.length input 285 evil = padle (len + j) input <> addl 286 rs = injectMd4 mac 287 p = fromIntegral (BSL.length evil) + j 288 forged = md4 rs p addl 289 validates <- oracleValidates evil forged 290 if validates 291 then pure (evil, forged) 292 else loop (succ j) 293 294 oracleValidates msg mac = do 295 k <- R.ask 296 pure $ CM.verifymd4mac k mac msg 297 298 Again, 'md4' is a modified implementation allowing us to resume from an 299 arbitrary state and use devious padding. Let's give it a whirl: 300 301 > k <- key 302 > let mac = CM.md4mac k raw 303 > let (evil, forged) = R.runReader (leamd4 raw mac mal) k 304 > B16.encodeBase16 . BSL.toStrict $ forged 305 "289e55e2fd99091f1b4e09e1ac4167f3" 306 > CM.verifymd4mac k forged evil 307 True 308 309 #### 4.31 310 311 For this exercise we'll switch it up and write some JavaScript. etc/hmac 312 contains some JS HMAC utilities (in hmac.js) and a tiny express.js app 313 (in index.js) that will verify messages and MACs it receives via query 314 parameters. 315 316 HMAC (Hash-based MAC) is not vulnerable to length-extension attacks 317 due the presence of multiple (at least two, and potentially more) 318 enveloping calls to the hash function when creating the MAC. Ferguson et 319 al. recommended a similar approach in *Cryptography Engineering* when 320 dealing with Merkle-Damgård hash functions, in which, for hash function 321 h and message m, instead of using h(m) one uses h(h(m) || m). The HMAC 322 construction, which goes: 323 324 hmac(k, m) = h((k' + 0x5c) || h((k' + 0x36) || m)) 325 326 for: 327 328 k' = { h(k) for k > block size 329 { k otherwise 330 331 where + is single-byte XOR and || is concatenation, is more or less the 332 same idea, except.. moreso. 333 334 Anyway. The idea here is that we have a webserver that does 335 byte-at-a-time comparison with the supplied MAC, sleeping a little 336 per byte verified and bailing out early if equality isn't found. So, 337 we start guessing MACs by working on one byte at a time and checking 338 whether the server takes appreciably longer to reject what we supply. 339 If it does, then we probably know that we've matched the value for that 340 byte, and so we move onto the next one, repeating the same procedure. 341 342 The server does what is expected of it. etc/crackhmac.sh is a 343 particularly-hacky bash script that calls curl in the requisite loop, 344 hammering the server with different MACs until it reports verification. 345 346 So, boot up the server via 'node index.js' and elsewhere do: 347 348 $ ./crackhmac.sh "secrets.csv" 349 present MAC guess: 0000000000000000000000000000000000000000 350 working on next byte (hexstring index 0).. 351 found byte b0 352 present MAC guess: b000000000000000000000000000000000000000 353 working on next byte (hexstring index 2).. 354 found byte 94 355 present MAC guess: b094000000000000000000000000000000000000 356 working on next byte (hexstring index 4).. 357 found byte e8 358 present MAC guess: b094e80000000000000000000000000000000000 359 working on next byte (hexstring index 6).. 360 [..] 361 362 N.b., the script is sort of janky, so there are facilities for resuming 363 the loop in case the next byte can't be found due to timing weirdness. 364 Append as arguments the index we were working on, the time in seconds 365 the last comparison took, and the MAC we've extracted thus far, e.g.: 366 367 $ ./crackhmac.sh "secrets.csv" 20 "0.57" "b094e8b74021e638ca4a" 368 369 I probably shouldn't have chosen to do this with the shell, in 370 hindsight, and I can immediately think of a better way to handle the 371 timing variation. But in any case, after a while we should recover the 372 secret MAC: 373 374 working on next byte (hexstring index 38).. 375 succeeded 376 file: secrets.csv 377 hmac: b094e8b74021e638ca4a65a9ac466cc7fde35c03 378 379 And we can check our result manually to see HTTP status code 200 (the 380 'safe' query parameter determines whether we do a sane comparison or 381 this insane bytewise sleep thingy): 382 383 $ file="secrets.csv" 384 $ hmac="b094e8b74021e638ca4a65a9ac466cc7fde35c03" 385 $ curl -o /dev/null --silent -Iw "%{http_code}\n" \ 386 "localhost:3000/hmac?safe=false&file=$file&signature=$hmac" 387 200 388 389 #### 4.32 390 391 Here we do the same thing as in the last challenge, but now it's harder. 392 393 > I probably shouldn't have chosen to do this with the shell, in 394 > hindsight, and I can immediately think of a better way to handle the 395 > timing variation. 396 397 This is where we atone for our previous hacky bash scripts and write 398 something a little more l33t. In this case we'll loop over bytes a 399 few times each, collect timings, and then choose byte values on a 400 statistical basis. 401 402 Cryptopals.MAC.Attacks implements a few functions for poking the server, 403 gathering timing samples, and choosing byte values. The thing that 404 required the most fiddling was selecting the number of samples to take 405 for each byte. A size of three didn't cut it, five often but not always 406 cut it, and ten seemed to usually cut it, all using a boring average to 407 aggregate the samples. Then I reckoned I could try taking less samples, 408 but compare a more robust statistic for central tendency -- i.e., the 409 median -- and that indeed worked really well. I converged on using seven 410 samples, so that the median could be calculated by merely plucking out 411 the middle sorted element. As a rule, the median is awesome. 412 413 With that all in place, let's crack us a HMAC for the provided "file." 414 We get periodic updates as more bytes are found: 415 416 > mac <- crackHmac "secrets.csv" 417 current guess: "34" 418 current guess: "34ad" 419 current guess: "34ade1" 420 current guess: "34ade1f0" 421 [..] 422 423 when all is said and done, we recover our MAC: 424 425 > B16.encodeBase16 mac 426 34ade1f0a9c49af9bea5dea3f949b4b07582ba0d 427 428 and can verify it validates properly for our "file": 429 430 $ file="secrets.csv" 431 $ hmac="34ade1f0a9c49af9bea5dea3f949b4b07582ba0d" 432 $ curl -o /dev/null --silent -Iw "%{http_code}\n" \ 433 "localhost:3000/hmac?safe=true&file=$file&signature=$hmac" 434 200 435 436 A HTTP 200 status code, as expected.