#### Zh3r0 CTF V2 - Real Mersenne

**Author**: Quintec**Date**:

Do you believe in games of luck? I hope you make your guesses real or you’ll be floating around.

`nc crypto.zh3r0.cf 4444`

# Source Analysis

```
1import random
2from secret import flag
3from fractions import Fraction
4
5def score(a,b):
6 if abs(a-b)<1/2**10:
7 # capping score to 1024 so you dont get extra lucky
8 return Fraction(2**10)
9 return Fraction(2**53,int(2**53*a)-int(2**53*b))
10
11total_score = 0
12for _ in range(2000):
13 try:
14 x = random.random()
15 y = float(input('enter your guess:\n'))
16 round_score = score(x,y)
17 total_score+=float(round_score)
18 print('total score: {:0.2f}, round score: {}'.format(
19 total_score,round_score))
20 if total_score>10**6:
21 print(flag)
22 exit(0)
23 except:
24 print('Error, exiting')
25 exit(1)
26else:
27 print('Maybe better luck next time')
```

We can see that we have to try to guess floats generated by `random.random()`

, and we have 2000 attempts to get 1,000,000 score. The strategy is then to clone the state of the `random`

with the first guesses, and then use at most 977 guesses in the end to get the required number of points.

# Python’s Random

Python’s random module uses a Mersenne Twister inside, as hinted by the title of the challenge. So how does a Mersenne Twister work? The short answer is through a series of states, which each consist of 624 integers (in most cases, including this one, 32-bit). When you ask for random values, some of these 624 values are used up, and each value is used only once. For example, if you call `random.getrandbits(32)`

, exactly one 32-bit state value is used, after being tempered (slightly modified), to make it harder to reverse. Once all 624 values have been used, they are “twisted” to generate a set of 624 new values, and the cycle repeats. (The generator has a period of $2^{19937} - 1$, which is why we sometimes refer to it as MT19937.)

But how does that produce floats between 0 and 1? Well let’s take a look at some source code.

```
1/* random_random is the function named genrand_res53 in the original code;
2 * generates a random number on [0,1) with 53-bit resolution; note that
3 * 9007199254740992 == 2**53; I assume they're spelling "/2**53" as
4 * multiply-by-reciprocal in the (likely vain) hope that the compiler will
5 * optimize the division away at compile-time. 67108864 is 2**26. In
6 * effect, a contains 27 random bits shifted left 26, and b fills in the
7 * lower 26 bits of the 53-bit numerator.
8 * The original code credited Isaku Wada for this algorithm, 2002/01/09.
9 */
10
11/*[clinic input]
12_random.Random.random
13 self: self(type="RandomObject *")
14random() -> x in the interval [0, 1).
15[clinic start generated code]*/
16
17static PyObject *
18_random_Random_random_impl(RandomObject *self)
19/*[clinic end generated code: output=117ff99ee53d755c input=afb2a59cbbb00349]*/
20{
21 uint32_t a=genrand_uint32(self)>>5, b=genrand_uint32(self)>>6;
22 return PyFloat_FromDouble((a*67108864.0+b)*(1.0/9007199254740992.0));
23}
```

So we see that calling `random.random()`

uses up 2 state values, discards some trailing bits from them, and combines them in the numerator of some fraction to generate a random float between 0 and 1. More specifically, 27 bits and 26 bits respectively of two states are combined and divided by $2^{53}$.

# The Plan

There are still a lot of issues we will need to deal with, but a hazy plan can be formed at this point. We can pass in an easy value, such as 0, to the guess function, and use the round score to solve for the numerator of the fraction in the code above (since the score is conveniently given as a `Fraction`

with values multiplied by $2^{53}$!). Each float recovers 2 state values, so repeating this 312 times should give us 624 state values, enough to clone the random state and predict the next values perfectly, with ~800 guesses to spare at the end.

Now there are a few main issues with this plan:

## Truncation

Even after solving for the numerator using the round score, and splitting that integer into the upper 27 and lower 26 bits, we’re still missing a combined 11 bits of the 2 states! If we were to brute force those bits for all 312 guesses, that would take $2^{3432}$ attempts, clearly infeasible. The solution here is to solve for the state symbolically, since the 624 values in one state are related to each other mathematically, and this should be enough information to solve for the missing bits. To save time, a teammate remembered this repository, which was used to solve a related challenge from PlaidCTF. (Which I spent many hours of my life going down the wrong path on… but I digress.)

## Precision

This one was a killer, and I honestly still have no idea why this happened. But even though `int(2**53*a)`

should be precise due to how python stores floats, and that the default precision for floats *is* 53 bits, in my local testing the recovered range of values for b (the lower 26 bits in the numerator) were always a little off when compared to the actual value in the random state. However, this too had a relatively simple fix, had I not wasted many hours trying to fix the precision without knowing where the error was. I just had to take less bits of b for granted, and solve for all the rest. To mitigate the smaller amount of info, I used up 624 guesses (2 whole states) instead of 312, since we had 2000 total guesses.

After making these fixes, and waiting 30 minutes for the agonizingly slow server since I did not live in India, the flag finally popped out. The full lightly-commented source code of the solution is below.

```
1#This class from https://github.com/icemonster/symbolic_mersenne_cracker
2"""
3MIT License
4
5Copyright (c) 2021 icemonster
6
7Permission is hereby granted, free of charge, to any person obtaining a copy
8of this software and associated documentation files (the "Software"), to deal
9in the Software without restriction, including without limitation the rights
10to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
11copies of the Software, and to permit persons to whom the Software is
12furnished to do so, subject to the following conditions:
13
14The above copyright notice and this permission notice shall be included in all
15copies or substantial portions of the Software.
16
17THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
18IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
19FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
20AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
21LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
22OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
23SOFTWARE.
24"""
25
26from z3 import *
27from random import Random
28from itertools import count
29import time
30import logging
31
32logging.basicConfig(format='STT> %(message)s')
33logger = logging.getLogger()
34logger.setLevel(logging.DEBUG)
35
36SYMBOLIC_COUNTER = count()
37
38class Untwister:
39 def __init__(self):
40 name = next(SYMBOLIC_COUNTER)
41 self.MT = [BitVec(f'MT_{i}_{name}', 32) for i in range(624)]
42 self.index = 0
43 self.solver = Solver()
44
45 #This particular method was adapted from https://www.schutzwerk.com/en/43/posts/attacking_a_random_number_generator/
46 def symbolic_untamper(self, solver, y):
47 name = next(SYMBOLIC_COUNTER)
48
49 y1 = BitVec(f'y1_{name}', 32)
50 y2 = BitVec(f'y2_{name}' , 32)
51 y3 = BitVec(f'y3_{name}', 32)
52 y4 = BitVec(f'y4_{name}', 32)
53
54 equations = [
55 y2 == y1 ^ (LShR(y1, 11)),
56 y3 == y2 ^ ((y2 << 7) & 0x9D2C5680),
57 y4 == y3 ^ ((y3 << 15) & 0xEFC60000),
58 y == y4 ^ (LShR(y4, 18))
59 ]
60
61 solver.add(equations)
62 return y1
63
64 def symbolic_twist(self, MT, n=624, upper_mask=0x80000000, lower_mask=0x7FFFFFFF, a=0x9908B0DF, m=397):
65 '''
66 This method models MT19937 function as a Z3 program
67 '''
68 MT = [i for i in MT] #Just a shallow copy of the state
69
70 for i in range(n):
71 x = (MT[i] & upper_mask) + (MT[(i+1) % n] & lower_mask)
72 xA = LShR(x, 1)
73 xB = If(x & 1 == 0, xA, xA ^ a) #Possible Z3 optimization here by declaring auxiliary symbolic variables
74 MT[i] = MT[(i + m) % n] ^ xB
75
76 return MT
77
78 def get_symbolic(self, guess):
79 name = next(SYMBOLIC_COUNTER)
80 ERROR = 'Must pass a string like "?1100???1001000??0?100?10??10010" where ? represents an unknown bit'
81
82 assert type(guess) == str, ERROR
83 assert all(map(lambda x: x in '01?', guess)), ERROR
84 assert len(guess) <= 32, "One 32-bit number at a time please"
85 guess = guess.zfill(32)
86
87 self.symbolic_guess = BitVec(f'symbolic_guess_{name}', 32)
88 guess = guess[::-1]
89
90 for i, bit in enumerate(guess):
91 if bit != '?':
92 self.solver.add(Extract(i, i, self.symbolic_guess) == bit)
93
94 return self.symbolic_guess
95
96
97 def submit(self, guess):
98 '''
99 You need 624 numbers to completely clone the state.
100 You can input less than that though and this will give you the best guess for the state
101 '''
102 if self.index >= 624:
103 name = next(SYMBOLIC_COUNTER)
104 next_mt = self.symbolic_twist(self.MT)
105 self.MT = [BitVec(f'MT_{i}_{name}', 32) for i in range(624)]
106 for i in range(624):
107 self.solver.add(self.MT[i] == next_mt[i])
108 self.index = 0
109
110 symbolic_guess = self.get_symbolic(guess)
111 symbolic_guess = self.symbolic_untamper(self.solver, symbolic_guess)
112 self.solver.add(self.MT[self.index] == symbolic_guess)
113 self.index += 1
114
115 def get_random(self):
116 '''
117 This will give you a random.Random() instance with the cloned state.
118 '''
119 logger.debug('Solving...')
120 start = time.time()
121 print(self.solver.check())
122 model = self.solver.model()
123 end = time.time()
124 logger.debug(f'Solved! (in {round(end-start,3)}s)')
125
126 #Compute best guess for state
127 state = list(map(lambda x: model[x].as_long(), self.MT))
128 result_state = (3, tuple(state+[self.index]), None)
129 r = Random()
130 r.setstate(result_state)
131 return r
132
133#below code, the actual exploit, is mine
134import gmpy2
135from pwn import *
136
137ut = Untwister()
138sh = remote('crypto.zh3r0.cf', 4444)
139for ii in range(624):
140 print(ii)
141 sh.recvline()
142 sh.sendline(b'0')
143 lin = sh.recvline()
144 flag = False
145 sc = lin.decode().strip().split('round score: ')[1]
146 if '/' in sc:
147 score = sc.split('/')
148 else:
149 score = sc
150 flag = True
151 if not flag:
152 numer = int(score[0])
153 denom = int(score[1])
154 state_comb = gmpy2.c_div(gmpy2.mpz(9007199254740992) * gmpy2.mpz(denom), numer) + 1 # solve for state
155 bits = bin(state_comb)[2:]
156 bits = '0' * (53 - len(bits)) + bits
157 a = bits[:27] #upper
158 b = bits[27:] #lower
159 assert len(a) == 27
160 assert len(b) == 26
161 ut.submit(a + '?' * 5) #a is truncated 5 bits
162 ut.submit(b[:13] + '?' * 19) # b bits are imprecise, take only 13
163 else:
164 ut.submit('0' * 10 + '?' * 22) #we only know that the first 10 bits are 0 (1/1024)
165 ut.submit('?' * 16 + '?' * 16)
166
167rec = ut.get_random()
168
169#score 1024!
170while True:
171 print(sh.recvline())
172 sh.sendline(str(rec.random()))
173 print(sh.recvline())
```