July 28, 2022
TetrisGYM SPS 0.2.1 [sig]
TetrisGYM SPS 0.2 [sig]
TetrisGYM SPS 0.1 [sig]
TetrisGYM is a wildly popular mod of Tetris for the Nintendo Entertainment System. It has recently been made into a cartridge by one of the highest scoring players, EricICX. Released in 1989, Tetris has come a long way as a result of high score competition, a fanatical community, and quirks that make it an incredibly difficult game to master. If it was just difficult to master, it might have earned a community. But its quirks have earned a mod that turns the game into a learning experience. The current list of features for TetrisGYM:
- Score uncap
- High score save
- SPS Seeded RNG
- Drought trainer
- T-spin trainer
- Transition trainer
- Block tool
- level 29 start
- Stacking trainer
- Tapping Hz display
- Input display
- Pace display
- Tuck spin trainer setups
- Quick tap trainer
- Tap quantity trainer
- DAS delay
- Hard Drop
- Ready indicator
Each of these modes serve a niche that Kirjava and Tetris players chose to fix. For example, DAS delay shows what life would be like if the authors chose a different value for the timing of DAS. SPS Seeded RNG allows players around the world to use the same piece sets, just like they do in CTWC’s annual tournament. I chose to dig into this part because it seemed like the most interesting, and easy for me to apply my skills. I have valuable insight into this already and am ready to start the work of comparing this RNG to others that could be implemented in 6502 assembly.
Remember that even though some of the players in classic Tetris run on a modern computer with help of an emulator, we are still stuck running our code on an NES (or in my case, a replica) which was originally released in North America in 1985. They chose a particularly low-end CPU for the NES with limited RAM and limited graphics and sound to ensure that the quality of the games were high – not because they had expensive hardware, but because they had the best software. How does one get good software without a hardware multiplier and megabytes of RAM? The answer is gameplay.
First off, my scripts implement SPS v3 from TetrisGYM in Python. This means given a seed, you can find the piece sequence.
python3 tetris_sps.py 000200 300 JIOSISZTOZTLJTILJOOIJTILSOLZILJSZJOSTIZJZJTIZTZJSZOJZOLIOIISSSZJSLJSZTLZOJTLIJLTLISILTLZTJIZIOSZJZTJLJTOTZLILTZITZSLISOZTJLSTJJZIOILOZJTJLJJIOLILIOOSTZJZOSLTJISLOTJZTJTZOOILOTOIOSZSSOISIJZOSTOIZOTITZIOZSILTLOZLTSIJOZIIZOTILIOZSZJSTSIOLTOLOLTJTOTZTIOJSOZOJTSLLTOJSTJTSJISJTOZOLJJOIJLOIOZIJIILSJZILSJIL
For statistics, use the
-s flag which provides information about total piece count. Because a fairly small number of seeds have bias against one piece, a random seed is unlikely to contain bias. However slight, the possibility of bias is worth checking before using a seed in a high stakes competitive match. While it would be unsportsmanlike for a competitor to run this script before a match (since it is telling them information that they should not know), it would be sensible for a commentator to run this to ensure that a match is not decided by RNG alone. While this may be controversial, I think the result would be more competitive matches. Since it does not factor into account long droughts, the core mechanic of the game remains unaffected. Perhaps a commentator-mode could be created that would ensure that the statistics are good and would give a thumbs up or down for the seed.
pypy3 tetris_sps.py -s 345678 500 LIJISJLJOSSLOSOTZTSLOSOLZSZOLJTIOISTZOTLILTOLSIZTSIZZSIZSLITZOLJTJTJLTOSISJJLJSZLZJITZTIZSTJLOSSLSTTTJTLISOLSLOSZTILOOJOIOJLSJZSLSZOILZIOJOLSOJTLJTSZLTSZIOTOILZOZSZITZJZIZTJZJOJSJLSIJOJIJSTIJIJSOTOTZLZSSJZOSTIJLSLSTJTLSILZITLJLSZJLSTSIZJIJZILOLJTILSIJSOOJOJSZTZISSZJZSTIJIZSJTOTISOTSZOTLSLTJIOLTZJZSTZSJOLISTSJLZLIOZOLTZSIOSOTJOSLJIZLZJTIZTTOSTZJZITJTJIJITOLOJTLOILZZSLZIOTZISSOTOZTISZOISSOSOLOITOZJOZOLTSTITLOTSIOTIZJTJTLISZILZILOLZJOTLJZZLOOIJZJOTZTIOITZTLSOZSLZJIZTZJJLJOTJLSTJZSJTOTZZLTOOIZJOZIST T 077 J 070 Z 076 O 072 S 075 L 066 I 064
Statistically Anomalous Seeds
Using multiple threads, you can do a parallel search for a seed with a large number of longbars (I-pieces):
pypy3 tetris_sps_search.py 000000 37 I pieces out of 300 000002 38 I pieces out of 300 000200 48 I pieces out of 300 000254 56 I pieces out of 300 001256 57 I pieces out of 300 0026e5 59 I pieces out of 300 006430 60 I pieces out of 300 009aa6 61 I pieces out of 300 00c670 62 I pieces out of 300 026477 64 I pieces out of 300 071073 65 I pieces out of 300 09ee71 66 I pieces out of 300 a2b073 67 I pieces out of 300 # I pieces are converted to underscore (_) in the sequence for visibility python3 tetris_sps.py a2b073 300 |sed 's#I#_#g' _SJSSZ_L_Z_T_STOLZJZL__L_ZJ_T_O_L_T_J_SZZJJLSZTJJ_SZ_OTSLTSOJL_OZLSS_TOZOSJ_TJZSJ_LS_OTOOJLOTSZJTTS_LSOTJLSTZTL_LOJT_ZO_J_ST_ZTOLZS_OTLJ_S_T_OLJ_SLJ_LTO_J_O_JTS_ZLSZLTL_JLOSLS_ZZTOSJS_SJSJL_S_TLT_TZO_J_SLZ__O_TOT__LTLJ_O_ZT_TST_TL_T_LJ_TLJTSLTOJS_TZJLOL_OLSJLJ_TJS_SJLOTLOST_TJLZOOTSL_J_OTSZTL_TO_OZ_
Since it makes sense to have a lot of seeds with a large number of I pieces, you can run the following to get a list of over 500 seeds with 64 or more I pieces.
pypy3 tetris_sps_search.py 0 64 >tetris_sps_search_0_64.txt
If you want to retroactively check how many I pieces a player should have gotten in a competition that used SPS, you can check the seed using the first script with the
-s flag. Using Tetrisfish, I am able to get piece sequences automatically from Tetris games I record, which provides a way to test the code.
It is not enough to just get seeds with a large number of I pieces. Other filters are just as easy such as: lowbars, smallest maximum drought, longest maximum drought.
If you want to test your skill in a seed with no longbar droughts at all, there is no better seed than
0c3686. If you want to test yourself against a possible drought world record, try
43ee14 which starts with an enormous 82 piece drought. If you enjoy low bars, why not try
0eba57 which has only 21 I pieces?
date; pypy3 tetris_sps_consistent.py; date Thu Jul 28 10:34:30 AM PDT 2022 000200 20 max drought out of 300 pieces 000210 18 max drought out of 300 pieces 000220 17 max drought out of 300 pieces 000254 16 max drought out of 300 pieces 0002a6 13 max drought out of 300 pieces 0018c4 12 max drought out of 300 pieces 0030c4 11 max drought out of 300 pieces 0c3686 10 max drought out of 300 pieces Thu Jul 28 10:38:15 AM PDT 2022 date; pypy3 tetris_sps_inconsistent.py; date Thu Jul 28 10:51:01 AM PDT 2022 000001 7 max drought out of 300 pieces 000200 20 max drought out of 300 pieces 000202 30 max drought out of 300 pieces 000211 31 max drought out of 300 pieces 000231 37 max drought out of 300 pieces 000241 47 max drought out of 300 pieces 000245 61 max drought out of 300 pieces 001aa3 65 max drought out of 300 pieces 002643 66 max drought out of 300 pieces 015a42 76 max drought out of 300 pieces 256a10 78 max drought out of 300 pieces 43ee14 82 max drought out of 300 pieces Thu Jul 28 10:54:46 AM PDT 2022 date; pypy3 tetris_sps_search_lowbars.py; date Thu Jul 28 11:04:20 AM PDT 2022 000000 37 I pieces out of 300 000211 33 I pieces out of 300 000226 32 I pieces out of 300 000241 27 I pieces out of 300 000245 23 I pieces out of 300 000a51 22 I pieces out of 300 0eba57 21 I pieces out of 300 Thu Jul 28 11:08:04 AM PDT 2022
This appears to indicate that the longest drought that exists in the seeds in the first 300 pieces is 82 pieces. Note how that is most of the game. Similarly this indicates that the fewest number of longbars that a person can receive in the first 300 pieces is 21. Compared to the maximum number of longbars is 67 out of the first 300 pieces.
Although I am providing the source code, I am not making this source code open source. That is to say, I am putting restrictions on the use and free sharing of this code. You’re welcome to copy, read, learn, and modify it for your own personal use. You are not allowed to create external interfaces to the code (such as a bot, a website, or more convenient front-end). This is not because I want to make money or force people to learn 6502 assembly. I want to reduce the attraction to those who lack the competitive spirit of the game of Tetris. The only way I can think of is to force them to download and run python (or pypy3) on the command line to use my software.
Caution: This machine has no brain, please use your own.
Note that the seed search scripts are slow because they are doing 300x25x16777216 operations. Therefore, PyPy3 is highly recommended to reduce waste.
So where do we go from here? Statistical analysis of the SPS RNG is now possible. From statistical analysis, we can compare to a highly entropic PRNG. If it fails dramatically, we can notify players of the issue. If it performs slightly worse than a highly entropic PRNG (which I believe I have evidence of already), then we can write a new RNG and test it.
While 7-bag random generator is not on the table (NES Tetris players generally accept the low quality RNG that Tetris provides as being a key feature and unchangeable for competition), if SPS performs worse than Tetris’s RNG or is exploitable by players, it is well within our right to fix it. SPS did not come as a standard feature on Tetris, and so it is acceptable to fix it. It is pretty likely that any fix will be a benefit to Tetris players – reducing the hardship of bias.
How does TetrisGYM’s RNG work? In 279 lines of python, it is implemented in a line for line port of the 6502 assembly in TetrisGYM.
The RNG is quite simple.
setSeedNextRNG the number of times depending on the 4 highest bits of the third byte of the seed. This is a problem. If you’re curious about
setSeedNextRNG, it just calls
generateNextPseudorandomNumber(0x34, 2). What happens in
generateNextPseudorandomNumber? Well unfortunately all that happens in
generateNextPseudorandomNumber is a
ror of the first two bytes of the seed (the
ror operation shifts the bits to the right 1 bit and sets the high bit depending on the carry flag). The high bit of first byte is set depending on the
xor of the first two seed bytes’ second bit. That’s right, it’s a LFSR with little state or logic.
Finding 1: Weak Seeds
So why doesn’t SPS result in almost instantaneous repetition of piece sequences? That is the question on my mind. With low values such as 0-511, the piece sequence repeats after just 7 pieces because of the issue with
pickTetriminoSeed explained above. When does the sequence repeat in valid seeds? It depends on the seed, but values are in the range of 36096 - 161024 with outliers like
123456 and its relatives (any seed ending with 5x) having an effective period of 7936. For cryptography, this is very bad. For Tetris pieces, this may be sufficient. Since a particularly long game in T6 can only last 389 pieces and particularly long games being less than a thousand pieces, it is unlikely that a sequence in a valid seed will repeat. With this source code you can check for yourself whether a long sequence repeats. We use my open source binary analysis software, unproprietary to do the analysis.
pypy3 tetris_sps.py 345678 500 > 345678.txt python3 unproprietary/zlibPy.py -v 345678.txt |grep distance |head inflate: distance 8 b'SLOSO' inflate: distance 31 b'ZTS' inflate: distance 4 b'SIZ' inflate: distance 34 b'ZOLJT' inflate: distance 2 b'JTJ' inflate: distance 27 b'LTO' inflate: distance 69 b'ISJ' inflate: distance 70 b'JLJ' inflate: distance 25 b'ITZ' inflate: distance 33 b'IZS'
We can see using zlib (which compresses using Huffman tables and also windowed duplication elimination) that the seed does repeat 105 sequences of length 3 or more with the longest being 6 pieces: TJTLIS and ZSJTOT. Does this affect the game negatively? This post will not try to prove or disprove this possibility. The fact that sequences as long as 6 are repeated in a game of 500 pieces does not surprise me as this is a natural consequence of random chance – something that good random number generators should have. Further analysis of the repetition in a large number of seeds is needed.
Finding 2: Similar Seeds, Similar Outputs
Similar seeds result in similar outputs as a direct consequence of the RNG’s simplicity.
432100 is equivalent to the seed
432000. This is generally true that the least significant bit of the second byte of the seed is ignored resulting in a large waste of seeds.
python3 tetris_sps.py 432100 30 JZTSOOJTIJLJZLOLZLZSJLTIZJTZSZ python3 tetris_sps.py 432000 30 JZTSOOJTIJLJZLOLZLZSJLTIZJTZSZ python3 tetris_sps.py B32100 30 OLJZSJLJTIZTOSTZTSOTTLZJTZSOIJ python3 tetris_sps.py B32000 30 OLJZSJLJTIZTOSTZTSOTTLZJTZSOIJ python3 tetris_sps.py FF9131 30 JILIJSILJISLJITZTISSLTZJZLJTIS python3 tetris_sps.py FF9031 30 JILIJSILJISLJITZTISSLTZJZLJTIS pypy3 tetris_sps.py 32349f 30 TILOSLZZIOSITIZISSTIOJLSZJISLI pypy3 tetris_sps.py 32359f 30 TILOSLZZIOSITIZISSTIOJLSZJISLI
This shows that we cannot expect different seeds to contain different values. How many different seeds contain the same outputs? There are 512 seeds (
000000-0001ff) which contain the same data. These seeds should be completely ignored by the Tetris community. Further, there are 2076 sequences which are shared by 12 seeds. There are further 75930 sequences that are shared by 8 seeds. The rest of the sequences are shared by 4 seeds. The odds of getting a sequence you have memorized is low, but the odds of getting a sequence shared by 8 or more seeds is 0.4%. You could happen upon such a seed quite easily.
This indicates that an improvement to the algorithm should result in more seeds for players.
pypy3 tetris_sps_dupe.py >tetris_sps_dupe3.txt grep -o ' [0-9].' tetris_sps_dupe3.txt |uniq -c 8 64 2076 12 75930 8) 921986 4)
tetris_sps_dupe.py only outputs 1M sequences to avoid using a lot of disk space. The fact that the 921986 is associated with 4 seeds just sets the lower bound for the number of sequences with 4 seeds.
tetris_sps_dupe.py does not address the issue of one seed converting into another seed. How long does it take a seed to devolve into another seed? This is where the algorithm is weak against attacks. Seed
235567 is defined by 3 values,
23, 55, and 67. But the next random number generated is defined by the rotation of the first two bytes and the increment of the third byte.
1d 91 68 is our neighbor seed.
pypy3 tetris_sps.py 235567 30 LSLJOISITIOILILOJSOLZILTJSIZZO pypy3 tetris_sps.py 1d9168 30 SLJOISITIOILILOJSOLZILTJSIZZOZ
Note how the sequence generated by
1d9168 is just 1 piece later than the sequence
235567. This isn’t just a random fluke. It is true of every seed in SPS that it has a neighbor seed that is rapidly computable from the seed. Instead of keeping a state that is based on the seed, the seed is itself the state of the PRNG. Can piece sequence can be predicted ahead of time? If so, players can be affected negatively by their competitors memorizing one sequence and being able to then predict every piece from the point onward. How lucky must a person be to get a sequence that they have memorized in a game of Tetris? My preliminary analysis has shown that they would need to be extremely lucky. This quirk would only affect players if they allowed their opponent to pick a seed. This should not be allowed, and truly random seeds should only every be picked. It might make sense for commentators to check the piece sequence for anomalies in advance for particularly sensitive matches. Since another random seed can be generated rapidly, bad seeds (such as those ending in 5x) can be thrown out without harming the game.
Weaknesses in the TetrisGYM SPS have now been documented. These should inform decisions about the use of SPS in competition. While no serious issues affect the current use of SPS in the wild, it may make sense for commentators to check seeds for sanity before use.
Improvements to SPS RNG would benefit the Tetris community. Features that would benefit players first: fewer duplicate seeds, higher quality random numbers, no weak seeds. Additional benefits might be considered. The limitations of the NES determine how complex an RNG can be used, but high quality RNG have been created for the purpose of low power embedded systems.
Further analysis of the repetition in a large number of seeds is needed. Also further analysis of bias in the RNG is needed. With tools that do this automatically, replacement RNG can be tested to determine the benefit to players.
Thanks to Kirjava for answering my questions. Thanks to #romhacking in the CTM Discord for inspiring me to write this code. Thanks to my competitors in the Tetris scene for inspiring me to play.
Leave a Reply
Leave a reply »