Posts

How to Measure Mahjong Luck and Skill

Image
Every Game of Mahjong Is A Dice Roll Imagine you are playing a 4 player game of mahjong against 3 clones of yourself. The probability that you get 1st, 2nd, 3rd, or 4th place is 25% - you're all equal in skill and playing the same strategy, so there's no reason any clone would have an advantage. The average placement you would achieve is 2.50. If you were playing against weaker players and had an average placement of 2.40, we would say that you have some edge , or advantage. Now suppose you were playing perfect mahjong against competent opponents. How much edge is possible? One estimate of the maximum possible edge would be the results of the AI LuckyJ, who has the best performance of any person or AI in the Tokujou room on Tenhou. Over 1145 games, LuckyJ's spread of placements is 31.5% 1st, 27.5% 2nd, 24.1% 3rd, and 16.7% 4th. Its average placement is 2.26. There is some selection bias in estimating the edge of perfect play using the best performance we've seen

Push Fold Fundamentals: Keiten

Image
If you haven't already, check out the first post in the push fold fundamentals series. What is Keiten? The term keiten refers to a tenpai hand with no yaku. Although these hands rarely win by ron or tsumo, they can gain points if a round goes to exhaustive draw (a.k.a. Ryuukyoku, rkk). In this post, we will analyze the EV of pushing dangerous tiles against 1 riichi with 5 or fewer draws until rkk. In the context of push/fold within 5 turns of rkk, we will incorrectly use the term keiten to refer to tenpai hands with or without yaku. There is a practical reason for this: if we decide that we should push a tenpai hand with no yaku, it follows that we would push the same hand if it had a yaku. Data Sources The main chart we will analyze in this post will be from this Mahjong Math post , which uses the nisi simulator to estimate EV. If you find this post helpful, consider supporting the Mahjong Math group by purchasing the post for 500 yen, which will give you access t

Push Fold Fundamentals: Winrate/Dealinrate

Image
Shoutout to hue for recommending Japanese resources and proofreading this post. Introducing The Win Dealin Ratio Your opponent declares riichi. Do you push a dangerous tile, or fold? Some players may refer to a flowchart (push good wait with X han), or a chart of expected values (EVs) to answer this question, but I think there’s a deeper understanding to be had. What is the math underlying these charts, and how can you modify your answer if parameters such as value, dealinrate, ability to fold/maneuver, or position change? I'm going to propose a heuristic that will help you answer push fold problems in a variety of circumstances. This is a concept that I borrowed from poker and modified for riichi mahjong. It’s a concept that is so important that I think about it in every game of mahjong that I play, and one that I wish I was formally taught as a beginner. It starts with a very simple idea: if I push, what percentage of the time will I win, and what percentage of the ti

No Yaku Kanchan Dama vs Break Winrates

Image
No Yaku, No Riichi? This analysis was inspired by a recent post by Yuusei that looked at the following situation: What would you do: riichi to secure 2nd, 4s dama, or 9m breaking tenpai? Assume a "generic" reward system: [+2, +1, 0, -3] for 1st-4th place (7 dan Tenhou, Ms1 Majsoul). As discussed in Yuusei's post, due to the harsh penalty for 4th place, riichi is a bad option. The dealer is in 1st place, so the game will probably end this round. Assuming the game ends, if we just fold this hand and don't deal in to last place, we will get +1 or +0 points. We do not want last place to fight our bad wait and risk a -3pt result. So we consider the two other options. Naga cuts 4s, hoping to draw our winning tile directly or upgrade to pinfu on 6m. LuckyJ, Tencent's new Mahjong AI, cuts 9m, giving up tenpai for more pinfu upgrades, which is also what Yuusei suggests. Who is right? Simulation Set Up To answer this question, I set up a simple Markov Cha

Reviewing My First 50 Houou Games With Naga

Image
My Naga Stats (50 games) Naga Stats Don't Matter? When you feed a game log into the Naga AI, it gives you 3 summary metrics: match%, Naga rating, and bad move%. The match% is how many of Naga's moves you match exactly. The Naga rating gives you 100% credit for matched moves, and partial credit for moves that it considers but aren't the top choice. The bad move% penalizes every move that Naga does not consider making. Average Match Rate/Bad Move Rate by Tenhou Rank (2020 data) Good metrics are correlated with mahjong performance, but they can be misleading. For example, in its last 100 games, the 9-10 Dan AI Suphx had a match% of 74.4, an average Naga Rating of 86.3, and a bad move rate of 7.1%, stats comparable to the average 7 dan in 2020. Tencent's new AI LuckyJ hit 10 Dan with bad move rates of >10% in many games. The main reason Naga rating is imperfect is because Naga assigns probabilities to moves, when we really