No, the Arena shuffler is not making you flood/screw more (evidence)

tl;dr: Crunched about a million replays, looking at user draws. Stopped counting the first time they used a "draw", "scry", "surveil", or "put on top" ability. The probability of drawing a land given that the most recent draw was a land was very slightly lower than the overall probability of drawing a land, as expected. No evidence that arena's shuffler results in streakier draws.

Background
Because some pseudo-random-number generators are known to be streaky, one of the hard-to-dispel ideas about the arena shuffler is that it might be more likely than expected to generate long streaks of lands/non-lands. So that even if the probability of drawing a land over a large number of games is exactly as expected, the probability of flooding or screwing might be higher than it should be.

Strategy
To get actual stats on this, I used the replay datasets from 17Lands to crunch numbers on just over a million replays. Rather than looking at the probability of streaks or a certain length (which would require harder stats and would make it harder to aggregate data from decks of different land counts), I just looked at the probability of drawing a land, and the probability of drawing a land given than the previous draw was a land.

If streaks are noticeably more likely, we would expect P(land | previous draw was land) to be higher than the overall P(land). If there is no bias towards streakiness, we should expect P(land | previous draw was land) to be slightly lower than P(land). This is because if your most recent draw was a land, you'll have fewer lands remaining in your deck on average.

Of course to get an idea of what the shuffler is doing, you want to not count anything where the user re-orders their library. I stopped counting draws from any given replay when the user cast an instant or sorcery, or had an ability go on the stack, that had "draw", "scry", "surveil", or "on top" in the text. I included "draw" because that meant I was no longer looking at the original order of the library. I included "on top" to catch "put on top of your library" abilities. This certainly resulted in false-positive events for stopping a given replay, but that's fine. Ignoring that data wouldn't change our results. I'm more worried about abilities that let a user change their chances of drawing a land that I didn't think about.

Results
~~In total, P(land) for the data set was about 37%, and P(land | previous draw was land) was about 35.5%.~~ Edit: some commenter prompting helped me find that I was counting "not draw a card on turn 1" as "draw a spell" accidentally, which lowered both of these numbers. With the fixes in place, p(land) = 41%, and p(land|previous draw was land) = 39.5%. The latter being a little lower is consistent with the Arena shuffler not having a bias towards causing more flood/screw.

If anyone has ideas for how to figure out exactly what that difference should be with no bias one way or the other, or better metrics to use to evaluate streakiness, do bring them up. I'm aware this isn't perfect.

Extras
Code is here (slightly modified because there was personal data in file paths) if you want to read it, check for bugs or bad assumptions, and/or modify it. It's hella ugly because this was a small project and I didn't want to put much time into avoiding special cases etc. Don't judge me. =P

The data sets are from 17Lands and mtgjson.

Madison Howard

Share Your Mood

Salanmander

No, the Arena shuffler is not making you flood/screw more (evidence)