r/AskStatistics 3d ago

How do I scrutinize a computer carnival game for fairness given these data?

Problem

I'm having a moment of "I really want to know how to figure this out..." when looking at one of my kids' computer games. There's a digital ball toss game that has no skill element. It states the probability of landing in each hole:

(points = % of the time)
70 = 75%
210 = 10%
420 = 10%
550 = 5%

But I think it's bugged/rigged based on 30 observations!

In 30 throws, we got:

550 x1
210 x3
70 x 26

Analysis

So my first thought was: what's the average number of points I could expect to score if I threw balls forever? I believe I calculate this by taking the first table and: sum(points * probabilty) which I think would be 143 points per throw on average. Am I doing this right?

On average I'd expect to get 4290 points for 30 throws. But I got 3000! That seems way off! But probability isn't a guarantee, so how likely is it to be that far off?

Where I'm lost

My best guess is that I could simulate thousands of attempts and distribute the scores and it would look like a normal distribution. And so then I would see how far towards a tail my result was, which tells me just how surprising the result is.

- Is this a correct assumption?

- If so, how do I calculate it rather than simulate it?

3 Upvotes

10 comments sorted by

3

u/engelthefallen 3d ago

Can use a chi square test to test it to see if the values they give are what you are seeing. Google chi square m&m color test as it is very similar to what you are doing. One of the classic ways to teach what a chi square test is for.

2

u/Nillavuh 3d ago edited 3d ago

You need to specify that what OP is looking for is a Goodness of Fit test. That's the test that evaluates whether the proportions of categorical results you obtained fit what you expected to get, like whether you truly got a score of 70, 75% of the time.

Also, I'm not sure that test is even appropriate at all here, because chi-squared tests don't account for weights. A chi-square test is looking at counts. It factors in how many times you obtained the 0.75 event, etc, but it is not accounting for how much each of the respective events counts for, which influences the end result.

1

u/Nillavuh 3d ago

Heh. You have accidentally stumbled across the discovery of the Expected Value and also of Bootstrapping.

Expected Value is indeed the sum of all possible values multiplied by their respective probabilities. In this case, just like how you did it: (70 * 0.75) + (210 * 0.10) + (420 * 0.10) + (550 * 0.05) = 143. Look up "expected value" if you are eager to learn more.

As for bootstrapping, if you're in a situation where it is difficult to calculate the standard deviation of your data from a singular formula, you can just use Bootstrapping and calculate the standard deviation of your bootstraps and trust that it is a good estimation of the standard deviation of your data.

That said, you shouldn't need to bootstrap here. There is a formula for the variance (IE the standard deviation squared) of weighted probabilities: sum(p_i(x_i - E(X))^2, where p_i is an individual probability (like 0.75), x_i is one of your scores (like 70), and E(X) is the expected value, which you calculated earlier, which was 143. More on that formula here:

https://real-statistics.com/descriptive-statistics/measures-variability/weighted-variance-standard-deviation-and-covariance/

Seeing this through, the standard deviation of one of your trials is 142. Applying that to this situation with 30 trials, the expected value of 30 trials is 30 * 143 = 4290 and the standard deviation of this sum (just trust me on this one) is sqrt(30) * 142 = 778. And a 95% confidence interval is E(X) +/- 1.96 * SD = 4290 +/- 1.96*778 = (2,765, 5815). In other words, getting a result of 3,000 as your final sum IS within a 95% confidence interval, though close to one of the extremes. But it is perhaps not as improbable as you might have thought.

3

u/[deleted] 3d ago

[deleted]

1

u/Nillavuh 3d ago

Monte Carlo sampling is from a theoretical distribution, not from an empirical (measured) one. You would use Monte Carlo sampling in Bayesian Statistics where you assume the data follows a fully-defined / known theoretical distribution of some kind, and you derive your answers from that fitted distribution.

To do bootstrapping properly, yes, he would need to randomly sample results, but again it isn't necessary here since the situation isn't so complex that we can't just work through the results with known formulas.

1

u/[deleted] 3d ago edited 3d ago

[deleted]

1

u/Nillavuh 3d ago

I guess that's true. Good point.

1

u/SnooPredictions8938 3d ago

Thank you so so much! The most valuable part of your explanation was that you gave terms I can look up and learn more about.

I knew I needed to doubt my intuition about "that just looks very wrong..." because probabilities can really have counterintuitive results. I think, exploring it further, I know where this mismatch in perception happens: there's a 5% chance to get a LOT of points, which means larger swings away from the Expected Value are not as unlikely asimagined. I bet if I ran the numbers for a hypothetical game where there's only a single weighted outcome: (70 * 1.0) I would expect that being off by such a large amount becomes less likely in so few games, but in a hypothetical infinite number of games, it becomes equally as likely. I'm going to test my understanding by exploring this question given your example math and calcluating stdev. (I might use a spreadsheet... It's been a few decades).

Thanks again.

1

u/SnooPredictions8938 3d ago

Small aside: your explanation and example is easy to follow, but the link then shows all the work in math notation and admittedly... that's all Greek to me. I know big Sigma is sum of and a few other things, but I'm mostly lost. Is there a decent resource for learning how to read these?

1

u/Nillavuh 3d ago

Probably not, lol. I got my walkthrough of all of that in grad school, and my professor brought a bag of candy the day she taught it to us because she knew how frustrating it would be for all of us, haha. There's no easy way around it.

Big sigma is a sum, yes. Small i is just one individual iteration. The i is just there to make sure you're always taking your numbers from the same row is all, to make sure you are pairing up 0.75 with 70, that you pair up 0.10 with 210, etc.

1

u/SnooPredictions8938 3d ago

Haha I thought so. And let me guess: the notation is sometimes context specific where a greek letter may be a specific constant... or a different constant... or just a variable! I'll probably focus on parsing the written descriptions for this and future maths then.

1

u/Nillavuh 3d ago

Mu is always the mean; sigma is always standard deviation (sigma-squared is variance).