A probabilistic pub quiz for nerds
The typical pub quiz has a “true or false” round.
You know the game:
the quizmaster tries to trick you
with statements that are often believed to be true,
but which are in fact false.
In this game,
you’re rewarded for accuracy,
but not for your confidence --
and that’s unfortunate, because
drunk people just love high-stakes gambling!
In this post,
I show the “Jim scoring” system,
a simple way to inject risk and alcoholic overconfidence
into your “true or false” quiz round.
As a refresher,
here’s a traditional “true or false” question:
Sydney is the capital of Australia. True or false?
If you answer right, you get a point.
If you answer wrong, you get no points.
If you don’t know the answer,
just have a punt,
because you can’t lose anything.
But now consider the following variant,
and consider it carefully:
Philadelphia is the capital of Pennsylvania. How likely is this?
Instead of asking for a black-and-white answer,
this asks you for your level of confidence.
Now, you’re rewarded based on two things:
your correctness, and your level of confidence.
The scoring system is as follows:
|
True |
False |
80-100% likely |
3 pts |
-7 pts |
60-80% likely |
2 pts |
-3 pts |
40-60% likely |
0 pts |
0 pts |
20-40% likely |
-3 pts |
2 pts |
0-20% likely |
-7 pts |
3 pts |
The Jim scoring system sure looks odd at first glance!
There’s a “magic sequence” of numbers:
3,
2,
0,
-3,
-7.
Stare at it for a few seconds, and you might spot a pattern.
But why is this a “good” scoring system?
It turns out the Jim scoring system has a very nice property:
the optimal strategy is to choose the option that matches your true belief.
You can’t “cheat” by pretending to be more confident than you truly are
in order to gain points.
To see why this scoring system rewards true reporting,
think about your expected score given your belief.
I’ll explain with another example:
Approximately ⅓ of human bones are in the feet. How likely is this?
Perhaps you’ve heard something like this before --
but was it ⅓ of bones, or ¼, or ½ ..?
Was it the feet, or one foot, or the hands ..?
Let’s say you think it’s 70% likely that “Approximately ⅓ of human bones are in the feet”.
Then your expected score for picking “60-80% likely” is calculated as
(70% × 2 pts) +
(30% × -3 pts).
This expected score comes out at 0.5,
which is higher than your expected score for any other answer.
Here’s a plot of your expected score,
given your belief and your answer:
Notice that the “0-20% likely answer” is optimal
precisely in the range 0-20%, and so on.
Now you’ve had time to think about it,
and you know the theory,
you can have a go:
Approximately ⅓ of human bones are in the feet. How likely is this?
The Jim scoring system is discrete:
it asks you to put your belief into one of five categories.
But if you’re a real nerd,
you can use a continuous scoring system.
Here is one such system:
- You enter an answer a between 0 and 1 (that is, “0% likely” to “100% likely”).
- If the statement is true, your score is log(a).
- If the statement is false, your score is log(1 - a).
If you want to see why this works,
consider that your believed probability is p.
Then your expected score is p×log(a) + (1 - p) × log(1 - a).
It turns out that to maximize this expected score,
you should set a=p --
that is, you should answer with your true believed probability.
But people in pubs don’t like logarithms --
they like quizzes and gambling.
The Jim scoring system
adds some fun gambling to the quiz.
It’s funny to see the effects of alcohol:
alcohol biases your confidence as well as your accuracy,
resulting in drunk people scoring many -7 points.
Observe the confidence ratings become more extreme towards the end of the quiz.
Try it out in your next family quiz over Zoom.
But in the mean time,
I’ll leave you to test your confidence on this 10-question quiz.
Post your final score on Twitter:
Similar posts
More by Jim
What does the dot do in JavaScript?
foo.bar
, foo.bar()
, or foo.bar = baz
- what do they mean? A deep dive into prototypical inheritance and getters/setters. 2020-11-01
Smear phishing: a new Android vulnerability
Trick Android to display an SMS as coming from any contact. Convincing phishing vuln, but still unpatched. 2020-08-06
A probabilistic pub quiz for nerds
A “true or false” quiz where you respond with your confidence level, and the optimal strategy is to report your true belief. 2020-04-26
Time is running out to catch COVID-19
Simulation shows it’s rational to deliberately infect yourself with COVID-19 early on to get treatment, but after healthcare capacity is exceeded, it’s better to avoid infection. Includes interactive parameters and visualizations. 2020-03-14
The inception bar: a new phishing method
A new phishing technique that displays a fake URL bar in Chrome for mobile. A key innovation is the “scroll jail” that traps the user in a fake browser. 2019-04-27
The hacker hype cycle
I got started with simple web development, but because enamored with increasingly esoteric programming concepts, leading to a “trough of hipster technologies” before returning to more productive work. 2019-03-23
Project C-43: the lost origins of asymmetric crypto
Bob invents asymmetric cryptography by playing loud white noise to obscure Alice’s message, which he can cancel out but an eavesdropper cannot. This idea, published in 1944 by Walter Koenig Jr., is the forgotten origin of asymmetric crypto. 2019-02-16
How Hacker News stays interesting
Hacker News buried my post on conspiracy theories in my family due to overheated discussion, not censorship. Moderation keeps the site focused on interesting technical content. 2019-01-26
My parents are Flat-Earthers
For decades, my parents have been working up to Flat-Earther beliefs. From Egyptology to Jehovah’s Witnesses to theories that human built the Moon billions of years in the future. Surprisingly, it doesn’t affect their successful lives very much. For me, it’s a fun family pastime. 2019-01-20
The dots do matter: how to scam a Gmail user
Gmail’s “dots don’t matter” feature lets scammers create an account on, say, Netflix, with your email address but different dots. Results in convincing phishing emails. 2018-04-07
The sorry state of OpenSSL usability
OpenSSL’s inadequate documentation, confusing key formats, and deprecated interfaces make it difficult to use, despite its importance. 2017-12-02
I hate telephones
I hate telephones. Some rational reasons: lack of authentication, no spam filtering, forced synchronous communication. But also just a visceral fear. 2017-11-08
The Three Ts of Time, Thought and Typing: measuring cost on the web
Businesses often tout “free” services, but the real costs come in terms of time, thought, and typing required from users. Reducing these “Three Ts” is key to improving sign-up flows and increasing conversions. 2017-10-26
Granddad died today
Granddad died. The unspoken practice of death-by-dehydration in the NHS. The Liverpool Care Pathway. Assisted dying in the UK. The importance of planning in end-of-life care. 2017-05-19
How do I call a program in C, setting up standard pipes?
A C function to create a new process, set up its standard input/output/error pipes, and return a struct containing the process ID and pipe file descriptors. 2017-02-17
Your syntax highlighter is wrong
Syntax highlighters make value judgments about code. Most highlighters judge that comments are cruft, and try to hide them. Most diff viewers judge that code deletions are bad. 2014-05-11
Want to build a fantastic product using LLMs? I work at
Granola where we're building the future IDE for knowledge work. Come and work with us!
Read more or
get in touch! This page copyright James Fisher 2020. Content is not associated with my employer. Found an error? Edit this page.