The Surprising Bias
This is very simple and satisfying code that demonstrates an unlikely result. It uses the first 50 million primes file that you can download from our resources.
We sort the last digit of consecutive primes into categories. For a given prime ending in 1, in theory there should be an equal chance that the next prime ends in 1, 3, 7, or 9 (once you get higher than 5, those are the only options). Each category should be around 6.25% if there's an equal chance.
What Actually Happens
The results show:
- Significant skew away from paired outcomes: $\{1,1\}, \{3,3\}, \{7,7\}, \{9,9\}$
- Clear dominance for $\{9,1\}$ transitions
Why? It should not be like this. The explanation is the Riemann Zeta function causing the skew.
C++ Implementation
Point this code to a file containing the first 50 million primes (one per line).
#include <iostream>
#include <fstream>
#include <string>
#include <unordered_map>
#include <iomanip>
using namespace std;
int main() {
string filename = "/path/to/50mprime.txt";
ifstream file(filename);
if (!file.is_open()) {
cout << "Failed to open the file." << endl;
return 1;
}
unordered_map<string, int> sequence_counts;
int prev_prime;
file >> prev_prime;
int total_count = 0;
while (file >> ws, !file.eof()) {
int curr_prime;
file >> curr_prime;
int prev_digit = prev_prime % 10;
int curr_digit = curr_prime % 10;
string sequence = to_string(prev_digit) + "," + to_string(curr_digit);
sequence_counts[sequence]++;
total_count++;
prev_prime = curr_prime;
}
file.close();
string sequences[] = {"1,1", "1,3", "1,7", "1,9",
"3,1", "3,3", "3,7", "3,9",
"7,1", "7,3", "7,7", "7,9",
"9,1", "9,3", "9,7", "9,9"};
cout << "Sequence\tCount\tPercentage" << endl;
for (const string& sequence : sequences) {
int count = sequence_counts[sequence];
double percentage = (count * 100.0) / total_count;
cout << "(" << sequence << ")\t" << count << "\t";
cout << fixed << setprecision(2) << percentage << "%" << endl;
}
return 0;
}
Compiling & Running
# Compile
g++ -std=c++17 -O2 last_digits.cpp -o last_digits
# Run (update the file path in the code first!)
./last_digits