Akshit Sivaraman | Research & Systems

The Problem

The Twin Prime Conjecture asserts there are infinitely many prime pairs $(p, p+2)$ . Verifying large candidates requires primarily primality testing (typically LLR or Fermat tests) on numbers of the form $k \cdot 2^N \pm 1$ .

The challenge is computational density. Checking a single 400,000-digit number takes hours on a standard CPU. Validating a batch of 10,000 candidates sequentially would take years. We needed a way to parallelize this embarrassingly parallel workload without incurring massive cloud costs.

The Solution: Primorial Shielding

To avoid wasting cycles on candidates divisible by small primes (3, 5, 7...), we implemented a Shielding strategy. We construct a target center $C$ using the Primorial $P\#_{100}$ (product of first 100 primes):

C = k \cdot P\#_{100} \cdot 2^{1,500,000}

This ensures that $C \pm 1$ is coprime to all small Shield primes, increasing the probability of primality by orders of magnitude compared to random integers.

Primorial Shielding Field Active: Filtering trivial composites

Cluster Architecture

1. Candidate Generation

Python scripts generate pre-sieved candidates ( $k \cdot 2^N \pm 1$ ) using the Shielding logic. Batches of 1,000 are serialized to Cloud Storage.

2. Fleet Orchestration

A custom launch_fleet.py controller spawns preemptible Compute Optimized instances (C2-standard-4). Each instance pulls a unique slice of candidates.

3. PFGW Verification

Instances execute PFGW (Prime Form Genefer for Windows/Linux) to run Fermat primality tests. Results are piped back to a centralized results bucket.

High-Altitude Statistical Analysis (

10^{20}

)

Note: These statistical deviations are conditional on the specific candidate generation process (primorial shielding) and are presented as hypothesis-generating observations rather than varying probability distributions.

Factor Subset	Sample Size	Twin Centers	Lift vs Baseline
Divisible by 5	741,267	4,594 (0.62%)	+67.0%
Divisible by 7	530,388	2,674 (0.50%)	+35.9%
Divisible by 11	337,324	1,570 (0.47%)	+25.5%
Divisible by 13	285,038	1,265 (0.44%)	+19.6%
Baseline (Random)	3,708,924	13,760 (0.37%)	1.0x

Key Findings

Analyzed 3.7 Million candidates at $N=10^{20}$ around the Baseline Rate of 0.37%.
Mod 5 Optimization: Candidates divisible by 5 are 67% more likely to be Twin Centers than random candidates.
This contradicts the naive intuition that removing small factors is always better; the distribution of Twin Centers is biased toward multiples of 5, 7, and 11.

Limitations

While we found several Probable Primes (PRP), no actual Twin Prime pair was discovered in this batch. The density of twin primes at $10^{400000}$ is incredibly low, requiring exponentially more compute for a definitive hit.

References

[1] Hardy, G. H., & Littlewood, J. E. (1923). "Some problems of 'Partitio numerorum'; III: On the expression of a number as a sum of primes." Acta Mathematica, 44(1), 1-70.

[2] Zhang, Y. (2014). "Bounded gaps between primes." Annals of Mathematics, 179(3), 1121-1174.

[3] Polymath, D. H. J. (2014). "Variants of the Selberg sieve, and bounded intervals containing many primes." Research in the Mathematical Sciences, 1(1), 12.

[4] Rabin, M. O. (1980). "Probabilistic algorithm for testing primality." Journal of Number Theory, 12(1), 128-138.

[5] Caldwell, C. K. (2024). "The Prime Pages." primes.utm.edu

Twin Prime Cluster