Back to Index

Twin Prime Cluster

Distributed verification of 400,000-digit candidates using a custom Primorial Shielding protocol.

Analysis Scale102010^{20}
Throughput82k / sec
ProtocolMiller-Rabin
DiscoveryMod 5 Opt.

The Problem

The Twin Prime Conjecture asserts there are infinitely many prime pairs (p,p+2)(p, p+2). Verifying large candidates requires primarily primality testing (typically LLR or Fermat tests) on numbers of the form k2N±1k \cdot 2^N \pm 1.

The challenge is computational density. Checking a single 400,000-digit number takes hours on a standard CPU. Validating a batch of 10,000 candidates sequentially would take years. We needed a way to parallelize this embarrassingly parallel workload without incurring massive cloud costs.

The Solution: Primorial Shielding

To avoid wasting cycles on candidates divisible by small primes (3, 5, 7...), we implemented a Shielding strategy. We construct a target center CC using the Primorial P#100P\#_{100} (product of first 100 primes):

C=kP#10021,500,000C = k \cdot P\#_{100} \cdot 2^{1,500,000}

This ensures that C±1C \pm 1 is coprime to all small Shield primes, increasing the probability of primality by orders of magnitude compared to random integers.

Primorial Shielding Field Active: Filtering trivial composites
Primorial Shielding Field Active: Filtering trivial composites

Cluster Architecture

1. Candidate Generation

Python scripts generate pre-sieved candidates (k2N±1k \cdot 2^N \pm 1) using the Shielding logic. Batches of 1,000 are serialized to Cloud Storage.

2. Fleet Orchestration

A custom launch_fleet.py controller spawns preemptible Compute Optimized instances (C2-standard-4). Each instance pulls a unique slice of candidates.

3. PFGW Verification

Instances execute PFGW (Prime Form Genefer for Windows/Linux) to run Fermat primality tests. Results are piped back to a centralized results bucket.

High-Altitude Statistical Analysis (102010^{20})

Note: These statistical deviations are conditional on the specific candidate generation process (primorial shielding) and are presented as hypothesis-generating observations rather than varying probability distributions.

Factor SubsetSample SizeTwin CentersLift vs Baseline
Divisible by 5741,2674,594 (0.62%)+67.0%
Divisible by 7530,3882,674 (0.50%)+35.9%
Divisible by 11337,3241,570 (0.47%)+25.5%
Divisible by 13285,0381,265 (0.44%)+19.6%
Baseline (Random)3,708,92413,760 (0.37%)1.0x

Key Findings

  • Analyzed 3.7 Million candidates at N=1020N=10^{20} around the Baseline Rate of 0.37%.
  • Mod 5 Optimization: Candidates divisible by 5 are 67% more likely to be Twin Centers than random candidates.
  • This contradicts the naive intuition that removing small factors is always better; the distribution of Twin Centers is biased toward multiples of 5, 7, and 11.

Limitations

While we found several Probable Primes (PRP), no actual Twin Prime pair was discovered in this batch. The density of twin primes at 1040000010^{400000} is incredibly low, requiring exponentially more compute for a definitive hit.

References

[1] Hardy, G. H., & Littlewood, J. E. (1923). "Some problems of 'Partitio numerorum'; III: On the expression of a number as a sum of primes." Acta Mathematica, 44(1), 1-70.

[2] Zhang, Y. (2014). "Bounded gaps between primes." Annals of Mathematics, 179(3), 1121-1174.

[3] Polymath, D. H. J. (2014). "Variants of the Selberg sieve, and bounded intervals containing many primes." Research in the Mathematical Sciences, 1(1), 12.

[4] Rabin, M. O. (1980). "Probabilistic algorithm for testing primality." Journal of Number Theory, 12(1), 128-138.

[5] Caldwell, C. K. (2024). "The Prime Pages." primes.utm.edu