p Statistics for 4.2 x 109 decimal digits
by JVSchmidt (Last updated 2008/09/20)


Motivation for p mining
p is a surprising number.
Over hundreds of years mathematicians have found dozens of presentations and armies of formulas to compute the value of Ludolph's number. Regardless of this fact any new record in p calculation demonstrates the randomness of the digit series. Frequency analysis of these results didn't point out any regularity in these sequences.
So, how do welldefinied formulas produce random output?
David Bailey and Richard Crandall are working on a proof that the famous BBP-formula for the hexadecimal presentation of a certain PI-digit generates randomizes digits.

The exact mathematical proof for p's "randomness" ist not given yet. Thus any deeper statistical examination of known PI sequences could be helpful. The results of those tests so far are suprisingly poor. Most of them present even digit counts. Others are based on a small data material only.
Nowadays an ordinary home pc can manage huge raw material. And since I am p-minded for a long time (see also MAGIC PIWORLD) the decision was made to set my computer on the trail. Results presented below are derived with a selfwritten statanalyzing program on a 4.2 billion p data sequence from the Yasumasa-Kanada-Laboratories.


AN ILLUSTRATION
Image shows the first 100x100 = 10.000 digits of p.
Shades of gray are choosen from light ("0") to dark ("9").
Is this a random pattern?




Miner's Efforts: Analyzing data
Preparation
  • Basic Considerations
  • Data, Hardware, Software


  • Testing digit frequencies
  • Basic Results for Digit Frequencies (L=1-7)
  • Details: Single digit frequencies
  • Graphics: Single digit swing-in at a glance
  • Details: Double digit frequencies


  • Special Testing
  • Poker Hands
  • Non-repeating digits
  • Gap-Test and digit distances
  • Long Runs
  • Favourite Places
  • Couckoo Positions


  • Analyzing Long Chains
  • Variety of Digits (up to L=40)
  • Sum of Digits (up to L=80)
  • Difference of Sums (up to L=80)
  • Chain Distance (up to L=80)



  • The Mountain: Where to get p-data ?
    Kanada Laboratories
    The FTP server from the record calculation team of Yasumasa Kanada serves free available data up to 4.2 billion digits. To proof the download correctness some basic statistics (like digit frequencies) can be found here as well.
    100.000 digits formatted
    View/Download directly
    The Reformator
    Use this tool to give the digits a form you need.
    Super PI
    The classic calculation freeware for PI digits from Kanada Labs often used as a benchmark for PC calculation speed.
    PIFAST 4.2 from Xavier GOURDON
    A really small and fast PI calculation freeware with a lot of output formatting options.
    PiDrops
    Calculate up to 400.000 digits with this spigot algorithm program.
    Not so fast, but interactive...


    p Statistics: Some related Papers, Books, Addresses
    Berggren,J.+P.Borwein: PI - A Source Book
    A fabulous collection of original papers in PI research.
    ISBN 0-387-98946-3

    Haenel, Arndt: PI - Algorithmen,Computer, Arthmetik
    A fine book illuminating the technics of PI calulation including an excellent overview to the history and mathematical background of PI.
    ISBN 3-540-66258-8

    Stan Wagon: Is PI normal?
    1985, The Mathematical Intelligencer Vol.7, No.3 (see also PI: A Source Book)
    (Poker hands statistics for the first 10 million decimal digits)

    Charles Seife: Randomly distributed slices of PI
    An articel about the actual efforts of Bailey and Crandell.

    Richard Preston: The Mountains of PI
    1992/03/02, The New Yorker
    A vivid article about the life and work of the Chudnovsky Brothers.

    David H. Bailey: The Computation of PI to 29,360,000 Digits
    1988, Mathematics of Computation, Vol.50, No.181 (see also PI: A Source Book)
    (Chapter 8:Statistical Analysis)

    Ted Jaditz: Are the digits of PI an IID sequence?
    Febr. 2000, The American Statistician, Vol. 54, No.1
    The author investigates if the sequence of the first 1.25 million digits are independently and identically distributed (iid). Regardless of small data he proofed the iid-hypothesis for up to 16-digits long sequences.

    Homepage of David H. Bailey
    A lot of information, mathematical papers and most interesting links related to the question "Is PI normal?"