Sublinear Algorithms and Coprimality
1 Quick Intro
I know I mentioned next post would be more in the analytics space, but I’ve been having issues sourcing data for the problem. Without giving too much away, I need the date of birth and death for about 5000 people, and my data gathering attempts have left my stuck in the mid 2000s. While I continue building this data, I thought I’d share a question I recently thought about that lead me down a long and winding road leading to my favorite computer algorithm and a surprise appearance of π: what are the odds that two randomly selected numbers are relatively prime?
2 Monte Carlo Simulation
The first place my mind tends to go in a situation like this is a Monte Carlo simulation. They're generally quick and easy to code, and can get you on the right track for a lot of problems. Of course they'll never help you find closed form solutions to problems and you'll be limited in your precision, but it can at least help give you a sense of what the answer might be and lead your investigation. Anyway, here are the results of several Monte Carlo simulations where I randomly generated two numbers between 1 and some upper limit (indicated in the table) and repeated this 10,000 times to see what percentage of the trials resulted in relatively prime numbers.
Upper limit on numbers | Percentage of pairs that were coprime |
10 | 58.5% |
100 | 60.2% |
1000 | 58.7% |
10000 | 60.1% |
100000 | 60.4% |
1000000 | 60.4% |
3 The Totient and Totient Summatory Functions
An $extremely$ useful function from the wonderful world of number theory is Euler's totient function, commonly denoted by $\phi(n)$. On the surface it might sound kind of contrived, but you'd be surprised at how often it comes up in number theory (which I know isn't exactly useful in a real world sense, but it's an indispensable tool when gaining insight into the integers). If we let $gcd(x,y)$ denote the greatest common divisor of two integers $x$ and $y$, we have the following definition: $$ \phi(x) = |\{ i | i <= x, gcd(i,x)=1 \}| $$ Or put in English, it's the count of all numbers less than or equal to x which are relatively prime to x. This function has many useful properties and applications, but in the interest of brevity I'll move on because we have more ground to cover. Given our definition of $\phi(x)$ above, we can define its "bigger brother" as follows: $$ \Phi(N) = \sum_{i=1}^{N} \phi(i) $$ Restated, we can think of this function as follows: $$ \Phi(N) = |\{ (x,y) | 1 \leq x \leq y \leq N ,\ gcd(x,y) = 1 \}| $$ It helps to frame things in terms of ordered pairs (in my opinion at least) and where we're headed this will be a good framework to ground ourselves in. As a final note, recall the following: $$ \lfloor \frac{y}{x} \rfloor = k \rightarrow (k+1) \cdot x > y $$ You may recognize this as the floor division function, and it has the following useful property: $$ \lfloor\frac{\lfloor \frac{y}{x} \rfloor}{z} \rfloor ={\lfloor \frac{y}{xz} \rfloor } $$ I will use the lamest cop out in all of academia and say the proof of this is left to the reader. I promise it's true, but the proof is somewhat involved and I don't want to dwell on it. Let's move on.
4 Counting Lattice Points
To begin our inquiry into the function $\Phi(N)$, let us define the following: $$S_{1} = \{ (x,y) | 1 \leq x \leq y \leq N , gcd(x,y) > 1\} $$ $$S_{2} = \{ (x,y) | 1 \leq x \leq y \leq N , gcd(x,y) = 1\} = \Phi(N) $$ $$S_{3} = \{ (x,y) | 1 \leq x \leq y \leq N\}$$ Of course we've defined these so that we have $$ S_{1} + S_{2} = S_{3} $$ We also know that $|S_{3}| = \frac{N(N+1)}{2} $ (this is the Nth triangle number). Using the definitions above and some substitutions we obtain the following: $$ \Phi(N) = \frac{N(N+1)}{2} - |S_{1}| $$ Meaning if we can wrap our hands around $S_{1}$ we'll be able to calculation $\Phi(N)$. Of course it won't be easy, but let's start with a seemingly innocuous observation for integers $m,x,y$: $$ gcd(x,y) = 1 \rightarrow gcd(mx,my) = m \quad \left[ 1 \right] $$ If we're interested in finding pairs of numbers that don't have a greatest common divisor of 1, it looks like we can do this by finding smaller pairs of numbers that DO have a greatest common divisor of 1. This is starting to smell like recursion to me. If we define the following: $$ P(N , i) = |\{ (x,y) | gcd(x,y) = i, 1 \leq x \leq y \leq N \}| $$ First see that: $$ P(N , 1) = \Phi(N) \quad \left[ 2 \right] $$ and that $$ P(N , i) = P(\lfloor \frac{N}{i} \rfloor ,1) \quad \left[ 3 \right] $$ By equation (1) above. Let's pause for a moment because I'd $highly$ recommend taking the time to make sure you understand what equation (3) is really saying and why it's true. It's the key insight that will allow us to determine $\Phi(N)$ efficiently. $$ (Pause) $$ Alright moving on. The last thing to note is that the $P(N,i)$'s must exhaust all pairs $(x,y)$ with $1 \leq x \leq y \leq N$. (I mean, every pair $(x,y)$ has to have $something$ as its gcd.) As a result we have: $$ \sum_{i=1}^{N} P(N,i) = \frac{N(N+1)}{2} \quad \left[ 4 \right] $$ It might seem like we're endlessly chasing our tail with definitions and whatnot, but if we combine equations (2), (3), and (4) and rearrange terms we have: $$ \Phi(N) = \frac{N(N+1)}{2} - \left(\sum_{i=2}^{N} P(\lfloor \frac{N}{i} \rfloor ,1) \right) = \frac{N(N+1)}{2} - \left(\sum_{i=2}^{N} \Phi(\lfloor \frac{N}{i} \rfloor) \right) \quad \left[ 5 \right] $$ I'll admit that although this implies an interesting recursive approach, it's not immediately obvious why we care. Let's take a closer look at the floor division term on the right though before making any snap judgements.5 The Miracle of Floor Division
You wouldn’t think it, but we’re actually knocking on the door of a sublinear algorithm for calculating Φ(N), and it’s due to a property of the floor division function (at least I think it’s sublinear, we’ll make a crude attempt at analyzing the run time later). Let’s consider the case where N = 10, and note the following:
6 A Worked Example
We can look at equations all day, but I think this algorithm in particular benefits greatly from a worked example. We don't want to be here all day, but that doesn't necessarily mean we need to pick a trivial example. Let's calculate $\Phi(25)$ and see how it plays out. As per equation (6) above, we have: $$ \Phi(25) = \frac{25\cdot26}{2} - (13\cdot \Phi(1) + 4\cdot \Phi(2) + 2\cdot\Phi(3) + \Phi(4) + \Phi(5) + \Phi(6) + \Phi(8) + \Phi(12)) $$ Feel free to take my word for it that $\Phi(1) = 1$ and $\Phi(2) = 2$. Using this, we have: $$ \Phi(3) = 2\cdot3 - (\Phi(1) + \Phi(1)) = 6 - 2 = 4 $$ $$ \Phi(4) = 2\cdot5 - (\Phi(2) + 2\cdot\Phi(1)) = 10 - (2+2) = 6 $$ $$ \Phi(5) = 3\cdot5 - (\Phi(2) + 3\cdot\Phi(1)) =15 - (2+3) = 10 $$ $$ \Phi(6) = 3\cdot7 - (\Phi(3) + \Phi(2) + 3\cdot\Phi(1)) = 21 - (4 + 2 + 3) = 12 $$ $$ \Phi(8) = 4\cdot9 - (\Phi(4) + 2\cdot\Phi(2) + 4\cdot\Phi(1)) = 36 - (6 + 4 + 4) = 22 $$ $$ \Phi(12) = 6\cdot13 - (\Phi(6) + \Phi(4) + \Phi(3) + 2\cdot\Phi(2) + 6\cdot\Phi(1)) = 78 - (12 + 6 + 4 + 4 + 6) = 46 $$ And now based on all of these calculations we can calculate $\Phi(25)$ $$ \Phi(25) = 25\cdot13 - (\Phi(12) + \Phi(8) + \Phi(6) + \Phi(5) + \Phi(4) + 2\cdot\Phi(3) + 4\cdot\Phi(2) + 13\cdot\Phi(1)) = 325 - (46+22+12+10+6+8+8+13) = 200 $$ We really belabored each calculation here for each clarity, but this should illustrate how fast this algorithm can be. Rather than manually checking the gcd of 325 pairs of numbers (and mind you this requires $O(log(n))$ operations for each pair), we only had to do a handful of additions and multiplications (and $O(1)$ lookups of course) to get the same answer. I think it's also worth mentioning that we didn't compute any intermediate values of $\phi(x)$ along the way. It's amazing to me that we're able to completely sidestep the function literally used to define $\Phi(x)$ by restating the problem in a slightly different way. (Even more importantly, we can calculate $\Phi(N)$ just as quickly if not quicker than we can calculate $\phi(N)$ which is very surprising in my opinion.)
7 Further Analysis of Φ(N)
Equation (6) above provides a useful way to implement $\Phi(N)$ as a computer algorithm (and shows why it's so fast) but I think it will be helpful to restate the function closer to the form we see in equation (4), namely: $$ \Phi(N) + \sum_{i=2}^{N} \Phi(\lfloor \frac{N}{i} \rfloor ) = \frac{N(N+1)}{2} \quad \left[ 7 \right] $$ I find this form tends to be a bit easier to work with in our subsequent inquiries. First note the following (somewhat obvious) fact: $$ \Phi(x) > 1 \; \forall x \; \in \mathbb{N} $$ (Note the upside down A symbol above simply means "for every", and the fancy N just refers to positive integers so the whole sentence basically says "this is true for all integers".) We can substitute this inequality into equation (7) and obtain the following: $$ \frac{N(N+1)}{2} = \Phi(N) + \sum_{i=2}^{N} \Phi(\lfloor \frac{N}{i} \rfloor ) \geq \Phi(N) + \sum_{i=2}^{N} 1 \geq \Phi(N) + (N-1) $$ Expanding and rearranging terms yields: $$ \Phi(N) \leq \frac{N^{2}}{2} + \frac{N}{2} - N + 1 < \frac{N^{2}}{2} \quad \left[ 8 \right] $$ Which is great, because now we have a quadratic upper bound on our function $\Phi(N)$! I guess in hindsight this is obvious, but the next conclusion took me very much by surprise. I'm going to move pretty quickly here, but I encourage you to take the time to read and verify each step. As our jumping off point: $$ \Phi(N) + \sum_{i=2}^{N} \Phi(\lfloor \frac{N}{i} \rfloor ) > \frac{N^{2}}{2} $$ By equation (8) above we also have: $$ \Phi(N) + \sum_{i=2}^{N} \Phi(\lfloor \frac{N}{i} \rfloor ) \leq \Phi(N) + \sum_{i=2}^{N} \frac{\lfloor \frac{N}{i} \rfloor ^{2}}{2} \leq \Phi(N) + \sum_{i=2}^{N} \frac{\left( \frac{N}{i}\right) ^{2}}{2} = \Phi(N) + \frac{N^{2}}{2}\left(\frac{1}{4} + \frac{1}{9}+\sum_{i=4}^{N} \left(\frac{1}{i} \right)^{2}\right) \quad \left[ 9 \right] $$ And one final identity: $$ \sum_{i=4}^{N} \left(\frac{1}{i} \right)^{2} \leq \int_{4}^{\infty} \frac{1}{(x-1)^{2}} \,dx = \frac{1}{3} \quad \left[ 10 \right] $$ To see why this is true, you can think of the left hand side as a Riemann sum approximation (with right endpoints) of the integral. Now if we combine equation (9) with the identity in (10) and inequality in (8) (and skip some mundane rearranging of terms) we have our desired result: $$ \frac{11}{72}N^{2} <\Phi(N) < \frac{N^{2}}{2} $$ Now this is admittedly a gap in my mathematical toolbox, but I believe this means we can say something to the effect of: $$ \Phi(x) \approx ax^{2} \quad \left[ 11 \right] $$ For sufficiently large $x$ and some constant $a$.. Now what could this constant $a$ be? I'll be honest I was very surprised by what I found.
8 Determining a
Now before we begin, I will issue a quick disclaimer that the following calculations aren't exactly rigorous. That being said, I'm fairly confident they're correct though (or at least barking up the right tree) and we'll see why shortly. With that out of the way, lets begin with equation (7) and sprinkle in a dash of (11) as our jumping off point: $$ \sum_{i=1}^{N} \Phi(\lfloor \frac{N}{i} \rfloor ) \approx \sum_{i=1}^{N} a(\lfloor \frac{N}{i} \rfloor )^{2} \approx aN^{2}\sum_{i=1}^{N} \frac{1}{i^{2}} \approx \frac{N^{2}+N}{2} $$ which yields $$ a = \frac{N^{2}+N}{2N^{2}} \left( \sum_{i=1}^{N} \frac{1}{i^{2}} \right) ^{-1} $$ At this point readers might recognize the summation expression on the right, and in fact if we take the limit as $N \rightarrow \infty$ the tends toward the solution to the famous Basel Problem (solved by none other than Leonhard Euler in 1734) which as a reminder is: $$ \sum_{i=1}^{\infty} \frac{1}{i^{2}} = \frac{\pi^{2}}{6} $$ Putting this all together results in the following: $$ a = \lim_{N\to\infty} \frac{N^{2}+N}{2N^{2}} \left( \sum_{i=1}^{N} \frac{1}{i^{2}} \right) ^{-1} = \frac{1}{2}\cdot \frac{6}{\pi^{2}} = \frac{3}{\pi^{2}} $$ and FINALLY we can answer the question we set out to at the start. We simply need to look at the total number of coprime pairs less than some limit, divided by the total number of pairs: $$ \lim_{N\to\infty} \frac{\Phi(N)}{\frac{N(N+1)}{2}} = \frac{2\cdot\frac{3}{\pi^{2}}N^{2}}{N^{2}+N} = \frac{6}{\pi^{2}} \approx 60.8\% $$ Which appears to be right in line with the initial results from our Monte Carlo simulation.
9 Wrapping Up
I applaud those of you who have made it to the end. What started out as a fairly benign question went completely off the rails and pushed me to my mathematical brink. This write up ended up being extremely long with a ton of algebra, and I recognize it might be too tedious to follow. I'm still working out how much detail to go into with some of these algebra steps (my understanding is math papers omit a lot of the "here's how we got from equation 1 to equation 2" type of info and leave that to the reader since space in journals is limited) but I also want to make it easy for people to follow each step.
Either way, this post was incredibly fulfilling to investigate and write. Initially I only planned on building up to and writing about the algorithm described in equation (6) as that remains one of my favorite algorithms (still blows my mind that something that seems so complex on the surface can be calculated so efficiently) but for some reason I wanted to dress it up under some (admittedly contrived) pretense, namely determining the likelihood of coprimality. I honestly had no idea about the tie-in to the Basel Problem until I had already done some initial programming and googled what the actual answer was. I very nearly gave up on "proving" some of these relationships, but stubborness prevailed and I'm thrilled that I was able to see exactly why it's related to the Basel Problem. The last week or so I've filled up a ton of notebook pages trying to prove equation (11) and I didn't have the breakthrough that allowed me to prove it until I was driving Thanksgiving morning (that breakthrough was equation (10) for anyone curious).
While I don't know exactly what the next steps here might be, I'd love to play around with some of these relationships a bit more. I bet you could derive some pretty odd looking identities involving $\pi$ and the floor function, although I don't have any good ones on hand. I'd also like to play around more with the floor function. I was surprised that $\lfloor \frac{N}{i} \rfloor ^ {2} \approx \left(\frac{N}{i} \right)^{2}$ seems to work for sufficiently large N and I'd like to look more into this. Finally, the overall result is still so surprising to me that I need to see what else I can find here. It's a genuine surprise to play around with the quotient of integers and get something involving a $\pi^{2}$ term. Anyway that's all for now, I'm putting a few additional bits of info into several appendices, but feel free to skip it if you're bored at this point.
Appendix A: An Easier Road to the Basel Problem
Wikipedia contains a nice write up that helps show the intuition behind the relationship between coprimality and the Basel Problem. We do need to lay some foundation before we get there though, so let's derive a nice identity that will help us. First, note that if we start with the infinite sum $$ \sum_{i=1}^{\infty} \frac{1}{i^{2}} $$ we can in some sense "factor" it as follows: $$ \left(\frac{1}{1^{2}} + \frac{1}{2^{2}} + \frac{1}{3^{2}} + ... \right) = \left(1 + \frac{1}{2^{2}} + \frac{1}{2^{4}} + \frac{1}{2^{6}} + .... \right) \left(1 + \frac{1}{3^{2}} + \frac{1}{3^{4}} + \frac{1}{3^{6}} + .... \right)... = \prod_{p \in primes} \left(1+\frac{1}{p^{2}} + \frac{1}{p^{4}} + ... \right) $$ Now thinking back to our Calculus 2 days, we know that each of the terms in the product on the righthand side is an infinite geometric series with first term equal to 1 and ratio equal to $\frac{1}{p^{2}}$, and we have a nice tidy formula for each sum. The overall result is: $$ \sum_{i=1}^{\infty} \frac{1}{i^{2}} = \prod_{p \in primes} \left(1+\frac{1}{p^{2}} + \frac{1}{p^{4}} + ... \right) = \prod_{p \in primes} \left( \frac{1}{1-\frac{1}{p^{2}}} \right) $$ Now armed with this identity, we can apply it to the coprimality question. To begin, note that "half" of all numbers are divisble by two, so it stands to reason that if you pick an integer randomly the odds it's divisible by two should by $\frac{1}{2}$. Building off of this logic, it shouldn't be tough to see that if we consider an arbitrary prime $p$, the likelihood that a randomly selected number is divisible by this is $\frac{1}{p}$, and furthermore that the likelihood of two randomly selected integers, say $a$ and $b$, both being divisble by $p$ is $\frac{1}{p^{2}}$. Taking this one small step further, we see the likelihood that at least one of of $a$ and $b$ not being divisible by $p$ would be $1 - \frac{1}{p^{2}}$. Of course we would need this to be true for $every$ prime, so we have the following infinite product: $$ \prod_{p \in primes}\left( 1 - \frac{1}{p^{2}} \right) = \frac{6}{\pi^{2}} \approx 60.8\% $$ And that's how our question about coprimality is related to the Basel Problem. I know the notion of probability can get a little shaky with infinite quanitities such as "all the integers" (if I'm not mistaken I think measure theory is a field that kind of extends the notion of probability to infinite sets), but it does seem to work out in this case.