Impractical Experiments #1: Representing Numbers as Polynomials

posted by Craig Gidney on September 24, 2013

In this post: trying to represent numbers like in a nice way.

Roots and Numbers

I like arbitrary precision arithmetic. I don’t end up using it that much, but I like having the option of doing things with perfect accuracy. Not having to care about overflow, not having to care about rounding error, just… being correct by default.

That is, until I have to compute a square root and am invariably forced to sacrifice my nice correctness properties. I want all the following properties, but square roots come along and ruin everything:

• Exact: Most square roots are irrational. This makes it difficult to represent them exactly, without introducing any error. You’ll never find integers and such that , so forget about using a BigRational.
• Closed: It’s not enough to represent simple roots, you have to be able to add and multiply without moving to a different type. For example, a naive representation like { base: Rational, root: Int } is exact but not closed. It can represent 1 as { base:1, root:1 } and as { base:2, root:2 }, but it can’t represent .
• Rigid Equality: Number representations flexible enough to represent square roots tend to be so flexible that you can’t actually compute equality. For example, suppose you’re representing numbers as functions that output more and more digits of their corresponding number. Given functions and , corresponding to numbers and , how do you determine if ? If they’re not equal you can just find a digit that differs, but showing they’re equal requires showing all of the digits are the same. This will either get you stuck in a loop or require a halting oracle to analyze the functions (i.e. the problem is incomputable). I’d prefer to wait not forever to discover that, yes, .
• Minimal: One way to represent roots is to use a full computer algebra system. For the purposes of this post, I’m going to consider that massive overkill. Is there a simpler representation, that can handle roots?

A couple months ago, I came up with an idea for a number representation that seemed to satisfy each of these properties: the roots of polynomials with integer coefficients. Consider: if you solve for in , what do you get? The only real solution is the cube root of 2. That’s kind of interesting: we can think of as an indirect representation of .

In fact, a wide class of numbers can appear as the roots of a polynomial and this has been investigated mathematically by plenty of people. The so-called algebraic numbers include not just roots but “Any expression formed using any combination of the basic arithmetic operations and extraction of nth roots gives an algebraic number” (and more). That sounds like exact and closed to me.

You can also check if polynomials share a root using polynomial gcd (rigid equality!). It’s a bit disappointing that transcendental numbers like euler’s constant can’t appear as polynomial roots, but from another point of view that’s just being minimal.

Roots of integer polynomials seem to have the properties I want. On the other hand, because computing polynomials roots involves approximating, I’ll need to somehow work with the roots without actually computing them. Time to experiment.

Basic Arithmetic

Let’s start by figuring out how to represent simple numbers, and do arithmetic on them involving other simple numbers.

Conversion

How do we represent an integer as the root of a polynomial? Easy: it’s the root of . If we’re representing polynomials as a little-endian list of integer coefficients, that means our representation of will be .

Representing a rational number is one small step further. The polynomial doesn’t have integer coefficients, but we can multiply both sides by to get the equivalent which does. That’s .

Shifting

How do we increase the root(s) of a polynomial by ? We use the fact that, to shift a function to the right, you just use .

More precisely, to increase the roots of by we must subtract from each like so: . We then need to expand the terms (using the binomial theorem) and simplify the result (I didn’t say we had to be fast…).

For example, has roots and . To add 1 we expand to and find that it does in fact have the roots and .

Actually, the technique of “do the opposite to x” lets us do a whole bunch of operations. To add to the roots of we use . To subtract from the roots of we use .

Scaling

To multiply the roots of by we just do the opposite to x and use . This just divides the coefficient of by for each . The resulting coefficients may no longer be integers, but we can fix that by scaling the coefficients by the least common multiple of the denominators.

For example, multiplying the roots of by gives . We cancel the denominators by multiplying by , giving which has roots and .

When the scaling factor is , meaning we’re just negating our input, dividing by corresponds to just flipping the coefficients of odd powers of .

Division

The multiplicative inverse of is . When we try to get the multiplicative inverse roots of by doing the opposite to x, we end up with ). That’s not a polynomial, but we can fix it.

For example, turns into . We fix it not being a polynomial by multiplying both sides by . The result is , which has roots that are the multiplicative inverses of and : and .

The multiplicative inverse of roots operation has a much simpler formulation. Did you catch it? Just reverse the coefficients.

Division works exactly like multiplication when is used for the numerator. When is used in the denominator, we just use the multiplicative inverse operation before multiplying.

For example, to represent the result of dividing by the roots of we just take the multiplicative inverse, which is , and multiply its roots by to get . The result has the right roots: and .

N’th Roots

To take the ‘th root of the roots of we do the opposite to x and use . This is equivalent to scaling each exponent by a factor of .

That’s right, to cube root the roots of we only need to multiply the exponents by to get . It has roots and , which is exactly what we wanted.

Equality

To determine if a rational is a root of , we just evaluate and check if the result is 0. You can use a BigRational to do the evaluation, since all the relevant quantities (the input and the coefficients) are integers or rationals.

Summary of Operations

When only one of the arguments to an operation is represented implicitly as the roots of a polynomial, you can get an implicit polynomial result as follows:

• Conversion: A rational becomes the polynomial .
• Equality: Check .
• Addition: Expand and simplify .
• Subtraction: Expand and simplify .
• Negation: Flip coefficients of odd powers.
• Multiplicative Inverse: Reverse coefficients.
• Multiplication: Scale coefficient of by for each . Cancel common denominator.
• N’th root: Scale exponents by .
• Division ( as numerator): Scale coefficient of by for each .
• Division ( as denominator): Combine the multiplicative inverse and multiplication operations.
• Exponentation: Unsure…

I hope this has given you a sense of how operations work on the roots, without actually computing the roots. Things are about to get more complicated.

Operations on Roots of Two Polynomials

The real challenge to using polynomials to represent numbers is operating on two at once. I know, because I filled pages and pages of notebook trying to rewrite the coefficients of the polynomial you get by adding or multiplying the roots of two polynomials in terms of the coefficients of the other two. This turned out to be the wrong approach, but I’ll go over it briefly.

A polynomial can be written as the product of its roots like this: . More compactly, given the list of roots , you can write .

A polynomial’s coefficient , which would be multiplied by in the standard sum-of-powers form, can be computed by adding up all the ways you can choose factors of when expanding the product form of the polynomial. More exactly, , where is the list of all the ways you can choose elements from the list of roots .

If the roots of our input polynomials are and respectively, and we want to add the roots, then the resulting polynomial must have the root whenever and . Formally, , where is “make a list of these items” in the same sense that is “sum up these items”.

Given , we can compute its coefficients . All we need to do is rewrite that equation for into a form that only mentions the coefficients derived from and , instead of directly using , , or .

Easy!… Not so much.

I did make progress, using this approach. I managed to separate the effects of the two lists of roots, but got stuck at a point where I was dot-producting all the decreasing sequences that sum up to whatever into the exponents of …

You know what, nevermind the details of how I was failing. The point is that eventually I asked for help from the math stack exchange. Within an hour, the problem was solved. Three different ways.

The solution I implemented is based on creating and solving a linear system. Unfortunately, despite implementing it and even tweaking it to work for addition, I don’t understand it well enough to explain it. The answer I was given is quite good, though.

Normally I’d work at understanding the code until I could know it was correct. In this case there are more serious issues that make the point a bit moot.

In Practice

I realized ahead of time that representing numbers as polynomials was going to have issues. That’s why it was an impractical experiment. Regardless, once I finished implementing the code, I could finally try out my numbers-as-polynomials and confirm that the expected problems were present.

The approach works fine on simple cases:

PolyNumber n = 5; // x - 5 = 0
PolyNumber v = n.Root(3); // x^3 - 5 = 0
double[] roots = v.Approximates(); // { 1.7099757194519043 }

But things can get quite slow, as the degrees of polynomials multiply together as you operate:

PolyNumber n = 5; // x - 5 = 0
PolyNumber v = n.Root(5) + n.Root(7); // takes MULTIPLE SECONDS
// x^35 + -35x^30 + -25x^28 + 525x^25 + -56875x^23 + 250x^21 + -4375x^20 + -2996875x^18
// + -1242500x^16 + 21875x^15 + -1250x^14 + -24609375x^13 + 48365625x^11 + -65625x^10
// + -2800000x^9 + -37734375x^8 + 3125x^7 + -64421875x^6 + 109375x^5 + -9406250x^4
// + -6562500x³ + -328125x² + 546875x + -81250
double[] roots = v.Approximates(); // { 2.6382288932800293 }

And don’t forget that numbers have two square roots, meaning we end up tracking multiple solutions a lot:

PolyNumber n = 2; // x - 2 = 0
PolyNumber v = n.Root(2) + 1; // x² + -2x + -1
double[] roots = v.Approximates(); // { 0.41421365737915039, 2.4142136573791504 }

It’s also possible to end up with multiple solutions despite never using an operation with two real results. For example, computing :

var two = (PolyNumber)2; // x - 2
var eight = (PolyNumber)8; // x - 8
var p = two.Root(3) + eight.Root(9); // x^24 + -20x^21 + 184x^18 + -1064x^15 + -3344x^12 + -146528x^9 + -304064x^6 + -25472x³ + -2048
var roots = p.Approximates() // { -1.2599220275878906, 2.5198402404785156 }

Where did that second root come from? Well, the root diagram shown on wolfram alpha gives a clue:

Oh right, polynomials have complex roots and our addition operation adds roots together. This is inconvenient, despite being kind of magical since I didn’t try to deal with the complex case at any point. Sometimes adding two complex roots cancels their imaginary parts and poof you get a brand new real root you didn’t want!

Well, so much for that idea.

Summary

It’s possible to represent numbers implicitly as the roots of polynomials, and you get a lot of nice properties doing so, but ultimately the result is slow and has a nasty tendency to introduce “bonus” solutions.

Update: Commenters have mentioned Holonomic functions, which generalize this idea to differential equations, as well as the Cyclotomic fields. Other people have even had the same idea.

Discuss on Reddit

• chrisdew

Have you thought of representing variables as the list of operations performed on them, with a simplification pass on each operation, and a final calculation, done only at display time? Analogous to how some ORMs merge multiple filters and merge and eliminate redundant sorts.

• CraigGidney

Yes, but that’s overkill and very complicated to get right. When you go to simplify an expression it’s hard to ensure you have all the necessary rules, and hard to know when to stop simplifying. Exact equality also remains very difficult without some sort of structure beyond “do these operations”.

• Darrell Plank

Mathematica does exactly this. If you ask for the roots of 4 x^7 – 3 x^6 + 5 x^4 – 11 x^2 + 8 you will get Root[8 - 11 #1^2 + 5 #1^4 - 3 #1^6 + 4 #1^7 &, i] where i goes from 1 to 7. You can ask for a numerical estimate of this value, but otherwise it just goes through the system in this form.

Twisted Oak Studios offers consulting and development on high-tech interactive projects. Check out our portfolio, or Give us a shout if you have anything you think some really rad engineers should help you with.

Archive

More interesting posts (7 of 29 articles)

Or check out our Portfolio.