On computing roots of quartic and cubic equations in Python

Summary

This document examines various ways to compute roots of cubic (3rd order polynomial) and quartic (4th order polynomial) equations in Python. First, two numerical algorithms, available from Numpy package (roots and linalg.eigvals), were analyzed. Then, an optimized closed-form analytical solutions to cubic and quartic equations were implemented and examined. Finally, the analytical solutions were vectorized by using numpy arrays in order to avoid slow python iterations when multiple polynomials are solved. All these functions were evaluated by comparing their computational speeds. Analytical cubic and quartic solvers were one order of magnitude faster than both numerical Numpy functions for a single polynomial. When a large set of polynomials were given as input, the vectorized analytical solver outperformed the numerical Numpy functions by one and two orders of magnitude, respectively.

Introduction

In scientific computing we are sometimes faced with solving roots of a cubic (3rd order polynomial) or quartic equation (4th order polynomial) to get crucial information about the characteristics of some physical process or to develop an appropriate numerical scheme. These issues are regularly encountered when analyzing coupled dynamic systems described by three or four differential equations. One such example is a two-layer Shallow Water Flow (SWE), which is defined by four Partial Differential Equations (PDE). In two-layer SWE, the eigenvalues of the 4x4 flux matrix describe the speed of internal and external wave propagation. And, the eigenvalues correspond to roots of a characteristic 4th order polynomial. Similarly, SWE coupled with sediment transport are defined by three PDEs. In this case, the eigenvalues of a 3x3 matrix correspond to roots of a characteristic 3rd order polynomial. There are many more examples where such computation is required.

Roots of cubic and quartic equations can be computed using numerical methods or analytical expressions (so called closed-form solutions). Numerical methods are based on specific algorithms and provide only approximations to roots. Root-finding algorithms (such as Newton’s, secant, Brent’s method, etc.) are appropriate for any continuous function, they use iterations but do not guarantee that all roots will be found. However, a different class of numerical methods is available (and recommended) for polynomials, based on finding eigenvalues of the companion matrix of a polynomial.

In Python, there are several ways to numerically compute roots of any polynomial; however, only two functions are generally recommended and used. First is a Numpy function called roots which directly computes all roots of a general polynomial, and the second is also a Numpy function from linalg module called eigvals, which computes eigenvalues of a companion matrix constructed from a given (characteristic) polynomial.

On the other hand, analytical closed-form solutions exist for all polynomials of degree lower than five, that is, for quadratic, cubic, and quartic equations. Although, the expressions for cubic and quadratic roots are longer and more complicated than for a quadratic equations, they can still be easily implemented in some computational algorithm. The closed-form solution for roots of cubic equations is based on Cardano’s expressions given here and here. Similarly, solution to the roots for quartic equations is based on Ferrari’s expressions given here and here. A fast and optimized algorithm - FQS - that uses analytical solutions to cubic and quartic equation was implemented in Python and made publicly available here.

All computational algorithms were implemented in Python 3.7 with Numpy 1.15, and tests were done on Windows 64-bit machine, i5-2500 CPU @ 3.30 GHz.

Numerical algorithms

Function numpy.roots

Function numpy.roots can compute roots of a general polynomial defined by a list of its coefficients p.

For cubic equations, p is defined as:

p[0]*x**3 + p[1]*x**2 + p[2]*x + p[3],

and for quartic equations, p is defined as:

p[0]*x**4 + p[1]*x**3 + p[2]*x**2 + p[3]*x + p[4].

The function to compute roots of a single polynomial is implemented for cubic roots as follows:

import numpy

p_cubic = numpy.random.rand(4)
cubic_roots = numpy.roots(p_cubic)

and for quartic roots:

p_quartic = numpy.random.rand(5)
quartic_roots = numpy.roots(p_quartic)

The respective results are:

>>> p_cubic
array([0.21129527, 0.23589228, 0.73094489, 0.84747689])
>>> cubic_roots
array([ 0.01557778+1.86945535j,  0.01557778-1.86945535j,
        -1.1475662 +0.j        ]))

and

>>> p_quartic
array([0.30022249, 0.31473263, 0.00791689, 0.06335546, 0.73838408])
>>> quartic_roots
array([-1.19538943+0.7660177j , -1.19538943-0.7660177j ,
        0.67122379+0.87725993j,  0.67122379-0.87725993j])

Let’s look at the computation times using timeit function:

import timeit

%timeit numpy.roots(p_cubic)
76.5 µs ± 148 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

%timeit numpy.roots(p_quartic)
80.1 µs ± 2.98 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

This may seem quite fast; however, what if we need to solve, let’s say, 10 000 polynomials:

p_cubics = numpy.random.rand(10_000, 4)
p_quartics = numpy.random.rand(10_000, 5)

Notice that numpy.roots takes only rank-1 arrays, which means that we have to use for loops or list comprehensions. The latter, are usually faster in python (and more pythonish), therefore we write:

cubic_roots = [numpy.roots(pi) for pi in p_cubics]
quartic_roots = [numpy.roots(pi) for pi in p_quartics]

Their corresponding computation times:

%timeit [numpy.roots(pi) for pi in p_cubics]
786 ms ± 15.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

%timeit [numpy.roots(pi) for pi in p_quartics]
795 ms ± 2.51 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

It seems that it takes 10 000 times longer to compute 10 000 polynomials than 1 polynomial.

Can we speed things up, knowing that loops in Python are slower in comparison to other high level languages, such as C/C++ or FORTRAN?

Function numpy.linalg.eigvals

Documentation for numpy.roots states that its algorithms rely on computing the eigenvalues of the companion matrix. It uses the fact that the eigenvalues of a matrix A are the roots of its characteristic polynomial p.

Function numpy.linalg.eigvals computes eigenvalues of a general square matrix A using geev LAPACK routines. The main advantage of the function linalg.eigvals over roots is that it uses vectorization. Meaning, it runs certain operations over entire array, rather than over individual elements. Therefore, it can take as input stacked array of companion matrices, and does not require for loops or list comprehensions.

For cubic equations, first we reduce the polynomial to the form:

x**3 + a*x**2 +b*x + c = 0,

and then construct the companion matrix:

A = [[0, 0, -c],
     [1, 0, -b],
     [0, 1, -a]]

For quartic equations, we reduce the polynomial to the form:

x**4 + a*x**3 + b*x**2 +c*x + d = 0,

and then construct the companion matrix:

A = [[0, 0, 0, -d],
     [1, 0, 0, -c],
     [0, 1, 0, -b],
     [0, 0, 1, -a]]

The function to compute roots from eigenvalues of a single companion matrix is implemented for cubic equation as follows:

def eig_cubic_roots(p):
	
    # Coefficients of quartic equation
    a, b, c = p[:, 1]/p[:, 0], p[:, 2]/p[:, 0], p[:, 3]/p[:, 0]
	
    # Construct the companion matrix
    A = numpy.zeros((len(a), 3, 3))
    A[:, 1:, :2] = numpy.eye(2)
    A[:, :, 2] = -numpy.array([c, b, a]).T
	
    # Compute roots using eigenvalues
    return numpy.linalg.eigvals(A)

Similarly, for quartic equation:

def eig_quartic_roots(p):
	
    # Coefficients of quartic equation
    a, b, c, d = (p[:, 1]/p[:, 0], p[:, 2]/p[:, 0],
                  p[:, 3]/p[:, 0], p[:, 4]/p[:, 0])
	
    # Construct the companion matrix
    A = numpy.zeros((len(a), 4, 4))
    A[:, 1:, :3] = numpy.eye(3)
    A[:, :, 3] = -numpy.array([d, c, b, a]).T
	
    # Compute roots using eigenvalues
    return numpy.linalg.eigvals(A)

To compute roots of a single cubic equation, eigvals is implemented as follows:

cubic_roots = eig_cubic_roots(p_cubic[None, :])

and for quartic roots:

quartic_roots = eig_quartic_roots(p_quartic[None, :])

The results are:

>>> cubic_roots
array([[-1.1475662 +0.j        ,  0.01557778+1.86945535j,
        0.01557778-1.86945535j]])

and

>>> quartic_roots
array([[ 0.67122379+0.87725993j,  0.67122379-0.87725993j,
		-1.19538943+0.7660177j , -1.19538943-0.7660177j ]])

Let’s look at the computation times:

%timeit eig_cubic_roots(p_cubic[None, :])
67 µs ± 316 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

%timeit eig_quartic_roots(p_quartic[None, :])
69.3 µs ± 135 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

The computation times are slightly faster than numpy.roots.

Let see the difference for 10 000 polynomials:

%timeit eig_cubic_roots(p_cubics)
31.2 ms ± 1.05 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

%timeit eig_quartic_roots(p_quartics)
48.3 ms ± 48.6 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

The results indicate that numpy.linalg.eigvals is one order of magnitude faster than numpy.roots, and that is a huge gain.

What about analytical solutions to cubic and quartic equations?

Implementation for a single polynomial

Now, let’s look at analytical implementations available by FQS function. First, implementation of analytical solutions for single quadratic, cubic and quartic equation is presented. As stated in the introduction, these algorithms are based on closed-form solutions for cubic (given here and here) and quartic equations (given here and here). These equations were modified to avoid repeating computations.

Python function for roots of a quadratic equation is implemented as:

import math, cmath

def single_quadratic(a0, b0, c0):
    ''' Reduce the quadratic equation to to form:
        x**2 + a*x + b = 0 '''
    a, b = b0 / a0, c0 / a0

    # Some repeating variables
    a0 = -0.5*a
    delta = a0*a0 - b
    sqrt_delta = cmath.sqrt(delta)

    # Roots
    r1 = a0 - sqrt_delta
    r2 = a0 + sqrt_delta

    return r1, r2

The function for roots of a cubic equation is implemented as:

def single_cubic(a0, b0, c0, d0):
    ''' Reduce the cubic equation to to form:
        x**3 + a*x**2 + b*x + c = 0 '''
    a, b, c = b0 / a0, c0 / a0, d0 / a0

    # Some repeating constants and variables
    third = 1./3.
    a13 = a*third
    a2 = a13*a13
    sqr3 = math.sqrt(3)

    # Additional intermediate variables
    f = third*b - a2
    g = a13 * (2*a2 - b) + c
    h = 0.25*g*g + f*f*f

    def cubic_root(x):
        ''' Compute cubic root of a number while maintaining its sign'''
        if x.real >= 0:
            return x**third
        else:
            return -(-x)**third

    if f == g == h == 0:
        r1 = -cubic_root(c)
        return r1, r1, r1

    elif h <= 0:
        j = math.sqrt(-f)
        k = math.acos(-0.5*g / (j*j*j))
        m = math.cos(third*k)
        n = sqr3 * math.sin(third*k)
        r1 = 2*j*m - a13
        r2 = -j * (m + n) - a13
        r3 = -j * (m - n) - a13
        return r1, r2, r3

    else:
        sqrt_h = cmath.sqrt(h)
        S = cubic_root(-0.5*g + sqrt_h)
        U = cubic_root(-0.5*g - sqrt_h)
        S_plus_U = S + U
        S_minus_U = S - U
        r1 = S_plus_U - a13
        r2 = -0.5*S_plus_U - a13 + S_minus_U*sqr3*0.5j
        r3 = -0.5*S_plus_U - a13 - S_minus_U*sqr3*0.5j
        return r1, r2, r3

And finally, the function for roots of a quartic equation is implemented as:

def single_quartic(a0, b0, c0, d0, e0):
 
    ''' Reduce the quartic equation to to form:
        x**4 + a*x**3 + b*x**2 + c*x + d = 0'''
    a, b, c, d = b0/a0, c0/a0, d0/a0, e0/a0

    # Some repeating variables
    a0 = 0.25*a
    a02 = a0*a0

    # Coefficients of subsidiary cubic equation
    p = 3*a02 - 0.5*b
    q = a*a02 - b*a0 + 0.5*c
    r = 3*a02*a02 - b*a02 + c*a0 - d

    # One root of the cubic equation
    z0, _, _ = single_cubic(1, p, r, p*r - 0.5*q*q)

    # Additional variables
    s = cmath.sqrt(2*p + 2*z0.real + 0j)
    if s == 0:
        t = z0*z0 + r
    else:
        t = -q / s

    # Compute roots by quadratic equations
    r0, r1 = single_quadratic(1, s, z0 + t)
    r2, r3 = single_quadratic(1, -s, z0 - t)

    return r0 - a0, r1 - a0, r2 - a0, r3 - a0

Notice that the quartic solution calls both cubic and quadratic functions.

For a single cubic equation, the results are:

>>> single_cubic(*p_cubic)
((-1.147566194142574+0j),
 (0.01557779507848811+1.8694553386446031j),
 (0.01557779507848811-1.8694553386446031j))

and for a quartic:

>>> single_quartic(*p_quartic)
((-1.195389428644198+0.766017693855147j),
 (-1.195389428644198-0.766017693855147j),
 (0.6712237840251022+0.8772599258280781j),
 (0.6712237840251022-0.8772599258280781j))

Let’s look at the computation times:

%timeit single_cubic(*p_cubic)
28.6 µs ± 604 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

%timeit single_quartic(*p_quartic)
50.3 µs ± 636 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

We can notice that single_cubic is twice as fast than both numerical solvers implemented in Numpy. Whereas, single_quartic is about 30-40% faster than the numerical solvers.

What about multiple polynomials?

%timeit [single_cubic(*pi) for pi in p_cubics]
236 ms ± 7.89 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

%timeit [single_quartic(*pi) for pi in p_quartics]
421 ms ± 1.84 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Analytical closed-form solver is about twice as fast than numpy.roots. However, it is much slower (almost one order of magnitude) than the numerical solver from numpy.linalg.eigvals. This difference is mainly the consequence of using list comprehension.

Just-in-time compiler from Numba

Let’s see if we can speed up the computation by using just-in-time compiler from Numba. We only need to import it and put a decorator before each function:

from numba import jit

@jit(nopython=True)
def single_quadratic(a0, b0, c0):
	...

@jit(nopython=True)
def single_cubic(a0, b0, c0, d0):
	...

@jit(nopython=True)
def single_quartic(a0, b0, c0, d0, e0):
	...
	

These are new computation times after Numba was implemented:

%timeit single_cubic(*p_cubic)
6.34 µs ± 7.75 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)

%timeit single_quartic(*p_quartic)
5.8 µs ± 6.87 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)

We can notice that the algorithms are several times faster than both numerical solvers implemented in Numpy.

What about multiple polynomials?

%timeit [single_cubic(*pi) for pi in p_cubics]
27.6 ms ± 80.5 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

%timeit [single_quartic(*pi) for pi in p_quartics]
30.8 ms ± 90.6 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

Analytical closed-form solver (with Numba) is one order of magnitude faster than numpy.roots and shows similar performance, but slightly faster, than numerical solver from numpy.linalg.eigvals.

However, notice that list comprehensions were used here for multiple inputs. What if we could vectorize this code using Numpy arrays and speed up computation times even more?

Vectorized analytical closed-form solvers

To vectorize functions single_quadratic, single_cubic, and single_quartic using Numpy arrays we have to get rid of all if clauses and replace them with Numpy masks. Also, we have to replace all mathematical functions from math and cmath with corresponding Numpy functions. This is implemented as follows.

For quadratic equation:

def multi_quadratic(a0, b0, c0):
    # Quadratic coefficient
    a, b = b0 / a0, c0 / a0

    # Some repeating variables
    a0 = -0.5*a
    delta = a0*a0 - b
    sqrt_delta = numpy.sqrt(delta + 0j)

    # Roots
    r1 = a0 - sqrt_delta
    r2 = a0 + sqrt_delta

    return r1, r2

For cubic equation:

def multi_cubic(a0, b0, c0, d0, all_roots=True):
    # Cubic coefficients
    a, b, c = b0 / a0, c0 / a0, d0 / a0

    # Some repeating constants and variables
    third = 1./3.
    a13 = a*third
    a2 = a13*a13
    sqr3 = math.sqrt(3)

    # Additional intermediate variables
    f = third*b - a2
    g = a13 * (2*a2 - b) + c
    h = 0.25*g*g + f*f*f

    # Masks for different combinations of roots
    m1 = (f == 0) & (g == 0) & (h == 0)     # roots are real and equal
    m2 = (~m1) & (h <= 0)                   # roots are real and distinct
    m3 = (~m1) & (~m2)                      # one real root and two complex

    def cubic_root(x):
        ''' Compute cubic root of a number while maintaining its sign
        '''
        root = numpy.zeros_like(x)
        positive = (x >= 0)
        negative = ~positive
        root[positive] = x[positive]**third
        root[negative] = -(-x[negative])**third
        return root

    def roots_all_real_equal(c):
        ''' Compute cubic roots if all roots are real and equal
        '''
        r1 = -cubic_root(c)
        if all_roots:
            return r1, r1, r1
        else:
            return r1

    def roots_all_real_distinct(a13, f, g, h):
        ''' Compute cubic roots if all roots are real and distinct
        '''
        j = numpy.sqrt(-f)
        k = numpy.arccos(-0.5*g / (j*j*j))
        m = numpy.cos(third*k)
        r1 = 2*j*m - a13
        if all_roots:
            n = sqr3 * numpy.sin(third*k)
            r2 = -j * (m + n) - a13
            r3 = -j * (m - n) - a13
            return r1, r2, r3
        else:
            return r1

    def roots_one_real(a13, g, h):
        ''' Compute cubic roots if one root is real and other two are complex
        '''
        sqrt_h = numpy.sqrt(h)
        S = cubic_root(-0.5*g + sqrt_h)
        U = cubic_root(-0.5*g - sqrt_h)
        S_plus_U = S + U
        r1 = S_plus_U - a13
        if all_roots:
            S_minus_U = S - U
            r2 = -0.5*S_plus_U - a13 + S_minus_U*sqr3*0.5j
            r3 = -0.5*S_plus_U - a13 - S_minus_U*sqr3*0.5j
            return r1, r2, r3
        else:
            return r1

    # Compute roots
    if all_roots:
        roots = numpy.zeros((3, len(a))).astype(complex)
        roots[:, m1] = roots_all_real_equal(c[m1])
        roots[:, m2] = roots_all_real_distinct(a13[m2], f[m2], g[m2], h[m2])
        roots[:, m3] = roots_one_real(a13[m3], g[m3], h[m3])
    else:
        roots = numpy.zeros(len(a))
        roots[m1] = roots_all_real_equal(c[m1])
        roots[m2] = roots_all_real_distinct(a13[m2], f[m2], g[m2], h[m2])
        roots[m3] = roots_one_real(a13[m3], g[m3], h[m3])

    return roots

And for quadratic equation:

def multi_quartic(a0, b0, c0, d0, e0):
    # Quartic coefficients
    a, b, c, d = b0/a0, c0/a0, d0/a0, e0/a0

    # Some repeating variables
    a0 = 0.25*a
    a02 = a0*a0

    # Coefficients of subsidiary cubic equation
    p = 3*a02 - 0.5*b
    q = a*a02 - b*a0 + 0.5*c
    r = 3*a02*a02 - b*a02 + c*a0 - d

    # One root of the cubic equation
    z0 = multi_cubic(1, p, r, p*r - 0.5*q*q, all_roots=False)

    # Additional variables
    s = numpy.sqrt(2*p + 2*z0.real + 0j)
    t = numpy.zeros_like(s)
    mask = (s == 0)
    t[mask] = z0[mask]*z0[mask] + r[mask]
    t[~mask] = -q[~mask] / s[~mask]

    # Compute roots by quadratic equations
    r0, r1 = solve_multi_quadratic(1, s, z0 + t) - a0
    r2, r3 = solve_multi_quadratic(1, -s, z0 - t) - a0

    return r0, r1, r2, r3

Let’s examine the computation time of vectorized analytical closed-form solvers for a single polynomial:

%timeit multi_cubic(*p_cubic.T)
174 µs ± 2.02 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

%timeit multi_quartic(*p_quartic.T)
233 µs ± 4.91 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

Clearly, vectorized version for a single polynomial is overkill, and results in slowest computation times.

But what about multiple polynomials:

%timeit multi_cubic(*p_cubics.T)
3.14 ms ± 13.1 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

%timeit multi_quartic(*p_quartics.T)
5.46 ms ± 38.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

The vectorized implementation of analytical solvers is an order of magnitude faster than original analytical implementation, even with just-in-time compiler from Numba. Furthermore, the vectorized implementation is two order of magnitudes faster than the numerical solver roots and also one-order of magnitude faster than linalg.eigvals.

Summary (TL;DR)

Findings on computation speed of different ways to solve cubic and quartic equations in Python can be summarized as follows:

  • Two numerical algorithms for finding polynomial roots are available out-of-box from Numpy package (numpy.roots and numpy.linalg.eigvals)
  • Analytical algorithms (closed-form solutions) for solving polynomial roots were implemented in Python (single_cubic/single_quartic for a single polynomial, and vectorized multi_cubic/multi_quartic for multiple polynomials). These functions are available through FQS
  • Both numerical algorithms have similar CPU times for a single polynomial, but for multiple polynomials linalg.eigvals becomes much faster (up to one order of magnitude)
  • Analytical algorithm single_cubic/single_quartic is the fastest when a single polynomial, or a set smaller then 100 polynomials should be solved
  • For single_cubic/single_quartic just-in-time compiler from Numba gives a significant increase in the computational speed
  • Analytical algorithm multi_cubic/multi_quartic is the fastest when a set larger than 100 polynomials is given as input
  • A Python function containing single_cubic , single_quartic, multi_cubic, and multi_quartic, as well as a function than determines what solver should be used in a specific case, is available through FQS.

The CPU times are summarized in the following two tables for different number of polynomials (Nr.) and separately for cubic and quartic equations:

Nr. roots linalg.eigvals single_cubic single_cubic(@jit) multi_cubic
1 76.5 µs 67 µs 28.6 µs 6.34 µs 174 µs
100 8.19 ms 0.54 ms 2.11 ms 0.27 ms 0.24 ms
10000 786 ms 31.2 ms 236 ms 27.6 ms 3.14 ms
Nr. roots linalg.eigvals single_quartic single_quartic (@jit) multi_quartic
1 80.1 µs 69.3 µs 50.3 µs 5.8 µs 233 µs
100 8.22 ms 0.59 ms 3.94 ms 0.33 ms 0.34 ms
10000 795 ms 48.3 ms 421 ms 30.8 ms 5.46 ms
Nino Krvavica
Nino Krvavica
Professor of Hydraulic Engineering