Deriving the Quadratic Formula

April 25, 2021


In any high school pre-calculus course, students are taught various strategies for finding the zeros of quadratic functions, one of those being the use of the quadratic-formula.

For reference, here is the formula where z0z_0 and z1z_1 are the zeros of a quadratic function and aa, bb and cc are its coefficients:

(z0,z1)=b±b24ac2a(z_0, z_1) = \frac{-b \pm \sqrt{b^2 - 4ac}}{2a}

Students are often told to memorize this formula without any further explaination, though I believe that if some extra insight as to how the formula can be found was provided along with its introduction, that task of memorization may feel a little less daunting for some.

The common methods of deriving the quadratic-formula can feel just as mysterious to students as the formula itself, however, in this post I'd like to walk through and explain the steps for a method of deriving the formula which I find to be fairly intuitive.

A general problem

Imagine a scenerio where we have no prior knowledge of the quadratic-formula and we are told to find the zeros of a quadratic function f(x)f(x) whose coefficients are arbitrary constants, namely a,b,a, b, and cc.

Such a function would have the following form:

f(x)=ax2+bx+cf(x) = \color{#FF1241}a\color{#392354}x^2 + \color{#4dff6e}b\color{#392354}x + \color{#4924FF}c\color{#392354}

Defining the zeros

One's first instinct may be to factor the function, but that is going to be tough to do since we don't know anything about the coefficients other than the fact that they are numbers. However, what we can do is introduce a few new variables in order to write f(x)f(x) in its factored form.

Let's see what that would look like:

f(x)=(α1x+α0)(β1x+β0)f(x) = (\color{#8619E6}\alpha_1\color{#392354}x + \color{#E14DD8}\alpha_0\color{#392354})(\color{#8619E6}\beta_1\color{#392354}x + \color{#E14DD8}\beta_0\color{#392354})

While f(x)f(x) is in this form, we can see that f(x)=0f(x) = 0 if either α1x+α0=0\color{#8619E6}\alpha_1\color{#392354}x + \color{#E14DD8}\alpha_0\color{#392354} = 0 or β1x+β0=0\color{#8619E6}\beta_1\color{#392354}x + \color{#E14DD8}\beta_0\color{#392354} = 0.

If we isolate both terms and solve for xx we will have found the zeros of f(x)f(x), those being z0z_0 and z1z_1.

α1x+α0=0z0=x=α0α1β1x+β0=0z1=x=β0β1\begin{aligned} \color{#8619E6}\alpha_1\color{#392354}x + \color{#E14DD8}\alpha_0\color{#392354} = 0 \\ z_0 = x = \frac{-\color{#E14DD8}\alpha_0\color{#392354}}{\color{#8619E6}\alpha_1\color{#392354}} \\\\ \color{#8619E6}\beta_1\color{#392354}x + \color{#E14DD8}\beta_0\color{#392354} = 0 \\ z_1 = x = \frac{-\color{#E14DD8}\beta_0\color{#392354}}{\color{#8619E6}\beta_1\color{#392354}} \end{aligned}

So we've found our zeros, however, their definitions are in terms of variables we don't yet know the values of. The only values which are known to us are the constant coefficients of f(x)f(x), so we need to find a way to redefine both z0z_0 and z1z_1 in terms of a,b,\color{#FF1241}a\color{#392354}, \color{#4dff6e}b\color{#392354}, and c\color{#4924FF}c\color{#392354}.

Redefining the coefficients

Even though we have defined the factored form of f(x)f(x) in terms of a few arbitrary variables, it is still representative of our original function, meaning that if we were to expand the factored form of f(x)f(x) it would provide us with new definitions for a,b,\color{#FF1241}a\color{#392354}, \color{#4dff6e}b\color{#392354}, and c\color{#4924FF}c\color{#392354} in terms of our new variables.

This expansion goes as follows:

f(x)=(α1x+α0)(β1x+β0)=α1β1x2+(α1β0+β1α0)x+α0β0=α1β1x2+(α1β0+β1α0)x+α0β0=ax2+bx+c\begin{aligned} f(x) &= (\color{#8619E6}\alpha_1\color{#392354}x + \color{#E14DD8}\alpha_0\color{#392354})(\color{#8619E6}\beta_1\color{#392354}x + \color{#E14DD8}\beta_0\color{#392354}) \\ &= \color{#8619E6}\alpha_1\color{#392354}\color{#8619E6}\beta_1\color{#392354}x^2 + (\color{#8619E6}\alpha_1\color{#392354}\color{#E14DD8}\beta_0\color{#392354} + \color{#8619E6}\beta_1\color{#392354}\color{#E14DD8}\alpha_0\color{#392354})x + \color{#E14DD8}\alpha_0\color{#392354}\color{#E14DD8}\beta_0\color{#392354} \\ &= \color{#ff0033} \alpha_1\beta_1 \color{default} x^2 + ( \color{#17e33c} \alpha_1\beta_0 + \beta_1\alpha_0 \color{default} ) x + \color{#4924ff} \alpha_0\beta_0 \color{default} \\ & = \color{#FF1241}a\color{#392354}x^2 + \color{#4dff6e}b\color{#392354}x + \color{#4924FF}c\color{#392354} \end{aligned}

Here is a clearer view of our new definitions:

a=α1β1b=α1β0+β1α0c=α0β0\begin{aligned} \color{#FF1241}a\color{#392354} &= \color{#8619E6}\alpha_1\color{#392354}\color{#8619E6}\beta_1\color{#392354} \\ \color{#4dff6e}b\color{#392354} &= \color{#8619E6}\alpha_1\color{#392354}\color{#E14DD8}\beta_0\color{#392354} + \color{#8619E6}\beta_1\color{#392354}\color{#E14DD8}\alpha_0\color{#392354} \\ \color{#4924FF}c\color{#392354} &= \color{#E14DD8}\alpha_0\color{#392354}\color{#E14DD8}\beta_0\color{#392354} \end{aligned}

We now have formal definitions for each of our coefficients, but in a similar fashion to z0z_0 and z1z_1, they are defined in terms of the variables we recently introduced and know nothing about, so these definitions aren't going to help us much right now.

Finding a middle-ground

At this point it's not all too obvious what should be done next, however, I'm going to take a moment to bring up a property which will help us manipulate our definitions in order to eliminate some of the unknown variables.

The property goes as follows:

The middle value between any two numbers aa and bb is equal to a+b2\frac{a + b}{2}

a    a+b2    ba \; \cdots \; \frac{a + b}{2} \; \cdots \; b

It may not be clear as to how this property could help us, but try and stick with me as we go through these next couple of steps, as things should soon start making sense.

Let's recall our definiton of b\color{#4dff6e}b\color{#392354}.

b=α1β0+β1α0\color{#4dff6e}b\color{#392354} = \color{#8619E6}\alpha_1\color{#392354}\color{#E14DD8}\beta_0\color{#392354} + \color{#8619E6}\beta_1\color{#392354}\color{#E14DD8}\alpha_0\color{#392354}

Since b\color{#4dff6e}b\color{#392354} is the sum of α1β0\color{#8619E6}\alpha_1\color{#392354}\color{#E14DD8}\beta_0\color{#392354} and β1α0\color{#8619E6}\beta_1\color{#392354}\color{#E14DD8}\alpha_0\color{#392354}, by the previous property we can see that the value which lies in between α1β0\color{#8619E6}\alpha_1\color{#392354}\color{#E14DD8}\beta_0\color{#392354} and β1α0\color{#8619E6}\beta_1\color{#392354}\color{#E14DD8}\alpha_0\color{#392354} should be b2\frac{\color{#4dff6e}b\color{#392354}}{2}.

Let's call this new value m\color{#FF5303}m\color{#392354} for middle.

m=α1β0+β1α02=b2\color{#FF5303}m\color{#392354} = \frac{\color{#8619E6}\alpha_1\color{#392354}\color{#E14DD8}\beta_0\color{#392354} + \color{#8619E6}\beta_1\color{#392354}\color{#E14DD8}\alpha_0\color{#392354}}{2} =\frac{\color{#4dff6e}b\color{#392354}}{2}

Going the distance

There is a small detail in the property I previously mentioned which I ommitted for clarity purposes, but we now need it in order to move forward.

The following is a version of the previous property which includes this detail:

The middle value between any two numbers aa and bb is equal to a+b2\frac{a + b}{2} and resides at a distance dd from both aa and bb

a  d  a+b2  d  ba \; \underbrace{\cdots}_{d} \; \frac{a + b}{2} \; \underbrace{\cdots}_{d} \; b

This additional detail should make sense, as a middle object is defined by the fact that the distance between it and all of it's surrounding objects is equal

This version of the property denotes this distance by dd, and it can be used for traveling to the left (d)(-d) and right (+d)(+d) of a+b2\frac{a + b}{2} in order to reach both aa and bb respectively.

For our particular instance, this means we can travel a distance d\color{#008CFF}d\color{#392354} to the left and right of m\color{#FF5303}m\color{#392354} in order to reach both α1β0\color{#8619E6}\alpha_1\color{#392354}\color{#E14DD8}\beta_0\color{#392354} and β1α0\color{#8619E6}\beta_1\color{#392354}\color{#E14DD8}\alpha_0\color{#392354} respectively.

This travel can be represented in both α1β0\color{#8619E6}\alpha_1\color{#392354}\color{#E14DD8}\beta_0\color{#392354} and β1α0\color{#8619E6}\beta_1\color{#392354}\color{#E14DD8}\alpha_0\color{#392354} by rewriting them as the following:

α1β0=mdβ1α0=m+d\begin{aligned} \color{#8619E6}\alpha_1\color{#392354}\color{#E14DD8}\beta_0\color{#392354} = \color{#FF5303}m\color{#392354} - \color{#008CFF}d\color{#392354} \\ \color{#8619E6}\beta_1\color{#392354}\color{#E14DD8}\alpha_0\color{#392354} = \color{#FF5303}m\color{#392354} + \color{#008CFF}d\color{#392354} \end{aligned}

With these two new equations we can now define ac\color{#FF1241}a\color{#392354}\color{#4924FF}c\color{#392354} in terms of m\color{#FF5303}m\color{#392354} and d\color{#008CFF}d\color{#392354}.

ac=α1β1α0β0=α1β0β1α0=(md)(m+d)=m2d2\color{#FF1241}a\color{#392354}\color{#4924FF}c\color{#392354} = \color{#8619E6}\alpha_1\color{#392354}\color{#8619E6}\beta_1\color{#392354}\color{#E14DD8}\alpha_0\color{#392354}\color{#E14DD8}\beta_0\color{#392354} = \color{#8619E6}\alpha_1\color{#392354}\color{#E14DD8}\beta_0\color{#392354}\color{#8619E6}\beta_1\color{#392354}\color{#E14DD8}\alpha_0\color{#392354} = (\color{#FF5303}m\color{#392354} - \color{#008CFF}d\color{#392354})(\color{#FF5303}m\color{#392354} + \color{#008CFF}d\color{#392354}) = \color{#FF5303}m\color{#392354}^2 - \color{#008CFF}d\color{#392354}^2

Now we can solve for d\color{#008CFF}d\color{#392354}.

ac=m2d2d=m2ac\begin{aligned} \color{#FF1241}a\color{#392354}\color{#4924FF}c\color{#392354} &= \color{#FF5303}m\color{#392354}^2 - \color{#008CFF}d\color{#392354}^2 \\ \color{#008CFF}d\color{#392354} &= \sqrt{\color{#FF5303}m\color{#392354}^2 - \color{#FF1241}a\color{#392354}\color{#4924FF}c\color{#392354}} \end{aligned}

Zeroing in

We've diverged pretty far from our original problem, but we now have all of the nessesary tools in order to solve for the zeros of f(x)f(x) as we initially set out to do.

Let's recall the definition of these zeros

(z0,z1)=(α0α1,β0β1)(z_0, z_1) = (\frac{-\color{#E14DD8}\alpha_0\color{#392354}}{\color{#8619E6}\alpha_1\color{#392354}}, \frac{-\color{#E14DD8}\beta_0\color{#392354}}{\color{#8619E6}\beta_1\color{#392354}})

As I had mentioned before, this definition isn't very useful since it is not in terms that we can work with, but we can now fix that using the new definitions we've worked to find.

β1α0=m+dz0=α0α1=mdα1β1=mdaα1β0=mdz1=β0β1=m+dα1β1=m+da\begin{aligned} \color{#8619E6}\beta_1\color{#392354}\color{#E14DD8}\alpha_0\color{#392354} &= \color{#FF5303}m\color{#392354} + \color{#008CFF}d\color{#392354} \\ z_0 = \frac{-\color{#E14DD8}\alpha_0\color{#392354}}{\color{#8619E6}\alpha_1\color{#392354}} &= \frac{-\color{#FF5303}m\color{#392354} - \color{#008CFF}d\color{#392354}}{\color{#8619E6}\alpha_1\color{#392354}\color{#8619E6}\beta_1\color{#392354}} = \frac{-\color{#FF5303}m\color{#392354} - \color{#008CFF}d\color{#392354}}{\color{#FF1241}a\color{#392354}} \\\\ \color{#8619E6}\alpha_1\color{#392354}\color{#E14DD8}\beta_0\color{#392354} &= \color{#FF5303}m\color{#392354} - \color{#008CFF}d\color{#392354} \\ z_1 = \frac{-\color{#E14DD8}\beta_0\color{#392354}}{\color{#8619E6}\beta_1\color{#392354}} &= \frac{-\color{#FF5303}m\color{#392354} + \color{#008CFF}d\color{#392354}}{\color{#8619E6}\alpha_1\color{#392354}\color{#8619E6}\beta_1\color{#392354}} = \frac{-\color{#FF5303}m\color{#392354} + \color{#008CFF}d\color{#392354}}{\color{#FF1241}a\color{#392354}} \end{aligned}

We can now write our zeros out as the following:

(z0,z1)=m±da(z_0, z_1) = \frac{-\color{#FF5303}m\color{#392354} \pm \color{#008CFF}d\color{#392354}}{\color{#FF1241}a\color{#392354}}

Now we can make a few substitutions and fully simplify.

(z0,z1)=m±da=b2±(b2)2aca=b±b24ac2a(z_0, z_1) = \frac{-\color{#FF5303}m\color{#392354} \pm \color{#008CFF}d\color{#392354}}{\color{#FF1241}a\color{#392354}} = \frac{-\frac{\color{#4dff6e}b\color{#392354}}{2} \pm \sqrt{(\frac{\color{#4dff6e}b\color{#392354}}{2})^2 - \color{#FF1241}a\color{#392354}\color{#4924FF}c\color{#392354}}}{\color{#FF1241}a\color{#392354}} = \frac{-b \pm \sqrt{b^2 - 4ac}}{2a}

And there we have it, the quadratic-formula.