In any high school pre-calculus course, students are taught various strategies for finding the zeros of quadratic functions, one of those being the use of the quadratic-formula.
For reference, here is the formula where z0 and z1 are the zeros of a quadratic function and a, b and c are its coefficients:
(z0,z1)=2a−b±b2−4ac
Students are often told to memorize this formula without any further explaination, though I believe that if some extra insight as to how the formula can be found was provided along with its introduction, that task of memorization may feel a little less daunting for some.
The common methods of deriving the quadratic-formula can feel just as mysterious to students as the formula itself, however,
in this post I'd like to walk through and explain the steps for a method of deriving the formula which I find to be fairly intuitive.
A general problem
Imagine a scenerio where we have no prior knowledge of the quadratic-formula and we are told to find the zeros of a quadratic function f(x) whose coefficients are arbitrary constants, namely a,b, and c.
Such a function would have the following form:
f(x)=ax2+bx+c
Defining the zeros
One's first instinct may be to factor the function, but that is going to be tough to do since we don't know anything about the coefficients other than the fact that they are numbers.
However, what we can do is introduce a few new variables in order to write f(x) in its factored form.
Let's see what that would look like:
f(x)=(α1x+α0)(β1x+β0)
While f(x) is in this form, we can see that f(x)=0 if either α1x+α0=0 or β1x+β0=0.
If we isolate both terms and solve for x we will have found the zeros of f(x), those being z0 and z1.
α1x+α0=0z0=x=α1−α0β1x+β0=0z1=x=β1−β0
So we've found our zeros, however, their definitions are in terms of variables we don't yet know the values of.
The only values which are known to us are the constant coefficients of f(x), so we need to find a way to redefine both z0 and z1 in terms of a,b, and c.
Redefining the coefficients
Even though we have defined the factored form of f(x) in terms of a few arbitrary variables, it is still representative of our original function, meaning that if we were to expand the factored form of f(x) it would provide us with new definitions for a,b, and c in terms of our new variables.
We now have formal definitions for each of our coefficients, but in a similar fashion to z0 and z1, they are defined in terms of the variables we recently introduced and know nothing about, so these definitions aren't going to help us much right now.
Finding a middle-ground
At this point it's not all too obvious what should be done next, however, I'm going to take a moment to bring up a property which will help us manipulate our definitions in order to eliminate some of the unknown variables.
The property goes as follows:
The middle value between any two numbers a and b is equal to 2a+b
a⋯2a+b⋯b
It may not be clear as to how this property could help us, but try and stick with me as we go through these next couple of steps, as things should soon start making sense.
Let's recall our definiton of b.
b=α1β0+β1α0
Since b is the sum of α1β0 and β1α0, by the previous property we can see that the value which lies in between α1β0 and β1α0 should be 2b.
Let's call this new value m for middle.
m=2α1β0+β1α0=2b
Going the distance
There is a small detail in the property I previously mentioned which I ommitted for clarity purposes, but we now need it in order to move forward.
The following is a version of the previous property which includes this detail:
The middle value between any two numbers a and b is equal to 2a+b and resides at a distance d from both a and b
ad⋯2a+bd⋯b
This additional detail should make sense, as a middle object is defined by the fact that the distance between it and all of it's surrounding objects is equal
This version of the property denotes this distance by d, and it can be used for traveling to the left(−d) and right(+d) of 2a+b in order to reach both a and b respectively.
For our particular instance, this means we can travel a distance d to the left and right of m in order to reach both α1β0 and β1α0 respectively.
This travel can be represented in both α1β0 and β1α0 by rewriting them as the following:
α1β0=m−dβ1α0=m+d
With these two new equations we can now define ac in terms of m and d.
ac=α1β1α0β0=α1β0β1α0=(m−d)(m+d)=m2−d2
Now we can solve for d.
acd=m2−d2=m2−ac
Zeroing in
We've diverged pretty far from our original problem, but we now have all of the nessesary tools in order to solve for the zeros of f(x) as we initially set out to do.
Let's recall the definition of these zeros
(z0,z1)=(α1−α0,β1−β0)
As I had mentioned before, this definition isn't very useful since it is not in terms that we can work with, but we can now fix that using the new definitions we've worked to find.