What is the Inverse of a Vector?
If vector multiplication were invertible, we could treat vectors just like any other variable. We could do simple algebra like:
It's strange to see vectors manipulated this way, isn't it? The cancellations generally make sense, but what does a quantity like -3\textcolor{purple}{\vec{b}}\textcolor{orange}{\vec{a}}^2 even mean?
This type of algebra looks weird because we learned in school how to add and subtract vectors, but we never learned how to multiply and divide them!
Most curriculums cover the dot product and the cross product but neither of these is invertible so neither gives us vector inversion.
But invertible vector multiplication does exist and we can derive it ourselves.
In this post we will re-invent a form of math that is far superior to the one you learned in school. The ideas herein are nothing short of revolutionary.
Table of Contents
- The Units of Magnitude
- Addition of Like Types
- Addition of Disparate Types
- Choosing The Vector Product
- Examining the Vector Product
- A Quick Warmup (Start here if you just need a refresher)
- Vectors as Transformations
- Transforming Twice
- A New Foundation for Physics
- Conclusion
- Geometric Algebra Cheat Sheet
- Contact me
- Errata
The Units of Magnitude
We're going to look at the things we're already familiar with and try to draw a pattern. We'll use this pattern to create some new objects. Those objects will lead us to invertible vector multiplication.
Along the way we'll pretend that we're inventing this all for ourselves. We'll make a few wrong turns and see why they're wrong, rather than just dropping conclusions out of the sky.
We're going to derive the foundations of an entire branch of Mathematics that is usually only taught to grad students. But to get there you need nothing more than high-school algebra skills.
The key insight comes from examining scalars and vectors. We'll look at their magnitudes and focus on the units that those magnitudes are measured in.
Scalars
A scalar s is a regular number like 1.618 or -3 or \frac{1}{137}.
A scalar is a point on a number line. Its magnitude \|s\| is just its absolute value. We can call this a 0-Vector or refer to it as "Grade 0" to indicate that its magnitude carries no units.
Vectors
For the purposes of this post a vector \textcolor{orange}{\vec{a}} is an arrow in 3D space. It points from the origin to some other point in space and is usually written like this:
Or equivalently, like this:
By the way, all these 3D illustrations are fully interactive. Go ahead and rotate them around to make sure you get a good intuitive feel for them.
A vector's magnitude \|\textcolor{orange}{\vec{a}}\| is equal to its length, so we'll call them 1-Vectors or say they are "Grade 1" because length is a 1-dimensional quantity.
That length is just:
Bivectors
This suggests a pattern. If a scalar's magnitude is a 0-dimensional value that carries no units, and a vector's magnitude is a 1-dimensional value that carries units of length, can we invent a type of object whose magnitude is a 2-dimensional value that carries units of area?
Let's try!
If vectors are oriented chunks of length, then our new object is some sort of oriented chunk of area, some segment of a 2D plane that is floating in 3D space.
The most efficient way to define a plane is to use the normal vector \textcolor{gray}{\vec{n}} which is perpendicular to the plane. The direction of the normal vector tells us the orientation of the plane, and we can encode magnitude as the length of that normal vector.
But...wait that definition isn't consistent with our goal. This object does describe an oriented plane and it does have a degree of freedom to represent scale, but it's still just a vector so its magnitude will be a length, not an area. This won't work.
What else can we try?
One alternative way to describe a plane is to use two vectors, \textcolor{orange}{\vec{a}} and \textcolor{purple}{\vec{b}}. This requires specifying 6 values rather than just 3, so it is less efficient but who cares, we're inventing our own math here so we make the rules.
The plane that they both fall upon gives us an orientation. But how do we get an area?
We've already got two vectors so the simplest possible solution might be to complete the triangle:
Which has an area equal to:
Where \theta is the angle between the vectors.
This works but we'll have to carry around an inconvenient factor of \frac{1}{2}.
On a lark, what if we borrow an idea from CAD software and try to extrude \textcolor{orange}{\vec{a}} in the direction of \textcolor{purple}{\vec{b}}?
The resulting shape is a parallelogram which covers an area:
By choosing to extrude, we get to drop the extra factor of \frac{1}{2}, and we get a very hands-on, intuitive definition for what we're doing with our two vectors.
In fact, let's come up with a special new symbol that means extrude. We'll use \wedge and pronounce it "extrude". We can write down our parallelogram as \textcolor{orange}{\vec{a}}\wedge\textcolor{purple}{\vec{b}}.
The parallelogram \textcolor{orange}{\vec{a}}\wedge\textcolor{purple}{\vec{b}} is defined by extruding the vector \textcolor{orange}{\vec{a}} in the direction of \textcolor{purple}{\vec{b}}
So we've got our new object, a 2D parallelogram floating in 3D space. It takes two vectors to describe these things so let's call this object a bivector.
The bivector \textcolor{orange}{\vec{a}}\wedge\textcolor{purple}{\vec{b}} is defined by extruding the vector \textcolor{orange}{\vec{a}} in the direction of \textcolor{purple}{\vec{b}}
If we double the length of one of the input vectors, the output bivector stays in the same plane, it just doubles in area:
Already we have a tough choice to make. Consider the bivectors 2\textcolor{orange}{\vec{a}}\wedge\textcolor{purple}{\vec{b}} and \textcolor{orange}{\vec{a}}\wedge 2\textcolor{purple}{\vec{b}}. These two objects are obviously not identical. But they share the same plane and they have the same magnitude. Should we say that they are equal or not equal?
When you're inventing your own algebra, you get to choose!
If we say they are equal we'll get an algebra that allows us to freely move coefficients around, which is a property we're used to from scalar algebra. So sure, let's define this as an axiom:
Because their orientations and magnitudes are both equal.
More generally, using any scalar s:
This property is called linearity which just means that the output scales linearly with the inputs. This is desirable property when designing an algebra! If the extrude product is linear, we can freely move coefficients around however we like.
But it comes at a cost. 2\textcolor{orange}{\vec{a}}\wedge\textcolor{purple}{\vec{b}} and \textcolor{orange}{\vec{a}}\wedge2\textcolor{purple}{\vec{b}} are visually distinct objects yet we say they are equal.
The fact that we can write (and visualize) the same bivector multiple ways implies that our notation is redundant. Even though we write down bivectors using 6 numbers, it turns out there are only 3 degrees of freedom in a bivector.
Even though we visualize them as oriented parallelograms, the only important properties of a bivector are the plane it lives in and the area of its parallelogram. The actual shape of the parallelogram and its location in that plane aren't important.
If you've heard of quaternions, they famously use 4 numbers to represent 3 degrees of rotation. Their redundancy makes them much more algebraicly useful but also harder to visualize and understand.
Bivectors might be similar. We'll come back to this.
But let's keep inspecting them first! What happens if we switch the input ordering? Does \textcolor{orange}{\vec{a}} \wedge \textcolor{purple}{\vec{b}} equal \textcolor{purple}{\vec{b}} \wedge \textcolor{orange}{\vec{a}}?
Well, \textcolor{orange}{\vec{a}} \wedge \textcolor{purple}{\vec{b}} forms a parallelogram by extruding an orange vector out sideways, so it forms an orange parallelogram. But \textcolor{purple}{\vec{b}} \wedge \textcolor{orange}{\vec{a}} extrudes a purple vector up almost vertically, so it extrudes a purple parallelogram. Again we get to choose if these should be considered equal.
You may complain that vectors don't usually carry any intrinsic color, so how can we reliably use color as the basis of distinguishing bivectors?
Your complaint is valid, and plus we don't always get to use color when we're sketching in a notebook so let's try looking at this situation in a way that doesn't rely so heavily on color:
By putting the second vector at the tip of the first, we can see that \textcolor{orange}{\vec{a}} \wedge \textcolor{purple}{\vec{b}} forms a parallelogram by tracing the perimeter clockwise, but \textcolor{purple}{\vec{b}} \wedge \textcolor{orange}{\vec{a}} forms a parallelogram by tracing the perimeter anticlockwise.
Now, the perceived spin direction changes based on which side of the parallelogram you're looking at. If you move the camera around you can make either bivector look clockwise or anticlockwise so I'm not trying to make any claims about their absolute spin directions.
But the fact remains that from any fixed camera position, the two bivectors spin in opposite directions. This is just another way of looking at the color difference, without us having to use colored pencils every time we draw bivectors on paper.
So we have to choose: Do we want our algebra to encapsulate a concept of spin direction, or do we want to go without?
There is probably a useful algebra to be found down either of these roads. But to me, spin direction feels pretty important so let's define our second axiom:
This equation makes it look like one quantity is positive and the other is negative, but we're not necessarily picking which is which. We're just saying that whatever the two spin directions are, they're opposites.
In deriving bivectors we invented the extrude product. Let's explore it a little more!
What happens if you extrude a vector by itself?
Well, any vector \textcolor{purple}{\vec{b}} is always parallel to itself, so the extruded parallelogram will always have zero area. We can already write an identity!
What happens if we form a bivector from two orthogonal unit vectors like \textcolor{red}{\hat{x}} and {\textcolor{green}{\hat{y}}}?
The parallelogram they form happens to be a square with unit area! We can use this to form three distinct unit bivectors:
These three unit bivectors are orthogonal to each other, which just means that if you project one plane onto another, its shadow occupies zero area. They also form a basis for all bivectors, which means that every conceivable bivector can be written as a weighted sum of these three.
Said again: Every plane is just some combination of these three planes.
The order you choose to write the vectors in doesn't really matter as long as you pick one thing and stick to it, but we should pick something that is easy to remember. I think we should use \textcolor{red}{\hat{x}}\wedge\textcolor{blue}{\hat{z}} instead of \textcolor{blue}{\hat{z}}\wedge\textcolor{red}{\hat{x}} because that way, every unit bivector is spelled alphabetically: (\textcolor{red}{\hat{x}}\wedge\textcolor{green}{\hat{y}}), (\textcolor{green}{\hat{y}}\wedge\textcolor{blue}{\hat{z}}), and (\textcolor{red}{\hat{x}}\wedge\textcolor{blue}{\hat{z}}). This ruins our nice one-of-each-color pattern, but it means we have a handy mnemonic device to remember our basis bivectors.
However you choose to write them, their mutual orthogonality gives bivectors an interesting commonality with regular vectors. Any vector can be written as the weighted sum of basis vectors:
Where you find the coefficients v_1, v_2, v_3 by projecting \vec{v} onto the unit vectors \textcolor{red}{\hat{x}}, \textcolor{green}{\hat{y}}, \textcolor{blue}{\hat{z}}.
Similarly, any bivector can be written as the weighted sum of basis bivectors:
Where you find the coefficients b_1, b_2, b_3 by projecting the bivector \overset{\Rightarrow}{b} onto the unit bivectors (\textcolor{red}{\hat{x}} \wedge \textcolor{green}{\hat{y}}), (\textcolor{green}{\hat{y}}\wedge\textcolor{blue}{\hat{z}}), (\textcolor{red}{\hat{x}}\wedge\textcolor{blue}{\hat{z}}).
Because bivectors can be written as three coefficients, they behave very similarly to vectors. You can add two bivectors the same way you add two vectors. You can multiply a bivector by a scalar the same way you would with a vector. The similarities are so striking that we might think of them as "pseudovectors":
But I won't write them this way because I think that obscures their true nature.
Instead, I will use:
Because it forces us to remember what those coefficients are attached to. A bivector contains three degrees of freedom just like a vector, but they are not the same degrees of freedom.
By the way I drew two arrows on top of the b to indicate that it is a bivector, not a vector. We won't use this notation much but it's nice to have in case we need it.
So there we have it, a new type of object called a bivector which is an oriented chunk of area! Its magnitude carries units of area, so bivectors are considered 2-Vectors, or referred to as "Grade 2".
Trivectors
Okay we have a handle on scalars, vectors, and bivectors. Is there an object which behaves like an oriented volume? Something we might call a trivector?
For bivectors we made progress by extruding one vector in the direction given by a second vector, which yielded a 2D parallelogram. Can we extrude our parallelogram into a 3D volume?
Why not! Extruding a parallelogram into 3D yields a parallelepiped. Basically a cube but all askew:
Which we can write as \overset{\Rrightarrow}{t} = \textcolor{orange}{\vec{a}}\wedge\textcolor{purple}{\vec{b}}\wedge\textcolor{aqua}{\vec{c}}.
This object's magnitude is clearly its volume so we've got that taken care of. But how should we think about orientation?
Well, let's take a quick look at \textcolor{red}{\vec{x}}\wedge\textcolor{green}{\vec{y}}\wedge\textcolor{blue}{\vec{z}}
This is clearly a unit cube which has a volume of 1.
If we throw a minus sign in front of any vector, let's pick \textcolor{green}{\vec{y}}, we get a different cube. Are these two trivectors equals or opposites?
Yet another difficult choice! Another way to ask it is: does \textcolor{red}{\vec{x}}\wedge-\textcolor{green}{\vec{y}}\wedge\textcolor{blue}{\vec{z}} have a volume of 1 or -1? Can a box contain negative volume?
I think so, if it is inside out! If you have a right-handed glove and you flip it inside out it becomes a left-handed glove.
If you have a right-handed volume and you flip it inside out it becomes a left-handed volume.
Left-handed gloves are clearly not equal to right-handed gloves, so I think we should consider them opposites:
By convention, most people agree that right-handed volume is positive and left-handed volume is negative, but this is an arbitrary choice. Really, the only important thing is that they are opposites.
Writing it out more formally:
Which is really just another way of saying that we can freely move coefficients around, even negative coefficients.
Remembering equation 1.1, which lets us swap the order of any two vectors at the cost of a minus sign:
This is great because it means that no matter what order the unit vectors appear in, no matter if they are positive or negative, we can always shuffle them around and insert minus signs to get them back into the standard \textcolor{red}{\vec{x}}\wedge\textcolor{green}{\vec{y}}\wedge\textcolor{blue}{\vec{z}} order.
That means every trivector is just a scaled version of \textcolor{red}{\vec{x}}\wedge\textcolor{green}{\vec{y}}\wedge\textcolor{blue}{\vec{z}}:
We might write a trivector using 9 numbers, but it only encapsulates a single degree of freedom: its volume.
So I guess we're back at something that has just one degree of freedom and feels more like a scalar. In fact you might even think of trivectors as pseudoscalars because they behave so similarly.
Remember that we only arrived at equation 1.1 because we decided that bivector spin direction was important—that clockwise and anticlockwise are opposites.
We could have chosen to ignore bivector spin direction but if we'd have made that decision then to be self-consistent we would now be forced to ignore trivector handedness. Perhaps there is a useful algebraic system down that road, but it models a universe where inside-out and rightside-in are not distinguishable, which is not our universe.
Trivectors are called 3-Vectors or thought of as "Grade 3" because their magnitude carries units of volume.
In the same way that bivectors behave as pseudovectors, trivectors behave as pseudoscalars.
Recap and Higher Dimensional Concerns
Grade | Magnitude | Written | Bases | |
---|---|---|---|---|
Scalars | 0 | - | s | 1 |
Vectors | 1 | length | \textcolor{orange}{\vec{a}} | 3 |
Bivectors | 2 | area | \textcolor{orange}{\vec{a}}\wedge\textcolor{purple}{\vec{b}} | 3 |
Trivectors | 3 | volume | \textcolor{orange}{\vec{a}}\wedge\textcolor{purple}{\vec{b}}\wedge\textcolor{aqua}{\vec{c}} | 1 |
The concept of Grade extends out to any number of dimensions, so we might ask what about quadvectors?
Well, it is impossible to hold any 3D volume in a 2D universe, so it also seems impossible to hold any 4D volume in a 3D universe. I don't think there's any way to extend our pattern further without amending our definition of what space is.
Addition of Like Types
You've made it a long way! We've derived some new fundamental objects and now it's time we learn how they interact with each other.
We all know how to add scalars together: Just add the values.
Vectors can be added together in a simple way: Project the input vectors onto three orthogonal unit vectors then add up the components
Bivectors behave a lot like vectors:
We could go ahead and develop explicit formulas for how to project a bivector on another bivector but there's no need. We're just trying to develop an intuition right now.
Trivectors are just fancy scalars. There's only one meaningful unit trivector so you just write your trivectors in terms of \textcolor{red}{\hat{x}}\wedge\textcolor{green}{\hat{y}}\wedge\textcolor{blue}{\hat{z}} and add the components you end up with
So we've made it through the first hurdle—our new algebra would not be very useful it we couldn't add like types!
Addition of Disparate Types
What does it mean to add a vector to a scalar?
I don't think there is any meaningful way to perform this computation. All you can do is just write out the sum as given.
This is reminsicent of the imaginary number i=\sqrt{-1}. We just accept that value in the direction of i is orthogonal to value in the direction of 1, so we write out complex numbers like:
And we don't try to actually add a and b in any other way. We just accept that a on its own is a real number, bi on its own is an imaginary number, and a + bi is a new thing called a complex number.
Applying this idea to our current situation, we can just say that objects with different grades are orthogonal to each other and cannot be meaningfully added.
A quantity like
Is perfectly valid, but it cannot be simplified into anything else. We just accept that a is a scalar, b\textcolor{red}{\hat{x}} is a vector, c(\textcolor{red}{\hat{x}}\wedge\textcolor{green}{\hat{y}}) is a bivector, d(\textcolor{red}{\hat{x}}\wedge\textcolor{green}{\hat{y}}\wedge\textcolor{blue}{\hat{z}}) is a trivector, and their sum is something different. We can call this new object a Geometric and write it as \mathbf{G}.
Writing it all out, we can see that in general a Geometric has eight orthogonal components:
One scalar component, three vector components, three bivector components, and one trivector component.
This might feel like a lot to deal with, but remember that most of the time most of those coefficients are zero.
If orthogonal objects don't mix, then the sum of any two geometrics, \mathbf{G_1} and \mathbf{G_2}, is just the pairwise sum of their scalar, vector, bivector, and trivector components. You'll never need this giant equation, but just for illustrative purposes:
So there we have it, adding disparate types is permitted it just doesn't simplify!
Choosing The Vector Product
The products we've mentioned so far are the dot product and the extrude product. You may also know about the cross product.
None of these are invertible.
Consider the dot product:
Given:
There is no way to unambiguously recover \textcolor{orange}{a_1}, \textcolor{orange}{a_2}, \textcolor{orange}{a_3}:
All we can say is that \textcolor{orange}{\vec{a}} lies somewhere on the infinite plane which is perpendicular to \textcolor{purple}{\vec{b}} and passes through \textcolor{gray}{s} \textcolor{purple}{\vec{b}}. There are infinitely many valid solutions.
The extrude product gets closer but still fails:
Given:
No matter which \textcolor{orange}{\vec{a}} you choose, as long as it lies on the line z = \frac{1}{2}, x = 0, the resulting parallelogram will always lie in the \textcolor{green}{\hat{y}} \wedge \textcolor{blue}{\hat{z}} plane and its area will always equal \frac{1}{2}.
If the dot product constrains \textcolor{orange}{\vec{a}} to a plane, and the extrude product constrains \textcolor{orange}{\vec{a}} to a line which passes through that plane...
A Eureka moment is upon us!
What if we define the Vector Product as:
Which we read as:
The vector product \textcolor{orange}{\vec{a}}\textcolor{purple}{\vec{b}} equals the dot product \textcolor{orange}{\vec{a}} \cdot \textcolor{purple}{\vec{b}} plus the extrude product \textcolor{orange}{\vec{a}} \wedge \textcolor{purple}{\vec{b}}
The output is a Geometric with a scalar component \textcolor{gray}{s} and a bivector component \textcolor{gray}{\overset{\Rightarrow}{c}}.
Visually, if the dot product constrains \textcolor{orange}{\vec{a}} to a plane and the extrude product constrains \textcolor{orange}{\vec{a}} to a line that passes through that plane, then the only value of \textcolor{orange}{\vec{a}} which satisfies both constraints lies at the point where the line intersects the plane
We write the vector product as \textcolor{orange}{\vec{a}}\textcolor{purple}{\vec{b}} in direct analogy to how we write scalar multiplication like 7x or 2c because it behaves so similarly.
The Vector Product is very important so let's examine it a few ways.
Examining the Vector Product
What happens when we plug in two of the same vector?
And just like that we've learned what it means to take the square of a vector!
Any vector squared yields a scalar equal to the magnitude of that vector, squared.
Buckle up because we're really going to start moving quickly now.
Consider the quantity \frac{\textcolor{orange}{\vec{a}}}{\textcolor{orange}{\vec{a}}^2}. What happens if you multiply it by \textcolor{orange}{\vec{a}}?
\textcolor{orange}{\vec{a}} times the quantity \frac{\textcolor{orange}{\vec{a}}}{\textcolor{orange}{\vec{a}}^2} equals 1. That means \frac{\textcolor{orange}{\vec{a}}}{\textcolor{orange}{\vec{a}}^2} is the inverse of \textcolor{orange}{\vec{a}}:
Just like scalars, vectors can be meaningfully inverted! Apparently the inverse of a vector is just that same vector, but scaled so that its new magnitude squared is the inverse of its old magnitude squared. Writing in a more suggestive way:
Qualitatively: The inverse of a big vector is a small vector that points in the same direction.
What happens if we swap the order of the input vectors?
The dot product is commutative:
And we know from equation 1.1 that the extrude product is anticommutative:
So the vector product \textcolor{purple}{\vec{b}}\textcolor{orange}{\vec{a}}:
Can be rewritten as:
Which is more similar to our original definition:
Just for practice, let's add equations 5.7 and 4.0 to get:
Dividing both sides by 2 gives us an explicit formula for the dot product in terms of the vector product!
If we subtract equation 5.7 from equation 4.0 we get:
Which, after dividing both sides by 2, gives an explicit formula for the extrude product in terms of the vector product!
These two equations aren't super useful on their own, it's just nice to get our hands dirty and do some algebraic manipulation with our new tools.
What if we plug in unit vectors?
The dot product of two orthogonal vectors is always 0, so:
This is an incredibly convenient result! It means that we can stop writing our unit bivectors as \textcolor{red}{\hat{x}} \wedge \textcolor{green}{\hat{y}} and instead start writing them as \textcolor{red}{\hat{x}}\textcolor{green}{\hat{y}} which is much more compact.
Given that \textcolor{red}{\hat{x}} \wedge \textcolor{green}{\hat{y}} = -\textcolor{green}{\hat{y}} \wedge \textcolor{red}{\hat{x}}, it is clear that
So our freedom to swap the order of any two adjacent unit vectors at the cost of a minus sign still holds!
Repeating the exercise for other unit vectors gives us the full set:
With these identities we'll be able to easily rearrange almost any equation we're given!
We created equation 4.0 to give us a way to multiply two vectors together. Can it cope if we start throwing bivectors at it? Let's try multiplying the bivector \textcolor{red}{\hat{x}}\textcolor{green}{\hat{y}} times the unit vector \textcolor{blue}{\hat{z}}.
Well, what does it mean to take the dot product between the \textcolor{red}{\hat{x}}\textcolor{green}{\hat{y}} plane and the \textcolor{blue}{\hat{z}} vector? Normally we visualize the dot product by projecting one argument onto the other, and in this case the \textcolor{blue}{\hat{z}} vector is completely orthogonal to the \textcolor{red}{\hat{x}}\textcolor{green}{\hat{y}} plane. It casts no "shadow".
So even though we haven't developed a formula for how to compute the dot product of a bivector with a vector, I think it makes sense to say that:
That leaves us with
Which is just ordinary extrusion of a bivector along a vector. Substituting in equation 5.12 we see that:
Another huge notational win! The unit trivector can be written as \textcolor{red}{\hat{x}}\textcolor{green}{\hat{y}}\textcolor{blue}{\hat{z}}!
What about two of the same unit vector?
Whoa. Repeating for each unit vector, we see that:
In fact, any unit-length vector is its own inverse! This combined with 5.15 will help us simplify complex expressions!
Now that we can simplify complicated expressions, let's make some! What happens if we try to square the \textcolor{red}{\hat{x}}\textcolor{green}{\hat{y}} unit bivector?
Here I'm just writing out what it means to square something; I haven't plugged it in to our special vector product formula. We know from 5.15 that we can swap the last two axes and bring in a minus sign:
And we know from equation 1.0 that we can freely move coefficients around:
The two \textcolor{green}{\hat{y}}'s cancel:
And the two \textcolor{red}{\hat{x}}'s cancel:
Whoa.
Apparently, the unit bivector that lives in the \textcolor{red}{\hat{x}}\textcolor{green}{\hat{y}} plane squares to negative one! Without resorting to anything "imaginary", we've stumbled upon an object with the property we normally assign to the imaginary number i! Maybe our new system of algebra can do some of the things that we normally rely on complex numbers to do, like 2D rotations?
This property is so remarkable to stumble upon that I think we should feel free to refer to the \textcolor{red}{\hat{x}}\textcolor{green}{\hat{y}} unit bivector as just \overset{\Rightarrow}{i} for short!
Instead of thinking of complex numbers as a + bi maybe we can write them as a + b\overset{\Rightarrow}{i} or a + b\textcolor{red}{\hat{x}}\textcolor{green}{\hat{y}}. As an exercise for the reader: can you prove that this system behaves identically to the complex numbers we normally learn about?
But there was nothing special about the \textcolor{red}{\hat{x}}\textcolor{green}{\hat{y}} unit bivector that gave us this strange property. Does it hold with the \textcolor{green}{\hat{y}}\textcolor{blue}{\hat{z}} unit bivector?
This is wild! We apparently have two "imaginary numbers" and they are orthogonal to each other! We can't just call this one \overset{\Rightarrow}{i} as well. Maybe we'll call this one \overset{\Rightarrow}{j}?
For completeness' sake let's try the last one:
Which is unsurprising the third time around, I guess. This plane is orthogonal to both of the previous planes so let's call it \overset{\Rightarrow}{k}!
Apparently our new algebraic system based on Geometrics permits three mutually orthogonal "imaginary numbers" corresponding to the three unit bivectors!
If you've ever studied quaternions you might remember the fundamental equation that defines them:
And how strikingly similar our current situation seems to be!
Now that we have bivector definitions for \overset{\Rightarrow}{i}, \overset{\Rightarrow}{j}, and \overset{\Rightarrow}{k}, let's see if that final, crucial identity holds?
Fantastic! Our new system of algebra conforms to the same fundamental rules that govern quaternions!
Perhaps our new algebra will be useful for some of the things that quaternions are useful for, like 3D rotations!
A Quick Warmup (Start here if you just need a refresher)
We're now armed with a robust set of identities, tools for simplifying complexity, and even a few hints at what directions we might want to explore. It's time to see what our newly invented form of algebra can do!
First let's do a quick practice problem just to get a taste. We're going to simplify:
We're just dealing with normal algebra here. Use the FOIL method:
Let's simplify the second term because \textcolor{red}{\hat{x}} is its own inverse:
On the final term let's swap the middle \textcolor{red}{\hat{x}} and \textcolor{green}{\hat{y}} and add our minus sign:
Remembering that \textcolor{green}{\hat{y}}^2 = 1:
Then combining like terms we see:
And that's what it feels like to use our new algebra! We did all that work to define the vector product and derive a bunch of identities, but most of our actual manipulations boil down to just swapping vector order and adding minus signs, or cancelling out squared unit vectors.
Other than these new rules, the system we've created behaves just like normal algebra!
Vectors as Transformations
In Matrix Algebra we often employ transformations that look like:
And when working with quaternions we can rotate vectors like:
This form of equation is sometimes called the sandwich product for obvious reasons. If you've never seen it before, just know that it pops up everywhere.
So it isn't a huge leap to ask: What happens when we use a vector as a transformation? We can invert them and multiply them now, so let's try!
The formula for applying a transformation (sandwich) is:
Our input vector will be something simple like:
The vector we'll transform it by will also be simple; let's use:
We already know how to find the inverse vector using equation 5.4:
Plugging it all in:
Let's pull out the constants .5 and 2 to the very front and let them cancel:
We can distribute the leftmost \textcolor{blue}{\hat{z}}, taking care to keep it on the left of any existing unit vectors:
Now we can incorporate that rightmost \textcolor{blue}{\hat{z}}:
It looks like we reflected \textcolor{orange}{\vec{a}} across \textcolor{brown}{\vec{t}} as though it were a mirror!
Let's verify that with a more 3D example, this time using:
Plugging into our transformation formula we see:
Which confirms our finding from the simpler, 2D case!
This is a refreshing change in perspective. Ordinarily we think of reflections happening only in mirrors which are planes, but the tools we've derived here let us compute reflections across vectors, without having to define any sort of plane!
So it turns out that if we use them in a sandwich product, vectors are more than just values—they are transformations.
When used as transformations, vectors act as mirrors.
The title of this post is What is the Inverse of a Vector? The answer is that the inverse a vector is the missing piece that lets us view vectors as actions, not just objects. Vectors and their inverses reflect.
Transforming Twice
This is the last section with math in it and the most important to remember.
What happens if we reflect \textcolor{orange}{\vec{a}} around \textcolor{brown}{\vec{t}} and then reflect the result around \textcolor{#516fae}{\vec{u}}?
I'm finding this a little hard to read so I'll temporarily drop the arrow signs, but we're still talking about vectors:
And we're just exploring here so let's arbitrarily define:
So, plugging it all in:
We reflected our input vector across \textcolor{green}{\hat{y}} and then the result across \textcolor{blue}{\hat{z}}.
But wait a second, if you reflect a right-handed glove you get a left-handed glove. If you reflect that one more time you're back to a right-handed glove, just rotated. Two reflections in series are equivalent to a single rotation!
So what rotation have we performed here?
We have rotated our input vector \textcolor{orange}{\vec{a}} by 2 \theta in the \textcolor{brown}{\vec{t}}\wedge\textcolor{#516fae}{\vec{u}} plane, where \theta is the angle between \textcolor{brown}{\vec{t}} and \textcolor{#516fae}{\vec{u}}. In this case \theta = 90 \degree.
Somehow our simple algebraic rules of swapping and cancelling have produced an operation that normally involves trigonometry or rotation matrices or quaternions! Using just two vectors and a sandwich product, we have implemented 3D rotation!
Rearranging the equation 8.0 stresses the combined transformation:
The product of two vectors can therefore be thought of as a rotor, because it rotates things.
The Rotor \textcolor{#516fae}{\vec{u}}\textcolor{brown}{\vec{t}} will rotate the input by 2\theta in the \textcolor{#516fae}{\vec{u}}\textcolor{brown}{\vec{t}} plane, where \theta is the angle between \textcolor{#516fae}{\vec{u}} and \textcolor{brown}{\vec{t}}
The extra factor of 2 comes from the fact that the rotor is applied twice: once on the left and once on the right in the sandwich product.
Every rotor includes a bivector part and a scalar part which might equal zero. That means every bivector is also a rotor.
If vectors perform reflection, bivectors perform rotation. Can you figure out for yourself what action trivectors perform?
A New Foundation for Physics
Every new physics student stumbles when learning about angular momentum and torque.
The world of linear momentum and linear forces is relatively intuitive. It is self-consistent in the sense that vectors can be added together meaningfully, provided you use scalars where required to make units compatible.
Then you learn about angular momentum and torque and you have to introduce a Bizarro World version of vectors called "pseudovectors" or "axial vectors" which are compatible with each other but are incompatible with regular vectors. It is never correct to add a pseudovector like angular momentum to a regular vector like linear momentum and you just have to memorize which ones belong to which camp.
Even worse, some physics curriculums don't even call out the difference between axial vectors and regular vectors, they just assume you will pick up on their patterns implicitly. Many students never pick up this intuition and consequently believe themselves to be "bad at physics".
Pseudovectors as we typically teach them are actually bivectors. They can be written using 3 numbers, but their true nature is tied to the fact that they are oriented pieces of area, not length.
If we teach students about bivectors and if we write them using their full notation like:
It is obvious from the beginning that pseudovectors and vectors have similar structure but are incompatible types.
The equation for torque that we're taught in high school physics is:
But it is equally correct and far simpler to teach:
Which means torque is a bivector living in the plane defined by r and F, not a pseudovector perpendicular to them. There is no need to introduce the cross product or the right-hand rule at all when teaching rotational dynamics.
All bivectors are also rotors, so this new definition is self-consistent in the colloquial sense that torque, being a rotor, rotates things. Angular velocity, also a rotor, is a measure of how much rotation is occurring!
A similar problem occurs when students take the conceptual leap from electric fields to magnetic fields. The electric field is a vector field which lets them build intuition, but then the magnetic field is a pseudovector field which just tosses their intuition out the window.
The magnetic field, like torque and every other pseudovector in physics, is better thought of as a bivector.
If you represent the electric field with vectors and the magnetic field with bivectors, then they can naturally coexist in a single Geometric that represents the combined electromagnetic field.
If you pull on that thread, you can write out a single Geometric equation:
Which, when broken down into its scalar, vector, bivector, and trivector components, corresponds to Maxwell's four equations! All of electricity and magnetism in one single equation!
The word "Pseudoscalar" also comes up in physics to describe scalar-like quantities that change their sign under reflection. The basic examples we learn are Magnetic Charge and Magnetic Flux, but the concept crops up later in fluid dynamics as the Stream Function and in quantum mechanics as Helicity. All of these are in fact just trivectors. In a mirror, trivectors behave differently than scalars do because mirrors change handedness.
Speaking of quantum mechanics, the defining traits of the Pauli Spin Matricies are that they all square to one and they are mutually orthogonal. They are just the unit vectors \textcolor{red}{\hat{x}}, \textcolor{green}{\hat{y}}, \textcolor{blue}{\hat{z}} dressed up as matrices!
The algebra we have learned today is a better, simpler model of the universe than the one we teach students in school. It is easier to learn and easier to teach. It extends beautifully to the concept of 4D spacetime, which makes special relativity far easier to understand.
The normal approach to physics notation is to learn standard 3D vector notation, then patch over it as needed to reconcile the physical reality with the limits of the notation:
- Complex Numbers
- Quaternions
- Pauli Spinors
- Minkowski 4-Vectors
- Dirac Spinors
All of these are band-aids over a fundamentally limited system. The alternative is to teach the Geometric Algebra we've started learning today, which encompasses the dynamics of all of these lesser systems.
Conclusion
So, what is the inverse of a vector?
It turns out that inverting a vector on its own isn't well defined. We have to invert a vector with respect to some kind of multiplication, and we are free to invent new kinds of multiplication!
Beyond that we can invent entire new structures: bivectors and trivectors, and we get to choose what multiplication and inversion should mean when dealing with these new objects!
The math we are taught in school is not handed down by the Gods themselves, it is invented in small pieces by human beings. It changes over time and supports regional dialects just like spoken language.
We can contribute to it. We can change it.
Many people, including many engineers and scientists, view math as just rote memorization of formulas or rigid steps to complete some task. But that's like taking one semester of Spanish and viewing the entire Spanish language as nothing but rote memorization of vocabulary and steps to complete a conversation. It completely misses the point.
Math is a living language and learning how to use it is a form of enculturation. Anyone can create new math in the same way that anyone can write a new poem or a new short story.
I firmly believe that in 100 years, Geometric Algebra will be the dominant way of introducing students to mathematical physics. In the same way that Newton's notation for Calculus is no longer the dominant one, or that Maxwell's actual equations for Electromagnetism have been replaced by Heaviside's, textbooks will change because a better system has come along.
Unlike those examples, the change from Gibb's vector notation to Geometric Algebra is something we're living through right now. New Youtube videos and textbooks are coming out all the time that incorporate this superior approach. Pick one up and be part of the revolution!
Geometric Algebra Cheat Sheet
Contact me
If you would like to tell me in excruciating detail exactly what I said wrong in this post, I'm on Twitter at @mferraro89. Email me at mattferraro.dev@gmail.com if you prefer to shame me privately.
Errata
The original version of the post incorrectly claimed that a bivector encapsulates 5 degrees of freedom. That is wrong. A bivector contains 3 degrees of freedom. Thank you to Sandrea for the helpful correction and detailed explanation of why the vector product is still invertible despite carrying only 4 degrees of freedom.