# A Gentle Introduction to Tensors and Monoids

There are at least three distinct conceptual roles which vectors and vector spaces play in mathematics:

• A vector is a column of numbers. This is the way vector spaces appear in quantum mechanics, sections of line bundles, elementary linear algebra, etc.
• A vector is a weighted direction in space. Vector spaces of this kind are often the infinitesimal data of some global structure, such as tangent spaces to manifolds, Lie algebras of Lie groups, and so on.
• A vector is an element of a module over the base ring/field.

What is a module? The basic idea is that a module $V$ is an object equipped with an action by a monoid $A$. (This is closely related to the concept of a representation of a group.)

Let’s take an example that you’re familiar with, vector spaces, and generalize it to get some intuition for working with modules.

Fields $\hookrightarrow$ Rings

If $K$ is a field, then a $K$-vector space (a vector space over $K$) $\equiv$ $K$-module.

$K$-Vector Spaces $\hookrightarrow$ $K$-modules

For the categorically minded: a familiar example of a module is a vector space $V$ over a field $K$; this is a module over $K$ in the category of abelian groups: every element of $K$ acts on the vector space by a multiplication of vectors, and this action respects the addition of vectors.

Tensors play analogous conceptual roles.

• A tensor is a multidimensional array of numbers.
• A tensor is multiple weighted directions in space.
• A tensor is an element of a free monoid over the base ring. (If you don’t know what a free monoid is, don’t worry. I’ll go over them later in this post.)

An explanation of tensors as type constructors is postscript for fellow Haskell enthusiasts.

#### Gaining Intuition Through the Classical Approach

The ‘classical’ approach to tensor theory views tensors as multidimensional arrays that are $n$D generalizations of 0D scalars, 1D vectors, and 2D matrices. The ‘components’ of the tensor are the indices of the array.

We can then visualize a higher-order analogue of matrix rows and columns as fiber. A fiber is defined by fixing every index but one, where a slice is defined by fixing two indices.

An algebra is a bilinear product distributive over vector addition (no commutatitvity, associativity, or identity required). In this article, I will use the word “algebra” to mean “associative algebra with unit.”

#### Gluing Stuff Together to Make Other Stuff

A monoid is a set with a binary operation (let’s call it multiplication and denote it as infix *). We demand that this operation:

• has a special identity element $1$ such that $1*a=a*1=a$
• is associative, $(a * b) * c = (a * b) * c$

The easiest way to understand a free monoid is to construct one.

1. Pick a set, any set, and call it the set of generators.

• Assign letters of the alphabet to your generators. Say, you have less than 26 generators and you call them $a, b, c$, etc.

2. Add one more element to the set and call it the unit.

• Reserve the empty string (which I’ll denote $””$) for your unit element.

3. Define multiplication by unit to always return the other muliplicand.

• “”a = a

4. For every pair of generators, create a new element and call it their product.

• When asked for the product of, say, $a$ and $t$, call it $at$. The product of $c$ and $at$ will be $cat$, the product of $ca$ with $t$ is also $cat$.

As you can see a free monoid generated by an alphabet is equivalent to the set of strings, with product defined as string concatenation.

Lists are free monoids. Take any finite or infinite set and create ordered lists of its elements. Your unit element is the empty list, and “multiplication” is the concatenation of lists. Strings are just a special type of lists based on a finite set of generators. But you can have lists of integers, 3-D points, or even lists of lists.

Let’s construct another one!

We can construct an algebra for any $R$-linear space $V$ by repeatedly gluing $V$ together via the tensor product.

To start, we define the $1$-tensor space on $V$ := $\mathcal{T}^1(V) = V$.

We can recursively define the $k$-tensor space on $V$ := $\mathcal{T}^k(V)$ by setting $\mathcal{T}^k(V) = V \otimes \mathcal{T}^{k-1}(V)$.

In other words: $\mathcal{T}(V) = \mathbb{K} \oplus V \oplus (V \otimes V) \oplus …$

Moreover, we can define the tensor algebra on $V$ as $\mathcal{T}(V) = \bigoplus_{n \in \mathbb{N}} \mathcal{T}^n(V)$.

This is the coproduct of all tensor products of $V$. A tensor algebra is the free monoid on the linear space $V$.

As with other free constructions, $\mathcal{T}$ is left adjoint to a forgetful functor that takes a $R$-Algebra to its underlying $R$-linear space.

Keep in mind that Lin is a type of monoidal category (a category $(C, \otimes)$ equipped with a tensor product). We can take the tensor product of linear spaces and of linear maps.

The functoriality of $\mathcal{T}$ means that any linear map $\phi: V \rightarrow W$ extends uniquely to an algebra homomorphism from $\mathcal{T}(\phi): \mathcal{T}(V) \rightarrow \mathcal{T}(W)$.

Note that a map $f: A \to B$ between two algebras is an algebra morphism if $f$ is linear and preserves multiplication $f(a_1a_2) = f(a_1)f(a_2)$.

#### Tensors are the Building Blocks of Multilinear Algebra

The tensor product is an operation combining vector spaces, and tensors are the elements of the resulting vector space.

Let’s start with an example of the tensor product of two vectors $\phi, \psi \in \mathbb{C}^2$:

What is this mystical thing?

The tensor product of $A, B$ over $R$ looks like this:

• an abelian group $A \otimes_R B$
• a bilinear map $\otimes: A \times B \to A \otimes_R B$
• the universal property holds.

At first glance, this diagram might upset category theorists (isn’t all in Vect). If we note that it is taking place in a multicategory (since our objects are modules and our morphisms are k-linear maps), your feelings of angst toward the diagram will hopefully subside.

Given:

• $R$-module $M$
• $R$-bilinear map $f: A \times B \to M$

The universal property means that: there exists a unique $R$-bilinear map $\hat{f}: A \otimes_R B \to M$, such that the above diagram commutes (i.e. every $R$-bilinear map defined on the product $A \times B$ factors through $A \otimes_R B$ uniquely).

The tensor product $\otimes$ of $A$ and $B$ (the tensor space) is any linear space having this universal property. Note that the following diagram is the same diagram as above.

In other words, a linear map out of the tensor space corresponds to a bilinear map out of the original linear space.

Note that a tensor can be naturally extended from a bilinear map to a multilinear map by currying.

The rules for manipulation of of tensors arise as an extension of linear algebra to multilinear algebra. We can think of composition as doing things in series, and tensoring as doing things in parallel.

The second bifunctor condition can be sloganized as: Doing things in parallel, in series, is the same as doing things in series, in parallel. If you’re an algebraist: this equivalent presentation may be more palletable: $Hom(M,M) \otimes Hom(N,N) \simeq Hom(M \otimes N, M \otimes N)$

#### Motivating Example: Application to Quantum Computing

Tensor products are used to describe systems consisting of multiple subsystems.

Each subsystem is a vector in a Hilbert space. A qubit is a 2D, normalized, complex vector in a Hilbert space with base vectors $|0\rangle$ and $|1\rangle$.

The state space of a quantum computer with $n$ qubits can be represented as the tensor product of the respective state spaces of all the individual qubits.

If we have 2 systems, let us have 2 systems $I$ and $II$ with corresponding Hilbert spaces $\mathcal{V}_I$ and $\mathcal{V}_{II}$.

The state vectors $|\phi_{I}\rangle$ and $|\phi_{II}\rangle$ describe the state of the total system as $|\phi_{I}\rangle \otimes |\phi_{II}\rangle$.

Contra-||Co-variance: A Discussion of Tensor Types

Contra-||co-variance refers to how the change of scale in the reference axis affects the components of the object.

In general, most things that we think of as vectors, such as a position or displacement, are contravariant vectors. These are represented as column vectors in linear algebra. Covariant vectors, or one-forms, are represented by row vectors in linear algebra. In the language of quantum mechanics, contravariant vectors are kets $v^i \sim |v\rangle$, while covariant vectors are bras $v_i \sim \langle v|$.

Tensors can be defined as objects in multilinear algebra that can have aspects of both covariance and contravariance.

The tensors are classified according to their type $(p, q)$, where $p$ is the number of contravariant indices, $q$ is the number of covariant indices, and $p + q$ gives the total order of the tensor.

The geometric product is based on simple geometric principles. The first premise is that multiplying two contravariant vectors $a \wedge b$ produces an area-like object called a bivector. Multiplying 3 vectors together $a \wedge b \wedge c$ produces a volume-like object called a trivector. In general, multiplying $p$ vectors together produces a $p$-vector.

A scalar is a grade-0 object, denoted as $\langle A \rangle_0$, a vector is a grade-1 object $\langle A \rangle_1$, a bivector is a grade-2 object $\langle A \rangle_2$, and in general a $p$-vector is a grade-$p$ that defines a $p$-volume.

Adding different grade objects creates a multivector of the form $A = \langle A \rangle_0 + … + \langle A \rangle_{p+q}$.

#### Postscript: Tensors as Type Constructors

In Haskell, we construct a free vector space over a type by taking some type a as a basis, and forming k-linear combinations of elements of a, represented as Vect k a.

Given 2 vector spaces:
A = Vect k a
B = Vect k b
We can form their tensor product A $\otimes$ B = Vect k (Tensor a b).

The tensor product is the vector space whose basis is given by all expressions of the form a $\otimes$ b. A Tensor is a type constructor on basis types, taking basis types a, b  for vector spaces A, B, and returning a basis type for the tensor product A $\otimes$ B.

In other words, a function of type <Tensor> is a bilinear function that takes each pair of basis elements a, b in the input to a basis element (a,b) in the output.

The power of this bilinear function is that it is in some sense “the mother of all bilinear functions”. Specifically, you can specify a bilinear function completely by specifying what happens to each pair (a,b) of basis elements.

It follows that any bilinear function f :: Vect k (Either a b) -> Vect k t can be factored as f = f' . tensor, where f' :: Vect k (a,b) -> Vect k t is the linear function having the required action on the basis elements (a,b) of Vect k (a,b).

Hungry for more Haskell? Here‘s a post written in literate Haskell using examples from computer vision.