Jeffrey Scholz

# Elementary Set Theory and Abstract Algebra for Programmers

Updated: 2 days ago

## Why another set theory tutorial?

The target audience for this piece are the sort of folks who don’t care about abstract math unless they see a direct use-case for it. They want to get the essential parts they need and move on. This piece caters to that audience.

Specifically, abstract algebra has a lot of concepts that can be properly called “useful,” but abstract algebra is heavily dependent on set theory.

Our goal here is to take the most direct path through set theory to abstract algebra such that we understand all the terminology and concepts we will be dealing with.

Engineers are usually not interested in collecting abstractions or proving theorems, they want to ship things while having a sufficiently deep understanding of the topic such that they don’t unintentionally create bugs or inefficiencies. Acquiring knowledge that does not directly aid in this quest is considered a waste of time.

To optimize for this goal, I have deliberately omitted any aspect of set theory that I do not think is directly useful to understanding the aspects of abstract algebra we need, and as such, this resource is not comprehensive by design. But please note that this is part of a series of tutorials, and is not intended to be the complete treatment.

Some mathematicians will be horrified that I don’t even discuss the axioms of set theory here. This is a feature, not a bug. If you want a proof for something, ask Google or ChatGPT (or better yet, work it out on your own). The concepts discussed here have been proven to death for over a century.

Set theory is not difficult (at least if you skip the proof writing, set theory proofs can be shockingly hard when doing it for the first time). You probably understand set theory intuitively already, and almost certainly have used sets in your code, probably to eliminate duplicates in an array quickly. However, we need to put a *language* to this intuition and make our intuitive understanding *explicit*.

Learning abstract math is like learning a human language. You can learn the vocabulary (what the words refer to), and the grammar (how they combine together in a valid way). We place a heavy emphasis on vocabulary over grammar here, to use that analogy. There is good reason for this.

If you enter a store in a foreign country and ask the equivalent of “the breads to be buy for me where?” the clerk can help you even though your grammar is horribly off. But if you nail the grammar and don’t know the vocabulary, your knowledge is useless. You can form a perfect sentence, but if you can’t succinctly refer to “bread” and “buy” your trip to the store will not be a success (I didn’t invent this analogy, but I can’t remember where I first saw it unfortunately).

Hence, we will barely scratch the grammar (proofs and theorems) and emphasize the vocabulary of abstract math (which is very useful in navigating this foreign country).

**Some types of knowledge can only be acquired by experience, not by explanation.**

Therefore, *you should* do the exercises in this text. Don’t worry, you won’t be writing proofs, just making sure you actually internalized what you just read.

Don’t let the relative brevity of this text fool you. It will take at least an afternoon (or two, maybe three) to work through it if you actually do the exercises. If a certain section doesn’t make sense, consult the internet for an alternative explanations of that subtopic.

## Definition of a set

A set is a well-defined collection of objects. Where the abstract power comes from is that these could be *anything* and the rules we learn from set theory apply to them.

What integers, rational numbers, real numbers, complex numbers, matrices, polynomials, polygons, and many other things have in common is that they are all sets.

There is a well-defined rule that decides if something is a member of the set or not. We won’t get into it, but it is clear that a polygon is not a polynomial, and a polynomial is not a matrix, etc.

Sets are allowed to be empty. We call this the *empty set*.

Sets do not contain duplicate items by definition, {a, a, b} is really {a, b}.

**Exercise:** Assume you have a proper definition for integers. Create a well-defined set of rational numbers.

### Superset and subsets

When we look at integers and rational numbers, there seems to be a relationship between some of them. Specifically, all integers are rational numbers, but not all rational numbers are integers. The relationship between them is that integers are a *subset* of rational numbers. On the flip side, rational numbers are a *superset* of integers.

A subset need not be strictly smaller than the set it is a part of. It is perfectly okay to say that integers are a subset of integers.

The precise definition for the relationship between integers and rational numbers is a *proper subset*, that is, there exist rational numbers that are not integers.

**Exercise:** Define the subset relationship between integers, rational numbers, real numbers, and complex numbers.

**Exercise:** Define the relationship between the set of __transcendental numbers__ and the set of complex numbers in terms of subsets. Is it a proper subset?

### Set equality

Sets are defined to be equal if they contain the same elements, regardless of order. For example, {4, 2, 5} is the same set as {2, 5, 4}. When doing formal proofs for sets, we say that if A is a subset of B and B is a subset of A, then A = B. Or in more mathy notation: A = B ⟷ A ⊆ B and B ⊆ A. That’s read as A = B if and only if A is a subset of B and B is a subset of A.

### Cardinality

In some of our above examples, there are an infinite number of integers, rational numbers, etc. However, we can also define sets in a finite way, such as the numbers {0,1,2,3,4,5,6,7,8,9,10}. The *cardinality* of the previous set is 11. If A = {5,9,10}, then |A| = 3, where the two vertical bars around the A mean cardinality.

There are different levels of infinite in set theory. For example, there are infinitely many more real numbers than there are integers. Specifically, we say integers are *countably infinite* because you can literally count them out. But there is no way to start counting real numbers which are *uncountably infinite*.

**Exercise:** Using the formal definition of equality, show that if two finite sets have different cardinality, they cannot be equal. (Demonstrating this for infinite sets is a little trickier, so we skip that).

## Fancy blackboard letters

Because “integers as a set” and “real numbers as a set” are used so frequently, there is scary looking mathematical shorthand for that.

The symbol ℕ is the set of natural numbers (1,2,3,…). It definitely does not include negative numbers, but whether it includes zero depends on who you are talking to.

The symbol ℤ is the set of all integers (because “zahl” is integer in German)

The symbol ℚ is the set of all rational numbers (I remember it as the quotient of the numerator and denominator)

The symbol ℝ is the set of all real numbers, because R stands for real. Duh.

The symbol ℂ is the set of all complex numbers for similarly obvious reasons.

Sometimes people write ℝ² as a vector of two real numbers, so a ∈ ℝ² means “a” is a 2d vector. I recommend writing it the second way because it is more concise, and also makes you look smarter.

## Ordered pairs

Although sets do not respect order, a new type of data structure can emerge from sets called the *ordered pair*. For example, (a, b) is an ordered pair while {a, b} is a set.

We programmers typically think of this as a *tuple*. We say two ordered pairs are equal in the same sense we say two tuples are equal.

How do we create order out of something that is unordered?

The implementation detail is that we write (a, b) in set form as {a, {b}}. We can do this because we can define our set as containing either letters or a set of cardinality one that contains a letter. This is why we can say (a, b) ≠ (b, a) because {a, {b}} ≠ {b, {a}}. We will not concern ourselves with this implementation detail any further.

Just like in other programming languages, our ordered pair can be arbitrarily long, for example (a,b,c,d) is valid. We are also allowed to encode ((a, b), c), which will be useful later.

## Cartesian product

Because sets are well defined, we can define a set such that every element from one set is one part of an ordered pair with an element from another set. For example, if A = {1,2,3} and B = {x, y, z} A × B = {(1, x), (1, y), (1, z), (2, x), …, (3, z)} or done as a table:

The result of a cartesian product is still a set: a set of ordered pairs.

Cartesian products are not commutative, as the following exercise will demonstrate.

**Exercise:** Compute the cartesian product of B × A using the definitions above.

**Exercise:** Compute the cartesian product of {1,2,3,4} and {3,6,9,12} (in that order). If you were to pick 4 particular ordered pairs from this, what arithmetic computation would that encode?

## Subsets of the cartesian product form a function

What if we wanted to say we have a function 1 → y, 2 → z, 3 → x. (I picked this out-of-order example to make it a little more interesting).

Well, we can define a set that defines this mapping. We just take a *subset* of our cartesian product above to include (1, y), (2, z), and (3, x).

**In set-theoretic terms, a function is a subset of the cartesian product of the domain and codomain sets.**

In other words, we have the set we start from, and the set we end at. We take the cartesian product of these two, which results in every possible assignment from the input set to the output set, then we take the subset of that to define the function as we like. When dealing with infinite sets like integers, we aren’t bothered that we can’t enumerate all the ordered points explicitly.

A very important note is that mathematicians rarely concern themselves with *computability*. A function is a mapping between sets. How that function is *computed*, if it is even possible to compute with a reasonable computer, is not a concern of mathematicians.

This is where programmers sometimes get tripped up. They often only think of functions as something that can be efficiently computed with lines of code. While useful, this limits our understanding of the general properties of functions.

The reason I emphasize this is that in zero knowledge proofs, we are going to be dealing with functions that are a lot “higher level” than plugging an argument into a function and getting a return value. We need to be able to appreciate the “big picture” of functions. They are a mapping from one set to another set. And **a mapping between sets comes about by taking a subset of their cartesian product**.

Specifically, we are going to be jumping between integers, polynomials, matrices, elliptic curves in one dimension, then elliptic curves in another dimension, and so forth.

You’ll get very dizzy trying to conceptualize this unless you understand at a foundational level that, by set theory, we are allowed to define jumps however we like!

Of course, how we *map* the jump will have a strong effect on its usefulness, if we map everything to zero, that is a valid map, but not very useful.

I want you to understand early that we aren’t doing anything weird when we warp between universes in this way.

At the end of the day, we are allowed to take any two sets we like, create a new set by taking their Cartesian product, taking a subset of that set of ordered pairs, and boom, we have a mapping.

If you're thinking "hey wait a minute, I can just *pick* a subset and define the function however I like?" you are not alone in wondering this. If you want to go down the rabbit hole, we've really been discussing the __axiom of choice__ this whole time, and it has been the subject of controversy despite the definition seeming non-controversial:

"The cartesian product of a collection of non-empty sets is non-empty."

### Subsets of the cartesian product form a function: example

Let’s define a mapping between positive real numbers and positive integers using the floor function. The floor function simply removes the decimal portion of a number. We can’t show all the real numbers (or integers), but we can create a sketch.

When we do ℝ × ℤ and take a subset, we simply pick the ordered pairs that correspond to taking the floor of the element from the real numbers. The ordered pairs we do not show in the table are ordered pairs that are not in our subset that defines the mapping. For example, 2 is not the floor of 500.3333 so that ordered pair is not included.

**Exercise:** Define a mapping (function) from integers n ∈ 1,2,3,4,5,6 to the set {even, odd}.

**Exercise:** Take the cartesian product of the set of integers 0,1,2,…,8 and the polygons triangle, square, pentagon, hexagon, heptagon, and octagon. Define a mapping such that the integer maps to the number of sides on the shape. For example, the ordered pair (4, □) should be in the subset, but (7,△) should not be in the subset of the cartesian product.

**Exercise:** Define a mapping between positive integers and positive rational numbers (not the whole thing, obviously). It is possible to perfectly map the integers to rational numbers. Hint: draw a table to construct rational numbers where the columns are the numerators and the rows are the denominators. Struggle with this for at least 15 minutes before looking up the answer.

### Valid and invalid subsets of the cartesian product.

There is an important restriction on how we pick our subset. For example, the following subset of the cartesian product {1,2,3}, {p,q,r} is not valid because 1 maps to p and 1 maps to q. When defining a function with a cartesian product, the same domain element cannot map to two different co-domain elements.

## Injective, Surjective, and Bijective functions

You may have noticed in the first exercise above, even and odd have many incoming relations, but in the second exercise, some of the integers don’t map to any of the shapes.

Can we say something more general about this?

Specifically, the first exercise you did created an *surjective* function, the second exercise was an *injective* function, and the final exercise was a *bijective* functions.

I found these terms hard to remember personally. I remember “surjective” as elements in the codomain as possibly having a “surge” of multiple incoming relations (remember, surjective functions do not require multiple incoming relationships, but they require at least one incoming relationship).

An injective function means the elements in the codomain have

*at most*one preimage. If one of the elements does not have a preimage, that is okay, but if more than one element in the domain maps to the same element, it is not injective. We can also say that if an output element has a preimage, then it is unique.A surjective function means the elements in the codomain has

*at least*one preimage. If an element in the codomain does not have a preimage, the function is not surjective.A function is bijective if and only if it is injective and surjective.

How we define the domain and codomain is very important. In our integers to polygons example, 0 and 1 didn’t map to anything, so the function was injective. But if we *defined* our domain to not include these numbers, then the function would become *bijective*.

Another example. f(x) = x² is surjective if we define the codomain to be all positive real numbers including zero. But if we define the codomain to include -1, then it is no longer surjective.

This is a fantastic youtube __video on injective, surjective, and bijective__ functions explaining these three kinds of functions further.

As promised at the beginning, I won’t introduce concepts that won’t directly aid you with understanding important topics later. Bijective and surjective functions are important when we talk about isomorphisms and homomorphisms later. We don’t directly use injective functions in our later journey, but a bijective function is a function that is both surjective and injective, so you might as well remember that.

**Exercise:** Let set A be {1,2,3} and set B be {x,y,z}. Define a function from A to B that is well-defined, but not surjective and not injective.

## A cartesian product of a set with itself

It should be no surprise that instead of doing a cartesian product between A and B, you can do a cartesian product between A and A. This is just mapping a set to itself.

This is actually a more abstract form of what we traditionally think of as functions between integers.

For example, y = x² (over positive integers) can be visualized in set theoretic terms as the subset of ℤ+ × ℤ+.

## Set relations

The phrase “taking a subset of the cartesian product” is so common that we have a word for it. It is a **relation**.

A relation can be from a cartesian product of a set with itself or a set with another set. If we have two sets A and B (B might or might not be equal to A without loss of generality), and we take a subset of A × B, then we say an element a from A is related to an element b from B if there is an ordered pair (a, b) in the subset.

In the y = x² example, 2 from X is related to 4 from Y, but 3 in X is not related to 6.

## A “binary operator” in set theoretic terms

A binary operator is a function from A × A → A. Basically, we take every possible pair from A (the cartesian product of A with itself) and map it to A.

Let’s use an example of the set {0,1,2} with binary operator addition modulo 3. First we take the set’s cartesian product with itself:

Then we take the cartesian product of this new set of pairs with the original set

And then take the subset of that which defines our binary operator addition modulo 3

You can see that the sum of the inner ordered pair modulo 3 is equal to the second number. This is the set-theoretic definition of addition modulo 3 in the domain {0,1,2}. By picking different subsets in the third step, we can define different binary operators, for example multiplication or exponentiation.

**Exercise:** Pick a subset of ordered pairs that defines a * b mod 3.

Here is an interesting fact about the relation between (A x A) and A (shown in the final table above): if the binary operator is commutative, then the map from (A x A) to A cannot be injective if the cardinality of the set is 2 or greater.

**Exercise:** Demonstrate the above statement is correct by reasoning from ((a, b), c) and ((b, a), c) and the definition of injective.

The relation is not necessarily surjective, because a closed binary operator does not need to have the entire set as a possible output. For example, our binary operator could always return 1.

A binary operator is more than just addition or multiplication, in fact, I think those are very bad examples, because they encourage you to think in limited terms.

A closed binary operator takes *any two* elements of a set, and outputs another element from the same set. The closed part is important, as it restricts the output to be in the same set.

Specifically, we construct a binary operator as follows:

Take a cartesian product of a set with itself, call this set of ordered pairs

*P*.Take the cartesian product of

*P*with A and take a subset of that such that P→A is well defined.

Division over integers is not a binary operator because what really happens is we do the P = (ℤ × ℤ) then take a subset of (P × ℚ) to get our relation. Division over integers is not closed because it can produce rational numbers.

We are going to deal with binary operators a lot on elliptic curves, integers, polynomials, matrices, etc.

When we can trust that the binary operator will have certain properties, then we can abstract away a lot of implementation detail.

For example, “adding” elliptic curve points is not exactly trivial, and the math is not obvious from the get go.

However, if you *know* the binary operator is closed, and follows certain properties, then *how* the “addition” is implemented does not matter!

Let’s use a slightly more relatable example. If you multiply two square matrices of determinant 1 together, the determinant of the product matrix will also be 1. The proof is not something you can quickly work out in your head. But if you model this as a “set of 3x3 matrices with determinant 1 and binary operator multiply, and multiply is closed” then you suddenly have a lot of functional knowledge of a system you don’t know the implementation details about. You know no matter what you do, you’ll get a determinant 1 matrix without having to know why.

Having the language to describe a binary operator as “closed” allows you to operate at a higher level and understand the bigger picture of transformations and not get bogged down in the implementation details.

You can reason about operations without understanding how they work!

### Constructing valid binary operations

When it comes to binary operators, we are not allowed to take a subset of A ⨉ A before mapping that to A. Binary operators must accept *all* members of group A as its inputs. We of course must take a subset of the ordered pairs between A ⨉ A and A because each pair from A ⨉ A must map to exactly one A.

**Exercise:** Define our set A to be the numbers 0,1,2,3,4 and our binary operator to be subtraction modulo 5. Define all the ordered pairs of A ⨉ A in a table, then map that set of ordered pairs to A.

## Properties of binary operators over sets: Magma, Semigroups, and Monoids, and Groups

We are now going to head into what is traditionally considered abstract algebra land, but the border between set theory and abstract algebra is rather fuzzy.

If we have sets and a binary operator on that set, we can categorize those sets based on how the binary operator behaves, and what elements are allowed (or expected) to be in the set.

Mathematicians have a word for every possible kind of behavior of the binary operator on the set. As applied programmers, we care about the *Group* (from Group Theory) in particular, but let’s work our way their incrementally. The group is just one type of animal in this large zoo. So rather than study the group in isolation, let’s study the group in its larger context of related algebraic structures (i.e. sets with a binary operator).

### Magma

A magma is a set with a closed binary operator. That’s it.

A magma is definitely something you understand intuitively as a programmer. Now you have a word for it.

As an example, let our set be all positive integers and our binary operator be x ^ y. Note that we don’t allow negative numbers because if y is negative, we get a fraction.

Clearly, the output will be in the space of integers.

Interestingly, this example is not commutative or associative. You can convince yourself of this by picking values for a, b, and c in the python code below.

```
assert a ** (b ** c) != (a ** b) ** c
assert a ** b != b ** a
```

But we don’t care. A magma is the one of the least restrictive kinds of algebraic structures. All that matters is that the binary operator is closed. Everything else is fair game.

### Semigroup

A semigroup is a magma where the binary operator must be associative.

In other words, a semigroup is a set with a binary operator that is closed and associative.

Let our (infinite) set be the set of all possible non-empty strings from the traditional alphabet a,b,…,y,z. For example, “az”, “programmer”, “zero”, “asdfghjk”, “foo”, and “bar” are all in this set.

Let our binary operator be the concatenation of strings. This is closed, because it produces another string in the set.

Note that if we commute “foo” and “bar”, the output string will not be the same, i.e. “foobar” and “barfoo”. However, that does not matter. Both “foobar” and “barfoo” are members of the set, so the binary operator “concatenate” is closed. Because we have a set with a closed binary operator, the set of all strings under concatenation is a magma.

**Exercise:** Work out for yourself that concatenating “foo”, “bar”, “baz” in that order is associative. Remember, associative means (A op B) op C = A op (B op C).

**Exercise:** Give an example of a magma and a semigroup. The magma must not be a semigroup. Don’t use the examples above.

### Monoid

A monoid is a semigroup with an identity element.

Aww yes, this is the same Monoid from the “A monad is a monoid in the category of endofunctors.”

If we look at the __monoid documentation__ in the Cats library for Scala, we see these definitions explicitly:

```
trait Semigroup[A] {
def combine(x: A, y: A): A
}
trait Monoid[A] extends Semigroup[A] {
def empty: A
}
```

The Cats library simply refers to "identity" as "empty" and the binary operator as "combine." The fact that Monoid extends Semigroup shows that a Monoid is a Semigroup with the requirement that it has an "empty" (identity).

The snippet above doesn’t show it, but it is indeed required that combine be associative.

A semigroup just has a binary operator with no restrictions on it except that it outputs the same type (A) as the inputs (x, and y).

For example, addition over positive integers *without zero* is a semigroup, but if you include zero, it becomes a monoid.

An identity element means you do a binary operator with the identity element and another element a, you get a. In the example of addition 8 + 0 = 8, where 0 is the identity element. If your binary operator was multiplication, then the identity element would be 1, since multiplying by 1 gives the same number back.

#### Sets of sets over union and intersection

Something bizarrely absent from our discussion of sets above was the discussion of *union* and *intersection* of sets. These are binary operators, and now is a good time to introduce them.

If you take the union of two sets {1,2,3,4} and {3,4,5,6}, you get {1,2,3,4,5,6}. If you take the intersection of {1,2,3,4} and {3,4,5,6} you get {3, 4}.

It should be clear that both of these operators is associative.

If we define our domain to be the set of all sets of integers, then the binary operators union and intersection are closed because their result is a set of integers.

(If you want to be a little meta, we could define our domain to be the set of all possible sets and it would still be closed, think about it).

Set union has an identity element in this domain: the empty set. Take the union of a set with the empty set and you get the original set, i.e. A ∪ {} = A.

Hence, the set of all sets of integers over union is a monoid.

However, in the case of set *intersection* over all possible sets of integers, it isn’t so clear if this is a monoid or a semigroup (infinities can be weird unfortunately). An identity element over intersection must be I such that A = A ∩ I for all A. This means for all A, |I| > |A| and it’s not clear we can define such a set without using a “special case.”

By the way, we are allowed to define such a set and include it in our domain if we wanted to. We’ll see later that elliptic curves use a trick like this and include a special point called the “point at infinity” to stay consistent with the algebraic laws. The point is we need to be very clear what our identity element is if we say a set is a monoid over some binary operator.

However, if we restrict the domain to be all sets of positive integers up to 5, then intersection clearly becomes a monoid because the identity element would be {0,1,2,3,4,5}, as any set of integers you intersect with it will produce the other set, i.e. A ∩ {0,1,2,3,4,5} = A.

At this point it should be clear that the category of an algebraic structure for a given binary operator is very sensitive to the domain of the set.

**Exercise:** Let our binary operator be the function min(a,b) over integers. Is this a magma, semigroup, or monoid? What if we restrict the domain to be positive integers (zero or greater)? What about the binary operator max(a,b) over those two domains?

**Exercise:** Let our set be all 3 bit binary numbers (a set of cardinality 8). Let our possible binary operators be bit-wise and, bit-wise or, bit-wise xor, bit-wise nor, bit-wise xnor, and bit-wise nand. Clearly this is closed because the output is a 3 bit binary number. For each binary operator, determine if the set under that binary operator is a magma, semigroup, or monoid.

### Group - The Star of the Show

A group is a monoid where each element has an inverse.

Or to be explicit, it is a set with three properties

a closed and associative binary operator (a semigroup)

an identity element (a monoid)

every element has an inverse. That is, there exists an inverse element of the set such that the binary operator of an element and its inverse produces the identity element.

In other words, a group is a monoid that, for each element, there exists in the set an “inverse element” such that the binary operator on an element and it’s inverse produces the identity element. If a is an element, and a’ is that element’s inverse, then (a op a’) = identity.

Using integers with addition, the identity element is zero (because you add zero, you get the same number back), and the inverse of an integer is that integer with the sign flipped (e.g. the inverse of 5 is -5 and the inverse of -7 is 7).

Going back to the domain sensitivity, addition over positive integers is not a group because there can be no inverse elements.

Here is an interesting table to drive the point home

Note that “inverse” is not meaningful if the set does not have an identity, because the definition of inverse is doing a binary operator of an element with its inverse produces the identity.

**Exercise:** Why can’t strings under concatenation be a group?

**Exercise:** Polynomials under addition satisfy the property of a group. Demonstrate this is the case by showing it matches the three properties that define a group.

Unfortunately, our tutorial must end here, because elementary group theory is the subject of another chapter.

But now you have a lot of context to understand what a group is even though we barely discussed it here!

### A word about commutativity

None of the algebraic data structures above are required to be commutative. If they are, we say they are *abelian* over their binary operator. So an *abelian group* means the binary operator is not sensitive to the order.

Abelian means the binary operator is commutative.

But say abelian, you'll sound smarter.

The technicality is we don't normally say "addition is abelian" but "the group is abelian over addition."

## Subsets again

Let’s tie this all back to what we learned at the beginning. Magmas, semigroups, monoids, and groups are all sets that have a closed binary operator. A binary operator is just a map from all the ordered pairs of the set’s cartesian product with itself back to itself.

Groups are a subset of monoids, monoids are a subset of semigroups, semigroups are a subset of magmas, and magmas are a subset of sets in general. Every group is also a magma or a set, but a magma is not necessarily a group.

“Sets” are easy to conceptualize, but when we start talking about groups and other algebraic structures, it’s easy to start getting lost. Groups are *very important* in our study of cryptography. Just remember:

**Groups are sets with restrictions.**

Also, it’s time to free your mind from “addition” and “multiplication” being the primary way of combining things.

We are allowed to take a cartesian product of a set (which could be anything) with itself then map that set of ordered pairs back to the set. This is a binary operator. If it follows the construction above, then it is closed.

## Math vocabulary doesn’t need to scare us

Before you began this tutorial, the sentence “the set of strings over the binary operator string concatenation is a semigroup or a monoid depending on the presence or absence of the empty string in the set” probably didn’t make sense. You might still have to translate that in your head like most learners of a second language, but you realize it’s actually packing a *lot* of information into a tiny space.

Could I say that sentence without the mathiness? Of course I could, but it would take me at least 500 words to do it clearly. It’s actually worth understanding what those terms mean. This will save us a lot of trouble in the long run.

What makes it cool is there are a plethora of theorems about groups that let us make claims about the group *without understanding how the binary operator of the group work under the hood*. This is very roughly like polymorphism in object-oriented programming or traits in functional languages. They hide implementation details from you and let you focus on the high level. That is powerful.

## Learn more with RareSkills

Want to keep going? The next tutorial in this series is __group theory for programmers__

The usefulness of the vocabulary from abstract algebra is why our __zero knowledge course__ does not dodge math. We just make sure we have our essential vocabulary down before we start using it.