The Lost Science of Set Theory – Part 1: What is Set Theory?

This series will cover the basics of Set Theory and how it is applied to relational databases.

History

George Cantor (1845-1918), a German mathematician, is considered by many to be the father of Set Theory. In his paper published in 1874, “On a Property of the Collection of All Real Algebraic Numbers”, he stated: “A set is a Many that allows itself to be thought of as a One.” His work launched an entire branch of mathematics and introduce many concepts that we take for granted today such as the concept of infinity.

Definition

Since the relational model is based upon some of the concepts within set theory, a basic understanding is quite helpful. However, for purposes here, we will limit the application of set theory to the implementation in the relational model. In addition, we will address a much more simplified approach.

To understand how set theory is applied to relational databases, a few basic definitions are required:

  • A set is essentially a collection of objects which are grouped together.
  • The objects in the set are known as members and those members are related to one another in some fashion.
  • For all intents and purposes, a set is treated as a single entity.
  • Any operation that is performed against a set is performed against all members of the set simultaneously.
  • By definition, a set can be divided into one or more sub-sets based upon some criteria and multiple sets can be merged into a single set.

Examples

To give some perspective, following are examples of various sets:

  • Real Numbers (3.14159, 23, -14.76, 256, -12, etc.)
  • Countries of the world:
  • Fruits (apples, bananas, oranges, peaches, etc.)
  • Database professionals (SQL Server, Oracle, etc.)
  • Deck of cards

As previously mentioned, sets can be broken into one or more sub-sets:

  • Integers (-12, 256, 65536, -351, etc.)
  • Countries of Europe: 
  • Citrus fruits (oranges, lemons, pineapples, etc.)
  • SQL Server professionals
  • Card Suits (spades, clubs, hearts, diamonds)

Why is this important?

Generally, set theory is typically taught only in Computer Science classes. However, with the focus on the latest trends and fashions in programming, in most curricula, set theory is not usually covered. It’s also true that many database developers started out as programmers utilizing procedural or Object-Oriented Programming Systems (OOPS) such as Java, Ruby-on-Rails, C#, or several other popular languages. Therefore, most database developers have had very little if any exposure to relational database theory.

As a result, set theory is typically avoided like the plague. Many database solutions end up developed in a procedural fashion because that is the mode with which they are comfortable. Most are simply just scared of the prospect of handling set-based operations!

Part 2 will cover the basics of set operations. Your comments and questions are most welcome!

Leave a Reply

Your email address will not be published. Required fields are marked *