Greg Heo

SQL JOINs, part 1

Left? Right? Outer? Let's review.

SQL JOIN is right up there in the list of most useful inventions along with sliced bread, the iPhone, and the chemical toilet. Why are there so many different kinds? Left? Right? Outer? Let’s review.

The INNER track

Here’s our example data set:

Equipment
ItemOwner
iPhoneJim
StaplerDwight
CalculatorJim
PrinterPam
MicrowaveEveryone
People
NameExtension
Jim255
Dwight132
Michael329
Pam0


That’s some pretty crappy schema design. Anyway, we want a list of equipment along with its owner and that person’s telephone extension:

SELECT * FROM equipment JOIN people ON (people.name=equipment.owner)

The result set will look something like this:

iPhoneJimJim255
StaplerDwightDwight132
CalculatorJimJim255
PrinterPamPam0

Hey, what happened to the microwave? The SELECT picks it up since it’s in the equipment table, but there’s no matching record in the people table (there’s no person called “Everyone”).

When you say JOIN, that actually means INNER JOIN, which means “if you cannot find a matching record to join, discard it”. And there goes the microwave.

In this case, that might not be what we’re interested in. Maybe we want to know about all the equipment we have, and the owners are just nice to know.

Enter stage LEFT

LEFT JOIN is what we want in this case:

SELECT * FROM equipment LEFT JOIN people ON (people.name=equipment.owner)

Notice that our main table equipment is on the left side of the statement, and our joined table people is toward the right. A left join means “return every record available from the table on the left, even if there’s nothing to join to”. Here are the results:

iPhoneJimJim255
StaplerDwightDwight132
CalculatorJimJim255
PrinterPamPam0
MicrowaveEveryoneNULLNULL

Much better. We get to see every piece of equipment and the fields that couldn’t be joined are filled with NULLs.

Exit stage RIGHT

A right join is essentially the same thing except everything from the table on the right is returned. That means these two queries will return the same thing (although the fields may show up in a different order):

SELECT * FROM equipment LEFT JOIN people ON (people.name=equipment.owner)

SELECT * FROM people RIGHT JOIN equipment ON (people.name=equipment.owner)

Generally though, you put the main table as the SELECT and the secondary one as the join. Right joins don’t get a lot of use and I would even argue that they’re bad form. Just stick to left joins and your future self and other code maintainers will thank you.

In closing

The default inner join is really a “left and right” or “both” join, since a record must be returned in both tables. Most JOINs in the world are either INNER or LEFT so those are the ones to know.

In part 2, we’ll look at the oddballs: OUTER and CROSS joins.