A scan node sits at the leaf of the tree and pulls rows from a single table. A join node sits in the middle and brings together the rows that its two children send up. It takes one row from users, one row from orders, checks whether they belong to the same user, and if they match, emits the combined row. PostgreSQL has three nodes for this one job: NestLoop, HashJoin, and MergeJoin. The reason a single task splits into three nodes is much like the reason scans did. There is more than one way to find matching pairs from two inputs, and which way is cheapest depends on the size of the inputs and the shape of the join condition.

Deciding which way is cheapest, by costing the alternatives, was the planner's job in an earlier chapter. This section looks at what those three nodes actually do when they execute. Given the same two tables, the three find matches in completely different ways, and that difference in approach is exactly what tells them apart.

How the three nodes route requests

All three join nodes are internal nodes with two children. One child is called the outer, the other the inner. All three run on the Volcano model's pull framework: when the parent asks for the next row, the join node takes rows from its two children, builds one matched row, and sends it up. The only difference is the order and manner in which it routes pull requests to its two children.