join operation

the JOIN operation, denoted by , is used to combine related tuples from two relations into single "longer" tuples. this operation is very important for any relational database with more than a single relation because it allows us to process relationships among relations. to illustrate JOIN, suppose that we want to retrieve the name of the manager of each department. to get the manager's name, we need to combine each department tuple with the employee tuple whose Ssn value matches the Mgr_ssn value in the department tuple. we do this by using the JOIN operation and then projecting the result over the necessary attributes, as follows:
Note that Mgr_ssn is a foreign key of the DEPARTMENT relation that references Ssn, the primary key of the EMPLOYEE relation. this referential integrity constraint plays a role in having matching tuples in the referenced relation EMPLOYEE.
the JOIN operation can be specified as a cartesian product operation followed by a select operation. however, JOIN is very important because it is used frequently when specifying database queries.
consider the following sequence of operations illustrating the cartesian product:
these two operations can be replaced with a single JOIN operation as follows:
the general form of a JOIN operation on two relations and is
the result of the JOIN is a relation with attributes in that order; has one tuple for each combination of tuples--one from and one from --whenever the combination satisfies the join condition. this is the main difference between cartesian product and join. in JOIN, only combinations of tuples satisfying the join condition appear in the result, whereas in the CARTESIAN PRODUCT all combinations of tuples are included in the result. the join condition is specified on attributes from the two relations and and is evaluated for each combination of tuples. each tuple combination for which the join condition evaluates to TRUE is included in the resulting relation as a single combined tuple.
a general join condition is of the form
where each <condition> is of the form , is an attribute of , is an attribute of , and have the same domain, and is one of the comparison operators . a JOIN operation with such a general join condition is called a THETA JOIN. tuples whose join attributes are NULL or for which the join condition is FALSE do not appear in the result. in that sense, the JOIN operation does not necessarily preserve all of the information in the participating relations, because tuples that do not get combined with matching ones in the other relation do not appear in the result.