group by operator
aggregate functions are used to summarize information from multiple tuples into a single-tuple summary. grouping is used to create subgroups of tuples before summarization. grouping and aggregation are required in many database applications, a number of built-in functions exist: COUNT, SUM, MAX, MIN, and AVG.
in many cases we want to apply an aggregate function operation to subgroups of tuples in a relation, where the subgroups are based on some attribute values. for example, we may want to find the average salary of employees in each department or the number of employees who work on each project. in these cases we need to partition the relation into nonoverlapping subsets (or groups) of tuples. each group (partition) will consist of the tuples that have the same value of some attribute(s), called the grouping attribute(s). we can then apply the function to each such group independently to produce summary information about each group. sql has a
sometimes we want to retrieve the values of these functions only for groups that satisfy certain conditions. SQL provides a
[cite:@elmasri_db_2015 chapter 7.1 more complex sql retrieval queries]
in many cases we want to apply an aggregate function operation to subgroups of tuples in a relation, where the subgroups are based on some attribute values. for example, we may want to find the average salary of employees in each department or the number of employees who work on each project. in these cases we need to partition the relation into nonoverlapping subsets (or groups) of tuples. each group (partition) will consist of the tuples that have the same value of some attribute(s), called the grouping attribute(s). we can then apply the function to each such group independently to produce summary information about each group. sql has a
GROUP BY clause for this purpose. the GROUP BY clause specifies the grouping attributes, which should also appear in the SELECT clause, so that the value resulting from applying each aggregate function to a group of tuples appears along with the value of the grouping attribute(s).
sometimes we want to retrieve the values of these functions only for groups that satisfy certain conditions. SQL provides a
HAVING clause, which can appear in conjunction with a GROUP BY clause, for this purpose. HAVING provides a condition on the summary information regarding the group of tuples associated with each value of the grouping attributes. only the groups that satisfy the condition are retrieved in the result of the query.
SELECT
Pnumber, Pname, COUNT (*)
FROM
PROJECT, WORKS_ON
WHERE
Pnumber = Pno
GROUP BY
Pnumber, Pname
HAVING
COUNT (*) > 2;