tgroup: Calculates aggregate functions on groups of rows
tgroup identifies groups of rows in
a table based on the values in a given column or columns,
and calculates statistical quantities or otherwise collapses
down the multiple values from other columns into single values
representing each group.
It does the same job as a
SELECT ... GROUP BY statement
with aggregate functions in ADQL/SQL.
keys parameter defines how input rows are grouped,
aggcols parameter defines what quantities to
aggregate from the rows in each group.
keys specifies one or more values (column names or expresssions)
that must be the same for rows grouped together,
aggcols specifies zero or more columns to be
added based on the content of rows in each group.
The output table therefore contains
one column for each entry in
and one column for each entry in
and has one row for each group identified.
This command can therefore be used to count rows or calculate statistical quantities per group. A number of statistical aggregation methods are provided such as mean, median, minimum, maximum etc. For more specialised requirements, for instance quantiles or custom statistics, you can also use the array aggregators which generate an array containing all of the values in the group, and operate on the resulting column using one of the functions in the Arrays class.
By way of comparison, the
stilts tgroup in=t keys="year detector" aggcols="0;count;num gmag;min;min_gmag gmag;mean"corresponds roughly to the ADQL query:
SELECT COUNT(*) AS num, MIN(gmag) AS min_gmag, MEAN(gmag), FROM t GROUP BY year, detector
See also the
which provide similar functionality where the grouping is over
evenly spaced numeric/coordinate values.