Aggregations
Start by importing:
import scalaql._
// Docs classes
import scalaql.docs.Hogwarts._
Basic groupByβ
You could define aggregations for your data as you do with plain SQL.
To evaluate such kind of aggregation:
case class FacultyInfo(
name: String,
avgAge: Double,
totalGrade: Double,
specializations: List[String])
First, start with groupBy:
val grouped = select[Student].groupBy(_.faculty)
// grouped: Query.GroupByQuery[From[Student], Student, String] = FROM(Hogwarts::Student) -> GROUP BY(String)
Then you could aggregate this query:
val facultyInfos = grouped
.aggregate((faculty, students) =>
(
students.avgBy(_.age.toDouble) &&
students.sumBy(_.grade) &&
students.toListBy(_.specialization)
).map {
case (avgAge, totalGrade, specializations) =>
FacultyInfo(faculty, avgAge, totalGrade, specializations)
}
)
// facultyInfos: Query[From[Student], FacultyInfo] = FROM(Hogwarts::Student) -> GROUP BY(String) -> AGGREGATE(MdocSession::App::FacultyInfo)
aggregate function provides you the grouping key (in this case - faculty) and a special value to declare your
aggregations.
In this example:
- To calculate average age - use
students.avgBy(_.age.toDouble) - To get total grade - use
students.sumBy(_.grade) - To collect each specialization of the faculty - use
students.toListBy(_.specialization)
You could chain multiple aggregations by && operator. It will automatically flatten the results into a single tuple.
Then it's possible to map the aggregation result.
This is the result of running such query:
facultyInfos
.show(truncate=false)
.run(
from(students)
)
// +----------+------------------+----------+-----------------------------------------+
// |name |avgAge |totalGrade|specializations |
// +----------+------------------+----------+-----------------------------------------+
// |Gryffindor|18.333333333333332|250.9 |[getting into troubles, eating, learning]|
// |Slytherin |18.0 |85.1 |[trolling] |
// |Hufflepuff|17.0 |90.1 |[young dying] |
// +----------+------------------+----------+-----------------------------------------+
//
Reportsβ
report aggregation function is somehow similar to groupBy with multiple keys.
The difference is that report allows you to do aggregations on intermediate results by using combine method.
First, start by defining report case class:
case class BySpec(specialization: String, avgGrade: Double)
case class ByAge(age: Int, grades: List[BySpec])
case class ByFaculty(faculty: String, grades: List[ByAge])
Then, you could define the aggreagtion as follows:
val byFacultyReportQuery: Query[From[Student], ByFaculty] =
select[Student]
.groupBy(_.faculty)
.aggregate { case (faculty, students) =>
students
.report(_.age, _.specialization) { (age, spec, students) =>
// We know Dumbledore too well =)
val additionalPoints = if (faculty == Gryffindor.name) 100 else 0
students.avgBy(_.grade + additionalPoints)
.map(BySpec(spec, _))
}
.combine((age, cascadeInfos) => cascadeInfos.toList.map(ByAge(age, _)))
.map(byAge => ByFaculty(faculty, byAge))
}
// byFacultyReportQuery: Query[From[Student], ByFaculty] = FROM(Hogwarts::Student) -> GROUP BY(String) -> AGGREGATE(MdocSession::App::ByFaculty)
In this case, you define your aggregation as always. Notice the following:
- Usage of
reportaggregation function reportaccepts grouping keys (likeageandspecialization)- First, you define aggregations per all grouping keys (both
ageandspecialization) - Then, using
combine, you aggregate onagegrouping key level - Inside
combineyou'll use a deeper level aggregation result - ... and so on until you combine all nested aggregations
Then run the query:
byFacultyReportQuery
.show(truncate = false)
.run(from(students))
// +----------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
// |faculty |grades |
// +----------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
// |Gryffindor|[{age: 18, grades: [{specialization: eating, avgGrade: 166.2}, {specialization: learning, avgGrade: 199.6}]}, {age: 19, grades: [{specialization: getting into troubles, avgGrade: 185.1}]}]|
// |Slytherin |[{age: 18, grades: [{specialization: trolling, avgGrade: 85.1}]}] |
// |Hufflepuff|[{age: 17, grades: [{specialization: young dying, avgGrade: 90.1}]}] |
// +----------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
//
Available aggregation functionsβ
toList- collects records into aListdistinct- collects distinct records into aSetdistinctBy- collects distinct records, with specific field to distinguish byflatDistinctBy- same asdistinctBy, but for a collection of fieldsconst- return a constant value as an aggregation resultsum- sum the fieldsumBy- sum the field by some other valueproduct- multiplication of field valuesproductBy- multiplication of field values by some other valueavg- average value of the fieldavgBy- average value of the field by some other valuestd- standard deviation of the fieldstdBy- standard deviation of the field by some other valuecount- number of values for each given predicate holdssize- total number of valuesreport(K1, K2, ...)- report of arity N