Aggregations
Hugr provides powerful aggregation capabilities for data analysis, including single-row aggregations and bucket (group-by) aggregations. These are automatically generated for all data objects.
Single Row Aggregation
Aggregate all records into a single result:
query {
customers_aggregation {
_rows_count
age {
avg
min
max
}
created_at {
min
max
}
}
}
Available Aggregation Functions
Numeric fields (Int, Float, BigInt):
count
- Count of non-null valuessum
- Sum of valuesavg
- Average valuemin
- Minimum valuemax
- Maximum valuestddev
- Standard deviationvariance
- Variance
String fields:
count
- Count of non-null valuesmin
- Alphabetically first valuemax
- Alphabetically last valuestring_agg(separator: String!)
- Concatenate with separatorlist(distinct: Boolean)
- Array of values
Boolean fields:
count
- Count of non-null valuesbool_and
- Logical AND of all valuesbool_or
- Logical OR of all values
Date/Timestamp fields:
count
- Count of non-null valuesmin
- Earliest date/timemax
- Latest date/time
Filtered Aggregation
Apply filters before aggregating:
query {
orders_aggregation(
filter: {
status: { eq: "completed" }
order_date: { gte: "2024-01-01" }
}
) {
_rows_count
total {
sum
avg
}
}
}
Bucket Aggregation (GROUP BY)
Group records and aggregate each group:
query {
orders_bucket_aggregation {
key {
status
customer {
country
}
}
aggregations {
_rows_count
total {
sum
avg
}
}
}
}
Sorting Bucket Results
Sort by aggregated values or keys:
query {
products_bucket_aggregation(
order_by: [
{ field: "aggregations.sales.sum", direction: DESC }
]
) {
key {
category {
name
}
}
aggregations {
_rows_count
sales {
sum
}
}
}
}
Time-Based Grouping
Group by time buckets:
query {
orders_bucket_aggregation {
key {
order_date(bucket: month)
year: _order_date_part(extract: year)
month: _order_date_part(extract: month)
}
aggregations {
_rows_count
revenue {
sum
}
}
}
}
Available time buckets:
minute
hour
day
week
month
quarter
year
Custom Intervals
Use custom time intervals:
query {
sensor_data_bucket_aggregation {
key {
timestamp(bucket_interval: "15 minutes")
}
aggregations {
temperature {
avg
min
max
}
}
}
}
Nested Aggregations
Aggregate related data through relationships:
query {
customers_aggregation {
_rows_count
orders {
_rows_count
total {
sum
avg
}
order_details {
quantity {
sum
}
}
}
}
}
Multiple Aggregations with Filters
Apply different filters to different aggregations:
query {
products_bucket_aggregation {
key {
category {
name
}
}
aggregations {
_rows_count
}
in_stock: aggregations(
filter: { stock_quantity: { gt: 0 } }
) {
_rows_count
stock_quantity {
sum
}
}
on_sale: aggregations(
filter: { discount: { gt: 0 } }
) {
_rows_count
discount {
avg
}
}
}
}
JSON Field Aggregation
Aggregate data within JSON fields:
query {
events_aggregation {
metadata {
count(path: "$.user_id")
sum(path: "$.score")
avg(path: "$.duration")
list(path: "$.tags", distinct: true)
}
}
}
Geometry Aggregation
Aggregate geometric data:
query {
locations_aggregation {
_rows_count
area {
union # Union of all geometries
extent # Bounding box
}
}
}
Sub-aggregations
For aggregated fields, apply additional aggregation functions:
query {
sales_bucket_aggregation {
key {
region
}
aggregations {
stores {
_rows_count
}
stores_aggregation {
revenue {
sum {
sum # Sum of sums
avg # Average of sums
min # Minimum sum
max # Maximum sum
}
}
}
}
}
}
Performance Considerations
Limit Aggregated Data
Use filters and limits to reduce the dataset:
query {
large_table_aggregation(
filter: { created_at: { gte: "2024-01-01" } }
limit: 10000
) {
_rows_count
amount {
sum
}
}
}
Optimize GROUP BY
Limit the number of groups:
query {
orders_bucket_aggregation(
limit: 100 # Return only top 100 groups
order_by: [
{ field: "aggregations.total.sum", direction: DESC }
]
) {
key {
customer_id
}
aggregations {
total {
sum
}
}
}
}
Use Indexes
Ensure proper indexes exist for:
- Fields used in GROUP BY (key fields)
- Fields used in filters
- Fields used in ORDER BY