Skip to main content

Unified API for Your Data, Wherever It Lives

One GraphQL layer for all your data.

Access databases, files, and APIs through a single, secure interface — with domain ownership and centralized governance built-in.

The hugr Data Mesh platform and GraphQL backend

Hugr is an Open Source Data Mesh Platform and high-performance GraphQL Backend.

Designed for seamless access to distributed data sources, advanced analytics, and geospatial processing — Hugr powers rapid backend development for applications and BI tools. It provides a unified GraphQL API across all your data.

Key Benefits

Data Mesh–Ready

Data Mesh–Ready

Build federated, domain-driven schemas without losing visibility or control.

Geospatial & Analytical Power

Geospatial & Analytical Power

Perform spatial joins, aggregations, and OLAP queries — all in GraphQL.

Modern Data Stack Support

Modern Data Stack Support

Natively integrates with Postgres, DuckDB, Parquet, Iceberg, Delta Lake, and REST APIs.

Cluster-Ready & Extensible

Cluster-Ready & Extensible

Scale with your workloads or embed the engine directly in your Go services.

Talk-to-data

Talk-to-data

Comming soon! Leverage natural language queries to access and analyze your data effortlessly.

Secure by Design

Secure by Design

Enforce fine-grained access policies with built-in authentication and role-based permissions.

Understanding Data Mesh

Data Mesh is a modern approach where teams own and publish their data as a product — just like they do with APIs or microservices.

Hugr enables this by giving every domain a flexible, secure, and unified way to expose their data using GraphQL.

Data Mesh visualization

Use Cases

Data Access Backend for Applications

1. Data Access Backend for Applications

hugr acts as a universal GraphQL layer over data sources:

  • Rapid API deployment over existing databases and files
  • Centralized schema and access control
  • Unified interfaces for apps and BI tools
  • Minimal manual integration
  • Ideal for data-first applications
Data Mesh Platform Architecture

2. Building Data Mesh Platforms

hugr is perfect for Data Mesh architecture:

  • Modular schema definitions
  • Federated access through a single API
  • Decentralized data ownership
  • Domain-specific modeling and scaling
  • Easy onboarding of teams and data sources
Analytics and MLOps Integration

3. Analytics, DataOps and MLOps Integration

hugr enables:

  • Support for OLAP and spatial analytics
  • Export to Arrow IPC and Python (pandas/GeoDataFrame)
  • Server-side jq transformations
  • Caching and scalability for heavy workloads
  • Integration of ETL/ELT and ML pipeline results
Vibe analytics

4. Vibe/Agentic Analytics

hugr MCP powers Vibe's analytics platform by:

  • Modular schema design for diverse data sources
  • Summarize data objects and their fields, relationships, functions, modules and data sources descriptions using LLMs to better understand business context
  • Lazy Hugr schema introspection tools to automatically generate GraphQL queries based on user requests
  • Allow models to build complex queries over multiple data sources and performs chain of queries to fetch and aggregate data as needed
  • Allow models to build JQ transformations to process and filter data server-side before returning results to users

Quick Setup in 2 Minutes

1. Start Hugr in Container

Make sure you have Docker installed on your machine. Read deployment guide →

Start hugr container:

docker run -d --name hugr -p 15000:15000 -v ./schemas:/schemas ghcr.io/hugr-lab/automigrate:latest

Access the admin UI:

http://localhost:15000/admin

Stop container:

docker stop hugr

View logs:

docker logs -f hugr

Connect DuckDB Database

Use DuckDB as an embedded analytical database with auto-generated schema

mutation AddDuckDBSource {
core {
insert_data_sources(data: {
name: "analytics"
type: "duckdb"
description: "DuckDB analytics database"
path: "/data/analytics.db"
as_module: true
self_defined: true
}) {
name
type
}
}
}

# Load the data source
mutation LoadDuckDBSource {
function {
core {
load_data_source(name: "analytics") {
success
message
}
}
}
}

Execute this mutation in the admin UI at http://localhost:15000/admin

Frequently Asked Questions

hugr is an open-source Data Mesh platform and high-performance GraphQL backend for accessing distributed data sources. It provides a unified GraphQL API across diverse sources including databases (PostgreSQL, MySQL, DuckDB), file formats (Parquet, Iceberg, Delta Lake), and REST APIs.

hugr enables rapid API development, analytics & BI, geospatial processing, and serves as a universal data access layer for applications.

Learn more about hugr →

Data Mesh is a decentralized approach to data architecture that treats data as a product, with domain-specific ownership. hugr enables Data Mesh by providing:

  • Modular schema definitions that can be reused across different sources
  • Federated access through a single GraphQL API
  • Domain-specific modeling and independent scaling
  • Decentralized data ownership while maintaining unified access

Learn more about Data Mesh architecture →

hugr supports multiple data source types:

  • Relational Databases: DuckDB, PostgreSQL (with PostGIS, TimescaleDB, pgvector), MySQL
  • File Formats: Parquet, Apache Iceberg, Delta Lake, CSV, JSON
  • Spatial Formats: GeoParquet, GeoJSON, Shapefiles (GDAL compatible)
  • Services: REST APIs with authentication (HTTP Basic, ApiKey, OAuth2)
  • Storage: Local files and cloud object storage (S3-compatible)
  • Coming Soon: DuckLake - a data lake solution for managing large volumes of data with snapshot-based schema evolution

Learn more about data sources →

Data sources are described using GraphQL SDL (Schema Definition Language) with hugr-specific directives. Key directives include:

  • @table - Define database tables
  • @view - Define views with SQL expressions
  • @field_references - Define relationships between tables
  • @join - Define subquery fields in schema for data selection
  • @module - Organize schema into logical modules
  • @function - Define custom functions

Schema files are stored in catalogs and can be located in file systems, HTTP endpoints, or S3 buckets.

Learn more about schema directives →

Queries:

  • Basic CRUD operations with filtering, sorting, and pagination
  • Complex aggregations (count, sum, avg, min, max) and bucket aggregations
  • Cross-source queries and relationships
  • Spatial joins and geospatial operations
  • Vector search for semantic similarity

Mutations:

  • Insert records with nested relations
  • Update multiple records with filters
  • Delete with conditional filters
  • Full transaction support within single requests

Learn more about queries → | Learn more about mutations →

Yes, hugr provides a comprehensive two-level caching system:

  • L1 Cache (In-Memory): Fast local cache for quick access
  • L2 Cache (Distributed): Redis/Memcached for shared cache across cluster nodes

Caching is controlled via directives:

  • @cache - Enable caching with configurable TTL and tags
  • @no_cache - Disable caching for real-time data
  • @invalidate_cache - Force cache refresh

Automatic cache invalidation occurs on mutations based on tags.

Learn more about caching →

hugr supports multiple authentication methods:

  • API Keys: Static keys for service-to-service communication, with support for managed keys stored in database
  • OAuth2/JWT: Token-based authentication with standard JWT claims and custom claim mapping
  • OIDC: OpenID Connect for enterprise identity providers (Google, Auth0, Keycloak, Azure AD)
  • Anonymous: Unauthenticated access with limited permissions

Multiple methods can be enabled simultaneously, and hugr tries each in order.

Learn more about authentication →

hugr uses role-based access control (RBAC) managed through GraphQL API:

  • Roles: Define user roles in the roles table
  • Permissions: Configure field-level and type-level access in role_permissions table
  • Row-Level Security: Apply mandatory filters to restrict data access
  • Default Values: Auto-inject values in mutations (e.g., user_id, tenant_id)

Permissions support wildcards (*) for broad rules with specific exceptions. Access is open by default; add permission entries to restrict.

Learn more about access control →

DuckDB is a high-performance analytical database engine optimized for OLAP workloads. hugr uses DuckDB as its core query engine because:

  • Optimized for analytical queries and aggregations
  • Native support for multiple data formats (Parquet, CSV, JSON)
  • In-process execution with efficient memory usage
  • Excellent performance for large-scale data processing
  • Can attach external databases (PostgreSQL, MySQL) and query them together

Learn more about DuckDB integration →

Yes, hugr has native support for geospatial operations:

  • Native Geometry scalar type for spatial fields
  • Support for PostGIS (PostgreSQL) and DuckDB spatial extension
  • Spatial file formats: GeoParquet, GeoJSON, Shapefiles
  • Spatial joins and aggregations across data sources
  • Distance-based queries and spatial relationships
  • H3 clustering for hierarchical spatial indexing

Learn more about spatial queries → | Learn more about H3 clustering →

hugr is designed for enterprise-scale deployments:

  • Horizontal Scaling: Stateless nodes that can be added/removed dynamically
  • Cluster Mode: Multi-node operation with load balancing and fault tolerance
  • Caching: Two-level cache (in-memory + Redis/Memcached) reduces database load
  • Performance: Query optimization and pushdown to data sources
  • Kubernetes Ready: Helm charts for easy K8s deployment

Learn more about cluster mode → | Learn more about container deployment →

hugr supports multiple output formats:

  • GraphQL JSON: Standard GraphQL response format
  • Arrow IPC: Efficient binary format for large datasets via Hugr multipart IPC protocol
  • Python Integration: Direct export to pandas DataFrame and GeoDataFrame
  • JQ Transformations: Server-side data transformation with custom JSON output

The Arrow IPC protocol enables efficient streaming of large datasets to analytics and ML pipelines.

Learn more about Arrow IPC → | Learn more about Python client → | Learn more about JQ transformations →

DuckDB Logo

Powered by DuckDB

hugr leverages DuckDB - the blazing-fast in-process analytical database - as its core engine. This enables lightning-speed cross-source JOINs and aggregations directly in memory, combining data from PostgreSQL, S3 Parquet files, CSV, and geospatial formats in a single GraphQL query. With zero network latency and OLAP-optimized performance, DuckDB makes hugr the perfect choice for analytic workloads and data mesh architectures.