Skip to main content

Unified API for Your Data, Wherever It LivesUnified API for All Your Data

One GraphQL layer for all your data.

Access databases, files, and APIs through a single, secure interface — with domain ownership and centralized governance built-in.

The hugr Data Mesh platform and GraphQL backend

Hugr is an Open Source Data Mesh Platform and high-performance GraphQL Backend.

Designed for seamless access to distributed data sources, advanced analytics, and geospatial processing — Hugr powers rapid backend development for applications and BI tools. It provides a unified GraphQL API across all your data.

Key Benefits

Data Mesh–Ready

Data Mesh–Ready

Build federated, domain-driven schemas without losing visibility or control.

Geospatial & Analytical Power

Geospatial & Analytical Power

Perform spatial joins, aggregations, and OLAP queries — all in GraphQL.

Modern Data Stack Support

Modern Data Stack Support

Natively integrates with Postgres, DuckDB, Parquet, Iceberg, Delta Lake, and REST APIs.

Cluster-Ready & Extensible

Cluster-Ready & Extensible

Scale with your workloads or embed the engine directly in your Go services.

Talk-to-data

Talk-to-data

Comming soon! Leverage natural language queries to access and analyze your data effortlessly.

Secure by Design

Secure by Design

Enforce fine-grained access policies with built-in authentication and role-based permissions.

Understanding Data Mesh

Data Mesh is a modern approach where teams own and publish their data as a product — just like they do with APIs or microservices.

Hugr enables this by giving every domain a flexible, secure, and unified way to expose their data using GraphQL.

Data Mesh visualization

Use Cases

Data Access Backend for Applications

1. Data Access Backend for Applications

hugr acts as a universal GraphQL layer over data sources:

  • Rapid API deployment over existing databases and files
  • Centralized schema and access control
  • Unified interfaces for apps and BI tools
  • Minimal manual integration
  • Ideal for data-first applications
Data Mesh Platform Architecture

2. Building Data Mesh Platforms

hugr is perfect for Data Mesh architecture:

  • Modular schema definitions
  • Federated access through a single API
  • Decentralized data ownership
  • Domain-specific modeling and scaling
  • Easy onboarding of teams and data sources
Analytics and MLOps Integration

3. Analytics, DataOps and MLOps Integration

hugr enables:

  • Support for OLAP and spatial analytics
  • Export to Arrow IPC and Python (pandas/GeoDataFrame)
  • Server-side jq transformations
  • Caching and scalability for heavy workloads
  • Integration of ETL/ELT and ML pipeline results
Vibe analytics

4. Vibe/Agentic Analytics

hugr MCP powers Vibe's analytics platform by:

  • Modular schema design for diverse data sources
  • Summarize data objects and their fields, relationships, functions, modules and data sources descriptions using LLMs to better understand business context
  • Lazy Hugr schema introspection tools to automatically generate GraphQL queries based on user requests
  • Allow models to build complex queries over multiple data sources and performs chain of queries to fetch and aggregate data as needed
  • Allow models to build JQ transformations to process and filter data server-side before returning results to users

Quick Setup in 2 Minutes

1. Start Hugr in Container

Make sure you have Docker installed on your machine. Read deployment guide →

Start hugr container:

docker run -d --name hugr -p 15000:15000 -v ./schemas:/schemas ghcr.io/hugr-lab/automigrate:latest

Access the admin UI:

http://localhost:15000/admin

Stop container:

docker stop hugr

View logs:

docker logs -f hugr

Connect DuckDB Database

Use DuckDB as an embedded analytical database with auto-generated schema

mutation AddDuckDBSource {
core {
insert_data_sources(data: {
name: "analytics"
type: "duckdb"
description: "DuckDB analytics database"
path: "/data/analytics.db"
as_module: true
self_defined: true
}) {
name
type
}
}
}

# Load the data source
mutation LoadDuckDBSource {
function {
core {
load_data_source(name: "analytics") {
success
message
}
}
}
}

Execute this mutation in the admin UI at http://localhost:15000/admin

Frequently Asked Questions

hugr is an open-source Data Mesh platform and high-performance GraphQL backend for accessing distributed data sources. It provides a unified GraphQL API across diverse sources including databases (PostgreSQL, MySQL, DuckDB, SQL Server), data lakes (DuckLake, Apache Iceberg), file formats (Parquet, Delta Lake), and REST APIs.

hugr enables rapid API development, analytics & BI, geospatial processing, and serves as a universal data access layer for applications.

Learn more about hugr →

Data Mesh is a decentralized approach to data architecture that treats data as a product, with domain-specific ownership. hugr enables Data Mesh by providing:

  • Modular schema definitions that can be reused across different sources
  • Federated access through a single GraphQL API
  • Domain-specific modeling and independent scaling
  • Decentralized data ownership while maintaining unified access

Learn more about Data Mesh architecture →

hugr supports multiple data source types:

  • Relational Databases: DuckDB, PostgreSQL (with PostGIS, TimescaleDB, pgvector), MySQL, SQL Server / Azure SQL
  • Data Lakes: DuckLake (snapshot-based time-travel, schema versioning), Apache Iceberg (REST catalogs, AWS Glue, S3 Tables — with time-travel and self-describing schema)
  • File Formats: Parquet, Delta Lake, CSV, JSON
  • Spatial Formats: GeoParquet, GeoJSON, Shapefiles (GDAL compatible)
  • Services: REST APIs with authentication (HTTP Basic, ApiKey, OAuth2)
  • Storage: Local files and cloud object storage (S3-compatible)

Learn more about data sources →

Data sources are described using GraphQL SDL (Schema Definition Language) with hugr-specific directives. Key directives include:

  • @table - Define database tables
  • @view - Define views with SQL expressions
  • @field_references - Define relationships between tables
  • @join - Define subquery fields in schema for data selection
  • @module - Organize schema into logical modules
  • @function - Define custom functions

Schema files are stored in catalogs and can be located in file systems, HTTP endpoints, or S3 buckets.

Learn more about schema directives →

Queries:

  • Basic CRUD operations with filtering, sorting, and pagination
  • Complex aggregations (count, sum, avg, min, max) and bucket aggregations
  • Cross-source queries and relationships
  • Spatial joins and geospatial operations
  • Vector search for semantic similarity

Mutations:

  • Insert records with nested relations
  • Update multiple records with filters
  • Delete with conditional filters
  • Full transaction support within single requests

Learn more about queries → | Learn more about mutations →

Yes, hugr provides a comprehensive two-level caching system:

  • L1 Cache (In-Memory): Fast local cache for quick access
  • L2 Cache (Distributed): Redis/Memcached for shared cache across cluster nodes

Caching is controlled via directives:

  • @cache - Enable caching with configurable TTL and tags
  • @no_cache - Disable caching for real-time data
  • @invalidate_cache - Force cache refresh

Automatic cache invalidation occurs on mutations based on tags.

Learn more about caching →

hugr supports multiple authentication methods:

  • API Keys: Static keys for service-to-service communication, with support for managed keys stored in database
  • OAuth2/JWT: Token-based authentication with standard JWT claims and custom claim mapping
  • OIDC: OpenID Connect for enterprise identity providers (Google, Auth0, Keycloak, Azure AD)
  • Anonymous: Unauthenticated access with limited permissions

Multiple methods can be enabled simultaneously, and hugr tries each in order.

Learn more about authentication →

hugr uses role-based access control (RBAC) managed through GraphQL API:

  • Roles: Define user roles in the roles table
  • Permissions: Configure field-level and type-level access in role_permissions table
  • Row-Level Security: Apply mandatory filters to restrict data access
  • Default Values: Auto-inject values in mutations (e.g., user_id, tenant_id)

Permissions support wildcards (*) for broad rules with specific exceptions. Access is open by default; add permission entries to restrict.

Learn more about access control →

DuckDB is a high-performance analytical database engine optimized for OLAP workloads. hugr uses DuckDB as its core query engine because:

  • Optimized for analytical queries and aggregations
  • Native support for multiple data formats (Parquet, CSV, JSON)
  • In-process execution with efficient memory usage
  • Excellent performance for large-scale data processing
  • Can attach external databases (PostgreSQL, MySQL) and query them together

Learn more about DuckDB integration →

Yes, hugr has native support for geospatial operations:

  • Native Geometry scalar type for spatial fields
  • Support for PostGIS (PostgreSQL) and DuckDB spatial extension
  • Spatial file formats: GeoParquet, GeoJSON, Shapefiles
  • Spatial joins and aggregations across data sources
  • Distance-based queries and spatial relationships
  • H3 clustering for hierarchical spatial indexing

Learn more about spatial queries → | Learn more about H3 clustering →

hugr is designed for enterprise-scale deployments:

  • Horizontal Scaling: Stateless nodes that can be added/removed dynamically
  • Cluster Mode: Multi-node operation with load balancing and fault tolerance
  • Caching: Two-level cache (in-memory + Redis/Memcached) reduces database load
  • Performance: Query optimization and pushdown to data sources
  • Kubernetes Ready: Helm charts for easy K8s deployment

Learn more about cluster mode → | Learn more about container deployment →

hugr supports multiple output formats:

  • GraphQL JSON: Standard GraphQL response format
  • Arrow IPC: Efficient binary format for large datasets via Hugr multipart IPC protocol
  • Python Integration: Direct export to pandas DataFrame and GeoDataFrame
  • JQ Transformations: Server-side data transformation with custom JSON output

The Arrow IPC protocol enables efficient streaming of large datasets to analytics and ML pipelines.

Learn more about Arrow IPC → | Learn more about Python client → | Learn more about JQ transformations →

DuckDB Logo

Powered by DuckDB

hugr leverages DuckDB - the blazing-fast in-process analytical database - as its core engine. This enables lightning-speed cross-source JOINs and aggregations directly in memory, combining data from PostgreSQL, S3 Parquet files, CSV, and geospatial formats in a single GraphQL query. With zero network latency and OLAP-optimized performance, DuckDB makes hugr the perfect choice for analytic workloads and data mesh architectures.