Unified API for Your Data, Wherever It Lives

Hugr is an Open Source Data Mesh Platform and high-performance GraphQL Backend.

Designed for seamless access to distributed data sources, advanced analytics, and geospatial processing — Hugr powers rapid backend development for applications and BI tools. It provides a unified GraphQL API across all your data.

Learn More

Key Benefits

Data Mesh–Ready

Build federated, domain-driven schemas without losing visibility or control.

Geospatial & Analytical Power

Perform spatial joins, aggregations, and OLAP queries — all in GraphQL.

Modern Data Stack Support

Natively integrates with Postgres, DuckDB, Parquet, Iceberg, Delta Lake, and REST APIs.

Cluster-Ready & Extensible

Scale with your workloads or embed the engine directly in your Go services.

Talk-to-data

Comming soon! Leverage natural language queries to access and analyze your data effortlessly.

Secure by Design

Enforce fine-grained access policies with built-in authentication and role-based permissions.

Understanding Data Mesh

Data Mesh is a modern approach where teams own and publish their data as a product — just like they do with APIs or microservices.

Hugr enables this by giving every domain a flexible, secure, and unified way to expose their data using GraphQL.

Use Cases

1. Data Access Backend for Applications

hugr acts as a universal GraphQL layer over data sources:

Rapid API deployment over existing databases and files
Centralized schema and access control
Unified interfaces for apps and BI tools
Minimal manual integration
Ideal for data-first applications

2. Building Data Mesh Platforms

hugr is perfect for Data Mesh architecture:

Modular schema definitions
Federated access through a single API
Decentralized data ownership
Domain-specific modeling and scaling
Easy onboarding of teams and data sources

3. Analytics, DataOps and MLOps Integration

hugr enables:

Support for OLAP and spatial analytics
Export to Arrow IPC and Python (pandas/GeoDataFrame)
Server-side jq transformations
Caching and scalability for heavy workloads
Integration of ETL/ELT and ML pipeline results

4. Vibe/Agentic Analytics

hugr MCP powers Vibe's analytics platform by:

Modular schema design for diverse data sources
Summarize data objects and their fields, relationships, functions, modules and data sources descriptions using LLMs to better understand business context
Lazy Hugr schema introspection tools to automatically generate GraphQL queries based on user requests
Allow models to build complex queries over multiple data sources and performs chain of queries to fetch and aggregate data as needed
Allow models to build JQ transformations to process and filter data server-side before returning results to users

Quick Setup in 2 Minutes

1. Start Hugr in Container

Make sure you have Docker installed on your machine. Read deployment guide →

Start hugr container:

docker run -d --name hugr -p 15000:15000 -v ./schemas:/schemas ghcr.io/hugr-lab/automigrate:latest

Access the admin UI:

http://localhost:15000/admin

Stop container:

docker stop hugr

View logs:

docker logs -f hugr

Connect DuckDB Database

Use DuckDB as an embedded analytical database with auto-generated schema

mutation AddDuckDBSource {
  core {
    insert_data_sources(data: {
      name: "analytics"
      type: "duckdb"
      description: "DuckDB analytics database"
      path: "/data/analytics.db"
      as_module: true
      self_defined: true
    }) {
      name
      type
    }
  }
}

# Load the data source
mutation LoadDuckDBSource {
  function {
    core {
      load_data_source(name: "analytics") {
        success
        message
      }
    }
  }
}

Execute this mutation in the admin UI at http://localhost:15000/admin

Frequently Asked Questions

hugr is an open-source Data Mesh platform and high-performance GraphQL backend for accessing distributed data sources. It provides a unified GraphQL API across diverse sources including databases (PostgreSQL, MySQL, DuckDB, SQL Server), data lakes (DuckLake, Apache Iceberg), file formats (Parquet, Delta Lake), and REST APIs.

hugr enables rapid API development, analytics & BI, geospatial processing, and serves as a universal data access layer for applications.

Learn more about hugr →

Data Mesh is a decentralized approach to data architecture that treats data as a product, with domain-specific ownership. hugr enables Data Mesh by providing:

Modular schema definitions that can be reused across different sources
Federated access through a single GraphQL API
Domain-specific modeling and independent scaling
Decentralized data ownership while maintaining unified access

Learn more about Data Mesh architecture →

hugr supports multiple data source types:

Relational Databases: DuckDB, PostgreSQL (with PostGIS, TimescaleDB, pgvector), MySQL, SQL Server / Azure SQL
Data Lakes: DuckLake (snapshot-based time-travel, schema versioning), Apache Iceberg (REST catalogs, AWS Glue, S3 Tables — with time-travel and self-describing schema)
File Formats: Parquet, Delta Lake, CSV, JSON
Spatial Formats: GeoParquet, GeoJSON, Shapefiles (GDAL compatible)
Services: REST APIs with authentication (HTTP Basic, ApiKey, OAuth2)
Storage: Local files and cloud object storage (S3-compatible)

Learn more about data sources →

Data sources are described using GraphQL SDL (Schema Definition Language) with hugr-specific directives. Key directives include:

@table - Define database tables
@view - Define views with SQL expressions
@field_references - Define relationships between tables
@join - Define subquery fields in schema for data selection
@module - Organize schema into logical modules
@function - Define custom functions

Schema files are stored in catalogs and can be located in file systems, HTTP endpoints, or S3 buckets.

Learn more about schema directives →

Queries:

Basic CRUD operations with filtering, sorting, and pagination
Complex aggregations (count, sum, avg, min, max) and bucket aggregations
Cross-source queries and relationships
Spatial joins and geospatial operations
Vector search for semantic similarity

Mutations:

Insert records with nested relations
Update multiple records with filters
Delete with conditional filters
Full transaction support within single requests

Learn more about queries → | Learn more about mutations →

Yes, hugr provides a comprehensive two-level caching system:

L1 Cache (In-Memory): Fast local cache for quick access
L2 Cache (Distributed): Redis/Memcached for shared cache across cluster nodes

Caching is controlled via directives:

@cache - Enable caching with configurable TTL and tags
@no_cache - Disable caching for real-time data
@invalidate_cache - Force cache refresh

Automatic cache invalidation occurs on mutations based on tags.

Learn more about caching →

hugr supports multiple authentication methods:

API Keys: Static keys for service-to-service communication, with support for managed keys stored in database
OAuth2/JWT: Token-based authentication with standard JWT claims and custom claim mapping
OIDC: OpenID Connect for enterprise identity providers (Google, Auth0, Keycloak, Azure AD)
Anonymous: Unauthenticated access with limited permissions

Multiple methods can be enabled simultaneously, and hugr tries each in order.

Learn more about authentication →

hugr uses role-based access control (RBAC) managed through GraphQL API:

Roles: Define user roles in the roles table
Permissions: Configure field-level and type-level access in role_permissions table
Row-Level Security: Apply mandatory filters to restrict data access
Default Values: Auto-inject values in mutations (e.g., user_id, tenant_id)

Permissions support wildcards (*) for broad rules with specific exceptions. Access is open by default; add permission entries to restrict.

Learn more about access control →

DuckDB is a high-performance analytical database engine optimized for OLAP workloads. hugr uses DuckDB as its core query engine because:

Optimized for analytical queries and aggregations
Native support for multiple data formats (Parquet, CSV, JSON)
In-process execution with efficient memory usage
Excellent performance for large-scale data processing
Can attach external databases (PostgreSQL, MySQL) and query them together

Learn more about DuckDB integration →

Yes, hugr has native support for geospatial operations:

Native Geometry scalar type for spatial fields
Support for PostGIS (PostgreSQL) and DuckDB spatial extension
Spatial file formats: GeoParquet, GeoJSON, Shapefiles
Spatial joins and aggregations across data sources
Distance-based queries and spatial relationships
H3 clustering for hierarchical spatial indexing

Learn more about spatial queries → | Learn more about H3 clustering →

hugr is designed for enterprise-scale deployments:

Horizontal Scaling: Stateless nodes that can be added/removed dynamically
Cluster Mode: Multi-node operation with load balancing and fault tolerance
Caching: Two-level cache (in-memory + Redis/Memcached) reduces database load
Performance: Query optimization and pushdown to data sources
Kubernetes Ready: Helm charts for easy K8s deployment

Learn more about cluster mode → | Learn more about container deployment →

hugr supports multiple output formats:

GraphQL JSON: Standard GraphQL response format
Arrow IPC: Efficient binary format for large datasets via Hugr multipart IPC protocol
Python Integration: Direct export to pandas DataFrame and GeoDataFrame
JQ Transformations: Server-side data transformation with custom JSON output

The Arrow IPC protocol enables efficient streaming of large datasets to analytics and ML pipelines.

Learn more about Arrow IPC → | Learn more about Python client → | Learn more about JQ transformations →

Powered by DuckDB

hugr leverages DuckDB - the blazing-fast in-process analytical database - as its core engine. This enables lightning-speed cross-source JOINs and aggregations directly in memory, combining data from PostgreSQL, S3 Parquet files, CSV, and geospatial formats in a single GraphQL query. With zero network latency and OLAP-optimized performance, DuckDB makes hugr the perfect choice for analytic workloads and data mesh architectures.

Unified API for Your Data, Wherever It LivesUnified API for All Your Data