Join Us
Join Our Community
Welcome to the hugr project! We are building an open-source Data Mesh platform and high-performance GraphQL backend, and we welcome contributions from developers around the world.
Whether you're interested in database internals, distributed systems, analytics, or GraphQL, there's a place for you in our community.
Technology Stack
The hugr ecosystem uses a diverse set of modern technologies. Here's what we work with:
Core Technologies
Go (Golang) - Primary Language
Go is the primary language for hugr development. We use it for:
- Query Engine: Core engine implementation at hugr-lab/query-engine
- Server: HTTP server and GraphQL API at hugr-lab/hugr
- Data Source Connectors: Native database drivers and integrations
- Cluster Management: Multi-node coordination and synchronization
What you need:
- Strong Go programming skills
- Understanding of concurrent programming and goroutines
- Experience with database connectivity and query optimization
- Knowledge of HTTP servers and API development
SQL - Deep Expertise Required
SQL is at the heart of hugr's data processing:
- Query Transformation: Converting GraphQL to optimized SQL
- Cross-source Joins: Complex query planning and execution
- Aggregations: Advanced analytical queries
- Performance Optimization: Index strategies, query plans, and execution optimization
What you need:
- Deep understanding of SQL (not just basic SELECT/INSERT)
- Experience with query optimization and execution plans
- Knowledge of database internals and indexing strategies
- Understanding of OLAP workloads and analytical queries
DuckDB and C++
DuckDB is our core analytical engine:
- Engine Integration: DuckDB embedded in Go via CGO
- Extensions: Building custom DuckDB extensions in C++
- Performance Tuning: Optimizing analytical queries
- Format Support: Working with Parquet, Arrow, and other analytical formats
What you need:
- C++ programming skills (for DuckDB extensions)
- Understanding of CGO for Go-C++ integration
- Knowledge of columnar databases and analytical workloads
- Experience with vectorized query execution (bonus)
Analytics and Machine Learning
Python - Analytics and ML
Python is the primary tool for data analytics and machine learning in the hugr ecosystem:
- Data Analysis: Working with hugr data using pandas and GeoPandas
- ML Pipelines: Building machine learning models on top of hugr data
- Client Library: Python client at hugr-lab/hugr-client
- Integration: Arrow IPC protocol for efficient data transfer
What you need:
- Python programming with pandas/GeoPandas experience
- Understanding of ML workflows and data pipelines
- Knowledge of Arrow format and data serialization
- Experience with data visualization and Jupyter notebooks
MCP Integration — Talk to Data
hugr includes a built-in Model Context Protocol (MCP) endpoint that enables AI assistants to explore and query data:
- Built-in MCP Server: The
/mcpendpoint provides discovery, schema, and data tools for LLM clients - Schema Summarization: The
hugr-tools summarizeCLI uses LLMs to generate human-readable descriptions for all schema objects - Semantic Search: Optional embedding-based search for finding relevant data objects by meaning
- Prompt Engineering: Creating effective prompts for data interaction tasks
- Client Integration: Compatible with Claude, Cursor, and other MCP-capable tools
What you need:
- Understanding of LLM capabilities and limitations
- Experience with prompt engineering and model APIs
- Knowledge of RAG (Retrieval-Augmented Generation) patterns
- Interest in natural language interfaces for data
- Understanding of data query patterns and SQL
Geospatial Analytics
Spatial Data Processing
hugr has native support for spatial analytics:
- PostGIS Integration: Working with spatial databases
- Spatial Types: Geometry, Geography, and spatial indexes
- Spatial Operations: Joins, aggregations, and transformations on spatial data
- H3 Clustering: Hexagonal hierarchical spatial indexing
What you need:
- Understanding of GIS concepts and spatial data types
- Experience with PostGIS or similar spatial databases
- Knowledge of spatial indexing and query optimization
- Interest in geospatial analytics and visualization
Data Engineering
DBT (Data Build Tool)
We use DBT for data transformation workflows:
- Data Modeling: Building analytical models on top of hugr
- Testing: Data quality checks and validation
- Documentation: Automated data catalog generation
- Orchestration: Integration with data pipelines
What you need:
- Experience with DBT and data modeling
- Understanding of dimensional modeling and data warehousing
- SQL proficiency for transformations
- Knowledge of data quality and testing practices
Big Data Integration
Spark Integration (Scala/Java)
For integration with Apache Spark:
- Spark Connectors: Building data source connectors for Spark
- Data Transfer: Efficient data exchange between hugr and Spark
- Distributed Processing: Leveraging Spark for large-scale transformations
- Integration Patterns: Best practices for Spark-hugr workflows
What you need:
- Scala or Java programming skills
- Experience with Apache Spark and distributed computing
- Understanding of data partitioning and parallel processing
- Knowledge of data formats (Parquet, Arrow, Iceberg)
How to Contribute
Getting Started
-
Explore the Repositories
- hugr-lab/hugr - Main server implementation
- hugr-lab/query-engine - Core Go engine
- hugr-lab/hugr-client - Python client library
- hugr-lab/docker - Docker images and Kubernetes charts
-
Check Open Issues
- Look for issues tagged with
good first issueorhelp wanted - Read existing discussions to understand current development priorities
- Ask questions if you're unsure about something
- Look for issues tagged with
-
Set Up Your Development Environment
- Install Go (latest stable version)
- Install DuckDB dependencies
- Set up your preferred IDE (VS Code, GoLand, etc.)
- Clone the repositories you want to work on
-
Start Contributing
- Fix bugs, add features, improve documentation
- Write tests for your changes
- Follow the project's coding standards
- Submit pull requests with clear descriptions
Areas Where We Need Help
- Core Engine Development: Query optimization, new data source connectors
- Documentation: Improving guides, adding examples, translations
- Testing: Unit tests, integration tests, performance benchmarks
- Client Libraries: Improving Python client, creating clients for other languages
- MCP Tools: Developing tools for LLM integration
- Examples: Real-world use cases and tutorials
- Infrastructure: Docker images, Kubernetes deployment, CI/CD
Community
Join our community:
- GitHub Discussions: Ask questions and share ideas
- Issues: Report bugs and request features
- Pull Requests: Submit your contributions
We value all contributions, whether it's code, documentation, bug reports, or community support. Every contribution helps make hugr better!
Hugr Ecosystem Repositories
The hugr project consists of multiple repositories, each serving a specific purpose in the ecosystem:
Core Repositories
hugr-lab/hugr
Language: Go | License: MIT
The main Hugr service repository containing the HTTP server implementation. This is the primary entry point for running hugr as a standalone service.
hugr-lab/query-engine
Language: Go | License: MIT
The core query engine implementation. This is a reusable Go package that powers hugr's GraphQL query processing, data source management, and query transformation logic. Can be embedded in custom applications.
Client Libraries
hugr-lab/hugr-client
Language: Python | License: MIT
Python client library for hugr. Provides a convenient interface for querying hugr via the Arrow IPC protocol, with support for pandas DataFrames and GeoPandas GeoDataFrames.
Infrastructure & Deployment
hugr-lab/docker
Language: Dockerfile | License: MIT
Docker images and Kubernetes charts for deploying hugr. Includes configurations for both single-node and multi-node cluster deployments with load balancing and caching.
Tools & Extensions
Built-in MCP Endpoint
The MCP server is now built into hugr itself (enabled via MCP_ENABLED=true). It provides discovery, schema introspection, and query execution tools for AI assistants. See the MCP documentation for details.
hugr-tools CLI
The hugr-tools binary (in hugr-lab/query-engine) provides utilities for working with hugr instances, including AI-powered schema summarization. Install via the install script or download from GitHub releases.
Examples & Learning
hugr-lab/examples
Language: Shell | License: MIT
Collection of example projects and tutorials demonstrating various hugr use cases, integrations, and best practices.
hugr-lab/osm_dbt
Language: Shell | License: MIT
OpenStreetMap universal DuckDB loader. Demonstrates how to work with geospatial data, DBT transformations, and spatial analytics in hugr.
Documentation
hugr-lab/hugr-lab.github.io
Language: CSS (Docusaurus) | License: MIT
This documentation website source code. Built with Docusaurus and contains all user guides, API references, and tutorials.
All repositories are open source and licensed under MIT, making them free to use for both commercial and non-commercial purposes. We welcome contributions to any of these repositories!
Contact
If you have questions or want to discuss contribution opportunities:
- Open a discussion on GitHub: hugr-lab/hugr/discussions
- Report issues: hugr-lab/hugr/issues
- Check our documentation: https://hugr-lab.github.io
We look forward to working with you!