KG-Microbe — Knowledge Graph for Microbiology

KG-Microbe is a comprehensive, modular knowledge graph for microbiology and microbiome research developed by Dr. Marcin P. Joachimiak at Lawrence Berkeley National Laboratory in Berkeley, California.

Overview

KG-Microbe is a comprehensive, modular knowledge graph designed specifically for microbiology and microbiome research. Developed by Dr. Marcin P. Joachimiak at Lawrence Berkeley National Laboratory in Berkeley, California, this innovative resource integrates diverse microbial data sources to enable AI-driven insights for growth preference prediction and culture optimization.

Key Features

🧬 Modular Architecture

  • Scalable Design: Built with modularity in mind for easy expansion and maintenance
  • Flexible Integration: Seamlessly incorporates new data sources and ontologies
  • Standardized Framework: Uses consistent data models across all modules

🤖 AI/ML Integration

  • Growth Preference Prediction: Machine learning models trained on integrated knowledge graph data
  • Culture Optimization: AI-driven recommendations for optimal cultivation conditions
  • Pattern Discovery: Automated identification of microbial relationships and preferences

📊 Comprehensive Data Sources

  • Taxonomic Information: Integrated microbial taxonomy and phylogenetic relationships
  • Growth Conditions: Comprehensive collection of experimental growth parameters
  • Environmental Data: Habitat and ecological context information
  • Genomic Features: Integration with genomic and functional annotations

Technical Architecture

Data Integration Pipeline

KG-Microbe employs a sophisticated ETL (Extract, Transform, Load) pipeline that:

  • Extracts data from multiple heterogeneous sources
  • Transforms data using standardized ontologies and vocabularies
  • Loads integrated data into a unified knowledge graph structure

METPO Ontology

The Microbial Experimental and Theoretical Preference Ontology (METPO) provides:

  • Standardized vocabulary for growth preferences
  • Controlled terms for experimental conditions
  • Hierarchical classification of microbial traits
  • Semantic relationships between concepts

Applications

Research Applications

  • Cultivation Studies: Predict optimal growth conditions for uncultured microbes
  • Comparative Microbiology: Analyze growth patterns across different species
  • Ecological Modeling: Understand microbial community dynamics
  • Biotechnology: Optimize industrial cultivation processes

AI/ML Use Cases

  • Predictive Modeling: Train models to predict growth preferences
  • Recommendation Systems: Suggest cultivation strategies
  • Knowledge Discovery: Identify novel microbial relationships
  • Data Mining: Extract patterns from large-scale microbial datasets

Projects Built on KG-Microbe

The kg-microbe knowledge graph serves as the foundation for numerous CultureBotAI projects:

Data Integration Projects

  • CultureMech & MicroMediaParam - Integrate chemical compound data with standardized identifiers
  • assay-metadata - Add phenotypic assay results from 99K+ BacDive strains
  • MATE-LLM - Extract and integrate literature-derived cultivation data
  • eggnogtable - Map genome functional annotations to ontology terms

Prediction & Analysis Tools

  • MicroGrowAgents - Multi-agent system using kg-microbe for evidence-based media design
  • MicroGrowLink - Graph transformer models trained on kg-microbe structure
  • PFAS-AI & PFASCommunityAgents - PFAS biodegradation using microbial relationships
  • CMM-AI - Lanthanide bioprocessing research leveraging metabolic data

Web Services

  • MicroGrowLinkService - RESTful API providing programmatic access to predictions

Explore all projects and their relationships →

Access and Usage

Repository

  • GitHub: Knowledge-Graph-Hub/kg-microbe
  • Documentation: Comprehensive guides and API documentation
  • Examples: Sample queries and use case demonstrations

Data Formats

  • RDF/OWL: Semantic web standards for knowledge representation
  • JSON-LD: Structured data format for web integration
  • TSV/CSV: Tabular formats for data analysis
  • SPARQL: Query language for graph traversal

Citation

If you use KG-Microbe in your research, please cite our preprint:

APA Format

Santangelo, B. E., Hegde, H., Caufield, J. H., Reese, J., Kliegr, T., Hunter, L. E., Lozupone, C. A., Mungall, C. J., & Joachimiak, M. P. (2025). KG-Microbe - Building Modular and Scalable Knowledge Graphs for Microbiome and Microbial Sciences. bioRxiv. https://doi.org/10.1101/2025.02.24.639989

BibTeX

@article{santangelo2025kgmicrobe,
  title={KG-Microbe - Building Modular and Scalable Knowledge Graphs for Microbiome and Microbial Sciences},
  author={Santangelo, Brook E and Hegde, Harshad and Caufield, J Harry and Reese, Justin and Kliegr, Tomas and Hunter, Lawrence E and Lozupone, Catherine A and Mungall, Christopher J and Joachimiak, Marcin P},
  journal={bioRxiv},
  year={2025},
  doi={10.1101/2025.02.24.639989},
  url={https://www.biorxiv.org/content/10.1101/2025.02.24.639989v1}
}

DOI: 10.1101/2025.02.24.639989

Contact

For questions about KG-Microbe, please contact:


Frequently Asked Questions

What is KG-Microbe?

KG-Microbe is a comprehensive, modular knowledge graph for microbiology and microbiome research developed by Dr. Marcin P. Joachimiak at Lawrence Berkeley National Laboratory in Berkeley, California. It integrates diverse microbial data sources to enable AI-driven insights.

Who developed KG-Microbe?

KG-Microbe was developed by Dr. Marcin P. Joachimiak and collaborators including Brook E. Santangelo, Harshad Hegde, J. Harry Caufield, Justin Reese, Tomas Kliegr, Lawrence E. Hunter, Catherine A. Lozupone, and Christopher J. Mungall at Lawrence Berkeley National Laboratory.

What is KG-Microbe used for?

KG-Microbe is used for growth preference prediction, culture optimization, comparative microbiology analysis, ecological modeling, and training AI/ML models for microbial cultivation research.

How can I access KG-Microbe?

KG-Microbe is freely available on GitHub at https://github.com/Knowledge-Graph-Hub/kg-microbe under the BSD-3-Clause license. Comprehensive documentation and examples are included in the repository.

What makes KG-Microbe modular?

KG-Microbe uses a modular architecture with scalable design, flexible integration of new data sources and ontologies, and standardized frameworks across all modules for easy expansion and maintenance.

How do I cite KG-Microbe?

Cite the bioRxiv preprint: Santangelo, B. E., et al. (2025). KG-Microbe - Building Modular and Scalable Knowledge Graphs for Microbiome and Microbial Sciences. bioRxiv. https://doi.org/10.1101/2025.02.24.639989


KG-Microbe: Bridging the gap between microbial data and artificial intelligence.