Go to file
fchinembiri bdc4d52f21 Update storage client with load_dataset and add comprehensive README 2026-04-23 22:29:19 +02:00
apps Restructure k8s manifests for GitOps alignment in k8s/base/ 2026-04-23 22:14:31 +02:00
k8s Restructure k8s manifests for GitOps alignment in k8s/base/ 2026-04-23 22:14:31 +02:00
ops Initial commit: Restructuring GeoCrop to Sovereign MLOps Platform 2026-04-23 22:02:12 +02:00
plan Initial commit: Restructuring GeoCrop to Sovereign MLOps Platform 2026-04-23 22:02:12 +02:00
terraform Restructure k8s manifests for GitOps alignment in k8s/base/ 2026-04-23 22:14:31 +02:00
training Update storage client with load_dataset and add comprehensive README 2026-04-23 22:29:19 +02:00
.geminiignore Initial commit: Restructuring GeoCrop to Sovereign MLOps Platform 2026-04-23 22:02:12 +02:00
.gitignore Initial commit: Restructuring GeoCrop to Sovereign MLOps Platform 2026-04-23 22:02:12 +02:00
AGENTS.md Initial commit: Restructuring GeoCrop to Sovereign MLOps Platform 2026-04-23 22:02:12 +02:00
CLAUDE.md Initial commit: Restructuring GeoCrop to Sovereign MLOps Platform 2026-04-23 22:02:12 +02:00
GEMINI.md Initial commit: Restructuring GeoCrop to Sovereign MLOps Platform 2026-04-23 22:02:12 +02:00
I10A3339~2.jpg Initial commit: Restructuring GeoCrop to Sovereign MLOps Platform 2026-04-23 22:02:12 +02:00
PXL_20231209_104246132.PORTRAIT.jpg Initial commit: Restructuring GeoCrop to Sovereign MLOps Platform 2026-04-23 22:02:12 +02:00
README.md Update storage client with load_dataset and add comprehensive README 2026-04-23 22:29:19 +02:00
mc_mirror_dw.log Initial commit: Restructuring GeoCrop to Sovereign MLOps Platform 2026-04-23 22:02:12 +02:00
studiofranknkaycee-72.jpg Initial commit: Restructuring GeoCrop to Sovereign MLOps Platform 2026-04-23 22:02:12 +02:00
studiofranknkaycee-75.jpg Initial commit: Restructuring GeoCrop to Sovereign MLOps Platform 2026-04-23 22:02:12 +02:00

README.md

Sovereign MLOps Platform: GeoCrop LULC Portfolio

Welcome to the Sovereign MLOps Platform, a comprehensive self-hosted environment on K3s designed for end-to-end Land Use / Land Cover (LULC) crop-mapping in Zimbabwe.

This project showcases professional skills in MLOps, Cloud-Native Architecture, Geospatial Analysis, and GitOps.

🏗️ System Architecture

The platform is built on a robust, self-hosted Kubernetes (K3s) cluster with a focus on data sovereignty and scalability.

  • Source Control & CI/CD: Gitea (Self-hosted GitHub alternative)
  • Infrastructure as Code: Terraform (Managing K3s Namespaces & Quotas)
  • GitOps: ArgoCD (Automated deployment from Git to Cluster)
  • Experiment Tracking: MLflow (Model versioning & metrics)
  • Interactive Workspace: JupyterLab (Data science & training)
  • Spatial Database: Standalone PostgreSQL + PostGIS (Port 5433)
  • Object Storage: MinIO (S3-compatible storage for datasets, baselines, and models)
  • Frontend: React 19 + OpenLayers (Parallel loading of baselines and ML predictions)
  • Backend: FastAPI + Redis Queue (Job orchestration)
  • Visualization: TiTiler (Dynamic tile server for Cloud Optimized GeoTIFFs)

🗺️ UX Data Flow: Parallel Loading Strategy

To ensure a seamless user experience, the system implements a dual-loading strategy:

  1. Instant Context: While waiting for ML inference, Dynamic World (DW) TIFF baselines (2015-2025) are immediately served from MinIO via TiTiler.
  2. Asynchronous Inference: The ML worker processes heavy classification tasks in the background and overlays high-resolution predictions once complete.

🛠️ Training Workflow

Training is performed in JupyterLab using a custom MinIOStorageClient that bridges the gap between object storage and in-memory data processing.

Using the MinIO Storage Client

from training.storage_client import MinIOStorageClient

# Initialize client (uses environment variables automatically)
storage = MinIOStorageClient()

# List available training batches
batches = storage.list_files('geocrop-datasets')

# Load a batch directly into memory (No disk I/O)
df = storage.load_dataset('geocrop-datasets', 'batch_1.csv')

# Train your model and upload the artifact
# ... training code ...
storage.upload_file('model.pkl', 'geocrop-models', 'Zimbabwe_Ensemble_Model.pkl')

🚀 Deployment & GitOps

The platform follows a strict GitOps workflow:

  1. All changes are committed to the geocrop-platform repository on Gitea.
  2. Gitea Actions build and push containers to Docker Hub (frankchine).
  3. ArgoCD monitors the k8s/base directory and automatically synchronizes the cluster state.

🖥️ Service Registry


Created and maintained by fchinembiri.