|
|
||
|---|---|---|
| apps | ||
| k8s | ||
| ops | ||
| plan | ||
| terraform | ||
| training | ||
| .geminiignore | ||
| .gitignore | ||
| AGENTS.md | ||
| CLAUDE.md | ||
| GEMINI.md | ||
| I10A3339~2.jpg | ||
| PXL_20231209_104246132.PORTRAIT.jpg | ||
| README.md | ||
| mc_mirror_dw.log | ||
| studiofranknkaycee-72.jpg | ||
| studiofranknkaycee-75.jpg | ||
README.md
Sovereign MLOps Platform: GeoCrop LULC Portfolio
Welcome to the Sovereign MLOps Platform, a comprehensive self-hosted environment on K3s designed for end-to-end Land Use / Land Cover (LULC) crop-mapping in Zimbabwe.
This project showcases professional skills in MLOps, Cloud-Native Architecture, Geospatial Analysis, and GitOps.
🏗️ System Architecture
The platform is built on a robust, self-hosted Kubernetes (K3s) cluster with a focus on data sovereignty and scalability.
graph TD
subgraph "Frontend & Entry"
WEB[React 19 Frontend]
ING[Nginx Ingress]
end
subgraph "Core Services (geocrop namespace)"
API[FastAPI Backend]
RQ[Redis Queue]
WORKER[ML Inference Worker]
TILER[TiTiler Dynamic Server]
end
subgraph "MLOps & Infra"
GITEA[Gitea Source Control]
ARGO[ArgoCD GitOps]
MLF[MLflow Tracking]
JUPYTER[JupyterLab Workspace]
end
subgraph "Storage & Data"
MINIO[(MinIO S3 Storage)]
POSTGIS[(Postgres + PostGIS)]
end
%% Flow
WEB --> ING
ING --> API
API --> RQ
RQ --> WORKER
WORKER --> MINIO
WORKER --> POSTGIS
TILER --> MINIO
WEB --> TILER
ARGO --> GITEA
ARGO --> ING
JUPYTER --> MINIO
MLF --> POSTGIS
📊 System Data Flow (DFD)
How data moves from raw satellite imagery to final crop-type predictions:
graph LR
subgraph "External Sources"
DEA[Digital Earth Africa STAC]
end
subgraph "Storage (MinIO)"
DS[(/geocrop-datasets)]
BS[(/geocrop-baselines)]
MD[(/geocrop-models)]
RS[(/geocrop-results)]
end
subgraph "Processing"
TRAIN[Jupyter Training]
INFER[Inference Worker]
end
%% Data movement
DEA -- "Sentinel-2 Imagery" --> INFER
DS -- "CSV Batches" --> TRAIN
TRAIN -- "Trained Models" --> MD
MD -- "Model Load" --> INFER
BS -- "DW TIFFs" --> INFER
INFER -- "Classification COG" --> RS
RS -- "Map Tiles" --> WEB[Frontend Visualization]
🗺️ UX Data Flow: Parallel Loading Strategy
To ensure a seamless user experience, the system implements a dual-loading strategy:
sequenceDiagram
participant U as User (Frontend)
participant T as TiTiler (S3 Proxy)
participant A as FastAPI
participant W as ML Worker
participant M as MinIO
U->>A: Submit Job (AOI + Year)
A->>U: Job ID (Accepted)
par Instant Visual Context
U->>T: Fetch Baseline Tiles (DW)
T->>M: Stream Baseline COG
M->>T:
T->>U: Render Baseline Map
and Asynchronous Prediction
A->>W: Enqueue Task
W->>M: Fetch Model & Data
W->>W: Run Inference & Post-processing
W->>M: Upload Prediction COG
loop Polling
U->>A: Get Status?
A-->>U: Processing...
end
W->>A: Job Complete
U->>A: Get Status?
A->>U: Prediction URL
U->>T: Fetch Prediction Tiles
T->>M: Stream Prediction COG
T->>U: Overlay High-Res Result
end
🚀 Deployment & GitOps Pipeline
graph LR
DEV[Developer] -->|Push| GITEA[Gitea]
subgraph "CI/CD Pipeline"
GITEA -->|Trigger| GA[Gitea Actions]
GA -->|Build & Push| DH[Docker Hub: frankchine]
end
subgraph "GitOps Sync"
ARGO[ArgoCD] -->|Monitor| GITEA
DH -->|Image Pull| K3S[K3s Cluster]
ARGO -->|Apply Manifests| K3S
end
🛠️ Training Workflow
Training is performed in JupyterLab using a custom MinIOStorageClient that bridges the gap between object storage and in-memory data processing.
Using the MinIO Storage Client
from training.storage_client import MinIOStorageClient
# Initialize client (uses environment variables automatically)
storage = MinIOStorageClient()
# List available training batches
batches = storage.list_files('geocrop-datasets')
# Load a batch directly into memory (No disk I/O)
df = storage.load_dataset('geocrop-datasets', 'batch_1.csv')
# Train your model and upload the artifact
# ... training code ...
storage.upload_file('model.pkl', 'geocrop-models', 'Zimbabwe_Ensemble_Model.pkl')
🖥️ Service Registry
- Portfolio Frontend: portfolio.techarvest.co.zw
- Source Control: git.techarvest.co.zw
- JupyterLab: lab.techarvest.co.zw
- MLflow: ml.techarvest.co.zw
- ArgoCD: cd.techarvest.co.zw
- MinIO Console: console.minio.portfolio.techarvest.co.zw
Created and maintained by fchinembiri.