# Sovereign MLOps Platform: GeoCrop LULC Portfolio Welcome to the **Sovereign MLOps Platform**, a comprehensive self-hosted environment on K3s designed for end-to-end Land Use / Land Cover (LULC) crop-mapping in Zimbabwe. This project showcases professional skills in **MLOps, Cloud-Native Architecture, Geospatial Analysis, and GitOps**. ## πŸ—οΈ System Architecture The platform is built on a robust, self-hosted Kubernetes (K3s) cluster with a focus on data sovereignty and scalability. ```mermaid graph TD subgraph "Frontend & Entry" WEB[React 19 Frontend] ING[Nginx Ingress] end subgraph "Core Services (geocrop namespace)" API[FastAPI Backend] RQ[Redis Queue] WORKER[ML Inference Worker] TILER[TiTiler Dynamic Server] end subgraph "MLOps & Infra" GITEA[Gitea Source Control] ARGO[ArgoCD GitOps] MLF[MLflow Tracking] JUPYTER[JupyterLab Workspace] end subgraph "Storage & Data" MINIO[(MinIO S3 Storage)] POSTGIS[(Postgres + PostGIS)] end %% Flow WEB --> ING ING --> API API --> RQ RQ --> WORKER WORKER --> MINIO WORKER --> POSTGIS TILER --> MINIO WEB --> TILER ARGO --> GITEA ARGO --> ING JUPYTER --> MINIO MLF --> POSTGIS ``` ## πŸ“Š System Data Flow (DFD) How data moves from raw satellite imagery to final crop-type predictions: ```mermaid graph LR subgraph "External Sources" DEA[Digital Earth Africa STAC] end subgraph "Storage (MinIO)" DS[(/geocrop-datasets)] BS[(/geocrop-baselines)] MD[(/geocrop-models)] RS[(/geocrop-results)] end subgraph "Processing" TRAIN[Jupyter Training] INFER[Inference Worker] end %% Data movement DEA -- "Sentinel-2 Imagery" --> INFER DS -- "CSV Batches" --> TRAIN TRAIN -- "Trained Models" --> MD MD -- "Model Load" --> INFER BS -- "DW TIFFs" --> INFER INFER -- "Classification COG" --> RS RS -- "Map Tiles" --> WEB[Frontend Visualization] ``` ## πŸ—ΊοΈ UX Data Flow: Parallel Loading Strategy To ensure a seamless user experience, the system implements a dual-loading strategy: ```mermaid sequenceDiagram participant U as User (Frontend) participant T as TiTiler (S3 Proxy) participant A as FastAPI participant W as ML Worker participant M as MinIO U->>A: Submit Job (AOI + Year) A->>U: Job ID (Accepted) par Instant Visual Context U->>T: Fetch Baseline Tiles (DW) T->>M: Stream Baseline COG M->>T: T->>U: Render Baseline Map and Asynchronous Prediction A->>W: Enqueue Task W->>M: Fetch Model & Data W->>W: Run Inference & Post-processing W->>M: Upload Prediction COG loop Polling U->>A: Get Status? A-->>U: Processing... end W->>A: Job Complete U->>A: Get Status? A->>U: Prediction URL U->>T: Fetch Prediction Tiles T->>M: Stream Prediction COG T->>U: Overlay High-Res Result end ``` ## πŸš€ Deployment & GitOps Pipeline ```mermaid graph LR DEV[Developer] -->|Push| GITEA[Gitea] subgraph "CI/CD Pipeline" GITEA -->|Trigger| GA[Gitea Actions] GA -->|Build & Push| DH[Docker Hub: frankchine] end subgraph "GitOps Sync" ARGO[ArgoCD] -->|Monitor| GITEA DH -->|Image Pull| K3S[K3s Cluster] ARGO -->|Apply Manifests| K3S end ``` ## πŸ› οΈ Training Workflow Training is performed in **JupyterLab** using a custom `MinIOStorageClient` that bridges the gap between object storage and in-memory data processing. ### Using the MinIO Storage Client ```python from training.storage_client import MinIOStorageClient # Initialize client (uses environment variables automatically) storage = MinIOStorageClient() # List available training batches batches = storage.list_files('geocrop-datasets') # Load a batch directly into memory (No disk I/O) df = storage.load_dataset('geocrop-datasets', 'batch_1.csv') # Train your model and upload the artifact # ... training code ... storage.upload_file('model.pkl', 'geocrop-models', 'Zimbabwe_Ensemble_Model.pkl') ``` ## πŸ–₯️ Service Registry - **Portfolio Frontend**: [portfolio.techarvest.co.zw](https://portfolio.techarvest.co.zw) - **Source Control**: [git.techarvest.co.zw](https://git.techarvest.co.zw) - **JupyterLab**: [lab.techarvest.co.zw](https://lab.techarvest.co.zw) - **MLflow**: [ml.techarvest.co.zw](https://ml.techarvest.co.zw) - **ArgoCD**: [cd.techarvest.co.zw](https://cd.techarvest.co.zw) - **MinIO Console**: [console.minio.portfolio.techarvest.co.zw](https://console.minio.portfolio.techarvest.co.zw) --- *Created and maintained by [fchinembiri](mailto:fchinembiri24@gmail.com).*