3.5 KiB
3.5 KiB
Sovereign MLOps Platform: LULC Crop-Mapping Portfolio
Overview
This document outlines the execution plan for restructuring the GeoCrop platform into a GitOps-driven, self-hosted MLOps platform on K3s. It replaces the full Supabase stack with a lightweight Postgres+PostGIS standalone container to conserve RAM while meeting all spatial querying requirements.
Phased Execution Strategy
Phase 1: Infrastructure Setup (The Foundation)
- Terraform (Namespaces & Quotas): Apply Terraform to configure the K3s namespace (
geocrop) with explicit ResourceQuotas. We will apply 512MB limits to lightweight services (API, Web) but allocate 2GB to the ML Worker and Jupyter instances to prevent OOM errors. - Database (Postgres + PostGIS): Deploy a standalone StatefulSet for PostGIS on port 5433 (
db.techarvest.co.zw), fully isolated from other apps. - MLOps Tools (MLflow & Jupyter):
- Deploy MLflow (
ml.techarvest.co.zw) backed by the new PostGIS DB and the existing MinIO artifact store. - Deploy a Jupyter Data Science workspace (
lab.techarvest.co.zw) configured to pull datasets directly from the MinIOgeocrop-datasetsbucket, ensuring node-agnostic scheduling.
- Deploy MLflow (
- GitOps Tools (Gitea & ArgoCD): Initialize Gitea (
git.techarvest.co.zw) and ArgoCD (cd.techarvest.co.zw) to take over cluster management.
Phase 2: Frontend (React/Vite) Setup & Testing
- Zero-Downtime Requirement: The current live web page at
portfolio.techarvest.co.zwMUST remain active and untouched during this transition as it is actively receiving traffic from job applications. - Parallel Loading Strategy: Configure the new React frontend components to instantly fetch and render Dynamic World (DW) baselines (2015-2025) via the TiTiler service (
tiles.portfolio.techarvest.co.zw) while awaiting ML inference. - ArgoCD Deployment: Commit the new frontend manifests to the Gitea repository and sync via ArgoCD, carefully routing traffic to avoid disrupting the live welcome page.
- Verification: Test that the new frontend components successfully load and render TiTiler COGs instantly without backend dependency.
Phase 3: Backend (API + ML Worker) Setup & CI/CD
- Gitea Actions (CI/CD): Implement
.gitea/workflows/build-push.yamlto automatically buildapps/worker/Dockerfileandapps/api/Dockerfile, and push them to Docker Hub (frankchine/geocrop-worker:latest, etc.). - ArgoCD Deployment: Update backend Kubernetes manifests in the GitOps repo to pull from
frankchine/.... Sync ArgoCD. - Worker Tuning: Ensure the ML worker is correctly configured to use the standalone PostGIS database (if spatial logging is needed) and MinIO for models/results.
Phase 4: End-to-End System Testing
- Trigger Job: Submit an AOI via the React frontend.
- Verify Instant UX: Ensure the DW baseline renders immediately.
- Verify Inference: Monitor the Redis queue and ML Worker logs to ensure it pulls STAC data, runs the XGBoost/Ensemble model, and writes the output COG to MinIO.
- Verify Result Overlay: Ensure the frontend polls the API and seamlessly overlays the high-resolution LULC prediction once complete.
- Verify MLflow: Check
ml.techarvest.co.zwto confirm the run metrics were logged successfully. to MinIO. - Verify Result Overlay: Ensure the frontend polls the API and seamlessly overlays the high-resolution LULC prediction once complete.
- Verify MLflow: Check
ml.techarvest.co.zwto confirm the run metrics were logged successfully.