# GeoCrop Portfolio App — End-State Checklist, Architecture, and Next Steps *Last updated: 27 Feb 2026 (Africa/Harare)* This document captures: * What’s **already built and verified** in your K3s cluster * The **full end-state feature checklist** (public + admin) * The **target architecture** and data flow * The **next steps** (what to build next, in the order that won’t get you stuck) * Notes to make this **agent-friendly** (Roo / Minimax execution) --- ## 0) Current progress — what you have done so far (verified) ### 0.1 Cluster + networking * **K3s cluster running** (1 control-plane + 2 workers) * **NGINX Ingress Controller installed and running** * Ingress controller exposed on worker `vmi3045103` public IP `167.86.68.48` * **cert-manager installed** * **Let’s Encrypt prod ClusterIssuer created** (`letsencrypt-prod`) and is Ready=True ### 0.2 DNS A records pointing to `167.86.68.48`: * `portfolio.techarvest.co.zw` * `api.portfolio.techarvest.co.zw` * `minio.portfolio.techarvest.co.zw` * `console.minio.portfolio.techarvest.co.zw` ### 0.3 Namespace + core services (geocrop) Namespace: * `geocrop` Running components: * **Redis** (queue/broker) * **MinIO** (S3 storage) with PVC (30Gi, local-path) * Placeholder web + API behind Ingress * TLS certificates for all subdomains (Ready=True) ### 0.4 Connectivity tests (verified) * `portfolio.techarvest.co.zw` reachable over HTTPS * `api.portfolio.techarvest.co.zw` reachable over HTTPS * `console.minio.portfolio.techarvest.co.zw` loads correctly ### 0.5 What you added recently (major progress) * Uploaded ML model artifact to **MinIO** (geocrop-models bucket) * Implemented working **FastAPI backend** with JWT authentication * Implemented **Python RQ worker** consuming Redis queue * Verified end-to-end async job submission + dummy inference response ### 0.6 Dynamic World Baseline Migration (Completed) * Configured **rclone** with Google Drive remote (`gdrive`) * Successfully copied ~7.9 GiB of Dynamic World seasonal GeoTIFFs (132 files) from Google Drive to server path: * `~/geocrop/data/dw_baselines` * Installed `rio-cogeo`, `rasterio`, `pyproj`, and dependencies * Converted all baseline GeoTIFFs to **Cloud Optimized GeoTIFFs (COGs)**: * Output directory: `~/geocrop/data/dw_cogs` > This is a major milestone: your Dynamic World baselines are now local and being converted to COG format, which is required for efficient tiling and MinIO-based serving. > Note: Your earlier `10-redis.yaml` and `20-minio.yaml` editing had some terminal echo corruption, but K8s objects did apply and are running. We’ll clean manifests into a proper repo layout next. --- ## 1) End-state: what the app should have (complete checklist) ### 1.1 Public user experience **Auth & access** * Login for public users (best for portfolio: **invite-only registration** or “request access”) * JWT auth (already planned) * Clear “demo limits” messaging **AOI selection** * Leaflet map: * Place a marker OR draw a circle (center + radius) * Radius slider up to **5 km** * Optional polygon draw (but enforce max area / vertex count) * Manual input: * Latitude/Longitude center * Radius (meters / km) **Parameters** * Year chooser: **2015 → present** * Season chooser: * Summer cropping only (Nov 1 → Apr 30) for now * Model chooser: * RandomForest / XGBoost / LightGBM / CatBoost / Ensemble **Job lifecycle UI** * Submit job * Loading/progress screen with stages: * Queued → Downloading imagery → Computing indices → Running model → Smoothing → Exporting GeoTIFF → Uploading → Done * Results page: * Map viewer with layer toggles * Download links (GeoTIFF only) **Map layers (toggles)** * ✅ Refined crop/LULC map (final product) at **10m** * ✅ Dynamic World baseline toggle * Prefer **Highest Confidence** composite (as you stated) * ✅ True colour composite * ✅ Indices toggles: * Peak NDVI * Peak EVI * Peak SAVI * (Optional later: NDMI, NDRE) **Outputs** * Download refined result as **GeoTIFF only** * Optional downloads: * Baseline DW clipped AOI (GeoTIFF) * True colour composite (GeoTIFF) * Indices rasters (GeoTIFF) **Legend / key** * On-map legend showing your refined classes (color-coded) * Class list includes: * Your refined crop classes (from your image) * Plus non-crop landcover classes so it remains full LULC ### 1.2 Processing pipeline requirements **Validation** * AOI inside Zimbabwe only * Radius ≤ 5 km * Reject overly complex geometries **Data sources** * DEA STAC endpoint: * `https://explorer.digitalearth.africa/stac/search` * Dynamic World baseline: * Your pre-exported DW GeoTIFFs per year/season (now in Google Drive; migrate to MinIO) **Core computations** * Pull imagery from DEA STAC for selected year + summer season window * Build feature stack: * True colour * Indices: NDVI, EVI, SAVI (+ optional NDRE/NDMI) * “Peak” index logic (seasonal maximum) * Load DW baseline for the same year/season, clip to AOI **ML refinement** * Take baseline DW + EO features and run selected ML model * Refine crops into crop-specific classes * Keep non-crop classes to output full LULC map **Neighborhood smoothing** * Majority filter rule: * If pixel is surrounded by majority class, set it to majority class * Configurable kernel sizes: 3×3 / 5×5 **Export and storage** * Export refined output as GeoTIFF (prefer **Cloud Optimized GeoTIFF**) * Save to MinIO * Provide **signed URLs** for downloads ### 1.3 Admin capabilities * Admin login (role-based) * Dataset uploads: * Upload training CSVs and/or labeled GeoTIFFs * Version datasets (v1, v2…) * Retraining: * Trigger model retraining using Kubernetes Job * Save trained models to MinIO (versioned) * Promote a model to “production default” * Job monitoring: * See queue/running/failed jobs, timing, logs * User management: * Invite/create/disable users * Per-user limits ### 1.4 Reliability + portfolio safety (high value) **Compute control** * Global concurrency cap (cluster-wide): e.g. **2 jobs running** * Per-user daily limits: e.g. **3–5 jobs/day** * Job timeouts: kill jobs > 25 minutes **Caching** * Deterministic caching: * If (AOI + year + season + model) repeats → return cached output **Resilience** * Queue-based async processing (RQ) * Retry logic for STAC fetch * Clean error reporting to user ### 1.5 Security * HTTPS everywhere (already done) * JWT auth * RBAC roles: admin vs user * K8s Secrets for: * JWT secret * MinIO credentials * DB credentials * MinIO should not be publicly writable * Downloads are signed URLs only ### 1.6 Nice-to-have portfolio boosters * Swipe/slider compare: Refined vs DW baseline * Confidence raster toggle (if model outputs probabilities) * Stats panel: * area per class (ha) * Metadata JSON (small but very useful even if downloads are “GeoTIFF only”) * job_id, timestamp, year/season, model version, AOI, CRS, pixel size --- ## 2) Recommendation: “best” login + limiting approach for a portfolio Because this is a portfolio project on VPS resources: **Best default** * **Invite-only accounts** (you create accounts or send invites) * Simple password login (JWT) * Hard limits: * Global: 1–2 jobs running * Per user: 3 jobs/day **Why invite-only is best for portfolio** * It prevents random abuse from your CV link * It keeps your compute predictable * It still demonstrates full auth + quota features **Optional later** * Public “Request Access” form (email + reason) * Or Google OAuth (more work, not necessary for portfolio) --- ## 3) Target architecture (final) ### 3.1 Components * **Frontend**: React + Leaflet * Select AOI + params * Submit job * Poll status * Render map layers from tiles * Download GeoTIFF * **API**: FastAPI * Auth (JWT) * Validate AOI + quotas * Create job records * Push job to Redis queue * Generate signed URLs * **Worker**: Python RQ Worker * Pull job * Query DEA STAC * Compute features/indices * Load DW baseline * Run model inference * Neighborhood smoothing * Write outputs as COG GeoTIFF * Update job status * **Redis** * job queue * **MinIO** * Baselines (DW) * Models * Results (COGs) * **Database (recommended)** * Postgres (preferred) for: * users, roles * jobs, params * quotas usage * model registry metadata * **Tile server** * TiTiler or rio-tiler based service * Serves tiles from MinIO-hosted COGs ### 3.2 Buckets (MinIO) * `geocrop-baselines` (DW GeoTIFF/COG) * `geocrop-models` (pkl/onnx + metadata) * `geocrop-results` (output COGs) * `geocrop-datasets` (training data uploads) ### 3.3 Subdomains * `portfolio.techarvest.co.zw` → frontend * `api.portfolio.techarvest.co.zw` → FastAPI * `tiles.portfolio.techarvest.co.zw` → TiTiler (recommended add) * `minio.portfolio.techarvest.co.zw` → MinIO API (private) * `console.minio.portfolio.techarvest.co.zw` → MinIO Console (admin-only) --- ## 4) What to build next (exact order) ### Phase A — Clean repo + manifests (so you stop fighting YAML) 1. Create a Git repo layout: * `geocrop/` * `k8s/` * `base/` * `prod/` * `api/` * `worker/` * `web/` 2. Move your current YAML into files with predictable names: * `k8s/base/00-namespace.yaml` * `k8s/base/10-redis.yaml` * `k8s/base/20-minio.yaml` * `k8s/base/30-api.yaml` * `k8s/base/40-worker.yaml` * `k8s/base/50-web.yaml` * `k8s/base/60-ingress.yaml` 3. Add `kubectl apply -k` using Kustomize later (optional). ### Phase B — Make API real (replace hello-api) 4. Build FastAPI endpoints: * `POST /auth/register` (admin-only or invite) * `POST /auth/login` * `POST /jobs` (create job) * `GET /jobs/{job_id}` (status) * `GET /jobs/{job_id}/download` (signed url) * `GET /models` (list available models) 5. Add quotas + concurrency guard: * Global running jobs ≤ 2 * Per-user jobs/day ≤ 3–5 6. Store job status: * Start with Redis * Upgrade to Postgres when stable ### Phase C — Worker: “real pipeline v1” 7. Implement DEA STAC search + download clip for AOI: * Sentinel-2 (s2_l2a) is likely easiest first * Compute indices (NDVI, EVI, SAVI) * Compute peak indices (season max) 8. Load DW baseline GeoTIFF for the year: * Step 1: upload DW GeoTIFFs from Google Drive to MinIO * Step 2: clip to AOI 9. Run model inference: * Load model from MinIO * Apply to feature stack * Output refined label raster 10. Neighborhood smoothing: * Majority filter 3×3 / 5×5 (configurable) 11. Export result as GeoTIFF (prefer COG) * Write to temp * Upload to MinIO ### Phase D — Tiles + map UI 12. Deploy TiTiler service and expose: * `tiles.portfolio...` 13. Frontend: * Leaflet selection + coords input * Submit job + poll * Add layers from tile URLs * Legend + downloads ### Phase E — Admin portal + retraining 14. Admin UI: * Dataset upload * Model list + promote 15. Retraining pipeline: * Kubernetes Job that: * pulls dataset from MinIO * trains models * saves artifact to MinIO * registers new model version --- ## 5) Important “you might forget” items (add now) ### 5.1 Model registry metadata For each model artifact store: * model_name * version * training datasets used * training timestamp * feature list expected * class mapping ### 5.2 Class mapping (must be consistent) Create a single `classes.json` used by: * training * inference * frontend legend ### 5.3 Zimbabwe boundary validation Use a Zimbabwe boundary polygon in the API/worker to validate AOI. * Best: store the boundary geometry as GeoJSON in repo. ### 5.4 Deterministic job cache key Hash: * year * season * model_version * center lat/lon * radius If exists → return cached result (huge compute saver). ### 5.5 Signed downloads Never expose MinIO objects publicly. * API generates signed GET URLs that expire. --- ## 6) Open items to decide (tomorrow) 1. **Frontend framework**: React + Vite (recommended) 2. **Tile approach**: TiTiler vs pre-render PNGs (TiTiler looks much more professional) 3. **DB**: add Postgres now vs later (recommended soon for quotas + user mgmt) 4. **Which DEA collections** to use for the first version: * Start with Sentinel-2 L2A (s2_l2a) * Later add Landsat fallback 5. **Model input features**: exact feature vector and normalization rules --- ## 7) Roo/Minimax execution notes (so it doesn’t get confused) * Treat current cluster as **production-like** * All services live in namespace: `geocrop` * Ingress class: `nginx` * ClusterIssuer: `letsencrypt-prod` * Public IP of ingress node: `167.86.68.48` * Subdomains already configured and reachable * Next change should be swapping placeholder services for real deployments --- ## 8) Short summary You already have the hard part done: * K3s + ingress + TLS + DNS works * MinIO + Redis work * You proved async jobs can be queued and processed Next is mostly **application engineering**: * Replace placeholder web/api with real app * Add job status + quotas * Implement DEA STAC fetch + DW baseline clipping + ML inference * Export COG + tile server + map UI