geocrop-platform./CLAUDE.md

177 lines
8.6 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
## What This Project Does
GeoCrop is a crop-type classification platform for Zimbabwe. It:
1. Accepts an AOI (lat/lon + radius) and year via REST API
2. Queues an inference job via Redis/RQ
3. Worker fetches Sentinel-2 imagery from DEA STAC, computes 51 spectral features, loads a Dynamic World baseline, runs an ML model (XGBoost/LightGBM/CatBoost/Ensemble), and uploads COG results to MinIO
4. Results are served via TiTiler (tile server reading COGs directly from MinIO over S3)
## Build & Run Commands
```bash
# API
cd apps/api && pip install -r requirements.txt
uvicorn main:app --host 0.0.0.0 --port 8000
# Worker
cd apps/worker && pip install -r requirements.txt
python worker.py --worker # start RQ worker
python worker.py --test # syntax/import self-test only
# Web frontend (React + Vite + TypeScript)
cd apps/web && npm install
npm run dev # dev server (hot reload)
npm run build # production build → dist/
npm run lint # ESLint check
npm run preview # preview production build locally
# Training
cd training && python train.py --data /path/to/data.csv --out ./artifacts --variant Raw
# With MinIO upload:
MINIO_ENDPOINT=... MINIO_ACCESS_KEY=... MINIO_SECRET_KEY=... \
python train.py --data /path/to/data.csv --out ./artifacts --variant Raw --upload-minio
# Docker
docker build -t frankchine/geocrop-api:v1 apps/api/
docker build -t frankchine/geocrop-worker:v1 apps/worker/
```
## Kubernetes Deployment
All k8s manifests are in `k8s/` — numbered for apply order:
```bash
kubectl apply -f k8s/00-namespace.yaml
kubectl apply -f k8s/ # apply all in order
kubectl -n geocrop rollout restart deployment/geocrop-api
kubectl -n geocrop rollout restart deployment/geocrop-worker
```
Namespace: `geocrop`. Ingress class: `nginx`. ClusterIssuer: `letsencrypt-prod`.
Exposed hosts:
- `portfolio.techarvest.co.zw` → geocrop-web (nginx static)
- `api.portfolio.techarvest.co.zw` → geocrop-api:8000
- `tiles.portfolio.techarvest.co.zw` → geocrop-tiler:8000 (TiTiler)
- `minio.portfolio.techarvest.co.zw` → MinIO API
- `console.minio.portfolio.techarvest.co.zw` → MinIO Console
## Architecture
```
Web (React/Vite/OL) → API (FastAPI) → Redis Queue (geocrop_tasks) → Worker (RQ)
DEA STAC → feature_computation.py (51 features)
MinIO → dw_baseline.py (windowed read)
MinIO → inference.py (model load + predict)
→ postprocess.py (majority filter)
→ cog.py (write COG)
→ MinIO geocrop-results/
TiTiler reads COGs from MinIO via S3 protocol
```
Job status is written to Redis at `job:{job_id}:status` with 24h expiry.
**Web frontend** (`apps/web/`): React 19 + TypeScript + Vite. Uses OpenLayers for the map (click-to-set-coordinates). Components: `Login`, `Welcome`, `JobForm`, `StatusMonitor`, `MapComponent`, `Admin`. State is in `App.tsx`; JWT token stored in `localStorage`.
**API user store**: Users are stored in an in-memory dict (`USERS` in `apps/api/main.py`) — lost on restart. Admin panel (`/admin/users`) manages users at runtime. Any user additions must be re-done after pod restarts unless the dict is seeded in code.
## Critical Non-Obvious Patterns
**Season window**: Sept 1 → May 31 of the following year. `year=2022` → 2022-09-01 to 2023-05-31. See `InferenceConfig.season_dates()` in `apps/worker/config.py`.
**AOI format**: `(lon, lat, radius_m)` — NOT `(lat, lon)`. Longitude first everywhere in `features.py`.
**Zimbabwe bounds**: Lon 25.233.1, Lat -22.5 to -15.6 (enforced in `worker.py` validation).
**Radius limit**: Max 5000m enforced in both API (`apps/api/main.py:90`) and worker validation.
**RQ queue name**: `geocrop_tasks`. Redis service: `redis.geocrop.svc.cluster.local`.
**API vs worker function name mismatch**: `apps/api/main.py` enqueues `'worker.run_inference'` but the worker only defines `run_job`. Any new worker entry point must be named `run_inference` (or the API call must be updated) for end-to-end jobs to work.
**Smoothing kernel**: Must be odd — 3, 5, or 7 only (`postprocess.py`).
**Feature order**: `FEATURE_ORDER_V1` in `feature_computation.py` — exactly 51 scalar features. Order matters for model inference. Changing this breaks all existing models.
## MinIO Buckets & Path Conventions
| Bucket | Purpose | Path pattern |
|--------|---------|-------------|
| `geocrop-models` | ML model `.pkl` files | ROOT — no subfolders |
| `geocrop-baselines` | Dynamic World COG tiles | `dw/zim/summer/<season>/<type>/DW_Zim_<Type>_<year>_<year+1>-<row>-<col>.tif` |
| `geocrop-results` | Output COGs | `results/<job_id>/<filename>` |
| `geocrop-datasets` | Training data CSVs | — |
**Model filenames** (ROOT of `geocrop-models`):
- `Zimbabwe_Ensemble_Raw_Model.pkl` — no scaler needed
- `Zimbabwe_XGBoost_Model.pkl`, `Zimbabwe_LightGBM_Model.pkl`, `Zimbabwe_RandomForest_Model.pkl` — require scaler
- `Zimbabwe_CatBoost_Raw_Model.pkl` — no scaler
**DW baseline tiles**: COGs are 65536×65536 pixel tiles. Worker MUST use windowed reads via presigned URL — never download the full tile. Always transform AOI bbox to tile CRS before computing window.
## Environment Variables
| Variable | Default | Notes |
|----------|---------|-------|
| `REDIS_HOST` | `redis.geocrop.svc.cluster.local` | Also supports `REDIS_URL` |
| `MINIO_ENDPOINT` | `minio.geocrop.svc.cluster.local:9000` | |
| `MINIO_ACCESS_KEY` | `minioadmin` | |
| `MINIO_SECRET_KEY` | `minioadmin123` | |
| `MINIO_SECURE` | `false` | |
| `GEOCROP_CACHE_DIR` | `/tmp/geocrop-cache` | |
| `SECRET_KEY` | (change in prod) | API JWT signing |
TiTiler uses `AWS_S3_ENDPOINT_URL=http://minio.geocrop.svc.cluster.local:9000`, `AWS_HTTPS=NO`, credentials from `geocrop-secrets` k8s secret.
## Feature Engineering (must match training exactly)
Pipeline in `feature_computation.py`:
1. Compute indices: ndvi, ndre, evi, savi, ci_re, ndwi
2. Fill zeros linearly, then Savitzky-Golay smooth (window=5, polyorder=2)
3. Phenology metrics for ndvi/ndre/evi: max, min, mean, std, amplitude, auc, peak_timestep, max_slope_up, max_slope_down (27 features)
4. Harmonics for ndvi only: harmonic1_sin/cos, harmonic2_sin/cos (4 features)
5. Interactions: ndvi_ndre_peak_diff, canopy_density_contrast (2 features)
6. Window summaries (early=OctDec, peak=JanMar, late=AprJun) for ndvi/ndwi/ndre × mean/max (18 features)
**Total: 51 features** — see `FEATURE_ORDER_V1` for exact ordering.
Training junk columns dropped: `.geo`, `system:index`, `latitude`, `longitude`, `lat`, `lon`, `ID`, `parent_id`, `batch_id`, `is_syn`.
## DEA STAC
- Search endpoint: `https://explorer.digitalearth.africa/stac/search`
- Primary collection: `s2_l2a` (falls back to `s2_l2a_c1`, `sentinel-2-l2a`, `sentinel_2_l2a`)
- Required bands: red, green, blue, nir, nir08 (red-edge), swir16, swir22
- Cloud filter: `eo:cloud_cover < 30`
## Worker Pipeline Stages
`fetch_stac → build_features → load_dw → infer → smooth → export_cog → upload → done`
When real DEA STAC data is unavailable, worker falls back to synthetic features (seeded by year+coords) to allow end-to-end pipeline testing.
## Label Classes (V1 — temporary)
35 classes including Maize, Tobacco, Soyabean, etc. — defined as `CLASSES_V1` in `apps/worker/worker.py`. Extract dynamically from `model.classes_` when available; fall back to this list only if not present.
## Training Artifacts
`train.py --variant Raw` produces `artifacts/model_raw/`:
- `model.joblib` — VotingClassifier (soft) over RF + XGBoost + LightGBM + CatBoost
- `label_encoder.joblib` — sklearn LabelEncoder (maps string class → int)
- `selected_features.json` — feature subset chosen by scout RF (subset of FEATURE_ORDER_V1)
- `meta.json` — class names, n_features, config snapshot
- `metrics.json` — per-model accuracy/F1/classification report
`--variant Scaled` also emits `scaler.joblib`. Models uploaded to MinIO via `--upload-minio` go under `geocrop-models` at the ROOT (no subfolders).
## Plans & Docs
`plan/` contains detailed step-by-step implementation plans (0105) and an SRS. Read these before making significant architectural changes. `ops/` contains MinIO upload scripts and storage setup docs.