geocrop-platform./AGENTS.md

84 lines
3.3 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# AGENTS.md - GeoCrop Intelligence & Patterns
This file provides foundational guidance for AI agents working within this repository. Adhere to these patterns to maintain system integrity.
## 🛠️ Project Stack
- **Frontend**: React 19 + TypeScript + Vite + OpenLayers (Leaflet fallback).
- **API**: FastAPI + Redis + RQ Job Queue.
- **Worker**: Python 3.11, rasterio, scikit-learn, XGBoost, LightGBM, CatBoost.
- **GitOps**: Gitea (Source) + Gitea Actions (CI) + ArgoCD (CD).
- **Storage**: MinIO (S3-compatible) + PostGIS (Metadata).
- **Observability**: MLflow (Experiments) + JupyterLab (Research).
## 🚀 Build & Dev Commands
### Frontend
```bash
cd apps/web
npm install
npm run dev # dev server on :5173
npm run lint # eslint
npm run build # runs tsc -b && vite build (typecheck before build)
```
### API
```bash
cd apps/api
uvicorn main:app --host 0.0.0.0 --port 8000 --reload
```
### Worker
```bash
cd apps/worker
python worker.py --test # syntax/import check (default when no --worker flag)
python worker.py --worker # start RQ worker listening on geocrop_tasks queue
```
### Docker (Local Build)
`docker build -t frankchine/geocrop-web:latest apps/web/`
## 🧠 Critical Patterns (Non-Obvious)
### ⚠️ Build Order (Frontend)
`npm run lint``npm run build` (build includes `tsc -b`). Run lint before build to catch issues early.
### 🚫 Scoping Mandate
- **Kubernetes Only:** Focus exclusively on resources managed by Kubernetes. **NEVER** modify host-level Nginx, CloudPanel, or system services outside the cluster.
### 🗺️ Geospatial Conventions
- **AOI Format:** Always `(lon, lat, radius_m)`. (Longitude first!).
- **Season Window:** "Summer" = Sept 1st to May 31st of following year.
- **Zimbabwe Bounds:** Lon 25.233.1, Lat -22.5 to -15.6.
- **Feature Order:** `FEATURE_ORDER_V1` (51 features) is strictly immutable.
### 🔌 Connectivity
- **Redis Host:** `redis.geocrop.svc.cluster.local` (Port 6379).
- **MinIO Host:** `minio.geocrop.svc.cluster.local` (Port 9000).
- **Queue Name:** `geocrop_tasks`.
### 📦 Storage Layout (MinIO)
- `geocrop-models/`: Serialized ML models (`.pkl`) and MLflow artifacts.
- `geocrop-baselines/`: Dynamic World COGs (`dw/zim/summer/...`).
- `geocrop-results/`: Output COGs (`results/<job_id>/...`).
- `geocrop-datasets/`: Training CSVs.
### 🗂️ Repo Structure
- `apps/web/`: React 19 + TypeScript + Vite + OpenLayers frontend.
- `apps/api/`: FastAPI backend (auth/JWT, job queue via RQ).
- `apps/worker/`: Python 3.11 worker. Entry: `worker.py``run_job()` → orchestrates STAC fetch → features → inference → COG export.
- `training/`: Jupyter-based training scripts (`MinIOStorageClient` for data access).
- `k8s/base/`: Kustomize manifests (ArgoCD target).
### 🚢 GitOps Workflow & Policies (MANDATORY)
- **CI**: Build images using **Kaniko** via `.gitea/workflows/build-push.yaml`.
- **Tagging**: CI uses Git SHA for deterministic image tagging.
- **CD**: CI updates `k8s/base/kustomization.yaml` with the new tag; ArgoCD auto-syncs.
- **Strict GitOps**:
- ALL changes MUST be pushed to Gitea.
- Deployments occur ONLY through the CI/CD pipeline via ArgoCD.
- Direct manual modifications to K8s resources or running containers are FORBIDDEN.
- No bypassing the GitOps flow.
- **Secrets**: Managed via Kubernetes Secrets (e.g., `geocrop-secrets`, `geocrop-db-secret`).