docs: update system documentation to reflect current MLOps/GitOps infrastructure
This commit is contained in:
parent
8fd6c8d4e5
commit
dba7d2bf99
777
AGENTS.md
777
AGENTS.md
|
|
@ -1,714 +1,77 @@
|
|||
# AGENTS.md
|
||||
# AGENTS.md - GeoCrop Intelligence & Patterns
|
||||
|
||||
This file provides guidance to agents when working with code in this repository.
|
||||
This file provides foundational guidance for AI agents working within this repository. Adhere to these patterns to maintain system integrity.
|
||||
|
||||
## Project Stack
|
||||
- **API**: FastAPI + Redis + RQ job queue
|
||||
- **Worker**: Python 3.11, rasterio, scikit-learn, XGBoost, LightGBM, CatBoost
|
||||
- **Storage**: MinIO (S3-compatible) with signed URLs
|
||||
- **K8s**: Namespace `geocrop`, ingress class `nginx`, ClusterIssuer `letsencrypt-prod`
|
||||
## 🛠️ Project Stack
|
||||
- **Frontend**: React 19 + TypeScript + Vite + OpenLayers (Leaflet fallback).
|
||||
- **API**: FastAPI + Redis + RQ Job Queue.
|
||||
- **Worker**: Python 3.11, rasterio, scikit-learn, XGBoost, LightGBM, CatBoost.
|
||||
- **GitOps**: Gitea (Source) + Gitea Actions (CI) + ArgoCD (CD).
|
||||
- **Storage**: MinIO (S3-compatible) + PostGIS (Metadata).
|
||||
- **Observability**: MLflow (Experiments) + JupyterLab (Research).
|
||||
|
||||
## Build Commands
|
||||
## 🚀 Build & Dev Commands
|
||||
|
||||
### Frontend
|
||||
`cd apps/web && npm install && npm run dev`
|
||||
|
||||
### API
|
||||
```bash
|
||||
cd apps/api && pip install -r requirements.txt && uvicorn main:app --host 0.0.0.0 --port 8000
|
||||
```
|
||||
`cd apps/api && uvicorn main:app --host 0.0.0.0 --port 8000 --reload`
|
||||
|
||||
### Worker
|
||||
```bash
|
||||
cd apps/worker && pip install -r requirements.txt && python worker.py
|
||||
```
|
||||
|
||||
### Training
|
||||
```bash
|
||||
cd training && python train.py --data /path/to/data.csv --out ./artifacts --variant Scaled
|
||||
```
|
||||
|
||||
### Docker Build
|
||||
```bash
|
||||
docker build -t frankchine/geocrop-api:v1 apps/api/
|
||||
docker build -t frankchine/geocrop-worker:v1 apps/worker/
|
||||
```
|
||||
|
||||
## Critical Non-Obvious Patterns
|
||||
|
||||
### Season Window (Sept → May, NOT Nov-Apr)
|
||||
[`apps/worker/config.py:135-141`](apps/worker/config.py:135) - Use `InferenceConfig.season_dates(year, "summer")` which returns Sept 1 to May 31 of following year.
|
||||
|
||||
### AOI Tuple Format (lon, lat, radius_m)
|
||||
[`apps/worker/features.py:80`](apps/worker/features.py:80) - AOI is `(lon, lat, radius_m)` NOT `(lat, lon, radius)`.
|
||||
|
||||
### Redis Service Name
|
||||
[`apps/api/main.py:18`](apps/api/main.py:18) - Use `redis.geocrop.svc.cluster.local` (Kubernetes DNS), NOT `localhost`.
|
||||
|
||||
### RQ Queue Name
|
||||
[`apps/api/main.py:20`](apps/api/main.py:20) - Queue name is `geocrop_tasks`.
|
||||
|
||||
### Job Timeout
|
||||
[`apps/api/main.py:96`](apps/api/main.py:96) - Job timeout is 25 minutes (`job_timeout='25m'`).
|
||||
|
||||
### Max Radius
|
||||
[`apps/api/main.py:90`](apps/api/main.py:90) - Radius cannot exceed 5.0 km.
|
||||
|
||||
### Zimbabwe Bounds (rough bbox)
|
||||
[`apps/worker/features.py:97-98`](apps/worker/features.py:97) - Lon: 25.2 to 33.1, Lat: -22.5 to -15.6.
|
||||
|
||||
### Model Artifacts Expected
|
||||
[`apps/worker/inference.py:66-70`](apps/worker/inference.py:66) - `model.joblib`, `label_encoder.joblib`, `scaler.joblib` (optional), `selected_features.json`.
|
||||
|
||||
### DEA STAC Endpoint
|
||||
[`apps/worker/config.py:147-148`](apps/worker/config.py:147) - Use `https://explorer.digitalearth.africa/stac/search`.
|
||||
|
||||
### Feature Names
|
||||
[`apps/worker/features.py:221`](apps/worker/features.py:221) - Currently: `["ndvi_peak", "evi_peak", "savi_peak"]`.
|
||||
|
||||
### Majority Filter Kernel
|
||||
[`apps/worker/features.py:254`](apps/worker/features.py:254) - Must be odd (3, 5, 7).
|
||||
|
||||
### DW Baseline Filename Format
|
||||
[`Plan/srs.md:173`](Plan/srs.md:173) - `DW_Zim_HighestConf_YYYY_YYYY.tif`
|
||||
|
||||
### MinIO Buckets
|
||||
- `geocrop-models` - trained ML models
|
||||
- `geocrop-results` - output COGs
|
||||
- `geocrop-baselines` - DW baseline COGs
|
||||
- `geocrop-datasets` - training datasets
|
||||
|
||||
## Current Kubernetes Cluster State (as of 2026-02-27)
|
||||
|
||||
### Namespaces
|
||||
- `geocrop` - Main application namespace
|
||||
- `cert-manager` - Certificate management
|
||||
- `ingress-nginx` - Ingress controller
|
||||
- `kubernetes-dashboard` - Dashboard
|
||||
|
||||
### Deployments (geocrop namespace)
|
||||
| Deployment | Image | Status | Age |
|
||||
|------------|-------|--------|-----|
|
||||
| geocrop-api | frankchine/geocrop-api:v3 | Running (1/1) | 159m |
|
||||
| geocrop-worker | frankchine/geocrop-worker:v2 | Running (1/1) | 86m |
|
||||
| redis | redis:alpine | Running (1/1) | 25h |
|
||||
| minio | minio/minio | Running (1/1) | 25h |
|
||||
| hello-web | nginx | Running (1/1) | 25h |
|
||||
|
||||
### Services (geocrop namespace)
|
||||
| Service | Type | Cluster IP | Ports |
|
||||
|---------|------|------------|-------|
|
||||
| geocrop-api | ClusterIP | 10.43.7.69 | 8000/TCP |
|
||||
| geocrop-web | ClusterIP | 10.43.101.43 | 80/TCP |
|
||||
| redis | ClusterIP | 10.43.15.14 | 6379/TCP |
|
||||
| minio | ClusterIP | 10.43.71.8 | 9000/TCP, 9001/TCP |
|
||||
|
||||
### Ingress (geocrop namespace)
|
||||
| Ingress | Hosts | TLS | Backend |
|
||||
|---------|-------|-----|---------|
|
||||
| geocrop-web-api | portfolio.techarvest.co.zw, api.portfolio.techarvest.co.zw | geocrop-web-api-tls | geocrop-web:80, geocrop-api:8000 |
|
||||
| geocrop-minio | minio.portfolio.techarvest.co.zw, console.minio.portfolio.techarvest.co.zw | minio-api-tls, minio-console-tls | minio:9000, minio:9001 |
|
||||
|
||||
### Storage
|
||||
- MinIO PVC: 30Gi (local-path storage class), bound to pvc-44bf8a0f-cbc9-4336-aa54-edf1c4d0be86
|
||||
|
||||
### TLS Certificates
|
||||
- ClusterIssuer: letsencrypt-prod (cert-manager)
|
||||
- All TLS certificates are managed by cert-manager with automatic renewal
|
||||
|
||||
---
|
||||
|
||||
## STEP 0: Alignment Notes (Worker Implementation)
|
||||
|
||||
### Current Mock Behavior (apps/worker/*)
|
||||
|
||||
| File | Current State | Gap |
|
||||
|------|--------------|-----|
|
||||
| `features.py` | [`build_feature_stack_from_dea()`](apps/worker/features.py:193) returns placeholder zeros | **CRITICAL** - Need full DEA STAC loading + feature engineering |
|
||||
| `inference.py` | Model loading with expected bundle format | Need to adapt to ROOT bucket format |
|
||||
| `config.py` | [`MinIOStorage`](apps/worker/config.py:130) class exists | May need refinement for ROOT bucket access |
|
||||
| `worker.py` | Mock handler returning fake results | Need full staged pipeline |
|
||||
|
||||
### Training Pipeline Expectations (plan/original_training.py)
|
||||
|
||||
#### Feature Engineering (must match exactly):
|
||||
1. **Smoothing**: [`apply_smoothing()`](plan/original_training.py:69) - Savitzky-Golay (window=5, polyorder=2) + linear interpolation of zeros
|
||||
2. **Phenology**: [`extract_phenology()`](plan/original_training.py:101) - max, min, mean, std, amplitude, auc, peak_timestep, max_slope_up, max_slope_down
|
||||
3. **Harmonics**: [`add_harmonics()`](plan/original_training.py:141) - harmonic1_sin/cos, harmonic2_sin/cos
|
||||
4. **Windows**: [`add_interactions_and_windows()`](plan/original_training.py:177) - early/peak/late windows, interactions
|
||||
|
||||
#### Indices Computed:
|
||||
- ndvi, ndre, evi, savi, ci_re, ndwi
|
||||
|
||||
#### Junk Columns Dropped:
|
||||
```python
|
||||
['.geo', 'system:index', 'latitude', 'longitude', 'lat', 'lon', 'ID', 'parent_id', 'batch_id', 'is_syn']
|
||||
```
|
||||
|
||||
### Model Storage Convention (FINAL)
|
||||
|
||||
**Location**: ROOT of `geocrop-models` bucket (no subfolders)
|
||||
|
||||
**Exact Object Names**:
|
||||
```
|
||||
geocrop-models/
|
||||
├── Zimbabwe_XGBoost_Raw_Model.pkl
|
||||
├── Zimbabwe_XGBoost_Model.pkl
|
||||
├── Zimbabwe_RandomForest_Raw_Model.pkl
|
||||
├── Zimbabwe_RandomForest_Model.pkl
|
||||
├── Zimbabwe_LightGBM_Raw_Model.pkl
|
||||
├── Zimbabwe_LightGBM_Model.pkl
|
||||
├── Zimbabwe_Ensemble_Raw_Model.pkl
|
||||
└── Zimbabwe_CatBoost_Raw_Model.pkl
|
||||
```
|
||||
|
||||
**Model Selection Logic**:
|
||||
| Job "model" value | MinIO filename | Scaler needed? |
|
||||
|-------------------|---------------|----------------|
|
||||
| "Ensemble" | Zimbabwe_Ensemble_Raw_Model.pkl | No |
|
||||
| "Ensemble_Raw" | Zimbabwe_Ensemble_Raw_Model.pkl | No |
|
||||
| "Ensemble_Scaled" | Zimbabwe_Ensemble_Model.pkl | Yes |
|
||||
| "RandomForest" | Zimbabwe_RandomForest_Model.pkl | Yes |
|
||||
| "XGBoost" | Zimbabwe_XGBoost_Model.pkl | Yes |
|
||||
| "LightGBM" | Zimbabwe_LightGBM_Model.pkl | Yes |
|
||||
| "CatBoost" | Zimbabwe_CatBoost_Raw_Model.pkl | No |
|
||||
|
||||
**Label Encoder Handling**:
|
||||
- No separate `label_encoder.joblib` file exists
|
||||
- Labels encoded in model via `model.classes_` attribute
|
||||
- Default classes (if not available): `["cropland_rainfed", "cropland_irrigated", "tree_crop", "grassland", "shrubland", "urban", "water", "bare"]`
|
||||
|
||||
### DEA STAC Configuration
|
||||
|
||||
| Setting | Value |
|
||||
|---------|-------|
|
||||
| STAC Root | `https://explorer.digitalearth.africa/stac` |
|
||||
| STAC Search | `https://explorer.digitalearth.africa/stac/search` |
|
||||
| Primary Collection | `s2_l2a` (Sentinel-2 L2A) |
|
||||
| Required Bands | red, green, blue, nir, nir08 (red-edge), swir16, swir22 |
|
||||
| Cloud Filter | eo:cloud_cover < 30% |
|
||||
| Season Window | Sep 1 → May 31 (year → year+1) |
|
||||
|
||||
### Dynamic World Baseline Layout
|
||||
|
||||
**Bucket**: `geocrop-baselines`
|
||||
|
||||
**Path Pattern**: `dw/zim/summer/<season>/<type>/DW_Zim_<Type>_<year>_<year+1>.tif`
|
||||
|
||||
**Tile Format**: COGs with 65536x65536 pixel tiles
|
||||
- Example: `DW_Zim_HighestConf_2021_2022-0000000000-0000000000.tif`
|
||||
|
||||
### Results Layout
|
||||
|
||||
**Bucket**: `geocrop-results`
|
||||
|
||||
**Path Pattern**: `results/<job_id>/<filename>`
|
||||
|
||||
**Output Files**:
|
||||
- `refined.tif` - Main classification result
|
||||
- `dw_baseline.tif` - Clipped DW baseline (if requested)
|
||||
- `truecolor.tif` - RGB composite (if requested)
|
||||
- `ndvi_peak.tif`, `evi_peak.tif`, `savi_peak.tif` - Index peaks (if requested)
|
||||
|
||||
### Job Payload Schema
|
||||
|
||||
```json
|
||||
{
|
||||
"job_id": "uuid",
|
||||
"user_id": "uuid",
|
||||
"lat": -17.8,
|
||||
"lon": 31.0,
|
||||
"radius_m": 2000,
|
||||
"year": 2022,
|
||||
"season": "summer",
|
||||
"model": "Ensemble",
|
||||
"smoothing_kernel": 5,
|
||||
"outputs": {
|
||||
"refined": true,
|
||||
"dw_baseline": false,
|
||||
"true_color": false,
|
||||
"indices": []
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Required Fields**: `job_id`, `lat`, `lon`, `radius_m`, `year`
|
||||
|
||||
**Defaults**:
|
||||
- `season`: "summer"
|
||||
- `model`: "Ensemble"
|
||||
- `smoothing_kernel`: 5
|
||||
- `outputs.refined`: true
|
||||
|
||||
### Pipeline Stages
|
||||
|
||||
| Stage | Description |
|
||||
|-------|-------------|
|
||||
| `fetch_stac` | Query DEA STAC for Sentinel-2 scenes |
|
||||
| `build_features` | Load bands, compute indices, apply feature engineering |
|
||||
| `load_dw` | Load and clip Dynamic World baseline |
|
||||
| `infer` | Run ML model inference |
|
||||
| `smooth` | Apply majority filter post-processing |
|
||||
| `export_cog` | Write GeoTIFF as COG |
|
||||
| `upload` | Upload to MinIO |
|
||||
| `done` | Complete |
|
||||
|
||||
### Environment Variables
|
||||
|
||||
| Variable | Default | Description |
|
||||
|----------|---------|-------------|
|
||||
| `REDIS_HOST` | `redis.geocrop.svc.cluster.local` | Redis service |
|
||||
| `MINIO_ENDPOINT` | `minio.geocrop.svc.cluster.local:9000` | MinIO service |
|
||||
| `MINIO_ACCESS_KEY` | `minioadmin` | MinIO access key |
|
||||
| `MINIO_SECRET_KEY` | `minioadmin` | MinIO secret key |
|
||||
| `MINIO_SECURE` | `false` | Use HTTPS for MinIO |
|
||||
| `GEOCROP_CACHE_DIR` | `/tmp/geocrop-cache` | Local cache directory |
|
||||
|
||||
### Assumptions / TODOs
|
||||
|
||||
1. **EPSG**: Default to UTM Zone 36S (EPSG:32736) for Zimbabwe - compute dynamically from AOI center in production
|
||||
2. **Feature Names**: Training uses selected features from LightGBM importance - may vary per model
|
||||
3. **Label Encoder**: No separate file - extract from model or use defaults
|
||||
4. **Scaler**: Only for non-Raw models; Raw models use unscaled features
|
||||
5. **DW Tiles**: Must handle 2x2 tile mosaicking for full AOI coverage
|
||||
|
||||
---
|
||||
|
||||
## Worker Contracts (STEP 1)
|
||||
|
||||
### Job Payload Contract
|
||||
|
||||
```python
|
||||
# Minimal required fields:
|
||||
{
|
||||
"job_id": "uuid",
|
||||
"lat": -17.8,
|
||||
"lon": 31.0,
|
||||
"radius_m": 2000, # max 5000m
|
||||
"year": 2022 # 2015-current
|
||||
}
|
||||
|
||||
# Full with all options:
|
||||
{
|
||||
"job_id": "uuid",
|
||||
"user_id": "uuid", # optional
|
||||
"lat": -17.8,
|
||||
"lon": 31.0,
|
||||
"radius_m": 2000,
|
||||
"year": 2022,
|
||||
"season": "summer", # default
|
||||
"model": "Ensemble", # or RandomForest, XGBoost, LightGBM, CatBoost
|
||||
"smoothing_kernel": 5, # 3, 5, or 7
|
||||
"outputs": {
|
||||
"refined": True,
|
||||
"dw_baseline": True,
|
||||
"true_color": True,
|
||||
"indices": ["ndvi_peak", "evi_peak", "savi_peak"]
|
||||
},
|
||||
"stac": {
|
||||
"cloud_cover_lt": 20,
|
||||
"max_items": 60
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Worker Stages
|
||||
|
||||
```
|
||||
fetch_stac → build_features → load_dw → infer → smooth → export_cog → upload → done
|
||||
```
|
||||
|
||||
### Default Class List (TEMPORARY V1)
|
||||
|
||||
Until we make fully dynamic, use these classes (order matters if model doesn't provide classes):
|
||||
|
||||
```python
|
||||
CLASSES_V1 = [
|
||||
"Avocado","Banana","Bare Surface","Blueberry","Built-Up","Cabbage","Chilli","Citrus","Cotton","Cowpea",
|
||||
"Finger Millet","Forest","Grassland","Groundnut","Macadamia","Maize","Pasture Legume","Pearl Millet",
|
||||
"Peas","Potato","Roundnut","Sesame","Shrubland","Sorghum","Soyabean","Sugarbean","Sugarcane","Sunflower",
|
||||
"Sunhem","Sweet Potato","Tea","Tobacco","Tomato","Water","Woodland"
|
||||
]
|
||||
```
|
||||
|
||||
Note: This is TEMPORARY - later we will extract class names dynamically from the trained model.
|
||||
|
||||
---
|
||||
|
||||
## STEP 2: Storage Adapter (MinIO)
|
||||
|
||||
### Environment Variables
|
||||
|
||||
| Variable | Default | Description |
|
||||
|----------|---------|-------------|
|
||||
| `MINIO_ENDPOINT` | `minio.geocrop.svc.cluster.local:9000` | MinIO service |
|
||||
| `MINIO_ACCESS_KEY` | `minioadmin` | MinIO access key |
|
||||
| `MINIO_SECRET_KEY` | `minioadmin123` | MinIO secret key |
|
||||
| `MINIO_SECURE` | `false` | Use HTTPS for MinIO |
|
||||
| `MINIO_REGION` | `us-east-1` | AWS region |
|
||||
| `MINIO_BUCKET_MODELS` | `geocrop-models` | Models bucket |
|
||||
| `MINIO_BUCKET_BASELINES` | `geocrop-baselines` | Baselines bucket |
|
||||
| `MINIO_BUCKET_RESULTS` | `geocrop-results` | Results bucket |
|
||||
|
||||
### Bucket/Key Conventions
|
||||
|
||||
- **Models**: ROOT of `geocrop-models` (no subfolders)
|
||||
- **DW Baselines**: `geocrop-baselines/dw/zim/summer/<season>/<type>/DW_Zim_<Type>_<year>_<year+1>.tif`
|
||||
- **Results**: `geocrop-results/results/<job_id>/<filename>`
|
||||
|
||||
### Model Filename Mapping
|
||||
|
||||
| Job model value | Primary filename | Fallback |
|
||||
|-----------------|-----------------|----------|
|
||||
| "Ensemble" | Zimbabwe_Ensemble_Model.pkl | Zimbabwe_Ensemble_Raw_Model.pkl |
|
||||
| "RandomForest" | Zimbabwe_RandomForest_Model.pkl | Zimbabwe_RandomForest_Raw_Model.pkl |
|
||||
| "XGBoost" | Zimbabwe_XGBoost_Model.pkl | Zimbabwe_XGBoost_Raw_Model.pkl |
|
||||
| "LightGBM" | Zimbabwe_LightGBM_Model.pkl | Zimbabwe_LightGBM_Raw_Model.pkl |
|
||||
| "CatBoost" | Zimbabwe_CatBoost_Model.pkl | Zimbabwe_CatBoost_Raw_Model.pkl |
|
||||
|
||||
### Methods
|
||||
|
||||
- `ping()` → `(bool, str)`: Check MinIO connectivity
|
||||
- `head_object(bucket, key)` → `dict|None`: Get object metadata
|
||||
- `list_objects(bucket, prefix)` → `list[str]`: List object keys
|
||||
- `download_file(bucket, key, dest_path)` → `Path`: Download file
|
||||
- `download_model_file(model_name, dest_dir)` → `Path`: Download model with fallback
|
||||
- `upload_file(bucket, key, local_path)` → `str`: Upload file, returns s3:// URI
|
||||
- `upload_result(job_id, local_path, filename)` → `(s3_uri, key)`: Upload result
|
||||
- `presign_get(bucket, key, expires)` → `str`: Generate presigned URL
|
||||
|
||||
---
|
||||
|
||||
## STEP 3: STAC Client (DEA)
|
||||
|
||||
### Environment Variables
|
||||
|
||||
| Variable | Default | Description |
|
||||
|----------|---------|-------------|
|
||||
| `DEA_STAC_ROOT` | `https://explorer.digitalearth.africa/stac` | STAC root URL |
|
||||
| `DEA_STAC_SEARCH` | `https://explorer.digitalearth.africa/stac/search` | STAC search URL |
|
||||
| `DEA_CLOUD_MAX` | `30` | Cloud cover filter (percent) |
|
||||
| `DEA_TIMEOUT_S` | `30` | Request timeout (seconds) |
|
||||
|
||||
### Collection Resolution
|
||||
|
||||
Preferred Sentinel-2 collection IDs (in order):
|
||||
1. `s2_l2a`
|
||||
2. `s2_l2a_c1`
|
||||
3. `sentinel-2-l2a`
|
||||
4. `sentinel_2_l2a`
|
||||
|
||||
If none found, raises ValueError with available collections.
|
||||
|
||||
### Methods
|
||||
|
||||
- `list_collections()` → `list[str]`: List available collections
|
||||
- `resolve_s2_collection()` → `str|None`: Resolve best S2 collection
|
||||
- `search_items(bbox, start_date, end_date)` → `list[pystac.Item]`: Search for items
|
||||
- `summarize_items(items)` → `dict`: Summarize search results without downloading
|
||||
|
||||
### summarize_items() Output Structure
|
||||
|
||||
```python
|
||||
{
|
||||
"count": int,
|
||||
"collection": str,
|
||||
"time_start": "ISO datetime",
|
||||
"time_end": "ISO datetime",
|
||||
"items": [
|
||||
{
|
||||
"id": str,
|
||||
"datetime": "ISO datetime",
|
||||
"bbox": [minx, miny, maxx, maxy],
|
||||
"cloud_cover": float|None,
|
||||
"assets": {
|
||||
"red": {"href": str, "type": str, "roles": list},
|
||||
...
|
||||
}
|
||||
}, ...
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
**Note**: stackstac loading is NOT implemented in this step. It will come in Step 4/5.
|
||||
|
||||
---
|
||||
|
||||
## STEP 4A: Feature Computation (Math)
|
||||
|
||||
### Features Produced
|
||||
|
||||
**Base indices (time-series):**
|
||||
- ndvi, ndre, evi, savi, ci_re, ndwi
|
||||
|
||||
**Smoothed time-series:**
|
||||
- For every index above, Savitzky-Golay smoothing (window=5, polyorder=2)
|
||||
- Suffix: *_smooth
|
||||
|
||||
**Phenology metrics (computed across time for NDVI, NDRE, EVI):**
|
||||
- _max, _min, _mean, _std, _amplitude, _auc, _peak_timestep, _max_slope_up, _max_slope_down
|
||||
|
||||
**Harmonic features (for NDVI only):**
|
||||
- ndvi_harmonic1_sin, ndvi_harmonic1_cos, ndvi_harmonic2_sin, ndvi_harmonic2_cos
|
||||
|
||||
**Interaction features:**
|
||||
- ndvi_ndre_peak_diff = ndvi_max - ndre_max
|
||||
- canopy_density_contrast = evi_mean / (ndvi_mean + 0.001)
|
||||
|
||||
### Smoothing Approach
|
||||
|
||||
1. **fill_zeros_linear**: Treats 0 as missing, linear interpolates between non-zero neighbors
|
||||
2. **savgol_smooth_1d**: Uses scipy.signal.savgol_filter if available, falls back to simple moving average
|
||||
|
||||
### Phenology Metrics Definitions
|
||||
|
||||
| Metric | Formula |
|
||||
|--------|---------|
|
||||
| max | np.max(y) |
|
||||
| min | np.min(y) |
|
||||
| mean | np.mean(y) |
|
||||
| std | np.std(y) |
|
||||
| amplitude | max - min |
|
||||
| auc | trapezoidal integral (dx=10 days) |
|
||||
| peak_timestep | argmax(y) |
|
||||
| max_slope_up | max(diff(y)) |
|
||||
| max_slope_down | min(diff(y)) |
|
||||
|
||||
### Harmonic Coefficient Definition
|
||||
|
||||
For normalized time t = 2*pi*k/N:
|
||||
- h1_sin = mean(y * sin(t))
|
||||
- h1_cos = mean(y * cos(t))
|
||||
- h2_sin = mean(y * sin(2t))
|
||||
- h2_cos = mean(y * cos(2t))
|
||||
|
||||
### Note
|
||||
Step 4B will add seasonal window summaries and final feature vector ordering.
|
||||
|
||||
---
|
||||
|
||||
## STEP 4B: Window Summaries + Feature Order
|
||||
|
||||
### Seasonal Window Features (18 features)
|
||||
|
||||
Season window is Oct–Jun, split into:
|
||||
- **Early**: Oct–Dec
|
||||
- **Peak**: Jan–Mar
|
||||
- **Late**: Apr–Jun
|
||||
|
||||
For each window, computed for NDVI, NDWI, NDRE:
|
||||
- `<index>_<window>_mean`
|
||||
- `<index>_<window>_max`
|
||||
|
||||
Total: 3 indices × 3 windows × 2 stats = **18 features**
|
||||
|
||||
### Feature Ordering (FEATURE_ORDER_V1)
|
||||
|
||||
51 scalar features in order:
|
||||
1. **Phenology metrics** (27): ndvi, ndre, evi (each with max, min, mean, std, amplitude, auc, peak_timestep, max_slope_up, max_slope_down)
|
||||
2. **Harmonics** (4): ndvi_harmonic1_sin/cos, ndvi_harmonic2_sin/cos
|
||||
3. **Interactions** (2): ndvi_ndre_peak_diff, canopy_density_contrast
|
||||
4. **Window summaries** (18): ndvi/ndwi/ndre × early/peak/late × mean/max
|
||||
|
||||
Note: Additional smoothed array features (*_smooth) are not in FEATURE_ORDER_V1 since they are arrays, not scalars.
|
||||
|
||||
### Window Splitting Logic
|
||||
- If `dates` provided: Use month membership (10,11,12 = early; 1,2,3 = peak; 4,5,6 = late)
|
||||
- Fallback: Positional split (first 9 steps = early, next 9 = peak, next 9 = late)
|
||||
|
||||
---
|
||||
|
||||
## STEP 5: DW Baseline Loading
|
||||
|
||||
### DW Object Layout
|
||||
|
||||
**Bucket**: `geocrop-baselines`
|
||||
|
||||
**Prefix**: `dw/zim/summer/`
|
||||
|
||||
**Path Pattern**: `dw/zim/summer/<season>/<type>/DW_Zim_<Type>_<year>_<year+1>.tif`
|
||||
|
||||
**Tile Naming**: COGs with 65536x65536 pixel tiles
|
||||
- Example: `DW_Zim_HighestConf_2021_2022-0000000000-0000000000.tif`
|
||||
- Format: `{Type}_{Year}_{Year+1}-{TileRow}-{TileCol}.tif`
|
||||
|
||||
### DW Types
|
||||
- `HighestConf` - Highest confidence class
|
||||
- `Agreement` - Class agreement across predictions
|
||||
- `Mode` - Most common class
|
||||
|
||||
### Windowed Reads
|
||||
|
||||
The worker MUST use windowed reads to avoid downloading entire huge COG tiles:
|
||||
|
||||
1. **Presigned URL**: Get temporary URL via `storage.presign_get(bucket, key, expires=3600)`
|
||||
2. **AOI Transform**: Convert AOI bbox from WGS84 to tile CRS using `rasterio.warp.transform_bounds`
|
||||
3. **Window Creation**: Use `rasterio.windows.from_bounds` to compute window from transformed bbox
|
||||
4. **Selective Read**: Call `src.read(window=window)` to read only the needed portion
|
||||
5. **Mosaic**: If multiple tiles needed, read each window and mosaic into single array
|
||||
|
||||
### CRS Handling
|
||||
|
||||
- DW tiles may be in EPSG:3857 (Web Mercator) or UTM - do NOT assume
|
||||
- Always transform AOI bbox to tile CRS before computing window
|
||||
- Output profile uses tile's native CRS
|
||||
|
||||
### Error Handling
|
||||
|
||||
- If no matching tiles found: Raise `FileNotFoundError` with searched prefix
|
||||
- If window read fails: Retry 3x with exponential backoff
|
||||
- Nodata value: 0 (preserved from DW)
|
||||
|
||||
### Primary Function
|
||||
|
||||
```python
|
||||
def load_dw_baseline_window(
|
||||
storage,
|
||||
year: int,
|
||||
season: str = "summer",
|
||||
aoi_bbox_wgs84: List[float], # [min_lon, min_lat, max_lon, max_lat]
|
||||
dw_type: str = "HighestConf",
|
||||
bucket: str = "geocrop-baselines",
|
||||
max_retries: int = 3,
|
||||
) -> Tuple[np.ndarray, dict]:
|
||||
"""Load DW baseline clipped to AOI window from MinIO.
|
||||
|
||||
Returns:
|
||||
dw_arr: uint8 or int16 raster clipped to AOI
|
||||
profile: rasterio profile for writing outputs aligned to this window
|
||||
"""
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Plan 02 - Step 1: TiTiler Deployment+Service
|
||||
|
||||
### Files Changed
|
||||
- Created: [`k8s/25-tiler.yaml`](k8s/25-tiler.yaml)
|
||||
- Created: Kubernetes Secret `geocrop-secrets` with MinIO credentials
|
||||
|
||||
### Commands Run
|
||||
```bash
|
||||
kubectl create secret generic geocrop-secrets -n geocrop --from-literal=minio-access-key=minioadmin --from-literal=minio-secret-key=minioadmin123
|
||||
kubectl -n geocrop apply -f k8s/25-tiler.yaml
|
||||
kubectl -n geocrop get deploy,svc | grep geocrop-tiler
|
||||
```
|
||||
|
||||
### Expected Output / Acceptance Criteria
|
||||
- `kubectl -n geocrop apply -f k8s/25-tiler.yaml` succeeds (syntax correct)
|
||||
- Creates Deployment `geocrop-tiler` with 2 replicas
|
||||
- Creates Service `geocrop-tiler` (ClusterIP on port 8000 → container port 80)
|
||||
- TiTiler container reads COGs from MinIO via S3
|
||||
- Pods are Running and Ready (1/1)
|
||||
|
||||
### Actual Output
|
||||
```
|
||||
deployment.apps/geocrop-tiler 2/2 2 2 2m
|
||||
service/geocrop-tiler ClusterIP 10.43.47.225 <none> 8000/TCP 2m
|
||||
```
|
||||
|
||||
### TiTiler Environment Variables
|
||||
| Variable | Value |
|
||||
|----------|-------|
|
||||
| AWS_ACCESS_KEY_ID | from secret geocrop-secrets |
|
||||
| AWS_SECRET_ACCESS_KEY | from secret geocrop-secrets |
|
||||
| AWS_REGION | us-east-1 |
|
||||
| AWS_S3_ENDPOINT_URL | http://minio.geocrop.svc.cluster.local:9000 |
|
||||
| AWS_HTTPS | NO |
|
||||
| TILED_READER | cog |
|
||||
|
||||
### Notes
|
||||
- Container listens on port 80 (not 8000) - service maps 8000 → 80
|
||||
- Health probe path `/healthz` on port 80
|
||||
- Secret `geocrop-secrets` created for MinIO credentials
|
||||
|
||||
### Next Step
|
||||
- Step 2: Add Ingress for TiTiler (with TLS)
|
||||
|
||||
---
|
||||
|
||||
## Plan 02 - Step 2: TiTiler Ingress
|
||||
|
||||
### Files Changed
|
||||
- Created: [`k8s/26-tiler-ingress.yaml`](k8s/26-tiler-ingress.yaml)
|
||||
|
||||
### Commands Run
|
||||
```bash
|
||||
kubectl -n geocrop apply -f k8s/26-tiler-ingress.yaml
|
||||
kubectl -n geocrop get ingress geocrop-tiler -o wide
|
||||
kubectl -n geocrop describe ingress geocrop-tiler
|
||||
```
|
||||
|
||||
### Expected Output / Acceptance Criteria
|
||||
- Ingress object created with host `tiles.portfolio.techarvest.co.zw`
|
||||
- TLS certificate will be pending until DNS A record is pointed to ingress IP
|
||||
|
||||
### Actual Output
|
||||
```
|
||||
NAME CLASS HOSTS ADDRESS PORTS AGE
|
||||
geocrop-tiler nginx tiles.portfolio.techarvest.co.zw 167.86.68.48 80, 443 30s
|
||||
```
|
||||
|
||||
### Ingress Details
|
||||
- Host: tiles.portfolio.techarvest.co.zw
|
||||
- Backend: geocrop-tiler:8000
|
||||
- TLS: geocrop-tiler-tls (cert-manager with letsencrypt-prod)
|
||||
- Annotations: nginx.ingress.kubernetes.io/proxy-body-size: "50m"
|
||||
|
||||
### DNS Requirement
|
||||
External DNS A record must point to ingress IP (167.86.68.48):
|
||||
- `tiles.portfolio.techarvest.co.zw` → `167.86.68.48`
|
||||
|
||||
---
|
||||
|
||||
## Plan 02 - Step 3: TiTiler Smoke Test
|
||||
|
||||
### Commands Run
|
||||
```bash
|
||||
kubectl -n geocrop port-forward svc/geocrop-tiler 8000:8000 &
|
||||
curl -sS http://127.0.0.1:8000/ | head
|
||||
curl -sS -o /dev/null -w "%{http_code}\n" http://127.0.0.1:8000/healthz
|
||||
```
|
||||
|
||||
### Test Results
|
||||
| Endpoint | Status | Notes |
|
||||
|----------|--------|-------|
|
||||
| `/` | 200 | Landing page JSON returned |
|
||||
| `/healthz` | 200 | Health check passes |
|
||||
| `/api` | 200 | OpenAPI docs available |
|
||||
|
||||
### Final Probe Path
|
||||
- **Confirmed**: `/healthz` on port 80 works correctly
|
||||
- No manifest changes needed
|
||||
|
||||
---
|
||||
|
||||
## Plan 02 - Step 4: MinIO S3 Access Test
|
||||
|
||||
### Commands Run
|
||||
```bash
|
||||
# With correct credentials (minioadmin/minioadmin123)
|
||||
curl -sS "http://127.0.0.1:8000/cog/info?url=s3://geocrop-baselines/dw/zim/summer/summer/highest/DW_Zim_HighestConf_2016_2017-0000000000-0000000000.tif"
|
||||
```
|
||||
|
||||
### Test Results
|
||||
| Test | Result | Notes |
|
||||
|------|--------|-------|
|
||||
| S3 Access | ❌ Failed | Error: "The AWS Access Key Id you provided does not exist in our records" |
|
||||
|
||||
### Issue Analysis
|
||||
- MinIO credentials used: `minioadmin` / `minioadmin123`
|
||||
- The root user is `minioadmin` with password `minioadmin123`
|
||||
- TiTiler pods have correct env vars set (verified via `kubectl exec`)
|
||||
- Issue may be: (1) bucket not created, (2) bucket path incorrect, or (3) network policy
|
||||
|
||||
### Environment Variables (Verified Working)
|
||||
| Variable | Value |
|
||||
|----------|-------|
|
||||
| AWS_ACCESS_KEY_ID | minioadmin |
|
||||
| AWS_SECRET_ACCESS_KEY | minioadmin123 |
|
||||
| AWS_S3_ENDPOINT_URL | http://minio.geocrop.svc.cluster.local:9000 |
|
||||
| AWS_HTTPS | NO |
|
||||
| AWS_REGION | us-east-1 |
|
||||
|
||||
### Next Step
|
||||
- Verify bucket exists in MinIO
|
||||
- Check bucket naming convention in MinIO console
|
||||
- Or upload test COG to verify S3 access
|
||||
`cd apps/worker && python worker.py --worker`
|
||||
|
||||
### Docker (Local Build)
|
||||
`docker build -t frankchine/geocrop-web:latest apps/web/`
|
||||
|
||||
## 🧠 Critical Patterns (Non-Obvious)
|
||||
|
||||
### 🚫 Scoping Mandate
|
||||
- **Kubernetes Only:** Focus exclusively on resources managed by Kubernetes. **NEVER** modify host-level Nginx, CloudPanel, or system services outside the cluster.
|
||||
|
||||
### 🗺️ Geospatial Conventions
|
||||
- **AOI Format:** Always `(lon, lat, radius_m)`. (Longitude first!).
|
||||
- **Season Window:** "Summer" = Sept 1st to May 31st of following year.
|
||||
- **Zimbabwe Bounds:** Lon 25.2–33.1, Lat -22.5 to -15.6.
|
||||
- **Feature Order:** `FEATURE_ORDER_V1` (51 features) is strictly immutable.
|
||||
|
||||
### 🔌 Connectivity
|
||||
- **Redis Host:** `redis.geocrop.svc.cluster.local` (Port 6379).
|
||||
- **MinIO Host:** `minio.geocrop.svc.cluster.local` (Port 9000).
|
||||
- **Queue Name:** `geocrop_tasks`.
|
||||
|
||||
### 📦 Storage Layout (MinIO)
|
||||
- `geocrop-models/`: Serialized ML models (`.pkl`) and MLflow artifacts.
|
||||
- `geocrop-baselines/`: Dynamic World COGs (`dw/zim/summer/...`).
|
||||
- `geocrop-results/`: Output COGs (`results/<job_id>/...`).
|
||||
- `geocrop-datasets/`: Training CSVs.
|
||||
|
||||
## 🚢 GitOps Workflow
|
||||
- **CI**: Build and Push via `.gitea/workflows/build-push.yaml`.
|
||||
- **CD**: ArgoCD tracks `k8s/base/` in the `geocrop-platform` application.
|
||||
- **Secrets**: Managed via Kubernetes Secrets (e.g., `geocrop-secrets`, `geocrop-db-secret`).
|
||||
|
||||
## 📊 Current Kubernetes State (geocrop namespace)
|
||||
|
||||
| Deployment | Role | Status |
|
||||
|------------|------|--------|
|
||||
| `geocrop-web` | React Frontend | Running (1/1) |
|
||||
| `geocrop-api` | FastAPI Backend | Running (1/1) |
|
||||
| `geocrop-worker` | Inference Engine | Running (1/1) |
|
||||
| `gitea` | Source Control | Running (1/1) |
|
||||
| `gitea-runner` | CI Runner (Actions) | Running (1/1) |
|
||||
| `mlflow` | Experiment Tracking | Running (1/1) |
|
||||
| `jupyter-lab` | Data Science IDE | Running (1/1) |
|
||||
| `geocrop-db` | PostGIS Database | Running (1/1) |
|
||||
| `redis` | Job Broker | Running (1/1) |
|
||||
| `minio` | S3 Storage | Running (1/1) |
|
||||
| `geocrop-tiler` | Dynamic Tile Server | Running (2/2) |
|
||||
|
||||
### 🌐 Endpoints
|
||||
- **Portfolio**: `portfolio.techarvest.co.zw`
|
||||
- **API Docs**: `api.portfolio.techarvest.co.zw/docs`
|
||||
- **Gitea**: `git.techarvest.co.zw`
|
||||
- **ArgoCD**: `cd.techarvest.co.zw`
|
||||
- **MLflow**: `ml.techarvest.co.zw`
|
||||
- **Jupyter**: `lab.techarvest.co.zw`
|
||||
- **Tiler**: `tiles.portfolio.techarvest.co.zw`
|
||||
|
|
|
|||
205
CLAUDE.md
205
CLAUDE.md
|
|
@ -1,176 +1,65 @@
|
|||
# CLAUDE.md
|
||||
# CLAUDE.md - GeoCrop Engineering Guide
|
||||
|
||||
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
||||
GeoCrop is a production-grade, self-hosted ML platform for crop-type classification in Zimbabwe.
|
||||
|
||||
## What This Project Does
|
||||
|
||||
GeoCrop is a crop-type classification platform for Zimbabwe. It:
|
||||
1. Accepts an AOI (lat/lon + radius) and year via REST API
|
||||
2. Queues an inference job via Redis/RQ
|
||||
3. Worker fetches Sentinel-2 imagery from DEA STAC, computes 51 spectral features, loads a Dynamic World baseline, runs an ML model (XGBoost/LightGBM/CatBoost/Ensemble), and uploads COG results to MinIO
|
||||
4. Results are served via TiTiler (tile server reading COGs directly from MinIO over S3)
|
||||
|
||||
## Build & Run Commands
|
||||
## 🚀 Key Commands
|
||||
|
||||
```bash
|
||||
# API
|
||||
cd apps/api && pip install -r requirements.txt
|
||||
uvicorn main:app --host 0.0.0.0 --port 8000
|
||||
# Frontend (React 19 + TypeScript)
|
||||
cd apps/web && npm install && npm run dev
|
||||
|
||||
# Worker
|
||||
cd apps/worker && pip install -r requirements.txt
|
||||
python worker.py --worker # start RQ worker
|
||||
python worker.py --test # syntax/import self-test only
|
||||
# API (FastAPI)
|
||||
cd apps/api && uvicorn main:app --reload
|
||||
|
||||
# Web frontend (React + Vite + TypeScript)
|
||||
cd apps/web && npm install
|
||||
npm run dev # dev server (hot reload)
|
||||
npm run build # production build → dist/
|
||||
npm run lint # ESLint check
|
||||
npm run preview # preview production build locally
|
||||
# Worker (RQ)
|
||||
cd apps/worker && python worker.py --worker
|
||||
|
||||
# Training
|
||||
cd training && python train.py --data /path/to/data.csv --out ./artifacts --variant Raw
|
||||
# With MinIO upload:
|
||||
MINIO_ENDPOINT=... MINIO_ACCESS_KEY=... MINIO_SECRET_KEY=... \
|
||||
python train.py --data /path/to/data.csv --out ./artifacts --variant Raw --upload-minio
|
||||
|
||||
# Docker
|
||||
docker build -t frankchine/geocrop-api:v1 apps/api/
|
||||
docker build -t frankchine/geocrop-worker:v1 apps/worker/
|
||||
# Infrastructure (K8s)
|
||||
# Manifests are managed via ArgoCD. Pushing to 'main' triggers reconciliation.
|
||||
# Root manifests in k8s/base/
|
||||
```
|
||||
|
||||
## Kubernetes Deployment
|
||||
## 🚢 CI/CD & GitOps
|
||||
|
||||
All k8s manifests are in `k8s/` — numbered for apply order:
|
||||
- **Source Control**: Gitea (`git.techarvest.co.zw`).
|
||||
- **CI**: Gitea Actions (`.gitea/workflows/build-push.yaml`) builds and pushes images to Docker Hub.
|
||||
- **CD**: ArgoCD (`cd.techarvest.co.zw`) tracks `k8s/base/` and auto-syncs to the `geocrop` namespace.
|
||||
- **Git Repo**: `http://gitea.geocrop.svc.cluster.local:3000/fchinembiri/geocrop-platform..git`
|
||||
|
||||
```bash
|
||||
kubectl apply -f k8s/00-namespace.yaml
|
||||
kubectl apply -f k8s/ # apply all in order
|
||||
kubectl -n geocrop rollout restart deployment/geocrop-api
|
||||
kubectl -n geocrop rollout restart deployment/geocrop-worker
|
||||
```
|
||||
## 🌐 Endpoints
|
||||
|
||||
Namespace: `geocrop`. Ingress class: `nginx`. ClusterIssuer: `letsencrypt-prod`.
|
||||
- **Portfolio**: `portfolio.techarvest.co.zw`
|
||||
- **API**: `api.portfolio.techarvest.co.zw`
|
||||
- **Gitea**: `git.techarvest.co.zw`
|
||||
- **ArgoCD**: `cd.techarvest.co.zw`
|
||||
- **MLflow**: `ml.techarvest.co.zw`
|
||||
- **Jupyter**: `lab.techarvest.co.zw`
|
||||
- **Tiler**: `tiles.portfolio.techarvest.co.zw`
|
||||
- **MinIO**: `minio.portfolio.techarvest.co.zw`
|
||||
|
||||
Exposed hosts:
|
||||
- `portfolio.techarvest.co.zw` → geocrop-web (nginx static)
|
||||
- `api.portfolio.techarvest.co.zw` → geocrop-api:8000
|
||||
- `tiles.portfolio.techarvest.co.zw` → geocrop-tiler:8000 (TiTiler)
|
||||
- `minio.portfolio.techarvest.co.zw` → MinIO API
|
||||
- `console.minio.portfolio.techarvest.co.zw` → MinIO Console
|
||||
## 📐 Architecture & Patterns
|
||||
|
||||
## Architecture
|
||||
### Components
|
||||
- **Web**: React 19 + OpenLayers. UI for portfolio and interactive crop mapping.
|
||||
- **API**: FastAPI. Handles auth (JWT), job validation, and queueing.
|
||||
- **Worker**: RQ-based Python worker. Orchestrates STAC fetch → Feature extraction → Inference → smoothing → COG export.
|
||||
- **Tiler**: TiTiler. Serves tiles directly from MinIO COGs via S3 protocol.
|
||||
|
||||
```
|
||||
Web (React/Vite/OL) → API (FastAPI) → Redis Queue (geocrop_tasks) → Worker (RQ)
|
||||
↓
|
||||
DEA STAC → feature_computation.py (51 features)
|
||||
MinIO → dw_baseline.py (windowed read)
|
||||
MinIO → inference.py (model load + predict)
|
||||
→ postprocess.py (majority filter)
|
||||
→ cog.py (write COG)
|
||||
→ MinIO geocrop-results/
|
||||
↓
|
||||
TiTiler reads COGs from MinIO via S3 protocol
|
||||
```
|
||||
### Storage (MinIO)
|
||||
- `geocrop-models`: ML models and MLflow artifacts.
|
||||
- `geocrop-baselines`: Dynamic World COGs.
|
||||
- `geocrop-results`: Inference outputs (COGs).
|
||||
|
||||
Job status is written to Redis at `job:{job_id}:status` with 24h expiry.
|
||||
### Non-Obvious Constraints
|
||||
- **Kubernetes Only**: Only modify resources managed by K8s. **Avoid** host-level configs (Nginx/CloudPanel).
|
||||
- **AOI Format**: `(lon, lat, radius_m)` — Longitude first.
|
||||
- **Season Window**: Sept 1st to May 31st (Zimbabwe Summer).
|
||||
- **Feature Order**: `FEATURE_ORDER_V1` (51 features) is immutable.
|
||||
|
||||
**Web frontend** (`apps/web/`): React 19 + TypeScript + Vite. Uses OpenLayers for the map (click-to-set-coordinates). Components: `Login`, `Welcome`, `JobForm`, `StatusMonitor`, `MapComponent`, `Admin`. State is in `App.tsx`; JWT token stored in `localStorage`.
|
||||
## 📂 Repository Structure
|
||||
|
||||
**API user store**: Users are stored in an in-memory dict (`USERS` in `apps/api/main.py`) — lost on restart. Admin panel (`/admin/users`) manages users at runtime. Any user additions must be re-done after pod restarts unless the dict is seeded in code.
|
||||
|
||||
## Critical Non-Obvious Patterns
|
||||
|
||||
**Season window**: Sept 1 → May 31 of the following year. `year=2022` → 2022-09-01 to 2023-05-31. See `InferenceConfig.season_dates()` in `apps/worker/config.py`.
|
||||
|
||||
**AOI format**: `(lon, lat, radius_m)` — NOT `(lat, lon)`. Longitude first everywhere in `features.py`.
|
||||
|
||||
**Zimbabwe bounds**: Lon 25.2–33.1, Lat -22.5 to -15.6 (enforced in `worker.py` validation).
|
||||
|
||||
**Radius limit**: Max 5000m enforced in both API (`apps/api/main.py:90`) and worker validation.
|
||||
|
||||
**RQ queue name**: `geocrop_tasks`. Redis service: `redis.geocrop.svc.cluster.local`.
|
||||
|
||||
**API vs worker function name mismatch**: `apps/api/main.py` enqueues `'worker.run_inference'` but the worker only defines `run_job`. Any new worker entry point must be named `run_inference` (or the API call must be updated) for end-to-end jobs to work.
|
||||
|
||||
**Smoothing kernel**: Must be odd — 3, 5, or 7 only (`postprocess.py`).
|
||||
|
||||
**Feature order**: `FEATURE_ORDER_V1` in `feature_computation.py` — exactly 51 scalar features. Order matters for model inference. Changing this breaks all existing models.
|
||||
|
||||
## MinIO Buckets & Path Conventions
|
||||
|
||||
| Bucket | Purpose | Path pattern |
|
||||
|--------|---------|-------------|
|
||||
| `geocrop-models` | ML model `.pkl` files | ROOT — no subfolders |
|
||||
| `geocrop-baselines` | Dynamic World COG tiles | `dw/zim/summer/<season>/<type>/DW_Zim_<Type>_<year>_<year+1>-<row>-<col>.tif` |
|
||||
| `geocrop-results` | Output COGs | `results/<job_id>/<filename>` |
|
||||
| `geocrop-datasets` | Training data CSVs | — |
|
||||
|
||||
**Model filenames** (ROOT of `geocrop-models`):
|
||||
- `Zimbabwe_Ensemble_Raw_Model.pkl` — no scaler needed
|
||||
- `Zimbabwe_XGBoost_Model.pkl`, `Zimbabwe_LightGBM_Model.pkl`, `Zimbabwe_RandomForest_Model.pkl` — require scaler
|
||||
- `Zimbabwe_CatBoost_Raw_Model.pkl` — no scaler
|
||||
|
||||
**DW baseline tiles**: COGs are 65536×65536 pixel tiles. Worker MUST use windowed reads via presigned URL — never download the full tile. Always transform AOI bbox to tile CRS before computing window.
|
||||
|
||||
## Environment Variables
|
||||
|
||||
| Variable | Default | Notes |
|
||||
|----------|---------|-------|
|
||||
| `REDIS_HOST` | `redis.geocrop.svc.cluster.local` | Also supports `REDIS_URL` |
|
||||
| `MINIO_ENDPOINT` | `minio.geocrop.svc.cluster.local:9000` | |
|
||||
| `MINIO_ACCESS_KEY` | `minioadmin` | |
|
||||
| `MINIO_SECRET_KEY` | `minioadmin123` | |
|
||||
| `MINIO_SECURE` | `false` | |
|
||||
| `GEOCROP_CACHE_DIR` | `/tmp/geocrop-cache` | |
|
||||
| `SECRET_KEY` | (change in prod) | API JWT signing |
|
||||
|
||||
TiTiler uses `AWS_S3_ENDPOINT_URL=http://minio.geocrop.svc.cluster.local:9000`, `AWS_HTTPS=NO`, credentials from `geocrop-secrets` k8s secret.
|
||||
|
||||
## Feature Engineering (must match training exactly)
|
||||
|
||||
Pipeline in `feature_computation.py`:
|
||||
1. Compute indices: ndvi, ndre, evi, savi, ci_re, ndwi
|
||||
2. Fill zeros linearly, then Savitzky-Golay smooth (window=5, polyorder=2)
|
||||
3. Phenology metrics for ndvi/ndre/evi: max, min, mean, std, amplitude, auc, peak_timestep, max_slope_up, max_slope_down (27 features)
|
||||
4. Harmonics for ndvi only: harmonic1_sin/cos, harmonic2_sin/cos (4 features)
|
||||
5. Interactions: ndvi_ndre_peak_diff, canopy_density_contrast (2 features)
|
||||
6. Window summaries (early=Oct–Dec, peak=Jan–Mar, late=Apr–Jun) for ndvi/ndwi/ndre × mean/max (18 features)
|
||||
|
||||
**Total: 51 features** — see `FEATURE_ORDER_V1` for exact ordering.
|
||||
|
||||
Training junk columns dropped: `.geo`, `system:index`, `latitude`, `longitude`, `lat`, `lon`, `ID`, `parent_id`, `batch_id`, `is_syn`.
|
||||
|
||||
## DEA STAC
|
||||
|
||||
- Search endpoint: `https://explorer.digitalearth.africa/stac/search`
|
||||
- Primary collection: `s2_l2a` (falls back to `s2_l2a_c1`, `sentinel-2-l2a`, `sentinel_2_l2a`)
|
||||
- Required bands: red, green, blue, nir, nir08 (red-edge), swir16, swir22
|
||||
- Cloud filter: `eo:cloud_cover < 30`
|
||||
|
||||
## Worker Pipeline Stages
|
||||
|
||||
`fetch_stac → build_features → load_dw → infer → smooth → export_cog → upload → done`
|
||||
|
||||
When real DEA STAC data is unavailable, worker falls back to synthetic features (seeded by year+coords) to allow end-to-end pipeline testing.
|
||||
|
||||
## Label Classes (V1 — temporary)
|
||||
|
||||
35 classes including Maize, Tobacco, Soyabean, etc. — defined as `CLASSES_V1` in `apps/worker/worker.py`. Extract dynamically from `model.classes_` when available; fall back to this list only if not present.
|
||||
|
||||
## Training Artifacts
|
||||
|
||||
`train.py --variant Raw` produces `artifacts/model_raw/`:
|
||||
- `model.joblib` — VotingClassifier (soft) over RF + XGBoost + LightGBM + CatBoost
|
||||
- `label_encoder.joblib` — sklearn LabelEncoder (maps string class → int)
|
||||
- `selected_features.json` — feature subset chosen by scout RF (subset of FEATURE_ORDER_V1)
|
||||
- `meta.json` — class names, n_features, config snapshot
|
||||
- `metrics.json` — per-model accuracy/F1/classification report
|
||||
|
||||
`--variant Scaled` also emits `scaler.joblib`. Models uploaded to MinIO via `--upload-minio` go under `geocrop-models` at the ROOT (no subfolders).
|
||||
|
||||
## Plans & Docs
|
||||
|
||||
`plan/` contains detailed step-by-step implementation plans (01–05) and an SRS. Read these before making significant architectural changes. `ops/` contains MinIO upload scripts and storage setup docs.
|
||||
- `apps/`: Source code for web, api, and worker.
|
||||
- `k8s/base/`: Kubernetes manifests (ArgoCD target).
|
||||
- `training/`: Model training scripts and research.
|
||||
- `plan/`: Architectural blueprints and restructuring reports.
|
||||
- `ops/`: Infrastructure scripts and data migration tools.
|
||||
|
|
|
|||
87
GEMINI.md
87
GEMINI.md
|
|
@ -1,73 +1,76 @@
|
|||
# GeoCrop - Crop-Type Classification Platform
|
||||
# GeoCrop - Sovereign MLOps Platform
|
||||
|
||||
GeoCrop is an ML-based platform designed for crop-type classification in Zimbabwe. It utilizes Sentinel-2 satellite imagery from Digital Earth Africa (DEA) STAC, computes advanced spectral and phenological features, and employs multiple ML models (XGBoost, LightGBM, CatBoost, and Soft-Voting Ensembles) to generate high-resolution classification maps.
|
||||
GeoCrop is a production-grade, self-hosted ML platform designed for crop-type classification in Zimbabwe. It utilizes Sentinel-2 satellite imagery (DEA STAC), computes 51 spectral/phenological features, and employs ensemble ML models to generate high-resolution Cloud Optimized GeoTIFFs (COGs).
|
||||
|
||||
## 🚀 Project Overview
|
||||
## 🚀 System Architecture
|
||||
|
||||
- **Architecture**: Distributed system with a FastAPI REST API, Redis/RQ job queue, and Python workers.
|
||||
- **Data Pipeline**:
|
||||
1. **DEA STAC**: Fetches Sentinel-2 L2A imagery.
|
||||
2. **Feature Engineering**: Computes 51 features (NDVI, NDRE, EVI, SAVI, CI_RE, NDWI) including phenology, harmonics, and seasonal window summaries.
|
||||
3. **Inference**: Loads models from MinIO, runs windowed predictions, and applies a majority filter.
|
||||
4. **Output**: Generates Cloud Optimized GeoTIFFs (COGs) stored in MinIO and served via TiTiler.
|
||||
- **Deployment**: Kubernetes (K3s) with automated SSL (cert-manager) and NGINX Ingress.
|
||||
The platform follows a **Sovereign MLOps** philosophy, hosting the entire lifecycle—from source control and experiment tracking to inference and GitOps—on a private K3s cluster.
|
||||
|
||||
- **Frontend**: React 19 + OpenLayers/Leaflet (Portfolio & App).
|
||||
- **Backend**: FastAPI REST API + Redis/RQ Job Queue.
|
||||
- **ML Engine**: Python Inference Workers + XGBoost/CatBoost/LightGBM Ensembles.
|
||||
- **Infrastructure**:
|
||||
- **GitOps**: ArgoCD (CD) + Gitea (Source Control & CI).
|
||||
- **Experiment Tracking**: MLflow (Postgres/MinIO backend).
|
||||
- **Development**: JupyterLab (integrated with MinIO).
|
||||
- **Storage**: MinIO (S3-compatible) for datasets, models, and results.
|
||||
- **Database**: Postgres + PostGIS for spatial metadata and app state.
|
||||
|
||||
## 🛠️ Building and Running
|
||||
|
||||
### Development
|
||||
```bash
|
||||
# Frontend Development
|
||||
cd apps/web && npm install && npm run dev
|
||||
|
||||
# API Development
|
||||
cd apps/api && pip install -r requirements.txt
|
||||
uvicorn main:app --host 0.0.0.0 --port 8000
|
||||
uvicorn main:app --reload
|
||||
|
||||
# Worker Development
|
||||
cd apps/worker && pip install -r requirements.txt
|
||||
python worker.py --worker
|
||||
|
||||
# Training Models
|
||||
cd training && pip install -r requirements.txt
|
||||
python train.py --data /path/to/data.csv --out ./artifacts --variant Raw
|
||||
```
|
||||
|
||||
### Docker
|
||||
```bash
|
||||
docker build -t frankchine/geocrop-api:v1 apps/api/
|
||||
docker build -t frankchine/geocrop-worker:v1 apps/worker/
|
||||
```
|
||||
### GitOps Workflow (CI/CD)
|
||||
1. **Push** code to Gitea (`git.techarvest.co.zw`).
|
||||
2. **CI**: Gitea Actions build and push Docker images to Docker Hub.
|
||||
3. **CD**: ArgoCD detects manifest changes or image updates and reconciles the cluster state.
|
||||
|
||||
### Kubernetes
|
||||
### Kubernetes Deployment
|
||||
```bash
|
||||
# Apply manifests in order
|
||||
kubectl apply -f k8s/00-namespace.yaml
|
||||
kubectl apply -f k8s/
|
||||
# Manual apply (if not using ArgoCD auto-sync)
|
||||
kubectl apply -k k8s/base/
|
||||
```
|
||||
|
||||
## 📐 Development Conventions
|
||||
|
||||
### Critical Patterns (Non-Obvious)
|
||||
- **AOI Format**: Always use `(lon, lat, radius_m)` tuple. Longitude comes first.
|
||||
- **Season Window**: Sept 1st to May 31st (Zimbabwe Summer Season). `year=2022` implies 2022-09-01 to 2023-05-31.
|
||||
- **Zimbabwe Bounds**: Lon 25.2–33.1, Lat -22.5 to -15.6.
|
||||
- **Feature Order**: `FEATURE_ORDER_V1` (51 features) is immutable; changing it breaks existing model compatibility.
|
||||
- **Redis Connection**: Use `redis.geocrop.svc.cluster.local` within the cluster.
|
||||
- **Queue**: Always use the `geocrop_tasks` queue.
|
||||
- **Kubernetes Only:** Focus exclusively on resources managed by Kubernetes (pods, services, ingresses, etc.). **NEVER** modify host-level Nginx configurations (`/etc/nginx/`), CloudPanel settings, or system services outside the cluster.
|
||||
- **AOI Format:** Always use `(lon, lat, radius_m)` tuple. Longitude comes first.
|
||||
- **Season Window:** Sept 1st to May 31st (Zimbabwe Summer Season). `year=2022` implies 2022-09-01 to 2023-05-31.
|
||||
- **Feature Order:** `FEATURE_ORDER_V1` (51 features) is immutable; changing it breaks model compatibility.
|
||||
- **Storage Contract:** Use `geocrop-results` for outputs and `geocrop-models` for serialized artifacts.
|
||||
|
||||
### Storage Layout (MinIO)
|
||||
- `geocrop-models`: ML model `.pkl` files in the root directory.
|
||||
- `geocrop-baselines`: Dynamic World COGs (`dw/zim/summer/...`).
|
||||
- `geocrop-results`: Output COGs (`results/<job_id>/...`).
|
||||
- `geocrop-datasets`: Training CSV files.
|
||||
- `geocrop-models/`: ML model `.pkl` files and MLflow artifacts.
|
||||
- `geocrop-baselines/`: Dynamic World COGs (`dw/zim/summer/...`).
|
||||
- `geocrop-results/`: Output COGs (`results/<job_id>/...`).
|
||||
- `geocrop-datasets/`: Training CSVs and ground-truth labels.
|
||||
|
||||
## 📂 Key Files
|
||||
- `apps/api/main.py`: REST API entry point and job dispatcher.
|
||||
- `apps/worker/worker.py`: Core orchestration logic for the inference pipeline.
|
||||
- `apps/worker/feature_computation.py`: Implementation of the 51 spectral features.
|
||||
- `training/train.py`: Script for training and exporting ML models to MinIO.
|
||||
- `CLAUDE.md`: Primary guide for Claude Code development patterns.
|
||||
- `AGENTS.md`: Technical stack details and current cluster state.
|
||||
- `apps/web/src/App.tsx`: Main React entry point with Portfolio/App view logic.
|
||||
- `apps/worker/worker.py`: Core orchestration of the inference pipeline.
|
||||
- `k8s/base/`: GitOps manifests for all services (ArgoCD tracking root).
|
||||
- `k8s/argocd-app.yaml`: ArgoCD Application definition for GeoCrop.
|
||||
- `.gitea/workflows/build-push.yaml`: CI pipeline for Docker builds.
|
||||
|
||||
## 🌐 Infrastructure
|
||||
## 🌐 Infrastructure (Endpoints)
|
||||
- **Frontend**: `portfolio.techarvest.co.zw`
|
||||
- **API**: `api.portfolio.techarvest.co.zw`
|
||||
- **Gitea**: `git.techarvest.co.zw`
|
||||
- **ArgoCD**: `cd.techarvest.co.zw`
|
||||
- **MLflow**: `ml.techarvest.co.zw`
|
||||
- **Jupyter**: `lab.techarvest.co.zw`
|
||||
- **Tiler**: `tiles.portfolio.techarvest.co.zw`
|
||||
- **MinIO**: `minio.portfolio.techarvest.co.zw`
|
||||
- **Frontend**: `portfolio.techarvest.co.zw`
|
||||
|
|
|
|||
Loading…
Reference in New Issue