geocrop-platform./AGENTS.md

24 KiB
Raw Permalink Blame History

AGENTS.md

This file provides guidance to agents when working with code in this repository.

Project Stack

  • API: FastAPI + Redis + RQ job queue
  • Worker: Python 3.11, rasterio, scikit-learn, XGBoost, LightGBM, CatBoost
  • Storage: MinIO (S3-compatible) with signed URLs
  • K8s: Namespace geocrop, ingress class nginx, ClusterIssuer letsencrypt-prod

Build Commands

API

cd apps/api && pip install -r requirements.txt && uvicorn main:app --host 0.0.0.0 --port 8000

Worker

cd apps/worker && pip install -r requirements.txt && python worker.py

Training

cd training && python train.py --data /path/to/data.csv --out ./artifacts --variant Scaled

Docker Build

docker build -t frankchine/geocrop-api:v1 apps/api/
docker build -t frankchine/geocrop-worker:v1 apps/worker/

Critical Non-Obvious Patterns

Season Window (Sept → May, NOT Nov-Apr)

apps/worker/config.py:135-141 - Use InferenceConfig.season_dates(year, "summer") which returns Sept 1 to May 31 of following year.

AOI Tuple Format (lon, lat, radius_m)

apps/worker/features.py:80 - AOI is (lon, lat, radius_m) NOT (lat, lon, radius).

Redis Service Name

apps/api/main.py:18 - Use redis.geocrop.svc.cluster.local (Kubernetes DNS), NOT localhost.

RQ Queue Name

apps/api/main.py:20 - Queue name is geocrop_tasks.

Job Timeout

apps/api/main.py:96 - Job timeout is 25 minutes (job_timeout='25m').

Max Radius

apps/api/main.py:90 - Radius cannot exceed 5.0 km.

Zimbabwe Bounds (rough bbox)

apps/worker/features.py:97-98 - Lon: 25.2 to 33.1, Lat: -22.5 to -15.6.

Model Artifacts Expected

apps/worker/inference.py:66-70 - model.joblib, label_encoder.joblib, scaler.joblib (optional), selected_features.json.

DEA STAC Endpoint

apps/worker/config.py:147-148 - Use https://explorer.digitalearth.africa/stac/search.

Feature Names

apps/worker/features.py:221 - Currently: ["ndvi_peak", "evi_peak", "savi_peak"].

Majority Filter Kernel

apps/worker/features.py:254 - Must be odd (3, 5, 7).

DW Baseline Filename Format

Plan/srs.md:173 - DW_Zim_HighestConf_YYYY_YYYY.tif

MinIO Buckets

  • geocrop-models - trained ML models
  • geocrop-results - output COGs
  • geocrop-baselines - DW baseline COGs
  • geocrop-datasets - training datasets

Current Kubernetes Cluster State (as of 2026-02-27)

Namespaces

  • geocrop - Main application namespace
  • cert-manager - Certificate management
  • ingress-nginx - Ingress controller
  • kubernetes-dashboard - Dashboard

Deployments (geocrop namespace)

Deployment Image Status Age
geocrop-api frankchine/geocrop-api:v3 Running (1/1) 159m
geocrop-worker frankchine/geocrop-worker:v2 Running (1/1) 86m
redis redis:alpine Running (1/1) 25h
minio minio/minio Running (1/1) 25h
hello-web nginx Running (1/1) 25h

Services (geocrop namespace)

Service Type Cluster IP Ports
geocrop-api ClusterIP 10.43.7.69 8000/TCP
geocrop-web ClusterIP 10.43.101.43 80/TCP
redis ClusterIP 10.43.15.14 6379/TCP
minio ClusterIP 10.43.71.8 9000/TCP, 9001/TCP

Ingress (geocrop namespace)

Ingress Hosts TLS Backend
geocrop-web-api portfolio.techarvest.co.zw, api.portfolio.techarvest.co.zw geocrop-web-api-tls geocrop-web:80, geocrop-api:8000
geocrop-minio minio.portfolio.techarvest.co.zw, console.minio.portfolio.techarvest.co.zw minio-api-tls, minio-console-tls minio:9000, minio:9001

Storage

  • MinIO PVC: 30Gi (local-path storage class), bound to pvc-44bf8a0f-cbc9-4336-aa54-edf1c4d0be86

TLS Certificates

  • ClusterIssuer: letsencrypt-prod (cert-manager)
  • All TLS certificates are managed by cert-manager with automatic renewal

STEP 0: Alignment Notes (Worker Implementation)

Current Mock Behavior (apps/worker/*)

File Current State Gap
features.py build_feature_stack_from_dea() returns placeholder zeros CRITICAL - Need full DEA STAC loading + feature engineering
inference.py Model loading with expected bundle format Need to adapt to ROOT bucket format
config.py MinIOStorage class exists May need refinement for ROOT bucket access
worker.py Mock handler returning fake results Need full staged pipeline

Training Pipeline Expectations (plan/original_training.py)

Feature Engineering (must match exactly):

  1. Smoothing: apply_smoothing() - Savitzky-Golay (window=5, polyorder=2) + linear interpolation of zeros
  2. Phenology: extract_phenology() - max, min, mean, std, amplitude, auc, peak_timestep, max_slope_up, max_slope_down
  3. Harmonics: add_harmonics() - harmonic1_sin/cos, harmonic2_sin/cos
  4. Windows: add_interactions_and_windows() - early/peak/late windows, interactions

Indices Computed:

  • ndvi, ndre, evi, savi, ci_re, ndwi

Junk Columns Dropped:

['.geo', 'system:index', 'latitude', 'longitude', 'lat', 'lon', 'ID', 'parent_id', 'batch_id', 'is_syn']

Model Storage Convention (FINAL)

Location: ROOT of geocrop-models bucket (no subfolders)

Exact Object Names:

geocrop-models/
├── Zimbabwe_XGBoost_Raw_Model.pkl
├── Zimbabwe_XGBoost_Model.pkl
├── Zimbabwe_RandomForest_Raw_Model.pkl
├── Zimbabwe_RandomForest_Model.pkl
├── Zimbabwe_LightGBM_Raw_Model.pkl
├── Zimbabwe_LightGBM_Model.pkl
├── Zimbabwe_Ensemble_Raw_Model.pkl
└── Zimbabwe_CatBoost_Raw_Model.pkl

Model Selection Logic:

Job "model" value MinIO filename Scaler needed?
"Ensemble" Zimbabwe_Ensemble_Raw_Model.pkl No
"Ensemble_Raw" Zimbabwe_Ensemble_Raw_Model.pkl No
"Ensemble_Scaled" Zimbabwe_Ensemble_Model.pkl Yes
"RandomForest" Zimbabwe_RandomForest_Model.pkl Yes
"XGBoost" Zimbabwe_XGBoost_Model.pkl Yes
"LightGBM" Zimbabwe_LightGBM_Model.pkl Yes
"CatBoost" Zimbabwe_CatBoost_Raw_Model.pkl No

Label Encoder Handling:

  • No separate label_encoder.joblib file exists
  • Labels encoded in model via model.classes_ attribute
  • Default classes (if not available): ["cropland_rainfed", "cropland_irrigated", "tree_crop", "grassland", "shrubland", "urban", "water", "bare"]

DEA STAC Configuration

Setting Value
STAC Root https://explorer.digitalearth.africa/stac
STAC Search https://explorer.digitalearth.africa/stac/search
Primary Collection s2_l2a (Sentinel-2 L2A)
Required Bands red, green, blue, nir, nir08 (red-edge), swir16, swir22
Cloud Filter eo:cloud_cover < 30%
Season Window Sep 1 → May 31 (year → year+1)

Dynamic World Baseline Layout

Bucket: geocrop-baselines

Path Pattern: dw/zim/summer/<season>/<type>/DW_Zim_<Type>_<year>_<year+1>.tif

Tile Format: COGs with 65536x65536 pixel tiles

  • Example: DW_Zim_HighestConf_2021_2022-0000000000-0000000000.tif

Results Layout

Bucket: geocrop-results

Path Pattern: results/<job_id>/<filename>

Output Files:

  • refined.tif - Main classification result
  • dw_baseline.tif - Clipped DW baseline (if requested)
  • truecolor.tif - RGB composite (if requested)
  • ndvi_peak.tif, evi_peak.tif, savi_peak.tif - Index peaks (if requested)

Job Payload Schema

{
  "job_id": "uuid",
  "user_id": "uuid",
  "lat": -17.8,
  "lon": 31.0,
  "radius_m": 2000,
  "year": 2022,
  "season": "summer",
  "model": "Ensemble",
  "smoothing_kernel": 5,
  "outputs": {
    "refined": true,
    "dw_baseline": false,
    "true_color": false,
    "indices": []
  }
}

Required Fields: job_id, lat, lon, radius_m, year

Defaults:

  • season: "summer"
  • model: "Ensemble"
  • smoothing_kernel: 5
  • outputs.refined: true

Pipeline Stages

Stage Description
fetch_stac Query DEA STAC for Sentinel-2 scenes
build_features Load bands, compute indices, apply feature engineering
load_dw Load and clip Dynamic World baseline
infer Run ML model inference
smooth Apply majority filter post-processing
export_cog Write GeoTIFF as COG
upload Upload to MinIO
done Complete

Environment Variables

Variable Default Description
REDIS_HOST redis.geocrop.svc.cluster.local Redis service
MINIO_ENDPOINT minio.geocrop.svc.cluster.local:9000 MinIO service
MINIO_ACCESS_KEY minioadmin MinIO access key
MINIO_SECRET_KEY minioadmin MinIO secret key
MINIO_SECURE false Use HTTPS for MinIO
GEOCROP_CACHE_DIR /tmp/geocrop-cache Local cache directory

Assumptions / TODOs

  1. EPSG: Default to UTM Zone 36S (EPSG:32736) for Zimbabwe - compute dynamically from AOI center in production
  2. Feature Names: Training uses selected features from LightGBM importance - may vary per model
  3. Label Encoder: No separate file - extract from model or use defaults
  4. Scaler: Only for non-Raw models; Raw models use unscaled features
  5. DW Tiles: Must handle 2x2 tile mosaicking for full AOI coverage

Worker Contracts (STEP 1)

Job Payload Contract

# Minimal required fields:
{
  "job_id": "uuid",
  "lat": -17.8,
  "lon": 31.0,
  "radius_m": 2000,  # max 5000m
  "year": 2022        # 2015-current
}

# Full with all options:
{
  "job_id": "uuid",
  "user_id": "uuid",  # optional
  "lat": -17.8,
  "lon": 31.0,
  "radius_m": 2000,
  "year": 2022,
  "season": "summer",  # default
  "model": "Ensemble",  # or RandomForest, XGBoost, LightGBM, CatBoost
  "smoothing_kernel": 5,  # 3, 5, or 7
  "outputs": {
    "refined": True,
    "dw_baseline": True,
    "true_color": True,
    "indices": ["ndvi_peak", "evi_peak", "savi_peak"]
  },
  "stac": {
    "cloud_cover_lt": 20,
    "max_items": 60
  }
}

Worker Stages

fetch_stac → build_features → load_dw → infer → smooth → export_cog → upload → done

Default Class List (TEMPORARY V1)

Until we make fully dynamic, use these classes (order matters if model doesn't provide classes):

CLASSES_V1 = [
    "Avocado","Banana","Bare Surface","Blueberry","Built-Up","Cabbage","Chilli","Citrus","Cotton","Cowpea",
    "Finger Millet","Forest","Grassland","Groundnut","Macadamia","Maize","Pasture Legume","Pearl Millet",
    "Peas","Potato","Roundnut","Sesame","Shrubland","Sorghum","Soyabean","Sugarbean","Sugarcane","Sunflower",
    "Sunhem","Sweet Potato","Tea","Tobacco","Tomato","Water","Woodland"
]

Note: This is TEMPORARY - later we will extract class names dynamically from the trained model.


STEP 2: Storage Adapter (MinIO)

Environment Variables

Variable Default Description
MINIO_ENDPOINT minio.geocrop.svc.cluster.local:9000 MinIO service
MINIO_ACCESS_KEY minioadmin MinIO access key
MINIO_SECRET_KEY minioadmin123 MinIO secret key
MINIO_SECURE false Use HTTPS for MinIO
MINIO_REGION us-east-1 AWS region
MINIO_BUCKET_MODELS geocrop-models Models bucket
MINIO_BUCKET_BASELINES geocrop-baselines Baselines bucket
MINIO_BUCKET_RESULTS geocrop-results Results bucket

Bucket/Key Conventions

  • Models: ROOT of geocrop-models (no subfolders)
  • DW Baselines: geocrop-baselines/dw/zim/summer/<season>/<type>/DW_Zim_<Type>_<year>_<year+1>.tif
  • Results: geocrop-results/results/<job_id>/<filename>

Model Filename Mapping

Job model value Primary filename Fallback
"Ensemble" Zimbabwe_Ensemble_Model.pkl Zimbabwe_Ensemble_Raw_Model.pkl
"RandomForest" Zimbabwe_RandomForest_Model.pkl Zimbabwe_RandomForest_Raw_Model.pkl
"XGBoost" Zimbabwe_XGBoost_Model.pkl Zimbabwe_XGBoost_Raw_Model.pkl
"LightGBM" Zimbabwe_LightGBM_Model.pkl Zimbabwe_LightGBM_Raw_Model.pkl
"CatBoost" Zimbabwe_CatBoost_Model.pkl Zimbabwe_CatBoost_Raw_Model.pkl

Methods

  • ping()(bool, str): Check MinIO connectivity
  • head_object(bucket, key)dict|None: Get object metadata
  • list_objects(bucket, prefix)list[str]: List object keys
  • download_file(bucket, key, dest_path)Path: Download file
  • download_model_file(model_name, dest_dir)Path: Download model with fallback
  • upload_file(bucket, key, local_path)str: Upload file, returns s3:// URI
  • upload_result(job_id, local_path, filename)(s3_uri, key): Upload result
  • presign_get(bucket, key, expires)str: Generate presigned URL

STEP 3: STAC Client (DEA)

Environment Variables

Variable Default Description
DEA_STAC_ROOT https://explorer.digitalearth.africa/stac STAC root URL
DEA_STAC_SEARCH https://explorer.digitalearth.africa/stac/search STAC search URL
DEA_CLOUD_MAX 30 Cloud cover filter (percent)
DEA_TIMEOUT_S 30 Request timeout (seconds)

Collection Resolution

Preferred Sentinel-2 collection IDs (in order):

  1. s2_l2a
  2. s2_l2a_c1
  3. sentinel-2-l2a
  4. sentinel_2_l2a

If none found, raises ValueError with available collections.

Methods

  • list_collections()list[str]: List available collections
  • resolve_s2_collection()str|None: Resolve best S2 collection
  • search_items(bbox, start_date, end_date)list[pystac.Item]: Search for items
  • summarize_items(items)dict: Summarize search results without downloading

summarize_items() Output Structure

{
    "count": int,
    "collection": str,
    "time_start": "ISO datetime",
    "time_end": "ISO datetime",
    "items": [
        {
            "id": str,
            "datetime": "ISO datetime",
            "bbox": [minx, miny, maxx, maxy],
            "cloud_cover": float|None,
            "assets": {
                "red": {"href": str, "type": str, "roles": list},
                ...
            }
        }, ...
    ]
}

Note: stackstac loading is NOT implemented in this step. It will come in Step 4/5.


STEP 4A: Feature Computation (Math)

Features Produced

Base indices (time-series):

  • ndvi, ndre, evi, savi, ci_re, ndwi

Smoothed time-series:

  • For every index above, Savitzky-Golay smoothing (window=5, polyorder=2)
  • Suffix: *_smooth

Phenology metrics (computed across time for NDVI, NDRE, EVI):

  • _max, _min, _mean, _std, _amplitude, _auc, _peak_timestep, _max_slope_up, _max_slope_down

Harmonic features (for NDVI only):

  • ndvi_harmonic1_sin, ndvi_harmonic1_cos, ndvi_harmonic2_sin, ndvi_harmonic2_cos

Interaction features:

  • ndvi_ndre_peak_diff = ndvi_max - ndre_max
  • canopy_density_contrast = evi_mean / (ndvi_mean + 0.001)

Smoothing Approach

  1. fill_zeros_linear: Treats 0 as missing, linear interpolates between non-zero neighbors
  2. savgol_smooth_1d: Uses scipy.signal.savgol_filter if available, falls back to simple moving average

Phenology Metrics Definitions

Metric Formula
max np.max(y)
min np.min(y)
mean np.mean(y)
std np.std(y)
amplitude max - min
auc trapezoidal integral (dx=10 days)
peak_timestep argmax(y)
max_slope_up max(diff(y))
max_slope_down min(diff(y))

Harmonic Coefficient Definition

For normalized time t = 2pik/N:

  • h1_sin = mean(y * sin(t))
  • h1_cos = mean(y * cos(t))
  • h2_sin = mean(y * sin(2t))
  • h2_cos = mean(y * cos(2t))

Note

Step 4B will add seasonal window summaries and final feature vector ordering.


STEP 4B: Window Summaries + Feature Order

Seasonal Window Features (18 features)

Season window is OctJun, split into:

  • Early: OctDec
  • Peak: JanMar
  • Late: AprJun

For each window, computed for NDVI, NDWI, NDRE:

  • <index>_<window>_mean
  • <index>_<window>_max

Total: 3 indices × 3 windows × 2 stats = 18 features

Feature Ordering (FEATURE_ORDER_V1)

51 scalar features in order:

  1. Phenology metrics (27): ndvi, ndre, evi (each with max, min, mean, std, amplitude, auc, peak_timestep, max_slope_up, max_slope_down)
  2. Harmonics (4): ndvi_harmonic1_sin/cos, ndvi_harmonic2_sin/cos
  3. Interactions (2): ndvi_ndre_peak_diff, canopy_density_contrast
  4. Window summaries (18): ndvi/ndwi/ndre × early/peak/late × mean/max

Note: Additional smoothed array features (*_smooth) are not in FEATURE_ORDER_V1 since they are arrays, not scalars.

Window Splitting Logic

  • If dates provided: Use month membership (10,11,12 = early; 1,2,3 = peak; 4,5,6 = late)
  • Fallback: Positional split (first 9 steps = early, next 9 = peak, next 9 = late)

STEP 5: DW Baseline Loading

DW Object Layout

Bucket: geocrop-baselines

Prefix: dw/zim/summer/

Path Pattern: dw/zim/summer/<season>/<type>/DW_Zim_<Type>_<year>_<year+1>.tif

Tile Naming: COGs with 65536x65536 pixel tiles

  • Example: DW_Zim_HighestConf_2021_2022-0000000000-0000000000.tif
  • Format: {Type}_{Year}_{Year+1}-{TileRow}-{TileCol}.tif

DW Types

  • HighestConf - Highest confidence class
  • Agreement - Class agreement across predictions
  • Mode - Most common class

Windowed Reads

The worker MUST use windowed reads to avoid downloading entire huge COG tiles:

  1. Presigned URL: Get temporary URL via storage.presign_get(bucket, key, expires=3600)
  2. AOI Transform: Convert AOI bbox from WGS84 to tile CRS using rasterio.warp.transform_bounds
  3. Window Creation: Use rasterio.windows.from_bounds to compute window from transformed bbox
  4. Selective Read: Call src.read(window=window) to read only the needed portion
  5. Mosaic: If multiple tiles needed, read each window and mosaic into single array

CRS Handling

  • DW tiles may be in EPSG:3857 (Web Mercator) or UTM - do NOT assume
  • Always transform AOI bbox to tile CRS before computing window
  • Output profile uses tile's native CRS

Error Handling

  • If no matching tiles found: Raise FileNotFoundError with searched prefix
  • If window read fails: Retry 3x with exponential backoff
  • Nodata value: 0 (preserved from DW)

Primary Function

def load_dw_baseline_window(
    storage,
    year: int,
    season: str = "summer",
    aoi_bbox_wgs84: List[float],  # [min_lon, min_lat, max_lon, max_lat]
    dw_type: str = "HighestConf",
    bucket: str = "geocrop-baselines",
    max_retries: int = 3,
) -> Tuple[np.ndarray, dict]:
    """Load DW baseline clipped to AOI window from MinIO.
    
    Returns:
        dw_arr: uint8 or int16 raster clipped to AOI
        profile: rasterio profile for writing outputs aligned to this window
    """

Plan 02 - Step 1: TiTiler Deployment+Service

Files Changed

  • Created: k8s/25-tiler.yaml
  • Created: Kubernetes Secret geocrop-secrets with MinIO credentials

Commands Run

kubectl create secret generic geocrop-secrets -n geocrop --from-literal=minio-access-key=minioadmin --from-literal=minio-secret-key=minioadmin123
kubectl -n geocrop apply -f k8s/25-tiler.yaml
kubectl -n geocrop get deploy,svc | grep geocrop-tiler

Expected Output / Acceptance Criteria

  • kubectl -n geocrop apply -f k8s/25-tiler.yaml succeeds (syntax correct)
  • Creates Deployment geocrop-tiler with 2 replicas
  • Creates Service geocrop-tiler (ClusterIP on port 8000 → container port 80)
  • TiTiler container reads COGs from MinIO via S3
  • Pods are Running and Ready (1/1)

Actual Output

deployment.apps/geocrop-tiler    2/2     2            2           2m
service/geocrop-tiler   ClusterIP   10.43.47.225   <none>        8000/TCP            2m

TiTiler Environment Variables

Variable Value
AWS_ACCESS_KEY_ID from secret geocrop-secrets
AWS_SECRET_ACCESS_KEY from secret geocrop-secrets
AWS_REGION us-east-1
AWS_S3_ENDPOINT_URL http://minio.geocrop.svc.cluster.local:9000
AWS_HTTPS NO
TILED_READER cog

Notes

  • Container listens on port 80 (not 8000) - service maps 8000 → 80
  • Health probe path /healthz on port 80
  • Secret geocrop-secrets created for MinIO credentials

Next Step

  • Step 2: Add Ingress for TiTiler (with TLS)

Plan 02 - Step 2: TiTiler Ingress

Files Changed

Commands Run

kubectl -n geocrop apply -f k8s/26-tiler-ingress.yaml
kubectl -n geocrop get ingress geocrop-tiler -o wide
kubectl -n geocrop describe ingress geocrop-tiler

Expected Output / Acceptance Criteria

  • Ingress object created with host tiles.portfolio.techarvest.co.zw
  • TLS certificate will be pending until DNS A record is pointed to ingress IP

Actual Output

NAME            CLASS   HOSTS                              ADDRESS        PORTS     AGE
geocrop-tiler   nginx   tiles.portfolio.techarvest.co.zw   167.86.68.48   80, 443   30s

Ingress Details

  • Host: tiles.portfolio.techarvest.co.zw
  • Backend: geocrop-tiler:8000
  • TLS: geocrop-tiler-tls (cert-manager with letsencrypt-prod)
  • Annotations: nginx.ingress.kubernetes.io/proxy-body-size: "50m"

DNS Requirement

External DNS A record must point to ingress IP (167.86.68.48):

  • tiles.portfolio.techarvest.co.zw167.86.68.48

Plan 02 - Step 3: TiTiler Smoke Test

Commands Run

kubectl -n geocrop port-forward svc/geocrop-tiler 8000:8000 &
curl -sS http://127.0.0.1:8000/ | head
curl -sS -o /dev/null -w "%{http_code}\n" http://127.0.0.1:8000/healthz

Test Results

Endpoint Status Notes
/ 200 Landing page JSON returned
/healthz 200 Health check passes
/api 200 OpenAPI docs available

Final Probe Path

  • Confirmed: /healthz on port 80 works correctly
  • No manifest changes needed

Plan 02 - Step 4: MinIO S3 Access Test

Commands Run

# With correct credentials (minioadmin/minioadmin123)
curl -sS "http://127.0.0.1:8000/cog/info?url=s3://geocrop-baselines/dw/zim/summer/summer/highest/DW_Zim_HighestConf_2016_2017-0000000000-0000000000.tif"

Test Results

Test Result Notes
S3 Access Failed Error: "The AWS Access Key Id you provided does not exist in our records"

Issue Analysis

  • MinIO credentials used: minioadmin / minioadmin123
  • The root user is minioadmin with password minioadmin123
  • TiTiler pods have correct env vars set (verified via kubectl exec)
  • Issue may be: (1) bucket not created, (2) bucket path incorrect, or (3) network policy

Environment Variables (Verified Working)

Variable Value
AWS_ACCESS_KEY_ID minioadmin
AWS_SECRET_ACCESS_KEY minioadmin123
AWS_S3_ENDPOINT_URL http://minio.geocrop.svc.cluster.local:9000
AWS_HTTPS NO
AWS_REGION us-east-1

Next Step

  • Verify bucket exists in MinIO
  • Check bucket naming convention in MinIO console
  • Or upload test COG to verify S3 access