geocrop-platform./AGENTS.md

715 lines
24 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# AGENTS.md
This file provides guidance to agents when working with code in this repository.
## Project Stack
- **API**: FastAPI + Redis + RQ job queue
- **Worker**: Python 3.11, rasterio, scikit-learn, XGBoost, LightGBM, CatBoost
- **Storage**: MinIO (S3-compatible) with signed URLs
- **K8s**: Namespace `geocrop`, ingress class `nginx`, ClusterIssuer `letsencrypt-prod`
## Build Commands
### API
```bash
cd apps/api && pip install -r requirements.txt && uvicorn main:app --host 0.0.0.0 --port 8000
```
### Worker
```bash
cd apps/worker && pip install -r requirements.txt && python worker.py
```
### Training
```bash
cd training && python train.py --data /path/to/data.csv --out ./artifacts --variant Scaled
```
### Docker Build
```bash
docker build -t frankchine/geocrop-api:v1 apps/api/
docker build -t frankchine/geocrop-worker:v1 apps/worker/
```
## Critical Non-Obvious Patterns
### Season Window (Sept → May, NOT Nov-Apr)
[`apps/worker/config.py:135-141`](apps/worker/config.py:135) - Use `InferenceConfig.season_dates(year, "summer")` which returns Sept 1 to May 31 of following year.
### AOI Tuple Format (lon, lat, radius_m)
[`apps/worker/features.py:80`](apps/worker/features.py:80) - AOI is `(lon, lat, radius_m)` NOT `(lat, lon, radius)`.
### Redis Service Name
[`apps/api/main.py:18`](apps/api/main.py:18) - Use `redis.geocrop.svc.cluster.local` (Kubernetes DNS), NOT `localhost`.
### RQ Queue Name
[`apps/api/main.py:20`](apps/api/main.py:20) - Queue name is `geocrop_tasks`.
### Job Timeout
[`apps/api/main.py:96`](apps/api/main.py:96) - Job timeout is 25 minutes (`job_timeout='25m'`).
### Max Radius
[`apps/api/main.py:90`](apps/api/main.py:90) - Radius cannot exceed 5.0 km.
### Zimbabwe Bounds (rough bbox)
[`apps/worker/features.py:97-98`](apps/worker/features.py:97) - Lon: 25.2 to 33.1, Lat: -22.5 to -15.6.
### Model Artifacts Expected
[`apps/worker/inference.py:66-70`](apps/worker/inference.py:66) - `model.joblib`, `label_encoder.joblib`, `scaler.joblib` (optional), `selected_features.json`.
### DEA STAC Endpoint
[`apps/worker/config.py:147-148`](apps/worker/config.py:147) - Use `https://explorer.digitalearth.africa/stac/search`.
### Feature Names
[`apps/worker/features.py:221`](apps/worker/features.py:221) - Currently: `["ndvi_peak", "evi_peak", "savi_peak"]`.
### Majority Filter Kernel
[`apps/worker/features.py:254`](apps/worker/features.py:254) - Must be odd (3, 5, 7).
### DW Baseline Filename Format
[`Plan/srs.md:173`](Plan/srs.md:173) - `DW_Zim_HighestConf_YYYY_YYYY.tif`
### MinIO Buckets
- `geocrop-models` - trained ML models
- `geocrop-results` - output COGs
- `geocrop-baselines` - DW baseline COGs
- `geocrop-datasets` - training datasets
## Current Kubernetes Cluster State (as of 2026-02-27)
### Namespaces
- `geocrop` - Main application namespace
- `cert-manager` - Certificate management
- `ingress-nginx` - Ingress controller
- `kubernetes-dashboard` - Dashboard
### Deployments (geocrop namespace)
| Deployment | Image | Status | Age |
|------------|-------|--------|-----|
| geocrop-api | frankchine/geocrop-api:v3 | Running (1/1) | 159m |
| geocrop-worker | frankchine/geocrop-worker:v2 | Running (1/1) | 86m |
| redis | redis:alpine | Running (1/1) | 25h |
| minio | minio/minio | Running (1/1) | 25h |
| hello-web | nginx | Running (1/1) | 25h |
### Services (geocrop namespace)
| Service | Type | Cluster IP | Ports |
|---------|------|------------|-------|
| geocrop-api | ClusterIP | 10.43.7.69 | 8000/TCP |
| geocrop-web | ClusterIP | 10.43.101.43 | 80/TCP |
| redis | ClusterIP | 10.43.15.14 | 6379/TCP |
| minio | ClusterIP | 10.43.71.8 | 9000/TCP, 9001/TCP |
### Ingress (geocrop namespace)
| Ingress | Hosts | TLS | Backend |
|---------|-------|-----|---------|
| geocrop-web-api | portfolio.techarvest.co.zw, api.portfolio.techarvest.co.zw | geocrop-web-api-tls | geocrop-web:80, geocrop-api:8000 |
| geocrop-minio | minio.portfolio.techarvest.co.zw, console.minio.portfolio.techarvest.co.zw | minio-api-tls, minio-console-tls | minio:9000, minio:9001 |
### Storage
- MinIO PVC: 30Gi (local-path storage class), bound to pvc-44bf8a0f-cbc9-4336-aa54-edf1c4d0be86
### TLS Certificates
- ClusterIssuer: letsencrypt-prod (cert-manager)
- All TLS certificates are managed by cert-manager with automatic renewal
---
## STEP 0: Alignment Notes (Worker Implementation)
### Current Mock Behavior (apps/worker/*)
| File | Current State | Gap |
|------|--------------|-----|
| `features.py` | [`build_feature_stack_from_dea()`](apps/worker/features.py:193) returns placeholder zeros | **CRITICAL** - Need full DEA STAC loading + feature engineering |
| `inference.py` | Model loading with expected bundle format | Need to adapt to ROOT bucket format |
| `config.py` | [`MinIOStorage`](apps/worker/config.py:130) class exists | May need refinement for ROOT bucket access |
| `worker.py` | Mock handler returning fake results | Need full staged pipeline |
### Training Pipeline Expectations (plan/original_training.py)
#### Feature Engineering (must match exactly):
1. **Smoothing**: [`apply_smoothing()`](plan/original_training.py:69) - Savitzky-Golay (window=5, polyorder=2) + linear interpolation of zeros
2. **Phenology**: [`extract_phenology()`](plan/original_training.py:101) - max, min, mean, std, amplitude, auc, peak_timestep, max_slope_up, max_slope_down
3. **Harmonics**: [`add_harmonics()`](plan/original_training.py:141) - harmonic1_sin/cos, harmonic2_sin/cos
4. **Windows**: [`add_interactions_and_windows()`](plan/original_training.py:177) - early/peak/late windows, interactions
#### Indices Computed:
- ndvi, ndre, evi, savi, ci_re, ndwi
#### Junk Columns Dropped:
```python
['.geo', 'system:index', 'latitude', 'longitude', 'lat', 'lon', 'ID', 'parent_id', 'batch_id', 'is_syn']
```
### Model Storage Convention (FINAL)
**Location**: ROOT of `geocrop-models` bucket (no subfolders)
**Exact Object Names**:
```
geocrop-models/
├── Zimbabwe_XGBoost_Raw_Model.pkl
├── Zimbabwe_XGBoost_Model.pkl
├── Zimbabwe_RandomForest_Raw_Model.pkl
├── Zimbabwe_RandomForest_Model.pkl
├── Zimbabwe_LightGBM_Raw_Model.pkl
├── Zimbabwe_LightGBM_Model.pkl
├── Zimbabwe_Ensemble_Raw_Model.pkl
└── Zimbabwe_CatBoost_Raw_Model.pkl
```
**Model Selection Logic**:
| Job "model" value | MinIO filename | Scaler needed? |
|-------------------|---------------|----------------|
| "Ensemble" | Zimbabwe_Ensemble_Raw_Model.pkl | No |
| "Ensemble_Raw" | Zimbabwe_Ensemble_Raw_Model.pkl | No |
| "Ensemble_Scaled" | Zimbabwe_Ensemble_Model.pkl | Yes |
| "RandomForest" | Zimbabwe_RandomForest_Model.pkl | Yes |
| "XGBoost" | Zimbabwe_XGBoost_Model.pkl | Yes |
| "LightGBM" | Zimbabwe_LightGBM_Model.pkl | Yes |
| "CatBoost" | Zimbabwe_CatBoost_Raw_Model.pkl | No |
**Label Encoder Handling**:
- No separate `label_encoder.joblib` file exists
- Labels encoded in model via `model.classes_` attribute
- Default classes (if not available): `["cropland_rainfed", "cropland_irrigated", "tree_crop", "grassland", "shrubland", "urban", "water", "bare"]`
### DEA STAC Configuration
| Setting | Value |
|---------|-------|
| STAC Root | `https://explorer.digitalearth.africa/stac` |
| STAC Search | `https://explorer.digitalearth.africa/stac/search` |
| Primary Collection | `s2_l2a` (Sentinel-2 L2A) |
| Required Bands | red, green, blue, nir, nir08 (red-edge), swir16, swir22 |
| Cloud Filter | eo:cloud_cover < 30% |
| Season Window | Sep 1 May 31 (year year+1) |
### Dynamic World Baseline Layout
**Bucket**: `geocrop-baselines`
**Path Pattern**: `dw/zim/summer/<season>/<type>/DW_Zim_<Type>_<year>_<year+1>.tif`
**Tile Format**: COGs with 65536x65536 pixel tiles
- Example: `DW_Zim_HighestConf_2021_2022-0000000000-0000000000.tif`
### Results Layout
**Bucket**: `geocrop-results`
**Path Pattern**: `results/<job_id>/<filename>`
**Output Files**:
- `refined.tif` - Main classification result
- `dw_baseline.tif` - Clipped DW baseline (if requested)
- `truecolor.tif` - RGB composite (if requested)
- `ndvi_peak.tif`, `evi_peak.tif`, `savi_peak.tif` - Index peaks (if requested)
### Job Payload Schema
```json
{
"job_id": "uuid",
"user_id": "uuid",
"lat": -17.8,
"lon": 31.0,
"radius_m": 2000,
"year": 2022,
"season": "summer",
"model": "Ensemble",
"smoothing_kernel": 5,
"outputs": {
"refined": true,
"dw_baseline": false,
"true_color": false,
"indices": []
}
}
```
**Required Fields**: `job_id`, `lat`, `lon`, `radius_m`, `year`
**Defaults**:
- `season`: "summer"
- `model`: "Ensemble"
- `smoothing_kernel`: 5
- `outputs.refined`: true
### Pipeline Stages
| Stage | Description |
|-------|-------------|
| `fetch_stac` | Query DEA STAC for Sentinel-2 scenes |
| `build_features` | Load bands, compute indices, apply feature engineering |
| `load_dw` | Load and clip Dynamic World baseline |
| `infer` | Run ML model inference |
| `smooth` | Apply majority filter post-processing |
| `export_cog` | Write GeoTIFF as COG |
| `upload` | Upload to MinIO |
| `done` | Complete |
### Environment Variables
| Variable | Default | Description |
|----------|---------|-------------|
| `REDIS_HOST` | `redis.geocrop.svc.cluster.local` | Redis service |
| `MINIO_ENDPOINT` | `minio.geocrop.svc.cluster.local:9000` | MinIO service |
| `MINIO_ACCESS_KEY` | `minioadmin` | MinIO access key |
| `MINIO_SECRET_KEY` | `minioadmin` | MinIO secret key |
| `MINIO_SECURE` | `false` | Use HTTPS for MinIO |
| `GEOCROP_CACHE_DIR` | `/tmp/geocrop-cache` | Local cache directory |
### Assumptions / TODOs
1. **EPSG**: Default to UTM Zone 36S (EPSG:32736) for Zimbabwe - compute dynamically from AOI center in production
2. **Feature Names**: Training uses selected features from LightGBM importance - may vary per model
3. **Label Encoder**: No separate file - extract from model or use defaults
4. **Scaler**: Only for non-Raw models; Raw models use unscaled features
5. **DW Tiles**: Must handle 2x2 tile mosaicking for full AOI coverage
---
## Worker Contracts (STEP 1)
### Job Payload Contract
```python
# Minimal required fields:
{
"job_id": "uuid",
"lat": -17.8,
"lon": 31.0,
"radius_m": 2000, # max 5000m
"year": 2022 # 2015-current
}
# Full with all options:
{
"job_id": "uuid",
"user_id": "uuid", # optional
"lat": -17.8,
"lon": 31.0,
"radius_m": 2000,
"year": 2022,
"season": "summer", # default
"model": "Ensemble", # or RandomForest, XGBoost, LightGBM, CatBoost
"smoothing_kernel": 5, # 3, 5, or 7
"outputs": {
"refined": True,
"dw_baseline": True,
"true_color": True,
"indices": ["ndvi_peak", "evi_peak", "savi_peak"]
},
"stac": {
"cloud_cover_lt": 20,
"max_items": 60
}
}
```
### Worker Stages
```
fetch_stac → build_features → load_dw → infer → smooth → export_cog → upload → done
```
### Default Class List (TEMPORARY V1)
Until we make fully dynamic, use these classes (order matters if model doesn't provide classes):
```python
CLASSES_V1 = [
"Avocado","Banana","Bare Surface","Blueberry","Built-Up","Cabbage","Chilli","Citrus","Cotton","Cowpea",
"Finger Millet","Forest","Grassland","Groundnut","Macadamia","Maize","Pasture Legume","Pearl Millet",
"Peas","Potato","Roundnut","Sesame","Shrubland","Sorghum","Soyabean","Sugarbean","Sugarcane","Sunflower",
"Sunhem","Sweet Potato","Tea","Tobacco","Tomato","Water","Woodland"
]
```
Note: This is TEMPORARY - later we will extract class names dynamically from the trained model.
---
## STEP 2: Storage Adapter (MinIO)
### Environment Variables
| Variable | Default | Description |
|----------|---------|-------------|
| `MINIO_ENDPOINT` | `minio.geocrop.svc.cluster.local:9000` | MinIO service |
| `MINIO_ACCESS_KEY` | `minioadmin` | MinIO access key |
| `MINIO_SECRET_KEY` | `minioadmin123` | MinIO secret key |
| `MINIO_SECURE` | `false` | Use HTTPS for MinIO |
| `MINIO_REGION` | `us-east-1` | AWS region |
| `MINIO_BUCKET_MODELS` | `geocrop-models` | Models bucket |
| `MINIO_BUCKET_BASELINES` | `geocrop-baselines` | Baselines bucket |
| `MINIO_BUCKET_RESULTS` | `geocrop-results` | Results bucket |
### Bucket/Key Conventions
- **Models**: ROOT of `geocrop-models` (no subfolders)
- **DW Baselines**: `geocrop-baselines/dw/zim/summer/<season>/<type>/DW_Zim_<Type>_<year>_<year+1>.tif`
- **Results**: `geocrop-results/results/<job_id>/<filename>`
### Model Filename Mapping
| Job model value | Primary filename | Fallback |
|-----------------|-----------------|----------|
| "Ensemble" | Zimbabwe_Ensemble_Model.pkl | Zimbabwe_Ensemble_Raw_Model.pkl |
| "RandomForest" | Zimbabwe_RandomForest_Model.pkl | Zimbabwe_RandomForest_Raw_Model.pkl |
| "XGBoost" | Zimbabwe_XGBoost_Model.pkl | Zimbabwe_XGBoost_Raw_Model.pkl |
| "LightGBM" | Zimbabwe_LightGBM_Model.pkl | Zimbabwe_LightGBM_Raw_Model.pkl |
| "CatBoost" | Zimbabwe_CatBoost_Model.pkl | Zimbabwe_CatBoost_Raw_Model.pkl |
### Methods
- `ping()` `(bool, str)`: Check MinIO connectivity
- `head_object(bucket, key)` `dict|None`: Get object metadata
- `list_objects(bucket, prefix)` `list[str]`: List object keys
- `download_file(bucket, key, dest_path)` `Path`: Download file
- `download_model_file(model_name, dest_dir)` `Path`: Download model with fallback
- `upload_file(bucket, key, local_path)` `str`: Upload file, returns s3:// URI
- `upload_result(job_id, local_path, filename)` `(s3_uri, key)`: Upload result
- `presign_get(bucket, key, expires)` `str`: Generate presigned URL
---
## STEP 3: STAC Client (DEA)
### Environment Variables
| Variable | Default | Description |
|----------|---------|-------------|
| `DEA_STAC_ROOT` | `https://explorer.digitalearth.africa/stac` | STAC root URL |
| `DEA_STAC_SEARCH` | `https://explorer.digitalearth.africa/stac/search` | STAC search URL |
| `DEA_CLOUD_MAX` | `30` | Cloud cover filter (percent) |
| `DEA_TIMEOUT_S` | `30` | Request timeout (seconds) |
### Collection Resolution
Preferred Sentinel-2 collection IDs (in order):
1. `s2_l2a`
2. `s2_l2a_c1`
3. `sentinel-2-l2a`
4. `sentinel_2_l2a`
If none found, raises ValueError with available collections.
### Methods
- `list_collections()` `list[str]`: List available collections
- `resolve_s2_collection()` `str|None`: Resolve best S2 collection
- `search_items(bbox, start_date, end_date)` `list[pystac.Item]`: Search for items
- `summarize_items(items)` `dict`: Summarize search results without downloading
### summarize_items() Output Structure
```python
{
"count": int,
"collection": str,
"time_start": "ISO datetime",
"time_end": "ISO datetime",
"items": [
{
"id": str,
"datetime": "ISO datetime",
"bbox": [minx, miny, maxx, maxy],
"cloud_cover": float|None,
"assets": {
"red": {"href": str, "type": str, "roles": list},
...
}
}, ...
]
}
```
**Note**: stackstac loading is NOT implemented in this step. It will come in Step 4/5.
---
## STEP 4A: Feature Computation (Math)
### Features Produced
**Base indices (time-series):**
- ndvi, ndre, evi, savi, ci_re, ndwi
**Smoothed time-series:**
- For every index above, Savitzky-Golay smoothing (window=5, polyorder=2)
- Suffix: *_smooth
**Phenology metrics (computed across time for NDVI, NDRE, EVI):**
- _max, _min, _mean, _std, _amplitude, _auc, _peak_timestep, _max_slope_up, _max_slope_down
**Harmonic features (for NDVI only):**
- ndvi_harmonic1_sin, ndvi_harmonic1_cos, ndvi_harmonic2_sin, ndvi_harmonic2_cos
**Interaction features:**
- ndvi_ndre_peak_diff = ndvi_max - ndre_max
- canopy_density_contrast = evi_mean / (ndvi_mean + 0.001)
### Smoothing Approach
1. **fill_zeros_linear**: Treats 0 as missing, linear interpolates between non-zero neighbors
2. **savgol_smooth_1d**: Uses scipy.signal.savgol_filter if available, falls back to simple moving average
### Phenology Metrics Definitions
| Metric | Formula |
|--------|---------|
| max | np.max(y) |
| min | np.min(y) |
| mean | np.mean(y) |
| std | np.std(y) |
| amplitude | max - min |
| auc | trapezoidal integral (dx=10 days) |
| peak_timestep | argmax(y) |
| max_slope_up | max(diff(y)) |
| max_slope_down | min(diff(y)) |
### Harmonic Coefficient Definition
For normalized time t = 2*pi*k/N:
- h1_sin = mean(y * sin(t))
- h1_cos = mean(y * cos(t))
- h2_sin = mean(y * sin(2t))
- h2_cos = mean(y * cos(2t))
### Note
Step 4B will add seasonal window summaries and final feature vector ordering.
---
## STEP 4B: Window Summaries + Feature Order
### Seasonal Window Features (18 features)
Season window is OctJun, split into:
- **Early**: OctDec
- **Peak**: JanMar
- **Late**: AprJun
For each window, computed for NDVI, NDWI, NDRE:
- `<index>_<window>_mean`
- `<index>_<window>_max`
Total: 3 indices × 3 windows × 2 stats = **18 features**
### Feature Ordering (FEATURE_ORDER_V1)
51 scalar features in order:
1. **Phenology metrics** (27): ndvi, ndre, evi (each with max, min, mean, std, amplitude, auc, peak_timestep, max_slope_up, max_slope_down)
2. **Harmonics** (4): ndvi_harmonic1_sin/cos, ndvi_harmonic2_sin/cos
3. **Interactions** (2): ndvi_ndre_peak_diff, canopy_density_contrast
4. **Window summaries** (18): ndvi/ndwi/ndre × early/peak/late × mean/max
Note: Additional smoothed array features (*_smooth) are not in FEATURE_ORDER_V1 since they are arrays, not scalars.
### Window Splitting Logic
- If `dates` provided: Use month membership (10,11,12 = early; 1,2,3 = peak; 4,5,6 = late)
- Fallback: Positional split (first 9 steps = early, next 9 = peak, next 9 = late)
---
## STEP 5: DW Baseline Loading
### DW Object Layout
**Bucket**: `geocrop-baselines`
**Prefix**: `dw/zim/summer/`
**Path Pattern**: `dw/zim/summer/<season>/<type>/DW_Zim_<Type>_<year>_<year+1>.tif`
**Tile Naming**: COGs with 65536x65536 pixel tiles
- Example: `DW_Zim_HighestConf_2021_2022-0000000000-0000000000.tif`
- Format: `{Type}_{Year}_{Year+1}-{TileRow}-{TileCol}.tif`
### DW Types
- `HighestConf` - Highest confidence class
- `Agreement` - Class agreement across predictions
- `Mode` - Most common class
### Windowed Reads
The worker MUST use windowed reads to avoid downloading entire huge COG tiles:
1. **Presigned URL**: Get temporary URL via `storage.presign_get(bucket, key, expires=3600)`
2. **AOI Transform**: Convert AOI bbox from WGS84 to tile CRS using `rasterio.warp.transform_bounds`
3. **Window Creation**: Use `rasterio.windows.from_bounds` to compute window from transformed bbox
4. **Selective Read**: Call `src.read(window=window)` to read only the needed portion
5. **Mosaic**: If multiple tiles needed, read each window and mosaic into single array
### CRS Handling
- DW tiles may be in EPSG:3857 (Web Mercator) or UTM - do NOT assume
- Always transform AOI bbox to tile CRS before computing window
- Output profile uses tile's native CRS
### Error Handling
- If no matching tiles found: Raise `FileNotFoundError` with searched prefix
- If window read fails: Retry 3x with exponential backoff
- Nodata value: 0 (preserved from DW)
### Primary Function
```python
def load_dw_baseline_window(
storage,
year: int,
season: str = "summer",
aoi_bbox_wgs84: List[float], # [min_lon, min_lat, max_lon, max_lat]
dw_type: str = "HighestConf",
bucket: str = "geocrop-baselines",
max_retries: int = 3,
) -> Tuple[np.ndarray, dict]:
"""Load DW baseline clipped to AOI window from MinIO.
Returns:
dw_arr: uint8 or int16 raster clipped to AOI
profile: rasterio profile for writing outputs aligned to this window
"""
```
---
## Plan 02 - Step 1: TiTiler Deployment+Service
### Files Changed
- Created: [`k8s/25-tiler.yaml`](k8s/25-tiler.yaml)
- Created: Kubernetes Secret `geocrop-secrets` with MinIO credentials
### Commands Run
```bash
kubectl create secret generic geocrop-secrets -n geocrop --from-literal=minio-access-key=minioadmin --from-literal=minio-secret-key=minioadmin123
kubectl -n geocrop apply -f k8s/25-tiler.yaml
kubectl -n geocrop get deploy,svc | grep geocrop-tiler
```
### Expected Output / Acceptance Criteria
- `kubectl -n geocrop apply -f k8s/25-tiler.yaml` succeeds (syntax correct)
- Creates Deployment `geocrop-tiler` with 2 replicas
- Creates Service `geocrop-tiler` (ClusterIP on port 8000 container port 80)
- TiTiler container reads COGs from MinIO via S3
- Pods are Running and Ready (1/1)
### Actual Output
```
deployment.apps/geocrop-tiler 2/2 2 2 2m
service/geocrop-tiler ClusterIP 10.43.47.225 <none> 8000/TCP 2m
```
### TiTiler Environment Variables
| Variable | Value |
|----------|-------|
| AWS_ACCESS_KEY_ID | from secret geocrop-secrets |
| AWS_SECRET_ACCESS_KEY | from secret geocrop-secrets |
| AWS_REGION | us-east-1 |
| AWS_S3_ENDPOINT_URL | http://minio.geocrop.svc.cluster.local:9000 |
| AWS_HTTPS | NO |
| TILED_READER | cog |
### Notes
- Container listens on port 80 (not 8000) - service maps 8000 80
- Health probe path `/healthz` on port 80
- Secret `geocrop-secrets` created for MinIO credentials
### Next Step
- Step 2: Add Ingress for TiTiler (with TLS)
---
## Plan 02 - Step 2: TiTiler Ingress
### Files Changed
- Created: [`k8s/26-tiler-ingress.yaml`](k8s/26-tiler-ingress.yaml)
### Commands Run
```bash
kubectl -n geocrop apply -f k8s/26-tiler-ingress.yaml
kubectl -n geocrop get ingress geocrop-tiler -o wide
kubectl -n geocrop describe ingress geocrop-tiler
```
### Expected Output / Acceptance Criteria
- Ingress object created with host `tiles.portfolio.techarvest.co.zw`
- TLS certificate will be pending until DNS A record is pointed to ingress IP
### Actual Output
```
NAME CLASS HOSTS ADDRESS PORTS AGE
geocrop-tiler nginx tiles.portfolio.techarvest.co.zw 167.86.68.48 80, 443 30s
```
### Ingress Details
- Host: tiles.portfolio.techarvest.co.zw
- Backend: geocrop-tiler:8000
- TLS: geocrop-tiler-tls (cert-manager with letsencrypt-prod)
- Annotations: nginx.ingress.kubernetes.io/proxy-body-size: "50m"
### DNS Requirement
External DNS A record must point to ingress IP (167.86.68.48):
- `tiles.portfolio.techarvest.co.zw` `167.86.68.48`
---
## Plan 02 - Step 3: TiTiler Smoke Test
### Commands Run
```bash
kubectl -n geocrop port-forward svc/geocrop-tiler 8000:8000 &
curl -sS http://127.0.0.1:8000/ | head
curl -sS -o /dev/null -w "%{http_code}\n" http://127.0.0.1:8000/healthz
```
### Test Results
| Endpoint | Status | Notes |
|----------|--------|-------|
| `/` | 200 | Landing page JSON returned |
| `/healthz` | 200 | Health check passes |
| `/api` | 200 | OpenAPI docs available |
### Final Probe Path
- **Confirmed**: `/healthz` on port 80 works correctly
- No manifest changes needed
---
## Plan 02 - Step 4: MinIO S3 Access Test
### Commands Run
```bash
# With correct credentials (minioadmin/minioadmin123)
curl -sS "http://127.0.0.1:8000/cog/info?url=s3://geocrop-baselines/dw/zim/summer/summer/highest/DW_Zim_HighestConf_2016_2017-0000000000-0000000000.tif"
```
### Test Results
| Test | Result | Notes |
|------|--------|-------|
| S3 Access | Failed | Error: "The AWS Access Key Id you provided does not exist in our records" |
### Issue Analysis
- MinIO credentials used: `minioadmin` / `minioadmin123`
- The root user is `minioadmin` with password `minioadmin123`
- TiTiler pods have correct env vars set (verified via `kubectl exec`)
- Issue may be: (1) bucket not created, (2) bucket path incorrect, or (3) network policy
### Environment Variables (Verified Working)
| Variable | Value |
|----------|-------|
| AWS_ACCESS_KEY_ID | minioadmin |
| AWS_SECRET_ACCESS_KEY | minioadmin123 |
| AWS_S3_ENDPOINT_URL | http://minio.geocrop.svc.cluster.local:9000 |
| AWS_HTTPS | NO |
| AWS_REGION | us-east-1 |
### Next Step
- Verify bucket exists in MinIO
- Check bucket naming convention in MinIO console
- Or upload test COG to verify S3 access