geocrop-platform./plan/00D_storage_security_notes.md

3.2 KiB

Storage Security Notes

Overview

All MinIO buckets in the geocrop project are configured as private with no public access. Downloads require authenticated access through signed URLs generated by the API.

Why MinIO Stays Private

1. Data Sensitivity

  • Baseline COGs: Dynamic World data covering Zimbabwe contains land use information that should not be publicly exposed
  • Training Data: Contains labeled geospatial data that may have privacy considerations
  • Model Artifacts: Proprietary ML models should be protected
  • Inference Results: User-generated outputs should only be accessible to the respective users

2. Security Best Practices

  • Least Privilege: Only authenticated services and users can access storage
  • Defense in Depth: Multiple layers of security (network policies, authentication, bucket policies)
  • Audit Trail: All access can be logged through MinIO audit logs

Access Model

Internal Access (Within Kubernetes Cluster)

Services running inside the geocrop namespace can access MinIO using:

  • Endpoint: minio.geocrop.svc.cluster.local:9000
  • Credentials: Stored as Kubernetes secrets
  • Access: Service account / node IAM

External Access (Outside Kubernetes)

External clients (web frontend, API consumers) must use signed URLs:

# Example: Generate signed URL via API
from minio import Minio

client = Minio(
    "minio.geocrop.svc.cluster.local:9000",
    access_key=os.getenv("MINIO_ACCESS_KEY"),
    secret_key=os.getenv("MINIO_SECRET_KEY),
)

# Generate presigned URL (valid for 1 hour)
url = client.presigned_get_object(
    "geocrop-results",
    "jobs/job-123/result.tif",
    expires=3600
)

Bucket Policies Applied

All buckets have anonymous access disabled:

mc anonymous set none geocrop-minio/geocrop-baselines
mc anonymous set none geocrop-minio/geocrop-datasets
mc anonymous set none geocrop-minio/geocrop-results
mc anonymous set none geocrop-minio/geocrop-models

Future: Signed URL Workflow

  1. User requests download via API (GET /api/v1/results/{job_id}/download)
  2. API validates user has permission to access the job
  3. API generates presigned URL with short expiration (15-60 minutes)
  4. User downloads directly from MinIO via the signed URL
  5. URL expires after the specified time

Network Policies

For additional security, Kubernetes NetworkPolicies should be configured to restrict which pods can communicate with MinIO. Recommended:

  • Allow only geocrop-api and geocrop-worker pods to access MinIO
  • Deny all other pods by default

Verification

To verify bucket policies:

mc anonymous get geocrop-minio/geocrop-baselines
# Expected: "Policy not set" (meaning private)

mc anonymous list geocrop-minio/geocrop-baselines
# Expected: empty (no public access)

Recommendations for Production

  1. Enable MinIO Audit Logs: Track all API access for compliance
  2. Use TLS: Ensure all MinIO communication uses TLS 1.2+
  3. Rotate Credentials: Regularly rotate MinIO root access keys
  4. Implement Bucket Quotas: Prevent any single bucket from consuming all storage
  5. Enable Versioning: For critical buckets to prevent accidental deletion

Date: 2026-02-28 Status: Documented