# Storage Security Notes ## Overview All MinIO buckets in the geocrop project are configured as **private** with no public access. Downloads require authenticated access through signed URLs generated by the API. ## Why MinIO Stays Private ### 1. Data Sensitivity - **Baseline COGs**: Dynamic World data covering Zimbabwe contains land use information that should not be publicly exposed - **Training Data**: Contains labeled geospatial data that may have privacy considerations - **Model Artifacts**: Proprietary ML models should be protected - **Inference Results**: User-generated outputs should only be accessible to the respective users ### 2. Security Best Practices - **Least Privilege**: Only authenticated services and users can access storage - **Defense in Depth**: Multiple layers of security (network policies, authentication, bucket policies) - **Audit Trail**: All access can be logged through MinIO audit logs ## Access Model ### Internal Access (Within Kubernetes Cluster) Services running inside the `geocrop` namespace can access MinIO using: - **Endpoint**: `minio.geocrop.svc.cluster.local:9000` - **Credentials**: Stored as Kubernetes secrets - **Access**: Service account / node IAM ### External Access (Outside Kubernetes) External clients (web frontend, API consumers) must use **signed URLs**: ```python # Example: Generate signed URL via API from minio import Minio client = Minio( "minio.geocrop.svc.cluster.local:9000", access_key=os.getenv("MINIO_ACCESS_KEY"), secret_key=os.getenv("MINIO_SECRET_KEY), ) # Generate presigned URL (valid for 1 hour) url = client.presigned_get_object( "geocrop-results", "jobs/job-123/result.tif", expires=3600 ) ``` ## Bucket Policies Applied All buckets have anonymous access disabled: ```bash mc anonymous set none geocrop-minio/geocrop-baselines mc anonymous set none geocrop-minio/geocrop-datasets mc anonymous set none geocrop-minio/geocrop-results mc anonymous set none geocrop-minio/geocrop-models ``` ## Future: Signed URL Workflow 1. **User requests download** via API (`GET /api/v1/results/{job_id}/download`) 2. **API validates** user has permission to access the job 3. **API generates** presigned URL with short expiration (15-60 minutes) 4. **User downloads** directly from MinIO via the signed URL 5. **URL expires** after the specified time ## Network Policies For additional security, Kubernetes NetworkPolicies should be configured to restrict which pods can communicate with MinIO. Recommended: - Allow only `geocrop-api` and `geocrop-worker` pods to access MinIO - Deny all other pods by default ## Verification To verify bucket policies: ```bash mc anonymous get geocrop-minio/geocrop-baselines # Expected: "Policy not set" (meaning private) mc anonymous list geocrop-minio/geocrop-baselines # Expected: empty (no public access) ``` ## Recommendations for Production 1. **Enable MinIO Audit Logs**: Track all API access for compliance 2. **Use TLS**: Ensure all MinIO communication uses TLS 1.2+ 3. **Rotate Credentials**: Regularly rotate MinIO root access keys 4. **Implement Bucket Quotas**: Prevent any single bucket from consuming all storage 5. **Enable Versioning**: For critical buckets to prevent accidental deletion --- **Date**: 2026-02-28 **Status**: ✅ Documented