Retrieval-aware SAR ship detection for Indian coastal waters — Sentinel-1 GRD → CA-CFAR → YOLOv8
This pipeline ingests Sentinel-1 GRD IW VV scenes from the Copernicus Data Space Ecosystem, applies radiometric calibration and adaptive ship detection, and feeds candidate tiles into a YOLOv8 model for final classification. The three-stage design discards roughly 90–97 % of empty ocean before any neural network inference runs, cutting GPU cost by an order of magnitude on 25 000 × 16 000 px SAR imagery.
Target area: Indian Exclusive Economic Zone (68–97.5 °E, 8–24 °N) — 2.37 million km².
┌─────────────────────────────────────────────────────────────────────┐
│ Stage 1 — Search & Download │
│ Copernicus CDSE OData v1 API → Sentinel-1 GRD IW VV │
│ OAuth2 / Keycloak ROPC auth · online-only filter · resume download │
└──────────────────────────┬──────────────────────────────────────────┘
│ raw .SAFE.zip archives
┌──────────────────────────▼──────────────────────────────────────────┐
│ Stage 2 — SAR Pre-Filter │
│ 2A DN → σ°_dB radiometric calibration (ESA MPC-0307 §6.5) │
│ 2B GSHHG water masking (high-res coastline, level-1 land) │
│ 2C CA-CFAR adaptive detection (integral image, O(1) per pixel) │
│ 2D 512×512 sliding-window tiles (50% overlap, min-hits filter) │
└──────────────────────────┬──────────────────────────────────────────┘
│ candidate tiles (~5–12% of scene)
┌──────────────────────────▼──────────────────────────────────────────┐
│ Stage 3 — YOLOv8 Inference │
│ 64-tile batch inference · hybrid scoring (YOLO + CFAR ratio) │
│ Geographic NMS · GeoJSON + CSV + annotated PNG output │
└─────────────────────────────────────────────────────────────────────┘
- Zero-cloud dependency — runs fully offline once data is downloaded
- Resumable downloads — partial
.zip.partfiles survive network drops - CA-CFAR with integral image — O(1) per pixel, processes a 25k×16k scene in ~40 s on CPU
- GSHHG auto-download — coastline DB fetched automatically on first run
- Hybrid confidence score —
0.6 × yolo_conf + 0.4 × min(cfar_ratio/5, 1.0)merges CFAR and YOLO signal - GeoJSON output — ship detections open directly in QGIS or Google Earth
- Streamlit dashboard — live map, ship gallery, tile browser, pipeline runner, structured logs
- 76-test suite — unit + end-to-end integration tests, no internet required
git clone https://github.com/NAVTEJJ/maritime-ship-detection.git
cd maritime-ship-detection
pip install -r requirements.txt
# For Stage 3 inference (optional):
pip install ultralytics huggingface-hubpython demo.pyGenerates a synthetic 4096×4096 SAR scene with 10 planted ships, runs Stage 2 in full, and reports recall. Typical output:
Ships recovered : 10 / 10 (recall = 100 %)
CFAR hits : 1926
Candidate tiles : 195
Pipeline time : 33.2 s
# Set credentials (free Copernicus account required)
export CDSE_USERNAME="[email protected]"
export CDSE_PASSWORD="yourpassword"
# Search + download + pre-filter + inference
python main.py pipeline
# Or run stages individually
python main.py search # Stage 1 only
python main.py process # Stage 2 only
python main.py infer # Stage 3 onlypython -m streamlit run app.py
# Navigate to http://localhost:8501Performance across five Sentinel-1 IW GRD scenes over the Indian EEZ:
| Scene | Tiles | CFAR hits | Pre-filter savings | Run time |
|---|---|---|---|---|
| S1A 2026-04-26 | 152 | 3 841 | 94.2 % | 178 s |
| S1A 2026-04-17 | 248 | 5 102 | 90.5 % | 211 s |
| S1A 2026-04-18 #1 | 78 | 1 209 | 97.1 % | 162 s |
| S1A 2026-04-18 #2 | 312 | 6 714 | 88.3 % | 287 s |
| S1A 2026-04-18 #3 | 183 | 3 182 | 93.1 % | 195 s |
Average pre-filter efficiency: 92.6 % — 9 in 10 YOLOv8 forward passes eliminated before they happen.
All tunables live in config/config.yaml:
| Key | Default | Effect |
|---|---|---|
search.days_back |
7 |
Archive search window (days) |
search.max_results |
4 |
Maximum scenes per run |
cfar.pfa |
1e-6 |
Probability of false alarm |
cfar.guard_size |
4 |
Guard window half-width (pixels) |
cfar.train_size |
12 |
Training window half-width (pixels) |
tiling.tile_size |
512 |
Tile size (must match model input) |
tiling.stride |
256 |
Sliding-window step (50 % overlap) |
tiling.min_detections |
25 |
Minimum CFAR hits to keep a tile |
inference.confidence |
0.20 |
YOLOv8 confidence threshold |
inference.model_source |
yolov8m.pt |
Local .pt or HuggingFace repo ID |
maritime_ship_detection/
├── stage1_search/ # Copernicus API auth, query, download
│ ├── auth.py # OAuth2 / Keycloak ROPC token flow
│ ├── query.py # OData v1 scene search with spatial filter
│ └── downloader.py # Resumable chunked download + retry
├── stage2_filter/ # SAR processing chain
│ ├── preprocessor.py # DN → σ°_dB calibration
│ ├── water_mask.py # GSHHG land masking
│ ├── cfar.py # CA-CFAR via summed-area table
│ └── tiler.py # 512×512 sliding-window tiler
├── stage3_detect/ # YOLOv8 inference
│ ├── inference.py # Batch inference + hybrid scoring
│ └── results_writer.py # GeoJSON / CSV / annotated-PNG output
├── utils/
│ ├── logging_config.py # Structured JSON logging, rotating handler
│ ├── pipeline_logger.py # Per-run report generation
│ └── coastline_fetch.py # GSHHG auto-download
├── tests/
│ ├── unit/ # 62 unit tests (CFAR, tiler, auth, …)
│ └── integration/ # 14 end-to-end pipeline tests
├── config/config.yaml # All runtime parameters
├── app.py # Streamlit web dashboard
├── main.py # CLI entry point
├── demo.py # Offline synthetic-scene demo
└── requirements.txt
pip install pytest pytest-mock
pytest tests/ -vThe suite covers CFAR mathematics, tiling edge cases, water mask rasterisation, OAuth2 token refresh, OData query serialisation, calibration formula correctness, and a full end-to-end synthetic pipeline run — all without network or file-system access.
tests/integration/test_pipeline_e2e.py ............. 14 passed
tests/unit/test_auth.py ...... 6 passed
tests/unit/test_cfar.py .......... 10 passed
tests/unit/test_preprocessor.py ......... 9 passed
tests/unit/test_query.py .......... 10 passed
tests/unit/test_tiler.py .................... 20 passed
tests/unit/test_water_mask.py ...... 7 passed
──────────────────────────────────────────────────────────
76 passed, 1 warning in 21.37 s
Sentinel-1 GRD IW scenes are downloaded automatically by Stage 1 from the Copernicus Data Space Ecosystem. A free account is required:
Each scene is ~700 MB–1.5 GB. The data/downloads/ directory is excluded from version control.
The GSHHG coastline database is fetched automatically on first run (no account needed).
The pipeline expects a YOLOv8 .pt model trained on SAR ship imagery. Two options:
Option A — local file:
# config/config.yaml
inference:
model_source: "models/ship_detector.pt"Option B — HuggingFace Hub (auto-download):
inference:
model_source: "keremberke/yolov8m-sar-ship-detection"MIT — see LICENSE.
- Copernicus Data Space Ecosystem — Sentinel-1 satellite data
- GSHHG — Global coastline database (Wessel & Smith, 1996)
- Ultralytics YOLOv8 — Object detection framework
- ESA Sentinel-1 Mission Performance Centre — radiometric calibration specification (MPC-0307)