nnunet_serve¶
Context¶
nnUNet is a relatively flexible framework. However, it is not exactly what people would call "production ready". With nnunet_serve, we have developed a container that allows users to run nnUNet as an API or as a CLI tool while keeping a relatively stable pool of models.
Main features¶
- Single case inference from and to multiple formats (from: Nifti, DICOM; to: Nifti, DICOM-seg, RT-struct, fractional DICOM-seg)
- Batch inference using the aforementioned options (with background file writing to accelerate processing)
- Model cascading: multiple models can be concatenated with being stuck to strict folder structures
- Example 1: segment prostate → crop to prostate → detect prostate cancer
- Example 2: segment prostate zones → crop to prostate zones → use prostate zones as input → segment csPCa
- Example 3: segment liver → crop to liver → segment HCC → exclude HCC with 0% overlap with liver
- Integration with Orthanc: Orthanc is the most open-source DICOM-web server, making nnunet_serve a very reasonable and appealing infrastructure for research
- TotalSegmentator integration: TotalSegmentator is the largest suite of nnU-Net models for multiple CT and MRI tasks. We improve on their framework and greatly reduce inference times through refactoring and keeping series/inferences in memory
- API: unlike typical workflows for nnU-Net, which depend on CLI-based routines, we have developed an API which guarantees integration with web-based services
- Integration with both SNOMED-CT and EUCAIM ontologies: ontology integration allows the simple specification of DICOM-seg/RTstruct metadata, lifting the burden of generating custom files for specific structures
Installation¶
Installation requirements are handled by uv (https://github.com/ultralytics/uv). uv is a tool for managing Python packages and dependencies.
Requirements¶
uv- usinguvmakes this all very easy as it manages Python packages. The installation is handled lazily (i.e. at runtime)- CUDA-compatible GPU cards
Operational notes¶
- Strict GPU requirement: The server requires an NVIDIA GPU and
nvidia-smi. It waits for a GPU with at least the model’smin_memfree memory (wait_for_gpu()), using the maximummin_memacross models for multi-model requests. CPU-only systems are not supported. - CORS: No CORS middleware is configured by default. If you expose the API to browsers, configure CORS as appropriate for your deployment.
- Debug mode: Set environment variable
DEBUG=1to disable try/except around inference.
Developer Documentation¶
The codebase follows Google-style docstrings for all functions and classes. If you are a developer looking to extend nnunet_serve, you can find detailed documentation for all core modules in the src/nnunet_serve directory.
Citation¶
If you use this repository please cite the Zenodo repository as below.
APA
de Almeida, J. G., & Papanikolaou, N. (2026). josegcpa/nnunet_serve: v0.1.2 (v0.1.2). Zenodo. https://doi.org/10.5281/zenodo.17522203
BibTex
@software{de_almeida_2026_17522203,
author = {de Almeida, José Guilherme and
Papanikolaou, Nikolaos},
title = {josegcpa/nnunet\_serve: v0.1.2},
month = feb,
year = 2026,
publisher = {Zenodo},
version = {v0.1.2},
doi = {10.5281/zenodo.17522203},
url = {https://doi.org/10.5281/zenodo.17522203},
swhid = {swh:1:dir:af8aa6feda0eb9a33d98a4629a978bc289ad9537
;origin=https://doi.org/10.5281/zenodo.17522202;vi
sit=swh:1:snp:ff077fba54804103b26419786f5f4035a9ae
3fa6;anchor=swh:1:rel:75c79771ab9c7e121eae8b4e50f4
5fe396abe1dc;path=josegcpa-nnunet\_serve-c5a1f06
},
}