MitoEM (Instance Segmentation)¶
This tutorial reproduces 3D mitochondria instance segmentation on
the
MitoEM dataset
released by
Wei et al.
in 2020. The recipe lives under tutorials/mitoEM/ with two
dataset-specific entry points (R.yaml for MitoEM-Rat,
H.yaml for MitoEM-Human, HR.yaml for joint training) sharing
common.yaml.
The pipeline is multi-task: predict short-range affinity, long-range affinity (radius 5), and a skeleton-aware EDT head, then decode with a distance-watershed step. Evaluation uses Adapted Rand and Variation of Information.
Goal¶
The pipeline pins the following setup (encoded in
tutorials/mitoEM/common.yaml and inherited by R.yaml /
H.yaml / HR.yaml):
Input
[32, 256, 256]patches at native MitoEM resolution30 × 8 × 8nm.Model MedNeXt-M, kernel size 3,
checkpoint_style: outside_block, three output heads:aff_r1— 3-channel short-range affinity at offsets(0, 0, 1) / (0, 1, 0) / (1, 0, 0);aff_r5— 3-channel long-range affinity at offsets(0, 0, 5) / (0, 5, 0) / (5, 0, 0);sdt— 1-channel skeleton-aware EDT head.
Loss per-channel BCE on each affinity head plus a SmoothL1 (
tanh: true) on the EDT head, balanced byuncertaintyloss-balancing.Augmentation
aug_em_neuron_fastprofile with rotations on all three axes.Optimization
warmup_cosine_lrprofile, 200 epochs × 1000 steps,accumulate_grad_batches=4,precision=bf16-mixed.Inference sliding window 32 × 256 × 256,
sw_batch_size=1, 50 % overlap, bump blending, replicate-padding mode; head set toaff_r1for the saved primary output.Decoder
decode_distance_watershedover the EDT channel (distance_channels=[6],distance_threshold=[0.5, 0],min_seed_size=100,min_instance_size=50).Metric
adapted_rand+voi.
1 - Get the data¶
The MitoEM dataset is publicly available at the project page and the MitoEM Challenge. On the lab cluster it is staged at:
/projects/weilab/dataset/mito/mitoEM/
EM30-R/ # rat
im_train.h5, mito_train-v2.h5
im_val.h5, mito_val-v2.h5
im_test.h5, mito_test-v2.h5
EM30-H/ # human
(same layout)
Each split is a 4096 × 4096 × {400|100|500} HDF5 stack at
30 × 8 × 8 nm. The train.data.root_path field in
common.yaml points at this directory; override at the CLI if you
stage data elsewhere.
The test labels for MitoEM challenge submission are not publicly
released; mito_test-v2.h5 here refers to the locally maintained
v2 labels for offline development.
2 - Run training¶
Pick the dataset variant and run:
conda activate pytc
# MitoEM-Rat
python scripts/main.py --config tutorials/mitoEM/R.yaml
# MitoEM-Human
python scripts/main.py --config tutorials/mitoEM/H.yaml
# Joint (rat + human in the same training run)
python scripts/main.py --config tutorials/mitoEM/HR.yaml
The config sets system.num_gpus: -1 and system.num_workers: -1,
so PyTC fans out across every visible GPU.
Training schedule:
max_epochs=200,n_steps_per_epoch=1000→ 200 k optimizer steps total.accumulate_grad_batches=4withbatch_size=1per GPU → effective batch size 4 × num_gpus.checkpoint.monitor=val_loss_totalwithmode=min,save_top_k=3.Image previews on the
aff_r1head every 10 epochs.
Outputs land in
outputs/mitoem30{r,h,hr}_mednext_sdt_multitask/<timestamp>/.
Monitor with TensorBoard:
just tensorboard mitoem30r_mednext_sdt_multitask
3 - Inference, decoding, evaluation¶
Run the combined test mode:
python scripts/main.py --config tutorials/mitoEM/R.yaml \
--mode test \
--checkpoint outputs/mitoem30r_mednext_sdt_multitask/<timestamp>/checkpoints/last.ckpt
What happens, in order:
Inference. Sliding window 32 × 256 × 256, 50 % overlap, bump blending,
padding_mode=replicate. The primary headaff_r1is selected at save time, and per-channel sigmoid is applied. The raw 7-channel multi-head prediction is saved astest_im_prediction.h5.Decoding.
decode_distance_watershedruns on the EDT channel (channel 6), seeded at distance > 0.5, growing until the distance hits 0, with seeds < 100 voxels and instances < 50 voxels filtered out. The fast EDT path is enabled withedt_parallel=8.Evaluation. Adapted Rand and Variation of Information against the test labels; written next to the segmentation.
To swap in the aff_r5 head as the primary inference output:
python scripts/main.py --config tutorials/mitoEM/R.yaml \
--mode test --checkpoint <ckpt> \
inference.model.head=aff_r5
4 - Submitting to the MitoEM Challenge¶
The Grand Challenge accepts segmentation HDF5 volumes. After --mode
test produces the segmentation under outputs/.../results_step=<N>/,
follow the formatting rules at
https://mitoem.grand-challenge.org/ and submit. Performance on the
challenge test split is only computable on the Grand Challenge website
because public ground truth is not released for that split.
Per-volume offline evaluation on the validation split (provided in
EM30-{R,H}/mito_val-v2.h5) uses the same adapted_rand + voi
metrics described above; just point test.data.test at the val
volumes.
5 - Reference behavior¶
A few sanity-check signals:
Training loss has three components (
aff_r1,aff_r5,sdt) and uncertainty-balanced weights. Thetrain_loss_term_*_weightedscalars logged in TensorBoard are the most informative — uncertainty balancing typically pushes theaff_r5term down faster thanaff_r1because the long-range task is harder.Validation loss is checked at every epoch boundary; the best-3 checkpoints by
val_loss_totalare kept.Inference on the 4096 × 4096 × 500 test volume is the dominant cost; expect roughly 1-2 hours on a single A100/H100 with
sw_batch_size=1.Decoder threshold (
distance_threshold[0]) is the primary knob for over- / under-segmentation. The default 0.5 is a reasonable starting point; lower (e.g. 0.3) yields more seeds.Adapted Rand below ~0.05 on the validation split is in the ballpark of the published MitoEM-Rat baseline. The challenge uses AP-75 (average precision at IoU 0.75), which is computed by the Grand Challenge submission system.