Learning multi-site harmonization of magnetic resonance images without traveling human phantoms¶
Authors: Siyuan Liu, Pew-Thian Yap
Year: 2024
Venue: Communications Engineering^https://www.nature.com/articles/s44172-023-00140-w
1. Classification¶
- Domain Category:
-
MRI harmonization / domain adaptation. This paper introduces a deep learning framework (MURD) for multi-site MRI harmonization that does not require traveling human phantoms.
-
FM Usage Type:
-
Method / pre-processing. MURD is an image-space harmonization method that can be applied before downstream foundation models or classical pipelines; it is not itself a foundation model.
-
Key Modalities:
- Structural MRI (T1- and T2-weighted brain images) from multiple sites and scanners.
2. Executive Summary¶
Large multi-site MRI studies (ABCD, ADNI, etc.) suffer from site-specific appearance differences due to vendor, scanner, and protocol variability, which introduce non-biological variance that can confound downstream analyses. Traditional retrospective harmonization methods either operate on global intensity distributions (e.g., ComBat) or require traveling human phantoms—the same subjects scanned at multiple sites—to supervise deep learning models, which is logistically impractical at scale.
This paper proposes MURD (multi-site unsupervised representation disentangler), a deep generative framework that learns to disentangle each image into a site-invariant anatomical content representation and a site-specific style representation (intensity/contrast). MURD uses content and style encoders plus a generator to synthesize images for any target site by recombining content with different site styles. Crucially, it learns these representations without paired multi-site scans of the same subject, relying instead on unpaired images and multi-domain image-to-image translation techniques. On >6,000 multi-site T1/T2 images, MURD generates harmonized images that match site-specific appearances while preserving anatomical details, improving cross-site consistency for downstream tasks and enabling retrospective harmonization of existing datasets.
3. Problem Setup and Motivation¶
- Scientific / practical problem:
- Harmonize MRI data across sites and scanners without requiring traveling human phantoms or exhaustive prospective protocol tuning.
-
Reduce non-biological variability to improve reliability and power of downstream analyses (segmentation, registration, diagnosis, FM training).
-
Why this is hard:
- Scanner / protocol variability: Different vendors, coils, pulse sequences, and parameter settings produce distinct intensity and contrast profiles.
- Limited paired data: Paired multi-site scans of the same subject are rare and expensive to obtain.
- Existing unsupervised methods: Pairwise unpaired translation methods (CycleGAN-style) require (N(N-1)) mappings for (N) sites and do not fully exploit shared structure across all sites.
- Preserving anatomy: Harmonization must not alter anatomical structures or disease-related features while adjusting appearance.
4. Data and Modalities¶
- Datasets used:
-
6,000 multi-site T1- and T2-weighted brain MRIs acquired across several scanners and protocols (details in the paper’s dataset section).
-
Multi-site configuration with (N) imaging sites; each site contributes unpaired T1/T2 images.
-
Modalities:
- Structural MRI (T1w, T2w) volumes.
-
No non-imaging modalities (e.g., clinical text, genetics) are used for harmonization.
-
Preprocessing / representation:
- Standard MRI preprocessing pipeline (brain extraction, bias correction, intensity normalization) before training.
- Images are processed in a CNN-compatible grid (3D or 2D slices; see paper for specifics).
5. Model / Method¶
- Model Type:
-
Multi-domain image-to-image translation model with disentangled representations:
- Content encoder: captures site-invariant anatomical features.
- Style encoder: captures site-specific appearance (intensity/contrast).
- Generator: recombines content and style to synthesize images for specific target sites.
-
Key components and innovations:
- Representation disentanglement:
- Each image (x_i) from site (s_i) is mapped to ((c_i, s_i)), where (c_i) is content (anatomy) and (s_i) is style (site).
- Content is shared across sites; style is site-specific.
- Unified multi-site model:
- Single model handles all (N) sites, rather than learning (N(N-1)) pairwise mappings.
- Style codes are indexed by site; content is site-invariant.
-
Losses:
- Reconstruction losses to ensure (G(c_i, s_i)) approximates the original image.
- Style-consistency and content-consistency losses to encourage disentanglement.
- Adversarial / perceptual losses to ensure realistic harmonized images for each site.
-
Training setup:
- Unpaired multi-site images; no traveling phantoms needed.
- Training objective encourages the generator to synthesize images that match the target site’s style while preserving content.
6. Multimodal / Integration Aspects¶
- Not multimodal:
- MURD operates purely on structural MRI (T1/T2). It does not ingest genetics, fMRI, or clinical text directly.
- Integration relevance:
- MURD is a pre-harmonization step for any downstream modality that depends on T1/T2-derived features (e.g., FreeSurfer IDPs, surface meshes).
- Harmonized T1/T2 volumes can feed into neuroimaging FMs (BrainLM, SwiFT, BrainMT, Brain Harmony) or standard pipelines, reducing site confounds before embedding extraction.
7. Experiments and Results¶
Main findings¶
- Effective harmonization:
- MURD can synthesize images with realistic site-specific appearances for arbitrary target sites while preserving anatomical content, as assessed qualitatively and quantitatively.
-
Harmonized images show improved inter-site consistency in intensity distributions and image statistics.
-
Improved downstream performance:
- Models trained on harmonized images generalize better across sites compared to models trained on raw, unharmonized data.
- Harmonization benefits tasks such as segmentation, registration, and intensity-based measurements.
Comparisons¶
- Outperforms traditional statistical harmonization (e.g., intensity normalization, ComBat) in capturing fine-grained, region-specific differences.
- More scalable than pairwise GAN-based harmonization methods that require separate mappings for each site pair.
8. Strengths and Limitations¶
Strengths¶
- No traveling phantoms required:
- Uses unpaired multi-site images, making it applicable to existing large-scale studies.
- Unified multi-site model:
- Single network handles harmonization across many sites, avoiding (N(N-1)) pairwise mappings.
- Disentangled representations:
- Clear separation between anatomical content and site-specific style.
- Retrospective applicability:
- Can be applied to already collected datasets for harmonization without new acquisitions.
Limitations¶
- Training complexity:
- Deep generative model with multiple loss terms; training can be compute-intensive and sensitive to hyperparameters.
- Modality specificity:
- Focused on T1/T2 structural MRI; extension to other sequences/modalities requires additional work.
- Validation scope:
- Requires careful evaluation to ensure no subtle anatomical distortions are introduced that could bias downstream analyses.
9. Context and Broader Impact¶
- Relation to other work:
- Extends unsupervised MRI harmonization methods (e.g., CycleGAN-based, information-bottleneck approaches) to a multi-site, unified framework.
-
Fits into a growing literature on deep generative harmonization tools intended to replace or augment statistical methods like ComBat.
-
Impact on large-scale studies:
- Enables multi-site studies (ABCD, ADNI, UKB-derived consortia) to retrospectively harmonize structural MRI without additional scanning.
- Can improve robustness and generalization of downstream ML models, including foundation models trained on harmonized images.
10. Key Takeaways¶
- MURD disentangles site-invariant anatomy and site-specific style to harmonize MRI across sites without paired data.
- A single multi-site model scales better than pairwise harmonization, reducing the number of mappings from (N(N-1)) to 1.
- Retrospective harmonization becomes feasible at scale, enabling harmonized datasets for downstream neuroimaging FMs.
- Pre-harmonization is a critical step in integration pipelines, especially when combining sMRI-derived features with genetics and fMRI.
- Careful validation remains essential to ensure that harmonization does not remove or distort biologically meaningful variation.