BrainMT¶
Overview¶
Type: Hybrid State Space + Transformer for fMRI
Architecture: Mamba blocks + Multi-Head Self-Attention (Hybrid SSM-Transformer)
Modality: Functional MRI (3D volumes or parcels)
Primary use: Long-range temporal dependency modeling with computational efficiency
Purpose & Design Philosophy¶
BrainMT fuses bidirectional Mamba blocks (State Space Models with temporal-first scanning) with Transformer attention to model long-range fMRI dependencies more efficiently than pure transformers. The architecture targets multitask learning across fluid intelligence regression, sex classification, and harmonization tasks on UKB/HCP cohorts.
Key innovation: Mamba's sub-quadratic complexity enables processing longer temporal sequences (≥200 frames) without the memory explosion of full attention.
Architecture Highlights¶
- Hybrid blocks: Bidirectional Mamba (temporal scanning) + MHSA (global attention)
- Patch embedding: 3D Conv → flatten → linear projection
- Temporal modeling: Mamba handles sequence dependencies; attention captures global structure
- Multitask heads: Shared encoder → task-specific prediction heads
- Training: Requires fused CUDA kernels (Mamba-ssm library)
Integration Strategy¶
For Neuro-Omics KB¶
Embedding recipe: rsfmri_brainmt_segments_v1
- Extract embeddings from shared encoder (before task heads)
- Mean pool over sequence length → subject vector
- Project to 512-D for downstream fusion
- Residualize: age, sex, site, mean FD
- Metadata requirement: Log sequence length (BrainMT performance depends on context ≥200)
Fusion targets: - Long-context gene-brain alignment: When temporal dynamics critical (e.g., task fMRI) - Developmental trajectories: Pediatric longitudinal fMRI with evolving patterns - Multitask prediction: Joint cognitive + diagnostic tasks
For ARPA-H Brain-Omics Models¶
BrainMT demonstrates efficient long-context modeling for multimodal systems: - Mamba architecture adaptable to other sequential modalities (EEG, longitudinal assessments) - Hybrid SSM-Transformer paradigm balances efficiency vs. expressiveness - Multitask framework aligns with Brain-Omics Model (BOM) joint training over diverse phenotypes
Embedding Extraction Workflow¶
# 1. Preprocess fMRI → 3D volumes or parcels (≥200 frames preferred)
# 2. Load pretrained BrainMT checkpoint
# 3. Forward through encoder (Mamba blocks + MHSA layers)
# 4. Extract pre-head embeddings (not task-specific outputs)
# 5. Pool to subject-level vector
# 6. Log: sequence_length, mamba_config, embedding_strategy_id
Strengths & Limitations¶
Strengths¶
- Efficient long-context: Mamba scales sub-quadratically vs. full attention
- Multitask learning: Single encoder serves multiple downstream tasks
- Hybrid architecture: Balances local temporal patterns (Mamba) + global structure (attention)
- Benchmarked on UKB/HCP: Published results on fluid intelligence and sex classification
Limitations¶
- Heavy dependencies: Requires Mamba-ssm CUDA kernels (custom build)
- Training complexity: Hybrid architecture harder to debug than pure ViT
- Checkpoint availability: Fewer public pretrained weights vs. BrainLM
- Overkill for short sequences: <200 frames may not fully leverage Mamba's strengths
When to Use BrainMT¶
✅ Use when: - Need long-context modeling (task fMRI, naturalistic viewing) - Multitask setting with shared encoder across cognitive/diagnostic tasks - Want efficiency gains over pure Transformer for ≥200 frame sequences - Exploring SSM architectures for neuro-omics applications
⚠️ Defer until: - BrainLM/Brain-JEPA baselines exhausted (per Nov 2025 integration plan) - Engineering resources available for custom kernel setup - Sufficient GPU memory for hybrid block training/inference
⚠️ Consider alternatives: - BrainLM: Simpler baseline, more mature ecosystem - Brain-JEPA: Faster inference, better for semantic consistency - SwiFT: 4D volumes without explicit sequence modeling - Brain Harmony: Multi-modal sMRI+fMRI fusion
Reference Materials¶
Knowledge Base Resources¶
Curated materials in this KB:
- Paper Summary (PDF Notes): BrainMT (2025)
- Code walkthrough: BrainMT walkthrough
- Model card (YAML): kb/model_cards/brainmt.yaml
- Paper card (YAML): kb/paper_cards/brainmt_2025.yaml
Integration recipes: - Modality Features: fMRI - Integration Strategy - Design Patterns
Original Sources¶
Source code repositories:
- Local copy: external_repos/brainmt/
- Official GitHub: arunkumar-kannan/brainmt-fmri
Original paper: - Title: "BrainMT: A Hybrid Mamba-Transformer Architecture for Modeling Long-Range Dependencies in Functional MRI Data" - Authors: Kannan, Arunkumar; Lindquist, Martin A.; Caffo, Brian - Published: Conference proceedings (SpringerLink), September 2025, pp. 150-160 - Link: SpringerLink - PDF Notes: brainmt_2025.pdf
Next Steps in Our Pipeline¶
- Baseline comparisons: BrainMT vs. BrainLM on UKB cognitive tasks (same train/test splits)
- Sequence length ablation: Test performance vs. context length (100, 200, 400 frames)
- Gene-brain alignment: Evaluate whether long-context embeddings improve genetics CCA
- Developmental extension: Adapt to pediatric fMRI (Cha Hospital, ABCD)
- SSM exploration: Investigate Mamba-style architectures for EEG/EPhys modalities
Engineering Notes¶
- Capture masking ratio and sequence length in metadata for reproducibility
- Multitask heads are task-specific; extract shared encoder embeddings for fusion
- When exporting weights, ensure Mamba kernel version compatibility across systems