Federated training matches centralized AI within 1 point — without moving patient data

Catherine Okafor, Liam Donnelly, Anja Petrova et al.~35s readarXiv:2606.04471

Bottom line: five health systems jointly fine-tuned a clinical language model without exchanging any patient data, landing within 1 point of centralized training and beating the best single-institution model by 12 points.

This is the existence proof for high-stakes AI collaboration in regulated industries: the data never moves, only model updates do, and with differential privacy enabled, standard extraction attacks recovered no patient text. Costs are real — 3x training time, slight accuracy loss, and serious cross-org coordination — and regulators have not yet issued formal guidance.

The pattern transfers directly to any sector where data-sharing agreements stall AI projects: banking, insurance, defense.

Recommended action: if pooling data with partners or subsidiaries is blocking an AI initiative, commission a federated-learning pilot on one cross-entity use case this half.