Skip to main content

News

TDS Projects February 27, 2026

STARR Imaging upgrades: researchers benefit from faster, more reliable data access

By Lisa Tsering

This cutting-edge medical imaging data delivery platform will provide researchers with access to de-identified medical imaging data for analysis — without disrupting clinical systems.

Modern hospital environments are generating hundreds of terabytes of new imaging data per year. To best leverage the data to support research without compromising patient privacy, Stanford Health Care and Stanford School of Medicine developed the STARR Imaging project, a cutting-edge medical imaging data platform whose primary goal is to provide researchers with access to de-identified medical imaging data for analysis, without disrupting clinical systems.

The platform, formerly known as STARR Radio, has recently undergone a comprehensive rebuild to enhance its performance, scalability, and reliability, explained Joe Mesterhazy, a principal developer within Research Technology team.

Development on the rebuilt platform began in earnest in the spring of 2025, after it became clear that the existing system was no longer meeting the needs of researchers. The new platform went into production in early February, and active development continues.

Key improvements include transitioning from a monolithic Java architecture to a modern Python-based microservices architecture, utilizing technologies such as Kubernetes for automated scaling, BigQuery for metadata storage and analytics, and DICOMweb for data ingestion. These changes have addressed long-standing limitations, including the previous system’s reliance on PostgreSQL, tight coupling with clinical systems, and inefficient resource utilization.

One of the more tangible changes is a switch from traditional Intel-based server hardware to the ARM64 chip architecture (the same platform that powers most modern smartphones). ARM64 processors deliver better performance while consuming less energy, and the team anticipates the transition will reduce compute costs by 30-40 percent.

Combined with architectural improvements, the platform has achieved a roughly threefold increase in data throughput, meaning that a task like de-identifying 5,000 CT studies now takes a fraction of the time it once did.

Automated scaling means the system can also grow on demand, whether a researcher needs one imaging study or millions, without any manual intervention from the engineering team. “When the system is maxed out at 500 CPUs, we can de-identify data at around 35Gb/s, which is faster than our researchers can download it,” said Mesterhazy.

But perhaps the most forward-looking aspect of STARR Imaging is how deeply artificial intelligence has been woven into the development process itself. The platform’s source code ships with comprehensive documentation including database guides formatted specifically for AI coding tools. This allows AI coding agents to develop an understanding of the platform's design, and the impact has been dramatic: development time for new features has fallen from months to weeks.

A clear example is the recent integration of echocardiography exams from the adult hospital, completed within a single two-week sprint, a turnaround the team says would have been unthinkable under the previous architecture. With AI compressing years of medical research into months, the speed at which researchers can access that data matters more than ever. 

Code showing the failure and fix

AI’s role extends well beyond writing code. STARR Imaging maintains a library of over 1,000 medical image files used as control studies to prevent any accidental leakage of patient information.

Because AI agents have been trained to understand the rules and syntax of the de-identification process, they can automatically pinpoint issues and suggest fixes when new imaging formats are introduced, work that previously required significant manual effort. GitHub Copilot is also automatically assigned as a reviewer on every code change, providing an additional layer of quality control.

The team has also taken a thoughtful approach to managing AI behavior. When an AI agent goes off-track, it is tasked with updating its own instruction files to prevent the same misstep from recurring. It’s a self-correcting feedback loop that keeps the system improving over time.

As STARR Imaging continues to evolve, it stands as a model for how thoughtful engineering and AI-assisted development can accelerate research while keeping patient privacy at the center. STARR (STanford medicine Research data Repository) is a single integrated data lake containing clinical data of different modalities, along with self-service tools. It is part of Stanford Research IT's suite of services used by researchers, participants, and clinicians to collect and combine data to make discoveries and to improve human health and wellness.

About Stanford Medicine

Stanford Medicine is an integrated academic health system comprising the Stanford School of Medicine and adult and pediatric health care delivery systems. Together, they harness the full potential of biomedicine through collaborative research, education and clinical care for patients. For more information, please visit med.stanford.edu.

  • TDS Projects

Senior Internal Communication Specialist

Lisa Tsering

Lisa Tsering is the Senior Internal Communications Specialist for TDS at Stanford Medicine.