MediProvider
Services
Adobe OCR
Tools
Electron, Bing Copilot, HTML/CSS/JS
Timeline
6 Months

Healthcare startups move fast, and their data almost never does.
When I was interning at ReAssist-Me, we needed to build out a clean, structured database of external medical providers, think outpatient services, hospice care, physical therapy, gynecology, and dozens of other specialties. The source material was exactly what you'd expect: dense, inconsistently formatted PDF documents from medical facilities, each listing 100+ providers with no standardization in sight.
Doing it by hand wasn't an option. Neither was paying for an API that didn't exist yet. So I built a tool that made the process actually manageable.
How It Worked
MediProvider was a semi-automated pipeline I designed to take those raw PDFs and turn them into clean, structured data the company could actually use.
First, the PDF got run through Adobe OCR to extract the raw text from whatever layout the facility had decided to use that day. From there, a predefined prompt got copied into Bing Copilot (this was pre-AI API era, so we were working with what was available) and Copilot would return a structured JSON response with the provider data parsed out.
That JSON got entered into MediProvider, and the application handled the rest: aggregating responses across all the providers in the document, normalizing the data, and outputting a clean spreadsheet alongside a final JSON file formatted for the company's internal database.
Repeat that for over 100 providers per facility. The tool made it repeatable, consistent, and a lot less soul-crushing than the alternative.
What It Solved
Eliminated manual data entry across 100+ providers per document
Standardized inconsistent PDF formats from multiple medical facilities
Produced clean spreadsheet and JSON output ready for direct database import
Built a repeatable pipeline for a process that previously had none
Context
This was built during my time as an Executive Assistant and intern at ReAssist-Me, a healthcare-adjacent startup. The company needed structured provider data to power their platform and had no clean way to get it. MediProvider was the practical solution I put together to bridge the gap between messy source documents and usable structured data, without waiting on infrastructure or budget that wasn't there yet.




