Let’s solve the very annoying process of getting patient charts

Yes, we are still faxing in 2026. Can Predoc help us?

Looking to hire the best talent in healthcare? Check out the OOP Talent Collective - where vetted candidates are looking for their next gig. Learn more here or check it out yourself.

Hire from the Out-Of-Pocket talent collective

Healthcare 101 Crash Course

Crash course the basics of US healthcare in a simple to understand and fun way. Understand who the different stakeholders are, how money flows, and trends shaping the industry.
Learn more
Next Cohort:
7/13 - 7/24

Featured Jobs

Finance Associate - Spark Advisors

  • Spark Advisors helps seniors enroll in Medicare and understand their benefits by monitoring coverage, figuring out the right benefits, and deal with insurance issues. They're hiring a finance associate.

Data Engineer - firsthand

  • firsthand is building technology and services to dramatically change the lives of those with serious mental illness who have fallen through the gaps in the safety net. They are hiring a data engineer to build first of its kind infrastructure to empower their peer-led care team.

Data Scientist - J2 Health

  • J2 Health brings together best in class data and purpose built software to enable healthcare organizations to optimize provider network performance. They're hiring a data scientist.

Looking for a job in health tech? Check out the other awesome healthcare jobs on the job board + give your preferences to get alerted to new postings.

Check Out The Job Board

TL;DR

Chart retrieval is when you pull a patient's full medical history from every doctor, hospital, lab, and pharmacy they've been to. Today that is still mostly done in 2026 by humans on the phone, on hold, and over fax. What's returned is often large PDF stacks which you then throw bodies at to reconcile with the rest of the medical record.

Predoc believes AI can make that much easier. They query every digital network in the US, send AI agents + humans after the records that aren't on any network, and use their own AI juiced data structuring process to turn the resulting fax packets into curated data delivered how you want it.

We'll talk about when chart retrieval is needed, how Predoc works, and the pros/cons of their approach.

This is a sponsored post. You can read more about my rules/thoughts on sponsored posts here. If you're interested in having a sponsored post done, we have 2 slots left this year.

Company Name - Predoc

The company is called Predoc, which is the honorific every pre-med dropout should use going forward. Still better than being a postdoc (sorry).

Predoc was started by two doctors who have been chart chasing for a long time and two predocs who thought it could be better. Not often you see a four person founding team, at some point one gets cannibalized by the other three.

What Is Chart Retrieval And What's The Process?

Chart retrieval is exactly what it sounds like: trying to assemble a patient's full medical history from every doctor, hospital, lab, pharmacy, and imaging center they've ever been to. Wouldn't it be pretty cool if like…a doctor could see all your health data? Damn that would be so sick.

Today a few things happen to get your historical data:

  1. You as a patient sign some intake forms, tell the doc some of the places you've been seen, and give them permission to go chase down your records.
  1. The doc is probably connected to some health information exchanges and pharmacy networks. This allows them to pull whatever's available digitally. Works great for the providers who are on those networks. But the amount of data available through this route is highly variable per state, per EHR, and not the complete picture.
  2. There's usually a chart chasing team of humans to get the rest of the info. They call other doctors' offices, send fax authorization forms, sit on hold, follow up, and wonder if this is their life's calling. These teams can be quite large, Predoc was saying some of the clinic chains they talk to have 40-60 people who do this (hospitals have even larger teams, local economy isn't gonna support itself).

The data you get comes in all different formats across the information exchanges + thicc e-fax packets of PDFs from other providers. This requires teams to structure the data, deduplicate it, etc. so that the doc can get a sense of what the patient timeline looks like.

Or…realistically the PDF records are uploaded in bulk to the media tab of the patient's profile in the electronic medical record. It's the digital equivalent of stuffing your homework straight into your backpack instead of putting it in a binder (the kids who did that are in jail btw).

As you can imagine this is a pain, tends to miss data, and requires a lot of time/staff. Also like…what are we even doing here lol.

Example PDF that might get scanned in - look at that fine graininess

What Does Predoc Actually Do?

Predoc's pitch is to take the chart retrieval and data curation process off your hands. You plug into their system and they'll go get the records + structure it + make it ingestible in whatever format you want. If you want it sung to you like a bard, they can do that too.

Requests are usually initiated via API during patient intake or scheduling. Predoc gets a patient's name, DOB, gender, phone, ZIP, etc. and the time period you want the records from. You can also choose if you want data from specific sources only and turnaround time.

Predoc starts with a digital query against all the major health information exchanges, EHR connections, pharmacy networks, and imaging networks. What comes back from that is a metric f***ton of unstructured, messy, patient health data.

Predoc's data normalization pipeline takes this to start building an encounters list. The end result is a "provider roadmap" of most of the places that the patient has been seen, cause we know your forgetful ass is missing stuff. This creates a supplemental list of docs and hospitals you now can hunt down to get missing records like imaging results, etc.

That's basically when you start "The Chase" for your medical records in various places. It's like getting the seven Dragon Balls, and the wish is to know if the patient is still taking Lipitor. Predoc has a process for chasing down your data from different providers.

  • An AI agent researches the request and validates the provider/office.
  • A voice AI agent calls the office, finds the fax number, walks through "we have a patient Master Roshi DOB X who saw you in 2023, can you find them in your system, here's the HIPAA authorization form."
  • At the 4-hour mark, a human Predoc agent calls the office to follow up.
  • Further escalations at 12 and 24 hours to get the data.

The company told me that 30% of chart chasing is completed using AI without human escalation. Honestly that's more convincing than I am on a phone call, but I'm conflict avoidant.

At this point there's a lot of data. FHIR feeds from the HIEs, HTML from PBM vendors, scanned PDFs of fax packets with handwriting, CDA documents, imaging reports, and other things that give Brendan Keeler serotonin. Predoc has built an AI-powered pipeline to ingest and clean all these data to make them usable:

  • Segmentation - Segmentation splits the data or document stacks into the individual records inside it.
  • Indexing - Each segment gets classified into the right clinical category - labs go in labs, the path report gets labeled a path report, the path of least resistance.
  • Extraction - the actual data points using vision-language models that handle text, tables, images, handwriting, and form checkboxes in the same document.
  • Normalization - That data is matched to standards like ICD-10, SNOMED-CT, LOINC, BROCKHAMPTON, RxNorm. "Atorvastatin 20mg" matches "Lipitor 20 mg PO daily".
  • Deduping - Across sources so the same lab result from three different facilities doesn't show up as three different things.
  • Validation - Human in the loop checks and scripts to check that the logic makes sense in the record (e.g. could a patient even be taking a med for something that doesn't have a diagnosis code?).

Predoc has a standalone web app that a customer can submit patients, track retrieval status, review structured records and facesheets, no engineering required. Most enterprise customers are directly integrated via API and never see that UI at all. As God intended.

And with that, you have liquid patient data gold coming out of the chart retrieval refinery.

Business Model And Pricing

For Predoc, pricing is usage-based with two pillars. First, per-patient retrieval fees - if you see 5,000 new patients a month and want full retrievals on all of them, you pay a per-patient fee.

If a company has their own retrieval team, Predoc charges per-page / per-document fees for just indexation and structuring. Some of these records are big as hell, need that chart Ozempic.

There's a few different entities that need a full scope of patient data regularly.

  • Speciality care, e.g. oncology. When a patient walks in with a cancer diagnosis, the oncologist needs to know exactly which prior treatments worked, which didn't, what current medications could conflict with chemo, what imaging has been done already, what genetic testing has been done, etc. This is not only to figure out the next treatment action, but also because they'll need that documentation for insurance to prove the patient needs it. This same logic and workflow basically applies to all specialty care that have high cost components.
  • Providers in value-based care arrangements - If you're an organization that gets paid based on patient outcomes, you need clean structured data on every patient. It helps flags who's at risk, close care gaps, and get the documentation needed for whatever version of risk adjustment we're now on.
  • Second-opinion services - If you're going anywhere where you're paying for a second opinion, the entire point of the appointment is to look at all of your past records, comment about how docs at the X institution aren't as good as you in a veiled critique, and then provide guidance. So…yeah they need your whole record.
  • Clinical research sites - If you want to enroll a patient in a clinical trial, you need to know basically everything about their health history to make sure they meet the requirements to be in the trial or have anything that might exclude them (e.g. has tried a biologic drug in the past, have they ever had the slightest whiff of depression, etc.).

But really anyone that gets patient referrals and needs data coming will need to gather charts.

Job Openings

Predoc is hiring for:

  • Senior Sales Development Representative (SDR)
  • Clinical Escalations Manager
  • Senior Manager, Clinical Strategy & Operations
  • Senior Account Executive

All remote (US). Check out their careers page for details.

Out-Of-Pocket Take

A few things I think are interesting about Predoc:

Doing one flow well - There's a meme going around about AI-native services companies. "Sell the work, not the tech" kinda shiii. Basically rather than be a software point solution, sell the entire end to end service and target headcount budget (large!) vs. technology budget (small!).

But in order to do this you have to own a process end-to-end and get good at all of the edge cases. Predoc's bet is they can use a combination of agents, human escalations, know-how of data formats, etc. that they can do it all. Finding each edge case that requires a human to get in the mix is a moat for them, because it continues to add to their logic around this one specific process.

Imagine a thousand small optimizations like this

Chart retrieval is a smart wedge to start with because:

1) it's complex and will always have new edge cases
2) getting the "last mile" of data actually matters for a lot of these use cases
3) it's a manual and expensive process with budget already allocated to it
4) It's embarrassing for us as an industry

Being just the data provider (good!) - Predoc has a strong opinion about being just the data retrieval and curation layer for customers to access the data. They don't want to build applications on the data they get, they want to live in the background and customers can pipe in that data wherever they need.

A pro of this is that they can integrate pretty easily into any customer's workflow, even if several different entities within that org use the data in different formats for different use cases. The oncologist using Aqua EHR wants spliced and indexed PDFs filed into the right folders. The VP of Data wants the curated structured data layer piped into their Snowflake instance for analytics. The CEO wants a dashboard view to monitor referral patterns.

Charts are getting more complex - Like me, charts are getting fatter over time. The combination of EHR upcoding incentives, billing documentation requirements, and the templated note epidemic means a full record today is dramatically longer than it was 10 years ago. A primary care visit note that was once one paragraph is now four pages of structured templates, copy-pasted history, and "the patient denies suicidality" boilerplate. Complex patients are getting increasingly more tests/interventions. Multiply that across 20 years of a patient's history and you get the famous Thousand-Page PDF (not an exaggeration).

Source: We'll literally measure things in "Moby Dick" before the metric system

When you start layering on top transcript data, LLM generated notes, potentially real-time data feeds like wearables, etc. and it just seems like the volume of clinical content that has to be structured is only going to increase. Predoc's bet is that your data digestion needs are only going to grow.

As with any company, these are some of the potential challenges a company like Predoc might face. Or "headwinds", as the intellectual class would say.

The downside of being just data curator (bad!) - Being just the data curation layer is great for customers, but the risk you run is that you could be swapped out under the hood relatively easily if someone else offers this kind of service. Or the EMRs/existing chart retrieval companies start offering a service like this themselves. Or the foundation models get so good that customers can just point Codex to a 5000 page PDF rawdog and it'll spit out the right info without needing to do the segmentation/extraction steps.

Predoc's bet is that the operational complexity of getting this right is harder than it looks + involves a lot of human orchestration under the hood to get the most complete record. And if it's a commodity service in the long run, then being the easiest to use and integrate across multiple workflows within an org + first mover advantage helps them.

What if interoperability actually happens - Lol. Lmao actually.

Anyway, there's a push right now towards national health data exchange. TEFCA, QHINs, and a bunch of acronyms I'm too extroverted to understand. In general though if it actually becomes easier to pipe data in from other providers, does the value prop for something like Predoc decrease?

The answer is probably that it'll get easier to do some of the things they've built expertise in. But there is always going to be some offline data gathering + data refinement work that needs to be done on the records to make them usable.

"Good enough" inertia - The question I always wonder with healthcare companies is "does this matter enough that the customer wants a good vendor?". Is a full patient record worth building a totally new chart retrieval flow or paying for a vendor to do this?

Predoc's answer is obviously yes and there are specific types of customers that need this to be correct. But there's nothing tougher than breaking through stagnation and apathy.

Conclusion

I used to work at a clinical trial company. If a patient wanted to join a trial we needed their full longitudinal medical history. A lot of human capital was used to call patients' previous doctors, ask them to fax over records, then we'd have to organize those PDFs into something usable. In many cases the doctors we called assumed we were scammers and would hang up, even if they legally had to give it to us.

This problem is the definition of unsexy, and being a good service provider here basically just means doing hand-to-hand combat with a bunch of different data exports, ops processes, and terrible edge cases. The durability comes from building one piece at a time: the digital onramp, the AI-agent long-tail chasing, the human escalations layer, and then the data curation process, etc. Every piece in isolation is doable, but stitching all of it together as a single managed service is kind of a nightmare.

We should all benefit from our providers having more complete medical histories of us. Hopefully companies like Predoc can help make that the norm.

Thinkboi out,

Nikhil aka. "Post doc clarity"

Twitter: @nikillinit

IG: @outofpockethealth

Other posts: outofpocket.health/posts

If you're enjoying the newsletter, do me a solid and shoot this over to a friend or healthcare slack channel and tell them to sign up. The line between unemployment and founder of a startup is traction and whether your parents believe you have a job.

Interlude - Apply to Ship It! And Healthcare 101!

See All Courses →

Don’t forget the application for our SHIP IT, our healthcare software engineering conference IS LIVE.

If you write or deeply work with code, have some experience working in healthcare, and want to has out how everyone is building things…you should apply to this. It’s small, intimate, and you’ll learn a lot.

And if you feel like you really need to get up to speed on how healthcare works, then you should let me teach you at Healthcare 101 starting 7/13! 

This is for anyone hiring teams of non-healthcare people that need to get up to speed quickly (in 2 weeks) - we do group discounts too hit up ya boy. You’ll even learn how to make memes.

search icon
close