Hey author here! Appreciate the feedback! Agreed on importance of portability and durability.
I'm not trying to build this out or sell it as a tool to providers. Just wanted to demo what you could do with structured guidelines. I don't think there's any reason this would have to be unique to a practice or emr.
As sister comments mentioned, I think the ideal case here would be if the guideline institutions released the structured representations of the guidelines along with the PDF versions. They could use a tool to draft them that could export in both formats. Oncologists could use the PDFs still, and systems could lean into the structured data.
The cancer reporting protocols from the College of American Pathologists are available in structured format (1). No major laboratory information system vendor properly implements them, properly, and their implementation errors cause some not-insignificant problems with patient care (oncologists calling the lab asking for clarification, etc). This has pushed labs to make policies disallowing the use of those modules and individual pathologists reverting to their own non-portable templates in Word documents.
The medical information systems vendors are right up there with health insurance companies in terms of their investment in ensuring patient deaths. Ensuring. With an E.
It doesn't look like the XML data is freely accessible.
If I could get access to this data as a random student on the internet, I'd love to create an open source tool that generates an interactive visualization.
> The medical information systems vendors are right up there with health insurance companies in terms of their investment in ensuring patient deaths. Ensuring. With an E.
Medical information system vendors only care about making a profit, not implementing actual solutions. The discrepancies between systems can lead to bad information which can cost people their life.
According to what killjoywashere said, the vendors do not want to implement these standards. So if CAP wants the standards to be relevant, they should release them for random people to implement.
How about fixing the format? Something that is obviously broken and resulting in patient deaths should really be considered a top priority. It's either malice or masskve incompetence. If these protocols were open there would definitely be volunteers willing to help fix it.
I think there are more options than malice or incompetence. My theory is difficulty.
There’s multiple countries with socialized medicine and no profit motive and it’s still not solved.
I think it’s just really complex with high negative consequences from a mistake. It takes lots of investment with good coordination to solve and there’s an “easy workaround” with pdfs that distributes liability to practitioners.
Healthcare suffers from strict regulatory requirements, underinvestment in organic IT capabilities, and huge integration challenges (system-to-system).
Layering any sort of data standard into that environment (and evolving it in a timely manner!) is nigh impossible without an external impetus forcing action (read: government payer mandate).
You seem to think that the default assumption is that fixing the format is easy/feasible, and I don't see why. Do you have domain knowledge pointing that way?
It's a truism in machine learning that curating and massaging your dataset is the most labor-intensive and error-prone part of any project. I don't why that would stop being true in healthcare just because lives are on the line.
Incompetence at this level is intentional, it means someone doesn't think they'll see RoI from investing resources into improving it. Calling it malice is appropriate I feel.
Not actively malicious perhaps, but prioritising profits over lives is evil. Either you take care to make sure the systems you sell lead to the best possible outcomes, or you get out of the sector.
The company not existing at all might be worse though? I think it’s too easy to make blanket judgments like that from the outside, and it would be the job of regulation to counteract adverse incentives in the field.
I believe you have good intentions, but someone would need to build it out and sell it. And it requires lots of maintenance. It’s too boring for an open source community.
There’s a whole industry that attempts to do what you do and there’s a reason why protocols keep getting punted back to pdf.
I agree it would be great to release structured representations. But I don’t think there’s a standard for that representation, so it’s kind of tricky as who will develop and maintain the data standard.
I worked on a decision support protocol for Ebola and it was really hard to get code sets released in Excel. Not to mention the actual decision gates in a way that is computable.
I hope we make progress on this, but I think the incentives are off for the work to make the data structures necessary.
>Agreed on importance of portability and durability.
I think "importance" is understating it, because permanent consistency is practically the only reason we all (still) use PDFs in quite literally every professional environment as a lowest common denominator industrial standard.
PDFs will always render the same, whether on paper or a screen of any size connected to a computer of any configuration. PDFs will almost always open and work given Adobe Reader, which these days is simply embedded in Chrome.
PDFs will almost certainly Just Work(tm), and Just Working(tm) is a god damn virtue in the professional world because time is money and nobody wants to be embarrassed handing out unusable documents.
PDFs generally will look close enough to the original intent that they will almost always be usable, but will not always render the same. If nothing else, there are seemingly endless font issues.
In this day and age that seems increasingly like a solved problem to most end users, often a client-side issue or using a very old method of generating a PDF?
Modern PDF supports font embedding of various kinds (legality is left as an exercise to the PDF author) and supports 14 standard font faces which can be specified for compatibility, though more often document authors probably assume a system font is available or embed one.
There are still problems with the format as it foremost focuses on document display rather than document structure or intent, and accessibility support in documents is often rare to non-existent outside of government use cases or maybe Word and the like.
A lot of usability improvements come from clients that make an attempt to parse the PDF to make the format appear smarter. macOS Preview can figure out where columns begin and end for natural text selection, Acrobat routinely generates an accessible version of a document after opening it, including some table detection. Honestly creative interpretation of PDF documents is possibly one of the best use cases of AI that I’ve ever heard of.
While a lot about PDF has changed over the years the basic standard was created to optimize for printing. It’s as if we started with GIF and added support to build interactive websites from GIFs. At its core, a PDF is just a representation of shapes on a page, and we added metadata that would hopefully identify glyphs, accessible alternative content, and smarter text/line selection, but it can fall apart if the PDF author is careless, malicious or didn’t expect certain content. It probably inherits all the weirdness of Unicode and then some, for example.
Community oncologists have limited technology resources as compared to a national cancer center. If we can make their lives easier, it can only be a good thing.
That said, I like published documents like PDFs - systems usually make it hard to conii ok are the June release from the September release.
You say this, but on the other hand, the author alleges that the places that use these custom tools achieve better outcomes. You didn't address this point one way or the other.
Do you think this is a completely fabricated non-explanation? It's not like the link says "the worst places use these custom tools."
Exactly. The PDF's work. They won't break. You can see all the information with your own eyes. You can send them by e-mail.
A wizard-type system hides most of the information from you, it might have bugs you aren't aware of, if you want to glance at an alternative path you can't, it's going to be locked into registered users, the system can go down.
I think much more intelligent computer systems are the future in health care, but I doubt the way to start is with yet another custom tool designed specifically for cancer guidelines and nothing else.
Totally valid concerns. If you have time, I would like to show you my solution to get your thoughts as I believe I have found ways to mitigate all of your concerns.
Currently I am using STCC (Schmitt-Thompson Clinical Content). I Have sent you some of the PDF's we use for testing.
The author is proposing that the DAG representation be in addition to the PDF:
>The organizations drafting guidelines should release them in structured, machine-interpretable formats in addition to the downloadable PDFs.
My opinion: Ideally the PDF could be generated from the underlying DAG -- that would give you confidence that everything in the PDF has been captured in the DAG.
The fundamental idea here is that doctors find it difficult to ensure that their recommendations are actually up-to-date with the latest clinical research.
Further, that by virtue of being at the centre of action in research, doctors in prestige medical centres have an advantage that could be available to all doctors. It's a pretty important point, sometimes referred to as the dissemination of knowledge problem.
Currently, this is best approached by publishing systematic reviews according to the Cochrane Criteria [0]. Such reviews are quite labour-intensive and done all too rarely, but are very valuable when done.
One aspect of such reviews, when done, is how often they discard published studies for reasons such as bias, incomplete datasets, and so forth.
The approach described by Geiger in the link is commendable for its intentions but the outcome will be faced with the same problem that manual systematic reviews face.
I wonder if the author considered included rules-based approaches (e.g. Cochrane guidelines) in addition to machine learning approaches?
NCCN guidelines and Cochrane Reviews serve complementary roles in medicine - NCCN provides practical, frequently updated cancer treatment algorithms based on both research and expert consensus, while Cochrane Reviews offer rigorous systematic analyses of research evidence across all medical fields with a stronger focus on randomized controlled trials. The NCCN guidelines tend to be more immediately applicable in clinical practice, while Cochrane Reviews provide a deeper analysis of the underlying evidence quality.
My main goal here was to show what you could do with any set of medical guidelines that was properly structured. You can choose any criteria you want.
> doctors find it difficult to ensure that their recommendations are actually up-to-date with the latest clinical research
Doctors care about as much this as software engineers care about the latest computer science research. A few curious ones do. But the general attitude is they already did tough years of school so they don’t have to anymore.
Yeah, non-programmers seem to think everything is changing so quickly all the time yet here I am writing in a 40 year old language against UNIX APIs from the 70s ¯\_(ツ)_/¯
The OP will be pleased to know that they’re not the first person to think of this idea. Searching for “computable clinical guidelines” will unearth a wealth of academic literature on the subject. A reasonable starting point would be this paper [1]. Indeed people have been trying since the 70s, most notably with the famous MYCIN expert system. [2]
As people have alluded to and the history of MYCIN shows, there’s a lot more subtlety to the problem than appears on the surface, with a whole bunch of technical, psychological, sociological and economic factors interacting. This is why cancer guidelines are stuck in PDFs.
Still, none of that should inhibit exploration. After all, just because previous generations couldn’t solve a problem doesn’t mean that it can’t be solved.
But they don't work as well as other decisionmaking techniques... Random forests, linear models, neural nets, etc. are all decision making techniques at their core.
And decision trees perform poorly for complex systems where lots of data exists - ie. human health.
So why are we using a known-inferior technique simply because it's easier to write down in a PDF file, reason about in a meeting, or explain to someone?
Shouldn't we be using the most advanced mathematical models possible with the highest 'cure' probability, even if they're so complex no human can understand them?
In cancer there's an abundance of clinical trials with high quality data, but it is all very complex in terms of encoding what the clinical trial actually encoded.
Go to a clinical cancer conference and you will see the grim reality of 10,000s of people contributing to the knowledge discovery process with their cancer care. There is an inverse relationship between the number of people in a trial and the amount of risk that goes into that trial, but it is still a massive amount of data that needs to be codified into some sensible system, and it's hard enough for a person to do it.
> That’ll be wonderful to explain in court when they figure out it was just data smuggling or whatever other bias.
What do you mean by this? I'm not aware of any data smuggling that has ever happened in a clinical trial. The "bias" is that any research hypothesis comes from the fundamentally biased position of "I think the data is telling me this" but I've seen very little bias of truly bad hypotheses in cancer research like those that have dominated, say Alzheimer's research. Any research malfeasance should be prosecuted to the fullest, but I don't think cancer research has much of it. This was a huge scandal, but I don't think it pointed to much in the way of bad research in the end:
but we have well established ways to deal with those... test/validation sets, n-fold validation, etc.
Even if there was some overfitting or data contamination that was undetected, the result would most probably still be better than a hand-made decision tree over the same data...
Ok, until you can sue the AI you need to find a doctor ok putting their license behind saying “I have no idea how this shiny thing works”. There are indeed some that will, but not a consensus.
Hand-made decision trees are open to inspection, comprehension, and adaption. There is no way to adapt an opaque ML model to new findings / an experimental treatment except by producing a new model.
I find the web(HTML/CSS) the most open format for sharing. PDFs are hard to be consumed on smaller devices and much harder to be read by machines. I am working on a feature at Jaunt.com to convert PDFs to HTML. It shows up as reader mode icon. Please try it out and see if it is good enough. I personally think we need to do much better job.
https://jaunt.com
As a cancer researcher myself, I'd point out that some branches of the decision trees in the NCCN guidelines are based on studies in which multiple options were not statistically significantly different, but all were better than the placebo. In those cases, the clinician is free to use other factors to decide which arm to take. A classic example of this is surgery vs radiation for prostate cancer. Both are roughly equally effective, but very different experiences.
Same reason why datasheets are still PDFs. It's a reliable, long lasting and portable format. And while it's kind of ridiculous that we are basically emulating paper, no other format fills that niche.
It's the niche HTML should be able to fill, since that was its original purpose, but isn't, since all focus over the last 20 or so years has been on everything else, but making HTML a better format for information exchange.
Trivial things like bundling up a complex HTML document into a single file don't have standard solutions. Cookies stop working when you are dealing with file:// URLs and a lot of other really basic stuff just doesn't work or doesn't exist. Instead you get offshot formats like ePUB that are mostly HTML, but not actually supported by most browser.
> With properly structured data, machines should be able to interpret the guidelines. Charting systems could automatically suggesting diagnostic tests for a patient. Alarm bells and "Are you sure?" modals could pop up when a course of treatment diverges from the guidelines. And when a doctor needs to review the guidelines, there should be a much faster and more natural way than finding PDFs
I have implemented this computerized process twice at two different startups over the past decade.
I would not want the NCCN to do it.
The NCCN guidelines are not stuck in PDFs, they are stuck in the heads of doctors.
Once the NCCN guidelines get put into computerized rules, they start to be guided by those computerized rules, a second influence that takes them away from the fundamental science.
So while I totally agree that there should be systemtticization of the rules, it should be entirely secondary and subservient to the best frontier knowledge about cancer, which changes extremely frequently. Annually after every ASCO (major pan-cancer conference) and every disease specific conference (e.g. the San Antonio breast cancer conference), and occasionally during the year when landmark clinical trials are published the doctors need to update their knowledge from the latest trials and their continuing medical education, which is entire body of knowledge that is complementary to the edges of what the NCCN publishes.
Having spanned both computer science and medicine for my entire career, I trust doctors to be able to update their rules far faster than the programmers and databases.
Please do not get the NCCN guidelines stuck in spaghetti code that a few programmers understand, rather than open in PDFs with lots of links that anybody can go and chase after.
Edit: though give me a week digesting this article and I may change my mind. Maybe the NCCN should be standardizing clinical variables enough such that the rules can trivially be turned into rules. That would require that the hypotheses that a clinical trial fits into those rules however, and that's why I need a week of digestion to see if it may even be possible...
Cool tool. From my experience the PDF was easy to traverse.
The hardest part for me was understanding that treatment options could differ (i.e. between the _top_ hospitals treating the cancer). And there were a few critical options to consider. NCCN paths were traditional, but there is in between decisions to make or alternative paths. ChatGPT was really helpful in that period. "2nd" opinions are important... but again you ask the top 2 hospitals and they differ in opinion, any other hospital is typically in one of those camps.
It's so much worse than you could possibly imagine. I worked for a healthcare startup working on patient enrollment for clinical oncology trials. The challenges are amazing. Quite frankly it wouldn't matter if the data were in plaintext. The diagnostic codes vary between providers, the semantic understanding of the diagnostic information has different meanings between providers, electronic health records are a mess, things are written entirely in natural language rather than some kind of data structure. Anyone who's worked in healthcare software can tell you way more horror stories.
I do hope that LLMs can help straighten some of it out but anyone whos done healthcare software, the problems are not technical, they are quite human.
That being said one bright spot is we've (my colleagues, not me) made a huge step forward using category theory and Prolog to discover the provably optimal 3+3 clinical oncology dose escalation trial protocol[1]. David gave a great presentation on it at the Scryer Prolog meetup[2] in Vienna.
It's kind of amazing how in the dark ages we are with medicine. Even though this is the first EXECUTABLE/PROGRAMMABLE SPEC for a 3+3 cancer trial, he is still fighting to convince his medical colleagues and hospital administrators that this is the optimal trial because -- surprise -- they don't speak software (or statistics).
Oh wow. No, that's heart breaking. I'll have to read up on this. Reminds me of David explaining the interesting and somewhat surprisingly insensitive language the oncology literature uses towards folks going through this. Its there for historical reasons but slow to change.
It also shows how important getting dose escalation trials are. The whole point is finding the balance point where "cure is NOT worse than the disease". A bad dose can be worse than the cancer itself, and conducting the trials correctly is extremely important... and this really underscores the human cost. Truly heartbreaking :(
Software that gives treatment instructions may be a medical device requiring FDA approval. You may be breaking the law if you give it to a medical professional without such approval.
The real question is: why is everything stuck in PDFs, and the more important meta-question is: why don't PDFs support meta-data (they do, somewhat). So much of what we do is essentially machine-to-machine, but trapped in a format designed entirely for human-to-human (also lump in a bit of machine-to-human).
Adobe has had literally a third of a century to recognize this need and address it. I don't think they're paying attention :-/
PDFs can have arbitrary files embedded, like XML and JSON. It also supports a logical structure tree (which doesn’t need to correspond to the visual structure) which can carry arbitrary attributes (data) on its structure elements. And then there’s XML Forms. You can really have pretty much anything machine-processable you want in a PDF. One could argue that it is too flexible, because any design you can come up with that uses those features for a particular application is unlikely to be very interoperable.
I know it's not the same, but in many areas we have this "follow the arrows" system in many guidelines.
For some examples, see the EULAR guidelines with it's fluxograms for treatments and also AO Surgery Reference with a graphical approach to select treatments based on fracture pattern, avaliable materials and skill set.
I think that's a logical and necessary step to join medical reasoning and computer helpers, we need easier access to new information and more importantly to present clinical relevant facts from the literature in a way that helps actual patient care decision making.
I'm just not too sure we can have generic approaches to all specialties, but it’s nice seeing efforts in this area.
The real problem is that the guidelines are written for humans in the first place. Workarounds like this shouldn't be needed, to go from a machine friendly layout to a human friendly one is usually quite easy.
And from what he says a decision tree isn't really the right model in the first place. What about no tree, just a heap of records in a SQL database. You do a query on the known parameters, if the response comes back with only one item in the treatment column you follow it. If it comes back with multiple items you look at what would be needed to distinguish them and do the test(s).
Excellent read. This consolidated and catalyzed my my spurious thoughts around personal information management. The input is generally markdown/pdf but over time highly useless for a single person. Thete would be value if it is passed through such a system over time.
I have to ask: did the author contact any medical professional when writing this article? Is this really something that needs to be fixed, and will his solution actually fix it?
It seems to me that ignoring the guideline is a physician decision, and when it is ignored (for good or for bad), it is not because the guidelines are not available in json.
I parsed some mind maps that were constructed with a tool and exported as pdfs (original sources were lost a long time ago) and I used python with tesseract for the text and opencv and it worked alright. I am curious why the author went with LLMs, but I guess with the mentioned amount of data it wasn't hard to recheck everything later.
Shouldn't the end goal be just to train an ai on all the pdfs and give the doctors an interface to plug in all the details and get a treatment plan generated by that ai?
Working on the data structure feels like an intermediate solution on the way to that ai which is not really necessary. Or am I missing something?
Comically, I worked in this space and initially tried to get decision support working with data structures and code sets and such.
I ended up only really contributed adding version numbers to the pdf. So at least people knew they had the latest and same versions. And that took a year, to get versions added to guideline pdfs.
That is wild, one would think versioning is extremely important. They tend to just put the timestamp in the filename (sometimes), which I guess is better than nothing.
Getting in the file name was kind of easy. But I meant adding it visually in the pdf guidance so readers could tell. Just numbers in the lower left corner. Or maybe right.
The guideline was available via url so the filename couldn’t change.
The author makes a great case for machine-interpretable standards but there is an enormous amount of work out there devoted to this, it’s been a topic of interest for decades. There’s so much in the field that a real problem is figuring out what solutions match the requirements of the various stakeholders, more than identifying the opportunities.
Because writers don't think about readers. PDF is one of the worst formats for science/technical info, but yet.
I've dumped a lot of papers from arxiv because it formatted as 2-column non zoomable PDF.
Forgive me if I'm mistaken, but isn't this exactly what the FHIR standard is meant to address? Not only does it enable global inter-health communication using a standardized resource, but it's already adopted in several national health services, including (but not broadly), America. Is this not simply a reimplementation, but without the broad iterations of HL7?
Right, it would make more sense to use HL7 FHIR (possibly along with CQL) as a starting point instead of reinventing the wheel. Talk to the CodeX accelerator about writing an Implementation Guide in this area. The PlanDefinition resource type should be a good fit for modeling cancer guidelines.
You would aim to use CQL expressions inside of a PlanDefinition, in my estimate. This is exactly what AHRQ's, part of HHS, CDS Connect project aims to create / has created. They publish freely accessible computable decision support artifacts here: https://cds.ahrq.gov/cdsconnect/repository
When they are fully computable, they are FHIR PlanDefinitions (+ other resources like Questionnaire, etc) and CQL.
There's so much other infrastructure around the EHR here to understand (and take advantage of). I think there's a big opportunity in proving that multimodal LLM can reliably generate these artifacts from other sources. It's not the LLM actually being a decision support tool itself (though that may well be promising), but rather the ability to generate standardized CDS artifacts in a highly scalable, repeatable way.
Happy to talk to anyone about any of these ideas - I started exactly where OP was.
I downloaded and opened an CDS for osteoporosis from the link (as a disease in my specialty), I need an API key to view what a "valueset" entails, so in practice I couldn't assert if the recommendation aligns with clinical practice, nor in the CQL provided have any scientific references (even a textbook or a weak recommendation from a guideline would be sufficient, I don't think the algorithm should be the primary source of the knowledge)
I tried to see if HL7 was approachable for small teams, I personally became exhausted from reading it and trying to think how to implement a subset of it, I know it's "standard" but all this is kinda unapproachable.
GraphViz has some useful graph schema languages that could be reused for something like this. There's DOT, a delightful DSL, and some kind of JSON format as well. You can then generate a bunch of different output formats and it will lay out the nodes for you.
Of all the challenges with this, graph layout is beyond trivial. It does not rank as a problem, intellectual challenge, or even that interesting.
The challenges are all about what goes in the nodes, how to define it, how to standardize it across different institutions, how to compare it to what was tested in two different clinical trials, etc. And if the computerized process goes into clinical practice, how is that node and its contents robustly defined so that a clinician sitting with a patient can instantly understand what is meant by it's yes/no/multiple choice question in terms that have been used in recent years at the clinician's conferences.
Addressing the challenges of constructing the graph requires deep understanding of the terms, deep knowledge of how 10 different people from different cultural backgrounds and training locations interpret highly technical terms with evolving meanings, and deep knowledge of how people could misunderstand language or logic.
These guidelines codify evolving scientific knowledge where new conceptions of the disease get invented at every conference. It's all at the edge of science where every month and year we have new technology to understand more than we ever understood before, and we have new clinical trials that are testing new hypotheses at the edge of it.
Getting a nice visual layout is necessary, but in no way sufficient for what needs to be done to put this into practice.
Modularity is an excellent way of attacking complex problems. We can all play with algorithms that can carry on realistic conversations and create synthetic 3D movies, because people worked on problems like making transistors the size of 10 atoms, figuring out how processors can predict branches with 99% accuracy, giving neural nets self-attention, deploying inexpensive and ridiculously fast networks all over the planet, and a lot of other stuff.
For many of us, curing cancer may someday become more important than almost anything else a computer can help us to do. It's just there are so many building blocks to solving truly complex problems; we must respect all that.
Funny i just had the thought the other day about how we as a society need to move past the pdf format or even just update it to be editable in traditional document software. The fact that Google docs will export as a pdf and not have it saved in the documents is proof its gotten to a point of inefficiency and that's just one example
Gee, before talking about complex stuff like decision trees, how about we start with something really simple like not requiring a login to download the stupid PDF from NCCN?
PDFs are the opposite of machine-readable if you want to do anything other than render them as images on paper or a screen. They're only slightly more machine-readable than binary executables.
I hate, hate, hate, hate, hate the practice of using PDFs as a system of record. They are intended to be a print format for ensuring consistent typesetting and formatting. For that, I have no quarrel. But so much of the world economy is based on taking text, docx (XML), spreadsheets, or even CSV files, rendering them out as PDFs, and then emailing them around or storing them in databases. They've gone from being simply a view layer to infecting the model layer.
PDFs are a step better than passing around screenshots of text as images - when they don't literally consist of a single image, that is. But even for reasonably-well-behaved, mostly-text PDFs, finding things like "headers" and "sections" in the average case is dependent on a huge pile of heuristics about spacing and font size conventions. None of that semantic structure exists, it's just individual characters with X-Y coordinates. (My favorite thing to do with people starting to work with PDFs is to tell them that the files don't usually contain any whitespace characters, and then watch the horror slowly dawn as they contemplate the implications.) (And yes, I know that PDF/A theoretically exists, but it's not reliably used, and certainly won't exist on any file produced more than a couple years ago.)
Now, with multi-modal LLMs and OCR reaching near-human levels, we can finally... attempt to infer structured data back out from them. So many megawatt-hours wasted in undoing what was just done. Structure to unstructure to structure again. Why, why, why.
As for universality... I mean, sure, they're better than some proprietary format that can only be decrypted or parsed by one old rickety piece of software that has to run in Win95 compatibility mode. But they're not better than JSON or XML if the source of truth is structured, and they're not better than Markdown or - again - XML if the source is mostly text. And there are always warts that aren't fully supported depending on your viewer.
TLDR: The NCCN surely has a clean pretty database of these algorithms. They output these junky pdfs for free. Want cleaner "templates" data? Pay the toll please.
What we have here is a walled garden. Want the treatment algorithm? Here muck through this huge disaster of 999 page pdfs. Oh you want the underlying data? Well, well, it's going to cost you.
What we have here is not so much different than the paywalls of an academic journal. Some company running a core service to an altruistic industry and skimming a price. OP is just writing an algorithm to unskim it. And nobody can really use it without making the thing bulletproof lest a physician mistreat a cancer.
To my sentiment this is yet another unethical topic in healthcare. These clunky algorithms, if a physician uses them, slows the process and introduces a potential source of error, ultimately harming patients. Harming patients for increased revenue. The physicians writing and maintaining the guidelines look the other way given they get a paycheck off it, plus the prestige of it all, similar to some scenarios in medicine itself.
The natural thing to do is crack open the database and let algorithms utilize it. This whole thing of dumping data in an obstruse and machine-challenging format, then a rube goldberg machine to reverse the transformation, it's not right.
Anyway I mention this because there seems to be a thought of "these pdfs are messy lets clean them" without looking at what's really going on here.
PDFs suck in many ways but are durable and portable. If I work with two oncologists, I use the same pdf.
The author means well but his solution will likely be worse because only he will understand it. And there’s a million edge cases.
I'm not trying to build this out or sell it as a tool to providers. Just wanted to demo what you could do with structured guidelines. I don't think there's any reason this would have to be unique to a practice or emr.
As sister comments mentioned, I think the ideal case here would be if the guideline institutions released the structured representations of the guidelines along with the PDF versions. They could use a tool to draft them that could export in both formats. Oncologists could use the PDFs still, and systems could lean into the structured data.
The medical information systems vendors are right up there with health insurance companies in terms of their investment in ensuring patient deaths. Ensuring. With an E.
(1) https://www.cap.org/protocols-and-guidelines/electronic-canc...
If I could get access to this data as a random student on the internet, I'd love to create an open source tool that generates an interactive visualization.
Can you expand on this?
That's basically medical information system vendors.
The fact that the US hasn't pushed open source EMRs through CMS is insane. It's literally the perfect problem for an open solution.
"Contact the CAP for more information about licensing and using the CAP electronic Cancer Protocols for cancer reporting at your institution."
This stinks of the same gate-keeping that places like NIST and ISO do, charging you for access to their "standards".
There’s multiple countries with socialized medicine and no profit motive and it’s still not solved.
I think it’s just really complex with high negative consequences from a mistake. It takes lots of investment with good coordination to solve and there’s an “easy workaround” with pdfs that distributes liability to practitioners.
Layering any sort of data standard into that environment (and evolving it in a timely manner!) is nigh impossible without an external impetus forcing action (read: government payer mandate).
It's a truism in machine learning that curating and massaging your dataset is the most labor-intensive and error-prone part of any project. I don't why that would stop being true in healthcare just because lives are on the line.
There’s a whole industry that attempts to do what you do and there’s a reason why protocols keep getting punted back to pdf.
I agree it would be great to release structured representations. But I don’t think there’s a standard for that representation, so it’s kind of tricky as who will develop and maintain the data standard.
I worked on a decision support protocol for Ebola and it was really hard to get code sets released in Excel. Not to mention the actual decision gates in a way that is computable.
I hope we make progress on this, but I think the incentives are off for the work to make the data structures necessary.
I think "importance" is understating it, because permanent consistency is practically the only reason we all (still) use PDFs in quite literally every professional environment as a lowest common denominator industrial standard.
PDFs will always render the same, whether on paper or a screen of any size connected to a computer of any configuration. PDFs will almost always open and work given Adobe Reader, which these days is simply embedded in Chrome.
PDFs will almost certainly Just Work(tm), and Just Working(tm) is a god damn virtue in the professional world because time is money and nobody wants to be embarrassed handing out unusable documents.
Modern PDF supports font embedding of various kinds (legality is left as an exercise to the PDF author) and supports 14 standard font faces which can be specified for compatibility, though more often document authors probably assume a system font is available or embed one.
There are still problems with the format as it foremost focuses on document display rather than document structure or intent, and accessibility support in documents is often rare to non-existent outside of government use cases or maybe Word and the like.
A lot of usability improvements come from clients that make an attempt to parse the PDF to make the format appear smarter. macOS Preview can figure out where columns begin and end for natural text selection, Acrobat routinely generates an accessible version of a document after opening it, including some table detection. Honestly creative interpretation of PDF documents is possibly one of the best use cases of AI that I’ve ever heard of.
While a lot about PDF has changed over the years the basic standard was created to optimize for printing. It’s as if we started with GIF and added support to build interactive websites from GIFs. At its core, a PDF is just a representation of shapes on a page, and we added metadata that would hopefully identify glyphs, accessible alternative content, and smarter text/line selection, but it can fall apart if the PDF author is careless, malicious or didn’t expect certain content. It probably inherits all the weirdness of Unicode and then some, for example.
Community oncologists have limited technology resources as compared to a national cancer center. If we can make their lives easier, it can only be a good thing.
That said, I like published documents like PDFs - systems usually make it hard to conii ok are the June release from the September release.
Do you think this is a completely fabricated non-explanation? It's not like the link says "the worst places use these custom tools."
A wizard-type system hides most of the information from you, it might have bugs you aren't aware of, if you want to glance at an alternative path you can't, it's going to be locked into registered users, the system can go down.
I think much more intelligent computer systems are the future in health care, but I doubt the way to start is with yet another custom tool designed specifically for cancer guidelines and nothing else.
I didn't see anything in the screenshots presented that wouldn't be doable in a single HTML file containing the data, styles and scripts?
This is a countercultural idea but it fits so many use cases; it's a tragedy we don't do this more often. The two options are either PDF or SaaS.
Not just that, PDFs are one of the few formats, where i'm willing to bet my own money, that they'll still work in 10 or 20 years.
Even basic html has changed, layouts look different depending on many factors, and even the <blink>-ing doesn't work anymore.
>The organizations drafting guidelines should release them in structured, machine-interpretable formats in addition to the downloadable PDFs.
My opinion: Ideally the PDF could be generated from the underlying DAG -- that would give you confidence that everything in the PDF has been captured in the DAG.
Much easier for doctors to draft PDFs than graphs.
Further, that by virtue of being at the centre of action in research, doctors in prestige medical centres have an advantage that could be available to all doctors. It's a pretty important point, sometimes referred to as the dissemination of knowledge problem.
Currently, this is best approached by publishing systematic reviews according to the Cochrane Criteria [0]. Such reviews are quite labour-intensive and done all too rarely, but are very valuable when done.
One aspect of such reviews, when done, is how often they discard published studies for reasons such as bias, incomplete datasets, and so forth.
The approach described by Geiger in the link is commendable for its intentions but the outcome will be faced with the same problem that manual systematic reviews face.
I wonder if the author considered included rules-based approaches (e.g. Cochrane guidelines) in addition to machine learning approaches?
[0] https://training.cochrane.org/handbook
NCCN guidelines and Cochrane Reviews serve complementary roles in medicine - NCCN provides practical, frequently updated cancer treatment algorithms based on both research and expert consensus, while Cochrane Reviews offer rigorous systematic analyses of research evidence across all medical fields with a stronger focus on randomized controlled trials. The NCCN guidelines tend to be more immediately applicable in clinical practice, while Cochrane Reviews provide a deeper analysis of the underlying evidence quality.
My main goal here was to show what you could do with any set of medical guidelines that was properly structured. You can choose any criteria you want.
Doctors care about as much this as software engineers care about the latest computer science research. A few curious ones do. But the general attitude is they already did tough years of school so they don’t have to anymore.
Oncology has a rapidly changing treatment landscape and it’s common for oncologists to be discussing the latest paper that has come out.
If you’re an oncologist and not keeping up with the literature you’re going to be out of date in your decisions in about 6 months from graduation.
As people have alluded to and the history of MYCIN shows, there’s a lot more subtlety to the problem than appears on the surface, with a whole bunch of technical, psychological, sociological and economic factors interacting. This is why cancer guidelines are stuck in PDFs.
Still, none of that should inhibit exploration. After all, just because previous generations couldn’t solve a problem doesn’t mean that it can’t be solved.
[1] https://pmc.ncbi.nlm.nih.gov/articles/PMC10582221/
[2] https://www.forbes.com/sites/gilpress/2020/04/27/12-ai-miles...
The above is a high quality comment with worthy areas to study.
Additionally I would draw your attention to NCCN’s “Developer API” which is not interesting technologically but how it reflects the IP landscape.
https://www.nccn.org/developer-api
But they don't work as well as other decisionmaking techniques... Random forests, linear models, neural nets, etc. are all decision making techniques at their core.
And decision trees perform poorly for complex systems where lots of data exists - ie. human health.
So why are we using a known-inferior technique simply because it's easier to write down in a PDF file, reason about in a meeting, or explain to someone?
Shouldn't we be using the most advanced mathematical models possible with the highest 'cure' probability, even if they're so complex no human can understand them?
Not a lot of high quality data exists for human health. Clinical guidelines for many diseases are built around surprisingly scant evidence many times.
> even if they're so complex no human can understand them?
That’ll be wonderful to explain in court when they figure out it was just data smuggling or whatever other bias.
Go to a clinical cancer conference and you will see the grim reality of 10,000s of people contributing to the knowledge discovery process with their cancer care. There is an inverse relationship between the number of people in a trial and the amount of risk that goes into that trial, but it is still a massive amount of data that needs to be codified into some sensible system, and it's hard enough for a person to do it.
> That’ll be wonderful to explain in court when they figure out it was just data smuggling or whatever other bias.
What do you mean by this? I'm not aware of any data smuggling that has ever happened in a clinical trial. The "bias" is that any research hypothesis comes from the fundamentally biased position of "I think the data is telling me this" but I've seen very little bias of truly bad hypotheses in cancer research like those that have dominated, say Alzheimer's research. Any research malfeasance should be prosecuted to the fullest, but I don't think cancer research has much of it. This was a huge scandal, but I don't think it pointed to much in the way of bad research in the end:
https://www.propublica.org/article/doctor-jose-baselga-cance...
Even if there was some overfitting or data contamination that was undetected, the result would most probably still be better than a hand-made decision tree over the same data...
The practice of real world medicine often interpolates between these data points.
It's the niche HTML should be able to fill, since that was its original purpose, but isn't, since all focus over the last 20 or so years has been on everything else, but making HTML a better format for information exchange.
Trivial things like bundling up a complex HTML document into a single file don't have standard solutions. Cookies stop working when you are dealing with file:// URLs and a lot of other really basic stuff just doesn't work or doesn't exist. Instead you get offshot formats like ePUB that are mostly HTML, but not actually supported by most browser.
I have implemented this computerized process twice at two different startups over the past decade.
I would not want the NCCN to do it.
The NCCN guidelines are not stuck in PDFs, they are stuck in the heads of doctors.
Once the NCCN guidelines get put into computerized rules, they start to be guided by those computerized rules, a second influence that takes them away from the fundamental science.
So while I totally agree that there should be systemtticization of the rules, it should be entirely secondary and subservient to the best frontier knowledge about cancer, which changes extremely frequently. Annually after every ASCO (major pan-cancer conference) and every disease specific conference (e.g. the San Antonio breast cancer conference), and occasionally during the year when landmark clinical trials are published the doctors need to update their knowledge from the latest trials and their continuing medical education, which is entire body of knowledge that is complementary to the edges of what the NCCN publishes.
Having spanned both computer science and medicine for my entire career, I trust doctors to be able to update their rules far faster than the programmers and databases.
Please do not get the NCCN guidelines stuck in spaghetti code that a few programmers understand, rather than open in PDFs with lots of links that anybody can go and chase after.
Edit: though give me a week digesting this article and I may change my mind. Maybe the NCCN should be standardizing clinical variables enough such that the rules can trivially be turned into rules. That would require that the hypotheses that a clinical trial fits into those rules however, and that's why I need a week of digestion to see if it may even be possible...
The hardest part for me was understanding that treatment options could differ (i.e. between the _top_ hospitals treating the cancer). And there were a few critical options to consider. NCCN paths were traditional, but there is in between decisions to make or alternative paths. ChatGPT was really helpful in that period. "2nd" opinions are important... but again you ask the top 2 hospitals and they differ in opinion, any other hospital is typically in one of those camps.
I do hope that LLMs can help straighten some of it out but anyone whos done healthcare software, the problems are not technical, they are quite human.
That being said one bright spot is we've (my colleagues, not me) made a huge step forward using category theory and Prolog to discover the provably optimal 3+3 clinical oncology dose escalation trial protocol[1]. David gave a great presentation on it at the Scryer Prolog meetup[2] in Vienna.
It's kind of amazing how in the dark ages we are with medicine. Even though this is the first EXECUTABLE/PROGRAMMABLE SPEC for a 3+3 cancer trial, he is still fighting to convince his medical colleagues and hospital administrators that this is the optimal trial because -- surprise -- they don't speak software (or statistics).
[1]: https://arxiv.org/abs/2402.08334
[2]: https://www.digitalaustria.gv.at/eng/insights/Digital-Austri...
It also shows how important getting dose escalation trials are. The whole point is finding the balance point where "cure is NOT worse than the disease". A bad dose can be worse than the cancer itself, and conducting the trials correctly is extremely important... and this really underscores the human cost. Truly heartbreaking :(
Adobe has had literally a third of a century to recognize this need and address it. I don't think they're paying attention :-/
I think that's a logical and necessary step to join medical reasoning and computer helpers, we need easier access to new information and more importantly to present clinical relevant facts from the literature in a way that helps actual patient care decision making.
I'm just not too sure we can have generic approaches to all specialties, but it’s nice seeing efforts in this area.
And from what he says a decision tree isn't really the right model in the first place. What about no tree, just a heap of records in a SQL database. You do a query on the known parameters, if the response comes back with only one item in the treatment column you follow it. If it comes back with multiple items you look at what would be needed to distinguish them and do the test(s).
It seems to me that ignoring the guideline is a physician decision, and when it is ignored (for good or for bad), it is not because the guidelines are not available in json.
As a side note, the German associations of oncology publish their guidelines here (HTML and SVG graphs): https://www.onkopedia.com/de/onkopedia/guidelines
Working on the data structure feels like an intermediate solution on the way to that ai which is not really necessary. Or am I missing something?
Nothing undermines medicine quite so thoroughly as yet another astronaut trying to force it into a data structure.
I ended up only really contributed adding version numbers to the pdf. So at least people knew they had the latest and same versions. And that took a year, to get versions added to guideline pdfs.
Don't signed PDFs include a timestamp, however?
The guideline was available via url so the filename couldn’t change.
https://codex.hl7.org/
https://www.hl7.org/fhir/plandefinition.html
You would aim to use CQL expressions inside of a PlanDefinition, in my estimate. This is exactly what AHRQ's, part of HHS, CDS Connect project aims to create / has created. They publish freely accessible computable decision support artifacts here: https://cds.ahrq.gov/cdsconnect/repository
When they are fully computable, they are FHIR PlanDefinitions (+ other resources like Questionnaire, etc) and CQL.
Here's an example of a fully executable Alcohol Use Disorder Identification Test: https://cds.ahrq.gov/cdsconnect/artifact/alcohol-screening-u...
There's so much other infrastructure around the EHR here to understand (and take advantage of). I think there's a big opportunity in proving that multimodal LLM can reliably generate these artifacts from other sources. It's not the LLM actually being a decision support tool itself (though that may well be promising), but rather the ability to generate standardized CDS artifacts in a highly scalable, repeatable way.
Happy to talk to anyone about any of these ideas - I started exactly where OP was.
I tried to see if HL7 was approachable for small teams, I personally became exhausted from reading it and trying to think how to implement a subset of it, I know it's "standard" but all this is kinda unapproachable.
The challenges are all about what goes in the nodes, how to define it, how to standardize it across different institutions, how to compare it to what was tested in two different clinical trials, etc. And if the computerized process goes into clinical practice, how is that node and its contents robustly defined so that a clinician sitting with a patient can instantly understand what is meant by it's yes/no/multiple choice question in terms that have been used in recent years at the clinician's conferences.
Addressing the challenges of constructing the graph requires deep understanding of the terms, deep knowledge of how 10 different people from different cultural backgrounds and training locations interpret highly technical terms with evolving meanings, and deep knowledge of how people could misunderstand language or logic.
These guidelines codify evolving scientific knowledge where new conceptions of the disease get invented at every conference. It's all at the edge of science where every month and year we have new technology to understand more than we ever understood before, and we have new clinical trials that are testing new hypotheses at the edge of it.
Getting a nice visual layout is necessary, but in no way sufficient for what needs to be done to put this into practice.
For many of us, curing cancer may someday become more important than almost anything else a computer can help us to do. It's just there are so many building blocks to solving truly complex problems; we must respect all that.
But the whole system is mess. And the whole SMART guideline system is controlled by 2-3 gatekeepers who don’t listen to any ideas other than their own
I hate, hate, hate, hate, hate the practice of using PDFs as a system of record. They are intended to be a print format for ensuring consistent typesetting and formatting. For that, I have no quarrel. But so much of the world economy is based on taking text, docx (XML), spreadsheets, or even CSV files, rendering them out as PDFs, and then emailing them around or storing them in databases. They've gone from being simply a view layer to infecting the model layer.
PDFs are a step better than passing around screenshots of text as images - when they don't literally consist of a single image, that is. But even for reasonably-well-behaved, mostly-text PDFs, finding things like "headers" and "sections" in the average case is dependent on a huge pile of heuristics about spacing and font size conventions. None of that semantic structure exists, it's just individual characters with X-Y coordinates. (My favorite thing to do with people starting to work with PDFs is to tell them that the files don't usually contain any whitespace characters, and then watch the horror slowly dawn as they contemplate the implications.) (And yes, I know that PDF/A theoretically exists, but it's not reliably used, and certainly won't exist on any file produced more than a couple years ago.)
Now, with multi-modal LLMs and OCR reaching near-human levels, we can finally... attempt to infer structured data back out from them. So many megawatt-hours wasted in undoing what was just done. Structure to unstructure to structure again. Why, why, why.
As for universality... I mean, sure, they're better than some proprietary format that can only be decrypted or parsed by one old rickety piece of software that has to run in Win95 compatibility mode. But they're not better than JSON or XML if the source of truth is structured, and they're not better than Markdown or - again - XML if the source is mostly text. And there are always warts that aren't fully supported depending on your viewer.
TLDR: The NCCN surely has a clean pretty database of these algorithms. They output these junky pdfs for free. Want cleaner "templates" data? Pay the toll please.
What we have here is a walled garden. Want the treatment algorithm? Here muck through this huge disaster of 999 page pdfs. Oh you want the underlying data? Well, well, it's going to cost you.
What we have here is not so much different than the paywalls of an academic journal. Some company running a core service to an altruistic industry and skimming a price. OP is just writing an algorithm to unskim it. And nobody can really use it without making the thing bulletproof lest a physician mistreat a cancer.
To my sentiment this is yet another unethical topic in healthcare. These clunky algorithms, if a physician uses them, slows the process and introduces a potential source of error, ultimately harming patients. Harming patients for increased revenue. The physicians writing and maintaining the guidelines look the other way given they get a paycheck off it, plus the prestige of it all, similar to some scenarios in medicine itself.
The natural thing to do is crack open the database and let algorithms utilize it. This whole thing of dumping data in an obstruse and machine-challenging format, then a rube goldberg machine to reverse the transformation, it's not right.
Anyway I mention this because there seems to be a thought of "these pdfs are messy lets clean them" without looking at what's really going on here.
Yeah, right. And then say "Die". /s
The guidelines shall be structured properly. It is not rocket science.