1. The scientific feedback loop in surgery is in some important ways better than drugs, in the sense that non-randomized evidence does a lot to improve the science. When surgery goes wrong, often-enough, it goes wrong in ways that can be reliably attributed to specific process sub-components, allowing iterative improvement. Causal mechanisms of drug side effects etc. are far harder to follow and mitigate at the level of a single case
2. If you bucket dental procedures as surgeries, I think you would clearly observe large amounts of fraud, unnecessary work, and wrong practices. Is medical surgery simply different from dental? Why? In part it may be due to medical surgery's extremely selective training and high stakes; even within medicine it is an outlier and consumes high-quality human capital in a way that may be unwise to scale
3. Most new surgeries are risky to patients and expensive, and so are restricted to cases where we expect large benefits on the core concerns. But the case for FDA-style regulation is, paradoxically, best for treatments which might have small margins of harm. If net effect is big and bad, we will figure it out before many people die (and can theoretically resolve costs via lawsuits and insurance). If net effect is small and bad, lacking a regulator with an RCT requirement, we will give it to everyone and fail to figure out the story.
4. Speaking of small harms: I haven't investigated this issue, but I do worry that iatrogenic harm due to e.g. general anaesthetics might not be properly quantified and accounted-for across surgery as a discipline.
5. For most of history, surgery was responsible for enormous iatrogenic harm and possibly a net-negative. Why did this reverse? What specific institutions are responsible? Does this imply that the answer on our FDA question should depend on those institutions or features of the treatment area?
All great points and great questions below, thank you for reading and commenting!
I agree with you on dental surgery but I guess I'd come back to "surgery isn't perfect but it's not obviously worse than pharma." There are lots of examples of unnecessary and iatrogenic prescriptions e.g opioids, which suggest that iatrogenics is not in general alleviated by the FDA.
I'm not so sure about your third point. I see what you're saying in that small net harms might not be able to get past the transaction costs of non-FDA consumer protection methods like the legal system. But this seems okay to me because for this sacrifice we get big benefits to the pace of pharmaceutical innovation and the price of drugs.
Your observation about historical surgery seems true to me and the question is interesting. I think that medicine had a similarly poor record in the pre-modern era, although there were some successes like willow bark. One thing I know about here is Trepanning. Basically drilling a hole in the skull to release pressure. Seems like a classic case of brutal pre modern surgery but anthropologists find tons of trepanned skulls which have healed significantly around the edges suggesting many years of life after the procedure, so there are arguably some surgical successes too.
I like the way you think about these issues, Maxwell. Thanks for engaging!
On the issues of small harms/small effects: I think the fundamental matter is less about transaction costs, and more about "does the requisite evidence ever get generated [without a regulatory backstop to force RCTs]"
I think many questions about small effects simply cannot be answered without large RCTs [or a totally revamped quasi-randomization infrastructure, which we just don't have right now]. The observational analysis paradigm simply breaks down when you want to find small effects.
Some academic departments are bought into evidence production from observational data. It lets them publish more papers; and grant proposals for observational studies are much cheaper to fund. But, often, the core questions remain unanswered, and it falls to the political economy to resolve them.
I think the question you need to answer is: what will replace the FDA as an enforcer of clinical trials quality, or as a generator of evidence?
One answer is to posit that we can develop better predictive tools with AI, which would let us move away from statistical measurement and toward cheaper yet more reliable ways of predicting drug effects.
Another answer is to posit that we are moving toward a world of highly effective medicines, so that the proportion of wonder drugs will be high enough compared to the duds that we can be less stringent. (The duds are, I think, are so numerous that currently I do not think this is a good trade.)
Another answer would be to refocus your assault onto specific FDA sub-components, such as medical tests or medical devices, where the concerns I've mentioned are smaller. For example, I think the comparison of surgery vs medical devices is excellent and much more apt than surgery vs drugs
I agree with all your points, especially when it comes to tracking latent negative effects of drugs that may be prescribed for years. Maybe this example is far-fetched, but it reminded me of the case of social media. That industry was completely unregulated, and even after more than 15 years, it's an endless legal battle over whether the tech giants can be held responsible for the damage they caused.
Trying to steelman a case against this: maybe non-FDA-regulated surgery only works because there is no such thing as Big Surgery, whereas there is Big Pharma. That is, maybe the incentives and capabilities of megacorporations make profiting from fraudulent drugs mechanically easier and more tempting than profiting from fraudulent surgical procedures. So the issue isn't just the information problem, it's the presence of a Molochian entity willing and able to hyperexploit that problem.
How might we check whether this is a plausibly salient difference? Note that even if it is, one can imagine different solutions to it than just FDA regulation. But whether it's worth fleshing out those solutions depends on whether this is actually what's going on.
Leaving aside the degree to which "big pharma" is an artifact of the FDA, there kind of is a big surgery. Check out the market for hip replacement parts for the most obvious example but all surgery uses more and more specialised tools, increasingly ultra-sophisticated robots and _lots_ of drugs. Not to mention 6 or 7 figure scanners.
I can also a world in which we had even bigger surgery and this made surgery cheaper, faster and even safer.
Also we have big mining and big energy and big auto and big food, it isn't obvious that big pharma causes more harm but the FDA certainly does!
> Running an RCT on a surgical technique is therefore difficult. Standardizing treatment as much as in pharmaceutical trials is basically impossible. It also isn’t clear what a surgical placebo should be. Do just put them under anesthetic for a few hours? Or do you cut people open and stitch them up without doing anything else? So surgical RCTs are rare and small when they happen.
But it is still worth discussing what we know from those RCTs when they *do* happen, one would think...
>> In 39 out of 53 (74%) trials there was improvement in the placebo arm and in 27 (51%) trials the effect of placebo did not differ from that of surgery. In 26 (49%) trials, surgery was superior to placebo but the magnitude of the effect of the surgical intervention over that of the placebo was generally small.
Fair enough. I don't think these results are too surprising or negative about surgery since they are looking at minor surgeries, often with subjective outcomes, and low power.
"Most of the trials investigated minor and not directly life threatening conditions, such as severe obesity (n=7; 13%) or gastro-oesophageal reflux (n=6; 11%). The most common type of intervention was endoscopy, with 23 trials (43%) using this technique as a part of the investigated procedure. Thirteen trials (25%) used some exogenous material, implant, or tissue, and a further six used balloons. Most studies reported subjective outcomes such as pain (n=13; 25%), improvement in symptoms or function (n=17; 32%), or quality of life (n=8; 15%). Less than half of the trials (n=22; 42%) reported an objective primary outcome—that is, measures that did not depend on judgment of patients or assessors. The majority of trials were small; the number of randomised participants ranged between 10 and 298, with a median of 60."
There is a long history of evaluating medical procedures using an RCT not using a placebo but using “treatment as usual” or other active interventions. It is quite possible to use an RCT with an active control to honestly evaluate if a new surgical intervention shows a benefit beyond treatment as usual. It just asks a slightly different, yet adequately useful question.
It is a pretty common strawman to mistakenly think that every kind of intervention needs to use the type of RCT design meant for drugs! This is not the case.
There are also other ways to generate sounder causal evidence to approximate the true clinical benefit with good observational studies. When clinicians choose to do such research does the result show that older estimates were true and correct or do they often have to revise it down? I would love to know if surgeons have been doing a better job on this front than pharma! Now that would be compelling.
> There is a long history of evaluating medical procedures using an RCT not using a placebo but using “treatment as usual” or other active interventions.
Presumably if the control was not a placebo but an actual treatment, the fraction of experimental arms which demonstrated superiority would be even smaller, and so the implied validity of all the new surgeries coming in without RCT testing even worse than that review finds...
> There are also other ways to generate sounder causal evidence to approximate the true clinical benefit with good observational studies. When clinicians choose to do such research does the result show that older estimates were true and correct or do they often have to revise it down?
The problem with observational studies is not that they are observational per se. It is important to identify if they were designed and analyzed in such a way that they have the capacity to answer a specific causal question or not. Unfortunately, the answer most of the time is no. The combination of (design + analysis) has historically been wrong and these have a track record of medical reversals with RCTs. But this is not an indictment of every observational study design or its capacity in theory to get the right answer.
It turns out that many observational studies do have the capacity to produce results consistent with RCTs, but only if you know what you are doing. There is now a large body of work done by Hernan, Robins and collaborators over the last 20 years showing the way.
For example, this paper from 15+ years ago (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3731075/) laid the groundwork for building such a case. They show that if you account for immortal time bias in the Nurse's health study and if the study measured relevant information, the results could have been compatible with answers from RCTs. It has since been developed into the target trial emulation framework https://academic.oup.com/aje/article/183/8/758/1739860 and has been adopted by the FDA as a requirement for any claims with respect to purported real world evidence claims.
Conceptually, this is a pretty powerful idea and in my humble opinion widely underutilized outside of epidemiology. That said it isn't a panacea. There are lots of papers jumping on the emulation hype bandwagon without doing the necessary validation and due diligence. I'm personally hopeful that we can start to move away from the era of "RCTs or nothing" to something more pluralistic without giving up rigorous standards.
> It turns out that many observational studies do have the capacity to produce results consistent with RCTs, but only if you know what you are doing. There is now a large body of work done by Hernan, Robins and collaborators over the last 20 years showing the way.
As I've told Shpitser and others, I'll believe that works when you show me nontrivial, controversial, pre-registered predictions which are subsequently verified by RCTs, and not post hoc analyses where you already know the right answer and then - mirabile dictu! - your re-analysis gets the right answer.
As you say, it's been '20 years'. Where are they all?
Totally agree with that stance. A lot of theoretical progress in causal inference, but theory guided empirical work has been slow. Most of the theorists aren't scientists themselves and it shows in the types of papers they write and topics they pursue.
The RCT Duplicate initiative is one such effort, a pre-registered design to evaluate emulators with RCTs. Their results came in recently https://jamanetwork.com/journals/jama/fullarticle/2804067. Clearly emulatability is not guaranteed it has to be tested for every new scenario. The FDA real world evidence department is basically asking that people do more such work for any real world evidence claims. I think that is promising.
If we lived without the FDA we, or anyway I, would subscribe to a service that did the same testing. But because the finances would be so much more difficult the number of drugs endorsed by that service would be comparatively tiny. It would be like living with the pharmaceuticals of fifty years ago.
Waiting to use drugs until the evidence is strong does not mean waiting for a purpose made clinical trial. You can also wait for people who are more experimental than you to take the drug and observe its safety and effectiveness in the population. So the number of drugs available to you with very strong evidence would very likely increase since so many more drugs can be developed and developed quicker without the FDA.
Who is going to be keeping track of these "people more experimental than I"? The manufacturer? Am I supposed to trust what he says? How is that supposed to work? In the world you envision I would read about some drug in the newspaper and call my service and ask what had happened to the people who had taken the drug up to that date. How is my service supposed to get that information? More to the point, how would I know to trust the sources the service is relying on? What a nightmare..
In the world I live in "Very strong evidence" always costs money. In this department, lots of money.
I think we should remember the central point of the post here. All of your issues apply even more to surgery than to pharmaceuticals because there is zero mandated testing for surgery and testing surgical procedures is far more difficult than testing pharmaceuticals. Yet, surgery works well. It is evidence based and improving over time. How do you explain this with your story of collapsing pharma quality in the absence of the FDA?
Science doesn't work by carefully deciding which authorities to trust - it works by having competing groups independently replicate key findings. To put it another way, we shouldn't turn to a trusted newspaper for these questions, we should turn to meta-analyses in Pubmed.
I guess most people will reflexively argue that only highly trained professionals can possibly understand medical literature, but that's bunk. An average non-expert could easily use this site https://c19early.org to see that there are solid RCTs showing metformin prevents long Covid and acetaminophen probably makes Covid worse. I informed my primary care physician about these academic research findings, not the other way around! It's dangerous to only depend on a trusted medical priesthood comparing notes with each other in a closed echo chamber. There's also wisdom in crowds. But only if the crowd is empowered to make decisions for itself.
It would be a solid investment to build better testing and meta-analysis infrastructure so the crowd can make more informed decisions. I recently posted a proposal:
I am not very sure surgery is a good counterfactual to drugs. Surgery is a sort of mechanical procedure where the outcomes are much more predictable by "human intuition" - In a sense it's much more similar to tinkering with wood than it is to chemically altering human bodies. A good surgeon will often have what is called "flair" - that means they can slightly innovate at the edges because it's somewhat easy to predict the impact of a small modification to the procedure. Then, that often gets published and slowly adopted by other surgeons - it reminds me a bit of a guild-like situation where artisans better their craft progressively.
But with drugs, you do not have "talented drug makers" that simply have a good flair for predicting results of drug trials. It's not within "human intuition" realm.
So that is a fundamental difference in how these things work. But there is also the fact that surgeons do face rigorous oversight in the form of malpractice suits. If you innovate too much as a surgeon and kill your patient, you are screwed. There is also direct feedback in this case - which you do not have with drugs. So this naturally keeps them in check. I am not sure how you can have the same thing for drugs - let them be tested - and then if they do nothing or kill people sue in a few years when you can evaluate the results?
I definitely agree with your description of how progress is made in surgery. But if you look at the history I think there are lots of cases where intuition among surgeons was wrong and the truth was very counterintuitive and took a long time and innovative tests to prove. In the drug realm I'm remembering Alex Telford's post on the early days of Janssen which did seem to have more flair and intuition. I think the very different regulatory environments and professional structures are exaggerating the actual difference between surgery and medicine here.
I also agree that malpractice is an important constraint on surgeons but I think this reinforces my point. We have an existence proof of decentralized mechanisms for consumer protection that can arise even when the product is difficult to evaluate and life-or-death. I think we can expect something similar to arise in pharmaceuticals in the absence of the FDA. There are challenges as you point out but they don't seem obviously harder than the challenges faced by surgery.
> I think the very different regulatory environments and professional structures are exaggerating the actual difference between surgery and medicine here.
that could be true, agreed.
> In the drug realm I'm remembering Alex Telford's post on the early days of Janssen which did seem to have more flair and intuition.
yeah so it's complicated, we have mostly switched from a thing called "phenotypic drug discovery" to "target based drug discovery". See this review: https://www.nature.com/articles/nrd.2017.111
But even when drug discovery was more "flair" based, the human intuition on what worked as a drug was not very good. Like, it's just fundamentally hard for the human mind to know what will happen if you chemically alter a receptor we do not even precisely know the function of than what will happen if you move a vein a bit further to the left. It's super complicated biochemistry vs mechanics. A car mechanic is pretty good at predicting what will happen to a car if they do X thing.
I totally agree we need to take bigger risks. This is from a doctor friend: "So all@pharma talks about is “de-risking” but I wish regulatory reform to make phase I trials less expensive could be on the table.
Because I think there is no way of getting around the need for human testing
And I have patients who die all the time who would have wanted to be on a trial at the end but couldn’t be.
But I can’t write something saying “the people who regulate me are idiots and the work done by my lab and most of my institution is mostly BS” under my real name lol"
I am very open to the anti FDA thing btw or at least some big decrease in regulation. So this is not really an argument against ban on FDA per se, more that I am really not sure surgery is a compelling case
How would your system work, exactly? Here is one guess (feel free to correct it). A company's lab worker tells management they have developed a drug that cures heart disease in mice. The CEO calls a press conference and announces they have this new drug and will supply it to anyone with the cash. That the way your system would work? Have I made clear my objections to it?
My concern is not collapsing "pharma quality" but collapsing trust.
Previously I might have said "it will work like cars or computers. Companies will prototype and extensively test new products before releasing them. Third parties will test and review them to provide info for customers. Etc"
But now, I would look at surgery as a better counterfactual. Prescribing doctors trading information among themselves plays a bigger role.
Also, I would be happy for the FDA to continue testing drugs and providing data to customers. But they needn't ban them and let thousands die waiting to do that!
I think perhaps we have very different ideas about how humans work. It is my observation that humans make really serious errors all the time. (7 out of 10 people believe in angels. Would you be willing to stipulate that I could find hundreds of comparable examples?) It is obvious to me that many more thousands would die without banning because they would chase after some weird drug that their next door neighbor was enthusiastic about. You think not? I congratulate you on your social circle. But it is not mine.
* RCT of knee surgery--no significant benefits. https://jamanetwork.com/journals/jamanetworkopen/fullarticle/2794027 Note the literature review: "The RCT conducted by Sihvonen et al found that arthroscopic partial meniscectomy is associated with a slightly increased risk of radiographic knee OA compared with exercise therapy. The study by Katz et al found a 5 times higher risk for total knee replacement (ie, the treatment for end-stage knee OA) after surgery vs exercise-based physical therapy. However, the trials by Berg et al, Herrlin et al, and Sonesson et al that compared surgery with exercise therapy found no clinically relevant difference between the 2 treatments for OA progression."
I.e., based on 6 RCTs here, knee surgery is either the same or worse than physical therapy.
But almost no drugs are like parachutes, and when a drug actually works at an undeniable level (e.g., Gleevec, penicillin), there is almost no delay arising from the FDA. (The NDA for Gleevec was filed on Feb. 27, 2001, and was approved by the FDA on May 10, 2001! https://www.accessdata.fda.gov/drugsatfda_docs/nda/2001/21-335_Gleevec_Approv.pdf).
But almost no drugs are like parachutes. Drugs have mostly very small effects (if any) -- in one review of several dozen cancer drugs, the median improvement to life expectancy was a mere 2.1 months, and worse, many cancer drugs don't have any measurable benefit to life expectancy at all. See https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5695531/
Given how toxic those drugs are, it is a scandal that patients and payors (Medicare, etc.) often pay $100k or more for drugs that barely provide a benefit, if any at all.
Historically, the main oversight of surgeon quality, when it was mostly an in-patient activity, was via hospital quality review and medical staff oversight. These can be frustratingly clunky but generally bend toward ensuring quality over time. I worry that the push toward more profit-motivated/cost-saing outpatient surgery will blunt these tools and generally lower quality, although the effect will be almost imperceptibly slow. In this sense, it makes me worry that surgery will go the way of the almost totally unregulated field of dentistry.
Oh my friend, obviously you're a young-un. Let me tell you an old-timer story.
Bypass surgeries, including triple bypass surgeries used to be done left, right and center. Coronary arteries are blocked up? Let's open 'em up! It's self-evident that if you unclog them the patiens will do better, right?
Yes, well... ahem. When they finally did the RCTs, they found... most bypass surgeries were of no benefit. Nada. Zilch. There have been several other RCTs that found sham surgeries and actual surgeries perform equally well in many instances.
This is why you see so few bypass surgeries nowadays.
The bypass study is unusually, and was probably done because cardiologists were competing for a slice of the economic pie and taking patients away from the surgeons.
This is not to say surgeries typically self-regulate without the FDA. They do. Like bone fractures or reattachment of amputated limbs. But those are for surgeries where the benefits are clear and relatively rapid. In instances where the benefit is uncertain (like mortality over months or years), you need RCTs to be sure.
Surgeons don't have free entry. It takes almost of decade of training to be a surgeon. They're incentived to not fuck up because they can lose their licence and no one wants to do a career change in their 40s. Same principle applies to lawyers. To make the pharmaceutical industry comparable you would have to force every industry owner and major employee to go through the same process. Perhaps it's more effective to have a licence Raj for people rather than processes.
Without endorsing a specific position, consider:
1. The scientific feedback loop in surgery is in some important ways better than drugs, in the sense that non-randomized evidence does a lot to improve the science. When surgery goes wrong, often-enough, it goes wrong in ways that can be reliably attributed to specific process sub-components, allowing iterative improvement. Causal mechanisms of drug side effects etc. are far harder to follow and mitigate at the level of a single case
2. If you bucket dental procedures as surgeries, I think you would clearly observe large amounts of fraud, unnecessary work, and wrong practices. Is medical surgery simply different from dental? Why? In part it may be due to medical surgery's extremely selective training and high stakes; even within medicine it is an outlier and consumes high-quality human capital in a way that may be unwise to scale
3. Most new surgeries are risky to patients and expensive, and so are restricted to cases where we expect large benefits on the core concerns. But the case for FDA-style regulation is, paradoxically, best for treatments which might have small margins of harm. If net effect is big and bad, we will figure it out before many people die (and can theoretically resolve costs via lawsuits and insurance). If net effect is small and bad, lacking a regulator with an RCT requirement, we will give it to everyone and fail to figure out the story.
4. Speaking of small harms: I haven't investigated this issue, but I do worry that iatrogenic harm due to e.g. general anaesthetics might not be properly quantified and accounted-for across surgery as a discipline.
5. For most of history, surgery was responsible for enormous iatrogenic harm and possibly a net-negative. Why did this reverse? What specific institutions are responsible? Does this imply that the answer on our FDA question should depend on those institutions or features of the treatment area?
All great points and great questions below, thank you for reading and commenting!
I agree with you on dental surgery but I guess I'd come back to "surgery isn't perfect but it's not obviously worse than pharma." There are lots of examples of unnecessary and iatrogenic prescriptions e.g opioids, which suggest that iatrogenics is not in general alleviated by the FDA.
I'm not so sure about your third point. I see what you're saying in that small net harms might not be able to get past the transaction costs of non-FDA consumer protection methods like the legal system. But this seems okay to me because for this sacrifice we get big benefits to the pace of pharmaceutical innovation and the price of drugs.
Your observation about historical surgery seems true to me and the question is interesting. I think that medicine had a similarly poor record in the pre-modern era, although there were some successes like willow bark. One thing I know about here is Trepanning. Basically drilling a hole in the skull to release pressure. Seems like a classic case of brutal pre modern surgery but anthropologists find tons of trepanned skulls which have healed significantly around the edges suggesting many years of life after the procedure, so there are arguably some surgical successes too.
I like the way you think about these issues, Maxwell. Thanks for engaging!
On the issues of small harms/small effects: I think the fundamental matter is less about transaction costs, and more about "does the requisite evidence ever get generated [without a regulatory backstop to force RCTs]"
I think many questions about small effects simply cannot be answered without large RCTs [or a totally revamped quasi-randomization infrastructure, which we just don't have right now]. The observational analysis paradigm simply breaks down when you want to find small effects.
Some academic departments are bought into evidence production from observational data. It lets them publish more papers; and grant proposals for observational studies are much cheaper to fund. But, often, the core questions remain unanswered, and it falls to the political economy to resolve them.
As an aside: an important second-order effect of regulatory RCTs is the removal of incentives for corrupting the scientific system. [e.g., look at what happened to our nutrition academics! https://www.npr.org/sections/thetwo-way/2016/09/13/493739074/50-years-ago-sugar-industry-quietly-paid-scientists-to-point-blame-at-fat ]. Regulatory RCTs are oddly pro-innovation if they can protect our epistemics and scientific human capital.
I think the question you need to answer is: what will replace the FDA as an enforcer of clinical trials quality, or as a generator of evidence?
One answer is to posit that we can develop better predictive tools with AI, which would let us move away from statistical measurement and toward cheaper yet more reliable ways of predicting drug effects.
Another answer is to posit that we are moving toward a world of highly effective medicines, so that the proportion of wonder drugs will be high enough compared to the duds that we can be less stringent. (The duds are, I think, are so numerous that currently I do not think this is a good trade.)
Another answer would be to refocus your assault onto specific FDA sub-components, such as medical tests or medical devices, where the concerns I've mentioned are smaller. For example, I think the comparison of surgery vs medical devices is excellent and much more apt than surgery vs drugs
I agree with all your points, especially when it comes to tracking latent negative effects of drugs that may be prescribed for years. Maybe this example is far-fetched, but it reminded me of the case of social media. That industry was completely unregulated, and even after more than 15 years, it's an endless legal battle over whether the tech giants can be held responsible for the damage they caused.
Trying to steelman a case against this: maybe non-FDA-regulated surgery only works because there is no such thing as Big Surgery, whereas there is Big Pharma. That is, maybe the incentives and capabilities of megacorporations make profiting from fraudulent drugs mechanically easier and more tempting than profiting from fraudulent surgical procedures. So the issue isn't just the information problem, it's the presence of a Molochian entity willing and able to hyperexploit that problem.
How might we check whether this is a plausibly salient difference? Note that even if it is, one can imagine different solutions to it than just FDA regulation. But whether it's worth fleshing out those solutions depends on whether this is actually what's going on.
Leaving aside the degree to which "big pharma" is an artifact of the FDA, there kind of is a big surgery. Check out the market for hip replacement parts for the most obvious example but all surgery uses more and more specialised tools, increasingly ultra-sophisticated robots and _lots_ of drugs. Not to mention 6 or 7 figure scanners.
I can also a world in which we had even bigger surgery and this made surgery cheaper, faster and even safer.
Also we have big mining and big energy and big auto and big food, it isn't obvious that big pharma causes more harm but the FDA certainly does!
Aren't all these "medical devices" under the purview of FDA approval?
> Running an RCT on a surgical technique is therefore difficult. Standardizing treatment as much as in pharmaceutical trials is basically impossible. It also isn’t clear what a surgical placebo should be. Do just put them under anesthetic for a few hours? Or do you cut people open and stitch them up without doing anything else? So surgical RCTs are rare and small when they happen.
But it is still worth discussing what we know from those RCTs when they *do* happen, one would think...
"Use of placebo controls in the evaluation of surgery: systematic review", Wartolowska et al 2014: https://www.bmj.com/content/348/bmj.g3253
>> In 39 out of 53 (74%) trials there was improvement in the placebo arm and in 27 (51%) trials the effect of placebo did not differ from that of surgery. In 26 (49%) trials, surgery was superior to placebo but the magnitude of the effect of the surgical intervention over that of the placebo was generally small.
Fair enough. I don't think these results are too surprising or negative about surgery since they are looking at minor surgeries, often with subjective outcomes, and low power.
"Most of the trials investigated minor and not directly life threatening conditions, such as severe obesity (n=7; 13%) or gastro-oesophageal reflux (n=6; 11%). The most common type of intervention was endoscopy, with 23 trials (43%) using this technique as a part of the investigated procedure. Thirteen trials (25%) used some exogenous material, implant, or tissue, and a further six used balloons. Most studies reported subjective outcomes such as pain (n=13; 25%), improvement in symptoms or function (n=17; 32%), or quality of life (n=8; 15%). Less than half of the trials (n=22; 42%) reported an objective primary outcome—that is, measures that did not depend on judgment of patients or assessors. The majority of trials were small; the number of randomised participants ranged between 10 and 298, with a median of 60."
There is a long history of evaluating medical procedures using an RCT not using a placebo but using “treatment as usual” or other active interventions. It is quite possible to use an RCT with an active control to honestly evaluate if a new surgical intervention shows a benefit beyond treatment as usual. It just asks a slightly different, yet adequately useful question.
It is a pretty common strawman to mistakenly think that every kind of intervention needs to use the type of RCT design meant for drugs! This is not the case.
There are also other ways to generate sounder causal evidence to approximate the true clinical benefit with good observational studies. When clinicians choose to do such research does the result show that older estimates were true and correct or do they often have to revise it down? I would love to know if surgeons have been doing a better job on this front than pharma! Now that would be compelling.
> There is a long history of evaluating medical procedures using an RCT not using a placebo but using “treatment as usual” or other active interventions.
Presumably if the control was not a placebo but an actual treatment, the fraction of experimental arms which demonstrated superiority would be even smaller, and so the implied validity of all the new surgeries coming in without RCT testing even worse than that review finds...
> There are also other ways to generate sounder causal evidence to approximate the true clinical benefit with good observational studies. When clinicians choose to do such research does the result show that older estimates were true and correct or do they often have to revise it down?
Observational studies do poor jobs of predicting later randomized experiments: https://gwern.net/correlation (This is particularly apparent where the researchers have access to very large N & k and very large sets of covariates, like Facebook: https://gwern.net/doc/statistics/causality/2019-gordon.pdf )
> Observational studies do poor jobs of predicting later randomized experiments: https://gwern.net/correlation
The problem with observational studies is not that they are observational per se. It is important to identify if they were designed and analyzed in such a way that they have the capacity to answer a specific causal question or not. Unfortunately, the answer most of the time is no. The combination of (design + analysis) has historically been wrong and these have a track record of medical reversals with RCTs. But this is not an indictment of every observational study design or its capacity in theory to get the right answer.
It turns out that many observational studies do have the capacity to produce results consistent with RCTs, but only if you know what you are doing. There is now a large body of work done by Hernan, Robins and collaborators over the last 20 years showing the way.
For example, this paper from 15+ years ago (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3731075/) laid the groundwork for building such a case. They show that if you account for immortal time bias in the Nurse's health study and if the study measured relevant information, the results could have been compatible with answers from RCTs. It has since been developed into the target trial emulation framework https://academic.oup.com/aje/article/183/8/758/1739860 and has been adopted by the FDA as a requirement for any claims with respect to purported real world evidence claims.
Conceptually, this is a pretty powerful idea and in my humble opinion widely underutilized outside of epidemiology. That said it isn't a panacea. There are lots of papers jumping on the emulation hype bandwagon without doing the necessary validation and due diligence. I'm personally hopeful that we can start to move away from the era of "RCTs or nothing" to something more pluralistic without giving up rigorous standards.
> It turns out that many observational studies do have the capacity to produce results consistent with RCTs, but only if you know what you are doing. There is now a large body of work done by Hernan, Robins and collaborators over the last 20 years showing the way.
As I've told Shpitser and others, I'll believe that works when you show me nontrivial, controversial, pre-registered predictions which are subsequently verified by RCTs, and not post hoc analyses where you already know the right answer and then - mirabile dictu! - your re-analysis gets the right answer.
As you say, it's been '20 years'. Where are they all?
Totally agree with that stance. A lot of theoretical progress in causal inference, but theory guided empirical work has been slow. Most of the theorists aren't scientists themselves and it shows in the types of papers they write and topics they pursue.
The RCT Duplicate initiative is one such effort, a pre-registered design to evaluate emulators with RCTs. Their results came in recently https://jamanetwork.com/journals/jama/fullarticle/2804067. Clearly emulatability is not guaranteed it has to be tested for every new scenario. The FDA real world evidence department is basically asking that people do more such work for any real world evidence claims. I think that is promising.
If we lived without the FDA we, or anyway I, would subscribe to a service that did the same testing. But because the finances would be so much more difficult the number of drugs endorsed by that service would be comparatively tiny. It would be like living with the pharmaceuticals of fifty years ago.
Waiting to use drugs until the evidence is strong does not mean waiting for a purpose made clinical trial. You can also wait for people who are more experimental than you to take the drug and observe its safety and effectiveness in the population. So the number of drugs available to you with very strong evidence would very likely increase since so many more drugs can be developed and developed quicker without the FDA.
Who is going to be keeping track of these "people more experimental than I"? The manufacturer? Am I supposed to trust what he says? How is that supposed to work? In the world you envision I would read about some drug in the newspaper and call my service and ask what had happened to the people who had taken the drug up to that date. How is my service supposed to get that information? More to the point, how would I know to trust the sources the service is relying on? What a nightmare..
In the world I live in "Very strong evidence" always costs money. In this department, lots of money.
I think we should remember the central point of the post here. All of your issues apply even more to surgery than to pharmaceuticals because there is zero mandated testing for surgery and testing surgical procedures is far more difficult than testing pharmaceuticals. Yet, surgery works well. It is evidence based and improving over time. How do you explain this with your story of collapsing pharma quality in the absence of the FDA?
Science doesn't work by carefully deciding which authorities to trust - it works by having competing groups independently replicate key findings. To put it another way, we shouldn't turn to a trusted newspaper for these questions, we should turn to meta-analyses in Pubmed.
I guess most people will reflexively argue that only highly trained professionals can possibly understand medical literature, but that's bunk. An average non-expert could easily use this site https://c19early.org to see that there are solid RCTs showing metformin prevents long Covid and acetaminophen probably makes Covid worse. I informed my primary care physician about these academic research findings, not the other way around! It's dangerous to only depend on a trusted medical priesthood comparing notes with each other in a closed echo chamber. There's also wisdom in crowds. But only if the crowd is empowered to make decisions for itself.
It would be a solid investment to build better testing and meta-analysis infrastructure so the crowd can make more informed decisions. I recently posted a proposal:
https://open.substack.com/pub/cbuck/p/consumer-reports-for-medicines
I am not very sure surgery is a good counterfactual to drugs. Surgery is a sort of mechanical procedure where the outcomes are much more predictable by "human intuition" - In a sense it's much more similar to tinkering with wood than it is to chemically altering human bodies. A good surgeon will often have what is called "flair" - that means they can slightly innovate at the edges because it's somewhat easy to predict the impact of a small modification to the procedure. Then, that often gets published and slowly adopted by other surgeons - it reminds me a bit of a guild-like situation where artisans better their craft progressively.
But with drugs, you do not have "talented drug makers" that simply have a good flair for predicting results of drug trials. It's not within "human intuition" realm.
So that is a fundamental difference in how these things work. But there is also the fact that surgeons do face rigorous oversight in the form of malpractice suits. If you innovate too much as a surgeon and kill your patient, you are screwed. There is also direct feedback in this case - which you do not have with drugs. So this naturally keeps them in check. I am not sure how you can have the same thing for drugs - let them be tested - and then if they do nothing or kill people sue in a few years when you can evaluate the results?
I definitely agree with your description of how progress is made in surgery. But if you look at the history I think there are lots of cases where intuition among surgeons was wrong and the truth was very counterintuitive and took a long time and innovative tests to prove. In the drug realm I'm remembering Alex Telford's post on the early days of Janssen which did seem to have more flair and intuition. I think the very different regulatory environments and professional structures are exaggerating the actual difference between surgery and medicine here.
I also agree that malpractice is an important constraint on surgeons but I think this reinforces my point. We have an existence proof of decentralized mechanisms for consumer protection that can arise even when the product is difficult to evaluate and life-or-death. I think we can expect something similar to arise in pharmaceuticals in the absence of the FDA. There are challenges as you point out but they don't seem obviously harder than the challenges faced by surgery.
Thank you for reading and commenting!!
> I think the very different regulatory environments and professional structures are exaggerating the actual difference between surgery and medicine here.
that could be true, agreed.
> In the drug realm I'm remembering Alex Telford's post on the early days of Janssen which did seem to have more flair and intuition.
yeah so it's complicated, we have mostly switched from a thing called "phenotypic drug discovery" to "target based drug discovery". See this review: https://www.nature.com/articles/nrd.2017.111
But even when drug discovery was more "flair" based, the human intuition on what worked as a drug was not very good. Like, it's just fundamentally hard for the human mind to know what will happen if you chemically alter a receptor we do not even precisely know the function of than what will happen if you move a vein a bit further to the left. It's super complicated biochemistry vs mechanics. A car mechanic is pretty good at predicting what will happen to a car if they do X thing.
I totally agree we need to take bigger risks. This is from a doctor friend: "So all@pharma talks about is “de-risking” but I wish regulatory reform to make phase I trials less expensive could be on the table.
Because I think there is no way of getting around the need for human testing
And I have patients who die all the time who would have wanted to be on a trial at the end but couldn’t be.
But I can’t write something saying “the people who regulate me are idiots and the work done by my lab and most of my institution is mostly BS” under my real name lol"
I am very open to the anti FDA thing btw or at least some big decrease in regulation. So this is not really an argument against ban on FDA per se, more that I am really not sure surgery is a compelling case
How would your system work, exactly? Here is one guess (feel free to correct it). A company's lab worker tells management they have developed a drug that cures heart disease in mice. The CEO calls a press conference and announces they have this new drug and will supply it to anyone with the cash. That the way your system would work? Have I made clear my objections to it?
My concern is not collapsing "pharma quality" but collapsing trust.
Previously I might have said "it will work like cars or computers. Companies will prototype and extensively test new products before releasing them. Third parties will test and review them to provide info for customers. Etc"
But now, I would look at surgery as a better counterfactual. Prescribing doctors trading information among themselves plays a bigger role.
Also, I would be happy for the FDA to continue testing drugs and providing data to customers. But they needn't ban them and let thousands die waiting to do that!
I think perhaps we have very different ideas about how humans work. It is my observation that humans make really serious errors all the time. (7 out of 10 people believe in angels. Would you be willing to stipulate that I could find hundreds of comparable examples?) It is obvious to me that many more thousands would die without banning because they would chase after some weird drug that their next door neighbor was enthusiastic about. You think not? I congratulate you on your social circle. But it is not mine.
There are some RCTs of surgeries, and they often find that surgery doesn't work.
* RCT of lumbar fusion surgery. No significant benefits to fusion surgery. https://pubmed.ncbi.nlm.nih.gov/27074066/ Another RCT on lumbar fusion with no significant benefit: https://pubmed.ncbi.nlm.nih.gov/12973134/ See also https://www.sciencedirect.com/science/article/abs/pii/S1529943015017738
* RCT of knee surgery--no significant benefits. https://jamanetwork.com/journals/jamanetworkopen/fullarticle/2794027 Note the literature review: "The RCT conducted by Sihvonen et al found that arthroscopic partial meniscectomy is associated with a slightly increased risk of radiographic knee OA compared with exercise therapy. The study by Katz et al found a 5 times higher risk for total knee replacement (ie, the treatment for end-stage knee OA) after surgery vs exercise-based physical therapy. However, the trials by Berg et al, Herrlin et al, and Sonesson et al that compared surgery with exercise therapy found no clinically relevant difference between the 2 treatments for OA progression."
I.e., based on 6 RCTs here, knee surgery is either the same or worse than physical therapy.
Separately, I agree with Michael Sklar's point: Surgery is often directly life-saving, in front of your very eyes. It can be like the classic BMJ article on parachutes: https://www.bmj.com/content/327/7429/1459?ijkey=ccd0367ae81fdafe32c828281327002c657ed802&keytype2=tf_ipsecsha
But almost no drugs are like parachutes, and when a drug actually works at an undeniable level (e.g., Gleevec, penicillin), there is almost no delay arising from the FDA. (The NDA for Gleevec was filed on Feb. 27, 2001, and was approved by the FDA on May 10, 2001! https://www.accessdata.fda.gov/drugsatfda_docs/nda/2001/21-335_Gleevec_Approv.pdf).
But almost no drugs are like parachutes. Drugs have mostly very small effects (if any) -- in one review of several dozen cancer drugs, the median improvement to life expectancy was a mere 2.1 months, and worse, many cancer drugs don't have any measurable benefit to life expectancy at all. See https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5695531/
Given how toxic those drugs are, it is a scandal that patients and payors (Medicare, etc.) often pay $100k or more for drugs that barely provide a benefit, if any at all.
The case of coronary artery bypass surgery is interesting too (as is the case of stents). Clear candidate for overoptimistic treatment effect ("M-error") that was corrected decades later. https://www.sensible-med.com/p/the-evidence-that-established-coronary?lli=1
Historically, the main oversight of surgeon quality, when it was mostly an in-patient activity, was via hospital quality review and medical staff oversight. These can be frustratingly clunky but generally bend toward ensuring quality over time. I worry that the push toward more profit-motivated/cost-saing outpatient surgery will blunt these tools and generally lower quality, although the effect will be almost imperceptibly slow. In this sense, it makes me worry that surgery will go the way of the almost totally unregulated field of dentistry.
Wasn't Akerlof's Nobel prize specifically for showing how important regulation is to the used car market?
Oh my friend, obviously you're a young-un. Let me tell you an old-timer story.
Bypass surgeries, including triple bypass surgeries used to be done left, right and center. Coronary arteries are blocked up? Let's open 'em up! It's self-evident that if you unclog them the patiens will do better, right?
Yes, well... ahem. When they finally did the RCTs, they found... most bypass surgeries were of no benefit. Nada. Zilch. There have been several other RCTs that found sham surgeries and actual surgeries perform equally well in many instances.
This is why you see so few bypass surgeries nowadays.
The bypass study is unusually, and was probably done because cardiologists were competing for a slice of the economic pie and taking patients away from the surgeons.
This is not to say surgeries typically self-regulate without the FDA. They do. Like bone fractures or reattachment of amputated limbs. But those are for surgeries where the benefits are clear and relatively rapid. In instances where the benefit is uncertain (like mortality over months or years), you need RCTs to be sure.
Surgeons don't have free entry. It takes almost of decade of training to be a surgeon. They're incentived to not fuck up because they can lose their licence and no one wants to do a career change in their 40s. Same principle applies to lawyers. To make the pharmaceutical industry comparable you would have to force every industry owner and major employee to go through the same process. Perhaps it's more effective to have a licence Raj for people rather than processes.