The Single Clearest Metascience Reform
Faster internal reviews instead of external academic ones
Metascience is a young field whose subjects often take multiple decades to show results, so much is still unknown. Proposals like person-based grants, variance scoring, golden tickets, or FROs have varying degrees of intrigue and evidence, but generally the strongest case one can make is that it’s worth experimenting with these ideas to gather more information.
One proposal in metascience has clearer support: Use less external academic review in grantmaking decisions.
The evidence that external academic review improves decision making over internal ones is weak. We are also certain that external academic review is costly to implement and leads to costly strategic behavior on the part of applicants. So we are pretty certain that we could get at least as good scientific outcomes with much less cost by using only fast internal reviews.
Does Internal-only Review Work At Least As Well as External Academic Review?
External academic review is definitely better than random at predicting some measures of scientific impact like citations or publications. This isn’t a massive surprise or accomplishment especially since those measures of scientific impact also depend on the opinions of external academics. So at least external review predicts its own opinions better than random.

Li and Agha find that external review scores tend to correlate with impact even among groups of proposals that share lots of other characteristics like the prestige of the institution the proposal is coming from or the previous success of the PI at getting NIH grants or publishing popular papers.
Still, there is lots of noise in these correlations between scores and rankings and likely even more noise in the unobserved correlation between scores and actual scientific impact. A similar study on the same data finds that among the top 20th percentile of proposals the relationship is especially weak, explaining only around 1% of the variance in productivity among those grants.
But is this any better than internal reviews of grant proposals? Before getting into the empirics, there are a couple of important observations that should influence your prior on the efficacy of internal vs external review.
First, I’ve studiously avoided using the phrase “peer review” to describe external academic reviews because both external and internal reviews are “peer” reviews. The program officers at the NIH or NSF hold PhDs in relevant fields and are expert scientists. They are peers too!
External academic reviewers are probably somewhat closer in expertise to the grants they review since internal reviewers would have to look over a wider range of grants, but you have to tell a pretty strained story about how this small extra similarity in research is a big advantage in evaluation.
Second, is that internal-only review was standard practice until the 1975 National Science Foundation special oversight hearings and Proxmire’s Golden Fleece Awards in the same year. There are obviously lots of confounding variables, but it’s just far from clear that the switch to much more external review has gained us any advantage in the targeting of grants or the speed of scientific progress.
More empirically, there are lots of internal-only review programs at scientific funding agencies that have good results. For example: SGER, RAPID, and EAGER at the NSF. SGER proposals are only 2 pages long and program officers at the NSF make internal-only decisions on whether or not to fund them, often in just a few weeks. SGER disbursed $284 million dollars over 5,000+ applications and 80% of the SGER winners that applied for further funding from the NSF succeeded in their grants, far higher than the overall success rate at the NSF of around 20%.
SGER was succeeded by RAPID and EAGER which work in much the same way with shorter proposals and internal-only review. These programs were able to scale up in response to the Covid pandemic and sent out tens of millions of dollars in grants months before the NIH could respond.
There are also the myriad successes of the ARPA model where program managers personally seek out projects to fund, sometimes consulting with external reviewers but always with the autonomy to fund by their own discretion. ARPA-E, a program focused on energy technology, used a two tier system of external review and program manager discretion. Goldstein and Kearney (2018) find that the proposals where ARPA program managers ignored low average external review scores and were funded had just as many patents and journal publications as the high scoring successful proposals.
There is some contrary evidence which suggests that external review can be better than internal discretion. Ginther and Heggeness look at an NIH research fellowship where early-career scientists are often funded “out of order” relative to their external review scores. They find that the scientists who were “reached” for and funded despite other applicants having higher external scores don’t get as many future NIH grants, publications, or citations as the scientists that the program officers skipped.
“Reached” winners get about .33 fewer further research project grants from the NIH than the researchers who were skipped higher up in the peer review rankings, where the average winner of this fellowship gets 2.8 RPGs. Part of this effect may be explained by POs optimizing for racial or fender diversity or for applicants from different states or universities. Alternatively, it may be the result of program officers trying to fund the researchers for whom the funding would have the largest counterfactual impact rather than the researchers with the highest ex ante ability.
The Costs of External Review
If external review has an advantage over internal-only review it is a small one, but how much do we pay for this potential small advantage?
There are direct financial costs to employing the labor of scientists to review NIH applications. The NIH employs about 30,000 unique external reviewers to produce 250,000 peer review reports for around 65,000 grants every year. Conservatively estimating that each peer review report takes two hours to write and using the average wage of medical scientists of $53 an hour, the cost comes out to $26.5 million every year.
Then, each scientist who writes a report participates in a two-day panel of a dozen or so scientists where they all present the reports they wrote and then discuss the papers. That’s two 8 hour days plus travel time for each of the 30,000 unique reviewers and perhaps more if some reviewers participate in multiple panels, so that’s at least another $25 million. The NIH also pays for travel, hotel accommodations, and a $700 dollar stipend to each participant, so this is at least another $21 million.
The direct costs of using external scientist labor might be $75 million a year. That’s about as much as 150 R01 grants. Even if the cost were double $75 million it would still be small compared to the amount of money the NIH is disbursing. If external peer review can make each dollar of grant money .25% more effective then it’s probably worth paying this cost to employ it.
Other costs of external review are larger. In the same way that a small increase in grant effectiveness can pay for the financial cost of peer review, a small quality decrease could easily double or 10x the cost of external review. The consensus driven decision making in external review panels might create this mistargeting. Pierre Azoulay and Wesley Greenblatt find that risky R01 grants are renewed at markedly lower rates than less risky ones and also show that the magnitude of the risk penalty is magnified for more novel areas of research and young investigators. Some of this risk penalty would surely remain even under internal-only review but previous experience with internal review programs like SGER and ARPA suggest that individual program officers are more risk-taking than external review panels.
Perhaps the most significant cost of external review is time. It typically takes between 8 and 20 months after the submission deadline for investigators to be notified of their award. Much of this time is taken up by waiting for external academic reviews to be submitted and organizing a time when external reviewers can meet to discuss proposals. Internal-only review could easily have turnarounds that are ten times faster.
Other significant costs of scientific grant systems like the arms race of massive upfront investments in proposal preparation that drain scientist’s time wouldn’t be as affected by internal review. But grant review times of one month instead of twelve might lower the stakes enough to ameliorate some of this rent dissipation.
How to Implement Internal Review
For the National Science Foundation expanding internal review is as easy as getting Sethuraman Panchanathan to require and push for it or even getting one of the NSF division heads to push for it in their own programs. The NSF already has two internal-only review programs that have been going in some form or another for decades and could easily require their program officers to expand their use. The NSF has no Congressional mandate for external peer review.
For the NIH, it would be more complicated. It probably requires an amended law from Congress or else some complicated bureaucratic rule-making. Title 42 of the US Code section 289a states that “the review of applications made for grants, contracts, and cooperative agreements shall be conducted to the extent practical, in a manner consistent with the system for technical and scientific peer review applicable on November 20, 1985.” At that time, the NIH used external peer review so this law essentially locked in something similar to that system.
It would be easier to expand internal review at other agencies like ARPA-H rather than try to get it working within the NIH.
Most Metascience proposals are about experimentation and information gathering. Those would be useful here too, and the rollout of internal review could be cleverly designed to get controlled experimental results comparing it to the old system. But there is already enough evidence to straightforwardly support transitioning much more funding to internal-only review.
Reading List for Paid Subscribers
Keep reading with a 7-day free trial
Subscribe to Maximum Progress to keep reading this post and get 7 days of free access to the full post archives.