Donald Trump’s new “Genesis Mission” initiative promises to use artificial intelligence to reinvent how science is done, in a bid to move the dial on the hardest challenges in areas like robotics, biotech and nuclear fusion.
It imagines a system in which AI designs experiments, executes them, learns from the results and continually proposes new lines of inquiry. The hope is that this will unlock dramatically higher productivity in federally funded research.
This vision fits a wider international trend, including in the UK: governments are investing heavily in AI for science, citing successes such as DeepMind’s AlphaFold, which predicts protein structures, and is now woven into many areas of biology and drug discovery.
However, core lessons from the philosophy of science show why “automating discovery” is far harder – and riskier – than the rhetoric suggests.
The philosopher Karl Popper famously described science as a process of “bold conjectures and severe attempts at refuting [them]”. Discovery, in this view, begins when researchers encounter an anomaly – a phenomenon that existing theories cannot easily explain. They then propose new hypotheses that might resolve the puzzle. Philosophers call this “abduction”: inferring to an explanation rather than merely extrapolating from previous data.
The large language models that underpin today’s AI systems mimic some patterns of abductive reasoning. But they do not possess the experience, know-how or situational understanding that human scientists draw on when reframing a problem or redefining what counts as an anomaly.
Machines excel at spotting regularities in existing data. Yet the most interesting scientific advances often occur when researchers notice what the data fails to capture – or decide that a previously ignored discrepancy is actually a clue to a new area needing investigated.
Even once a new idea is on the table, scientists must decide which theories to pursue, refine and invest scarce resources in. These choices are guided not just by immediate empirical payoffs, but virtues such as coherence with other ideas, simplicity, explanatory depth or the ability to open up fertile new research programmes.
None of these can be reduced to fixed rules. Trying to reduce them to simpler but more measurable proxies may result in prioritising projects that yield short-term gains over speculative but potentially transformative lines of inquiry. There’s also a risk of ignoring hypotheses that challenge the status quo.
Justification is not just data
Scientists assess competing theories using evidence, but philosophers have long noted that evidence alone rarely forces a single conclusion. Multiple, incompatible theories can often fit the same data, which means scientists must weigh the pros and cons of each theory, consider their underlying assumptions, and debate whether anomalies call for more data or a change of framework.
Fully automating this stage invites trouble, because algorithmic decision systems tend to hide their assumptions and compress messy tradeoffs into binary outputs: approve or deny, flag or ignore. The Dutch childcare-benefits scandal of 2021 showed how this can play out in public policy. A risk-scoring algorithm “hypothesised” and “evaluated” which families were engaging in fraud to claim benefits. It fed these “justified” conclusions into automated workflows that demanded repayment of benefits, and plunged many innocent families into financial ruin.
The same data can lead to multiple conclusions.
NicoElNino
Genesis proposes to bring similar forms of automation into scientific decision chains. For instance, this could let AI agents determine which results are credible, which experiments are redundant, and which lines of inquiry should be terminated. It all raises concerns that we may not know why an agent reached a certain conclusion, whether there is an underlying bias in its programming and whether anyone is actually scrutinising the process.
Science as organised persuasion
Galileo understood persuasion.
Wikimedia, CC BY-SA
Another lesson from the philosophy and history of science is that producing data is only half the story; scientists must also persuade one another that a claim is worth accepting. The Austrian philosopher Paul Feyerabend showed how even canonical figures such as Galileo strategically chose languages, audiences and rhetorical styles to advance new ideas.
This is not to imply that science is propaganda; the point is that knowledge becomes accepted through argument, critique and judgement by a scientist’s peers.
If AI systems begin to generate hypotheses, run experiments and even write papers with minimal human involvement, questions arise about who is actually taking responsibility for persuading the scientific community in a given field. Will journals, reviewers and funding bodies scrutinise arguments crafted by foundation models with the same scepticism they apply to human authors? Or will the aura of machine objectivity make it harder to challenge flawed methods and assumptions embedded deep in the pipeline?
Consider AlphaFold, often cited as proof that AI can “solve” major scientific problems. The system has indeed transformed structural biology (the study of the shapes of living molecules) by providing high-quality predictions for vast numbers of proteins. This has dramatically lowered the barrier to exploring how a protein’s structure affects how it works.
Yet careful evaluations emphasise that these outputs should be treated as “valuable hypotheses”: highly informative starting points that still require experimental validation.
Genesis-style proposals risk overgeneralising from such successes, forgetting that the most scientifically useful AI systems work precisely because they are embedded in human-directed research ecologies, not because they run laboratories on their own.
Protecting what makes science special
Scientific institutions emerged partly to wrest authority away from opaque traditions, priestly castes and charismatic healers, replacing appeals to enchantment with public standards of evidence, method and critique.
Yet there has always been a kind of romance to scientific practice: the stories of eureka moments, disputes over rival theories and the collective effort to make sense of a resistant world. That romance is not mere decoration; it reflects the human capacities – curiosity, courage, stubbornness, imagination – that drive inquiry forward.
Automating science in the way Genesis envisions risks narrowing that practice to what can be captured in datasets, loss functions and workflow graphs. A more responsible path would see AI as a set of powerful instruments that remain firmly embedded within human communities of inquiry. They would ultimately support but never substitute the messy, argumentative and often unpredictable processes through which scientific knowledge is created, contested and ultimately trusted.
Akhil Bhardwaj does not work for, consult, own shares in or receive funding from any company or organisation that would benefit from this article, and has disclosed no relevant affiliations beyond their academic appointment.