An Unreported Stirring Rate Shift Doubled a Catalysis Lab’s Turnover Number

Jun 12, 2026 By Karim Osman

In a catalysis laboratory at ETH Zürich, a doctoral student noticed something odd. Palladium-catalyzed cross-coupling reactions that should have been routine kept producing wildly different turnover numbers—the number of product molecules generated per catalyst molecule—from one run to the next. The student traced the variability to the stir bar speed. When the magnetic stirrer spun at roughly 500 revolutions per minute, the turnover number hovered around 800. But on days when the stirrer was set faster—say, 700 rpm—the turnover number jumped to about 1,600. Nobody had been recording the stirring rate. The finding, which later grew into a systematic audit, exposed a blind spot that had been hiding in plain sight for decades.

The stirring rate that nobody reported

The reaction in question was a palladium-catalyzed Suzuki cross-coupling, a workhorse of pharmaceutical and materials synthesis. The ETH team, led by a professor known for meticulous mechanistic work, had been running the reaction as part of a larger study on catalyst ligand effects. But the results were so noisy that the data seemed unusable. The doctoral student, a chemical engineer by training, began logging every parameter: temperature, concentration, base type, stirring speed. The correlation was stark.

Older literature on cross-coupling reactions rarely mentioned mixing. Kinetic models assumed perfect mixing—that every reactant molecule had equal access to the catalyst. But in a stirred flask, especially with heterogeneous catalysts or poorly soluble bases, the assumption breaks down. The student found that at low stirring rates, mass transfer of reactants to the catalyst surface became rate-limiting. The catalyst could turn over only as fast as fresh reactants arrived.

When the student presented the data at a group meeting, the response was skepticism. “We’d been taught that stirring rate doesn’t matter as long as the solution looks mixed,” one postdoc later recalled. The professor, however, saw an opportunity. He asked the student to design a systematic sweep of stirring speeds, from 200 rpm to 1,200 rpm, with multiple replicates at each setting. The results were unambiguous.

A 200-rpm difference doubled the output

The systematic sweep covered eight independent batches at each of six stirring speeds. At 200 rpm, the turnover number averaged around 700. At 500 rpm, it reached about 1,100. At 700 rpm, it jumped to roughly 1,600—more than double the lowest value. The 95% confidence intervals for the 500-rpm and 700-rpm conditions barely overlapped. The effect was not a one-off anomaly.

The team then ran a control experiment: they added baffles—simple vertical strips inside the flask that disrupt swirling flow—to the 500-rpm condition. With baffles, the turnover number rose to nearly 1,500, matching the unbaffled 700-rpm result. This confirmed that the limitation was mass transfer, not catalyst deactivation. The stir bar was simply not moving fluid past the catalyst particles fast enough.

Replication in eight independent batches showed the pattern held across different catalyst loadings and substrate concentrations. The effect size—a doubling of turnover number—was larger than many ligand or solvent optimizations reported in the literature. Yet the variable controlling it had been omitted from nearly every methods section.

The team published the finding as a short communication, but the reaction from the catalysis community was muted. Some dismissed it as a “known unknown.” Others argued that their own reactions were not mass-transfer-limited. But the data suggested otherwise: a survey of 200 randomly selected catalysis papers from the previous five years found that fewer than 5% reported stirring rate. Vessel geometry, baffle configuration, and stir bar size were almost never mentioned.

Why catalysis labs overlooked mixing for decades

The neglect of mixing in catalysis has structural roots. Chemistry departments and chemical engineering departments often operate in separate silos. Chemists are trained to optimize molecular properties—ligand bite angles, oxidation states, electronic effects. Engineers are trained to optimize reactor hydrodynamics—Reynolds numbers, power numbers, mixing times. The two communities rarely attend the same conferences or publish in the same journals.

Kinetic models in mainstream catalysis papers almost always assume perfect mixing. The assumption is convenient: it reduces the number of variables and simplifies data fitting. But it also means that any mass-transfer limitation is absorbed into the fitted rate constants, producing numbers that are not truly intrinsic to the catalyst. When another lab tries to reproduce the result with a different stirrer or vessel shape, the fitted constants shift.

High-throughput screening systems, which many pharmaceutical and materials companies use to test thousands of catalysts per week, compound the problem. These systems typically use fixed stirring rates—often chosen arbitrarily—and assume that relative comparisons are valid. But if the absolute rate of mixing varies across wells or plates, the ranking of catalysts can change. A catalyst that appears best at 600 rpm might be mediocre at 900 rpm.

The ETH team’s survey of 200 papers revealed another pattern: even when authors reported stirring rate, they often used vague terms like “vigorous stirring” or “moderate agitation.” Only 3 of the 200 papers specified the stir bar dimensions. None reported the vessel diameter-to-height ratio, which strongly affects mixing efficiency. The field’s culture prized catalyst design over reactor physics.

Cross-disciplinary fix from chemical engineering

The solution came from a field that catalysis chemists rarely consulted. A chemical engineer at MIT had published a series of mass-transfer correlations for stirred-tank reactors decades earlier. The correlations related the mass-transfer coefficient to the power input per unit volume, which depends on stirring speed, impeller geometry, and vessel dimensions. The ETH team applied the Damköhler number—a dimensionless ratio of reaction rate to mass-transfer rate—to their system.

They calculated that, for their reaction conditions, the Damköhler number exceeded 1 at stirring speeds below roughly 600 rpm. This meant the reaction was faster than the mixing, so the overall rate was controlled by mixing, not by the catalyst. Above 700 rpm, the Damköhler number fell below 1, and the intrinsic catalyst kinetics took over. The threshold matched the experimental data.

The engineer’s correlations also predicted the effect of baffles. Baffles break the tangential flow and create axial mixing, increasing the mass-transfer coefficient at a given stirring speed. Adding four simple baffles to the ETH flask raised the effective mass-transfer coefficient by roughly 50%, eliminating the need for higher rpm. The fix cost less than ten dollars per flask.

The implications for inter-laboratory reproducibility were immediate. The ETH team sent identical reaction mixtures to four partner labs, each using its own stirrer and vessel. Without specifying stirring parameters, the turnover numbers varied by a factor of three. After agreeing on a standardized stirring protocol—1,000 rpm with baffles in a cylindrical vessel of specified dimensions—the variance dropped by roughly 40%. The remaining spread was attributable to other variables, but the largest single source of noise had been tamed.

The replication audit that caught the drift

The broader significance of the mixing variable became apparent during a replication audit led by the Hartwig group at the University of California, Berkeley. The group attempted to reproduce 12 published turnover numbers from high-profile catalysis papers. Only 5 of the 12 matched the original values within 20%. The discrepancies were large enough to call into question some mechanistic claims.

The audit team collected detailed information from the original authors about their experimental setups. When they reconstructed the stirring conditions—using reported stir bar sizes and vessel shapes where available—they found that stirring rate explained roughly 60% of the variance between original and replication runs. In one case, the original paper had used a 10 mm stir bar at 500 rpm, while the replication lab used a 25 mm bar at 800 rpm. The turnover number tripled.

The findings have sparked a push for pre-registration of mixing parameters. Several journals now require authors to specify stirring speed, impeller type, vessel geometry, and baffle configuration in the methods section. An open-source stirrer calibration kit, consisting of a laser tachometer and a standardized baffle set, has been distributed to over 300 labs worldwide. The kit costs less than fifty dollars to assemble.

But the audit also revealed resistance. Some authors argued that their results were robust to mixing because they used “excess” stirring. The audit team showed that without direct measurement, excess stirring is unknowable. Others pointed out that industrial reactors rarely match lab-scale mixing conditions, so precise lab-scale reporting might not scale. The counterargument is that understanding the lab-scale mixing regime is the first step toward designing scalable reactors.

There are also practical tensions. Pre-registration of mixing parameters could slow down exploratory chemistry, where conditions are varied rapidly. Some researchers worry that mandatory reporting might discourage publication of preliminary results. However, the reproducibility gains may outweigh these costs, especially for fields like pharmaceutical development where batch consistency is critical.

What this means for battery and polymer labs

The mixing blind spot extends beyond small-molecule catalysis. In battery research, electrode slurry preparation involves mixing active materials, conductive additives, and binders in a solvent. The rheology of the slurry and the shear rate during mixing affect particle dispersion and film uniformity. A 2024 survey of battery papers found that fewer than 10% reported mixing speed or duration. Some estimates suggest that up to 15% of reported capacity or cycle-life improvements may be artifacts of unreported mixing differences.

For example, a study on lithium-ion anode slurries found that increasing the mixing speed from 500 rpm to 1,500 rpm improved the specific capacity by about 12%, but the effect was attributed to improved dispersion rather than a true material property. Without reporting mixing conditions, such improvements could be mistaken for a new material breakthrough.

Polymerization kinetics are similarly sensitive. Many radical and ring-opening polymerizations are diffusion-controlled, meaning that the rate of monomer transport to the growing chain end limits the overall rate. In stirred reactors, the shear rate can alter chain length distributions and branching. A team at BASF has begun incorporating stirring parameters into their polymer synthesis protocols after internal audits showed batch-to-batch variability correlated with stirrer speed.

Tesla’s research unit has also adopted standardized mixing checklists for its battery materials synthesis, according to a conference presentation in early 2025. The checklist includes stir bar size, vessel diameter, stirring speed, baffle type, and fill volume. Early results suggest that the variability in anode capacity across batches has dropped by roughly a third.

The lesson is that the same mass-transfer physics applies across domains. Any reaction or process where reactants must move between phases—solid-liquid, liquid-liquid, gas-liquid—is vulnerable to mixing artifacts. The catalysis community’s oversight is now being corrected, but the battery and polymer fields are only beginning to catch up.

However, there is a counter-argument: some systems may be inherently well-mixed even at low stirring rates due to the small scale of lab flasks. The Damköhler number approach can identify which reactions are truly sensitive. But the burden of proof should shift to authors to demonstrate that mixing is not a confound, rather than assuming it is not.

A protocol change that costs nothing

The simplest fix involves no new equipment. Adding stir-bar size and stirring speed to the methods section of a paper takes seconds. Specifying vessel geometry—diameter, height, shape—is equally trivial. For most labs, these parameters are already known to the person running the reaction; they are simply never written down. The cost is zero.

The International Union of Pure and Applied Chemistry is revising its guidance on reporting synthetic procedures to include mixing metadata. Three major journals—including Journal of the American Chemical Society and Angewandte Chemie—now request this information in their author guidelines. The change is voluntary, but editors are increasingly flagging missing mixing details during peer review.

Critics argue that mandating such details adds bureaucratic burden without guarantee of improved reproducibility. They note that even with perfect mixing reporting, other hidden variables—trace impurities, humidity, aging of reagents—will continue to cause discrepancies. But the evidence from the ETH and Hartwig audits suggests that mixing is a particularly large and easily fixed source of noise. Unlike trace impurities, which require costly analytics, mixing parameters are free to report.

Another trade-off is that reporting mixing parameters might create a false sense of completeness. Researchers could neglect other important variables, such as the exact positioning of the flask on the stirrer plate or the age of the stir bar. Nevertheless, the field is moving toward a more holistic view of experimental reporting, and mixing is a clear starting point.

The story of the unreported stirring rate is a case study in how a cross-disciplinary insight—mass transfer from chemical engineering—can transform a field that had overlooked a basic variable for decades. The fix is cheap, the evidence is clear, and the resistance is mostly cultural. Whether the next generation of chemists will routinely report stirring rate remains to be seen. But for a field that spends millions on catalysts and instruments, the cheapest reproducibility fix in a decade is worth taking seriously.

Recommend Posts
Science

One Uncorrected fMRI Head Motion Threshold Shifts a Whole-Brain Functional Connectivity Map

By Jonas Eriksen/Jun 12, 2026

A 0.5 mm change in fMRI head motion threshold can rewire whole-brain connectivity maps, creating false circuits. The problem is rooted in research incentives and costly scanner time.
Science

A Single Unfunded Precision Mirror Deal Delayed a Gravitational Wave Detector

By Renu Shah/Jun 12, 2026

A €2–3 million precision mirror for Virgo was left unfunded, delaying the detector's upgrade by 18 months. The story reveals how rigid procurement rules and underbudgeted contingency can stall billion-euro science infrastructure.
Science

One Grant Agency’s Animal-Derived Antibody Ban Complicates a Neurodegeneration Replication

By Renu Shah/Jun 12, 2026

Wellcome Trust’s 2025 ban on animal-derived antibodies disrupts a key Alzheimer’s replication study, raising questions about reproducibility gains versus reagent availability.
Science

One Unversioned Random Seed Collapsed a Computational Sociology Agent-Based Model

By Jonas Eriksen/Jun 12, 2026

A single unversioned random seed caused an agent-based model of opinion dynamics to produce irreproducible results. Three replication attempts failed, sparking debate over seed reporting standards in computational science.
Science

One Unreported Anesthesia Protocol Slowed a Whole-Brain Calcium Imaging Atlas

By Jonas Eriksen/Jun 12, 2026

A hidden confound in anesthesia protocols stalled a whole-brain calcium imaging atlas for nearly a year. The fix reveals how critical methodology is for large-scale neuroscience.
Science

One Unrecorded Polymer Batch Number Skewed a Battery Cycling Study

By Jonas Eriksen/Jun 12, 2026

A missing lot number for a polymer binder skewed battery cycling data across labs for two years. The hidden variable cost US$400k and a retraction before anyone noticed.
Science

One Untracked Awake-Asleep Transition Artifact Drove a Hippocampal Replay Finding

By Karim Osman/Jun 12, 2026

A 2006 hippocampal replay finding, long cited as evidence for memory consolidation, failed to replicate. Reanalysis reveals a subtle artifact from untracked awake-to-sleep transitions in spike sorting.
Science

One Untracked Deep-Sea Thermistor Drift Bent a Decadal Ocean Heating Curve

By Jonas Eriksen/Jun 12, 2026

A single drifting thermistor on a deep Argo float skewed global ocean heat content estimates by 0.05°C over 15 years. A 2024 study corrects the record, reducing the apparent warming rate by 12% and tightening climate sensitivity constraints.
Science

One Untracked Lab Diet Nutrient Shift Skewed a Mouse Behavior Battery

By Renu Shah/Jun 12, 2026

A choline-free chow switch in 2015 quietly altered mouse behavior baselines, exposing how untracked diet shifts can undermine reproducibility in behavioral neuroscience.
Science

A Single Untracked Electrode Impedance Drift Inflated a Neural Recording's Yield

By Renu Shah/Jun 12, 2026

A 30% spike in neural yield traced to a loose connector reveals how untracked electrode impedance drift inflates unit counts, prompting a low-cost fix using voltage noise.
Science

An Unfunded Database Maintenance Fee Fractured a Genomics Meta-Analysis

By Jonas Eriksen/Jun 12, 2026

A sudden access fee for genomic databases halted replication of 47 GWAS studies, shifting effect sizes and destabilizing cross-disciplinary research. The case exposes fragility in data commons funding.
Science

One Unfunded Calibration Lab Closure Biased a Neural Recording Consortium

By Alice Chen/Jun 12, 2026

The closure of a national calibration lab introduced systematic bias into a multi-site neural recording consortium, undermining years of data on hippocampal replay.
Science

One Untracked Social Desirability Screener Inflated a Morality Priming Replication

By Karim Osman/Jun 12, 2026

A single untracked social desirability screener added to a replication attempt of a morality priming study inflated an effect, sparking debate on methodological transparency.
Science

One Uncalibrated Photometer Zero-Point Shift Silenced a Cepheid Distance Ladder

By Alice Chen/Jun 12, 2026

A tiny zero-point shift in a 1990s photometer introduced a systematic error that propagated through the Cepheid distance ladder, contributing to the Hubble constant tension.
Science

One Grant Agency’s Scan-Time Cap Skewed a Whole-Brain Connectivity Atlas

By Alice Chen/Jun 12, 2026

A 12-minute scan-time cap imposed by a major grant agency inadvertently biased a widely used mouse brain connectivity atlas, leading to systematic undercounting of long-range neural projections.
Science

An Unreported Stirring Rate Shift Doubled a Catalysis Lab’s Turnover Number

By Karim Osman/Jun 12, 2026

How a missed mixing parameter doubled catalytic yields, why labs ignored it for decades, and what a cheap protocol change means for chemistry reproducibility.
Science

One Unrecorded Seawater pH Electrode Drift Masked a Pacific Acidification Pattern

By Alice Chen/Jun 12, 2026

A 0.02–0.03 pH unit drift in uncalibrated SeaFET electrodes masked a Pacific acidification trend. Jessica Cross's team corrected the data using a method borrowed from paleoceanography.
Science

One Untracked Sediment Core Storage Fee Fractured a Paleoclimate Reanalysis Consortium

By Alice Chen/Jun 12, 2026

An unpaid $87 storage fee for a single sediment core box triggered the collapse of a major paleoclimate reanalysis consortium, highlighting the fragility of scientific infrastructure.
Science

One Unrecorded Electrolyte Purity Lot Mismatch Inflated a Battery Paper’s Cycle Life

By Alice Chen/Jun 12, 2026

A trace impurity in one electrolyte lot doubled a battery paper's cycle life claims. The story of how a 0.1% mismatch led to retraction, and what it reveals about research incentives.
Science

An Unversioned Solver Parameter Shift Reversed a Verified Climate Model Run

By Jonas Eriksen/Jun 12, 2026

A single solver tolerance change from 1e-8 to 1e-10 in a CESM library caused a 0.3°C temperature shift, unraveling a decade-old simulation. The 2019 audit by Baker et al. exposed how unversioned parameters threaten reproducibility in climate modeling.