I work 10 hours per week at my University’s medical school consulting on statistics for residents and physicians. I’ve noticed far fewer requests for work, and I think AI is the reason. Let me explain.
Economics, Statistics, and Tukey
Tukey once said
The best thing about being a statistician is that you get to play in everyone’s backyard.
I take the phrase to mean the following: That statistical methods can be used in fields from astronomy to microbiology means that if one masters statistics, then one can ostensibly solve problems (play) in these fields (backyards). Tukey said this phrase during a time where there was a gap between knowing one needs statistical expertise and having the time, means, or interest to develop that expertise.
Insofar as this gap exists today, it has existed for at least as long as statistics has been applied in the sciences and business. Astronomers and microbiologists do not have the time or means to become experts in both their own fields and statistics, even if they have the interest. They do however know they need statistical expertise (perhaps because peer reviewers tell them). Statisticians happily filled this gap. We called filling this gap “collaboration” (or perhaps “consultation” if there is an exchange of money), and we called those who specialize in the application of statistics (the playing), rather than the theory of statistics, “applied statisticians”. This is largely a good thing. You can make a decent living solving other people’s problems, and are therefore valuable precisely because you can solve other people’s problems. Astronomers and microbiologists can focus on their own expertise, saving time and effort. Everyone prospers due to specialization, much like in the economy.
As such, I think the metaphor Tukey made is cute though perhaps not quite suitable for this post. Rather than thinking of statisticians as playing, it is perhaps more useful for what follows to consider statisticians as a middleman between a product (statistical expertise) and buyers (people working on problems) 1.
AI As A Direct-To-Consumer Solution
Statisticians have been a middleman between statistical expertise and those in need of said expertise. To abuse the metaphor further, AI has become a direct-to-consumer solution for statistical expertise, thereby cutting out the middleman.
The main blocker for my clients is, as I’ve mentioned, time and ability to learn things like mathematics, coding, and the other things which are needed in applied work. Additionally, the interactions with an applied statistician I reckon are filled with friction for these clients who just want a damn p-value for their paper (think of all the times you’ve said it depends in order to practice intellectual humility, and think how that might come off to someone who sees statistics as a means to an end). Now along comes a solution which, for very cheap if not for free, offers you the product (statistical expertise, or a reasonable facsimile thereto) at a fraction of the time and and for a fraction of the means. There is no longer a need to seek out the middleman, you can get the product directly.
For what it is worth, I reckon that a medical student sufficiently well versed with AI could use it to make a very good analysis. It might be as simple as pointing the AI at Frank Harrell’s RMS or BBR course notes, creating the right agent or skill, and then downloading R. The hard part in this example is understanding git, and AI can teach you that pretty easily. I myself have gotten very good at using AI to write software in Go at my job despite not being trained as a software engineer or knowing Go. The secret is the ability to give it the right information and the right amount of detail in the prompt. If I can do it in software engineering, a medical student can do it for regression.
The Cost of Cheap Goods
I can imagine two objections to AI as a Direct-To-Consumer solution to statistical expertise: - first, that without statistical expertise, you cannot detect your own errors, and - second, that by removing the friction from a human, we lose the primary safeguard for scientific integrity.
I don’t find either of these compelling. In the first case, these models are really good at applied work and will only get better. Mechanical minds neither tire nor forget, and while they do err they do so at a rate far less frequent than my own. I’ve used AI to do a few analyses using the strategy outlined above, and I’ve needed to correct it far less than I would have needed to debug my own code. I’m relying more and more on AI and learning a lot from using it. The risk of an error will be smaller than that from a human in short time if it isn’t already smaller.
Second, scientific integrity is threatened more by academic incentive structures than by AI. AI did cause people to work around the friction. At least since 2018, residents at the local teaching hospital have presented their research at “Resident Research Day”, and I can ensure you I didn’t consult on all of those projects. AI is enabling circumvention of friction, but one should ask why circumvent the consult at all.
Further abusing the economic metaphor, the market has decided that the cost of being wrong in an analysis is lower than the cost of rigor, hence circumvention is the rationale choice. The stakes for a resident’s publication are just not the same as a large scale RCT for a new and potentially dangerous drug. At the risk of sounding glib, a lot of the papers published by residents are just not important papers. I think this is true of a lot of academic research, but that is a story for another time. If the paper is not very important, there is a low cost to being wrong, and the benefits of publishing are still obtained. I think people understand that, which makes the friction applied statisticians apply hurt all the more, making AI look more attractive. I’m not saying this is acceptable, or good, but I do think this is the case we find ourselves in.
An Economic Disruption; What Do You Do When There Are No More Backyards To Play In?
If the middleman can be cut out without much in the way of consequences either because the analyses are good enough or because the incentive structures are so that publishing at all is as good as publishing something of high quality, what then?
On one hand, this is (again, to torture the economic metaphor) a disruption of the industry. There is less of a premium on being the kind of person who specializes in solving problems because problem solving has, partially, been democratized. For applied statisticians, this should be kind of alarming, and should be seen as taking money directly out of your pockets. That is, only if you see yourself as the middleman to statistical expertise and not something more than that.
Personally, I find see this disruption as a sort of liberation. To make things personal for a moment, I have put a premium on external validation in my life; I needed other people to tell me I was “good”. I still do, I’m just a little better at not needing it. I think this has explained much of my career choices, choosing to work in a capacity in which I served people and was seen as valuable precisely because I could solve their problems. I think a shift away from emphasis on the methods will be good for me, and can force me to determine what it is that I find worthwhile to work on as opposed to working on other people’s work. If the machine can worry about the “how” then I can worry about the “why”.
Footnotes
Yes, exactly what we need, a boring and trite anology. Not unlike economics itself ;) I joke, don’t kill me.↩︎