Finding research influences
There isn’t any expert consensus on many key questions related to AI safety. For example, estimates of when we’ll have transformative AI range from a few years to a century. There are also many wild opinions in the AI safety space. While some of these wild opinions seem justifiable, many people seem to exaggerate the risks from AI in an attempt to move policy-makers.
I think there are a few researchers which seem to have an unusual degree of conceptual clarity, though. A few names that come to mind are Buck Shleregis, Holden Karnofsky and Jan Kulveit. While I don’t endorse all their views, they seem to raise good questions. For lack of a better word, you could call them my research influences.
I’ve never been on looking for new research influences. Every now and then, I just realise that I’ve been influenced by someone, perhaps after citing the work of someone for the third time in a conversation. But say you want to look for research influences more deliberately. How might you proceed?
Knowing where to look #
A reasonable first step is exploring new content.
A reasonable first substep, then, is to narrow down the search space. Identify the kinds of questions you care about. For example, I’m mostly interested in reading about AI control and LLM psychology right now, so I’ll ignore papers and blog posts on, say, singular learning theory.
Next, ask people you find sensible for reading recommendations. Better yet, ask if they have any research influences. This is one of those things which is infinitely easier doing in person. Sending cold emails to researchers usually works, but it’s relatively time-consuming. It’s much easier bringing up the topic over a coffee with people in your local community.
Asking “Which blog posts have had the largest influence on your research?” also proved a good way of rounding up conversations at EAG. This way, I got to know the other person better and exploring new content at once.
Having a nose for bullshit #
Once you’ve decided what to read, you want to scrutinise the argument of the text. This is a highly non-trivial task. I’m not going to try solving all of philosophy here, so I’ll just focus on heuristics for detecting bullshit in the context of AI safety.
First, beware any kind of extreme. Is the proposed idea radical? Radical ideas shouldn’t be dismissed offhand. However, the burden of proof is greater. Similarly, quickly screen the author’s background: is the author known to have radical opinions, or affiliated with an organisation pursuing an unusual agenda?
I find it especially troubling when authors promote radical opinions and are unwilling to engage in debate with the general public. This leads to echo chambers. Moreover, refusing to explain your ideas to laymen just seems uncool. Just as lecturers should take questions from students seriously, authors should take questions from the non-initiated seriously. It’s an act of charity.
Another helpful strategy is to listen to interviews with the author. It’s harder lying in speech than in writing1. Of course, not everyone is as persuasive orally. But if there’s a big discrepancy between your confidence in the author’s argument as presented in the text and during an interview, that’s a warning sign. Moreover, a good interviewer will also help expose the flaws in the interviewee’s reasoning. In an essay, the author has full control.
Observing the influence #
Suppose you come across an author whose work makes sense but leaves you feeling “Sure, so what?”. I wouldn’t speak of a research influence here. A research influence changes the way you think. It’s not enough just stating true facts; their work needs to have some oomph.
When can we speak of a research influence, then?
One reliable proxy is the top idea of your mind. Do you have shower thoughts about their work? Also, do you find yourself coming back to their work after several months? In particular, when revisiting their points, do they still make as much sense? Big ideas need to be slept on, and you can only sleep so many times in a given week. Lastly, notice if you reference their ideas when chatting with others and, if so, in what way.
So finding research influences takes time, even if you take some of the shortcuts listed above. The process of finding research influences very much resembles the process of doing research. In fact, perhaps the two are indistinguishable.
Thanks to Miles Kodama for valuable discussions on this topic.
A fact well known among those who have taken oral exams. ↩︎