How To Interview a Data Scientist
Daniel Tunkelang
Presented at the O'Reilly Strata 2013 Conference
Video: https://www.youtube.com/watch?v=gUTuESHKbXI
Interviewing data scientists is hard. The tech press sporadically publishes “best” interview questions that are cringe-worthy.
At LinkedIn, we put a heavy emphasis on the ability to think through the problems we work on. For example, if someone claims expertise in machine learning, we ask them to apply it to one of our recommendation problems. And, when we test coding and algorithmic problem solving, we do it with real problems that we’ve faced in the course of our day jobs. In general, we try as hard as possible to make the interview process representative of actual work.
In this session, I’ll offer general principles and concrete examples of how to interview data scientists. I’ll also touch on the challenges of sourcing and closing top candidates.
11. Alternatives are at best a partial solution.
§ Only hiring people you’ve worked with doesn’t scale.
– And traps you in a locally optimal monoculture.
§ Interns are great! But they are a significant investment.
– Managing interns well is a productivity gamble.
– Most interns have at least a year of school left.
– Not all interns will make your bar. You won’t always make theirs.
§ Try before you buy: nice in theory.
– Adverse selection bias when other offers are permanent roles.
– Creates bureaucracy.
11
16. High-fructose corn syrup is 100% natural.
§ Working sessions are difficult to set up.
– No more natural than a final exam.
– High variance, and very difficult to calibrate performance.
§ Take-home assignments are great for the employer.
– But they are a significant investment for the candidate.
– Adverse selection bias if other companies don’t require them.
– Creates incentive to cheat if significant part of hiring process.
§ Previous work is like natural experiments.
– Always good to review a candidate’s previous work.
– But not always possible to find work with high predictive value.
16
27. Gotchas reduce the signal-to-noise ratio.
§ Avoid problems where success hinges on a single insight.
– Good interview problems offer lots of room for partial credit.
– Making a key insight often reflects experience, not intelligence.
§ Don’t test a candidate’s knowledge of a niche technique.
– Unless that niche technique is critical to job performance.
– And can’t be learned on the job as part of on-boarding.
§ Be a hard interviewer, but don’t be an asshole.
– An interview is not a stress-test to see where candidates break.
– Interviews communicate your values to the candidate.
27
29. Commit to binary interview outcomes.
§ Forced choice so interviewers don’t take easy way out.
– Just like having 4 choices instead of 5 on a rating scale.
– Encourages interviewers to take their role seriously.
§ Each team member is a critical filter.
– Two no’s or one strong no is a no.
– All weak yes’s is a no.
§ Short-circuit candidates early in the process.
– Resume and phone screening should be aggressive.
– Onsite interviews should have ~50% chance of leading to offers.
29
30. But what about
C ulture
ommunication
uriosity
All are must-haves.
?
Every interview evaluates all three.
30
32. Three Principles
1. Keep it real.
– Avoid whiteboard coding. Filter with FizzBuzz.
– Use real-world algorithms questions.
– Ask candidates to design your products.
2. No gotchas.
– Gotchas reduce the signal-to-noise ratio.
3. Maybe = no.
– Bad hires suck. Be conservative.
– Trust your team.
32