About Me
I am a data scientist based in Vancouver, British Columbia.
Here is a list of things that I aspire towards at work.
As data scientists, we should:
- think deeply about the ethics of our work. Not every data scientist can work on projects that will save the world, but we should carefully consider the implications of our work. Often harm is unintentional and unanticipated, so it is important to examine the impact of our products, once they are live and people interact with them.
- work with integrity. The push for positive results in both industry and academia is difficult to resist. Let’s not tweak our methods until we get a
p-value < 0.05. - be curious about the data and the data-generating process. This is not as common as one might think. Some people are more interested in the algorithms, or the data pipelines. Nothing wrong with a cool neural network, but at the heart of our work are the questions we have and the data we use to answer them.
- talk to the domain experts, or to those who understand the business. Just to make sure our fancy models will actually be useful.
- carefully plan experiments, document results, ensure reproducibility. After all, it is called data science.
- write clean code and tests. Even if it is “just for research”. To make sure we are standing on a solid foundation.
- communicate our insights to non-specialists. We are lucky that a lot of people are interested in our work, and want to know how it might impact theirs.
- quantify the uncertainty of our results. A prediction is meaningless without knowledge of its uncertainty.
- use Python and avoid R at all costs. Nah, just messing with the R people!
In reality, hitting 100% of these goals is hard for me. But there is a virtuous cycle to it: when I clean up my code, I become more productive, and have more time to plan my experiments. When I have meaningful conversations with domain experts, I understand better the implications of my work.