10 Technical Questions Asked in a Junior Data Analyst Interview
The increment of demand in the role of data analysts has been observed in the recent decade. Without any former experience as a data analyst, for instance, work experience related, internships, and educational background, I still fortunately landed a job as a data analyst. The key is, firstly, having an attractive CV to obtain a job interview opportunity, and secondly, to pass the technical questions in the job interview.
In order to meet the requirement, first you can dig into my personal website to have some ideas for building a personal brand. Then, it comes to the topic: 10 technical questions were asked in my junior data analyst interview.
10 Technical Questions
- What is your check list when you receive a new quantitative dataset?
- Explain the difference between list and tuple in Python
- Explain the difference between set and dict in Python
- Explain the difference between variance and bias in statistics/machine learning
- According to 4, how to reduce the gap between variance and bias in machine learning
- Briefly explain the job description of data analyst
- Briefly explain how data analysts create the actual value in a company
- Provide a data analysis project you had conducted/involved in
- Some following questions for the limiting factors of this project (question 8)
- Propose at least three question related to the role and the company.
Scroll down if you want to know how did I ask them resulting in getting the position
Details & Answer to 10 Technical Questions
1. Check list for a new quantitative dataset?
This question is to test if you have experience in quantitative data cleaning. Brief answer could be like:
1. Check the data type and open in a data analysis tool. E.g. Excel, R, Python…
2. Check the data type of each column. If they are any mistakes, redefine the data type. E.g. a column of money exchange but it is defined as string
3. View the whole dataset or tail and head depends on the size, summarise the dataset, and make graphs to detect any possible, unreasonable outliers.
4. Find if there are any missing value in each column, and fill them depends on the different scenarios. E.g. air temperature can fill with average temperature,
5. Find the inconsistent name or labels such as misspelling, extra space, and modify them. E.g. NicolasCage, NicolasCave, Nicolas Cage
2. Python: the difference between List and Tuple
I got this question without any previous questions about what programming I was used to coding with. Fortunately I was studying Online course: IBM Data Analyst Professional Certificate, which covered all python question they asked.
Thus, the answer to this question can be easily explained by one key sentence
List is mutable and Tuple is immutable, which means List can be modified but Tuple can’t
Of course, there are other characteristics different and Pros & Cons following them. You can check online sources like Real Python.
3. Python: the difference between set and dict
Again, it is taught in Online course: IBM Data Analyst Professional Certificate as well.
Dict is the version of Set with a key-value pairing, which requires input value. Then both Set and Dict can have unordered but not two identical keys.
4–5. Machine learning: variance and bias
Firstly, you have to come up with what they are inferring. If you have some courses or experience in machine learning, it would be hard for you to answer right. The graph below demonstrates explicitly the difference.
bias-variance is a classical trade-off in machine learning, also related to overfitting problems. The answer to question 4 is like blew:
Bias: an indicator of how close the prediction to the actual value inside the training data used
Variance: an indicator of how the prediction to the actual value out of the training data used
Then, to reduce the gap according to the upper graph, the best model complexity achieve the relatively low total error, variance and bias (not the best for each so it called trade-off…) The approach to question 5 is
To divide the whole dataset into training, validation, and test sets. Use training and validation set to train model, and test set to determine the final performance after training.
6. Job description of data analyst
It is not a question of right or wrong but definitely required having some know-how in data-driven industries. Surprisingly, the first lecture in Online course: IBM Data Analyst Professional Certificate introduces the role of data analyst which eventually help me to answer this question without hesitation.
By detecting, analysing and explaining the hidden facts behind any kind of data, data analyst enables to execute corresponding actions and then create actual value for the company.
7. Actual value created by data analyst
This question was the follow-up to question 6. It calls for “create” so I went for “creativity”.
Data analysts usually cooperate with other departments in the company. We make daily/weekly/monthly reports for the executive team. These reports provide crucial information for decision-making, avoiding mistakes and fixing problems.
8–9. A data analysis project you have done
This question I felt is a determining factor of whether to get this job opportunity or not. You might wonder how if you are not a data analyst but have done data analysis projects. However, it is the opposite. You must have conducted data analyst projects before landing a data analyst job. The projects not only prove that you are competent but also imply your passion for data analysis. Of course, this case is for someone like me who didn’t have any related internship.
The project I introduced is a machine learning application to micro-climate real-time forecast. Some of the detail you can check from my Github.
Of course, to verify if I have really done this project, the follow-up questions for the detail are the next. At this point, I was asked about the limitation, the difficulties, the problems I met and my corresponding solution.
The most important to answer this question is to react to the doubts from the recruiter with confidence. It is not to prove that you can do things perfectly. It is here to show you have understood in the real-life things won’t go well usually and you know how to cope with various situations properly and logically.
10. Three questions from you
Always prepare some questions before a job interview! I thought if you have read here you must have some experience of job interviews or have already studied on them. However, in the role of data analyst, here is the questions that I asked.
1. What kinds of programming tools will I use in this role, so that I recap a bit or promote my skills in certain programming tools? (Indicate your willingness to learn)
2. What will be the daily work like? How will I interact with different departments in the company?
3. How can I expect the training as a junior data analyst?
In conclusion, I would say it is nothing difficult to answer in this interview, as long as you authentically learnt to be a data analyst no matter it’s your interests or you craved for a high salary. Conducting projects is absolutely essential. The courses or practices make you feel like you’re learning something, but in fact, these pieces of knowledge still need to internalise by projects, by the difficulties and pains you experienced. Therefore, just do it!
In addition, The courses of data analysis are not necessary, but it helps you to acquire a bigger picture and maybe cover some knowledge you haven’t used in your past projects. It is beneficial to take courses in data analysis just before the interview. They cover all of the basic entry-level technical questions and made me feel confident and less nervous. However, you should consider your memory. “Do not do them too early and you would only have fuzzy memory in your interview!”
Last but not least, the advice for those who did not have any related work experience in data analysis but have complete training of it. “Start you build your profile!” It is the key to be invited to job interviews where will be your runway.