Data Engineering Job Interviews Recap

I wake up, roll out of bed on a cold cloudy Wednesday morning. The Sonic Ring sound from my phone reminds me that I scheduled my apartment to be cleaned early in the morning. I wash my face, and brush my teeth…actually, I just brushed my teeth, let me not lie to you, lol. I go to my little corner desk in my living room and open my work laptop. I connect it to the hdmi cord as it begins to load up. I hear a knock on the door. Its cleaning service.
I go to sit back down at my desk. Its about 8:50am and I see a meeting invite from my manager, which was sent at 8:30. My manager is pretty cool and usually just slacks me if anything comes up, so a random meeting from him at 8:30 is suspicious. I accept. And of course, if you’ve been laid off in the past 12 or so months, you know what this looks like. I took that meeting while having the cleaning service working, so it was a lil embarrassing.
So I’ve set the stage for the beginning of my sr. data engineer job search journey. After the laying off, I got mini-anxious as I’m wondering how long this thing is going to last. This took about 3 months. And I’ve gone through my share of rejections throughout. Rejections of my application at the first stage, failures on a couple tech rounds, and then rejections after a few final rounds.
I’m including all this just so if you’ve felt all these feelings as well before, you know I feel your pain. First hand knowledge. Just realize that you not getting the job isn’t necessarily a reflection on you or them.
Also, I just like writing stories, so this first set of paragraphs is that.
So the purpose of this post from hereon after is just to help give you some pointers and what you can expect on your job search for a data engineering position. Keep in mind, this was looking for sr data engineering roles in the NYC-NJ-Philly areas. This might wind up being dated in the future, and who knows, AI might render this all obsolete and now this post means nothing. But if you are searching or a data engineering job, or if I’m searching for one again after some time, here are some notes:

RESUME
You definitely want to, while working/training/learning new things and especially after new projects, update your resume. Obviously if you get laid off, you’ll be updating it anyway, but it’ll be a little easier to keep track of things.
1 resume vs job specific resumes
I’m not too sure on what works best, but I’m leaning towards job specific. I feel like I got slightly better results not just submitting the same resume to each job. But it does take a little bit of time to make the updates based on the JD. Generally you’ll want to mention programming language (python), data pipeline tools (spark), data warehousing solution (snowflake), and cloud service(s)(aws) you’ve worked with. If you know how much data you were managing or moving, that seems to also help.
Mention any diverse data sources you’ve worked with. This includes APIs, particular databases, NoSQL databases, and real-time/streaming data. I also included a section in my resume that just stated languages and tools and how many years experience I’ve had with them
I’ll say I’ve gotten the best responses from LinkedIn applications. Indeed, Dice, and others I’ve tried, but they don’t seem to have the same amount of jobs nor do they seem to be as responsive on there.
INTERVIEWS
Initial Calls
Once you’ve submitted your resume to a few places, hopefully you start hearing back from some. It may take anywhere from same day, if submitted in the morning, to a week later for them to respond with an email or even a phone call. As far as no’s go, you may not get any response or you’ll get an email saying what amounts to “thanks but no thanks”.
The first thing that will happen is you’ll get a screening call. Most times you’ll get an email where they ask you to schedule 15–30 minutes to talk to someone from hr or recruiter. Many times they don’t understand all the tech jargon, but they’ll be taking notes, as they’ll be passing this onto the hiring manager. No need to be nervous here, but I would say practice summarizing your career. It should line up with your resume, and be prepared to answer follow ups.
Afterwards, they’ll tell you about the job and/or the company, and maybe the team. You can ask questions, but again, since they’re not on the actual hiring team, they might not be able to answer job specific things. They might also ask you some other things like they’ll list tools or languages and ask you if you’ve used them before. Those should already be in the job description though, so you’ll already have an idea
Tech challenge pre-screening
In some cases, the first task won’t be a call. They’ll send some email that says they’re interested in your job application and they’d like you to participate in a code screening. These are usually leetcode type of questions. You’ll have 1–2 hours to complete a certain amount of leetcode questions. I’d say you should prepare for this by doing practice problems and get a good understanding of data structures and algorithms. There are a few to focus on.
- Lists
- Dictionaries
- Sets
- Strings (as lists)
- Sliding Window
- Shrinking Window
- Path Finding (DFS & BFS)
- TriNodes (understand, but you’ll create it yourself)
- Recursion
- I’ll comment more on this later in the interview section, but you basically should expect to get tested similar to a software developer round. If you’ve read about FAANG (would it be MAANA now tho?), you’ll know about this already.
Also, from what I’ve seen, this will always be geared towards Python/Java/Scala, and not focus on SQL. I’ve only done like 4 of these, though, so I can’t guarantee it.
Generally, I’d say Path Finding and Recursion don’t come up a lot, since its data engineering, but I have seen problems that need them in some cases.
1st Call
So this can vary depending on the place, but this will either be a meeting with the hiring manager or this will be the tech round. If this is the hiring manager, you’ll pretty much talk through your resume and summarize your career. You may get some behavioural questions, but you’re probably gonna get more tech/architecture questions such as
- Tell me about your favorite project
- What’s the difference between a data warehousing solution and a relational database solution
- What’s a columnar database
- Do you have any experience in <programming language/tool>
- How do you implement testing (some people don’t ask this, but if you mention it and automating it, this can earn you points. Sometimes data teams don’t come from the software side of things and don’t know about test driven development or automating testing packages/tools)
- If a query is running slow, how would you solve it?
- There are others, so don’t be surprised if these aren’t exactly the questions you get. I’ll also mention this in the summary or list of pointers, but if you don’t know something, especially after answering previous things correctly, its ok to let them know you don’t know but you’ll try to answer. And try to keep in mind what you didn’t know or got wrong. That’s homework for you.
Tech Round
If you made it here, then the hiring manager enjoyed talking to you, or the HR screener thought you reasonable enough to take on the tech round. This is where it gets a bit all over the place. There are multiple ways companies do this. I’ll do my best to organize this in a good way
- Coding session
- - Live session
- - — Leet Code
- - — Data Engineer related coding
- - Do on own time, like the tech challenge pre-screening
- - Leet code
- Take home solution
- Grilling session
Coding Session
The coding session will happen either live or like the technical prescreen, which you will do on your own time
Live Session
This will be either Leet code or they’ll have some problems for you that are more inline with data engineering day to days. With leet code, there will be 2–5 questions, and it’ll most likely start off with your programming language of choice, and end with you writing some SQL queries
The other situation may be that you’ll be writing some code based on a problem they came up with. Me, being python based, the questions I’ve seen have been standard, get data from an api, it may have multiple pages, turn the relevant data from the response into a dataframe and answer some questions.
They wont really care about syntax, as either the code will be executable (they might help you answer something) or they’re just trying to understand your overall thinking (in the case of pyspark)
Make sure you understand how to find Big O of your solution, or if SQL, another way to implement your solution, as they might give a reason why your solution won’t work in their scenario.
You’re not going to have a lot of time, so don’t try to stammer through getting the optimal solution. If its leet code, you should already know a decently optimized solution already. Otherwise, just be able to explain your choices, but you want to complete your exercise. Also don’t be afraid to talk through your solution, as this will let them know what you are thinking, and in case you can’t finish because you’re a slow developer like me, they might shoot you some bail. And if anything is unclear, ask questions.
Another pointer: as you have a limited amount of time, try not to get distracted on variable names or remembering cute code. If you have multiple ways to approach the problem, do your best to go with the most familiar way. You want to figure out the problem, what you need to do, and code.
Own Time
If its something to do on your own time, its most likely just going to be leet code. It may vary, but in some systems, you wont be able to have anything else open, and you have to turn your camera on and share your screen so they can see your not cheating, and some dont care at all. You’ll just write code and have 1–1.5 hrs to complete all the questions. I’d say a minimum to complete is 3, but to be safe, do all of them
Take Home Solution
This will definitely be more data engineering related. They’ll give you some problem statement in either an email or a github repo. You pretty much have to solve that problem based on your knowledge. They’ll give you a timeline of when they want a submission by, and then probably suggest how much actual time it should take you to complete. Probably 2–6 hours max
They won’t really restrict you on what you can use, but I’ve wound up using docker, python, pandas, and spark in some cases. I did 1 time make this overcomplicated by including a postgres database and using a flask api, but I think that position was an anomaly, as I was unaware that they were looking for a junior developer. The next meeting, if they like your solution, will be a discussion on your code and decisions
Make sure to keep track of your assumptions as well.
Grilling Session
In this type of interview, you’ll pretty much just be answering 21 questions. Some of those questions I mentioned in the hiring manager round? They surely will show up here. Here are more example questions:
- Whats the difference between <join 1> and <join 2>
- Whats the difference between a union and union all
- Why would we use spark
- Design how you would solve this problem with <cloud suite of choice>
- What is a decorator and why would we use them
- What does immutable mean and what data types are immutable?
- Why would we use a class for a data pipeline
- How would we create a standard solution for our data extracts, regardless of source type
- If the data science/bi team gives you a query that’s running inefficiently, what steps would you take
- While not rare as an interview itself, it is rare for this to replace a coding challenge. Just take some time to look these up and make sure you know them (maybe ask your LLM of choice), because you’ve likely already used them, but because everything was in place when you got there (or someone else made the decisions) you never had to answer these questions or think about it
Full Loop/Virtual On-Site
At this point, you’ve talked to 2–4 people already. The HR person, hiring manage,r and then 1–2 people hosting the tech screening. Next, depending on the company, you’re going to be schedule for a “full loop”/virtual on site,or just the next interview. With the full loop (i have no idea why its called that), you’ll be scheduling 4–5 interviews, of 1 hour or 30 minute durations. Either they’ll all be the same day or you’ll work with their point person to split them up on consecutive days. Even if its not full loop, you’ll wind up with similar interviews. If its full loop, you’re going to get 2–3 coding type of rounds, a behavioral round, and then either an architectural or product scope with coding round.
Coding Rounds
Pretty much the technical round, but again, lol. This will be SQL and/or the programming language you use. And I’ve also seen a code review session for DBT, so be familiar with the efficiencies and best practices of the tool you have your expertise in
Behavioral and cultural fit
You can think of this as a vibe check. They’ll ask you not so techy questions and ask you to “Tell me about a situation where…”. So make sure you don’t have any red flags, and try to paint yourself positively. If you had a situation where you weren’t able to fix it, but learned a lesson that you applied elsewhere, mention that.
I think some places online mention the STAR method. I think its useful and makes sense, but I’ve been cognizant of using it once. Here are some sample questions
- Tell me about a time a project didn’t go as planned
- Tell me about a time when you had a disagreement with another developer
- Tell me about a time when you’ve had to work with a difficult person
Product Scope w/ Code
With this session, you’ll be given some type of app (ride sharing, social network, etc), and you’re going to have to create a datamodel and supporting tables to answer queries. If you haven’t worked with data models before, I’ll say I usually answer with a star schema and have my fact table and my dimension tables. This is something I picked up more in my time working with Qlikview and QlikSense. If you don’t have this experience, I’d say just practice doing this
The code part of this session will be using your data model to answer questions that the intervewer asks. Don’t worry about making sure you get all the fields right in your data model, as you’ll be able to adjust the data model during the session.
Architectural
This will be a bit more high level than product scope. Pretty much, the question is, how would you design a data pipeline solution in your cloud environment of choice. You’ll probably be able to diagram this out. Just know, you’re going to get a bunch of follow up questions, even after you explain. Don’t be too worried if the answer is super obvious, answer the question with that, and if they mean something else, they’ll ask it a different way. At the very least, you’ll show you know why you architected your solution the way you did.
FINAL POINTERS
So that’s pretty much the different type of rounds of interviewing you’ll face. They may switch up the order some times so just listen to the HR/recruiter. Here, I’ll just be listing some pointers and things that I did that helped me
- Interview prep — I used glassdoor whenever possible to get a feel for the interviews. Some people are gracious enough to describe the interview questions they got, from tech questions to behavioral. Just keep in mind, they can rotate or change them, and it might even be a different interviewer. So don’t memorize answers, but know the core concepts in solving them. Also use glassdoor to get some internal opinions of the work environment
- Leetcode prep — Sometimes teams treat this like a software development role, so you’ll get some leetcode questions. You can argue whether or not they’re useful because they aren’t problems you’ll solve on the job, but they’re going to be asked anyway. So just get familiar with leetcode. Some people have done 100s and thousands of leetcode, especially those prepping to get a job at MAANA, but I don’t like the idea of coding with no end goal. I just made sure to familiarize myself with the main ones and made sure I remembered the main concepts, algorithms, and data structs
- Post Interview Notes — Try to take notes. As mentioned earlier, sometimes you’ll run into problems you can’t solve or get questions you don’t know the answer to. Make sure to write those down. Just like in sports, its ok to get things wrong, but if you are in that situation again and you didn’t do your homework, that’s on you. Make sure to google the answer or ask AI to help you answer. Sometimes, the interviewer will come across as pompous, don’t worry about it. Just take whatever answer they give you, because if you’re wrong they can’t wait to tell you what the answer is like an elementary school kid that learned the ‘s’ in island is silent.
- Questions for interviewer — Try to have a couple questions for your interviewer. You’ll want to ask questions that give you an idea of what the day to day looks like, who you’ll be interacting with, and what the culture is. Even though you’re hoping to land the job, you do want to at least be able to decide whether the place sounds like an appealing place or not
- Leveraging AI — I only use this for prep. Some people might use this while they’re doing their coding, I don’t know, but its not my style. It feels a bit dishonest, plus if its coming up with code you can’t explain, you’re not going to be able to answer follow up questions. There are already resources for Leetcode, so I wouldn’t even have it walk you through that, but instead, I’ll ask it to give me some leetcode easy-medium questions and some sql easy-medium questions at random.
- Managing Feelings and Mental Well Being — For some of us, this journey of finding a new position will take a while. You’ll get a bunch of no’s. Sometimes it’ll be for positions that you really want or thought you killed the interview, and sometimes it’ll be for positions you weren’t going to accept anyway (Wait, I was doing you a favor, and your telling me no? lol). Just don’t be rash with your decisions. You don’t want to do an angry response on social media or in a response email. But you also don’t want to go into a shell and think you’re no good at anything. Just continue to move forward in your job search.
- 1 minor annoyance — That workday application, its the same system for multiple companies, but its like 5 pages of the same exact questions that is auto filled by your resume. But then it does it wrong the same way every time and you have to make the same correction over and over and over.

Resources:
Leetcode help
Thank you to Navdeep Singh (youtuber) for putting together a guide for leetcode questions. I’ve used his videos to get explanations so I understand how to use different data structures and algorithms. Here is the google sheet with leetcode problems organized by sheet: https://docs.google.com/spreadsheets/d/1A2PaQKcdwO_lwxz9bAnxXnIQayCouZP6d-ENrBz_NXc/edit#gid=0