I started my first job as a Data Engineer back in September. I wasn't exactly sure what the job entailed, but I was ready to shift my focus and learn new things. I'd spent the previous four years on back end application development, and while I learned a lot, I was ready to shift away from primarily consuming and constructing APIs.
When I was interviewing for the position, I read up on DAGs, ETL pipelines, and specific common database patterns like the star schema. However, in my interview I was asked things "how do you optimize this SQL query" and "assuming you don't have an automated load balancer in place, how do you decide which cluster to create a new database in?" They were interesting questions, so I decided to take the job. But it can be difficult to explain, even to other engineers, what the scope of my job is.
My team is responsible for database infrastructure. We make sure that database clusters aren't overloaded. We're concerned with internal data security. We manage database users and permissions, and field requests from different teams. We're the information gathering arm for the Business Intelligence team; if they need a dump of data from a specific API every 24 hours, we write the ETL pipeline to do it, and manage the data warehouse where all that information lives. My job is kind of a grab bag of things that need to get done but might not have an obvious owner.
I know I probably haven't done a great job of answering the question "What even is data engineering?" because I'm not exactly sure I know myself! I recently met up with Ali at a local Python meetup and we bonded over the fact that the discipline of data engineering seems utterly made up.
In this series, however, I'm going to introduce different tools that I use in my work through the lens of data engineering. If you have specific questions, or want to know about specific tools commonly used by data engineers, let me know in the comments. Data engineering is a fairly new field in software development, and the boundaries are still being drawn. Let's suss it out together!