I want to share with you some of the traits that I’ve found in especially good Data
Engineers. Every one of these traits may not be in every Data Engineer, but you will find several.

I can’t stress enough how important it is for a Data Engineer to have a strong programming
background. Data Engineers are commonly more mid to senior in their careers. Those fresh out
of school usually have a Master’s degree or above in Computer Science with focus on
distributed systems or data. I have seen some especially bright junior engineers make great
contributions to the team.

This will sound odd given how much I talked about the importance of programming, but the
best Data Engineers are bored with just programming. That means that they’ve mastered or
nearly mastered programming as a discipline. Writing another enterprise system or small data
project doesn’t have much interest.

As a result, they’ve started to cross-train into other fields. These could be related to
programming like data science or unrelated like marketing or analysis.

Data Engineers are bored of creating small data systems. They aren’t as complex. They want to
create bigger and more complex systems. The main driver for this is their desire to create data
products that can be used by everyone.

This desire to create data products comes out of a common love of data. You might have seen a
Software Engineer love coding or maybe even love a language. They are happiest when coding.
Data Engineers love coding and data. If there isn’t a love, there is at least an interest in data.
I’ve found this distinguishes the great Data Engineers from the good Data Engineers.

They use this data because they are inherently curious about what is happening and why.
They’re going to use their data to either prove or disprove that hypothesis.

I don’t focus on what technologies a Data Engineer knows. I focus on their understanding of
systems and distributed systems. They obviously need to know some Big Data technologies and
APIs. However, learning APIs or another technology is much easier once you know the basic
architectural and design patterns of Big Data systems. A Data Engineer who has shown they can
learn some Big Data technologies is likely to have the ability to learn other technologies.

I see this all the time when I train a team that is already working with Big Data technologies.
They catch on quicker to the concepts because there are similarities to their other Big Data
technologies. The team learns more from the training because they’re not starting from
scratch.