How to Become a Data Engineer

What is Data Engineering?

  • Generalist: If your dream is to work on a small team or startup, following the generalist path is your best bet. They know how to do a bit of everything and can set up architectures for a variety of tasks.
  • Pipeline centric: If your dream company is an established organization with complex data needs, focusing on building pipelines could be your jam. Pipelines are often part of revenue-producing data projects.
  • Database Centric: Large companies with massive data and legacy systems need engineers to form data warehouses. Legacy systems with multiple massive data streams will need to be converted to something workable for a massive, fast data-driven culture.

How to Become a Data Engineer

From a computer science/coding path

  1. Attend a boot camp specifically for engineering or seek out open courseware if you don’t have a boot camp budget. Specific skills in pipelines, architecting distributed systems and data stores, or combining data sources are all parts of this. Skills such as Scala, Python, and Hadoop are essential, but more important is the underlying concepts.
  2. You can gain experience from entry-level IT positions or transition from data science to data engineering for a small company that doesn’t need scale quite yet. You can also build systems independently and document them through your online portfolio.
  3. Gain professional certifications. IBM, Microsoft, Google, and Cloudera, for example, all offer certifications specifically in data engineering. If you know your preferred organization works with a particular set of tools, that can help focus your certifications.
  4. Consider a graduate degree. Data engineering is highly technical, and just a certification may not be enough to help you stand out. There is a data engineering talent shortage, but companies seem willing to wait for the right one or train internally instead of hiring someone that doesn’t quite fit.

From a noncoding path

  1. Check-in your area if local boot camps or sprint are available. In my area, for example, the Nashville Software School frequently offers three-week “jumpstarts” to get you started in both development and data science.
  2. You’ll need at least an intermediate familiarity with advanced skills such as Python, Java/Scala, SQL/NoSQL, cloud platforms and computing, and architecture options like Hadoop. Explore open courseware to help get you there. Look into your local community college or university, too.
  3. You’ll want to get on board as quickly as possible with a specific project. Finding a passion project, such as this person’s journey with OkCupid, could help jumpstart those real-world skills and keep you interested for the long haul.
  4. Set up your Github profile and look for hackathons, volunteer opportunities, internships, local meetups, and anything else that can help you gain real-world experience, networking experience, and keep you interested.
  5. Begin exploring positions in data science or even data engineering at small companies or startups with simple needs. As you gain experience processing and building for business value, you can move on to larger companies with more complex data needs.

Fulfilling the Data Engineer Role



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
ODSC - Open Data Science

ODSC - Open Data Science


Our passion is bringing thousands of the best and brightest data scientists together under one roof for an incredible learning and networking experience.