A Day in the Life of a Data Engineer
The first time I considered becoming a data professional was in my last year of college when I took a class in machine learning and big data. I saw the potential impact and use cases of machine learning that have simply not been seen before. Even today, I still believe there are plenty of ways we can leverage the power of machine learning, such as natural language processing and computer vision. Just a few years ago, we didn’t even know it was possible. What I love about it is that you’re working on something that didn’t exist and then making it a reality.
Currently, I’m part of the DnA team at Vista, which is driven by the ambition to become one of the world’s most iconic data-driven companies. It’s great to be on a journey of transformation that brings exciting technical challenges and opportunities to build impactful data products.
Meaning behind words
Being a data engineer, I enjoy solving complex problems and turning ideas and insights into scalable data products. Some of the main parts of my job consist of defining use cases, animating workshops with decision-makers, and designing and implementing cloud-based data solutions and machine learning pipelines at scale.
One of my current challenges is working with transformers for natural language processing (NLP). In a nutshell, it’s the ability to understand the semantics behind text. Traditionally, we would process text as words, without understanding its meaning or what’s behind the content. Now, we have the technology to embed meaning into a machine and, ultimately, make a connection between different texts.
A typical use case of transformers-based NLP for an eCommerce website is what we call “Natural Search” or natural language-based search, as opposed to keyword-based search. For instance, if a user types in a complex sentence in the search bar, it can help us bring up the results that match the search better because the machine has a deeper understanding of what they’re looking for. That makes the customer’s experience more enjoyable.
In my role, I see the impact of this process in every day to day. For example, a restaurant owner can describe what he needs, say “minimalist business cards for my vegan restaurant business.” Instead of having to select the business card product gallery and then filter on the restaurant industry and minimalist style, we can show relevant design templates for vegan restaurants and minimalist styles on the relevant product. All thanks to transformers-based models.
Solving real-life issues
Aside from the usual projects I work on, one of the key topics around machine learning I’m passionate about is MLOps. MLOps is the process of taking a machine learning model to production to solve real-life problems. It considers what you have to build to actually have a machine learning model deployed at scale.
To have the NLP use case that I mentioned before in production, we need to create the infrastructure first to continuously process large amounts of texts in order to train a model that’s able to understand semantics. We then also need to build an API to be able to use the model on the website, while continuously monitoring the model performance. It helps us ensure the model becomes more accurate over time. It examines all the operations that you need to set in place to have a machine learning model working.
MLOps stem from a broader concept of DevOps, usually explained as a set of tools, concepts, and ways of working for software teams to reduce the time-to-production of new features. Ultimately, it aims to deliver faster increments of value to end-users and collect their feedback more often through experimentation.
While machine learning development might still be slower compared to software development, we can spend several sprints between model inception to first A/B test and several more sprints to test new versions for that same model. MLOps is there to bridge that gap between software development and machine learning development. In Vista’s use cases, MLOps is an enabler for faster iteration, from inception to experimentation, to make sure we are building the right models for our customers.
Read more data science articles on OpenDataScience.com, including tutorials and guides from beginner to advanced levels! Subscribe to our weekly newsletter here and receive the latest news every Thursday. You can also get data science training on-demand wherever you are with our Ai+ Training platform. Subscribe to our fast-growing Medium Publication too, the ODSC Journal, and inquire about becoming a writer.