Feature Store for Machine Learning

As we know that the Data-driven decision-making has become key to the success of any domain in this digital business, Herer Machine Learning plays a vital role in achieving that and helping every industry. In the ML life cycle, the Feature Engineering stage is one of the major and critical to make any kind of decision.

In this “Feature Store for Machine Learning“book, the author has divided this journey into three significant sections and elaborated on each topic precisely and focused on the point. And it makes every single reader understand the flow of feature engineering and feature store seamlessly.

Let’s share my thoughts and key takeaways from this book from each section

Section 1: An Overview of the Machine Learning Life Cycle: In this session author takes us to the basis of the ML life cycle in a practical way with classical examples. He has outlined the critical stages in ML and expounded on specific composition with crystal clear, which includes Dataset Selection, Data Exploration and Feature Engineering, Model Selection and Monitoring stance. I’m sure these topics would give the context of the ML Life Cycle exactly.

Without further delay, our author has quickly jumped into Feature Engineering aspects, as we know that this Feature Engineering stage is really time-consuming during ML solution building. Specific to ML Feature Management, he has explained a major impediment to getting ML models into production in the absence of feature management. Explicitly the author has provided a bunch of details regarding the features in production and their importance, the ways how to bring them into production, the common problems around the approaches and how the “Feature Store” comes to the rescue.

I would recommend that readers must read the “Batch model pipeline and Online-Transactional models” Since the author has clearly articulated all these concepts with neat sketches, and he has provided a list of common problems, all of which are very valid based on my observation and experience. I like to highlight a few of them here Re-inventing the wheel, feature re-calculation, Training vs Serving skew, Model reproducibility and a few others. And the author didn’t miss out to provide a detailed walk-through on how the feature store solves them in a detailed manner, this must-read topic I would say.

Section 2 – A Feature Store in Action: From Section 2, the Author is deep diving into Feature Store by introducing us to “Feast” the open-source and its feature management system for serving and managing ML features in every detail. He was provided with the most needed terminology, definitions to understand while we’re using in real-time situations and excellent steps to install and play around with the features – like Adding an entity and FeatureView, Generating training data, and Load features to the online store, all these are highly demanded features in “Feast” and have to read and digest topic

The author has clearly coated the steps for creating “Feast” resources in the AWS environment and exploring them with various aspects along with AWS components (S3, Redshift, Glue/Lake Formation, Airflow, SageMaker) and some of the major topics like Model training and inference and Productionisation, all these topics are very useful, those who are planning to incorporate the Feast in AWS space. Please note this point for your reference. After completing this section, we could understand the overall Amazon-managed workflows with Apache airflow for orchestration and productionization online and batch models with Feast, I believe others would feel the same way.

Section 3 – Alternatives, Best Practices, and a Use Case: In this session author has provided an outline of Feast Alternatives as SageMaker Feature Store, Google’s Vertex AI Feature Store, The Hopsworks Feature Store and ML Best Practices point of view. This session really helps us to understand the Feast and compare other feature stores. You must try the Use Case provided in this book; I promise you could get the experience on how to use feature stores.

Overall, this “Feature Store for Machine Learning” would help ML Engineers and Data Scientists to understand the importance of Feature engineering and how effectively use Feature Stores in projects.

All the best to the author. Overall, I can give 4.0/5.0 for this. Certainly, a special effort from the author is much appreciated.

-Shanthababu Pandian
Artificial Intelligence and Analytics | Cloud Data and ML Architect | Scrum Master
National and International Speaker | Blogger

Published by Shanthababu

I am Shanthababu Pandian, and having 17 yrs of IT experience and doing Project Manager Roles and responsibilities.

Leave a comment