Feast is a standalone, open-source feature store that organizations use to store and serve features consistently for offline training and online inference.
Before discussing about feast let’s learn about Feature :
Features are the lifeblood of modern machine learning systems. No other activity in the machine learning life cycle has a higher return on investment than improving the data a model has access to. However, one of the time-consuming tasks that ML teams focus on can also be the process of developing, managing, sharing, and serving features.
Feast management in the ML lifecycle
In the middle of data engineering and ML engineering is Feast. On the one side, have data owners (data engineers, data scientists) creating data sets and data streams (outside of Feast) and ingesting them into the system. On the other hand, have ML experts who use these characteristics either when serving or in training.
Once deployed, getting features in and out of Feast is simple, and can be done using our Python, Java, or Golang SDKs.
Beyond connecting the feature producers to the feature consumers, Feast also provides other conveniences for both parties such as :
- Decentralised serving
- Unified feature serving API
- Consistent feature joins
- Project isolation
- Standardization of features
Typical challenges :
Features not being reused: Features representing the same business concepts are being redeveloped many times, when existing work from other teams could have been reused.
Feature definitions vary : Different teams define features differently, and it is difficult to access a feature’s documentation.
Hard to serve up to date features: Not all teams have the skills needed to combine streaming and batch derived features and make them serveable. Often, specialised infrastructure is needed for ingesting and serving features derived from streaming data. Teams are deterred from using real-time data as a result.
Inconsistency between training and serving: Models that serve predictions require the latest values, whereas training requires access to historical data. Inconsistencies arise when data is siloed into many independent systems requiring separate tooling.
Why Implement Feast’s Feature Store in Your Stack?
- An open-source solution
Answer your organization’s feature storing and serving needs with Feast’s customizable open-source feature store.
- The promise of consistency
Ensure consistency of feature values across training and serving environments, no matter the use case.
- An extension to your existing stack
Connect Feast to the infrastructure and tools you already leverage for transformation, storage, monitoring, and modeling.
- Stand-alone feature store
Build and manage your data pipelines with your existing tools, and trust Feast to help you store and serve feature values reliably.
Feast is based on four fundamental components:
Feast Core: The Core subsystem is responsible for managing the different components of Feast. For instance, Feast Core manages the execution of feature ingestion jobs from batch and streaming sources while also enabling the registration and management of entities, features, data stores, and other system resources.
Feast Store: Feast supports two fundamental types of stores: warehouses and serving. Feast Warehouse Stores are based on Google BigQuery and maintain all historical feature data. The warehouse can be queried for batch datasets which are then used for model training. Serving Stores are responsible for maintaining feature values for access in a production serving environment.
Feast Serving API: This API is responsible for the retrieval of feature values by models in production. Feast Serving API supports HTTP and gRPC models which allows for low latency and high throughput execution models.
Feast Client Libraries: Feast supports client libraries for different languages such as Java, Go and Python as well as a command-line module. The client libraries streamline the developer interactions with the platform
Feast is an open-source Feature Store for machine learning. It has several useful properties, including generating point-in-time correct feature sets , so error-prone future feature values do not leak to models during training .And, supporting both streaming and batch data sources. However, it currently only supports timestamped structured data and therefore may not be suitable if you work with unstructured data in your models.Feast doesn’t require the deployment and management of dedicated infrastructure. Instead, it reuses existing infrastructure and spins up new resources when needed.
For more details contact email@example.com
- No similar blogs