3. Confidential and Proprietary to Daugherty Business Solutions
“A feature is an individual measurable property or characteristic of a phenomenon being observed… Feature data is
used both as input to models during training and when models are served in production.”
Key takeaways
• Features are not data
• Features enumerate information
• Not all features are equal
Features
https://docs.feast.dev/user-guide/features
4. Confidential and Proprietary to Daugherty Business Solutions
Feature Engineering is the process of extracting features from raw data.
Feature Engineering Techniques
• Imputation
• Handling Outliers
• Binning
• Numerical Transform
• One-Hot Encoding
• Grouping
• Extraction
• Scaling
Feature Engineering
5. Confidential and Proprietary to Daugherty Business Solutions
• Feature Reuse Between Models
• Consistent Feature Definitions
• Latency / Recency
• Environmental Variation
• Unstable Dependencies
• Governance
• Versioning
Feature Challenges
6. Confidential and Proprietary to Daugherty Business Solutions
Feature Store
API
Metadata /
Model /
Predictions
Offline
Data Store
Online
Data Store
Batch Engine
Stream Engine
Batch Prediction
Stream Prediction
7. Confidential and Proprietary to Daugherty Business Solutions
• Retrieve Feature Metadata
• Retrieve Feature Values
• Remove Features
• Store Features
• Stream Store Features
• Stream Retrieve Features
• Feature Versioning
• Model Versioning
• Record Predictions
Feature Store Use Cases
8. Confidential and Proprietary to Daugherty Business Solutions
• Data engineers interact with a feature store by creating
data pipeline definitions.
• Data pipeline definitions combine
– Data Sources
– Business definitions
– Transformation rule
– Streaming/Batch definitions
– Scheduling
• Data pipelines are executed by the feature store engines
and stored in online and offline data stores.
Data Pipeline
9. Confidential and Proprietary to Daugherty Business Solutions
• Data scientists interact with the feature store through the Feature Registry.
• They can search for and browse feature definitions.
• They can register data science models as a class of data pipeline.
Feature Registry
10. Confidential and Proprietary to Daugherty Business Solutions
• Feature stores can assist with versioning and monitoring data
science applications.
• Predictions are recorded in the feature store API including
source data, model used, version of that model, and the
rendered prediction.
• Predictions can be compared with reality to determine the
accuracy of the models.
• Models and versions are tracked and can be used to determine
the lift provided by a particular instance of a model.
Versioning and Monitoring
11. Confidential and Proprietary to Daugherty Business Solutions
• Open Source
– GoJEK/Google FEAST
• Product Offerings
– Logical Clocks Hopsworks
– Scribble Enrich
• Presentations Only
– Uber Michaelangelo
– Airbnb Zipline
– Survey Monkey ML Feature Store
– Netflix MetaFlow
Feature Store Implementations