- LinkedIn developed Feathr to help manage and distribute features used in its machine learning technologies.
- According to LinkedIn, the feature store is presently being used to track hundreds of features used by the social media giant.
- LinkedIn released the code of the feature store under an Apache 2.0 license in April, making the open source feature store available to the general public for the first time.
- LinkedIn is giving the Linux Foundation’s LF AI & Data section additional influence over the open source project by donating the feature store, which should attract more users and contributions.
LinkedIn announced today that Feathr, its open source feature store, has joined LF AI & Data, the Linux Foundation’s umbrella foundation for big data and AI initiatives.
Feathr is used to improve the accuracy and performance of ML systems
LinkedIn created Feathr to aid with the management and delivery of features used in its machine learning applications. Rather than dealing with features manually as part of a separate data pipeline, the feature store automates and standardizes the interaction with the data type, which is utilized in machine learning training and inference stages.
The motivation for developing Feathr was to improve its machine learning systems’ consistency, accuracy, and performance. Users may now get data features used in ML programs “by name” from within ML workflows by specifying them once in a shared feature namespace. This enables the usage of the same features across many ML systems, increasing efficiency and accuracy.
Feature stores also provide a more repeatable mechanism for translating source data into features (not available in all feature stores) and improve ML serving speed at the inference stage by centralizing feature storage and serving.
Since the software’s initial introduction in 2017, it has increased in popularity. The feature store, according to LinkedIn, is currently being used to track hundreds of features utilized by the social media giant.
Hangfei Lin, a LinkedIn data infrastructure engineer, wrote in a blog post that, “It has reduced the engineering time required for adding and experimenting with new features from weeks to days. It’s also performed up to 50% faster than the custom feature processing pipelines that it replaced.”
This April, LinkedIn published the Feathr code under an Apache 2.0 license, allowing the general public to access the open source feature store for the first time. Lin notes that the project “has achieved substantial popularity among the machine learning operations (MLOps) community” since then and is now being used by enterprises across numerous sectors.
META TRANSITIONS PYTORCH TO LINUX FOUNDATION
The donation will draw more contributors to the project
LinkedIn is placing more control behind the open source project by donating Feathr to The Linux Foundation’s LF AI & Data division, which should help draw more users and contributions to the project.
Dr. Ibrahim Haddad, General Manager of LF AI & Data, said in a press statement, “We’re excited to welcome Feathr to LF AI & Data and for it to be part of our technical project portfolio (41 projects and growing) with over 17K developers. We aim to support Feathr to expand its user base, grow its community of developers, become a leader within its own category, and enable collaboration and integration opportunities with other projects. We look forward to the project’s continued growth and success as part of LF AI & Data.”
Microsoft is also involved in the Feathr story (Microsoft owns LinkedIn). Lin claims that LinkedIn developers collaborated with Microsoft Azure colleagues to guarantee Feathr functions well on Azure and integrates with other Azure products and projects.
Machine learning makes life easier for data scientists
According to an Azure blog post, Feathr now supports Apache Spark, Juypter, Azure BLOB Storage, HDFS, Snowflake, Databricks Delta Tables, and SQL Server. Back in April, Microsoft was also participating in the open-sourcing of Feathr. The feature store is currently an LF AI + Data sandbox project. Check out its GitHub page for further details.