Open Sourcing Featuretools

by Max Kanter, CEO | September 27, 2017

Open Sourcing Featuretools

by Max Kanter, CEO | September 27, 2017

I created Deep Feature Synthesis two years ago while I was a student at MIT. My intention from the very beginning was to one day share that technology with the world. That day has finally come, and Featuretools is now available for anyone to use for free.

Open-sourcing Featuretools will help fill a gap in the ecosystem for building end-to-end machine learning systems. Great tools such as Pandas enable data prep and ad hoc feature engineering, and tools such as sckit-learn allow for machine learning. But until now, there has been no structured process for converting raw data into machine-learning-ready data.

Featuretools easily integrates with popular libraries

Even though the importance of feature engineering has been acknowledged for years, Featuretools is actually the first release of an open source library for performing automatic feature engineering for relational and transactional datasets. With this release, we are not only making it easier than ever for newcomers to learn machine learning, but also increasing the productivity of data scientists ten-fold.

Open-sourcing Featuretools has been a long time in the making. At first, the code base needed more time to mature. Over time, priorities shifted to establishing and growing Feature Labs. However, with successful deployments and more than two years of testing, we are ready to share our work with the community.

The code is now available under the 3-Clause BSD License on Github. In this initial release, you will find

  • Deep Feature Synthesis – an automated feature engineering algorithm
  • Feature Primitives – reusable feature engineering functions
  • Entity Sets – abstractions for representing structured dataset

Featuretools has been a labor of love for everyone here, at Feature Labs, because it combines our decades of experience as data scientists. We’re all excited because this is just the beginning of drastically improving the process of feature engineering in order to create better machine learning models.

And we can’t wait to see what’s in store for the future!


Stay up to date

Get the latest updates from Feature Labs

Get in touch

Feature Labs is a predictive analytics platform created to make data science automation a strategic component of any organization. Contact us to learn how we can help you succeed with data science and predictive modeling endeavors.

Follow us on

Feature Engineering vs Feature Selection

All machine learning workflows depend on feature engineering and feature selection. However, they are often erroneously equated by the data science and machine learning communities. Although they share some overlap, these two ideas have different objectives. Knowing...

read more

Feature Engineering: Secret to data science success

Prior to starting Feature Labs, I researched data science automation in the Data to AI Lab at MIT. Unlike most data scientists who work in a single domain, our group had sponsors from a wide range of industries. This gave us the unique opportunity to develop innovative solutions to use with the diverse problems we worked on.

read more

Learn Feature Engineering in MIT’s Big Data Analytics Course

Feature Labs is pleased to share that our open source library, Featuretools, is being used in a new MIT course on Data Science and Big Data Analytics. Feature engineering is a vital skill for all data scientists, so we are excited to provide the library that enables teaching it alongside other important machine learning topics for the first time.

read more

Applying Data Science Automation to Better Predict Credit Card Fraud

If you use a credit card, you probably know the feeling of having your card declined due to a suspected fraudulent transaction. An industry report from 2015 found that one out of every six legitimate cardholders experienced at least one declined transaction because of inaccurate fraud detection in the past year. That makes fraud detection an expensive problem for issuers: Those declined transactions lead to nearly $118 billion dollars in losses on an annual basis.

Even though numerous machine learning approaches have been developed in the past to address fraud, newly introduced data science automation platforms like Feature Labs give us a reason to revisit the problem. And now, any organization can see the power of automation for themselves using our just announced developer library, Featuretools.

read more

Featuretools at CMU’s Learn Lab

Feature Labs visited Carnegie Mellon University this past July to participate in the 17th annual Simon Initiative’s LearnLab Summer School on Educational Data Mining. During the program we introduced teams to Featuretools, our open source feature engineering library. You can find the complete details in the Featuretools blog post, but here the highlights:

read more

About this blog

Thoughts, reflections, and examples of how organizations can take advantage of data science technologies today from the minds behind Feature Labs.

Follow us on

 

Get in touch

 

Feature Labs is a predictive analytics platform created to make data science automation a strategic component of any organization. Contact us to learn how we can help you succeed with data science and predictive modeling endeavors.

Follow us on