Research

The minds behind Feature Labs are at the forefront of data science research. You can watch our videos and see a sampling of our peer-reviewed academic research below.

Research

The minds behind Feature Labs are at the forefront of data science research. You can watch our videos and see a sampling of our peer-reviewed academic research below.

Videos

Towards increasing data scientists’ productivity by 1000x
Label, Segment, Featurize : a reusable framework for prediction engineering
Trane: automatically formulating and solving thousands of prediction problems

 Papers

Deep Feature Synthesis: Towards automating data science endeavors

Authors: James Max Kanter, Kalyan Veeramachaneni
Published in: IEEE International Conference on Data Science and Advanced Analytics 2015

In this paper, we develop the Data Science Machine, which is able to derive predictive models from raw data automatically. To achieve this automation, we first propose and develop the Deep Feature Synthesis algorithm for automatically generating features for relational datasets. The algorithm follows relationships in the data to a base field, and then sequentially applies mathematical functions along that path to create the final feature. Second, we implement a generalizable machine learning pipeline and tune it using a novel Gaussian Copula process based approach. We entered the Data Science Machine in 3 data science competitions…

Machine learning 2.0 Engineering data driven AI products

Authors: Max Kanter, Benjamin Schreck, Kalyan Veeramachaneni

In this paper, we propose a paradigm shift from the current practice of creating machine learning models that requires months-long discovery, exploration and “feasibility report” generation, followed by re-engineering for deployment, in favor of a rapid 8 week long process of development, understanding, validation and deployment that can executed by developers or subject matter experts (non-ML experts) using reusable APIs. It accomplishes what we call a “minimum viable data-driven model,” delivering a ready-to-use machine learning model for problems that haven’t been solved before using machine learning…

What would a data scientist ask? Automatically formulating and solving prediction problems

Authors: Benjamin Schreck, Kalyan Veeramachaneni
Published in: IEEE International Conference on Data Science and Advanced Analytics 2016

In this paper, we designed a formal language, called Trane, for describing prediction problems over relational datasets, implemented a system that allows data scientists to specify problems in that language. We show that this language is able to describe several prediction problems and even the ones on Kaggle- a data science competition website. We express 29 different Kaggle problems in this language. We designed an interpreter, which translates input from the user, specified in this language, into a series of transformation and aggregation operations…

Label, Segment, Featurize: a cross domain framework for prediction engineering

Authors: James Max Kanter, Owen Gillespie, Kalyan Veeramachaneni
Published in: IEEE International Conference on Data Science and Advanced Analytics 2016

In this paper, we introduce “prediction engineering” as a formal step in the predictive modeling process. We define a generalizable 3 part framework — Label, Segment, Featurize (L-S-F) — to address the growing demand for predictive models. The framework provides abstractions for data scientists to customize the process to unique prediction problems. We describe how to apply the L-S-F framework to characteristic problems in 2 domains and demonstrate an implementation over 5 unique prediction problems…

Want to push data science automation forward?

Reach out to careers@featurelabs.com

Get in touch

Feature Labs is changing the way companies create new machine learning products and services. We make a web app and developer API to automate time-intensive and error-prone parts of the data science process such as feature engineering. Our customers love our products because they make machine learning easier to use.

Follow us on

 

Get in touch

 

Feature Labs is changing the way companies create new machine learning products and services. We make a web app and developer API to automate time-intensive and error-prone parts of the data science process such as feature engineering. Our customers love our products because they make machine learning easier to use.

Follow us on