Atreya Biswas

Atreya Biswas

Audience level:
3:45 p.m.–5:15 p.m.

Machine Learning Pipeline using Luigi and Scikit Learn


Using Luigi and Scikit-Learn to create a Machine Learning Pipeline which trains a model and predict through a Rest API


A Machine Learning Pipeline can be broadly thought of as many tasks which includes - Data Ingestion - Data Cleaning - Feature Extraction - Training Models - Hyper Parameter Optimization - Model Evaluation - Model Deployment. Luigi is Spotify's open sourced Python framework for batch data processing including dependency resolution, workflow resolution, visualisation, handling failures and monitoring. Scikit-Learn is the most popular and widely used Machine Learning Library in Python. We will demonstrate how Luigi and Scikit-Learn can be used to orchestrate the Machine Learning Tasks, hence creating a cohesive Machine Learning Pipeline.

