Author(s): FS Stance

Originally published on Towards AI.

One of the main goals of creating my home lab is to gain a deeper understanding of Machine Learning Operations (MLOps) and how to productionalize AI workflows. Generally speaking, MLOps and productionalization deals with moving AI models from research into a real-life environment with automation and ability to handle errors gracefully.

In my previous articles, I set up a PostgreSQL server and an Airflow server. These serve as the data foundation for how to get datasets that will be used to train AI models. Now we need to start filling out our PostgreSQL databases with data. We can use Airflow to orchestrate data pipelines so that up-to-date data is loaded into our PostgreSQL database. Setting up this data foundation is typically the first step in the machine learning (ML) process after planning, as you need data to train your models.

A major reason behind my home lab setup is I want to show that you can self-host the whole ML process with a couple of VMs and containers. Since I have been doing a lot of personal investments lately, let’s work with finance data. With finance data, you can analyze trends, correlate prices, and even try forecasting, making it broadly useful across many scenarios.