12-07, 19:00–21:00 (UTC), Data Track
As data practitioners, we often rely on the data engineering teams upstream to deliver the right data needed to train ML models at scale. Deploying these ML models as a data application to downstream business users is constrained by one’s web development experience. Using Snowpark, you can build end to end data pipelines, and data applications from scratch using Python.
In this talk, you will learn to build a Streamlit data application to help visualize the ROI of different advertising spends of an example organization.
Setup Environment: Use stages and tables to ingest and organize raw data from S3 into Snowflake.
Data Engineering: Leverage Snowpark for Python DataFrames to perform data transformations such as group by, aggregate, pivot, and join to prep the data for downstream applications.
Data Pipelines: Use Snowflake Tasks to turn your data pipeline code into operational pipelines with integrated monitoring.
Machine Learning: Prepare data and run ML Training in Snowflake using Snowpark ML and deploy the model as a Snowpark User-Defined-Function (UDF).
Streamlit Application: Build an interactive application using Python (no web development experience required) to help visualize the ROI of different advertising spend budgets.
Previous knowledge expected
Vino is a Developer Advocate, focussing on Data engineering and LLM workloads at Snowflake. She started as a software engineer at NetApp, and worked on data management applications for NetApp data centers when on-prem data centers were still a cool thing. She then hopped onto cloud and big data world and landed at the data teams of Nike and Apple. There she worked mainly on batch processing workloads as a data engineer, built custom NLP models as an ML engineer and even touched upon MLOps a bit for model deployments. When she is not working with data, you can find her doing yoga or strolling the golden gate park and ocean beach.