12-08, 16:30–17:00 (UTC), General Track
How many fish are in the ocean? To answer this efficiently, we attempt to modernize fisheries operations to support interoperable and scalable sonar data processing by building user-friendly customizable Prefect workflows. We share our story to inform others considering ways to provide modern orchestration tools to users without a lot of technical experience.
Acoustic fisheries surveys collect terabytes of sonar data that require custom processing to obtain distribution estimates of species in the ocean. We will describe our experience leveraging the Prefect orchestration framework to create a user-friendly Python package Echoflow which allows fisheries scientists and data managers without distributed computing experience to execute complex data workflows on a variety of platforms (local and cloud) by editing existing recipes. Echoflow uses dask under the hood and benefits from storing sonar data in a cloud-native data format. We will share how we addressed some challenges around logging, handling rules and data validation, retry mechanisms, and credential passing. We hope this will serve as a guide to others embarking on streamlining workflows through Prefect.
Audience: Anybody with a Python background interested in how Prefect workflows can facilitate the distributed processing of large datasets
No previous knowledge expected
I'm Soham, a Data Science Master's student at the University of Washington. With three years of diverse experience at Deloitte, I've delved into software engineering, data engineering, and application security. I'm deeply passionate about Data Engineering and always eager to embrace new technologies. Beyond the screen and code, I find solace in the great outdoors; hiking is not just an activity for me but a way to rejuvenate my spirit. And when it comes to mental exercises, who can resist the allure of a thrilling game of chess? Looking forward to connecting and exploring the vast horizons of technology and beyond.