PyData Global 2023

Patrick Hoefler

Patrick Hoefler is a member of the pandas core team and a Dask maintainer. He is currently working at Coiled where he focuses on Dask development and the integration of a logical query planning layer into Dask. He holds a Msc degree in Mathematics and works towards a Msc in Software engineering at the University of Oxford.

The speaker's profile picture

Sessions

12-06
16:30
30min
Arrow revolution in pandas and Dask
Matthew Rocklin, Patrick Hoefler

The pandas library for data manipulation and data analysis is the most widely used open source data science software library. Dask is the natural extension for scaling pandas workloads to more than a single machine. The continuing integration and adoption of Apache Arrow accelerates historical bottlenecks in both libraries.

Data Track
Data Track