Peaceful Sharing while Training Models

Pinar Tözün

Deep learning training is an expensive process that extensively uses GPUs. However, not all model training saturates the resources of a single GPU. This problem gets exacerbated with each new GPU generation offering more hardware resources. In this talk, we will first investigate methods to share GPU resources across model training jobs by collocating these jobs on the same GPU to improve hardware utilization. Then, we will explore work sharing opportunities in the data pipelines of model training, furthering the benefits of collocated training.

Pınar Tözün is an Associate Professor at IT University of Copenhagen. Before ITU, she was a research staff member at IBM Almaden Research Center. Prior to joining IBM, she received her PhD from EPFL. Her thesis received ACM SIGMOD Jim Gray Doctoral Dissertation Award Honorable Mention in 2016. Her research focuses on resource-aware machine learning, performance characterization of data-intensive systems, and scalability and efficiency of data-intensive systems on modern hardware.