Supercomputing and Big Data

given by Prof. Dr. - Ing. Morris Riedel

The fast training of traditional machine learning models and more innovative deep learning networks from increasingly growing large quantities of scientific and engineering datasets (aka ‘Big Data‘) requires high performance computing (HPC) on modern supercomputers today. HPC technologies such as those developed within the European DEEP-EST project provide innovative approaches w.r.t. processing, memory, and modular supercomputing usage during training, testing, and validation processes. This workshop thus focus on parallel and scalable machine learning driven by HPC and will pave the way for participants to use parallel processing on supercomputers as a key enabler for a wide variety of machine learning and deep learning algorithms used today. Examples include scientific and engineering applications that leverage traditional machine learning techniques such as scalable feature engineering, density-based spatial clustering of applications with noise (DBSCAN) and support vector machines (SVMs) with kernel methods. Those applications of traditional machine learning will be also compared with innovative deep learning models using Keras and TensorFlow taking advantage of convolutional neural networks (CNNs) for image datasets as well as long short-term memory (LSTM) networks for sequence data. Throughout learning these concrete models the participants will further learn required aspects of statistical learning theory and how to avoid overfitting in context of applications using various regularization and cross-validation techniques. The agenda is as follows:

10:00 – 11:30 HPC Introduction & Parallel and Scalable Clustering using DBSCAN

11:30 – 12:00 coffee break

12:00 – 13:30 Parallel and Scalable Classification using SVMs with Applications

13:30 – 14:30 lunch

14:30 – 16:00 Deep Learning using CNNs driven by HPC & GPUs

16:00 – 16:30 coffee break

16:30 – 17:30 Deep Learning using LSTMs driven by HPC & GPUs