HPC Summer School 2018

Course list

Getting started with the UNIX Shell

30 July @ 9 am - Chernoff Hall 211

This class serves as an introduction to Linux, the UNIX-like operating system that runs on almost all high-performance computing systems. It is intended for users who have little or no experience with Unix or Linux. The focus is on the common bash shell. We cover material that helps the user to develop an understanding of the Linux command-line environment, which is necessary for successful use of Unix.

Introduction to High-Performance Computing

30 July @ 1 pm - Chernoff Hall 211

This tutorial is intended as an introduction to using high-performance computing resources provided by organizations like Compute Canada. This session will cover all of the information necessary to operate and use a compute cluster, including topics on running jobs, visualization, and software management.

Analysis pipelines with Python

31 July @ 9 am - Chernoff Hall 213

Python is perhaps the most versatile programming language in existence, and sees widespread use in every field of modern computing. This tutorial focuses on Python for high-performance computing applications, and will include topics on performance optimization, parallel programming, and pipelining. The afternoon session will focus on using Python to (easily) write and scale massively parallel data analysis pipelines across a cluster.

Shared-memory programming with OpenMP

31 July @ 9 am - Chernoff Hall 211

The “multi-core revolution” has affected virtually any computer from large SMP machines in research centres and banks down to smart phones with dual-core processors. To exploit the enhanced capabilities of such systems as a programmer, it is necessary to learn the basic principles of shared-memory parallel programming, also termed “multi-threading”. The use of multi-threading has the potential to speed up virtually any application even on a single-core system due to greater responsiveness and more efficient use of modern CPU’s and memory.

Distributed-memory programming with MPI

1 August @ 9 am - Chernoff Hall 211

The MPI (Message Passing Interface) is widely used for programming parallel computers ranging from shared-memory servers to large clusters. This workshop is directed at current or prospective users of parallel computers who want to significantly improve the performance of their programs by “parallelizing” the code on a wide range of platforms.

Cloud computing and MapReduce

1 August @ 9 am - Chernoff Hall 213

In this seminar, you will learn about using cloud computing to make it possible to analyze Big Data by providing elastic resource management to accommodate the application’s processing needs. Then, you will be introduced to the MapReduce programming paradigm that allows developing parallelized programs using very simple syntax. Finally, you will develop a simple Hadoop application on Microsoft’s Azure Cloud.

High-performance computing with R

2 August @ 9 am - Chernoff Hall 213

The R programming language has become the standard tool for data science, statistics, and bioinformatics. This course focuses on making your R code as fast as possible, including topics on performance optimization and parallelization. There will be a major emphasis on newer additions to the language, in particular, the “tidyverse” set of packages. Prerequisites

Programming GPUs with CUDA

2 August @ 9 am - Chernoff Hall 211

This is an introductory course covering programming and computing on GPUs - graphics processing units - which are an increasingly common presence in massively parallel computing architectures. The basics of GPU programming will be covered, and students will work through a number of hands on examples. The structuring of data and computations that makes full use of the GPU will be discussed in detail. The course covers some new features available on GPUs installed on Graham and Cedar. Students should be able to leave the course with the knowledge necessary to begin developing their own GPU applications.

Bioinformatics workflows

3 August @ 9 am - Chernoff Hall 213

Introduction to Bioinformatics for High-Throughput Sequencing: Rapid advances in high-throughput sequencing (HTS) and other ‘omics’ methods are transforming the biological sciences, but the pace of development of new technologies can make them difficult to follow. This ‘crash course’ introduces new and emerging sequencing technologies and provides hands-on training with major data types and analysis pipelines with a focus on whole-genome assembly, transcriptome (RNA) sequencing, and DNA metabarcoding.