Spark for Big Data

Overview

Instructor: Rick McMullen, Jian Tao

Time: Friday, March 08, 1:30PM-4:00PM

Location: SCC 102.B

Prerequisites: Python, HPRC cluster account

This class will introduce the Spark Big Data computing environment and how to use it on HPRC clusters.

Course Materials

Agenda

The course agenda will be available soon.

  • What Spark is and what it is good for
  • Using Spark on the Ada cluster using the OpenOnDemand portal
  • Running Jupyter+Spark
  • Learning some Spark basics with Jupyter notebooks