Spark for Big Data

Overview

Instructor: Rick McMullen, Jian Tao

Time: Friday, March 8, 2019 1:30PM-4:00PM CT

Location: SCC 102.B

Prerequisites: Current HPRC account, Python

This class will introduce the Spark Big Data computing environment and how to use it on HPRC clusters.

Course Materials

Agenda

The course agenda will be available soon.

  • What Spark is and what it is good for
  • Using Spark on the Ada cluster using the OpenOnDemand portal
  • Running Jupyter+Spark
  • Learning some Spark basics with Jupyter notebooks