Spark for Big Data


Instructor: Rick McMullen, Jian Tao

Time: Friday, March 08, 1:30PM-4:00PM

Location: SCC 102.B

Prerequisites: Python, HPRC cluster account

This class will introduce the Spark Big Data computing environment and how to use it on HPRC clusters.

Course Materials


The course agenda will be available soon.

  • What Spark is and what it is good for
  • Using Spark on the Ada cluster using the OpenOnDemand portal
  • Running Jupyter+Spark
  • Learning some Spark basics with Jupyter notebooks