- Academic Programs
- Our Department
- Student Life
- Alumni & Industry
According to the ex-CEO of Google Eric Schmidt, “every two days now we create as much information as we did from the dawn of civilization up until 2003″. Massive amounts of data at our disposal and exponential advances in computing technology enabling us to process it, ushered in the era of “big data”. Businesses, healthcare industry and governments alike are keen on leveraging big data for improving their products and services, for discovering trends in disease proliferation, for personal health improvement and other goals. Big data is everywhere.
In this class we will learn about key technologies underlying the processing, storage and analysis of big data. We will begin with the review of core systems and principles in the areas of databases and distributed systems. We will then read about and discuss the most prominent big data technologies, such as Google’s GFS, Spark, the RAD stack (Kafka, Storm, Druid, Zookeeper) and many others.
The students must have taken a course on data structures and algorithms and on computer systems (e.g., computer architecture or operating systems). It is recommended to have background in distributed systems and databases, but not required.