Syllabuses - PG

CS988 - Big Data Tools and Techniques

TIMETABLETEACHING MATERIAL
Credits10
Level5
SemesterSemester 1
AvailabilityPossible elective
PrerequisitesN/A
Learning Activities Breakdown
Lectures: 10 hours | Labs: 10 hours
Homework / Private Study: 80 hours
Items of Assessment1
AssessmentExam
LecturerWilliam Bell

Aims and Objectives

The objectives of the module are to:

  • Understand challenges of manipulating large data samples, including storage, access and processing.
  • Be familiar with relational and NoSQL databases.
  • Be familiar with tools that are used to perform distributed data processing.
  • Be able to create a data processing pipeline.

Learning Outcomes

Learning outcomes:

  • Familiarisation with relational and NoSQL databases, schema designs, and associated constraints.
  • Familiarisation with distributed file systems and processing tools.
  • Ability to create a data processing pipeline.
  • Ability to create a dashboard to display processed data.

Syllabus

  • Overview of storage solutions, including relational, NoSQL, distributed file systems and cloud solutions.
  • Distributed data processing using current data analysis tools, pipeline approaches and dashboards.

Recommended Reading

This list is indicative only – the class lecturer may recommend alternative reading material. Please do not purchase any of the reading material listed below until you have confirmed with the class lecturer that it will be used for this class.

Learning Spark 2e: Lightning-Fast Data Analytics. Damji, J. et al, 2nd edition, O’Reilly Media, Inc. 2020. | Stocked at Amazon (Other retailers are available)

Last updated: 2025-08-19 12:11:46