|Contact||Lectures: 20 hours | Labs: 20 hours
Homework / Private Study: 160
|Assessment||One individual assignment worth 50%, and a final 2-hour examination worth 50%.|
|Lecturer||Dr Joseph El-Gemayel |
Dr Willam Bell
The aim of this module is to endow students with:
- an understanding of the new challenges posed by the advent for big data, as they refer to its modelling, storage, and access;
- an understanding of the key algorithms and techniques which are embodied in data analytics solutions;
- an exposure to a number of different big data technologies and techniques, to show how they can achieve efficiency and scalability while also addressing design trade-offs and their impacts.
After completing this module participants will be able to:
- understand the fundamentals of Python to enable the use of various big data technologies;
- understand how classical statistical techniques are applied in modern data analysis;
- understand the potential application of data analysis tools for various problems and appreciate their limitations;
- be familiar with a number of different cloud NoSQL systems and their design and implementation, showing how they can achieve efficiency and scalability while also addressing design trade-offs and their impacts;
- be familiar with the Map-Reduce programming paradigm.
- Introduction to Python;
- Quantitative methods for data analysis and knowledge extraction including classification, clustering, association rules, Bayesian approaches, decision trees;
- Overview of various NoSQL cloud storage systems such as document stores like MongoDB, column stores like Cassandra and graph databases like Neo4j;
- Distributed data processing with Hadoop and MapReduce.
* This list is indicative only – the class lecturer may recommend alternative reading material. Please do not purchase any of the reading material listed below until you have confirmed with the class lecturer that it will be used for this class.