Big Data Management Systems and Tools
Online learning: Live recorded sessions
1,899 Canadian Dollars (including fees)
Upon successfully completing all the requirements of the course, a digital certificate will be provided to you
Register Your Interest
Big Data involves massive data volumes and diverse data types. Modern organizations need people who can help implement the tools they need to deal with these huge data sets.
In this course, you will learn the technology of big data and build in-demand skills. You will also get hands-on experience using up-to-date database management systems and tools. Get in on the explosion of new NoSQL technologies and big data tools including Hadoop, Spark, and Cassandra.
Receive a certificate from the University of Waterloo
Upon successful completion of this program, you will receive a professional education certificate from the University of Waterloo.
Who Should Enrol
- Business associates, operations managers, project managers, and intelligence analysts.
- Finance, securities, and insurance professionals.
- Digital marketing and communication specialists.
- Professionals from every level or industry who work with analytics or data.
What You Will Learn
Module 1: Reliable and Scalable Data-Intensive Applications | Scaling to Big Data: Explore the challenges and strategies for scaling applications to handle large volumes of data.Reactive Design Framework: Understand the reactive design principles for building responsive and robust data systems.High Reliability and Scalability Strategies: Discover methods to achieve high reliability, scalability and resilience in data applications. |
---|---|
Module 2: Relational Databases | Relational Model: Understand the fundamentals of the relational model and its use in database systems.SQL Fundamentals: Learn SQL for querying and managing relational databases effectively.Operational Data Stores and Data Warehouses: Explore how operational data stores and data warehouses are used in big data environments. |
Module 3: NoSQL | Key-Value Stores: Learn about key-value stores and their applications in NoSQL databases.Column-Oriented Databases: Explore column-oriented databases and their advantages for big data analytics.Document-Oriented Databases: Understand document-oriented databases for managing semi-structured data.Object, Graph, and Triple Stores: Discover the roles of object, graph, and triple stores in handling complex data relationships.MongoDB: Get practical experience with MongoDB, a popular NoSQL database. |
Module 4: Distributed Datastores | Distributed Filesystems: Learn the principles of distributed filesystems for managing large-scale data.Scalability Challenges: Understand the challenges of scaling relational databases in distributed environments.CAP Theorem: Explore the CAP theorem and its implications for designing distributed systems.Replication and Partitioning: Learn techniques for data replication and partitioning in distributed architectures.Hadoop Distributed Filesystem (HDFS): Understand the structure and operation of the Hadoop Distributed Filesystem. |
Module 5: Introduction to Spark | Setting Up Spark: Learn how to set up Apache Spark for distributed data processing.Spark Datasets and DataFrames: Explore how to use Spark Datasets and DataFrames for efficient data handling. |
Module 6: Analytics with Spark | Spark Actions: Understand different actions in Spark for executing data processing tasks.Spark Transformations: Learn about various transformations in Spark for data manipulation. |
Module 7: Spark Streaming | Streaming Data Concepts: Understand the unique characteristics of streaming data.Stream Processing: Learn the fundamentals of stream processing and its applications.Spark Stream Processing: Explore how to use Spark for real-time stream processing. |
Module 8: Data in Motion | Metadata Management: Learn about the importance of metadata in data management.Data Transfer Patterns: Understand different patterns of data transfer in big data environments.Data Serialization Formats: Discover formats for serializing and transmitting data efficiently. |
Module 9: Big Data Architecture | Lambda Architecture: Explore the principles of Lambda architecture for real-time data processing.Lakehouses: Understand the concept of lakehouses and their role in big data architectures.Kafka and Real-Time Analytics: Learn how Kafka supports real-time data analytics. |
Module 10: Cloud Analytics | Cloud Platforms Overview: Explore cloud platforms such as Amazon, Azure, and Google for big data analytics.Cloud Analytics Tools: Learn about various tools and services offered by these cloud providers for managing big data. |
Module 11: MLOps | Introduction to MLOps: Understand the fundamentals of MLOps for machine learning model management.Continuous Deployment: Learn best practices for continuous deployment in machine learning workflows.Tools for MLOps: Explore different tools used in MLOps to streamline the model development lifecycle. |
Course Overview
Weekly webinars review key concepts and provide an opportunity for live interaction and Q&A with the instructors. The webinars will be recorded so if you are unable to attend you can watch the webinar recording at a time that is convenient to you.
- Understand the architecture of reliable big data systems.
- Describe how they differ from traditional systems.
- Use several NoSQL database management systems.
- Address the many challenges of working with data at scale.
- Use tools such as MongoDB and Spark to process large datasets.