Course Content
KM-01: Overview of Artificial Intelligence
This module introduces learners to the fundamental concepts of Artificial Intelligence (AI) and its growing role in modern technology, business, and society. Learners will explore the evolution of AI, key definitions, and different types of artificial intelligence, as well as related fields such as machine learning, deep learning, neural networks, data science, automation, and robotics. The module also examines how AI is applied in real-world environments, including industries such as healthcare, finance, agriculture, manufacturing, and digital services. In addition, learners will understand the strategic advantages of AI in business, including automation, improved decision-making, and increased productivity. By the end of the module, learners will have a foundational understanding of AI technologies, their applications, and their impact on the Fourth Industrial Revolution (4IR). This knowledge prepares learners for further study and practical skills development within the Artificial Intelligence Software Developer qualification at NQF Level 4.
0/8
KM-02: Introduction to Mathematics and Statistics for Artificial Intelligence
This module introduces learners to the essential mathematical and statistical concepts required for understanding Artificial Intelligence, Machine Learning, Deep Learning, and Data Analytics. It provides foundational knowledge in areas such as basic mathematics, linear algebra, binary number systems, scientific notation, probability, and statistics. Learners will explore how mathematical principles are used to represent data, perform calculations, and analyze patterns in AI systems. The module also develops problem-solving skills through practical applications including coordinate systems, matrix operations, and probability models used in modern AI technologies.
0/25
KM-03: Analytical Thinking and Problem Solving
This module focuses on developing the learner’s ability to analyse problems logically and design structured solutions. Learners are introduced to analytical thinking techniques, critical thinking skills, and problem-solving methods used in artificial intelligence development. The module teaches how to break down complex problems, evaluate possible solutions, and apply structured reasoning when designing AI-based systems. By the end of the module, learners will understand how to approach real-world problems systematically and use analytical tools such as decision trees and critical thinking methods to support AI problem solving
0/7
KM-04: Data, Databases and Data Visualisation
This module introduces learners to the fundamental concepts of data, database systems, and data visualisation, which are essential components in modern artificial intelligence and data-driven technologies. The module focuses on helping learners understand how data is collected, processed, analysed, stored, and transformed into meaningful insights for decision-making. Learners begin by exploring the value of data and the role of data analysis, including how reliable data sources are identified and how raw data is refined by handling missing values, correcting misalignments, and eliminating irrelevant information. The module also explains common flaws and limitations in data collection, such as bias, omission, and errors that may affect the quality and reliability of data. The module then moves into practical data handling using spreadsheets, where learners study techniques for analysing and presenting data. This includes creating reports, sorting and filtering datasets, using pivot tables and dashboards, importing data from files and databases, and visualising results using charts and analytical tools. Learners are also introduced to databases and Structured Query Language (SQL), which allow large volumes of data to be stored, managed, and retrieved efficiently. In addition, the module explores data mining techniques used to identify patterns and relationships within datasets. Finally, the module highlights the importance of data visualisation and data security, teaching learners how to present information clearly using AI-assisted tools while ensuring that sensitive information is protected from misuse or unauthorized access. Overall, this module equips learners with the knowledge required to manage data effectively, perform analysis, create meaningful visualisations, and maintain data integrity and security, which are critical skills for professionals working in artificial intelligence, data science, and software development environments.
0/17
KM-05: Computing Theory
computational thinking. Programming is the process of writing instructions that tell a computer how to perform tasks. These instructions are written using programming languages such as Python, Java, or C++. In this module learners will develop an understanding of how computers interpret instructions, how algorithms are used to solve problems, and how basic programming structures work. The module also introduces the core principles of software development and provides an entry-level understanding of Python programming. By the end of the module learners will understand how software systems are designed, how algorithms are created to solve problems, and how programming languages are used to build modern digital solutions including artificial intelligence systems. The module covers the following key topics: Introduction to programming languages Introduction to algorithms Programming basics Solution development Introduction to Python These concepts provide the theoretical foundation needed before learners begin writing real programs in practical learning modules.
0/11
KM-06: Introduction to Artificial Intelligence, Machine Learning, Deep Learning
The main focus of the learning in this knowledge module is to build an understanding of the relationship between Artificial Intelligence, Machine Learning and Deep Learning, as well as the application of such systems to create a set of instructions to perform a programming task. Learners will explore how AI technologies are used across industries such as healthcare, finance, education, and automation. The module also introduces ethical considerations, responsible AI use, and the impact of AI on society and employment. By the end of this module, learners will understand how artificial intelligence systems work, the different types of AI technologies, and how these technologies are applied in modern software development environments.
0/3
KM-07: Artificial Intelligence Frameworks and Data Scraping
This module introduces learners to Artificial Intelligence frameworks and their role in developing intelligent systems. Learners will explore how frameworks such as TensorFlow, Keras, PyTorch and IBM Watson help developers design, train and deploy AI models efficiently. The module also introduces the concept of data scraping, explaining how AI technologies can be used to collect and extract information from websites. Learners will understand the tools, procedures, and legal considerations involved in web scraping and how this data can be used for analytics and decision-making. By the end of the module, learners will understand the structure of AI frameworks, their advantages, practical applications, and how AI techniques can be used to automate data extraction processes.
0/7
KM-08: Machine learning
The main focus of this knowledge module is to build an understanding of the relationship between Artificial Intelligence (AI), Machine Learning (ML), and Deep Learning, as well as the application of machine learning to create a set of instructions that can perform programming tasks. This module introduces learners to the types of machine learning models, machine learning algorithm classifications, common machine learning algorithms, and the machine learning workflow process used to develop intelligent systems. Learners will also explore how machine learning can support business decision-making and improve business performance. The module further explains how machine learning systems use data, features, and labels to identify patterns, make predictions, and automate tasks. By understanding these concepts, learners will gain the foundational knowledge required to work with machine learning technologies and apply them in real-world applications and business environments.
0/11
KM-09: Deep Learning (DL)
This module introduces learners to the concept of Deep Learning, an advanced area of Artificial Intelligence that builds on Machine Learning techniques to create intelligent systems capable of learning complex patterns from large datasets. The module focuses on understanding the relationship between Artificial Intelligence (AI), Machine Learning (ML), and Deep Learning (DL) and how deep learning technologies are used to develop intelligent applications. Learners will explore how neural networks are structured and how they function, including the roles of input layers, hidden layers, and output layers in deep learning systems. The module also introduces different neural network architectures such as convolutional neural networks, recurrent neural networks, and recursive neural networks, which are widely used in fields such as computer vision, natural language processing, and speech recognition. In addition, the module covers activation functions used in deep learning models, including functions such as Sigmoid, Tanh, Softmax, and ReLU. Learners will also study how deep learning networks are built, trained, and tuned to improve performance. These concepts help developers design more accurate and efficient models for solving complex computational problems. The module further introduces advanced Python concepts for deep learning, including decorators, context managers, exception handling, and Python package management. These programming techniques are important for developing scalable deep learning applications. Finally, learners will explore TensorFlow and Keras, two of the most widely used frameworks for deep learning development. These tools allow developers to build, train, and deploy neural networks efficiently using modern machine learning libraries and APIs. By the end of this module, learners will understand the core concepts of deep learning, neural network architecture, advanced Python programming for AI development, and the use of TensorFlow and Keras to build deep learning models.
0/7
KM-10: Introduction to Governance, Legislation and Ethics
This module introduces learners to the principles of governance, legislation, ethics, workplace security, and business practices that influence organisations and employees. The module focuses on understanding how legal frameworks and ethical standards guide behaviour in the workplace and ensure accountability, transparency, and responsible decision-making. Learners will explore important workplace legislation such as the Labour Relations Act (LRA), the Protection of Personal Information Act (POPIA), and other regulatory frameworks that affect employees and employers. The module also introduces key ethical principles, including professional conduct, fairness, honesty, and accountability in professional environments. In addition, the module examines workplace security, performance management, business planning, and costing concepts that influence organisational efficiency and sustainability. By the end of the module, learners will understand how governance, ethics, legislation, and management practices contribute to a responsible and productive workplace environment.
0/19
KM-11: Fundamentals of Design Thinking and Innovation
This module introduces learners to the principles of design thinking, creativity, and innovation in the workplace. It focuses on solving problems using a human-centered approach, where user needs are prioritised through observation, empathy, and iterative development. Learners will explore key concepts such as design thinking methodology, creativity, innovation types, and application in real-world environments, including software development and business. The module also highlights how organisations use design thinking to improve products, processes, and services while fostering innovation. By the end of this module, learners will understand how to apply design thinking to solve complex problems and drive innovation effectively in the workplace.
0/15
KM-12: Fundamentals of Research and Information Analysis
This module focuses on developing an understanding of research principles, information gathering, and data analysis techniques. It equips learners with the ability to collect, evaluate, interpret, and apply information effectively in problem-solving and decision-making contexts
0/6
Artificial Intelligence Software Developer

Lesson Overview

Before data can be analysed effectively, it must first be cleaned and prepared. Raw data collected from different sources often contains errors, missing values, duplicate entries, or inconsistent formatting. These issues can affect the accuracy of analysis and may lead to incorrect conclusions if they are not corrected.

Data cleaning and preparation is the process of identifying problems in a dataset and correcting them so that the data becomes reliable and ready for analysis. This step is one of the most important stages in data analysis because poor data quality can lead to misleading results.

In this lesson, learners will explore the importance of data cleaning, the most common data quality problems, and techniques used to prepare data before analysis.

1. What is Data Cleaning?

Data cleaning refers to the process of identifying and correcting errors in a dataset. The goal is to improve the accuracy, completeness, and consistency of the data.

Data cleaning may involve several tasks, including removing duplicate records, correcting incorrect entries, fixing inconsistent formatting, and addressing missing values.

For example, if a dataset contains customer information, the analyst must ensure that names, phone numbers, and addresses are entered consistently and correctly. If some entries contain mistakes or incomplete information, those issues must be corrected before the data can be analysed.

Data cleaning helps ensure that the dataset reflects the true information that the organization intends to analyse.

2. Importance of Data Cleaning

Data cleaning is essential because data analysis depends on accurate and reliable data. If the dataset contains errors, the results of the analysis may be incorrect or misleading.

Poor data quality can cause several problems. Organizations may make incorrect decisions, analysts may misinterpret trends, and reports may become unreliable.

High-quality data allows analysts to produce accurate insights and reliable reports. It also improves efficiency because analysts spend less time correcting errors during the analysis process.

For this reason, many data analysts spend a significant portion of their time preparing and cleaning data before performing any analysis.

3. Common Data Quality Issues

When working with datasets, analysts often encounter several common problems that affect data quality.

One common issue is missing data. This occurs when certain information is not recorded in the dataset. For example, a dataset containing employee information may include names and salaries but have missing age values for some employees. Missing data can occur due to human error, system failures, or incomplete data collection.

Another issue is duplicate data. Duplicate records occur when the same information appears more than once in the dataset. For example, a customer record might be entered twice by mistake. Duplicate records can cause incorrect calculations, especially when totals or averages are calculated.

Inconsistent data formatting is another common problem. Data may be entered in different formats, making it difficult to analyse. For instance, dates might appear in multiple formats such as day-month-year or year-month-day. Standardizing the format helps maintain consistency.

Incorrect data entries are also common. Sometimes data contains unrealistic or impossible values. For example, if a dataset records a person’s age as 300 years, the value is clearly incorrect and must be corrected or removed.

Finally, datasets sometimes contain irrelevant data that is not needed for the analysis being performed. Removing unnecessary information can simplify the dataset and improve efficiency.

4. Data Preparation

After cleaning the data, the next step is data preparation. Data preparation involves organizing the dataset so that it is ready for analysis.

During this stage, analysts may sort data, group related information together, convert data types, or create new variables that will help with analysis.

For example, a dataset may contain sales information recorded in different currencies. To analyse the data properly, the analyst may convert all the values into the same currency so that the information can be compared accurately.

Data preparation ensures that the dataset is structured in a way that supports meaningful analysis.

5. Tools Used for Data Cleaning

Various tools can be used to clean and prepare data. Many organizations use spreadsheet software such as Microsoft Excel or Google Sheets to perform basic data cleaning tasks. These tools allow users to filter data, remove duplicates, correct entries, and standardize formats.

More advanced tools such as Python, R programming language, and SQL databases are often used when working with very large datasets. These tools allow analysts to automate data cleaning tasks and process large volumes of data efficiently.

The choice of tool depends on the size of the dataset and the complexity of the analysis being performed.

6. Steps in Data Cleaning

The data cleaning process usually follows several key steps.

The first step is inspecting the data. Analysts carefully examine the dataset to identify any errors, missing values, or inconsistencies.

The second step is removing duplicate records. Duplicate entries are identified and removed so that the dataset contains only unique records.

The third step involves handling missing values. Analysts may choose to remove records with missing data, estimate the missing values, or replace them with appropriate information.

The fourth step is standardizing formats. This ensures that data is recorded consistently across the entire dataset.

The fifth step involves correcting errors. Incorrect values or unrealistic entries are fixed or removed.

The final step is validating the data. Analysts verify that the cleaned dataset is accurate and complete before proceeding with analysis.

Lesson Summary

Data cleaning and preparation are essential steps in the data analysis process. Raw data often contains errors, missing values, duplicate records, and inconsistent formatting that must be corrected before analysis can begin.

Data cleaning involves identifying these issues and correcting them to improve the quality and reliability of the dataset. Common problems include missing data, duplicate entries, inconsistent formats, and incorrect values.

Once the data has been cleaned, it must be prepared for analysis by organizing and structuring the dataset appropriately.

By ensuring that data is accurate and consistent, analysts can perform reliable analysis and support better decision-making within organizations.

Scroll to Top