Scroll down

Learner assessment techniques and algorithms used in a learning context

Ikram Gagaoua |

Applied to learning, assessment gives an overview of the level of proficiency or knowledge on a given skill. It is an essential step in learning that gives feedback to both the learner and the trainer on the progress to be made and the acquisition of skills. It also allows a more precise remediation of the learner’s weaknesses or gaps and therefore more beneficial. Thanks to the evolution of e-learning, many assessment techniques and algorithms have appeared, which are increasingly accurate and provide reliable and valid data. In this article, we will present these new ways of evaluating a learner’s level of proficiency for target skills using the power of artificial intelligence algorithms, and this in order to be able to industrialize the approach on the scale of groups, organizations or schools.

Positioning, to assess the current state of knowledge

Positioning is the entry point into the learning path. It is generally carried out at the beginning of the school or academic year or to evaluate the need for access to learning in a corporate environment. The purpose of positioning is to guide the learner on a path towards a well-defined objective. It also helps to identify learners’ weaknesses to offer them a personalized learning experience where they will not need to revisit concepts that have already been acquired and consolidated. There are many ways to position a learner: among them, self-positioning and CAT, which we will detail below.

Self-positioning

Self-positioning is an assessment concept where learners try to determine for themselves their gaps and acquired skills on a given subject. This allows them to take a step back from their learning paths and promotes responsibility and independence, factors that encourage engagement in the learning process. However, it is important to notice that self-positioning should be carried out in combination with another type of positioning to avoid any bias that could affect the assessment, such as the impostor syndrome, in which case learners will tend to underestimate their skills, or on the opposite, according to the Dunning-Kruger1 effect, they will tend to overestimate them, which will have the effect of altering the results.

CAT, computerized adaptive test

Computerized Adaptive Testing (CAT) is a form of assessment that adapts to the individual. It is a computer-administered test in which the next question or set of questions selected and offered to learners depends on the accuracy of their answers to previous questions. It is an iterative process with several steps, starting with an initial estimate of proficiency to find the most appropriate question taking into account the estimated level of proficiency at that time. Learners answer the question asked and from their answers a new estimate of proficiency is calculated. The steps follow each other until the estimate is at a certain level of certainty.

CAT overcomes the limits of the so-called classical linear assessment, which results in an overall score in the form of a ratio of correct answers. This score does not take into account many factors such as the difficulty of the question or the learner’s real proficiency in the subject matter. A standardized test of this kind does not therefore make it possible to determine the true proficiency of the learner. The CAT has several advantages over the traditional linear assessment method: learners with a high level of proficiency are not required to answer all the easy elements of a test, they only answer the elements that really give information about their skills. On the contrary, for learners with lower levels of proficiency, only easy questions adapted to their proficiency are asked. The CAT therefore allows a more accurate measurement of the learners’ ability without wasting their time.

To better illustrate this, let’s take the example of a non-adaptive assessment in which the same set of twenty questions is administered to two different learners, the first one correctly answers fourteen questions, among his correct answers eight are easy, four of medium difficulty and two difficult. The second learner also correctly answers fourteen questions, however, of his correct answers four are easy, six medium and four difficult. These two individuals had the same final score, but do you think this test was a true measure of their ability? With a CAT assessment, the difficulty of the question is taken into account in determining the final score, so the two individuals who answered the same number of questions correctly will not necessarily have the same score.

Assessment of acquired knowledge during the learning process

IRT

IRT (Item Response Theory) is an approach to assessment design and scoring that uses the correct or incorrect answers of learners to test questions to model certain parameters such as the difficulty of a question, the learner’s ability, the discrimination of the question, the chance factor, etc. The IRT is a tool that can be used to model the difficulty of an assessment. The IRT is based on the principle that an individual’s response to a question can tell us about their underlying abilities and provide us with other relevant information. A classic evaluation allows us to have a rate of correct and incorrect answers, these results only give us a glimpse of the learner’s performance on this set of questions. It is then impossible to assess any improvement by comparing two separate assessments of the same person.

With IRT, we can design similar assessments with which we can compare learners’ performance to deduce whether they are improving as they learn or whether they have retention difficulties. The trainer will therefore be able to follow learners individually by following their improvement curve. An important advantage of this approach also lies in its ability to provide not only an estimate of the skill level, but also the uncertainty of the algorithm on this estimate. This information can be used to understand how many questions need to be asked before an estimate with uncertainty below a desired threshold can be obtained. This will have an impact on the assessment design, which will be shorter and more relevant. In addition to its use in learner assessment, the IRT can also be used to assess content in particular, but also the learning path as a whole, by measuring the skill level of individuals before and after interacting with content. It is possible to see the impact of each piece of content to distinguish those that need to be improved from those that are already effective.

BKT

BKT (Bayesian Knowledge Tracing) is an algorithm that models an individual’s learning as a Markov chain. It is based on the principle that an individual’s learning is not fixed, when he interacts with a learning platform, his skills in a given concept improve as he interacts. Thus, the BKT allows the modeling of how the individual impregnates or learns a new concept.

Initially, we do not know if the learner masters the concept, he is in an initial state. He is then presented with a question and some probabilities are calculated, such as the probability of learning, which would move him from a state of ignorance to a state of knowledge. But also, the probability of guessing the answer when the learner does not know the concept, the probability of error when the learner has mastered the concept and finally the probability of forgetting which, unlike the learning probability, makes the learner move from knowledge to ignorance. The sequence of questions continues, and the probabilities of learning and others are reinforced to allow us to have a vision of the evolution of the individual’s learning. This model is widely used to assess the effectiveness of e-learning platforms.

ELO Rating System

The ELO grading system was established in 1978 by Arpad Elo. It was initially developed to rank chess players. Today it is used more broadly to rank players in most games or sports where two or more players compete against each other. More recently, this ranking system is now used in learning systems not to rank learners among themselves but to analyze the interaction between the learner and the content. Depending on the learner’s success or failure with the content, the learner’s level of proficiency and the difficulty of the learning content are estimated dynamically: the more interaction data there is, the more precisely the proficiency of the different learners and the levels of difficulty of the content are assessed. Following each interaction, the learner’s proficiency and the level of difficulty of the content are updated. It is therefore possible to follow the learner’s progress over time.

Finally, knowledge assessment is a key step in the learning process, and this is even more true for intelligent learning systems: it allows to collect a lot of data about the learner and the contents. This data can be used in particular to feed algorithms related to learning personalization such as recommendation systems.


Source1 : The Dunning – Kruger effect is a cognitive bias in which people who are weak regarding a skill overestimate their ability. It is related to the cognitive bias of illusory superiority and comes from the inability of individuals to recognize their lack of competencies.

Source2 : A Markov chain is a mathematical system that undergoes transitions from one state to another according to certain rules of probability.