A scoping review of webcam eye tracking in learning and education

The use of eye tracking in educational research has shown great potential in recent decades. There are various approaches to the usage of eye tracking technology in this area, including investigation of self-regulated learning from different types of learning environments. Nonetheless, the majority of published research studies have one tremendous limitation: using expensive remote or tower-based eye trackers to provide high-quality data in laboratory conditions. Now, new webcam eye trackers may offer an easily affordable approach allowing eye tracking measurements in the real environment, such as the investigation of learning behavior in online learning environments. The main aim of this scoping review is to explore the use of webcam eye tracking technology in the field of learning and education. We established three specific purposes: 1) to introduce educational topics being explored using webcam eye tracking, 2) to discuss the methodological aspects when exploring educational topics with webcam eye tracking, and 3) to investigate the eye tracking aspects used for the analysis. To do this, we analyzed 16 studies that used webcam eye tracking. The results of the scoping review show that 1) selected studies focus mainly on students’ behavior in online learning environments, such as engagement, lack of attention, cheating and others; 2) a wide range of studies aimed at the development of automatized detection tools; and 3) studies are mainly focused on extracting raw and event data features using them mostly for automatized detection purposes.


Introduction
The use of eye movement tracking has been on the rise in the field of education and learning in recent decades.Eye tracking technology is able to provide researchers with a wide range of information about the metacognitive, and behavioral processes of learners (Antonietti et al., 2014).Eye tracking technology contributes to the development of educational processes and the individuals themselves.Eye tracking is used in a large number of sectors within education, e.g., processes in the classroom, teaching and learning in virtual reality, reading research (Rayner, 1998), and attention, perception, and language learning (Šmideková, 2018;Lai et al., 2013).Lai et al. (2013) also mention topics of metacognition and learning strategies.Another broad area of educational research in which eye tracking and analysis of eye movement data can be used extensively is in the context of multimodal learning analytics (MLA).In general, MLA refers to the integration or the elicitation of different data from multiple sources (e.g., audio, video, eye tracking, biosensors and more; see Blikstein & Worsley, 2016 or Worsley, 2018) that can subsequently provide crucial insights regarding an individual's learning process from a variety of perspectives (Ochoa, 2017).Such analyses can provide assessments of students' knowledge, behavior, intentions, or even physiological characteristics (Blikstein & Worsley, 2016), which can help to increase the holistic understanding of an individual.Alemdag and Catilgay (2018) then explore in more depth the area of eye tracking research (particularly with remote and webcam eye tracking devices) in multimedia learning, which can be closely related to the online learning environments and the learner's behavior with learning materials, and their metacognitive and self-regulatory abilities, which are crucial for successful and effective learning from e-learning materials.
Self-regulated learning has become crucial in the last two decades with the expansion of learning with online learning environments where students must demonstrate sufficient self-regulation skills, such as motivation, strategic planning, responsibility, and time management to learn effectively (Panadero, 2017).Self-regulated learning is described as a learning process that is built on cognitive strategies, motivation, and metacognitive skills.At the same time, responsibility and autonomy are also necessary for successful self-regulated learning (Carneiro et al., 2011).Zimmerman (2000) describes self-regulated learning as a cyclical process that is composed of three phases.These phases include the preparatory phase, the performance phase, and the reflective phase.However, exploring self-regulated learning in online learning environments is relatively challenging.Researchers often focus on self-reports and questionnaires, which can be subjective (Dostálová et al., 2022).For this reason, the focus in research on self-regulated learning has begun to shift NICOL DOSTÁLOVÁ, LUKÁŠ PLCH simultaneously to the use of new technologies and to the collection of psychophysiological data that can point to behavioral patterns in individuals' learning that were not previously apparent.The method of tracking eye movements is included among such technologies (e.g., Antonietti et al., 2014).
Tracking eye movements can contribute greatly to uncovering selfregulated processes during learning (e.g., Taub & Azevedo, 2018) in the sense of detecting what areas a student has looked at during the learning process, for how long, and possibly in what sequence.However, eye tracking research in the area of online learning environments presents major challenges in the form of lab-based data gathering.Currently, some of the eye tracking devices that are commonly used for research are most often based on the principle of pupil and corneal reflection and varying levels of sampling frequency, precision and accuracy (Holmqvist et al., 2011).However, these devices can be relatively sensitive to the conditions in which the measurements are conducted.These in-lab eye tracking devices can also be relatively expensive (Semmelmann & Weigelt, 2017), and so the methodological aspects and the data collection are usually rather more limited and simplified, which may reduce the ecological validity of the research since the participant is not measured under natural conditions (Papoutsaki et al., 2016).
For such reasons, it is necessary to pay attention to the new technological possibilities regarding both the availability of eye tracking devices and data gathering in ecological settings (Wisiecka et al., 2022).A new approach can be provided by webcam eye tracking.Webcam eye tracking is based on the principle of face landmark detection and a machine learning approach for gaze position prediction (Wisiecka et al., 2022).Recent studies have sought to compare the accuracy and precision of the webcam eye tracking solution with commercial eye trackers showing reliable results depending on, e.g., experimental stimuli (for detailed information about the eye tracking device parameters and experimental settings see Burton et al., 2014;Skovsgaard et al., 2011;Wisiecka et al., 2022).Nonetheless, Wisiecka et al. (2022) suggest that more replications are needed in this field to properly consider all aspects of the webcam eye tracking solution (e.g., research topic, procedure, experimental stimuli, and more).
However, the main advantage of a webcam eye tracking device is the possibility of its use by nearly anyone, since only a device with a webcam is needed to conduct the research, and therefore a higher ecological validity of the measurements might be enabled, which can be significantly useful, e.g., in research on self-regulated learning from online learning materials and also to support the multimodal learning analytics during learning processes.This research aims to investigate whether and how eye tracking can be used in education and learning and thus contribute to the further use and development of this technological approach.
A SCOPING REVIEW OF WEBCAM EYE TRACKING IN LEARNING ...

Methods
This scoping review study aims to provide and analyze an insight into the current state of knowledge on the use of webcam eye tracking in the field of learning and education.To this end, a main research question and then two specific research questions were established as follows: RQ1: How is webcam eye tracking technology used in the field of education and learning?RQ1.a:What fields of education and learning are explored with webcam eye tracking technology?RQ1.b:How is webcam eye tracking used from the methodological perspective and what aspects of eye tracking are considered in selected studies?
In order to answer the research questions, which are exploratory in nature, we chose to do a scoping review.We followed the methodology set out by Tricco et al. (2018) andMunn (2018).

Inclusion and exclusion criteria
Inclusion and exclusion criteria determine which documents will or will not be included in the search to meet the study objective set by the research questions.In our case, the inclusion criteria are eye tracking, eye movement, webcams, and school or university settings.For the summary of all inclusion and exclusion criteria, see Table 1.For the review, we concentrated on studies published between 2010 and 2023, and we also decided to work in a broader scope in this review, not only with studies of the "article" type but also with "conference papers."This is mainly because webcam eye tracking is a relatively new technology that has developed more widely in the last two decades.Furthermore, conference proceedings papers are a common type of document in this field.(11.2.7 Data Extraction, 2022) in order to extract data on the authors of the publication, year of publication, origin, research objectives, population studied, research methodology, and findings.See Table 1 for the extraction results.

Critical appraisal
This is an optional part of the scoping review, and was not conducted in our case.

Data synthesis
The objectives of the data synthesis were descriptive qualitative content analysis.The tool used to conduct it was open coding.A simple frequency count was used for descriptive statistical analysis of quantitative data (Aromataris & Munn, 2020).

Data management and screening
Data were processed in the online software Rayyan https://www.rayyan.ai/,where they were deduplicated and screened independently by both authors A SCOPING REVIEW OF WEBCAM EYE TRACKING IN LEARNING ... of the study.First, studies were screened by reading titles and abstracts and in the second stage by full-text analysis.The key measures of whether to include or exclude studies were inclusion and exclusion criteria.If there were any disagreements, these were resolved in online meetings via MS Teams.

Ethical considerations
This scoping review does not require ethical approval.All data are gathered from publicly available sources, either licensed or open access.

Data analysis
The analysis of the selected articles commenced by scrutinizing the research objectives, inquiries, methodologies, and study outcomes.We systematically extracted pertinent information from the articles, which encompassed metadata such as title, authors, DOI and 1.8 Data analysis The analysis of the selected articles commenced by scrutinizing the research objectives, inquiries, methodologies, and study outcomes.We systematically extracted pertinent information from the articles, which encompassed metadata such as title, authors, DOI and name of a journal or conference.We also extracted the aim of a study, sample, research method and eye tracking approach.
A SCOPING REVIEW OF WEBCAM EYE TRACKING IN LEARNING ...

Results
This scoping review presents a summary of 16 articles and conference papers that combine two key topics: the use of webcam eye tracking and the thematic areas of education and learning.We decided to divide the results into two main categories according to the focus of the specific research questions.
The following two categories are: 1) the field of interest regarding education and learning, and 2) methodological and eye tracking aspects of selected studies.

Exploring educational horizons: Webcam eye trackers and learning domains
The main aim of this section is to present the possibilities for using webcam eye tracking in teaching and learning that have already been explored in various contexts.
From a general perspective, eye tracking technology is used to investigate cognitive functional processes.This has been addressed by Lin et al. (2022), who focused on testing the feasibility of a webcam eye tracking interface for commonly used cognitive tasks.They subsequently tested the webcam interface on Chinese reading tasks.Reading is a complex cognitive ability, and the reading tasks were also applied by Guan et al. (2022) pointing out the importance of low-cost eye tracking data gathering in natural conditions, especially in the era of digital and online learning, using webcam eye tracking to explore the relationship between reading behavior and reading performance.Cognitive processes related to reading comprehension and mind wandering during an online reading comprehension task are also discussed by Hutt et al. (2022).In the case of a specific reading disorder, learning how to read and the reading itself can be a very challenging task.Calabrich et al. (2021aCalabrich et al. ( , 2021b) ) explored cross-modal bindings and episodic memory in dyslexic and intact adult readers.The reader's attention is also closely related to working with the text and reading.A reading task for a group of neurodivergent students (ADHD, autism, and learning disability) was also used by Wong et al. (2023) who explored the use of webcam eye tracking to support these students during learning.Li et al. (2016) took a different approach to reading research using webcam eye tracking and concentrated on developing a system for detecting attention during reading in an e-learning environment.For this purpose, they adopted a multimodal approach, thus using information related to facial expression, eye tracking, and mouse dynamics.Students' attention during learning from online learning materials is of great interest to Robal et al. (2018), Khan et al. (2022) and Madsen et al. (2021).Khan et al. (2022) reacted to the pandemic situation and the subsequent conversion of in-person lectures to an e-learning environment.They proposed an e-learning framework that would be able to determine student's attention levels during online learning sessions, in part to address the problem of a potential lack of self-regulation among students during online lessons.Robal et al. (2018) follow up on the issue of students not being able to adequately regulate their learning process in Massive Open Online Courses (MOOCs) and propose a tool that would be able, based on a face-capturing webcam eye tracker, to detect the loss of attention.Koshravi et al. (2022) also directed their attention to self-directed learning in the online environment, using both a webcam eye tracker and commercial eye tracking glasses to collect psychophysiological data (gaze position), and therefore, to potentially improve the quality of e-learning materials.Madsen et al. (2021) then used webcam eye tracking for an experiment addressing students' attention while watching online tutorial videos.Lack of attention and mind wandering during the learning process has also been investigated by Zhao et al. (2017), who focused on detecting mind wandering for MOOCs based on webcam gaze data.Furthermore, behavioral and emotional engagement and its automatized detection in the educational environment were explored by Alkabbany et al. (2023).The analysis of learners' general behavior in e-learning was the focus of Yi et al. (2015), who used machine learning to analyze eye movements captured by a webcam for the purpose of learning patterns classification, which can lead to a better understanding of students' learning intentions and their overall behavior when studying in online environments.
A completely different area has been explored by Dilini et al. (2021), who have concentrated on the area of remote online exams and have developed eye-movement-based cheating detection for this purpose.
Based on the thematic analysis of the selected studies in the scoping review, it is apparent that the use of webcam eye tracking in the field of learning and education is relatively broad but focuses almost exclusively on the area of online or e-learning environments.With the help of simple and accessible webcam eye tracking technology, it is possible to observe learners' behavior in online learning environments (attention, concentration, engagement, etc.), and at the same time, eye movement data can be used to create classification and detection tools that could lead to the improvement of online learning environments.

Unraveling the gaze: Essential eye tracking aspects and methods
In this section, we focus on the types of webcam eye tracking devices employed in selected studies, the eye tracking metrics, and the way of working with the eye tracking outputs used in the research.Given the inconsistency of webcam eye tracking usage, we decided to proceed with this section according to the selected eye tracking device and then the aspects of eye tracking data analyzed.
The most used webcam eye tracking device in the selected studies was Webgazer.js(see Papoutsaki et al., 2016).Superficial eye tracking metrics were used for subsequent analyses.A webcam eye tracking system was used by Alkabbany et al. (2023) whose main goal was to develop an automated measurement of behavioral engagement in students.For this purpose, head position, eye gaze, and action units were considered.Eye gaze was focused where the student was looking.The webcam system collected data at a frequency of 2-3 seconds and achieved a total of 240 feature vectors (considering head pose, eye gaze, and action units), which were then processed using a support vector machine (SVM) classifier.Robal et al. (2018) worked on self-regulation in MOOCs and the detection of attention in the online learning environment.For these purposes, they used both a commercial eye tracker and Webgazer.jsto track eye movements and tracking.js(see tracking.js, n.d.) for face tracking.Nonetheless, only the accuracy and reaction times were discussed for the detection development.As Dilini et al. (2021) focused on cheating detection, they used WebGazer.jseye tracking to gather a set of raw eye tracking data containing the estimated x and y positions and corresponding timestamps.These data were processed and divided into two main categories: "looking at the screen" and "looking outside of the screen."Khosravi et al. ( 2022) compared head-mounted eye tracking (Pupil Core eye tracking glasses) and webcam eye tracking (Webgazer.js).In this study, recorded eye tracking data was used to visualize its performance and to compare visualizations from both devices, showing similar accuracy.
Nonetheless, the next wide group of research studies focused on the use of Webgazer.jsfor various thematic purposes.These authors already used a broader set of eye tracking features for the subsequent analyses.For example, Hutt et al. (2022) used Webgazer.jsand converted the total gaze raw data into global gaze features (general eye movement data independent of the presented stimuli, e.g., number of gaze samples, number of unique gaze samples, and variance of gaze points), and local gaze features (dependent on predefined AOIs).The AOIs approach was also chosen by Wong et al. (2023) to examine the validity and usability of webcam eye tracking as an aid tool for neurodivergent learners.Gaze data were recorded with Webgazer.jsand processed in proportion to each AOI (AOI corresponds to a paragraph on the stimuli).Calabrich et al. (2021b) used Webgazer.jsoperating at a sampling A SCOPING REVIEW OF WEBCAM EYE TRACKING IN LEARNING ... rate of 60 Hz (depending on the screen refresh rate) to investigate audiovisual learning in dyslexic adult readers and intact adults.An algorithm to detect fixations from the raw data was used for the analysis.The position of these fixations was related to the regions of interest (ROIs) generated.Webgazer.js was also used in the research of Khan et al. (2022), who first worked with raw data and then moved on to selecting individual eye tracking features that related to predefined areas of interest, primarily related to fixations (e.g., number, variance, duration and ratio of fixations etc.).These metrics were processed through machine learning (logistic regression, SVM and polynomial regression) to create a framework to capture attention loss and engagement in e-learning environments.Guan et al. (2022) used this webcam eye tracking with a focus on analyzing reading performance.The eye tracking features selected were related to more detailed fixation parameters (e.g., frequency of fixation) or frequency of regressions on pages.These data were then statistically processed.Aiming for mind-wandering detection, Zhao et al. (2017) compared eye tracking data from a high-quality commercial eye tracker with a sampling rate of 30 Hz with a Webgazer.jssampling rate of 5 Hz.For these purposes, they selected 58 eye tracking features based on detailed parameters of fixations and saccades.Calabrich et al. (2021a) also focused on intact adult readers and tracked their eye movements while reading pseudowords using Webgazer.js.For this study, Webgazer.jswas set to a frame rate of 60 Hz and focused on fixation location and consistency.
Nonetheless, several studies used a different webcam eye tracking tool to perform the measurement.Rather superficial eye tracking aspects have been considered by Behera et al. (2020) who performed their research using webcam hand-over-face gestures, head and eye movements, and facial emotions.Focusing on the eye movements themselves, the researchers used the IntraFace tool (see De la Torre et al., 2015) to directly record the left and right eye gaze data.On the other hand, Li et al. (2016) investigated attention detection in an online learning environment using a multimodal approach that includes facial expression, mouse dynamics, and eye gaze patterns.Using an unspecified webcam, they analyzed eye blinks, fixations (fixation rate and duration), and saccades (saccade rate and duration).These features are further used for machine learning analysis.Yi et al. (2015) used unspecified webcam eye tracking and described eye-movement detection for the purposes of realtime learning evaluation.
Based on the eye tracking aspects used in the selected studies, it can be evident that the most used webcam eye tracking framework is probably Webgazer.js,which the researchers used in different thematic contexts.If we focus on the selection and further analysis of the eye tracking data, we notice a relatively high diversity.In some cases, the authors focus their attention only on the superficial detection of faces and possible on-screen NICOL DOSTÁLOVÁ, LUKÁŠ PLCH and off-screen gaze; in other cases, the authors work with raw data, which are processed into standard eye tracking event metrics, including fixations and saccades and their detailed characteristics as well (e.g, count or duration of fixations and saccades).In some cases, these feature metrics are also analyzed in the context of their location and delineated areas of interest (AOIs).Eye tracking metrics are further used for statistical analysis or alternatively analyzed at the level of several machine learning approaches.

Conclusion
The present scoping review was devoted to the current state of knowledge in the area of webcam eye tracking in the field of education and learning.
In our study, we found that research in this area is still somewhat in its early stages and a large majority of the research is from a computer science background and focuses mainly on automatized detection systems that can be potentially used for education and learning in various learning environments.Nonetheless, there is a huge opportunity to expand research in the field of education, in terms of proper investigation of educational and learning processes, and both thematic focus and the actual processing of the gaze data (as an example of such an approach see Calabrich et al., 2021).
In the first section of our review, we concentrated on the possibilities and areas of using webcam eye tracking in the field of learning and education.Webcam eye tracking is mainly used in e-learning environments, often in the context of observing the learning process, various aspects of an individual's behavior, or enhancing the quality and functionality of such systems and environments.At the same time, webcam eye tracking measures can be applied to a variety of detection functions (e.g.proctoring).
From the methodological perspective, our results show that the field of webcam eye tracking is developed primarily in the field of computer science in the form of designing detection tools, and only a few studies aimed to experimentally explore the cognitive processes (e.g., reading patterns in neurodivergent students, see Wong et al., 2023).The actual use of webcam eye tracking in educational settings, in the sense of replacing conventional eye tracking, has been rather sporadic.The subsequent manner of working with eye tracking data is then rather extensive, ranging from basic gaze tracking on-and off-screen, to analyzing detailed metrics of saccades and fixations, also at the level of selected areas of interest.The choice of eye tracking metrics was determined by the main objectives of the research concerned.
Nonetheless, this scoping review study provides a summary of current trends in the field of webcam eye tracking in the context of learning and education.Within such a context, the authors of the selected studies primarily A SCOPING REVIEW OF WEBCAM EYE TRACKING IN LEARNING ... focus on the area of learning in online learning environments and student behavior while working with them.Based on this, they also concentrate predominantly on the development of automated detection tools in the field of learner attention or engagement.
The results of our study may offer a new perspective and new challenges for educational researchers considering the use of eye tracking for investigative purposes.At the same time, however, more research is also needed on the quality of webcam measurements, even though the selected studies that focused on comparing commercial eye tracking with webcam eye tracking showed reliable outcomes.In any case, this information offers a direction for further research in this area that may lead to broadening and deepening the research.

Figure 1
Figure 1 Prisma diagram showing the process of studies selection

Table 1
Table summarizing the inclusion and exclusion criteria.

Table 2
Key concepts and search terms used for the literature search.
* OR learn* OR teach* OR study* OR student* OR instruct* OR pupil* OR school* OR universit* OR college* 1.3 Data extraction Data extraction was carried out according to the Joanna Brigss Institute scoping review methodology guidelines

Table 1
Resumé of extracted data