MODELING THE FOCUS OF ATTENTION AS A RANDOM PROCESS

This research investigates the feasibility of modeling visual attention (as represented through eye movements) as a stochastic process. A stochastic model of attention would provide a foundation for research involving probabilistic predictions of attention allocation which could be used in a variety of domains. The following hypotheses are examined as part of this research. 1. The visual trace of participants when asked to fixate on a single point can be modeled as a stochastic process (Supported). 2. The visual trace of participants will fluctuate when performing an additional cognitive task, but can still be modeled stochastically with additional parameter considerations (Supported). In order to determine whether attention could be modeled as a stochastic process, eye-tracking data were collected and analyzed. The experiment contains only a single focal point with no distractors. The goal of this experiment is to determine how the eyes move when attention is singularly focused. This experiment does not attempt to determine how attention is captured or distracted, but rather to understand the foundational elements of attention that can be ascertained from the inherent movement of the eyes. To determine whether the data could be modeled as a stochastic process, different tests are used to compare the empirical cumulative distribution function to the hypothesized theoretical distribution. It was hypothesized that the saccade occurrences follow a Poisson process, but only 46% of the 52 runs provided support that the data could be modeled as a Poisson process. There was no significant difference between the control and n-back runs. Overall, there is not enough evidence to support that the saccades follow a Poisson process. The Wiener process and random walk are hypothesized to relate to the gaze pattern or visual trace. For the Wiener process, the length of movement in the horizontal and vertical directions was assessed for normality. The hypothesis that the data followed a Wiener process was supported by 100% of the 53 runs in both the horizontal and vertical directions. Thus, this data was able to be modeled as a stochastic process, specifically a Wiener process, which supported hypothesis 1. This analysis was extended to consider whether the distribution changed as the differences in position increased to two samples, three samples, four samples, five samples, and ten samples. As the number of time samples between eye position difference calculations increased, the results still strongly supported that the data followed a normal distribution. However, the variance proportions did not increase as expected by a Wiener process. This suggests that as the distance between time samples increases, at some point, the differences in position will no longer follow a Wiener process. The additional parameter assessment comparison for the Wiener process showed that in both the horizontal and vertical directions, the variances of the distributions differed significantly between the control and n-back runs. The variances for the n-back runs were consistently larger than those for the control runs. This provides support to hypothesis 2. Additional analyses regarding whether the gaze path followed a random walk were executed. The distribution of the angle or direction of movement was analyzed. In comparison to a uniform distribution, when saccades were removed, 60% of the 53 runs failed to reject the null hypothesis. For the data with saccades, only 21% of the 53 runs failed to reject the null hypothesis. Thus, there was a significant difference in the distributions of the angle of eye movement between the data with and without saccades. When saccades were removed, there is more support for the gaze path following a random walk, compared to when saccades are included in the assessment. This research and resulting conclusions are important in starting to explain the involuntary eye movements which occur when a participant is singularly focused. These results provide strong evidence that this underlying movement can be represented as a Wiener process. For any experiment considering eye movements, the inherent movement of the eyes should be considered in the analysis.


The visual trace of participants when asked to fixate on a single point can be modeled as a stochastic process (Supported).
2. The visual trace of participants will fluctuate when performing an additional cognitive task, but can still be modeled stochastically with additional parameter considerations (Supported).
In order to determine whether attention could be modeled as a stochastic process, eye-tracking data were collected and analyzed. The experiment contains only a single focal point with no distractors. The goal of this experiment is to determine how the eyes move when attention is singularly focused. This experiment does not attempt to determine how attention is captured or distracted, but rather to understand the foundational elements of attention that can be ascertained from the inherent movement of the eyes.
To determine whether the data could be modeled as a stochastic process, different tests are used to compare the empirical cumulative distribution function to the hypothesized theoretical distribution. It was hypothesized that the saccade occurrences follow a Poisson process, but only 46% of the 52 runs provided support that the data could be modeled as a Poisson process. There was no significant difference between the control and n-back runs. Overall, there is not enough evidence to support that the saccades follow a Poisson process. The Wiener process and random walk are hypothesized to relate to the gaze pattern or visual trace. For the Wiener process, the length of movement in the horizontal and vertical directions was assessed for normality. The hypothesis that the data followed a Wiener process was supported by 100% of the 53 runs in both the horizontal and vertical directions. Thus, this data was able to be modeled as a stochastic process, specifically a Wiener process, which supported hypothesis 1.
This analysis was extended to consider whether the distribution changed as the differences in position increased to two samples, three samples, four samples, five samples, and ten samples. As the number of time samples between eye position difference calculations increased, the results still strongly supported that the data followed a normal distribution. However, the variance proportions did not increase as expected by a Wiener process. This suggests that as the distance between time samples increases, at some point, the differences in position will no longer follow a Wiener process. The additional parameter assessment comparison for the Wiener process showed that in both the horizontal and vertical directions, the variances of the distributions differed significantly between the control and n-back runs. The variances for the n-back runs were consistently larger than those for the control runs. This provides support to hypothesis 2.
Additional analyses regarding whether the gaze path followed a random walk were executed. The distribution of the angle or direction of movement was analyzed.
In comparison to a uniform distribution, when saccades were removed, 60% of the 53 runs failed to reject the null hypothesis. For the data with saccades, only 21% of the 53 runs failed to reject the null hypothesis. Thus, there was a significant difference in the distributions of the angle of eye movement between the data with and without saccades. When saccades were removed, there is more support for the gaze path following a random walk, compared to when saccades are included in the assessment.
This research and resulting conclusions are important in starting to explain the involuntary eye movements which occur when a participant is singularly focused.
These results provide strong evidence that this underlying movement can be represented as a Wiener process. For any experiment considering eye movements, the inherent movement of the eyes should be considered in the analysis.  Table B  center of a screen, and the participants were asked to look at the dot for two minutes, during two distinct types of trials. Each participant repeated each of the two types of trials three times (total of 6 runs = 12 minutes). On one trial, participants performed an additional task requiring cognitive processing, i.e., the n-back task. In the n-back task, a series of verbal stimuli (e.g., a series of numbers) is presented and the participant is asked to indicate when the currently presented stimulus is the same as the one presented n trials previously [6]. The other trial simply asked that participants focus on the black dot. Once the data was collected through the eye tracker software and C-Sharp program, it was analyzed to support assessment of the hypothesis that eye movements, when directed to focus on a single point, can be modeled as a random process. The inherent properties of eye movements are examined without distractors or additional required tasks (i.e., visual search for a specified object within the scene).
This is important in determining how the eyes move involuntarily while singularly focused. The purpose of including a cognitive task is to determine its effect on the focus of visual attention.
Throughout the literature, eye position is used as an indirect measurement of where attention is currently deployed in space. This eye position is then translated into two primary classes of eye movements, fixations and saccades. Fixations are a type of eye movement in which the gaze is held almost stationary and visual information is obtained [3]. Saccades are fast movements (typically defined as movements greater than 300 o /sec) that redirect the eye to a new part of the surroundings [3]. In much of the literature, the emphasis of the experiments, analysis, and modeling is based on fixations since they represent where a person's attention is focused. In this research, the focus is on saccades. Since, in addition to playing an important role in redirecting the eye to a new position, saccades may provide further insight into the underlying nature of eye movements. Orienting is the process of aligning attention with a source of sensory input or an internal semantic structure stored in memory [4]. Detection of a stimulus means that the stimulus has reached a level of the nervous system at which it is now possible for the subject to report its presence [4]. These search experiments address bottom-up guidance, in addition to the goal-directed search guided by top-down mechanisms, by examining the difficulty of detecting the target amongst different combinations of distractors. Egeth and Yantis (1997) reviewed various experiments related to attentional control and attentional capture. Attentional control describes the extent to which deployment of attention is a result of an individual's deliberate state of attentional readiness (i.e., goal-directed control) or is captured by specific aspects of the image (i.e., stimulus-driven control) [10]. Their paper focuses on two types of external stimuli, feature singletons and abrupt visual onsets, to determine the extent to which each captures attention. By exploring attentional control in addition to capture, the paper explores not only the effect of these stimuli on bottom-up guidance, but also the extent to which top-down guidance influences the ability of a specific stimulus to capture one's attention.
Feature singletons are stimuli that differ substantially from their background in one or more visual attributes (e.g., color, orientation) [10]. Egeth and Yantis comment on the amount of conflicting evidence in the literature regarding whether singletons capture attention (see Pashler (1988), Theeuwes (1991a), and Joseph and Optican (1996) for studies supporting that singletons capture attention; and Jonides and Yantis (1988) and Theeuwes (1990), and Hillstrom and Yantis (1994) for examples of studies that conclude that singletons do not capture attention) [10]. In an attempt to reconcile the conflicting conclusions regarding feature singletons and attentional capture, Bacon and Egeth (1994) suggest that these results are the manifestations of two different attentional strategies adopted by participants [10]. Under some circumstances, subjects direct attention to the location exhibiting the largest local feature contrast as they search for the feature singleton [10]. In other stimulus conditions, subjects instead direct attention to the locations that match some task-defined visual feature (e.g., locations of blue items since the task is to identify a blue circle) [10]. This suggests that based upon a participant's goals or intentions, a participant may be able to resist or minimize the effect of a feature singleton on attentional capture.
With abrupt visual onsets, there is a consensus that abrupt onsets capture attention. The extent to which attention is captured may depend on a number of variables. For example, Jonides (1981) showed that peripheral cues always drew attention (whether or not they were informative about the target location), while central cues only captured attention when they were informative [10]. Yantis and Jonides (1990) found that capture, which occurs in the absence of any relevant attentional set, is prevented when subjects are induced to focus attention on a different spatial location in advance of each trial [10]. This suggests that an individual can potentially resist or reduce the effects of abrupt visual onsets capturing attention, if attention is focused in advance. Mack et al. (2002) describe a series of experiments that explore the power of specific stimuli to capture attention when an inattentional state is produced. One inattentional state utilized was the attentional blink. In order to investigate whether an attentional blink occurred, a rapid serial presentation of alphanumeric characters was presented to an observer [11]. The observer was asked to report two consecutively presented characters designated as targets. An attentional blink occurs when the observer fails to detect the second target. It is called an attentional blink because the failure is attributed to the demands placed on attentional processing imposed by the first target which makes the processes for detecting the second target temporarily unavailable [11]. The experiments produced evidence that a complex, familiar, and meaningful stimuli (i.e., one's own name) is able to capture attention and be perceived under a variety of conditions in which other stimuli are not. Mack et al. concluded that under conditions of inattention, it is the meaning of the stimulus which is ascertained or processed without attention that captures attention and subsequently brings that stimulus into awareness [11]. Posner et al. (1980) examined the relationship of orienting and detecting in the task of reporting the presence of a visual signal by reviewing experiments from the literature as well as executing their own experiments to validate or extend upon those conclusions [12]. The objective of one experiment was to compare detection latencies when a stimulus location was cued on each trial (mixed block) to a non-cued situation in which subjects prepared for one location for a block of trials (pure blocks) [12].
Each stimulus trial consisted of a visual warning signal, a stimulus (LED), the subject's response, feedback based on the response, and an inter-trial interval. For the mixed block trials, 80% of the trials contained a digit (1, 2, 3, or 4) as a warning signal [12]. The digit indicated the most probable location at which the subject could expect the stimulus to occur [12]. The remaining 20% included a warning signal of a plus sign, indicating that the four locations were equally likely [12]. For the pure blocks, the warning signal was always a plus sign followed by an equal (each of the four locations is equally probable) or unequal block condition [12]. In the unequal condition, participants were informed of the most likely stimulus location at the beginning of each unequal block [12]. When subjects were cued on each trial, they showed stronger expectancy effects than when a probable position was held constant for a block, indicating the active nature of the expectancy [12]. Knowledge regarding the location of the stimulus produced benefits when it was used actively (cued) but not when it was used to maintain a general set (blocked) [12].
Another experimental task described in Posner et al. (1980) involved a measure of reaction time to the onset of a stimulus LED. Again, warning signals of either a plus sign or a digit (1, 2, 3, or 4) were presented to indicate likely stimulus locations [12].
The plus sign indicated that each position was equally likely to contain the stimulus [12]. The digit indicated the most likely position during that block, while the next most likely position remained constant for three consecutive blocks [12]. Subjects were asked to remember these positions throughout the block and try to prepare accordingly [12]. Statistical analysis showed that for the most likely target positions, subjects exhibited significantly faster reaction times [12]. When the second most likely position was adjacent to the most likely position, the reaction time resembled that of the most likely target position [12]. However, when it was separated by a position from the most likely target position, the reaction time resembled that of the least likely position [12]. The results suggest that it is not possible for subjects to split their attentional mechanism between two positions separated in space when trying to detect a target [12]. Furthermore, the results indicate severe limits in the ability of subjects to assign attention to a secondary focus in addition to a primary focus [12].
The authors conclude that there is no evidence of an ability to divide attention [12].
The conclusions of Posner et al., regarding the ability to divide attention, dealt with an experiment that was solely based on visual stimuli. In this research, attention is again divided, but between a visual focus and a secondary aural task. One goal of this research is to determine how the model of attention changes when participants are asked to perform this secondary task. These changes in the focus of attention while performing a secondary task may provide insight into the extent of one's ability to effectively divide attention between a visual focus and a secondary aural task.

Attention Theories and Models
The models in the current literature focus primarily on attentional capture, visual search, and detection. Certain models rely solely on saliency-based, bottom-up attention [23,25,27,28,29], while others incorporate both bottom-up, image-based saliency cues and top-down, task-dependent cues [13,14,16,22,24]. Absent from the literature are models which focus on only top-down guidance, without search or distractors, which is the focus of this research. Since the participants are presented with a singular visual task (focus on the single black dot on the screen) with no other images on the screen, only top-down attention is utilized. Auditory stimuli are also presented during the n-back task, but the stimuli consist of a steady stream of numbers with no abrupt changes or onsets to draw attention away from the screen.
"For the last decade the attention literature has been embroiled in a debate over the nature of visual spatial attention that focuses on the 'thing' that attention selects;" many have attempted to determine whether attention selects regions of space (spacebased attention) or specific objects within a region (object-based attention) [13]. The Contour Detector (CODE) Theory of Visual Attention (TVA), CTVA, integrates space-based and object-based approaches to attention by merging the CODE theory of perceptual grouping by proximity with TVA [13]. CTVA's representation of space and objects derives from the CODE theory of perceptual grouping [13]. CODE theory utilizes two representations of space, one which represents the locations of items and the other representing objects and groups of objects [13]. CODE assumes that the location of each item is represented by its own distribution, and the CODE surface is formed by bottom-up processes from the summation of those distributions [13]. By applying a threshold set by top-down processes to the CODE surface, items residing in the same above-threshold region are categorized into perceptual groups [13]. When applying the CODE theory to attention, the CODE surface can be viewed as distributions of item features where the height represents the probability of sampling those features [13]. CODE provides a sampling of visual features as an input to TVA, which then chooses among categorizations of perceptual inputs to select where attention will be allocated [13]. Pomplun et al. (2003) developed the Area Activation Model, which is a computational model that predicts the statistical distribution of saccadic endpoints in visual search tasks [15]. The distribution is based on the assumption that saccades in visual search tend to foveate the display areas that provide the maximum amount of task-relevant information for the subsequent fixation. Navalpakkam and Itti (2005) describe another computational model for task-specific guidance of visual attention.
Their model is based on a biologically motivated architecture which emphasizes the aspects important to biological vision [16]. The model is divided into four important steps in guiding visual attention in real-world scenes. The model first determines the task-relevance of an entity. Then, the model biases attention for the low-level visual features indicative of the desired targets. The model uses these low-level features to recognize targets. Throughout the process, the model incrementally builds a visual map of the task-relevance at every scene location and stores this information within the memory of the model. Rutishauser and Koch (2007) developed a generative model that reproduces eye movements during a visual search task. It calculates the conditional probabilities that, given a specified target, observers fixate on or near an item sharing a specific feature with that target [22]. The model consists of three modules: (1) feature extraction and representation of the visual scene, (2) saccade planning, and (3) target detection [22].
The probabilities generated by the model are used to infer which visual features were biased by top-down attention [22].
Itti and Koch (2011) established a combined model of attentional selection and object recognition [24]. This model is based on a framework that subjects selectively direct attention to objects within a scene using both bottom-up, image-based saliency cues and top-down, task-dependent cues [24]. The model uses a bottom-up feature extraction pathway to select informative image regions from an incoming visual scene [24]. A trained knowledge base hierarchically represents object classes and encodes each object within an object class [24]. This encoding includes both expected visual features and a set of critical points on these objects, which the model then uses for object recognition [24].  describes a contextual cueing model for attentional guidance based on global scene configuration [25]. This model utilizes statistical correlations between global scene structure and object properties to facilitate search in complex scenes [25]. In addition, it provides estimates of the likelihood of finding an object in a given scene and its most likely position [25]. In Rimey and Brown (1990), the focus of the model is attention allocation [26]. They consider an augmented hidden Markov model in which allocation of visual processing is controlled preferentially within a scene, leaving open just the resources being managed [26]. As in the attention experiments, the models focus on aspects of attention such as visual search, detection of features, and attention allocation. The model considered in this research focuses on the inherent elements of eye movements when attention is singularly focused. The papers listed in the bibliography contain additional information about attentional theories and models.

Random Variables and Probability Distributions
A random variable, , is a function which maps the sample space, , into a subset of numbers on the real line [30]. This subset of numbers forms a new sample space , which for a discrete random variable includes a finite or countably infinite set of points, ( ) = for = 1, 2, … [30]. To characterize the properties of a random variable, a probability mass function (for discrete random variables) or a cumulative distribution function (CDF) can be used. These functions summarize the probabilities associated with the random variable assuming certain values and contain all the information available about a random variable before its value is determined by an experiment [31]. The probability mass function (PMF) is defined in [32] as: For a discrete random variable, the probability mass function can be equivalently defined as the discrete probability density function (PDF) as in  by: For a continuous random variable, the probability density function contains the same information as that contained in a probability mass function for a discrete random variable. The probability density function is a function, ( ), which possesses the following properties: where and are any two values of satisfying < [33]. The cumulative distribution function represents the probability that takes on a value less than or equal to , and is defined as: The cumulative distribution functions for discrete and continuous random variables are defined in terms of the probability mass or density functions as follows:

if is continuous
Often when it is clear to which random variable the distribution function applies, the subscript is removed, and thus, ( ) becomes ( ). These two terms can be viewed as interchangeable throughout this paper.
This section describes a few of the most important distributions in order to illustrate some of the statistical concepts discussed. The first distribution, which is perhaps the easiest to understand and represent, is the uniform distribution. In the uniform distribution, within the range of possible outcomes, each outcome is equally likely to occur. Its distribution functions are defined as follows: The normal distribution, or Gaussian distribution, is probably the most important distribution due to its wide applicability throughout a variety of domains. It is defined by two parameters, a location parameter, , and a scale parameter, . The location parameter represents the center or mean of the 'bell-shaped' curve of the distribution, whereas the scale parameter represents the spread or standard deviation of the distribution. Thus, both parameters are important in defining the related distribution functions: Since the integral for ( ) cannot be solved explicitly, the value for the cumulative distribution function for the normal curve is found from a table representing the standard normal distribution. In the standard normal distribution, = 0 and = 1, resulting in the following distribution functions:  To obtain the value for the general normal distribution, the following transformation is used to determine which value of Φ( ) to look up in the table: Another distribution of relevance to this research is the Poisson distribution. A random variable, , is a Poisson random variable with parameter > 0, if its probability mass function is defined as: The cumulative distribution function of a Poisson random variable is defined as: The expected value, [ ], of a random variable is the average value of the outcomes of a large number of experimental trials [30]. For a discrete random variable, it is defined by the equation: Equivalently, for a continuous random variable: The variability of a random variable is measured by its variance, defined by: For a uniform random variable, the variance is equal to The variance for the normal distribution is again easily obtained from one of its parameters, with var( ) = 2 . One important characteristic of a Poisson random variable is that its variance is equivalent to its expected value; thus, var[ ] = .

Overview of Random Processes
Suppose a characteristic of a system is observed at discrete points in time, 0, 1, 2, … and let be a random variable, which represents the value of the system characteristic at time, . A discrete-time stochastic process or random process is simply a description of the relation between the random variables 0 , 1 , 2 , … [34]. A continuous-time stochastic process is a stochastic process in which the state of the system can be viewed at any time, not just at discrete instants in time [34]. Stochastic processes are distinguished by their state space, or the range of possible values for the random variables , by their index set , and by the dependence relations among the random variables [31].
Another way to think about a random process is to consider a conceptual random process generator as described in . In this conceptual random process generator, the input consists of an infinite sequence of random variables along with their probabilistic description (e.g., probability mass function), and the output is an infinite sequence of numbers, or the random process [30]. In other words, a random process is an infinite sequence of random variables, with one random variable for each time instant, and each realization of the random process takes on a value represented by an infinite sequence of random variables [30].
A point process is a type of random process that models the occurrences of some phenomenon at the time epochs { } where refers to an index set [35]. For a point process, there are four equivalent descriptions of the sample paths: (1) counting measures, (2) non-decreasing, integer-valued step functions, (3) sequences of points, and (4) sequences of intervals (see Figure 2-2) [35]. A temporal point process is composed of a time-series of binary events that occur in continuous time [35]. At each point in time, the point process can take on only one of two possible values indicating whether an event occurred [35]. As with any random signal, the data observed as part of a point process can be expressed in terms of multiple collections of random variables [36]. For example, let 1 , 2 , … be random variables describing the occurrence times of a point process. A realization of this point process is the event 1 = 1 , 2 = 2 , … for some collection of times 0 < 1 < 2 < ⋯ [36].
Any point process can be completely characterized by its conditional intensity function, ( | ) defined as: where Pr� ( , + ] = 1� � is the instantaneous conditional probability of an event and is history of events up to time [35]. A Poisson process of intensity, or rate, > 0 is an integer-valued stochastic process { ( ); ≥ 0} for which the following properties hold: (1) for any time points 0 = 0 < 1 < 2 < ⋯ < , the process increments: (2) for ≥ 0 and > 0, the random variable ( + ) − ( ) has the Poisson distribution: A Poisson point process �( , ]� counts the number of events occurring in an interval ( , ] [31]. A Poisson counting process, or more simply a Poisson process ( ), counts the number of events occurring up to time [31]. has occurred. These follow the gamma distribution, with probability density function: In particular, 1 , the time to the first event, is exponentially distributed: The sojourn times (i.e., the times between events), 0 , 1 , … , −1 , are independent random variables, each having the exponential probability density function: Additionally, events of a Poisson process occur randomly in time, the first at 1 or 1 , the second at 2 , and so on, with the ℎ event occurring at . The random variable denotes the time at which the ℎ event occurs, and the values of are called the points of occurrence [32]. These points, occurring randomly over a specified time period, follow a uniform distribution.
Another type of stochastic process of interest is a Wiener Process, which is the mathematical model for Brownian motion. Brownian motion describes the random motion of particles suspended in a fluid resulting from their collision with the atoms within the fluid. A stochastic process ( ), −∞ < < ∞, is a Wiener process with parameter 2 , if it satisfies the following properties as defined in [37]: (1) (0) = 0 (2) ( ) − ( ) has a normal distribution with mean equal to zero and a A random walk is an extension of the Wiener process for which, the direction of change in position is also of importance. For example, define a sequence of independent, identically distributed discrete random variables by { } =1 ∞ . For each positive integer , let denote the sum of the sequence of 's up to : ∞ is a random walk. In a simple random walk, the 's follow a Bernoulli distribution in which: where + = 1. If = = 0.5, this is a simple symmetric random walk. The more general symmetric random walk is the process obtained by summing up independent, uniformly distributed, ( ∪ −1 )-valued random variables, . The random walk can also be studied in multiple dimensions. When considering a two-dimensional random walk, in which a step can be taken in any direction or angle, , the direction is chosen randomly with probability 1 . Thus, the angle or direction of the random walk follows a uniform distribution.
If the 's follow a normal distribution with mean equal to zero and variance equal to 2 , the sequence { } =1 ∞ is a random walk with Gaussian steps. When adding Gaussian random variables, the resulting sum is also normally distributed with mean and variance equal to the respective sums. If is normally distributed with mean and variance 2 , this can be represented as: ~( , 2 ). If in addition, ~( , 2 ) and = + , then ~( + , 2 + 2 ). A random walk with Gaussian steps is normally distributed with mean equal to zero and 2 = ∑ 2 =1 . Each of the stochastic processes described in this section are considered in the data analysis and modeling of the eye tracker data. The Poisson process is considered as a representation of the saccade occurrences. The Wiener process and random walk are considered for the gaze pattern or eye movements that occur during the run. [2] M. Corbetta and G. L. Shulman, "Control of goal-directed and stimulus-driven attention in the brain," Nature Reviews: Neuroscience, vol. 3, no. 3, pp. 201-215, 2002.

List of References
[3] M. F. Land, "Eye movements and the control of actions in everyday life," Progress in Retinal and Eye Research, vol. 25, no.3, pp. 296-324, 2006.

Material
The required equipment for this experiment includes: This configuration was consistent for all test subjects.

Description of Experiment
In order to determine whether attention can be modeled as a stochastic process, eye-tracking data was collected and analyzed. The data included horizontal and vertical eye position within the experimental field and pupil size. For this study, a CHAPTER 3 limited number of gaze paths (nine participants) were collected to support the data analysis. Participants were solicited from the URI student population. The study was performed in the Eye Lab in Gilbreth Hall and utilized the ISCAN software available.
A C-Sharp program displayed a single black dot in the center of a screen, and collected data from the eye tracker system. The participants were asked to look at the dot for two minutes, on two separate types of trials. Each participant repeated each of the two trial types three times (total of six runs per participant).
On one trial, participants were required to perform an additional task requiring cognitive processing, i.e., the n-back task. In the n-back task, a series of verbal stimuli (e.g., a series of numbers) were presented and the participant was asked to indicate when the currently presented stimulus is the same as the one presented n trials previously [1]. For this data collection, = 2 (i.e., participants would indicate that the same stimulus was presented on the string "141" after hearing the second "1"). On the other trial, the participants were simply asked to focus on the black dot in the center of the screen. Data was collected from the eye tracker software through the C-Sharp program and analyzed to support assessment of the hypothesis that eye movements, when a participant is asked to focus on a single point, can be modeled as a random process (see Chapter 4 for more details on the modeling process). The purpose of including the additional cognitive task is to determine its effect on the focus of visual attention and resulting gaze path.    Since the focus of the eye does not always start in a position perpendicular to the screen, the angle of movement must be calculated utilizing three-dimensional geometry. Figure 3-3 illustrates the three-dimensional geometry utilized to determine the overall amount of movement between eye position samples (in degrees).
This angle is then used to calculate the angular velocity of the eye movement by dividing by the time between samples.

Experimental Procedure
The sequence of events for the experiment is detailed in the following procedure.
1. Start the eye tracker system.
2. Open and run the C-sharp program "SingleDotFocus".
3. Perform the eye calibration.
a. Ask the participant to sit in the chair facing the large projector screen and place his/her chin on the chin rest, and look straight ahead.
b. Select 'track active' and make the necessary adjustments to the eye tracker system to obtain a clear picture of the eye and pupil (as shown in Figure 3-1). e. Assure that the system is properly calibrated before proceeding to experiment. If the system calibration is inaccurate, repeat calibration.
4. Present the screen containing a single center dot and ask participants to focus on the dot for two minutes on two different types of trials.
a. On one trial, the participant will simply focus on the black dot.
b. On the other trial, a series of numbers will be presented to the participant and he/she will be asked to indicate when the currently presented number is the same as the number presented two trials previously (e.g., for the sequence of numbers "3-7-3", a report of same would occur after the second "3" is heard).

5.
Repeat each type of trial three times (order of trial type will be randomized), for a total of six runs per participant.
During the n-back trials, each time a participant signaled that the currently presented number was the same as the one presently two trials previously, it was manually documented. No associated time stamp was recorded. This additional cognitive task was included in order to determine its effect of gaze pattern. The purpose of this experiment was not to assess a participant's speed or accuracy in identifying the repeated numbers, and thus, an actual performance assessment of the nback task extended beyond the scope of this research.

Eye-tracker data analysis
Data collected for each participant was analyzed using MATLAB (see Appendix central position, blinks were removed (to avoid skewing the data unfairly), and the data was plotted without blinks. It is clear from the following plots that there is a slight difference in the pattern of drifts, but the overall shapes are similar. In addition, there is a reduction of noise in the data when the blinks were removed and the distance was determined using the central focus of the participant.
It was also important to consider the variation in the amount of noise between participants. Initially, a distance from center threshold was used to identify drifts.
However, due to the differences in variation, thresholds were also considered in terms of the number of standard deviations for each participant. In order to classify eye movements from the eye tracker data, various event detection algorithms were reviewed. i.e., when determining points belonging to a fixation, removing data surrounding blinks, and identifying the onset and offset of a saccade. Thus, this concept will be considered, but not a focal point in the following discussion.
The spatial criteria divide algorithms into categories based on their use of dispersion, velocity, or area-of-interest (AOI) information to detect events [1].
Dispersion-based algorithms typically identify samples as belonging to a fixation if the samples are located within a spatially limited region for a minimum period of time (i.e., the fixation duration) [2]. Dispersion-based algorithms utilize the fact that the distance between samples is different for fixations and saccades. These algorithms also take into account that a fixation has a minimum duration of 50-100 msec [3].
Velocity-based algorithms compute the angular velocity between each data point and use a velocity threshold to distinguish whether a point belongs to a fixation or a saccade [3]. These algorithms take advantage of the fact that the velocity of a fixation is different from that of a saccade [3]. Area-based algorithms identify points within given AOIs that represent the relevant visual targets [1]. Since in this task there is only a single focal point, area-based algorithms utilizing AOI information are irrelevant.
Thus, the remainder of the section will focus on dispersion and velocity-based methods.
Dispersion-based algorithms identify fixations as groups of consecutive points of regard (POR) within a particular dispersion or maximum separation [4]. There are a variety of methods that can be used to calculate dispersion. For example, the dispersion can be measured as the distance between each point in a fixation and every other point or it can be calculated as the distance between successive points [4]. The Dispersion-based algorithms typically focus on identifying fixations whereas velocity-based algorithms are better suited to saccade identification. The Velocity-Threshold Identification (I-VT) algorithm computes the angular velocity between each data point and uses a velocity threshold to distinguish whether points belong to a fixation or a saccade [3]. The angular velocity is simply calculated by dividing the angular dispersion by the time between sample points. Thus, for consecutive data points, at a 240-Hz sampling rate as used in this experiment, the points are 1/240 second apart, and the angular velocity is calculated by dividing the angle by 1/240.
There are a few areas of concern in implementing any of the event detection Another issue in analyzing eye tracker data in order to detect saccades results from the noise within the data. Various methods exist for filtering the data, but then one risks filtering out important pieces of information. Other methods suggest adaptive thresholds, which can take into account the variation between runs or participants by utilizing the standard deviation [6]. The final issue of concern, when attempting to identify saccades, is the sampling rate of the eye tracker system. Larsson (2010) compared event detection algorithm results for a variety of sampling frequencies ranging from 50 Hz to 1250 Hz [3]. Larsson concluded that a sampling frequency of at least 250 Hz is required to obtain reasonable results when computing angular velocity [3]. Thus, with the maximum available sampling frequency of 240 Hz for this data collection, accurately identifying saccades may be an issue.

Saccade Detection Results and Graphs
Before the data could be modeled, it was necessary to determine where and at what rate saccades were occurring. Based on the review of event detection algorithms, it was determined that a velocity-based algorithm would be used. A variety of thresholds and filtering methods were explored, before selecting a threshold value of 300 o /second. An adaptive threshold was also considered to account for participants with larger amounts of variation, but ultimately, was not used in the saccade identification.
A simple algorithm which identified saccades based predominately on the velocity threshold was developed. The points that exceeded the threshold were categorized as potential saccades. Further review assessed whether or not these high velocity events occurred surrounding a blink. The dark-pupil-to-corneal reflection method used by the eye tracker is not able to accurately obtain the eye's position while it is opening (or closing) as part of a blink. Thus, if the data occurred within twenty samples before or after a blink (identified by a zero in the positional data), it was discarded from the set of potential saccades. The remaining potential saccades were then identified as saccades and used for the following analyses. The data was analyzed using MATLAB with the pseudo code shown in Error! Reference source not found.. The number of saccades within each run was calculated and the times at which the saccades occurred were obtained. The number of data points for each run both with and without saccades is shown in Table 4-1. Note that blinks and physiologically impossible values have been removed from the data prior to the total data counts.
Though for saccade identification 20 samples before and after blinks were considered, only ten samples before and after each blink were removed. For saccade detection, it was important that the adjustment of the eye before and after a blink was not identified as a saccade. For blink removal, it was only necessary to remove the data points that might be part of the actual opening or closing of the eye. For the majority of the runs, removing blinks and the ten samples before and after blinks resulted in less than 10% of the data points being removed.  The following plots illustrate the saccade counts versus time for each of the participants. Each subplot shows either the three control runs or three n-back runs of that participant. Participant 5 runs have been removed due to a data recording issue.  For participant 3, the saccade pattern between the control runs and the n-back runs did not differ much, nor did the quantity of saccades. Both types of runs varied between 1000 and 1600 saccades during the two-minute period. Participant 4 also showed similar patterns of saccades between the control runs and n-back runs.

Saccades
However, while the first and second runs resulted in 200 saccades or less for the control runs, and less than 300 saccades for the n-back runs, this number significantly increased to almost 1400 saccades for control run 3 and almost 2000 saccades for nback run 3.   Overall, there was not a consistent pattern among the saccade counts across participants, or even within participants for the same type of run. Some participants exhibited more consistent patterns of saccades across all runs, whereas others differed significantly between the control and n-back runs. To further assess the distribution of saccades, histograms of saccade counts were generated using ten-second bins. The following figures illustrate a subset of these histograms.

Modeling and Simulation Results
This section considers the various stochastic models described in section 2.5, and their applicability to different aspects of visual attention. The first process considered is a Poisson process, and how accurately it describes the saccade patterns of the participants. As described in Section 2.5, the events of a Poisson process, , occur randomly in time, and thus, follow a uniform distribution. In addition, (0) = 0 and event occurrences in disjoint time intervals are independent. At the beginning of a run, no saccades have yet occurred, and thus the condition that (0) = 0 is satisfied.
Since saccade occurrences are not influenced by past saccades, the requirement for independence of events between disjoint time intervals is also satisfied.
In order to test whether the times at which the events occur follow a uniform distribution, the Kolmogorov-Smirnov test was used. A Kolmogorov-Smirnov plot is a graph of the empirical cumulative distribution function compared to the theoretical cumulative distribution function. The empirical distribution was obtained from the event times at which the saccades occurred. For any , the empirical distribution function, ( ), represents the proportion of observations less than or equal and is defined as: where = the total number of observations or events. The theoretical cumulative distribution function, ( ), represents the hypothesized distribution that the empirical data is expected to follow. As such, ( ) = ( ≤ ), the probability of an observation value less than or equal to . If the hypothesized model fails to account for some aspect of the event behavior, then the lack of fit will be reflected in the Kolmogorov-Smirnov plot as a significant deviation from the theoretical cumulative distribution function.
The Kolmogorov-Smirnov test statistic was then calculated to obtain a numerical estimate of lack of fit. This test statistic equals the largest absolute vertical difference between the empirical and theoretical cumulative distribution functions. This test is useful in determining whether a model accurately describes the structure of the data.
For this test, the null hypothesis is that the empirical data follow the hypothesized distribution. Therefore, since a significance level of 0.05 is used, any value exceeding 0.05, would fail to reject the null hypothesis, and thus, provide evidence to support that the empirical data follows the hypothesized distribution.
In Table 4-2, the Kolmogorov-Smirnov test is assessing whether or not the saccade occurrence times follow a uniform distribution, and thus, a Poisson process.
The cells highlighted in magenta represent those showing support for the saccade occurrences following a uniform distribution. Of the 52 runs, only 21% failed to reject the null hypothesis, indicating evidence towards the data following a Poisson process.
Thus, it is not likely that the saccades follow a Poisson process.

. Kolmogorov-Smirnov Test for a uniform distribution of saccade event times to indicate whether the saccades follow a Poisson process
The following series of plots show a subset of the Kolmogorov-Smirnov plots testing whether the saccade occurrence times follow a uniform distribution. One aspect of concern is the runs in which the p-value does not appear to correspond with the plot. For example, there are certain plots with small p-values in which it appears that the empirical distribution closely follows the theoretical distribution and others with the opposite issue. These discrepancies may suggest that the Kolmogorov-Smirnov test is not the ideal test for this data. The Kolmogorov-Smirnov test requires that the cumulative distribution function be pre-determined [7]. It is not accurate if the parameters of the cumulative distribution function are estimated from the data [7]. In this case, the parameters were estimated from the data, and this inaccuracy may be represented in the discrepancies exhibited between the graphical representations and the p-values.

Illustrates small p-value (reject that data comes from the theoretical distribution) with small discrepancy between empirical and theoretical distributions
Due to the numerous discrepancies between the p-values and the graphical representations, additional distribution tests were explored. The Lilliefors test is a goodness-of-fit test suitable when a fully specified null distribution is unknown and the parameters must be estimated from the data [7]. The Lilliefors test statistic is the same as the Kolmogorov-Smirnov test, but the theoretical cumulative distribution function parameters are estimated from the data [7]. The  [8]. One limitation of the Lilliefors test is that it is only able to test whether the empirical data comes from a distribution in the normal family, as opposed to the Kolmogorov-Smirnov test, which allows a comparison of any two distribution functions [7]. Thus, instead of analyzing the distribution of the saccade event times, the distribution of the time between saccades was compared to an exponential distribution.
As noted previously, another indicator that a point process is Poisson is that its sojourn times (i.e., the time between events) follow an exponential distribution. The empirical cumulative distribution function was assessed against an exponential distribution at a 95% confidence level using the lillietest function in MATLAB.
Thus, an additional limitation of this test was that a dataset required at least four sample points, and that p-values above 0.5 and below 0.001 were rounded to 0.5 and 0.001, respectively. Data which encountered data collection issues or contained less than four data points (i.e., less than four saccades), are shown as N/A in the table of pvalues (see Table 4-3).
Razali and Wah (2011) compared the power of four formal tests of normality including the Kolmogorov-Smirnov and Lilliefors test. Power comparisons were obtained via Monte Carlo simulation of sample data generated from alternative distributions [9]. Samples sizes ranged from 10 to 2000 [9]. For small sample sizes, the power of each test varied considerably, but at a sample size of 2000, all tests obtained power values exceeding 80% [9]. In this study, the number of samples exceeds 20,000 for each run. Thus, for the distribution tests utilized, the power of each test is more than sufficient, with values exceeding 99% [9].

. Lilliefors Test for an exponential distribution of saccade sojourn times to indicate whether the saccades follow a Poisson process
The p-values appeared to better correspond to the graphical representations in this case, not only for the exponential distribution comparison, but also when reviewing the uniform distribution comparison plots previously illustrated. The following subset of figures illustrates both those empirical distributions that follow the theoretical one and those that do not. For this test, 46% of the 52 runs (assuming the three runs with an insufficient number of saccades do not follow an exponential distribution) resulted in evidence which supports that the saccades follow a Poisson process. The overall conclusion is still that the saccade occurrences do not follow a Poisson process.

. Summary of Lilliefors Distribution Tests -Poisson Process
Another stochastic process considered is the Wiener process. In this section, the goal is to determine whether the gaze path, as defined by the eye movements made between sample times, follows a Wiener process. As defined in section 2.5, a stochastic process ( ), −∞ < < ∞, is a Wiener process with parameter 2 , if: (1) (0) = 0,

-5. Summary of Lilliefors Distribution Tests for a Wiener process
For both the horizontal differences and vertical differences in movement between samples, 100% of the 53 runs failed to reject the null hypothesis. This provides strong evidence that the gaze path in each dimension can be modeled as Brownian motion.
Due to the limitations in minimum and maximum p-values, additional tests for normality were considered. The Jarque-Bera test assesses the hypothesis that the sample data follow a normal distribution with unknown mean and variance, against the alternative that the data does not come from a normal distribution. The test statistic for this test is: � where = sample size, = skewness of the sample data, and = kurtosis of the sample data [7]. The jbtest function in MATLAB was used. This function replaces the chi-square approximation of the test statistic with a more accurate algorithm that interpolates p-values from a table of quantiles [7]. The resulting Jarque-Bera test p-values for the difference in position, both with and without saccades, are shown in Table 4

Table 4-6 (cont.) P-values for the Jarque-Bera Test for a normal distribution to indicate whether the gaze path follows a Wiener process
In terms of overall rejection or failure to reject the null hypothesis, the results of the Jarque-Bera test resulted in the same conclusions as those from the Lilliefors test.
Though there was more variation in the p-values, 100% of the 53 still failed to reject the null hypothesis, which provided support that the data could be modeled as a Wiener process. The following plots illustrate the comparisons between the empirical distribution and the theoretical distribution, a normal distribution with mean equal to zero and a standard deviation estimated from the data. The plots illustrate the differences in p-values between the Lilliefors and Jarque-Bera tests. Though the overall conclusions drawn were always the same, the Jarque-Bera test provided additional variation in the p-values that corresponded better with the graphical depiction and reflected the differences in the distribution functions. A summary of the Jarque-Bera test conclusions is shown in Table 4

. Summary of Jarque-Bera Distribution Tests for a Wiener process; Note that for the columns labeled removed or included, this refers to whether saccades were removed from the data before the analysis
This analysis was extended to consider whether the distribution changed as the differences in position increased to two samples, three samples, four samples, five samples, and ten samples. For a Wiener process, ( ) − ( ) should follow a normal distribution with mean equal to zero and variance equal to 2 ( − ). Thus, as the distance between and increases, it is expected that the variance will increase accordingly. For instance, if ( − ) = 3, then the proportion of variances between this three sample case and the consecutive sample case (i.e., ( − ) = 1)) should be equal to three. The results are summarized in Table 4-8 and Table 4-9.

Table 4-9. Summary of Jarque-Bera Distribution Tests (with saccades included) Comparison of different amounts of time between samples
As the number of time samples between eye positions increased in the calculation of horizontal and vertical differences, the results still strongly supported that the data follow a normal distribution. For the number of time samples up to four, more than 90% of the runs failed to reject the null hypothesis. This number dropped slightly to 86% for five samples between positional difference calculations, but still showed strong support that the data followed a normal distribution. At ten samples, a little more than 70% of the runs failed to reject the null hypothesis. However, the variance proportions did not increase as expected by a Wiener process as seen in Table   4-10.  In the vertical dimension, there was more variation than in the horizontal dimension. For ( − ) = 2 or 3, the variance proportion values were quite close to the expected values. However, as the distance between time samples increased up to ten samples, the smallest average proportion observed was 11.707 when it should have been equal to 10. This suggests that as the distance between samples increases, at some point, the differences in position will no longer follow a Wiener process.
Additional analysis of the parameter fits for the mean and standard deviation of the distributions was executed. A statistical summary of the results is provided in Appendix B for each type of run and for the analysis both with and without saccades.
For each comparison, the mean and variance for each pair of datasets were assessed.
In order to compare the variance, the MATLAB function vartest2 was used. This function tests the null hypothesis that the variances of the two distributions are equal.
This result was used in comparing the means using ttest2 in MATLAB. This function tests the null hypothesis that the means of the two datasets are equal, and an equal or unequal variance assumption must be defined. The results from the various comparisons are shown in Table 4 All the tests resulted in a rejection of the null hypothesis that the variances were equal. For the means, when comparing with versus without saccades, all the tests failed to reject the null hypothesis that the means were equal. Thus, for the comparison of with versus without saccades, there was no difference in means but there was a difference in variances. The runs with saccades consistently resulted in larger variances than those without the saccades. For the horizontal versus vertical differences in positions, the control runs rejected the hypothesis that the means are equal, but the n-back runs failed to reject the null hypothesis. Hence, there was a difference in the variances between the horizontal and vertical dimensions, and between the means for the control runs. For the comparison of means between the control and n-back runs, in both dimensions, there was a failure to reject the null hypothesis. There was a difference in variances between the control and n-back runs, but there was no difference in means. The variances for the n-back runs were consistently larger than those for the control runs.
Additional analyses regarding whether the gaze path followed a random walk were executed. The distribution of the angle or direction of movement was analyzed.
In comparison to a uniform distribution, when saccades were removed, 60% of the 53 runs failed to reject the null hypothesis. This provides support that the angle of eye movement, in two dimensions, followed a uniform distribution. For the data with saccades, only 21% of the 53 runs failed to reject the null hypothesis. Thus, there was a significant difference in the distributions of the angle of eye movement between the data with and without saccades. When saccades were removed, there is more support for the gaze path following a random walk, compared to when saccades are included in the assessment.
Recall that represents the position of the eye at the end of movements. The distribution of the sequence of { } =1 was also examined. However, the resulting empirical distribution functions were varied, and without a consistent pattern. The distributions were first tested against a normal distribution and then a uniform distribution, but neither showed promising results. Thus, from this data, no conclusions can be drawn regarding the distribution of .
For each of the distribution tests, simulation runs were also executed and assessed for validation. Multiple sets of 1000 random variables were generated in MATLAB for each of the distributions tested, i.e., uniform distribution, exponential distribution, and normal distribution. For the uniform distribution, each set of random variables was distributed over the same specified range, [0,120], to represent the two-minute test runs. For the exponential distribution, the mean parameter value was varied over 40 different data points based on the range of mean values in the collected data set.
The same approach was followed for determining the standard deviation values for the normal random variables generated. The mean value for the normal random variables was set at zero for all the simulation runs.
The tests used to assess the uniform, exponential, and normal distributions for the collected data were then used on the simulated data for comparison. For each test, the random variables corresponding to the same distribution as the test were assessed, as well as a set of random variables corresponding to a different distribution. First, the uniform theoretical distribution was tested against the set of uniform random variables and the set of exponential random variables using the Kolmogorov-Smirnov test.
Thus, the null hypothesis is that the data (i.e., simulated set of random variables) follow a uniform distribution. The results are summarized in Table 4-12. The cells highlighted in yellow represent that the correct conclusion was drawn, i.e., reject the null hypothesis that the exponential random variables follow a uniform distribution.
Ideally, all of the data points would fall within these cells.

Exponential random variables Reject H0
8 200 Fail to Reject H0 192 0 The exponential distribution was assessed using the Lilliefors test. For the exponential distribution, a set of exponential random variables and a set of uniform random variables were tested. The null hypothesis is that the data follow an exponential distribution, and the results are summarized in Table 4  The last set of tests assessed how a simulated set of random variables (generated from a normal distribution or from an exponential distribution) compared to a normal distribution. The null hypothesis is that the data follow a normal distribution with mean equal to zero and a standard deviation estimated from the data. The results are summarized in Table 4-14 and Table 4   In all the simulation results, the test never identified a set of random variables as following the specified distribution, when in fact it was generated from a different distribution. For instance, of the 200 trials testing the set of exponential random variables against a normal distribution, all 200 trials rejected the null hypothesis that the data followed a normal distribution. This was consistent across all distribution tests. Thus, these tests generated no false positives. This is important because it assures that those which failed to reject the null hypothesis, were actually variables generated from the same distribution being tested.
On the other hand, there were instances in which a set of random variables (e.g., normal random variables) generated from the specified distribution rejected the null hypothesis that the data followed that specified distribution (e.g., the normal distribution). Thus, there are instances in which the tests may be too rigid or the data, though generated from the specified distribution, still results in test statistics exceeding the acceptable value. It appears that the Jarque-Bera test was more conservative than the Lilliefors test in failing to reject the null hypothesis when the data was indeed generated from a normal distribution. Thus, for the summary of results when testing for Brownian motion, the results from the Jarque-Bera test are used. Overall, the results from the simulated sets of data provide evidence that when a test failed to reject the null hypothesis, that data did indeed follow the specified distribution. [2] M. Nystrom and K. Holmqvist, "An adaptive algorithm for fixation, saccade, and glissade detection in eyetracking data," Behavior Research Methods, vol. 42, no. 1, pp. 188-204, 2010.

List of References
[3] L. Larsson, "Event detection in eye-tracking data," Unpublished Master's Thesis, Lund University Department of Electrical and Information Technology, Sweden, June 2010.

CONCLUSIONS
Eye tracker data was collected on nine participants for a series of runs. Two types of runs were considered, control and n-back runs. In both types of runs, the visual task was simple: focus on a single dot on a screen for two minutes. The n-back runs extended the task to include an auditory task in which a stream of numbers was presented to the participant. While focusing on the dot, and listening to the number stream, participants were asked to identify when the current number matched the number presented two steps previously. The goal of this research was to identify how the eyes moved when a participant is simply focusing on a single location. Unlike most of the experiments in the literature, the goal was not to identify how distractors affected eye movements or how one performed in a visual search task, but rather to provide a foundational model for basic eye movement.
Three stochastic processes were considered in terms of modeling the focus of attention. The first process considered was a Poisson process, and how accurately it describes the saccade patterns of the participants. First, the saccade occurrence times were assessed using a Kolmogorov-Smirnov test. This resulted in 21% of the 52 runs supporting the hypothesis that the data followed a Poisson process. However, due to discrepancies between the graphical representation and the test statistics, additional distribution test statistics were considered. The Lilliefors test was then used to assess the times between saccades (i.e., the sojourn times). The sojourn times were compared

CHAPTER 5
to an exponential distribution; for this test, 46% of the 52 runs resulted in evidence, which supports that the saccades follow a Poisson process. There was not a substantial difference between the control runs and the n-back runs with 53% of control and 38% of n-back runs failing to reject the null hypothesis. Overall, the evidence was not strong enough to support that the saccades followed a Poisson process.
One goal of this research was to determine the distribution of saccades. Different algorithms for saccade detection were considered as well as how noise affected the saccade detection. A saccade detection algorithm was implemented and histograms of saccade occurrences were analyzed. However, there was insufficient evidence to support that the saccade occurrences consistently follow one distribution across participants. Some participants exhibited similar patterns of saccades throughout while others varied considerably within the six runs. Thus, future work could extend the data collection and assessment to determine the distribution of the saccade occurrences.
Another stochastic process considered was the Wiener process. The goal was to determine whether the gaze path, as defined by the eye movements made between sample times, follows a Wiener process. For this process, the length of eye movements made between samples, ( ) − ( ), follows a normal distribution with mean equal to zero and a variance of 2 ( − ) for ≤ . For both the horizontal and vertical positions, ( ) − ( ) was calculated for consecutive time samples, after removing blinks and saccades, and then tested for normality. For the consecutive time sample case, 100% of the 53 runs provided support to the hypothesis that the data followed a normal distribution. This provides strong evidence that the gaze path in each dimension follows Brownian motion.
The analysis extended to consider whether the distribution changed as the differences in position increased to two samples, three samples, four samples, five samples, and ten samples. As the number of time samples between eye position differences calculations, the results still strongly supported that the data followed a normal distribution. For the number of time samples up to four, more than 90% of the runs failed to reject the null hypothesis. This number dropped slightly, to 86%, for five samples between positional difference calculations, but still showed strong support that the data followed a normal distribution. At ten samples, a little more than 70% of the runs failed to reject the null hypothesis. However, the variance proportions did not increase as expected by a Wiener process. This suggests that as the distance between samples increases, at some point, the differences in position will no longer follow a Wiener process.
Additional analysis of the parameter fits for the mean and standard deviation of the distributions was executed. All the tests resulted in a rejection of the null hypothesis that the variances were equal. For the comparison of with versus without saccades, there was no difference in means, but the runs with saccades resulted in consistently larger variances than those without the saccades. For the horizontal versus vertical differences in positions, the control runs rejected the hypothesis that the means are equal, but the n-back runs failed to reject the null hypothesis. For the comparison between the control and n-back runs, there was a difference in variances between the control and n-back runs, but there was no difference in means. The variances for the nback runs were consistently larger than those for the control runs.
Additional analyses regarding whether the gaze path followed a random walk were executed. The distribution of the angle or direction of movement was analyzed.
In comparison to a uniform distribution, when saccades were removed, 60% of the 53 runs failed to reject the null hypothesis. This provides support that the angle of eye movement, in two dimensions, followed a uniform distribution. For the data with saccades, only 21% of the 53 runs failed to reject the null hypothesis. Thus, there was a significant difference in the distributions of the angle of eye movement between the data with and without saccades. When saccades were removed, there is more support for the gaze path following a random walk, compared to when saccades are included in the assessment.
Overall, both hypotheses were supported. Focus of attention, as represented by eye movement data, was able to be modeled as a stochastic process. The strongest evidence resulted from modeling differences in the horizontal and vertical eye movement as Brownian motion or the Wiener process. For these tests, 100% of the runs failed to reject the null hypothesis that the data followed a normal distribution, and thus, supported the hypothesis that the data followed a Wiener process.
Additionally, there were differences exhibited between the variances for the parameter estimates between the control and n-back runs in both dimensions.
This research and resulting conclusions are important in starting to explain the involuntary eye movements, which occur when a participant is singularly focused.
These results provide strong evidence that this underlying movement can be      For the assessment of gaze path, much of the initial code as well as the methods used are the same as shown above. The first page is identical and thus, is not repeated here. The code starts after that first page has occurred, and thus labeling begins at page 2. For this page 2, compared to the above, there are only slight differences. Additional calculations were added to obtain values for horizontal, vertical, and overall differences in position between time samples. The other pages contain more extensive changes in the code and thus, are not summarized here.
% Eye Tracker Data -Saccade Distribution Tests (page 6 of 6) %% Lilliefors test -Exponential distribution of sojourn times % Test to determine whether the sojourn times (i.e., time between % saccade occurrences) follow an exponential distribution as required % by the points of a Poisson process for sj = 1:length(run) [fval,xval] = ecdf(stimes.(run{sj,:})); thcdf = expcdf(xval,mean(stimes.(run{sj,:}))); stairs(xval,fval,'b','LineWidth',2) hold on plot (     For the assessment of the random walk, much of the script was the same as that for the Brownian motion assessment. However, an additional variable needed to be calculated and tested. Again page 1 from figure A-1 remained the same. For page 2 in Figure A-2, the following lines were the only changes made. These were added to the initial for loop used for reading data into MATLAB and defining variables. The distribution of both these variables, equivalent to the angle of eye movement with saccades (ang_dir_nb) and without saccades (ang_dir_ns), were then tested using the Kolmogorov-Smirnov test as described in Figure   A statistical summary of the results is provided below for each type of run and for the analysis both with and without saccades.

Parameter Run Type Minimum Maximum Mean Median
Horizontal Mean