A Statistical Analysis of Students’ Attitudes and Achievement in Introductory Statistics Courses

The Millennial Generation has been phasing out of undergraduate classrooms since 2013 and is being replaced by the technologically savvy and visual learners of Generation Z. To help to increase our understanding of the learning needs and attitudes of this new population of students, a two-fold data collection design has been implemented in undergraduate statistics classes at the University of Rhode Island. In the first round of data collection during the spring 2016 semester, survey and grade data was collected from an introductory biostatistics class pertaining to 146 students. Results from the analysis including the use of longitudinal generalized linear mixed models, hierarchical linear models and regression trees indicated a relationship between time and student performance throughout the semester, as well as a relationship between students starting attitudes and their performance and a potential group structure in the class based on their attitudes. This first round of data collection and analysis lead to interesting results about students starting attitudes and the effect on their performance. To further explore these results and extend them to more than one course, a second round of data collection was completed during the spring 2017 semester. Principal component analysis in connection with regression analysis indicated a relationship between students starting attitudes and their course performance. Cluster analysis indicated a two group structure in starting attitudes of the students in each course, with each cluster showing different achievement and learning preferences.

.12 Biplot of the first two principal components for the pretest attitudes for STA308 with students plotted as points colored by their final grade as an A or A-(89.5 and up) or not. . . . . . . 64 .13 Biplot of the first two principal components for the pretest attitudes for STA409 with students plotted as points colored by their final grade as an A or A-(89.5 and up) or not. . . . . . . 65 .14 Cluster scatterplot for the k-means clustering of STA308 students' prestest attitudes plotted versus the first two principal components. is currently under review. The paper was co-authored by Dr. Natallia Katenka.

Abstract
The Millennial Generation is phasing out of undergraduate classes and being replaced by the technologically savvy and visual learners of Generation Z. To help to increase our understanding of the learning needs and attitudes of this new population of students, we collected survey and grade data in an introductory biostatistics course pertaining to 146 students at the University of Rhode Island. Our purpose was three-fold. First, to increase the amount of immediate feedback collected from students by implementing weekly quizzes. These quizzes were analyzed using longitudinal mean response profiles and generalized linear mixed models to discover a significant effect of time on the student performance, but not of grade incentives. Next, students attitudes towards statistics were analyzed to determine how the starting attitudes effected performance using hierarchical linear models to find a significant effect of starting affect and cognitive competence on students final grades. Finally, regression trees were utilized to identify groups of learners who increased their attitude throughout the semester dependent on their starting attitude and final grade. These results lead to practical implications for instructors as they plan the timing of their instruction within a course, as well as the importance of identifying students' confidence and feelings towards the subject at the start of the course and to hopefully minimize the impact of these negative attitudes on their students' performance.

INTRODUCTION 1.Motivation
The Millennial Generation (all persons born from the early 1980s to the mid-1990s) is phasing out of undergraduate courses and the next generation is replacing them. Penned Generation Z these students were born into a world with technology, where their phones contained the answers to nearly any question they could ask.
As described by , there are several consequences to this upbringing with instant gratification from technology. One result is the lowered attention span of eight seconds, which shows a sharp decrease from the Millennial Generation's time span of 12 seconds. Another result is an increased ability to understand visual imagery .
This shift in generations calls for an update to teaching methodology and techniques. This generation of students did not adapt to the introduction of technology like the Millennial Generation, but rather they grew up with applications like Snapchat, YouTube, Facebook and more available at a moments notice. When any question can be answered with a quick Google search or trip to Wikipedia, the idea of listening to lectures and reading from textbooks is not only unappealing, but very dissimilar to the normal way of learning for these students. This generation can get any answer they want in seconds, but their ability to validate and further interpret these answers may be absent . No longer do students look to books for answers, now the knowledge is held in their mobile devices. But how do we adapt teaching to this new generation? The first step to any unknown is to gather more information about these students and their needs.
Getting feedback from students about their understanding of ongoing topics and learning preferences can be incredibly difficult when many students are afraid to ask or answer questions in front of their peers and do not participate in office hours. This is especially true in large (>100 students) lecture sections, such as introductory statistics courses, which are filled with a diverse set of students from various mathematical backgrounds, many of whom have a low or neutral evaluation of the subject and little inclination to participate. In response to these challenges, an ongoing study to incorporate additional feedback strategies and garner more information about students attitudes and achievement has been implemented in undergraduate statistics courses here at the University of Rhode Island. In this study, an interactive feedback framework was implemented in the Spring 2016 Introduction to Biostatistics courses which included the use of weekly quizzes, an introductory survey and two attitude surveys.

Related Work
Many studies have evaluated the use of immediate feedback in large courses through the use of clickers. Typically, clickers (small electronic devices with educational software to collect student responses) are implemented in lectures or labs to allow for student to answer a small number of questions, either before, during or after instruction. According to , clickers have traditionally been used in many large lecture courses with the potential benefits of improved attendance, immediate feedback for students, the ability to revisit challenging topics and continuous assessment throughout the lecture and semester .
In a study on student perception of clickers, Vaterlaus et al. (2012) found a positive perception overall and a significant effect of clicker usage on student recall on exams (Vaterlaus et al., 2012).
In a randomized experiment in an introductory statistics course, McGowan and Gunderson (2010) studied the effect of clicker use on engagement and learning during lab sections. In this study, there was little evidence that clicker use increased students engagement; however there was an effect on student learning if the number of questions were low and well assimilated with the material. The researchers also studied the effect of external incentives on student clicker participation and discovered that students were much more likely to participate in clicker questions when given external motivation . This study did not however look at the actual responses or grades on the questions, but rather if students answered at least 1 or at least 50% of the questions.
Many of the studies used clickers during lecture periods, rather than in recitations after the weeks lectures. Also, they required the additional cost and hardware of using clicker software, whereas the use of smart phone technology using either the smart phone application or on-line website for Socrative in the spring 2016 course utilized a familiar device to students and did not add financial burden to the students (Soc, 2017).
In addition to the use of weekly quizzes, this study implemented surveys to measure students attitudes at the beginning and end of the course. Many instru- Effort . In a study of 47 students from a small liberal arts college,  had students complete the SATS-36 along side a short perception of statistics survey at the beginning and end of the semester.
They also observed a decrease in students attitudes over the course of the semester .
In a comparative review of these surveys,  explored the validity and reliability of the tools with published evidence of these measures . From their summary, the SATS-36 scores appeared to have the strongest construct validity based on unparceled CFA and internal consistency ratings based on Cronbach's Alpha, assuming the construct validity evidence for the SATS-28 can be applied . Several studies have documented the solid psychometric properties, including confirming the four factor structure of the SATS-28, including  and , however only two could be found for the SATS-36   . The six factor structure was confirmed in studies by Vanhoof (2011) and   (Vanhoof, 2011)  . Several authors have explored the relationship between students attitudes and course performance including Sorge and Schau (2002), Miller and Schau (2010) and Emmioglu (2011) (Sorge and Schau, 2002)    . Other researchers have used the SATS instruments to explore the differences in teach-ing environments and methods including , DeVaney (2010),  and    (DeVaney, 2010)   . Several studies also looked to explore the attitudes of students from different fields of study, such as  and Mathew and Aktan (2014)     .
This study utilized the SATS-36 to measure students' attitudes at the beginning and end of the course. The relationship between the attitude components and course performance is evaluated, as well as the use of regression trees to identify groups of students with a similar change in attitudes. The rest of the paper continues as follows: Section 1.2 describes the methods for data collection, design of experiment and data analysis. Next, the results are presented in Section 1.3. Finally, the main findings, limitations and practical recommendations are discussed in Section 1.4.

METHODS 1.2.1 Data Description
The data were collected for this work during the spring 2016 semester at the University of Rhode Island in an undergraduate introductory biostatistics course.
This course had a total enrollment of 171 students and two professors. There were six recitation sections and three teaching assistants for the students included in the analysis, the distribution of students between sections is in Week Topic Week 1-2 Definitions, Population vs. Sample, Types of Variables.
Week 3 Descriptive Statistics and Graphical Data Summaries. Basic Probability.
Week 4 Combinations and Permutations. Random Variable. Binomial Distribution.
Week 5 Normal Distribution. Empirical Rule. Normal Approximation to Binomial.
Week 6 Sampling Distribution. Central Limit Theorem.
Week 8 One-sample Hypothesis Test for Population Mean. Sample size calculation.
Week 9 Two Independent Sample Inferences for Difference in Population Means. Paired Test.
Week 10 One Sample Tests for Population Proportion. Midterm 2.
Week 11 Difference in Population Proportion. Chi-Square Tests.
Week 12 Introduction to ANOVA.
Week 13 Introduction to Correlation and Regression.
Week 14 Final Review.
Of the 171 students enrolled in the course, 146 students signed the Institutional Review Board (IRB) consent form to allow their data included in the survey. An additional nine students' data were removed due to too small of a sample size for one teaching assistant. Of these 137 students eligible to be included in the analysis, only 114 students completed the attitude surveys at both the beginning and end of the semester. All available data for the 137 students were included in the analysis of course performance throughout the semester, however only 114 students' data were included in the analysis of the attitude data. These students included 29 male and 85 female students, the majority of students were 19 years old and most students are from the College of Pharmacy. week's lecture material. There were three different grading schemes possible for the quizzes: graded personal (GP), graded competition (GC) and non-graded (NG).

Design of Experiment
The graded personal quizzes were individually graded based on the students' performance. The non-graded quizzes were used strictly for student feedback and attendance and the graded competition quizzes were graded with a bonus for the team who completed all questions correctly first. Each recitation section had each quiz scheme for three consecutive weeks with the order based on the recitation day.
The Monday recitations had a NG-GC-GP rotation, whereas the Tuesday recitations had a GC-GP-NG rotation and the Wednesday recitations had GP-NG-GC.
This rotation allowed each professors' section to have one of each rotation order and every TA to have two different rotations.

Data Analysis Tools Longitudinal Models
Mean response profiles were used to graphically and analytically display patterns of change in the mean quiz grades over time for each grading structure. This method is primarily used to address the null hypothesis of no group by time interaction effect, represented graphically by parallel response profiles between groups.
The null hypotheses of no time effect and no group effect can also be graphically shown by flat or overlapping lines respectively. This method can be utilized due to the balanced design of the study, with the timing of the repeated measures common to all subjects.
To model the students' quiz performance over time, piecewise quadratic generalized linear mixed models were utilized. The quiz grades were recorded as a count of correct responses out of six questions. This count variable can be mod-eled with a mixed effects log-linear regression model with a random intercept for each student. To incorporate the hierarchical structure of the course where groups of students are in the same professors' section and in smaller classes with teaching assistants, random effects for these grouping structures were modeled. The full hierarchical model to represent the quiz grade in terms of the grading scheme and time is: where Y ij is the number of quiz questions answered correctly for individual i at time j. The variables GP i and GC i refer to the quiz types graded personal and graded competition, with a reference of not graded. The variables T ime, T ime6 and T ime9 refer to the piecewise time variables cut before each of the first two exams at weeks seven and ten. The quadratic terms allow for the quiz grades to change in a non-linear trend between exams. There is a random intercept for each student and a random effect for professor and teaching assistant. Given b i , it is assumed that the The random intercepts are assumed to have a bivariate normal distribution, with a mean of zero and a 2x2 covariance matrix G .

Linear Models
Correlation analysis was utilized to explore the relationship between the attitude scores and each of the grade book items. Pearson correlations were calculated between each pretest and posttest component score and the quiz, exam, homework and final grades.
Next, hierarchical linear regression was utilized to model the relationship between the final course grade and starting attitude components. Once again, the hierarchical structure of the course needed to be modeled using random effects to account for the dependence between students in similar professors' and teaching assistants' sections. The full hierarchical linear model is: where Y i jk is the final grade for the i th student from the j th recitation nested in the k th professor's section. The final grade is predicted by each of the pretest attitude components. The term b k represents the random effect for professor and b jk is the random effect for the recitation section, resulting in a three-level model.
The error term is assumed to follow a normal distribution with a mean of zero and constant variance.

Regression Trees
Regression trees (also called decision trees) are a nonparametric method for segmenting the feature space based on a set of covariates. The algorithm to build the regression tree partitions the feature space to minimize least squares criterion and continues to create splits until the error can no longer be reduced. The resulting nodes are the means for each partition. The tree must then be pruned to avoid over fitting the data and reduce the variance of the final model.
Regression trees were used to model: where f (X) represents the change in each attitude component, c m is the constant mean change in attitude for each partition of final grade and starting attitude and is an indicator which equals one if the student is in partition m and zero otherwise. This method does not require the assumption of a linear relationship and has easy to interpret results .
The longitudinal analysis of the quiz grading schemes began with a visual representation of the mean response profiles over time using the software SAS for the null hypothesis of no group effect. The group by time effect also suggests to support the null hypothesis of no effect represented by the parallel slopes between many intervals.
The plot for the weekly homework suggests a significant effect for recitation day as the Monday recitation section had a higher mean for all weeks, except for a tie at week ten. The group by time effect for the quizzes is also not consistent throughout the weeks, as the trend over time appears similar between the groups.
The homework plot also indicates a change in performance over time, indicated by the differing slopes.
Also of note is the effect of the exams, which occurred during weeks seven and The hierarchical log-linear regression model was also run in SAS Studio 3.6 Enterprise Edition using the GLIMMIX procedure (SAS Institute Inc., ). The model was fit using maximum likelihood and approximated using adaptive Gaussian quadrature with ten quadrature points for each random effect during the evaluation of the integrals for the marginal likelihood .
The interaction terms were found to be non-significant and were removed from the model. The nested random effects for professor and teaching assistant were also found to be non-significant, based on covariance estimates not significantly different than zero, indicating no effect of clustering within the data. The random intercepts were non-zero and included in the model.
All effects for time and quadratic time were significant, except Time9. The fixed effects for the quiz types were not significant, nor were the interaction effects between quiz types and time. The over dispersion parameter indicated a good fit to the conditional distribution of the model, based on the Pearson Chi-Square value of 0.56 and the assumption of conditional normality of residuals was also met. These results support the mean response profiles graphical findings. There is a trend in quiz performance over time, influenced by the timing of the exams, however the quiz grading structure had no significant effect on student performance each week.

Survey of Attitudes
Before analyzing the results of the SATS-36 attitude survey, the internal consistency of the attitude components had to be investigated to explore the extent to which each parcel was actually measuring the same construct. Cronbach's coefficient alpha is one of the most frequently reported measures of internal consistency, however the assumptions can be difficult to meet and an alternative method, Omega, is available that assumes fewer assumptions and holds fewer restrictions on the dataset

Performance and Attitudes
The correlation analysis results between the attitude components and each gradebook item are presented in Table 1 indicate that students' with more positive feelings towards statistics and confidence in their own computational abilities at the start of the semester performed better in the course overall.

Change in Attitudes
After analyzing the quiz performance throughout the semester and the relationship between the course performance and students' attitudes, the next analysis of interest in this study is the change in attitudes throughout the semester. To begin, the summary statistics for each attitude during the pretest and posttest are in Table 1.10. The decrease in the attitudes throughout the semester is similar to other studies using the SATS-36 survey and has been hypothesized to be caused by an increase in students' understanding of what statistics is and the details involved in the subject throughout the semester (Schau and Emmioglu, 2012)  . To begin to the explore the students' change in attitude throughout the semester, regression trees were the chosen technique due to the resulting parti- interpret. For all other trees, the left most node displays the change in attitude for the students with lower final grades, and shows that students that did poorly in the course had a lower affect, cognitive competence, difficulty, interest and value than at the start of the course.
The partitions that showed the greatest average increase in each attitude component contained students that performed well in the course (at least a B average) and were in the lower partition for starting attitude. The other partitions contain groupings of students with higher starting attitudes that lowered or students who did okay in the course and showed very little change in their attitude. Each tree also shows that the largest partitions result in a group with little change in attitude throughout the semester. Many students whose attitude were higher than their peers at the start of the course showed a decrease throughout the semester.
There are many factors that could influence students' attitudes towards statistics during the semester, including the chosen covariates of starting attitude and course performance.

MAIN FINDINGS 1.4.1 Limitations
As with any observational or survey experimental design, there are several limitations to consider. First, due to the nature of the data and the use of human subjects, Institutional Review Board (IRB) consent was needed from the students.
This process reduced the potential sample size from 171 to 146 students. The students who did not consent to be in the study were not eligible to be included in any analysis and may have differing characteristics than those who did consent.
Another reduction to the sample size was the removal of nine students form a small Thursday recitation that had both a different day of the week and teaching assistant than the other students. This sample size was too small to be modeled for the day of week or TA effects.
Beyond these sample restrictions, another limitation was missing data in the form of non-response from students on one or both of the attitude surveys. There were 23 students that did not complete both surveys and were removed from the analysis of attitudes. These 23 students' information was included, whenever available, in the course performance analysis.
These reductions to the sample used for the analysis limits the amount of data available and leads to potential biases. The non-response bias, from both non-consent and missing surveys, leads to a potential missing subset of the class.
The students who were willing to not complete a survey required for their course homework could potentially share similar attitudes about the course that are now not available in this analysis. The students who did not want to consent could potentially share feelings of discontent with the course or their course performance.
Students who did consent and complete the surveys could also potentially share more positive feelings towards the subject and have better course outcomes.
Another potential bias with any survey is response bias. There is no way of knowing if students were completely truthful in their responses to the attitude surveys. The survey was designed to have positively and negatively worded questions to help reduce the tendency to answer the same way to every question which helps to check if students were paying attention to the survey. However, the possibility that students were answering how they felt the professor or their peers would want cannot be measured, but must be considered. Students may feel they should answer more positively if they wanted to align with their professor's wants or views.
Conversely, they may have answered more negatively than their true feelings to conform with other students. Similarly, there's always the chance that students were not taking the survey seriously and did not answer truthfully due to their desire to quickly complete the survey. There are many possible factors that could lead to different results in the survey responses and the nature of survey data needs to be taken into consideration when drawing conclusions from the results.

Practical Recommendations
From the analysis of student performance throughout the semester, the main conclusion was that time effects students performance, specifically the timing of exams and new concepts effects students' homework and quiz performance signifi-cantly. The lowest performance was when the new topic of inference was presented after the first exam for both homework and quizzes. The best quiz performance was the week before the exam and the best homework performances were Overall, this study shows that students' attitudes are an important measure in relation to student outcomes and motivation in a course. Not only do students who perform poorly in the course show low starting attitudes in certain components, they also leave with a negative change throughout the semester. The question now is, how can we increase students attitudes towards statistics? How can we help students who are struggling still see value in learning the subject?
To attempt to get closer to these answers, a second round of data collection has been collected in the Spring 2017 semester in all undergraduate courses at the University of Rhode Island. This new dataset includes a broader set of students, a more detailed collection of exit learning preferences and a more diverse set of professors. This new data will hopefully help us get a closer look into the learning preferences and teaching styles that effect the change in attitudes throughout the semester, as well as how different subsets of students view statistics. Sorge, C. and Schau, C. (2002). Impact of engineering students attitudes on achievement in statistics: A structural model.  Partitions show groups of students ranging from negative to neutral to positive changes in attitudes. Students who performed poorly in the course tended to leave with lower attitudes, while students who did well and started with lower attitudes than their peers left with a more positive outlook on statistics. Students who did well and started with higher attitudes, typically left the course with the same attitude towards statistics.

MANUSCRIPT 2 A Multivariate Analysis of Generation Z Students
This manuscript has been prepared with the intent to submit to the Journal of Statistics Education. The paper was co-authored by Dr. Natallia Katenka. to implement students' learning preferences into their lesson plans.

INTRODUCTION 2.1.1 Motivation
In 2013, the first incoming class of Generation Z students walked onto college campuses around the world. This generation, filled with all persons born since approximately 1995, has quietly entered the undergraduate educational landscape, with much less fanfare than the Millennial Generation before them. Now, all undergraduate courses are filled with predominantly Generation Z students. These students were born into a world where the Internet was a reality and growing up any question could be answered with a simple Google search. Most Generation Z students also grew up in a post-9/11 world hearing about various mass shootings all over the country. Information about these events was immediately available through their social media accounts and news websites adding to a sense of global connectivity and spread of information unfamiliar to previous generations (Seemiller and Grace, 2017

Background
Several studies have begun to characterize Generation Z students' learning preferences in higher education settings. These students tend to learn better from observation, such as through watching a video or demonstration of how to perform a certain action before attempting it themselves. Many Generation Z students will prefer to watch a YouTube video rather than reference a textbook or written media while learning something new . These students also value the applicability and practice of new skills very highly. They desire to apply what they are learning in a variety of settings, with internships being a very important learning opportunity for them (Seemiller and Grace, 2017).
An interesting development from their technology driven upbringing is a desire to work independently to find answers to their questions and work through their assignments. The individual nature of the Internet allows Generation Z students to take entire classes without interacting with peers or instructors, find resources for research papers without traveling to the library and complete many instructional activities without the aid of others (Seemiller and Grace, 2017). This leads to an interesting preference to work independently and utilize those around them as a resource, rather than a requirement.
Generation Z students are typically accustomed to instantaneous answers to their questions and have almost too many sources available to them with a simple Google search. It has been observed that students from this generation may lack the ability to parse through these results and critique their validity. This instant gratification is also important to Generation Z students as they have been found to have a decreased attention span from previous generations and are accustomed to being surrounded by visual imagery, multiple sources and almost too much information at any time .
Specifically relating to STEM education, Hora et al. (2017) completed a descriptive study of students study habits in real-world situations. Students in the sample were from biology, physics, earth science and mechanical engineering courses. Their study found that students' studying habits had several stages: cues, timing, resources, setting and method of study. They found that students' most common cue to study was the instructor mentioning an upcoming exam and that the timing of study was split between the sample, with several students studying for days leading to the exam, while others crammed the night before and some studied throughout the semester. The most common resources for studying were found to be the course website, google, the textbook and lecture notes while the least used were the human resources and cue cards. The setting of study seemed to vary depending on the assignment, as many students reported studying in both groups and alone. Finally, the method of study was most commonly a review of the notes and textbook, while the least used was reviewing homework and weekly quizzes (Hora and Oleson, 2017).
This research also involves looking at students attitudes towards statistics as a subject and tool to use in their future fields. In order to measure students attitudes, a latent construct, an appropriate instrument needed to be chosen.  .
This study utilized the SATS-36 to measure students' attitudes at the beginning and end of the course, along with original introductory and exit surveys about students' learning, teaching and collaboration preferences. Extending on the qualitative analysis of the study habits, this paper also explores the relationship between the attitude components and course performance, as well as between the attitude components and learning preferences. The rest of the paper continues as follows: first, Section 2.2 describes the methods for data collection, design of experiment and data analysis. Next, the results are presented in Section 2.3. Fi-nally, the main findings, limitations and practical recommendations are discussed in Section 2.4.

METHODS 2.2.1 Design of Experiment
This study includes data that were collected from all four introductory statis-  added to the exit survey asking students about their collaborators throughout the semester. Each student was asked to report the names of each student they worked with throughout the semester in this class, as well as what they worked on and how they met.
The surveys were the only additions made to each course. At the end of the semester, final course grades were requested from each professor for all of the consenting students in the study.

Data Description
Each course had two different professors and at least one teaching assistant.
Students in STA307 and STA308 had lecture three times a week and have one weekly recitation class led by a teaching assistant where practice problems were solved. STA409 students did not have recitation classes, however their lecture classes were smaller than those in STA307 and STA308. Excel was used on several homework assignments for STA308 and SAS was the technology chosen for the STA307 students to practice.
In order to be included in any analysis students needed to sign the IRB consent form and to be included in any analysis beyond the descriptive plots and tables, students' needed to have completed both the pretest and posttest SATS-36 surveys.
The total enrollment, consent totals and study participation totals can be found in Each course had it's own population of students with differing mathematical preparation and majors. STA307 can be characterized by predominantly sophomore students from the College of Pharmacy. STA308 also had predominantly sophomore students and the most common major from students in the sample was Biology, followed by Pre-Med. STA409 had mostly junior and senior students and most students were engineering majors. Most students in STA307 had only taken one previous college mathematics or statistics course, while most STA308 students had taken between one and three classes and STA409 students had taken between 4 and 5 prior mathematics or statistics courses. The survey sample for STA307 and STA308 both show about 70% female students, while STA409 sample has 57% male students.
The summary statistics for each course's pretest and posttest SATS-36 surveys are in Table 2 To begin to explore the study habits of students in each class, visualizations and summary statistics were generated. In Figure 2.1, bar plots for each resource surveyed in the exit survey for STA307 are displayed. The most used resources were the online notes and practice exams. The professor and teaching assistant office hours were the least used resources, followed by the textbook. It appears that more students use email to contact their instructors than in person meetings.
The plots for STA308 show a similar trend, however more students report using the textbook. STA409 also has an overall similar trend, however more students report using the textbook weekly, more students report emailing their instructor and less use the teaching assistant's office hours. The summary statistics for the rankings (1-8) of various learning activities from the exit survey are in Table 2.4. The survey questions asked students to rank the activities from least (1) to most (8) beneficial to their learning this semester.
From the table, STA307 students report recitation problems, exams and lecture notes as the most beneficial activities towards furthering their learning, while reading the textbook and using SAS were the least beneficial. For STA308, lecture notes and recitation problems were the most beneficial, while the exams and Excel were somewhere in the middle and reading the textbook was the least beneficial. For STA409, the course had no recitation sections, nor use of statistical software, so their most beneficial activity was taking and studying for the exams, followed by reading the textbook and finally the lecture notes.

Data Analysis Tools Principal Component Analysis
To begin to analyze the relationship between the attitude components and course outcomes, principal component analysis (PCA) was used. Due to the highly correlated nature of the attitude components, as seen in Figure ?? typical multiple regression analysis cannot be used on the raw attitude components simultaneously.
Past studies have dealt with this issue by performing separate regression equations for each attitude component as a predictor of course performance, however multivariate techniques such as principal component analysis exist to combat this issue . Principal component analysis is a multivariate technique that aims to reduce the dimensionality of a dataset, while also retaining as nents are determined, the number of components necessary is chosen by analyzing a scree plot of the variance explained. This is necessary, as one of the motivations for using PCA is to reduce the number of variables (Everitt and Hothorn, 2011).
Once the principal components are determined, they can also be used to plot the pretest attitudes in lower dimension using a biplot. This plot allows the 6 attitude component vectors and individual student scores to be plotted in the dimensions of two of the principal components. These plots are helpful for viewing potential groups within the data that are not visible in the original multivariate dimensions of the data. They are also able to show the correlation between individual attitude components within each principal component.

Hierarchical Linear Model
Once the principal components were determined for the pretest attitude scores for each class, they were used as explanatory variables in a hierarchical multiple regression model to explain course performance. The aim of this model was to determine if the pretest attitude survey could be used to identify students at risk of performing poorly in the course at the beginning of the year. A hierarchical model was necessary due to the multiple sections within each course, where the grades of students in similar sections were not independent of the section. A separate model was built for each course, dependent on the course's principal components and section. The full model is: where Y ij is the i th student's final numerical course grade from professor j's course as predicted by the principal components and b j is the random effect for professor. The error term is assumed to follow a normal distribution with a mean of zero and constant variance. A reduced model with the number of sufficient principal components, as determined by the scree plot, was used.

Cluster Analysis
After performing principal component analysis and plotting the biplots for each class, a group structure within each class was explored. Groups of students with similar starting attitudes are of interest to look for differences in performance and learning strategies between the groups. The groups of students were compared for their course performance, leaving attitude and study habits throughout the semester to see if there is any relationship between the starting attitudes and their activities throughout the semester. The method utilized to uncover these groups was cluster analysis.
Cluster analysis attempts to uncover groups or clusters that are homogeneous within a dataset. There are several methods for performing cluster analysis. The method determined to be most suitable (based on the cohesion within the clusters) for this dataset was k-means clustering. K-means clustering attempts to partition the classes of students into k groups (G 1 , G 2 , ...G k ) where G i represents the group of n i students in group i. There are several ways to determine the clustering criterion, with the most common method involving choosing the partitions that minimizes the within-group sum of squares (WGSS) over all variables: is the mean of the students in group G l on variable j (Everitt and Hothorn, 2011).
While this method sounds fairly straightforward, in practice it is impractical to search every possible partition of the individuals into k clusters. Instead, algorithms exist to search for improvements in a clustering criterion after some starting partitions are made. With k-means clustering, k has to be determined before running the algorithm. Choosing k can be done several ways, including by running k-means for several values of k and analyzing a scree plot of the WGSS.
The choice of k is where the "elbow" in the plot is (Everitt and Hothorn, 2011).

Canonical Correlation Analysis
Once cluster analysis was used to find groups of students with similar starting attitudes in each class, statistical tests were performed to compare learning characteristics between the groups. To further characterize the relationship between learning preferences and starting attitudes, canonical correlation analysis was used.
Canonical correlation analysis looks for relationships between two sets of variables, like multiple regression, but with multiple response variables. CCA attempts to quantify the association between two sets of variables, x T = (x 1 , x 2 , ..., x q1 ) and y T = (y 1 , y 2 , ..., y q2 ) as the largest correlation between two single variables u 1 and v 1 where u 1 is a linear combination of x 1 , x 2 , ..., x q1 and v 1 is a linear combination of y 1 , y 2 , ..., y q2 . Often, one pair of variables (u 1 , v 1 ) cannot sufficiently quantify the relationship and several pairs are necessary. The pairs (u i , v i ) are chosen such that the u 1 are mutually uncorrelated, as are the v i , the correlation between u i and v i is R i where the correlation decreases as i increases and the u i is uncorrelated with all v j except v i (Everitt and Hothorn, 2011). Here, the set of variables will be the change in attitudes related to the rankings of learning activities to see if there is a relationship between the change in attitudes and the way students preferred to learn in the class.

Omega Internal Consistency
The internal consistency of the attitude component structure of the SATS-36 instrument must first be checked to ensure that each group of questions is measuring the intended construct. Typically, Cronbach's coefficient alpha is reported for the internal consistency, however the alternative method, Omega, is reported as this measure holds fewer restrictions on the data .
The pretest and posttest Omega values for all attitude components for each course can be found in Table .9 showing both the point estimate and bootstrap confidence interval using the MBESS package in R Studio .
The Omega values for all attitudes are within the acceptable range (above 0.70), except the difficulty values.

Principal Component Analysis
The pretest attitude scores are of the most interest to this research due to their timing within the course. Learning more about students at the beginning of the course, especially something with potential to predict student success, is very important. In order to use the pretest attitude scores in a regression model, their correlation amongst themselves needs to be taken into consideration. To remedy this problem, principal component analysis was applied to the pretest attitudes using the prcomp function in the stats package in R Studio (R Core Team, 2013).
This function takes the scaled pretest attitudes and uses singular value decompo-sition to determine the value of the principal components.
PCA was applied to all three courses separately. The results from STA307 are included here, with significant deviations from the other courses included for comparison. From the screeplot in Figure 2.2 the number of principal components necessary to explain a sufficient amount of the original variation in the STA307 pretest attitude scores appears to be 3 components. These first 3 principal components explain 86.7% of the original variance. The coefficient values for these principal components are in Table 2 being closer to the positive attitude vector direction is also clear in these dimensions. The biplots for STA308 and STA409 showed similar results with the attitude vectors, however the separation between the grade ellipses is not as apparent in those classes.

Hierarchical Linear Model
Once the principal components were calculated and the number of components sufficient to explain the pretest attitude variation were chosen, they were implemented in the hierarchical linear regression model to predict students' final numerical grades. Results from the regression run in R Studio using the lmer function in the lme4 package . The resulting model did not meet the assumption of normality of the residuals, so the mean and confidence intervals for the coefficients were bootstrapped by re-sampling the data. The regression models for STA308 and STA409 met the assumption of normality. The results for STA307 can be found in

Cluster Analysis
Principal component and regression analysis have found an indication that students' pretest attitude scores may have a significant relationship to their course performance. The next step in the analysis was to explore the grouping structure within the pretest attitudes to see if there are groups of students with similar attitudes. These groups were then compared for their average attitudes at the beginning and end of the semester, differences in final grades and in learning activities throughout the semester.
The cluster analysis was performed in R Studio using the eclust function in the factoextra package (Kassambara and Mundt, 2017). The chosen clustering method was kmeans, with k chosen through applying the cluster algorithm to k = 1 : 10 and plotting the Total Within Group Sum of Squares, as seen in Figure   2.5. The graph stops decreasing as sharply at k = 2. The cluster analysis was then applied to the pretest attitudes with 2 clusters specified. The resulting clustering result can be seen in Figure 2.6 plotted in the dimensions of the first two principal components. The clusters appear to have some overlap. Similar plots were made for STA308 and STA409 as both also showed k = 2 as the best solution based on the WGSS. The plot for STA409 shows greater separation between the clusters.
Once the students were clustered based on their pretest attitudes, differences between these groups were explored. The first measure looked at was the pretest attitudes themselves to see the composition of the clusters. The average change in attitude throughout the semester was also investigated Figure 2.5. Cluster screeplot to determine choice of k for the k-means clustering of STA307 students based on their pretest attitudes. Based on the total within sum of squares, the best choice of k appears to be 2.
Figure 2.6. Cluster scatterplot for the k-means clustering of STA307 students' prestest attitudes plotted versus the first two principal components. between the clusters, STA308 showed cluster 2 having an increase in attitude throughout the semester. The posttest averages follow the same pattern, with STA307 and STA409 scores higher for cluster 1 and STA308 having higher posttest scores for cluster 2.
Next, a qualitative analysis of the clusters based on the grades, demographics, learning preferences and study habits indicated in the exit survey was conducted.
Selected results are in Table 2.8. The questions involving use of resources were grouped into two categories: Rare Use (< 3 times) or Frequent Use (≥ 3 times).

The bold numbers represent a significant result on either a Wilcoxon Rank Sum
Test or a Chi-Square Test of Independence between groups at the 5% level and italics represent a significant result at the 10% level. The distribution of gender between each cluster show a greater proportion of females in cluster one for STA307 and STA409, while STA308 has a significantly larger proportion of females in cluster 2. No classes show a significant difference between the clusters in terms of using the textbook as a resource and only STA307 shows a significant result for studying in groups. It is interesting to see how many more students used the textbook in STA409 than the other courses. For the average rank for the resources, STA308 showed a significantly higher rank for cluster 2 valuing the recitation problems, STA409 showed a significant difference for the mean rank of the value of the lecture notes as a benefit to their learning and STA307 showed a difference in the valuation of the exams. Finally, the final grade averages within each cluster were compared. The average final grade is higher in cluster one for all classes, which corresponds with the results from the principal component analysis, however only STA307 was significantly different.

Canonical Correlation Analysis
Canonical correlation analysis was used to explore the relationship between the change in attitudes and the learning activity ranks for each class. The process was implemented in R Studio. The first two canonical variates for STA307 are: The correlation between the first variates is 0.27 and between the second variates is 0.10. An interpretation of this result is that a positive change in interest and cognitive competence is weakly correlated with low ranking of SAS and lecture notes, but a higher ranking of the textbook. The second variates show that a a positive change in difficulty and value and a negative change in cognitive competence is very weakly correlated with a higher ranking of SAS and the textbook and a lower ranking of the exams. This could show that students who left with higher cognitive competence and interest than they entered put a lower value on SAS and lecture notes, but learned independently from the textbook.
The results of CCA for the STA308 indicate that a positive change in affect is 0.16 correlated with a higher ranking of Excel and the exams. This indicates that students who left with a more positive feeling towards the subject placed a high value on the Excel assignments and exams. The results for STA409 indicate that an increase in cognitive competence and decrease in affect is 0.31 correlated with a higher ranking of the exams and lower ranking of the textbook.

MAIN FINDINGS 2.4.1 Limitations
As with any survey design with human subjects, there are several biases that present themselves in the data. First, there is the non-response bias from the students who did not consent or did not complete the surveys. There were 25, 78, and 49 students in STA307, STA308 and STA409 respectively that did not sign the IRB consent form to have their data included in the research. Some of these students may have been absent from the classroom of the day the consent forms were distributed, others may have refused to participate. The sample size decreased again when looking at the number of students that completed both of the SATS-36 surveys. There were 18, 94, and 5 additional students from each course that were not included in the analysis. An explanation for the low involvement in STA308 is a lack of incentive for students to complete the surveys. Only one section offered an incentive to participate, biasing the results towards that section. STA307 and one section of STA409 included the surveys on a homework assignment which helps to convince students to participate.
These reductions to the sample size may not have been random. The students who completed the surveys, either due to their own accord or due to the incentive very well may be from a different population of students than those that decided not to complete the surveys. The students who were absent from the class when the consent forms were distributed may have a different relationship between their attitudes and course performance than those present.
In addition to the nonresponse bias, every survey has the potential for response bias. Students are self reporting and answering a variety of questions. There is no way to know that the responses are entirely truthful, especially if the students were rushing to complete the survey just to get the task done. To limit this, there are negatively worded questions on the SATS-36 to identify students answering every question the same way. This will not totally protect against response bias due to the possibility that students were answering how they thought they should answer, instead of how they actually felt. It is important to keep these biases in mind when interpreting the results of this paper. The responses being analyzed can not be guaranteed to represent the entire course populations.

Practical Recommendations
Beyond the results discussed above, there are several practical implications to this work. The principal component analysis in conjunction with the regression analysis concluded that there is a potential relationship between pretest attitude scores and the course performance within two of the three classes. This is important because it shows that students' Value, Interest, Affect and Cognitive Competence at the start of the semester can affect their performance. Students with lower confidence in their technical skills and lower perception of the subject will not perform as well throughout the semester. This can be an important finding, if confirmed with further studies, to develop a way to identify students at risk of performing poorly at the beginning of the semester and possibly implement intervention methods.
The cluster analysis identified a potential two group structure to the class in terms of the pretest attitudes. Cluster one, containing students with higher average pretest attitudes, showed several interesting characteristics depending on the class.
In STA307, cluster one had a higher average final grade and was more likely to find the exams beneficial to their learning. In STA308, cluster one was more likely to find the recitation problems helpful and contained a smaller proportion of females. In STA409, cluster one was more likely to find the lectures beneficial to their learning. These findings are important because they indicate that there are learning style differences between each group of students and one group is performing better than the other.
In practice, professors' should bear in mind that students rank the lectures and exams highly and teaching assistants should consider the importance of the recitations to students learning. It is also indicated that students in two of the classes do not view the textbook as a valuable resource. Other resources that students value in their learning process regardless of cluster include the online notes, practice exams and studying alone. The resources students don't utilize often include their TA and Professor's office hours and email to contact their instructors.
Knowing the study habits of the students is important for educators, especially if students are not utilizing a valuable resource. Potential recommendations here include evaluating the choice of textbook for each course, advertising for office hours and promoting yourself as a resource for students, as well as making office hours at least once during the semester required for the course. However, based on the characteristics of Generation Z students, their independent learning styles and technological skills may be able to explain the choices in resources.
Future work includes looking into student collaboration networks within each course to see how students work together and choose which peers to work with.
Other future work could consist of following up in future classes with similar pretest attitude surveys and exit surveys to see if the results are consistent, as well as to ask additional questions about other resources and learning preferences. Adding an option to have students self describe their own learning styles and give suggestions at the end of the semester could also be very beneficial to understanding the needs of the current generation of students.    Figure .10. Correlation plot for pretest and posttest attitude components for STA308. Figure .11. Correlation plot for pretest and posttest attitude components for STA409. Figure .12. Biplot of the first two principal components for the pretest attitudes for STA308 with students plotted as points colored by their final grade as an A or A-(89.5 and up) or not. Figure .13. Biplot of the first two principal components for the pretest attitudes for STA409 with students plotted as points colored by their final grade as an A or A-(89.5 and up) or not.  Figure .14. Cluster scatterplot for the k-means clustering of STA308 students' prestest attitudes plotted versus the first two principal components. Figure .15. Cluster scatterplot for the k-means clustering of STA409 students' prestest attitudes plotted versus the first two principal components.