<<

Exploring Correlation among Different Elements of Student Evaluation of Teaching

Wen Cheng Rani Vyas Ranjithsudarshan Gopalakrishnan Department of Civil Engineering Department of Industrial and Manufacturing Department of Industrial and Manufacturing California State Polytechnic University, Engineering Engineering Pomona California State Polytechnic University, California State Polytechnic Univerity, Pomona, USA Pomona \Pomona [email protected] Pomona, USA Pomona, USA [email protected] [email protected]

Edwars R. Clay Mankirat Singh Department of Civil Engineering Department of Civil Engineering California State Polytehcnic University, California State Polytechnic University, Pomona Pomona Pomona, USA Pomona, USA [email protected] [email protected]

Abstract— The Research to Practice Full Paper intends the faculty in doing reflective work to comprehend student to explore the possible correlations among the distinct elements requisites. of the student evaluations. The students have their own Keywords—Student Evaluation, feedback assessments, perception of the course and instructor. The individualism of Correlation analysis, Panel . students in a class reflects the need to change instructional elements of the course when the student expectation is not I. INTRODUCTION satisfied thoroughly. Students ought to have their mind flooded University education is for students to enhance their with the things which they want to learn from the instructor. critical thinking and develop skills for the professional world. They entail improving their skills through university education for the upcoming competition of the professional world. In Cambridge dictionary defines education as a service addition, some of the previous research indicates student occupation, with the university students being considered as a evaluations are a crucial source of data available to increase the consumer. Consumer perceived service quality is one of the student performance of the course. most crucial features for every university. In general, student As a result, the authors are dedicated to the analysis of evaluations are the most popular platform for students to the instructor-based questions used in the evaluations. The convey their satisfaction towards the courses being offered usually used in a feedback form is generalized and [1]. branded according to instructional requirements. The Evaluations can be gathered from either online or correlation of course structure of assignment in the mind of an offline feedback assessments [2]. The feedback assessment instructor is different than in the mind of a student. The data used include courses offered by a selected professor at the forms consist of a list of multiple-choice, categorical, or Engineering Department at University. The student evaluations descriptive questions related to teaching effectiveness. The cover instructor and course-based questions. This include, course effectiveness of the methods used to teach critical concepts in structure and organization, course objective-defining and course- a classroom by universities can also be analyzed through these meeting, knowledge demonstration, concept-explaining, assessments [3]. communication effectiveness, class discussion-stimulating, There is extensive research dedicated to student student question-answering, consultation availability, student evaluations. For example, Peterson et al. (2001) demonstrate a learning facilitation by assignments, concept-reflecting by positive impact on student performance due to the use of exams, timely, and constructive feedback on student work, and student evaluations [4]. Another research article compares overall rating of the instructor. The rating scale with five response categories are employed which from very good student evaluation and student learning through meta-analysis (1) to very poor (5). The panel data used contains of each which showed that Student learnings/performance is not question from all students enrolled during a semester. Layer in affected by a having highly ranked instructor. [5]. However, the panel data differs for a lecture course and a lab course. very few research articles have been centered on the To capture the reliable correlation amongst instructor- correlation analysis of different elements of student based questions, some of the more popular rank correlation evaluations with the intention to verify whether there is any are used including Spearman's ρ, Kendall's τ, Pearson’s association among distinct teaching performance areas. r, Wilcoxon T-test. Some of the teaching activities are found to To this end, this paper aims to address the correlation have a stronger connection with the method of structuring between different questions used in student evaluations. assignments, which tend to have isolated impacts on student learning. It is anticipated that the research findings would assist Multiple correlation analysis methods are used with the Panel

1 data gathered from student evaluation at University. The panel the class (e.g., speaks data consists of both different years of data and multiple clearly, writes legibly, uses students’ data for the same year. appropriate visual The contribution of this paper mainly resides in two aids, listens areas: First, for the statistics field, it provides additional effectively)? insights to the identification of the difference and similarity of 8 How well does the instructor stimulate various correlation statistics relying on the rarely used and/ or encourage 1.00 1.36 1.09 0.08 educational empirical data. Second, the common results found appropriate class in the different correlation tests would lead to a better discussions? understanding of the relationship of various teaching elements, 9 How effectively does the instructor answer 1.00 1.36 1.10 0.08 which can then be used by the university community for the student question? betterment in student learnings. 10 How available is the instructor for student 1.00 1.30 1.09 0.08 II. DATA consultation? The panel data has been gathered by one University 11 How well do the instructor’s 1.00 1.32 1.09 0.08 using a student evaluation questionnaire acquired from the assignment facilitate Department of engineering. The evaluations include the student learning? courses taught by one same instructor from Fall 2009 to 12 How well do the exams reflect the Spring 2019 with the exclusion of Fall 2015 due to instructor 1.00 1.32 1.07 0.07 concepts of the sabbatical leave during that quarter. The university has course? changed from a quarter system to a semester system in Fall 13 How effective is the 2018, but the questionnaire remains the same. One different instructor in giving questionnaire was used for each lecture, and lab courses, timely and 1.00 1.25 1.06 0.06 constructive feedback respectively. The panel data spans 10 years and includes on student work? winter, fall and spring terms (quarters or semesters). The total 14 How would you rate number of lecture courses are 51, and of lab courses are 17 this instructor during this period. compared to other 1.00 1.44 1.05 0.07 instructor in the university? a Questions 1~2 & 15~17 are excluded from the research analysis since those are not related with instructor performance. b Min and Max refer to Minimum and Maximum values of the data.

TABLE 1 SUMMARY OF STUDENT EVALUATIONS FROM TABLE 2 SUMMARY OF STUDENT EVALUATIONS FOR LECTURE COURSES LAB COURSES Question Question Standard Min. Max. Mean Question Question Standard Number Deviation Min. Max. Mean 3 How effectively does Number Deviation the instructor 1 How well prepared organize and for the lab does the 1.00 1.26 1.07 0.08 structure the course instructor appear? (e.g.: course 2 How well did the objectives, grading instructor define and 1.00 1.33 1.07 0.07 1.00 1.30 1.09 0.08 policy, textbook meet objective of the and/or reference lab? material, office 3 How well does the instructor arouse hours, number of 1.00 1.45 1.17 0.12 exams and schedule interest and transmit of exam times)? enthusiasm in lab? 4 How well did the 4 How helpful and instructor define and available is the 1.00 1.32 1.06 0.07 instructor for meet the objectives 1.00 1.30 1.07 0.08 of the course? consultation during 5 How well does the lab and office instructor hours? demonstrate 1.00 1.36 1.06 0.07 5 How would you rate this lab instructor as knowledge of the 1.00 1.40 1.12 0.11 subject? compared to other lab 6 How effective is the instructors? instructor in 6 How well did the lab explaining the 1.00 1.64 1.13 0.12 coordinate with and 1.04 1.38 1.21 0.12 concepts of the reinforce the lecture? 7 How clearly were the course? 1.04 1.59 1.26 0.13 7 How effectively does written? the instructor 1.00 1.96 1.17 0.16 a Questions 8~10 are excluded from the research analysis since those are not related to the instructor’s performance. b Min. and Max. refer to Minimum and Maximum values of the data communicate with B. Kendall correlation The students were asked to answer the questions The Kendall rank correlation ( ) is a using a score of 1 to 5, with 1 being “very good” and 5 being nonparametric measure of a coefficient. This evaluates the “poor”. Only instructor-based questions are selected from both degree of similarity between two sets of ranked objects. This data banks for this paper. Tables 1~2 show that Q3 to Q14 for coefficient depends upon the number of inversions of pair of lecture courses and Q1 to Q7 for lab courses are selected, objects which would be needed to transform one rank order which are instructor-based questions. The tables also show into the other. Kendall coefficient of correlation is obtained by other basic statistical measures of the panel data, including normalizing the symmetric difference such that it will take minimum, maximum, mean, and . values between -1 and +1 with -1 corresponding to the largest possible distance, and +1 corresponding to the smallest III. METHODOLOGIES possible distance [7]. The can be The correlation analysis is one of the statistical calculated using the following formula: methods to compare two variables and find a relationship between them. The method could be parametric or non- (2) parametric. The parametric method assumes a specific distribution of the population and is usually used for Where continuous and unranked data. Whereas for ranked data, non- parametric tests are used which uses data with nominal or ordinal scale. The range of resulted correlation coefficient is usually between -1 to +1. Here the sign of the coefficient shows a negative or a positive trend between the two C. Pearson correlation variables. This paper worked with the mean value of ranked data, for which ranked correlation methods can be used as Pearson's is a parametric correlation ( ) method used well as parametric tests such as Pearson’s T-test. The usual for continuous data. It measures the association between absolute threshold value of a correlation coefficient of 0.6 is variables of interest as it is based on the method of . used to determine whether two student evaluation elements are Pearson is defined as the covariance of the two variables to the highly correlated or not. product of their respective standard deviation [8]. The data Four popular correlation methods are selected for the consists of the average score of some students enrolled in each correlation analysis, which includes Spearman's ρ, Kendall's τ, course. Even though the score from an individual student is Pearson’s r, and Wilcoxon T-test. The details of each rank value, however, the average score is a continuous correlation tests are shown in the following sections in order. variable. This is why Pearson's correlation coefficient r can be calculated between -1 and 1 [3]. The correlation coefficient A. Spearman correlation can be calculated using the following expression: Spearman’s rank correlation ( ) is a nonparametric (3) technique for evaluating the degree of linear association or correlation between two independent variables. It operates on Where the rank of the data rather than the raw data value. It assesses how well is the relationship between two variables using a monotonic function. The advantage is that results are unaffected by the distribution of the population because the D. Wilcoxon correlation technique operates on the rank values. It is not required for the Wilcoxon T-test is also called a Mann-Whitney U data to be collected over a regular space interval. It can be test. It is another nonparametric test of the null hypothesis. used with a small sample size and it is easy to apply [6]. The Under this test, two independent variables are tested to check correlation coefficient can be calculated using the following whether they are from populations having the same equation: distribution [9]. For a population of one of the two variables, the number of lower ranks and higher ranks for the sample (1) rank value are calculated and these are called wins and ties, respectively. The sum of wins and ties is called U or W. The correlation coefficient can be calculated using the following Where equation: x′=rank of x (4) y′=rank of y (5) mx’= of rank(x) my’= means of rank(y) Where

a rank of the data of two variable

of data P value of the Wilcoxon T-test is the probability of the ranked data after the sample value of z critical is compared. [6]

IV. RESULTS The detailed correlation results containing correlation graphs, summary of correlation matrix, and some highest correlated questions are presented in the following subsections. A. Correlation graphs Results for correlation coefficients is computed through the R language for each correlation method. The graphs are generated through correlation matrix plots. The (b) correlation graphs show the coefficient values for each method individually using the lecture and the lab course dataset. Fig 2. Kendall’s Tou correlation graph for lecture (a) and lab courses (b)

Fig 1. Spearman’s rho correlation graph for lecture (a) and Lab courses (b).

(a)

(a)

(b)

Fig 3. Pearson’s r correlation graph for lecture (a) and lab courses (b)

(a) (b) This diagonal in the above graphs showing question numbers has the value 1 in each cell as self-correlation is always positive 1. The upper triangle of the matrix shows colored circled which different radius representing the magnitude of the correlation, and the lower triangle of the matrix gives the numerical values representing the correlation coefficient. The side scale represents the color gradient for a value ranging from -1 (warmest) to 1 (coolest). In all the graphs above we do not have a negative coefficient which suggests that each question is positively correlated with at least one question. According to graphs, it is evident that Wilcoxon T-test shows the lowest number of correlations for both the lecture and the lab courses. Lecture case shows that almost all the questions are somehow positively correlated when it comes to instructor-based questions. This means that if one of the

(b) questions has a greater rank response from students, then all the other questions will tend to show a similar increase in response.

B. Summary of Correlation Matrix

Fig 4 Wilcoxon’s p-value correlation Graph for the lecture (a) and the lab When the results of Spearman correlation are compared courses (b) with other methods, the Kendall correlation method shows a little drop in the correlation coefficient values, whereas Pearson’s correlation method shows an increase in the correlation coefficient. Wilcoxon T-test has the most variating results than other results. TABLE 3 SUMMARY FROM CORRELATION MATRIX FOR LECTURE COURSES Question Pearson Spearman Kendall Wilcoxon

Lecture

Q3 7 10 2 0

Q4 7 4 1 2

Q5 4 2 0 2

Q6 6 3 0 0

Q7 6 2 0 0 (a) Q8 6 5 0 2 When looking at similarities in the lab and lecture analysis, it is evident results from lab courses are more Q9 2 3 0 1 consistent than the lecture courses. It could be due to a small Q10 3 4 0 2 sample size of the lab course dataset when compared to the lecture course dataset. As the dataset sample size increase, the Q11 7 7 1 1 correlation coefficients show more conflicting results in each Q12 6 4 0 1 method. Both analyses of labs and lectures agree with the correlation results by Kendall and Pearson correlation method, Q13 5 4 0 1 even though one is non-parametric, and the other is parametric. Pearson follows the hypothesis of normal Q14 9 6 0 0 distribution in a variable population which is proven true due Lab to similar results. A realistic view of the correlated questions can be Q1 4 4 1 1 noted when looking at the questionnaire. These correlations Q2 4 4 1 0 show that Q3 and Q4 from the lecture course questionnaire can be used by the instructor. Looking at the similarities in Q3 3 3 1 0 them, both questions address the organization of course specified and followed by the instructor. It can be stated as, Q4 3 3 0 1 effective course objectives, grading policy, reference material, Q5 4 4 3 0 number of exams, and schedule helps the instructor to meet the defined objectives of the course. Q6 0 0 0 0 The Q4 and Q12 are also correlated according to Q7 0 0 0 0 Wilcoxon Method. One another correlation according to Spearman is between Q3 and Q11. It can be anticipated that, the structure of course leads to effective student learning The difference between the four methods is through assignments. significant in Table 4. The Highest number of correlations is For lab courses, Q2 and Q5 are highly correlated. shown in the methods Pearson and Spearman whereas Kendall According to the questions, the defined objectives of the lab and Wilcoxon only show few correlations for some of the by the instructor would be related to the comparison of questions. The two methods, Pearson and Spearman illustrate different lab instructor in students’ views. It can be stated that similar results for data received from lab courses. Q6 and Q7 when the instructor is compared with other instructor, students for the lab courses are not correlated to any of the other usually look at the lab objectives devised by the instructor. questions according to all the methods used. The Wilcoxon correlation suggests Q1 and Q4 as highly correlated. It shows that, the preparedness by the instructor for the class gives students a chance to have the consultation for a query with the instructor during lab and office hours.

C. Highest correlated questions for lecture and lab courses V. CONCLUSION & LIMITATIONS TABLE 4 The highly correlated questions in each method for We have analyzed the panel data of the Student Lectures and Lab. evaluations of lecture and lab courses taught by the same Evaluation Data Spearman Kendall Pearson Wilcoxon lectures Q3~Q11 Q3~Q4 Q3~Q4 Q4~Q12 instructor at university for around 10 years. The correlated lab Q2~Q5 Q2~Q5 Q2~Q5 Q1~Q4 questions give the relationship between the two performance areas involved in teaching the course. The positively Table 4 illustrates only highly correlated questions correlated questions indicate that teacher performing well in from each method for both lecture and lab course datasets. It one aspect tends to have great performance in other aspects. shows the irregularities between the four methods. Wilcoxon This is helpful for the instructor to enhance his teaching T-test results do not match the other three methods in both performance by improving the highly correlated areas. course datasets. Assuming by the similar results of other Even though the analysis reveals a better methods, it can be said that a null hypothesis is not true for the understanding of the correlation of various elements of student dataset, suggesting that there are similar values in multiple evaluation of teaching, there are some caveats for the results pairs of variable populations. Kendall’s and Pearson’s as shown in the present study. First, all student evaluations coefficients show similar results for both datasets, which were collected from one instructor only. More data from suggest a suitability of the methods with each other and for multiple instructors might show different correlation patterns two diverse populations. Spearman coefficient agrees with the among the different teaching elements. Second, only the two methods Kendall and Pearson when the lab dataset is average scores of the entire class for each question are used analyzed. for correlation analysis. The association evaluation based on individual student data could yield totally different results. REFERENCES [5] Uttl, Bob et al., Meta-analysis of faculty's teaching effectiveness: Student evaluation of teaching ratings and student learning are not [1] Athiyaman, Adee, “Linking student satisfaction and service quality related. Studies in education evaluation, 2017. perception: the case of university education”, European Journal marketing 31,7, 1997, ISSN: 0309-0566. [6] Gauthier, Thomas D., Detecting trends using spearman’s rank correlation coefficient, Environmental forensics, 2001. [2] Menges, Robert J. & Brinko, Kathleen T., Effects of Student Evaluation Feedback: A meta-analysis of higher education research, 1986. [7] Abdi, Herve, The Kendall rank correlation coefficient, 2007. [3] Donald, Janet G. & D. Brian Denison, Quality Assessment of University [8] Chok, Nian Shing, Pearson’s versus Spearman’s and Kendall’s Students, The Journal of Higher Education, vol. 72:4, pp. 478-502, correlation coefficients for continuous data, 2010. 2001, DOI: 10.1080/00221546.2001.11777109. [9] McKnight, Patrick E. & Julius Najab, Mann-whitney U test, wiley [4] Peterson, Marwin W. & Marne K. Einarson. What are Colleges Doing online library, 2010, DOI: 10.1002/9780470479216. about Student Assessment?, The Journal of Higher Education, vol. 72:6, [10] Seldin, P., Using student feedback to improve teaching. D. DeZure 2001, pp. 629-669, DOI: 10.1080/00221546.2001.11777120. (Ed.), To lmprove the Academy, Vol. 16, pp. 335-346, 1997.