A Survey of Faculty Opinions Concerning Student Evaluations of Teaching
Michael H. Birnbaum
California State University, Fullerton
Student evaluations of teaching were originally intended to help improve instruction, but they may be doing more harm than good.
Because retention, tenure, promotion, and merit salary raises are influenced by student evaluations, faculty members make changes in their courses that they believe will improve their evaluations. This article explores beliefs held by members of the faculty concerning how changes in grading standards and content of courses would affect student evaluations and student learning.
A request to complete a survey was sent by email to all members of the faculty of C.S.U.F., and 208 completed it. The majority judged that student learning can be improved by increasing course content and by raising standards for grading. However, they also stated that these improvements would hurt their evaluations. The majority judged that the current system of tenure and promotion discourages raising standards, encourages lowering of standards, and promotes "watering down" of course content. Most said that ratings are hurt by changes that would improve learning, and that the use of student evaluations of teaching is harmful to the quality of education.
A survey was also conducted of 142 students. The majority of students gave highest ratings to courses with the least content and the lowest standards; thus, the faculty understands student opinion.
Debate on Validity of Student Evaluations
A recent issue of American Psychologist featured the controversy on validity and biases of student evaluations of teaching. Meta-analysis of studies concluded that less than one-sixth of the variance of evaluations is associated with educational performance. Some authors warned that ratings are so complicated that anyone using them for practical purposes must understand nonlinear, nonadditive, multidimensional modeling of confounded judgment data.
My field of research is human judgment. With the same methods used in student evaluations, I found that the number 9 is judged to be significantly "bigger" than 221. Since 9 < 221, we should be careful not to evaluate faculty by the same methods that lead to wrong conclusions.
Apart from the actual validity of student evaluations is a potentially more important question, namely, their perceived validity. Although some teachers are fired because of student evaluations, most figure out how to get better evaluations. Do their adjustments promote student learning? No, according to a survey of CSUF faculty.
Survey of Faculty
Within one month of an email invitation, 208 members of the faculty completed the survey. There were 76 faculty members with less than 12 years experience (68 were untenured), 66 with 12 to 24 years, and 64 with more than 24 years.
Faculty Theories of Student Evaluations
If you were to RAISE standards for grades in your class, would it affect your student evaluations? Nearly two-thirds of those surveyed (65.4%* or 136) reported that higher standards would result in lower evaluations, and only 3.4% (7) thought the opposite would occur; the others stated no difference. (*Asterisks designate that split are statistically significant throughout this paper).
If you were to INCREASE the amount of CONTENT (material) in your classes, would it affect student evaluations?
About two-thirds (65.9%*) responded that increasing content would decrease student evaluations, against only 4.8% who stated the opposite. The theory proposed is that with less content, the student believes that the instructor was very successful in teaching the subject. Because students do not know what content should have been included in the course, they will not know that important material has been omitted until later, long after the evaluations are done.
Are student evaluations influenced by such variables as the teacher's personality, attractiveness, gender, race, dress, religion, ethnicity, sexual orientation, or disability status? In response to this question, only 16.8% responded that student ratings are "unbiased;" 52,4%* responded that students are biased in favor of certain groups; 26%* responded that students are biased against certain groups.
Theories of Student Learning
The questionnaire defined student learning as "knowledge of the subject matter, as might be measured by objective, standardized exams...the sum of knowledge and skills that the student retains from the class and will be able to use in the future..."
How would increasing the content covered in class and in assigned readings affect student learning? 45.2%* said that increasing content would increase student learning compared to 27.9% who thought the opposite.
How would raising standards for grading affect student learning? 57.2%* responded that raising standards would increase student learning against only 7.7% who indicated the opposite. The theory most often expressed was that students will work to achieve a certain grade. If less is required to pass, students ease off in their studies, so they learn and retain less.
The Incentive Structure and Student Evaluations
Does the current system of promotion and tenure give incentives to RAISE standards for grading? A surprisingly high 92.3%* stated "no" compared to only 5.8% who said "yes."
Does the current system of promotion and tenure encourage faculty to LOWER their standards? 70.2%* said "yes" against 28.8% who said "no."
Does the use of student evaluations encourage faculty to "WATER DOWN" content in their courses? 72.1%* said "yes" against 26.9% who said "no."
Thus, the majority opinion of the faculty is that the incentive system for tenure and promotion causes faculty to lower standards and water down courses, which most faculty members believe will decrease student learning. Apparently, the majority of faculty believe that the incentive system has the opposite effect of what a citizen in favor of quality education would support.
Trends over Time
Over the years, have you changed the amount of material presented in your classes? 48.6%* said that they now present less material against 14.9% who said that they present more material, and the rest indicated no change. Over the years, have you changed the standards required to get a passing grade in your classes? 32.2%* said that they now use lower standards against 7.2% who said that they now use higher standards.
Since the majority opinion is that reductions in content and standards are harmful to student learning, it seems sad that so many faculty concede having made changes that they believe reduced the quality of education.
Student Preparation and Competence of Graduates
The questionnaire asked, Please assess the preparation of students who are now enrolled in your college or university, compared to previous years. The majority (67.3%* or 140*) reported that students are not as well prepared now, compared to only 2.4% (5) who said the opposite.
When asked what percentage of lower division students possess the study skills one should expect of the top 1/3 of high school graduates, the median response was 40%, with 85 responses below 30% and 134 (64%*) less than or equal to 50%. Apparently, about two-thirds of the faculty think that half or more of our students do not qualify under the state's concept for admission.
One theory is that declining standards for recent new teachers is a cause of this problem. Based on data published each semester at CSUF, students who plan to be teachers have some of the highest grade point averages (GPAs) on the campus. When asked if students with the highest GPAs are indeed the best students, only 12.7%* thought these "future teachers" are our best students; about twice as many rated these students as below average on the campus, and 55.8% judged them average. When asked, what percentage of undergraduates who want to be teachers do you think should become teachers, nearly two thirds (63%*) of respondents said that less than half should become teachers.
When asked, what percentage of graduates in your department possess the general education, specific skills, and knowledge base that should be required of a graduate...," the median response was 60%. Thus, the average faculty member believes that two out of every five of our graduates are not qualified to receive the degrees we confer upon them.
A Survey of Students
A sample of 142 lower division students evaluated 89 hypothetical classes, based on combinations of three variables: instructor's individual characteristics (personality), standards for grading in the course, and the amount of content. The students represented 29 different majors; there were also 26 with undeclared majors. I anticipated that this heterogeneous mix of students would hold a variety of different views of what would be the optimal class. However, to my surprise, the students were remarkably homogeneous in their evaluations of courses:
94.4% (134* of 142) gave higher evaluations to an "attractive, well-dressed, 36 year old female with a nice personality" than to a "62 year old male with a slight tremor (due to a previous stroke) who doesn't smile in class."
92.3% (131*) gave higher ratings to a class with "light" content (less than 100 pages to read in a semester, and nothing else to do outside of class) than to a course with "heavy" content (800 pages to read and homework assignments); only 9 gave highest ratings to courses with the most content. Only 16.9% (24) rated a "medium" level of content as better than the "light" level, although the "medium" course was described as having "300 pages of medium level reading" to do in the semester, and the course might require 'some study' to master the material."
97.9% (139* of 142) gave higher ratings to a course with "very easy" standards than to a course with "very hard" standards. Only 14 (9.8%) students gave their highest ratings to a course with "medium-easy" or "medium-hard" standards.
The "very easy" standards course was described as follows: "This instructor gives most students As and Bs, even those who are struggling with the material or who have not been diligent in attendance and study. Only the most clueless student will get a C in this class. If a person has half a brain and attends some of the time, (they get) an A or a B." In the "Medium-easy" course most students get As and Bs. "Medium-hard" was a class with 30% As and Bs, 50% Cs, and 20% Ds and Fs. The "very hard" course assigned 7% As, 13% Bs, 40% Cs, 25% Ds, and 15% Fail.
Students gave the highest rating to the course in which the teacher is attractive, where the standards for grading are lowest, and where the content is least. Apparently, the majority of faculty are correct in their understanding of what students like.
Conclusion: Student Evaluations May Harm Education
According to the majority of faculty members, the incentive system (using student evaluations for promotion and tenure decisions) puts teachers in a conflict of interest between making changes that would improve student learning and making changes that would improve student evaluations.
An implicit assumption in the use of student evaluations is that the average student is more likely right than the professor. However, it is dubious if a professor should redesign a course to suit anonymous comments by students who have not yet finished one class on the subject. It seems doubtful that students who have not yet taken the next course in a sequence can judge if they were adequately prepared in the first course.
Many students are inaccurate in describing what the teacher said in class when they are motivated to be as accurate as possible (when taking exams); therefore, is it reasonable to assume that these same students are accurate when they give evaluative descriptions anonymously with no incentive to be accurate and no penalty for libel?
Our incentive system has produced a decline in standards that diminishes education. Students are motivated to get good grades, and faculty are motivated to get good evaluations. Unfortunately, both of these interests can be satisfied by reductions in content and grading standards, which diminish education. The finding that the average member of our faculty thinks that only 60% of our graduates have educations to match their degrees is a sign that our institution is in trouble. We should begin to study how our incentive system can be changed to align the interests of students, faculty, and the people of the state.