For the most part I am like most of my colleagues: there are about a hundred things — some less than glorious — that I would rather do than analyze my semesterly course evaluations. But for whatever reason, I feel compelled to do so, especially given that in recent years my course evaluations have been in decline. I guess that I take some pride in facing something that makes most in my profession uncomfortable: how our students rate our courses. In my eyes you gotta face the music, even if that music is a bit confusing, obtuse, and even a bit grating.
Evolution is a course that I have been teaching for a long time, although I took a significant hiatus from teaching it between 2011 and 2014. It’s an interesting course because it retains some features of my older teaching style (such as weekly in-class quizzes at the beginning of each class session) and some features from my newer ‘innovations’ in teaching (such as weekly Follow-Up Questions and cumulative exams designed to assess learning outcomes). When I returned to teaching this course in the Fall of 2014 I went big, asking students to complete a rather ambitious research-based Term Project that required them to consider how adaptive hypotheses are tested. Student reaction to this assignment was mixed, and for the past two semesters I have backed off from this requirement; now the course just requires a couple of simulation-based laboratory projects plus a midterm and final exam. So how has this course gone over with students, particular those from this past semester?
As I indicated in my analysis of course evaluations from the last time I taught this course, I was pleasantly surprised by high overall ratings for this newest version of Evolution. So would this positive student impression hold for the Spring 2016 semester? If you had asked me to place a bet, I would have put my money on another strong course rating: this most recent group of Evolution students created a wonderful classroom environment that I consistently enjoyed facilitating, and they seemed to be enjoying the course.
Oh, man… impressions in the classroom sure can be wrong, or at least out of synch with the course evaluation process. For Spring 2016 my overall rating for Evolution was 3.56 out of 4.00 (89%), my worst rating ever for this course. This is not a disastrous rating, but it certainly is not the kind of rating that I like to get. And perhaps most disturbing was the fact that I expected to get a better rating based on how the course had proceeded. What was driving this lower-than-normal rating?
Well, the first thing that I always look for is whether or not this lower rating represented an overall impression of students as a whole or just the dissatisfaction of a few students. If one or two students really rate me poorly, that can lead to much lower-than-normal ratings, especially since my ratings are generally good. One way to get at how much a particular rating represented the consensus of students is to consider the variance of student ratings. The lower the variance of the student ratings, the more the average rating represents student consensus. For this semester of Evolution the variance of this 3.56 rating was 0.51. This is a higher variance than the average variance for my courses over the past nine years (0.44), indicating that there was a mild effect of a few dissatisfied students. Digging deeper into the ratings reveals more about the nature of this dissatisfaction.
As usually is the case, this dissatisfaction is not really about how well I run the course. These ratings contextualize student satisfaction with my general running of the course:
- 4.00 out of 4, Variance = 0.00: The instructor knows the subject matter thoroughly.
- 4.00 out of 4, Variance = 0.00: The instructor was well prepared for class.
- 3.70 out of 4, Variance = 0.43: The instructor presented the subject matter clearly.
- 3.70 out of 4, Variance = 0.43: The instructor utilized the class time well.
- 3.73 out of 4, Variance = 0.25: The instructor promoted a constructive classroom climate.
- 3.90 out of 4, Variance = 0.20: The instructor made the goals of the course clear.
- 3.80 out of 4, Variance = 0.27: The instructor clearly informed students how they would be evaluated.
- 3.70 out of 4, Variance = 0.33: The instructor provided feedback in timely fashion.
This I kind of know about myself: I deliver to my students a well-organized, well-executed course. So if there’s any problem with the course, it is not so much about preparation and performance as it is about the design and execution of the course. These lower ratings contextualize the things that students don’t like about my course design:
- 3.13 out of 4, Variance = 0.79: The instructor stimulated my interest.
- 3.10 out of 4, Variance = 1.04: The quantity of assigned work was appropriate to goals of the course.
- 3.55 out of 4, Variance = 0.47: The instructor’s evaluation/grading of my work was fair.
- 3.40 out of 4, Variance = 0.78: The course improved my understanding of the subject matter.
- 2.65 out of 4, Variance = 1.29: I would recommend this course to another student.
- 3.13 out of 4, Variance = 1.31: I would recommend this instructor to another student.
What do these lower categories of rating mean? That is of course the question that I struggle with on a regular (semesterly) basis! I immediately gravitate to trying to interpret the “recommendation” ratings, which are consistently lower for me. I wish that students did want to recommend both my course and me as an instructor to their fellow students, but if they aren’t enthusiastic about making those recommendations I would love to know why. Is because I don’t stimulate student interest? Perhaps this low rating makes it hard for students to recommend the course. Or perhaps it is hard to recommend a course that requires as much work as Evolution? Or maybe I grade unfairly, which makes it hard to recommend me. The only way to get any sense of what these numbers means is to delve into the student comments.
Student comments on course evaluations are probably the most important feedback I get. And yet as a scientist I am acutely aware of the problems that comments present: they can be curated, selectively ignored, and interpreted in a great variety of ways. The nice things about numbers is that they don’t lie; the problem with numbers — as outlined above — is that they might not lie but they sure can play coy. The comments are the only way to put these numerical ratings in context.
Let’s start with positive comments that back up the more positive components of my evaluations. Consistently students praised the structure of the course: its organization, availability through our course Learning Management System, course activities, and overall pace. A number of students suggested that I was an “excellent teacher” and “dedicated”, and the word “interesting” was applied to both me as a teacher and the course material. Such comments seem to fly in the face of the worst numerical ratings above, but really they don’t: they just reflect the diversity of student perspectives that contribute to these ratings. Apparently some students really would recommend me as a teacher, and do find the course interesting. You can see this fact in the variance of these ratings. Although my recommend ratings are relatively low, their variance is high. That means that there are some students who really wouldn’t recommend me but others who would recommend me highly. As I have discussed before, lower ratings are statistically bound to have higher variance, but the variance of these low ratings — combined with positive comments that seem to contradict these ratings — confirm that there’s big disagreement among students about whether or not to recommend me or the Evolution course. I appreciated the positive comments that see how hard I have worked to design and organize the course and that recognize the passion I bring to the classroom. I am clearly reaching a good number of my students, but what about those who are dissatisfied?
As is often the case, a good number of students would prefer not to work in groups. Out of twenty student responses, five students specifically expressed a dislike of working in groups. That’s a decent rate of complaint, and one that should be taken seriously. So what to do about it? Working in groups is an essential skill, and it has been my experience that students really need to work on this skill. On most weeks, students are randomized into groups, which maximizes the chances that students bringing different skills to the group will come in contact throughout the semester. That kind of challenge each week can be stressful to some students, but I think that the very students who are stressed by group work most need to work collaboratively in order to develop this skill. And the big problem is none of these comments provide any substantial critique of the group work; basically students just express a preference not to work in groups. As is often the case, the students don’t really process the course rating process in light of any course goals, they just express their emotional reactions to what they experienced. Perhaps I need to make the purpose of my group activities more explicit in the future; at least this would empower students to say something like “I know the goal of the group activities is to [fill in the blank], but I don’t think that they serve this purpose very well because….”.
The other major comments on the course related to workload and grading; these kinds of comments almost always appear on my course evaluations, which tempts me to ignore them as simple whining about rigor. If the students want a less challenging course, that’s not a request I feel I should honor: the current level of rigor for my course is the product of years of teaching at Pratt, and I believe that lowering that rigor would turn my courses into a bit of a joke. I don’t deny that plenty of joke courses — ones whose rigor is below that of even a middle school class — exist at Pratt and elsewhere, but I have no intention of capitulating to student whining so that my course joins the comedy show. So does that mean that I should always just ignore student whining about workload and grading? Probably not: the question is whether there is a way to make things a bit less stressful on students without lowering rigor. And in the comments that I got for Evolution this semester are some clues as to how to strike that balance. What a number of students complained about was the inflexibility of my course: deadlines that were hard and fast, and assignments that students had to do in order to avoid getting zeros. That kind of inflexibility has been the hallmark of my teaching thus far, and based on student course evaluation comments I am considering becoming more flexible (see below for what I plan to change).
There are also comments that don’t directly relate to any of the rating categories that are worth taking a look at. In particular, I look to see how students are reacting to the various components of the course, as these are the things that I can tweak in future years to perhaps make the course more satisfactory to all my students. There weren’t too many comments of this nature this semester, but one really stuck with me:
My first reaction at reading this kind of comment is anger, because what the student is saying kind of flies in the face of how I actually serve students with disabilities. But it pays to read the comment carefully: the student is very much focused on how much I emphasize fairness. It is hard to tell if I actually disrespected the student when I was approached about a disability, or if the student’s feeling of disrespect went along with the discomfort he or she was feeling. Either way, it’s clear that I need to make my track record a bit more clear, because when students do what they are supposed to do — deliver a letter from Pratt’s Disability Resource Center to me — I have always honored the accommodations to which they were entitled. But perhaps my larger emphasis on fairness — which is aimed at students who have no claim to accommodation — is making it harder for students with disabilities to approach me.
I teach two sections of Evolution this Fall. It will be interesting to see if modifications made in response to the rating and comments outlined above produce any change in my course evaluations.
I want to start by saying that I have been really enthusiastic about teaching architecture students at Pratt. Although for years I rarely saw an architecture major in any of my courses, once I was asked to start regularly teaching the Ecology for Architects course, I immediately saw the potential in teaching this group of Pratt students. Pratt’s architecture students are a very talented group of students, and the work that they do is tremendously important to the future sustainability of our societies. In class they frequently lead insightful discussions, and they bring a very broad diversity of personal experience to the classroom.
Although I had been teaching Ecology for years, I took the opportunity to teach Ecology for Architects as an opportunity to introduce a variety of new teaching tools and techniques. I really upped the quality of my in-class activities, moving from activities that mostly asked students to discuss key questions in groups to in-class projects that required that students tackle challenges that taught new skills. I also added Reading Questions that required students to perform close reading of assigned material and weekly Follow-Up Questions that allowed students to test their understanding after each class meeting. Some of these approaches inspired wholesale changes, as I have adopted Follow-Up Questions in all the courses I currently teach.
That all said, teaching Ecology for Architects has been a tremendous challenge. It’s a given that a little bit of instructor-student tension is part of teaching anywhere, and perhaps especially as part of the general education component of students’ educations. If some students were not annoyed at me being too demanding, or grading them too harshly, or having expectations that were higher than they wanted to meet, I would get worried that I had gone soft. Teachers should challenge students, and that challenge is always going to produce some tension. But too much tension is problematic. If a few students are frustrated, that to me is to be expected. But when students are consistently frustrated, that kind of tension is a sign of a dysfunctional rather than functional relationship between instructor and student expectations. And boy, have my architecture students been consistently frustrated. And wow, do they express that frustration via course evaluations.
If there’s any question about the effect of my Ecology for Architects course evaluations on my overall “performance” on these metrics, check out this graph:
As the labeling makes pretty clear, my ratings have taken a big dive every time that I have taught Ecology for Architects (each time involving two of my three course sections being this course). I can’t deny that overall my ratings have taken a dip over the past couple of years — even during the one semester when I didn’t teach Ecology for Architects — but when you take out this one course there’s at least the possibility that my ratings are part of normal oscillations. But the effect of Ecology for Architects on my ratings is pretty stark, and nearly undeniable.
An even more granular way of visualizing the effects of Ecology for Architects can be seen in this figure:
Not only are five out of my six worst ratings coming from past Ecology for Architects course sections, but many of these are “off the trend” in terms of their variance: unlike most of my lower ratings, these low ratings are driven by a more consistent low estimation of my teaching quality rather than just a few dissatisfied students. Don’t misinterpret the meaning of these slightly-lower-than-trend variances: there is still a lot of disagreement among Ecology for Architects students about the quality of my course. It’s just that in this course — and amongst this student population — the quantity of dissatisfied students is a lot higher: I can’t chalk low ratings in this course up to “a few students that I did not serve well”. Overall the architects clearly want something different than what I offer. What changes might they suggest to make my course more appealing to them?
As you have probably gathered, my ratings for Spring 2016 from Ecology for Architects students were pretty low. In one section of the course I received a 3.19 out of 4.00 (80%), my worst rating ever for this course. In the other section of the course, I received a 3.36 out of 4.00 (84%). Below I will break down where and how students were dissatisfied with my course, but let me start by observing an interesting trend: for the third time in a row, I had two very different sections of Ecology for Architects, and for the third time in a row these very different sections rated me in the same relative manner. The first was a full section with students predominately performing very well: the median final grade in this section was a 82.1% (B-), and six students got grades in the “A” range. The second was a rag-tag group of around a dozen students, many of whom rarely came to class: the median final grade in this section was a 66.2% (D+) and four students failed the course. You want to guess who gave me the harshest rating? That’s right, the high-performing students!
Breaking down my ratings in different categories helps to explain this seemingly-counterintuitive pattern. As with Evolution, students generally think that I run an organized and informed course:
- 3.74 & 4.00 out of 4, Variance = 0.65 & 0.00: The instructor knows the subject matter thoroughly.
- 3.84 & 4.00 out of 4, Variance = 0.25 & 0.00: The instructor was well prepared for class.
- 3.63 & 3.67 out of 4, Variance = 0.36 & 0.27: The instructor presented the subject matter clearly.
- 3.53 & 3.33 out of 4, Variance = 0.71 & 1.07: The instructor utilized the class time well.
- 3.42 & 3.33 out of 4, Variance = 0.81 & 1.47: The instructor promoted a constructive classroom climate.
- 3.37 & 3.67 out of 4, Variance = 1.02 & 0.27: The instructor made the goals of the course clear.
- 3.47 & 3.33 out of 4, Variance = 0.82 & 1.47: The instructor clearly informed students how they would be evaluated.
- 3.05 & 3.00 out of 4, Variance = 1.16 & 1.60: The instructor provided feedback in timely fashion.
- 3.42 & 3.67 out of 4, Variance = 0.59 & 0.27: The course improved my understanding of the subject matter.
Clearly these are not as good ratings as I got in Evolution, so how can I claim that they basically tell the same story? Well, just look at the variances! It’s not that most students rate me poorly in any of these categories, it’s that a few students rate me way lower than their comrades. It is possible that in Ecology for Architects I just don’t perform as well on these metrics as I do for Evolution, and that’s what’s driving these lower ratings. But two facts really make that hard to accept as an explanation: 1) the fact that my methods and practices relevant to these rating criteria are virtually identical in both courses; and 2) the fact that it is a small minority of students driving these ratings down. I am always ready to hear that there are aspects of my teaching that need improving, but these ratings seem to me to be one of the best examples of students using course evaluations to take retribution on an instructor who has given them grades lower than they feel entitled to. As we will see below, there are some criteria by which students more consistently rate me lower. But when I see the kinds of variances associated with the slightly-low rating on the above criteria, it is hard not to feel a bit violated by the course evaluation process.
The really low overall average of my ratings are driven by these criteria:
- 3.00 & 3.17 out of 4, Variance = 1.00 & 1.37: The instructor stimulated my interest.
- 1.79 & 1.83 out of 4, Variance = 0.95 & 1.37: The quantity of assigned work was appropriate to goals of the course.
- 2.32 & 2.50 out of 4, Variance = 0.89 & 1.50: The instructor’s evaluation/grading of my work was fair.
- 2.44 & 3.00 out of 4, Variance = 1.32 & 1.60: I would recommend this course to another student.
- 2.32 & 2.83 out of 4, Variance = 1.12 & 1.37: I would recommend this instructor to another student.
To me these numbers tell most of the story. Again there is not agreement among students — just look at the really high variances! — but the low average does speak for itself. Almost at the same level as for my Evolution course, students find me mildly (i.e. 75%) stimulating of their interest. I do think that this is something for me to look at: are there ways that I can make the course more interesting to the students while still being a course about ecology and/or evolution? I am sure I could make many aspects of the class more dynamic, so this is an area to consider.
But the rest seems to me to be about workload and grades, and it is hard to not feel very frustrated by these ratings. Are three-to-four popular science articles a week too much reading? Is is too much to ask students to work through some questions before and after class? Apparently so, because that’s the “quantity of work” that students are responding to. I now require much less work of my students than I have in the past. In my first version of Ecology for Architects, students had to write a term paper! But for whatever reason, this amount of work is considered too much.
And a significant number of students also feel that they were graded unfairly. This too is tough for me to swallow, because the “unfair” that they are perceiving probably does not result from comparing themselves to other members of the class. I use a rubric for all of my grading, and although I am sure that I am not infallible or perfectly objective, the entire structure of the way I grade — by a system and with complete transparency — makes it hard to for me to be grossly unfair. So what’s the unfairness these students perceive?
Perhaps delving into the comments that students made can help make sense of these ratings. As with Evolution, there were plenty of students who made positive comments about my enthusiasm, organization, and structuring of the course. When it came to suggesting improvements, the message was clear: not so much work, not so high expectations, not so much strictness. The word “harsh” in relation to my grading system was used many times. Several students suggested that I did not understand the workload of architecture students. A lot of the students considered the readings to be excessive or unnecessary, and the homeworks that I assign to be not useful. I can’t deny that this sentiment was expressed across the board, including by students who rated me highly overall (students remain anonymous in this process, but are identified by an arbitrary number, so I can see how numerical ratings and comments correspond).
Something that has always vexed me about these low Ecology for Architects ratings is that I couldn’t figure out why the architecture students had such negative reactions to what is almost the same course structure as my Ecology and Evolution courses, which are taught to non-architects. Over these last three years I have come to understand the context in which I teach this course a little better, and I am not so vexed… just frustrated. Here are the factors that I think drive my difficulty with the architecture students:
- I think we have a personality conflict. There’s something about the way I present expectations, and communicate feedback, and interact in class, that’s much more frustrating to architecture students than art, design, and other majors I teach. Trust me, it took me a long time to accept that explanation because initially I had a really hard time imagining that architecture students were that different. But they are. Of course that’s a generalization, but course ratings emerge from the general impressions of a population of students. And the population of architecture students is different from that of other creative majors at Pratt.
- My architecture students care more about their grades. It’s probably not the case that my non-architecture students love my inflexible expectations, they just don’t care as much about the negative consequences of falling short of those expectations. And this makes sense: if you are getting a Bachelor’s of Fine Arts from Pratt, you may have aspirations to go on to a Master’s Degree program, but what grade you get in a science class won’t affect that at all: your portfolio buoys or sinks your graduate school prospects. But apparently GPA matters for architecture students, so it makes them mad that my class is hard to excel in.
- There’s an assumption that non-architecture courses should know their place. The comments about understanding the workload of architecture students really illuminate this facet of our disagreement. My idea is that this is a three-credit course, and that it should have a certain workload associated with it that’s comensurate to those credits (there actually are federal and state standards for minimum workload per credit, and I would be more worried that I fall below these minimums than I would that my course has an excessive workload!). My students’ idea is actually more ecological in nature: they think that I should be structuring my course to assure that they can spend most of their time working on their studio work. I can understand that wish, but that’s not really how course crediting works.
- My biggest issue is that I am more demanding than my colleagues. What makes for the perfect storm with Ecology for Architects is that the students have to take the course and therefore there are a myriad of sections taught each year by different instructors. I am pretty sure that my standard for the course is the toughest, and that’s why the students perceive my class as unfair. They might still think the workload of my class was too high if all sections of Ecology for Architects were delivered with the same structure and expectations, but I don’t think they would think that my grading was as unfair. It’s the comparison to other members of their cohort, who enjoy far less demanding sections of Ecology for Architects, that makes them feel that my grading and demands are unfair. And I am quite sympathetic to the students on this point: it is unfair that they have to do more work to get the same grade for the same required course. Talk about a problem of high variance: they should rightly complain that there is such variation in the way this course is taught. If I were a student I would resent that our department hasn’t come to some consensus on how rigorous this course should be.
I understand totally that I can’t really make a reliable interpretation of these course evaluations, especially given that they have frustrated me so. But to the best of my ability to make this analysis, the above analysis captures my problem. And this is not a problem that I can really fix alone unless I am willing to fundamentally change the nature of the way I teach. So when — towards the end of this last semester — I was “given” the “opportunity” to not teach this course in the future, I agreed. Poof! My problem is solved, as is the problem for architecture students: they no longer will have to take my version of Ecology for Architects. And if you don’t think that course evaluations make any difference, this is an example that should change your mind: although some of the same complaints filtered through other channels, it’s the course evaluations that really pushed me out.
Where do we go from here?
All this analysis would be useless if I wasn’t planning to do anything in response to course evaluations. As you can see above, I am as susceptible as any other instructor to explaining away certain aspects of my lower ratings. This is healthy I think, as not everything students dislike is an actual course or teacher flaw. But I am not ready to explain it all away. My approach to these evaluations is to take them seriously and look for the really big messages that I can respond to.
Unfortunately, I still think that the rating that I will receive for a given course is highly predictable on the first day of the course. That’s not to say that I can look into the eyes of my students and tell you whether they are going to give me good or bad course evaluations. As I have explained, I consistently have very poor intuition about the kind of rating I will get from a given class, even at the end of the course. I am almost backwards in my intuition, as frequently I get the most push-back from classes of students who overall did the best in my courses. What I mean is that I don’t control the mix of personalities and expectations embodied in the students who enroll for my courses, and from semester to semester they seem vary a lot.
Ideally, students enroll in one of my courses because they want to take it. Although I don’t have data to back up this hunch, I have the sense that the classes that rate me the highest are composed of students who did their research and made a very deliberate choice to take my course. They might have looked at the Math & Science options and decided that courses in ecology and evolution best fit their interests. Or they might have spoken to other students whose judgment they trust and decided that I was a good fit (and let’s hope that the opposite is also happening, that students who didn’t like my course are dissuading other students like them from taking my course!). Or perhaps they read something about my courses on this site, including one of these course evaluation analyses. The point is that when students make informed choices, they are happier with my class and I am happier teaching them. What makes Ecology for Architects interesting is that student choice is more limited. They have to take that course in their second year, and the most they can do is choose their instructor (if that’s even possible). A lot of my problems with Ecology for Architects emerged because it appeared that students were deliberately avoiding signing up for my section of the course, presumably (and well backed-up by my course evaluations) because my version of this course is more rigorous than the alternatives.
What makes me confident in my hunch that my ratings are highest when students actively choose my course over alternatives? Well, to start with there is the lower ratings from the “captive audience” of Ecology for Architects who have to take my course. But I also have evidence — admitted anecdotal — that when my courses fill up fast they tend to give me higher ratings and when they fill up late they tend to give me lower ratings. For me, having my courses fill up immediately is the gold standard: if my classes fill up immediately upon being offered for registration, that suggests that they are mostly filled with students who wanted to take these courses. Of course there are always other variables at play, with scheduling and student course needs being the most prominent ‘other reasons’ why a student might register for my course. But one thing is for sure: when a course is offered late, or just fills up late, the students who end up in that course are much more likely to have fell into it. And my experience is that these late-arriving student groups often have the most negative reaction to what I offer in my courses. I have an interesting experiment in this principle at play for this coming Fall semester: my two Evolution courses were offered from the start of registration and filled up pretty quickly, so presumably they should have a higher proportion of students who actively chose my course over alternatives. In contrast, my Ecology course was added way after regular registration was over (because I made the switch away from Ecology for Architects), so it is more likely to be filled with students who just needed a course that fit with their schedule rather than their educational needs. We’ll see if that difference makes any difference in regards to course evaluations.
Although all students are required at this time to take two Math & Science courses — and some may only be made happy by having this requirement removed altogether — they have a wide variety of choices on how to fulfill this requirement. We’d all be better served if they made this choice carefully; one of the reasons I provide these analyses is so that prospective students can read more about my teaching style and decide if it is for them! The role of Academic Advising is also crucial here: although advisors shouldn’t be in the business of recommending or not recommending particular professors, they can allow students to better understand the style and structure of the different available courses and find the best fit for their needs. I think that this is happening, but perhaps it could happen even more effectively.
As I look back at my ratings history, I really don’t think that I have done a lot to change the way I teach. There’s always been the heavy use of the Learning Management System, there’s always been the expectation that students will do substantial work and do that work on time, and I have always brought about the same level of energy and enthusiasm to my courses. So what has changed to make my ratings decline? Two things seem most likely to have changed: the rigor/form of my grading, and the students who walk through my door. I have become a bit more rigorous with students, using more precise assessments and assessing student attainment of outcomes on these assessments more precisely. I also have turned to more traditional methods of assessing attainment of science learning outcomes — exams! — over the past few years. And although I don’t have any real measure of this change, it does feel like the kind of student walking through my door has changed. In the case of the architects I have taught over the past three years, the change is dramatic: there’s a very different culture among architecture students, and their expectations of my course and vision of their responsibilities to it are very different from my non-architecture students. In the case of my courses taken by the rest of the student body, the change is much more subtle. I am still getting classes composed of the kind of students who used to give me very high ratings, it is just that a lot more students who aren’t going to be satisfied with my teaching walk through my door at the beginning of each semester.
This chart, which shows the variance of my ratings over time, tells the story pretty well:
What’s the pattern of student disagreement about my course quality over time? Well, if you just try to overlay a linear pattern over the trend (see the black dashed line), it looks like students have — by and large, over time — had about the same amount of disagreement about the quality of my courses. But this line seems like a pretty horrible representation of the data! Contrast that with the non-linear polynomial trendline (see the green dashed line), which seems to capture the overall pattern of this noisy data. The story I would tell about this data is that when I first arrived at Pratt I was reaching a lot of students, but not some. Then I made adjustments — or something else changed, such as who selects to register for my courses — that lowered the amount of disagreement students had about the quality of my courses. Those adjustments helped me reach a “state of general agreement” from 2010 to 2013, a period where variance in my ratings was at a historic low. But since I started teaching Ecology for Architects in 2014, the variance of my ratings has gone way up. If I want to address this issue, I need to make another adjustment.
Although I feel pretty confident — as confident as I can be with limited samples — that there is a real “Ecology for Architects effect” driving my lower ratings, I also think that in the past few years other trends have driven my ratings below my past performance. And while I think this is because there’s been a slight-but-significant change in our student body, I don’t say that as a way of blowing off my responsibility to serve that shifting student body. With Ecology for Architects off my teaching plate for the time being, I have the ability to play around a little more with my course structures to see whether there are other things that I can do to improve how students feel about my course (without turning my courses into low-rigor facsimiles of a college course). What kind of playing around should I do?
I have several hypothesis, some not mutually-exclusive, about why my ratings have declined over the past few years:
- In recent years, I have gotten away from asking students to integrate their creative work with their scientific inquiries. Ironically, this means that my courses require less overall work, but perhaps it is not the quantity but the quality of that work that students respond to. If students prefer to get into a creative project related to the science of the course, restoring term projects that center around creative work will improve student ratings of the course.
- Although the overall level of preparation of the students that enter my courses is still pretty high, I fear that there are more students arriving in my class who are not well-prepared for the kind of work I expect. Their lack of preparation can take the form of poor English-language skills, difficulty in writing, or minimal background in the kind of analytical thinking that a science course requires. If the reason I am getting more lower ratings is that poorly-prepared students are expressing frustrations with my expectations, then the only way to improve these ratings would be to lower course rigor. Since this is not one of the changes that I plan to institute, if this ‘preparation problem’ is driving my lower ratings then I shouldn’t expect them to improve.
- Given how much students complain about them, perhaps some of my expectations are too high. Specifically, students have consistently railed against having strict deadlines for completing assignments: they suggest in their comments on course evaluations that they need more flexibility in when they complete their work for my class, and they don’t like that there are harsh penalties (i.e. zeros) for not completing work on time. This complaint has been particularly strong in Ecology for Architects, a course that admittedly has the highest expectations of timely work completion (even if the overall amount of work required in the course is much lower than a lot of highly-rated courses I have taught!). If my ratings problem relates to overly-high deadline expectations, lowering these expectations should raise these ratings. Perhaps some students just don’t want me at all. But others want a kinder, gentler version of me!
So this semester I plan to institute a number of changes in my courses that will help me test the above hypotheses. I am well aware of the fact that I can’t make scientific tests of these hypotheses, both because the predictions they generate are too general and because my before and after sample sizes are too small. So my “test” is more in the tradition of Scholarship of Teaching & Learning than science, which is still a worthwhile pursuit.
My first planned change is to re-institute a creative Term Project in my resurrected version of Ecology. This course has been shelved for awhile, and with my departure from Ecology for Architects it lives again! What I plan to do is to use most of the good material — including a lot of well-developed activities — from Ecology for Architects but dramatically change the way that students are graded. I am going to retain the cumulative Final Exam, but replace the Midterm Exam with the Term Project. This will be my first experiment with a writing-intensive creative project, a model that I hope to use in all of my future General Education “core” courses (starting Fall 2017, all Pratt undergraduate sophomores in the art and design majors will be required to take one of these Math & Science courses… this will undoubtedly require new adjustments to the way I teach!).
The expectations of this Term Project — both in terms of the amount and quality of work — are high, so this will be a true test of the hypothesis that students would rather work more on things related to their major rather than less on assignments that don’t connect with their majors. Students will have to generate two pieces of written work, a Project Proposal and a Project Summary, which emerge from a multi-step process of drafting, feedback, and revision. We shall see whether this Term Project generates higher or lower ratings. If the ratings are higher, then it will appear that students just want courses that better connect with their majors. But if the ratings are lower, perhaps students just have declining ability — or willingness — to complete work in their general education courses. It will be interesting to compare the course evaluations for Ecology next semester with those for Evolution, as I plan on retaining the exam-heavy assessment program in Evolution.
My second planned change is to radically change the nature of deadlines and overall work expectations in my courses. Students want a less harsh grading system, one that produces less stress and anxiety. I have devised a system that should address their concerns. The question is whether students still will be able to achieve my learning outcomes once I release some of the past pressure that I have exerted on them.
In the past, I have basically used grades as a stick. I wanted students to do certain things on a certain timeline, and if they did not do these things when I wanted they got a zero. A lot of the mediocre-to-failing grades in my courses emerged from this feature of my courses: students couldn’t (or didn’t) keep up with my deadlines, and their grade suffered. I think that a lot of student frustration with my courses also emerged from the “harsh” and unforgiving nature of my grading system. My rationale for maintaining this strictness was pretty simple: my fear was that if students did not have a strong incentive to complete assignments, they wouldn’t do them at all and their learning (as measure by performance on assessments) would suffer. But in Ecology for Architects, a disturbing trend started to erode my faith in my own you must compel students to keep up with work tenet.
If assuring students complete all coursework on time is important to their performance on major assessments of their learning, then there should be a strong correlation between the coursework grade and grades on these assessments. Trends on the Midterm Exam that I have been giving in Ecology for Architects don’t show this correlation. Here’s the data for my two Spring 2016 sections of this course:
As you can see, when I plot each student as a data point represented by their Coursework and Midterm Exam grades, very few students fall onto the line representing a perfect correlation (there’s a Pearson correlation of only 0.37!). In fact, there are a lot of students who do far less well on their Midterm Exam than they did in completing their Coursework. And there are even some students who do pretty poorly on their Coursework and still perform well on their Midterm Exam. What does this all mean? Well, it clearly suggests that simply doing the work I require — which does generally increase the student’s Coursework grade — does not guarantee success on assessments of learning. That’s a bit depressing, because it suggests that the work I require is not in-of-itself a pathway to learning. But I also don’t think that this means that the work I require is useless: it’s just not the only factor determining student success.
Although this certainly merits further exploration, here’s what I think best explains the graph above. Some of my students are very conscientious, doing all their work. But checking off the box to do the work is not the same as studying hard, or thinking hard about the material. Some are just doing the work to get the credit, but that crediting is not translating to learning. And then there are some students who don’t bother with a lot of my work but apparently still understand the material well enough to perform adequately on assessments.
If doing assigned work in my class is no guarantee of successful learning, why does this work count so much towards the student’s grade? That’s the first question that I am asking myself. This question could push me to a real extreme that the majority of students probably wouldn’t like too much better: if my work does not matter, then really only scores on final “summative” assessments should count towards the final grade. Given that these courses are for non-majors and that a lot of the work done in class is really important in ways that can’t be measured by a final exam, it wouldn’t make a lot of sense to just have these ‘high stakes’ assessment determine each student’s grade. But perhaps there are some assignments that I can still offer but not absolutely require — or require on a deadline.
I think that a big flaw in my current assignment and grading system is that it is based on the idea that all work in my course is of equal importance: everything counts, and everything has a deadline. I am going to soften that stance, substantially, and see what happens. My new mantra is I only hand out zeros for missing work that really matters. So basically the zeros go away for all student actions that I don’t have to have happen, or at least don’t have to have happen at a particular time.
For all my courses this means that homework assignments that are basically a form of review become extra credit. Don’t want to do my Follow-Up Questions at the end of each class? That’s fine… you can do them any time during the semester for extra credit, and if you don’t do them there will be no direct penalty enforced. This doesn’t just mean that I am being flexible about deadlines on this assignment: effectively, with so much extra credit available, I am being flexible about all the Coursework, as missed classes or missed required assignments can be in part compensated for by doing work that used to be required.
You can perhaps imagine my worry about this change, or at least see the hypotheses that I am going to be able to test. If doing these formative assignments on time is not important — or if doing them at all is not important — then this change shouldn’t impact student performance on the summative exams. I am very interested to see if simply leaving students to their own devices on these assignments has any impact on their exam performance. If it doesn’t, all the complaining on course evaluations about me being unnecessarily harsh in my grading system will be validated. I am ready to be wrong… I will be happy if these changes make students happier without reducing what they can demonstrate that they have learned in my course.
What I will not be doing is making work that I consider essential optional. Some of my classes have Reading Questions that must be done before class and some have an in-class Quiz to begin each class. Neither of these traditions is going away, and the consequence of missing this work will still be a zero. That’s because having students do the reading before coming to class is critical to all the activities that we do in class. Similarly, being in class is crucial, and so in-class participation and activities are still required, and still result in a zero if you miss them. Because credit earned towards Coursework is fungible — it does not matter whether you earn points by doing the “mandatory” work or earn the same points via extra credit — the overall incentive to do the readings and come to class are still reduced. But I think most students don’t even know what the word “fungible” means, so I expect there to be a strong bias towards coming to class and doing the reading-related assignments.
In my Ecology course, I plan to go even further with my experiment in flexibility and reduced “big brother” grading tactics. I have decided to make the Term Project, which is a multi-stage assignment, have only “suggested deadlines”. Now of course any student who bothers to see what’s required in the development of this project will see that there’s not a lot of room to procrastinate on completing the Term Project, but under my new policy there will be no penalty for turning in any stage of the project late. That’s not to say that there will be any grace for failing to complete all of these stages — which is in total worth 40% of the final grade — but students can figure out when they have time to complete each stage.
Of all my experiments for the coming semester, this is the most radical because it could lead to total disaster. Frequently students have complained that my strict deadlines are incompatible with their major responsibilities and unnecessarily infantilize students by forcing them to adhere to a particular workflow timeline. I can see this point: students don’t always have control over when major work is dumped on them without warning, and in the real world there won’t be anyone micro-managing the steps towards completing a major project. It’s a major life skill to manage your time so that things get done, and we will see how well my students do without me forcing them to complete particular steps at particular times.
Like a good scientist, I don’t know what will become of my experiments: I am just curious to see how much they affect student performance, and whether student experiences of my courses change enough to push my course ratings higher.A Major Post, Assessment Methods, Course Evaluations, Higher Education, MSCI-260, Evolution, MSCI-270, Ecology, MSCI-271, Ecology for Architects, Teaching