I read Michael Goldstein’s description of the assessment data coming out of Khan Academy. I listened to the CEO of Knewton, Jose Ferreira, talk about the assessment data coming out of his platform. I feel hollow. These platforms measure the time of day and the amount of time a student spends looking at a webpage. They measure hint tokens. What can they tell us about what a student does on that webpage though? What can they tell us about what a student knows and doesn’t know when she gets a problem wrong?
They can measure proficiency but only if the task is defined down to the cornmeal that a machine can easily digest — multiple-choice and objective-response items. Free response gives them indigestion. When it comes to assessment and the relationship between learners and machines, the learners are giving way more than they’re getting. They’re meeting the machines more than halfway.
If the NRC’s 21st-century competencies should even be assessed at all (open question) I don’t have a lot of hope that a machine on its own can grade them. We would need humans. But that doesn’t mean co-presence is required. We may be able to use machines to eliminate the need for co-presence while not letting them dumb down our assessments.
One example: the Smarter Balanced Assessment Consortium is designing assessments for California’s implementation of the Common Core State Standards. Look at one. You have a large text entry field and the prompt to “explain.” A machine has no hope of grading that meaningfully but a human can. So the machine offers an expansive, permissive input field and just passes the data along unmolested to a human not co-present for grading. I can live with that. (Whether the human can live with grading just that assessment for forty hours a week is another open question.)
Another. Here’s an assessment for “intellectual openness” that may not be possible without machines. (Again, I only want machines to facilitate the non-copresence. I don’t want them anywhere near the grading itself.)
Imagine: a student takes a survey before the assessment that asks for her opinion on any number of inflammatory issues. Thumbs up or down, perhaps. A Likert scale, maybe. “Do you agree or disagree: the death penalty should be legal?” It doesn’t really matter.
The machine then pairs that student with a non-copresent student who’s taking the assessment at the same time who holds the opposing view. They are linked to each other in a chat. They need to explain their own views and attempt to convince each other while maintaining an open stance towards an idea with which they disagree. The non-copresence is a blessing here. You engage someone with whom you have no meaningful history. We control for past history. A human reads the transcript later and uses a rubric to determine the open-mindedness of each participant.
(The exact elements of that rubric are left as an exercise for the reader.)