Comments from colleagues
I reached out to the learning scientists and assessment developers on my team and asked them for pedagogical principles. They shared that “as long as it’s clear what they got wrong and what the right answers are,” we are good pedagogically. Then I showed the three design versions to them.
Having expressed observing clarity of User Selection, most of my colleagues showed concerns about the ability to spot answer keys: “if you only see green and checkmarks, it looks to me like there were no errors.” They even shared feedback from the user’s voice: “as a user, I would benefit more from a more explicit indication of my selection being wrong” (with an “x” as an indicator, for example). The bottom line, they don’t mind having the incorrect options called out, and for now, Rule B was the most preferable.
Polling results on social media
After testing these designs with my friends on social media, I found more evidence in the demand for visual cues that tell keys from distractors. The way I facilitated these “tests” was through the Poll feature on Instagram Stories. Since Instagram allows polling between only two options, I sent out three polls to compare Rule A & B, B & C, and A & C. Just FYI — polling between two options each time sounds like a limitation, but it actually makes it easier for users to make a decision based on given information. (Feel free to dive more into this theory, Diagnostic feature-detection hypothesis,” Wixted & Mickes, 2014.)
In the polls — instead of asking my friends to tell me “what did the user selected and what are the answer keys,” I asked them to compare the respective designs and vote for the one that was “clearer about both User Selection and Keys.”
I had a few reasons for doing that. Firstly, during my initial “pilot dry run” on Instagram Stories, I noticed that the success rate of their replies was very low (also an obvious failure of my designs). My friends also mentioned that they would spend a long time trying to figure out the “right answer” for my question, even after I had specifically asked them to just speak their intuition. Furthermore, I thought it was not quite authentic to ask them what they thought this “hypothetical user” had chosen because the question did not contain any real content.
Later, when comparing results from the three polls, I noticed that in both polls that included Rule C, Rule C was rated the less preferable one — it was the loser among the three initial designs.