Rethinking True/False: Strategies for More Rigorous, AI-Resistant Exam Questions

Why We’re Rethinking Assessment Design

Designing online exams that accurately measure what students have learned has always been a challenge, and the rise of Generative AI has made that challenge even more complex.

While some instructors have considered returning to paper exams or proctored testing, the College of General Studies has taken a different approach—designing better questions. By emphasizing rigor, reasoning, and engagement, we can develop online exams that are both more resistant to AI misuse and more reflective of authentic student learning—all while maintaining our commitment to offering undergraduate coursework that meets the needs of all students, including CGS’s nontraditional student audience.

This work is based on a revision project between the CGS Online Design Team and Jon Vallano, Associate Professor of Psychology. These approaches to revision illustrate how thoughtful redesigns in exam questions can improve assessment validity, reduce AI shortcuts, and deepen student engagement with course content.

Please note: some parts and content of the questions have been changed to reflect a broader audience and to maintain the integrity of Professor Vallano’s exam questions.

Why True/False Questions Fall Short

True/False questions can be efficient to write and grade, but they often:

Test surface-level recall rather than comprehension.
Allow students (or AI) to reason through generalities or cultural familiarity.
Give instructors little insight into how students arrived at their answers.

Additionally, Generative AI tools are particularly adept at handling binary, low-context questions, making the True/False format highly susceptible to exploitation.

Revision #1: Utilizing Shaded Evidence from Core Resources

One option for revising True/False questions is to push students to select the strongest or most appropriate evidence from a list of plausible answers derived from core course resources. At CGS, we call this approach utilizing “shaded evidence.” It requires students to think critically about degrees of accuracy and strength of support rather than simply spotting a single correct response.

This method prompts deeper engagement because students must weigh multiple reasonable possibilities and evaluate their relative strength against each other - skills that mirror authentic disciplinary thinking and are far less compatible with quick AI lookups. Consider the following example:

Original Question: In Their Eyes Were Watching God, Janie chooses to engage in a relationship with Tea Cake because he offers her a stark contrast to the life she lived with Jodie.

True
False

Problems

The question can be answered through general knowledge or AI without a close reading of the text. It doesn’t reveal whether students understand why the statement is true or false.

Revised Question: In Their Eyes Were Watching God, Janie chooses to engage in a relationship with Tea Cake because he offers her a stark contrast to the life she lived with Jodie.

Choose one piece of evidence from Jodie and one from Tea Cake that best supports this claim:

“You sho’ loves to tell me whut to do, but Ah can’t tell you nothin’ Ah see.” (Jodie, Ch. 6)
“You oughta have some sympathy ‘bout yo’self. You ain’t no young gal no mo’.” (Jodie, Ch. 8)
“Dis ain’t no business proposition... Dis is uh love game.” (Tea Cake, Ch. 13)
“Us goin’ tuh do all we can tuh keep our love together.” (Tea Cake, Ch. 13)

Why It Works

Students must demonstrate an understanding of deeper concepts (in this case: characterization and relationship dynamics) rather than simple recall.
They’re asked to analyze and evaluate multiple pieces of plausible evidence before choosing their answers, which promotes deeper engagement with the text.
Generative AI struggles with selecting the best evidence among several plausible options, reducing the plausibility of AI-based assistance in a timed assessment.

Revision #2: Checking Understanding Through the Resource Specific Content

Another effective revision strategy is to design questions that require students to engage directly with course materials rather than rely on general knowledge or reasoning. These resource-specific questions assess comprehension by prompting students to locate and interpret information from assigned readings, videos, or lecture materials.

This approach reinforces authentic learning behaviors; students must return to their resources, verify details, and think critically about key concepts. It also makes AI shortcuts less practical, since success depends on knowing what the course materials actually say, not on general subject familiarity.

The example below demonstrates how reframing a True/False question into a resource-based multiple-choice item increases both rigor and validity.

Original Question: Physical and psychological threats, evaluation systems, and organizational problems have been shown to be the categories of stress most commonly encountered by police.

True
False

Problem
All of the listed categories sound reasonable - even to someone who hasn’t read the material - making it possible to guess the correct answer through a combination of logic, cultural familiarity with the topic, and/or AI assistance rather than direct engagement with the textbook.

Revised Question: According to your course resources, which of the following is NOT a category of stress encountered by police?

Physical and psychological threats
Jurisprudential anomalies
Evaluation systems
Organizational problems

Why It Works

Students must consult their textbook to find the correct answer, encouraging engagement with course materials.
The question format gives instructors clearer insight into student thinking and comprehension.
When entered into a Generative AI tool, the question typically produces vague or overly cautious responses - further reducing the advantage of automated support.

The Takeaway

Shifting from binary True/False statements to evidence-based and resource-linked questions strengthens validity, reduces AI effectiveness, and fosters authentic learning. These small design changes can have a major impact, especially when paired with a few technical Canvas adjustments outlined in the next page.

Why We’re Rethinking Assessment Design

Why True/False Questions Fall Short

Revision #1: Utilizing Shaded Evidence from Core Resources

Revision #2: Checking Understanding Through the Resource Specific Content

The Takeaway

College of General Studies

University Resources

For Faculty