DMPS Assessment, Data, and Evaluation
#dmpsdata
  • Home
  • Testing
    • ISASP (formerly Iowa Assessments)
    • FastBridge
    • MAP >
      • MAP Resources for Parents
    • ELPA21
    • The ACT
    • Conditions for Learning
    • Panorama Education >
      • Panorama - SEL Survey
      • Panorama Success
  • Data Systems
    • Infinite Campus
    • Tableau
    • BusinessPlus
  • School Improvement
  • Data Governance
  • Data Requests
  • NEWSLETTER
  • Contact Us

Tips for Writing Test Items

Picture
Writing Items
  • Each item should be written to focus primarily on one standard.  Secondary standards are acceptable, but it should be clear to all readers which standard is the focal point of the item.
  • Each item should be written to clearly elicit the desired evidence of a student’s knowledge, skills, or abilities.
  • Items should be appropriate for students in terms of grade-level difficulty, cognitive complexity, and reading level. For non-reading items, the reading level should be approximately one grade level below the grade level of the test, except for specifically assessed terms or concepts. 
  • Items are expected to include concepts detailed in the standards of lower grades.
  • Items should provide clear and complete instructions to students.
  • Options should be arranged according to a logical order whenever possible (e.g., alphabetical, least to greatest value, greatest to least value, length of options).
Language Usage
  • Present all instructions and procedures using simple, clear, and easy-to-understand language.
  • Keep the length of prompts and stimuli to the minimal required length.
  • Avoid sentences with multiple clauses.
  • Use a series of simpler, shorter sentences in place of longer, more complex sentences.
  • Use vocabulary and sentence structure that is at or below grade level for prompts and directions.
  • Use vocabulary and sentence structure for prompts and directions that is at grade level when assessing reading skills.
  • Use vocabulary and sentence structure that is at or below grade level when assessing skills other than reading.
  • Use common words instead of unusual or low-frequency words.
  • Do not use ambiguous words, idioms, or jargon unless they are defined or part of the knowledge being measured.
  • Avoid false cognates (words with a common etymological origin), such as “billion,” which means the number 1,000,000,000 in English but which means 1,000,000,000,000 in Spanish.
  • Do not use words, phrases, names, or terms that may be culturally insensitive or unfamiliar to people of a given culture.
  • Avoid irregularly spelled words.
  • Avoid proper names unless necessary.
  • When a fictional context is necessary (e.g., for a mathematics word problem), use a simple context that will be familiar to the widest possible range of students (such as objects and activities commonly encountered in school).
  • Present essential words or vocabulary in bold.
  • Make prompts as direct as possible and use an active voice.
  • Prompts are worded positively and avoid the use of terms like “not” and “never.”
  • When a prompt references a specific section of a stimulus, include the relevant section with the prompt when possible.
  • Do not use extraneous verbiage in answer options.
  • Present answer options in the shortest form possible.
  • Make answer options approximately equal in length (multiple choice items).

Fairness
To determine item fairness, ask:
  • Do the items measure any irrelevant knowledge or skill? If so, will some group(s) be more greatly affected than others? 
  • Will any aspect of the test materials anger, offend, upset, or  otherwise distract test takers? If so, will some group(s) be more greatly  affected than others?
  • Do the test materials treat all groups of people with respect? If not, will some group(s) be more greatly offended than others?  

If some group(s) will be more greatly affected than others, a potential fairness problem exists. The next step is to determine whether or not the potential problem is a real one. Has difficulty been confused with fairness? Has a guideline been overgeneralized? Is the material important for valid measurement of the standard?