Impact of different scoring algorithms applied to multiple-mark survey items on outcome assessment: an in-field study on health-related knowledge.


multiple-mark items
multiple answer items pick-n items


Introduction. Health-related knowledge is often assessed through multiple-choice tests. Among the different types of formats, researchers may opt to use multiple-mark items, i.e. with more than one correct answer. Although multiple-mark items have long been used in the academic setting â sometimes with scant or inconclusive results â little is known about the implementation of this format in research on in-field health education and promotion.

Methods. A study population of secondary school students completed a survey on nutrition-related knowledge, followed by a single-lecture intervention. Answers were scored by means of eight different scoring algorithms and analyzed from the perspective of classical test theory. The same survey was re-administered to a sample of the students in order to evaluate the short-term change in their knowledge.

Results. In all, 286 questionnaires were analyzed. Partial scoring algorithms displayed better psychometric characteristics than the dichotomous rule. A penalizing algorithm in which the proportion of marked distracters was subtracted from that of marked correct answers was the only one that highlighted a significant difference in performance between natives and immigrants, probably owing to its slightly better discriminatory ability. This algorithm was also associated with the largest effect size in the pre-/post-intervention score change.

Discussion. The choice of an appropriate rule for scoring multiple-mark items in research on health education and promotion should consider not only the psychometric properties of single algorithms but also the study aims and outcomes, since scoring rules differ in terms of biasness, reliability, difficulty, sensitivity to guessing and discrimination.


Domnich A, Panatto D, Signori A, et al. Uncontrolled web-based administration of

surveys on factual health-related knowledge: a randomized study of untimed versus timed

quizzing. J Med In- ternet Res 2015;13(4):17. Available from:

PubMed PMID: 25872617. doi: 10.2196/jmir.3734. [Google Scholar]

Baker DW. The meaning and the measure of health literacy. J Gen Intern Med

;21(8):878-883. Available from:

&date=2006&volume=21&issue=8&spage=878 PubMed PMID: 16881951. doi: 10.1111/j.1525-1497.2006.00540.x. [Google Scholar]

Affairs, H.C.o.H.L.f.t.C.o.S.t. (Ad), , Association AM. Health literacy: re- port of

the Council on Scientific Affairs. JAMA 1999;281:552-7. [Google Scholar]

Nutbeam D. Health literacy as a public health goal: a challenge for contemporary

health education and communication strate- gies into the 21st century. Health Promot Int

;15(3):259-67. Available from: doi:

1093/heapro/15.3.259. [Google Scholar]

Dewalt DA, Berkman ND, Sheridan S, et al. Literacy and health outcomes: a systematic

review of the literature. J Gen Intern Med 2004;19(12):1228-1239. Available from:

&date=2004&volume=19&issue=12&spage=1228 PubMed PMID: 15610334. doi:

1111/j.1525-1497.2004.40153.x. [Google Scholar]

Roberts TS. The use of multiple choice tests for formative and summative assessment.

In: Proceedings of the 8th Australasian Conference on Computing Education. Australian

Computer So- ciety 2006;52:175-80. [Google Scholar]

Newble DI, Baxter A, Elmslie RG. A comparison of multiple‐ choice tests and

free‐response tests in examinations of clinical competence. Med Educ 1979;13(4):263-268.

Available from: PubMed PMID:

doi: 10.1111/j.1365-2923.1979.tb01511.x. [Google Scholar]

Chang SH, Lin PC, Lin ZC. Measures of partial knowledge and unexpected responses in

multiple-choice tests. Educ Technol Soc 2007;10:95-109. [Google Scholar]

Lau PNK, Lau SH, Hong KS, et al. Guessing, partial knowledge, and misconceptions in

multiple-choice tests. Educ Technol Soc 2011;14:99-110. [Google Scholar]

Cronbach LJ. An experimental comparison of the multiple true-false and multiple

multiple-choice tests. Journal of Educational Psychology 1941;32(7):533-543. Available

from: doi: 10.1037/h0058518. [Google Scholar]

Albanese MA, Kent TH, Whitney DR. Cluing in multiple-choice test items with

combinations of correct responses. Academic Medicine 1979;54(12):948-50. Available from:

PubMed PMID: 513099. doi: 10.1097/00001888-197912000-00008. [Google Scholar]

Hsu TC, Moss PA, Khampalikit C. The merits of multiple-an- swer items as evaluated by

using six scoring formulas. J Exp Educ 1984;52(3):152-8. Available from: doi:

1080/00220973.1984.11011885. [Google Scholar]

Pomplun M, Omar MH. Multiple-mark items: An alternative ob- jective item format. Educ

Psychol Meas 1997;57(6):949-62. Available from:

doi: 10.1177/0013164497057006005. [Google Scholar]

Frisbie DA, Druva CA. Estimating the reliability of multiple true-false tests. J Educ

Meas 1986;23(2):99-105. Available from: doi:

1111/j.1745-3984.1986.tb00236.x. [Google Scholar]

Duncan GT, Milton EO. Multiple-answer multiple-choice test items: responding and

scoring through Bayes and minimax strategies. Psychometrika 1978;43(1):43-57. Available

from: doi: 10.1007/BF02294088. [Google Scholar]

Ripkey DR, Case SM, Swanson DB. A ‘new’ item format for as- sessing aspects of

clinical competence. Acad Med 1996;71(Sup- pl. 10). [Google Scholar]

Bandaranayake R, Payne J, White S. Using multiple re- sponse true-false multiple

choice questions. Aust N Z J Surg 1999;69(4):311-5. Available from:

doi: 10.1046/j.1440-1622.1999.01551.x. [Google Scholar]

Tarasowa D, Auer S. Balanced scoring method for multiple- mark questions. CSEDU 2013 –

Proceedings of the 5th Inter- national Conference on Computer Supported Education 2013, pp. 411-6.

Bauer D, Holzer M, Kopp V, et al. Pick-N multiple choice-ex- ams: a comparison of

scoring algorithms. Adv Health Sci Educ Theory Pract 2011;16(2):211-21.

Available from: doi: 10.1007/s10459-010-9256-1. [Google Scholar]

Berk RA. A consumer’s guide to multiple-choice item for- mats that measure complex

Frary RB. Partial-credit scoring methods for multiple-choice tests. Appl Meas Educ

;2(1):79-96. Available from:

doi: 10.1207/s15324818ame0201_5. [Google Scholar]

Verbić S. Information value of multiple response questions. Psihologija

;45(4):467-85. Available from:

doi: 10.2298/PSI1204467V. [Google Scholar]

Pennington HR, Pachana NA, Coyle SL. Use of the facts on ag- ing quiz in New Zealand:

validation of questions, performance of a student sample, and effects of a don’t know

option. Educ Gerontol 2001;27:409-16. [Google Scholar]

Dressel PL, Schmid J. Some modifications of the multiple-choice item. Educ Psychol

Meas 1953;13(4):574-95. Available from: doi: 10.1177/001316445301300404.

[Google Scholar]

Morgan MRJ. MCQ: An interactive computer program for mul- tiple-choice self-testing.

Biochem Educ 1979;7(3):67-9.

Available from: doi: 10.1016/0307-4412(79)90049-9. [Google Scholar]

Feldt LS, Woodruff DJ, Salih FA. Statistical inference for coef- ficient alpha. Appl Psychol Meas 1987;11(1):93-103. Available from: doi: 10.1177/014662168701100107. [Google Scholar]

Feldt LS. A test of the hypothesis that Cronbach’s alpha reliabil- ity coefficient is

the same for two tests administered to the same sample. Psychometrika 1980;45(1):99-105.

Available from: doi: 10.1007/BF02293600. [Google Scholar]

Package “cocron”. Available at:

Kline P. The handbook of psychological testing. 2nd edition. Abingdon: Routledge; 1993. [Google Scholar]

Mackison D, Wrieden WL, Anderson AS. Validity and reliabil- ity testing of a short

questionnaire developed to assess consum- ers’ use, understanding and perception of food

labels. Eur J Clin Nutr 2010;64(2):210-7. Available from:

doi: 10.1038/ejcn.2009.126. [Google Scholar]

Crocker L, Algina J. Introduction to classical and modern test theory. New York: Holt; 1986. [Google Scholar]

Cohen J. Statistical power analysis for the behavioral sciences. 2nd edition.

Hillsdale, NJ: Lawrence Erlbaum Associates; 1988. [Google Scholar]

R Core Team R . A language and environment for statistical computing. Vienna, Austria [Google Scholar]

Faul F, Erdfelder E, Buchner A, et al. Statistical power analyses using G*Power 3.1:

Tests for correlation and regression analy- ses. Behav Res Methods 2009;41(4):1149-60.

Available from:

doi: 10.3758/BRM.41.4.1149. [Google Scholar]

Sichert-Hellert W, Beghin L, De Henauw S, Grammatikaki E, Hallström L, Manios Y, et

al. Nutritional knowledge in European adolescents: results from the HELENA (Healthy

Lifestyle in Europe by Nutrition in Adolescence) study. Public Health Nutr 2011

Aug;14(12):2083-2091. Available from:

PubMed PMID: 21810282. doi: 10.1017/S1368980011001352. [Google Scholar]

Reinehr T, Kersting M, Chahda C, et al. Nutritional knowledge of obese compared to non

obese children. Nutr Res 2003;23(5):645-9. Available from: doi:

1016/S0271-5317(03)00025-3. [Google Scholar]

Osler M, Hansen ET. Dietary knowledge and behaviour among schoolchildren in

Copenhagen, Denmark.. Scand J Soc Med 1993;21(2):135-140. PubMed PMID: 8367681. [Google Scholar]

Cunningham-Sabo LD, Davis SM, Koehler KM, et al. Food preferences, practices, and

cancer-related food and nutrition knowledge of southwestern American Indian youth. Cancer

Oct;78(Suppl. 7):1617-1622.

Available from:

&date=1996&volume=78&issue=7%20Suppl&spage=1617 PubMed PMID: 8839582.

doi: 10.1002/(SICI)1097-0142(19961001)78:7<1617::AID-CNCR44>3.0.CO;2-#. [Google Scholar]

Muijtjens AM, Mameren HV, Hoogenboom RJ, et al. The effect of a ‘don’t know’ option on

test scores: number-right and for- mula scoring compared. Med Educ 1999;33(4):267-75.

Available from:

doi: 10.1046/j.1365-2923.1999.00292.x. [Google Scholar]

Shepherd R, Towler G. Nutrition knowledge, attitudes and fat intake: application of

the theory of reasoned action.. J Hum Nutr Diet 2007;20(3):159-169. Available from: PubMed PMID: 17539865. doi:

1111/j.1365-277X.2007.00776.x. [Google Scholar]

Shah P, Misra A, Gupta N, et al. Improvement in nutrition-re- lated knowledge and

behaviour of urban Asian Indian school children: findings from the ‘Medical education for

children/ Adolescents for Realistic prevention of obesity and diabetes and for healthy

aGeing’ (MARG) intervention study. Br J Nutr 2010;104(03):427-36.

Available from:

doi: 10.1017/S0007114510000681. [Google Scholar]

Slakter MJ, Koehler RA, Hampton SH, et al. Sex, grade level, and risk taking on

objective examinations. J Exp Educ 1971;39(3):65-8. Available from:

doi: 10.1080/00220973.1971.11011268. [Google Scholar]

Ben‐Shakhar G, Sinai Y. Gender differences in multiple‐choice tests: the role of

differential guessing tendencies. J Educ Meas 1991;28(1):23-35. Available from:

doi: 10.1111/j.1745-3984.1991.tb00341.x. [Google Scholar]

Suda AL, Jennings F, Bueno VC, et al. Development and validation of Fibromyalgia

Knowledge Questionnaire: FKQ. Rheumatol Int 2012;32(3):655-662. Available from: PubMed PMID: 21132552.

doi: 10.1007/s00296-010-1627-7. [Google Scholar]

Maciel SC, Jennings F, Jones A, et al. The development and vali- dation of a Low Back

Pain Knowledge Questionnaire – LKQ. Clinics (Sao Paulo 2009;64(12):1167-75.

Available from:

iso&tlng=en doi: 10.1590/S1807-59322009001200006. [Google Scholar]

Byrd-Bredbenner C, Wheatley V, Schaffner D, et al. Develop- ment and implementation of

a food safety knowledge instrument. J Food Sci Education 2007;6(3):46-55. Available from:

doi: 10.1111/j.1541-4329.2007.00029.x. [Google Scholar]

Macías YF, Glasauer P. Guidelines for assessing nutrition-related knowledge, attitudes

and practices. Rome: Food and Agriculture Organization of the United Nations; 2014. [Google Scholar]

Mondak JJ, Davis BC. Asked and answered: Knowledge levels when we will not take “don’t

know” for an answer. Polit Behav 2001;23:199-224. [Google Scholar]

Cronbach LJ. Studies of acquiescence as a factor in the true- false test. J Educ

Psychol 1942;33(6):401-15. Available from:

doi: 10.1037/h0054677. [Google Scholar]

Health Behavior in the School-Aged Children study: rapporto sui dati 2010. Rapporto

ISTISAN 13/5, 2013. Available at: ale_2010.pdf

Nunnally J. Psychometric theory. Second edition. New York: McGraw-Hill; 1978. [Google Scholar]

Weiss BD, Mays MZ, Martz W, et al. Quick assessment of literacy in primary care: the

newest vital sign. Ann Fam Med 2005;3(6):514-522. Available from: PubMed PMID: 16338915.

doi: 10.1370/afm.405. [Google Scholar]