Journal of the NACAA
ISSN 2158-9429
Volume 13, Issue 2 - December, 2020


A Comparison of Multiple Choice and Likert Scale Type Extension Program Evaluations

Augustin, C.L. , Dickinson Research Extension Center Director, North Dakota State University


Five soil testing educational workshops were held in North Dakota. Attendees were given the same evaluation before and after the workshop. Evaluations had four sections of different soil fertility categories. Each category had five multiple choice questions and one Likert scale question “What was your knowledge before/after this workshop?”  A score of one indicated little or no knowledge. A score of 5 indicated very knowledgeable. Likert scale question scores were compared to the five multiple choice questions in the respective soil category. Comparison of means determined that no difference was observed, indicating that evaluating student knowledge with multiple choice questions or Likert scale methods are equally effective.   


Evaluation is a crucial piece of Cooperative Extension outreach. Evaluation data has been used to document program outcomes (Lamm et al., 2013), demonstrate program value for funding purposes (McClure et al., 2012), improve extension programs, hold Extension accountable (Jayaratne, 2016), and to determine the amount or type of education occurred (Kirkpatrick, 1954). 

Knowledge gained can be measured from a Likert Scale (Likert, 1932) instrument or multiple choice test evaluation tool. A Likert based instrument collects ordinal data by asking the student what their knowledge was before or after a learning experience (Likert, 1932).

Both evaluation methods have their benefits and flaws. It is possible that Likert scores do not truly reflect knowledge as it evaluates the subject’s feeling. Pre-workshop evaluations may inflate knowledge as a participant may not be aware of the knowledge. This phenomena can cause post evaluations scores to show little knowledge improvement, when in fact knowledge did increase (Rockwell and Kohn, 1989).

Multiple choice questions may cause test anxiety and adversely impact the knowledge assessment by negatively influencing cognitive and emotional factors for test takers (Cassidy and Johnson, 2002). Guessed answers that are correct can inflate a student’s measured knowledge which would reduce the effectiveness of an evaluation (La Barge, 2007).

The purpose of this project was to compare Likert scale type evaluations with multiple choice test type evaluations. This project asks the questions:

  1. Is there a difference of assessing knowledge between a Likert evaluation tool and a multiple choice test?
  2. Does perceived knowledge prior to the workshop change with a Likert scale evaluation if the evaluation is administered before or after a workshop?



In March and April of 2019, five soil testing clinics were held in North Dakota. Each clinic occurred in a different county. The purpose of the workshops were to teach participants about soil testing and fertilizer management. Workshop duration was three to four hours. Soil testing clinics were open to the public, but designed for crop producers and agronomists. Forty-two participants filled out the evaluations. 

Workshop participants were given nearly identical evaluations immediately before and after the workshop. Evaluations had 20 multiple choice questions. Questions regarded soil science issues that students would learn about while attending the soil testing clinic. The multiple choice evaluation (test) was divided into four soil science categories, macronutrients, micronutrients, soil testing, and soil management. Each test category corresponded with five multiple choice questions and a Likert scale (Likert, 1932) question regarding their knowledge before or after the workshop. A six point Likert scale was used as it can provide more reliable results than a five point scale (Chomeya, 2010). The Likert scale ranged from zero to five. A score of zero indicated little or no knowledge of the respective soil category and a score of five indicated that the participant was very knowledgeable about the soil category. A Likert value of zero was used because it was possible for test scores to be zero.

This project compared the means of student’s knowledge before and after the workshop using Likert and multiple choice type evaluations and timing (before workshop and after workshop) of the assessment tool. Comparison of means was completed by Student’s T-Test with Statistical Analysis Software Version 9.4 (2016). [AC1] Likert questions were compared to their respective test category. The number of correct multiple choice questions was used to determine the test score. Means of Likert scores were evaluated as results had a normal distribution (Boone and Boone, 2012; Jamieson, 2004; Sullivan and Artino, 2013).

Results and Discussion

Assessed knowledge before the soil testing clinic was affected by evaluation timing and instrument (Table 1). The multiple choice exam taken by students before the workshop (pre-pre-test) had the highest before workshop knowledge score (1.95). The Likert knowledge assessment tool administered before the workshop (pre-pre-Likert) had a mean score of 1.58. The Likert evaluation tool that assessed knowledge before the workshop and administered after the workshop (post-pre-Likert) had a mean score of 1.86. The pre-pre-test evaluation score was statistically greater than the pre-pre-Likert evaluation score (Table 1). The difference between pre-pre-Likert and post-pre-Likert was statistically significant. No difference was observed between the pre-pre-test and post-pre-Likert evaluation (p-value = 0.5071).

Table 1. Participant's knowledge before attending the workshop.

Evaluation Mean Standard Deviation Number p-value


1.16 168 0.0048
Pre*-PreT-Likert" 1.58 1.18 159  
Pre*-PreT-Likert" 1.58 1.18 159 0.0454
Post**-PreT-Likert" 1.86 1.22 143  
Pre*-PreT-Test' 1.95 1.16 168 0.5071
Post**-PreT-Likert" 1.86 1.22 143  

*Evaluation administered before the workshop.

**Evaluation administered after the workshop.

TEvaluation asseed knowledge before attending workshop.

'Multiple choice evaluation tool.

"Likert evaluation tool.

Post workshop evaluation scores were affected by assessment tool type (Table 2). The test that assessed knowledge after the workshop (post-test) score was significantly higher (3.29) than the Likert evaluation tool administered after the workshop (post-Likert) to evaluate post workshop knowledge (2.60). North Dakota State University Extension assess knowledge as described by Kirkpatrick (1954). This evaluation was geared to evaluate learning and not behavior or results. To determine results and behavior, evaluations should occur sometime after the workshop. Adding a Likert question regarding implementation to the post workshop evaluation tool used (Appendix B) could help determine Kirkpatrick’s (1954) behavior or results. Koundinya et al., (2016) observed significantly higher follow-up evaluation response rates at two months versus ten months. However, better outcomes of participants were observed at the ten month follow-up evaluation interval.

Table 2. Participant's knowledge after attending the workshop.

Evaluation Mean Standard Deviation Number p-value
Post*-Test' 3.29 1.15 160 <0.0001
Post*-Likert" 2.60 1.15 140  

*Evaluation administered before the workshop.

'Multiple choice evaluation tool.

"Likert evaluation tool.

Evaluation type and timing did impact post knowledge assessment scores (Table 3). The multiple choice evaluation tool indicated the largest amount of knowledge gained (1.34). Whereas, the Likert assessment tool administered after the workshop indicated the least amount of knowledge gained (0.74). It is possible that some workshop attendees guessed on the test as some pre test scores where higher than post test scores. La Barge (2007) suggested to prevent guessing on a test to pair “Yes, I know the Answer” or “No, I am guessing” to alleviate test guessing scores. However, pre and post test scores improved significantly thus indicating guessing correctly during the pre-test was minimal (Table 3).   

Newly learned knowledge can affect baseline results. A student may over estimate their knowledge on a Likert evaluation administered before the workshop. To reduce overestimation, Rockwell and Kohn (1989) suggest evaluating prior workshop knowledge after the conclusion of a workshop and not before.

Participants in this project tended to underestimate their knowledge as indicated by the pre and post Likert evaluation scores were lower than their respective pre-test or post-test (Table 1 and 2). Evaluation tools can further reduce “response-shift-bias” that complicates evaluations by adjusting the Likert evaluation tool to assess “knowledge after the workshop”, before “knowledge prior to the workshop” (Rockwell and Kohn, 1989). This method could change the results of this project.   


Table 3. Differences of various pre and post assessment tools. Mean initial evaluation score was subtracted from final evaluation score.

Final Evaluation Initial Evaluation Difference of Means§
Pre*-PreT-Test' Post**-Test' 1.34a
Pre-PreT-Likert" Post**-Likert' 1.02b
Post**-PreT-Likert" Post**-Likert" 0.74c

*Evaluation administered before the workshop.

**Evaluation administered after the workshop.

TEvaluation assessed knowledge before attending workshop.

'Multiple choice evaluation tool.

"Likert evaluation tool.

§Different letters indicate statistical significance at the 0.05 level.



  1. Likert and multiple choice type evaluation tools are both effective methods of assessing the amount of learning that took place in a workshop.
  2. Assessing previous knowledge with a Likert evaluation tool is affected if administered before or after an educational experience. 
  3. Multiple choice test type evaluation answers can show greater improvement of learning and showcase greater value of Cooperative Extension.


Literature Cited

Boone, Harry N. Jr.; Boone, Debra A. (2012, April). Analyzing Likert Data. Journal of Extension, 50(2). Retrieved April 18, 2019, from

Cassady, J. C., & Johnson, R. E. (2002). Cognitive Test Anxiety and Academic Performance. Contemporary Educational Psychology, 27(2), 270-295.doi:

Chomeya, R. (2010). Quality of Psychology Test Between Likert Scale 5 and 6 Points. Journal of Social Sciences, 6(3), 399-403. doi:10.3844/jssp.2010.399.403

Jamieson, Susan. (2004, November 25). Likert scales: how to (ab)use them. Medical Education, 38, 112-1218.

Jayaratne, K.S.U. (2016, February). Tools for Formative Evaluation: Gathering the Information Necessary for Program Improvement. Journal of Extension, 54(1). Retrieved April 15, 2019, from

Kirkpatrick, D. L. (1954). Evaluating human relations programs for industrial foremen and supervisors. Madison, WI: University of Wisconsin.

Koundinya, V., Klink, J., Deming, P., Meyers, A., & Erb, K. (2016). How do mode and timing of follow-up surveys affect evaluation success? Journal of Extension,54(1) . Retrieved February 15, 2018, from

La Barge, Greg. (2007, December). Pre- and Post-Testing with More Impact. Jorunal of Extension, 45(6). Retrieved April 23, 2019, from

Lamm, A.J., Israel, G.D., & Diehl, D. (2013) A national perspective on the current evaluation activities in Extension. Journal of Extension [Online], 51(1) Article 1FEA1. Available at:;joe/2013february/a1.php

Likert, R. (1932). A technique for the measurement of attitudes. New York: Publisher not identified.

McClure, M.M., Furman, N.E., % Morgan, A.C. (2012). Program evaluation competencies of Extension professionals: Implications for continuing professional development. Journal of Agricultural Education, 53(4), 85-97. Doi:10.5032/jae.2012.04085

Rockwell, S. K., & Kohn, H. (1989). Post-Then-Pre evaluation. Journal of Extension, 27(2). Retrieved February 15, 2018, from

Statistical Analysis Software (Version 9.4) [Computer software]. (2016). Retrieved April 12, 2019.

Steele, S. M. (1970). Program evaluation - A broader definition. Journal of Extension, 517. Retrieved March 1, 2018, from

Sullivan, G. M., & Artino, A. R. (2013). Analyzing and Interpreting Data From Likert Type Scales. Journal of Graduate Medical Education, 5(4), 541-542. doi:10.4300/jgme-5-4-18