Revitalising rubrics

10 min read
Ryan Campbell, Jakarta Intercultural School, Indonesia
Christian Bokhove University of Southampton, UK

Purposed summatively as an instrument for assessing student performance and formatively for supporting learner progression, as well as being known for its sheer ubiquity, the humble rubric has attracted a great deal of criticism. Taking aim at both the major functions of rubrics, these criticisms range from fears over students being incentivised to narrowly reproduce the rubric features in their work to concerns about the inherent subjectivity of written performance descriptors. While many of these criticisms are well reasoned and worthy of consideration, recent reviews of the evidence, and in particular Panadero and Jonsson (2020), suggest that some of the points made against rubrics lack empirical support and that the case against rubrics may not be nearly as strong as their critics believe. In this article, we first look at the definition and purpose of rubrics, and then evaluate major arguments for and against their use. This article will argue that criticisms have been overstated and that rubrics remain a useful tool as long as practitioners are aware of some limitations pertaining to their usage. The article concludes with some practical examples for practitioners on how, when and why to use rubrics.

What are rubrics?

Two of the more prominent researchers into rubrics, Panadero and Jonsson (2020), summarise the aims of rubrics as being two-fold. Firstly, rubrics are assessment tools to help judge the overall quality of student work. Secondly, rubrics can play a role in informing next steps for learner progression. Three design elements are included to achieve these aims: student performance criteria, levelled descriptions of performance and an overall strategy or approach on how to score student work. While Panadero and Jonsson (2020) go on to make a distinction between analytic and holistic rubrics, those three elements of criteria, descriptions and strategy remain the same for both types of rubrics and are consistent with the elements of rubrics described in more directly practitioner-facing literature such as work by Andrade (2005) and Brookhart (2018). These points are summarised in Table 1.

 

Table 1: Defining rubrics through their design

Analytic rubrics Holistic rubrics
Scoring Generate scores for each criterion Generate single overall score
Purpose Better for feedback as criteria are considered one by one Better for grading as only one decision needed
Design element: 

Student performance criteria

In separate rows for each criterion

✔ 

Integrated into performance levels

Design element: Performance descriptions
Design element: 

Scoring strategy

The critical look: Criticisms of rubrics

Perhaps because of their ubiquity, rubrics have come under sustained criticism from educators who argue similarly to the criticisms that Panadero and Jonsson (2020) address. For example, Christodoulou (2017) argues against the effectiveness of both the summative and formative functions of rubrics by stating that absolute judgements of work quality based on rubrics are unreliable. She also makes the further argument that rubric usage can distort teaching through priming a teacher and students to focus on reproducing the items identified within the rubric at a surface level, rather than skilfully or with full understanding, and in turn this could conceivably stop more creative or original responses from getting the recognition that they should. Focusing on the summative dimension, Sadler (2009) has argued that grading decisions based on rubric use can be distorted, which, if accurate, would remove one of the core reasons for rubric usage. However, summarising these sources too simplistically (i.e. rubrics are ‘bad’) often conveys more about fundamental discussions about the purpose of schooling, low-stakes versus high-stakes and formative versus summative uses of such tools. Furthermore, according to Panadero and Jonsson (2020), most of these major criticisms of rubrics can be characterised by a lack of supporting empirical evidence, including an overreliance on anecdotes, a lack of clear data for conclusions, overstated claims and, most damningly, some of the criticisms are based on misunderstandings of previous research (Panadero and Jonsson, 2020, p. 16).

In their comprehensive analysis, Panadero and Jonsson (2020) reviewed these criticisms of rubric use and found that many of the criticisms were not as evidence-based as they first appeared. The inclusion criteria for their review were that the included articles: (a) criticised rubrics; (b) were published in journals or books (regardless of review process); and (c) were written in English. From these publications, the researchers grouped the criticisms into five major themes:

  1. standardisation and narrowing of the curriculum
  2. criteria compliance
  3. simple implementations (i.e. using rubrics) don’t work
  4. limitations of criteria and analytical assessments
  5. assessment criteria are context-dependent.

Panadero and Jonsson (2020) then identified sub-themes of each of these major critical themes and explored the extent of the empirical basis for these criticisms. These have been summarised alongside their review findings for each category in Table 2. Tellingly, Panadero and Jonsson could find little empirical support for each of the five major criticisms of rubrics, and remaining criticisms largely disappear if rubrics are used formatively rather than summatively.

Table 2: A critical review of the arguments against the use of rubrics (based on Panadero and Jonsson, 2020)

Major critical theme Related sub-themes
#1 Rubrics lead to (over)standardisation and curriculum narrowing ●        Standardisation of assessment through rubrics

●        Rubrics narrow the curriculum

●        Rubric usage reduces the variability of scores

Response to 1: High-stakes assessment can lead to distortions such as a one-size-fits-all approach or teaching to the test, but this is a result of the high stakes and not a criticism of rubrics. If rubrics are used formatively, standardisation becomes less of an issue (Panadero and Jonsson, 2020, p. 8).
#2 Rubrics lead to narrow instrumentalism and ‘criteria compliance’ ●        Steering effect of assessment criteria: students adopt an instrumental/’meet the bar’ only approach

●        Some students are likely to strategically adjust their learning strategies to assessment demands, as long as there are high-stakes consequences attached to assessment outcomes

Response to 2: The literature tends to show the opposite, with studies showing students using rubrics for understanding expectations, self-regulating their own learning and improving task performance (Panadero and Jonsson, 2020, p. 9).
#3 Rubrics are a simple intervention and simple interventions don’t work ●        Teachers need training

●        Rubrics are no substitute for good instruction and assessment

Response to 3: The authors note that a lot of empirical support for the criticisms is lacking, while there is some evidence for the reverse, i.e. that simple implementations, untrained teachers and effective self-use can work.
#4 Rubric criteria are limited and, in turn, so are analytical assessments derived from them ●        No precision in criteria

●        Analytical assessments of individual criteria are not valid

●        No list of criteria is complete

●        Some criteria are not possible to articulate

Response to 4: Most of these criticisms are based on evidence from Sadler (2009), which is mainly anecdotal and not empirical. More empirical work, such as a study by Bloxham et al. (2011), highlighted how assessors tend to use holistic post-hoc judgements rather than criteria when marking.
#5 Rubric assessment criteria are context-agnostic while assessment criteria are context-dependent ●        Assessment purpose

●        Formative and summative

Response to 5: Panadero and Jonsson (2020) argue that these criticisms do not have empirical backing. This criticism symbolises the different purposes of assessment and the need to carefully design assessment tools. For example, it is likely that rubrics used for high-stakes national assessments will need to be ‘context-agnostic’. However, if they are to further students’ learning in a formative way, then a more tailored, context-specific rubric would be suitable.

Although the researchers and Table 2 provide some detail on the nature and validity of the criticisms, we surmise that the five themes relate to three fundamental questions regarding rubrics. Firstly, with regard to the theme of the purpose of assessment and schooling (themes #1 and #2), Biesta (2008) discerns three interlocking dimensions for these purposes, namely qualification, socialisation and subjectification. There is a tension between the aspects of curriculum, standardisation and functionality on the one hand, with rubrics being used summatively, and rubrics being used formatively for personal development and growth. Low- and high-stakes contexts play a role here as well. Secondly, in relation to the actual validity and reliability of rubrics (theme #4), this is very much a case of ‘horses for courses’. There is research that shows that rubrics can be a valid and reliable tool, but this effectiveness sits apart from the purpose of schooling and is very much related to the types of judgements that one wants to make. A blanket dismissal of rubrics not being reliable is therefore unwarranted. Therefore, we would argue that rubrics can capture enough of the learning process to be useful but that their limitations always have to be taken into account. If the argument is to do away with tools because they are not perfect, then we risk throwing away the baby with the bathwater, as perfection is unattainable; the question then becomes: how can rubrics be most useful to practitioners?

How can rubrics be useful?

The current weight of evidence suggests that rubrics are most useful if used formatively with students. Brookhart (2018) makes the point that when used formatively, well-designed rubrics hold value for students and teachers. Alongside the very sensible caveat that if only used for summative purposes, point schemes or rating scales would be easier for teachers to use than rubrics, Brookhart (2018, p. 10) suggests that the value of rubrics lies in their ability to explicitly share learning outcomes at different levels of performance with students. It therefore follows that a key design element must include the measured construction of that criterion and this is something that we will address in more detail later. Similarly, Panadero and Jonsson (2013), in an earlier review, suggest that formative use of rubrics can help students to regulate and monitor their own learning. This could happen in various ways, from students using the rubric to engage more deeply with feedback, to planning, to assessing task progress and then for final review checks before submitting. They also point to other potential benefits, including anxiety reduction and improvements in self-efficacy, all of which supply indirect benefits for student performance. Lastly, it is worth noting that in contrast with Panadero and Jonsson’s findings with regard to rubric criticisms (Panadero and Jonsson, 2020), at least one substantial review of the rubric research highlights that several of the studies showing positive use of rubrics used quite rigorous designs, including experimental and quasi-experimental studies (Brookhart and Chen, 2014).

Rubric design parameters

As positive effects for rubric usage have been noted in research using a wide variety of designs, it may well be the case that rubric design is not as important in comparison to how rubrics are used (Panadero and Jonsson, 2013, p. 141). There are some clear design pointers that will be useful for the practitioner looking to use rubrics. For example, Brookhart’s advice to focus on ensuring that rubrics have generative rather than surface-level criteria seems eminently sensible and (far) more likely to yield positive outcomes for task performance and richer learning conversations (2018, p. 10). She provides two specific examples of this, and we have summarised them in Table 3.

 

Table 3: Substantive vs surface-level criteria in rubric design (adapted from Brookhart, 2018) 

Example Surface-level criteria/task directions Substantive/generative criteria to describe learning
History ‘Has three sources’ ‘Uses a variety of relevant, credible sources’
English language and literature ‘Write five paragraphs’ ‘Write a compelling thesis’

 

Concluding remarks

The evidence and arguments reviewed above demonstrate that the case against rubrics is not nearly as strong as it may appear at first glance. The utility of the rubric as a simple and easy to use formative tool for making the implicit explicit to students means that, far from being in a state of disrepair, the humble rubric does still have a lot of mileage yet. We conclude with some examples of practitioner-orientated suggestions derived from some of the literature discussed above.

Practitioner implications

  • The backwards rubric: For formative use when there is less need for standardisation across classes, create your own custom rubric, perhaps with student input. Experiment with using a comparative judgement engine like No More Marking to identify high-quality work and then create a ‘backwards rubric’ from analysis of that. (Derived from Panadero and Jonsson, 2020, p. 8.)
  • The disappearing rubric: Begin with a rubric that is generalisable across tasks, and as the school year progresses and students gradually internalise the criteria, gradually remove parts of the rubric. (Derived from Panadero and Jonsson, 2020, p. 8.)
  • The evolving rubric: Begin with a rubric that is generalisable across tasks, and as the school year progresses and students gradually internalise the criteria, replace the old progressions with more demanding ones. (Derived from Brookhart, 2018.)
  • The (semi) co-created rubric: Co-create the major rubric criteria with students. Creating an entire rubric with students is probably not worth the opportunity cost, especially as most of the benefits will come from co-creating the rubric criteria alone. Best done from actual work samples. (Derived from Andrade and Heritage, 2017, p. 61.)

References

Andrade HG (2005) Teaching with rubrics: The good, the bad, and the ugly. College Teaching 53(1): 27– DOI: 10.3200/CTCH.53.1.27-31.

Andrade H and Heritage M (2017) Using Assessment to Enhance Learning, Achievement, and Academic Self-Regulation. New York: Routledge.

Biesta G (2008) Good education in an age of measurement: On the need to reconnect with the question of purpose in education. Educational Assessment, Evaluation and Accountability 21: 33–46.

Bloxham S, Boyd P and Orr S (2011) Mark my words: The role of assessment criteria in UK higher education grading practices. Studies in Higher Education 36(6): 655–670. DOI: 10.1080/03075071003777716.

Brookhart SM (2018) Appropriate criteria: Key to effective rubrics. Frontiers in Education 3: article 22. DOI: 10.3389/feduc.2018.00022.

Brookhart SM and Chen F (2014) The quality and effectiveness of descriptive rubrics. Educational Review 67(3): 1–26. DOI: 10.1080/00131911.2014.929565.

Christodoulou D (2017) Making Good Progress? The Future of Assessment for Learning, Kindle e-book ed. Oxford: Oxford University Press.

Panadero E and Jonsson A (2013) The use of scoring rubrics for formative assessment purposes revisited: A review. Educational Research Review 9: 129–

Panadero E and Jonsson A (2020) A critical review of the arguments against the use of rubrics. Educational Research Review 30: 100329.

Sadler DR (2009) Indeterminacy in the use of preset criteria for assessment and grading in higher education. Assessment and Evaluation in Higher Education 34: 159–179

      0 0 votes
      Please Rate this content
      Subscribe
      Notify of
      0 Comments
      Oldest
      Newest Most Voted
      Inline Feedbacks
      View all comments

      From this issue

      Impact Articles on the same themes