Regardless of whether you are modifying an existing rubric, creating one from scratch, or using a rubric developed by another party, both before and after you use the rubric is a good time to evaluate it and determine if it is the most appropriate tool for the assessment task.
Questions to ask when evaluating a rubric include:
Does the rubric relate to the outcome(s) being measured?
The rubric should address the criteria of the outcome(s) to be measured and no unrelated aspects.
Does it cover important criteria for student performance?
Is the rubric authentic, does it reflect what was emphasized for the learning outcome and assignment(s)?
Does the top end of the rubric reflect excellence?
Is acceptable work clearly defined? Does the high point on the scale truly represent an excellent ability? Does the scale clearly indicate an acceptable level of work? These should be based not on the number of students expected to reach these levels, but on current standards defined by the department often taking into consideration the types of courses student work was collected from (introductory or capstone courses).
Are the criteria and scales well-defined?
Is it clear what the scale for each criterion measures and how the levels differ from one another? Has it been tested with actual student products to ensure that all likely criteria are included? Is the basis for assigning scores at each scale point clear?
Is it clear exactly what needs to be present in a student product to obtain a score at each point on the scale? Is it possible to easily differentiate between scale points?
Can the rubric be applied consistently by different scorers?
Inter-rater reliability, also sometimes called inter-rater agreement, is a reference to the degree to which scorers can agree on the level of achievement for any given aspect of a piece of student work. Inter-rater reliability depends on how well the criteria and scale points are defined. Working together in a norming session to develop shared understandings of definitions and adjusting the criteria, scales, and descriptors will increase consistency.