Tuesday, February 23, 2010

23 Feb 2010: Rater Biases

Generosity error (too easy, grade inflation)
Giving everybody a higher score, possible incompetence on behalf of teacher (everyone gets high marks)

Severity error
Too hard, no perfect papers, always critical, grades harder, expectations are high for the group, possibly due to expertise of teacher/judge/critic (everybody gets poor marks)

Central tendency
rating everyone about average
sports, P.E.

Halo Effect
general impresion of individual (positive or negative) influences an individual rating
judging merit based on your "older brother" or "older sister" - stereotypes?

Logical error
rating alike or different based on the belief that factors are related (e.g. studious and able)
overrate gifted students because they are gifted or underrate them because they're nerdy

Effective Rating Review
- focus on educationally significant outcomes
- characteristics should be directly observable (how can you rate what you can't see?)
- clearly define key points on scale
- select the most appropriate type of instrument (checklists, rubrics, rating scales)
- if possible, rate all performance on one task before going on to the next
- when possible, rate performances without knowing the raters name
- if the assessment has significant impact, several ratings should be used (multiple raters on essays for ACT)
- example practice [7-22, 7-23]

ALTERNATIVE ASSESSMENTS
Portfolio assessment:
* A Purposeful collection of student work that exhibits the student's efforts, progress, or achievement in a given area
* Can be a maximum performance or typical performance assessment
* Assessment purpose can be formative or summative; but is also considered to be an instructional activity

Should include
* Student participation in selection content
* Criteria for selection
* Criteria for judging merit
* Evidence of self-reflection by student
* Eliminate biases by double-blind rating system

Potential Strengths:
- fosters self evaluation skills
- communication/evidence/feedback (for parents/students)

Potential Weaknesses
- time consuming, tendency to busywork
- need guidance, scoring issues (freedom vs. standardization), comparability problems

Concept Maps [see9-48]
Instructional purpose
- organize knowledge
- connect concepts
- built schema - Piaget (assimilation, accomodation)

Evaluate Purpose

Potential Strengths
- learning tool for building understanding
- evidence of understanidng

Potential Weaknesses
- time consuming, completeness issues
- difficult to interpret and score (self-presentation, explanation)

Components
- nodes (concepts)
- links (connections)
- link phrases

Are all learning outcomes measurable using objective paper and pencil tests? No! non-cognitive outcomes (Affective domains)

How would you measure:
student engagement
work ethic
socially acceptable behavior
appreciation for...
social adjustment
character development?
state vs. trait (condition vs. chronic)
attitude, anxiety, interest, value, locus of control, academic self esteem / self concept

Our goal is to measure those things that are more stable
Don't waste time measuring things that are temporary

Triangulation

Observation
Purpose is to view student behavior in natural/typical setting as part of an assessment plan to measure learning outcomes that cannot be measured directly in another way.

Observation
structured vs. unstructured
- looking for specific behaviors
- seeing all, whatever there is to see

Unstructured observation are typically less useful as part of an assessment plan but can add insightful contextual information

Subjectivity vs. Professional Judgment
- bias, misinterpretation, partial, or sketchy record
- planned, prolonged, engagement, outcome specific

Subjectivity is not a negative aspect of assessment if it is based on expert opinion
It becomes sterile if we don't consider all of the aspects of it

Observational / Anecdotal Records
- determine what to observe (focus)
- important learning outcome

Multiple Observations (Prolonged engagement)
When Recording
- make the record ASAP
- limit record to brief single incidents
- keep factual and interpretive records separate
- record both positive and negative
- check for bias [see 319]

Peer appraisal
- useful when assessing personal-social development outcomes (e.g. leadership, concern, group work skills)
- Supplements teacher observations (guess-who technique)

No comments:

Post a Comment