Manager-Employee Feedback and Development: Why is it so Hard?

Bracken ATD Feedback 2017

The current climate surrounding performance appraisals leans toward the abandonment of the administrative exercise we all have come to despise and, instead, replace it with a feedback culture of continuous exchanges between manager and direct report. The solution is not new, so why has it not been implemented in more organizations?

Manager-Employee Feedback and Development: Why is it so Hard?

Bracken & Timmreck (1999)

Bracken, D. W., & Timmreck, C. W.  Guidelines for multisource feedback when used for decision making purposes.  The Industrial/Organizational Psychologist, April, 1999.

Guidelines for MultiSource Feedback
When Used for Decision Making

David W. Bracken
dwb assessments, inc.

Carol W. Timmreck
Shell Oil Company

These Guidelines are a product primarily of the MultiSource Feedback Forum membership, about half of which are SIOP members. Addition contributions were made by the following individuals: Allan Church, John Fleenor, Rod Freudenberg, Bob Jako, John Kasley, Vicki Pollman, Lynn Summers, and Alan Walker.

We now have a draft document which we would like to make available for further review and feedback. It is our intent to collect feedback via e-mail prior to the SIOP conference and to convene an informal meeting sometime during the conference for those who wish to further discuss this document. After we have collected this input, we plan to publish the document in some format (e.g., a pamphlet) for distribution later in the year. We wish to make it clear that this effort is not sponsored by SIOP nor are we seeking endorsement by SIOP.

Please send your comments to David Bracken at DWBRACKEN@aol.com  by April 27 and at the same time indicate whether you would like to be notified of specifics of the ad hoc meeting during SIOP.

Introduction

Definition: For the purposes of these Guidelines, MultiSource Feedback (MSF) is questionnaire-based feedback to an individual regarding work-related behavior from coworkers (e.g., supervisor(s), subordinates, peers, team members, internal customers) and other individuals (e.g., external customers) who have had an opportunity to observe that behavior. Supervisor feedback is typically not anonymous since it is collected from one person, but feedback from other sources typically is. Participants (ratees) typically receive feedback results in the form of aggregated scores (e.g., mean scores), usually reported for each feedback source (peers, subordinates, etc.) and often including write-in comments.

Successful MSF Process Defined: For the purposes of these Guidelines, a “successful” MSF process is one which:

Creates and/or reinforces focused, sustained behavior change and/or skill development in a sufficient number of individuals so as to result in increased organizational effectiveness.

Note that “behavior change” and “skill development” can be in areas valued by the organization but not necessarily measured directly by the MSF instrument.

Purpose

These Guidelines are provided as recommended practices for implementing MSF for decision-making purposes in human resource systems (e.g., performance management, staffing, succession planning, compensation) which in turn will optimize the likelihood of success (as defined above). These Guidelines should be applied to MSF processes which are designed in anticipation of use for decision making even though initial administrations may not include that purpose. Application of these Guidelines to “development-only” MSF processes will also typically improve the likelihood of success.

MSF used for decision making typically has design considerations which are not found in many “development-only” processes. For example, MSF used for decision-making processes such as performance appraisal (and resulting outcomes) are conducted for whole segments of an employee population (e.g., supervisors, all exempt employees) on a schedule determined by organizational needs (e.g., annually). Such practices place significant demands on the system which most development-only processes do not.

A second point of differentiation of MSF used for decision making regards the issue of sustainability. (While MSF is occasionally used on a one-time basis for purposes such as downsizing, that practice is outside the scope of these Guidelines and may be harmful to other MSF processes.) Many decision making applications of MSF are repeated events such as those conducted annually. These processes are dynamic as participants (raters, ratees, management) have experiences which shape their behavior over time.

These Guidelines in total should not be used as “standards.” For example, it would be inappropriate to use the Guidelines to determine whether an MSF process is legally defensible. It is unlikely that any MSF process can simultaneously satisfy all of the objectives listed below. The Guidelines reflect practices and recommendations that have been shown to optimize the likelihood of achieving “success” (as defined above). The Guidelines should be used to guide decisions in the design of MSF processes with an acknowledgment of the ramifications of each decision.

In cases where a Guideline is of sufficient criticality as to be required, it is noted as [Essential]. Used at the beginning of a paragraph, it indicates criticality for all points that follow in that paragraph. When indicated after a sentence, it applies to only that point.

Guidelines

Objectives: These Guidelines are designed to support the following objectives. Objectives will be referenced in support of recommendations in the sections that follow.

Acceptance: The feedback should have characteristics that enhance acceptance by the ratees and their managers. Acceptance is a precursor to behavior change for ratees and to decision making for their managers.

Accuracy: The process should ensure that data are collected, processed and reported with no errors.

Actionability: The feedback (including write-ins) should be behaviorally based and within the ability of the ratee to address through behavior change and/or skill development.

Alignment: The content of the feedback and the process itself should be consistent with the organization’s strategies, goals, values, competencies and desired culture.

Anonymity: Guaranteeing and delivering real and perceived anonymity for the feedback providers (raters) is recommended as a means of maximizing honesty and candor. (Note that there are some legal opinions regarding the ability to guarantee anonymity in all circumstances.)

Census: The data collected should represent a census (vs. sample) of those persons who have the best opportunity to provide reliable feedback, consistent with rater selection policies (e.g., all direct reports, number of peers, etc.).

Clarity: Participants (raters and ratees) must fully understand their roles and how to correctly fulfill those expectations.

Communication: Communications regarding the purpose, methods and expected outcomes of the process form a “contract” with participants on behalf of the organization to fulfill the commitments made.

Confidentiality: A clear policy which states who may access and use feedback data is necessary. It is understood by all participants that data must be accessible to the data processor (internal or external) with clear requirements for data integrity and security. (A confidentiality policy which allows the ratee to keep the results to him/herself is not considered to be a requisite for MSF when used for decision making, and, in fact, may be a barrier to success.)

Consistency: All processes should be administered consistently for all participants (raters, ratees, management). Where procedures must differ, administrators must demonstrate that the inconsistencies do not have a systematic effect on the feedback results.

Cooperation: The task of providing feedback by raters should not be so onerous as to affect the quality (e.g., honesty) or quantity (e.g., response rates) of the observations.

Insight: The ratee should be provided with information of sufficient quality and specificity to ensure that resulting actions are aligned with and responsive to the observations of the feedback providers.

Ratee Accountability: Methods are used that maximize the likelihood that ratees will understand, accept and use their feedback in the manner intended by the organization.

Rater Accountability: Methods are used that maximize the likelihood that raters will fulfill the role of accurate, honest reporters of observed ratee behaviors, including providing assistance to the ratee in understanding the feedback, guiding action plans, and reinforcing desired behavior.

Relevance: The feedback should address behaviors/skills which occur within the work setting and are observable by others.

Reliability: The feedback instrument (questionnaire) should be designed to generate reliable, quantifiable data based on content design principles and supporting statistical documentation.

Timeliness: Methods should minimize problems caused by delays between observation and reporting, and between action and feedback by the ratee.

I. Preconditions

A. Commitment: A successful MSF process should gain the full support of the organization in the form of coordinators (e.g., training, workshops, administrators) and endorsement. This support includes participation by all levels of management, including the most senior executives. (Acceptance, Alignment)

B. Clarity of Purpose: [Essential] The purpose of the MSF process must be clearly and explicitly understood and communicated. For the purposes of these Guidelines, purpose must include a statement as to how the feedback is distributed, documented, and used. MSF processes used for decision-making purposes should clearly communicate and support methods to assist ratees in their development as well. (Communication, Consistency, Clarity)

C. Behavioral Model: [Essential] The content of the feedback instrument must be derived from a model (e.g., competencies, values, strategies). This model should be translated into behavioral terms. Models may differ based on level and/or job. There must be full management endorsement and acceptance at all levels as to the relevance and importance of the model(s). A model not developed specifically for the organization (i.e., “off-the-shelf”) must be reviewed for relevance and accompanied by a technical report documenting its measurement characteristics and (if relevant) the characteristics of any normative data provided. (Alignment, Reliability, Relevance, Acceptance, Actionability).

II. Instrument Development

A. Item construction: Items should be behaviorally based. [Essential] The feedback instrument should be designed (or reviewed) by survey professionals. (Reliability, Actionability, Relevance)

B. Content and forms: Items must be examined for opportunity to observe for various rater groups. [Essential] Where items are appropriate for one group but not another (e.g., external customers), separate forms containing only relevant content are recommended. (Reliability, Relevance)

C. Rating Scales: Rating scales should be designed to be consistent with the purpose of the feedback. When the purpose is decision making, the anchors and related training should encourage between-person (normative) comparisons (e.g., “In the top 5%”). The number of choices should allow for meaningful differentiation in performance either between ratees and/or within ratees over time, usually 5 to 9 points. Scales should include an option for raters to indicate insufficient information to respond. [Essential] The use of multiple scales (e.g., importance, desired vs. observed behavior) should be carefully evaluated for their added value in light of rater overload, quality of information, and reporting complexity. (Reliability, Insight, Clarity, Cooperation)

D. Write-in Comments: Raters should be given the opportunity to provide additional feedback to ratees using write-in comments consistent with the purpose of the process. The process should not require write-in comments (e.g., to support extreme ratings). Raters should have a clear understanding as to how their comments are reported (e.g., verbatim vs. paraphrased) and receive training on how to write good comments consistent with the purpose of MSF. (Insight, Clarity, Cooperation, Actionability, Relevance, Alignment)

E. Pretesting: Prior to initial administration, the tools, policies, procedures, and communications should be pretested with representatives of the anticipated audience (i.e., raters, ratees, managers, administrators). While many methods are available to the practitioner for doing pretests (e.g., pilots, focus groups, interviews), the process(es) should allow for the collection of reactions, suggestions, and possible barriers to successful implementation. Pretest may also be used to collect initial normative data. The pretest should solicit input on:

  • Purpose

  • Policies and procedures

  • Perceptions of anonymity

  • Perceptions of confidentiality

  • Rater nomination

  • Rater honesty

  • Instructions

  • Instrument characteristics (e.g., clarity, observability, length)

  • Report format

  • Resources required for ratees (e.g., training, courses, coaching)

  • Appropriateness for use in decision making

  • Process communications (Supports all objectives)

F. Reliability: Data collected from a pilot and/or initial administration of the instrument must be analyzed to determine reliability. [Essential]

  • Rater: Indices of rater agreement should be analyzed both within and between rater perspective groups (e.g., subordinates, peers, customers). Insufficient within-group agreement may indicate the need to increase group sizes and/or use alternative groupings (e.g., peers vs. team members vs. internal customers). Where feasible, a test-retest reliability check is also desirable.

  • Interitem: When category (dimension) scores are used, the instrument should be analyzed to determine the cohesiveness of categories to justify the calculation of category scores. Appropriate analyses can include factor analysis and coefficient alpha. (Reliability)

G. Validation: Validity must be demonstrated. [Essential] The process of demonstrating validity will typically be an iterative process of collecting evidence over time.

  • Content Validity: Most instruments will be initially constructed to satisfy requirements for relevance and alignment, reflecting the organization’s strategies, goals, values, and/or competencies. Content should be reviewed repeatedly for relevance over time.

  • Criterion-Related Validity: As data are collected, the correlation of MSF results with other indicators of individual, group, and organization success should be examined (e.g., formal appraisals, sales, customer satisfaction/retention, promotions, turnover, organizational surveys). (Alignment, Acceptance, Reliability, Relevance)

III. Administration

A. Rater Nomination: The selection of raters is a key factor in the reliability, validity, and acceptance of the feedback results. Policies and procedures for selecting raters must be clearly communicated and applied consistently across the organization (see the following guidelines). [Essential]

Opportunity to observe is a key factor in deciding rater groups and the raters to be selected within a perspective group.

Rater groups that cannot be trained or monitored or have an insufficient opportunity to observe (e.g., external customers) may not be appropriate feedback providers.

Opportunity to observe will consider not only working relationships but also length of time; a requirement for minimum time for the work relationship (considering both time and amount of contact during that time) should be specified to ensure sufficient opportunity to observe behavior.

All direct administrative reports (where applicable) should be included as raters.

For other rater groups, enough raters should be selected to enhance the reliability of the feedback. This will typically suggest nominating at least the 4 to 6 raters per category who have had the best opportunity to observe ratee performance.

For nominations not determined by policy (e.g., all direct reports), the ratee will be the primary source in selecting raters. The nominations must have the concurrence of the ratee’s supervisor. [Essential]

The nomination process will include a method to identify cases where a rater is nominated an excessive number of times, potentially impacting the quality of the feedback. A policy should specify ways to handle these situations. (Reliability, Acceptance, Clarity, Cooperation, Census, Consistency)

B. Rater Training: Once nominated to provide feedback, raters should be trained as to how to perform their role. Training is necessary primarily for first time participants. Possible topics can include:

  • Purpose of the MSF process

  • How the feedback will be used

  • How raters were selected

  • How to complete the rating form

  • How to be a good observer/rater

  • How missing data will be defined and reported

  • How to write a good comment

  • How to avoid typical rating errors

  • How the feedback data will be processed

  • How the feedback will be reported to the ratees

  • How write-in comments will be reported

  • How to avoid invalid rating patterns

  • How to properly fulfill the role of rater

  • Expectations for the ratees

  • Timeline and next steps

While rater training can be delivered effectively through various media, methods which use face-to-face delivery are preferred. Providing written instructions alone does not suffice as rater training. (Clarity, Reliability, Consistency, Anonymity, Rater Accountability, Acceptance, Confidentiality, Communication, Actionability, Relevance)

C. Technology: Many technologies exist for the administration and data collection of MSF degree feedback. The best technology will be partially dictated by the nature of the feedback instrument (i.e., length, branching, open-ended questions and comments) and organizational culture. Other issues to consider include:

Perceptions of anonymity for the raters can affect the honesty of feedback. In certain climates, technologies that are not perceived to guarantee anonymity (e.g., internally processed) may result in feedback with low reliability due to reduced honesty/candor and lower response rates. (Anonymity, Reliability, Cooperation)

Logistics, geographies and resources may require the use of multiple technologies. If this is necessary, the feedback should be systematically examined to detect any possible biases introduced by a technology (e.g., lower/higher scores, lower response rates, incomplete questionnaires, errors in responding). Some climates may show resistance to the use of certain technologies. (Consistency, Cooperation, Accuracy, Reliability, Timeliness)

Any technology must protect the data from access by unauthorized parties. [Essential] (Accuracy, Anonymity, Confidentiality)

D. Timing: The timing and frequency of administration may be dictated by the systems that require MSF data (e.g., performance appraisal, succession planning).

Annual administrations that are integrated with other HR systems help establish accountability for the use of MSF results in ways that are aligned with organization objectives and therefore are recommended. (Alignment, Timeliness, Ratee Accountability, Acceptance)

Long time intervals (e.g., 18 months, 2 years) between administrations can lead to problems of timeliness in regard to a) the time lapse between rater observation and reporting, b) the delay between behavior and feedback for the ratee, and c) the delay in receiving feedback (and reinforcement) for actual behavior change on the part of the ratee. (Timeliness, Reliability, Actionability)

Annual census (one time) administrations can place significant strain on the organization with possible negative effects on some objectives. Creative solutions should be explored (e.g., technologies, formats displaying multiple ratees).(Cooperation, Census, Clarity)

Staggering administrations throughout the year may create both real and perceived inconsistencies, affecting acceptance by ratees and perceptions of fairness for all participants. (Consistency, Acceptance)

IV. Data Processing

A. The role of the data processor is primarily to ensure total accuracy along with maintaining anonymity and confidentiality consistent with policy and communications. Other important considerations will be timeliness, cost effectiveness, and customer service. Data processing must be carefully tested and monitored to ensure 100% accuracy. [Essential] (Accuracy)

B. [Essential] The data (questionnaires and reports) must be totally secure from access by unauthorized personnel. In addition, policies and procedures must clearly state who may see the reported results and under what circumstances. Such policies and procedures should be clearly communicated and agreed to by management to prevent possible abuses.(Confidentiality, Anonymity, Consistency, Communication)

C. Data should be maintained according to policy and legal requirements consistent with those applied to other employee performance data (e.g., performance appraisals) (Consistency, Communication, Alignment)

D. Note that there are some legal opinions regarding the ability to “guarantee” anonymity in all circumstances. Communications regarding legal anonymity should incorporate local legal guidance. (Consistency, Communications)

V. Reporting

A. Report Generation:

[Essential] Reliability and anonymity both require the specification of minimum group size to report a score (item and category). This is never less than three (3) (except for self-scores, supervisor scores, and any other agreed-upon one-on-one relationship) and can be greater. (Reliability, Anonymity)

At minimum, reports should provide for each category and item an aggregate (e.g., mean) score, the number responding, and some indication of rater agreement (if score distributions are not provided). When available, trend scores (i.e., prior results) should also be included in the report. (Insight, Acceptance)

Internal normative comparisons (e.g., percentiles, comparison group scores) should be provided in the report, with care taken to ensure that the normative data are relevant, accurate, and up-to-date. If off-the-shelf instruments are used, internal norms should be generated. (Insight, Acceptance, Accuracy)

Write-in comments should be reported verbatim by rater group. (Insight, Acceptance, Alignment)

B. Rater Reliability Checks:

Any data “cleansing” performed after processing should be clearly communicated to participants prior to administration. Possible useful methods can include identification of invalid rating patterns suggesting that a rater is not fulfilling his/her role as a quality feedback provider. Processes that arbitrarily remove data (e.g., “Olympic scoring”), resulting in reduced group sizes, are not appropriate. (Reliability, Census, Consistency, Communication, Rater Accountability, Acceptance)

In cases where a rater is found to be a provider of invalid feedback (e.g., ratings all of the same score), a policy should be followed for ways to handle these circumstances. Options can include automatic discarding of questionnaires with invalid ratings, or providing raters with the opportunity to modify their responses. (Reliability, Consistency, Clarity, Communication, Rater Accountability)

Online administration can be used to provide “real time” feedback to raters regarding their response patterns. (Reliability, Rater Accountability)

C. Feedback reports are typically provided for each ratee. Copies may be provided to other sources (e.g., manager, HR) depending on policy, with clear communication of this policy to ratees. (Insight, Ratee Accountability, Communication, Consistency)

VI. Follow Through

A. Ratee Training: Ratees should be trained on how to read, interpret, and use their feedback. Best done in a workshop setting, ratee training could include:

  • How feedback can be used for behavior change

  • How to read a feedback report

  • How to identify priority behaviors for improvement

  • How to create an action plan

  • How to identify and access development resources

  • How to conduct a meeting with raters

  • How to conduct a meeting between ratee and manager

  • How the data will/should be used

  • Expectations for ratees

Other resources can be used to support ratee training, such as written guides, coaches, counselors, mentors, and help centers (online, telephone). (Ratee Accountability, Acceptance, Communication, Clarity, Alignment, Consistency, Actionability)

B. Using the results: The way the feedback is used will ultimately determine the success and sustainability of the MSF process. Two events are key to successful implementation and sustained engagement of feedback providers:

In a decision-making context, the ratee is required to share results with his/her manager. (Note that other individuals in the company may be given access to individual results by policy.) This sharing process will provide the manager with information necessary to fulfill his/her role as a representative of the company. A meeting to discuss results will facilitate the implementation of an action plan for the ratee. [Essential] (Ratee Accountability, Alignment, Acceptance, Insight, Consistency, Confidentiality)

Decision-making contexts typically include repeated administrations (e.g., yearly). Raters will continue to participate and provide honest feedback only to the extent that they see their effort rewarded through the resulting actions of the ratees. A key event to support this engagement is for the ratee to share results and action plans with the raters, particularly direct reports. Sharing results has an additional benefits: allowing the ratee to gain further insight into the meaning of the feedback, facilitating rater conversation that enhances their understanding and workgroup alignment, and creating an ongoing dialogue with the raters throughout the year. Sharing results with raters, particularly direct reports, should be a clear expectation for ratees, with significant flexibility as to how results are presented. (Ratee Accountability, Rater Accountability, Insight, Alignment, Consistency, Acceptance)

Additional considerations:

Ratees must be provided with resources which will enable them to address the gaps (between desired behavior and actual behavior) identified in their feedback. [Essential] Resources might include internal and external training, job experiences, special assignments, community activities, and various media sources. Coaches can be very effective in aiding both data interpretation and action planning. It is important that such resources are not only available but easily accessible to those who desire them. (Acceptance, Actionability, Ratee Accountability)

The MSF process should integrate ongoing support between administrations, such as interim progress reviews, mini-feedback tools, communications, ongoing training, and mentor relationships. (Timeliness, Alignment, Communication)

VII. Integrating Results into Decision Making

Once steps have been taken to ensure that the feedback data are reliable and valid, it becomes equally critical to ensure that the data are used appropriately, accurately, and consistently.

Managers given access to MSF feedback for use in decision making should be trained on how to read, interpret, and use it.

Formulaic approaches that use mathematical calculations based on MSF scores as the sole determinants of decisions are not appropriate.

Policies and practices should be clearly defined and communicated regarding the use (and misuse) of MSF. [Essential] Violations of these policies should be monitored and remedied.

Processes that use MSF results must be scrutinized to ensure that results are not disclosed in a way that violates confidentiality policies. [Essential] (Consistency, Alignment, Clarity, Communication, Acceptance, Confidentiality)

VIII. Evaluation

Methods should be used to determine whether the MSF process is being implemented as prescribed and is having the desired results. Methods available to the user include:

  • Focus groups

  • Interviews

  • Audits

  • Surveys

  • Utilization of organization resources to address individual development

  • Process data (e.g., response rates, score trends)

  • Statistical analyses (e.g., rating patterns, adverse impact)

  • Related organization outcomes (Supports all objectives)

Bracken & Timmreck (1999)