This section is structured as follows:
What does this mean?
A judgement criterion specifies an aspect of the evaluated intervention that will allow its merits or success to be assessed. Whilst "judgement criterion" is the appropriate word, an acceptable alternative is "reasoned assessment criterion". The criterion is used to answer an evaluation question. One or more judgement criteria are derived from each question.
What is the purpose?
How can a judgement criterion be clarified on the basis of a question?
All the evaluation questions relate to one or more judgement criteria, unless they are designed only to further knowledge or understanding about the intervention or its effects. The following is an example of a question:
Like most evaluative questions, it has two parts:
The judgement criteria develop and specify the second part of the question, for example:
The judgement criteria derive from the question, for instance in the case of the first criterion:
To be used in practice, each judgement criterion has to be accompanied by a target level and one or more indicator(s).
Be careful not to confuse concepts
On this site, the word criterion is used for three different concepts:
According to the EC, the value added of an evaluation is the formulation of value judgements on the basis of evidence and explicit judgement criteria. When dealing with organisations which are not familiar with evaluation, it may be wise not to use the word "judgement", which may induce resistance. An acceptable alternative is "assessment", or preferably "reasoned assessment".
What does this mean?
The concept of a 'target' is widely used in the context of public management for setting a verifiable objective or a level of performance to be achieved. In an evaluation context it is used in a much wider sense since the evaluated intervention may have to be judged against targets that were not set in advance but that are specifically identified, such as a benchmark, a success threshold or a comparable good practice.
What is the purpose?
How can they be determined?
- By reference to an objective defined in a verifiable way
The target may appear in one of the intervention objectives, that is, as long as they have been established in a verifiable way. In this particular case, the same indicator helps to define the objective, to make the judgment criterion operational and to determine the target.
- In relation to comparable good practices outside the intervention
In this case, the target is established at the outset of the evaluation. It is not related to an objective or a performance framework existing prior to the evaluation.
The procedure is as follows:
- Compared to best practices identified within the intervention
The target can be found within the evaluated intervention itself during the synthesis phase, provided that specific practices can be considered as good as regards the judgement criteria under consideration.
In this case, the good practices will serve as benchmarks to judge the others. Of course, it is advisable to check that the contextual conditions are close enough so as to allow for comparison.
When should they be determined?
- Earlier or later in the evaluation process
If the target is derived from a verifiable objective or a performance framework, then it can be determined at the very first stage of the evaluation process.
If the target is derived from an outside benchmark, then it should be identified during the early stages of the evaluation. However, the process may involve the gathering of secondary data with a view to specifying the benchmark, as well as a careful examination of comparability. This means that the target will not be completely defined in the first phase of the evaluation.
If the target is to be derived from the best practices discovered within the intervention by the evaluation team, it will be determined in the synthesis phase.
- After choosing the judgement criterion
Determining the target takes place in a three-step process:
Evaluation targets and others
When the evaluation question pertains to an intended result or impact, the target level is usually derived from a verifiable objective or borrowed from a performance assessment framework.
Performance monitoring may however be of little or no help in the instance of evaluation questions relating to cross-cutting issues, sustainability factors, unintended effects, evolving needs and problems, coherence, etc.
What does this mean?
The evaluation team may use any kind of reliable data to assess whether an intervention has been successful or not in relation to a judgement criterion and a target.
Data may be collected in a structured way by using indicators. Indicators specify precisely which data are to be collected. An indicator may be quantitative or qualitative. In the latter case the scoring technique may be used.
Unstructured data are also collected during the evaluation, either incidentally, or because tools such as case studies are used. This kind of evidence may be sound enough to be a basis for conclusions, but it is not an indicator.
What is the purpose?
The main evaluation indicators are those related to judgement criteria, that specify the data needed to make a judgement based on those criteria.
An indicator can be constructed specifically for an evaluation (ad hoc indicator) and measured during a survey, for example. It may also be drawn from monitoring databases, a performance assessment framework, or statistical sources.
A qualitative indicator (or descriptor) takes the form of a statement that has to be verified during the data collection (e.g. parents' opinion is that their children have the possibility of attending a primary school class with a qualified and experienced teacher).
A quantitative indicator is based on a counting process (e.g. number of qualified and experienced teachers). The basic indicator directly results from the counting process. It may be used for computing more elaborate indicators (ratios, rates) such as cost per pupil or number of qualified and experienced teachers per 1,000 children of primary-school age.
Indicators may belong to different categories: inputs, outputs, results or impacts.
Evaluation indicators and others
When an evaluation question pertains to an intended result or impact, it is worth checking whether this result or impact has been subject to performance monitoring. In such cases, the evaluation team uses the corresponding indicators and data, which is a considerable help, especially if baseline data have been recorded.
Performance monitoring may, however, be of little or no help in the instance of evaluation questions relating to cross-cutting issues, sustainability factors, unintended effects, evolving needs or problems, coherence, etc.
Quality of an indicator
An indicator measures or qualifies with precision the judgement criterion or variable under observation (construct validity). If necessary, several less precise indicators (proxies) may be used to enhance validity.
It provides straightforward information that is easy to communicate and is understood in the same way by the information supplier and the user.
It is precise, that is, associated with a definition containing no ambiguity.
It is sensitive, that is, it generates data which vary significantly when a change appears in what is being observed.
Performance indicators and targets are often expected to be SMART, i.e. Specific, Measurable, Attainable, Realistic and Timely. The quality of an evaluation indicator is assessed differently.
Indicators and effects: a warning!
The indicator used to evaluate an effect is not in itself a measurement or evidence of that effect. The indicator only informs on changes, which may either result from the intervention (effect) or from other causes.
The evaluation team always has to analyse or interpret the indicator in order to assess the effect.
Categories of indicators
- Indicators and the intervention cycle
Indicators are used throughout the intervention cycle. They are first used to analyse the context; then, for the choice and validation of the intervention strategy, afterwards for monitoring outputs and results and, finally, for the evaluation.
Indicators and intervention design
Context indicators may be used to support the identification of the needs, problems and challenges which justify the intervention.
As far as possible, objectives and targets are defined in a measurable way by using indicators.
Indicators, monitoring and performance assessment
Monitoring systems and performance assessment frameworks also use indicators which derive from the diagram of expected effects (also called results chain).
Monitoring indicators primarily relate to inputs and outputs. Performance indicators primarily focus on intended results and impacts. The EC's Result Oriented Monitoring (ROM) does not rely that much on indicators. It delivers systematic assessments of external aid projects in the form of ratings with a view to intended results and impacts.
Indicators and evaluation
Evaluation indicators are used to help answering specific evaluation questions. Depending on the question, they may relate to the needs, problems and challenges which have justified the intervention, or to the achievement of intended outputs, results and impacts, or to anything else.
- Global and specific indicators
Global or contextual indicators apply to an entire territory, population or group, without any distinction between those who have been reached by the intervention and those who have not. They are mainly taken from statistical data. This site offers help to look for contextual indicators.
Specific indicators concern only a group or territory that has actually been reached. With specific indicators, changes among those affected by the intervention can be monitored. Most of these indicators are produced through surveys and management databases.
- Indicators and intervention logic
Input indicators provide information on financial, human, material, organisational or regulatory resources mobilised during the implementation of the intervention. Most input indicators are quantified on a regular basis by the management and monitoring systems (providing that they are operational).
Output indicators provide information on the operators' activity, especially on the products and services that they deliver and for which they are responsible. To put it simply, one could say that outputs correspond to what is bought with public money.
Result indicators provide information on the immediate effects of the intervention for its direct addressees. An effect is immediate if the operator notices it easily while he/she is in contact with an addressee. Because they are easily recognised by the operators, direct result indicators can be quantified exhaustively by the monitoring system.
Impact indicators provide information on the long-term direct and indirect consequences of the intervention.
A first category concerns the consequences that appear or last in the medium or long term for the direct beneficiaries.
A second category of impacts concerns people or actors that are not direct beneficiaries.
Impact indicators cannot be produced in general from management information. They require statistical data or surveys specially conducted during the evaluation process.
Indicators derived from scoring
What does this mean?
Scoring (or rating) produces figures that synthesise a set of qualitative data and or opinions. Scoring is guided by a scoring grid (or scorecard) with varying degrees of detail.
From an evaluation point of view, both words scoring and rating can be used.
What is the point?
Scoring allows the production of structured and comparable data on judgement criteria that do not lend themselves to a measurement using quantitative indicators.
How to construct a scoring grid
How to use the scoring grid
Scoring grids usually apply to projects or components of the intervention and allow for comparing these.
The evaluation team puts together all the data it has on the project or intervention to be assessed. It then chooses the level (or descriptor) in the scoring grid that corresponds best (or the least badly) to this information. The score results from this choice.
The more detailed the scoring grid the less subjective the score will be and the more comparable the scores allocated by two different evaluators will be.
From questions to judgement criteria
The judgement criterion (also called reasoned assessment criterion) specifies an aspect of the evaluated intervention that will allow its merits or worth to be assessed in order to answer the evaluation question, . For instance:
The judgement criterion gives a clear indication of what is positive or negative, for example: "enhancing the expected effects" is preferable to "taking potential effects into account".
A more precise judgement criterion than the question
The question is drafted in a non-technical way with wording that is easily understood by all, even if it lacks precision.
The judgement criterion focuses the question on the most essential points for the judgement.
Yet the judgement criterion does not need to be totally precise. In the first example the term "satisfactory quality" can be specified elsewhere (at the indicator stage).
Not too many criteria
It is often possible to define many judgement criteria for the same question, but this would complicate the data collection and make the answer less clear.
In the example below, the question is treated with three judgement criteria (multicriteria approach):
A judgement criterion corresponding to the question
The judgement criterion should not betray the question. In the following example, two judgement criteria are considered for answering the same question:
The first judgement criterion is faithful to the question, while the second is less so in so far as it concerns the success in primary education, whereas the question concerns only the access to it. The question may have been badly worded, in which case it may be amended if there is still time.
Also specify the scope of the question
Most questions have a scope (what is judged) and a judgement criterion (the way of judging). In addition to the judgement criterion, it is therefore often desirable to specify the scope of the question, for example: "European aid granted over the past X years", "design of programme X", "the principle of decentralisation adopted to implement action X".
Also specify the type of cause-and-effect analysis
Some questions imply a cause-and-effect analysis prior to the judgement. It may therefore also be useful to specify the type of analysis required by means of terms such as "has European aid led to", "has it contributed to", "is it likely to".
From judgement criteria to indicators
An indicator describes in detail the information required to answer the question according to the judgement criterion chosen, for example:
Not too many indicators
It is possible to define many indicators for the same judgement criterion. Relying upon several indicators allows for cross-checking and strengthens the evidence base on which the question is answered. However, an excessive number of indicators involves a heavy data collection workload without necessarily improving the soundness of the answer to the question.
In the examples below three indicators are applied to a judgement criterion ("capacity of the primary school system to enrol pupils from ethnic minority X with satisfactory quality"):
Indicator corresponding to the judgement criterion
The indicator should not betray the judgement criterion. Two indicators are considered below:
The first indicator corresponds faithfully in so far as it describes an essential aspect of the judgement criterion. The second indicator is less faithful because it fails to reflect the concept of "satisfactory quality". Its construct validity is not good.
An indicator must be defined without any ambiguity and understood in the same way by all the members of the evaluation team. For instance, in the above examples it is necessary to specify what a "qualified and experienced teacher" is. This can be done with reference to an existing definition, or else a definition can be formulated as precisely as possible until there is no more ambiguity whatsoever.
Indicators independent from the observation field
The same indicator should be able to serve to collect data in several contexts, for example:
In this case the same indicator is applied in both types of area and serves as a comparison, on the basis of which a judgement is formulated.
Quantitative and qualitative indicators
The following two examples present an alternative between a quantitative indicator and a qualitative indicator for treating the same judgement criterion:
An indicator is preferably associated with a target
The target indicates which comparison should be made in order to answer the question, for example: "In the areas where ethnic minority X concentrates, the indicator is at least as good as in the entire country in average".
The target and the indicator are often specified interactively in successive steps. It is important not to digress from the judgement criterion during this process.
When the evaluation question pertains to an intended result or impact, the target is usually derived from a verifiable objective or borrowed from a performance assessment framework.
The indicator makes it possible to focus and structure data collection but serves no purpose as long as data does not exist. To ensure the feasibility of an indicator, it is necessary to indicate the source of the information to use, for example:
If no source is available or feasible, the indicator should be changed. If no feasible indicator can be found, excluding the question should be envisaged.
Example of a country evaluation
The judgement criterion is derived from the question in the following way:
The indicator is derived from the judgement criterion in the following way:
Example of a sector evaluation
The question refers to a family of evaluation criteria: effectiveness.
The judgement criterion is derived from the question in the following way:
The indicator derives from the judgement criterion in the following way: