Developing measurement instruments
Developing a new measurement instrument that fits your research question
- Clearly define the construct of the measurement instrument;
- Conduct a qualitative study to determine the relevance and comprehensiveness of the content of the measurement instruments (concept elicitation phase)
- Conduct a pilot study to assess the relevance, comprehensiveness and comprehensibility of the measurement instrument;
- Determine other measurement properties of the measurement instrument.
- Clear description of the construct and conceptual model
- rationale for each item and answering options
- Results of pilot testing
- Final version of the measurement instrument
- Decide whether a new measurement instrument is necessary to develop (i.e. based on a systematic review of the literature);
- Inspect the defined construct and its conceptual model by reading and adding comments;
- Evaluate each step in the development of the instrument with the executing researcher;
- Ensure that pilot tests are executed.
- Ensure that the measurement instrument is carefully evaluated.
- Define a clear construct and conceptual model;
- Choose for an appropriate type of measurement instrument;
- Search for, select and formulate clear items, in close collaboration with patients;
- Choose the scoring system carefully, including the conceptual framework (i.e. formative or reflective model);
- Perform pilot tests in a small group of the selected population and make sure to adapt the instrument wherever needed;
- Evaluate the instrument according to the ‘Evaluating instruments’ guidelines.
Research assistant: N.a.
When there is no instrument available that measures the construct of your interest, you may decide to develop a measurement instrument yourself. Therefore, the following steps need to be performed:
Step 1: Definition and elaboration of the construct intended to be measured
The first step in instrument development is conceptualization, which involves defining the construct and the variables to be measured. Use the International Classification of Functioning, Disability and Health (ICF) (WHO, 2011) or the model by Wilson and Clearly (1995) as a framework for your conceptual model. When the construct is not directly observable (latent variable), the best choice is to develop a multi-item instrument (De Vet et al. 2011). When the observable items are consequences of (reflecting) the construct, this is called a reflective model. When the observable items are determinants of the construct, this is called a formative model. When you are interested in a multidimensional construct, each dimension and its relation to the other dimensions should be described.
Step 2: Choice of the type of measurement instrument (e.g. questionnaire/physical test)
Some constructs form an indissoluble alliance with a measurement instrument, e.g. body temperature is measured with a thermometer; and a sphygmomanometer is usually used to assess blood pressure in clinical practice. The options are therefore limited in these cases, but in other situations more options exist. For example, physical functioning can be measured with a performance test, observations, or with an interview or self-report questionnaire. With a performance test for physical functioning, information is obtained about what a person can do, while by interview or self-report questionnaire information is obtained about what a person perceives he/she can do. Consider what best fits to your construct and research question and describe the rationale for the chosen method.
Step 3: qualitative research: Selecting and formulating items
To get input for the content of the instrument, and the formulation of the items of a multi-item questionnaire you could examine similar existing instruments from the literature that measure a similar construct, e.g. for different target population, and involve patients and other experts using in-depth interview techniques, such as focus groups. In addition, you should pay careful attention to the formulation of response options, instructions, and choosing an appropriate recall period (Van den Brink & Mellenbergh, 1998). Be aware that in the case of a formative model (i.e. items form/determine the construct), it is important to be complete here: include all determinants in order to adequately measure the intended construct.
Step 4: Scoring issues
The scoring algorithm of a multi-item instrument that is based on a reflective model can be determined by means of (confirmatory) factor analysis. Multi-item instruments based on a formative model could be determined on common sense, or predictive modelling techniques. Many multi-item questionnaires contain 5-point item scales, and therefore are ordinal scales. Often a total score of the instrument is considered to be an interval scale, which makes the instrument suitable for more statistical analyses. Several questions are important to answer:
How will you deal with missing values? When you are evaluating the measurement properties of an instrument, you should not impute data but only describe the amount of missings. When you are using the the instrument, you can consider multiple imputation (Eekhout et al., 2014).
Step 5: Pilot study
Be aware that the first version of the instrument you develop will (probably) not be the final version. It is sensible to (regularly) test your instrument in small groups of people. A pilot test is intended to test the relevance, comprehensiveness, comprehensibility, and acceptability and feasibility of your measurement instrument. You can do this by presenting the instrument to people from the target population and asking them to rate the relevance and clarity (on a scale from 1-4) and ask whether they miss important items, or by interviewing them about the clarity and relevance of items, while responding to the instrument (a so-called ‘think aloud interview), and subsequently, ask about whether they miss important items. Make sure that you adequately process the comments and input from the pilot study in order to improve the measurement instrument.
Step 6: Field-testing and validation
It is wise to perform a field-test among a larger population (than the pilot test respondents) in order to further develop and validate the instrument, including item reduction and gaining insights into the dimensionality of the items. This can be done by critically looking at the item scores (in particular search for missing data), the distribution of scores (to check for possible ceiling effects) and if items intended to cluster in a dimension indeed do so (via factor analyses). The quality of individual items could be tested by performing IRT analyses. To understand whether the test (procedure) can be improved, various reliability studies (e.g. inter- or intra rater studies, test-retest, or inter-machine studies) could be performed, to understand how the test can further be standardized or restricted. In these studies, in addiction to reliability, also the measurement error can be assessed. Further testing of the instruments could be done while starting to use the instrument (e.g. in a trial).
More instructions and information on validation can be found in the section on Evaluation of measurement instrument.