Developing measurement instruments (questionnaires)


  • Developing a new measurement instrument that fits your research question


  • Clearly define the construct of the measurement instrument;
  • Conduct a pilot study to test the measurement instrument;
  • Determine the validity of the measurement instrument.



  • Clear description of the construct
  • Old and new versions of items
  • Formulation why certain scorings were chosen
  • Results of pilot testing
  • Final version of the measurement instrument



Project leaders:
  • Decide whether a new measurement instrument is necessary to develop;
  • Inspect the defined construct by reading and adding comments;
  • Evaluate each step in the development of the instrument with the executing researcher;
  • Ensure that pilot tests are executed.
  • Ensure that the measurement instrument is carefully evaluated.
Executing researcher:
  • Define a clear construct;
  • Chose for appropriate measurement method;
  • Search, select and formulate clear items;
  • Chose scoring system carefully;
  • Perform pilot tests in a small group of the selected population and make sure to adapt the instrument wherever needed;
  • Evaluate the instrument according to the ‘Evaluating instruments’ guidelines.
Research assistant: N.a.

How To

When there is no instrument available that measures the construct of your interest, you may decide to develop a measurement instrument yourself. Therefore, the following steps need to be performed:
Step 1: Definition and elaboration of the construct intended to be measured
The first step in instrument development is conceptualization, which involves defining the construct and the variables to be measured. Use the International Classification of Functioning, Disability and Health (ICF) (WHO, 2011) or the model by Wilson and Clearly (1995) as a framework for your conceptual model. When the construct is not directly observable (latent variable), the best choice is to develop a multi-item instrument (De Vet et al. 2011). When the observable items are consequences of (reflecting) the construct, this is called a reflective model. When the observable items are determinants of the construct, this is called a formative model. When you are interested in a multidimensional construct, each dimension and its relation to the other dimensions should be described.
Step 2: Choice of measurement method (e.g. questionnaire/physical test)
Some constructs form an indissoluble alliance with a measurement instrument, e.g. body temperature is measured with a thermometer; and a sphygmomanometer is usually used to assess blood pressure in clinical practice. The options are therefore limited in these cases, but in other situations more options exist. For example, physical functioning can be measured with a performance test, observations, or with an interview or self-report questionnaire. With a performance test for physical functioning, information is obtained about what a person can do, while by interview or self-report questionnaire information is obtained about what a person perceives he/she can do.
Step 3: Selecting and formulating items
To get input for formulating items for a multi-item questionnaire you could examine similar existing instruments from the literature that measure a similar construct, e.g. for different target population, and talk to experts (both clinicians and patients) using in-depth interview techniques. In addition, you should pay careful attention to the formulation of response options, instructions, and choosing an appropriate recall period (Van den Brink & Mellenbergh, 1998).
Step 4: Scoring issues
Many multi-item questionnaires contain 5-point item scales, and therefore are ordinal scales. Often a total score of the instrument is considered to be an interval scale, which makes the instrument suitable for more statistical analyses. Several questions are important to answer:
How can you calculate (sub)scores? Add the items, use the mean score of each item, or calculate Z-scores.
Are all items equally important or will you use (implicit) weights? Note that when an instrument has 3 subscales, with 5, 7, and 10 items respectively, the total score calculated as the mean of the mean score of each subscale differs from the total score calculated as the mean of all items.
How will you deal with missing values? In case of many missings (>5-10%) consider multiple imputation (Eekhout et al., 2014).
Step 5: Pilot study
Be aware that the first version of the instrument you develop will (probably) not be the final version. It is sensible to (regularly) test your instrument in small groups of people. A pilot test is intended to test the comprehensibility, relevance, and acceptability and feasibility of your measurement instrument.
Step 6: Field-testing
See guideline Evaluation of measurement properties.


  • De Vet et al. 2011 Measurement in Medicine.
  • Streiner D.L., Noorman G. 2008 Health measurement scales: a practical guide to their development and use. 4th Oxford University Press.
  • Van den Brink W.P., Mellenbergh G.J. 1998 Testleer en testconstructie. Boom, Amsterdam.
  • Eekhout I., de Vet H.C., Twisk J.W.R., Brand J.P., de Boer M.R., Heymans M.W. Missing data in a multi-item instrument were best handled by multiple imputation at the item score level. J Clin Epidem. 2014;67(3):335-342.


Audit questions

  1. Is the construct of the instrument clearly defined?
  2. Was the type of the measurement instrument correctly chosen?
  3. Has a pilot study been conducted?
  4. Was the measurement instrument validated?

V4.0: 10 Aug 2016: Revision guideline
V3.0: 19 June 2015: Revision format
V2.0: 27 Mai 2011: Guideline 1.1B-08 rewritten and divided into 3 guidelines: 1.1B-08 a,b and c