Skip to main content

Reliability and measurement error of sensorimotor tests in patients with neck pain: a systematic review



Neck pain is one of the leading causes of years lived with disability, and approximately half of people with neck pain experience recurrent episodes. Deficits in the sensorimotor system can persist even after pain relief, which may contribute to the chronic course of neck pain in some patients. Evaluation of sensorimotor capacities in patients with neck pain is therefore important. No consensus exists on how sensorimotor capacities of the neck should be assessed in physiotherapy. The aims of this systematic review are: (a) to provide an overview of tests used in physiotherapy for assessment of sensorimotor capacities in patients with neck pain; and (b) to provide information about reliability and measurement error of these tests, to enable physiotherapists to select appropriate tests.


Medline, CINAHL, Embase and PsycINFO databases were searched for studies reporting data on the reliability and/or measurement error of sensorimotor tests in patients with neck pain. The results for reliability and measurement error were compared against the criteria for good measurement properties. The quality of evidence was assessed according to the modified GRADE method proposed by the COSMIN group.


A total of 206 tests for assessment of sensorimotor capacities of the neck were identified and categorized into 18 groups of tests. The included tests did not cover all aspects of the sensorimotor system; tests for the sensory and motor components were identified, but not for the central integration component. Furthermore, no data were found on reliability or measurement error for some tests that are used in practice, such as movement control tests, which apply to the motor component. Approximately half of the tests showed good reliability, and 12 were rated as having good (+) reliability. However, tests that evaluated complex movements, which are more difficult to standardize, were less reliable. Measurement error could not be evaluated because the minimal clinically important change was not available for all tests.


Overall, the quality of evidence is not yet high enough to enable clear recommendations about which tests to use to assess the sensorimotor capacities of the neck.


Neck pain is the second most common musculoskeletal problem [1]. It is one of the leading causes of years lived with disability worldwide and represents an increasing burden on healthcare systems [2,3,4]. The economic burden of neck pain, in terms of treatment costs, lost productivity and work-related problems is high [1]. The point prevalence of neck pain in different countries ranges from 2443.9 to 6151.2 cases per 100,000 population, with the highest values in western Europe [1, 5]. The mean percentages for one-year prevalences and lifetime prevalences of adults worldwide are 37.2% and 48.5%, respectively [6]. Although acute neck pain usually resolves within two months, approximately 50% of patients are not completely pain free one year after an episode of neck pain [7,8,9]. This illustrates the often chronic-episodic course of the condition, with patients experiencing persistent or recurrent episodes of neck pain [10].

Management of patients with neck pain is a major challenge in physiotherapy, mainly because these patients form a very heterogeneous group in terms of the nature of symptoms, symptom distribution, and underlying pain mechanisms [11]. As neck pain is a multidimensional condition, management should consider multiple factors (e.g. pain mechanisms, and psychological, biological, movement and work-related factors). Among the work-related factors, workload, work or study time, sustained postures or body positions during work and computer work are considered as risk factors for the development of neck pain [1, 12]. The different factors can interact, and their expression may be more or less dominant in each patient, thus influencing the clinical approach [1, 12].

Deficits of sensorimotor capacities (SC) may be one of the factors contributing to neck pain, in particular the persistence or recurrence of neck pain [13]. The sensorimotor system is defined as an integrated whole, comprising afferent and efferent information, with central integration and processing components necessary to provide functional joint stability [14]. It is thought to influence, among others, joint position sense, activation of cervical flexor muscles and control of head-eye movement. The SC of the cervical spine are related to neck pain [15] and patients with neck pain often demonstrate reduced SC, e.g. reduced joint position sense [16,17,18], altered activation patterns of the cervical muscles [19,20,21], or disturbed head-eye movement control [22]. Furthermore, the persistence of deficits in the sensorimotor system can continue even after pain relief. It is hypothesized that persistence of these deficits may contribute to some patients experiencing recurrent episodes of neck pain [23,24,25] and the integration of sensorimotor training in the management of patients with neck pain has shown promising results [13]. Therefore, evaluation of SC in patients with neck pain is important [26]. Various tests to evaluate the sensorimotor system have been developed and are widely used in physiotherapy practice and research. However, the terminology used is often confusing, and there is no consensus on how SC of the neck should be assessed [14, 27]. Systematic reviews of tests for SC of the neck have investigated only a limited selection of tests assessing single aspects of SC, such as joint position sense [28] or muscle function [29,30,31]. A systematic review, providing a comprehensive overview of all available tests to assess all different aspects of SC of the neck, is lacking.

Given that many tests exist for assessment of SC of the neck, the challenge is to choose the most appropriate test for use in a specific situation. From a scientific perspective, knowledge about the quality of a test, i.e. measurement properties, is important when making this decision. The quality of a test depends on three criteria: reliability, validity and responsiveness [32].

This systematic review investigates the domain reliability. Reliability is the degree to which measurements are free from measurement error. The domain reliability includes three measurement properties: reliability, (expressing the proportion of the total variance in the measurements which is due to ‘true’ differences between patients), measurement error (which is the systematic and random error of a patient’s score that is not attributed to true changes in the construct to be measured), and internal consistency [32]. Internal consistency is usually investigated in self-reporting multi-item questionnaires and therefore is not relevant for the single-item tests used to assess SC.

The aim of this systematic review is to include all tests assessing any aspect of SC of the neck. Therefore, since many different tests are described in the literature, this review focusses only on reliability. Of course, when deciding which test to use, it would also be important to consider the different aspects of validity.

The concepts of reliability and measurement error are related, but focus on different purposes. Reliability focusses on the variability between patients or measurements and is influenced by the variation in the population where the test is used. On the other hand, measurement error is a relevant parameter for measurement of change over time, and it is not affected by population variability [33]. In clinical practice physiotherapists are interested in both concepts. The distinction between patients with and without deficits in the sensorimotor system (diagnostic purpose) is important. But measurement error is also an issue, as change over time, i.e. the evolution of the patient’s symptoms, is of interest.

The aims of this systematic review are: (a) to provide an overview of tests used in physiotherapy to assess SC in patients with neck pain; and (b) to provide information about the reliability and measurement error of these tests, to enable physiotherapists to select appropriate tests.



A meta-analysis of studies investigating the reliability and measurement error of tests assessing SC of patients with neck pain in a physiotherapy setting.

Search strategy

The databases CINAHL, Embase and PsycINFO were searched up to July 2020 and for Medline up to May 2021. Blocks of search terms were developed for: (a) construct of interest (sensorimotor capacities), (b) population (patients with neck pain), (c) the sensitive PubMed filter developed by Terwee et al. [34] for the identification of studies about measurement properties of measurement instruments, and (d) the exclusion filter proposed by Terwee et al. [34] to exclude irrelevant studies. The two filters were adapted to the other databases, adopting the strategy used by Ammann-Reiffer et al. [35]. There was no language restriction. The reference lists of systematic reviews retrieved were hand searched for further eligible studies. The detailed search strategy is shown in Additional file 1.

Selection process

Two reviewers (SE and either RH or MT) screened the titles and abstracts independently, based on the predefined inclusion and exclusion criteria listed in Fig. 1. Disagreements were discussed and, if necessary, a third reviewer (CB) made a decision regarding inclusion. Reviewers in the team were able to read English, German, Dutch, French, Danish and Norwegian, and no exclusion of relevant papers based on language was noted.

Full-text screening was performed independently by two researchers (SE and RH) using the same predefined criteria (Fig. 1). After each screening step (title/abstract and full text), in the case of any disagreement about inclusion, consensus was reached through discussion with a third reviewer (CB). The screening was carried out using Covidence systematic review software [36].

Data extraction

Data extraction was conducted using REDCap electronic data capture tools hosted at the University of Applied Sciences and Arts Western Switzerland (HES-SO) Valais [37] by SE and RH. The first five studies were checked by a third researcher (CB) to ensure the correct procedure. Data were extracted on study characteristics, reliability, and measurement error of the different tests. Two researchers (SE and RH) assessed methodological quality, applying the COSMIN risk of bias tool in the adapted version for clinician-reported or performance-based outcome measures [38]. Each criterion was rated on a four-point rating system (i.e. very good, adequate, doubtful, or inadequate). The lowest rating determined the overall rating of the study (worst-score-counts method). The detailed tables for risk of bias assessment are shown in Additional file 2 (reliability) and Additional file 3 (measurement error). A third researcher (CB) performed a check of the first studies. Data extraction and synthesis was conducted with all included studies regardless of their methodological quality.

Data synthesis and analysis

The intraclass correlation coefficients of studies that used the same device and similar instructions for the corresponding test were quantitatively pooled. When pooling was not feasible, the study results were qualitatively summarized, by reporting the lowest and highest values. Because of the large number of tests applied for different directions of movement of the neck (left rotation, right rotation, etc.) test directions were summarized with reliability or measurement error values that led to the same conclusion regarding the criteria for good measurement properties, with the lowest and highest value. Tests directions with very different values (i.e. when the conclusion about the appropriateness of the reliability or the measurement error for this direction would be different from that for other directions) were reported separately.

The overall results for the reliability and/or measurement error of single studies or of summarized or pooled studies were compared against the criteria for good measurement properties. In a next step, the quality of evidence was graded according to the modified GRADE method proposed by the COSMIN group [38]. The quality of evidence was classified as high, moderate, low, or very low. The score was downgraded for risk of bias (minus one for serious, minus two for very serious and minus three for extremely serious risk of bias), inconsistency (minus one if more than one study per test available I2 > 0.5), and imprecision (minus one if total sample size n = 50–100, minus two if total simple size n < 50). The score was not downgraded for indirectness, due to the restrictive inclusion criteria used in the current study [38]. Detailed tables of the quality of evidence criteria are shown in Additional file 4 (reliability) and Additional file 5 (measurement error).

Fig. 1
figure 1

Criteria for inclusion or exclusion of studies


In total 11,704 studies were found using the search strategy in four databases (Medline, CINAHL, Embase, PsycINFO). First, 3741 duplicates were removed. The remaining 7963 studies were screened for title and abstract, and 7803 were excluded based on the predefined criteria. Of the 160 full-text studies, 118 were excluded. The reasons for exclusion are listed in Fig. 2.

A final total of 42 studies, investigating a total of 206 tests for the assessment of SC in patients with neck pain, were included in the systematic review (Table 1).

Fig. 2
figure 2

Flow chart

Table 1 Characteristics of the included studies

Tests were categorized into 18 different groups (e.g. tests for active range of motion in the different movement directions of flexion, extension, lateral flexion, and rotation with the help of different devices were grouped together as active range of motion tests). Based on the classification of Riemann & Lephart [14], tests for the sensory and the motor components of the sensorimotor system were identified, but no tests for the central integration component were found. Within the sensory component, tests in the subcomponents “tactile” and “conscious proprioceptive senses” were found. As this study did not search for tests assessing pain, the subcomponent “pain” does not contain a test. A list of all groups of tests is shown in Fig. 3.

Fig. 3
figure 3

Sensorimotor system definition (according to Riemann & Lephart 2002 (14)) and the 18 groups of tests included in this systematic review (pink boxes)

According to the COSMIN criteria the following 12 tests were rated as good: craniocervical flexion test (test-retest reliability), neck flexor muscle endurance test (inter-rater and test-retest reliability), neck extensor muscle endurance test (inter-rater and test-retest reliability), sternocleidomastoid muscle strength (test-retest reliability), maximal voluntary isometric contraction (test-retest reliability), isometric strength with the help of different devices (test-retest reliability), flexion-relaxation ratio (test-retest reliability), active range of motion test with the help of different devices (inter-rater and test-retest reliability), figure of eight test (inter-rater and test-retest reliability), zigzag test (inter-rater and test-retest reliability), smooth pursuit neck torsion test (test-retest reliability), and rod and frame test (test-retest reliability). An overview of the ratings of all tests is shown Table 2. However, regarding reliability, the quality of evidence was rated as low or very low for all included studies. The reasons for downgrading are shown in Additional file 4.

Table 2 Summary of findings

Regarding measurement error, the criteria for good measurement error were rated as unknown for all included tests, because the minimal clinically important change was not reported. The quality of evidence was rated very low to high (Table 2). Reasons for downgrading are shown in Additional file 5.


This systematic review included 42 studies evaluating 206 tests, with the aim of investigating the reliability and measurement error of tests for SC in patients with neck pain. The main findings are, firstly, that tests for the sensory and motor components of the sensorimotor system were found, but not for the central integration component. Furthermore, no data were found on reliability or measurement error in patients with neck pain for some tests that are used in practice, such as the movement control tests, which would belong to the motor component; secondly, approximately half of the tests, particularly tests that are easier to standardize with regard to test position or movement direction, showed good reliability; and, finally, tests evaluating more complex movements, which are more difficult to standardize, were less reliable.

In general, all included muscle endurance tests, had good (relative) reliability values according to the criteria for good measurement properties proposed by COSMIN, except for the scapula muscle endurance test in standing position. The execution of this test is much more complex and more difficult to standardize than other tests. Furthermore, scapula movements, compensatory movements, muscle recruitment etc. are more difficult to assess compared with neck movements where the movement directions follow the sagittal, frontal, or transversal plane in a more stable way. Similarly, regarding reliability of the isometric muscle strength tests, tests involving the judgement of movements or muscle recruitment around the scapula have lower values for reliability than tests for isometric activity of the head into flexion, extension, lateral flexion, or rotation. Again, this may be because scapula positions are more difficult to standardize, and isometric contractions of the scapula muscles are more difficult to assess regarding compensatory movements than isometric muscle activity of the muscles of the cervical spine.

The test of fast cervical rotations showed very low reliability, possibly due to the very complex characteristics of these movements, which make it difficult to standardize the test. The tests assessing active range of motion (AROM) of the cervical spine showed that assessment of rotation is more difficult compared with the other movement directions. This is particularly evident when the rotation is assessed as a single movement (combined right and left values) and when AROM is assessed with the help of a smartphone. In the current analysis, the values were less reliable for Android phones than for iPhones (see Table 2). This could be due to differences in the study protocols. In the study that used an Android phone, it was only held against the head, whereas in the study assessing AROM with an iPhone, the device was fastened securely to the forehead with a rigid strap, which might produce more reliable results. The assessment of AROM with the help of a dynamometer or goniometer showed good test-retest reliability results, but less good values for inter-rater reliability. It is evident that good values for inter-rater reliability are more difficult to achieve, because more sources of variation are included (e.g. different testers). Thus, the standardization of these types of tests is often a problem.

Using the example of the craniocervical flexion test (CCFT), this review shows that tests that require a substantial subjective rating (e.g. judgement of muscle recruitment or movement patterns) lead to lower reliability compared with more objective criteria (e.g. time). The current results are in line with a recent systematic review by Selistre and colleagues [30], investigating clinical tests for measuring strength or endurance of cervical muscles. They found moderate to good intra- and inter-rater reliability for the CCFT, cervical flexor endurance test, cervical extensor endurance test and cervical muscle strength assessed using a handheld dynamometer. The results of the current review are comparable for the CCFT, the cervical flexor endurance test and the cervical extensor endurance test. For the cervical muscle strength tests, the current review performed a more detailed analysis, e.g. Selistre et al. [30] described the cervical strength tests only with the handheld dynamometer and not with other devices. In the current study, the cervical strength tests with dynamometer showed good results for test-retest reliability, but poorer results for intra-rater reliability. The results of the current review are also comparable with those of a recent systematic review of the measurement properties of the CCFT [31]. The authors classified the inter-rater and the intra-rater reliability of the CCFT as positive and the level of evidence as moderate. The measurement error was classified as indeterminate and the level of evidence as unknown. The authors identified the same problems as found in the current review, such as low methodological quality of the included studies and missing data on minimal clinically important change.

The two recent systematic reviews on measurement properties of tests for the SC of the neck included studies with participants with and without neck pain [30, 31]. Both stated that studies on participants with neck pain were lacking, which is in line with the current results. The current review excluded several studies because the results for participants with neck pain were not reported separately but only together with those for people without neck pain. It was decided to include only studies with data for patients with neck pain, given our interest in the use of the tests in a clinical setting. Because the reliability of a test is influenced by the heterogeneity of the population in which the test is performed, it is important to know the reliability for a comparable population to that in which the test will be administered. It was also surprising that tests such as the CCFT, which is widely used in clinical practice, are so rarely investigated in patients with neck pain.

The major strength of this study is that it included all available tests for assessment of all aspects of sensorimotor control of the neck. However, the study also has a number of limitations. Many tests were performed only on healthy participants or in a mixed group of participants with and without neck pain. Several studies were excluded, including all studies assessing tests for movement control of the neck, as the authors did not report separate data for the patient group. Secondly, the quality of evidence was low to very low regarding reliability for all included studies. It was necessary to downgrade the level of evidence, mainly because of high risk of bias, inconsistency, and low precision. In the assessment of risk of bias, the item “patient stability” is one of the items that was particularly rated as doubtful in many cases. COSMIN recommends that patient stability should only be rated as very good if the study explicitly describes that the patients’ condition did not change between measurements. As this information was often missing, the current review had to rate the patient stability item as doubtful, even though the time interval between measurements was adequate. A further limitation of this review is that the included studies did not report data on interpretability and feasibility of the different tests, which would be important information for the recommendation of specific tests. Finally, this review did not assess aspects of validity, which would certainly also be important for the selection of appropriate tests.

Better studies are needed on reliability, measurement error and validity of tests in patients with neck pain, because the quality of evidence of the existing research is mainly low or very low, and the reliability of some tests (e.g. for movement control) was not evaluated in patients with neck pain at all.


Despite the large number of tests available, the quality of evidence is not yet high enough to conclusively inform clinicians which test to use to assess SC in patients with neck pain.

For clinical practice, this systematic review shows that tests with objective criteria and a thorough standardization should be chosen to ensure higher reliability.

Measurement error could not be evaluated because the minimal clinically important change was not available for all tests.

Availability of data and materials

All data generated or analysed during this study are included in this published article and its supplementary information files.



Active range of motion


Craniocervical flexion test


Cumulative Index to Nursing and Allied Health Literature


Consensus-based Standards for the selection of health Measurement INstruments

Embase :

Excerpta Medica dataBASE


Grading of Recommendations Assessment, Development and Evaluation


Intra-class correlations

Medline :

Medical Literature Analysis and Retrieval System Online


  1. Kazeminasab S, Nejadghaderi SA, Amiri P, Pourfathi H, Araj-Khodaei M, Sullman MJ, et al. Neck pain: global epidemiology, trends and risk factors. BMC Musculoskel Dis. 2022;23(26):1–13.

    Google Scholar 

  2. Hoy D, March L, Woolf A, Blyth F, Brooks P, Smith E, et al. The global burden of neck pain: estimates from the global burden of disease 2010 study. Ann Rheumat Dis. 2014;73(7):1309–15.

    PubMed  Google Scholar 

  3. Vos T, Flaxman AD, Naghavi M, Lozano R, Michaud C, Ezzati M, et al. Years lived with disability (YLDs) for 1160 sequelae of 289 diseases and injuries 1990–2010: a systematic analysis for the global burden of Disease Study 2010. Lancet. 2012;380(9859):2163–96.

    PubMed  PubMed Central  Google Scholar 

  4. Hurwitz EL, Randhawa K, Yu H, Côté P, Haldeman S. The Global Spine Care Initiative: a summary of the global burden of low back and neck pain studies. Eur Spine J. 2018;27(6):796–801.

    PubMed  Google Scholar 

  5. Safiri S, Kolahi A-A, Hoy D, Buchbinder R, Mansournia MA, Bettampadi D, et al. Bettampadi, et al. Global, regional, and national burden of neck pain in the general population, 1990-2017: systematic analysis of the Global Burden of Disease Study 2017. BMJ. 2020;368:m791.

  6. Fejer R, Kyvik KO, Hartvigsen J. The prevalence of neck pain in the world population: a systematic critical review of the literature. Eur Spine J. 2006;15(6):834–48.

    PubMed  Google Scholar 

  7. Vos CJ, Verhagen AP, Passchier J, Koes BW. Clinical course and prognostic factors in acute neck pain: an inception cohort study in general practice. Pain Med. 2008;9(5):572–80.

    PubMed  Google Scholar 

  8. Vasseljen O, Woodhouse A, Bjørngaard JH, Leivseth L. Natural course of acute neck and low back pain in the general population: the HUNT study. PAIN®. 2013;154(8):1237–44.

    PubMed  Google Scholar 

  9. Cohen SP, editor. Epidemiology, diagnosis, and treatment of neck pain. Mayo Clinic Proceedings; 2015:90(2):284 – 99.

  10. Hogg-Johnson S, van der Velde G, Carroll LJ, Holm LW, Cassidy JD, Guzman J, et al. The burden and determinants of neck pain in the general population: results of the bone and joint decade 2000–2010 Task Force on Neck Pain and its Associated Disorders. J Manipul Physiol Therapeut. 2009;32(2):46–S60.

    Google Scholar 

  11. Childs MJD, Fritz JM, Piva SR, Whitman JM. Proposal of a classification system for patients with neck pain. J Orthopaed Sports Phys Ther. 2004;34(11):686–700.

    Google Scholar 

  12. Rabey M, Beales D, Slater H, O’Sullivan P. Multidimensional pain profiles in four cases of chronic non-specific axial low back pain: an examination of the limitations of contemporary classification systems. Man Ther. 2015;20(1):138–47.

    PubMed  Google Scholar 

  13. Sremakaew M, Jull G, Treleaven J, Uthaikhup S. Effectiveness of adding rehabilitation of cervical related sensorimotor control to manual therapy and exercise for neck pain: a randomized controlled trial. Musculoskel Sci Pract. 2023;63:102690.

    Google Scholar 

  14. Riemann BL, Lephart SM. The sensorimotor system, part I: the physiologic basis of functional joint stability. J Athlet Train. 2002;37(1):71.

    Google Scholar 

  15. Beinert K, Preiss S, Huber M, Taube W. Cervical joint position sense in neck pain. Immediate effects of muscle vibration versus mental training interventions: a RCT. Eur J Phys Rehabil Med. 2015;51(6):825–32.

    CAS  PubMed  Google Scholar 

  16. Revel M, Andre-Deshays C, Minguet M. Cervicocephalic kinesthetic sensibility in patients with cervical pain. Arch Phys Med Rehabil. 1991;72(5):288–91.

    CAS  PubMed  Google Scholar 

  17. Kristjansson E, Dall’Alba P, Jull G. A study of five cervicocephalic relocation tests in three different subject groups. Clin Rehabil. 2003;17(7):768–74.

    PubMed  Google Scholar 

  18. Reddy RS, Meziat-Filho N, Ferreira AS, Tedla JS, Kandakurti PK, Kakaraparthi VN. Comparison of neck extensor muscle endurance and cervical proprioception between asymptomatic individuals and patients with chronic neck pain. J Bodyw Mov Ther. 2021;26:180–6.

    PubMed  Google Scholar 

  19. Falla D, Jull G, Edwards S, Koh K, Rainoldi A. Neuromuscular efficiency of the sternocleidomastoid and anterior scalene muscles in patients with chronic neck pain. Disabil Rehabil. 2004;26(12):712–7.

    CAS  PubMed  Google Scholar 

  20. Falla DL, Jull GA, Hodges PW. Patients with neck pain demonstrate reduced electromyographic activity of the deep cervical flexor muscles during performance of the craniocervical flexion test. Spine. 2004;29(19):2108–14.

    PubMed  Google Scholar 

  21. Jull G, Kristjansson E, Dall’Alba P. Impairment in the cervical flexors: a comparison of whiplash and insidious onset neck pain patients. Man Ther. 2004;9(2):89–94.

    CAS  PubMed  Google Scholar 

  22. Della Casa E, Affolter Helbling J, Meichtry A, Luomajoki H, Kool J. Head-eye movement control tests in patients with chronic neck pain; inter-observer reliability and discriminative validity. BMC Musculoskel Dis. 2014;15:16.

    Google Scholar 

  23. Sterling M, Jull G, Wright A. The effect of musculoskeletal pain on motor activity and control. The J Pain. 2001;2(3):135–45.

    CAS  PubMed  Google Scholar 

  24. Jull G, Trott P, Potter H, Zito G, Niere K, Shirley D, et al. A randomized controlled trial of exercise and manipulative therapy for cervicogenic headache. SPINE. 2002;27(17):1835–43.

  25. Kristjansson E, Treleaven J. Sensorimotor function and dizziness in neck pain: implications for assessment and management. J Orthopaed sports Phys Ther. 2009;39(5):364–77.

    Google Scholar 

  26. Treleaven J. Sensorimotor disturbances in neck disorders affecting postural stability, head and eye movement control. Man Ther. 2008;13(1):2–11.

    PubMed  Google Scholar 

  27. de Zoete RM, Osmotherly PG, Rivett DA, Farrell SF, Snodgrass SJ. Sensorimotor control in individuals with idiopathic neck pain and healthy individuals: a systematic review and meta-analysis. Arch Phys Med Rehabil. 2017;98(6):1257–71.

    PubMed  Google Scholar 

  28. Michiels S, De Hertogh W, Truijen S, November D, Wuyts F, Van de Heyning P. The assessment of cervical sensory motor control: a systematic review focusing on measuring methods and their clinimetric characteristics. Gait Posture. 2013;38(1):1–7.

    PubMed  Google Scholar 

  29. de Koning CH, van den Heuvel SP, Staal JB, Smits-Engelsman B, Hendriks EJ. Clinimetric evaluation of methods to measure muscle functioning in patients with non-specific neck pain: a systematic review. BMC Musculoskel Dis. 2008;9(1):1–9.

    Google Scholar 

  30. Selistre LFA, de Sousa Melo C, de Noronha MA. Reliability and validity of clinical tests for measuring strength or endurance of cervical muscles: a systematic review and meta-analysis. Arch Phys Med Rehabil. 2021;102(6):1210–27.

    PubMed  Google Scholar 

  31. Araujo FXd, Ferreira GE, Scholl Schell M, Castro MPd, Ribeiro DC, Silva MF. Measurement properties of the craniocervical flexion test: a systematic review. Phys Ther. 2020;100(7):1094–117.

    Google Scholar 

  32. Mokkink LB, Terwee CB, Patrick DL, Alonso J, Stratford PW, Knol DL, et al. The COSMIN study reached international consensus on taxonomy, terminology, and definitions of measurement properties for health-related patient-reported outcomes. J Clin Epidemiol. 2010;63(7):737–45.

    PubMed  Google Scholar 

  33. Scholtes VA, Terwee CB, Poolman RW. What makes a measurement instrument valid and reliable? Injury. 2011;42(3):236–40.

    PubMed  Google Scholar 

  34. Terwee CB, Jansma EP, Riphagen II, de Vet HC. Development of a methodological PubMed search filter for finding studies on measurement properties of measurement instruments. Qual Life Res. 2009;18(8):1115–23.

    PubMed  PubMed Central  Google Scholar 

  35. Ammann-Reiffer C, Bastiaenen CH, de Bie RA, van Hedel HJ. Measurement properties of gait-related outcomes in youth with neuromuscular diagnoses: a systematic review. Phys Ther. 2014;94(8):1067–82.

    PubMed  Google Scholar 

  36. Covidence systematic review software Melbourne, Australia.

  37. Harris PA, Taylor R, Thielke R, Payne J, Gonzalez N, Conde JG. A metadata-driven methodology and workflow process for providing translational research informatics support. J Biomed Inform. 2009;42(2):377–81.

    PubMed  Google Scholar 

  38. Mokkink LB, Boers M, van der Vleuten CPM, Bouter LM, Alonso J, Patrick DL, de Vet HCW, Terwee CB. COSMIN Risk of Bias tool to assess the quality of studies on reliability or measurement error of outcome measurement instruments: a Delphi study. BMC Med Res Methodol. 2020;20(293).

  39. Chiu TTW, Lo SK. Evaluation of cervical range of motion and isometric neck muscle strength: reliability and validity. Clin Rehabil. 2002;16(8):851–8.

    PubMed  Google Scholar 

  40. Cibulka MT, Herren J, Kilian A, Smith S, Mahmutovic F, Dolles C. The reliability of assessing sternocleidomastoid muscle length and strength in adults with and without mild neck pain. Physiother Theory Pract. 2017;33(4):323–30.

    PubMed  Google Scholar 

  41. Cleland JA, Childs JD, Fritz JM, Whitman JM. Interrater reliability of the history and physical examination in patients with mechanical neck pain. Arch Phys Med Rehabil. 2006;87(10):1388–95.

    PubMed  Google Scholar 

  42. De Pauw R, Van Looveren E, Lenoir D, Danneels L, Cagnie B. Reliability and discriminative validity of a screening tool for the assessment of neuromuscular control and movement control in patients with neck pain and healthy individuals. Disabil Rehabil. 2020.

  43. Dvir Z, Gal-Eshel N, Shamir B, Prushansky T, Pevzner E, Peretz C. Cervical motion in patients with chronic disorders of the cervical spine: a reproducibility study. Spine. 2006;31(13):E394–E9.

    PubMed  Google Scholar 

  44. Edmondston SJ, Wallumrød ME, MacLéid F, Kvamme LS, Joebges S, Brabham GC. Reliability of isometric muscle endurance tests in subjects with postural neck pain. J Manip Physiol Ther. 2008;31(5):348–54.

    Google Scholar 

  45. Fletcher JP, Bandy WD. Intrarater reliability of CROM measurement of cervical spine active range of motion in persons with and without neck pain. J Orthop Sports Phys Ther. 2008;38(10):640–5.

    PubMed  Google Scholar 

  46. Ghorbani F, Kamyab M, Azadinia F. Smartphone applications as a suitable alternative to CROM device and inclinometers in assessing the cervical range of motion in patients with nonspecific neck pain. J Chiropr Med. 2020;19(1):38–48.

    PubMed  PubMed Central  Google Scholar 

  47. Gonçalves C, Silva AG. Reliability, measurement error and construct validity of four proprioceptive tests in patients with chronic idiopathic neck pain. Musculoskelet Sci Pract. 2019;43:103–9.

    PubMed  Google Scholar 

  48. Grod JP, Diakow PR. Effect of neck pain on verticality perception: a cohort study. Arch Phys Med Rehabil. 2002;83(3):412–5.

    PubMed  Google Scholar 

  49. Hanney WJ, George SZ, Kolber MJ, Young I, Salamh PA, Cleland JA. Inter-rater reliability of select physical examination procedures in patients with neck pain. Physiother Theory Pract. 2014;30(5):345–52.

    PubMed  Google Scholar 

  50. Harris KD, Heer DM, Roy TC, Santos DM, Whitman JM, Wainner RS. Reliability of a measurement of neck flexor muscle endurance. Phys Ther. 2005;85(12):1349–55.

    PubMed  Google Scholar 

  51. Hoppenbrouwers M, Eckhardt MM, Verkerk K, Verhagen A. Reproducibility of the measurement of active and passive cervical range of motion. J Manipulative Physiol Ther. 2006;29(5):363–7.

    PubMed  Google Scholar 

  52. Hoving JL, Pool JJ, Van Mameren H, Devillé WJ, Assendelft WJ, De Vet HC, et al. Reproducibility of cervical range of motion in patients with neck pain. BMC Musculoskel Dis. 2005;6:1–8.

    Google Scholar 

  53. Kristjansson E, Hardardottir L, Asmundardottir M, Gudmundsson K. A new clinical test for cervicocephalic kinesthetic sensibility:“the fly. Arch Phys Med Rehabil. 2004;85(3):490–5.

    PubMed  Google Scholar 

  54. Kristjansson E, Oddsdottir GL. The Fly”: a new clinical assessment and treatment method for deficits of movement control in the cervical spine: reliability and validity. Spine. 2010;35(23):E1298–E305.

    PubMed  Google Scholar 

  55. Kumbhare DA, Balsor B, Parkinson WL, Harding Bsckin P, Bedard M, Papaioannou A, et al. Measurement of cervical flexor endurance following whiplash. Disabil Rehabil. 2005;27(14):801–7.

    PubMed  Google Scholar 

  56. Law EYH, Chiu TT-W. Measurement of cervical range of motion (CROM) by electronic CROM goniometer: a test of reliability and validity. J Back Musculoskelet Rehabil. 2013;26(2):141–8.

    PubMed  Google Scholar 

  57. Lourenço AS, Lameiras C, Silva AG. Neck flexor and extensor muscle endurance in subclinical neck pain: intrarater reliability, standard error of measurement, minimal detectable change, and comparison with asymptomatic participants in a university student population. J Manipulative Physiol Ther. 2016;39(6):427–33.

    PubMed  Google Scholar 

  58. Majcen Rosker Z, Vodicar M, Kristjansson E. Inter-visit reliability of smooth pursuit neck torsion test in patients with chronic neck pain and healthy individuals. Diagnostics. 2021;11(5):752.

    PubMed  PubMed Central  Google Scholar 

  59. Martins F, Bento A, Silva AG. Within-session and between-session reliability, construct validity, and comparison between individuals with and without neck pain of four neck muscle tests. PM&R. 2018;10(2):183–93.

    Google Scholar 

  60. Murphy BA, Marshall PW, Taylor HH. The cervical flexion-relaxation ratio: reproducibility and comparison between chronic neck pain patients and controls. Spine. 2010;35(24):2103–8.

    PubMed  Google Scholar 

  61. O’Leary SP, Vicenzino BT, Jull GA. A new method of isometric dynamometry for the craniocervical flexor muscles. Phys Ther. 2005;85(6):556–64.

    PubMed  Google Scholar 

  62. Pearson I, Reichert A, De Serres SJ, Dumas J-P, Côté JN. Maximal voluntary isometric neck strength deficits in adults with whiplash-associated disorders and association with pain and fear of movement. J Orthop Sports Phys Ther. 2009;39(3):179–87.

    PubMed  Google Scholar 

  63. Peolsson A, Hamp C, Albinsson A-K, Engdahl S, Kvist J. Test position and reliability in measurements of dorsal neck muscle endurance. Adv Physiotherapy. 2007;9(4):181–9.

    Google Scholar 

  64. Petersen C, Johnson R, Schuit D. Reliability of cervical range of motion using the OSI CA 6000 spine motion analyser on asymptomatic and symptomatic subjects. Man Ther. 2000;5(2):82–8.

    CAS  PubMed  Google Scholar 

  65. Piva SR, Erhard RE, Childs JD, Browder DA. Inter-tester reliability of passive intervertebral and active movements of the cervical spine. Man Ther. 2006;11(4):321–30.

    PubMed  Google Scholar 

  66. Pourahmadi MR, Bagheri R, Taghipour M, Takamjani IE, Sarrafzadeh J, Mohseni-Bandpei MA. A new iPhone application for measuring active craniocervical range of motion in patients with non-specific neck pain: a reliability and validity study. Spine J. 2018;18(3):447–57.

    PubMed  Google Scholar 

  67. Rheault W, Albright B, Byers C, Franta M, Johnson A, Skowronek M, et al. Intertester reliability of the cervical range of motion device. J Orthop Sports Phys Ther. 1992;15(3):147–50.

    CAS  PubMed  Google Scholar 

  68. Röijezon U, Djupsjöbacka M, Björklund M, Häger-Ross C, Grip H, Liebermann DG. Kinematics of fast cervical rotations in persons with chronic neck pain: a cross-sectional and reliability study. BMC Musculoskel Dis. 2010;11(1):1–10.

    Google Scholar 

  69. Roren A, Mayoux-Benhamou M-A, Fayad F, Poiraudeau S, Lantz D, Revel M. Comparison of visual and ultrasound based techniques to measure head repositioning in healthy and neck-pain subjects. Man Ther. 2009;14(3):270–7.

    PubMed  Google Scholar 

  70. Schneider GM, Jull G, Thomas K, Smith A, Emery C, Faris P, et al. Intrarater and interrater reliability of select clinical tests in patients referred for diagnostic facet joint blocks in the cervical spine. Arch Phys Med Rehabil. 2013;94(8):1628–34.

    PubMed  Google Scholar 

  71. Sebastian D, Chovvath R, Malladi R. Cervical extensor endurance test: a reliability study. J Bodyw Mov Ther. 2015;19(2):213–6.

    PubMed  Google Scholar 

  72. Shahidi B, Johnson CL, Curran-Everett D, Maluf KS. Reliability and group differences in quantitative cervicothoracic measures among individuals with and without chronic neck pain. BMC Musculoskel Dis. 2012;13(1):1–11.

    Google Scholar 

  73. Stenneberg MS, Busstra H, Eskes M, van Trijffel E, Cattrysse E, Scholten-Peeters GG, et al. Concurrent validity and interrater reliability of a new smartphone application to assess 3D active cervical range of motion in patients with neck pain. Musculoskelet Sci Pract. 2018;34:59–65.

    PubMed  Google Scholar 

  74. Sterling M, Jull G, Carlsson Y, Crommert L. Are cervical physical outcome measures influenced by the presence of symptomatology? Physiother Res Int. 2002;7(3):113–21.

    PubMed  Google Scholar 

  75. Uddin Z, MacDermid JC, Galea V, Gross AR, Pierrynowski MR. Reliability indices, limits of agreement, and construct validity of the current perception threshold test in mechanical neck disorder. Crit Rev Phys Rehabil Med. 2013;25:3–4.

    Google Scholar 

  76. Vernon H, Aker P, Aramenko M, Battershill D, Alepin A, Penner T. Evaluation of neck muscle strength with a modified sphygmomanometer dynamometer: reliability and validity. J Manipulative Physiol Ther. 1992;15(6):343–9.

    CAS  PubMed  Google Scholar 

  77. Werner IM, Ernst MJ, Treleaven J, Crawford RJ. Intra and interrater reliability and clinical feasibility of a simple measure of cervical movement sense in patients with neck pain. BMC Musculoskel Dis. 2018;19:1–9.

    Google Scholar 

  78. Williams MA, Williamson E, Gates S, Cooke MW. Reproducibility of the cervical range of motion (CROM) device for individuals with sub-acute whiplash associated disorders. Eur Spine J. 2012;21:872–8.

    PubMed  Google Scholar 

  79. Ylinen J, Salo P, Nykänen M, Kautiainen H, Häkkinen A. Decreased isometric neck strength in women with chronic neck pain and the repeatability of neck strength measurements. Arch Phys Med Rehabil. 2004;85(8):1303–8.

    PubMed  Google Scholar 

  80. Youdas JW, Carey JR, Garrett TR. Reliability of measurements of cervical spine range of motion—comparison of three methods. Phys Ther. 1991;71(2):98–104.

    CAS  PubMed  Google Scholar 

Download references


We sincerely thank Mathieu Tripod for his contribution to the screening of titles and abstracts.


This research did not receive any specific funding.

Author information

Authors and Affiliations



SE and RH did the systematic search, the screening of the articles and the data extraction. CB contributed to the screening and data extraction process. RH conducted the statistical analyses. SE and RH analysed the results and drafted the manuscript. LA critically reviewed the manuscript. RB did substantially revise the manuscript. All authors revised the manuscript and approved the final manuscript.

Corresponding author

Correspondence to Simone Elsig.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1.

Search strategy (for Medline). Search strategy for Medline

Additional file 2.

Risk of bias Reliability. Items of the assessment of risk of bias for reliability

Additional file 3.

Risk of bias Measurement error. Items of the assessment of risk of bias for measurement error

Additional file 4.

GRADE Downgrading Reliability. Reasons for downgrading

Additional file 5.

GRADE Downgrading Measurement error. Reasons for downgrading

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Elsig, S., Allet, L., Bastiaenen, C.G. et al. Reliability and measurement error of sensorimotor tests in patients with neck pain: a systematic review. Arch Physiother 13, 15 (2023).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: