Genetics and pathogenesis of idiopathic scoliosis

Idiopathic scoliosis (IS), the most common spinal deformity, affects otherwise healthy children and adolescents during growth. The aetiology is still unknown, although genetic factors are believed to be important. The present review corroborates the understanding of IS as a complex disease with a polygenic background. Presumably IS can be due to a spectrum of genetic risk variants, ranging from very rare or even private to very common. The most promising candidate genes are highlighted.


Background
Idiopathic scoliosis (IS), the most common form of spinal deformity, affects otherwise healthy children and adolescents during growth. It usually presents as a rib hump visible at forward bending, together with unleveled shoulders and an asymmetrical waist. According to Cobb [1], the diagnosis is confirmed by a standing spinal radiograph showing a lateral curvature of the spine exceeding 10°. The aetiology is still unknown, although hereditary and genetic factors are believed to be important.
A major concern in IS is the absence of reliable means by which to predict risk of progression, leading to frequent follow-ups, radiographs, and potentially unnecessary brace treatments. A further understanding of the pathogenesis and genetics in IS might help in identifying at-risk individuals, leading to an earlier diagnosis and possibly better preventive and therapeutic options. The aim of the present review is to give an overview of current research in the area; the literature search strategy is outlined in Table 1.

Clinical presentation
The prevalence of IS is approximately 2-3% worldwide [2][3][4]. Most individuals have small curvatures, girls and boys being equally affected. Approximately 10% progress to a moderate or severe curve [3,4]. Among those with severe curves, the percentage of boys is less than 10% [5]. The clinical manifestation or phenotype of IS is highly variable: the apex of the major curve may be thoracic, thoracolumbar or lumbar and the convexity may be either left or right-sided, with compensatory curvatures above and below. The most common form is a right thoracic convexity with a compensatory left lumbar convexity. A left thoracic convexity is uncommon and more often associated with asymptomatic neural axis abnormalities [6].
A young age at onset, large curvature at presentation, a thoracic curve pattern, and skeletal immaturity increase the likelihood of progression [7,8]. Thoracic curves in the skeletally immature individual have the highest risk of progression, 58-100% [8][9][10]. When the individual stops growing, the risk of progression diminishes. At skeletal maturity, curves of less than 30°carry a very small risk of progression. In contrast, curves that reach 50°tend to continue to progress throughout adulthood, at a rate of approximately 1°per year [9].

Aetiology and pathogenesis
The pathogenesis of scoliosis is poorly understood. It is not unreasonable to believe that an existing deformity might produce an asymmetrical loading of the growing spine, which in turn would cause asymmetrical growth of the vertebrae. But how does it start? And why is it progressive in some but not in others? Biomechanical, neural, metabolic and hormonal changes have been reported in IS but it is difficult to say whether these are primary or secondary to the deformity. Various theories based on these findings have been suggested, some of these are listed below.
In 1959, Thillard [11] discovered that pinealectomised chickens developed scoliosis. This was repeated in bi-pedalised rats and a deficiency of melatonin was suggested to be causative of IS [12,13]. Further studies, however, showed that adolescent IS patients had normal melatonin levels [14], and pinealectomised monkeys did not develop scoliosis [15]. Instead, a melatonin-signaling pathway dysfunction affecting only certain cell types, notably osteoblasts, was suggested [16,17]. Calmodulin, a calcium-binding receptor protein, regulates contractile properties in platelets and muscles, and interacts with melatonin. Increased levels of calmodulin in platelets and an asymmetrical distribution of calmodulin in paraspinal muscles compared to healthy controls have been described in IS patients [18,19].
Dickson et al. [20] showed that vertebral bodies were wedged in the sagittal plane in IS patients, causing an apical lordosis in thoracic curvatures. They suggested that this lordosis, in a region that is normally kyphotic, created a rotation of the spine and, secondarily, a lateral spinal curvature. On MRI scans of IS patients it has been shown that the spinal cord is shorter in relation to the vertebral column [21], that there is an increased prevalence of cerebellar tonsillar ectopia [22], as well as an uncoordinated growth of the vertebral bodies in relation to the dorsal elements [23], compared to controls. This has led to theories postulating a relative anterior spinal overgrowth (RASO) or an uncoupled neuroosseus growth as a cause of IS [24].
As previously described, the risk of curve progression in IS is related to skeletal immaturity. It has also been shown that girls with adolescent IS are taller [25][26][27] and have a higher growth velocity during puberty compared to healthy controls [28][29][30]. Subsequently, bone mineral density, growth, and sex hormones have been studied in the pathogenesis of IS. Cheung et al. [27] showed that adolescent girls with IS had lower bone mineral density than healthy controls, and a higher bone turnover rate. In the same cohort, Hung et al. [31] found that low bone mineral density in the femoral neck was associated with curve progression.
Gerdhem et al. [32] showed a decreased level of COMP, cartilage oligomeric matrix protein, in serum in IS patients compared to controls. COMP has previously been associated with growth velocity in juvenile rheumatoid arthritis [33]. In addition, raised levels of growth hormone (GH) and insulin-like growth factor 1 (IGF-1) have been associated with IS [34,35], as well as lower circulating levels of leptin, the "satiety" hormone [36]. Oestrogen levels have also been studied, but with inconclusive results [37].

Heredity and genetics
It has long been known that hereditary factors play a role in the aetiology of IS. Inheritance of scoliosis in five generations was described by Garland in 1934 [38].
In 1968, Wynne-Davis [39] and in 1973 Riseborough and Wynne-Davis [40] reported on the familial occurrence of IS in British and American cohorts. The proportion of study participants having a relative with IS was 27 and 26%, respectively. The prevalence of scoliosis among first-degree relatives was 7% and 16%, which is significantly higher than in the general population. Tang et al. [41] showed a sibling recurrence risk of scoliosis of 18% in a Chinese cohort of 415 female adolescent IS patients with Cobb > 20°. In a cohort of 1,463 individuals with IS, we found that among those treated with a brace or surgery for scoliosis, 53% reported one or more relatives with scoliosis, compared to 46% of the untreated, pointing towards a slightly higher risk of treatment in the presence of a family history of scoliosis [42].
In addition, several twin studies have reported a higher concordance of IS (meaning that both twins have the disorder) in monozygotic compared to dizygotic twin pairs, indicating a genetic influence [43][44][45]. We have analysed self-reported data on scoliosis in twins (n = 64,578) in the population-based Swedish Twin Registry and estimated the relative importance of genetic effects on the phenotypic variance (i.e. the heritability) to be 38% [46].
As a consequence there has been a vast amount of genetic research on IS. A short description of different approaches in genetic research as well as a summary of the findings on IS are given below.

Genetic approaches
Sequencing allows us to determine the nucleotide sequence of a DNA strand, and thus potentially discover new mutations or genetic variants. Sequencing a whole genome, however, yields immense amounts of data and requires large amounts of downstream bioinformatic analysis. Severe phenotypes could be assumed to be due to mutations in the protein-coding rather than the noncoding areas of the genome. One option could then be to sequence only the protein-coding regions, the so- We searched Pubmed using the following search terms: idiopathic AND 'scoliosis' AND, 'etiology', OR 'heredity' OR 'genetics' OR 'pathogenesis'.

Selection criteria
The reference lists of articles and reviews identified by this search strategy were scrutinized and references were included when judged relevant.
called exome, which constitutes approximately 1% of the genome, Fig. 1. Genotyping, in contrast to sequencing, depends on the knowledge of known variantsfor example, singlenucleotide variants (SNV), with known positions in the genome, Fig. 2. An assay is set up to test for the specific variant(s), meaning that one tests which of the possible alleles or versions of the variation the individual has at that specific point. Compared to sequencing, this is a very efficient method of finding out if a certain known variation is associated with a disease. Genotyping can be used in both genome-wide and candidate-gene approaches.
In genome-wide approaches, millions of variants throughout the genome are assayed simultaneously. This approach is useful when one has no prior hypothesis of what region or gene might be involved in the disease. However, it is rather expensive and yields massive amounts of data, and the criteria for significance of the data are often quite stringent due to a need for multiple testing correction. If there is a solid hypothesis of what gene(s) might be involved in the disorder, one can instead elect to test variations in this specific area onlya so called candidate-gene approach. The latter approach is more straightforward and allows for a more detailed analysis of a candidate gene, but it is highly dependent on the initial assumptions of the study design. It would also not be helpful for discovering completely new and previously unsuspected disease mechanisms.
Genotyping is used in both association and linkage studies. In association studies one compares the frequency of specific versions/forms/alleles of genetic variants in cases and controls. Association studies can establish whether common known genetic variants are associated with a disorder, even if they only have a weak effect on the phenotype or low penetrance. The existence of a variant in an individual is usually not diagnostic for the disease, but rather indicates an (often subtly) increased disease susceptibility. Even if a specific variant increases the person's susceptibility to a disease by only 5%, this can be a very important modulator of disease risk in the population if the variant is common.
Linkage studies, on the other hand, analyse the cosegregation of a phenotype and a mutation in families. Both large and small families can be used. DNA markers or SNVs are analysed either at a certain point of interest or genome-wide. It is then possible to link a region of the genome with the phenotype. The advantage of linkage studies is that one does not need to know what one is looking for in advance, and a study of multiple families could yield a linkage signal in common even if the diseasecausing mutations underlying the linkage signal differed between families. A limitation is that a strong correlation between the phenotype and genotype is needed (a high penetrance), making linkage a more powerful approach for phenotypes of more classical Mendelian inheritance (e.g. recessive or dominant). This type of study can have diagnostic value for members of families carrying a rare, monogenic disease, but the relevance of such findings for the general population is unclear.

Exome sequencing
Baschal et al. [47] sequenced the exomes of three affected individuals in a multigenerational family with dominant Mendelian inheritance of IS. They identified a rare missense variant in HSPG2, coding for an extracellular matrix protein, also known as perlecan. They further sequenced exons of HSPG in 100 independent IS patients and found 21 other potentially damaging variants in HSPG. Buchan et al. [48] exome-sequenced a cohort of 91 individuals with severe IS and compared the results with 337 controls. Using a gene burden analysis, they found that variants within the fibrillin 1 and 2  Haller et al. [49] exome-sequenced 391 severe adolescent IS cases and 843 controls. Using a pathway burden analysis, they found that variants in extracellular matrix genes, especially in musculoskeletal collagen, were significantly enriched in adolescent IS cases compared to controls. Individual genotyping in 919 cases revealed a highly significant association with COL11A2, which encodes a fibrillar collagen.

Genome-wide association studies (GWAS)
Several genome-wide association studies in adolescent IS have been reported. The most interesting findings are shown in Table 2.
In 2011, Takahashi et al. [50] performed a large GWAS in a Japanese population and found an association with a variant downstream of the LBX1 (ladybird homeobox 1) gene. This finding was later replicated in both Chinese Han and Caucasian populations, as well as by us in a Scandinavian population [51][52][53]. The function of LBX1 is largely unknown but it is expressed in dorsal spinal neurons and hindbrain, muscle precursor cells, and certain cardiac crest cells [54][55][56][57][58]. Fernandez-jaen et al. [59] reported a clinical case involving scoliosis and myopathy due to a microduplication in the chromosomal region of 10q24.31 affecting exclusively LBX1. Recently, Guo et al. [60] found that the associated variant facilitates transcription of LBX1 and that overexpression of LBX1 causes body axis deformation in zebrafish.
Kou et al. [61] found an association of GPR126 (Gprotein coupled receptor 126) to IS in a GWAS in populations of Japanese, Han Chinese and European ancestry. This finding has been replicated in a small Chinese candidate gene study [62]. A knockdown of GPR126 in zebrafish caused delayed ossification of the developing spine [61].
Through an extended GWAS and replication studies using independent Japanese and Chinese populations, the same group found an association of a variant in the BNC2 (basonuclin-2) gene, which encodes a zinc finger transcription factor. BNC2 overexpression induced body curvature in developing zebrafish in a gene-dosagedependent manner [63].
In a GWAS of severe cases of adolescent IS in the Japanese and Chinese populations, Miyake et al. [64] found association to the variant rs12946942 on chromosome 17q24.3 near the genes SOX9 and KCNJ2. Mutations within these genes are associated with campomelic dysplasia and Andersen-Tawil syndrome, both demonstrating a scoliotic phenotype in addition to other symptoms.
Ward et al. [65] identified 53 variants associated with curve progression of adolescent IS in a GWAS that is not yet published and validated them in a Caucasian cohort. They suggested that these variants could be useful for predicting progression of scoliosis. However, the association of these variants to progression of scoliosis has not been replicated in either a Japanese or a French-Canadian cohort [66,67].
By performing a GWAS of 3,102 individuals, Sharma et al. [68] identified significant association of a locus distal to PAX1 to IS in female patients. PAX1 codes for Paired box 1, a transcription factor involved in spine development. The associated locus has showed enhancer activity in zebrafish somitic muscle and spinal cord.

Candidate gene association studies
Inspired by speculations on the pathogenesis of IS, various candidate genes related to bone metabolism, connective tissue, the melatonin-signaling pathway, growth and sex hormones have been investigated [69]. Most of these associations have not been replicated in later larger studies [69][70][71][72][73][74]. Recent candidate gene studies have shown an association between IL-17RC (interleukin 17 receptor C), TGFB1 (transforming growth factor beta 1), genes correlated with peak height velocity during puberty, DOT1L and C17orf67, and IS [75][76][77].

Linkage and inheritance models
Several genome-wide linkage studies have been performed on IS families [69]. Both autosomal dominant, X-linked dominant and autosomal recessive models of inheritance have been suggested and different  [79]. Edery et al. [80] suggested linkage to the regions 3q12.1 and 5q13.3 in a multigenerational family. In a follow-up using exome sequencing of three affected members of this family, a novel rare missense variant in POC5, a centripolar protein, was discovered. This variant caused spine deformity in a zebrafish model [81].

Other approaches
Fendri et al. [82] compared mRNA expression in primary osteoblasts from vertebrae in adolescent IS patients and healthy controls and found 145 genes differentially expressed in osteoblasts. The most significant changes in expression levels were observed in homeobox genes, as well as in ZIC2, FAM101A, COMP and PITX1. These genes interact in the biological pathways of bone development, particularly in the differentiation of skeletal elements and the structural integrity of the vertebrae [82]. Buchan et al. [83] reported rare copy number variations (CNVs) in a cohort of 143 IS patients. These genes have not previously been investigated in IS.

Drawbacks
Genetic studies are limited by their design. A genetic approach focused on finding common variants (GWAS) will not reveal rare variants, on the other hand linkage studies of a family may identify disease causing variants in that specific family but these findings might not be applicable in the majority of patients. As the pathogenetic mechanism(s) of idiopathic scoliosis is/are not well known, a number of different genetic approaches has been used and the various genetic findings reported reflect the chosen methods. However, instead of being contradictory one may interpret the results (of well conducted studies) as small pieces in a large genetic puzzle. The limitations of the present review include the possibility that we have failed to analyse some studies on the subject and that the conclusion is a result of our understanding of the field rather than evidence.

Conclusion
The present review corroborates the understanding of IS as a complex disease with a polygenic background. IS can presumably be due to a spectrum of genetic risk variants, ranging from very rare or even private to very common in the general population. The risk effect of the variants could also range from quite severe to very mild and even undetectable in practice. The most promising common gene variant discovered so far is rs11190870 downstream of the LBX1 gene. This variant is shown to increase disease susceptibility in several populations and a plausible effect mechanism has been presented [50][51][52][53]60]. An intronic variant within GPR126 and the intergenic variant rs12946942 have also been replicated in different populations, however the precise functional effect of these variants has yet to be elucidated [61,64]. Two other common variants of interest have recently been found to be associated with IS, but have yet to be replicated (Table 2) [63,68]. New methodologies such as exome-sequencing have made it possible to identify rare variants associated with idiopathic scoliosis [47][48][49]81]. The importance of these findings in the general idiopathic scoliosis population, however, remains to be seen. Future strategies for revealing the pathogenic mechanism underlying scoliosis might be studying families with monogenic IS in order to find a causative mutation. A possible finding will not explain the specific genetic background in the general IS population but might reveal biological pathways that are important in all or most forms of IS. In addition there are several genetic syndromes, of both known and unknown causes, in which scoliosis is part of the phenotype. Further studies of these disorders could add information to the pathogenic mechanism of scoliosis development. Yet another possibility is international collaboration, collecting even larger cohorts of IS patients. A large sample size would better enable us to find association with rare variants.