OUP user menu

The identification of gene expression profiles associated with progression of human diabetic neuropathy

Junguk Hur, Kelli A. Sullivan, Manjusha Pande, Yu Hong, Anders A. F. Sima, Hosagrahar V. Jagadish, Matthias Kretzler, Eva L. Feldman
DOI: http://dx.doi.org/10.1093/brain/awr228 3222-3235 First published online: 16 September 2011


Diabetic neuropathy is a common complication of diabetes. While multiple pathways are implicated in the pathophysiology of diabetic neuropathy, there are no specific treatments and no means to predict diabetic neuropathy onset or progression. Here, we identify gene expression signatures related to diabetic neuropathy and develop computational classification models of diabetic neuropathy progression. Microarray experiments were performed on 50 samples of human sural nerves collected during a 52-week clinical trial. A series of bioinformatics analyses identified differentially expressed genes and their networks and biological pathways potentially responsible for the progression of diabetic neuropathy. We identified 532 differentially expressed genes between patient samples with progressing or non-progressing diabetic neuropathy, and found these were functionally enriched in pathways involving inflammatory responses and lipid metabolism. A literature-derived co-citation network of the differentially expressed genes revealed gene subnetworks centred on apolipoprotein E, jun, leptin, serpin peptidase inhibitor E type 1 and peroxisome proliferator-activated receptor gamma. The differentially expressed genes were used to classify a test set of patients with regard to diabetic neuropathy progression. Ridge regression models containing 14 differentially expressed genes correctly classified the progression status of 92% of patients (P < 0.001). To our knowledge, this is the first study to identify transcriptional changes associated with diabetic neuropathy progression in human sural nerve biopsies and describe their potential utility in classifying diabetic neuropathy. Our results identifying the unique gene signature of patients with progressive diabetic neuropathy will facilitate the development of new mechanism-based diagnostics and therapies.

  • biomarkers
  • diabetic neuropathy
  • classification model
  • sural nerve
  • gene expression


Twenty-five million Americans have diabetes and the incidence is increasing by 5% annually (http://www.diabetes.org). Peripheral neuropathy occurs in ∼60% of these patients (Edwards et al., 2008). An additional 57 million Americans have impaired glucose tolerance, or pre-diabetes, and up to 30% of these patients exhibit peripheral neuropathy at diagnosis. Diabetic neuropathy is characterized by progressive loss of peripheral nerve axons, resulting in decreased sensation, pain and eventually complete loss of sensation (Edwards et al., 2008). In patients with diabetic neuropathy, nerve damage precedes the development of noticeable symptoms and diabetic neuropathy is almost never detected until clinically obvious symptoms and signs appear and irreversible damage has already occurred. Despite over 20 years of intense research, there are no effective treatment options to prevent, slow or reverse the progression of diabetic neuropathy other than control of blood glucose levels, and this is not always achievable even in a vigilant patient.

While modest regeneration has been documented in diabetic neuropathy, this regeneration fails over time with resulting loss of peripheral nerve fibres (Kennedy and Zochodne, 2005); therefore, our goal is to identify individuals at high risk of developing diabetic neuropathy in order to develop strategies to prevent or slow disease onset and progression. A number of dysregulated metabolic pathways have been documented in both human patients and experimental models of diabetic neuropathy including changes in osmotic balance within neurons and nerve fibres (Stevens et al., 1993), polyol synthesis and accumulation (Chung et al., 2003), oxidative stress (Vincent et al., 2004), changes in nerve blood flow (Cameron and Cotter, 2001), altered insulin and protein kinase C signalling (Yamagishi et al., 2008; Obrosova, 2009), formation of advanced glycation end products (Stern et al., 2002) and cellular inflammation (Wang et al., 2006). In the mid 1990s, many of these pathways were addressed in clinical trials including antioxidant therapies (Ford et al., 2001), aldose reductase inhibitors [tolrestat (Sima et al., 1993), epalrestat (Goto et al., 1995) and sorbinil (Krans, 1992)] and acetyl-carnitine supplementation; however, none were approved in the USA. While some biochemical and symptomatic effects were noted, none of these therapies reversed decreases in nerve conduction velocity or loss of myelinated fibre density.

We are in possession of a unique repository of human nerve biopsies harvested as part of one of these clinical trials. In the initial study, whole sural nerve biopsies, blood chemistries, electrophysiology and sensory data were collected from participants in a large randomized placebo-controlled clinical trial testing acetyl-l-carnitine for diabetic neuropathy treatment (Sima et al., 2005; Wiggin et al., 2009). Acetyl-l-carnitine treatment was efficacious in alleviating pain symptoms; however, no improvement in sural nerve conduction velocities, amplitudes or myelinated fibre density was observed (Sima et al., 2005). Our initial demographic analyses revealed that after correcting for baseline diabetic neuropathy severity and clinical factors, such as gender, age, duration and types of diabetes, insulin treatment, acetyl-l-carnitine treatment, and haemoglobin A1c, only elevated serum triglycerides measured at study onset correlated with diabetic neuropathy progression (Wiggin et al., 2009).

The acetyl-l-carnitine trial took place between 1995 and 1996 and was the first clinical trial to use high-throughput transmission electron microscopy to examine regenerating nerve fibres. At that time, molecular biology tools that are commonly used today were not available. In the current study, we combine modern microarray and bioinformatics techniques to examine the remaining biopsy material and extend the basic biochemical and anatomical analyses that were performed initially. We report the first high-throughput genome-wide expression study of human sural nerve biopsies obtained from patients with diabetic neuropathy to identify genes and pathways altered over the course of diabetes. Gene expression profiles were examined in sural nerve samples from two groups of patients with either fast progressing or slow/non-progressing diabetic neuropathy by high-throughput Affymetrix gene expression microarrays. A series of bioinformatics tools analysed differential gene expression profiles between the two groups and revealed gene networks linked to the progression of diabetic neuropathy. Computational predictive models, based on the expression profiles of selected genes, correctly classified patients as exhibiting either progressing or non-progressing diabetic neuropathy.

It is highly unlikely that future studies of diabetic neuropathy in patients will include the collection of sural nerve biopsies; therefore, it is imperative to fully examine these biopsies using the most current and advanced tools available. To our knowledge, the data presented here provide the first correlative measures of human diabetic neuropathy progression based on gene expression profiles and may be used to explore new pathways underlying disease pathogenesis. The identification of genes encoding secreted proteins may further provide a unique starting point for targeted serum biomarker development. Our goal is to identify patients at risk for rapid diabetic neuropathy progression and to someday intervene prior to diabetic neuropathy development.

Materials and methods

Human sural nerve samples

Human sural nerve biopsies were obtained during a double-blind, placebo-controlled, 52-week clinical trial of acetyl-l-carnitine treatment of diabetic neuropathy (Sima et al., 2005; Wiggin et al., 2009). Patient screening was described by Greene et al. (1999) and consisted of a medical history and physical examination, a detailed neurological examination and measures of nerve conduction velocity, vibration perception threshold and heart rate variability measurements (Greene et al., 1999). For study inclusion, patients had to be between 18 and 70 years of age, diagnosed with either type 1 or type 2 diabetes mellitus for at least 2 years, glycosylated haemoglobin levels between 7.1% and 14.0% on stable anti-diabetic therapy for 3 months or more and 200% of their ideal body weight with good nutrition (Greene et al., 1999). All patients demonstrated clinically evident, mild to moderate diabetic neuropathy as defined by the presence of at least two neurological findings: (i) symptoms consistent with a distal symmetric polyneuropathy; (ii) signs consistent with a distal symmetric polyneuropathy; and (iii) abnormal nerve function tests including abnormal nerve conduction velocity in at least two nerves or abnormal vibration perception threshold or abnormal heart rate (Greene et al., 1999). Bilaterally detectable sural sensory nerve action potentials and vibration perception threshold in the great toes were required for study inclusion in order to exclude subjects with advanced nerve damage (Greene et al., 1999). Subjects were also excluded for known non-diabetic causes of peripheral neuropathy, unstable pre-proliferative or proliferative retinopathy, serious systemic disease including vasculitis or inflammatory disease, history of carcinoma in the last 5 years, serum creatinine >2.0 mg/dl, creatinine clearance <50 ml/min, proteinuria >3 g/day, or serum uric acid >10 mg/dl (American Diabetes Association, 1988; Greene et al., 1999).

Prior to treatment, whole sural nerve biopsies (5–6 cm, Week 0; denoted as the primary sample) were harvested from the distal calf ∼10 cm proximal to the lateral malleolus. The nerve sample was immediately divided into five 1-cm segments; three segments frozen in liquid nitrogen and two segments fixed in 0.1 M cacodylate buffered (pH 7.3) with 2.5% glutaraldehyde and shipped overnight to the University of Michigan Nerve Biopsy Laboratory. Following 52 weeks of treatment, measures of diabetic neuropathy were re-assessed and a second whole sural nerve biopsy from the contralateral leg was harvested (Week 52; denoted as the secondary sample). Frozen nerve segments not used in the initial study were stored at −80°C.

A blood sample was collected at the time of patient enrolment and the following measures were recorded: haemoglobin A1c, haematocrit, serum triglycerides and cholesterol and albumin. These samples were exhausted by the initial analysis and no further blood or urine samples were collected at study conclusion.

Our post hoc analysis classified the patient samples into two groups, progressors and non-progressors, based on myelinated fibre density. Patient samples in the progressor group lost ≥500 fibres/mm2, while patient samples in the non-progressor group lost ≤100 fibres/mm2 over 52 weeks (Wiggin et al., 2009). Primary and secondary biopsies from 36 patients (18 progressors and 18 non-progressors) were used in this study. The selection of patient samples from each group was adjusted for myelinated fibre density at trial onset, insulin treatment, gender and type of diabetes. The use of the human sural nerve samples was approved by the Institutional Review Board for Human Subject Research at the University of Michigan.

RNA preparation for microarray

Total RNA was isolated from a 1-cm segment of each sural nerve biopsy using a commercially available kit (RNeasy® Mini Kit; QIAGEN, Inc.), including an on-column deoxyribonuclease digestion and following the manufacturer's protocol. RNA quality and quantity were assessed by microfluid electrophoresis using an RNA 6000 Pico LabChip on a 2100 Bioanalyzer (Agilent Technologies, Inc.). Fifty samples (14 primary and 36 secondary) with a minimum RNA integrity number of 6.5 were used for microarray hybridization (Fig. 1).

Figure 1

Primary and secondary biopsy selection. Primary and secondary biopsies of 36 patients with diabetic neuropathy (DN) were included in this study. Samples with a minimum RNA integrity number (RIN) of 6.5 were used for microarray hybridization. Microarrays of 12 primary (five progressors and seven non-progressors) and 35 secondary (18 progressors and 17 non-progressors) samples were used for differential gene expression analysis and diabetic neuropathy progression modelling. btw = between; P = progressor; NP = non-progressor.

Affymetrix microarrays

Total RNA (75 ng) from each sample was amplified and biotin labelled using the Ovation Biotin-RNA Amplification and Labelling System (NuGEN Technologies, Inc.) according to the manufacturer's protocol. The University of Michigan Comprehensive Cancer Centre Affymetrix and Microarray Core Facility (University of Michigan) performed the amplification and hybridization using the Affymetrix GeneChip Human Genome U133 Plus 2.0 Array. Intensities of target hybridization to respective probe features were detected by laser scan of the array. Image files were generated by Affymetrix GeneChip software (MAS5).

Data analysis

Quality assessment and data preprocessing

The Affymetrix CEL files were analysed using a local version of the GenePattern genomic analysis platform from the Broad Institute (http://www.broadinstitute.org/cancer/software/genepattern) (Reich et al., 2006). The samples were Robust Multi-array Average normalized using the BrainArray Custom CDF HGU133Plus2_Hs_ENTREZG version 12 (http://brainarray.mhri.med.umich.edu) (Dai et al., 2005). Microarray quality was assessed using the probe-level modelling and quality metrics provided by the Affy package of BioConductor (Bolstad et al., 2003; Irizarry et al., 2003; Gautier et al., 2004). Three outlier arrays (two from the primary progressor group and one from the secondary non-progressor group) that did not cluster with other arrays in principal component analysis results were excluded from further analyses.

Identification of differentially expressed genes

Two independent analysis platforms were employed to identify differentially expressed genes between different biological groups (secondary biopsies; progressors and non-progressors): the GenePattern platform using the standard robust multi-array average-based probe set approach and ChipInspector (CI; version 2.1; Genomatix Software GmbH). The robust multi-array average approach averages normalized expression levels across all probes for the gene (probe set level analysis) whereas Genomatix ChipInspector calculates the change in each probe (probe level analysis) (Cohen et al., 2008). Genes were deemed as ‘differentially expressed genes’ using Cyber-T (Baldi and Long, 2001), based on a Bayesian regularized t-test, P < 0.05 in the robust multi-array average approach and a false discovery rate (FDR) <0.1% using ChipInspector (Cohen et al., 2008) with a minimum of four probes per transcript.

Functional enrichment analyses

The Database for Annotation, Visualization and Integrated Discovery (DAVID) (http://david.abcc.ncifcrf.gov) (Huang da et al., 2009a, b) and ConceptGen (http://conceptgen.ncibi.org) (Sartor et al., 2010) were used to identify over-represented biological functions and pathways among the differentially expressed genes.

Network analysis

A gene co-citation network of the differentially expressed genes was generated by Genomatix BiblioSphere (Genomatix Software GmbH) using a sentence level co-citation filter. This network allows visualization of the differentially expressed genes and their potential associations with each other identified in the literature. The topology of the network was further analysed using the Fast Greedy community-structure identification algorithm, implemented in the Cytoscape plug-in GLay (http://brainarray.mbni.med.umich.edu/sugang/glay) (Su et al., 2010) to identify coherent subnetworks. Identified subnetworks were subjected to functional enrichment analyses by DAVID to reveal over-represented biological functions within each subnetwork.

Computational modelling using gene expression profiles

Differentially expressed gene expression was evaluated for the ability to classify progressors versus non-progressors using ridge regression modelling (Shedden et al., 2008; Ju et al., 2009; Yoshihara et al., 2010). The gene expression profiles of the secondary samples, excluding those samples paired with primary samples, were used as the training set (13 progressors and 11 non-progressors). The expression profiles of the primary samples were used as the testing set (five progressors and seven non-progressors). Three different sets of differentially expressed genes were used in the initial models; Set 1 included all 532 differentially expressed genes, Set 2 contained 63 differentially expressed genes with a minimum fold-change of 1.5, and Set 3 included 10 differentially expressed genes with a minimum fold-change of 2. To identify a set of genes with the least number of genes but with high classification accuracy, the genes from Set 2 were added to Set 3 one at a time until the accuracy of the expanded set reached the maximum level. The statistical significance of each model was evaluated by repeating the classification procedures 1000 times by randomizing class labels.

Array data validation using real-time reverse transcriptase–polymerase chain reaction

The expression of eight differentially expressed genes identified by microarray was confirmed by real-time reverse transcriptase–polymerase chain reaction performed on 10 independent samples from each secondary group (progressor and non-progressor). Reverse transcription was performed using iScript™ cDNA Synthesis kit (Bio-Rad). Real-time polymerase chain reaction amplification and SYBR® green fluorescence detection were performed using iCycler iQ™ Real-time Detection System (Bio-Rad Laboratories). The fluorescence threshold value (CT) was calculated using iCycler iQ™ system software and the levels were normalized to an endogenous reference gene: TATA box binding protein. A Pearson correlation coefficient was calculated for each gene between the log2-transformed expression values as measured by microarray and the negative of the CT by reverse transcriptase–polymerase chain reaction (Gyorffy et al., 2009).


Sample information

Patient information regarding type and duration of diabetes, gender, body mass index, circulating lipids and measurement of neuropathy (the O’Brien score and myelinated fibre density) is provided in Table 1. The only significant difference between the progressor and non-progressor groups was the change in myelinated fibre density over 52 weeks. Eighty per cent of the study participants had type 2 diabetes and 61% were treated with insulin.

View this table:
Table 1

Patient characteristics (n = 36)

Diabetes type
    Type 1341
    Type 21514
Acetyl-l-carnitine treatment
Insulin treatment
Age (years)54.7 ± 12.952.2 ± 10.30.524
Diabetes duration (years)10.8 ± 7.212.0 ± 7.30.622
Body mass index (kg/cm2)30.0 ± 5.731.6 ± 10.30.568
Haemoglobin A1C (%)8.9 ± 1.69.2 ± 1.40.449
Triglyceride (mmol/l)1.8 ± 0.82.7 ± 1.90.088
Cholesterol (mmol/l)5.5 ± 0.85.5 ± 0.90.938
O’Brien score4179.7 ± 772.23854.5 ± 860.10.241
MFD base (fibres/mm2)5133.2 ± 1139.25132.8 ± 1450.80.999
MFD 52 Weeks (fibres/mm2)5256.8 ± 1200.04066.6 ± 1538.70.014*
MFD change (fibres/mm2)123.6 ± 209.2−1066.2 ± 391.13.84E-13***
  • Continuous variables are reported as mean ± standard deviation. P-values were calculated by two-sample t-test for continuous variables and Fisher’s exact test for categorical variables (*P < 0.05, ***P < 0.001).

  • MFD = myelinated fibre density.

Microarray quality assessment and identification of differentially expressed genes

Fifty samples [14 primary (seven progressors and seven non-progressors) and 36 secondary (18 progressors and 18 non-progressors) samples] met the RNA quality criteria and were hybridized to Affymetrix gene expression microarrays. Excluding two outliers and one mislabelled sample identified during the quality assessment process, 47 microarrays [12 primary (five progressors and seven non-progressors) and 35 secondary (18 progressors and 17 non-progressors)] were used in our analyses (Fig. 1).

The changes in gene expression described below represent changes between secondary biopsies from the progressor (n = 18) and non-progressor (n = 17) groups. In at least one sample, 14 885 genes were expressed above background. Among them, 558 genes had a Bayesian P < 0.05, while 4899 genes had a ChipInspector FDR < 0.1%. Only 532 genes deemed as differentially expressed genes by both methods were included for further analyses. Real-time reverse transcriptase–polymerase chain reaction demonstrated a positive correlation with the microarray data in all of the eight selected differentially expressed genes (Supplementary Table 1).

While acetyl-l-carnitine had no effect on myelinated fibre density or nerve conduction velocity, an assessment of potential effects on gene expression was performed. One hundred and forty-eight and 859 differentially expressed genes were identified between the acetyl-l-carnitine-treated and the non-treated samples in the non-progressing and progressing groups, respectively. Among these, 18 and 73 genes were also detected in the set of 532 differentially expressed genes identified between the progressing and non-progressing groups (∼3.4 and 13.7%, respectively). Four genes were common between these sets of genes; however, none exhibit consistent directional change, i.e. either up- or downregulation in both non-progressors and progressors (Supplementary Table 2). We conclude that it is unlikely that the initially identified 532 differentially expressed genes are significantly affected by acetyl-l-carnitine treatment.

Functional enrichment analyses

Functional enrichment analyses of the 532 differentially expressed genes were performed to identify over-represented biological functions using Gene Ontology terms and pathways. DAVID identified 31 and 168 over-represented biological functions among the up- and downregulated differentially expressed genes in the progressor group, respectively (DAVID P < 0.05). Table 2 lists a selected subset of the over-represented biological functions; the upregulated genes in progressors (i.e. downregulated in non-progressors) were enriched in ‘extracellular region’, ‘defence response’ and ‘inflammatory response’ (Table 3), while downregulated genes in progressors (i.e. upregulated genes in non-progressors) were enriched in energy metabolism-related functions such as ‘glucose metabolic process’, ‘peroxisome proliferator-activated receptor (PPAR) signalling pathway’ and ‘regulation of lipid metabolic process’ (Table 4).

View this table:
Table 2

Over-represented biological functions in differentially expressed genes

Biological functionGene countP-valueEnrichment fold
Upregulated in progressors
    Extracellular region586.86E-071.9
    Prostanoid metabolic process42.24E-0314.9
    Defence response185.99E-032.1
    Inflammatory response126.24E-032.6
    Regulation of axonogenesis58.34E-036.2
    Response to wounding151.75E-022.0
Downregulated in progressors
    Chemical homeostasis208.25E-063.3
    Glucose metabolic process111.21E-056.1
    Glycerolipid metabolic process112.01E-055.8
    PPAR signalling pathway87.04E-057.6
    Regulation of lipid metabolic process94.93E-056.8
    Response to insulin stimulus81.67E-046.8
  • DAVID identified 31 and 168 over-represented biological functions among the up- and downregulated differentially expressed genes (DAVID P < 0.05). The table lists a subset of over-represented biological functions regulated in the progressors.

View this table:
Table 3

Differentially expressed genes related to defence response and inflammatory response (upregulated genes in progressors)

Entrez IDSymbolDescriptionP-valueFold-change
Defence response
    136ADORA2BAdenosine A2b receptor0.01331.4
    2788GNG7Guanine nucleotide binding protein (G protein), gamma 70.03361.2
    7033TFF3Trefoil factor 3 (intestinal)0.01271.5
    23601CLEC5AC-type lectin domain family 5, member A0.02071.7
    57817HAMPHepcidin antimicrobial peptide0.00252.4
    81035COLEC12Collectin subfamily member 120.01521.2
Inflammatory response
    140ADORA3Adenosine A3 receptor0.03271.3
    624BDKRB2Bradykinin receptor B20.02731.2
    3075CFHComplement factor H0.00101.2
    4282MIFMacrophage migration inhibitory factor (glycosylation-inhibiting factor)0.02821.1
    4973OLR1Oxidized low density lipoprotein (lectin-like) receptor 10.00031.6
    7852CXCR4Chemokine (C-X-C motif) receptor 40.03881.3
    10344CCL26Chemokine (C-C motif) ligand 260.00043.2
    25824PRDX5Peroxiredoxin 50.02001.1
    53833IL20RBInterleukin 20 receptor beta0.01851.3
    57834CYP4F11Cytochrome P450, family 4, subfamily F, polypeptide 110.03721.4
    148022TICAM1Toll-like receptor adaptor molecule 10.03791.2
  • Upregulated genes in progressors were enriched in biological functions such as ‘extracellular region’, ‘defence response’ and ‘inflammatory response’.

  • The table lists 18 differentially expressed genes related to ‘defence response’, including 12 ‘inflammatory response’ differentially expressed genes.

View this table:
Table 4

Differentially expressed genes related to lipid metabolism and PPAR signalling pathway (downregulated genes in progressors)

Entrez IDSymbolDescriptionP-valueFold-change
Regulation of lipid metabolic process
    348APOEApolipoprotein E0.02660.8
    2180ACSL1Acyl-CoA synthetase long-chain family member 10.03790.8
    5140PDE3BPhosphodiesterase 3B, cGMP-inhibited0.01260.7
    5468PPARγPeroxisome proliferator-activated receptor gamma0.04260.7
    8660IRS2Insulin receptor substrate 20.01180.9
    9370ADIPOQAdiponectin, C1Q and collagen domain containing0.00940.7
    51085MLXIPLMLX interacting protein-like0.00830.7
    57104PNPLA2Patatin-like phospholipase domain containing 20.02060.8
    100129500LOC100129500Hypothetical LOC1001295000.04850.7
PPAR signalling pathway
    948CD36CD36 molecule (thrombospondin receptor)0.03730.7
    2180ACSL1Acyl-CoA synthetase long-chain family member 10.03790.8
    4199ME1Malic enzyme 1, NADP(+) dependent, cytosolic0.01810.8
    5105PCK1Phosphoenolpyruvate Carboxykinase 1 (soluble)0.04970.8
    5468PPARγPeroxisome proliferator-activated receptor gamma0.04260.7
    6319SCDStearoyl-CoA desaturase (delta-9-desaturase)0.02410.7
    9370ADIPOQAdiponectin, C1Q and collagen domain containing0.00940.7
  • Downregulated genes in progressors were over-represented with energy metabolism-related functions such as ‘glucose metabolic process’, ‘PPAR signalling pathway’ and ‘regulation of lipid metabolic process’.

  • The table lists differentially expressed genes related to ‘regulation of lipid metabolic process’ and ‘PPAR signalling pathway’.

ConceptGen identified similar sets of over-represented Gene Ontology terms and pathways, whose associations are visually represented (Fig. 2). ConceptGen also identified Medical Subject Headings terms highly associated with the differentially expressed genes. The downregulated differentially expressed genes were enriched with Medical Subject Headings terms such as lipids, fatty acids, triglycerides, cholesterol and insulin, suggesting decreased energy metabolism in rapidly progressing diabetic neuropathy.

Figure 2

A network of over-represented biological concepts identified by ConceptGen. The concepts (gene sets) over-represented in the upregulated genes (A) and downregulated genes (B) in progressors. The centre nodes in violet, titled as ‘GP-CI-Common-SP-SN…’ refer to the differentially expressed genes. DEGs = differentially expressed genes; MeSH = Medical Subject Headings.

Network analysis

A literature-derived gene network of the differentially expressed genes was created by BiblioSphere based on sentence level co-citations of differentially expressed genes to examine their potential relationships (Fig. 3). This network is composed of subnetworks centred on the five most connected genes: the transcription factor jun (JUN), leptin (LEP), serpin peptidase inhibitor E Type 1 (SERPINE1), apolipoprotein E (APOE) and peroxisome proliferator-activated receptor gamma (PPARγ). The complete network was further analysed by Fast Greedy algorithm (Clauset et al., 2004), implemented in the Cytoscape plug-in GLay (Su et al., 2010) (Supplementary Fig. 1), to cluster the genes into subgroups based on their network structure. Six clusters with a minimum of eight genes were identified. DAVID identified representative biological functions within each cluster: cell death and inflammatory response (Cluster 1), glucose and lipid metabolism (Cluster 2), cell projection and axonogenesis (Cluster 3), cellular homeostasis and cofactor metabolic process (Cluster 4), cytoskeletal protein binding (Cluster 5), and Wnt receptor signalling pathway (Cluster 6). A simplified network including only nodes with a minimum of five connections is illustrated in Fig. 4.

Figure 3

Gene co-citation network of differentially expressed genes generated by BiblioSphere. A literature-derived gene network of the differentially expressed genes was created by BiblioSphere using sentence level co-citations of differentially expressed genes. The network is composed of five subnetworks centred on the five most connected genes: JUN, PPARγ, LEP, SERPINE1 and APOE.

Figure 4

Gene co-citation network clustered by fast-greedy community structuring algorithm. The complete co-citation network of the differentially expressed genes was clustered based on network topology by Fast Greedy algorithm implemented in the Cytoscape GLay plug-in (Supplementary Fig. 1). The figure contains only the nodes with a minimum of five connections (edges) in the complete network. The size of node represents the number of edges. Nodes (genes) highlighted in red or yellow refer to the highly connected genes; red for the central genes in the initial BiblioSphere co-citation network (Fig. 3) and yellow for additionally identified highly connected genes in the GLay-identified clusters.

Computational modelling of diabetic neuropathy progression

Ridge regression models were developed to classify the status of patient samples as progressors/non-progressors based on the gene expression profiles (Fig. 5). Both models using differentially expressed gene Set 1 (532 genes) and Set 2 (63 genes) achieved a classification accuracy of 92% (11/12) (Supplementary Table 3). In order to create a model using the smallest set of differentially expressed genes but with the same accuracy, individual differentially expressed genes from Set 2 were sequentially added to Set 3 until the new set achieved the original classification accuracy (92%). Four models of 14 genes achieved an accuracy of 92% (Supplementary Table 4); each model includes the 10 base differentially expressed genes from Set 3 and four additional genes from Set 2 (Table 5). All four models demonstrated P-values <0.001 from the 1000 repeated classification procedures using randomly shuffled class labels. The positive predictive values of these models ranged from 83% to 100%, and the negative predictive values were from 86% to 100% (Supplementary Table 4).

Figure 5

Biopsy selection for ridge regression modelling of diabetic neuropathy. The secondary samples, excluding those samples with paired primary samples, were used as the training set (13 progressors and 11 non-progressors) for diabetic neuropathy classification modelling, and the primary samples were used as the testing set (five progressors and seven non-progressors). P = progressor; NP = non-progressor.

View this table:
Table 5

The gene content of the four computational models with 14 genes

Model#EntrezIDSymbolDescriptionCyber-T P-valueFold-changeDetectable biofluids
1469CST1Cystatin SN0.034910.0Saliva
10804GJB6Gap junction protein, beta 6, 30 kDa0.00115.5
10344CCL26Chemokine (C-C motif) ligand 260.00043.2Blood
10647SCGB1D2Secretoglobin, family 1D, member 20.00332.9Urine
57817HAMPHepcidin antimicrobial peptide0.00252.4Blood, urine
6036RNASE2Ribonuclease, RNase A family, 2 (liver, eosinophil-derived neurotoxin)0.00882.4Blood, urine
3860KRT13Keratin 130.03152.4Blood
4741NEFMNeurofilament, medium polypeptide0.00362.1
1412CRYBA2Crystallin, beta A20.00652.1
80763C12orf39Chromosome 12 open reading frame 390.00570.4
21381CRABP1Cellular retinoic acid binding protein 10.03611.8Blood, urine
111341SCRG1Stimulator of chondrogenesis 10.00961.7
14163933FAM43BFamily with sequence similarity 43, member B0.02681.7
410562OLFM4Olfactomedin 40.00131.6Urine
355825PECRPeroxisomal trans-2-enoyl-CoA reductase0.00430.7
151085MLXIPLMLX interacting protein like0.00830.7
23153918FAM164BFamily with sequence similarity 164, member B0.00520.6
33112HLA-DOBMajor histocompatibility complex, class II, DO beta0.03560.6Blood
123456605ERO1LBERO1-like beta (S. cerevisiae)0.00140.6
2225ABCD2ATP-binding cassette, subfamily D (ALD), member 20.00540.6
  • Each model demonstrating a classification accuracy of 92% includes 10 base differentially expressed genes from Set 3 and 4 combinations of 11 differentially expressed genes from Set 2. The Metabolomics analysis package (Biomarker Filter) from Ingenuity Pathway Analysis (http://www.ingenuity.com) revealed that nine of them are detectable in saliva, blood or urine.


Diabetic neuropathy is the most common diabetic complication, affecting up to 60% of patients and contributes significantly to pain, injury, poor wound healing and lower extremity amputation (Edwards et al., 2008). The pathogenesis of diabetic neuropathy is complex and includes hyperglycaemia-induced oxidative stress and deranged polyol metabolism, changes in nerve microvasculature, decreased growth factor support and dysregulated lipid metabolism (Edwards et al., 2008; Figueroa-Romero et al., 2008). Addressing these deficits alone or in combination has yet to result in effective diabetic neuropathy treatment, confirming that an increased understanding of the mechanisms underlying the onset and progression of diabetic neuropathy is of prime importance.

The current study takes an important first step towards this goal by identifying specific sets of genes whose expression accurately classifies patient samples with regard to diabetic neuropathy progression and by analysing their interactions within known cellular pathways. Identifying common elements in these complex networks will yield novel insights into disease pathogenesis, provide new therapeutic targets and identify potential diabetic neuropathy biomarkers. The genes identified in the current study confirm data gathered from experimental models of diabetes and provide a comprehensive picture of the expression of multiple targets in a single human tissue sample.

Our initial analyses of this data set classified the patient samples based on myelinated fibre density and found that two large groups emerged; those with a loss of myelinated fibre density ≥500 fibres/mm2 over 52 weeks (progressors) and those whose myelinated fibre density was relatively stable (myelinated fibre density loss ≤100 fibres/mm2 over 52 weeks, non-progressors) (Wiggin et al., 2009). We examined sural nerve biopsies from two groups of diabetic neuropathy patients (progressors and non-progressors) to discover differences in gene expression that could account for the differences in their clinical course. Gene expression profiling in damaged peripheral nerves by diabetic neuropathy or axotomy has been explored in experimental rodent models (Renaud et al., 2005; Bosse et al., 2006; Price et al., 2006; de Preux et al., 2007; Karamoysoyli et al., 2008) but has yet to be examined in humans. These studies indicate that changes in gene expression in injured peripheral nerves are similar to the over-represented biological functions (inflammation and energy metabolism) reported in the current study.

Multiple cells types are affected in diabetic complication-prone tissues. Peripheral nerves contain the cellular extensions (axons and dendrites) of both sensory and motor neurons and their ensheathing glia, the Schwann cells. Other components include fibroblasts, capillary endothelial cells and a complex extracellular matrix. The majority of messenger RNA isolated from any peripheral nerve will be that generated by Schwann cells. Schwann cells are a target of hyperglycaemia and diabetes results in Schwann cell damage in part due to altered axon integrity and defective growth factor signalling (Yu et al., 2008; McGuire et al., 2009). In addition, inflammatory pathways including advanced glycation end products/receptor for advanced glycation end products (AGE/RAGE) signalling in axons and Schwann cells are reported in experimental animals with diabetic neuropathy and contribute to nerve damage (Lukic et al., 2008).

The upregulated differentially expressed genes in progressors were enriched with ‘defence response’ and ‘inflammatory response’. Inflammation-associated molecules such as chemokines and cytokines are implicated in the development and progression of both diabetic nephropathy and diabetic neuropathy (Rivero et al., 2009). Among these inflammation genes, bradykinin receptor B2 (BDKRB2) is of particular interest. BDKRB2 regulates the expression of genes involved in progressive glomerulosclerosis such as transforming growth factor beta 1 (TGF-β1) and p53 (Kakoki et al., 2006). We recently reported that type 1 diabetic mice with dysregulated BDKRB2 developed enhanced nephropathy and diabetic neuropathy (Kakoki et al., 2010). Membrane-associated adenosine A3 receptor (ADORA3), is also implicated in the pathogenesis of diabetic nephropathy (Pawelczyk et al., 2005). Thus, the upregulation of cytokines, chemokines and genes such as DBKRB2 and ADORA3 in our study (Table 3) suggests enhanced inflammation and dysregulated defence responses, thus contributing to more substantial nerve damage in patients with progressive diabetic neuropathy. It is not yet clear if these cytokines and chemokines are expressed by the Schwann cells or by infiltrating macrophages (Sommer et al., 2005), which may interact with each other in injury and demyelinating diseases (Martini et al., 2008). Regardless, our data raise the intriguing idea that the inflammatory response should be further explored as a new therapeutic target for diabetic neuropathy.

The downregulated differentially expressed genes in the progressors were enriched with biological functions related to energy metabolism including ‘glucose metabolic process’ and ‘PPAR signalling pathway’ (Table 4). Among these differentially expressed genes, PPARγ, encoding a nuclear receptor for glitazone, plays a key role in regulating glucose and lipid metabolism (Duan et al., 2009). Agonists of PPARγ are effective in ameliorating diabetic neuropathy and nephropathy in animal models (Maeda et al., 2008; Yamagishi et al., 2008). Another key gene is APOE, encoding an apolipoprotein, which regulates the normal catabolism of triglycerides and cholesterol. A polymorphism of this gene is linked to the progression of diabetic nephropathy (Li et al., 2010). Decreased levels of PPARγ and APOE as well as other lipid metabolism-related differentially expressed genes correlates with the increased levels of serum triglycerides confirming our recent finding that altered lipid metabolism may play a role in the progression of diabetic neuropathy (Wiggin et al., 2009). Further experimental work is required to determine how altered lipid metabolism influences the progression of diabetic neuropathy.

We have demonstrated that increased glucose metabolism results in increased oxidative stress, mitochondrial dysfunction and cell death in both in vitro and in vivo models of diabetic neuropathy (Vincent et al., 2004; Russell et al., 2008). In the current study, genes involved in glucose metabolism are downregulated in progressors, which is counter-intuitive to what we and others have reported for sensory neurons. Yet, considering that the majority of sural nerve RNA originates from Schwann cells, these data do support what we and others have reported regarding Schwann cells i.e. that they are resistant to hyperglycaemia-induced cell death and exhibit an enhanced antioxidant capacity (Delaney et al., 2001; Vincent et al., 2009). Our data also imply that sensory neurons and Schwann cells likely employ different pathways when presented with metabolic stressors such as hyperglycaemia.

Although functional enrichment analyses identify over-represented biological functions, they do not reveal how these genes interact with each other. To obtain a global view of the network, we examined gene interaction networks based on literature-derived co-citation information (Figs 3 and 4). Although co-citation of two genes in a single sentence does not necessarily indicate there is a direct interaction, this process may reveal novel associations and lend new insights into function (Schmelzer et al., 2008). In the current study, the BiblioSphere co-citation network demonstrated potential interactions among differentially expressed genes and identified five major subnetworks centred on the following genes: PPARγ, APOE, SERPINE1, JUN and LEP.

The majority of the key genes identified in our network analyses are implicated in the pathogenesis of diabetes and diabetic complications (mainly diabetic nephropathy) (Supplementary Table 5). As discussed above, PPARγ and APOE are downregulated in progressors. Downregulation of either gene in adipocytes leads to a decrease in serum lipid uptake with subsequent hyperlipidaemia (Duan et al., 2009) and a predisposition towards developing diabetic neuropathy (Wiggin et al., 2009). A fibrinolysis regulating gene, SERPINE1 encodes plasminogen activator inhibitor 1 (PAI-1), whose elevated levels are associated with higher incidences of diabetes (Festa et al., 2002) and knocking out PAI-1 ameliorated diabetic nephropathy in mice (Nicholas et al., 2005). A recent study suggested leptin's therapeutic effect in a combinatorial treatment with insulin in type 1 diabetic mice (Wang et al., 2010). The cell cycle controlling JUN might be also involved in progression of diabetic neuropathy through its close interacting partners c-Jun N-terminal kinases (JNKs). JNKs are key signalling molecules linking inflammation and insulin resistance and are significantly activated in multiple tissues including the sural nerve of patients with type 1 and 2 diabetes (Yang and Trevillyan, 2008). Thus, the enriched biological functions and the networks of the differentially expressed genes reflect current theories with regard to dysregulation in diabetes and its complications, suggesting their expression changes may be related to the development of diabetic neuropathy.

To fully incorporate all of the co-citation connections among the differentially expressed genes, we applied the Fast Greedy algorithm, a community structure identification algorithm, to the entire co-citation network. Fast Greedy grouped LEP and PPARγ together within the context of glucose and lipid metabolism and JUN and SERPINE1 within the context of cell death and inflammation. Three other subnetworks were identified with noteworthy key genes: ‘cell projection and axonogenesis’ with nerve growth factor receptor (NGFR), ‘cellular homeostasis and inflammatory response’ with thioredoxin and ‘cytoskeletal protein binding’ with stathmin 1 (STMN1).

Nerve growth factor receptor exerts protection against nerve damage and the expression of nerve growth factor receptor protein in plasma correlates with diabetic neuropathy progression in diabetic rats (Chilton et al., 2004). Thioredoxin, which regulates cellular oxidative stress, is also implicated in diabetes. Thioredoxin's antioxidant activity is significantly inhibited by hyperglycaemia, suggesting its important role in vascular oxidative stress and inflammation in diabetes (Schulze et al., 2004). No direct implication of stathmin 1, a major regulator of microtubule dynamics, in diabetes is currently known. We and others have reported regenerative changes in response to diabetic neuropathy (Dyck et al., 1986; Sullivan et al., 2003). The expression of genes involved in axonal extension may reflect these changes and an individual's ability to recover from nerve damage.

Our next goal was to use observed differentially expressed gene expression to classify a separate subset of biopsies using Ridge regression modelling. Regression modelling using gene expression data has proven extremely useful in predicting the progression of cancer and diabetic nephropathy (Shedden et al., 2008; Ju et al., 2009). In the current study, gene expression profiles from secondary biopsy samples with known myelinated fibre density (progressors or non-progressors) were compared and used in training the models. The models were then used to classify the expression profiles of a set of primary biopsies for their progression endpoint 12 months later. The best classification models included 14 genes and correctly identified 11 out of 12 patients with respect to their identification as progressors versus non-progressors. This classification accuracy (92%) is much higher than our previous naïve Bayes-based classification model's accuracy (63%) using only physiological and demographic data of these patients (Wiggin et al., 2009). Our data demonstrate that the gene expression profile from sural nerve biopsies of patients with diabetic neuropathy achieve a higher prediction accuracy (92%) than the clinical parameters alone and are better predictors of diabetic neuropathy progression.

We hypothesize that the genes identified in our classification models (Table 5) represent products or ‘genetic biomarkers’ of the biological networks involved in diabetic neuropathy onset and progression. This idea is reinforced by the fact that several of the genes have known associations with diabetes or diabetic complications. We are particularly interested in CST1, whose expression was increased by 10-fold in progressors. CST1, encoding a cysteine protease inhibitor, was initially implicated in gastric and colorectal tumourigenesis (Choi et al., 2009; Yoneda et al., 2009). Another member of this protein family, cystatin C (CST3), has been identified as a prime predictor of diabetic nephropathy progression (Shimizu et al., 2003; Taglieri et al., 2009). Although the CST1 gene product has not been investigated in the context of diabetic complications, it is detectable in saliva, tears and urine (Choi et al., 2009). To date, there are no definitive biomarkers of diabetic neuropathy progression easily accessed from body fluids, and we speculate that CST1 could prove to be an easily measureable biomarker for diabetic neuropathy.

According to the Ingenuity Pathway Analysis Biomarker Filter (http://www.ingenuity.com), nine genes (six genes from the base set and three genes from the additional set) out of the 21 genes in the four models, are detectable in readily accessible human biofluids such as saliva, blood or urine (Table 5). Initial analyses of urine and blood chemistry were used to qualify patients for study inclusion during the clinical trials. No blood or urine was collected at study end; thus, we cannot evaluate the direct correlation of these candidate genes between the sural nerve and biofluids from the same patients. Sural nerve biopsy has been replaced by minimally invasive skin biopsy as a measure of innervation and it is unlikely that sural nerve biopsies and biofluids from new patients will ever be available; however, our novel results strongly suggest these biofluid detectable genes should be prioritized as potential diabetic neuropathy predictive biomarkers in future prospective studies.

No model is currently able to predict potential development of diabetic neuropathy. We do not know if our models would be useful in identifying patients who could potentially develop diabetic neuropathy since our gene expression-based models are optimized for differentiating patients with progressive diabetic neuropathy from those with non-progressive diabetic neuropathy. We do not expect our models to be used in a clinical setting in itself, especially since sural nerve biopsies are rarely performed; however, these models provide a starting point to identify potential biomarkers since many of the genes in our models are detectable in human biofluids. We expect that these potential biomarkers will be examined in a clinical context to assess their usefulness in identifying patients at risk of developing diabetic neuropathy and those whose disease course may be more severe.

In summary, we report for the first time differential gene expression of human sural nerves from patients with progressive and non-progressive diabetic neuropathy. Biological enrichment and network analyses identified several novel areas of biological importance, yielding new insight into disease pathogenesis and opening up new areas of potential investigation for the discovery of mechanism-based therapies. We have also reported for the first time the expression signatures of gene sets that accurately classify patients with regard to diabetic neuropathy progression. While translating gene expression to predictive biomarkers measurable in the clinic remains a challenge, we report several novel potential biomarker candidates for diabetic neuropathy. Evaluation of these biomarker candidates and refinement of patient classification models represents an exciting new direction in the management and treatment of diabetic neuropathy. Collectively, our results represent the first exploration of gene expression arrays from human sural nerves of patients with varying degrees of diabetic neuropathy and provide new insight into disease pathogenesis and biomarker identification.


National Institutes of Health (U54-DA021519 to H.V.J., M.K. and E.L.F.; R24-DK082841-01 to H.V.J., M.K. and E.L.F.; P30-DK081943 to M.K.; and RC1-NS068182-01 to E.L.F.); the Animal Models of Diabetic Complications Consortium (AMDCC U01-DK076160 to E.L.F); the American Diabetes Association (to K.A.S. and E.L.F.); National Institute of Diabetes and Digestive and Kidney Disease, the Michigan Diabetes Research and Training Centre (DK020572); the Program for Neurology Research and Discovery; the A. Alfred Taubman Medical Research Institute; a Rackham Graduate School Predoctoral Fellowship (to J.H.); and a Juvenile Diabetes Research Foundation Postdoctoral Fellowship (to J.H.).

Supplementary material

Supplementary material is available at Brain online.


The authors acknowledge the expertise of Ms Carey Backus and Ms Lisa McLean with regard to retrieving human sural nerve samples from the repository and Ms Ann Randolph for RNA extraction from the sural nerve samples. The authors also thank Dr Kerby Shedden at Department of Statistics, University of Michigan for providing the ridge regression modelling script.

 Database for Annotation, Visualization and Integrated Discovery


View Abstract