Identification and validation of potential common biomarkers for papillary thyroid carcinoma and Hashimoto’s thyroiditis … – Nature.com
Identify shared differential genes
When conducting PCA analysis on the expression matrices of GSE33570 (Fig.2a) and GSE29315 (Fig.2d), we observed a clear two-sided distribution of samples in both the disease group and the control group. In the analysis of the GSE35570 dataset, a total of 1572 distinct genes were detected as being differentially expressed. These DEGs were categorized into 824 up-regulated genes and 748 down-regulated genes (Fig.2b). Similarly, we observed 423 DEGs in the GSE29315 dataset, including 271 up-regulated DEGs and 152 down-regulated DEGs (Fig.2e). Next, the GEGs of the two datasets are displayed heatmaps for both datasets (Fig.2c,f). Furthermore, we employed a Venn diagram to identify the overlapping genes with the same directional trend, resulting in 64 genes being up-regulated (Fig.2g) and 37 genes being down-regulated (Fig.2h).
Differential expression gene analysis, function enrichment analysis and pathway enrichment analysis. (a) The PCA plot of GSE35570. (b, c) The Volcano plot and heatmap of DEGs in GSE33570. (d) The PCA plot of GSE29315. (e, f) The Volcano plot and heatmap of DEGs in GSE29315. (g) Venn plot of the up-regulated DEGs. (h) Venn plot of the down-regulated DEGs. (i) The KEGG enrichment analyses of DEGs. (j) The GO enrichment analyses of DEGs.
In order to enhance our comprehension of the fundamental biological functions linked to the 101 DEGs, an assessment of GO and KEGG enrichment was conducted using the clusterProfiler software package in R. An analysis of GO highlighted that these shared genes were mainly enriched in leukocyte mediated immunity, myeloid leukocyte activation, and antigen processing and presentation (Fig.2j). Additionally, the DEGs exhibited significant enrichment across the top five KEGG pathways, including Tuberculosis, Phagosome, Viral myocarditis, Inflammatory bowel disease, and Th1 and Th2 cell differentiation (Fig.2i). Apparently, the functions of differentially expressed genes are closely associated with the immune function of the body. The core genes primarily serve the purpose of activating immune cells.
To carry out the PPI analysis, we utilized the STRING online tool and visualized the outcomes using the Cytoscape software (Supplementary Fig. S1a). The PPI network showed 68 nodes and 498 edges. The DC value of each node was calculated, with a median value of 11. Based on this, we identified 17 hub genes of PPI network: TYROBP, ITGB2, STAT1, HLA-DRA, C1QB, MMP9, FCER1G, IL10RA, LCP2, LY86, CD53, CD14, CD163, HCK, MNDA, HLA-DPA1, and ALOX5AP. Subsequently, we employed the MCODE plug-in to identify six modules (Supplementary Fig. S1b,c), which included a total of 29 common DEGs. These DEGs were LCP2, TYROBP, CD53, LY86, ITGB2, FCER1G, MNDA, C1QB, HCK, IL10RA, HLA-DRA, ALOX5AP, MT1G, MT1F, MT1E, MT1X, ISG15, IFIT3, PSMB9, GBP2, CD14, CD163, VSIG4, CAV1, TIMP1, S100A4, SDC2, FGFR2, and STAT1. The most important module comprises 12 genes (LCP2, TYROBP, CD53, LY86, ITGB2, FCER1G, MNDA, C1QB, HCK, IL10RA, HLA-DRA, ALOX5AP), which were further analyzed using the ClueGO plug-in in Cytoscape software. The investigation revealed that these genes primarily function in activating neutrophils to participate in the immune response and activating innate immunity (Supplementary Fig. S1d).
In this study, we analyzed a total of 26 genes from six modules extracted from MCODE. To determine the importance of each gene, we employed the RF algorithm in two datasets, namely GSE35570 (Fig.3a) and GSE29315 (Fig.3b). By comparing the rankings of gene importance in both datasets, we identified the top eight genes that were consistently ranked highly. To visualize this overlap, we created a Venn diagram (Fig.3c), which revealed three genes (CD53, FCER1G and TYROBP) that were shared between the two datasets. Remarkably, these three genes overlap with the hub genes identified through the PPI analysis based on DC values, as well as the genes found in the most significant module. These three genes showed promising diagnostic potential for HT and PTC. To evaluate the diagnostic value of the common hub genes, we computed the Cutoff Value, sensitivity, specificity, AUC and 95% CI for each gene in the four datasets (Table 1). In the GSE35570 dataset (Fig.3d), the AUC values were as follows: CD53 (AUC 0.71, 95% CI 0.610.82), FCER1G (AUC 0.81, 95% CI 0.730.89), and TYROBP (AUC 0.79, 95% CI 0.710.88). In the GSE29315 dataset (Fig.3e), the AUC values were as follows: CD53 (AUC 1.00, 95% CI 1.001.00), FCER1G (AUC 1.00, 95% CI 1.001.00) and TYROBP (AUC 1.00, 95% CI 1.001.00). In the TCGA dataset (Fig.3f), we validated the diagnostic value of the common hub genes for PTC. The AUC values were as follows: CD53 (AUC 0.71 95% CI 0.610.82), FCER1G (AUC 0.74, 95% CI 0.640.89) and TYROBP (AUC 0.80, 95% CI 0.700.89). To further evaluate the diagnostic value of the common hub genes for PTC in HT, we computed the AUC and 95% CI for each gene using GSE1398198. In the GSE138198 dataset (Fig.3g), the AUC values were as follows: CD53 (AUC 0.83, 95%CI 0.571.00), FCER1G (AUC 0.92, 95% CI 0.721.00) and TYROBP (AUC 1.00, 95% CI 1.001.00). We also analyzed the difference box plots between the two groups in the four datasets (Supplementary Fig. S2). Our analysis using box plots revealed a noteworthy disparity in gene expression between the HT group and the control group in GSE29315. This disparity serves as an explanation for the AUC values of the three hub genes in GSE29315, all of which were observed to be 1.
Screening of hub genes and the diagnostic value of hub genes. (a) The rankings of gene importance in GSE35570. (b) The rankings of gene importance in GSE29315. (c) Venn plot of the top eight genes in GSE35570 and GSE29315. (d) Diagnostic value of hub genes in the GSE35570. (e) Diagnostic value of hub genes in the GSE29315, (f) Diagnostic value of hub genes in the TCGA. (g) Diagnostic value of hub genes in the GSE138198.
By using the GSE35570 dataset, we developed three diagnostic model specifically for PTC, incorporating these pivotal genes that were identified through our analysis. The ANN model (Fig.4a) had 4 hidden units, a penalty of 0.0108, and was trained for 537 epochs. The ANN model achieved an AUC of 0.94 (95% CI 0.910.98) in the training set, while in the test set, the AUC was 0.94 (95% CI 0.831.00) (Fig.4b). The XGBoost model had 8 mtry, 6 min_n, 3 max_depth, 0.001 learn_rate, and 0.07 loss_reduction and 0.97 sample_size. The XGBoost model achieved an AUC of 0.84 (95% CI 0.750.93) in the training set, while in the test set, the AUC was 0.62 (95% CI 0.420.83) (Supplementary Fig. S3a). The DT model had 0.0003 cost_complexity, 5 tree_depth and 6 min_n. The DT model achieved an AUC of 0.93 (95% CI 0.900.97) in the training set, while in the test set, the AUC was 0.83 (95% CI 0.651.00) (Supplementary Fig. S3b). Supplementary Table S1 displays the predictive performance of three machine learning models. The results indicate that the ANN model outperformed the other models, leading us to choose the ANN model for further analysis. TCGA dataset as external validation dataset was utilized to assess the diagnostic performance of the ANN model for PTC, yielding an AUC value of 0.77 (95% CI 0.660.87) (Fig.4c). The GSE138198 dataset was used to evaluate the ANN models diagnostic efficacy for PTC in HT. In the GSE138198 dataset (Fig.4d), the ANN model demonstrated a perfect AUC of 1.00 (95% CI 1.001.00). To provide clinicians with a better understanding of variable contributions, we utilized the SHAP algorithm to interpret the ANN prediction results. Figure4e, f, g illustrated how the attributed importance of features changed as their values varied. Our findings reveal that CD53 had the most significant impact on the output of the ANN model. Initially, it was positively associated with the risk of PTC and then became negatively correlated after a turning point of approximately 6. TYROBP and FCER1G showed a positive correlation with the occurrence of PTC.
ANN model construction and feature importance analysis. (a) The ANN was constructed based on the shared hub genes. (b) Diagnostic value of the ANN model in the GSE35570. (c) Diagnostic value of the ANN model in the TCGA. (d) Diagnostic value of the ANN model in the GSE138198. (e) A score calculated by SHAP was used for each input feature. (f, g) Distribution of the impact of each feature on the full model output estimated using the SHAP values.
We analyzed the protein expression of the hub genes based on the HPA database (Supplementary Fig. S4). CD53 was highly expressed in both tumor and normal tissues, while FCER1G and TYROBP showed higher expression in tumors compared to normal tissues. Furthermore, IF staining was performed to measure the expressions of CD53, FCER1G, and TYROBP in our clinical samples, including 10 HT-related PTC tissues and 6 NAT. By performing IF analysis (Fig.5), we obtained semi-quantitative results indicating significantly elevated fluorescence signal intensities for CD53, FCER1G, and TYROBP in the HT-related PTC group, as compared to the NAT group (P<0.05).
Microscopy scan of IF staining showed the distribution of CD53(green), FCER1G(green), and TYROBP(green), in HT-related PTC tissues and normal tissues adjacent to the tumour (NAT); as well as diagnostic value of CD53, FCER1G and TYROBP. MFI: Mean Fluorescence Intensity.
Considering the important roles of immune and inflammatory responses in the development of HT and PTC, we analyzed the differences in immune cell infiltration patterns between PTC, HT and normal samples using the CIBERSORT algorithm. By utilizing the GSE35570 dataset, we identified 12 immune subgroups that exhibited significant variations between PTC and normal samples (Supplementary Fig. S5a). Additionally, the analysis of the GSE29315 dataset revealed 5 immune subgroups that were significantly different between HT and normal samples (Supplementary Fig. S5b). Among these, 4 common immune subpopulations were found to be significantly higher in both PTC and HT samples compared to normal samples. These subpopulations included T cells CD8, T cells CD4 memory resting, macrophages M1 and mast cells resting. Additionally, we conducted spearman correlation analysis between hub genes and immune cells (Supplementary Fig. S5c,d). The results suggested that immune responses could potentially contribute to the involvement of hub genes in PTC and HT progression. IF staining was utilized to identify immune cell infiltration in 5 cases of PTC in HT tissues and 5 cases of NAT (Fig.6). The expression levels of CD4+T-cell marker Cd4, CD8+T-cell marker Cd8, and macrophage marker Cd86 were found to be significantly higher in the PTC in HT group compared to the NAT group. The IF staining results provided some extent of verification for the accuracy of the immune infiltration analysis results.
Microscopy scan of IF staining showed the distribution of Cd4(green), Cd8(green), and Cd86(green), in HT-related PTC tissues and normal tissues adjacent to the tumour (NAT). MFI: Mean Fluorescence Intensity.
Based on the three core genes screened in the RF algorithm, we conducted a search in the DGIdb database for relevant potential drugs. The results showed that only FCER1G had relevant drugs, while no relevant drugs were found for CD53 and TYROBP. FCER1G was predicted to have two potential drugs: benzylpenicilloyl polylysine and aspirin. Among these, benzylpenicilloyl polylysine had the highest score of 29.49, while aspirin had a score of only 1.26. We hypothesise that benzylpenicilloyl polylysine and aspirin may be effective in the treatment of HT and PTC and may prevent HT carcinogenesis.
See original here:
Identification and validation of potential common biomarkers for papillary thyroid carcinoma and Hashimoto's thyroiditis ... - Nature.com
- Exploring LLMs with MLX and the Neural Accelerators in the M5 GPU - Apple Machine Learning Research - November 23rd, 2025 [November 23rd, 2025]
- Machine learning model for HBsAg seroclearance after 48-week pegylated interferon therapy in inactive HBsAg carriers: a retrospective study - Virology... - November 23rd, 2025 [November 23rd, 2025]
- IIT Madras Free Machine Learning Course 2026: What to know - Times of India - November 23rd, 2025 [November 23rd, 2025]
- Towards a Better Evaluation of 3D CVML Algorithms: Immersive Debugging of a Localization Model - Apple Machine Learning Research - November 23rd, 2025 [November 23rd, 2025]
- A machine-learning powered liquid biopsy predicts response to paclitaxel plus ramucirumab in advanced gastric cancer: results from the prospective IVY... - November 23rd, 2025 [November 23rd, 2025]
- Monitoring for early prediction of gram-negative bacteremia using machine learning and hematological data in the emergency department - Nature - November 23rd, 2025 [November 23rd, 2025]
- Development and validation of an interpretable machine learning model for osteoporosis prediction using routine blood tests: a retrospective cohort... - November 23rd, 2025 [November 23rd, 2025]
- Snowflake Supercharges Machine Learning for Enterprises with Native Integration of NVIDIA CUDA-X Libraries - Snowflake - November 23rd, 2025 [November 23rd, 2025]
- Rethinking Revenue: How AI and Machine Learning Are Unlocking Hidden Value in the Post-Booking Space - Aviation Week Network - November 23rd, 2025 [November 23rd, 2025]
- Machine Learning Prediction of Material Properties Improves with Phonon-Informed Datasets - Quantum Zeitgeist - November 23rd, 2025 [November 23rd, 2025]
- A predictive model for the treatment outcomes of patients with secondary mitral regurgitation based on machine learning and model interpretation - BMC... - November 23rd, 2025 [November 23rd, 2025]
- Mobvista (1860.HK) Delivers Solid Revenue Growth in Q3 2025 as Mintegral Strengthens Its AI and Machine Learning Technology - Business Wire - November 23rd, 2025 [November 23rd, 2025]
- Machine learning beats classical method in predicting cosmic ray radiation near Earth - Phys.org - November 23rd, 2025 [November 23rd, 2025]
- Top Ways AI and Machine Learning Are Revolutionizing Industries in 2025 - nerdbot - November 23rd, 2025 [November 23rd, 2025]
- Snowflake Supercharges Machine Learning for Enterprises with Native Integration of NVIDIA CUDA-X Libraries - Yahoo Finance - November 18th, 2025 [November 18th, 2025]
- An interpretable machine learning model for predicting 5year survival in breast cancer based on integration of proteomics and clinical data -... - November 18th, 2025 [November 18th, 2025]
- scMFF: a machine learning framework with multiple feature fusion strategies for cell type identification - BMC Bioinformatics - November 18th, 2025 [November 18th, 2025]
- URI professor examines how machine learning can help with depression diagnosis Rhody Today - The University of Rhode Island - November 18th, 2025 [November 18th, 2025]
- Predicting drug solubility in supercritical carbon dioxide green solvent using machine learning models based on thermodynamic properties - Nature - November 18th, 2025 [November 18th, 2025]
- Relationship between C-reactive protein triglyceride glucose index and cardiovascular disease risk: a cross-sectional analysis with machine learning -... - November 18th, 2025 [November 18th, 2025]
- Using machine learning to predict student outcomes for early intervention and formative assessment - Nature - November 18th, 2025 [November 18th, 2025]
- Prevalence, associated factors, and machine learning-based prediction of probable depression among individuals with chronic diseases in Bangladesh -... - November 18th, 2025 [November 18th, 2025]
- Snowflake supercharges machine learning for enterprises with native integration of Nvidia CUDA-X libraries - MarketScreener - November 18th, 2025 [November 18th, 2025]
- Unlocking Cardiovascular Disease Insights Through Machine Learning - BIOENGINEER.ORG - November 18th, 2025 [November 18th, 2025]
- Machine learning boosts solar forecasts in diverse climates of India - researchmatters.in - November 18th, 2025 [November 18th, 2025]
- Big Data Machine Learning In Telecom Market by Type and Application Set for 14.8% CAGR Growth Through 2033 - openPR.com - November 18th, 2025 [November 18th, 2025]
- How Humans Could Soon Understand and Talk to Animals, Thanks to Machine Learning - SYFY - November 10th, 2025 [November 10th, 2025]
- Machine learning based analysis of diesel engine performance using FeO nanoadditive in sterculia foetida biodiesel blend - Nature - November 10th, 2025 [November 10th, 2025]
- Machine Learning in Maternal Care - Johns Hopkins Bloomberg School of Public Health - November 10th, 2025 [November 10th, 2025]
- Machine learning-based differentiation of benign and malignant adrenal lesions using 18F-FDG PET/CT: a two-stage classification and SHAP... - November 10th, 2025 [November 10th, 2025]
- How to Better Use AI and Machine Learning in Dermatology, With Renata Block, MMS, PA-C - HCPLive - November 10th, 2025 [November 10th, 2025]
- Avoiding Catastrophe: The Importance of Privacy when Leveraging AI and Machine Learning for Disaster Management - CSIS | Center for Strategic and... - November 10th, 2025 [November 10th, 2025]
- Efferocytosis-related signatures identified via Single-cell analysis and machine learning predict TNBC outcomes and immunotherapy response - Nature - November 10th, 2025 [November 10th, 2025]
- Arc Raiders' use of AI highlights the tension and confusion over where machine learning ends and generative AI begins - PC Gamer - November 3rd, 2025 [November 3rd, 2025]
- From performance to prediction: extracting aging data from the effects of base load aging on washing machines for a machine learning model - Nature - November 3rd, 2025 [November 3rd, 2025]
- Meet 'kvcached': A Machine Learning Library to Enable Virtualized, Elastic KV Cache for LLM Serving on Shared GPUs - MarkTechPost - October 28th, 2025 [October 28th, 2025]
- Bayesian-optimized machine learning boosts actual evapotranspiration prediction in water-stressed agricultural regions of China - Nature - October 28th, 2025 [October 28th, 2025]
- Using machine learning to shed light on how well the triage systems work - News-Medical - October 28th, 2025 [October 28th, 2025]
- Our Last Hope Before The AI Bubble Detonates: Taming LLMs - Machine Learning Week US - October 28th, 2025 [October 28th, 2025]
- Using multiple machine learning algorithms to predict spinal cord injury in patients with cervical spondylosis: a multicenter study - Nature - October 28th, 2025 [October 28th, 2025]
- The diagnostic potential of proteomics and machine learning in Lyme neuroborreliosis - Nature - October 28th, 2025 [October 28th, 2025]
- Using unsupervised machine learning methods to cluster cardio-metabolic profile of the middle-aged and elderly Chinese with general and central... - October 28th, 2025 [October 28th, 2025]
- The prognostic value of POD24 for multiple myeloma: a comprehensive analysis based on traditional statistics and machine learning - BMC Cancer - October 28th, 2025 [October 28th, 2025]
- Reducing inequalities using an unbiased machine learning approach to identify births with the highest risk of preventable neonatal deaths - Population... - October 28th, 2025 [October 28th, 2025]
- Association between SHR and mortality in critically ill patients with CVD: a retrospective analysis and machine learning approach - Diabetology &... - October 28th, 2025 [October 28th, 2025]
- AI-Powered Visual Storytelling: How Machine Learning Transforms Creative Content Production - About Chromebooks - October 28th, 2025 [October 28th, 2025]
- How beauty brand Shiseido nearly tripled revenue per user with machine learning - Performance Marketing World - October 28th, 2025 [October 28th, 2025]
- Magnite introduces machine learning-powered ad podding for streaming platforms - PPC Land - October 26th, 2025 [October 26th, 2025]
- Krafton is an AI first company and will invest 70M USD on machine learning - Female First - October 26th, 2025 [October 26th, 2025]
- Machine learning prediction of bacterial optimal growth temperature from protein domain signatures reveals thermoadaptation mechanisms - BMC Genomics - October 24th, 2025 [October 24th, 2025]
- Data Proportionality and Its Impact on Machine Learning Predictions of Ground Granulated Blast Furnace Slag Concrete Strength | Newswise - Newswise - October 24th, 2025 [October 24th, 2025]
- The Evolution of Machine Learning and Its Applications in Orthopaedics: A Bibliometric Analysis - Cureus - October 24th, 2025 [October 24th, 2025]
- Sentiment Analysis with Machine Learning Achieves 83.48% Accuracy in Predicting Consumer Behavior Trends - Quantum Zeitgeist - October 24th, 2025 [October 24th, 2025]
- Use of machine learning for risk stratification of chest pain patients in the emergency department - BMC Medical Informatics and Decision Making - October 24th, 2025 [October 24th, 2025]
- Mass spectrometry combined with machine learning identifies novel protein signatures as demonstrated with multisystem inflammatory syndrome in... - October 24th, 2025 [October 24th, 2025]
- How Machine Learning Is Shrinking to Fit the Sensor Node - All About Circuits - October 24th, 2025 [October 24th, 2025]
- Machine learning models for mechanical properties prediction of basalt fiber-reinforced concrete incorporating graphical user interface - Nature - October 24th, 2025 [October 24th, 2025]
- Ohio wins national cybersecurity award for fraud solutions using machine learning - Spectrum News NY1 - October 24th, 2025 [October 24th, 2025]
- Itron Partners with Gordian Technologies to Enhance Grid Edge Intelligence with AI and Machine Learning Solutions - Quiver Quantitative - October 24th, 2025 [October 24th, 2025]
- Wearable sensors and machine learning give leg up on better running data - Medical Xpress - October 23rd, 2025 [October 23rd, 2025]
- Geophysical-machine learning tool developed for continuous subsurface geomaterials characterization - Phys.org - October 23rd, 2025 [October 23rd, 2025]
- Ohio wins national cybersecurity award for fraud solutions using machine learning - Spectrum News 1 - October 23rd, 2025 [October 23rd, 2025]
- Machine learning predictions of climate change effects on nearly threatened bird species ( Crithagra xantholaema) habitat in Ethiopia for conservation... - October 23rd, 2025 [October 23rd, 2025]
- A machine learning tool for predicting newly diagnosed osteoporosis in primary healthcare in the Stockholm Region - Nature - October 23rd, 2025 [October 23rd, 2025]
- ECBs New Perspective on Machine Learning in Banking - KPMG - October 23rd, 2025 [October 23rd, 2025]
- Ensemble Machine Learning for Digital Mapping of Soil pH and Electrical Conductivity in the Andean Agroecosystem of Peru - Frontiers - October 21st, 2025 [October 21st, 2025]
- New UA research develops machine learning to address needs of children with autism - AZPM News - October 21st, 2025 [October 21st, 2025]
- NMDSI Speaker Series on Weather Forecasting: What Machine Learning Can and Can't Do, Oct. 23 - Marquette Today - October 21st, 2025 [October 21st, 2025]
- Polyskill Achieves 1.7x Improved Skill Reuse and 9.4% Higher Success Rates through Polymorphic Abstraction in Machine Learning - Quantum Zeitgeist - October 21st, 2025 [October 21st, 2025]
- University of Strathclyde opens admission for MSc in Machine & Deep Learning for Jan 2026 intake - The Indian Express - October 21st, 2025 [October 21st, 2025]
- Reducing Model Biases with Machine Learning Corrections Derived from Ocean Data Assimilation Increments - ESS Open Archive - October 19th, 2025 [October 19th, 2025]
- Unlocking Obesity: Multi-Omics and Machine Learning Insights - Bioengineer.org - October 19th, 2025 [October 19th, 2025]
- Lockheed Martin advances PAC-3 MSE interceptor using artificial intelligence and machine learning - Defence Industry Europe - October 19th, 2025 [October 19th, 2025]
- Semi-automated surveillance of surgical site infections using machine learning and rule-based classification models - Nature - October 19th, 2025 [October 19th, 2025]
- AI and Machine Learning - City of San Jos to release RFP for generative AI platform - Smart Cities World - October 19th, 2025 [October 19th, 2025]
- Machine learning helps identify 'thermal switch' for next-generation nanomaterials - Phys.org - October 17th, 2025 [October 17th, 2025]
- Machine Learning Makes Wildlife Data Analysis Less of a Trek - Maryland.gov - October 17th, 2025 [October 17th, 2025]
- An interpretable multimodal machine learning model for predicting malignancy of thyroid nodules in low-resource scenarios - BMC Endocrine Disorders - October 17th, 2025 [October 17th, 2025]
- In First-Episode Psychosis Patients, Machine Learning Predicted Illness Trajectories to Potentially Improve Outcomes - Brain and Behavior Research - October 17th, 2025 [October 17th, 2025]
- Novel Machine Learning Model Improves MASLD Detection in Type 2 Diabetes - The American Journal of Managed Care (AJMC) - October 17th, 2025 [October 17th, 2025]