Resources

📥 Available Datasets for Download

Here are the datasets available for download. Please ensure to cite our work when using these datasets.

Group Link Species Counts Source Size Release
Physiological Sequences ST-P1 Homo sapiens 53,519 Uniprot, NCBI_RefSeq 46.7M 2025.10.31
Frameshift Sequences ST-S1 Homo sapiens 1,240,190 ClinVar, 1000GP, DepMap, GDC, dbSNP 887.3M 2025.11.11
FS-control Sequences ST-SC1 Homo sapiens 44,519 ClinVar, 1000GP, DepMap, GDC, dbSNP 24.7M 2025.11.11

📊 Comparative Analysis

If you have prepared the prediction results for both sequence types under the same parameters, click here to proceed with comparative analysis. Please ensure that the FS dataset and its control counterpart were generated using identical parameters; otherwise, the comparison will not be meaningful.

🔗 Related Resources

Explore SERtool’s specialized modules: The data described above can be used within sequence-based predictive frameworks. We have developed two sequence prediction models aligned with our research group’s focus and will continue to expand this toolkit, with an emphasis on deciphering the relationship between frameshift mutations and molecular function, as well as their roles in disease.

Splicing Factor Prediction
(Available)
Chaperone Prediction
(Under examination……)
Future Feature