Here are the datasets available for download. Please ensure to cite our work when using these datasets.
| Group | Link | Species | Counts | Source | Size | Release |
|---|---|---|---|---|---|---|
| Physiological Sequences | ST-P1 | Homo sapiens | 53,519 | Uniprot, NCBI_RefSeq | 46.7M | 2025.10.31 |
| Frameshift Sequences | ST-S1 | Homo sapiens | 1,240,190 | ClinVar, 1000GP, DepMap, GDC, dbSNP | 887.3M | 2025.11.11 |
| FS-control Sequences | ST-SC1 | Homo sapiens | 44,519 | ClinVar, 1000GP, DepMap, GDC, dbSNP | 24.7M | 2025.11.11 |
If you have prepared the prediction results for both sequence types under the same parameters, click here to proceed with comparative analysis. Please ensure that the FS dataset and its control counterpart were generated using identical parameters; otherwise, the comparison will not be meaningful.
Explore SERtool’s specialized modules: The data described above can be used within sequence-based predictive frameworks. We have developed two sequence prediction models aligned with our research group’s focus and will continue to expand this toolkit, with an emphasis on deciphering the relationship between frameshift mutations and molecular function, as well as their roles in disease.