The Filtration of 2D Electrophoresis Data during Creation of a Learning Set for Prediction of the Value of the Isoelectric Point of Proteins


  • V.S. Skvortsov Institute of Biomedical Chemistry, 10 Pogodinskaya str., Moscow, 119121 Russia
  • A.V. Rybina Institute of Biomedical Chemistry, 10 Pogodinskaya str., Moscow, 119121 Russia



isoelectric point; 2D electrophoresis; data collection


A number of simple filters formulated from general considerations that take into account the peculiarities of the experiments as well as results obtained in 2D electrophoresis experiments are considered. These filters can be used for automated dataset formation and verification of learning of system for predicting protein isoelectric point values. These include: (i) filtering obvious errors introduced during initial database formation; (ii) selection of a known plausible range of values; (iii) selection of a single variant among various proteoforms; (iv) selection within a preset value of electrophoretic shift deviation, etc. Using a dataset combining data from 8 maps of Homo sapiens, Mus musculus, and Rattus norvegicus, the application of this set of filters improved the R2 value of predictions from 0.44 to 0.67.


  1. Skvortsov, V.S, Voronina, A.I., Ivanova, Y.O., Rybina, A.V. (2021) The Prediction of the Isoelectric Point Value of Peptides and Proteins with a Wide Range of Chemical Modifications. Biomedical Chemistry: Research and Methods, 4(4), e00161. DOI
  2. Po, H.N., Senozan, N.M. (2001) The Henderson-Hasselbalch Equation: Its History and Limitations. Journal of Chemical Education, 78, 1499-1503. DOI
  3. Kozlowski, L.P. (2021) IPC 2.0: prediction of isoelectric point and pKa dissociation constants. Nucleic Acids Research, 49(W1, 2), W285–W292. DOI
  4. Naryzhny, S.N., Legina, O.K. (2019) Structural-functional diversity of p53 proteoforms. Biomeditsinskaya khimiya, 65(4), 263-276. DOI
  5. Bjellqvist, B., Hughes, G.J., Pasquali, C., Paquet, N., Ravier, F., Sanchez, J. C., Frutiger, S., Hochstrasser, D. (1993) The focusing positions of polypeptides in immobilized pH gradients can be predicted from their amino acid sequences. Electrophoresis, 14(10), 1023–1031. DOI
  6. Kozlowski, L. P. (2022) Proteome-pI 2.0: proteome isoelectric point database update. Nucleic acids research, 50(D1), D1535-D1540. DOI
  7. Kitchin, R. (2014) Big Data, new epistemologies and paradigm shifts. Big data & society, 1(1), 2053951714528481. DOI
  8. Hoogland, C., Mostaguir, K., Appel, R.D., Lisacek, F. (2008) The World-2DPAGE Constellation to promote and publish gel-base d proteomics data through the ExPASy server. Journal of proteomics, 71(2), 245–248. DOI
  9. The UniProt Consortium (2021) UniProt: the universal protein knowledgebase in 2021, Nucleic Acids Research, 49(D1), D480–D489. DOI
  10. Sanchez, J. C., Chiappe, D., Converset, V., Hoogland, C., Binz, P.A., Paesano, S., Appel, R.D., Wang, S., Sennitt, M., Nolan, A., Cawthorne, M.A., Hochstrasser, D.F. (2001) The mouse SWISS-2D PAGE database: a tool for proteomics study of diabetes and obesity. Proteomics, 1(1), 136–163. DOI
  11. Sanchez, J.C., Appel, R.D., Golaz, O., Pasquali, C., Ravier, F., Bairoch, A., Hochstrasser, D.F. (1995) Inside SWISS-2DPAGE database. Electrophoresis, 16(7), 1131–1151. DOI
  12. Demalte-Annessi, I., Sanchez, J.-C., Hoogland, C., Rouge, V., Binz, P.-A., Appel, R.D., Hochstrasser D.F. (1999) Submitted JAN-1999 to SWISS-2DPAGE. Retrieved from:
  13. Golaz, O., Hughes, G.J., Frutiger, S., Paquet, N., Bairoch, A., Pasquali, C., Sanchez, J. C., Tissot, J. D., Appel, R.D., Walzer, C. (1993) Plasma and red blood cell protein maps: update 1993. Electrophoresis, 14(11), 1223–1231. DOI
  14. D'Hertog, W., Maris, M., Thorrez, L., Waelkens, E., Overbergh, L., Mathieu, C. (2011) Two-dimensional gel proteome reference map of INS-1E cells. Proteomics, 11(7), 1365–1369. DOI
  15. Plikat, U., Voshol, H., Dangendorf, Y., Wiedmann, B., Devay, P., Müller, D., Wirth, U., Szustakowski, J., Chirn, G.W., Inverardi, B., Puyang, X., Brown, K., Kamp, H., Hoving, S., Ruchti, A., Brendlen, N., Peterson, R., Buco, J., Oostrum, J. v., Peitsch, M.C. (2007) From proteomics to systems biology of bacterial pathogens: approaches, tools, and applications. Proteomics, 7(6), 992–1003. DOI
  16. Franco, C.F., Santos, R., Coelho, A.V. (2011) Exploring the proteome of an echinoderm nervous system: 2-DE of the sea star radial nerve cord and the synaptosomal membranes subproteome. Proteomics, 11(7), 1359–1364. DOI
  17. Rath, A., Glibowicka, M., Nadeau, V. G., Chen, G., Deber, C. M. (2009) Detergent binding explains anomalous SDS-PAGE migration of membrane proteins. Proceedings of the National Academy of Sciences, 106(6), 1760-1765. DOI



How to Cite

Skvortsov, V., & Rybina, A. (2022). The Filtration of 2D Electrophoresis Data during Creation of a Learning Set for Prediction of the Value of the Isoelectric Point of Proteins. Biomedical Chemistry: Research and Methods, 5(1), e00162.