We use cookies to improve your experience with our site.

Indexed in:

SCIE, EI, Scopus, INSPEC, DBLP, CSCD, etc.

Submission System
(Author / Reviewer / Editor)
Li Wang, Hao Zhang, Hao-Wu Chang, Qing-Ming Qin, Bo-Rui Zhang, Xue-Qing Li, Tian-Heng Zhao, Tian-Yue Zhang. GAEBic: A Novel Biclustering Analysis Method for miRNA-Targeted Gene Data Based on Graph Autoencoder[J]. Journal of Computer Science and Technology, 2021, 36(2): 299-309. DOI: 10.1007/s11390-021-0804-3
Citation: Li Wang, Hao Zhang, Hao-Wu Chang, Qing-Ming Qin, Bo-Rui Zhang, Xue-Qing Li, Tian-Heng Zhao, Tian-Yue Zhang. GAEBic: A Novel Biclustering Analysis Method for miRNA-Targeted Gene Data Based on Graph Autoencoder[J]. Journal of Computer Science and Technology, 2021, 36(2): 299-309. DOI: 10.1007/s11390-021-0804-3

GAEBic: A Novel Biclustering Analysis Method for miRNA-Targeted Gene Data Based on Graph Autoencoder

Funds: This work was supported by the National Natural Science Foundation of China under Grant No. 62072210 and the Project of the Development and Reform Commission of Jilin Province of China under Grant No. 2019C053-6.
More Information
  • Author Bio:

    Li Wang is a Master candidate of College of Software, Jilin University, Changchun. His research interests include data mining, machine learning, bioinformatics, pattern recognition, image processing, and neural network.

  • Corresponding author:

    Hao Zhang E-mail: zhangh@jlu.edu.cn

  • Received Date: July 13, 2020
  • Revised Date: March 04, 2021
  • Published Date: March 04, 2021
  • Unlike traditional clustering analysis, the biclustering algorithm works simultaneously on two dimensions of samples (row) and variables (column). In recent years, biclustering methods have been developed rapidly and widely applied in biological data analysis, text clustering, recommendation system and other fields. The traditional clustering algorithms cannot be well adapted to process high-dimensional data and/or large-scale data. At present, most of the biclustering algorithms are designed for the differentially expressed big biological data. However, there is little discussion on binary data clustering mining such as miRNA-targeted gene data. Here, we propose a novel biclustering method for miRNA-targeted gene data based on graph autoencoder named as GAEBic. GAEBic applies graph autoencoder to capture the similarity of sample sets or variable sets, and takes a new irregular clustering strategy to mine biclusters with excellent generalization. Based on the miRNA-targeted gene data of soybean, we benchmark several different types of the biclustering algorithm, and find that GAEBic performs better than Bimax, Bibit and the Spectral Biclustering algorithm in terms of target gene enrichment. This biclustering method achieves comparable performance on the high throughput miRNA data of soybean and it can also be used for other species.
  • [1]
    Kuwabara P E. DNA microarrays and gene expression:From experiments to data analysis and modeling. Briefings in Functional Genomics and Proteomics, 2003, 2(1):80-81. DOI: 10.1093/bfgp/2.1.80.
    [2]
    Jain A K, Murty M N, Flynn P J et al. Data clustering:A review. ACM Computing Surveys, 1999, 31(3):264-323. DOI: 10.1145/331499.331504.
    [3]
    Wang H, Wang W, Yang J et al. Clustering by pattern similarity in large data sets. In Proc. the 2002 ACM SIGMOD International Conference on Management of Data, June 2002, pp.394-405. DOI: 10.1145/564691.564737.
    [4]
    Gasch A P, Eisen M B. Exploring the conditional coregulation of yeast gene expression through fuzzy k-means clustering. Genome Biology, 2002, 3(11):Article No. research0059. DOI: 10.1186/gb-2002-3-11-research0059.
    [5]
    Cheng Y, Church G M. Biclustering of expression data. In Proc. the 8th International Conference on Intelligent Systems for Molecular Biology, August 2000, pp.93-103.
    [6]
    Madeira S C, Oliveira A L. Biclustering algorithms for biological data analysis:A survey. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2004, 1(1):24-45. DOI: 10.1109/TCBB.2004.2.
    [7]
    Busygin S, Prokopyev O A, Pardalos P M et al. Biclustering in data mining. Computers & Operations Research, 2008, 35(9):2964-2987. DOI: 10.1016/j.cor.2007.01.005.
    [8]
    Eren K, Deveci M, Küçüktunç O et al. A comparative analysis of biclustering algorithms for gene expression data. Briefings in Bioinformatics, 2013, 14(3):279-292. DOI: 10.1093/bib/bbs032.
    [9]
    Oghabian A, Kilpinen S, Hautaniemi S et al. Biclustering methods:Biological relevance and application in gene expression analysis. PLoS ONE, 2014, 9(3):Ariticle No. e90801. DOI: 10.1371/journal.pone.0090801.
    [10]
    Pontes B, R. Giráldez, Aguilar-Ruiz J S. Biclustering on expression data:A review. Journal of Biomedical Informatics, 2015, 57:163-180. DOI: 10.1016/j.jbi.2015.06.028.
    [11]
    Getz G, Levine E, Domany E. Coupled two-way clustering analysis of gene microarray data. Proceedings of the National Academy of Sciences of the United States of America, 2000, 97(22):12079-12084. DOI: 10.1073/pnas.210134797.
    [12]
    Bhattacharya A, De Rajat K. Bi-correlation clustering algorithm for determining a set of co-regulated genes. Bioinformatics, 2009, 25(21):2795-2801. DOI: 10.1093/bioinformatics/btp526.
    [13]
    Prelić A, Bleuler S, Zimmermann P et al. A systematic comparison and evaluation of biclustering methods for gene expression data. Bioinformatics, 2006, 22(9):1122-1129. DOI: 10.1093/bioinformatics/btl060.
    [14]
    Hartigan J A. Direct clustering of a data matrix. Journal of the American Statistical Association, 1972, 67(337):123-129. DOI: 10.1080/01621459.1972.10481214.
    [15]
    Yang J, Wang H, Wang W et al. Enhanced biclustering on expression data. In Proc. the 3rd IEEE Symposium on BioInformatics and BioEngineering, March 2003, pp.321-327. DOI: 10.1109/BIBE.2003.1188969.
    [16]
    Liu J, Wang W. OP-cluster:Clustering by tendency in high dimensional space. In Proc. the 3rd IEEE International Conference on Data Mining, November 2003, pp.187-194. DOI: 10.1109/ICDM.2003.1250919.
    [17]
    Tanay A, Sharan R, Shamir R. Discovering statistically significant biclusters in gene expression data. In Proc. the 10th International Conference on Intelligent Systems for Molecular Biology, August 2002, pp.136-144.
    [18]
    Rodriguez-Baena D S, Perez-Pulido A J, Aguilarruiz J S. A biclustering algorithm for extracting bit-patterns from binary datasets. Bioinformatics, 2011, 27(19):2738-2745. DOI: 10.1093/bioinformatics/btr464.
    [19]
    Alzahrani M, Kuwahara H, Wang W et al. Gracob:A novel graph-based constant-column biclustering method for mining growth phenotype data. Bioinformatics, 2017, 33(16):2523-2531. DOI: 10.1093/bioinformatics/btx199.
    [20]
    Sheng Q, Moreau Y, De Moor B. Biclustering microarray data by Gibbs sampling. Bioinformatics, 2003, 19(suppl 2):ii196-ii205. DOI: 10.1093/bioinformatics/btg1078.
    [21]
    Kluger Y, Basri R, Chang J T et al. Spectral biclustering of microarray data:Coclustering genes and conditions. Genome Research, 2003, 13(4):703-716. DOI: 10.1101/gr.648603.
    [22]
    Kipf T, Welling M. Semi-supervised classification with graph convolutional networks. In Proc. the 5th International Conference on Learning Representations, April 2017.
    [23]
    Niepert M, Ahmed M H, Kutzkov K. Learning convolutional neural networks for graphs. In Proc. the 33rd International Conference on Machine Learning, June 2016, pp.2014-2023.
    [24]
    Kipf T N, Welling M. Variational graph auto-encoders. arXiv:1611.07308, 2016. https://arxiv.org/abs/1611.07308, November 2020.
    [25]
    Zhou J, Cui G, Zhang Z et al. Graph neural networks:A review of methods and applications. arXiv:1812.08434, 2018. https://arxiv.org/abs/1812.08434, July 2020.
    [26]
    Wu Z, Pan S, Chen F et al. A comprehensive survey on graph neural networks. arXiv:1901.00596, 2019. https://arxiv.org/abs/1901.00596v4, December 2019.
    [27]
    Cao S S, Lu W, Xu Q K. Deep neural networks for learning graph representations. In Proc. the 13th AAAI Conference on Artificial Intelligence, February 2016, pp.1145-1152.
    [28]
    Hammer B, Micheli A, Sperduti A. Universal approximation capability of cascade correlation for structures. Neural Computation, 2005, 17(5):1109-1159. DOI: 10.1162/0899766053491878.
    [29]
    Wang D, Cui P, Zhu W. Structural deep network embedding. In Proc. the 22nd ACM Conference on Knowledge Discovery and Data Mining, August 2016, pp.1225-1234. DOI: 10.1145/2939672.2939753.
    [30]
    Hamilton W L, Ying Z, Leskovec J. Inductive representation learning on large graphs. In Proc. the 31st Annual Conference on Neural Information Processing Systems, December 2017, pp.1024-1034.
  • Related Articles

    [1]Geng Lin, Jian Guan. A Binary Particle Swarm Optimization for the Minimum Weight Dominating Set Problem[J]. Journal of Computer Science and Technology, 2018, 33(2): 305-322. DOI: 10.1007/s11390-017-1781-4
    [2]ZHOU Aoying, JIN Wen, ZHOU Shuigeng, QIAN Weining, TIAN Zenping. Incremental Mining of the Schema of Semistructured Data[J]. Journal of Computer Science and Technology, 2000, 15(3): 241-248.
    [3]Fan Jianhua, Li Deyi. An Overview of Data Mining and Knowledge Discovery[J]. Journal of Computer Science and Technology, 1998, 13(4): 348-368.
    [4]Wu Hong, Nie Xumin. Extending STL with Efficient Data Structures[J]. Journal of Computer Science and Technology, 1998, 13(4): 317-324.
    [5]Li Minglu, Sun Yongqiang, Sheng Huany. Nondeterministic Temporal Relations in Multimedia Data[J]. Journal of Computer Science and Technology, 1997, 12(3): 244-251.
    [6]Xu Meihe, Tang Zesheng. Surface Reconstruction for Cross Sectional Data[J]. Journal of Computer Science and Technology, 1996, 11(5): 471-479.
    [7]Shi Ronghua. A Redundant Binary Algorithm for RSA[J]. Journal of Computer Science and Technology, 1996, 11(4): 416-420.
    [8]Zheng Weidong, Lin Zongkai, Guo Yuchai. A Data Manager for Engineering Applications[J]. Journal of Computer Science and Technology, 1993, 8(4): 21-30.
    [9]Tang Zhimin. Pipelined Global Data Communication on Hypertoruses[J]. Journal of Computer Science and Technology, 1992, 7(3): 247-256.
    [10]Zhang Yan, He Jichao. Data Dependencies in Database with Incomplete Information[J]. Journal of Computer Science and Technology, 1988, 3(2): 131-138.
  • Others

  • Cited by

    Periodical cited type(7)

    1. Hao Zhang, Jiao Jiao, Tianheng Zhao, et al. GERWR: Identifying the Key Pathogenicity- Associated sRNAs of Magnaporthe Oryzae Infection in Rice Based on Graph Embedding and Random Walk With Restart. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2024, 21(2): 227. DOI:10.1109/TCBB.2023.3348080
    2. Jingjing Xue, Feiping Nie, Chaodie Liu, et al. Co-Clustering by Directly Solving Bipartite Spectral Graph Partitioning. IEEE Transactions on Cybernetics, 2024, 54(12): 7590. DOI:10.1109/TCYB.2024.3451292
    3. Enshuang Zhao, Liyan Dong, Hengyi Zhao, et al. A Relationship Prediction Method for Magnaporthe oryzae–Rice Multi-Omics Data Based on WGCNA and Graph Autoencoder. Journal of Fungi, 2023, 9(10): 1007. DOI:10.3390/jof9101007
    4. Md Abdur Rahaman, Zening Fu, Armin Iraji, et al. A Deep Biclustering Framework for Brain Network Analysis. 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), DOI:10.1109/CVPRW63382.2024.00514
    5. Md Abdur Rahaman, Zening Fu, Armin Iraji, et al. SpaDE: Semantic Locality Preserving Biclustering for Neuroimaging Data. 2024 46th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), DOI:10.1109/EMBC53108.2024.10782417
    6. Cuiyuan Li, Kai Hu, Fa Zhang, et al. Variational Clustering and Denoising of Spatial Transcriptomics. 2023 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), DOI:10.1109/BIBM58861.2023.10385692
    7. Xinrun Xu, Manying Lv, Zhanbiao Lian, et al. A Clustering Method with Graph Maximum Decoding Information. 2024 International Joint Conference on Neural Networks (IJCNN), DOI:10.1109/IJCNN60899.2024.10650698

    Other cited types(0)

Catalog

    Article views (52) PDF downloads (0) Cited by(7)
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return