We use cookies to improve your experience with our site.

Indexed in:

SCIE, EI, Scopus, INSPEC, DBLP, CSCD, etc.

Submission System
(Author / Reviewer / Editor)
Li GL, Wang J, Chen G. openGauss: An enterprise-grade open-source database system. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY 39(5): 1007−1028 Sept. 2024. DOI: 10.1007/s11390-024-4302-2.
Citation: Li GL, Wang J, Chen G. openGauss: An enterprise-grade open-source database system. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY 39(5): 1007−1028 Sept. 2024. DOI: 10.1007/s11390-024-4302-2.

openGauss: An Enterprise-Grade Open-Source Database System

Funds: This work was supported by the National Key Research and Development Program of China under Grant No. 2023YFB4503600, the National Natural Science Foundation of China under Grant Nos. 61925205, 62232009, and 62102215, Zhongguancun Lab, Huawei, TAL Education, and Beijing National Research Center for Information Science and Technology (BNRist).
More Information
  • Author Bio:

    Guo-Liang Li is a professor at the Department of Computer Science and Technology, Tsinghua University, Beijing. He got VLDB 2017 Early Research Contribution Award, TCDE 2014 Early Career Award, and SIGMOD 2024 Research Highlight Award. His research interests include database systems, machine learning for databases, and large-scale data cleaning and integration

    Jiang Wang is a member of the openGauss Technical Committee and a database kernel architect of openGauss. His main research interests include distributed systems, database systems, parallel processing, database intelligence, and data security

    Guo Chen is a deputy general manager of China Mobile Information Technology Center, Beijing. He is a professor-level senior engineer and a recipient of Special Government Allowance from the State Council. He is responsible for strategic planning, technological innovation, system construction, research and development, and Ability as a Service (AaaS) operations in the information technology field at China Mobile. Chen has led the establishment of cloud-native evolution technology architectures, roadmaps, and planning schemes for China Mobile's operational support systems. He has overseen the development and promotion of innovative capabilities in cloud computing, computing power network, large models, AI applications, databases, blockchain, IPA, etc., providing IT momentum for the enterprise's digital and intelligent transformation

  • Received Date: March 19, 2024
  • Accepted Date: August 29, 2024
  • We have built openGauss, an enterprise-grade open-source database system. openGauss has fulfilled its design goal of high performance, high availability, high security, and high intelligence. For high performance, it leverages NUMA (non-uniform memory access)-aware data access among multiple cores to enable efficient concurrent transaction processing, and symmetric multi-processing to make use of parallel processing resources adaptively. Moreover, memory-optimized tables (MOTs) are designed to put everything in memory. For high availability, a three-tier pooling architecture that shares storage among the master and standby instances is proposed to achieve availability at 99.99%, containing both a distributed memory service (DMS) and a distributed storage service (DSS). For high security, it is a fully encrypted database with safe storage features, efficient complex querying, and tamper-proof. For high intelligence, an AI-based optimizer in the kernel and a self-driving platform named DBMind are demonstrated to achieve better performance and greater user-friendliness. openGauss has served over 150 enterprises and institutions since its release in 2020. We share the lessons we learned from its development and operation, and our customers.

  • [1]
    Porobic D, Pandis I, Branco M, Tözün P, Ailamaki A. OLTP on hardware islands. Proceedings of the VLDB Endowment, 2012, 5(11): 1447–1458. DOI: 10.14778/2350229.2350260.
    [2]
    Kemper A, Neumann T. HyPer: A hybrid OLTP&OLAP main memory database system based on virtual memory snapshots. In Proc. the 27th International Conference on Data Engineering, Apr. 2011, pp.195–206. DOI: 10.1109/ICDE.2011.5767867.
    [3]
    Sheng Y, Tomasic A, Zhang T, Pavlo A. Scheduling OLTP transactions via machine learning. arXiv: 1903.02990, 2019. https://arxiv.org/abs/1903.02990, Aug. 2024.
    [4]
    Appavoo J, Da Silva D, Krieger O, Auslander M, Ostrowski M, Rosenburg B, Waterland A, Wisniewski R W, Xenidis J, Stumm M, Soares L. Experience distributing objects in an SMMP OS. ACM Trans. Computer Systems (TOCS), 2007, 25(3): Article No. 6. DOI: 10.1145/1275517.1275518.
    [5]
    Bryant R, Hawkes J, Steiner J, Barnes J, Higdon J. Scaling Linux® to the extreme. In Proc. the 2004 Linux Symposium, Jun. 2004, pp.133–148.
    [6]
    Gamsa B, Krieger O, Appavoo J, Stumm M. Tornado: Maximizing locality and concurrency in a shared memory multiprocessor operating system. In Proc. the 3rd Symposium on Operating Systems Design and Implementation, Feb. 1999.
    [7]
    Porobic D, Liarou E, Tözün P, Ailamaki A. ATraPos: Adaptive transaction processing on hardware Islands. In Proc. the 30th International Conference on Data Engineering, Mar. 31–Apr. 4, 2014, pp.688–699. DOI: 10.1109/ICDE.2014.6816692.
    [8]
    Funke F, Kemper A, Neumann T. HyPer-sonic combined transaction and query processing. Proceedings of the VLDB Endowment, 2011, 4(12): 1367–1370. DOI: 10.14778/3402755.3402772.
    [9]
    Li S, Hoefler T, Snir M. NUMA-aware shared-memory collective communication for MPI. In Proc. the 22nd International Symposium on High-Performance Parallel and Distributed Computing, Jun. 2013, pp.85–96. DOI: 10.1145/2493123.2462903.
    [10]
    Calciu I, Dice D, Lev Y, Luchangco V, Marathe VJ, Shavit N. NUMA-aware reader-writer locks. In Proc. the 18th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Feb. 2013, pp.157–166. DOI: 10.1145/2442516.2442532.
    [11]
    Kallman R, Kimura H, Natkins J, Pavlo A, Rasin A, Zdonik S, Jones E P C, Madden S, Stonebraker M, Zhang Y, Hugg J, Abadi D. H-store: A high-performance, distributed main memory transaction processing system. Proceedings of the VLDB Endowment, 2008, 1(2): 1496–1499. DOI: 10.14778/1454159.1454211.
    [12]
    Larson P Å, Blanas S, Diaconu C, Freedman C, Patel J M, Zwilling M. High-performance concurrency control mechanisms for main-memory databases. Proceedings of the VLDB Endowment, 2011, 5(4): 298–309. DOI: 10.14778/2095686.2095689.
    [13]
    Hsiao D K, Kung H T, Robinson J T. On optimistic methods for concurrency control. ACM Trans. Database Systems (TODS), 1981, 6(2): 213–226. DOI: 10.1145/319566.319567.
    [14]
    Tu S, Zheng W T, Kohler E, Liskov B, and Madden S. Speedy transactions in multicore in-memory databases. In Proc. the 24th ACM Symposium on Operating Systems Principles, Nov. 2013, pp.18–32. DOI: 10.1145/2517349.2522713.
    [15]
    Mao Y, Kohler E, Morris R T. Cache craftiness for fast multicore key-value storage. In Proc. the 7th ACM European Conference on Computer Systems, Apr. 2012, pp.183–196. DOI: 10.1145/2168836.2168855.
    [16]
    Li G, Zhou X, Cao L. AI meets database: AI4DB and DB4AI. In Proc. the 2021 International Conference on Management of Data, Jun. 2021, pp.2859–2866. DOI: 10.1145/3448016.3457542.
    [17]
    Zhou X, Li G, Wu J, Liu J, Sun Z, Zhang X. A learned query rewrite system. Proceedings of the VLDB Endowment, 2023, 16(12): 4110–4113. DOI: 10.14778/3611540.3611633.
    [18]
    Yu X, Chai C, Li G, Liu J. Cost-based or learning-based?: A hybrid query optimizer for query plan selection. Proceedings of the VLDB Endowment, 2022, 15(13): 3924–3936. DOI: 10.14778/3565838.3565846.
    [19]
    Li G, Zhou X, Cao L. Machine learning for databases. In Proc. the 1st International Conference on AI-ML Systems, Oct. 2021, Article No. 28. DOI: 10.1145/3486001.3486248.
    [20]
    Wu Z, Shaikhha A. BayesCard: A unified bayesian framework for cardinality estimation. arXiv: 2012.14743, 2020. https://arxiv.org/abs/2012.14743v1, Aug. 2024.
    [21]
    Dutt A, Wang C, Nazi A, Kandula S, Narasayya V, Chaudhuri S. Selectivity estimation for range predicates using lightweight models. Proceedings of the VLDB Endowment, 2019, 12(9): 1044–1057. DOI: 10.14778/3329772.3329780.
    [22]
    Sun J, Zhang J, Sun Z, Li G, Tang N. Learned cardinality estimation: A design space exploration and a comparative evaluation. Proceedings of the VLDB Endowment, 2021, 15(1): 85–97. DOI: 10.14778/3485450.3485459.
    [23]
    Sun J, Li G, Tang N. Learned cardinality estimation for similarity queries. In Proc. the 2021 International Conference on Management of Data, Jun. 2021, pp.1745–1757. DOI: 10.1145/3448016.3452790.
    [24]
    Sun J, Li G. An end-to-end learning-based cost estimator. Proceedings of the VLDB Endowment, 2019, 13(3): 307–319. DOI: 10.14778/3368289.3368296.
  • Related Articles

    [1]Peng-Ju Liu, Cui-Ping Li, Hong Chen. Enhancing Storage Efficiency and Performance: A Survey of Data Partitioning Techniques[J]. Journal of Computer Science and Technology, 2024, 39(2): 346-368. DOI: 10.1007/s11390-024-3538-1
    [2]Xiao-Fei Liao, Wen-Ju Zhao, Hai Jin, Peng-Cheng Yao, Yu Huang, Qing-Gang Wang, Jin Zhao, Long Zheng, Yu Zhang, Zhi-Yuan Shao. Towards High-Performance Graph Processing: From a Hardware/Software Co-Design Perspective[J]. Journal of Computer Science and Technology, 2024, 39(2): 245-266. DOI: 10.1007/s11390-024-4150-0
    [3]Hao-Hua Que, Yu Jin, Tong Wang, Ming-Kai Liu, Xing-Hua Yang, Fei Qiao. A Survey of Approximate Computing: From Arithmetic Units Design to High-Level Applications[J]. Journal of Computer Science and Technology, 2023, 38(2): 251-272. DOI: 10.1007/s11390-023-2537-y
    [4]Hui Yu, Xin-Yu Jiang, Jin Zhao, Hao Qi, Yu Zhang, Xiao-Fei Liao, &nbsp, Hai-Kun Liu, &nbsp, Fu-Bing Mao, Hai Jin. Toward High-Performance Delta-Based Iterative Processing with a Group-Based Approach[J]. Journal of Computer Science and Technology, 2022, 37(4): 797-813. DOI: 10.1007/s11390-022-2101-1
    [5]Zheng-Hao Jin, Haiyang Shi, Ying-Xin Hu, Li Zha, Xiaoyi Lu. CirroData: Yet Another SQL-on-Hadoop Data Analytics Engine with High Performance[J]. Journal of Computer Science and Technology, 2020, 35(1): 194-208. DOI: 10.1007/s11390-020-9536-z
    [6]Qi Chen, Kang Chen, Zuo-Ning Chen, Wei Xue, Xu Ji, Bin Yang. Lessons Learned from Optimizing the Sunway Storage System for Higher Application I/O Performance[J]. Journal of Computer Science and Technology, 2020, 35(1): 47-60. DOI: 10.1007/s11390-020-9798-5
    [7]Xiao-Dong Meng, Chen-Tao Wu, Min-Yi Guo, Jie Li, Xiao-Yao Liang, Bin Yao, Long Zheng. A Hint Frequency Based Approach to Enhancing the I/O Performance of Multilevel Cache Storage Systems[J]. Journal of Computer Science and Technology, 2017, 32(2): 312-328. DOI: 10.1007/s11390-017-1724-0
    [8]Kai Zhang, Feng Chen, Xiaoning Ding, Yin Huai, Rubao Lee, Tian Luo, Kaibo Wang, Yuan Yuan, Xiaodong Zhang. Hetero-DB: Next Generation High-Performance Database Systems by Best Utilizing Heterogeneous Computing and Storage Resources[J]. Journal of Computer Science and Technology, 2015, 30(4): 657-678. DOI: 10.1007/s11390-015-1553-y
    [9]Jian-Wei Xu, Ming-Yu Chen, Gui Zheng, Zheng Cao, Hui-Wei Lv, Ning-Hui Sun. SimK: A Large-Scale Parallel Simulation Engine[J]. Journal of Computer Science and Technology, 2009, 24(6): 1048-1060.
    [10]Dan Feng, Hai Jin. Massive Storage Systems[J]. Journal of Computer Science and Technology, 2006, 21(5): 648-664.
  • Others

Catalog

    Article views (790) PDF downloads (36) Cited by()
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return