Skip to main content

LakeSoul Cloud-Native Lakehouse

Through the national information innovation certification, it provides the world's leading and reliable data warehouse management capabilities for the lakehouse data middle platform of enterprises

Leading technical concept and architecture design

Traditional data architecture is faced with the untimely response, high cost, inability to unify real-time data, batch data, and difficulty scaling. LakeSoul provides a perfect lake warehouse storage to solve the above problems. It offers high concurrency, high throughput, read and write capabilities and complete warehouse management capabilities on the cloud and provides it to various computing engines in a general way.

Efficient and extensible Catalog metadata service

Use a PostgreSQL database to store Catalog information, improving metadata scalability and transaction concurrency.

Concurrent writes and ACID transactions

Concurrency control, with a high degree of write concurrency ability, is the automatic judgment of conflicts and processing to ensure data consistency.

Incremental writes and Upsert updates are supported

LakeSoul provides efficient Merge on Read and Upsert functions to improve data intake flexibility and performance.

Real-time Lakehouse

It supports streaming and batch writing, row-level updates, and SQL operations. MVCC multi-version control, snapshot reading, and version rollback are available. Provide Flink CDC for efficient real-time access to the lake.


It supports interconnection with various computing engines such as Spark, Flink, and Presto and fully supports multiple data intelligent computing services such as ETL, OLAP, and AI model training.

Unified stream-batch table storage

Rich application scenarios, meeting various service requirements and helping to release service value

Real-time data is rapidly entering the lake

Flink CDC is provided for real-time implementation from the data source without T+1 import and Kafka deployment

Example of real-time online database entry report analysis

With only relevant configurations, such as online data sources, the whole database synchronization and real-time entry task can be started. It supports the automatic sensing of new tables and synchronizing table structure changes without human operation and maintenance. The online data is updated to the lake warehouse in real time. The BI reports and large-screen display are seamlessly connected and updated in real time so that key business indicators can be grasped at any time to support business decisions.

Real-time Report Analysis

Based on the streaming batch update feature, data extraction, transformation and development are completed through SQL, simplifying the ETL and data analysis process.

AI Application Landing

Large-scale DMP, machine learning sample database, and feature database are constructed to connect AI models and online reasoning seamlessly to realize intelligent data applications.

Join the community and share data intelligence