One-Stop Data Intelligence Development Environment

LakeInsight provides a WEB-UI-based one-stop data intelligence development environment that unifies streaming computing, batch computing, AI development, and multimodal data processing on a single platform. Data engineers, AI developers, and business analysts can complete the entire development lifecycle — from data ingestion to model deployment and metric publishing — within a single workspace.

Unified Development Experience

(1) Multi-Language Development Support

SQL-based data query and modeling development, compatible with Flink SQL and Spark SQL
Python, Java, and Scala support for data processing, AI training, and custom business logic
Jupyter Notebook for interactive exploratory analysis and AI model experimentation
One-stop service covering the full development lifecycle: development, testing, and deployment

(2) Unified Streaming, Batch, AI & Multimodal Platform

Manage Flink real-time streaming jobs, Spark batch jobs, and Python AI training tasks within a single workspace
Streaming jobs (Flink) for real-time CDC synchronization, real-time metric computation, and streaming feature engineering
Batch jobs (Spark) for large-scale offline modeling, historical data backfill, and periodic reports
Python tasks support PyTorch, Pandas, and other AI frameworks, reading lakehouse data directly for model training
Multimodal data (video, audio, images, text) queried and processed alongside structured tables in the same IDE

(3) Online IDE Development Environment

CodeServer-based online editor with syntax highlighting, auto-completion, and syntax checking
Built-in Conda virtual environment management for project-level dependency isolation
Interactive SQL editor with real-time query result preview
Collaborative development supporting multiple users working on data modeling tasks simultaneously

Security & Multi-Tenancy

Enterprise single sign-on (SSO) integration
Development and production environment isolation to prevent development errors from affecting production data
Data domain partitioning with fine-grained read, write, and execute permission isolation
Role-based access control (RBAC) ensuring business and data security across workspaces
Custom roles with flexible module-level permission configuration

Task Publishing & Operations

One-click publishing of completed development tasks to production with configurable settings
Approval workflow: tasks must pass administrator review before online deployment
Real-time task status monitoring with start/stop control, log viewing, and anomaly alerts
Task version management with traceable history and rollback capability
Flexible compute resource (CPU, memory) and cluster configuration adjustment

Platform Management

Workspace management: isolated workspaces for different users and projects
Separate development and production cluster configurations for production stability
Resource monitoring: real-time monitoring and alerting for cluster resources and compute tasks
Task scheduling: Cron-based periodic scheduling for batch jobs, 24/7 continuous operation for streaming jobs

One-Stop Data Intelligence Development Environment

Unified Development Experience​

Security & Multi-Tenancy​

Task Publishing & Operations​

Platform Management​

Unified Development Experience

Security & Multi-Tenancy

Task Publishing & Operations

Platform Management