Agentic AI Native Capabilities
LakeInsight builds an Agentic AI core layer on top of the lakehouse foundation, featuring three major engines — Semantic Agent, AI Coding Agent, and Intelligent Query Agent — to deliver full-chain AI-native capabilities from automatic code semantic extraction to intelligent development and natural language querying, forming a self-reinforcing intelligent closed loop.
LakeInsight Semantic Agent
The Semantic Agent is LakeInsight's AI-native semantic infrastructure, automatically extracting structured semantic information from business code without manual configuration:
(1) Automated Semantic Extraction Pipeline
- Code Parsing: Scans SQL and data processing code used in warehouse modeling, ETL, and metric computation, identifying table structures, field definitions, and computation logic
- Lineage Extraction: Automatically resolves field-level dependency relationships and transformation chains, building full-chain data lineage graphs
- Metric & Terminology Recognition: Extracts business metric calculation logic and naming conventions from code, forming a unified business glossary
- Knowledge Graph Construction: Organizes extracted semantic information into a structured lakehouse knowledge graph encompassing tables, fields, metrics, terms, and their relationships
(2) Self-Reinforcing Closed Loop
- New code generated by upper-layer Agents automatically flows back into the business code layer
- Semantic Agent performs continuous incremental parsing, keeping the semantic layer automatically updated as development progresses
- The entire semantic layer stays current without manual maintenance
AI Agent Core Engines
LakeInsight provides three major AI Agents covering data query, development coding, and diagnosis scenarios:
(1) Intelligent Query Agent
- Natural language interaction that automatically converts user questions into precise SQL queries
- Retrieves field meanings and metric definitions via the Semantic Retrieval API, ensuring query results align with business definitions
- Automatically infers JOIN paths for cross-table queries and resolves time ranges and aggregation granularity for fuzzy queries
- Designed for business analysts, enabling self-service analytics without understanding underlying table structures
(2) AI Coding Agent
- Built-in out-of-the-box development skills for LakeSoul, Flink, and Spark
- Generates data ingestion, modeling, and publishing code from natural language descriptions of development requirements
- Generated code automatically aligns with existing metric definitions and business terminology, avoiding redundant definitions
- Supports one-click publishing of validated code as production tasks
(3) Diagnostic & Repair Agent
- Real-time monitoring of task execution status with automatic error log parsing
- Intelligent root cause analysis combining lineage graphs and semantic information
- Provides fix recommendations or automatic code repair, reducing operational costs
MCP Server Standardized Interfaces
- Standards-based API interfaces covering the full pipeline via the Model Context Protocol (MCP)
- Encapsulates lakehouse operations, code development, and task management capabilities as standardized tools callable by Agents
- Supports seamless integration with external AI assistants and ecosystem tools
Unified Semantic Retrieval API
As the experience hub of the one-stop platform, the Unified Semantic Retrieval API provides upper-layer Agents and users with:
- Field Meaning Lookup: Returns business meaning, data type, and associated metrics for given table and field names
- Metric Definition Lookup: Retrieves metric computation logic, data sources, and downstream dependencies
- Lineage Query: Traces complete upstream and downstream dependency chains for fields
- Knowledge Graph Retrieval: Graph-based search of semantic relationships between tables, fields, metrics, and terms
All user roles — data engineers, business analysts, and AI model developers — access precise semantics through the unified retrieval interface, without needing to understand the underlying knowledge graph construction details.