Skip to main content

Processing Video Data in LakeInsight

本演示主要内容:

1. 读取原始视频文件,使用 Daft 抽取关键帧,写入到 LakeSoul 多模态湖仓中

2. 读取 LakeSoul 中的视频帧并展示

可以进入到 LakeInsight 演示环境,打开 video_demo/video.ipynb 查看演示代码。

1. 读取视频文件, 使用 daft 进行信息抽取

import logging
logging.disable(logging.CRITICAL)

使用 daft.read_video_frames 抽取视频中的关键帧

import daft
from daft import col, DataType
from daft.functions import encode_image

video_path = "/home/maji/data/Projects/multi/video/data/UCF101_subset/train/Basketball/v_Basketball_g01_c01.avi"
# 设置 Daft 执行器
daft.set_runner_ray(noop_if_initialized=True)
df = daft.read_video_frames(
path=video_path,
image_height=480,
image_width=640,
is_key_frame=True,
sample_interval_seconds=1.0,
)
df = df.with_column( "video_path", daft.lit(video_path)).with_column("data", encode_image(col("data"), "JPEG"))

df.show()


path
String
frame_index
Int64
frame_time
Float64
frame_time_base
String
frame_pts
Int64
frame_dts
Int64
frame_duration
Int64
is_key_frame
Bool
data
Binary
video_path
String
file:///home/maji/data/Projects/multi/video/data/UCF101_subset/train/Basketball/v_Basketball_g01_c01.avi
0
0
1001/30000
0
0
1
true
b"ÿØÿàJFIF"...
/home/maji/data/Projects/multi/video/data/UCF101_subset/train/Basketball/v_Basketball_g01_c01.avi
file:///home/maji/data/Projects/multi/video/data/UCF101_subset/train/Basketball/v_Basketball_g01_c01.avi
3
1.2012
1001/30000
36
36
1
true
b"ÿØÿàJFIF"...
/home/maji/data/Projects/multi/video/data/UCF101_subset/train/Basketball/v_Basketball_g01_c01.avi
file:///home/maji/data/Projects/multi/video/data/UCF101_subset/train/Basketball/v_Basketball_g01_c01.avi
5
2.002
1001/30000
60
60
1
true
b"ÿØÿàJFIF"...
/home/maji/data/Projects/multi/video/data/UCF101_subset/train/Basketball/v_Basketball_g01_c01.avi
file:///home/maji/data/Projects/multi/video/data/UCF101_subset/train/Basketball/v_Basketball_g01_c01.avi
8
3.2032
1001/30000
96
96
1
true
b"ÿØÿàJFIF"...
/home/maji/data/Projects/multi/video/data/UCF101_subset/train/Basketball/v_Basketball_g01_c01.avi
file:///home/maji/data/Projects/multi/video/data/UCF101_subset/train/Basketball/v_Basketball_g01_c01.avi
10
4.004
1001/30000
120
120
1
true
b"ÿØÿàJFIF"...
/home/maji/data/Projects/multi/video/data/UCF101_subset/train/Basketball/v_Basketball_g01_c01.avi
Cell Details

Click on a cell to view its full content

(Showing first 5 of 5 rows)

2. 写入 LakeSoul 多模态湖仓

使用 create_table 创建 LakeSoul 表,并写入

from lakesoul.metadata import create_table
from lakesoul.ray import LakeSoulDatasink
schema = df.schema().to_pyarrow_schema()
create_table(
"video_frames_table",
table_schema=schema,
table_path="/tmp/lakesoul/video_frames_table",
)
ds = df.to_ray_dataset()
sink = LakeSoulDatasink("video_frames_table")
ds.write_datasink(sink)
(pid=181785) PhysicalScan->Project:   0%|          | 0.00/1.00 [00:00<?, ?it/s]

3. 从湖仓中读取视频数据

使用 ray.data.read_lakesoul() 读取 LakeSoul 表
之后将 ray.data.Dataset 转换成 daft.dataframe
将关键帧 data 解码为 image

import ray
import lakesoul.ray
df = ray.data.read_lakesoul("video_frames_table").to_daft()

from daft import col
from daft.functions import decode_image

df = df.with_column(
"data",
decode_image(col("data"))
)

df.show()
path
String
frame_index
Int64
frame_time
Float64
frame_time_base
String
frame_pts
Int64
frame_dts
Int64
frame_duration
Int64
is_key_frame
Bool
data
Image[RGB]
video_path
String
file:///home/maji/data/Projects/multi/video/data/UCF101_subset/train/Basketball/v_Basketball_g01_c01.avi
0
0
1001/30000
0
0
1
true
<Image>
/home/maji/data/Projects/multi/video/data/UCF101_subset/train/Basketball/v_Basketball_g01_c01.avi
file:///home/maji/data/Projects/multi/video/data/UCF101_subset/train/Basketball/v_Basketball_g01_c01.avi
3
1.2012
1001/30000
36
36
1
true
<Image>
/home/maji/data/Projects/multi/video/data/UCF101_subset/train/Basketball/v_Basketball_g01_c01.avi
file:///home/maji/data/Projects/multi/video/data/UCF101_subset/train/Basketball/v_Basketball_g01_c01.avi
5
2.002
1001/30000
60
60
1
true
<Image>
/home/maji/data/Projects/multi/video/data/UCF101_subset/train/Basketball/v_Basketball_g01_c01.avi
file:///home/maji/data/Projects/multi/video/data/UCF101_subset/train/Basketball/v_Basketball_g01_c01.avi
8
3.2032
1001/30000
96
96
1
true
<Image>
/home/maji/data/Projects/multi/video/data/UCF101_subset/train/Basketball/v_Basketball_g01_c01.avi
file:///home/maji/data/Projects/multi/video/data/UCF101_subset/train/Basketball/v_Basketball_g01_c01.avi
10
4.004
1001/30000
120
120
1
true
<Image>
/home/maji/data/Projects/multi/video/data/UCF101_subset/train/Basketball/v_Basketball_g01_c01.avi
Cell Details

Click on a cell to view its full content

(Showing first 5 of 5 rows)