模型

在这个解决方案中，MediaPipe 提供了两种模型：一般模型和景观模型。两种模型都基于MobileNetV3，并进行了修改以提高效率。通用模型在 256x256x3 (HWC) 张量上运行，并输出代表分割掩码的 256x256x1 张量。景观模型类似于通用模型，但在 144x256x3 (HWC) 张量上运行。它比一般模型具有更少的 FLOP，因此运行速度更快。请注意，再将输入图像输入 ML 模型之前，MediaPipe Selfie Segmentation 会自动将输入图像的大小调整为所需的张量维度。

代码实现

import cv2 import mediapipe as mp import numpy as np mp_drawing = mp.solutions.drawing_utils mp_selfie_segmentation = mp.solutions.selfie_segmentation # 图片人物抠图: IMAGE_FILES = [] BG_COLOR = (0, 255, 0) # 背景颜色也可以使用其他的照片，要求与原照片尺寸一致 #bg_image = cv2.imread('6.jpg') MASK_COLOR = (255, 255, 255) # mask图片颜色 file = '1.jpg' with mp_selfie_segmentation.SelfieSegmentation(model_selection=0) as selfie_segmentation: image = cv2.imread(file) image_height, image_width, _ = image.shape # 在处理之前需要转换图片到RGB颜色空间 image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB) results = selfie_segmentation.process(image) # 在背景图像上绘制分割图 #为了改善边界周围的分割，可以考虑在 results.segmentation_mask进行双边过滤 condition = np.stack((results.segmentation_mask,) * 3, axis=-1) > 0.1 #生成纯色图像,白色的mask图纸 #fg_image = np.zeros(image.shape, dtype=np.uint8) #fg_image[:] = MASK_COLOR fg_image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR) bg_image = np.zeros(image.shape, dtype=np.uint8) bg_image[:] = BG_COLOR output_image = np.where(condition, fg_image, bg_image) cv2.imshow('output_image',output_image) cv2.waitKey(0) #cv2.imwrite('selfie0.png', output_image)

首先加载需要的第三方库

BG_COLOR为纯颜色背景的RGB数值

MASK_COLOR为人像抠图的mask值，一般设置为纯白色

model_selection=0 模型选择，可以选择的参数（0,1）

然后便可以使用cv2.imread函数加载一张需要分割的图片，并转换颜色空间到RGB模式，预处理后的图片直接输入selfie_segmentation.process(image)函数进行人像的分割即可

为了边缘分割，np.stack((results.segmentation_mask,) * 3, axis=-1) > 0.1最后参数越小，包括的边缘越多，可自行尝试

fg_image图片我们需要把image图纸重新转换到BGR空间，并使用np.where函数把人像区域与背景图片融合一起，最终显示抠图后的效果

当然我们有时候需要一张背景图来替换，这里我们修改一下原始代码

import cv2 import mediapipe as mp import numpy as np mp_drawing = mp.solutions.drawing_utils mp_selfie_segmentation = mp.solutions.selfie_segmentation # 图片人物抠图: IMAGE_FILES = [] #BG_COLOR = (0, 255, 0) # 背景颜色也可以使用其他的照片，要求与原照片尺寸一致 bg_image = cv2.imread('6.jpg') MASK_COLOR = (255, 255, 255) # mask图片颜色 file = '1.jpg' with mp_selfie_segmentation.SelfieSegmentation(model_selection=0) as selfie_segmentation: image = cv2.imread(file) image_height, image_width, _ = image.shape # 在处理之前需要转换图片到RGB颜色空间 image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB) results = selfie_segmentation.process(image) # 在背景图像上绘制分割图 #为了改善边界周围的分割，可以考虑在 results.segmentation_mask进行双边过滤 condition = np.stack((results.segmentation_mask,) * 3, axis=-1) > 0.1 #生成纯色图像,白色的mask图纸 #fg_image = np.zeros(image.shape, dtype=np.uint8) #fg_image[:] = MASK_COLOR fg_image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR) #bg_image = np.zeros(image.shape, dtype=np.uint8) #bg_image[:] = BG_COLOR output_image = np.where(condition, fg_image, bg_image) cv2.imshow('output_image',output_image) cv2.waitKey(0) cv2.imwrite('selfie00.png', output_image)

运行代码的效果

实时视频分割

import cv2 import mediapipe as mp import numpy as np mp_drawing = mp.solutions.drawing_utils mp_selfie_segmentation = mp.solutions.selfie_segmentation BG_COLOR = (192, 192, 192) # gray cap = cv2.VideoCapture(0) cv2.waitKey(2000) bg_image = cv2.imread('6.jpg') with mp_selfie_segmentation.SelfieSegmentation(model_selection=1) as selfie_segmentation: while cap.isOpened(): success, image = cap.read() print(image.shape()) if not success: print("Ignoring empty camera frame.") continue image = cv2.cvtColor(cv2.flip(image, 1), cv2.COLOR_BGR2RGB) image.flags.writeable = False results = selfie_segmentation.process(image) image.flags.writeable = True image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR) condition = np.stack((results.segmentation_mask,) * 3, axis=-1) > 0.1 if bg_image is None: bg_image = np.zeros(image.shape, dtype=np.uint8) bg_image[:] = BG_COLOR output_image = np.where(condition, image, bg_image) cv2.imshow('MediaPipe Selfie Segmentation', output_image) if cv2.waitKey(5) & 0xFF == 27: break cap.release()

视频实时分割的代码结构跟图片分割类似，主要需要注意的是打开摄像头后，会自动有一个cap的图纸尺寸，这里我们读取的背景图片尺寸需要与摄像头的尺寸一致

首先我们打开摄像头，并加载一张背景图片，从摄像头中获取视频帧图片，对图片进行翻转与颜色空间的转换操作后，使用selfie_segmentation.process(image)函数进行人像的分割，最后实时显示分割后的效果，当然也可把视频保存下来

mediapipe系列文章：

颠覆2D对象检测模型，MediaPipe 3D对象检测还原真实的对象特征

MediaPipe Face Detection可运行在移动设备上的亚毫秒级人脸检测

MediaPipe虹膜检测：实时虹膜跟踪和深度估计

利用深度学习进行Web浏览器视频电话会议中的背景更换

利用机器学习，进行人体33个2D姿态检测与评估

利用机器学习，进行人手的21个3D手关节坐标检测

利用机器学习进行人脸468点的3D坐标检测，并生成3D模型

MediaPipe 集成人脸识别，人体姿态评估，人手检测模型

标签：分割

谷歌制作的Mediapipe人像分割可以随意改变图片和视频背景

模型

代码实现

实时视频分割