传统目标检测实战:Sift/ORB+Match

news/2024/7/10 0:38:55 标签: 目标检测, 计算机视觉, 深度学习

传统目标检测实战:Sift/ORB+Match

文章目录

  • 传统目标检测实战:Sift/ORB+Match
    • 1. 前言
    • 2. 先验知识
    • 3. 项目框架
    • 4. 工具函数(utils.py)
    • 5. 检测待测图像(test_xxxx.py)
      • 5.1 使用图像缩放金字塔(test_PG.py)
      • 5.2 没有使用图像缩放金字塔(test_noPG.py)
      • 5.3 效果(可能不是很好,得再调调)
    • 6. 总结

1. 前言

本文的目标检测对象是裂缝,主要通过对图像提取特征点特征,然后进行特征点匹配操作,进而实现目标检测

  • 1)对模板图像提取特征点特征信息
  • 2)对待测图像进行(金字塔缩放+滑动窗口),对窗口内图像提取特征点特征信息,判断是否与模板图像的特征信息匹配,进而判断该窗口内是否存在裂缝。
  • 3)非极大值抑制、合并含有裂缝的窗口。

2. 先验知识

  1. 机器视觉特征简单介绍:HOG、SIFT、SURF、ORB、LBP、HAAR
  2. cv2-特征点匹配(bf、FLANN)
  3. cv2–特征点特征提取(Sift,Orb,Surf)

3. 项目框架

1

  • data代表数据文件夹,下有base_img(是模板图像)、result(预测的结果)和test_img(预测时所用到的图像)。
  • 其余的py文件,下面详细介绍。

4. 工具函数(utils.py)

import numpy as np
import cv2

def get_sift_op():
    return cv2.SIFT_create()

def get_orb_op():
    return cv2.ORB_create()

def sliding_window(image, window_size, step_size):
    for row in range(0, image.shape[0], step_size[0]):
        for col in range(0, image.shape[1], step_size[1]):
            yield (row, col, image[row:row + window_size[0], col:col + window_size[1]])

def overlapping_area(detection_1, detection_2, show = False):
    '''
    计算两个检测区域覆盖大小,detection:(x, y, pred_prob, width, height, area)
    '''
    # Calculate the x-y co-ordinates of the
    # rectangles
    # detection_1的 top left 和 bottom right
    x1_tl = detection_1[0]
    y1_tl = detection_1[1]
    x1_br = detection_1[0] + detection_1[3]
    y1_br = detection_1[1] + detection_1[4]

    # detection_2的 top left 和 bottom right
    x2_tl = detection_2[0]
    y2_tl = detection_2[1]
    x2_br = detection_2[0] + detection_2[3]
    y2_br = detection_2[1] + detection_2[4]
    # Calculate the overlapping Area
    # 计算重叠区域

    x_overlap = max(0, min(x1_br, x2_br) - max(x1_tl, x2_tl))
    y_overlap = max(0, min(y1_br, y2_br) - max(y1_tl, y2_tl))
    overlap_area = x_overlap * y_overlap
    area_1 = detection_1[3] * detection_1[4]
    area_2 = detection_2[3] * detection_2[4]

    # 计算重叠比例方法1
    # total_area = area_1 + area_2 - overlap_area
    # return overlap_area / float(total_area)

    # 计算重叠比例方法2
    area = area_1
    if area_1 > area_2:
        area = area_2
    return float(overlap_area / area)


def nms(detections, threshold=0.5):
    '''
    抑制策略:
    1. 最大的置信值先行
    2. 最大的面积先行
    非极大值抑制减少重叠区域,detection:(x,y,pred_prob,width,height, area)
    '''
    if len(detections) == 0:
        return []
    # Sort the detections based on confidence score
    # 根据预测值大小排序预测结果
    detections = sorted(detections, key=lambda detections: detections[2], reverse=True)
    # print((detections[0][5], detections[-1][5]))
    # Unique detections will be appended to this list
    # 非极大值抑制后的检测区域
    new_detections=[]
    # Append the first detection
    # 默认第一个区域置信度最高是正确检测区域
    new_detections.append(detections[0])
    # Remove the detection from the original list
    # 去除以检测为正确的区域
    del detections[0]
    # For each detection, calculate the overlapping area
    # and if area of overlap is less than the threshold set
    # for the detections in `new_detections`, append the
    # detection to `new_detections`.
    # In either case, remove the detection from `detections` list.
    print(len(detections))
    for index, detection in enumerate(detections):
        if len(new_detections) >= 50:
            break
        overlapping_small = True
        # 重叠区域过大,则删除该区域,同时结束检测,过小则继续检测
        for new_detection in new_detections:
            if overlapping_area(detection, new_detection) > threshold:
                overlapping_small = False
                break
        # 整个循环中重叠区域都小那么增加
        if overlapping_small:
            new_detections.append(detection)
    return new_detections

5. 检测待测图像(test_xxxx.py)

5.1 使用图像缩放金字塔(test_PG.py)

import numpy as np
import os
import glob
from skimage.transform import pyramid_gaussian
import cv2
from utils import *

# ----------------------准备工作-----------------------start
window_size = (256, 256)
step_size = (128, 128)
op = get_sift_op()

base_dataset_path = os.path.expanduser('./data/base_img')
base_dataset_pos_lists = glob.glob(os.path.join(base_dataset_path, '*.jpg'))
kps = []
des = []
for base_path in base_dataset_pos_lists:
    img = cv2.imread(base_path, cv2.IMREAD_GRAYSCALE)
    img = cv2.resize(img, window_size)
    kp, de = op.detectAndCompute(img, None)
    kps.append(kp)
    des.append(de)
# ----------------------准备工作-----------------------end


img_name = 'a.jpg'
test_image = cv2.imread("./data/test/" + img_name, cv2.IMREAD_GRAYSCALE)
scale = 0
detections = []
downscale = 1.25
bf = cv2.BFMatcher(normType=cv2.NORM_HAMMING, crossCheck=True)

# test_PG 与 test_noPG 的不同之处在于下面这行代码,即是否使用图像金字塔。
for test_image_pyramid in pyramid_gaussian(test_image, downscale=downscale):
    if test_image_pyramid.shape[0] < window_size[0] or test_image_pyramid.shape[1] < window_size[1]:
        break
    for (row, col, sliding_image) in sliding_window(test_image_pyramid, window_size, step_size):
        if sliding_image.shape != window_size:
            continue
        sliding_image = np.uint8(sliding_image*255)
        kp1, de1 = op.detectAndCompute(sliding_image, None)
        max_num = 0
        for de2 in des:
            matches = bf.match(de1, de2)
            num = len(matches)
            if num >= 20:
                if max_num < num:
                    max_num = num
        if max_num != 0:
            (window_height, window_width) = window_size
            real_height = int(window_height*downscale**scale)
            real_width = int(window_width*downscale**scale)
            detections.append((int(col*downscale**scale), int(row*downscale**scale), max_num, real_width, real_height, real_height*real_width))
    scale+=1

test_image1 = cv2.imread("./data/test/" + img_name, 1)
test_image_detect = test_image1.copy()
for detection in detections:
    col = detection[0]
    row = detection[1]
    width = detection[3]
    height = detection[4]
    cv2.rectangle(test_image_detect, pt1=(col, row), pt2=(col+width, row+height), color=(255, 0, 0), thickness=4)

print('before NMS')
cv2.imwrite("./data/result/_"+ img_name, test_image_detect)

threshold = 0.2
detections_nms = nms(detections, threshold)

test_image_detect = test_image1.copy()
for detection in detections_nms:
    col = detection[0]
    row = detection[1]
    width = detection[3]
    height = detection[4]
    cv2.rectangle(test_image_detect, pt1=(col, row), pt2=(col+width, row+height), color=(0, 255, 0), thickness=4)

print('after NMS')
cv2.imwrite("./data/result/"+ img_name, test_image_detect)

5.2 没有使用图像缩放金字塔(test_noPG.py)

import os
import glob
import cv2
from utils import *

# ----------------------准备工作-----------------------start
window_size = (256, 256)
step_size = (128, 128)
op = get_sift_op()

base_dataset_path = os.path.expanduser('./data/base_img')
base_dataset_pos_lists = glob.glob(os.path.join(base_dataset_path, '*.jpg'))
kps = []
des = []
for base_path in base_dataset_pos_lists:
    img = cv2.imread(base_path, cv2.IMREAD_GRAYSCALE)
    img = cv2.resize(img, window_size)
    kp, de = op.detectAndCompute(img, None)
    kps.append(kp)
    des.append(de)
# ----------------------准备工作-----------------------end

img_name = 'a.jpg'
test_image = cv2.imread("./data/test/" + img_name, cv2.IMREAD_GRAYSCALE)
detections = []
bf = cv2.BFMatcher(normType=cv2.NORM_HAMMING, crossCheck=True)

# test_PG 与 test_noPG 的不同之处在于下面这行代码,即是否使用图像金字塔。
for (row, col, sliding_image) in sliding_window(test_image, window_size, step_size):
    if sliding_image.shape != window_size:
        continue
    kp1, de1 = op.detectAndCompute(sliding_image, None)
    max_num = 0
    for de2 in des:
        matches = bf.match(de1, de2)
        num = len(matches)
        if num >= 20:
            if max_num < num:
                max_num = num
    if max_num != 0:
        detections.append((int(col), int(row), max_num, window_size[1], window_size[0], window_size[1]*window_size[0]))

test_image1 = cv2.imread("./data/test/" + img_name, 1)
test_image_detect = test_image1.copy()
for detection in detections:
    col = detection[0]
    row = detection[1]
    width = detection[3]
    height = detection[4]
    cv2.rectangle(test_image_detect, pt1=(col, row), pt2=(col+width, row+height), color=(255, 0, 0), thickness=4)

print('before NMS')
cv2.imwrite("./data/result/_"+ img_name, test_image_detect)

threshold = 0.2
detections_nms = nms(detections, threshold)

test_image_detect = test_image1.copy()
for detection in detections_nms:
    col = detection[0]
    row = detection[1]
    width = detection[3]
    height = detection[4]
    cv2.rectangle(test_image_detect, pt1=(col, row), pt2=(col+width, row+height), color=(0, 255, 0), thickness=4)

print('after NMS')
cv2.imwrite("./data/result/"+ img_name, test_image_detect)

5.3 效果(可能不是很好,得再调调)

5

6. 总结

本文的目标检测对象是裂缝,主要通过对图像提取特征点特征,然后进行特征点匹配操作,进而实现目标检测。也可以扩展到其他对象的目标检测


http://www.niftyadmin.cn/n/55257.html

相关文章

每日一个解决问题:事务无法回滚是什么原因?

今天在码代码时发现事务不回滚了&#xff0c;学过MySQL 事务小伙伴们都懂&#xff0c;通过 begin 开启事务&#xff0c;通过 commit 提交事务或者通过 rollback 回滚事务。 正常来说&#xff0c;当我们开启一个事务之后&#xff0c;需要 commit 或者 rollback 来结束一个事务的…

尚医通(十)数据字典加Redis缓存 | MongoDB

目录一、Redis介绍二、数据字典模块添加Redis缓存1、service_cmn模块&#xff0c;添加redis依赖2、service_cmn模块&#xff0c;添加Redis配置类3、在service_cmn模块&#xff0c;配置文件添加redis配置4、通过注解添加redis缓存5、查询数据字典列表添加Redis缓存6、bug&#x…

OpenStack云平台搭建(4) | 部署Placement

目录 安装部署Placement 1、登录数据库授权 2、安装palcement-api 安装部署Placement 【Placement】服务 是从【nova】服务中拆分出来的组件&#xff0c;作用是收集各个【node】节点的可用资源&#xff0c;把【node】节点的资源统计写入到【MySQL】【Placement】服务会被【n…

数据结构 | 线性表

&#x1f525;Go for it!&#x1f525; &#x1f4dd;个人主页&#xff1a;按键难防 &#x1f4eb; 如果文章知识点有错误的地方&#xff0c;请指正&#xff01;和大家一起学习&#xff0c;一起进步&#x1f440; &#x1f4d6;系列专栏&#xff1a;数据结构与算法 &#x1f52…

SQL语句训练

好文推荐&#xff1a; 21个MySQL表设计的经验准则 后端程序员必备&#xff1a;书写高质量SQL的30条建议 我们为什么要分库分表&#xff1f; 从0.742秒到0.006秒&#xff0c;MySQL百万数据深分页优化实战 2020年MySQL数据库50面试题目含答案 MyBatis 表连接查询写法|三种对…

vue入门到精通(九)

7.2 如何定义和使用组合式函数 18_composition/91_composition_mouse_hooks.html <!DOCTYPE html> <html langen> <head><meta charsetUTF-8><meta http-equivX-UA-Compatible contentIEedge><meta nameviewport contentwidthdevice-width,…

腾讯云安全组配置参考版

官方文档参考: 云服务器 安全组应用案例-操作指南-文档中心-腾讯云 新建安全组时&#xff0c;您可以选择腾讯云为您提供的两种安全组模板&#xff1a; 放通全部端口模板&#xff1a;将会放通所有出入站流量。放通常用端口模板&#xff1a;将会放通 TCP 22端口&#xff08;Lin…

Deepwalk深度游走算法

主要思想 Deepwalk是一种将随机游走和word2vec两种算法相结合的图结构数据的挖掘算法。该算法可以学习网络的隐藏信息&#xff0c;能够将图中的节点表示为一个包含潜在信息的向量&#xff0c; Deepwalk算法 该算法主要分为随机游走和生成表示向量两个部分&#xff0c;首先…