人工智能

您所在的位置：网站首页 › rgb转换灰度值 › 人工智能

人工智能

#人工智能| 来源: 网络整理| 查看: 265

介绍

在这篇文章中，我们将学习如何执行图像处理。在整篇文章中，我们使用到的库是Scikit Image。

基础知识

1、什么是图像？

图像数据可能是继文本之后最常见的数据。那么，电脑是如何理解你在埃菲尔铁塔前的自拍的呢？

[En]

Image data is probably the most common data after text. So, how does the computer understand your selfie in front of the Eiffel Tower?

它使用一个称为像素的小正方形网格。像素覆盖很小的区域，并且具有表示颜色的值。图像中的像素越多，质量就越高，存储所需的内存也就越多。

[En]

It uses a small square grid called pixels. The pixel covers a small area and has a value representing the color. The more pixels in the image, the higher the quality and the more memory required for storage.

好吧。图像处理主要处理这些单独的像素(有时是一组像素)，以便计算机视觉算法能够从它们中提取更多的信息。

[En]

okay. Image processing mainly deals with these individual pixels (sometimes groups of pixels) so that computer vision algorithms can extract more information from them.

文末有惊喜哦！！学习资料文末点击拿走~

2、NumPy和Skimage的图像基础

在Matplotlib和Skimage中，图像都作为NumPy ndarray加载。

from skimage.io import imread # pip install scikit-image image = imread("images/colorful_scenery.jpg") >>> type(image) numpy.ndarray

NumPy数组带来灵活性、速度和力量。图像处理也不例外。

Ndarrays可以轻松检索图像的一般详细信息，例如图像的尺寸：

>>> image.shape (853, 1280, 3) >>> image.ndim 3 # The number of pixels >>> image.size # 853 * 1280 * 3 3275520

我们的图像高度为853像素，宽度为1280像素。第三维表示RGB（红、绿、蓝）颜色通道的值。最常见的图像格式是3D。

你可以通过常规NumPy索引检索单个像素值。下面，我们尝试索引图像以检索三个颜色通道中的每一个通道：

red = image[:, :, 0] compare(image, red, "Red Channel of the Image", cmap_type="Reds_r")

green = image[:, :, 1] compare(image, green, "Green Channel of the Image", "Greens_r")

blue = image[:, :, 2] compare(image, blue, "Blue Channel of the Image", "Blues_r")

0表示红色，1表示绿色，2表示蓝色通道-非常简单。

创建了两个函数，show和compare，它们显示一个图像或并排显示其中两个进行比较。在整个教程中，我们将广泛使用这两个函数。

按照约定，ndarray的第三维用于颜色通道，但并不总是遵循此约定。Skimage通常提供参数来指定这种行为。

图像与通常的Matplotlib绘图不同。它们的原点不位于左下角，而是位于左上角的位置（0，0）。

>>> show(image, axis=True)

当我们在Matplotlib中绘制图像时，轴表示像素的顺序，但我们通常会隐藏它们。

3、常见转换

我们要执行的最常见的图像转换是将彩色图像转换为灰度图像。许多图像处理算法都需要灰度图像。因为颜色不是图片的定义特征，没有它，计算机仍然可以提取足够的信息。

[En]

The most common image conversion we are going to perform is to convert a color image to grayscale. Many image processing algorithms require gray-scale images. Because color is not the defining feature of a picture, without it, the computer can still extract enough information.

from skimage.color import rgb2gray image = imread("images/grayscale_example.jpg") # Convert image to grayscale gray = rgb2gray(image) compare(image, gray, "Grayscale Image")

>>> gray.shape (853, 1280)

当将图像转换为灰度时，它们会丢失其第三维度-颜色通道。相反，图像数组中的每个单元格现在表示uint8类型的整数。它们的范围从0到255，提供256种灰度。

你还可以使用np.flipud或者np.fliplr之类的NumPy函数，随心所欲地以任何方式操纵图像。

kitten = imread("images/horizontal_flip.jpg") horizontal_flipped = np.fliplr(kitten) compare(kitten, horizontal_flipped, "Horizontally Flipped Image")

ball = imread("images/upside_down.jpg") vertically_flipped = np.flipud(ball) compare(ball, vertically_flipped, "Vertically Flipped Image")

在”颜色”模块中，你可以找到许多其他变换函数来处理图像中的颜色。

4、颜色通道直方图

有时，查看每个颜色通道的强度有助于了解颜色分布。我们可以通过对每个颜色通道进行切片并绘制它们的直方图来做到这一点。以下是执行此操作的函数：

[En]

Sometimes, looking at the intensity of each color channel helps to understand the color distribution. We can do this by slicing each color channel and drawing their histogram. The following are the functions that do this:

def plot_with_hist_channel(image, channel): channels = ["red", "green", "blue"] channel_idx = channels.index(channel) color = channels[channel_idx] extracted_channel = image[:, :, channel_idx] fig, (ax1, ax2) = plt.subplots( ncols=2, figsize=(18, 6) ) ax1.imshow(image) ax1.axis("off") ax2.hist(extracted_channel.ravel(), bins=256, color=color) ax2.set_title(f"{channels[channel_idx]} histogram")

除了Matplotlib的一些细节之外，你还应该注意hist函数的调用。提取颜色通道及其数组后，我们将其展平为1D数组，并将其传递给hist函数。

bin数量应该是256个，每个像素值对应一个-0表示黑色，255表示完全白色。

让我们使用彩色风景图像：

[En]

Let’s use color landscape images:

colorful_scenery = imread("images/colorful_scenery.jpg") plot_with_hist_channel(colorful_scenery, "red")

>>> plot_with_hist_channel(colorful_scenery, "green")

>>> plot_with_hist_channel(colorful_scenery, "blue")

还可以使用直方图在将图像转换为灰度后找出图像中的亮度：

[En]

You can also use histograms to find out the brightness in the image after converting the image to grayscale:

gray_color_scenery = rgb2gray(colorful_scenery) plt.hist(gray_color_scenery.ravel(), bins=256);

大多数像素具有较低的值，因为场景图像较暗。

[En]

Most pixels have lower values because the scene image is darker.

在接下来的几节中，我们将探索直方图的更多应用。

[En]

We will explore more applications of histograms in the following sections.

过滤器

1、手动阈值

现在，让我们来看看一些有趣的事情–过滤图像。我们将学到的第一件事是阈值。让我们加载一个示例图像：

[En]

Now, let’s take a look at something interesting-filtering images. The first thing we will learn is thresholding. Let’s load a sample image:

stag = imread("images/binary_example.jpg") >>> show(stag)

阈值分割在图像分割、目标检测、边缘或轮廓提取等方面有着广泛的应用。它主要用于区分图像的背景和前景。

[En]

Threshold segmentation is widely used in image segmentation, target detection, edge or contour extraction and so on. it is mainly used to distinguish the background and foreground of the image.

阈值处理在高对比度灰度图像上效果最好：

[En]

Threshold processing works best on high contrast grayscale images:

# Convert to graysacle stag_gray = rgb2gray(stag) >>> show(stag_gray)

我们将从基本的手动阈值设置开始，然后转到自动阈值设置。

[En]

We will start with the basic manual threshold setting and then go to the automatic threshold setting.

首先，让我们看看灰度图像中所有像素的平均值：

[En]

First, let’s look at the average of all the pixels in the grayscale image:

>>> stag_gray.mean() 0.20056262759859955

请注意，通过将所有灰度图像的值除以256，上述灰度图像的像素在0和1之间进行了归一化。

[En]

Notice that the pixels of the above grayscale image are normalized between 0 and 1 by dividing the values of all grayscale images by 256.

我们得到的平均值为0.2，这为我们提供了可能要使用的阈值的初步想法。

现在，我们使用该阈值进行掩码操作。如果像素值低于阈值，则其值将变为0-黑色或1-白色。换句话说，我们得到了一个黑白的二进制图像：

[En]

Now, we use this threshold for masking operations. If the pixel value is below the threshold, otherwise its value will become 0-black or 1-white. In other words, we get a black-and-white binary image:

# Set threshold threshold = 0.35 # Binarize binary_image = stag_gray > threshold compare(stag, binary_image, "Binary image")

在这个版本中，我们可以更清楚地识别鹿的轮廓。我们可以反转蒙版以使背景变为白色：

[En]

In this version, we can identify the outline of the deer more clearly. We can reverse the mask to make the background white:

inverted_binary = stag_gray >> compare(stag, inverted_binary, "Binary image inverted")

2、阈值-全局

虽然尝试不同的阈值并观察它们对图像的影响可能很有趣，但我们通常使用比眼球估计更稳健的算法来执行阈值分割。

[En]

While it may be interesting to try different thresholds and observe their effects on the image, we usually use algorithms that are more robust than our eyeball estimates to perform threshold segmentation.

有很多阈值算法，所以可能很难选择一种。在这种情况下，skimage具有try_all_threshold函数，该函数在给定的灰度图像上运行七种阈值算法。让我们加载一个示例并进行转换：

flower = imread("images/global_threshold_ex.jpg") flower_gray = rgb2gray(flower) compare(flower, flower_gray)

我们将看看是否可以使用阈值来优化郁金香的特性：

[En]

We’ll see if we can use thresholds to optimize the characteristics of tulips:

from skimage.filters import try_all_threshold fig, ax = try_all_threshold( flower_gray, figsize=(10, 8), verbose=False )

正如你所看到的，一些算法在这张图像上工作得更好，而其他算法则很糟糕。otsu算法看起来更好，所以我们将继续使用它。

现在，我想请大家注意郁金香的原始图像：

[En]

At this point, I would like to draw your attention to the original image of the tulip:

>>> show(flower)

图像的背景不均匀，因为从后窗进来的光太多了。我们可以通过绘制灰色郁金香的直方图来证明这一点：

[En]

The background of the image is uneven because there is too much light coming in from the back window. We can prove this by drawing a histogram of gray tulips:

>>> plt.hist(flower_gray.ravel(), bins=256);

正如预期的那样，大多数像素的值位于直方图的远端，这证实了它们中的大多数是明亮的。

[En]

As expected, the values of most pixels are at the far end of the histogram, which confirms that most of them are bright.

为什么这很重要？根据图像的亮度，阈值算法的性能也会发生变化。因此，通常有两种类型的阈值算法：

[En]

Why is this important? According to the brightness of the image, the performance of the threshold algorithm will also change. Therefore, there are usually two types of threshold algorithms:

全局-适用于具有均匀、统一背景的照片局部-用于不同图片区域中具有不同亮度级别的图像。

郁金香图像属于第二类，因为右侧部分比另一半亮得多，使其背景不均匀。我们不能在其上使用全局阈值算法，这就是为什么try_all_threshold中所有算法的性能都很差的原因。

稍后我们将返回到郁金香示例和本地阈值。现在我们将加载另一个亮度更精确的实例，并尝试自动设置阈值：

[En]

We will return to the tulip example and local thresholds later. Now we will load another instance with more accurate brightness and try to set the threshold automatically:

spiral = imread("images/otsu_example.jpg") spiral_gray = rgb2gray(spiral) compare(spiral, spiral_gray)

我们将在Skimage中使用通用的全局阈值算法threshold_otsu：

from skimage.filters import threshold_otsu # Find optimal threshold with threshold_otsu threshold = threshold_otsu(spiral_gray) # Binarize binary_spiral = spiral_gray > threshold compare(spiral, binary_spiral, "Binarized Image w. Otsu Thresholding")

它工作得更好！

3、阈值-局部

现在，我们将使用局部阈值算法。

[En]

Now, we will use the local threshold algorithm.

局部算法不关注整个图像，而是关注像素邻域，以解释不同区域的亮度不均匀。skimage中常见的局部算法为threshold_local函数：

from skimage.filters import threshold_local local_thresh = threshold_local(flower_gray, block_size=3, offset=0.0002) binary_flower = flower_gray > local_thresh compare(flower, binary_flower, "Tresholded flower image")

你必须使用offset参数来找到符合你需要的最佳图像。offset是从局部像素邻域的平均值中减去的常数。该”像素邻域”由local_threshold中的block_size参数确定，该参数表示算法在每个方向上围绕每个点查看的像素数。

显然，同时调整offset和block_size是一个缺点，但局部阈值是唯一比手动或全局阈值产生更好结果的选项。

让我们再举一个例子：

from skimage.filters import threshold_local handwriting = imread("images/chalk_writing.jpg") handwriting_gray = rgb2gray(handwriting) # Find optimal threshold using local local_thresh = threshold_local(handwriting_gray, offset=0.0003) # Binarize binary_handwriting = handwriting_gray > local_thresh compare(handwriting, binary_handwriting, "Binarized image with local thresholding")

如你所见，经过阈值处理后，黑板上的笔迹更精细了。

[En]

As you can see, the handwriting on the blackboard is finer after threshold processing.

4、边缘检测

边缘检测在目标识别、特征提取、统计等方面都有很大的应用价值。

[En]

Edge detection is very useful in many aspects, such as identifying objects, extracting features from them, counting them and so on.

我们将从基本的Sobel滤波器开始，它在灰度图像中查找对象的边缘。我们将加载一张硬币图片，并对其使用Sobel滤波器：

from skimage.filters import sobel coins = imread("images/coins_2.jpg") coins_gray = rgb2gray(coins) coins_edge = sobel(coins_gray) compare(coins, coins_edge, "Images of coins with edges detected")

sobel很直截了当；你只需在灰色图像上调用它即可获得如上所述的输出。我们将在后面的部分中看到Sobel的更复杂版本。

5、平滑

另一种图像过滤技术是平滑。许多像下面的鸡一样的图像可能包含随机噪声，而对ML和DL算法没有任何有价值的信息。

例如，鸡周围的毛发会给图像添加噪声，这可能会使ML模型的注意力偏离主要对象本身。在这种情况下，我们使用平滑来模糊噪声或边缘并降低对比度。

chickens = imread("images/chickens.jpg") >>> show(chickens)

高斯平滑是最流行和最强大的平滑技术之一：

[En]

Gaussian smoothing is one of the most popular and powerful smoothing techniques:

from skimage.filters import gaussian smoothed = gaussian(chickens, multichannel=True, sigma=2) compare(chickens, smoothed, "An image smoothed with Gaussian smoothing")

你可以通过调整sigma参数来控制模糊的效果。如果你正在处理RGB图像，请不要忘记将multichannel设置为True。

如果图像分辨率太高，肉眼可能看不到平滑效果，但仍然有效。

[En]

If the image resolution is too high, the smoothing effect may not be visible to the naked eye, but it is still valid.

6、对比度增强

有些类型的图像，比如医学分析结果，对比度低，很难找到细节，如下所示：

[En]

Some types of images, such as medical analysis results, have low contrast and are difficult to find details, as shown below:

xray = imread("images/xray.jpg") xray_gray = rgb2gray(xray) compare(xray, xray_gray)

在这种情况下，我们可以使用对比度增强来使细节更清晰。有两种对比度增强算法：

[En]

In this case, we can use contrast enhancement to make the details clearer. There are two contrast enhancement algorithms:

对比度拉伸直方图均衡化

在本文中，我们将讨论直方图均衡，它有三种类型：

[En]

In this article, we will discuss histogram equalization, which has three types:

标准直方图均衡化自适应直方图均衡化对比度受限自适应直方图均衡化（CLAHE）

直方图均衡化将图像对比度最高的区域扩展到亮度较低的区域，以使其均衡。

[En]

Histogram equalization extends the area with the highest contrast of the image to the area with lower brightness to equalize it.

您可以通过从最高像素值中减去最低像素来计算图像的对比度。

[En]

You can calculate the contrast of an image by subtracting the lowest pixel from the highest pixel value.

>>> xray.max() - xray.min() 255

现在，让我们尝试exposure模块中的标准直方图均衡化：

from skimage.exposure import equalize_hist enhanced = equalize_hist(xray_gray) >>> compare(xray, enhanced)

我们已经可以更清楚地看到细节。

[En]

We can already see the details more clearly.

from skimage.exposure import equalize_hist enhanced = equalize_hist(xray_gray) >>> compare(xray, enhanced)

接下来，我们将使用CLAHE，它为图像中的不同像素邻域计算许多直方图，即使在最暗的区域也会得到更详细的信息：

from skimage.exposure import equalize_adapthist # Adjust clip_limit enhanced_adaptive = equalize_adapthist(xray_gray, clip_limit=0.4) compare(xray, enhanced_adaptive, "Image with contrast enhancement")

这个看起来好多了，因为它可以在背景中显示细节，在左下角显示更多缺失的肋骨。你可以调整clip_limit以获得更多或更少的细节。

7、变换

数据集中的图像可能有几个相互冲突的特征，如不同的比例、未对齐的旋转等。ML和DL算法希望你的图片具有相同的形状和尺寸。因此，你需要学习如何修复它们。

旋转

要旋转图像，请使用”transform”模块中的”rotate”函数。

from skimage.transform import rotate clock = imread("images/clock.jpg") clockwise = rotate(clock, angle=-60) compare(clock, clockwise, "Clockwise rotated image, use negative angles")

anti_clockwise = rotate(clock, angle=33) compare(clock, anti_clockwise, "Anticlockwise rotated image, use positive angles")

缩放

另一个标准操作是缩放图像。

[En]

Another standard operation is to zoom the image.

我们对此操作使用rescale函数：

butterflies = imread("images/butterflies.jpg") >>> butterflies.shape (720, 1280, 3) from skimage.transform import rescale scaled_butterflies = rescale(butterflies, scale=3 / 4, multichannel=True) compare( butterflies, scaled_butterflies, "Butterflies scaled down by a factor of 3/4", axis=True, )

当图像分辨率较高时，缩小可能会导致质量损失或像素不协调，从而产生意外的边或角。要考虑这种影响，可以将anti_aliasing设置为True，它使用高斯平滑：

https://gist.github.com/f7ae272b6eb1bce408189d8de2b71656

和以前一样，平滑效果并不明显，但在更细粒度的层面上会更加明显。

[En]

As before, the smoothing effect is not obvious, but it will be more obvious at a more fine-grained level.

调整大小

如果希望图像具有特定的宽度和高度，而不是按系数缩放，可以通过提供output_shape来使用resize函数：

from skimage.transform import resize puppies = imread("images/puppies.jpg") # Also possible to set anti_aliasing puppies_600_800 = resize(puppies, output_shape=(600, 800)) compare(puppies, puppies_600_800, "Puppies image resized 600x800 (height, width)")

图像恢复和增强

某些图像可能会在文件转换、错误下载或许多其他情况下失真、损坏或丢失。

[En]

Some images may be distorted, corrupted, or lost in file transformations, incorrect downloads, or in many other cases.

在这一部分中，我们将讨论一些图像恢复技术，从恢复开始。

[En]

In this section, we will discuss some image restoration techniques, starting with restoration.

1、修补

该修复算法能够智能地填补图像中的空隙。我找不到损坏的图片，所以我们将使用这张鲸鱼图片，并手动在上面放置一些空白：

[En]

The repair algorithm can fill the gaps in the image intelligently. I can’t find the damaged picture, so we’ll use this whale image and manually place some white space on it:

whale_image = imread("images/00206a224e68de.jpg") >>> show(whale_image)

>>> whale_image.shape (428, 1916, 3)

以下函数创建四个变黑区域来模拟图像上丢失的信息：

[En]

The following function creates four blackened areas to simulate the information lost on the image:

def make_mask(image): """Create a mask to artificially defect the image.""" mask = np.zeros(image.shape[:-1]) # Make 4 masks mask[250:300, 1400:1600] = 1 mask[50:100, 300:433] = 1 mask[300:380, 1000:1200] = 1 mask[200:270, 750:950] = 1 return mask.astype(bool) # Create the mask mask = make_mask(whale_image) # Apply the defect mask on the whale_image image_defect = whale_image * ~mask[..., np.newaxis] compare(whale_image, image_defect, "Artifically damaged image of a whale")

我们将使用inpaint模块中的inpaint_biharmonic函数来填充空白，并传递我们创建的掩码：

from skimage.restoration import inpaint restored_image = inpaint.inpaint_biharmonic( image=image_defect, mask=mask, multichannel=True ) compare( image_defect, restored_image, "Restored image after defects", title_original="Faulty Image", )

如您所见，在看到故障图像之前，很难判断缺陷区域在哪里。

[En]

As you can see, it is difficult to tell where the defect area is until you see the fault image.

现在，让我们制造一些噪音。

[En]

Now, let’s make some noise.

点击图像识别资料拿走腾讯文档-在线文档人工智能——图像处理和Python深度学习的全教程（建议收藏） https://docs.qq.com/doc/DT2dRWmNNRGtmb2pS ;

2、噪声

如前所述，噪声在图像增强和恢复中起着重要的作用。

[En]

As mentioned earlier, noise plays an important role in image enhancement and restoration.

有时，您可能会有意将其添加到下图中：

[En]

Sometimes, you may intentionally add it to the following image:

from skimage.util import random_noise pup = imread("images/pup.jpg") noisy_pup = random_noise(pup) compare(pup, noisy_pup, "Noise puppy image")

我们使用random_noise函数向图像喷洒随机的颜色斑点。因此，这种方法被称为”盐和胡椒（salt和 pepper）”技术。

3、降噪-去噪

但是，在大多数情况下，您希望从图像中删除噪波，而不是添加噪波。有几种类型的去噪算法：

[En]

However, in most cases, you want to remove noise from the image instead of adding noise. There are several types of denoising algorithms:

TV滤波器双边去噪小波降噪非局部均值去噪

在本文中，我们将只看前两个。我们先试试TV滤波器

from skimage.restoration import denoise_tv_chambolle denoised_pup_tv = denoise_tv_chambolle(noisy_pup, weight=0.2, multichannel=True) compare( noisy_pup, denoised_pup_tv, "Total Variation Filter denoising applied", title_original="Noisy pup", )

图像的分辨率越高，去噪时间越长。可以使用权重参数来控制消噪效果。

[En]

The higher the resolution of the image, the longer the denoising time. You can use weight parameters to control the denoising effect.

现在，让我们尝试denoise_bilateral：

from skimage.restoration import denoise_bilateral denoised_pup_bilateral = denoise_bilateral(noisy_pup, multichannel=True) compare(noisy_pup, denoised_pup_bilateral, "Bilateral denoising applied image")

它不如TV滤波器有效，如下所示：

compare( denoised_pup_tv, denoised_pup_bilateral, "Bilateral filtering", title_original="TV filtering", )

4、分割

图像分割是图像处理中最基本、最日常的课题之一。它在运动与目标检测、图像分类等领域有着广泛的应用。

[En]

Image segmentation is one of the most basic and daily topics in image processing. It is widely used in many fields, such as motion and target detection, image classification and so on.

我们已经看到了一个分割的例子–对图像进行阈值处理以从前景中提取背景。

[En]

We have seen an example of segmentation-thresholding the image to extract the background from the foreground.

本节将学习更多内容，例如将图像划分为相似的区域。

[En]

This section will learn more, such as dividing an image into similar regions.

要开始分割，我们需要理解超像素的概念。

[En]

To start segmentation, we need to understand the concept of super pixels.

像素本身只代表颜色的一小部分，一旦与图像分离，单个像素就毫无用处了。因此，分割算法使用多组具有相似对比度、颜色或亮度的像素，称为超像素。

[En]

A pixel itself represents only a small part of the color, and once separated from the image, a single pixel will be useless. Therefore, the segmentation algorithm uses multiple sets of pixels with similar contrast, color, or brightness, which are called super pixels.

一种试图找到超像素的算法是简单线性迭代聚类（SLIC），它使用k均值聚类。让我们看看如何在skimage库中提供的咖啡图像上使用它：

from skimage import data coffee = data.coffee() >>> show(coffee)

我们将使用segmentation模块中的slic函数：

from skimage.segmentation import slic segments = slic(coffee) >>> show(segments)

默认情况下，slic会查找100个线段或标签。要将它们放回图像中，我们使用label2rgb函数：

from skimage.color import label2rgb final_image = label2rgb(segments, coffee, kind="avg") >>> show(final_image)

让我们将此操作包装在一个函数中，并尝试使用更多段：

[En]

Let’s wrap this operation in a function and try to use more segments:

from skimage.color import label2rgb from skimage.segmentation import slic def segment(image, n_segments=100): # Obtain superpixels / segments superpixels = slic(coffee, n_segments=n_segments) # Put the groups on top of the original image segmented_image = label2rgb(superpixels, image, kind="avg") return segmented_image # Find 500 segments coffee_segmented_2 = segment(coffee, n_segments=500) compare(coffee, coffee_segmented_2, "With 500 segments")

分割将使计算机视觉算法更容易从图像中提取有用的特征。

[En]

Segmentation will make it easier for computer vision algorithms to extract useful features from images.

5、等高线

关于一个物体的大部分信息都存在于它的形状中。如果我们能够检测到物体的线条或轮廓形状，我们就可以提取出有价值的数据。

[En]

Most of the information about an object exists in its shape. If we can detect the line or outline shape of the object, we can extract valuable data.

让我们看看如何在实践中使用Domino图像来查找轮廓。

[En]

Let’s look at how to use domino images to find contours in practice.

dominoes = imread("images/dominoes.jpg") >>> show(dominoes)

我们将看看是否可以使用skimage中的find_contours函数来隔离瓷砖和圆。此函数需要一个二进制（黑白）图像，因此我们必须先对图像设置阈值。

from skimage.measure import find_contours # Convert to grayscale dominoes_gray = rgb2gray(dominoes) # Find optimal threshold with treshold_otsu thresh = threshold_otsu(dominoes_gray) # Binarize dominoes_binary = dominoes_gray > thresh domino_contours = find_contours(dominoes_binary)

生成的数组是（n，2）个数组的列表，表示等高线的坐标：

for contour in domino_contours[:5]: print(contour.shape) [OUT]: (371, 2) (376, 2) (4226, 2) (177, 2) (11, 2)

我们将把操作包装在一个名为mark_contours的函数中：

from skimage.filters import threshold_otsu from skimage.measure import find_contours def mark_contours(image): """A function to find contours from an image""" gray_image = rgb2gray(image) # Find optimal threshold thresh = threshold_otsu(gray_image) # Mask binary_image = gray_image > thresh contours = find_contours(binary_image) return contours

要在图像上绘制等高线，我们将创建另一个名为plot_image_contours的函数，该函数使用上述函数：

def plot_image_contours(image): fig, ax = plt.subplots() ax.imshow(image, cmap=plt.cm.gray) for contour in mark_contours(image): ax.plot(contour[:, 1], contour[:, 0], linewidth=2, color="red") ax.axis("off") >>> plot_image_contours(dominoes)

正如我们所看到的，我们已经成功地检测到了大部分的轮廓，但我们仍然可以看到中心的一些随机波动。

[En]

As we can see, we have successfully detected most of the contours, but we can still see some random fluctuations in the center.

在将Domino图像传递给轮廓查找函数之前对其进行去噪：

[En]

Denoise the domino image before passing it to the contour lookup function:

dominoes_denoised = denoise_tv_chambolle(dominoes, multichannel=True) plot_image_contours(dominoes_denoised)

好吧!我们已经消除了导致不正确轮廓的大部分噪音！

[En]

okay! We have eliminated most of the noise, which leads to incorrect contours!

高级操作

1、边缘检测

之前，我们使用Sobel算法来检测对象的边缘。在这里，我们将使用Canny算法，因为它更快、更准确，所以得到了更广泛的应用。一如既往，函数canny需要灰度图像。

这一次，我们将使用包含更多硬币的图像，因此需要检测更多边缘：

[En]

This time, we will use an image with more coins, so we need to detect more edges:

coins_3 = imread("images/coins_3.jpg") # Convert to gray coins_3_gray = rgb2gray(coins_3) compare(coins_3, coins_3_gray)

要找到边缘，我们只需将图像传递给canny函数：

from skimage.feature import canny # Find edges with canny canny_edges = canny(coins_3_gray) compare(coins_3, canny_edges, "Edges detected with Canny algorithm")

该算法发现了几乎所有硬币的边缘，但由于硬币上的雕刻也被检测到，因此噪声非常大。我们可以通过调整sigma参数来降低canny的灵敏度：

canny_edges_sigma_2 = canny(coins_3_gray, sigma=2.5) compare(coins_3, canny_edges_sigma_2, "Edges detected with less intensity")

正如你所见，Canny现在只找到了硬币的大致轮廓。

2、角点检测

另一种重要的图像处理技术是角点检测。在图像分类中，角点可以作为目标的关键特征。

[En]

Another important image processing technology is corner detection. Corners can be the key features of objects in image classification.

为了找到角点，我们将使用Harris角点检测算法。让我们加载一个示例图像并将其转换为灰度：

windows = imread("images/windows.jpg") windows_gray = rgb2gray(windows) compare(windows, windows_gray)

我们将使用corner_harris函数生成一个测量图像，该图像屏蔽了角点所在的区域。

from skimage.feature import corner_harris measured_image = corner_harris(windows_gray) >>> show(measured_image)

现在，我们将此蒙版度量图像传递给corner_peaks函数，该函数这次返回角点坐标：

from skimage.feature import corner_peaks corner_coords = corner_peaks(measured_image, min_distance=50) >>> len(corner_coords) 79

该函数找到79个角点。让我们将操作包装到函数中：

def find_corner_coords(image, min_distance=50): # Convert to gray gray_image = rgb2gray(image) # Produce a measure image measure_image = corner_harris(gray_image) # Find coords coords = corner_peaks(measure_image, min_distance=min_distance) return coords

现在，我们将创建另一个函数，该函数使用上述函数生成的坐标绘制每个角：

[En]

Now we will create another function that draws each corner using the coordinates generated by the above function:

def show_image_cornered(image): # Find coords coords = find_corner_coords(image) # Plot them on top of the image plt.imshow(image, cmap="gray") plt.plot(coords[:, 1], coords[:, 0], "+b", markersize=15) plt.axis("off") show_image_cornered(windows)

不幸的是，算法并没有像预期的那样工作。标记被放置在砖的交叉处，而不是找到窗户的角落。

[En]

Unfortunately, the algorithm did not work as expected. The tag is placed at the intersection of the bricks instead of finding the corner of the window.

这些噪音使它们变得毫无用处。让我们对图像进行去噪，并再次将其传递给该函数：

[En]

These are noises that make them useless. Let’s Denoise the image and pass it to the function again:

windows_denoised = denoise_tv_chambolle(windows, multichannel=True, weight=0.3) show_image_cornered(windows_denoised)

现在，这好多了！它找到了大部分窗户角。学习资料人工智能《OpenCV图像识别》领取腾讯文档-在线文档人工智能——图像处理和Python深度学习的全教程（建议收藏） https://docs.qq.com/doc/DT2dRWmNNRGtmb2pS ;

结论

在真实的计算机视觉问题中，你不能同时使用所有这些。正如您可能已经注意到的，我们今天所学的内容并不复杂，最多只需要几行代码。棘手的部分是将它们应用于实际问题，并实际提高模型的性能。

[En]

You don’t use all of this at the same time in real computer vision problems. As you may have noticed, what we have learned today is not complicated and requires at most a few lines of code. The tricky part is to apply them to practical problems and actually improve the performance of the model.

Original: https://blog.csdn.net/m0_59485658/article/details/125953029Author: 代码输入中…Title: 人工智能——图像处理和Python深度学习的全教程（建议收藏）

相关阅读 Title: pytorch dataloader详解

构建自己的dataloader是模型训练的第一步，本篇文章介绍下pytorch与dataloader以及与其相关的类的用法。

DataLoader类中有一个必填参数为 dataset，因此在构建自己的dataloader前，先要定义好自己的 Dataset类。这里先大致介绍下这两个类的作用：

Dataset：真正的”数据集”，它的作用是：只要告诉它数据在哪里(初始化)，就可以像使用iterator一样去拿到数据，继承该类后，需要重载 __len__()以及 __getitem__ DataLoader：数据加载器，设置一些参数后，可以按照一定规则加载数据，比如设置batch_size后，每次加载一个batch_siza的数据。它像一个生成器一样工作。

有小伙伴可能会疑惑，自己写一个加载数据的工具似乎也没有多”困难”，为何大费周章要继承pytorch中类，按照它的规则加载数据呢？关于这点可以参考这里：pytorch dataloader，总结一下就是：

当数据量很大的时候，单进程加载数据很慢一次全加载过来，会占用很大的内存空间（因此dataloader是一个生成器，惰性加载）在进行训练前，往往需要一些数据预处理或数据增强等操作，pytorch的dataloader已经封装好了，避免了重复造轮子一、使用方法

两步走：

定义自己的Dataset类，具体要做的事：告诉它去哪儿读数据，并将数据resize为统一的shape（可以思考下为什么呢）重写 __len__()以及 __getitem__，其中 __getitem__中要确定自己想要哪些数据，然后将其return出来。将自己的Dataset实例传到Dataloder中并设置想要的参数，构建自己的dataloader

下面简单加载一个目录下的图片以及label：

import os import numpy as np from torch.utils.data.dataset import Dataset from torch.utils.data.dataloader import DataLoader import cv2 img_dir = '/home/jyz/Downloads/classify_example/val/骏马/' anno_file = '/home/jyz/Downloads/classify_example/val/label.txt' class MyDataset(Dataset): def __init__(self, img_dir, anno_file, imgsz=(640, 640)): self.img_dir = img_dir self.anno_file = anno_file self.imgsz = imgsz self.img_namelst = os.listdir(self.img_dir) def __len__(self): return len(self.img_namelst) def __getitem__(self, idx): with open(self.anno_file, 'r') as f: label = f.readline().strip() img = cv2.imread(os.path.join(img_dir, self.img_namelst[idx])) img = cv2.resize(img, self.imgsz) return img, label dataset = MyDataset(img_dir, anno_file) dataloader = DataLoader(dataset=dataset, batch_size=2) for img_batch, label_batch in dataloader: img_batch = img_batch.numpy() print(img_batch.shape) if img_batch.shape[0] == 2: img = np.hstack((img_batch[0], img_batch[1])) else: img = np.squeeze(img_batch, axis=0) print(img.shape) cv2.imshow(label_batch[0], img) cv2.waitKey(0)

上面是一次加载两张图片，效果如下：

其实从这里可以看出，为什么要在Dataset中将数据resize为统一的shape。因为dataloader加载数据时，将一个batch_size的数据拼接成一个大的tensor，如果shape不同，就无法拼接了。就像这两张图片加入shape不一样就无法通过拼接的方式show出来一样。二、结论使用pytorch的dataloader，需要先构建自己的Dataset 构建自己的Dataset，需要重载 __len__()以及 __getitem__ 数据地址：example data，提取码: a1ds

Original: https://blog.csdn.net/qq_34062683/article/details/126528869Author: 惊瑟Title: pytorch dataloader详解

原创文章受到原创版权保护。转载请注明出处：https://www.johngo689.com/325537/

转载文章受原作者版权保护。转载请注明原作者出处！

【本文地址】

人工智能

人工智能

今日新闻

推荐新闻