基于CLIP模型的图像相似度计算与检索

您所在的位置:网站首页 如何计算两张图片的相似度和相似度 基于CLIP模型的图像相似度计算与检索

基于CLIP模型的图像相似度计算与检索

2024-07-16 01:16| 来源: 网络整理| 查看: 265

使用CLIP模型的对图像进行相似度计算。

1、CLIP中的图像相似度

使用CLIP计算两个图像之间的相似度是一个简单的过程,只需要两个步骤:首先提取两个图像的特征,然后计算它们的余弦相似度。 首先,确保已安装所需的软件包。建议设置和使用虚拟环境:

#Start by setting up a virtual environment virtualenv venv-similarity source venv-similarity/bin/activate #Install required packages pip install transformers Pillow torch

接下来,继续计算图像相似度:

#!/usr/bin/env python # -*- coding: utf-8 -*- import torch from PIL import Image from transformers import AutoProcessor, CLIPModel import torch.nn as nn import os os.environ["http_proxy"] = "http://127.0.0.1:21882" os.environ["https_proxy"] = "http://127.0.0.1:21882" device = torch.device('cuda' if torch.cuda.is_available() else "cpu") #####--------两种方法都可以 processor = AutoProcessor.from_pretrained("openai/clip-vit-base-patch32",cache_dir="./CLIPModel") model = CLIPModel.from_pretrained("openai/clip-vit-base-patch32",cache_dir="./CLIPModel").to(device) # processor = AutoProcessor.from_pretrained("CLIP-Model") # model = CLIPModel.from_pretrained("CLIP-Model").to(device) #Extract features from image1 image1 = Image.open('img1.jpg') with torch.no_grad(): inputs1 = processor(images=image1, return_tensors="pt").to(device) image_features1 = model.get_image_features(**inputs1) #Extract features from image2 image2 = Image.open('img2.jpg') with torch.no_grad(): inputs2 = processor(images=image2, return_tensors="pt").to(device) image_features2 = model.get_image_features(**inputs2) #Compute their cosine similarity and convert it into a score between 0 and 1 cos = nn.CosineSimilarity(dim=0) sim = cos(image_features1[0],image_features2[0]).item() sim = (sim+1)/2 print('Similarity:', sim)

在这里插入图片描述 在这里插入图片描述 2张相似的图像 使用两张相似图像的示例,获得的相似度得分为令人印象深刻的95.5%。 在这里插入图片描述

2、图像相似度检索

在深入评估它们的性能之前,让我们使用COCO数据集的验证集中的图像来比较CLIP的结果。我们采用的流程如下:

遍历数据集以提取所有图像的特征。将嵌入存储在FAISS索引中。提取输入图像的特征。检索相似度最高的三张图像。 #!/usr/bin/env python # -*- coding: utf-8 -*- import torch from PIL import Image from transformers import AutoProcessor, CLIPModel, AutoImageProcessor, AutoModel import faiss import os import numpy as np device = torch.device('cuda' if torch.cuda.is_available() else "cpu") # Load CLIP model and processor processor_clip = AutoProcessor.from_pretrained("CLIP-Model") model_clip = CLIPModel.from_pretrained("CLIP-Model").to(device) # model = CLIPModel.from_pretrained("CLIP_Model").to(device) #Input image source='laptop.jpg' image = Image.open(source) # Retrieve all filenames images = [] for root, dirs, files in os.walk('./test_data/'): for file in files: if file.endswith('jpg'): images.append(root + '/' + file) print(images) # Define a function that normalizes embeddings and add them to the index def add_vector_to_index(embedding, index): vector = embedding.detach().cpu().numpy() vector = np.float32(vector) faiss.normalize_L2(vector) index.add(vector) def extract_features_clip(image): with torch.no_grad(): inputs = processor_clip(images=image, return_tensors="pt").to(device) image_features = model_clip.get_image_features(**inputs) return image_features # Create 2 indexes. index_clip = faiss.IndexFlatL2(512) for image_path in images: img = Image.open(image_path).convert('RGB') clip_features = extract_features_clip(img) add_vector_to_index(clip_features, index_clip) faiss.write_index(index_clip, "clip.index") with torch.no_grad(): inputs_clip = processor_clip(images=image, return_tensors="pt").to(device) image_features_clip = model_clip.get_image_features(**inputs_clip) def normalizeL2(embeddings): vector = embeddings.detach().cpu().numpy() vector = np.float32(vector) faiss.normalize_L2(vector) return vector image_features_clip = normalizeL2(image_features_clip) index_clip = faiss.read_index("clip.index") d_clip, i_clip = index_clip.search(image_features_clip, 3) print(d_clip) print(i_clip)

结果如下: 在这里插入图片描述



【本文地址】


今日新闻


推荐新闻


CopyRight 2018-2019 办公设备维修网 版权所有 豫ICP备15022753号-3