嵌入模型

目录

嵌入模型#

SGLang 通过集成高效的服务机制和灵活的编程接口，为嵌入模型提供了强大的支持。这种集成能够简化嵌入任务的处理，促进更快、更准确的检索和语义搜索操作。SGLang 的架构使得嵌入模型的部署能够更好地利用资源并降低延迟。

重要提示

它们使用 --is-embedding 参数执行，部分可能需要 --trust-remote-code。

启动命令示例#

python3 -m sglang.launch_server \
  --model-path Alibaba-NLP/gme-Qwen2-VL-2B-Instruct \
  --is-embedding \
  --host 0.0.0.0 \
  --chat-template gme-qwen2-vl \
  --port 30000

客户端请求示例#

import requests

url = "http://127.0.0.1:30000"

text_input = "Represent this image in embedding space."
image_path = "https://hugging-face.cn/datasets/liuhaotian/llava-bench-in-the-wild/resolve/main/images/023.jpg"

payload = {
    "model": "gme-qwen2-vl",
    "input": [
        {
            "text": text_input
        },
        {
            "image": image_path
        }
    ],
}

response = requests.post(url + "/v1/embeddings", json=payload).json()

print("Embeddings:", [x.get("embedding") for x in response.get("data", [])])

支持矩阵#

模型系列 (嵌入)	HuggingFace 标识符示例	聊天模板	描述
基于 Llama/Mistral (E5EmbeddingModel)	`intfloat/e5-mistral-7b-instruct`	不适用	基于 Mistral/Llama 的嵌入模型，针对高质量文本嵌入进行了微调（在 MTEB 基准测试中排名靠前）。
GTE (QwenEmbeddingModel)	`Alibaba-NLP/gte-Qwen2-7B-instruct`	不适用	阿里云的通用文本嵌入模型（7B），在中英双语方面实现了最先进的多语言性能。
GME (MultimodalEmbedModel)	`Alibaba-NLP/gme-Qwen2-VL-2B-Instruct`	`gme-qwen2-vl`	基于 Qwen2-VL 的多模态嵌入模型（2B），将图像 + 文本编码到统一的向量空间中，用于跨模态检索。
CLIP (CLIPEmbeddingModel)	`openai/clip-vit-large-patch14-336`	不适用	OpenAI 的 CLIP 模型 (ViT-L/14)，用于将图像（和文本）嵌入到联合潜在空间中；广泛用于图像相似度搜索。