buildfastwithaibuildfastwithai
GenAI LaunchpadAI WorkshopsAll blogs
Back to blogs

FlagEmbedding: Enhance AI Retrieval with Advanced Embeddings

February 20, 2025
4 min read
FlagEmbedding: Enhance AI Retrieval with Advanced Embeddings

Are you content watching others shape the future, or will you take charge?

Be part of Gen AI Launch Pad 2025 and make your mark today.

Introduction

In the era of AI-driven search and retrieval, FlagEmbedding emerges as a powerful open-source project aimed at improving information retrieval and large language model (LLM) augmentation through advanced embeddings. This blog post will guide you through the features, implementation, and practical applications of FlagEmbedding, providing a deep dive into its components and functionalities. By the end of this article, you'll gain a comprehensive understanding of how FlagEmbedding enhances retrieval accuracy, improves ranking, and optimizes language model adaptability.

Key Features of FlagEmbedding

FlagEmbedding offers a suite of robust features tailored for diverse retrieval needs:

  • BGE M3-Embedding 🌍: Supports multi-lingual, multi-granular embeddings and enables both dense and sparse retrieval.
  • Visualized-BGE πŸ–ΌοΈ: Fuses text and image embeddings for hybrid retrieval tasks.
  • LM-Cocktail 🍹: Blends fine-tuned and base models to improve adaptability in retrieval scenarios.
  • LLM Embedder πŸ€–: Optimized for knowledge retrieval, memory augmentation, and tool retrieval.
  • BGE Reranker πŸ”„: Re-ranks top-k results for enhanced accuracy.

Installation

Before diving into implementation, install FlagEmbedding via pip:

pip install -U FlagEmbedding

FlagEmbedding Model Initialization

To begin using FlagEmbedding, initialize the model as follows:

from FlagEmbedding import FlagAutoModel

model = FlagAutoModel.from_finetuned('BAAI/bge-base-en-v1.5',
                                      query_instruction_for_retrieval="Represent this sentence for searching relevant passages:",
                                      use_fp16=True)

Explanation

  • FlagAutoModel.from_finetuned loads a pre-trained BGE model optimized for retrieval tasks.
  • query_instruction_for_retrieval provides context for how the sentence should be represented for search.
  • use_fp16=True enables mixed-precision floating point for performance optimization.

Use Case

This is ideal for document retrieval systems, search engines, and LLM augmentation, where users need to match queries with relevant passages efficiently.

Encoding Sentences with FlagEmbedding

Now, let's encode some sentences and generate their embeddings:

sentences_1 = ["I love NLP", "I love machine learning"]
sentences_2 = ["I love BGE", "I love text retrieval"]
embeddings_1 = model.encode(sentences_1)
embeddings_2 = model.encode(sentences_2)

Explanation

  • Sentence embeddings are numerical representations that capture semantic meaning.
  • model.encode(sentences) converts textual sentences into high-dimensional vector embeddings.

Computing Sentence Similarity

Once embeddings are generated, compute cosine similarity between sentences:

similarity = embeddings_1 @ embeddings_2.T
print(similarity)

Expected Output

[[0.6538745  0.7568528 ]
 [0.6559792  0.72265273]]

Explanation

  • The dot product (@) computes similarity scores between embeddings.
  • Higher values indicate greater similarity between sentences.

Use Case

This technique is beneficial in recommendation systems, duplicate content detection, and contextual search engines.

AutoReranker: Enhancing Ranking Accuracy

FlagEmbedding provides an AutoReranker for improving search result ranking.

from FlagEmbedding import FlagAutoReranker

reranker = FlagAutoReranker.from_finetuned('BAAI/bge-reranker-large',
                                           query_max_length=256,
                                           passage_max_length=512,
                                           use_fp16=True,
                                           devices=['cuda:0'])

score = reranker.compute_score(['query', 'passage'])
print(score)

Explanation

  • FlagAutoReranker.from_finetuned loads a large reranker model.
  • query_max_length & passage_max_length control the input sizes.
  • FP16 & CUDA accelerate performance.

Expected Output

[-1.513671875]

This value represents the relevance of the passage to the query.

Use Case

This is useful for search engines, chatbots, and knowledge bases, where ranking precision is crucial.

Normal Reranker: Standard Ranking Mechanism

For simpler ranking, a standard FlagReranker is available:

from FlagEmbedding import FlagReranker

reranker = FlagReranker('BAAI/bge-reranker-v2-m3',
                         query_max_length=256,
                         passage_max_length=512,
                         use_fp16=True,
                         devices=['cuda:0'])

score = reranker.compute_score(['query', 'passage'])
print(score)

Explanation

  • Similar to AutoReranker but tailored for standard ranking tasks.

Expected Output

[-5.66015625]

Use Case

Suitable for e-commerce searches, FAQ retrieval, and support chatbots.

LLM Reranker: Layer-wise Re-ranking

For advanced layer-wise ranking, use the LLM Reranker:

from FlagEmbedding import LayerWiseFlagLLMReranker

reranker = LayerWiseFlagLLMReranker('BAAI/bge-reranker-v2-minicpm-layerwise',
                                     query_max_length=256,
                                     passage_max_length=512,
                                     use_fp16=True,
                                     devices=['cuda:0'])

score = reranker.compute_score(['query', 'passage'], cutoff_layers=[28])
print(score)

Explanation

  • cutoff_layers allows tuning of ranking layers for customization.

Expected Output

[-1.375]

Use Case

This is ideal for academic search engines, medical literature retrieval, and legal document ranking.

Conclusion

FlagEmbedding is a game-changer for AI-powered retrieval, offering flexible and powerful tools for embedding generation, reranking, and hybrid search. Key takeaways:

  • BGE embeddings power multi-lingual, dense, and sparse retrieval.
  • AutoReranker & Normal Reranker boost ranking accuracy.
  • Layer-wise reranking fine-tunes results for advanced use cases.

Whether you’re building a search engine, AI chatbot, or recommendation system, FlagEmbedding is a must-have tool.

Resources

  • FlagEmbedding GitHub
  • BAAI Models on Hugging Face
  • BERT for Text Retrieval
  • FlagEmbedding Experiment Notebook

---------------------------

Stay Updated:- Follow Build Fast with AI pages for all the latest AI updates and resources.

Experts predict 2025 will be the defining year for Gen AI Implementation. Want to be ahead of the curve?

Join Build Fast with AI’s Gen AI Launch Pad 2025 - your accelerated path to mastering AI tools and building revolutionary applications.

---------------------------

Resources and Community

Join our community of 12,000+ AI enthusiasts and learn to build powerful AI applications! Whether you're a beginner or an experienced developer, this tutorial will help you understand and implement AI agents in your projects.

  • Website: www.buildfastwithai.com
  • LinkedIn: linkedin.com/company/build-fast-with-ai/
  • Instagram: instagram.com/buildfastwithai/
  • Twitter: x.com/satvikps
  • Telegram: t.me/BuildFastWithAI