buildfastwithaibuildfastwithai
GenAI LaunchpadAI WorkshopsAll blogs
Back to blogs
LLMs
Tutorials

Multilingual Chatbot Tutorial: RAG + SUTRA Integration Guide

June 16, 2025
4 min read
Multilingual Chatbot Tutorial: RAG + SUTRA Integration Guide

a chatbot that answers course or community questions in Hindi, Bengali, or Spanish—pulling accurate insights from PDFs or threads. With the power of Retrieval-Augmented Generation (RAG) and the multilingual SUTRA model by TWO Platforms, this is not a vision—it's real and achievable today.

In this tutorial, we'll break down the basics of RAG and walk you through building your own multilingual chatbot using the SUTRA model.

What is RAG (Retrieval-Augmented Generation)?

RAG is a framework introduced by Lewis et al. in 2020 to improve the performance of large language models (LLMs). It works by combining retrieval of external data with text generation, allowing models to stay up-to-date and minimize hallucinations.

📌 RAG Workflow:

  1. Retrieval: Finds relevant information from sources like PDFs or community threads.

  2. Generation: Uses the retrieved context to generate accurate responses.

RAG enables:

  • More factual and grounded responses

  • Dynamic access to domain-specific knowledge

  • Enhanced user experience in Q&A/chatbots

What is the SUTRA Model?

SUTRA, developed by TWO Platforms, is a family of multilingual language models (LMLMs) that support over 50 global languages—including Hindi, Tamil, Bengali, and Spanish.

  • Built with a dual-transformer architecture

  • Ideal for conversational agents, education, and community platforms

  • Especially suited for multilingual RAG applications

🔗 Why Combine RAG with SUTRA?

By combining RAG + SUTRA, you can:

  • Retrieve relevant data from course PDFs or threads

  • Generate answers in multiple languages

  • Support education, community support, and social learning platforms globally

🚀 Getting Started

Prerequisites

  • A Sutra API key

  • An OpenAI API key

  • Python libraries:

pip install -q langchain langchain_openai langchain-community faiss-cpu requests pypdf python-docx

🛠️ Step-by-Step: Build Your Multilingual RAG Chatbot

Step 1: Get Your API Keys

Sign up and grab your free API key here:

👉 https://www.two.ai/sutra/api

Step 2: Configure Environment

import os
from google.colab import userdata

# For Google Colab
os.environ["SUTRA_API_KEY"] = userdata.get("SUTRA_API_KEY")
os.environ["OPENAI_API_KEY"] = userdata.get("OPENAI_API_KEY")

# For local
os.environ["SUTRA_API_KEY"] = "your-sutra-api-key"
os.environ["OPENAI_API_KEY"] = "your-openai-api-key"

Step 3: Load PDF Documents

from langchain_community.document_loaders import PyPDFLoader

loader = PyPDFLoader("/content/NIPS-2017-attention-is-all-you-need-Paper.pdf")
documents = loader.load()

print(f"Loaded {len(documents)} pages.")

Step 4: Split Documents into Chunks

from langchain.text_splitter import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000,
    chunk_overlap=100
)

chunks = text_splitter.split_documents(documents)
print(f"Split into {len(chunks)} chunks.")

Step 5: Create Embeddings and Store in FAISS

from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import FAISS

embeddings = OpenAIEmbeddings()
vectorstore = FAISS.from_documents(chunks, embeddings)
retriever = vectorstore.as_retriever()

Step 6: Set Up Conversational RAG Chain

from langchain.memory import ConversationBufferMemory
from langchain.chains import ConversationalRetrievalChain
from langchain_openai import ChatOpenAI

memory = ConversationBufferMemory(
    memory_key="chat_history",
    return_messages=True
)

rag_chain = ConversationalRetrievalChain.from_llm(
    llm=ChatOpenAI(
        api_key=os.getenv("SUTRA_API_KEY"),
        base_url="https://api.two.ai/v2",
        model="sutra-v2",
        temperature=0.5
    ),
    retriever=retriever,
    memory=memory
)

Step 7: Handle Multilingual Queries

def ask_question(question, language="English"):
    rag_response = rag_chain.invoke(question)
    context = rag_response["answer"]

    prompt = f"""
    You are a helpful assistant that answers questions about documents.
    Use the following context to answer the question:

    CONTEXT:
    {context}

    Please respond in {language}.

    Question: {question}
    """

    chat = ChatOpenAI(
        api_key=os.getenv("SUTRA_API_KEY"),
        base_url="https://api.two.ai/v2",
        model="sutra-v2",
        temperature=0.7
    )

    from langchain.schema import HumanMessage
    response = chat.invoke([HumanMessage(content=prompt)])
    return response.content

Step 8: Test Multilingual Queries

# Hindi
response_hi = ask_question("What is a transformer?", language="Hindi")
print("Hindi Response:\n", response_hi)

# Bengali
response_bn = ask_question("ট্রান্সফরমার কী?", language="Bengali")
print("Bengali Response:\n", response_bn)

💬 Example Output

🔹 Hindi Response:
Transformer एक न्यूरल नेटवर्क आर्किटेक्चर है जो पूरी तरह के ध्यान तन्त्र (अटेंशन मेकेनिज्म्स) पर निर्भर करता है...

🔹 Bengali Response:
Transformer একটি নিউরাল নেটওয়ার্ক আর্কিটেচার যা সম্পূর্ণভাবে এটেনশন মেকেনিজ্মের উপর ভিত্তি খালি করে..

Share Your Work

Contribute your chatbot to the open-source community:

  • ✨ Submit to sutra-cookbook GitHub repo

  • 🚀 Share your notebook with your team or audience

Tips & Tricks

  • ✅ Multilingual Power: Use sutra-v2 to support 50+ languages

  • 📚 Optimal Chunks: Stick with chunk_size=1000 and chunk_overlap=100

  • 🌍 Community First: Star the repo and share feedback

Conclusion

Combining RAG with SUTRA empowers you to build intelligent, multilingual, document-aware chatbots—perfect for education, community discussions, and global learning.

🔗 Resources & Community

  1. 1. Website: two.ai

  2. 2. GitHub: sutra-cookbook

  3. 3. Discord: Join the community

  4. 4. Twitter: @sutra_dev

  5. 5. LinkedIn: TWO Platforms

Related Articles

EmbeddingGemma: Google’s 308M On-Device Multilingual Embedding Model for Privacy-Preserving AI

Sep 19• 1607 views

How to Build a General-Purpose LLM Agent?

Mar 20• 902 views

TxtAI Semantic Search and LLM Workflows

Jan 4• 471 views

    You Might Also Like

    How FAISS is Revolutionizing Vector Search: Everything You Need to Know
    LLMs

    How FAISS is Revolutionizing Vector Search: Everything You Need to Know

    Discover FAISS, the ultimate library for fast similarity search and clustering of dense vectors! This in-depth guide covers setup, vector stores, document management, similarity search, and real-world applications. Master FAISS to build scalable, AI-powered search systems efficiently! 🚀

    7 AI Tools That Changed Development (December 2025 Guide)
    Tools

    7 AI Tools That Changed Development (December 2025 Guide)

    7 AI tools reshaping development: Google Workspace Studio, DeepSeek V3.2, Gemini 3 Deep Think, Kling 2.6, FLUX.2, Mistral 3, and Runway Gen-4.5.