QRAG replaces keyword Quran search with morphological analysis—understanding that the Arabic root ر-ح-م (mercy) appears 339 times in 12+ word forms. The system uses three specialized models at different pipeline stages: Opus for deep research, k2p5 for production, and Minimax for Arabic TTS. Result: search that understands how Arabic form carries meaning.

When you search “mercy” in a typical Quran app, you get results. You also miss everything that matters.

You miss that the Arabic root ر-ح-م (raḥima) appears 339 times. You miss that it transforms into 12+ word forms—from the noun raḥma (mercy) to the adjective raḥīm (merciful) to the intensive raḥmān (most merciful). You miss that al-Baqarah opens with both al-Raḥmān and al-Raḥīm—not by accident, but by design.

Keyword search treats the Quran as a bag of words. QRAG treats it as what it actually is: a meticulously crafted Arabic text where form carries meaning.

This post shows you the architecture, the model selection strategy, and the multi-layered analysis pipeline. Working code, hard numbers, honest failure modes included.

The Core Problem: Search Without Understanding

Most Quran search engines are sophisticated keyword matchers. They tokenize your query, match against an inverted index, and return verses. They’re fast. They’re useful. They’re fundamentally shallow.

# What typical search gives you
def keyword_search(query: str) -> list[Ayah]:
tokens = tokenize(query)
return inverted_index.lookup(tokens)

This misses:

  • That the root ر-ح-م contains رَحْمَة، رَحِيم، رَحْمَان
  • That verb form II (تَفْعِيل) intensifies the meaning
  • That al-Raḥmān and al-Raḥīm are morphological variants with different implications
  • What centuries of scholars have said about these distinctions

QRAG solves this with three structural indices plus semantic search.

flowchart TD
    A[User Query<br/>English/French/etc] --> B[Translation Layer<br/>Claude → Arabic]
    B --> C{Query Engine}
    
    C --> D[Root Index<br/>1,800+ roots]
    C --> E[Lemma Index<br/>4,500+ lemmas]  
    C --> F[Word Index<br/>77,430 words]
    
    D --> G[Hybrid Ranker]
    E --> G
    F --> G
    G --> H[Vector DB<br/>Pinecone]
    H --> I[Response Generator<br/>Claude]
    
    style D fill:#e3f2fd
    style E fill:#e3f2fd
    style F fill:#e3f2fd
    style H fill:#fff3e0

The Three Structural Indices

IndexCountPurpose
Root1,800+Arabic roots (جذور) with all occurrences
Lemma4,500+Dictionary forms (أُصول)
Word77,430Every word with full morphological annotation

Each word carries detailed annotations:

{
"location": "2:255:1",
"arabic": "ٱللَّهُ",
"root": "أله",
"lemma": "ٱللَّه",
"pos": "proper_noun",
"case": "nominative",
"verb_form": "I",
"verb_tense": "perfect",
"derivation": "active_participle"
}

This enables queries no keyword matcher can handle:

  • “Show all verb forms of علم” → Returns Forms I, II, IV, V, VIII, X
  • “Compare رحم vs علم” → Frequency, lemma diversity, semantic fields
  • “What nouns derive from ضلل?” → Active participle, verbal noun, adjectives

Multi-Model Pipeline: Different Stages, Different Tools

Here’s the secret most AI systems miss: no single model does everything well.

The Single-Model Trap

Using one model for your entire pipeline is like using a single tool for carpentry. You’ll hammer everything, but the results will show it.

The same layer-based approach drives the Three-Layer AI Operations Pattern, where different models handle different phases of incident triage — the same principle applied to a different domain.

QRAG uses three specialized models:

StageModelWhyUse Case
Deep ResearchAnthropic Opus 4Superior reasoning for pattern discoveryComplex linguistic analysis
Productionk2p5Arabic-optimized, excellent quality/costStandard queries
Quick AnalysisGemini 2.0 FlashSpeed when neededFast enrichment
SpeechMinimaxBest Arabic TTSAudio generation

Why This Matters

# Research: Use Opus for pattern discovery
research_agent = OpusAgent(
system_prompt=RESEARCH_SYSTEM_PROMPT,
tools=[analyze_query, discover_patterns, cross_reference_tafsirs]
)
# Result: Deep analysis, scholarly cross-references
# Production: Use k2p5 for response generation
orator = OratorSkill(
model="k2p5", # Default production
pillars=[HOOK, BUILD, ARABIC_MOMENT, SCHOLARLY_DIALOGUE, LANDING]
)
# Result: Compelling narrative, fast response
# Content: Use Minimax for speech synthesis
audio = minimax.tts(
text=oration_script,
voice="scholar_male",
language="ar"
)
# Result: Natural Arabic speech

The cost difference is significant:

  • Opus: ~$15/M tokens (deep analysis, used sparingly)
  • k2p5: ~$0.50/M tokens (production, high volume)
  • Minimax: ~$2/M characters (audio generation)
The 80/20 Rule

80% of queries can use k2p5. Reserve Opus for complex pattern discovery that requires deep reasoning. Your wallet will thank you.

The Four-Layer Analysis

When you ask “What is the significance of رحم?”, here’s what happens:

Layer 1: Root Analysis

result = engine.analyze_root("رحم")
# Returns:
# {
# "occurrences": 339,
# "lemmas": ["رَحْمَة", "رَحِيم", "رَحْمَان", ...],
# "forms": {
# "verb_form_I": ["يَرْحَمُ", "رَحِمَ"],
# "noun_adjective": ["رَحِيم", "رَحْمَان"]
# }
# }

Layer 2: Morphological Patterns

  • How does meaning shift between verb forms?
  • What nouns derive from this root?
  • What’s the semantic field?

Layer 3: Tafsir Cross-Reference

# Cross-reference multiple sources
tafsirs = {
"zamakhshari": "Linguistic focus, balagha analysis",
"razi": "Philosophical interpretation, theological implications",
"ibn_kathir": "Traditional Sunni commentary"
}
# Present disagreements as dialogue, not contradictions

Layer 4: Pattern Discovery

  • Thematic connections to other roots
  • Rhetorical devices used
  • Sound patterns that reinforce meaning

The Orator’s Craft: From Data to Revelation

Research is only half the story. The real magic is transforming dry analysis into something that makes people go “wow.”

The Orator skill applies five pillars inspired by classical Arabic rhetoric:

1. The Hook (الاستهلال)

Never begin with data. Begin with wonder.

❌ “The root ر-ح-م appears 339 times in the Quran.”

✅ “Of all the words Allah could have chosen to open His Book, He chose two derived from the same root: al-Raḥmān, al-Raḥīm. Not once—but twice. Why?“

2. The Build (البناء)

Layer understanding like an architect. Simple → Complex.

3. The Arabic Moment (اللحظة العربية)

Let the language itself teach—the sound, the root family, why this specific form.

4. The Scholarly Dialogue (الحوار العلمي)

Present scholars as characters in conversation across centuries:

“Al-Zamakhshari—that brilliant Mu’tazili grammarian from 12th-century Khwarezm—read this verse and saw…“

5. The Landing (الهبوط)

End with resonance, not summary. The listener should feel changed.

The Ruhaniyyah Pipeline: Content in Every Form

The system doesn’t just output text. It generates multiple content variants from a single research run:

flowchart LR
    A[Research] --> B[Orator]
    B --> C[Newsletter]
    B --> D[Blog]
    B --> E[Audio]
    B --> F[Video]
    
    style C fill:#e8f5e9
    style D fill:#e3f2fd
    style E fill:#fff3e0
    style F fill:#fce4ec
  • Newsletter: Full lecture style, comprehensive
  • Blog: SEO-optimized, scannable headings
  • Audio: TTS-ready script with Minimax voice synthesis
  • Video: Short-form script for TikTok/Shorts

This is the “Ruhaniyyah” (spiritual) dimension—taking deep scholarship and making it accessible through multiple modalities.

What Doesn’t Work

Semantic Search Alone

Vector embeddings capture similarity, but they miss morphological relationships. “Mercy” and “merciful” might be close in embedding space—but they miss that they’re from the same root.

Fix: Hybrid search combining structural indices with vector similarity.

Single Tafsir Source

Relying on one commentary creates blind spots. Al-Zamakhshari (linguist) and al-Razi (theologian) see different things in the same verse.

Fix: Cross-reference multiple sources, present as dialogue.

Academic Output

Traditional academic writing about the Quran is accurate but inaccessible. It reads like a textbook, not a revelation.

Fix: Apply orator craft—hook, build, Arabic moment, scholarly dialogue, landing.

The Numbers

The 94% context reduction is context engineering in practice — semantic retrieval over a structured index instead of feeding the full corpus into every prompt.

FeatureCount
Words analyzed77,430
Roots indexed1,800+
Lemmas tracked4,500+
Quran verses6,236
Tafsir sources5+
Root رحم occurrences339
Word forms from رحم12+
Context reduction (hybrid)94%

Try It Yourself

The morphological data comes from the Quranic Arabic Corpus, a linguistically annotated dataset covering all 77,430 words with POS tags, roots, and lemmas. Vector embeddings are stored in Pinecone for sub-second semantic retrieval.

Terminal window
# Clone and setup
git clone https://github.com/ameno-/qrag.git
cd qrag
uv sync
# Parse the corpus
uv run python main.py parse data/quran-morphology.txt
# Build the vector index
uv run python scripts/build_vector_db.py
# Query
uv run python test_queries.py
# Enter: "What is the significance of رحم?"

Or use the API:

Terminal window
curl -X POST http://localhost:38791/pipeline \
-H "Content-Type: application/json" \
-d '{"query": "What is the significance of رحم?"}'

What’s Next

The goal isn’t just search—it’s building a system that helps people feel the depth of the Quranic text.

Coming soon:

  • Real-time Arabic TTS streaming with Minimax
  • Interactive verse exploration
  • Scholar voice synthesis with distinct character voices
  • Multi-language support (Urdu, Indonesian, etc.)

The intersection of morphological NLP, multi-model AI pipelines, and classical rhetorical traditions is where this project lives. It’s where linguistics meets revelation. The NomFeed architecture used the same token-discipline approach to build a content ingestion system in 865K tokens — the same architectural thinking applies across domains.


QRAG: Where Arabic morphology meets AI engineering.

Key Takeaways

  1. Morphological search beats keyword search by tracking 1,800+ Arabic roots and 4,500+ lemmas—not just matching words
  2. Multi-model pipeline: Opus for pattern discovery, k2p5 for production, Minimax for Arabic speech synthesis
  3. The Orator skill transforms dry analysis into compelling narratives using five pillars of classical Arabic rhetoric
  4. Cost optimization: semantic vector search reduces full-text load by 94% while improving relevance
  5. Content pipeline generates newsletter, blog, audio, and video variants from a single research run