QRAG replaces keyword Quran search with morphological analysis—understanding that the Arabic root ر-ح-م (mercy) appears 339 times in 12+ word forms. The system uses three specialized models at different pipeline stages: Opus for deep research, k2p5 for production, and Minimax for Arabic TTS. Result: search that understands how Arabic form carries meaning.
When you search “mercy” in a typical Quran app, you get results. You also miss everything that matters.
You miss that the Arabic root ر-ح-م (raḥima) appears 339 times. You miss that it transforms into 12+ word forms—from the noun raḥma (mercy) to the adjective raḥīm (merciful) to the intensive raḥmān (most merciful). You miss that al-Baqarah opens with both al-Raḥmān and al-Raḥīm—not by accident, but by design.
Keyword search treats the Quran as a bag of words. QRAG treats it as what it actually is: a meticulously crafted Arabic text where form carries meaning.
This post shows you the architecture, the model selection strategy, and the multi-layered analysis pipeline. Working code, hard numbers, honest failure modes included.
The Core Problem: Search Without Understanding
Most Quran search engines are sophisticated keyword matchers. They tokenize your query, match against an inverted index, and return verses. They’re fast. They’re useful. They’re fundamentally shallow.
# What typical search gives youdef keyword_search(query: str) -> list[Ayah]: tokens = tokenize(query) return inverted_index.lookup(tokens)This misses:
- That the root ر-ح-م contains رَحْمَة، رَحِيم، رَحْمَان
- That verb form II (تَفْعِيل) intensifies the meaning
- That al-Raḥmān and al-Raḥīm are morphological variants with different implications
- What centuries of scholars have said about these distinctions
QRAG solves this with three structural indices plus semantic search.
Architecture: Three Indices + Vector Search
flowchart TD
A[User Query<br/>English/French/etc] --> B[Translation Layer<br/>Claude → Arabic]
B --> C{Query Engine}
C --> D[Root Index<br/>1,800+ roots]
C --> E[Lemma Index<br/>4,500+ lemmas]
C --> F[Word Index<br/>77,430 words]
D --> G[Hybrid Ranker]
E --> G
F --> G
G --> H[Vector DB<br/>Pinecone]
H --> I[Response Generator<br/>Claude]
style D fill:#e3f2fd
style E fill:#e3f2fd
style F fill:#e3f2fd
style H fill:#fff3e0
The Three Structural Indices
| Index | Count | Purpose |
|---|---|---|
| Root | 1,800+ | Arabic roots (جذور) with all occurrences |
| Lemma | 4,500+ | Dictionary forms (أُصول) |
| Word | 77,430 | Every word with full morphological annotation |
Each word carries detailed annotations:
{ "location": "2:255:1", "arabic": "ٱللَّهُ", "root": "أله", "lemma": "ٱللَّه", "pos": "proper_noun", "case": "nominative", "verb_form": "I", "verb_tense": "perfect", "derivation": "active_participle"}This enables queries no keyword matcher can handle:
- “Show all verb forms of علم” → Returns Forms I, II, IV, V, VIII, X
- “Compare رحم vs علم” → Frequency, lemma diversity, semantic fields
- “What nouns derive from ضلل?” → Active participle, verbal noun, adjectives
Multi-Model Pipeline: Different Stages, Different Tools
Here’s the secret most AI systems miss: no single model does everything well.
Using one model for your entire pipeline is like using a single tool for carpentry. You’ll hammer everything, but the results will show it.
The same layer-based approach drives the Three-Layer AI Operations Pattern, where different models handle different phases of incident triage — the same principle applied to a different domain.
QRAG uses three specialized models:
| Stage | Model | Why | Use Case |
|---|---|---|---|
| Deep Research | Anthropic Opus 4 | Superior reasoning for pattern discovery | Complex linguistic analysis |
| Production | k2p5 | Arabic-optimized, excellent quality/cost | Standard queries |
| Quick Analysis | Gemini 2.0 Flash | Speed when needed | Fast enrichment |
| Speech | Minimax | Best Arabic TTS | Audio generation |
Why This Matters
# Research: Use Opus for pattern discoveryresearch_agent = OpusAgent( system_prompt=RESEARCH_SYSTEM_PROMPT, tools=[analyze_query, discover_patterns, cross_reference_tafsirs])# Result: Deep analysis, scholarly cross-references
# Production: Use k2p5 for response generationorator = OratorSkill( model="k2p5", # Default production pillars=[HOOK, BUILD, ARABIC_MOMENT, SCHOLARLY_DIALOGUE, LANDING])# Result: Compelling narrative, fast response
# Content: Use Minimax for speech synthesisaudio = minimax.tts( text=oration_script, voice="scholar_male", language="ar")# Result: Natural Arabic speechThe cost difference is significant:
- Opus: ~$15/M tokens (deep analysis, used sparingly)
- k2p5: ~$0.50/M tokens (production, high volume)
- Minimax: ~$2/M characters (audio generation)
80% of queries can use k2p5. Reserve Opus for complex pattern discovery that requires deep reasoning. Your wallet will thank you.
The Four-Layer Analysis
When you ask “What is the significance of رحم?”, here’s what happens:
Layer 1: Root Analysis
result = engine.analyze_root("رحم")# Returns:# {# "occurrences": 339,# "lemmas": ["رَحْمَة", "رَحِيم", "رَحْمَان", ...],# "forms": {# "verb_form_I": ["يَرْحَمُ", "رَحِمَ"],# "noun_adjective": ["رَحِيم", "رَحْمَان"]# }# }Layer 2: Morphological Patterns
- How does meaning shift between verb forms?
- What nouns derive from this root?
- What’s the semantic field?
Layer 3: Tafsir Cross-Reference
# Cross-reference multiple sourcestafsirs = { "zamakhshari": "Linguistic focus, balagha analysis", "razi": "Philosophical interpretation, theological implications", "ibn_kathir": "Traditional Sunni commentary"}# Present disagreements as dialogue, not contradictionsLayer 4: Pattern Discovery
- Thematic connections to other roots
- Rhetorical devices used
- Sound patterns that reinforce meaning
The Orator’s Craft: From Data to Revelation
Research is only half the story. The real magic is transforming dry analysis into something that makes people go “wow.”
The Orator skill applies five pillars inspired by classical Arabic rhetoric:
1. The Hook (الاستهلال)
Never begin with data. Begin with wonder.
❌ “The root ر-ح-م appears 339 times in the Quran.”
✅ “Of all the words Allah could have chosen to open His Book, He chose two derived from the same root: al-Raḥmān, al-Raḥīm. Not once—but twice. Why?“
2. The Build (البناء)
Layer understanding like an architect. Simple → Complex.
3. The Arabic Moment (اللحظة العربية)
Let the language itself teach—the sound, the root family, why this specific form.
4. The Scholarly Dialogue (الحوار العلمي)
Present scholars as characters in conversation across centuries:
“Al-Zamakhshari—that brilliant Mu’tazili grammarian from 12th-century Khwarezm—read this verse and saw…“
5. The Landing (الهبوط)
End with resonance, not summary. The listener should feel changed.
The Ruhaniyyah Pipeline: Content in Every Form
The system doesn’t just output text. It generates multiple content variants from a single research run:
flowchart LR
A[Research] --> B[Orator]
B --> C[Newsletter]
B --> D[Blog]
B --> E[Audio]
B --> F[Video]
style C fill:#e8f5e9
style D fill:#e3f2fd
style E fill:#fff3e0
style F fill:#fce4ec
- Newsletter: Full lecture style, comprehensive
- Blog: SEO-optimized, scannable headings
- Audio: TTS-ready script with Minimax voice synthesis
- Video: Short-form script for TikTok/Shorts
This is the “Ruhaniyyah” (spiritual) dimension—taking deep scholarship and making it accessible through multiple modalities.
What Doesn’t Work
Semantic Search Alone
Vector embeddings capture similarity, but they miss morphological relationships. “Mercy” and “merciful” might be close in embedding space—but they miss that they’re from the same root.
Fix: Hybrid search combining structural indices with vector similarity.
Single Tafsir Source
Relying on one commentary creates blind spots. Al-Zamakhshari (linguist) and al-Razi (theologian) see different things in the same verse.
Fix: Cross-reference multiple sources, present as dialogue.
Academic Output
Traditional academic writing about the Quran is accurate but inaccessible. It reads like a textbook, not a revelation.
Fix: Apply orator craft—hook, build, Arabic moment, scholarly dialogue, landing.
The Numbers
The 94% context reduction is context engineering in practice — semantic retrieval over a structured index instead of feeding the full corpus into every prompt.
| Feature | Count |
|---|---|
| Words analyzed | 77,430 |
| Roots indexed | 1,800+ |
| Lemmas tracked | 4,500+ |
| Quran verses | 6,236 |
| Tafsir sources | 5+ |
| Root رحم occurrences | 339 |
| Word forms from رحم | 12+ |
| Context reduction (hybrid) | 94% |
Try It Yourself
The morphological data comes from the Quranic Arabic Corpus, a linguistically annotated dataset covering all 77,430 words with POS tags, roots, and lemmas. Vector embeddings are stored in Pinecone for sub-second semantic retrieval.
# Clone and setupgit clone https://github.com/ameno-/qrag.gitcd qraguv sync
# Parse the corpusuv run python main.py parse data/quran-morphology.txt
# Build the vector indexuv run python scripts/build_vector_db.py
# Queryuv run python test_queries.py# Enter: "What is the significance of رحم?"Or use the API:
curl -X POST http://localhost:38791/pipeline \ -H "Content-Type: application/json" \ -d '{"query": "What is the significance of رحم?"}'What’s Next
The goal isn’t just search—it’s building a system that helps people feel the depth of the Quranic text.
Coming soon:
- Real-time Arabic TTS streaming with Minimax
- Interactive verse exploration
- Scholar voice synthesis with distinct character voices
- Multi-language support (Urdu, Indonesian, etc.)
The intersection of morphological NLP, multi-model AI pipelines, and classical rhetorical traditions is where this project lives. It’s where linguistics meets revelation. The NomFeed architecture used the same token-discipline approach to build a content ingestion system in 865K tokens — the same architectural thinking applies across domains.
QRAG: Where Arabic morphology meets AI engineering.
Key Takeaways
- Morphological search beats keyword search by tracking 1,800+ Arabic roots and 4,500+ lemmas—not just matching words
- Multi-model pipeline: Opus for pattern discovery, k2p5 for production, Minimax for Arabic speech synthesis
- The Orator skill transforms dry analysis into compelling narratives using five pillars of classical Arabic rhetoric
- Cost optimization: semantic vector search reduces full-text load by 94% while improving relevance
- Content pipeline generates newsletter, blog, audio, and video variants from a single research run