DeepSieve 深度淘漉

Information Sieving via LLM-as-a-Knowledge-Router

“Through countless siftings, though laborious, only when the wild sands are blown away, does gold appear.”
千淘万漉虽辛苦，吹尽狂沙始到金

Paper Code

Authors

Minghao Guo¹, Qingcheng Zeng², Xujiang Zhao³, Yanchi Liu³, Wenchao Yu³, Mengnan Du⁴, Haifeng Chen³, Wei Cheng³*

¹Rutgers University ²Northwestern University ³NEC Laboratories America ⁴NJIT

*Corresponding Author: weicheng@nec-labs.com

News

2025-07-29 — Uploaded full corpus to Arkiv and released DeepSieve preprint on arXiv.
2025-07-25 — Released initial project page and demo diagrams.
2025-07-12 — Completed core DeepSieve pipeline: Decompose → Route → Reflect → Fuse.

Overview

DeepSieve is a modular Retrieval-Augmented Generation (RAG) framework that enhances LLM-based reasoning across heterogeneous knowledge sources. It introduces a multi-stage information sieving pipeline: Decompose → Route → Reflexion → Fusion. Each subquery is dynamically routed to the most suitable (Tool, Corpus) pair. If retrieval fails, DeepSieve reroutes or replans to ensure robust answers.

System Architecture

DeepSieve executes a multi-hop reasoning DAG where each subquery selects its optimal knowledge route. Reflexion enables recovery from retrieval failure. Fusion aggregates validated answers into a final response.

Core Modules

Decomposition: Breaks the query into a DAG of sub-questions using an LLM planner.
Routing: Selects (Tool, Corpus) pairs for each subquestion based on profiles and history.
Reflexion: Re-evaluates failed queries by retrying alternate sources.
Fusion: Merges validated subanswers into a coherent final output.

Results

DeepSieve achieves state-of-the-art performance on MuSiQue, 2Wiki, and HotpotQA under both DeepSeek-V3 and GPT-4o backbones.

MuSiQue F1: 46.8 (+13.4 over best baseline)
2Wiki F1: 68.4 (+5.3)
HotpotQA F1: 61.6

Example

Query: What country is the birthplace of Erik Hort a part of?
Pure RAG: Incorrect or hallucinated answer.
DeepSieve Reasoning Chain:

Q1: Who was born in Montebello? → Erik Hort
Q2: What state is Montebello in? → New York
Q3: What country is New York in? → United States

Final Answer: United States

Future Work

DeepSieve introduces a flexible backbone for modular RAG across heterogeneous knowledge sources. Looking forward, we envision several exciting directions:

🔄 Personalized Routing: Allow users to upload their own corpora (e.g., SQL tables, OCR-ed PDFs, legal documents) with corresponding profile descriptions for dynamic tool selection.
⚖️ Legal Domain Extensions: Inspired by early feedback, legal research is a promising application. Queries often span statutes, case law, and regulatory sources — a natural fit for source-aware subquestion routing.
🧠 Law Agents: Build agentic assistants that plan, retrieve, and reflect across structured and unstructured legal content — enabling robust multi-hop legal reasoning.

We believe these extensions will push RAG systems toward personalized, domain-aware, and self-reflective intelligence.

Contact

For general inquiries about this work, feel free to contact any of the authors listed above. For technical issues or implementation questions, please open an issue on our GitHub repository.

You're also welcome to join the discussion on LinkedIn:
https://www.linkedin.com/posts/minghao-guo-181b4b168_rag-llm-multihopqa...