A pipelined architecture for distributed text query evaluation

Moffat, A, Webber, W, Zobel, J and Baeza-Yates, R 2007, 'A pipelined architecture for distributed text query evaluation', Information Retrieval, vol. 10, pp. 205-231.


Document type: Journal Article
Collection: Journal Articles

Title A pipelined architecture for distributed text query evaluation
Author(s) Moffat, A
Webber, W
Zobel, J
Baeza-Yates, R
Year 2007
Journal name Information Retrieval
Volume number 10
Start page 205
End page 231
Total pages 27
Publisher Springer
Abstract Two principal query-evaluation methodologies have been described for cluster-based implementation of distributed information retrieval systems: document partitioning and term partitioning. In a document-partitioned system, each of the processors hosts a subset of the documents in the collection, and executes every query against its local sub-collection. In a term-partitioned system, each of the processors hosts a subset of the inverted lists that make up the index of the collection, and serves them to a central machine as they are required for query evaluation. In this paper we introduce a pipelined query-evaluation methodology, based on a term-partitioned index, in which partially evaluated queries are passed amongst the set of processors that host the query terms. This arrangement retains the disk read benefits of term partitioning, but more effectively shares the computational load. We compare the three methodologies experimentally, and show that term distribution is inefficient and scales poorly. The new pipelined approach offers efficient memory utilization and efficient use of disk accesses, but suffers from problems with load balancing between nodes. Until these problems are resolved, document partitioning remains the preferred method.
Subject Information Retrieval and Web Search
Keyword(s) Information-Retrieval
Performance
Web
DOI - identifier 10.1007/s10791-006-9014-4
Copyright notice © Springer Science + Business Media, LLC 2007
ISSN 1386-4564
Versions
Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 33 times in Thomson Reuters Web of Science Article | Citations
Scopus Citation Count Cited 63 times in Scopus Article | Citations
Altmetric details:
Access Statistics: 230 Abstract Views  -  Detailed Statistics
Created: Fri, 07 Jan 2011, 09:11:00 EST by Catalyst Administrator
© 2014 RMIT Research Repository • Powered by Fez SoftwareContact us