An effective and efficient truth discovery framework over data streams

Li, T, Gu, Y, Zhou, X, Ma, Q and Yu, G 2017, 'An effective and efficient truth discovery framework over data streams', in V. Markl, S. Orlando, B, Mitschang, P. Andritsos, K. -U. Sattler and S. Bress (ed.) Proceedings of the 20th International Conference on Extending Database Technology, Venice, Italy, 21-24 March 2017, pp. 180-191.


Document type: Conference Paper
Collection: Conference Papers

Title An effective and efficient truth discovery framework over data streams
Author(s) Li, T
Gu, Y
Zhou, X
Ma, Q
Yu, G
Year 2017
Conference name International Conference on Extending Database Technology (EDBT)
Conference location Venice, Italy
Conference dates 21-24 March 2017
Proceedings title Proceedings of the 20th International Conference on Extending Database Technology
Editor(s) V. Markl, S. Orlando, B, Mitschang, P. Andritsos, K. -U. Sattler and S. Bress
Publisher Springer
Place of publication Switzerland
Start page 180
End page 191
Total pages 12
Abstract Truth discovery, a validity assessment method for conflicting data from various sources, has been widely studied in the conventional database community. However, while existing methods for static scenario involve time-consuming iterative processes, those for streams suffer from much sacrifice on accuracy due to the incremental source weight learning. In this paper, we propose a novel framework to conduct truth discovery over streams, which incorporates various iterative methods to effectively estimate the source weights, and decides the frequency of source weight computation adaptively. Specifically, we first capture the characteristics of source weight evolution, based on which a framework is modelled. Then, we define the conditions of source weight evolution for the situations with relatively small unit and cumulative errors, and construct a probabilistic model that estimates the probability of meeting these conditions. Finally, we propose a novel scheme called adaptive source reliability assessment (ASRA), which converts an estimation problem into an optimization problem. We have conducted extensive experiments over real datasets to prove the high effectiveness and efficiency of our framework.
Subjects Pattern Recognition and Data Mining
Keyword(s) Truth discovery
Source reliability
Data quality
Data stream
Copyright notice © 2017 The Authors. Creative Commons Attribution Non Commercical No Derivatives 4.0 License
ISBN 9783893180738
Versions
Version Filter Type
Access Statistics: 74 Abstract Views  -  Detailed Statistics
Created: Mon, 06 Nov 2017, 08:54:00 EST by Catalyst Administrator
© 2014 RMIT Research Repository • Powered by Fez SoftwareContact us