UFSSF - An Efficient Unsupervised Feature Selection for Streaming Features

Almusallam, N, Tari, Z, Chan, J and Al-Harthi, A 2018, 'UFSSF - An Efficient Unsupervised Feature Selection for Streaming Features', in Dinh Phung, Vincent S. Tseng, Geoffrey I. Webb, Bao Ho, Mohadeseh Ganji, Lida Rashidi (ed.) Proceedings of the 22nd Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD 2018) Part II, Melbourne, Australia, 3-6 June 2018, pp. 495-507.


Document type: Conference Paper
Collection: Conference Papers

Title UFSSF - An Efficient Unsupervised Feature Selection for Streaming Features
Author(s) Almusallam, N
Tari, Z
Chan, J
Al-Harthi, A
Year 2018
Conference name PAKDD 2018: Lecture Notes in Artificial Intelligence Volume 10938
Conference location Melbourne, Australia
Conference dates 3-6 June 2018
Proceedings title Proceedings of the 22nd Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD 2018) Part II
Editor(s) Dinh Phung, Vincent S. Tseng, Geoffrey I. Webb, Bao Ho, Mohadeseh Ganji, Lida Rashidi
Publisher Springer
Place of publication Cham, Switzerland
Start page 495
End page 507
Total pages 13
Abstract Streaming features applications pose challenges for feature selection. For such dynamic features applications: (a) features are sequentially generated and are processed one by one upon their arrival while the number of instances/points remains fixed; and (b) the complete feature space is not known in advance. Existing approaches require class labels as a guide to select the representative features. However, in real-world applications most data are not labeled and, moreover, manual labeling is costly. A new algorithm, called Unsupervised Feature Selection for Streaming Features (UFSSF), is proposed in this paper to select representative features in streaming features applications without the need to know the features or class labels in advance. UFSSF extends the k-mean clustering algorithm to include linearly dependent similarity measures so as to incrementally decide whether to add the newly arrived feature to the existing set of representative features. Those features that are not representative are discarded. Experimental results indicates that UFSSF significantly has a better prediction accuracy and running time compared to the baseline approaches.
Subjects Distributed and Grid Systems
DOI - identifier 10.1007/978-3-319-93037-4_39
Copyright notice © Springer International Publishing AG, part of Springer Nature 2018
ISBN 9783319930367
Versions
Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 0 times in Thomson Reuters Web of Science Article
Scopus Citation Count Cited 0 times in Scopus Article
Altmetric details:
Access Statistics: 85 Abstract Views  -  Detailed Statistics
Created: Thu, 06 Dec 2018, 10:39:00 EST by Catalyst Administrator
© 2014 RMIT Research Repository • Powered by Fez SoftwareContact us