Influence-oriented community analysis in social networks

Wang, X 2018, Influence-oriented community analysis in social networks, Doctor of Philosophy (PhD), Science, RMIT University.

Document type: Thesis
Collection: Theses

Attached Files
Name Description MIMEType Size
Wang.pdf Thesis Click to show the corresponding preview/stream application/pdf; Bytes
Title Influence-oriented community analysis in social networks
Author(s) Wang, X
Year 2018
Abstract The emergence of online social networks has fundamentally changed the way people communicate with each other. Scholars have never ceased devoting their time and energy to the phenomenon since its emergence. Among researches around the social network, One line of study that draws a significant amount of attention recently is the discovery of communities, i.e., relatively densely connected sub-networks. Discovering such structures or communities provides insight into the relationship between individuals and composition of a social network. However, these studies mainly focus on the inner connection between individuals inside a community structure and neglect the external influence of a community as a whole.

Another line of study in the field of the social network is influence analysis which analyze the ability of individuals to convince other users to adopt a new product (or an innovative idea, a service, a political opinion, etc.) with word-of-mouth effect which propagates information through network structures that can trigger cascades of further adoptions. However, these studies mainly focus on the relationship between individuals and the information diffusion process and neglect the community structures in a social network. There is a lack of studies that analyze the social influence of communities, which is fundamentally important for understanding the relationship between network structures and the information diffusion among it and has many practical applications. For example, a company may try to find the most influential community to advertise their products; an organization may intend to initiate a campaign in hope to attract more diverse customers, i.e., maximizing the number of influenced communities instead of customers; an association may hope to minimize the influence of a malicious information spread by one of its opponents, so that the community consisted of its core customers would be affected the least.

To fill in this meaningful blank, in this thesis, we intend to analyze communities on the aspect of social influence and solve three research questions as follows. First, how to identify the communities with the dense intra-connections and the highest outer influence on the users outside the communities? Second, how to maximize both the spread and the diversity of the diffusion at the end of the information propagation by selecting a fixed number of influential users from a social network to spread the information. The higher diversity means more communities are influenced. Third, how to minimize the influence of a set of initial active nodes, which has been infected by a piece of malicious information, over a target community? The aim is to protect from this disinformation, by deleting a fixed number of edges in a social network.

To address the first research question, we propose a new metric to measure the likelihood of the community to attract the other users outside the community within the social network, i.e., the community's outer influence. There are lots of applications that need to rank the communities using their outer influence, e.g., Ads trending analytics, social opinion mining and news propagation pattern discovery by monitoring the influential communities. We refer to such problem as Most Influential Community Search. While the most influential community search problem in large social networks is essential in various applications, it is mostly ignored by the academic research community. In this work, we systematically investigate this problem.

Firstly, we propose a new community model, maximal kr-Clique community, which has desirable characters, i.e., society, cohesiveness, connectivity, and maximum. And then, we developed a novel tree-based index structure, denoted as C-Tree, to maintain the offline computed r-cliques. To efficiently search the most influential maximal kr-clique communities with the maximum outer influence, we developed four advanced index-based algorithms, which can improve the search performance of non-indexed solution by about 200 times. The efficiency and effectiveness of constructing index structure and evaluating the search algorithms have been verified using six real datasets including Facebook, Google+, Gowalla, Twitter, Youtube, and Amazon. A small case study shows the value of the most influential communities using DBLP data.

To solve the second research question, we investigate Diverse Influence Maximization (DIM) to efficiently find k nodes which, at the end of propagation process, can maximize the number of activated nodes and the diversity of the activated nodes. In this work, an evaluation metric has been proposed to balance the two objectives. To address the computational challenges, we develop two efficient algorithms and one advanced PSP-Tree index. The effectiveness and efficiency of our DIM solution are verified by the extensive experimental studies on five real-world social network datasets.

To address the last research question, we study the community-targeted influence minimization problem. Unlike previous influence minimization work, this study considers the influence minimization concerning a particular group of social network users, called targeted influence minimization. Thus, the objective is to protect a set of users, called target nodes, from malicious information originating from another group of users, called active nodes. This study also addresses two fundamental, but largely ignored, issues in different influence minimization problems: (i) the impact of a budget on the solution; (ii) robust sampling. To this end, two scenarios are investigated, namely unconstrained and constrained budget. Given an unconstrained budget, we provide an optimal solution; Given a constrained budget, we show the problem is NP-hard and develop a greedy algorithm with an (1 − 1/e)-approximation. More importantly, to solve the influence minimization problem in large, real-world social networks, we propose a robust sampling-based solution with a desirable theoretic bound. Extensive experiments using real social network datasets offer insight into the effectiveness and efficiency of the proposed solutions.
Degree Doctor of Philosophy (PhD)
Institution RMIT University
School, Department or Centre Science
Subjects Analysis of Algorithms and Complexity
Pattern Recognition and Data Mining
Information Engineering and Theory
Keyword(s) Big Data
Data Mining
Graph Mining
Influence Maximization
Influence Minimization
Community Detection
Influence Diversification
Optimization Theory
Knowledge Discovery
Version Filter Type
Access Statistics: 224 Abstract Views, 254 File Downloads  -  Detailed Statistics
Created: Fri, 06 Jul 2018, 14:00:14 EST by Denise Paciocco
© 2014 RMIT Research Repository • Powered by Fez SoftwareContact us