Data Mining: The Textbook



Yüklə 17,13 Mb.
səhifə385/423
tarix07.01.2024
ölçüsü17,13 Mb.
#211690
1   ...   381   382   383   384   385   386   387   388   ...   423
1-Data Mining tarjima

f (S ∪ {e}) − f (S) ≥ f (T ∪ {e}) − f (T )

(19.52)

Virtually all natural models for quantifying influence turn out to be submodular. Submod-ularity is algorithmically convenient because a very efficient greedy optimization algorithm exists for maximizing submodular functions, as long as f (S) can be evaluated for a given value of S. This algorithm starts by setting S = {} and incrementally adds nodes to S that increase the value of f (S) as much as possible. This procedure is repeated until the set S contains the required number of influencers k. The approximation level of this heuristic is based on a well-known classical result on optimization of submodular functions.


Lemma 19.6.1 The greedy algorithm for maximizing submodular functions provides a solu-




tion with an objective function value that is at least a fraction of the optimal value.


Here, e is the base of the natural logarithm.

Thus, these results show that it is possible to optimize f (S) effectively, as long as an appropriate submodular influence function f (S) can be defined for a given set of nodes S.


Two common approaches for defining the influence function f (S ) of a set of nodes S are the Linear Threshold Model and the Independent Cascade Model. Both these diffusion models were proposed in one of the earliest works on social influence analysis. The general operational assumption in these diffusion models is that nodes are either in an active or inactive state. Intuitively, an active node is one which has already been influenced by the set of desired behaviors. Once a node moves to an active state, it never deactivates. Depending on the model, an active node may trigger activation of neighboring nodes either for a single time, or over longer periods. Nodes are successively activated until no more nodes are activated in a given iteration. The value of f (S) is evaluated as the total number of activated nodes at termination.


19.6.1 Linear Threshold Model

In this model, the algorithm initially starts with an active set of seed nodes S and iteratively increases the number of active nodes based on the influence of neighboring active nodes. Active nodes are allowed to influence their neighbors over multiple iterations throughout the execution of the algorithm until no more nodes can be activated. The influence of



19.6. SOCIAL INFLUENCE ANALYSIS

657

neighboring nodes is quantified with the use of a linear function of the edge-specific weights bij . For each node i in the network G = (N, A), the following is assumed to be true:





bij 1

(19.53)

j:(i,j)∈A

Each node i is associated with a random threshold θi ∼ U [0, 1] which is fixed up front and stays constant over the course of the algorithm. The total influence I(i) of the active neighbors of node i on it, at a given time-instant, is computed as the sum of the weights bij of all active neighbors of i.




I(i) = bij (19.54)



Yüklə 17,13 Mb.

Dostları ilə paylaş:
1   ...   381   382   383   384   385   386   387   388   ...   423




Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©azkurs.org 2024
rəhbərliyinə müraciət

gir | qeydiyyatdan keç
    Ana səhifə


yükləyin