Data Mining: The Textbook

Yüklə 17,13 Mb.

səhifə	365/423
tarix	07.01.2024
ölçüsü	17,13 Mb.
	#211690

1 ... 361 362 363 364 365 366 367 368 ... 423

1-Data Mining tarjima

Preferential attachment: In a growing network, the likelihood of a node receiving new edges increases with its degree. This is a natural consequence of the fact that highly connected individuals will typically find it easier to make new connections. If π(i) is the probability that a newly added node attaches itself to an existing node i in the network, then a model for the probability π(i) in terms of the degree of node i is as follows:

π(i) ∝ Degree(i)^α

(19.2)

The value of the parameter α is dependent on the domain from which the network is drawn, such as a biological network or social network. In many Web-centric domains, a scale-free assumption is used. This assumption states that α ≈ 1, and therefore the proportionality is linear. Such networks are referred to as scale-free networks. This model is also referred to as the Barabasi–Albert model. Many networks, such as the World Wide Web, social networks, and biological networks, are conjectured to be scale free, although the assumption is obviously intended to be an approximation. In fact, many properties of real networks are not completely consistent with the scale-free assumption.

Small world property: Most real networks are assumed to be “small world.” This means that the average path length between any pair of nodes is quite small. In fact, Milgram’s experiment in the sixties conjectured that the distance between any pair of nodes is about six. Typically, for a network containing n(t) nodes at time t, many models postulate that the average path lengths grow as log(n(t)). This is a small number, even for very large networks. Recent experiments have confirmed that the average path lengths of large-scale networks such as Internet chat networks are quite small. As discussed below, the dynamically varying diameters have been experimentally shown to be even more constricted than the (modeled) log(n(t)) growth rate would suggest.

Densification: Almost all real-world networks such as the Web and social networks add more nodes and edges over time than are deleted. The impact of adding new edges generally dominates the impact of adding new nodes. This implies that the graphs gradually densify over time, with the number of edges growing superlinearly with the number of nodes. If n(t) is the number of nodes in the network at time t, and e(t) is the number of edges, then the network exhibits the following densification power law:

e(t) ∝ n(t)^β

(19.3)

The exponent β is a value between 1 and 2. The value of β = 1 corresponds to a network where the average degree of the nodes is not aﬀected by the growth of the network. A value of β = 2 corresponds to a network in which the total number of

19.2. SOCIAL NETWORKS: PRELIMINARIES AND PROPERTIES

623

edges e(t) remains a constant fraction of the complete graph of n(t) nodes as n(t) increases.

Yüklə 17,13 Mb.

Dostları ilə paylaş:

1 ... 361 362 363 364 365 366 367 368 ... 423