Marketing Analytics: DataDriven Techniques with Microsoft Excel (2014)
Part XI. Internet and Social Marketing
Chapter 42: Networks
Chapter 43: The Mathematics Behind The Tipping Point
Chapter 44: Viral Marketing
Chapter 45: Text Mining
Chapter 42. Networks
You may be familiar with the movie The Social Network that describes the early days of Facebook. In general, a network consists of points (usually called nodes) that are connected by links (sometimes called arcs). You can easily associate (a huge!) network with Facebook in which the nodes are the members of Facebook and a link exists between two nodes if the people represented by the two nodes are friends. In this chapter you learn how marketing analysts describe networks and gain insight into how networks such as Facebook evolve. Applications of network theory to the spread of new products are also discussed. The chapter closes with a brief discussion of the wellknown Klout score, which purports to measure an individual's online influence.
Measuring the Importance of a Node
It is important to have a way to measure the importance of a node or link. For example, on the Internet, Google, Bing, and Amazon.com are clearly more important nodes than the author's blog (www.waynewinston.com). Marketers would love to have a measure of influence so that they can reach the most influential people and have these people spread the word about their products. This section discusses three metrics (assuming that each link in the network is bidirectional) that you can use to measure the importance of a node:
· Degree centrality
· Closeness centrality
· Betweenness centrality
Degree Centrality
For any node its degree centrality is simply the number of nodes connected to the given node by a link. Take a look at the network shown in Figure 42.1.
Figure 421: Example of a simple network
Table 42.1 shows the degree centrality of each node:
Table 42.1 Degree Centrality
Node 
Degree Centrality 
3 

Blake 
2 
Blair 
4 
Jessica 
3 
Helen 
1 
Lindsay 
3 
For example, Lindsay's degree centrality is 3 because there are links connecting Lindsay to Paris, Blair, and Jessica. Also Helen's degree centrality is 1 because the only link to Helen is Blair. Degree centrality indicates that Blair is most influential and is only slightly more important than Jessica and Lindsay. Referring to Figure 42.1, however, it appears that Blair is much more influential than say Lindsay. For example, removing Blair from the network would result in there being no path to Helen. You can soon see that betweenness centrality (see the section “Betweenness Centrality”) illuminates Blair's importance to the network. Therefore, the shortcoming of degree centrality is that it does not measure the extent to which a node's links help connect pairs of nodes.
Closeness Centrality
One way to look at the importance of a node is to assume a node is more important if the node is close to other nodes. To implement this idea, pick a given node (say Paris) and find the shortest path from Paris to each other node. Averaging those path lengths gives you an idea of how far Paris is from the rest of the network. Taking the reciprocal of the average path length (that is 1 / average path length) yields your measure of closeness centrality. This measure of closeness centrality is larger for nodes that tend to be closer to the other nodes in the network. Referring to the network in Figure 42.1, you can determine Paris' closeness centrality as follows:
· Shortest path from Paris to Blake has length 1.
· Shortest path from Paris to Blair (ParisBlakeBlair) has length 2.
· Shortest path from Paris to Jessica has length 1.
· Shortest path from Paris to Helen has length 3 (ParisLindsayBlairHelen).
· Shortest path from Paris to Lindsay is length 1.
· The average of the length of these shortest paths is: = 1.6 so Paris' closeness centrality is 1 / 1.6 = 5 / 8.
In a similar fashion you can find the closeness centrality of all nodes in Figure 42.1, as shown in Table 42.2.
Table 42.2 Closeness Centralities for Figure 42.1
Node 
Closeness Centrality 
5/8 = 0.625 

Blake 
5/8 = 0.625 
Blair 
5/6 = 0.833 
Jessica 
5/8 = 0.625 
Helen 
5/10 =0.50 
Lindsay 
5/8 = 0.625 
Closeness centrality indicates that Lindsay and Blake are almost as important as Blair. Note, however, the closeness centrality values of the nodes are close in value. This is typical, and changing even a large network by adding or deleting a few links can result in a large change in how the nodes rank for closeness centrality. Because a node with a large closeness centrality is close to the other nodes, a node with large closeness centrality is a good position to view what happens in the network. Closeness centrality does not measure the power of a node to influence how information flows through the network. To measure the power of a node to influence the flow of information through the network, you need the measure of betweenness centrality.
Betweenness Centrality
Suppose a marketer wants to spread knowledge of a new product throughout a network. She would then want to know which nodes have the most impact on the spread of information through the network. To illustrate how the concept of betweenness centrality measures a node's impact of information spread, focus on Blair. To compute Blair's betweenness centrality, follow these steps:
1. For each pair of nodes excluding Blair, find all shortest paths between the pair of nodes. For example, for the BlakeLindsay pair, two shortest paths exist (BlakeParisLindsay and BlakeBlairLindsay).
2. Determine the fraction of these shortest paths that include Blair. In this case ½ = 0.5 of the shortest paths that include Blair. Blair now earns 0.5 points toward betweenness centrality.
3. Summing Blair's points over the 10 pairs of nodes which exclude Blair yields Blair's betweenness centrality.
All the computations needed to compute Blair's and Paris' betweenness centrality are summarized in Table 42.3 and Table 42.4.
Table 42.3 Blair's Betweenness Centrality
Pair of Nodes 
Shortest Paths 
Points for Blair 
Paris Blake 
ParisBlake 
0/1 = 0 
Paris Jessica 
ParisJessica 
0/1 = 0 
Paris Helen 
2/2 = 1 

Paris Lindsay 
ParisLindsay 
0/1 = 0 
Blake Jessica 
BlakeBlairJessica 
1/1 = 1 
Blake Helen 
BlakeBlairHelen 
1/1 = 1 
Blake Lindsay 
BlakeParisLindsay and BlakeBlair Lindsay 
1/2 = 0.5 
Jessica Helen 
JessicaBlair Helen 
1/1 = 1 
Jessica Lindsay 
Jessica Lindsay 
0/1 = 0 
Helen Lindsay 
Helen BlairLindsay 
1/1 = 1 
Table 42.4 Paris' Betweenness Centrality
Pair of Nodes 
Shortest Paths 
Points for Paris 
Blake Blair 
BlakeBlair 
0/1 = 0 
Blake Jessica 
BlakeBlairJessica 
0/1 = 0 
Blake Helen 
BlakeBlairHelen 
0/1 = 0 
Blake Lindsay 
BlakeParisLindsay and BlakeBlairLindsay 
1/2 = 0.5 
Blair Jessica 
BlairJessica 
0/1 = 0 
Blair Helen 
BlairHelen 
0/1 = 0 
Blair Lindsay 
BlairLindsay 
0/1 = 0 
Jessica Helen 
JessicaBlairHelen 
0/1 = 0 
Jessica Lindsay 
JessicaLindsay 
0/1 = 0 
Helen Lindsay 
HelenBlairLindsay 
0/1 = 0 
Adding up the last column of this table, you can find that Blair's betweenness centrality is 5.5.
Table 42.4 shows the calculations needed to compute Paris' betweenness centrality.
Adding up the last column of this table, you can see that Paris' betweenness centrality measure is 0.5. This indicates that Paris is not important in passing information through the network. This is good because while in South Africa, Paris Hilton said, “I love Africa in general. South Africa and West Africa they are both great countries.” (http://www.foxnews.com/story/2008/03/25/parishiltonwestafricaisgreatcountry/)
Table 42.5 summarizes the betweenness centrality for each node referred to in Figure 42.1.
Table 42.5 Betweenness Centrality for Figure 42.1
Node 
Betweenness Centrality 
0.5 

Blake 
2/3 = 0.67 
Blair 
5.5 
Lindsay 
2/3 = 0.67 
Helen 
0 
Jessica 
2/3 = 0.67 
Table 42.5 makes it clear that Blair is the key to spreading information through the network.
Measuring the Importance of a Link
Analogous to betweenness centrality for a node, you can define link betweenness as a measure of a link's importance in the network. To illustrate the concept, determine (see Table 42.6) the link betweenness for the BlakeBlair link.
1. For each pair of nodes, find all shortest paths between the pair of nodes. For example, for the BlakeLindsay pair of nodes, two shortest paths of length 2 exist (BlakeParisLindsay and BlakeBlairLindsay).
2. Determine the fraction of these shortest paths that include the BlakeBlair link. In this case ½ = 0.5 of the shortest paths includes Blair. The BlakeLindsay pair of nodes now earns 0.5 points toward link betweenness for the BlakeBlair link.
3. Summing these points over all pairs of nodes yields the BlakeBlair link's link betweeness.
Table 42.6 Link Betweenness for BlakeBlair Link
Adding up the third column of Table 42.6 shows the link betweenness for the BlakeBlair link is 4.5.
Table 42.7 shows the computation of the link betweenness for the JessicaBlair link.
Table 42.7 Computation of Link Betweenness for JessicaBlair link
Adding up the numbers in the third column, you find that the JessicaBlair link has a link betweenness of 3.
Summarizing Network Structure
Because large networks are complex, you need simple metrics that can be used to summarize a network's structure. In Chapter 3, “Using Excel Functions to Summarize Marketing Data,” you learned how a large data set could be summarized by two numbers: the mean or median as a measure of typical value and the standard deviation as a measure of spread about the mean. In this section you learn how the structure of a large complex network can be summarized by two numbers:
· L = a measure of the average distance between network nodes.
· C = a local cluster coefficient, which measures the extent to which your friends are friends of one another.
Six Degrees of Separation
In 1967, Harvard sociology professor Stanly Milgram performed an interesting experiment. He gave 296 residents of Omaha, Nebraska, a letter and the name and address of a stockbroker living in a Boston suburb. The goal was to get the letter to the stockbroker with the minimum number of mailings, but the rule of the game was each time the letter was mailed the letter had to be mailed to a friend of the person mailing the letter. Two hundred and seventeen of the Omaha residents mailed the letter, and 64 made it to the stockbroker. Each time the letter mailed was referred to as a “hop.” On average it took 5.2 hops to get the letter to the stockbroker (never more than 10 hops!) and the median number of hops was 6, hence the phrase “six degrees of separation.”
Another example of six degrees of separation is the famous six degrees of the actor Kevin Bacon. Define a network in which the nodes are movie actors and actresses, and there is a link between two actors and/or actresses if they appeared in the same movie. Most actors can be linked to Kevin Bacon in six links or less. The website http://oracleofbacon.org/ enables you to find the path linking an actor or actress to Bacon. For example, as shown in Figure 42.2, Cate Blanchett can be linked to Kevin Bacon through John Goodman. You can then say Cate Blanchett's Bacon number is 2.
Figure 422: Cate Blanchett's Bacon number
Definition and Computation of L
For any network define L = average over all pairs of nodes of the length of the shortest between the pairs of nodes. Essentially a small value for L means that for a randomly chosen pair of nodes it is likely that there exists a fairly short path connecting the nodes. On the other hand, a large value of Lindicates that many pairs of nodes are not connected by short paths.
For the network in Figure 42.1, the computations for L are shown in Table 42.8.
Table 42.8 Computation of L for Figure 42.1
Node Pair 
Length of Shortest Path Between Node Pair 
BlakeBlair 
1 
BlakeJessica 
2 
BlakeHelen 
2 
BlakeLindsay 
2 
BlakeParis 
1 
BlairJessica 
1 
BlairHelen 
1 
BlairLindsay 
1 
BlairParis 
2 
JessicaHelen 
2 
JessicaLindsay 
2 
JessicaParis 
1 
HelenLindsay 
2 
HelenParis 
3 
LindsayParis 
1 
Adding the second column you get 24, so L = 24 / 15 = 1.6. For such a small network, the small value of L is not surprising. The film actor network amazingly has L = 3.65. Amazingly the Facebook friends network has L = 4.7 and for just U.S Facebook users L = 4.3!
The Local Cluster Coefficient
A network's local cluster coefficient is a number between 0 and 1 that measures the tendency of a person's friends to be friends of one another. Because most of your friends know each other, you would expect most social networks to have a relatively large cluster coefficient. Before defining a network's local cluster coefficient, you need a definition: A neighbor of node n is any node connected by a link to node n.
To define a network's local clustering coefficient C, you define for each node n a cluster coefficient C_{n}, which is the fraction of node n's pairs of neighbors that are linked to each other. Then C is obtained by averaging C_{n} over all nodes.
The following illustrates the determination of C for the network pictured in Figure 42.1:
· Paris has three pairs of friends (Blake and Lindsay, Blake and Jessica, and Lindsay and Jessica) and one of these pairs (Lindsay and Jessica) is linked, so C_{Paris} = .
· Blake has one pair of friends (Paris and Blair) and they are not linked, so C_{Blake} = 0.
· Blair has four friends (Blake, Jessica, Helen, and Lindsay), which results in six pairs of friends (Blake and Jessica, Blake and Helen, Blake and Lindsay, Jessica and Helen, Jessica and Lindsay, and Helen and Lindsay.) Of the six pairs of friends, only Jessica and Lindsay are linked, so C_{Blair} =.
· Jessica has three pairs of friends (Blair and Lindsay, Blair and Paris, and Lindsay and Paris). Blair and Lindsay and Lindsay and Paris are linked, so C_{Jessica} = .
· Helen has no pair of friends, so you can omit her when computing the average of the cluster coefficients for each node.
· Lindsay has three pairs of friends (Blair and Jessica, Blair and Paris, and Paris and Jessica). Paris and Jessica and Blair and Jessica are linked, so C_{Lindsay} = .
You now find that the following is true:
The film actor/actress network has C = 0.79.
The next two sections use your understanding of L and C to demonstrate how various reallife networks were created.
Random and Regular Networks
This section looks at random and regular networks and then discusses the seminal work of Steven Strogatz and Duncan Watts (then of Cornell) on small world networks (see “Collective Dynamics of ‘smallworld’ networks,” Nature, June 4, 1998).
Random Networks
Consider a network with n nodes. Then there are n * (n − 1) / 2 possible links in this network. You can form a random network by choosing a probability p for each possible link to be included in the network. For example, choose n = 10 and p = 0.25. Then you might obtain the network shown inFigure 42.3.
Figure 423: Example of random graph
In general a random network has a low C and low L. For example recall that for the film actor network L = 3.65 and C = 0.79. Strogatz and Watts used simulation to repeatedly generate a random network with the same number of nodes and links as the film actor network. They found L = 2.99 and C = 0.08. As this example illustrates, randomly generated networks typically have a small L and a small C. Because most social networks (like the film actor network) have low L and high C, it is not likely that a social network was generated by successive random generation of links.
Regular Networks
Another type of network often studied is a regular network. A network is regular if every node is linked to the same number of nodes. Figure 42.4 shows an example of a regular network on a circle in which each node is linked to four neighboring nodes.
Figure 424: Regular network
It is easy to compute L and C for the network in Figure 42.4 because the network looks the same when viewed from any node. Therefore, you can choose node 1 and compute L for the network by computing (as shown in Table 42.9) the average length of the shortest paths from node 1 to nodes 2–12. You can view the top node in the middle as node 1 and then the nodes are numbered clockwise.
Table 42.9 Lengths of Shortest Paths from Node 1
Node 
Length of shortest path from Node 1 
2 
1 (12) 
3 
1 (13) 
4 
2 (134) 
5 
2 (135) 
6 
3 (1356) 
7 
3 (1357) 
8 
3 (111108) 
9 
2 (1119) 
10 
2 (11110) 
11 
1 (111) 
12 
1 (112) 
Averaging the lengths of the paths in the second column, you can find L = 21 / 11. By the symmetry of the network, you can compute C as the fraction of pairs of node 1's friends that have links between them. Node 1 has six pairs of friends: 1112, 112, 113, 122, 123, and 23. Of these pairs 1112, 122, and 23 are linked, so C = 3 / 6 = 0.5. If you drew a regular network with more nodes (say 1,000) in which each node has four neighbors, then it is straightforward to show (see Problem 9) that C remains at and L = 125.4. In general for a large regular graph both L and C are large. This implies that social networks such as the movie star network cannot be represented by a regular graph. Thirtyone years after Milgram coined six degrees of separation, Strogatz and Watts figured out a reasonable explanation for the prevalence of social networks having a small L and large C.
It's a Small World After All!
Strogatz and Watts began with a regular network like the network shown in Figure 42.5. This network has 10 nodes and each node is linked to two neighboring nodes.
Figure 425: Regular network 10 nodes
Strogatz and Watt's brilliant insight was to define for each link a probability (call it PROB) that the link is deleted. Then the deleted link is replaced by a link joining a randomly chosen pair of nodes. Figure 42.6 shows an example of how the regular network might look after two links are deleted and then replaced.
Figure 426: New network after two links are replaced
In the original network node 10 was four links away from node 6. In Figure 42.6 the new link has reduced the distance from node 10 to node 6 to one link. Strogatz and Watts showed that even a small value of PROB can create a network with a much smaller L than the regular network and virtually the same C value as the regular network. Strogatz and Watts referred to the new arcs as weak ties. For most people the great majority of their friends live in their city of residence, but most have several “weaker” acquaintances in different cities. These weak ties provide a possible explanation for the creation of networks (like the movie star network) having a small L and large C.
In their wonderful book Networks Illustrated (Edwiser Scholastic Press, 2013) Princeton graduate student Chris Brinton and Princeton professor Mung Chiang report simulations that start with a 600node network having six links per node. Even if PROB is relatively small (say 0.1) L is reduced by 70 percent and C hardly changes.
The Rich Get Richer
Google, Facebook YouTube, Yahoo, and Baidu are the five mostvisited websites in the world. The Internet has evolved to the point in which several Internet sites handle lots of traffic, and many sites handle little traffic. Consider the Internet as a network with unidirectional links defined by hyperlinks. For example, your website may contain a link to Amazon.com but Amazon.com does not likely link to your website. Define a node for the Internet to simply be a URL. The indegree of a node in this network is the number of websites linking to the given node. More specifically, let x = in degree of a network URL and y(x) = the number of URLs having in degree x. Then the graph of y(x) versus x follows a Power Law where y = cx^{a}, where a is thought to be near 2. Amazingly if you graph the relationship between x and y(x), you get a graph like Figure 42.7, which follows the Power Law (so called because x is raised to a power).
Figure 427: Power Law for networks
Figure 42.7 illustrates the Power Law for the Internet viewed as a network. The many URLs with small x represent the many URLs with few URLs pointing to them. The URLs with large x represent the few URL's with many nodes pointing toward them.
The Power Law with a = 2 implies, for example:
· as many sites have 1,000,000 nodes pointing to them as have 500,000 nodes pointing to them.
· as many sites have 2,000,000 nodes pointing to them as have 1,000,000 nodes pointing to them.
Consider for any integers k > 0 and x > 0 the following ratio:
If a network follows a Power Law then this ratio is independent of x (see Exercise 11). Therefore, networks following the Power Law are also known as scalefree networks.
Neither random networks, regular networks, nor small world networks yield a Power Law. In 1976, D.J. Price of Yale University provided an elegant explanation of network evolution that is consistent with Power Laws in his article “A general theory of bibliometric and other cumulative advantage processes” (Journal of American Society of Information Sciences). The following steps detail this explanation:
1. Begin with a network (see Figure 42.8) having two nodes and one link.
2. At each step create a new node. The new node will be linked to an existing node and the probability that an existing node is selected is proportional to the number of links possessed by the existing node. In Figure 42.9, node 3 is added. Because nodes 1 and 2 each have one link, there is a 50percent chance that node 3 will be linked to either node 1 or 2. As shown in Figure 42.9, you can assume that the new link connects node 3 to node 2.
3. Now add another node (node 4.) Because node 2 has two links and nodes 1 and 3 have one link, there is a 50 percent chance that node 4 will link to node 2, a 25 percent chance node 4 will link to node 1, and a 25 percent chance node 4 will link to node 3. Now suppose that node 4 links to node 2. The resulting network is shown in Figure 42.10.
Figure 428: Rich Get Richer step 1
Figure 429: Rich Get Richer step 2
Figure 4210: Step 3: Rich Get Richer
If you now add node 5, there would still be a 3 / 6 = 50 percent chance that node 5 links to node 2.
Because nodes with more links are more likely to get the newest link, this view of network formulation is called the Rich Get Richer or the method of preferential attachment. Price showed that networks formed by the Rich Get Richer mechanism follow the Power Law.
The Rich Get Richer mechanism also provides a possible explanation for why hightech product markets are sometimes dominated by a single player (such as Microsoft Office or the Google Search engine.) The more people who use Office, the more attractive Office becomes to prospective customers seeking a productivity suite. Similarly, more people using a search engine leads to better performance, making it more likely that people will use the dominant search engine.
Klout Score
Anyone who uses social media, and especially a marketing analyst, could benefit from knowing how their posts, tweets and other contributions to the Internet move the opinions of others. The website Klout.com aids in this task by creating Klout scores based on Twitter, Facebook, Google+, Instagram, FourSquare, LinkedIn, and even YouTube which purport to measure how Internet content created by a person moves the opinions of other Internet users. For example, on a scale of 0–100 in April 2013, Barack Obama had a Klout Score of 99 and the author had a Klout score of 43. While nobody outside of Klout knows how an individual's Klout score is computed the following are believed to be true:
· Increasing the number of followers you have on Twitter, Facebook, or Instagram will (all other things equal) increase your Klout score.
· A key to your Klout scores is the likelihood that your activity will be acted upon. For example, increasing your Likes on Facebook or Instagram will increase your Klout score, and being retweeted more often will also increase your Klout score.
· The influence of your engaged audience affects your Klout score. For example, being retweeted by one person with a Klout score of 95 might be more important than being retweeted by 40 people each having a Klout score of 2.
· Sean Golliher (see www.seangolliher.com/2011/uncategorized/howireversedengineeredkloutscoretoanr2094/) cleverly attempted to “reverse engineer” the computation of Klout score. For 99 people Golliher attempted to predict their Klout scores using only each person's number of Twitter followers and each person's number of retweets. Golliher found the following simple equation explained 94 percent of the variation in Klout scores:
Summary
In this chapter you learned the following:
· A network consists of nodes and links connecting the nodes.
· The importance of a node can be evaluated by degree centrality, closeness centrality, or betweenness centrality.
· The importance of a link can be evaluated by link betweenness.
· The structure of a network can be characterized by L, a measure of the average distance between nodes and C, the local clustering coefficient that measures the tendency of a person's friends to be friends of one another.
· Random networks have a small L and small C.
· Regular networks have a large L and large C.
· Most social networks have a small value of L (probably caused by weak ties that are the essence of the StrogatzWatts model) and a large value of C.
· The Rich Get Richer theory explains how few nodes with lots of traffic came about on the Internet.
· If you let x = in degree of a network URL and y(x) = the number of URLs having in degree x, then the graph of y(x) versus x follows a Power Law where y = cx^{a}, where a is thought to be near 2.
· Klout score measures a person's online influence across a variety of channels.
Exercises
For Exercises 1–3 use the network shown in Figure 42.11.
1. For each node in the network, compute all three centrality measures.
Figure 4211: Network for exercises 1–3
2. Compute the link betweenness measure for nodes 1 and 2.
3. Compute L and C for the network in Figure 42.11.
4. Compute L and C for a regular network on a circle consisting of 12 nodes and 2 links per node. That is, node 1 links to nodes 12 and 2, and so on.
5. Consider the U.S. power grid network where nodes are generators, substations, and transformers and two nodes are linked if there is a transmission line joining them. Explain why you would expect this network to have a large L and small C.
6. The tributaries of the Mississippi River follow a Power Law. Can you explain which variable should go on each axis?
7. Zipf's Law states that the number of times a word appears in the English language follows a Power Law. Can you explain which variable should go on each axis?
8. How is the Pareto principle (8020 rule discussed in Chapter 1, “Slicing and Dicing Marketing Data with PivotTables”) related to the Power Law?
9. Consider a regular network on a circle with 1,000 nodes in which each node has links to its four neighbors. Show that L = 125.4
10. Explain why Twitter is a unidirectional network and Facebook is not.
11. Show that for any network following a Power Law is independent of x.