Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

Analyzing Social Networks with Facebook

Save for later
  • 900 min read
  • 2017-06-20 00:00:00

article-image

In this article by Raghav Bali, Dipanjan Sarkar and Tushar Sharma, the authors of the book Learning Social Media Analytics with R, we got a good flavor of the various aspects related to the most popular social micro-blogging platform, Twitter. In this article, we will look more closely at the most popular social networking platform, Facebook. With more than 1.8 billion monthly active users, over 18 billion dollars annual revenue and record breaking acquisitions for popular products including Oculus, WhatsApp and Instagram have truly made Facebook the core of the social media network today.

(For more resources related to this topic, see here.)

Before we put Facebook data under the microscope, let us briefly look at Facebook’s interesting origins! Like many popular products, businesses and organizations, Facebook too had a humble beginning. Originally starting off as Mark Zuckerberg’s brainchild in 2004, it was initially known as “Thefacebook” located at thefacebook.com, which was branded as an online social network, connecting university and college students. While this social network was only open to Harvard students in the beginning, it soon expanded within a month by including students from other popular universities. In 2005, the domain facebook.com was finally purchased and “Facebook” extended its membership to employees of companies and organizations for the first time. Finally in 2006, Facebook was finally opened to everyone above 13 years of age and having a valid email address. The following snapshot shows us how the look and feel of the Facebook platform has evolved over the years!

analyzing-social-networks-facebook-img-0

Facebook’s evolving look over time

While Facebook has a primary website, also known as a web application, it has also launched mobile applications for the major operating systems on handheld devices. In short, Facebook is not just a social network website but an entire platform including a huge social network of connected people and organizations through friends, followers and pages.

We will leverage Facebook’s social “Graph API” to access actual Facebook data to perform various analyses. Users, brands, business, news channels, media houses, retail stores and many more are using Facebook actively on a daily basis for producing and consuming content. This generates vast amount of data and a substantial amount of this is available to users through its APIs.  From a social media analytics perspective, this is really exciting because this treasure trove of data with easy to access APIs and powerful open source libraries from R, gives us enormous potential and opportunity to get valuable information from analyzing this data in various ways.

We will follow a structured path in this article and cover the following major topics sequentially to ensure that you do not get overwhelmed with too much content at once.

  • Accessing Facebook data
  • Analyzing your personal social network
  • Analyzing an English football social network
  • Analyzing English football clubs’ brand page engagements

We will use libraries like Rfacebook, igraph and ggplot2 to retrieve, analyze and visualize data from Facebook. All the following sections of the book assume that you have a Facebook account which is necessary to access data from the APIs and analyze it. In case you do not have an account, do not despair. You can use the data and code files for this article to follow along with the hands-on examples to gain a better understanding of the concepts of social network and engagement analysis.   

Accessing Facebook data

You will find a lot of content in several books and on the web about various techniques to access and retrieve data from Facebook. There are several official ways of doing this which include using the Facebook Graph API either directly through low level HTTP based calls or indirectly through higher level abstract interfaces belonging to libraries like Rfacebook.

Some alternate ways of retrieving Facebook data would be to use registered applications on Facebook like Netvizz or the GetNet application built by Lada Adamic, used in her very popular “Social Network Analysis” course (Unfortunately http://snacourse.com/getnet is not working since Facebook completely changed its API access permissions and privacy settings).

Unofficial ways include techniques like web scraping and crawling to extract data. Do note though that Facebook considers this to be a violation of its terms and conditions of accessing data and you should try and avoid crawling Facebook for data especially if you plan to use it for commercial purposes. In this section, we will take a closer look at the Graph API and the Rfacebook package in R. The main focus will be on how you can extract data from Facebook using both of them.

Understanding the Graph API

To start using the Graph API, you would need to have an account on Facebook to be able to use the API. You can access the API in various ways. You can create an application on Facebook by going to https://developers.facebook.com/apps/ and then create a long-lived OAuth access token using the fbOAuth(…)function from the Rfacebook package. This enables R to make calls to the Graph API and you can also store this token on the disk and load it for future use. An easier way is to create a short-lived token which would let you access the API data for about two hours by going to the Facebook Graph API Explorer page which is available at https://developers.facebook.com/tools/explorer and get a temporary access token from there. The following snapshot depicts how to get an access token for the Graph API from Facebook.

analyzing-social-networks-facebook-img-1

Facebook’s Graph API explorer

On clicking “Get User Access Token” in the above snapshot, it will present a list of checkboxes with various permissions which you might need for accessing data including user data permissions, events, groups and pages and other miscellaneous permissions. You can select the ones you need and click on the “Get Access Token” button in the prompt. This will generate a new access token the field depicted in the above snapshot and you can directly copy and use it to retrieve data in R. Before going into that, we will take a closer look at the Graph API explorer which directly allows you to access the API from your web browser itself and helps if you want to do some quick exploratory analysis. A part of it is depicted in the above snapshot. The current version of the API when writing this book is v2.8 which you can see in the snapshot beside the GET resource call. Interestingly, the Graph API is so named because Facebook by itself can be considered as a huge social graph where all the information can be classified into the following three categories.

  • Nodes: These are basically users, pages, photos and so on. Nodes indicate a focal point of interest which is connected to other points.
  • Edges: These connect various nodes together forming the core social graph and these connections are based on various relations like friends, followers and so on.
  • Fields: These are specific attributes or properties about nodes, an example would be a user’s address, birthday, name and so on.

Like we mentioned before, the API is HTTP based and you can make HTTPGET requests to nodes or edges and all requests are passed to graph.facebook.com to get data. Each node usually has a specific identifier and you can use it for querying information about a node as depicted in the following snippet.

GET graph.facebook.com
  /{node-id}

You can also use edge names in addition to the identifier to get information about the edges of the node. The following snippet depicts how you can do the same.

GET graph.facebook.com
  /{node-id}/{edge-name}

The following snapshot shows us how we can get information about our own profile.

analyzing-social-networks-facebook-img-2

Querying your details in the Graph API explorer

Now suppose, I wanted to retrieve information about a Facebook page,“Premier League” which represents the top tier competition in English Football using its identifier and also take a look at its liked pages. I can do the same using the following request.

analyzing-social-networks-facebook-img-3

Querying information about a Facebook Page using the Graph API explorer

Thus from the above figure, you can clearly see the node identifier, page name and likes for the page, “Premier League”. It must be clear by now that all API responses are returned in the very popular JSON format which is easy to parse and format as needed for analysis. Besides this, there also used to be another way of querying the social graph in Facebook, which was known as FQL or Facebook Query Language, an SQL like interface for querying and retrieving data. Unfortunately, Facebook seems to have deprecated its use and hence covering it would be out of our present scope. Now that you have a firm grasp on the syntax of the Graph API and have also seen a few examples of how to retrieve data from Facebook, we will take a closer look at the Rfacebook package.

Understanding Rfacebook

Since we will be accessing and analyzing data from Facebook using R, it makes sense to have some robust mechanism to directly query Facebook and retrieve data instead of going to the browser every time like we did in the earlier section. Fortunately, there is an excellent package in R called Rfacebook which has been developed by Pablo Barberá. You can either install it from CRAN or get its most updated version from GitHub. The following snippet depicts how you can do the same. Remember you might need to install the devtools package if you don’t have it already, to download and install the latest version of the Rfacebook package from GitHub.

install.packages("Rfacebook") # install from CRAN
# install from GitHub
library(devtools)
install_github("pablobarbera/Rfacebook/Rfacebook") 

Once you install the package, you can load up the package using load(Rfacebook) and start using it to retrieve data from Facebook by using the access token you generated earlier. The following snippet shows us how you can access your own details like we had mentioned in the previous section, but this time by using R.

> token = 'XXXXXX' 
> me <- getUsers("me", token=token)
> me$name
[1] "Dipanjan Sarkar"
> me$id
[1] "1026544"

The beauty of this package is that you directly get the results in curated and neatly formatted data frames and you do not need to spend extra time trying to parse the raw JSON response objects from the Graph API. The package is well documented and has high level functions for accessing personal profile data on Facebook as well as page and group level data points. We will now take a quick look at Netvizz a Facebook application, which can also be used to extract data easily from Facebook.

Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $15.99/month. Cancel anytime

Understanding Netvizz

The Netvizz application was developed by Bernhard Rieder and is a tool which can be used to extract data from Facebook pages, groups, get statistics about links and also extract social networks from Facebook pages based on liked pages from each connected page in the network. You can access Netvizz at https://apps.facebook.com/netvizz/ and on registering the application on your profile, you will be able to see the following screen.

analyzing-social-networks-facebook-img-4

The Netvizz application interface

From the above app snapshot, you can see that there are various links based on the type of operation you want to execute to extract data. Feel free to play around with this tool and we will be using its “page like network” capability later on in one of our analyses in a future section.

Data Access Challenges

There are several challenges with regards to accessing data from Facebook. Some of the major issues and caveats have been mentioned in the following points:

  • Facebook will keep evolving and updating its data access APIs and this can and will lead to changes and deprecation of older APIs and access patterns just like FQL was deprecated.
  • Scope of data available keeps changing with time and evolving of Facebook’s API and privacy settings. For instance we can no longer get details of all our friends from the API any longer.
  • Libraries and Tools built on top of the API can tend to break with changes to Facebook’s APIs and this has happened before with Rfacebook as well as Netvizz. Besides this, Lada Adamic’s GetNet application has stopped working permanently ever since Facebook changed the way apps are created and the permissions they require. You can get more information about it here http://thepoliticsofsystems.net/2015/01/the-end-of-netvizz/

Thus what was used in the book today for data retrieval might not be working completely tomorrow if there are any changes in the APIs though it is expected it will be working fine for at least the next couple of years. However to prevent any hindrance on analyzing Facebook data, we have provided the datasets we used in most of our analyses except personal networks so that you can still follow along with each example and use-case. Personal names have been anonymized wherever possible to protect their privacy. Now that we have a good idea about Facebook’s Graph API and how to access data, let’s analyze some social networks!

Analyzing your personal social network

Like we had mentioned before, Facebook by itself is a massive social graph, connecting billions of users, brands and organization. Consider your own Facebook account if you have one. You will have several friends which are your immediate connections, they in turn will be having their own set of friends including you and you might be friends with some of them and so on. Thus you and your friends form the nodes of the network and edges determine the connections. In this section we will analyze a small network of you and your immediate friends and also look at how we can extract and analyze some properties from the network. Before we jump into our analysis, we will start by loading the necessary packages needed which are mentioned in the following snippet and storing the Facebook Graph API access token in a variable.

library(Rfacebook)
library(gridExtra)
library(dplyr)
# get the Graph API access token
token =  ‘XXXXXXXXXXX’

You can refer to the file fb_personal_network_analysis.R for code snippets used in the examples depicted in this section.

Basic descriptive statistics

In this section, we will try to get some basic information and descriptive statistics on the same from our personal social network on Facebook. To start with let us look at some details of our own profile on Facebook using the following code.

# get my personal information
me <- getUsers("me", token=token, 
               private_info = TRUE)
> View(me[c('name', 'id', 'gender', 'birthday')])

This shows us a few fields from the data frame containing our personal details retrieved from Facebook. We use the View function which basically invokes a spreadsheet-style data viewer on R objects like data frames.

analyzing-social-networks-facebook-img-5

Now, let us get information about our friends in our personal network. Do note that Facebook currently only lets you access information about those friends who have allowed access to the Graph API and hence you may not be able to get information pertaining to all friends in your friend list. We have anonymized their names below for privacy reasons.

anonymous_names <- c('Johnny Juel', 'Houston Tancredi',...,   
                     'Julius Henrichs', 'Yong Sprayberry')
# getting friends information
friends <- getFriends(token, simplify=TRUE)
friends$name <- anonymous_names
# view top few rows
> View(head(friends))

This gives us a peek at some people from our list of friends which we just retrieved from Facebook.

analyzing-social-networks-facebook-img-6

Let’s now analyze some descriptive statistics based on personal information regarding our friends like where they are from, their gender and so on.

# get personal information
friends_info <- getUsers(friends$id, token, private_info = TRUE)

# get the gender of your friends
>View(table(friends_info$gender))

This gives us the gender of my friends, looks like more male friends have authorized access to the Graph API in my network!

analyzing-social-networks-facebook-img-7

# get the location of your friends
>View(table(friends_info$location))

This depicts the location of my friends (wherever available) in the following data frame.

analyzing-social-networks-facebook-img-8

# get relationship status of your friends
> View(table(friends_info$relationship_status))

From the statistics in the following table I can see that a lot of my friends have gotten married over the past couple of years. Boy that does make me feel old!

analyzing-social-networks-facebook-img-9

Suppose I want to look at the relationship status of my friends grouped by gender, we can do the same using the following snippet.

# get relationship status of friends grouped by gender
View(table(friends_info$relationship_status, friends_info$gender))

The following table gives us the desired results and you can see the distribution of friends by their gender and relationship status.

analyzing-social-networks-facebook-img-10

Summary

This article has been proven very beneficial to know some basic analytics of social networks with the help of R. Moreover, you will also get to know the information regarding the packages that R use.

Resources for Article:


Further resources on this subject: