library(igraph)
library(readxl)
#Node and Edge Characteristics Your network may have some edge characteristics. What this means is that the network has qualitative or quantitative information regarding the connections between nodes. These could be things that denote a certain type of connection between individuals in the network (romantic vs. friend, positive vs. negative, kinship vs. friend). These qualitatively different categories tell us more about the types of relationships that there are between the nodes in our network. Meanwhile, quantitative information can also tells us more about the information. Quantitative information could include things like frequency of communication. Such information is termed the edges “weight” indicating that there are substantively meaningful differences between the levels of connection (for example interacting only once compared to 10 times).
Additionally, your network may have some information about the nodes. Such information could include categorical information (e.g demographics or other categories) or numeric information (e.g. age). This means we can inform our visualisation to portray the information about the nodes.
All of this information can be attached to our edgelist. You can also do this on an adjacency matrix but it is not as straightforward. To learn the process, let’s stick with edgelists.
This dataset has a lot more information about these individuals and their relationships than others we have used this far. We have one excel spreadsheet (not .csv this time) with two separate sheets, one called vertices and the other called edges. The vertices sheet has information about each individual node in the network while the edges contains both the edges that exist as well as information about them. Let’s read them in and take a look at them
We are using a little more specific code on this one using the readxl package to read in a specific sheet from our spreadsheet. Then we can read in each sheet using the followin method.
vertices.df <- read_excel("C:/Users/trleppar/Downloads/node and vertex characteristics.xlsx", sheet = "vertices")
edges.df <- read_excel("C:/Users/trleppar/Downloads/node and vertex characteristics.xlsx", sheet = "edges")
Let’s take a look at these one at a tine to get an idea of what information we have.
head(edges.df)
## # A tibble: 6 × 4
## from to freq affinity
## <chr> <chr> <dbl> <chr>
## 1 A B 2 pos
## 2 A C 1 neg
## 3 A D 1 pos
## 4 A E 1 neg
## 5 A F 3 neg
## 6 E F 2 pos
We have an edgelist between A-F people. In this edgelist we have the frequency of interaction and the affinity (i.e. if they are positive or negative interactions).
head(vertices.df)
## # A tibble: 6 × 4
## name age role gender
## <chr> <dbl> <chr> <chr>
## 1 A 20 DJ F
## 2 B 25 MC M
## 3 C 21 DJ F
## 4 D 23 crew M
## 5 E 24 MC M
## 6 F 23 MC F
We have their name, age and two categorical variables about them, their role (if they are a DJ or something else) and their sex (Male/Female).
Now I can create a network object in igraph using the familiar method you are used to - graph_from_data_frame(). However, We want this network to have all the information possible. For this, we don’t just want the edge information, but also the node level information. TO do this, we tell R that the data = the edgelist.df (a familiar step to you pros now!), and the vertex characteristics are stored in the object we created earlier, vertices.df
graph <- graph_from_data_frame(d = edges.df, vertices = vertices.df , directed = FALSE)
graph
## IGRAPH 3073d2c UN-- 7 7 --
## + attr: name (v/c), age (v/n), role (v/c), gender (v/c), freq (e/n),
## | affinity (e/c)
## + edges from 3073d2c (vertex names):
## [1] A--B A--C A--D A--E A--F E--F F--G
Nice work! You can see the vertex information is stores as v characteristics (name, age, role and gender. The edge characteristics are stored as e characteristics - freq, affinity. Now lets visualise the network and see what we have.
plot(graph)
Here comes the fun part. First, let’s start with the edge characteristics. Rapid fire, we can visualise these in may different ways.
We will create a few visuals to demonstrate the information about these edges.
Let’s start with the numeric information we have about the edges. First, we will change the width of the lines between nodes to reflect the frequency of interactions using the edge.width argument and the freq edge characteristic.
plot(graph, edge.width = E(graph)$freq)
Or, we can label the nodes with the frequency to tell a similar story. We do this using the edge.labelargument.
plot(graph, edge.label = E(graph)$freq)
What do these visuals tell you about their relationships compared to the first one?
Now let’s use the categorical information to tell a slightly different story. Let’s see what we can do to demonstrate the levels of affinity between these individuals. First, we will change the line type to reflect the different levels. To do this, we first create a logical comparison using an ifelse statement. This checks if the affinity attribute of each edge is equal to “pos”. This will return a logical vector (TRUE or FALSE for each edge). If the edge is “pos” then it will return an item of the vector “solid” if it is false (i.e. “neg”), then it will return “dotdash”. We can then visualise this in the network using the edge.lty argument.
#change the line type using edge.lty to match the affinity
type_affinity <- ifelse(E(graph)$affinity == "pos", "solid", "dotdash")
# Plot plus colour
plot(graph, edge.lty = type_affinity)
Now, let’s combine a few approaches. We will use the same ifelsestatement but will apply it to the colours of the edges. We will also change the edge labels to reflect the affinity label alongside the line type.
affinity <- ifelse(E(graph)$affinity == "pos", "green", "red")
plot(graph, edge.color = affinity, edge.label = E(graph)$affinity, edge.lty = type_affinity)
Now let’s turn to the rest of our data and explore the network’s vertex attributes.
We will start with the numerical characteristics of the attributes - their age. First, let’s change the labels to show their age.
plot(graph, vertex.label = V(graph)$age)
Now, let’s change the colours based on certain parameters that we set using an ifelse() statement.
over_22 <- ifelse(V(graph)$age > 22, "red", "white")
plot(graph, vertex.color = over_22, veterx.label.color = "Black")
Next, let’s work with the categorical variables. First we can change the labels to show these, and then change the colours. See if you can follow the following code chunks and think about what these new networks tell us.
plot(graph, vertex.label =V(graph)$gender)
plot(graph, vertex.label = V(graph)$role)
gender <- ifelse(V(graph)$gender == "M", "orange", "blue")
plot(graph, vertex.color = gender, vertex.label.color = "white")
We have done a lot with ifelse statements here. This are great for setting direct parameters or for working with dichotomous categories (i.e. the male/female one we have). However, we may want to create colours for categories that have more than one and then visualise it. We are going to use a differet package, called dplyr to manipulate what we have to create a vertex attribute that reflect colours based on a categorical variable (their role).
library(dplyr)
To do this, we will return to the original dataframe storing information about the vertex characteristics. Then, we will use the mutate() function to create a new variable that reflect a colour for each role. See if you can follow the logic and look at what we end up with.
vertices.df <- vertices.df %>%
mutate(role_colour = ifelse(role == "DJ", "blue", role))
vertices.df <- vertices.df %>%
mutate(role_colour = ifelse(role == "MC", "red", role_colour))
vertices.df <- vertices.df %>%
mutate(role_colour = ifelse(role == "crew", "green", role_colour))
head(vertices.df)
## # A tibble: 6 × 5
## name age role gender role_colour
## <chr> <dbl> <chr> <chr> <chr>
## 1 A 20 DJ F blue
## 2 B 25 MC M red
## 3 C 21 DJ F blue
## 4 D 23 crew M green
## 5 E 24 MC M red
## 6 F 23 MC F red
Now, let’s recreate our network object following the above method.
graph <- graph_from_data_frame(d = edges.df, vertices = vertices.df , directed = FALSE)
graph
## IGRAPH 3105c9f UN-- 7 7 --
## + attr: name (v/c), age (v/n), role (v/c), gender (v/c), role_colour
## | (v/c), freq (e/n), affinity (e/c)
## + edges from 3105c9f (vertex names):
## [1] A--B A--C A--D A--E A--F E--F F--G
The network has the new v characteristic that we created - role_colour
Now we can visualise this network with the different colours for the roles all represented on the visual.
plot(graph, vertex.color = V(graph)$role_colour, vertex.label.color = "white")
GREAT WORK!