Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
537 views
in Technique[技术] by (71.8m points)

date - How can I calculate network measures separately for different periods using igraph in r?

This is my transaction data:

data:

id          from_id        to_id      amount    date_trx
<fctr>      <fctr>         <fctr>     <dbl>     <date>
0           7468           5695       700.0     2005-01-04
1           6213           9379       11832.0   2005-01-08
2           7517           8170       1000.0    2005-01-10
3           6143           9845       4276.0    2005-01-12
4           6254           9640       200.0     2005-01-14
5           6669           5815       200.0     2005-01-20
6           6934           8583       49752.0   2005-01-24
7           9240           8314       19961.0   2005-01-26
8           6374           8865       1000.0    2005-01-30
9           6143           6530       13.4      2005-01-31
...

I formed the network where the edges are formed between the nodes(accounts) from_id's and to_id's, and the weights of the edges determined by the amounts they transact. Then I calculated the network's measures such as degree centrality, betweenness centrality, closeness centrality etc.

That is:

relations <- data.frame(from = data$from_id, 
                        to = data$to_id)
network <- graph_from_data_frame(relations, directed = T)

E(network)$weight <- data$amount
V(network)$degree <- degree(network, normalized=TRUE)
V(network)$betweenness <- betweenness(network, normalized=TRUE)
V(network)$closeness <- closeness(network, normalized=TRUE)

But now I want to calculate these measures periodically. For example, I want to divide my data by weeks(starting from the very first transaction date) and calculate the network measures for each account for corresponding weeks.

data$week <- unsplit(tapply(data$date_trx, data$from_id, function(x) (as.numeric(x-min(data$trx_date)) %/% 7)+1),data$from_id)

select(data, from_id, to_id, date_trx, week, amount) %>% arrange(date_trx)

from_id       to_id      date_trx      week    amount
<fctr>        <fctr>     <date>        <dbl>   <dbl>
6644           6934       2005-01-01    1      700
6753           8456       2005-01-01    1      600
9242           9333       2005-01-01    1      1000
9843           9115       2005-01-01    1      900 
7075           6510       2005-01-02    1      400 
8685           7207       2005-01-02    1      1100   

...            ...        ...           ...    ...

9866           6697       2010-12-31    313    95.8
9866           5992       2010-12-31    313    139.1
9866           5797       2010-12-31    313    72.1
9866           9736       2010-12-31    313    278.9
9868           8644       2010-12-31    313    242.8
9869           8399       2010-12-31    313    372.2

As I divided my data into weekly periods, now I need to form networks of accounts for each week separately and so that I can calculate network measures for accounts in weekly periods. How can I do that for 313 weeks and at once?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

One possibility is splitting your data according to week, transform each week into an igraph object and then add the centralities and degree to all graphs at once, using lapply. My initial data.frame is named d (see below):

library(igraph)

head(d)
  from_id to_id weight   date_trx
1       D     I      8 1999-09-12
2       E     H     10 1999-10-20
3       A     G     10 1999-09-10
4       C     G     13 1999-04-15
5       E     J      9 1999-06-26
6       B     F     15 1999-04-30

First get the week:

d$week <- strftime(d$date_trx, format = "%V")

Now split by week:

dd <- split(d, d$week )

Transform each week into an igraph

dd <- lapply(dd, function(x) graph_from_data_frame(x, directed = T))

Write a function that does all the operations you want to carry out, and then apply it to each graph:

my.funct <- function(x) {
  V(x)$degree <- degree(x, normalized=TRUE)
  V(x)$betweenness <- betweenness(x, normalized=TRUE)
  V(x)$closeness <- closeness(x, normalized=TRUE)
  return(x)
}

dd <- lapply(dd, my.funct)

For instance, for the first week:

dd[[1]]
IGRAPH f515e52 DN-- 4 2 -- 
+ attr: name (v/c), degree (v/n), betweenness (v/n), closeness (v/n), weigth (e/n), date_trx
| (e/n), week (e/c)
+ edges from f515e52 (vertex names):
[1] B->F C->G



get.vertex.attribute(dd[[1]])
$name
[1] "B" "C" "F" "G"

$degree
[1] 0.3333333 0.3333333 0.3333333 0.3333333

$betweenness
[1] 0 0 0 0

$closeness
[1] 0.3333333 0.3333333 0.2500000 0.2500000



get.edge.attribute(dd[[1]])
$weight
[1] 9 7

$date_trx
[1] 10595 10601

$week
[1] "01" "01"

You can then retrieve all centralities and degree for all weeks:

ddd <- lapply(dd, function(x) igraph::as_data_frame(x, what = "vertices") )

# keep in mind that `split` names the objects in the list according to
# the value it used to split, therefore the name of the data.frames in
# the list is the name of the week.

library(dplyr)
ddd <- bind_rows(ddd, .id="week")

head(ddd)
  week name    degree betweenness closeness
1   01    E 1.4444444           0 0.2000000
2   01    D 1.5555556           0 0.1666667
3   01    B 0.7777778           0 0.2000000
4   01    A 1.0000000           0 0.2000000
5   01    C 0.7777778           0 0.1666667
6   01    F 1.0000000           0 0.1000000

In case, you can use this to merge back to the original edges list.

Data used in this example:

set.seed(123)
d <- data.frame(from_id = sample(LETTERS[1:5], 2000, replace = T),
                to_id = sample(LETTERS[6:10], 2000, replace = T),
                weight = rpois(2000, 10),
                date_trx = sample(seq(as.Date('1999/01/01'), as.Date('2000/01/01'), by="day"), 2000, replace = T))

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...