You are misusing the function read_edgelist
. From the documentation, each line needs to be parsed a string, while csv.reader
parses the lines in the input file into lists of strings (for example, 202,237,1 -> ['202', '237', '1']
). Therefore, AttributeError
is raised because read_edgelist
is trying to parse the lists provided by csv.reader
, while they should be strings.
We can correctly parse the graph from the input file without using the csv
module. However, we still need to deal with the first line (the headers) of the input file, which should not be parsed. There are two methods. The first method skip the first line using next
:
Data = open('test.csv', "r")
next(Data, None) # skip the first line in the input file
Graphtype = nx.Graph()
G = nx.parse_edgelist(Data, delimiter=',', create_using=Graphtype,
nodetype=int, data=(('weight', float),))
The second method is a bit "hacky": since the first line starts with target
, we mark the character t
as the start of a comment in the input file.
Data = open('test.csv', "r")
Graphtype = nx.Graph()
G = nx.parse_edgelist(Data, comments='t', delimiter=',', create_using=Graphtype,
nodetype=int, data=(('weight', float),))
In both methods, we have to use parse_edgelist
instead of read_edgelist
because the input file uses
for newlines. To use read_edgelist
, the file needs to be opened in binary mode, whose lines are split iff the newlines are either
or
. Thus the input file with
newlines cannot be split into lines, and thus cannot parsed correctly.
Also, since you want to find the in-degrees and out-degrees, the graph should be created using DiGraph
, not Graph
.
Edit
The key point here is to skip the header in the input file. We can achieve this by first reading the input file into a pandas.DataFrame
, then we convert it to a graph.
import networkx as nx
import pandas as pd
df = pd.read_csv('test.csv')
Graphtype = nx.Graph()
G = nx.from_pandas_edgelist(df, edge_attr='weight', create_using=Graphtype)