simrank_similarity(G, source=None, target=None, importance_factor=0.9, max_iterations=100, tolerance=0.0001)¶
Returns the SimRank similarity of nodes in the graph
SimRank is a similarity metric that says “two objects are considered to be similar if they are referenced by similar objects.” 1.
The pseudo-code definition from the paper is:
def simrank(G, u, v): in_neighbors_u = G.predecessors(u) in_neighbors_v = G.predecessors(v) scale = C / (len(in_neighbors_u) * len(in_neighbors_v)) return scale * sum(simrank(G, w, x) for w, x in product(in_neighbors_u, in_neighbors_v))
Gis the graph,
uis the source,
vis the target, and
Cis a float decay or importance factor between 0 and 1.
The SimRank algorithm for determining node similarity is defined in 2.
G (NetworkX graph) – A NetworkX graph
source (node) – If this is specified, the returned dictionary maps each node
vin the graph to the similarity between
target (node) – If both
targetare specified, the similarity value between
targetis returned. If
targetis specified but
sourceis not, this argument is ignored.
importance_factor (float) – The relative importance of indirect neighbors with respect to direct neighbors.
max_iterations (integer) – Maximum number of iterations.
tolerance (float) – Error tolerance used to check convergence. When an iteration of the algorithm finds that no similarity value changes more than this amount, the algorithm halts.
similarity – If
None, this returns a dictionary of dictionaries, where keys are node pairs and value are similarity of the pair of nodes.
targetis, this returns a dictionary mapping node to the similarity of
sourceand that node.
None, this returns the similarity value for the given pair of nodes.
- Return type
dictionary or float
If the nodes of the graph are numbered from zero to n - 1, where n is the number of nodes in the graph, you can create a SimRank matrix from the return value of this function where the node numbers are the row and column indices of the matrix:
>>> import networkx as nx >>> from numpy import array >>> G = nx.cycle_graph(4) >>> sim = nx.simrank_similarity(G) >>> lol = [[sim[u][v] for v in sorted(sim[u])] for u in sorted(sim)] >>> sim_array = array(lol)