1 General knowledge

http://stackoverflow.com/questions/9471906/what-are-the-differences-between-community-detection-algorithms-in-igraph

https://www.r-bloggers.com/summary-of-community-detection-algorithms-in-igraph-0-6/

Community Detection in graphs - community detection algorithm implemented in igraph :

  1. edge-betweennes.community(w,-d) - hierarchical decomposition process, very slow

  2. walktrap.community (w,-d) - based on random walks, in my case generates to many to small clusters

  3. fastgreedy.community(w) - bottom-up, suffer from a resolution limit

  4. spinglass.community (w,d, not for unconnected graph) - approach from statistical physics, based on the so-called Potts model; It is not guaranteed that nodes in completely remote (or disconencted) parts of the networks have different spin states; in my case without set.seed(Random Number Generation) option generates different assignments

  5. infomap.community (w,d) - in my case, works only on undirected, simplified (loops and multiple relations are removed) LKN/CKN with 128 (2^7) iterations and set.seed()

  6. label.propagation.community(w) - very fast but yields different results based on the initial configuration (which is decided randomly)

  7. multivel.community(w) - cluster_louvain(g)

  8. leading.eigenvector.community (w) - top-down hierarchical approach

Pipeline:

1. Cluster LKN/CKN using Multi-Level algorithm (https://arxiv.org/abs/0803.0476)

2. After Multi-Level, subnetworks from big.0.0 of cluster size [2^10, max) cluster with spinglass algorithm according to the number of hub nodes and plot separately.

4.BE CAREFUL: spinglass algorithm works only on connected graphs. Each time before spinglass clustering check if subgraph disintegrate to weak components.

https://lists.nongnu.org/archive/html/igraph-help/2010-04/msg00076.html - layout repeatability in R and spinglass.community function?

“…it gives different results each time because it is starting from different random start-points each time.”

2 R packages

if (!require("igraph")) install.packages("igraph")
library(igraph)
if (!require("tictoc")) install.packages("tictoc")
require(tictoc)
if (!require("gtools")) install.packages("gtools")
library(gtools)
if (!require("reshape2")) install.packages("reshape2")
library(reshape2)
if (!require("gmp")) install.packages("gmp")
library(gmp)

tic()
set.seed(123456)

Warnings in igraph because of bug - fix

autocurve.edges <- function(graph, start=0.5) {
  el <- apply(get.edgelist(graph, names = FALSE), 1, paste, collapse = ":")
  ave(rep(NA, length(el)), el, FUN = function(x) {
    if (length(x) == 1) {
      return(0)
    } else {
      return(seq(-start, start, length = length(x)))
    }
  })
}
'%ni%' = Negate('%in%') 

3 Network ID

myname = your network.graphml

# https://string-db.org/cgi/network.pl?taskId=DkhvPMjSF0rr
# myname = 'STRING_PR1.graphml' # 
myname = 'STRING_PR1.graphml'
dir0 = getwd()
g = read_graph(paste0(myname), format = "graphml")

print(summary(g))
## IGRAPH 4856a4e DN-- 141 1069 -- 
## + attr: SUID (g/n), shared name (g/c), name (g/c), selected (g/l),
## | __Annotations (g/c), SUID (v/n), shared name (v/c), name (v/c),
## | selected (v/l), shortName (v/c), geneID (v/c), shortDescription
## | (v/c), MapManBin (v/c), id (v/c), SUID (e/n), shared name (e/c),
## | shared interaction (e/c), name (e/c), selected (e/l),
## | interaction (e/c), reactionType (e/c), shortName1 (e/c),
## | shortName2 (e/c), geneID1 (e/c), geneID2 (e/c)
## IGRAPH 4856a4e DN-- 141 1069 -- 
## + attr: SUID (g/n), shared name (g/c), name (g/c), selected (g/l),
## | __Annotations (g/c), SUID (v/n), shared name (v/c), name (v/c),
## | selected (v/l), shortName (v/c), geneID (v/c), shortDescription
## | (v/c), MapManBin (v/c), id (v/c), SUID (e/n), shared name (e/c),
## | shared interaction (e/c), name (e/c), selected (e/l),
## | interaction (e/c), reactionType (e/c), shortName1 (e/c),
## | shortName2 (e/c), geneID1 (e/c), geneID2 (e/c)
## + edges from 4856a4e (vertex names):
## [1] AT3G25070->AT3G07040 AT3G25070->AT3G20600 AT3G25070->AT1G72930
## [4] AT3G07040->AT1G72930 AT3G07040->AT1G64280 AT5G05730->AT1G25220
## + ... omitted several edges
list.vertex.attributes(g)
## [1] "SUID"             "shared name"      "name"            
## [4] "selected"         "shortName"        "geneID"          
## [7] "shortDescription" "MapManBin"        "id"
list.edge.attributes(g)
##  [1] "SUID"               "shared name"        "shared interaction"
##  [4] "name"               "selected"           "interaction"       
##  [7] "reactionType"       "shortName1"         "shortName2"        
## [10] "geneID1"            "geneID2"
myname = unlist(strsplit(myname, "[.]"))[1]
dir1 = paste(dir0,'/clusteringResults',sep = '')
ifelse(!dir.exists(dir1), dir.create(dir1), FALSE)
## [1] TRUE

Graph summary

A scale-free network is a network whose degree distribution follows a power law, at least asymptotically.

CHECK all the vertex and edge attributes! This is important at the end. The graphical parameters/attributes will not be saved, all others will. Also be careful which column is short NIB ZR biological name.

V(g)$names = V(g)$geneID
V(g)$nodeID = V(g)$names
V(g)$numVec = seq(1,length(V(g)$id),1)

# list.vertex.attributes(g)
# list.edge.attributes(g)

quantile(degree(g), c(0.5, 0.75, 0.90, 0.95, 0.975, 0.99, 0.995, 0.9987, 1.0))
##    50%    75%    90%    95%  97.5%    99%  99.5% 99.87%   100% 
##  13.00  20.00  29.00  42.00  45.00  50.80  64.00  84.72  92.00
which(degree(g) > 0.6*max(degree(g)))
## AT1G64280 
##        11
# just for drawing purposes
# V(g)$size = rep(40,length(V(g)$size))
# V(g)$size = sample(seq(20,40,1), vcount(g), replace = TRUE)

4 Multi-Level

http://stackoverflow.com/questions/20364939/community-detection-with-infomap-algorithm-producing-one-massive-module#20390769

Finding community structure by multi-level optimization of modularity

https://www.r-bloggers.com/summary-of-community-detection-algorithms-in-igraph-0-6/

https://arxiv.org/abs/0803.0476

fastgreedy.community merges pairs of communities iteratively, always choosing the pair that yields the maximum increase in the overall modularity. In multilevel.community, communities are not merged; instead of that, nodes are moved between communities such that each node makes a local decision that maximizes its own contribution to the modularity score. When this procedure gets stuck (i.e. none of the nodes change their membership), then all the communities are collapsed into single nodes and the process continues (that’s why it is multilevel).

On undirected, simplified network (LKN/CKN)

Finding community structure by multi-level optimization of modularity

Description: This function implements the multi-level modularity optimization algorithm for finding community structure, see references below. It is based on the modularity measure and a hierarchial approach.

Usage: cluster_louvain(graph, weights = NULL)

BONUS: add overall degree as vertex attribute

################################################################################
# The number of edges remains constant, 
#  an undirected edge is created for each directed one,
#  this version might create graphs with multiple edges.
gu = as.undirected(g, mode = "each")
# summary(gu)
gs = simplify(gu,
              remove.multiple = TRUE,
              remove.loops = TRUE,
              edge.attr.comb = "concat")
              # Concatenate the attributes, using the c function.
              # This results almost always a complex attribute.

# summary(gs)

V(gu)$numVec = seq(1,length(V(gu)$id),1)
V(gs)$numVec = seq(1,length(V(gs)$id),1)

V(gs)$degreeFullSimplified = degree(gs, loops = FALSE, 
                                    normalized = FALSE,  mode = "all")
V(g)$degreeFullSimplified = degree(gs, loops = FALSE, 
                                   normalized = FALSE,  mode = "all")

# tic()
cl = multilevel.community(gs)
# exectime <- toc()
is_hierarchical(cl)
## [1] FALSE
names(cl)
## [1] "membership"  "memberships" "modularity"  "names"       "vcount"     
## [6] "algorithm"
# cl$membership
# cl$names

https://en.wikipedia.org/wiki/Power_law http://stackoverflow.com/questions/21541240/goodness-of-fit-test-for-power-law-distribution-in-r https://cran.r-project.org/web/packages/poweRlaw/

For degree_distribution a numeric vector of the same length as the maximum degree plus one. The first element is the relative frequency zero degree vertices, the second vertices with degree one, etc.

dd = degree_distribution(gs, v = V(gs), loops = FALSE, 
                         normalized = FALSE,  mode = "all")
plot(sort(dd, decreasing = TRUE))

ddd = degree(gs, v = V(gs))
plot(sort(ddd, decreasing = TRUE))

max(ddd) + 1 == length(dd)
## [1] TRUE
plf = power.law.fit(dd, impelementation = "plfit")
print(plf)
## $continuous
## [1] TRUE
## 
## $alpha
## [1] 3.111234
## 
## $xmin
## [1] 0.02836879
## 
## $logLik
## [1] 48.21339
## 
## $KS.stat
## [1] 0.2391393
## 
## $KS.p
## [1] 0.2546955
plf$alpha
## [1] 3.111234
xxm = plf$xmin + 0.01

alpha = vector(length = 0)
xm = vector(length = 0)
powerl = vector(length = 0)
powerl3 = vector(length = 0)
for (i in seq(xxm, max(dd), 0.01)) {
  plf = power.law.fit(dd, impelementation = "plfit", xmin = i)
  alpha[length(alpha) + 1] = plf$alpha
  xm[length(xm) + 1] = i
  powerl[length(powerl) + 1] = (round(plf$alpha) == 2)
  powerl3[length(powerl3) + 1] = (round(plf$alpha) == 3)
}

# closest value
alpha[ which.min(abs(alpha - 2.00))]
## [1] 3.180394
xm[ which.min(abs(alpha - 2.00))]
## [1] 0.03836879
aalpha = alpha[alpha >= 2.0]
aalpha[ which.min(abs(aalpha - 2.00))]
## [1] 3.180394
xm[ which.min(abs(aalpha - 2.00))]
## [1] 0.03836879
aalpha = alpha[alpha <= 3]
aalpha[ which.min(abs(aalpha - 3))]
## numeric(0)
xm[ which.min(abs(aalpha - 3))]
## numeric(0)

5 multilevel communities

Calculate some topological network parameters (http://www.nature.com/nprot/journal/v7/n4/full/nprot.2012.004.html) like centralisation, clustering coeff., number of hub nodes (http://www.nature.com/articles/srep08665)

http://igraph.org/r/doc/centralize.html

For degree, closeness and betweenness the most centralized structure is some version of the star graph, in-star, out-star or undirected star.

For eigenvector centrality the most centralized structure is the graph with a single edge (and potentially many isolates).

Be CAREFUL: network with 3 nodes or less cannot be plotted!

t = (as.data.frame(table(cl$membership)))
t[,1] = as.numeric(t[,1])
t[,2] = as.numeric(t[,2])
d = t[with(t, order(-Freq, Var1)), ]
dim(d)[1] == length(unique(cl$membership))
## [1] TRUE
sum(d[,2]) == vcount(g)
## [1] TRUE
# singletons
sum((t[t[,2] < 2, ])[,2])
## [1] 0
# duos
sum((t[t[,2] == 2, ])[,2])
## [1] 0
# trios
sum((t[t[,2] == 3, ])[,2])
## [1] 0
# quartets
sum((t[t[,2] == 4, ])[,2])
## [1] 0
V(g)$membership <- cl$membership
V(gu)$membership <- cl$membership
V(gs)$membership <- cl$membership

hist(cl$membership, breaks = length(unique(cl$membership)))

toSmallInd = d[(d[,2] <= 2^2),1]

d = d[d[,2] > 2^2,] # 3 doesn't plot

dim(t)[1]
## [1] 3
cat('some clusterID size')
## some clusterID size
print(d)
##   Var1 Freq
## 2    2   64
## 3    3   45
## 1    1   32
mydf = matrix(NA, ncol = 36, nrow = dim(d)[1])
colnames(mydf) = c('cliID', 'vcountD', 'ecountD', 'connectedD',
                   'starD','-Log10qD', 'evD', 'hDegD',
                   'vcountS', 'ecountS', 'centralityS',
                   'starS','-Log10qS', 'evS', 'hDegS',
                   'vcountGSS', 'ecountGSS', 'centralityGSS',
                   'starGSS','-Log10qGSS', 'evGSS', 'hDegGSS',
                   'vShrinkage', 'eShrinkage',
                   'centr_degreeD', 'centr_cloD' , 'centr_betwD',
                   'centr_degreeS', 'centr_cloS' , 'centr_betwS',
                   'centr_degreeGSS', 'centr_cloGSS', 'centr_betwGSS',
                   '-Log10edge_densityD', 
                   '-Log10edge_densityS', 
                   '-Log10edge_densityGSS')

d[,1] = as.numeric(d[,1])
d[,2] = as.numeric(d[,2])

for (i in 1:dim(d)[1]) {
  j = as.numeric(d[i,1])
  a = which(cl$membership == j) # cl[[j]] ####  #### ####  #### ####  ####

  tmpNetD = induced.subgraph(graph = g, vids = a)
  # do not want loops and multiple edges i count
  tmpNetS = induced.subgraph(graph = gs, vids = a)

  gss = delete.vertices(tmpNetS, which(degree(tmpNetS) < 2))


  vcS = vcount(tmpNetS)
  ecS = ecount(tmpNetS)
  vcD = vcount(tmpNetD)
  ecD = ecount(tmpNetD)
  vcgss = vcount(gss)
  ecgss = ecount(gss)

  starS = max(centr_degree(tmpNetS, loops = FALSE, 
                           normalized = TRUE, mode = 'all')$centralization,
      centr_clo(tmpNetS, mode = 'all', normalized = TRUE)$centralization,
      centr_betw(tmpNetS, directed = FALSE, normalized = TRUE)$centralization)

  if (length(E(gss))) {
      starGSS = max(centr_degree(gss, loops = FALSE, 
                                 normalized = TRUE, mode = 'all')$centralization,
        centr_clo(gss, mode = 'all', normalized = TRUE)$centralization,
        centr_betw(gss, directed = FALSE, normalized = TRUE)$centralization)
    } else starGSS = -1

  (hubD  = length(which(degree(tmpNetD) >= 0.6*max(degree(tmpNetD)))))
  (hubS  = length(which(degree(tmpNetS) >= 0.6*max(degree(tmpNetS)))))
  # cat('is star if measure close to 1:', j, starS, hubS, '\n')
  (hubGSS  = length(which(degree(gss) >= 0.6*max(degree(gss)))))

  qS = 2*ecount(tmpNetS)/(vcS*(vcS - 1)) # == edge_density(tmpNetS)
  qD = 1*ecount(tmpNetD)/(vcD*(vcD - 1))
  qGSS = ifelse(vcgss > 2, 2*ecount(gss)/(vcgss*(vcgss - 1)), 0)

    mydf[i,1] = j
    mydf[i,2] = vcD
    mydf[i,3] = ecount(tmpNetD)
    mydf[i,4] = is.connected(tmpNetD)

    mydf[i,5] = ifelse((hubD) == 1, 1, 0)
    mydf[i,6] = -log10(qD)
    mydf[i,7] = ecount(tmpNetD)/vcD
    mydf[i,8] = hubD

    mydf[i,9] = vcS
    mydf[i,10] = ecount(tmpNetS)
    mydf[i,11] = starS

    mydf[i,12] = ifelse((hubS) == 1, 1, 0)
    mydf[i,13] = -log10(qS)
    mydf[i,14] = ecount(tmpNetS)/vcS
    mydf[i,15] = hubS

    mydf[i,16] = vcgss
    mydf[i,17] = ecount(gss)
    mydf[i,18] = starGSS

    mydf[i,19] = ifelse((hubGSS) == 1, 1, 0)
    mydf[i,20] = ifelse(qGSS, -log10(qGSS), -1)
    mydf[i,21] = ecount(gss)/vcgss
    mydf[i,22] = hubGSS

    mydf[i,23] = round(vcount(tmpNetD)/vcount(gss))
    mydf[i,24] = ifelse(ecount(gss) != 0, 
                        round(ecount(tmpNetD)/ecount(gss)), 
                        -1)

    mydf[i,25] = centr_degree(tmpNetD, loops = FALSE, 
                              normalized = TRUE, mode = 'all')$centralization
    mydf[i,26] = centr_clo(tmpNetD, mode = 'all', 
                           normalized = TRUE)$centralization
    mydf[i,27] = centr_betw(tmpNetD, directed = FALSE, 
                            normalized = TRUE)$centralization

    mydf[i,28] = centr_degree(tmpNetS, loops = FALSE, 
                              normalized = TRUE, mode = 'all')$centralization
    mydf[i,29] = centr_clo(tmpNetS, mode = 'all', 
                           normalized = TRUE)$centralization
    mydf[i,30] = centr_betw(tmpNetS, directed = FALSE, 
                            normalized = TRUE)$centralization

    mydf[i,31] = ifelse(ecount(gss) != 0,
                        centr_degree(gss, loops = FALSE, 
                                     normalized = TRUE, mode = 'all')$centralization, 
                        -1)
    mydf[i,32] = ifelse(ecount(gss) != 0,
                        centr_clo(gss, mode = 'all', 
                                  normalized = TRUE)$centralization, 
                        -1)
    mydf[i,33] = ifelse(ecount(gss) != 0,
                        centr_betw(gss, directed = FALSE, 
                                   normalized = TRUE)$centralization, 
                        -1)

    mydf[i,34] = -log10(edge_density(simplify(tmpNetD), loops = FALSE))
    mydf[i,35] = -log10(edge_density(simplify(tmpNetS), loops = FALSE))
    mydf[i,36] = ifelse(ecount(gss) != 0, 
                        -log10(edge_density(simplify(gss), loops = FALSE)), 
                        -1)


}

rownames(mydf) = mydf[,1]

write.table(mydf, paste0(dir1, '/', myname, '_mydf.txt'), 
            sep = '\t', row.names = FALSE, quote = FALSE)

mydf.se = (mydf[order(-mydf[,3], -mydf[,2]),] )
# print (mydf.se)
mydf.sv = (mydf[order(-mydf[,2], -mydf[,3]),] )
# print (mydf.sv)

dim(mydf)
## [1]  3 36

6 stars, densely connected, unconnected

indI = union(intersect(intersect(which(mydf[,11] > 0.60), 
                                 which(round(mydf[,14]) == 1)), 
                       which(mydf[,30] > 0.60)),
             which(mydf[,15] == 1))
myStars = as.vector(mydf[indI, 1])

denselyConnectedG = as.vector(mydf[as.vector(which(round(mydf[,11], digits = 2) == 0.00)),1])

unConnected = as.vector(mydf[as.vector(which(mydf[,4]  == 0)),1])

myStars = setdiff(myStars, unConnected)

denselyConnectedG = setdiff(denselyConnectedG, unConnected)

7 matrix of coord and clu

mat <- matrix(nrow = 0, ncol = 8)
colnames(mat) = c('nodeID', 
                  'x', 'y', 
                  'clu', 'origin', 
                  'degreeSimplifiedSuperCluster', 'degreeSimplifiedCluster', 
                  'isconnected')
head(mat)
##      nodeID x y clu origin degreeSimplifiedSuperCluster
##      degreeSimplifiedCluster isconnected
myClusters = 1

V(g)$x = vector(mode = "double", vcount(g))
V(g)$y = vector(mode = "double", vcount(g))

# list.vertex.attributes(g)
# list.edge.attributes(g)

8 mini

cntV = 0
Vid = unlist(lapply(toSmallInd, function(x) cl[[x]]))
cnt = length(Vid)

if (cnt) {
  mini = induced.subgraph(graph = g, vids = strtoi(match(Vid, cl$names)))
  mydeg = degree(mini, loops = FALSE, normalized = FALSE,  mode = "all")
  for (l in 1:vcount(mini)) {
    tmpvec = c()
    tmpvec = c(V(mini)$nodeID[l], 0, 0, 0, 
               paste0('mini','_',l), mydeg[l], mydeg[l], is.connected(mini))
    mat = rbind(mat,tmpvec)
  }

  myseq = paste0('mini','_',seq(1,vcount(mini),1))
  rownames(mat)[which(rownames(mat) == 'tmpvec')] = myseq
  print(cntV)
  print(dim(mat))
  # print(head(mat))
  print(tail(mat))
  print(myClusters)

}

9 small-moderate, big

mydf = mydf[which(mydf[,1] %ni% c(myStars, unConnected, denselyConnectedG)),]

big.0.0 = mydf[union((which((mydf[,2] >= 2^10))), 
                     (which(((mydf[,3] >= 2^11))))),]
dim(big.0.0)
## [1]  0 36
normal.0.0 = mydf[which(as.vector(mydf[,1]) %ni% as.vector(big.0.0[,1])),]
dim(normal.0.0)
## [1]  2 36

10 moderate

cntV = 0
tmplist = vector()

print(normal.0.0)
##   cliID vcountD ecountD connectedD starD  -Log10qD       evD hDegD vcountS
## 3     3      45     428          1     0 0.6652214  9.511111    10      45
## 1     1      32     345          1     0 0.4586926 10.781250    24      32
##   ecountS centralityS starS  -Log10qS       evS hDegS vcountGSS ecountGSS
## 3     428   0.6126174     0 0.3641914  9.511111    10        45       428
## 1     345   0.4472610     0 0.1576626 10.781250    24        30       343
##   centralityGSS starGSS -Log10qGSS     evGSS hDegGSS vShrinkage eShrinkage
## 3     0.6126174       0  0.3641914  9.511111      10          1          1
## 1     0.3365043       0  0.1031951 11.433333      30          1          1
##   centr_degreeD centr_cloD centr_betwD centr_degreeS centr_cloS
## 3     0.2732558  0.6126174  0.06315932     0.5465116  0.6126174
## 1     0.1623656  0.4472610  0.13908256     0.3247312  0.4472610
##   centr_betwS centr_degreeGSS centr_cloGSS centr_betwGSS
## 3  0.06315932       0.5465116    0.6126174    0.06315932
## 1  0.13908256       0.2266010    0.3365043    0.01351562
##   -Log10edge_densityD -Log10edge_densityS -Log10edge_densityGSS
## 3           0.6652214           0.3641914             0.3641914
## 1           0.4586926           0.1576626             0.1031951
if (!is.null(dim(normal.0.0))) {
  IND = as.vector(normal.0.0[,1])
} else {
  IND = normal.0.0[1]
}


if (length(IND)) {
  for (i in 1:length(IND)) {
      print(IND[i])

      ttmplist = vector()
      tmplist = c(tmplist, paste0((IND)[i],'_',seq(1,(length(cl[[IND[i]]])),1)))
      ttmplist = c(ttmplist, paste0((IND)[i],'_',seq(1,(length(cl[[IND[i]]])),1)))

      moderateNet = induced.subgraph(graph = g, vids = strtoi(match(cl[[IND[i]]], cl$names))) ####  ####  ####  ####
      moderateNetGS = induced.subgraph(graph = gs,vids = strtoi(match(cl[[IND[i]]], cl$names)))  ####  ####  ####  ####

      cat("i = ", i, "\n")
      cat("# vertices", vcount(moderateNet), "\n")
      cntV = cntV + vcount(moderateNet)
      cat("# edges", ecount(moderateNet), "\n")
      cat("#edges/#vertices", ecount(moderateNet)/vcount(moderateNet), "\n")
      cat('\n')

    if ((ecount(moderateNet) < 2^6) & (vcount(moderateNet) < 2^6)) {
      l2 = layout.fruchterman.reingold(moderateNet, niter = 2^13)
    } else {

      l1 = layout_on_grid(moderateNet, dim = 2)
      l2 = layout_with_kk(moderateNet, coords = l1, dim = 2,
                          maxiter = 999 * vcount(moderateNet),
                          epsilon = 0, kkconst = vcount(moderateNet),
                          minx = NULL, maxx = NULL,
                          miny = NULL, maxy = NULL,
                          minz = NULL,maxz = NULL)
      z = round(ecount(moderateNet)/vcount(moderateNet))
      l2 = l2*2*z
    }

      if (vcount(moderateNet) >= 2^2) {
        plot(0, type = "n",
             #ann=FALSE,
             axes = FALSE,
            xlim = extendrange(l2[,1]),
            ylim = extendrange(l2[,2]),
            xlab = paste0('moderateNet_', myClusters),
            ylab = paste0('clu: ',(IND)[i]))

        plot(moderateNet, layout = l2,
             edge.label = '',
             vertex.label = V(moderateNet)$shortName,
             vertex.color = 'gray60', ####  ####  ####  ####
             rescale = FALSE, add = TRUE,
             vertex.label.cex = 0.5,
             edge.arrow.size = 0.25,
             edge.arrow.width = 0.25,
             edge.lty = 'solid',
             edge.color = 'gray',
             edge.width = 0.25,
             # edge.arrow.mode = 0,
             edge.label.cex = 0.25,
             edge.curved = autocurve.edges(moderateNet))
      }
      ###

      mydeg = degree(moderateNet, loops = FALSE, 
                     normalized = FALSE,  mode = "all")
      for (l in 1:vcount(moderateNet)) {
        tmpvec = c()
        tmpvec = c(V(moderateNet)$nodeID[l], l2[l,1], l2[l,2], myClusters,
                   paste0('normal','_',ttmplist[l]), 
                   mydeg[l], mydeg[l], is.connected(moderateNet))
        mat = rbind(mat,tmpvec)
      }
      myClusters = myClusters + 1

  }
  myseq = paste0(rep('moderate'),'_', tmplist)
  rownames(mat)[which(rownames(mat) == 'tmpvec')] = myseq
  print(dim(mat))
  # print(head(mat))
  print(tail(mat))
  print(myClusters)
}
## [1] 3
## i =  1 
## # vertices 45 
## # edges 428 
## #edges/#vertices 9.511111

## [1] 1
## i =  2 
## # vertices 32 
## # edges 345 
## #edges/#vertices 10.78125

## [1] 77  8
##               nodeID      x                   y                  clu
## moderate_1_27 "AT4G31990" "82.3725686169967"  "108.83013589392"  "2"
## moderate_1_28 "AT5G19550" "75.1650814336672"  "121.665335142787" "2"
## moderate_1_29 "AT1G62800" "105.778630124115"  "92.8744728308272" "2"
## moderate_1_30 "AT5G11520" "97.3836402164135"  "106.740380484517" "2"
## moderate_1_31 "AT5G26920" "-53.4522133496858" "37.8593880243283" "2"
## moderate_1_32 "AT5G13320" "-48.6532235714614" "87.2168029981343" "2"
##               origin        degreeSimplifiedSuperCluster
## moderate_1_27 "normal_1_27" "18"                        
## moderate_1_28 "normal_1_28" "18"                        
## moderate_1_29 "normal_1_29" "18"                        
## moderate_1_30 "normal_1_30" "18"                        
## moderate_1_31 "normal_1_31" "1"                         
## moderate_1_32 "normal_1_32" "1"                         
##               degreeSimplifiedCluster isconnected
## moderate_1_27 "18"                    "TRUE"     
## moderate_1_28 "18"                    "TRUE"     
## moderate_1_29 "18"                    "TRUE"     
## moderate_1_30 "18"                    "TRUE"     
## moderate_1_31 "1"                     "TRUE"     
## moderate_1_32 "1"                     "TRUE"     
## [1] 3

11 stars

IND = myStars
cntV = 0
tmplist = vector()

if (length(IND)) {
  for (i in 1:length(IND)) {

    ttmplist = vector()
    tmplist = c(tmplist, paste0((IND)[i],'_',seq(1,(length(cl[[IND[i]]])),1)))
    ttmplist = c(ttmplist, paste0((IND)[i],'_',seq(1,(length(cl[[IND[i]]])),1)))

    starNet = induced.subgraph(graph = g, vids = strtoi(match(cl[[IND[i]]], cl$names)))  ####  ####  ####  ####

    cat("i = ", i, "\n")
    cat("# vertices", vcount(starNet), "\n")
    cntV = cntV + vcount(starNet)
    cat("# edges", ecount(starNet), "\n")
    cat("#edges/#vertices", ecount(starNet)/vcount(starNet), "\n")
    cat('\n')

    if ((ecount(starNet) < 2^6) & (vcount(starNet) < 2^6)) {
      l2 = layout.fruchterman.reingold(starNet, niter = 2^13)
    } else {

      l1 = layout_on_grid(starNet, dim = 2)
      l2 = layout_with_kk(starNet, coords = l1, dim = 2,
                          maxiter = 999 * vcount(starNet),
                          epsilon = 0, kkconst = vcount(starNet),
                          #weights = rep(100, length.out),
                          minx = NULL, maxx = NULL,
                          miny = NULL, maxy = NULL,
                          minz = NULL,maxz = NULL)
      z = round(ecount(starNet)/vcount(starNet))
      l2 = l2*2*z
    }

    if ((vcount(starNet) >= 2^2) & (ecount(starNet) <= 2^11)) {
      plot(0, type = "n", ann = TRUE, axes = FALSE,
           xlim = extendrange(l2[,1]),
           ylim = extendrange(l2[,2]),
           xlab = paste0('starNet_', myClusters),
           ylab = paste0('clu: ',(IND)[i]))

      plot(starNet, layout = l2,
           edge.label = '',
           vertex.label = V(starNet)$shortName,
           vertex.color = 'gray60', ####  ####  ####  ####
           rescale = FALSE, add = TRUE,
           vertex.label.cex = 0.5,
           edge.arrow.size = 0.25,
           edge.arrow.width = 0.25,
           edge.lty = 'solid',
           edge.color = 'gray',
           edge.width = 0.25,
           # edge.arrow.mode = 0,
           edge.label.cex = 0.25,
           edge.curved = autocurve.edges(starNet))
      }
      ###

      mydeg = degree(starNet, loops = FALSE, 
                     normalized = FALSE,  mode = "all")
      for (l in 1:vcount(starNet)) {
        tmpvec = c()
        tmpvec = c(V(starNet)$nodeID[l], l2[l,1], l2[l,2], myClusters,
                   paste0('starNet','_',ttmplist[l]), 
                   mydeg[l], mydeg[l], is.connected(starNet))
        mat = rbind(mat,tmpvec)
      }
      myClusters = myClusters + 1

  }
  myseq = paste0(rep('starNet'),'_', tmplist)
  rownames(mat)[which(rownames(mat) == 'tmpvec')] = myseq
  print(dim(mat))
  # print(head(mat))
  print(tail(mat))
  print(myClusters)
}
## i =  1 
## # vertices 64 
## # edges 211 
## #edges/#vertices 3.296875

## [1] 141   8
##              nodeID      x                  y                  clu
## starNet_2_59 "AT5G06839" "-2.0027303222501" "36.3652845433234" "3"
## starNet_2_60 "AT5G24780" "57.4171224319684" "11.8290178498089" "3"
## starNet_2_61 "AT4G17880" "52.3313115895203" "5.15420519294015" "3"
## starNet_2_62 "AT1G78780" "30.783553305754"  "43.0373541757408" "3"
## starNet_2_63 "AT1G08720" "5.6653193715662"  "44.6021783570855" "3"
## starNet_2_64 "AT4G26000" "48.536011846199"  "47.0850680452374" "3"
##              origin         degreeSimplifiedSuperCluster
## starNet_2_59 "starNet_2_59" "1"                         
## starNet_2_60 "starNet_2_60" "2"                         
## starNet_2_61 "starNet_2_61" "4"                         
## starNet_2_62 "starNet_2_62" "2"                         
## starNet_2_63 "starNet_2_63" "1"                         
## starNet_2_64 "starNet_2_64" "1"                         
##              degreeSimplifiedCluster isconnected
## starNet_2_59 "1"                     "TRUE"     
## starNet_2_60 "2"                     "TRUE"     
## starNet_2_61 "4"                     "TRUE"     
## starNet_2_62 "2"                     "TRUE"     
## starNet_2_63 "1"                     "TRUE"     
## starNet_2_64 "1"                     "TRUE"     
## [1] 4

12 densely Connected

IND = denselyConnectedG
cntV = 0
tmplist = vector()

if (length(IND)) {
  for (i in 1:length(IND)) {

    ttmplist = vector()
    tmplist = c(tmplist, paste0((IND)[i],'_',seq(1,(length(cl[[IND[i]]])),1)))
    ttmplist = c(ttmplist, paste0((IND)[i],'_',seq(1,(length(cl[[IND[i]]])),1)))

    denselyConnectedNet = induced.subgraph(graph = g, vids = strtoi(match(cl[[IND[i]]], cl$names)))  ####  ####  ####  ####

    cat("i = ", i, "\n")
    cat("# vertices", vcount(denselyConnectedNet), "\n")
    cntV = cntV + vcount(denselyConnectedNet)
    cat("# edges", ecount(denselyConnectedNet), "\n")
    cat("#edges/#vertices", 
        ecount(denselyConnectedNet)/vcount(denselyConnectedNet), 
        "\n")
    #     cat ('duplicated connect rows:', '\n')
    #     print (cbind(V(denselyConnectedNet)$nodeID[get.edgelist(denselyConnectedNet)[duplicated(get.edgelist(denselyConnectedNet)),1]],
    #                                          V(denselyConnectedNet)$nodeID[get.edgelist(denselyConnectedNet)[duplicated(get.edgelist(denselyConnectedNet)),2]]))
    cat('\n')

    if ((ecount(denselyConnectedNet) < 2^6) & (vcount(denselyConnectedNet) < 2^6)) {
      l2 = layout.fruchterman.reingold(denselyConnectedNet, niter = 2^13)
    } else {

      l1 = layout_on_grid(denselyConnectedNet, dim = 2)
      l2 = layout_with_kk(denselyConnectedNet, coords = l1, dim = 2,
                          maxiter = 999 * vcount(denselyConnectedNet),
                          epsilon = 0, kkconst = vcount(denselyConnectedNet),
                          minx = NULL, maxx = NULL,
                          miny = NULL, maxy = NULL,
                          minz = NULL,maxz = NULL)
      z = round(ecount(denselyConnectedNet)/vcount(denselyConnectedNet))
      l2 = l2*2*z
    }

    if ((vcount(denselyConnectedNet) >= 2^2) & (ecount(denselyConnectedNet) <= 2^11)) {
      plot(0, type = "n", ann = TRUE, axes = FALSE,
           xlim = extendrange(l2[,1]),
           ylim = extendrange(l2[,2]),
           xlab = paste0('denselyConnectedNet_', myClusters),
           ylab = paste0('clu: ',(IND)[i]))

      plot(denselyConnectedNet, layout = l2,
           edge.label = '',
           vertex.label = V(denselyConnectedNet)$shortName,
           vertex.color = 'gray60', ####  ####  ####  ####
           rescale = FALSE, add = TRUE,
           vertex.label.cex = 0.5,
           edge.arrow.size = 0.25,
           edge.arrow.width = 0.25,
           edge.lty = 'solid',
           edge.color = 'gray',
           edge.width = 0.25,
           # edge.arrow.mode = 0,
           edge.label.cex = 0.25,
           edge.curved = autocurve.edges(denselyConnectedNet))
    }
    ###

    mydeg = degree(denselyConnectedNet, loops = FALSE, 
                   normalized = FALSE,  mode = "all")
    for (l in 1:vcount(denselyConnectedNet)) {
      tmpvec = c()
      tmpvec = c(V(denselyConnectedNet)$nodeID[l], l2[l,1], l2[l,2], myClusters,
                 paste0('denselyConnectedNet','_',ttmplist[l]), 
                 mydeg[l], mydeg[l], is.connected(denselyConnectedNet))
      mat = rbind(mat,tmpvec)
    }
    myClusters = myClusters + 1

  }
  myseq = paste0(rep('denselyConnectedNet'),'_', tmplist)
  rownames(mat)[which(rownames(mat) == 'tmpvec')] = myseq
  print(dim(mat))
  # print(head(mat))
  print(tail(mat))
  print(myClusters)
}

13 unConnected

IND = unConnected
cntV = 0
tmplist = vector()

if (length(IND)) {
  for (i in 1:length(IND)) {

    tmplist = c(tmplist, paste0((IND)[i],'_',seq(1,(length(cl[[IND[i]]])),1)))
    print(tmplist)

    unConnectedNet = induced.subgraph(graph = g, vids = strtoi(match(cl[[IND[i]]], cl$names)))  ####  ####  ####  ####

    mydegSC = degree(unConnectedNet, loops = FALSE, normalized = FALSE,  mode = "all")
    mydegSC = cbind(strtoi(match(cl[[IND[i]]], cl$names)), mydegSC)

    graphs = decompose(unConnectedNet, mode = "weak")
    # lapply(graphs,function(x) cat('######## ######## unlist(vcount(x))', unlist(vcount(x)), '\n'))
    unConnectedNet = graphs[[which(unlist(lapply(graphs,function(x) vcount(x)))
                                   ==
                                     max(unlist(lapply(graphs,function(x) vcount(x)))))]]

    tempora = which(unlist(lapply(graphs,function(x) vcount(x))) !=  max(unlist(lapply(graphs,function(x) vcount(x)))))  ####  ####  ####  ####

    if ((length(graphs) - 1) != 1) {
      vids = as.vector(unlist(sapply(tempora, 
                                     function(x) match(V(graphs[[x]])$geneID, cl$names))))
      unConnectedNetLeftovers = induced.subgraph(graph = g,vids = strtoi(vids))
    } else {
      vids = match(V(graphs[[tempora]])$geneID, cl$names)
      unConnectedNetLeftovers = induced.subgraph(graph = g,vids = strtoi(vids))
    }

    cat("# vertices", vcount(unConnectedNetLeftovers), "\n")
    cntV = cntV + vcount(unConnectedNetLeftovers)
    cat("# edges", ecount(unConnectedNetLeftovers), "\n")
    cat("#edges/#vertices", 
        ecount(unConnectedNetLeftovers)/vcount(unConnectedNetLeftovers), 
        "\n")
    cat('\n')

    if ((ecount(unConnectedNetLeftovers) < 2^6) & (vcount(unConnectedNetLeftovers) < 2^6)) {
      l2 = layout.fruchterman.reingold(unConnectedNetLeftovers, niter = 2^13)
    } else {

      l1 = layout_on_grid(unConnectedNetLeftovers, dim = 2)
      l2 = layout_with_kk(unConnectedNetLeftovers, coords = l1, dim = 2,
                          maxiter = 999 * vcount(unConnectedNetLeftovers),
                          epsilon = 0, kkconst = vcount(unConnectedNetLeftovers),
                          #weights = rep(100, length.out),
                          minx = NULL, maxx = NULL,
                          miny = NULL, maxy = NULL,
                          minz = NULL,maxz = NULL)
      z = round(ecount(unConnectedNetLeftovers)/vcount(unConnectedNetLeftovers))
      l2 = l2*2*z
    }

    if ((vcount(unConnectedNetLeftovers) >= 2^2) & (ecount(unConnectedNetLeftovers) <= 2^11)) {
      plot(0, type = "n", ann = TRUE, axes = FALSE,
           xlim = extendrange(l2[,1]),
           ylim = extendrange(l2[,2]),
           xlab = paste0('unConnectedNetLeftovers_', myClusters),
           ylab = paste0('clu: ',(IND)[i]))

      plot(unConnectedNetLeftovers, layout = l2,
           edge.label = '',
           vertex.label = V(unConnectedNetLeftovers)$shortName,
           vertex.color = 'gray60', ####  ####  ####  ####
           rescale = FALSE, add = TRUE,
           vertex.label.cex = 0.5,
           edge.arrow.size = 0.25,
           edge.arrow.width = 0.25,
           edge.lty = 'solid',
           edge.color = 'gray',
           edge.width = 0.25,
           # edge.arrow.mode = 0,
           edge.label.cex = 0.25,
           edge.curved = autocurve.edges(unConnectedNetLeftovers))
    }
    ttmplist = vector()
    ttmplist = c(ttmplist, paste0((IND)[i],
                                  '_unConnectedNetLeftovers_',
                                  seq(1,vcount(unConnectedNetLeftovers),1)))
    print(ttmplist)
    mydeg = degree(unConnectedNetLeftovers, loops = FALSE, 
                   normalized = FALSE,  mode = "all")
    mydegSCucl = mydegSC[which(!is.na(match(mydegSC[,1], 
                                            V(unConnectedNetLeftovers)$numVec))),2]

    for (l in 1:vcount(unConnectedNetLeftovers)) {
      tmpvec = c()
      tmpvec = c(V(unConnectedNetLeftovers)$nodeID[l], l2[l,1], l2[l,2], myClusters,
                 paste0(ttmplist[l]), 
                 mydegSCucl[l], mydeg[l], 
                 is.connected(unConnectedNetLeftovers))
      mat = rbind(mat,tmpvec)
    }
    myClusters = myClusters + 1

    tmpNetS = as.undirected(unConnectedNet, mode = "each")
    tmpNetS = simplify(tmpNetS,
                       remove.multiple = TRUE,
                       remove.loops = TRUE,
                       edge.attr.comb = "concat")
    mysn = length(which(degree(tmpNetS) >= 0.6*max(degree(tmpNetS))))
    
    if (mysn < 2) {
      # mysn = 2
      # print('Problemos')
      
      cat('##########\n')
      mysubg = unConnectedNet
            cat('connected\t', is.connected(mysubg), '\n')
      cat("# vertices", vcount(mysubg), "\n")
      cat("# edges", ecount(mysubg), "\n")
      cat("#edges/#vertices", ecount(mysubg)/vcount(mysubg), "\n")
      q = ecount(mysubg)/(vcount(mysubg)*(vcount(mysubg) - 1))
      cat("#edges/#MAXedges", q, "\n")
    
      if ((ecount(mysubg) < 2^6) & (vcount(mysubg) < 2^6)) {
        l2 = layout.fruchterman.reingold(mysubg, niter = 2^13)
      } else {
    
        l1 = layout_on_grid(mysubg, dim = 2)
        l2 = layout_with_kk(mysubg, coords = l1, dim = 2,
                    maxiter = 999 * vcount(mysubg),
                    epsilon = 0, kkconst = vcount(mysubg),
                    #weights = rep(100, length.out),
                    minx = NULL, maxx = NULL,
                    miny = NULL, maxy = NULL,
                    minz = NULL,maxz = NULL)
        z = round(ecount(mysubg)/vcount(mysubg))
        l2 = l2*2*z
      }
    
      if ((vcount(mysubg) >= 2^2) & (ecount(mysubg) <= 2^11)) {
        plot(0, type = "n", ann = TRUE, axes = FALSE,
             xlim = extendrange(l2[,1]),
             ylim = extendrange(l2[,2]),
             xlab = paste0('origigi; clu: ', IND[i]),
             ylab = paste0('subclu: ',j))
    
        plot(mysubg,
             layout = l2,
             vertex.color = 'gray60', ####  ####  ####  ####
             edge.label = '',
             vertex.label = V(mysubg)$shortName,
             rescale = FALSE, add = TRUE,
             vertex.label.cex = 0.25,
             edge.arrow.size = 0.025,
             edge.arrow.width = 0.025,
             edge.lty = 'solid',
             edge.color = 'gray',
             edge.width = 0.025,
             # edge.arrow.mode = 0,
             edge.label.cex = 0.025,
             edge.curved = autocurve.edges(mysubg))
      }
        ttmplist = vector()
        ttmplist = c(ttmplist, paste0((IND)[i],
                          '_unConnectedNet_mysubg_', 
                          j, 
                          '_', 
                          seq(1,vcount(mysubg),1)))
        print(ttmplist)
        mydeg = degree(mysubg, loops = FALSE, 
                   normalized = FALSE,  mode = "all")
        mydegSCs = mydegSC[which(!is.na(match(mydegSC[,1], V(mysubg)$numVec))),2]
        for (l in 1:vcount(mysubg)) {
          tmpvec = c()
          tmpvec = c(V(mysubg)$nodeID[l], l2[l,1], l2[l,2], myClusters,
                 ttmplist[l], mydegSCs[l], mydeg[l], is.connected(mysubg))
          mat = rbind(mat,tmpvec)
        }
        myClusters = myClusters + 1
      
      
    } else {
          
        cat('mysn', mysn, '\n\n')
    
    
        ##########  ##########  ##########  ##########  ##########
    
    
        # set.seed(123456)
        clspin = spinglass.community(unConnectedNet, spins = mysn)
        cntV = cntV + vcount(unConnectedNet)
    
        cat("# spins", mysn, "\n")
    
        df = (as.data.frame(table(clspin$membership)))
        print((df))
        df[,1] = strtoi(df[,1])
        df[,2] = strtoi(df[,2])
    
        for (j in df[,1]) {
          cat('##########\n')
          mysubg = induced.subgraph(graph = unConnectedNet, vids = strtoi(match(clspin[[j]], clspin$names))) ####  ####  ####  ####
    
    
          cat('connected\t', is.connected(mysubg), '\n')
          cat("# vertices", vcount(mysubg), "\n")
          cat("# edges", ecount(mysubg), "\n")
          cat("#edges/#vertices", ecount(mysubg)/vcount(mysubg), "\n")
          q = ecount(mysubg)/(vcount(mysubg)*(vcount(mysubg) - 1))
          cat("#edges/#MAXedges", q, "\n")
    
          if ((ecount(mysubg) < 2^6) & (vcount(mysubg) < 2^6)) {
            l2 = layout.fruchterman.reingold(mysubg, niter = 2^13)
          } else {
    
            l1 = layout_on_grid(mysubg, dim = 2)
            l2 = layout_with_kk(mysubg, coords = l1, dim = 2,
                                maxiter = 999 * vcount(mysubg),
                                epsilon = 0, kkconst = vcount(mysubg),
                                #weights = rep(100, length.out),
                                minx = NULL, maxx = NULL,
                                miny = NULL, maxy = NULL,
                                minz = NULL,maxz = NULL)
            z = round(ecount(mysubg)/vcount(mysubg))
            l2 = l2*2*z
          }
    
          if ((vcount(mysubg) >= 2^2) & (ecount(mysubg) <= 2^11)) {
            plot(0, type = "n", ann = TRUE, axes = FALSE,
                 xlim = extendrange(l2[,1]),
                 ylim = extendrange(l2[,2]),
                 xlab = paste0('origigi; clu: ', IND[i]),
                 ylab = paste0('subclu: ',j))
    
            plot(mysubg,
                 layout = l2,
                 vertex.color = 'gray60', ####  ####  ####  ####
                 edge.label = '',
                 vertex.label = V(mysubg)$shortName,
                 rescale = FALSE, add = TRUE,
                 vertex.label.cex = 0.25,
                 edge.arrow.size = 0.025,
                 edge.arrow.width = 0.025,
                 edge.lty = 'solid',
                 edge.color = 'gray',
                 edge.width = 0.025,
                 # edge.arrow.mode = 0,
                 edge.label.cex = 0.025,
                 edge.curved = autocurve.edges(mysubg))
          }
            ttmplist = vector()
            ttmplist = c(ttmplist, paste0((IND)[i],
                                          '_unConnectedNet_mysubg_', 
                                          j, 
                                          '_', 
                                          seq(1,vcount(mysubg),1)))
            print(ttmplist)
            mydeg = degree(mysubg, loops = FALSE, 
                           normalized = FALSE,  mode = "all")
            mydegSCs = mydegSC[which(!is.na(match(mydegSC[,1], V(mysubg)$numVec))),2]
            for (l in 1:vcount(mysubg)) {
              tmpvec = c()
              tmpvec = c(V(mysubg)$nodeID[l], l2[l,1], l2[l,2], myClusters,
                         ttmplist[l], mydegSCs[l], mydeg[l], is.connected(mysubg))
              mat = rbind(mat,tmpvec)
            }
            myClusters = myClusters + 1
        }
            ###
    }


  }
  
    myseq = paste0(rep('unConnectedNet'),'_', tmplist)
    rownames(mat)[which(rownames(mat) == 'tmpvec')] = myseq
    # any(duplicated(mat[grep('unConnectedNet',rownames(mat)),1]))
    # bla = data.frame(mat[which(duplicated(mat[,1])),c(1,4)])
    # bla = bla[with(bla, order(bla[,1])), ]
    print(dim(mat))
    # print(head(mat))
    print(tail(mat))
    print(myClusters)
  
}
save.image(paste0(dir1, '/', myname, 'pt1.RData'))
# load(paste0(dir1, '/', myname, 'pt1.RData'))

14 Splits

#######
cntV = 0
tmplist = vector()
IND = as.vector(big.0.0[,1])

if (length(IND)) {
  for (i in 1:length(IND)) {
    
    tmplist = c(tmplist, paste0((IND)[i],'_',seq(1,(length(cl[[IND[i]]])),1)))
    
    cat('i = ', i, "\n")
    ind =  strtoi(match(cl[[IND[i]]], cl$names))  ####  ####  ####  ####
    entryGraph = induced.subgraph(graph = g,vids = ind) 

    cat("# vertices", vcount(entryGraph), "\n")
    cat("# edges", ecount(entryGraph), "\n")
    cat("#edges/#vertices", ecount(entryGraph)/vcount(entryGraph), "\n")
    
    mydegSC = degree(entryGraph, loops = FALSE, 
                     normalized = FALSE,  mode = "all")
    mydegSC = cbind(ind, mydegSC)
    
    mysn = mydf[which(mydf[,1] == IND[i]),15]
    if (mysn < 2) {
      mysn = 2
    }
    cat('mysn', mysn, '\n\n')
    
    ##########  ##########  ##########  ##########  ##########
    ##########  ##########  ##########  ##########  ##########
    
    
    # set.seed(123456)
    clspin = spinglass.community(entryGraph, spins = mysn)
    cntV = cntV + vcount(entryGraph)
    
    df = (as.data.frame(table(clspin$membership)))
    print((df))
    df[,1] = strtoi(df[,1])
    df[,2] = strtoi(df[,2])
    cat("# spins", dim(df)[1], "\n")
    
    for (j in df[,1]) {
      cat('##########\n')
      mysubg = induced.subgraph(graph = entryGraph, vids = strtoi(match(clspin[[j]], clspin$names))) ####  ####  ####  ####
      
      cat('connected\t', is.connected(mysubg), '\n')
      cat("# vertices", vcount(mysubg), "\n")
      cat("# edges", ecount(mysubg), "\n")
      cat("#edges/#vertices", ecount(mysubg)/vcount(mysubg), "\n")
      q = ecount(mysubg)/(vcount(mysubg)*(vcount(mysubg) - 1))
      cat("#edges/#MAXedges", q, "\n")
      
      if ((ecount(mysubg) < 2^6) & (vcount(mysubg) < 2^6)) {
        l2 = layout.fruchterman.reingold(mysubg, niter = 2^13)
      } else {
        
        l1 = layout_on_grid(mysubg, dim = 2)
        l2 = layout_with_kk(mysubg, coords = l1, dim = 2,
                            maxiter = 999 * vcount(mysubg),
                            epsilon = 0, kkconst = vcount(mysubg),
                            minx = NULL, maxx = NULL,
                            miny = NULL, maxy = NULL,
                            minz = NULL,maxz = NULL)
        z = round(ecount(mysubg)/vcount(mysubg))
        l2 = l2*2*z
      }
      
      if ((vcount(mysubg) >= 2^2) & (ecount(mysubg) <= 2^11)) {
        plot(0, type = "n", ann = TRUE, axes = FALSE,
             xlim = extendrange(l2[,1]),
             ylim = extendrange(l2[,2]),
             xlab = paste0('origigi; clu: ', IND[i]),
             ylab = paste0('subclu: ',j))

        plot(mysubg,
             layout = l2,
             vertex.color = 'gray60', ####  ####  ####  ####
             #edge.label=E(mysubg)$intType,
             edge.label = '',
             vertex.label = V(mysubg)$shortName, #$X12_shortName,
             rescale = FALSE, add = TRUE,
             vertex.label.cex =  0.25,
             edge.arrow.size = 0.025,
             edge.arrow.width = 0.025,
             edge.lty = 'solid',
             edge.color = 'gray',
             edge.width = 0.025,
             # edge.arrow.mode = 0,
             edge.label.cex = 0.025,
             edge.curved = autocurve.edges(mysubg))
      }
        ttmplist = vector()
        ttmplist = c(ttmplist, paste0((IND)[i],
                                      '_big_mysubg_', 
                                      j, 
                                      '_', 
                                      seq(1,vcount(mysubg),1)))
        mydeg = degree(mysubg, loops = FALSE, 
                       normalized = FALSE,  mode = "all")
        mydegSCs = mydegSC[which(!is.na(match(mydegSC[,1], V(mysubg)$numVec))),2]
        for (l in 1:vcount(mysubg)) {
          tmpvec = c()
          tmpvec = c(V(mysubg)$nodeID[l], l2[l,1], l2[l,2], myClusters, 
                     ttmplist[l], mydegSCs[l], mydeg[l], is.connected(mysubg))
          mat = rbind(mat,tmpvec)
        }
        myClusters = myClusters + 1
    }
        ###
  }
  ########  ########  ########  ########
}

myseq = paste0(rep('big'),'_', tmplist)
rownames(mat)[which(rownames(mat) == 'tmpvec')] = myseq
print(dim(mat))
## [1] 141   8
# print(head(mat))
print(tail(mat))
##              nodeID      x                  y                  clu
## starNet_2_59 "AT5G06839" "-2.0027303222501" "36.3652845433234" "3"
## starNet_2_60 "AT5G24780" "57.4171224319684" "11.8290178498089" "3"
## starNet_2_61 "AT4G17880" "52.3313115895203" "5.15420519294015" "3"
## starNet_2_62 "AT1G78780" "30.783553305754"  "43.0373541757408" "3"
## starNet_2_63 "AT1G08720" "5.6653193715662"  "44.6021783570855" "3"
## starNet_2_64 "AT4G26000" "48.536011846199"  "47.0850680452374" "3"
##              origin         degreeSimplifiedSuperCluster
## starNet_2_59 "starNet_2_59" "1"                         
## starNet_2_60 "starNet_2_60" "2"                         
## starNet_2_61 "starNet_2_61" "4"                         
## starNet_2_62 "starNet_2_62" "2"                         
## starNet_2_63 "starNet_2_63" "1"                         
## starNet_2_64 "starNet_2_64" "1"                         
##              degreeSimplifiedCluster isconnected
## starNet_2_59 "1"                     "TRUE"     
## starNet_2_60 "2"                     "TRUE"     
## starNet_2_61 "4"                     "TRUE"     
## starNet_2_62 "2"                     "TRUE"     
## starNet_2_63 "1"                     "TRUE"     
## starNet_2_64 "1"                     "TRUE"
print(myClusters - 1)
## [1] 3
save.image(paste0(dir1, '/', myname, 'pt2.RData'))
# load(paste0(dir1, '/', myname, 'pt2.RData'))
mydataframe = as.data.frame(mat)
mydataframe[,1] = (as.character(((mydataframe[,1]))))
mydataframe[,2] = as.numeric(as.character(((mydataframe[,2]))))
mydataframe[,3] = as.numeric(as.character((mydataframe[,3])))
mydataframe[,4] = strtoi(as.character(mydataframe[,4]))
mydataframe[,5] = (as.character(((mydataframe[,5]))))
mydataframe[,6] = strtoi(as.character(mydataframe[,6]))
mydataframe[,7] = strtoi(as.character(mydataframe[,7]))
mydataframe[,8] = (as.character(((mydataframe[,8]))))

head(mat)
##              nodeID      x                  y                  clu
## moderate_3_1 "AT2G43790" "37.558906756742"  "90.5822247476108" "1"
## moderate_3_2 "AT1G10210" "51.0221609032984" "85.2300432855909" "1"
## moderate_3_3 "AT4G01370" "50.047985863501"  "65.9582767575875" "1"
## moderate_3_4 "AT3G45640" "38.1158454034153" "71.6412588884805" "1"
## moderate_3_5 "AT2G40860" "62.8073272304976" "80.0294836157671" "1"
## moderate_3_6 "AT5G65700" "17.9049752911501" "5.50758382547282" "1"
##              origin       degreeSimplifiedSuperCluster
## moderate_3_1 "normal_3_1" "42"                        
## moderate_3_2 "normal_3_2" "41"                        
## moderate_3_3 "normal_3_3" "41"                        
## moderate_3_4 "normal_3_4" "41"                        
## moderate_3_5 "normal_3_5" "42"                        
## moderate_3_6 "normal_3_6" "13"                        
##              degreeSimplifiedCluster isconnected
## moderate_3_1 "42"                    "TRUE"     
## moderate_3_2 "41"                    "TRUE"     
## moderate_3_3 "41"                    "TRUE"     
## moderate_3_4 "41"                    "TRUE"     
## moderate_3_5 "42"                    "TRUE"     
## moderate_3_6 "13"                    "TRUE"
head(mydataframe)
##                 nodeID        x         y clu     origin
## moderate_3_1 AT2G43790 37.55891 90.582225   1 normal_3_1
## moderate_3_2 AT1G10210 51.02216 85.230043   1 normal_3_2
## moderate_3_3 AT4G01370 50.04799 65.958277   1 normal_3_3
## moderate_3_4 AT3G45640 38.11585 71.641259   1 normal_3_4
## moderate_3_5 AT2G40860 62.80733 80.029484   1 normal_3_5
## moderate_3_6 AT5G65700 17.90498  5.507584   1 normal_3_6
##              degreeSimplifiedSuperCluster degreeSimplifiedCluster
## moderate_3_1                           42                      42
## moderate_3_2                           41                      41
## moderate_3_3                           41                      41
## moderate_3_4                           41                      41
## moderate_3_5                           42                      42
## moderate_3_6                           13                      13
##              isconnected
## moderate_3_1        TRUE
## moderate_3_2        TRUE
## moderate_3_3        TRUE
## moderate_3_4        TRUE
## moderate_3_5        TRUE
## moderate_3_6        TRUE
typeof(mat[,2])
## [1] "character"
typeof(mydataframe[,2])
## [1] "double"
mat = mydataframe

15 Results

plot(mat[,4])

hist(strtoi(mat[,4]), breaks = myClusters)

(all(mat[,1] %in% V(g)$geneID))
## [1] TRUE
(all(V(g)$geneID %in% mat[,1]))
## [1] TRUE
length(intersect(V(g)$nodeID, mat[,1]))
## [1] 141
dim(mat)[1] == vcount(g)
## [1] TRUE
vcount(g) - dim(mat)[1]
## [1] 0
myind = match(mat[,1],V(g)$nodeID)
mynumvec = strtoi(V(g)$numVec[myind])

mat.t = (as.data.frame(cbind(mat,as.numeric(mynumvec)), stringsAsFactors = FALSE))
colnames(mat.t)[dim(mat.t)[2]] = 'mynumvec'
mat.t[,dim(mat.t)[2]] = as.numeric(mat.t[,dim(mat.t)[2]])
# head(mat.t)
# tail(mat.t)
mat.sorted = mat.t[with(mat.t, order(mynumvec)), ]
head(mat.sorted)
##                 nodeID          x          y clu      origin
## starNet_2_1  AT3G25070  3.1407355  -5.418174   3 starNet_2_1
## starNet_2_2  AT3G07040 -0.6724419   9.520218   3 starNet_2_2
## moderate_1_1 AT5G05730 51.5716029 -13.606843   2  normal_1_1
## moderate_1_2 AT1G25220 68.2973404 -14.680944   2  normal_1_2
## moderate_1_3 AT2G29690 36.1639439 -11.244073   2  normal_1_3
## moderate_1_4 AT3G55870 66.7743236   2.639625   2  normal_1_4
##              degreeSimplifiedSuperCluster degreeSimplifiedCluster
## starNet_2_1                             6                       6
## starNet_2_2                             9                       9
## moderate_1_1                           20                      20
## moderate_1_2                           20                      20
## moderate_1_3                           20                      20
## moderate_1_4                           20                      20
##              isconnected mynumvec
## starNet_2_1         TRUE        1
## starNet_2_2         TRUE        2
## moderate_1_1        TRUE        3
## moderate_1_2        TRUE        4
## moderate_1_3        TRUE        5
## moderate_1_4        TRUE        6
tail(mat.sorted)
##                  nodeID          x         y clu       origin
## moderate_3_45 AT4G28650 116.782610 28.752568   1  normal_3_45
## starNet_2_61  AT4G17880  52.331312  5.154205   3 starNet_2_61
## starNet_2_62  AT1G78780  30.783553 43.037354   3 starNet_2_62
## starNet_2_63  AT1G08720   5.665319 44.602178   3 starNet_2_63
## moderate_1_32 AT5G13320 -48.653224 87.216803   2  normal_1_32
## starNet_2_64  AT4G26000  48.536012 47.085068   3 starNet_2_64
##               degreeSimplifiedSuperCluster degreeSimplifiedCluster
## moderate_3_45                           11                      11
## starNet_2_61                             4                       4
## starNet_2_62                             2                       2
## starNet_2_63                             1                       1
## moderate_1_32                            1                       1
## starNet_2_64                             1                       1
##               isconnected mynumvec
## moderate_3_45        TRUE      136
## starNet_2_61         TRUE      137
## starNet_2_62         TRUE      138
## starNet_2_63         TRUE      139
## moderate_1_32        TRUE      140
## starNet_2_64         TRUE      141
write.table(mat.sorted, paste0(dir1, '/', myname, "_id_coord_clu.tsv"), 
            sep = "\t", quote = FALSE, row.names = TRUE, col.names = TRUE)
# all vertex attributes
# list.vertex.attributes(g)
v = list.vertex.attributes(g)
# for (i in v) {
#   print(head(get.vertex.attribute(g,i)))
# }
# all edge attributes
# list.edge.attributes(g)
e = list.edge.attributes(g)
# for (j in e) {
#   print(head(get.edge.attribute(g,j)))
# }


setdiff(V(g)$geneID, mat.sorted$nodeID)
## character(0)
length(mat.sorted$mynumvec)
## [1] 141
min(mat.sorted$mynumvec)
## [1] 1
max(mat.sorted$mynumvec)
## [1] 141
missing = setdiff(seq(1,max(mat.sorted$mynumvec),1), mat.sorted$mynumvec)


myind = match(V(g)$nodeID, mat.sorted[,1])
noInfo = which(is.na(myind))
if (length(noInfo)) {
  for (i in 1:length(noInfo)) {
    print(i)
    tmpvec = c(V(g)$nodeID[noInfo[i]], 0, 0, 0, '-', 0, 'FALSE', noInfo[i])
    mat.sorted = rbind(mat.sorted, tmpvec)
  }
}

mat.sorted$mynumvec = strtoi( mat.sorted$mynumvec)
mat.sorted = mat.sorted[with(mat.sorted , order(mynumvec)), ]
any(!(V(g)$nodeID %in% mat.sorted$nodeID))
## [1] FALSE
any(!(mat.sorted$nodeID %in% V(g)$nodeID))
## [1] FALSE
all(mat.sorted$mynumvec[noInfo] == noInfo)
## [1] TRUE
myind = myind[which(!(is.na(myind)))]
myind = match(V(g)$nodeID, mat.sorted[,1])


V(g)$superClu <- V(g)$membership
V(g)$membership <- (mat.sorted$clu)
V(g)$x <- as.numeric(mat.sorted$x)
V(g)$y <- as.numeric(mat.sorted$y)
V(g)$cluOrigin <- mat.sorted$origin
V(g)$degreeSimplifiedCluster <- mat.sorted$degreeSimplifiedCluster
V(g)$degreeSimplifiedSuperCluster <- mat.sorted$degreeSimplifiedSuperCluster
V(g)$isconnected <- mat.sorted$isconnected
V(g)$exists <- rep(1, vcount(g))
# list.vertex.attributes(g)

E(g)$cluA = V(g)$membership[match(E(g)$geneID1,  V(g)$geneID)]
E(g)$cluB = V(g)$membership[match(E(g)$geneID2,  V(g)$geneID)]
E(g)$superCluA = V(g)$superClu[match(E(g)$geneID1,  V(g)$geneID)]
E(g)$superCluB = V(g)$superClu[match(E(g)$geneID2,  V(g)$geneID)]
E(g)$cluOriginA = V(g)$cluOrigin[match(E(g)$geneID1,  V(g)$geneID)]
E(g)$cluOriginB = V(g)$cluOrigin[match(E(g)$geneID2,  V(g)$geneID)]
E(g)$matchClu = (E(g)$cluA == E(g)$cluB)
E(g)$degreeFullSimplifiedA = V(g)$degreeFullSimplified[match(E(g)$geneID1,  V(g)$geneID)]
E(g)$degreeFullSimplifiedB = V(g)$degreeFullSimplified[match(E(g)$geneID2,  V(g)$geneID)]
E(g)$degreeSimplifiedSuperClusterA = V(g)$degreeSimplifiedSuperCluster[match(E(g)$geneID1,  V(g)$geneID)]
E(g)$degreeSimplifiedSuperClusterB = V(g)$degreeSimplifiedSuperCluster[match(E(g)$geneID2,  V(g)$geneID)]
E(g)$degreeSimplifiedClusterA = V(g)$degreeSimplifiedCluster[match(E(g)$geneID1,  V(g)$geneID)]
E(g)$degreeSimplifiedClusterB = V(g)$degreeSimplifiedCluster[match(E(g)$geneID2,  V(g)$geneID)]
E(g)$exists <- rep(1, ecount(g))
# list.edge.attributes(g)


# first check if theese attributes exist
v = list.vertex.attributes(g)
e = list.edge.attributes(g)

df1 = matrix(NA, dim(mat.sorted)[1], length(v))

for (i in 1:length(v)) {
  df1[,i] = get.vertex.attribute(g,v[i])
}

myind = match(V(g)$nodeID, mat.sorted[,1])
myind = myind[!(is.na(myind))]


df1 = as.data.frame(df1, stringsAsFactors = FALSE)
colnames(df1) = v
# str(df1)


# which columns!!!???!!!
colNames = toupper(c('geneID', 'shortDescription', 'shortName', 'MapManBin', 
                     'numVec', 'superClu', 'membership', 'cluOrigin', 'x', 'y', 
                     'degreeFullSimplified', 
                     'degreeSimplifiedSuperCluster', 
                     'degreeSimplifiedCluster', 
                     'isconnected',
                     'exists'))


importantColsE = unlist(sapply(colNames, 
                               function(x) grep(paste("^",x,"$", sep = ""), 
                                                toupper((v)))))

df1 = df1[,importantColsE]

colnames(df1) = c('geneID',
                  'shortDescription',
                  'shortName',
                  'MapManBin',
                  'sortingOrder',
                  'superClusterID',
                  'clusterID',
                  'clusterOrigin',
                  'x',
                  'y',
                  'networkSimplifiedNodeDegree',
                  'superClusterSimplifiedNodeDegree',
                  'clusterSimplifiedNodeDegree',
                  'isConnected',
                  'expressed')

write.table(df1, paste0(dir1, '/', myname, "_NODES.tsv"),
            quote = FALSE,
            row.names = FALSE,
            sep = "\t")

df2 = matrix(NA, ecount(g), length(e))

for (i in 1:length(e)) {
  df2[,i] = get.edge.attribute(g,e[i])
}

df2 = as.data.frame(df2, stringsAsFactors = FALSE)
colnames(df2) = e
# str(df2)

# which columns!!!???!!!
colNames = toupper(c('geneID1', 'geneID2', 'reactionType', 
                     'cluA', 'cluB', 
                     'superCluA', 'superCluB', 
                     'cluOriginA', 'cluOriginB', 'matchClu',
                     'degreeFullSimplifiedA', 'degreeFullSimplifiedB' ,
                     'degreeSimplifiedSuperClusterA', 'degreeSimplifiedSuperClusterB', 
                     'degreeSimplifiedClusterA' ,'degreeSimplifiedClusterB',
                     'exists'))

importantColsE = unlist(sapply(colNames, 
                               function(x) grep(paste("^",x,"$", sep = ""), 
                                                toupper((e)))))

df2 = df2[,importantColsE]

colnames(df2) = c('geneID1',
                  'geneID2',
                  'reactionType',
                  'clusterID_geneID1',
                  'clusterID_geneID2',
                  'superClusterID_geneID1',
                  'superClusterID_geneID2',
                  'clusterOrigin_geneID1',
                  'clusterOrigin_geneID2',
                  'matchingClusters',
                  'networkSimplifiedNodeDegree_geneID1',
                  'networkSimplifiedNodeDegree_geneID2',
                  'superClusterSimplifiedNodeDegree_geneID1',
                  'superClusterSimplifiedNodeDegree_geneID2',
                  'clusterSimplifiedNodeDegree_geneID1',
                  'clusterSimplifiedNodeDegree_geneID2',
                  'exists')

write.table(df2, paste0(dir1, '/', myname, "_EDGES_all.tsv"), 
            quote = FALSE,
            row.names = FALSE,
            sep = "\t")

df3 = df2[which(df2$matchingClusters == TRUE),]

write.table(df3, paste0(dir1, '/', myname, "_EDGES_byClu.tsv"), 
            quote = FALSE,
            row.names = FALSE,
            col.names = TRUE,
            sep = "\t")
save.image(paste0(dir1, '/', myname,'_clu_and_coord.RData'))

exectime <- toc()
## 11.04 sec elapsed
print(exectime)
## $tic
## elapsed 
##     1.6 
## 
## $toc
## elapsed 
##   12.64 
## 
## $msg
## logical(0)

http://r.789695.n4.nabble.com/igraph-and-plotting-connected-components-td836537.html