r - Turn on all CPUs for all nodes on a cluster: snow/snowfall package -
i working on cluster , using snowfall package establish socket cluster on 5 nodes 40 cpus each following command:
 > sfinit(parallel=true, cpus = 200, type="sock", sockethosts=c("host1", "host2", "host3", "host4", "host5"));  r version:  r version 3.1.0 (2014-04-10)    snowfall 1.84-6 initialized (using snow 0.3-13): parallel execution on 5 cpus. i seeing lower load on slaves expected when check cluster report , disconcerted fact says "parallel execution on 5 cpus" instead of "parallel execution on 200 cpus". merely ambiguous reference cpus or hosts running 1 cpu each?
edit: here example of why concerns me, if use local machine , specify max number of cores, have:
 > sfinit(parallel=true, type="sock", cpus = 40);  snowfall 1.84-6 initialized (using snow 0.3-13): parallel execution on 40 cpus. i ran identical job on single node, 40 cpu cluster , took 1.4 minutes while 5 node, apparently 5 cpu cluster took 5.22 minutes. me confirms suspicions running parallelism on 5 nodes turning on 1 of cpus on each node.
my question then: how turn on cpus use across available nodes?
edit: @simong used underlying snow package's intialization , can see 5 nodes being turned on:
 > cl <- makesockcluster(names = c("host1", "host2", "host3", "host4", "host5"), count = 200)  > clustercall(cl, runif, 3)  [[1]]  [1] 0.9854311 0.5737885 0.8495582   [[2]]  [1] 0.7272693 0.3157248 0.6341732   [[3]]  [1] 0.26411931 0.36189866 0.05373248   [[4]]  [1] 0.3400387 0.7014877 0.6894910   [[5]]  [1] 0.2922941 0.6772769 0.7429913   > stopcluster(cl)  > cl <- makesockcluster(names = rep("localhost", 40), count = 40)  > clustercall(cl, runif, 3)  [[1]]  [1] 0.6914666 0.7273244 0.8925275   [[2]]  [1] 0.3844729 0.7743824 0.5392220   [[3]]  [1] 0.2989990 0.7256851 0.6390770        [[4]]  [1] 0.07114831 0.74290601 0.57995908   [[5]]  [1] 0.4813375 0.2626619 0.5164171   .  .  .   [[39]]  [1] 0.7912749 0.8831164 0.1374560   [[40]]  [1] 0.2738782 0.4100779 0.0310864 i think shows pretty clearly. tried in desperation:
 > cl <- makesockcluster(names = rep(c("host1", "host2", "host3", "host4", "host5"), each = 40), count = 200) and predictably got:
 error in socketconnection(port = port, server = true, blocking = true,  :     connections in use 
after thoroughly reading snow documentation, have come (partial) solution.
i read 128 connections may opened @ once distributed r version, , have found true. can open 25 cpus on each node, cluster not start if try start 26 on each. here proper structure of host list needs passed makecluster:
> library(snow);  > unixhost13 <- list(host = "host1"); > unixhost14 <- list(host = "host2"); > unixhost19 <- list(host = "host3"); > unixhost29 <- list(host = "host4"); > unixhost30 <- list(host = "host5");  > kcpus <- 25; > hostlist <- c(rep(list(unixhost13), kcpus), rep(list(unixhost14), kcpus),               rep(list(unixhost19), kcpus), rep(list(unixhost29), kcpus), rep(list(unixhost30), kcpus)); > cl <- makecluster(hostlist, type = "sock") > clustercall(cl, runif, 3) [[1]] [1] 0.08430941 0.64479036 0.90402362  [[2]] [1] 0.1821656 0.7689981 0.2001639  [[3]] [1] 0.5917363 0.4461787 0.8000013 . . . [[123]] [1] 0.6495153 0.6533647 0.2636664  [[124]] [1] 0.75175580 0.09854553 0.66568129  [[125]] [1] 0.79336203 0.61924813 0.09473841 i found reference saying in order connections, r needed rebuilt nconnections set higher (see here).
Comments
Post a Comment