Creating 2 Nodes Cluster Using PMICH2 on Centos 5.5 x86
We will use 3 VMs
master 192.168.1.120
node1 192.168.1.121
node2 192.168.1.122
1. Editing /etc/hosts
Master
# vim /etc/hosts
127.0.0.1 localhost.localdomain localhost
192.168.1.120 master
192.168.1.121 node1
192.168.1.122 node2
# scp /etc/hosts node1:/etc/
# scp /etc/hosts node2:/etc/
2. Creating SSH Keys
Master
# ssh key-gen --> Don't change anything
# cat /root/.ssh/*.pub > /root/.ssh/authorized_keys
# scp -r /root/.ssh/ node1:/root/
# scp -r /root/.ssh/ node2:/root/
3. Install PDSH ( Parallel Distribution SHell ) to control more than one machine at once
Master
Install Rpmforge Repo
# rpm --import http://apt.sw.be/RPM-GPG-KEY.dag.txt --> If you got an error this mean it installed before
# rpm -Uvh http://packages.sw.be/rpmforge-release/rpmforge-release-0.5.2-2.el5.rf.i386.rpm --> Or download it with wget and install it locally
# yum install pdsh --> For distributing commands to all nodes once at the same time
# vim /etc/pdsh/machines
node2
node2
master
# pdsh -a uptime --> you should have the result of the 3 machines
Note:- If any machine didn't respond and you sure all the above configurations are ok try to change the orde of the machines in /etc/pdsh/machines
4. Install Time Server to Disallow Time Variances Between The Nodes
Install this time server in any other machine for example the host hosting this VMs "In my case the host IP 192.168.1.2"
# yum install ntp
# vim /etc/ntp.conf
#server 0.centos.pool.ntp.org
#server 1.centos.pool.ntp.org --> Comment them
#server 2.centos.pool.ntp.org
server 127.127.1.0 # local clock
fudge 127.127.1.0 stratum 10 --> Make sure that they are uncommented
# /etc/init.d/ntpd start
# chkconfig ntpd on
Master
phsh -a yum -y install ntp
# vim /etc/ntp.conf
server 192.168.1.2
#server 0.centos.pool.ntp.org
#server 1.centos.pool.ntp.org
#server 2.centos.pool.ntp.org --> Comment them
#server 127.127.1.0 # local clock
#fudge 127.127.1.0 stratum 10
# scp /etc/ntp.conf node1:/etc/
# scp /etc/ntp.conf node2:/etc/
# pdsh -a /etc/init.d/ntpd start
# pdsh -a chkconfig ntpd on
5. Sharing The /cluster Using NFS
Master
# yum install nfs-utils.i386
# vim /etc/exports
/cluster *(rw,sync,no_root_squash)
# mkdir /cluster
# /etc/init.d/portmap start
# /etc/init.d/nfs start
# chkconfig nfs on
# chkconfig portmap on
# pdsh -w node1,node2 mkdir /cluster
# pdsh -w node1,node2 yum -y install nfs-utils
# pdsh -w node1,node2 /etc/init.d/portmap start
# pdsh -w node1,node2 mount.nfs master:/cluster /cluster
# pdsh -w node1,node2 chkconfig nfs on
# pdsh -w node1,node2 chkconfig portmap on
Node1 Node2
# vim /etc/fstab
master:/cluster /cluster nfs defaults 0 0
Note:- If you reboot the VMs or Start it any other time make sure that the master VM start first
6. Creating mpiuser and his SSH keys
Master
# pdsh -a groupadd -g 1000 mpigroup
# pdsh -a useradd -u 1000 -g 1000 -d /cluster/mpiuser mpiuser
# pdsh -a yum -y install gcc gcc-c++.i386 compat-gcc-34-g77.i386
$ su - mpiuser
$ ssh key-gen --> Don't change anything
$ cat ~/.ssh/*.pub > ~/.ssh/authorized_keys
7. Installing MPICH2
Master
# yum -y install patch
# cd /cluster && wget http://www.mcs.anl.gov/research/projects/mpich2/downloads/tarballs/1.4.1p1/mpich2-1.4.1p1.tar.gz
# chown mpiuser.mpigroup -R /cluster
# su mpiuser
$ cd /cluster && tar -xvzf mpich2-1.4.1p1.tar.gz
$ cd mpich2-1.4.1p1 && ./configure --prefix=/cluster/mpich2
$ make && make install
$ vim ~/.bash_profile --> Edit it as follow
PATH=$PATH:$HOME/bin:/cluster/mpich2/bin
LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/cluster/mpich2/lib
export PATH LD_LIBRARY_PATH
$ source ~/.bash_profile
$ vim /cluster/mpiuser/hosts
node1
node2
$ mpiexec -f /cluster/mpiuser/hosts hostname --> results must be as follow
node1
node2
Note:- Make sure that print fingers of all nodes are saved in know_hosts of mpiuser
$ mpiexec -n 1 -f /cluster/mpiuser/hosts /cluster/mpich2-1.4.1p1/examples/cpi --> Test execution with one node
$ mpiexec -n 2 -f /cluster/mpiuser/hosts /cluster/mpich2-1.4.1p1/examples/cpi --> Test execution with two nodes
Note :- Since we are using VMs on the same host you won't notice any changes and sometimes the indicator will increase instead of decrease but in
real world you will happy with the results.
Now Using The Cluster To Compile a File And Make Another Test
$ mpicc -o /cluster/mpich2-1.4.1p1/examples/icpi /cluster/mpich2-1.4.1p1/examples/icpi.c
$ mpiexec -f /cluster/mpiuser/hosts -n 1 /cluster/mpich2-1.4.1p1/examples/icpi --> Add intervals say 1000000 then repeat the test with -n 2
8. Installating Benchmark Tool "Linpack"
Master
$ cd && wget http://ftp.freebsd.org/pub/FreeBSD/ports/distfiles/gotoblas/GotoBLAS2-1.13_bsd.tar.gz
$ tar -xvzf GotoBLAS2-1.13_bsd.tar.gz --> Special library for Linpack
$ cd GotoBLAS2
$ make TARGET=NEHALEM
$ cd && wget http://www.netlib.org/benchmark/hpl/hpl-2.0.tar.gz
$ tar -xvzf hpl-2.0.tar.gz && cd hpl-2.0
$ cp setup/Make.Linux_PII_FBLAS_gm .
$ vim Make.Linux_PII_FBLAS_gm --> Edit the following directives as follow
TOPdir = $(HOME)/hpl-2.0
LAdir = $(HOME)/GotoBLAS2
LAinc =
LAlib = $(LAdir)/libgoto2.a -lm -L/usr/lib/gcc/i386-redhat-linux/4.1.2 --> This is the path of gcc ver 4.1.2 make sure it does exit
CCFLAGS = $(HPL_DEFS) -O3
LINKER = mpicc
$ make arch=Linux_PII_FBLAS_gm
$ mkdir -p /cluster/mpiuser/hpl/
$ cp Make.Linux_PII_FBLAS_gm /cluster/mpiuser/hpl/
Note:- I made the last 2 steps to get around an error in the compilation process
9. Cluster Benchmarking
Master
$ cd /cluster/mpiuser/hpl-2.0/bin/Linux_PII_FBLAS_gm
$ cp HPL.dat HPL.dat.bak
To Determine The Size of The Problem
$ free -b --> To get the number of free blocks in RAM, in my case 181088256.Apply it in the following formula in your calculator using bc command
Note:- The free command should be executed any any node not the master
sqrt ( .1 * 181088256 * 2 ) --> 2 is the number of nodes result is 6018.1
$ vim HPL.dat --> Edit the following
6 device out (6=stdout,7=stderr,file)
1 # of problems sizes (N)
6000 Ns
1 # of NBs
100 NBs
0 PMAP process mapping (0=Row-,1=Column-major)
1 # of process grids (P x Q)
1 Ps
2 Qs
16.0 threshold
3 # of panel fact
0 1 2 PFACTs (0=left, 1=Crout, 2=Right)
2 # of recursive stopping criterium
2 4 NBMINs (>= 1)
1 # of panels in recursion
2 NDIVs
3 # of recursive panel fact.
0 1 2 RFACTs (0=left, 1=Crout, 2=Right)
1 # of broadcast
0 BCASTs (0=1rg,1=1rM,2=2rg,3=2rM,4=Lng,5=LnM)
1 # of lookahead depth
0 DEPTHs (>=0)
2 SWAP (0=bin-exch,1=long,2=mix)
64 swapping threshold
0 L1 in (0=transposed,1=no-transposed) form
0 U in (0=transposed,1=no-transposed) form
1 Equilibration (0=no,1=yes)
8 memory alignment in double (> 0)
$ mpiexec -f ~/mpd.hosts -n 2 ./xhpl --> This will run many tests to benchmark performance
Note :- Tweak the HPL.dat configuration untill you get maximum utilization of CPU, for me after tweaking the configuration I got the following is the results in top command for node1 and node2
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2136 mpiuser 25 0 1564m 1.5g 1188 R 100.2 76.6 1:00.74 xhpl