Sep 162016
 

My notes for installing Son of Grid Engine (SGE) on commodity cluster.

golden_h

Intro

Grab from here  the following RPM packages:

gridengine-8.1.9-1.el6.x86_64.rpm
gridengine-debuginfo-8.1.9-1.el6.x86_64.rpm
gridengine-devel-8.1.9-1.el6.noarch.rpm
gridengine-drmaa4ruby-8.1.9-1.el6.noarch.rpm
gridengine-execd-8.1.9-1.el6.x86_64.rpm
gridengine-guiinst-8.1.9-1.el6.noarch.rpm
gridengine-qmaster-8.1.9-1.el6.x86_64.rpm
gridengine-qmon-8.1.9-1.el6.x86_64.rpm

(at the time of writing version 8.1.9).

For your convenience, the following one liner should fetch these for you 🙂

cd /tmp; for i in gridengine-8.1.9-1.el6.x86_64.rpm gridengine-debuginfo-8.1.9-1.el6.x86_64.rpm gridengine-devel-8.1.9-1.el6.noarch.rpm gridengine-drmaa4ruby-8.1.9-1.el6.noarch.rpm gridengine-execd-8.1.9-1.el6.x86_64.rpm gridengine-guiinst-8.1.9-1.el6.noarch.rpm gridengine-qmaster-8.1.9-1.el6.x86_64.rpm gridengine-qmon-8.1.9-1.el6.x86_64.rpm; do wget https://arc.liv.ac.uk/downloads/SGE/releases/8.1.9/$i;done

Pick one server that will be serving as a master node in your cluster, referred later as qmaster.
For smaller clusters it can happily run on small VM (say 2x vCPU, 2GB RAM) maximising your resource usage.

Install EPEL on all nodes

rpm -Uvh http://dl.fedoraproject.org/pub/epel/epel-release-latest-6.noarch.rpm

Install prerequisits on all nodes

yum install -y perl-Env.noarch perl-Exporter.noarch perl-File-BaseDir.noarch perl-Getopt-Long.noarch perl-libs perl-POSIX-strptime.x86_64 perl-XML-Simple.noarch jemalloc munge-libs hwloc lesstif csh ruby xorg-x11-fonts xterm java xorg-x11-fonts-ISO8859-1-100dpi xorg-x11-fonts-ISO8859-1-75dpi mailx

Install GridEngine packages on all nodes

cd /tmp/
yum localinstall gridengine-*

Install Qmaster

cd /opt/sge
./install_qmaster

Accepting defaults should just work, well you might want to run it under different user than r00t so:

"Please enter a valid user name >> sgeadmin"

Make sure to add GridEngine to global environment:

cp /opt/sge/default/common/settings.sh /etc/profile.d/sge.sh

NFS export SGE root to nodes in your cluster

vim /etc/exports

/opt/sge 10.10.80.0/255.255.255.0(rw,no_root_squash,sync,no_subtree_check,nohide)

and mount share on exec nodes

vim /etc/fstab

qmaster:/opt/sge 	/opt/sge nfs	tcp,intr,noatime	0	0

 

Installing exec nodes

cd /opt/sge
./install_execd

Just go with the flow here. Once done you should be able to see your exec nodes:

# qhost 
HOSTNAME                ARCH         NCPU NSOC NCOR NTHR  LOAD  MEMTOT  MEMUSE  SWAPTO  SWAPUS
----------------------------------------------------------------------------------------------
global                  -               -    -    -    -     -       -       -       -       -
execnode01              lx-amd64        8    2    8    8  0.12   15.6G    5.2G   20.0G  104.9M
execnode02        	lx-amd64        8    2    8    8  0.00   15.7G    1.3G   21.1G     0.0
execnode03              lx-amd64        8    2    8    8  0.00   15.7G    1.4G   21.1G   18.6M

That means you can start submitting jobs to your cluster, either interactive with qlogin or qrsh or batch jobs with qsub.

Adding queues (for FSL)

In most cases it’s enough to have a default queue called all.q

This example will define new queues with different priorities (nice levels):

# change defaults for all.q
qconf -sq all.q |\
    sed -e 's/bin\/csh/bin\/sh/' |\
    sed -e 's/posix_compliant/unix_behavior/' |\
    sed -e 's/priority              0/priority 20/' >\
    /tmp/q.tmp
qconf -Mq /tmp/q.tmp

# add other queues
sed -e 's/all.q/verylong.q/' /tmp/q.tmp >\
   /tmp/verylong.q
qconf -Aq /tmp/verylong.q

sed -e 's/all.q/long.q/' /tmp/q.tmp |\
   sed -e 's/priority *20/priority 15/' >\
   /tmp/long.q
qconf -Aq /tmp/long.q

sed -e 's/all.q/short.q/' /tmp/q.tmp |\
   sed -e 's/priority *20/priority 10/' >\
   /tmp/short.q
qconf -Aq /tmp/short.q

sed -e 's/all.q/veryshort.q/' /tmp/q.tmp |\
   sed -e 's/priority *20/priority 5/' >\
   /tmp/veryshort.q
qconf -Aq /tmp/veryshort.q

Monitoring your cluster

 

Use qmon GUI or the following commands:

# qstat -f

queuename                      qtype resv/used/tot. load_avg arch          states
---------------------------------------------------------------------------------
all.q@execnode01 BIP   0/0/8          0.12     lx-amd64      
---------------------------------------------------------------------------------
all.q@execnode02 BIP   0/0/8          0.00     lx-amd64      
---------------------------------------------------------------------------------
all.q@execnode03 BIP   0/0/8          0.00     lx-amd64      
---------------------------------------------------------------------------------
long.q@execnode01 BIP   0/0/8          0.12     lx-amd64      
---------------------------------------------------------------------------------
long.q@execnode02 BIP   0/0/8          0.00     lx-amd64      
---------------------------------------------------------------------------------
long.q@execnode03 BIP   0/0/8          0.00     lx-amd64      
---------------------------------------------------------------------------------
short.q@execnode01 BIP   0/0/8          0.12     lx-amd64      
---------------------------------------------------------------------------------
short.q@execnode02 BIP   0/0/8          0.00     lx-amd64      
---------------------------------------------------------------------------------
short.q@execnode03 BIP   0/0/8          0.00     lx-amd64      
---------------------------------------------------------------------------------
verylong.q@execnode01 BIP   0/0/8          0.12     lx-amd64      
---------------------------------------------------------------------------------
verylong.q@execnode02 BIP   0/0/8          0.00     lx-amd64      
---------------------------------------------------------------------------------
verylong.q@execnode03 BIP   0/0/8          0.00     lx-amd64      
---------------------------------------------------------------------------------
veryshort.q@execnode01 BIP   0/0/8          0.12     lx-amd64      
---------------------------------------------------------------------------------
veryshort.q@execnode02 BIP   0/0/8          0.00     lx-amd64      
---------------------------------------------------------------------------------
veryshort.q@execnode03 BIP   0/0/8          0.00     lx-amd64   

# qhost -q

HOSTNAME                ARCH         NCPU NSOC NCOR NTHR  LOAD  MEMTOT  MEMUSE  SWAPTO  SWAPUS
----------------------------------------------------------------------------------------------
global                  -               -    -    -    -     -       -       -       -       -
execnode01            lx-amd64        8    2    8    8  0.12   15.6G    5.2G   20.0G  104.9M
   all.q                BIP   0/0/8         
   long.q               BIP   0/0/8         
   short.q              BIP   0/0/8         
   veryshort.q          BIP   0/0/8         
   verylong.q           BIP   0/0/8         
execnode02        lx-amd64        8   2    8    8  0.00   15.7G    1.3G   21.1G     0.0
   all.q                BIP   0/0/8         
   long.q               BIP   0/0/8         
   short.q              BIP   0/0/8         
   veryshort.q          BIP   0/0/8         
   verylong.q           BIP   0/0/8         
execnode03                lx-amd64    8    2    8    8  0.00   15.7G    1.4G   21.1G   18.6M
   all.q                BIP   0/0/8         
   long.q               BIP   0/0/8         
   short.q              BIP   0/0/8         
   veryshort.q          BIP   0/0/8         
   verylong.q           BIP   0/0/8         

Jun 222016
 

Intro

Cassandra is highly available (no SPOF) distributed database service for managing large amounts of structured data across many commodity servers.

Here is a quick recipe for starting a first Debian based Cassandra server in your Cassandra cluster.

 

Modify Debian repositories

 

  • Modify default apt sources and make sure you have Contrib in sources:
# vim /etc/apt/sources.list
deb http://http.debian.net/debian jessie main contrib non-free
deb-src http://http.debian.net/debian jessie main contrib non-free
deb http://http.debian.net/debian jessie-updates main contrib non-free
deb-src http://http.debian.net/debian jessie-updates main contrib non-free
deb http://security.debian.org/ jessie/updates main contrib non-free
deb-src http://security.debian.org/ jessie/updates main contrib non-free

 

  • Add Cassandra to sources:
# vim /etc/apt/sources.list.d/cassandra.list
deb http://www.apache.org/dist/cassandra/debian 37x main
deb-src http://www.apache.org/dist/cassandra/debian 37x main

 

Install Oracle Java JDK

Visit Oracle website and download Oracle JDK tar.gz package from Oracle website

http://www.oracle.com/technetwork/java/javase/downloads/index.html

 

 Install Java and Cassandra

sudo -i
apt-get update && apt-get install java-package && exit
apt install libgl1-mesa-glx libgtk2.0-0 libxxf86vm1 -y
make-jpkg jdk-8u51-linux-x64.tar.gz
dpkg -i oracle-java8-jdk_8u91_amd64.deb
update-alternatives --config java # and choose Oracle version
apt-get install cassandra -y

 

Is it running?

# systemctl status cassandra
cassandra.service - LSB: distributed storage system for structured data
Loaded: loaded (/etc/init.d/cassandra)
Active: active (running) since Wed 2016-06-22 13:23:54 UTC; 17min ago
CGroup: /system.slice/cassandra.service
└─17055 java -Xloggc:/var/log/cassandra/gc.log -ea -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -XX:+Heap

Working with Cassandra

cqlsh
nodetool status
nodetool info
nodetool tpstats

Example output:

$ cqlsh

Connected to Test Cluster at 127.0.0.1:9042.
[cqlsh 5.0.1 | Cassandra 3.7 | CQL spec 3.4.2 | Native protocol v4]
Use HELP for help.
cqlsh> 

$ nodetool status

Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address    Load       Tokens       Owns (effective)  Host ID                               Rack
UN  127.0.0.1  105.98 KiB  256          100.0%            c1ad1f98-170a-4a29-a007-46fd1dda4506  rack1

Easy!