NCSR DEMOKRITOS
Institute of Nuclear and Particle Physics
Patriarxou Grigoriou & Neapoleos
P.O: 60037
15310 Aghia Paraskevi
GREECE
tel: +30 2106503512
fax: +30 2106503529
email: info AT inp.demokritos.gr

ZEUS CLUSTER

Access through:

SSH (internal ip address)
VNC (internal ipaddress

Storage: 22TB

/share/ifs

srmcp, lcg-cp @ zeus

source /opt/glite/etc/profile.d/grid-env.sh

cmssw @ zeus

under  /share/apps directory

How to run VNC:

1. Connect to a cluster in text mode using ssh.
2. Start a vncserver process on the cluster by running:
“vncserver” or “/usr/bin/vncserver”.
3. The first time you run vncserver, it will ask you
to set up a password. This is the VNC password that can be different to
your login password. Once you have set your password, run “vncserver”
again and you should see the output similar to this:

$ vncserver

New ‘zeus.inp.demokritos.gr:2 (veera)’ desktop
is zeus.inp.demokritos.gr:2

Starting applications specified in /home/veera/.vnc/xstartup

Log file is /home/veera/.vnc/zeus.inp.demokritos.gr:2.log

4. vncserver gives you the display name, such as “zeus.inp.demokritos.gr:2″.
5. You can change your VNC password by running “vncpasswd”.
6. Run the vncviewer program on your PC.
Enter the display name, such as “zeus.inp.demokritos.gr:2″,
and then enter the password. VNC will bring the desktop
environment of the cluster to your PC screen.
7. When you finish,

-
close the VNC viewer window on your PC

-
stop vncserver on the cluster by running “vncserver
-kill :<displaynumber>”. For example,

$ vncserver -kill :2

Killing Xvnc process ID 30251

By default, VNC creates a startup file ~/.vnc/xstartup
that launches a basic window manager program called twm.
If you find twm unattractive, you may wish to use another
window manager program instead.

Edit ~/.vnc/xstartup. It looks like this:

#!/bin/sh

xrdb
$HOME/.Xresources
xsetroot -solid grey
xterm -geometry 80×24+10+10 -ls -title
“$VNCDESKTOP Desktop” &
twm &

Change the last line to specify your window manager

/usr/bin/gnome-session
&

Error:

-
If you run
vncserver and see this error message

vncserver: couldn’t
find “xauth” on your PATH.

Make sure that you have /usr/X11R6/bin in your PATH environment variable by running this command:

$ echo $PATH

You should see something like
this:

/opt/nmi/bin:/opt/nmi/sbin:/usr/kerberos/bin:/usr/local/bin:/bin:/usr/bin:/usr/X11R6/bin:/opt/ganglia/bin:/usr/java/jdk1.5.0/bin:/opt/maui/bin:/opt/torque/bin:/opt/torque/sbin:/opt/rocks/bin:/opt/rocks/sbin:/opt/torque/bin:/opt/mpich/gnu/bin:.:/home/veera/bin

Otherwise, run this command

$ PATH=$PATH:/usr/X11R6/bin

Scheduler:

Sun Grid Engine (SGE)

How to use SGE
Submitting Batch Jobs to SGE
Batch jobs are submitted to SGE via scripts. Here is an example of a serial job script, sleep.sh. It basically executes the sleep command.
[sysadm1@frontend-0 sysadm1]$ cat sleep.sh
#!/bin/bash
#
#$ -cwd
#$ -j y
#$ -S /bin/bash
#
date
sleep 10
date

Entries which start with #$ will be treated as SGE options.
-cwd means to execute the job for the current working directory.
-j y means to merge the standard error stream into the standard output stream instead of having two separate error and output streams.
-S /bin/bash specifies the interpreting shell for this job to be the Bash shell.
To submit this serial job script, you should use the qsub command.
[sysadm1@frontend-0 sysadm1]$ qsub sleep.sh
your job 16 (“sleep.sh”) has been submitted

Next, we’ll submit a parallel job. First, let’s get and compile a test MPI program. As a non-root user, execute:
$ cd $HOME
$ mkdir test
$ cd test
$ cp /opt/mpi-tests/src/*.c .
$ cp /opt/mpi-tests/src/Makefile .
$ make
Now we’ll create an SGE submission script for mpi-ring. The program mpi-ring sends a 1 MB message in a ring between all the processes of an MPI job. Process 0 sends a 1 MB message to process 1, then process 1 send a 1 MB message to process 2, etc. Create a file named $HOME/test/mpi-ring.qsub and put the following in it:
#!/bin/bash
#
#$ -cwd
#$ -j y
#$ -S /bin/bash
#

/opt/openmpi/bin/mpirun -np $NSLOTS $HOME/test/mpi-ring
The command to submit a MPI parallel job script is similar to submitting a serial job script but you will need to use the -pe orte N. Nrefers to the number of processes that you want to allocate to the MPI program. Here’s an example of submitting a job that will use 2 processors:
$ qsub -pe orte 2 mpi-ring.qsub
When the job completes, the job’s output will be in the file mpi-ring.qsub.o*. Error messages pertaining to the job will be in mpi-ring.qsub.po*.
To run the job on more processors, just change the number supplied to the -pe orte flag. Here’s how to run the job on 16 processors:
$ qsub -pe orte 16 mpi-ring.qsub
If you need to delete an already submitted job, you can use qdel given it’s job id. Here’s an example of deleting a fluent job under SGE:
[sysadm1@frontend-0 sysadm1]$ qsub fluent.sh
your job 31 (“fluent.sh”) has been submitted
$ qstat
job-ID  prior name       user         state submit/start at     queue      master  ja-task-ID
———————————————————————————————
31     0 fluent.sh  sysadm1      t     12/24/2003 01:10:28 comp-pvfs- MASTER
$ qdel 31
sysadm1 has registered the job 31 for deletion
$ qstat
$

Monitoring SGE Jobs
To monitor jobs under SGE, use the qstat command. When executed with no arguments, it will display a summarized list of jobs
[sysadm1@frontend-0 sysadm1]$ qstat
job-ID  prior name       user         state submit/start at     queue      master  ja-task-ID
———————————————————————————————
20     0 sleep.sh   sysadm1      t     12/23/2003 23:22:09 frontend-0 MASTER
21     0 sleep.sh   sysadm1      t     12/23/2003 23:22:09 frontend-0 MASTER
22     0 sleep.sh   sysadm1      qw    12/23/2003 23:22:06
Use qstat -f to display a more detailed list of jobs within SGE.
[sysadm1@frontend-0 sysadm1]$ qstat -f
queuename            qtype used/tot. load_avg arch      states
—————————————————————————-
comp-pvfs-0-0.q      BIP   0/2       0.18     glinux
—————————————————————————-
comp-pvfs-0-1.q      BIP   0/2       0.00     glinux
—————————————————————————-
comp-pvfs-0-2.q      BIP   0/2       0.05     glinux
—————————————————————————-
frontend-0.q         BIP   2/2       0.00     glinux
23     0 sleep.sh   sysadm1      t     12/23/2003 23:23:40 MASTER
24     0 sleep.sh   sysadm1      t     12/23/2003 23:23:40 MASTER

############################################################################
- PENDING JOBS – PENDING JOBS – PENDING JOBS – PENDING JOBS – PENDING JOBS
############################################################################
25     0 linpack.sh sysadm1      qw    12/23/2003 23:23:32

QMON (Graphical User Interface):

http://wikis.sun.com/display/gridengine62u3/Interacting+With+Sun+Grid+Engine+as+a+User