NetAdmin pre-alfa release

Well, finaly I managed to get a release of NetAdmin so that you can see what can do and maybe sugest some improvements. It is a pre-alfa release so it has things more or less implemented but is usable. The link to the project is https://netadmin.doraz.ro. You need to authenticate with user demo and password d3m0netadmin (d3m0 with zerro) and then create your own account for NetAdmin. The username will be your email address filled in the account creation form.

Due to development phase, I realised that the design I chose isn’t very good. This version will be developed to a certain point to achieve its usability while a new design will be developed and implementend. When the next version will be released, data will be migrated.

Maintenance

Sorry for inconvinience because the site’s structure has been changed a little. First it looked ok with the Pages there but as far as you can’t post to pages I had to change it. So you will see posts that were written long time ago on the front page now and the tabs near home button gonne. Instead of pages there are now categories.

Thanks for understanding.

Install Sun (Oracle) Grid Engine

In this tutorial I will show you how to deploy a Sun Grig Engine infrastructure to transform your cluster into an HPC environment. The Sun Grid Engine used will be 6.2 update 5.

First we will take a short look at the Grid Engine architecture and the terms that must be understant before procedding with the installation process.

Terms list

  • slot – the smallest entity of computational resources that can be reserved. Usualy this is a core of a multicore procesor or the enterire processor.
  • queue – method for storeing the jobs that were submited to execution. It every job has a priority that can be change dynamicaly.
  • parallel environment – it is an adition to the queue process so that a user can request more than one slot for his parallel or distributed job.
  • master host/daemon – this is the main host in a cluster. It is resposable with job management and resource alocation. The deamon process that runs on this machine do all the scheduling work. The host can be a common PC because here no job will run.
  • execution host/daemon – there are the cluster worker nodes. These must be dedicated computers with as much cpu power and memory as needed.

Cluster architecture

A generic cluster has the folowing elements:

  • a storage system
  • a control unit
  • execution hosts
  • front end

The storage system is the place were all the data produced or proccesed by the cluster will be put. This includes the users home directories, global configuration files, repositories, etc. This system must be able to export the storage element by using SAN or NAS protocols.

A control unit usualy is the host that runs the master daemon. This host is the gate to the rest of the machines in cluster.

Execution hosts are the actual computation workers. Here the numbers are crunched.

The cluster must have a front end where user can submit jobs and view their results. The front end may be a terminal computer or a graphical environment (qmon).

Software (Grid Engine) architecture

The master host and the execution daemon were discused earlier.

ARCo – Accounting and Reporting Console – it is an interface for gathering statistic data from the cluster.

DRMAA – Distributed Resource Management Application API – automates Sun Grid Engine functions by writing scripts that run Sun Grid Engine commands and parse the results

Shadow Master Host – this is a method to reduce the cluster downtime. It the master host fails the shadow master takes his place.

More information abou how the system works is available at http://wikis.sun.com/display/gridengine62u5/How+the+System+Operates.

Install process

Prereq

Before we start the install process the folowing services must be properly configured:

1. name resolution service – SGE relies on a good name resolution system: all the hosts must have a name and a static ip (dhcp static alocation is ok) and each ip must be translated into a name, so you must provide forward and reverse name resolution.

2. users – SGE proceses are advised to run under a unpriviledged user. The default user is sgeadmin. It must have the same uid:gid an all hosts that run SGE components. Cluster users must exists (with the same uid:gid) on all hosts that run SGE components. The best way to this is to configure a LDAP service for central user authentication. This part (LDAP) will not be covered in this tutorial.

3. shared home directories – it is required that the users home diretories be the same on each host that runs a SGE component. Usualy users will submit jobs from their home dir (-cwd) and the execution hosts must be able to run that program from that place.

4. Sun Grid Engine 6.2u5 binaries: http://www.sun.com/software/sge/get_it.jsp.

After the prereq are satisfied it is time to begin the installation process. The commands are from an Oracle Enterprize Server 5.4 edition but with little efort may be applied on other distribution. I will try no to use distribution specific commands. Be carefull with the firewall on the servers, I sugest to deactivate it to avoid posible connection errors.

The cluster in this example will lock like this:

Our domain will be mygrid.net and the servers from the diagram have the coresponding ips. The fqdn for nfs server will be nfs.mygrid.net .

Ones that the the infrastructure is up and running (the servers have an operating system and the networking is done) we will need some form of automation to do work on hosts from one point. For this the ssh root key from the control.mygrid.net server have to be copied to all other servers to permit password-less authentication.

control# ssh-keygent -t rsa -b 4096
control# ssh-copy-id -i .ssh/id_rsa.pub root@node1.mygrid.net
control# ssh-copy-id -i .ssh/id_rsa.pub root@node2.mygrid.net
control# ssh-copy-id -i .ssh/id_rsa.pub root@nfs.mygrid.net

Because we don’t have a directory service to store our user account data, this job will be assigned to control server. To replicate the data into the cluster we simply copy over ssh the /etc/passwd file on the rest of the servers.

The nfs server will store user home directories. It will export for the 10.0.0.0/24 network the /mygrid/home/ directory:

nfs# vi /etc/exports
/mygrid/home/ 10.0.0.0/24(rw,sync,no_root_squash)

The rest of the server will mount at startup the nfs share. For this the /etc/fstab file must be edited:

{control,node1,node2}# vi /etc/fstab
10.0.0.4:/mygrid/home /mygrid/home/ nfs defaults 0 0

When a user is added the home directory will look like /mygrid/home/username.

Make SGE adminitrative user

After the infrastructure is ready we have to make the group and user for the Sun Grig Engine system. The uid:gid will be 500:500 and the name of both user and group will be sgeadmin.

groupadd -g 500 sgeadmin
useradd -m -d /opt/sge-6.2u5/ -s /bin/bash -u 500  -g sgeadmin -c "SGE Admin User" sgeadmin

This command will also create the /opt/sge-6.2u5 directory with the apropriate rights.

Install the master (control) host

Now it is time to install the Sun Grid Engine master daemon on our control host. First unpack the binaries downloaded from the Sun website into the /opt/sge-6.2u5 directory. Go to that directory and execute ./inst_sge -m. This will gide you trough the install process of the master daemon. Please answer to the question with the following information:

  • sge_master_port 49100
  • sge_exection_port 49101
  • cell_name mygrid
  • spool type classic
  • create startup script yes
  • add execution hosts yes and enter “node1.mygrid.net node2.mygrid.net” without “”

After the instalation is complete the sge_master daemon will be started. You can check that with ps ax | grep -i sge_master .

Now it is time to add our execution hosts to the system. Please make sure thier hostname are properly resolved. The execution host must be added first as administrative host, then installed.

To prepare our environemnt variables first we have to source the cell configuration file:

source /opt/sge-6.2u5/mygrid/common/settings.sh

and then tell the master daemon witch are our execution hosts:

qconf -ah node1.mygrid.net
qconf -ah node2.mygrid.net

qconf is the command line configuration tool. If you prefer a graphic one, use qmon (the X server must be installed on the control host). To be able to submit jobs from this host, it must be declared as a submition host:

qconf -as control.mygrid.net

Install the execution hosts

The execution hosts need the cell configuration files. In this way any change made on the master is instantly propagated to execution hosts. This usefull in case of the master hosts fails and the shadow master anounce himself as a new master daemon. To accomplish this we need to export the /opt/sge-6.2u5/mygrid directory via nfs.

So on the control host add the folowing line to the /etc/exports file:

control# vi /etc/exports
/opt/sge-6.2u5/mygrid 10.0.0.0/24(rw,sync,no_root_squash)

On the execution host the nfs export must be mounted on the same place. So we append the /etc/fstab file to automaticaly mount the directory on boot time:

{node1,node2}# vi /etc/fstab
10.0.0.1:/opt/sge-6.2u5/mygrid /opt/sge-6.2u5/mygrid nfs defaults 0 0

The target directory must exists on the nodes.

To install the execution hosts unpack the two zip archive in the /opt/sge-6.2u5/ directory. Source the configuration file (this tells the install program were the master is and on what port is it running):

source /opt/sge-6.2u5/mygrid/settings.sh

then run the ./inst_sge -x command from the /opt/sge-6.2u5/ directory. The installation procces will gide you (the most answers will autocomplete themself) and at the end you are promted to choose whether on not to add the node (node1 and node2) to all.q queue as execution hosts.

Testing the system

Now the install proccess is allmost complete. It is time for running some tests. The test assumes that the shared home directories are mounted on all hosts but nfs.mygrid.net as described in this tutorial.

Step1:

From the control host add a new user. Let the user name be sgeuser:

groupadd sgeusers
useradd -m -d /mygrid/home/sgeuser -g sgeusers sgeuser
passwd sgeuser

Step2:

Copy from the control host the /etc/passwd and the /etc/group  files to the rest of the hosts. Use the ssh key authentication accomplished earlier.

On any host you should be able to run

 id sgeuser

and get full information about the user. You can set the sgeuser’s password if you like (or copy the /etc/shadown file from the control host).

Step3:

Login as our new user from the control host: sgeuser. Source the cell configuration file located in /opt/sge-6.2u5/mygrid/common/settings.sh. The

qstat -f

should show you that the node1 and node2 are running well:

queuename                          qtype resv/used/tot. load_avg  arch                     states
———————————————————————————————-
all.q@node1.mygrid.net  BIP       0/0/8                 0.03        lx24-amd64
———————————————————————————————-
all.q@node2.mygrid.net  BIP      0/0/8                 0.10         lx24-amd64

If it looks like this it is time to run some jobs. We will submit a dummy job from the provided examples.

qsub -q all.q -cwd /opt/n1sge6/sge-6.2u5/examples/jobs/sleeper.sh

To watch the queue progress:

watch "qstat -f"

Good luck!

OCR for cards

You might wonder why you need OCR for cards. Well there is a simple answer: i haven’t seen something done yet and there is a usefull thing if you need to register every game in a database. If you whant to make an online game with real cards with might need something like this.

At the first sight, when I was told the project, it hadn’t no idee about image recognition and it looked complicated, but after a while it stared to take shape.

The project was about a online game poker, game witch must be recorded in a database to respect the legislation. For that is not enough to display the cards as images to the online players, you need to know exactly what cards on the table so you can decide witch was the winner or to check if the game rules are obeyed.

In the game are nine players witch at first are given two cards and later are added five more card but this are visible to everybody. With your own two card and the visible one you have to make a combination. The player with the highest five-card poker hand at showdown wins the pot. This is Texas holdem.

The main problem of this project was the cards recognition. After talking with some friends (many thanks to Traian Zvirid)  it came out the folowing idea: let’s evaluate the card (sign and number) as number of colored pixels of the total zone pixels. Using this numbers I will make a database and then compare values computed from a new card with that database. If I will have a very close match I can tell what the card was.

Another problem were the pictures that had to be taken from the cards. I have to use an Axis 211 webcam witch has only 640×480 rezolution and an optic system not to good: at lower focus distances the fish-eye efect is big. Besides this there must be an eficient light source to light up the cards. The cards must be put in a fixed pisition … so there are some things to be put toghether.

Let’s start with the cards recognition algorithm. To do this the card must be devided in 3 parts: first will contain the sign and number from the left-up corner, second the body card (with caracters and signs) and the third the sign and number from the right-down corner. The first and the third areas are usefull to the image recognition. To get the coordinates of this regions you can do the following way:

1. for the first zone:

- start scanning column by column and compute the number of colored pixels out of total pixels. If you have only white pixels (or enough all most white pixels) it means that you haven’t hit the sign of the card (my cards had the sign larger that the number).

- when you hit the sign you skip the colored area unil it becomes white again

- keep those two pixels

2. for the third zone the algoritm is the same but is starts from the right to left

After this you will compute the X1-4 coordonates just as in the folowing picture:

Now that we have delimited the simbol and number areas there have to be splitted: for the recognition algoritm we have to have simbols and numbers separate. So this time we use the same algoritm described earlier but we scan the image from top to half  for the left corner and from bottom to half for the right corner. The images is scanned between the coordonates X1-X2 and X3-X4. We go to half on the picture because it is not necesary to scan all the image (performance reasons). After that we will compute the Y1-8 coordinates.

At this point we have all we need to start the evaluation algoritm:

- left corner: number (x1,y1, x2,y2), simbol (x1,y3,x2,y4)

-right corner: number (x3,y7,x4,y8), simbol (x3, y5, x4, y6)

So for every element in our picture we do:

- from top to bottom of the element we compute the number of colored pixel / number of total pixels
- the resulted array is compared with all values from all numbers/simbols stored in the database
- it the mach is found (with threshold) then that is the card. The match is computed as an arithmetic mean of the differences of the values so that is we have a single totaly out of order value that will be distributed uniform to the rest of the values and the error is canceled. But if there are many out of order values then the hole mean is afected.

NetAdmin

Why NetAdmin?

One of the major task a system administrator must do is to maitaint an up to date set of network topologies in order to debug the network or to search for information about the network like the ip address of an equipment or who is in charge of a specific segment of the network.

The usual thing to do that is to store your configurations in text files or using Excel spreadsheets. These are fine methods to store info like the list of hosts on the network and their ip addresses. But now let’s supose that you have to do a big network design, or to document one, with a couple of switches, dedicated routers and a lot of servers. You whant to partition the network using vlans, connect different sites of the company through tunnels, etc. In this case the Excel spreadsheet is obsolete.

Now it is time to use a dedicated tool to do the job. NetAdmin can manage usual network equipment like switches, routers and hosts. For every equipment in part the are a set of attributes that will make your network documentation task easy. For example if you define a switch then you can add ports to that switch, ports that have a number, vlantag, port mode (access or trunk) and finnaly the port belongs to a network. But there are also a router connected to that network. The powerfull part of the software will be generating the full network topology based on your added information.

Usualy when you work on a large network design you don’t work alone. Using the NetAdmin you will have the ability to share your work with others and set the apropriate permitions.

NetAdmin work flow

The NetAdmin work flow consists of the following steps:

  • define network elements like switches, routers, hosts
  • define the networks
  • atach the network elements ports to the coresponding network
  • generate the network topology

When it will be available?

The software will be released in beta version in about a month and it will be free of charge.

Update:

Due to school activities the development processes is behind schedule.

Are you interested?

Leave a comment!

Nokia Remote Drive

In lipsa de spor pentru pregatirea pentru examenul de master am luat telefonul la butonat si am vazut ca dispune de o functionalitate interesanta cel putin ca idee de networking: remote drive. Acest lucru ma duce cu gandul la NFS, iSCSI dar cand vad ca adresa URL este de forma https://url incep sa presupun a fi ceva gen DAV. Nici una nici doua, scotocesc putin netul si intradevar: Symbian 9.3 stie sa vorbeasca “DAV”. Dupa principiul DAV vrei, DAV ai, m-am pus si am mesterit un serviciu de DAV pe un Slackware 13 si Apache 2.2.16.

Am vazut pe internet mai multe variante de setup insa eu am luat-o pe cea mai simpla (cred). Am creat un folder accesibil in internet, am activat DAV pe folderul respectiv din Apache, am luat un certificat SSL oarecare, aveam o autentificare cu LDAP facuta pentru alt ceva, le-am asamblat si o mers, evident nu din prima. Daca iesea din prima sigur era ceva gresit.

Din ce am testat nu se prea intelege bine fara conexiune SSL, sau cel putin mie nu mi-a mers. Cu SSL e ok. Softul mi se pare cam buggy sau poate era din cauza conexiuni mele de date.

Partea de configurare arata cam asa:

DavLockDB "/var/lib/httpd/DavLock"

<Directory /var/www/default/dav/>
DAV On
Options Indexes MultiViews FollowSymLinks
Order allow,deny
Allow from all
<IfModule authnz_ldap_module>
AuthType basic
AuthName "Restricted area!"
AuthBasicProvider ldap
AuthLDAPBindDN cn=admin,dc=ro
AuthLDAPBindPassword password
AuthzLDAPAuthoritative on
AuthLDAPURL "ldap://localhost/dc=doraz,dc=ro?uid?"
Require ldap-group cn=www,o=groups,dc=doraz,dc=ro
</IfModule>
</Directory>
Mi s-a parut un serviciu foarte util pentru facut backup, iar fisierele odata uploadate vor putea fi accesate peste HTTP de oriunde. Recomand folosirea unui retele wireless pentru traficul de date, asta daca aveti volum mare de informatie de transferat, altfel o retea 3G e numa buna.