Created: 28 Feb 2005 Status: OPEN


-- RobertManteufel and CarstenPreuss - 28 Feb 2005

1 PROOF - The Parallel Root Facility

Introduction

To be prepared for the parallel analysis of distributed datasets a PROOF environment, embedded in the local batch system, has been set up at GSI. The Parallel ROOT Facility, PROOF, is an extension of the ROOT system. It enables physicists to analyse large sets of ROOT files in parallel on computer clusters.

Due to the increasing amount of data in High Energy Physics the ROOT developers at CERN decided to set ROOT on a parallel base.

PROOF ( Parallel ROOT Facility ) is an extension of ROOT which allows a transparent and fast analysis of large sets of ROOT-files (ROOT-trees).

The goal of PROOF is not only to increase the CPU-power by using multiple hosts, it also takes advantage of the possibility to access and analyse parallel one or more ROOT-files which are stored on several hosts, so you can increase the I/O-speed depending on the quantity of hosts.

scaleability.png

PROOF depends on a three-tier-architecture as shown in the following picture:

PROOF_Cluster.png

Instead of running the ROOT-session on your local machine it's at this time necessary that it runs on a batch-system-enabled host. The master-server controls the slave-server, provide them with work-packages, compose the results delivered from the slaves and return the whole result to the client. The slaves can work on local files with the class TFile or on remote-files with the class TNetFile. They ask the master for work-packages, analyse them and return the results to the master, so he can send them new work-packages. This package-oriented-method provides a load-balancing and is fault tolerant.

There are some differences between a ROOT- and a PROOF-session, because of the distributed architecture. Instead of adding the analysis-files to a TChain you have to use a TDSet. If you use libraries and packages you must setup a ROOT-daemon on your localhost.

Proofmanual

Basic knowledge of PROOF

If you already know about creating library packages for PROOF, jump to next part: Short explanation for creating an analysis script

- Packages:

A PAR package is a tar-gzip archieve of a directory containing the shared libraries in source or binary form and how to build/install them. The build/install information is provided within the PROOF-INF directory which needs to be located in the topmost directory of the package. The file PROOF-INF/BUILD.sh contains shell commands to build a package from sources and must be executable. The file PROOF-INF/SETUP.C contains ROOT commands to activate the shared libraries.

Create a PAR package containing the genarated TCounter and Event class. Explanation:

In this example we create a PAR package containing the user generated class TInfo (TInfo.h and TInfo.C, explanation see below) and the default ROOT class Event. For this we need following directory/file mix only:

   libEvent/            (directory named by the user)
      PROOF-INF/     (has to be named so)
         SETUP.C      (has to be named so)

Explanation: Before you use this, create the shared libraries from classes Event and TInfo. (for example in a rootsession: ".x TInfo.C++;")

Finally create the tar-gzipped archieve: "tar -czf libEvent.par libEvent;"

Before you activate the package in a PROOF-session, you have to upload it. Syntax: "gProof->UploadPackage("libEvent.par");"

A more specific example you will find at Example PROOF-session

Short explanation for creating an analysis script

4 steps:

- open ROOT-file

- make Selector-files (analysis files)

- edit header file

- edit source file

Open ROOT-file and create Selector files

- start ROOT Session

- open your file like:

TFile f("/u/dvgamma/MakeEvent4.0004/Event45000.root");

- create Selector-files:

T->MakeSelector("EventSelector");

In this example the ROOT-file contains a tree called "T" which is the tree to be analyzed.

You should receive a message like:

Info in TTreePlayer ::MakeClass: Files: EventSelector.h and EventSelector.C generated from TTree T

Edit header file

- In the given example we only need branches for "fNtrack" and "event".

- Set branch addresses

- Add user defined objects and some data members

Here you don't need to set the branch address of fNtrack because "event" includes all the other branches!

User defined class TInfo

In principal the class TInfo is a dummy class designed for collecting analysis results from the slaves. The object keeps the data till it's catched in SlaveTerminate(). See the TInfo.h: and TInfo.C: code for mor details ...

A look into our example script: EventSelector.h

Edit the source file

First a look in our actual example: EventSelector.C

An explanation for each function you will see at its beginning.
The analysis is embedded in
EventSelector::Process(Int_t entry)

In EventSelector::SlaveTerminate() we have to add our objects:

fOutput->Add(infoobj); and
fOutput->Add(firstevent);

EventSelector::Terminate()

First you get your infoobj as an TObject from the outList. Convert it back before it can be used again as an TInfo object.

Finally launch a PROOF-Session and start the analysis

Start the PROOF-Session via script or manually. Inside ot the session the following steps are to do:

- Upload packages

- Enable packages

- Create TDSet

- Add file/files

- Start analysis

Use the prooflogin script (author: Carsten Preuss) to launch a PROOF session. See also: Example PROOF-session

At least an extract off a PROOF-session
root [0] 
Processing /u/dvgamma/proofstarter.C("proof://lxb051:1095")...
<nop>dvgamma@lxb051.gsi.de password:
PROOF set to parallel mode (9 slaves)
proof://lxb051:1095
root [1] gProof->UploadPackage("libEvent.par");
root [2] gProof->EnablePackage("libEvent.par");
(Int_t)1
(Int_t)1
(Int_t)1
(Int_t)1
(Int_t)1
(Int_t)1
(Int_t)1
(Int_t)1
(Int_t)1
(Int_t)1
root [3] TDSet *set = new TDSet("TTree","T");
root [4] set->Add("/u/dvgamma/MakeEvent4.0004/Event45000.root");
root [5] set->Process("/u/dvgamma/projects/Prooftest/EventSelector.C");
remark: Here there's only one root.file being analysed. Of course you can add more than one, but the tree has to be the same! Here: "T"

PROOF-Setup

What do I need to work with PROOF?

To use the GSI-PROOF-enviroment you need a LINUX-account and you must be logged into a batch-system-enabled host. After this the only thing you have to do is to type prooflogin. This starts the script prooflogin which you can see in the folder /usr/local/bin

All config files needed by PROOF are built by the script itself. This includes the following files:

.rootrc

.rootauthrc

.proof.conf

proofd.sh

proofstarter.C

Normally the only file that already exists on your host is the .rootrc. This file provides many informations about how your ROOT looks and works. For PROOF it also provides needed informations for the authentification between the client and the PROOF-master-server. If you have this file on your host but without the required statements the script will tell you which changes are necessary. These changes have to be done manually. If there is no .rootrc file in yout $HOME directory prooflogin will build a rudimentary file with which it will work and delete it after the PROOF session.

.rootauthrc, .proof.conf and root.conf are configuration-files which describe the structure of your PROOF-session (hostnames, portnumbers, authentification-methods, etc.).

proofd.sh is a script which is send to the LSF-cluster and from there starts the proof- and root-daemons.

proofstarter.C is a C-script which is executed during the (automatic) start of ROOT. This script can upload your libraries and packages, build a TDSet with your analysis-files, start ROOT and the PROOF-session.

The cleanup.sh-script clears up the produced files and ends the started daemons so that there's nothing left to do manually.

Working with PROOF

To start a default PROOF-session execute the prooflogin script without any parameters. This starts 10 PROOF-server ( 9 slaves (sorry, workers ;-)) and 1 master ) in the LSF-cluster with a termination-time of 12 hours ( this means you can run your session with a maximum of 12 hours).

To set up a more personal session you can use the following parameters :

-s slave-count This is the number of slaves you want to use, in a range between 5 and 15. The default is 10.

-t termination-time This is the maximum possible run-length of your job, with an upper limit of 12 hours.

-f file1,... This is a list of files you want to analyse. The filenames must include the hostname and path in the form : //usr/h1analysis.root" for files on your local machine or lxb050.gsi.de://shared/users/h1analysis.root for files on remote hosts. The files must be seperated by a comma.

-tree treename This is the name of the tree you wish to analyse. It must be the same in all added ROOT-files. If you indicate to files at the startup of prooflogin you must also set the option -tree so that the script can build the TDSet.

-l filelist treename This is a file in which you have written the hostname, path and filenames of the files you want to analyse. After the file, seperated by space, you must give a name for the tree you want to analyse.

-pac This is the package that you use with your analysis. The filename must include the hostname and path in the form : //usr/package.par for packages on your local machine or lxb050.gsi.de://shared/users/package.par for packages on remote hosts.

-mol master on localhost This option starts the PROOF-master-server on your local machine. So you can have one master and 15 slaves. (This is one slave more than otherwhise.)

-v root-version This is the desired ROOT-version which you like to work with. It looks like old, pro, new dev or 400-04 ,etc. . If this parameter is not given, the script uses an already user-initialized ROOT-version. If no ROOT-version is started before, the default version ( pro ) is setup.

-h, -? -help help Shows you a short help how to use proof.

The prooflogin-script scans the needed files and informs you about any changes to made. This script does not change your files itself. You'll have to do this manually. All files that are builded by this script are deleted afterwards automatically.

This script has been tested with the following ROOT-versions:

ROOT 310-02
ROOT 400-03
ROOT 400-04
ROOT 400-06
ROOT 400-08
ROOT 401-02
ROOT 401-04
ROOT 402-00
ROOT 403-02

Example PROOF-session

For better understanding how things work we provide on this site an example PROOF-session. You can download all needed files from this site and start the PROOF-session on your own account/host. In the case of problems you can compare your output with that shown in our example. To get more practice with the PROOF-enviroment we decided to show a more experienced example.

Required files: (ROOT-version 400-04)

- the ROOT file called hitfile.root
- the analysis files called Anaproof.h and Anaproof.C
- the libraries: libTMytrackerhit.so and TCounter_C.so

After downloading the required files build a package concerning your libraries. Here you need following directory/file mix:

libTMytrackerhit/PROOF-INF/SETUP.C

The SETUP.C file contains:
Int_t SETUP()
{
   gSystem->Load("/...your_path.../libTMytrackerhit.so");
   gSystem->Load("/...your_path.../TCounter_C.so");
   return 1;
}
"libTMytrackerhit" is a directory the user can name as he likes. "PROOF-INF" and "SETUP.C" have to be called so!!!

Finally create the tar-gzipped archieve:
tar -czf libTMytrackerhit.par libTMytrackerhit;

After downloading the files and creating the package, login to a batch-system-enabled host.
Then start the PROOF-session by typing:
prooflogin -s 3 -mol -v 400-04

This means your PROOF-session starts 3 slaves, the master is started on your localhost (mostly a lxi-host) and the used ROOT-version is 400-04.

Now you get the following output (we hope;-):
Job <proofd> is not found in queue <proof>
Scanning for PROOF-slaves.
Job <proofd> is not found in queue <proof>
Job <proofd> is not found in queue <proof>
3 PROOF-slaves started.
Termination-time is set to the dafault of 12 hours!
UsrPwd-Status is valid
UsrPwd.Login is valid
UsrPwd.LoginPrompt is valid
--------------------------------------------------
GLOBUS enabled.
ROOT Version 400-04-new enabled, path /usr/local/pub/debian3.0/gcc295-04/rootmgr/400-04 !
/usr/local/pub/debian3.0/gcc295-04/rootmgr/400-04
Job <proofd> is not found
Job <238495> is submitted to queue <proof>.
Job <proofd> is not found
Job <238496> is submitted to queue <proof>.
Job <238497> is submitted to queue <proof>.
Job <proofd> is not found




  *******************************************
  *                                         *
  *        W E L C O M E  to  R O O T       *
  *                                         *
  *   Version   4.00/04    25 August 2004   *
  *                                         *
  *  You are welcome to visit our Web site  *
  *          http://root.cern.ch            *
  *                                         *
  *******************************************

<nop>FreeType Engine v2.1.3 used to render <nop>TrueType fonts.
Compiled for linux with thread support.

CINT/ROOT C/C++ Interpreter version 5.15.133, Apr 18 2004
Type ? for help. Commands must be C++ statements.
Enclose multiple statements between { }.
root [0]
Processing /u/manteufe/proof_temp/proofstarter.C("proof://lxb035:1095")...
manteufe@localhost password:

On the ROOT-shell you will be asked for your password.
After entering it correctly;-) you get this output on your screen:
PROOF set to parallel mode (3 slaves)
root [1]

You are now working on the ROOT-commandline, to get more informations about your PROOF-session, just type:
gProof->Print()


Connected to:             localhost (valid)
Port number:              1095
User:                     manteufe
Security context:         Method: 0 (UsrPwd) not reusable
Proofd protocol version:  9
Client protocol version:  3
Remote protocol version:  3
Log level:                0
*** Master server (parallel mode, 3 slaves):
Master host name:         localhost
Port number:              1095
User:                     manteufe
Protocol version:         3
Image name:               localhost
Working directory:        /u/manteufe/proof/master-localhost-1101412200-5728
Config directory:         /usr/local/pub/debian3.0/gcc295-04/rootmgr/400-04
Config file:              /u/manteufe/.proof.conf
Log level:                0
Number of slaves:         3
Number of active slaves:  3
Number of unique slaves:  3
Number of bad slaves:     0
Total MB's processed:     0.00
Total real time used (s): 0.000
Total CPU time used (s):  0.000

If you don't load up the libs automatically in your rootlogon.C script, you have to do it manually:

.L libTMytrackerhit.so;
.L TCounter_C.so;

Upload and enable your package by typing:
gProof->UploadPackage("/...your path.../libTMytrackerhit.par");
gProof->EnablePackage("libTMytrackerhit.par");


(Int_t)1
(Int_t)1
(Int_t)1
(Int_t)1

After this you can build your analysis-tree:
TDSet *set = new TDSet("TTree","hittree");
set->Add("/...your path.../hitfile.root");

Now it's time to start the analyse:
set->Process("/...your path.../Anaproof.C")

and you get a new, small window where you can see how far the analysis is and this output:

green.png

The future of PROOF, news and FAQ

Last new ROOT versions tested is 403-02.

GLOBUS-support

Currently we try to enable GLOBUS-authentification for ROOT/PROOF.
This feature will be accessable by the new prooflogin-parameter

-auth 0 (for UsrPwd, that is the standard) or 3 (for GLOBUS)

If this feature is implemented you will see it in the prooflogin-help.

PROOF@GSI/FZK

To provide more computing power we are planning to expand our PROOF enviroment
to the PBS cluster of the FZK.

To use PROOF@GSI/FZK you only need a GLOBUS certificate.
To provide a propper setup it's recommanded that you provide the names and
locations of your ROOT-files to the prooflogin-script.

This feature is automatically used if you have a valid GLOBUS certificate and ROOT-files
at the FZK.

How to get a GLOBUS certificate

Refer to digital certificate in this topic.

Links

Related Documents

I Attachment Action Size Date Who Comment
Anaproof.CC Anaproof.C manage 4 K 2005-04-12 - 19:39 UnknownUser  
Anaproof.hh Anaproof.h manage 4 K 2005-04-12 - 19:39 UnknownUser  
EventSelector.CC EventSelector.C manage 5 K 2005-03-01 - 17:22 UnknownUser  
EventSelector.hh EventSelector.h manage 2 K 2005-03-01 - 17:22 UnknownUser  
SETUP.CC SETUP.C manage 142 bytes 2005-02-28 - 16:14 UnknownUser  
TCounter.CC TCounter.C manage 599 bytes 2005-02-28 - 15:48 UnknownUser  
TCounter.hh TCounter.h manage 670 bytes 2005-02-28 - 15:48 UnknownUser  
TCounter_C.soso TCounter_C.so manage 41 K 2005-02-28 - 15:42 UnknownUser  
TInfo.CC TInfo.C manage 498 bytes 2005-03-01 - 17:23 UnknownUser  
TInfo.hh TInfo.h manage 521 bytes 2005-03-01 - 17:23 UnknownUser  
hitfile.rootroot hitfile.root manage 240 K 2005-02-28 - 15:34 UnknownUser  
libTMytrackerhit.soso libTMytrackerhit.so manage 53 K 2005-02-28 - 15:42 UnknownUser  
This topic: Grid > TheParallelRootFacility_OLD
Topic revision: 2015-08-10, IlonaNeis
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding GSI Wiki? Send feedback | Legal notice | Privacy Policy (german)