Created: 28 Feb 2005
Status: OPEN
--
RobertManteufel and
CarstenPreuss - 28 Feb 2005
1 PROOF - The Parallel Root Facility
Introduction
To be prepared for the parallel analysis of distributed datasets a PROOF environment,
embedded in the local batch system, has been set up at GSI.
The Parallel ROOT Facility, PROOF, is an extension of the ROOT system. It enables physicists
to analyse large sets of ROOT files in parallel on computer clusters.
Due to the increasing amount of data in High Energy Physics the ROOT developers at
CERN decided to set ROOT on a parallel base.
PROOF ( Parallel ROOT Facility ) is an extension of ROOT which allows a transparent and fast analysis
of large sets of ROOT-files (ROOT-trees).
The goal of PROOF is not only to increase the CPU-power by using multiple hosts,
it also takes advantage of the possibility to access and analyse parallel one or more ROOT-files which
are stored on several hosts, so you can increase the I/O-speed depending on the quantity of hosts.
PROOF depends on a three-tier-architecture as shown in the following picture:
Instead of running the ROOT-session on your local machine it's at this time necessary that it runs on a batch-system-enabled host.
The master-server controls the slave-server, provide them with work-packages, compose the results delivered from the
slaves and return the whole result to the client.
The slaves can work on local files with the class
TFile or on remote-files with the class
TNetFile. They ask the
master for work-packages, analyse them and return the results to the master, so he can send them new work-packages.
This package-oriented-method provides a load-balancing and is fault tolerant.
There are some differences between a ROOT- and a PROOF-session, because of the distributed architecture.
Instead of adding the analysis-files to a
TChain you have to use a
TDSet.
If you use libraries and packages you must setup a ROOT-daemon on your localhost.
Proofmanual
Basic knowledge of PROOF
If you already know about creating library packages for PROOF, jump to next part:
Short explanation for creating an analysis script
- Packages:
A PAR package is a tar-gzip archieve of a directory containing the
shared libraries in source or binary form and how to build/install
them. The build/install information is provided within the
PROOF-INF directory which needs to be located in the topmost
directory of the package. The file PROOF-INF/BUILD.sh contains
shell commands to build a package from sources and must be
executable. The file PROOF-INF/SETUP.C contains ROOT commands to
activate the shared libraries.
Create a PAR package containing the genarated
TCounter and
Event class.
Explanation:
In this example we create a PAR package containing the user generated class TInfo (
TInfo.h and
TInfo.C, explanation see below) and the default ROOT class
Event.
For this we need following directory/file mix only:
libEvent/ (directory named by the user)
PROOF-INF/ (has to be named so)
SETUP.C (has to be named so)
Explanation:
Before you use this, create the shared libraries from classes
Event
and
TInfo. (for example in a rootsession: ".x TInfo.C++;")
Finally create the tar-gzipped archieve:
"tar -czf libEvent.par libEvent;"
Before you activate the package in a PROOF-session, you have to
upload it.
Syntax:
"gProof->UploadPackage("libEvent.par");"
A more specific example you will find at Example PROOF-session
Short explanation for creating an analysis script
4 steps:
- open ROOT-file
- make Selector-files (analysis files)
- edit header file
- edit source file
Open ROOT-file and create Selector files
- start ROOT Session
- open your file like:
TFile f("/u/dvgamma/MakeEvent4.0004/Event45000.root");
- create Selector-files:
T->MakeSelector("EventSelector");
In this example the ROOT-file contains a tree called "T"
which is the tree to be analyzed.
You should receive a message like:
Info in
TTreePlayer ::MakeClass: Files:
EventSelector.h and
EventSelector.C generated from
TTree T
- In the given example we only need branches for "fNtrack" and
"event".
- Set branch addresses
- Add user defined objects and some data members
Here you don't need to set the branch address of fNtrack
because "event" includes all the other branches!
User defined class TInfo
In principal the class TInfo is a dummy class designed for
collecting analysis results from the slaves.
The object keeps the data till it's catched in SlaveTerminate().
See the
TInfo.h: and
TInfo.C: code for mor details ...
A look into our example script:
EventSelector.h
Edit the source file
First a look in our actual example:
EventSelector.C
An explanation for each function you will see at its beginning.
The analysis is embedded in
EventSelector::Process(Int_t entry)
In EventSelector::SlaveTerminate() we have to add our objects:
fOutput->Add(infoobj); and
fOutput->Add(firstevent);
EventSelector::Terminate()
First you get your infoobj as an TObject from the outList. Convert
it back before it can be used again as an TInfo object.
Finally launch a PROOF-Session and start the analysis
Start the PROOF-Session via script or manually.
Inside ot the session the following steps are to do:
- Upload packages
- Enable packages
- Create TDSet
- Add file/files
- Start analysis
Use the prooflogin script (author: Carsten Preuss) to launch
a PROOF session.
See also: Example PROOF-session
At least an extract off a PROOF-session
root [0]
Processing /u/dvgamma/proofstarter.C("proof://lxb051:1095")...
<nop>dvgamma@lxb051.gsi.de password:
PROOF set to parallel mode (9 slaves)
proof://lxb051:1095
root [1] gProof->UploadPackage("libEvent.par");
root [2] gProof->EnablePackage("libEvent.par");
(Int_t)1
(Int_t)1
(Int_t)1
(Int_t)1
(Int_t)1
(Int_t)1
(Int_t)1
(Int_t)1
(Int_t)1
(Int_t)1
root [3] TDSet *set = new TDSet("TTree","T");
root [4] set->Add("/u/dvgamma/MakeEvent4.0004/Event45000.root");
root [5] set->Process("/u/dvgamma/projects/Prooftest/EventSelector.C");
remark:
Here there's only one root.file being analysed. Of course you can add
more than one, but the tree has to be the same! Here: "T"
PROOF-Setup
What do I need to work with PROOF?
To use the GSI-PROOF-enviroment you need a LINUX-account and you must be logged into a batch-system-enabled host.
After this the only thing you have to do is to type prooflogin.
This starts the script prooflogin which you can see in the folder /usr/local/bin
All config files needed by PROOF are built by the script itself.
This includes the following files:
.rootrc
.rootauthrc
.proof.conf
proofd.sh
proofstarter.C
Normally the only file that already exists on your host is the .rootrc.
This file provides many informations about how your ROOT looks and works.
For PROOF it also provides needed informations for the authentification between the client and the PROOF-master-server.
If you have this file on your host but without the required statements the script will tell you which changes are necessary.
These changes have to be done manually.
If there is no .rootrc file in yout $HOME directory prooflogin will build a rudimentary file with which it will work and delete it
after the PROOF session.
.rootauthrc, .proof.conf and root.conf are configuration-files which describe the structure
of your PROOF-session (hostnames, portnumbers, authentification-methods, etc.).
proofd.sh is a script which is send to the LSF-cluster and from there starts the proof- and root-daemons.
proofstarter.C is a C-script which is executed during the (automatic) start of ROOT.
This script can upload your libraries and packages, build a
TDSet with your analysis-files, start ROOT and the PROOF-session.
The cleanup.sh-script clears up the produced files and ends the started daemons so that there's
nothing left to do manually.
Working with PROOF
To start a default PROOF-session execute the prooflogin script without any parameters. This starts 10 PROOF-server
( 9 slaves (sorry, workers ;-)) and 1 master ) in the LSF-cluster with a termination-time of 12 hours ( this means
you can run your session with a maximum of 12 hours).
To set up a more personal session you can use the following parameters :
-s slave-count
This is the number of slaves you want to use, in a range between 5 and 15. The default is 10.
-t termination-time
This is the maximum possible run-length of your job, with an upper limit of 12 hours.
-f file1,...
This is a list of files you want to analyse. The filenames must include the hostname and path in the form :
//usr/h1analysis.root"
for files on your local machine or
lxb050.gsi.de://shared/users/h1analysis.root
for files on remote hosts.
The files must be seperated by a comma.
-tree treename
This is the name of the tree you wish to analyse. It must be the same in all added ROOT-files.
If you indicate to files at the startup of prooflogin you must also set the option -tree so that the script
can build the TDSet.
-l filelist treename
This is a file in which you have written the hostname, path and filenames of the files you want to analyse.
After the file, seperated by space, you must give a name for the tree you want to analyse.
-pac
This is the package that you use with your analysis. The filename must include the hostname and path in the form :
//usr/package.par
for packages on your local machine or
lxb050.gsi.de://shared/users/package.par
for packages on remote hosts.
-mol master on localhost
This option starts the PROOF-master-server on your local machine. So you can have one master and 15 slaves.
(This is one slave more than otherwhise.)
-v root-version
This is the desired ROOT-version which you like to work with. It looks like old, pro, new dev or 400-04 ,etc. .
If this parameter is not given, the script uses an already user-initialized ROOT-version. If no ROOT-version is
started before, the default version ( pro ) is setup.
-h, -? -help help
Shows you a short help how to use proof.
The prooflogin-script scans the needed files and informs you about any changes to made.
This script does not change your files itself. You'll have to do this manually.
All files that are builded by this script are deleted afterwards automatically.
This script has been tested with the following ROOT-versions:
ROOT 310-02
ROOT 400-03
ROOT 400-04
ROOT 400-06
ROOT 400-08
ROOT 401-02
ROOT 401-04
ROOT 402-00
ROOT 403-02
Example PROOF-session
For better understanding how things work we provide on this site an example PROOF-session.
You can download all needed files from this site and start the PROOF-session on your own account/host.
In the case of problems you can compare your output with that shown in our example.
To get more practice with the PROOF-enviroment we decided to show a more experienced example.
Required files:
(ROOT-version 400-04)
- the ROOT file called
hitfile.root
- the analysis files called
Anaproof.h and
Anaproof.C
- the libraries:
libTMytrackerhit.so and
TCounter_C.so
After downloading the required files build a package concerning your libraries.
Here you need following directory/file mix:
libTMytrackerhit/PROOF-INF/SETUP.C
The SETUP.C file contains:
Int_t SETUP()
{
gSystem->Load("/...your_path.../libTMytrackerhit.so");
gSystem->Load("/...your_path.../TCounter_C.so");
return 1;
}
"libTMytrackerhit" is a directory the user can name as he likes.
"PROOF-INF" and "SETUP.C" have to be called so!!!
Finally create the tar-gzipped archieve:
tar -czf libTMytrackerhit.par libTMytrackerhit;
After downloading the files and creating the package, login to a batch-system-enabled host.
Then start the PROOF-session by typing:
prooflogin -s 3 -mol -v 400-04
This means your PROOF-session starts 3 slaves, the master is started on your
localhost (mostly a lxi-host) and the used ROOT-version is 400-04.
Now you get the following output (we hope;-):
Job <proofd> is not found in queue <proof>
Scanning for PROOF-slaves.
Job <proofd> is not found in queue <proof>
Job <proofd> is not found in queue <proof>
3 PROOF-slaves started.
Termination-time is set to the dafault of 12 hours!
UsrPwd-Status is valid
UsrPwd.Login is valid
UsrPwd.LoginPrompt is valid
--------------------------------------------------
GLOBUS enabled.
ROOT Version 400-04-new enabled, path /usr/local/pub/debian3.0/gcc295-04/rootmgr/400-04 !
/usr/local/pub/debian3.0/gcc295-04/rootmgr/400-04
Job <proofd> is not found
Job <238495> is submitted to queue <proof>.
Job <proofd> is not found
Job <238496> is submitted to queue <proof>.
Job <238497> is submitted to queue <proof>.
Job <proofd> is not found
*******************************************
* *
* W E L C O M E to R O O T *
* *
* Version 4.00/04 25 August 2004 *
* *
* You are welcome to visit our Web site *
* http://root.cern.ch *
* *
*******************************************
<nop>FreeType Engine v2.1.3 used to render <nop>TrueType fonts.
Compiled for linux with thread support.
CINT/ROOT C/C++ Interpreter version 5.15.133, Apr 18 2004
Type ? for help. Commands must be C++ statements.
Enclose multiple statements between { }.
root [0]
Processing /u/manteufe/proof_temp/proofstarter.C("proof://lxb035:1095")...
manteufe@localhost password:
On the ROOT-shell you will be asked for your password.
After entering it correctly;-) you get this output on your screen:
PROOF set to parallel mode (3 slaves)
root [1]
You are now working on the ROOT-commandline, to get more informations
about your PROOF-session, just type:
gProof->Print()
Connected to: localhost (valid)
Port number: 1095
User: manteufe
Security context: Method: 0 (UsrPwd) not reusable
Proofd protocol version: 9
Client protocol version: 3
Remote protocol version: 3
Log level: 0
*** Master server (parallel mode, 3 slaves):
Master host name: localhost
Port number: 1095
User: manteufe
Protocol version: 3
Image name: localhost
Working directory: /u/manteufe/proof/master-localhost-1101412200-5728
Config directory: /usr/local/pub/debian3.0/gcc295-04/rootmgr/400-04
Config file: /u/manteufe/.proof.conf
Log level: 0
Number of slaves: 3
Number of active slaves: 3
Number of unique slaves: 3
Number of bad slaves: 0
Total MB's processed: 0.00
Total real time used (s): 0.000
Total CPU time used (s): 0.000
If you don't load up the libs automatically in your rootlogon.C script, you have to do it manually:
.L libTMytrackerhit.so;
.L TCounter_C.so;
Upload and enable your package by typing:
gProof->UploadPackage("/...your path.../libTMytrackerhit.par");
gProof->EnablePackage("libTMytrackerhit.par");
(Int_t)1
(Int_t)1
(Int_t)1
(Int_t)1
After this you can build your analysis-tree:
TDSet *set = new TDSet("TTree","hittree");
set->Add("/...your path.../hitfile.root");
Now it's time to start the analyse:
set->Process("/...your path.../Anaproof.C")
and you get a new, small window where you can see how far the analysis is and this output:
The future of PROOF, news and FAQ
Last new ROOT versions tested is 403-02.
GLOBUS-support
Currently we try to enable GLOBUS-authentification for ROOT/PROOF.
This feature will be accessable by the new prooflogin-parameter
-auth 0 (for UsrPwd, that is the standard) or 3 (for GLOBUS)
If this feature is implemented you will see it in the prooflogin-help.
PROOF@GSI/FZK
To provide more computing power we are planning to expand our PROOF enviroment
to the PBS cluster of the FZK.
To use PROOF@GSI/FZK you only need a GLOBUS certificate.
To provide a propper setup it's recommanded that you provide the names and
locations of your ROOT-files to the prooflogin-script.
This feature is automatically used if you have a valid GLOBUS certificate and ROOT-files
at the FZK.
How to get a GLOBUS certificate
Refer to digital certificate in this topic.
Links