[hpc-devel] slurm

Michael Shigorin =?iso-8859-1?q?mike_=CE=C1_osdn=2Eorg=2Eua?=
Вт Сен 18 16:45:57 MSD 2007


On Tue, Sep 18, 2007 at 04:32:47PM +0400, Stanislav Ievlev wrote:
> slurm - это аналог/заменитель чего?

This is SLURM, the Simple Linux Utility for Resource Management. SLURM
is an open-source cluster resource management and job scheduling system
that strives to be simple, scalable, portable, fault-tolerant, and
interconnect agnostic. SLURM currently has been tested only under Linux.

(полностью README и DISCLAIMER лови в аттачах)

> Стоит/Возможно ли его включить в дистрибутив?

В Debian main slurm-llnl есть.

-- 
 ---- WBR, Michael Shigorin <mike на altlinux.ru>
  ------ Linux.Kiev http://www.linux.kiev.ua/
----------- следующая часть -----------
This is SLURM, the Simple Linux Utility for Resource Management. SLURM
is an open-source cluster resource management and job scheduling system
that strives to be simple, scalable, portable, fault-tolerant, and
interconnect agnostic. SLURM currently has been tested only under Linux.

As a cluster resource manager, SLURM provides three key functions. First,
it allocates exclusive and/or non-exclusive access to resources
(compute nodes) to users for some duration of time so they can perform
work. Second, it provides a framework for starting, executing, and
monitoring work (normally a parallel job) on the set of allocated
nodes. Finally, it arbitrates conflicting requests for resources by
managing a queue of pending work.

SLURM is provided "as is" and with no warranty. This software is
distributed under the GNU General Public License, please see the files
COPYING, DISCLAIMER, and LICENSE.OpenSSL for details.

This README presents an introduction to compiling, installing, and
using SLURM.


SOURCE DISTRIBUTION HIERARCHY
-----------------------------

The top-level distribution directory contains this README as well as
other high-level documentation files, and the scripts used to configure
and build SLURM (see INSTALL). Subdirectories contain the source-code
for SLURM as well as a DejaGNU test suite and further documentation. A
quick description of the subdirectories of the SLURM distribution follows:

  src/        [ SLURM source ]
     SLURM source code is further organized into self explanatory 
     subdirectories such as src/api, src/slurmctld, etc.

  doc/        [ SLURM documentation ]
     The documentation directory contains some latex, html, and ascii
     text papers, READMEs, and guides. Manual pages for the SLURM
     commands and configuration files are also under the doc/ directory.

  etc/        [ SLURM configuration ] 
     The etc/ directory contains a sample config file, as well as
     some scripts useful for running SLURM.

  slurm/      [ SLURM include files ]
     This directory contains installed include files, such as slurm.h
     and slurm_errno.h, needed for compiling against the SLURM API.

  testsuite/  [ SLURM test suite ]
     The testsuite directory contains the framework for a set of 
     DejaGNU and "make check" type tests for SLURM components.
     There is also an extensive collection of Expect scripts.

  auxdir/     [ autotools directory ]
     Directory for autotools scripts and files used to configure and
     build SLURM
  
  contribs/   [ helpful tools outside of SLURM proper ]
     Directory for anything that is outside of slurm proper such as a
     different api or such.  To have this build you need to do a 
     make contrib/install-contrib.

COMPILING AND INSTALLING THE DISTRIBUTION
-----------------------------------------

Please the the INSTALL file for basic instructions. You will need a
working installation of OpenSSL.

SLURM does not use reserved ports to authenticate communication
between components. You will need to have at least one "auth"
plugin. Currently, only three authentication plugins are available:
"auth/none," "auth/authd," and "auth/munge." The "auth/none" plugin is
built and used by default, but one of either Brent Chun's authd, or Chris
Dunlap's Munge should be installed in order to get properly authenticated
communications.  The configure script in the top-level directory of this
distribution will determine which authentication plugins may be built.


OpenSSL:
http://www.openssl.org

AUTHD:
http://www.theether.org/authd/

MUNGE:
http://www.llnl.gov/linux/munge/


CONFIGURATION
-------------

An annotated sample configuration file for SLURM is provided with this
distribution as etc/slurm.conf.example. Edit this config file to suit
your site and cluster, then copy it to `$sysconfdir/slurm.conf,' where
sysconfdir defaults to PREFIX/etc unless explicitly overwritten in the
`configure' or `make' steps.

Once the config file is installed in the proper location, you'll need
to create the keys for SLURM job credential creation and verification.
The following openssl commands should be used:

 > openssl genrsa -out /path/to/private/key 1024
 > openssl rsa -in /path/to/private/key -pubout -out /path/to/public/key

The private key and public key locations should be those specified by
JobCredentialPrivateKey and JobCredentialPublicCertificate in the SLURM
config file.


RUNNING SLURM
-------------

Once a valid configuration has been set up and installed, the SLURM
controller, slurmctld, should be started on the primary and backup
control machines, and the SLURM compute node daemon, slurmd, should be
started on each compute server.

The slurmd daemons need to run as root for production use, but may be
run as a user for testing purposes (obviously no jobs may be run as
any other user in that configuration). The SLURM controller, slurmctld,
need to be run as the configured SlurmUser (see your config file).

Man pages are the best source of information about SLURM commands and
daemons. Please see: slurmctld(8), slurmd(8), scontrol(1), sinfo(1),
squeue(1), scancel(1), and srun(1).

Also, take a look at the Quickstart Guide to get acquainted with
running and managing jobs with SLURM: doc/html/quickstart_admin.html
or PREFIX/share/doc/quickstart_admin.html.


PROBLEMS
--------

If you experience problems compiling, installing, or running SLURM
please send e-mail to either slurm-dev на lists.llnl.gov.

$Id: README 11977 2007-08-09 17:49:08Z da $
----------- следующая часть -----------
Copyright (C) 2002-2006 The Regents of the University of California.
Produced at Lawrence Livermore National Laboratory, Hewlett-Packard, 
Linux NetworX, and other sites.

Written by:
Ernest Artiaga <ernest.artiaga на bsc.es>
Danny Auble <auble1 на llnl.gov>
Susanne Balle <susanne.balle на hp.com>
Daniel Christians <Daniel.Christians на hp.com>
Chris Dunlap <cdunlap на llnl.gov>
Joey Ekstrom <ekstrom1 на llnl.gov>
Jim Garlick <garlick на llnl.gov>
Mark Grondona <grondona1 на llnl.gov>
Christopher Holmes <cholmes на hp.com>
Takae Hatazaki <takao.hatazaki на hp.com>
Nathan Huff <nhuff на geekshanty.com>
David Jackson <jacksond на clusterresources.com>
Greg Johnson <gjohnson на lanl.gov>
Morris Jette <jette1 на llnl.gov>
Jason King <king49 на llnl.gov>
Chris Morrone <morrone2 на llnl.gov>
Brian O'Sullivan <bos на pathscale.com>
Daniel Palermo <dan.palermo на hp.com>
Dan Phung <phung4 на llnl.gov>
Andy Riebs <Andy.Riebs на hp.com>
Jeff Squyres <jsquyres на lam-mpi.org>
Keven Tew <tew1 на llnl.gov>
Jay Windley <jwindley на lnxi.com>

UCRL-CODE-226842.

This file is part of SLURM, a resource management program.
For details, see <http://www.llnl.gov/linux/slurm/>.

SLURM is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free
Software Foundation; either version 2 of the License, or (at your option)
any later version.

SLURM is distributed in the hope that it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
FOR A PARTICULAR PURPOSE.  See the GNU General Public License for more
details.

You should have received a copy of the GNU General Public License along
with SLURM; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301  USA.


OUR NOTICE AND TERMS OF AND CONDITIONS OF THE GNU GENERAL PUBLIC LICENSE

Our Preamble Notice

A. This notice is required to be provided under our contract with the U.S.
Department of Energy (DOE). This work was produced at the University 
of California, Lawrence Livermore National Laboratory under Contract 
No. W-7405-ENG-48 with the DOE.

B. Neither the United States Government nor the University of California 
nor any of their employees, makes any warranty, express or implied, or 
assumes any liability or responsibility for the accuracy, completeness, or 
usefulness of any information, apparatus, product, or process disclosed, or 
represents that its use would not infringe privately-owned rights.

C. Also, reference herein to any specific commercial products, process, or 
services by trade names, trademark, manufacturer or otherwise does not 
necessarily constitute or imply its endorsement, recommendation, or 
favoring by the United States Government or the University of California. 
The views and opinions of authors expressed herein do not necessarily 
state or reflect those of the United States Government or the University of 
California, and shall not be used for advertising or product endorsement 
purposes.

The precise terms and conditions for copying, distribution and modification 
is provided in the file named "COPYING" in this directory.


Подробная информация о списке рассылки HPC-devel