ug4
JuQueen

Architecture

JuQueen currently provides 458.752 cores (448 Ki), with 1 GB RAM per core (total 448 Tbyte RAM)

Juqueen consists of 28 racks: 1 Rack = 32 nodeboards, 1 Nodeboard = 16 Compute Nodes.

Each Compute Node (CN) has a 16-core IBM PowerPC A2 running at 1.6 GHz with 16 Gbyte SDRAM-DDR3 of RAM in total or 1 Gbyte per core.

Note
Every core can execute four processes/threads,leaving 256 MB of RAM. However, since our applications are memory bound, this is NOT recommended.
Keep in mind that in comparison to other clusters or current laptop processors, one PowerPC A2-core is slower.

Creating an account

  1. get the project ID (note: not the project number) from someone like Sebastian, Andreas, Martin or Arne.
  2. goto here, enter email adress .
  3. open the link in the email
  4. enter the project id
  5. enter name and stuff
  6. generate ssh key in terminal with
    ssh-keygen -b 2048 -t rsa
    do NOT leave the passphrase empty! Remember your passphrase!
  7. upload id_rsa.pub. check "no" at will processed personal data... , Leave the DN of the user certificate . check none of the software packages, therefore leave out licence.
  8. get the mail, print out pdf, sign (2 places), hand over to your professor.

Resetting your ssh keys / password

What to do if

  • You lost your id_rsa.pub, because it got deleted, overwritten or similar
  • You ssh keys are not valid anymore because you moved to another computer
  • You forgot your passphrase

Solution:

  1. generate a ssh key with the commands above
  2. Go to here, follow the instructions and upload the id_rsa.pub

Access to JuQueen's Login Nodes

JuQueen is reached via so called front-end or login nodes for interactive access and the submission of batch jobs.

These login nodes are reached via

ssh <user>@juqueen.fz-juelich.de

i.e., for login there is only a generic hostname, juqueen.fz-juelich.de, from which automatically a connection either to juqueen1 or juqueen2 will be established.

Note
The front-end nodes have an identical environment, but multiple sessions of one user may reside on different nodes which must be taken into account when killing processes.
The login nodes are running under Linux, while the computing nodes (CNs) are running a limited version of Linux called Compute Node Kernel (CNK). Therefore it is necessary to cross-compile for JuQueen. A binary compiled for the compute nodes will not run on the login node and vice versa.

It is necessary to upload the SSH key of the machine from which to connect to one of JuQueens login nodes. Normally you've already done this during the application process.

See Logging on to JuQueen (also for X11 problems).

To be able to connect to JuQueen from different machines maybe you find it useful to define one of GCSC's machines (e.g. speedo, quadruped, ...) as a "springboard" to one of JuQueens login nodes (so that you have to login to this machine first, then to JuQueen), see SSH Hopping.


Working environment on JuQueen's login nodes

Some additional software is only available if the appropriate module was loaded for the current session via the module command. A short introduction to the module concept and the module command can be found here.

Note
The module concept allows for dynamic modification of your shell environment but only for the current session. To make those changes persistent you might want to add your module load/swap commands to your shell configuration file (~/.bashrc etc.).

For ug4 compiling, make sure that the cmake module has been loaded. You can e.g. add

module load cmake
cmake
Definition: unit_tests.doxygen:198

to your ~/.bashrc file or simply execute it manually.

to your ~/.bash_profile file or simply execute it manually. Necessary for ug4 are the modules lapack and cmake. Together with some ug4 and ugsubmit settings a useful example setting is to add those lines to you ~/.bash_profile:

...
#ug4
export UG4_ROOT=~/ug4/trunk
source $UG4_ROOT/scripts/shell/ugbash
export UGSUBMIT_TYPE=Juqueen
export UGSUBMIT_EMAIL=your.name@gcsc.uni-frankfurt.de
#load some modules
module load cmake
module load lapack
# get a nice prompt with host [full path]
source $UG4_ROOT/scripts/shell/prompt
...
size_t source(SM_edge< typename T::value_type > const &e, ug::BidirectionalMatrix< T > const &)
Definition: bidirectional_boost.h:94
static vector2 trunk(const vector2 &v, number smallestVal)
Definition: file_io_tikz.cpp:59
Warning
We suggest to use bash, we had some strange errors with other shells (no job output, $WORK set wrong).
Note
For this to work, you need to Checkout ug4 .

Configuration of ug4 for JuQueen

For JuQueen you have to "cross compile" and to do so use a specific Toolchain File. Start CMake like this

cmake -DSTATIC_BUILD=ON -DCMAKE_TOOLCHAIN_FILE=../cmake/toolchain/juqueen.cmake ..

Static Builds are your choice if you want to process very large jobs. Regarding the preferred build type "static" see also Very large Jobs on JuQueen!

You can check your executable by running the (standard unix) ldd command ("list dynamic dependencies") on it:

ldd ugshell

Answer should be not a dynamic executable for a completely static build!

Debug builds: No special requirements for the pre-installed GCC 4.4.6 (in October 2012).

For debugging a parallel application on JuQueen see Debugging on JuQueen


Working with ug4 on JuQueen

Basic Job Handling

You can use ugsubmit to run your jobs on JUQUEEN. Make sure to source ug4/trunk/scripts/shell/ugbash and to export the variables

export UGSUBMIT_TYPE=Juqueen
export UGSUBMIT_EMAIL=your@email.com

e.g. in ~/.bashrc. (of course you have to replace 'your@.nosp@m.emai.nosp@m.l.com' with your real email adress...).

Interactive jobs

Not possible in the moment!

Batch jobs

Read ugsubmit - Job Scheduling on Clusters for further instructions on unified ugsubmit usage.

Warning
Make sure to execute all batch jobs on the $WORK partition of Juqueen. Access to $HOME from the compute-nodes is not only slow but will likely cause other troubles, too.

For manual job handling (not recommended, please modify ugsubmit for special needs!) see Manual Job Handling using LoadLeveler (not recommended).