Configuring galaxy instance on a vm connected to a cluster

Configuring a cluster instance is similar to installing galaxy on a vm in the cloud, but requires some tricks to make it work.

Setting the cluster type

In host_vars/cluster_example_host/cluster.settings set your cluster type with galaxy_docker_cluster_type. Currently only sge is supported, but the ansible role can be very easily expanded to other clusters. Open an issue or create a pull request on

Integrating with the user management system on the cluster.

Docker containers can be run by root and users in the docker group. Docker has root rights by default and this manifests to the mounted filesystems. The UIDs in the docker container are used to create the files on the mounted filesystem and these probably do not match with users on the cluster.

Also, the job is submitted with the name and uid of the main galaxy user. Your cluster may not accept jobs from unknown user. And the user needs access to the files on the cluster.

In host_vars/cluster_example_host/cluster_settings.yml there is a section on how to set UIDs and names to already existing users on your cluster. In this way you can use service accounts on your cluster to run galaxy.

In order to do this galaxy-launcher has to build a custom image with the right UIDs. You can set the base version of this image by using the bgruening_galaxy_stable_version variable.

Setting up ssh key pairs for galaxy users

galaxy-launcher uses three users to run the playbook * galaxy_docker_docker_user which runs the docker container. * galaxy_docker_web_user which runs the web application and submits jobs to the cluster * galaxy_docker_database_user which manipulates the database

When the playbook is run on the vm with a user that has sudo rights, the playbook will use sudo to switch users.

There is also an option to run without sudo. To do this ssh key pairs need to be set up for all three users. The path to the private keys can be set by galaxy_docker_web_user_private_key, galaxy_docker_docker_user_private_key and galaxy_docker_database_user_private_key.

Integrating with the filesystem

There are certain quirks to using the galaxy-stable image on a cluster. Internally most paths route to either /export and /galaxy-central these paths do not exist on the cluster.

In host_vars/cluster_example_host/cluster_settings.yml you can see the galaxy_docker_shared_cluster_directory variable. This one is automatically mounted to the galaxy docker container if it is set. In host_vars/cluster_example_host/galaxy_settings.yml you can see that paths are set relative to this directory or to galaxy_docker_export_location which is itself set relative to galaxy_docker_shared_cluster_directory in docker_settings.yml.

By using this configuration the galaxy main process will look for its files on the cluster file system instead of using /export. This is essential for running jobs on a cluster.

Further integration

The galaxy_docker_extra_volumes variable allows you to mount extra volumes to your container.

The galaxy_docker_extra_ports variable allows you to open extra ports on your container.

The galaxy_docker_custom_image_lines variable allows you to add extra lines to the docker file before the custom image for your cluster is build. You can add lines such as RUN apt get install sssd or ADD sssd.conf /etc/sssd/sssd.conf. You can add files to files/HOSTNAME/docker_custom_image if you want to use this functionality.