Configuring galaxy instance on a vm connected to a cluster
Configuring a cluster instance is similar to installing galaxy on a vm in the cloud, but requires some tricks to make it work.
Setting the cluster type
host_vars/cluster_example_host/cluster.settings set your cluster type with
galaxy_docker_cluster_type. Currently only
sge is supported, but the ansible role can be very easily expanded to other clusters. Open an issue or create a pull request on
Integrating with the user management system on the cluster.
Docker containers can be run by root and users in the docker group. Docker has root rights by default and this manifests to the mounted filesystems. The UIDs in the docker container are used to create the files on the mounted filesystem and these probably do not match with users on the cluster.
Also, the job is submitted with the name and uid of the main galaxy user. Your cluster may not accept jobs from unknown user. And the user needs access to the files on the cluster.
host_vars/cluster_example_host/cluster_settings.yml there is a section on how to set UIDs and names to already existing users on your cluster. In this way you can use service accounts on your cluster to run galaxy.
In order to do this galaxy-docker-ansible has to build a custom image with the right UIDs. You can set the base version of this image by using the
Setting up ssh key pairs for galaxy users
Galaxy-docker-ansible uses three users to run the playbook
galaxy_docker_docker_user which runs the docker container.
galaxy_docker_web_user which runs the web application and submits jobs to the cluster
galaxy_docker_database_user which manipulates the database
When the playbook is run on the vm with a user that has sudo rights, the playbook will use sudo to switch users.
There is also an option to run without sudo. To do this ssh key pairs need to be set up for all three users. The path to the private keys can be set by
Integrating with the filesystem
There are certain quirks to using the galaxy-stable image on a cluster. Internally most paths route to either
/galaxy-central these paths do not exist on the cluster.
host_vars/cluster_example_host/cluster_settings.yml you can see the
galaxy_docker_shared_cluster_directory variable. This one is automatically mounted to the galaxy docker container if it is set.
host_vars/cluster_example_host/galaxy_settings.yml you can see that paths are set relative to this directory or to
galaxy_docker_export_location which is itself set relative to
By using this configuration the galaxy main process will look for its files on the cluster file system instead of using
/export. This is essential for running jobs on a cluster.
galaxy_docker_extra_volumes variable allows you to mount extra volumes to your container.
galaxy_docker_extra_ports variable allows you to open extra ports on your container.
galaxy_docker_custom_image_lines variable allows you to add extra lines to the docker file before the custom image for your cluster is build. You can add lines such as
RUN apt get install sssd or
ADD sssd.conf /etc/sssd/sssd.conf. You can add files to
files/HOSTNAME/docker_custom_image if you want to use this functionality.