Installation

Here you can find a set of containers to help creating a data science core architecture, for example:

  • LDAP server
  • PostgreSQL server
  • NFS server (Not active)
  • Samba server (for integration with sequencer technology like Illumina)
  • Galaxy server
  • Zabbix server
  • Software server (i.e. a lot of pre-installed software)
  • Interative Compute server (a place for users to login)
  • Exploratory Analysis server (JupyterHub with JupyterLab)
  • SLURM grid configuration

There is a focus on bioinformatics, but the infrastructure can be used for other applications.

Base images

We use Alpine Linux for simple servers (very small footprint) and Ubuntu for larger images. It might happen that some containers are derived from Debian ones.

Dependencies

  • Python 2.7 (for ansible) and 3.5+
  • PyYAML
  • Docker
  • Ansible
  • docker-py (python-docker on ubuntu)

Todo

Check Python version for ansible (conda...)

If you use the setup wizard (strongly recommended for a first install)

  • Flask
  • openssl and pyOpenSSL (if you need to generate keys)

(explain with conda)

Installation

This will install all your servers on the local machine. If you have a very large big-iron machine, this might be what you want. If you have a cluster, this is still a reasonable starting point, though you will have some work to do, especially on the security front.

  1. Use the wizard to configure the most complicated stuff: ./run_wizard.sh
  2. Create a virtual network called virtual_core docker network create virtual_core. Make sure this is configured everytime you start the system
  3. Create a directory that will store all your docker volumes. This might need to be very big.
  4. cp etc/hosts.sample etc/hosts (you will want to edit this in the future)
  5. cd _instance/ansible; ansible-playbook --ask-pass -i ../../etc/hosts main.yml