How to Convert a Conda Environment into an Apptainer Image with Snakemake: Improving Performance on Software-Defined Storage System

Managing complex workflows and dependencies in scientific computing can be challenging. Conda helps by managing dependencies and environments, while Apptainer (formerly Singularity) allows for containerized execution, especially in HPC environments. Snakemake, a powerful workflow management system, can integrate these tools to ensure reproducibility and scalability. Additionally, using Apptainer can significantly improve small file performance on software-defined storage systems like CEPHFS. This guide will walk you through converting a Conda environment into an Apptainer image using Snakemake.

Prerequisites

  1. Conda: Make sure Conda is installed.
  2. Apptainer or Singularity: Install Apptainer by following the official guide.
  3. Snakemake: Ensure Snakemake is installed.

Step-by-Step Guide

1. Generate a Dockerfile with Snakemake

First, use Snakemake to generate a Dockerfile for your workflow:

snakemake --containerize > Dockerfile

This command creates a Dockerfile that includes all the necessary dependencies for your Snakemake workflow. Please make sure your workflow itself does not print any text.

update 2025-11-05: Ensure to use snakemake 9.13.3 or greater to include all enviroments

4. Build a Docker container and convert it to Apptainer.

Build in the same dir as the dockerfile the container:

docker build -t local/workflow:latest .

Convert this to Apptainer with:

sudo singularity build workflow.sif docker-daemon://local/workflow:latest`

In the past it was possible to convert a Dockerfile to an Apptainer recipe file with the help of Spython. The benefit was that no sudo rights were needed. However, this does not work at the moment.

5. Add container to Snakefile

Tell SnakeMake where to find the container. Add the following to your snakemake file in a global part of the workflow:

containerized: "/path/to/container.sif "

Please note, do use containerized, not conainerize.

6. Run Your Snakemake Workflow with Apptainer

Finally, run your Snakemake workflow using the generated Apptainer image. Ensure the necessary directories are bound to the Apptainer container using the APPTAINER_BIND environment variable:

export APPTAINER_BIND="/cvmfs/softdrive.nl"
snakemake -c1 --use-conda  --use-singularity

This command runs the Snakemake workflow, producing the desired VCF file while leveraging the Apptainer container for execution. For snakemake >8 use. snakemake -c1 --software-deployment-method conda apptainer

Addressing Small File Performance Issues

Using Apptainer containers can help alleviate performance issues associated with small files on software-defined storage systems like CEPHFS. These systems often struggle with the overhead of managing numerous small files, leading to degraded performance. By containerizing your environment, you encapsulate your dependencies and binaries into a single image file. This reduces the number of small files accessed directly from the storage system, thus improving I/O performance and overall workflow efficiency.

Conclusion

By following these steps, you can seamlessly integrate Conda environments into Apptainer images using Snakemake, which not only ensures reproducibility and scalability but also addresses small file performance issues on software-defined storage systems like CEPHFS. This workflow enhances your computational environment's portability and efficiency, especially in HPC settings. If you encounter any issues or have questions, feel free to reach out for assistance. Happy computing!