Outputs

When running the Nextflow workflow we provided a --outdir "my_results_directory" parameter. This outputs all results into the provided directory; note: if Nextflow can't find the directory it will create the entire filepath provided.
Additionally, files that don't change between runs (e.g. downloaded HMM models) are stored for long term use into the path specified by --outdir_download_cache.

Within the --outdirdirectory (show above as set to "my_results_directory") the workflow will create two directories ./socialgene_neo4j and ./socialgene_neo4j.

┣ 📂 socialgene_per_run
┗ 📂 socialgene_neo4j

┣ 📂 socialgene_per_run (contains files that are specific to the Nextflow run but not included in the database)
  ┣ 📂 antismash_results
  ┣ 📂 blastp_cache (DIAMOND database)
  ┣ 📂 hmm_cache (processed HMM files)
  ┣ 📂 mmseqs_databases (MMseqs2 cluster database(s))
  ┣ 📂 nonredundant_fasta (non-redundant, indexed protein fasta)
  ┣ 📂 pipeline_info (stats about the workflow run)

┣ 📂 socialgene_neo4j
  - contains all the files for import into the Neo4j Database, as well as the Neo4j Database
  - To run the database, the Neo4j Docker image is pointed directly at the `📂 socialgene_neo4j` directory, the structure is important (Neo4j looks for each of the directories). If you delete one of the subdirectories (`data`, `import`, `logs`, `plugins`) and don't recreate it the Neo4j Docker image will make it again, but as root user, and you will only be able to delete if you have sudo permissions.
  ┣ 📂 data (contains the Neo4j database)
  ┣ 📂 import (contains gzipped tsv files that were used to import data into Neo4j)
  ┣ 📂 logs (Neo4j runtime logs)
  ┣ 📂 plugins (Neo4j plugins (apoc, gds, etc.))
  ┣ 📄 import.report (contains report of all warnings and errors during database import creation)