Scsilhouette Modules
Nextflow modules for the scsilhouette silhouette-score QC branch of the
sc-nsforest-qc-nf workflow. These modules run inside the
ghcr.io/nih-nlm/scsilhouette:1.0 container and are orchestrated
by main.nf.
The scsilhouette package computes silhouette scores and integrated
visualizations with NSForest F-scores. For full details see the
scsilhouette repository and
scsilhouette documentation.
Execution order (runs in parallel with the NSForest branch):
compute_silhouette_process— silhouette scores + cluster summaryviz_summary_process— silhouette + F-score summary plotviz_distribution_process— cluster size vs silhouette distributionviz_dotplot_process— UMAP/embedding coloured by silhouette
Compute Silhouette Process
compute_silhouette_process
Source: modules/scsilhouette/compute_silhouette.nf
Compute Silhouette Module
Computes per-cell silhouette scores for each cluster using the specified embedding. Saves per-cell scores, per-cluster summary statistics, and an annotation JSON for downstream viz processes.
Input:
h5ad: Path to adata_filtered.h5ad
Output:
{organ}_{first_author}_{journal}_{year}_{cluster_header_safe}_{embedding_safe}_{vid}_cluster_summary.csv {organ}_{first_author}_{journal}_{year}_{cluster_header_safe}_{embedding_safe}_{vid}_annotation.json
Params referenced:
params.outdirparams.publish_mode
Compute Summary Stats Process
compute_summary_stats_process
Source: modules/scsilhouette/compute_summary_stats.nf
Compute Summary Statistics
Creates dataset-level summary statistics from cluster summaries. Computes median-of-medians and other aggregate metrics across all clusters.
Input:
silhouette_scores:{prefix}_silhouette_scores.csv
cluster_summary: {prefix}_cluster_summary.csv
annotation: {prefix}_annotation.json
nsforest_results: {prefix}_results.csv (or NO_FILE sentinel)
Output:
median/mean/std silhouette, quality tier counts, median/mean F-score, doi, collection_name, dataset_title, journal
Params referenced:
params.outdirparams.publish_mode
Viz 2D Projection Process
viz_2D_projection_process
Source: modules/scsilhouette/viz_2D_projection.nf
Viz 2D_Projection Module
Generates an embedding scatter plot (UMAP/t-SNE/etc.) coloured by cluster identity, saved as both HTML (interactive) and SVG.
Input:
h5ad: Path to adata_filtered.h5ad
Output:
Params referenced:
params.outdirparams.publish_mode
Viz Distribution Process
viz_distribution_process
Source: modules/scsilhouette/viz_distribution.nf
Viz Distribution Module
Generates distribution plots of cluster cell counts (raw and log10) overlaid with mean/median silhouette scores per cluster.
Input:
silhouette_scores: {prefix}_silhouette_scores.csv
cluster_summary: {prefix}_cluster_summary.csv
annotation: {prefix}_annotation.json
Output:
Params referenced:
params.outdirparams.publish_mode
Viz Summary Process
viz_summary_process
Source: modules/scsilhouette/viz_summary.nf
Viz Summary Module
Generates an interactive silhouette F-score summary plot combining silhouette scores with NSForest F-scores per cluster. Also writes a dataset-level summary CSV.
Input:
silhouette_scores:{prefix}_silhouette_scores.csv
cluster_summary: {prefix}_cluster_summary.csv
annotation: {prefix}_annotation.json
nsforest_results: {prefix}_results.csv (or NO_FILE sentinel)
Output:
Params referenced:
params.outdirparams.publish_mode