Define the models and baselines to be tested.

Model to be tested.

type: string
default: NaiveDrugMeanPredictor

Model to be tested. See the documentation for a list of pre-implemented models. Can be multiple models separated by ','.

Baselines to be tested.

type: string
default: NaiveMeanEffectsPredictor

Baselines to be tested. See documentation of a list of available models. For baselines, randomization and robustness tests are not run. The NaiveMeanEffectsPredictor will always be included.

Define where the pipeline should find input data and save output data.

Run name for the pipeline. The subdirectory in results will be named like this.

type: string
default: my_run

You will need to set a run identifier for the pipeline. This is used to create a unique output directory for each run.

Name of the dataset. Pre-supplied datasets are CTRPv2, CTRPv1, CCLE, GDSC1, GDSC2, TOYv1, TOYv2.

type: string
default: CTRPv2

Name of the dataset used for the pipeline. This can be either one of the provided datasets ('GDSC1', 'GDSC2', 'CCLE', 'CTRPv2', 'TOYv1', 'TOYv2) in which case the datasets with the fitted curves is downloaded, or a custom dataset name, pointing either to raw viability measurements for automatic curve fitting, or pre-fit data (see no_refitting option; not recommended for dataset comparability reasons due to potential differences in fitting procedures).

The output directory where the results will be saved. Default is results/

type: string
default: results

Email address for completion summary.

type: string
pattern: ^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$

Set this parameter to your e-mail address to get a summary e-mail with details of the run sent to you when the workflow exits. If set in your user config file (~/.nextflow/config) then you don't need to specify this on the command line for every run.

Define the mode in which the pipeline will be run.

Run the pipeline in test mode LPO (Leave-random-Pairs-Out), LCO (Leave-Cell-line-Out), or LDO (Leave-Drug-Out).

type: string
default: LCO
pattern: ^((LPO|LCO|LTO|LDO)?,?)*(?<!,)$

Which tests to run (LPO=Leave-random-Pairs-Out, LCO=Leave-Cell-line-Out, LTO=Leave-Tissue-Out, LDO=Leave-Drug-Out). Can be a list of test runs e.g. 'LPO,LCO,LTO,LDO' to run all tests. Default is LCO.

Options for randomization.

Randomization mode for the pipeline.

type: string
default: None
pattern: ^(None|(?:SVR[CD]|SVC[CD])(,(?:SVR[CD]|SVC[CD]))*)$

Which randomization tests to run, additionally to the normal run. Default is None which means no randomization tests are run. Modes: SVCC, SVRC, SVCD, SVRD. Can be a list of randomization tests e.g. 'SCVC,SCVD' to run two tests. Default is None. SVCC: Single View Constant for Cell Lines: in this mode, one experiment is done for every cell line view the model uses (e.g. gene expression, mutation, ..). For each experiment one cell line view is held constant while the others are randomized. SVRC Single View Random for Cell Lines: in this mode, one experiment is done for every cell line view the model uses (e.g. gene expression, mutation, ..).

Randomization type for the pipeline.

type: string

type of randomization to use. Choose from "permutation", "invariant". Default is "permutation

Options for robustness.

Number of trials to run for the robustness test

type: integer

Number of trials to run for the robustness test. Default is 0, which means no robustness test is run. The robustness test is a test where the model is trained with varying seeds. This is done multiple times to see how stable the model is.

Options for data input.

Path to the data directory.

type: string
default: data

Path to the data directory. The downloaded data will be exported here. If you supply custom data, it goes here, too.

The name of the drug response measure to use.

type: string

Column of the response dataset in which the drug response is stored.

Datasets for cross-study prediction.

type: string
pattern: ^(?:|(?:GDSC[12]|CCLE|CTRPv[12]|TOYv[12])(,(?:GDSC[12]|CCLE|CTRPv[12]|TOYv[12]))*)$

List of datasets to use to evaluate predictions across studies. Can be a combination like 'CTRPv1,CCLE'. Default is empty string which means no cross-study datasets are used.

Link to the latest Zenodo version of the dataset.

type: string
default: https://zenodo.org/records/15533857/files/
pattern: ^https://zenodo.org/records/[0-9]+/files/$

Additional options for the pipeline.

False by default (=refitting). By default, we use measures calculated with CurveCurator instead of original measures reported by the authors for the available datasets, or invoke automatic fitting of custom raw viability data with CurveCurator. Set this flag to disable this option.

type: boolean

By default, measures calculated by CurveCurator (by re-fitting the response curves, see 'measure' option for details) are used for available datasets, which allows better comparability between datasets. When providing a custom dataset (see 'dataset_name' option), we expect a csv-formatted file at <path_data>/<dataset_name>/<dataset_name>_raw.csv (also see 'path_data' option) containing the raw response data. We fit the curves by default with CurveCurator to provide fair comparison to our other available datasets. The fitted data will then be stored at <path_data>/<dataset_name>/<dataset_name>.csv. If you want to disable this option, set the flag.

Optimization metric for the pipeline.

type: string

Optimization metric for the pipeline. All models will minimize (MSE, RMSE, MAE)/maximize (R^2, Pearson, Spearman, Kendall) this metric calculated on the validation set. Default is RMSE.

Number of cross-validation splits.

type: integer
default: 10

Number of cross-validation splits. Default is 10.

Response transformation

type: string

Transformation to apply to the response variable possible values: None, standard, minmax, robust

Model checkpoint directory

type: string
default: TEMPORARY

Directory to save model checkpoints.

Parameters used to describe centralised config profiles. These should not be edited.

Git commit id for Institutional configs.

hidden
type: string
default: master

Base directory for Institutional configs.

hidden
type: string
default: https://raw.githubusercontent.com/nf-core/configs/master

If you're running offline, Nextflow will not be able to fetch the institutional config files from the internet. If you don't need them, then this is not a problem. If you do need them, you should download the files from the repo and tell Nextflow where to find them with this parameter.

Institutional config name.

hidden
type: string

Institutional config description.

hidden
type: string

Institutional config contact information.

hidden
type: string

Institutional config URL link.

hidden
type: string

Less common options for the pipeline, typically set in a config file.

Display version and exit.

hidden
type: boolean

Method used to save pipeline results to output directory.

hidden
type: string

The Nextflow publishDir option specifies which intermediate files should be saved to the output directory. This option tells the pipeline what method should be used to move these files. See Nextflow docs for details.

Email address for completion summary, only when pipeline fails.

hidden
type: string
pattern: ^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$

An email address to send a summary email to when the pipeline is completed - ONLY sent if the pipeline does not exit successfully.

Send plain-text email instead of HTML.

hidden
type: boolean

Do not use coloured log outputs.

hidden
type: boolean

Incoming hook URL for messaging service

hidden
type: string

Incoming hook URL for messaging service. Currently, MS Teams and Slack are supported.

Boolean whether to validate parameters against the schema at runtime

hidden
type: boolean
default: true

Base URL or local path to location of pipeline test dataset files

hidden
type: string
default: https://raw.githubusercontent.com/nf-core/test-datasets/

Suffix to add to the trace report filename. Default is the date and time in the format yyyy-MM-dd_HH-mm-ss.

hidden
type: string