Introduction

This command-line utility takes advantage of the Sophia DDM API, allowing automatic upload of raw sequencing data.

CLI can be used to trigger upload of a specific run and to download output of runs. The upload can be performed using the Illumina sample sheet file to define the sample details.

Setup of this functionality needs support from SOPHiA GENETICS' IT team and familiarity with command-line utilities, please contact the support team using support@sophiagenetics.com.

This utility is a Java tool that is delivered separately and this user guide helps getting started with the tool. It has no graphical user interface and can only be used using the command line.

Requirements

The CLI tool requires either a Java Runtime Environment (the latest update of Oracle JRE 1.8 is recommended) or a Java Development Kit in order to start.

The wrapper script requires Python 3.8 or higher.

Prior to using the CLI tool, you should have logged in at least one time in the SOPHiA DDM application to change your password from the auto-generated one sent to you via mail to one of your convenience. This is a security layer implemented in SOPHiA DDM that prevent users from performing any actions using the auto-generated password.

Installation

Download the Python script sg-upload-v2-wrapper.py which will keep you up-to-date with the latest bug fixes and security upgrades of the tool.

Run the following command. If the output displays as below, you now have the latest version in the current directory.

$ python3 sg-upload-v2-wrapper.py --help

Sophia Genetics - Downloader v
Usage: Sophia Genetics upload-api client [-hvV] [COMMAND]
-h, --help Show this help message and exit.
-v, --verbose Talks a bit more
-V, --version Print version information and exit.
Commands:
login, li Login with your Sophia Genetics credentials
logout, lo Logs out the current user
login-iam
logout-iam
new, n New batch request analysis (run)
upload, up Uploads the last created batch analysis request (run)
manage, mg Manages batch analysis requests (runs)
status, s Get the status of one or many batch requests (runs)
file, f File management commands
export, e Export files from completed interpretations
tests, t Tests commands
version, cv Check version
patient Get patient information
order Manage patient orders
pipeline Pipeline management commands
sample Get sample information
userInfo User management commands

Enabling TLS 1.2 for Uploads with Java 11+

If you are using Java 11 or later versions, you may experience connection issues when executing Upload commands. This is due to Java 11+ defaulting to TLS 1.3, which does not support TLS renegotiation—a feature required by the upload process.

To resolve this, add the following option to your Upload command to explicitly enforce TLS 1.2:

-Djdk.tls.client.protocols=TLSv1.2

This will instruct Java to use TLS 1.2, ensuring successful and secure communication during the upload.

NOTE

The terms Batch Analysis Request and run are used interchangeably in this document.

Sample Sheet upload workflow

The SOPHiA DDM™ CLI enables automated run upload using a Sample Sheet CSV file generated with Illumina® sequencers. The CLI supports both the new Illumina sample sheet v2 format (with [SOPHIA_DDM_Settings] and [SOPHIA_DDM_Data] sections) and the legacy format (with [SOPHIA_DDM_Data_v1] section) for backward compatibility. The CLI Sample Sheet upload workflow supports the upload of large runs in multiple batches, organized by hybrid capture groups or other custom grouping. It also enables specification of SOPHiA DDM™ Bundle Serial Numbers and SOPHiA DDM™ Pipeline IDs for each sample.

Option 1 - all samples from same assay, to be analyzed using the same pipeline: In this case, you can exclude the Pipeline_ID column from the sample sheet.

$ python3 sg-upload-v2-wrapper.py new --folder /path/to/fastq/files --sampleSheet samplesheet.csv --pipeline=PIPID --upload

Option 2 - samples from different assays, to be analyzed using different pipelines. The sample sheet then needs to contain the Pipeline_ID column and filled for each sample.

$ python3 sg-upload-v2-wrapper.py new --folder /path/to/fastq/files --ref Run_reference --sampleSheet samplesheet.csv --upload

See example: full format example.

The CLI uses the Illumina sample sheet v2 format (recommended). For CLI usage, only the [SOPHIA_DDM_Data] section is required. The [SOPHIA_DDM_Settings] section is optional and defaults to version 1 if omitted. The Illumina header sections ([Header], [BCLConvert_Data]) are optional as well and only needed if you're using the sample sheet with Illumina BCL converter tools.

1. Minimal format example

Download: sampleSheet_minimal.csv

[SOPHIA_DDM_Data]
Sample_ID,Capture_ID,Pipeline_ID
SG10000008,1,1234
SG10000009,1,1234
library01,2,1235
library02,2,1235

2. Full format example

Full format example (with optional Illumina sections for BCL converter compatibility):

Download: sampleSheet_full.csv

[Header],,,,
FileFormatVersion,2,,,
RunName,MyRun,,,
InstrumentPlatform,NextSeq1k2k,,,
InstrumentType,NextSeq2000,,,
,,,,
[Reads],,,,
Read1Cycles,125,,,
Read2Cycles,125,,,
Index1Cycles,8,,,
Index2Cycles,8,,,
,,,,
[BCLConvert_Settings],,,,
SoftwareVersion,x.y.z,,,
,,,,
[BCLConvert_Data],,,,
Lane,Sample_ID,index,index2,,
1,S01-TOO-12plex-P1-rep1,ATCCACTG,AGGTGCGT,,
1,S02-TOO-12plex-P1-rep2,GCTTGTCA,GAACATAC,,
,,,,
[SOPHIA_DDM_Settings],,,,
version,1,,,
,,,,
[SOPHIA_DDM_Data],,,,
Sample_ID,Capture_ID,Upload_Batch,Bundle_SN,Pipeline_ID,Patient_Ref,SIS_No,Order_ID,HPO_ID,Virtual_Panel_ID,Gene_Filter_ID,Patient_Lock,Disease_ID,Patient_First_Name,Patient_Last_Name,Patient_DOB,Patient_Gender,Test_Date,Date_Collected,Sample_Type_ID,Library_Type,Specimen_ID,Case_block_id,Date_received,Tumor_percent,Ordering_provider_id,Ordering_provider_type,Place_collected,Order_type
SG10000008,1,1,BDS-1111111111-10,5,SDSD12,sis-744587603-44,Order1,1-2-3,123,32323,0,12-23-22,John,Doe,2000-02-03,M,2025-09-03,2025-09-01,308000,DNA,SPE-001,BLK-001,2025-09-01,-,PROV-001,INSTITUTION,City Hospital,GEN2
SG10000009,1,1,BDS-1111111112-11,6,DFDF111,sis-744587603-44,Order2,2-3-5,345,12344,1,334-44,John,Doe,2000-02-03,M,2025-09-03,2025-09-01,408000,RNA,SPE-002,BLK-002,2025-09-01,40,PROV-001,INSTITUTION,City Hospital,GEN2
library01,1,2,BDS-1111111113-12,6,DSFF111,sis-744587603-44,Order3,4-6-9,445,2444,1,34-54,Jane,Doe,2000-02-03,F,2025-08-03,2025-08-01,608000,-,SPE-003,-,2025-08-01,-,PROV-002,SERVICE,Lab Center,GEN2
library02,1,2,BDS-1111111114-13,6,DFDS11,-,Order4,1-8-7,2323,55666,0,34-65-64,Jane,Doe,2000-02-03,F,2025-08-03,2025-08-01,808000,-,SPE-004,-,2025-08-01,-,PROV-002,SERVICE,Lab Center,GEN2

Format Structure:

[SOPHIA_DDM_Settings]: Optional section supporting the following fields:
version: Sample sheet version. Defaults to 1 if omitted.
filetype: File type for the upload. Set to VCF for VCF-based uploads. If omitted, FASTQ files are assumed.
[SOPHIA_DDM_Data]: Main data section containing sample information with all required and optional columns. Required.
[Header] and [BCLConvert_Data]: Optional Illumina sections. Only needed if using the sample sheet with Illumina BCL converter tools. Not required for CLI usage.

Note: The CLI also supports the legacy v1 format (with [SOPHIA_DDM_Data_v1] section) for backward compatibility. For information about the legacy format and how to migrate from v1 to v2, see the Migration Guides section.

Sample Sheet Columns

Sample_ID: the id of the sample - needs to match the first part of the fastq file names (e.g. SAMPLEID in SAMPLEID_S01_L001_R1_001.fastq.gz)

Capture_ID: sample capture group id, identical id means the samples have been captured together in the hybridization capture workflow. By default, run splitting will be based on Capture_ID groups.

Upload_Batch: if the lab prefers multiple captures to be included in uploads (e.g. 2 captures per upload batch) a separate column can be used to define upload batch id - samples with same id will be uploaded together as one batch / run

Bundle_SN: for SOPHiA Bundle Solutions only - the Serial Number from the box containing reagents, replaces current separate mapping file

SIS_No: the SIS or DIS number corresponding to the purchase order if sample has been processed as part of Integrated or Dispatch service. Use - (hyphen) if the sample is not part of SIS.

Pipeline_ID: the ID of the pipeline to be launched for the sample - specifying different ids allows mixing multiple panels in the same run / upload batch. (Can be retrieved using the pipeline -l command.)

Patient_Ref: the patient reference to be associated with the sample - defaults to Sample_ID (like when uploading via SOPHiA DDM UI)

Disease_ID: the disease ids that needs to be added to the patient. (Multiple disease id's seperated by "-" hyphen)

Order_ID: the Order id needed to be added for the sample

HPO_ID: the hpo ids need to be added for the order. (Multiple hpo id's seperated by "-" hyphen)

Virtual_Panel_ID: the Virtual panel Id that asscoiated for the Order.

Gene_Filter_ID: the Gene filter Id for the Order

Patient_Lock: sets the patient lock for the order. ("1" is used to set the lock. "0" for not to set lock)

Gene_List: a semicolon-separated list of gene names used to dynamically create a virtual gene panel for the order (e.g., BRCA1;TP53;MAP2K1). Only letters, digits, hyphens, and semicolons are accepted. Gene panel resolution is performed before run creation — if the gene list cannot be resolved (e.g. unknown gene names), the upload fails immediately and no orphaned run or samples are left in the system. The error message will indicate which sample's gene list could not be resolved.

Patient_First_Name: The patient's first name, formatted as a string.

Patient_Last_Name: The patient's last name, formatted as a string.

Patient_DOB: The patient's date of birth, formatted as a string in the format yyyy-mm-dd (e.g., 1990-01-01).

Patient_Gender: The patient's gender, formatted as a string, accepted values are ("M" for male, "F" for female, or "U" unknown)

Test_Date: The date the test was performed, formatted as a string in the format yyyy-mm-dd (e.g., 2025-06-13).

Date_Collected: The date the sample was collected, formatted as a string in the format yyyy-mm-dd (e.g., 2025-06-13).

Specimen_id: The specimen identifier associated with the sample.

Case_block_id: The case block reference for the specimen (e.g. a paraffin block identifier).

Date_received: The date the specimen was received in the lab, formatted as yyyy-MM-dd.

Tumor_percent: The estimated percentage of tumor cells in the specimen (numeric value between 0 and 100).

Ordering_provider_id: The identifier of the ordering provider.

Ordering_provider_type: The type of ordering provider. Accepted values: INSTITUTION, SERVICE, or PERSON.

Place_collected: The location or institution where the specimen was collected.

Order_type: The order management generation for the sample. Accepted values: GEN1 (default) or GEN2. GEN2 orders support disease IDs managed via the order table. When not specified, defaults to GEN1.

Sample_Type_ID: The sample type id of the sample. Below is listed the mapping to be used

| Sample Type ID | Sample Type Name               |
|----------------|--------------------------------|
| 108000         | PERIPHERAL_BLOOD               |
| 208000         | FRESH_TUMOR                    |
| 308000         | FFPE                           |
| 408000         | BIOPSY                         |
| 508000         | CELL_LINE                      |
| 1308000        | CFDNA                          |
| 608000         | CTDNA                          |
| 708000         | BUCCAL_SWAB                    |
| 808000         | NASOPHARYNGEAL_SWAB            |
| 908000         | SPUTUM                         |
| 1008000        | BRONCHOALVEOLAR_LAVAGE         |
| 1108000        | SALIVA                         |
| 1208000        | BONE_MARROW                    |
| 8000           | OTHER                          |

Library_Type : The library type of the sample, used to specify only which one to include. To group the analysis, either add a Library_Type column or use the specified suffixes in the Sample_ID column. For none, use "-" as the column value. Allowed values are described below.

| Library_Type     | Sample ID Suffix         |
|------------------|--------------------------|
| DNA              | -D                       |
| DNA_WGS          | -DW                      |
| WGS              | -W                       |
| RNA              | -R                       |
| LIB1             | -lib1                    |
| LIB2             | -lib2                    |
| NORMAL           | -N                       |
| TUMOR            | -T                       |
| NONE             |                          |

Library type grouped DNA/RNA using column:

[SOPHIA_DDM_Data],,,,
Sample_ID,Library_Type,,
sample1,DNA,,
sample1,RNA,,
sample2,DNA,,
sample3,RNA,,

Library type grouped DNA/RNA using suffix

[SOPHIA_DDM_Data],,,,
Sample_ID,,,
sample1-D,,,
sample1-R,,,
sample2-D,,,
sample3-R,,,

Library type grouped Tumor/Normal using suffix:

[SOPHIA_DDM_Data],,,,
Sample_ID,,,
sample1-T,,,
sample1-N,,,
sample2-T,,,
sample3-T,,,

Only Sample_ID column is mandatory in the sheet; others are optional.

VCF Sample Sheet

For VCF-based uploads, set filetype,VCF in the [SOPHIA_DDM_Settings] section. The VCF sample sheet uses a dedicated column set instead of the FASTQ columns above.

VCF format example (tumor-normal pair):

[SOPHIA_DDM_Settings]
version,2
filetype,VCF
[SOPHIA_DDM_Data]
Sample_ID,Patient_Ref,Pipeline_ID,Library_Type,Group_ID,File_Name,Order_ID,Order_Date,Icd10_Info,Tumor_Site
ND-PATIENT-001,PT-2026-001,8579,NORMAL,GROUP_001,short_variants.vcf,ORD-001,2024-06-15,C34.10 Lung malignancy,lung
TD-PATIENT-001,PT-2026-001,8579,TUMOR,GROUP_001,short_variants.vcf,ORD-001,2024-06-15,C34.10 Lung malignancy,lung

VCF-specific columns:

File_Name: the VCF file name to associate with the sample row. VCF files must contain the standard header columns (#CHROM, POS, ID, REF, ALT, QUAL, FILTER, INFO).

Order_Date: the date the order was placed, in yyyy-MM-dd format. Requires Order_ID to be present.

Icd10_Info: ICD-10 diagnosis code and description (e.g., C34.10 Lung malignancy). Requires Order_ID to be present.

Tumor_Site: the anatomical site of the tumor (e.g., lung).

Group_ID: groups related sample rows (e.g., a tumor-normal pair) into a single analysis. When Group_ID is used, Patient_Ref is required for patient identification.

Tumor-normal pairs and Order_ID deduplication:

In a tumor-normal VCF analysis, the NORMAL and TUMOR rows typically share the same Order_ID. The CLI automatically deduplicates orders: when multiple rows in the same batch share the same Order_ID, the order is created only once. A console warning is displayed for each duplicate:

Order ID 'ORD-001' appears on multiple rows (e.g. tumor-normal pair) — order will be created only once.

By default, all the samples are considered to be a single run if no Capture_ID or Upload_Batch is present.
If Upload_Batch is present then it will be considered rather than Capture_ID to split the samples into specific runs.
Bundle_SN can be passed here or by adding the "--bdsNumber" or "--bdsMappingFile"
By default, underscores (_) in the sample ID are converted to hyphens (-). To preserve underscores, pass the "--includeUnderscores" flag when creating the run.

Note : Sample Numbers within a Batch should be unique.

Upload size limits

There are technical limitations on the total run size that can be processed by SOPHiA DDM in one upload batch. Please note that the application specific limitations are described in the corresponding product's Instruction for Use (IFU) document. Additionally, the upload size limits described below apply to all CLI uploads:

> Enhanced Exome Solutions: maximum 512 GiB per upload batch
> All other applications: maximum 420 GiB per upload batch

Attempting to upload a batch that exceeds above limits will results in an error. To ensure the upload batches remain within accepted size limits the Sample Sheet upload workflow can be used with the Capture_ID or Upload_batch column to ensure the upload is performed in multiple batches.

Commands

The script is run by calling sg-upload-v2-wrapper.py followed by one of the commands below and its options. Each command has a --help option (or -h) which will display more information.

(recommended)

Recommended authentication mechanism to login to Sophia Genetics Platform.

This command is used to login with the IAM/SSO. You will be redirected to the browser to login and then after successful login you can close the tab.

After logging in, you will not need to login again as long as you have used the CLI at least once within 90 days.

The IAM/SSO is a new flow where you can authenticate yourself in a secure way. This was introduced to increase the security standards of the application.

Users can now login to multiple accounts using the option --client-id and can create run/upload specifying the client ID. If client ID is not provided then main account client ID will be considered.

Typical usage

$ python3 sg-upload-v2-wrapper.py login-iam

Login successful

$ python3 sg-upload-v2-wrapper.py login-iam --client-id 12345

Login successful
You're logged to IAM with client id: 12345

$ python3 sg-upload-v2-wrapper.py login-iam --client-id 67890

Login successful
You're logged to IAM with client id: 67890

Options

Option	Alias	Mandatory	Description	Example
--force	-f	No	This flag enforces user to re-login with IAM/SSO even you have already logged in	login-iam --force
--headless		No	Use this when running on a system with no GUI access	login-iam --headless
--client-id		No	If specified, will login to the account related to this client id. If not specified, will login to the main account Also can login to multi accounts using client id and create run/upload to a particular account.	login-iam --client-id 123456

logout-iam

This command is used to logout the current loggedIn IAM/SSO user and Any subsequent commands (other than login/login-iam) will fail.

Typical usage

$ python3 sg-upload-v2-wrapper.py logout-iam

You have logged out

(not recommended, will be deprecated, use login-iam)

Note: We are moving customers from grid card based authentication to a new authentication method. It is recommended to use the login-iam command instead of the login command. Please refer to the section for more information on how to use it command login-iam. The login command will be phased out in the upcoming months.

Only one user can be logged in at a time. If another user logs in, any previous user will automatically be logged out.

Typical usage

$ python3 sg-upload-v2-wrapper.py login -u myemail@mycompany.org -p

Enter value for --password (Password): <enter your password>
Provide token for coordinate [8, D]: <enter your grid token>
Log in success

Options

Option	Alias	Mandatory	Description	Example
--user	-u	Yes	Your Sophia Genetics username	login --user myemail@my-company.org
--password	-p	*	Command line will interactively ask for your password	login --user myemail@my-company.org -p
--password:env	-pe	*	Password is extracted from an environment variable	login --user myemail@my-company.org -pe ENV_PASSWORD_VAR
--password:file	-pf	*	Password is read from a UTF-8 encoded text file	login --user myemail@my-company.org -pf /path_to/password_file
--help	-h		Explain previous options	login -h

*one of -p, -pe or -pf is mandatory

logout

The currently logged in user will be disconnected, and subsequent commands (other than login) will fail.

Typical usage

$ python3 sg-upload-v2-wrapper.py logout

User logged out

adegen ( DEPRECATED: See sample sheet workflow )

Generate ADE formatted JSON from FastQ files. This command provides a convenient way to create ADE format JSON files that can be used with the new command.

Typical usage

$ python3 sg-upload-v2-wrapper.py adegen --folder /path/to/fastq/files --ref MyRun123 --output output.json

Available pipelines:
1234: BRCA Analysis (MISEQ)
5678: Hereditary Cancer Solution (NEXTSEQ)
9012: Solid Tumor Solution (NOVASEQ)
Enter pipeline ID: 1234

Created JSON file: output.json

Options

Option	Alias	Mandatory	Description	Example
--folder	-f	Yes	Path to folder containing FastQ files	adegen --folder /path/to/fastq/files
--output	-o	No	Output JSON file. If not specified, JSON will be written to stdout	adegen --folder /path/to/files --output run.json
--pipeline	-p	No	Pipeline ID (defaults to -1, which triggers interactive pipeline selection)	adegen --folder /path/to/files --pipeline 12345
--sampletype	-s	No	Sample Type ID (defaults to 108000)	adegen --folder /path/to/files --sampletype 108001
--ref	-r	No	Run name	adegen --folder /path/to/files --ref MyRun123
--deep	-d	No	Recurse through the folder when searching for FastQ files	adegen --folder /path/to/files --deep
--bdsNumber		No	Serial Number for all SOPHiA GENETICS bundle solutions. When provided, this number will be applied to all samples in the run.	adegen --folder /path/to/files --bdsNumber BDS-123456
--bdsMappingFile		No	Path to a mapping file containing patient reference to Serial Number mappings	adegen --folder /path/to/files --bdsMappingFile /path/to/mapping.csv
--help	-h	No	Explain previous options	adegen -h

BDS Number Format

When using either --bdsNumber or --bdsMappingFile options: - BDS numbers must start with "BDS-" prefix - Must be followed by valid serial numbers - Cannot use both --bdsNumber and --bdsMappingFile together

BDS Mapping File Format

When using --bdsMappingFile, the file should contain one mapping per line containing the patient reference and the BDS number:

patient1,BDS-123456
patient2,BDS-789012

Pipeline Selection

When not specifying a pipeline ID: 1. The system displays available pipelines with IDs and sequencer codes 2. You will be prompted to select a pipeline by entering its ID 3. Selection is validated against available pipelines

Errors

If there are any issues with the input, an error message will explain what went wrong.

Example 1 - Invalid pipeline ID

Error: Invalid pipeline ID

Example 3 - Invalid BDS mapping file format

Error: Invalid format in mapping file. Each line should contain: patient_ref,BDS-number

Important Notes 1. This command requires new Platform Services to be activated 2. The generated JSON file can be used as input for the new command 3. When no output file is specified, the JSON is printed to stdout

new

Create a new batch analysis request. This can be done in a few different ways, but the recommended workflow is using a sample sheet .csv file. See Sample Sheet

If you would prefer to use a JSON file, there are two formats - legacy and the more recent ADE. Python scripts are provided to generate both of these formats.

Typical usage

Using JSON file:

$ python3 sg-upload-v2-wrapper.py new --json /path_to/file.json

Run successfully created with id 200002747

Using folder with FastQ files:

$ python3 sg-upload-v2-wrapper.py new --folder /path/to/fastq/files --ref MyRun123

Available pipelines:
1234: BRCA Analysis (MISEQ)
5678: Hereditary Cancer Solution (NEXTSEQ)
9012: Solid Tumor Solution (NOVASEQ)
Enter pipeline ID: 1234

Run successfully created with id 200002748

Using client-id flag with upload:

$ python3 sg-upload-v2-wrapper.py new --folder /path/to/fastq --client-id 12345 --pipeline 1234 --upload --ref MyRun123

Run successfully created with id 200002749
Starting upload after analysis creation...
Upload ended in 123456ms

$ python3 sg-upload-v2-wrapper.py new --folder /path/to/fastq --client-id 67890 --pipeline 1234 --upload --ref MyRun456

Run successfully created with id 200002750
Starting upload after analysis creation...
Upload ended in 123456ms

Using BaseSpace project ID:

$ python3 sg-upload-v2-wrapper.py new --basespace-project 12345678 --pipeline 1234 --ref MyRun123

BaseSpace project integration:
  Project ID: 12345678
  Virtual folder: /tmp/basespace_project_12345678_xyz
  Note: Files will be processed via BaseSpace URLs during analysis
  FASTQ files discovered: 8

Available pipelines:
1234: BRCA Analysis (MISEQ)
5678: Hereditary Cancer Solution (NEXTSEQ)
Enter pipeline ID: 1234

Run successfully created with id 200002751

Using BaseSpace project ID with sample sheet:

$ python3 sg-upload-v2-wrapper.py new --basespace-project 12345678 --sampleSheet samplesheet.csv --ref MyRun123

BaseSpace project integration:
  Project ID: 12345678
  Virtual folder: /tmp/basespace_project_12345678_xyz
  Note: Files will be processed via BaseSpace URLs during analysis
  FASTQ files discovered: 8

Run successfully created with id 200002752

Options

Option	Alias	Mandatory	Description	Example
--json	-j	*	Path to your JSON file describing the new run	new --json /path_to/file.json
--folder	-f	*	Path to folder containing FastQ files	new --folder /path/to/fastq/files
--sampleSheet		No	Path to a sample sheet CSV file containing sample definitions. See Sample Sheet upload workflow for format details.	new --folder /path/to/fastq/files --sampleSheet samplesheet.csv --upload
--deep	-d	No	Recurse through the folder when searching for FastQ files	new --folder /path/to/files --deep
--ref	-r	No	Run name (required when using --folder)	new --folder /path/to/files --ref MyRun123
--pipeline	-p	No	Pipeline ID (defaults to -1, which triggers interactive pipeline selection)	new --folder /path/to/files --pipeline 12345
--sampletype	-s	No	Sample Type ID (defaults to 108000)	new --folder /path/to/files --sampletype 108001
--legacy		No	Use this option if you provide the json with the legacy format	new --json /path_to/file.json --legacy
--client-id		**	If specified, will create the RUN on your account related to this client id. Important notes: 1. If you create RUNs for two clients located on different data-centers, the RUN id can be the same for both. In that case the newest one overrides the oldest one. To avoid that, it is recommended to run the "upload" command between each creation. 2. This flag is only relevant for Core Services (not working with new platform services). 3. When specifying client-id either --upload flag or --samplesheet has to be supplied	new --json /path_to/file.json --client-id 123456
--force-platform	-fp	No	If specified, will create the RUN with new Platform Services.	new --json /path_to/file.json -fp
--bdsNumber		No	Serial Number for all SOPHiA GENETICS bundle solutions. When provided, this number will be applied to all samples in the run. Works only with One command rules flow.	new --folder /path/to/fastq/files --pipeline 12345 --bdsNumber BDS-123456
--bdsMappingFile		No	Path to a mapping file containing patient reference to Serial Number mappings. The file should be in CSV format with each line containing: patient_ref,BDS-number. Works only with One command rules flow.	new --folder /path/to/fastq/files --pipeline 12345 --bdsMappingFile /path/to/mapping.csv
--basespace-project		No	BaseSpace project ID for file discovery and upload. When specified, files are processed via BaseSpace URLs (no upload needed). Can be used with --pipeline or --sampleSheet. Requires BaseSpace authentication (see `basespace auth login`).	new --basespace-project 12345678 --pipeline 1234 --ref MyRun123
--upload		**	When set, automatically starts the upload process after successfully creating the analysis. Note: Upload is automatically skipped when using --basespace-project since files are handled via BaseSpace URLs.	new --json /path_to/file.json --upload
--help	-h	No	Explain previous options	new -h

*one of --json or --folder is mandatory

**when specifying client-id one of --upload or --samplesheet is mandatory

BDS Number Format

BDS Mapping File Format

When using --bdsMappingFile, the file should contain one mapping per line:

patient1,BDS-123456
patient2,BDS-789012

BaseSpace Project Integration

When using --basespace-project: - The command creates a virtual folder that represents the BaseSpace project - Files are discovered from the BaseSpace project and processed via BaseSpace URLs during workflow execution - Upload is automatically skipped - BaseSpace files are handled via URLs, so no file upload is needed - Requires BaseSpace authentication (run basespace auth login first) - Can be used with either --pipeline (for pipeline-based import) or --sampleSheet (for sample sheet-based import) - The BaseSpace project ID can be found using basespace project list

Note

Sample Numbers within a run should be Unique.

Errors

If the JSON file is incorrect, an error message will explain what went wrong (will not be in red in the command line output).

Example 1 - "germatic" does not exist; it is either germline or somatic

Bad request
The experimentType germatic passed in parameters does not exist

Example 2 - one of the files does not exist

Unprocessable request
The file /path_to/SG10000001_S1_L001_R1_001.fastq.gz passed in parameters does not exist

upload

Upload the files and execute the batch analysis requests created with the new command. If an upload was in progress, it will be resumed. If several batch analysis requests have been created, they will be uploaded sequentially.

Typical usage

$ python3 sg-upload-v2-wrapper.py upload

Getting upload run information
Found 3 runs to upload.
Uploading run n°
Upload ended in 123456ms
Uploading run n°
Upload ended in 789012ms
Uploading run n°
Upload ended in 3456789ms

Options

Option	Alias	Description	Example
--id	-i	Uploads or resumes a given batch analysis requests	upload --id 200003035
--dry-run	-d	Verifies that everything is okay, but doesn't start the upload process	upload --dry-run
--port	-p	The upload process opens a socket (default: 40530) on your computer, to ensure that the upload process is only running one at a time. If this socket is already used, you can specify another one. To be able to run multiple uploads in parallel on the same computer, use the --port 0 option	upload --port 56789 upload --port 0 to launch uploads in parallel on the same computer
--help	-h	Explain previous options	login -h
--show-progress	-sp	Show upload rate while uploading	upload -i 200003035 --show-progress

WARNING

Do not upload any files containing nominative information or any other direct identifier related to a patient (e.g. patient's full name).

manage

This command returns the status of a batch analysis request, and can remove or reset all local files that could have been created by mistake. It never modifies data on the server side.

Typical usage

$ python3 sg-upload-v2-wrapper.py manage

Found 3 upload ready to be sent :
- Run with id '200003035' for client id '3' has 1 analysis
- Run with id '200003036' for client id '3' has 1 analysis
- Run with id '200003037' for client id '3' has 4 analysis
You can upload a specific run by using the 'upload' command with the '--id' option

Options

Option	Alias	Description	Example
--status	-s	Returns current status information of the current runs, if they have not been uploaded yet	manage --status
--delete	-d	Deletes local files of a given batch analysis request id (run).	manage --delete 123456789
--reset	-r	Deletes local files of all batch analysis requests (runs).	manage --reset
--help	-h	Explain previous options	login -h

status

Get the status of one or more batch analysis requests.

Typical usage

$ python3 sg-upload-v2-wrapper.py status --limit 3

200002934: Waiting for upload
200002933: Waiting for upload
200002932: Waiting for upload

Options

Option	Alias	Mandatory	Description	Example
--id	-i	*	Get the status of one batch analysis request, given a specified identifier. The second line of the terminal output will be the status. For example "Waiting for upload".	status --id 200002934
--limit	-l	*	Get the status of many batch analysis requests, given a specified limit. Each batch analysis request status will be prefixed with the entity identifier and a colon character. The displayed records will be ordered from the most recently created batch analysis request to the oldest one last.	status --limit 3
--run-ref	-run-ref	*	Get the status of the batch analysis with the specified run reference and sample Id. This will return the latest run that matches this condition.	status --run-ref run1 --sample-id sample1
--sample-id	-sample-id	*	Get the status of the batch analysis with the specified run reference and sample Id. This will return the latest run that matches this condition.	status --run-ref run1 --sample-id sample1
--help	-h		Explain previous options	login -h
--pipeline-info	-pipeline		Lists the pipeline version for each sample in a run. To be used in combination with sample-id and run-ref options.	status --run-ref run1 --sample-id sample1 --pipeline-info

*one of --id or --limit or(--run-ref and --sample-id) is mandatory

Status Responses

Waiting for upload
Upload in progress
Pipeline running
Finished
Status code unknown (####)
Error

patient

Create new patients, list existing ones, or manage patient diseases.

Typical usage

List patients

$ python3 sg-upload-v2-wrapper.py patient --list patient1,patient

[
  {
    "medicalInformationId": 111111111,
    "personalInformationId": 1111123222,
    "userRef": "patient1"
  },
  {
    "medicalInformationId": 2222222222,
    "personalInformationId": 22222444444,
    "userRef": "patient2"
  }
]

Add diseases to a patient

$ python3 sg-upload-v2-wrapper.py patient --add-diseases --patient-ref PATIENT_REF --disease-ids 1,2,3

Diseases added successfully for patient PATIENT_REF

Options

Option	Alias	Mandatory	Description	Example
--list	-l	*	Retrieve technical IDs of the specified patients. When at least one patient does not exist in the system, an error message is displayed with the list of not found patients: `Unable to find following patients in SGP: notFound1,notFound2`
--create	-c	*	Create patients specified in the command line. It creates only non existing patients. It then prints the patients as with the --list option.	patient --create p3,p4,p5
--add-diseases		*	Add diseases to a specified patient using comma-separated disease IDs	patient --add-diseases --patient-ref PATIENT_REF --disease-ids 1,2,3
--disease-ids		**	Comma-separated list of disease IDs to add to the patient	patient --add-diseases --patient-ref PATIENT_REF --disease-ids 1,2,3
--force-platform	-fp		If specified, will list patients using new Platform Services	patient --list -fp patient1,patient2
--help	-h		Explain previous options	login -h

one of --list, --create, or --add-diseases is mandatory *required when using --add-diseases

order

Manage orders for patients with disease support. The system supports two modes:

GEN1: Diseases are managed via patient command (default behavior)
GEN2: Diseases are added to order table via pmi-svc

Note: This command is only available with Platform Services.

Typical usage

GEN1 Orders (Default)

Basic order

$ python3 sg-upload-v2-wrapper.py order --add --patient-ref patient1 --order-id ORDER123

Order 'ORDER123' added successfully for patient patient1

GEN1 order with phenotypes

$ python3 sg-upload-v2-wrapper.py order --add \
  --patient-ref PATIENT123 \
  --order-id ORDER123 \
  --virtual-panel VP789 \
  --phenotypes PHENO101,PHENO102,PHENO103 \
  --filter FILTER303

Order 'ORDER123' added successfully for patient PATIENT123

Explicitly specify GEN1 (optional)

$ python3 sg-upload-v2-wrapper.py order --add \
  --patient-ref PATIENT123 \
  --order-id ORDER123 \
  --order-type GEN1

Order 'ORDER123' added successfully for patient PATIENT123

GEN2 Orders with Diseases

GEN2 order with diseases (order-type must be explicitly set to GEN2)

$ python3 sg-upload-v2-wrapper.py order --add \
  --patient-ref PATIENT123 \
  --order-id ORDER123 \
  --order-type GEN2 \
  --disease-ids "100,200,300"

Order 'ORDER123' added successfully for patient PATIENT123

GEN2 order with all parameters

$ python3 sg-upload-v2-wrapper.py order --add \
  --patient-ref PATIENT123 \
  --order-id ORDER123 \
  --order-type GEN2 \
  --disease-ids "100,200" \
  --phenotypes "PHENO101" \
  --virtual-panel VP789 \
  --filter FILTER303 \
  --patient-lock

Order 'ORDER123' added successfully for patient PATIENT123

List orders for a patient

JSON format

$ python3 sg-upload-v2-wrapper.py order --list --patient-ref patient1

[ {
  "id" : 22,
  "patientId" : 627721022,
  "orderId" : "ORDER-1",
  "diseaseIds" : [],
  "orderType" : "GEN1",
  "createdAt" : "2025-03-10T13:58:12Z",
  "updatedAt" : "2025-03-10T13:58:12Z"
}, {
  "id" : 32,
  "patientId" : 627721022,
  "orderId" : "ORDER-2",
  "diseaseIds" : ["100", "200"],
  "orderType" : "GEN2",
  "createdAt" : "2025-03-10T14:02:07Z",
  "updatedAt" : "2025-03-10T14:02:07Z"
} ]

Flat table format

$ python3 sg-upload-v2-wrapper.py order --list --patient-ref patient1 --flat

Orders for patient patient1:
ID         Order ID             Created At                    Updated At                    Virtual Panel        Phenotypes                    Filter                    Lock       Diseases                        Type
---------- -------------------- ------------------------------ ------------------------------ -------------------- ------------------------------ ------------------------------ ---------- ------------------------------ ----------
22         ORDER-1             2025-03-10T13:58:12Z          2025-03-10T13:58:12Z          VP789               PHENO101,PHENO102,PHENO103   FILTER303               No         []                              GEN1
32         ORDER-2             2025-03-10T14:02:07Z          2025-03-10T14:02:07Z          VP789               []                             []                      No         [100, 200]                      GEN2

Note: The Diseases column shows comma-separated disease IDs when present, or empty brackets [] when no diseases are associated with the order.

Options

Option	Alias	Mandatory	Description	Example
--add	-a	*	Add a new order for a specified patient	order --add --patient-ref patient1 --order-id ORDER123
--list	-l	*	List all orders for a specified patient	order --list --patient-ref patient1
--patient-ref	-p	Yes	The patient reference to add an order for or list orders from	order --add --patient-ref patient1 --order-id ORDER123
--order-id	-o	**	The order ID to add (required when using --add)	order --add --patient-ref patient1 --order-id ORDER123
--virtual-panel	-vp	No	Virtual panel ID for the order	order --add --patient-ref patient1 --order-id ORDER123 --virtual-panel VP789
--phenotypes	-ph	No	Comma-separated list of phenotype IDs	order --add --patient-ref patient1 --order-id ORDER123 --phenotypes PHENO101,PHENO102
--filter	-f	No	Cascade filter ID for the order	order --add --patient-ref patient1 --order-id ORDER123 --filter FILTER303
--patient-lock		No	Lock the patient to prevent modifications	order --add --patient-ref patient1 --order-id ORDER123 --patient-lock
--disease-ids		No	Comma-separated list of disease IDs (only allowed with GEN2)	order --add --patient-ref patient1 --order-id ORDER123 --order-type GEN2 --disease-ids "100,200,300"
--order-type		No	Order type (GEN1 or GEN2). Disease IDs only allowed with GEN2	order --add --patient-ref patient1 --order-id ORDER123 --order-type GEN2
--flat		No	Display orders in flat format instead of JSON (for list command)	order --list --patient-ref patient1 --flat
--force-platform	-fp	No	If specified, will manage orders using Platform Services	order --list --patient-ref patient1 -fp
--help	-h		Explain previous options	order -h

one of --add or --list is mandatory *required when using --add

Validation Rules

Order Type: Defaults to GEN1 if not specified
Disease IDs with GEN1: Disease IDs can only be provided when order type is explicitly set to GEN2
Missing Order ID: Throws exception if order ID is not provided when adding orders

pipeline

Get all pipelines available to currently logged in user.

Typical usage

$ python3 sg-upload-v2-wrapper.py pipeline --list

[
  {
    "pipeline_id": 123,
    "pipeline_name": "Pipeline 123",
    "analysis_type": "BRCA",
    "analysis_type_id": 30078000,
    "kit": "Multiplicom_MASTR_assay",
    "sequencer_id": 123456
    "sequencer": "ILLUMINA_MiSeq",
    "experiment_type": "germline",
    "pairend": true
  },
  {
    "pipeline_id": 456,
    "pipeline_name": "Pipeline 456",
    "analysis_type": "HCS_v1_1",
    "analysis_type_id": 6003000,
    "kit": "IDT",
    "sequencer_id": 123456
    "sequencer": "ILLUMINA_MiSeq",
    "experiment_type": "germline",
    "pairend": true
  }
]

Options

Option	Alias	Mandatory	Description	Example
--list	-l	Yes	Retrieve the list of allowed pipelines to the logged in user	pipeline --list
--force-platform, -fp			If specified, will list pipelines using new Platform Services.	pipeline --list -fp
--file-out	-o	No	The path where the file will be downloaded.	pipeline --list --file-out /test/example/pipelineOutput.txt
--help	-h		Explain previous options	login -h

sample

List all samples of a run or a specific sample.

Typical usage

View by Run ID

$ python3 sg-upload-v2-wrapper.py sample --run-id 1234567890

[
  {
    "id": 1111111111,
    "sgacltId": 10,
    "userRef": "dnoble",
    "analysisType": "BRCA_Tumor",
    "kit": "Multiplicom_MASTR_Plus",
    "sampleType": "Other",
    "sequencer": "ILLUMINA_MiSeq",
    "status": "Creation",
    "experimentType": "somatic",
    "isPairend": true,
    "isControl": false
  },
  {
    "id": 2222222222,
    "sgacltId": 10,
    "userRef": "dnoble2",
    "analysisType": "BRCA_Tumor",
    "kit": "Multiplicom_MASTR_Plus",
    "sampleType": "Other",
    "sequencer": "ILLUMINA_MiSeq",
    "status": "Creation",
    "experimentType": "somatic",
    "isPairend": true,
    "isControl": false
  }
]

View by Sample ID

$ python3 sg-upload-v2-wrapper.py sample --sample-id 1111111111

{
  "id": 1111111111,
  "sgacltId": 10,
  "userRef": "dnoble",
  "analysisType": "BRCA_Tumor",
  "kit": "Multiplicom_MASTR_Plus",
  "sampleType": "Other",
  "sequencer": "ILLUMINA_MiSeq",
  "status": "Creation",
  "experimentType": "somatic",
  "isPairend": true,
  "isControl": false
}

Options

Option	Alias	Mandatory	Description	Example
--run-id		*	Get all samples of the specified run	sample --run-id 200002934
--sample-id		*	Get the description of the specified sample	sample --sample-id 123456
--force-platform, -fp			If specified, will list samples using new Platform Services.	sample --run-id -fp 200002934
--help	-h		Explain previous options	login -h

*one of --run-id or --sample-id is mandatory

file

List and download files from a batch request or an analysis.

Typical usage

List files by run ID

$ python3 sg-upload-v2-wrapper.py file --list --run-id 1234567890

[{"id":204177851,"name":"SG10000001_S1_L001_R1_001.fastq.gz","size":345676,"patient":"SG10000001","analysisId":300042016},
{"id":204177852,"name":"SG10000001_S1_L001_R2_001.fastq.gz","size":345682,"patient":"SG10000001","analysisId":300042016}]

List the Files by date (Time in milliseconds e.g. Epoch Unix Timestamp) and extension

$ python3 sg-upload-v2-wrapper.py file --list --date 1704063600000 --extension .bam

Download file by file ID to non-existent destination

$ python3 sg-upload-v2-wrapper.py file --download --file-id 1234567890 --file-out /tmp/test/test.fastq

Will copy file 1234567890 into /tmp/test/test.fastq
Will create the parent folder: /tmp/test
Have created the parent folder: /tmp/test
Your file has been downloaded and is available here: /tmp/test/test.fastq

Download file by file ID to current directory

$ python3 sg-upload-v2-wrapper.py file --download --file-id 1234567890 --file-out report.pdf

Will copy file 204183269 into report.pdf
Your file has been downloaded and is available here: report.pdf

*Download JSON report based on order ID and patient reference

$ python3 sg-upload-v2-wrapper.py file --download-reports --order-id testorder --patient-ref testpatient

*Download output files of an analysis based on run reference and sample ID

$ python3 sg-upload-v2-wrapper.py file --download --run-ref "testRun2025"  --sample-id "mySampleName"

Options

|Option|Alias|Mandatory| Description | | Example | |---|:---:|:---:|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------| |--list|-l|| List the files of a bar or an analysis.

Requires at least --run-id or --analysis-id argument

By default results are in JSON format.

For files which has the information associated, the analysis id and patient user reference will also be provided.

File list for run 200002764
[{"id":123,"name":"file.fastq.gz","size":0}, {"id":123,"name":"file.fastq.gz","size":0,"analysisId":123,"patient":"dnoble"}] | file --list --run-id 200002934 | |--run-id| | | Goes along with --list, this represent the batch analysis request id | file --list --run-id 200002934 | |--analysis-id| | | Goes along with --list, this represent the analysis id | file --list --analysis-id 200002934 | |--flat| | | Outputs the results in CSV format:

File list for run 200002764 id;name;size;analysisId;patient
204177851;SG10000001_S1_L001_R1_001.fastq.gz;0;;
204177852;SG10000001_S1_L001_R2_001.fastq.gz;0;;

If you provide a full path that doesn't exist, the folders will be automatically created
If the name ends with ".gz", the file will be automatically gzipped
If not specified, the file will be downloaded to the current directory

| file --download --file-id 234565645 --file-out /test/example/fastq.gz | |--run-id|-r| | Download all output files of the specified RUN.

Input files (fastq and bam files) will not be downloaded by default. | file --download --run-id 123456 | |--with-input-files| | | Usable only with the --run-id option.

If set to true, will download all the files of the run, even fastq or bam files. | file --download --run-id 123456 --with-input-files | |--folder-user-ref| | | Usable only with the --run-id option.

If set to true, will download all the files of the run and creates the subfolders by patient/userRef instead of Sophia's Internal Analysis Id. | file --download --run-id 123456 --folder--user-ref |--skip-md5-check| | | Skip the MD5 checksum verification after downloading a file | file --download --file-id 234565645 --file-out fastq.gz --skip-md5-check |--skip-zip| | | Usable only with the --download option.

If set to true, will download all the files of the run and skips the zipping of the folder | file --download --run-id 123456 --skip-zip |--out| | | The destination folder where to download files.

If used with the --run-id option, it will download the zip into the specified folder.
The zip name will be "runId-out.zip".
This option cannot be used with the --file-id and --file-out options. | file --download --run-id 123456 --out /my/custom/path
→ Download all files as /my/custom/path/123456-out.zip | |--date| | | List files uploaded since a specific date (Time in milliseconds e.g. Epoch Unix Timestamp). Important: This flag can only be used when --extension flag is provided. | file --list --date 1704063600000 --extension .bam | |--extension| | | List files with a specific extension. Important: This flag can only when --date flag is provided. | file --list --date 1704063600000 --extension .bam | |--force-platform, -fp| | | If specified, will download files using new Platform Services. | file --list -fp | |--help|-h| | Explain previous options | login -h |

*one of --list or --download is mandatory

Downloadable files

All fastq, bam, bai, fna, qual, sff, ab1, warnings, zip files, as well as the following:


full_variant_table.txt ampCov_patient_merge.txt combined_cov_stats.txt combined_var_stats.txt combined_hotspot_stats.txt exon_coverage_stats.txt exon_coverage_stats_v2.txt exon_coverage_stats_v3.txt hCoV2_detection.txt hCoV2_detection_per_patient.txt QA-report.pdf QA-patient.pdf CNV-Report.pdf	MSI-Report.pdf Gene-Expression-Report.pdf full_variant_table.vcf sampleCheckId.vcf ontarget-mapping-statistics-table.csv pcr-duplicates-table.csv read-counts-overview-table.csv softclip-percentage-table.csv target-region-coverage-table.csv alignment-stats-RNA-table.csv detected-fusions-table.csv full_variant_seq.fa fasta_sequences.fa

export

Export files from completed interpretations. This command allows you to list and download output files (e.g., JSON reports) from analyses that have been completed since a specified date.

Typical usage

List files from completed interpretations since a specific date

$ python3 sg-upload-v2-wrapper.py export --since-date 2025-01-01 --file-pattern "*.json" --list

Found 15 completed interpretations since 2025-01-01
Found 45 files matching pattern '*.json'
[ {
  "id" : 204178491,
  "name" : "report.json",
  "checksum" : "686031b1b1ea63c3fea69b0061237e3b",
  "length" : 61812
}, {
  "id" : 204175927,
  "name" : "variant_report.json",
  "checksum" : "57a2743b3e8f5155f90172fb1fbf79c2",
  "length" : 129583
} ]

Output Modes

The --list command supports two output modes to suit different use cases:

Concise Mode (`--json-only`)

Use the --json-only flag for less verbose output — clean JSON with only essential fields, no informational messages. Ideal for scripting and piping to other tools:

$ python3 sg-upload-v2-wrapper.py export --since-date 2025-01-01 --file-pattern "*ORDER-46#*.json" -fp --list --json-only

[ {
  "id" : 422532316,
  "name" : "report-json-analysis_400163833-interpretation_30856-#ORDER-46#-rev19549.json",
  "checksum" : "9eee3cc1fc5fce788064b4c26e800b27",
  "length" : 12585
}, {
  "id" : 422532283,
  "name" : "report-json-analysis_400163833-interpretation_30856-#ORDER-46#-rev19518.json",
  "checksum" : "8678f5dff1d00d266e0a0feba7a804e7",
  "length" : 12587
} ]

Output fields:

Field	Description
id	Unique file identifier for download
name	Filename
checksum	MD5 checksum of the file
length	File size in bytes

Verbose Mode (`--flat`)

Use the --flat flag for detailed output — includes informational messages and full file attributes with timestamps, associations, and encryption status:

$ python3 sg-upload-v2-wrapper.py export --since-date 2025-01-01 --file-pattern "*ORDER-46#*.json" -fp --list --flat

Found 39 completed interpretations since 2025-01-01
Found 2 files matching pattern '*ORDER-46#*.json'
[ {
  "id" : 422532316,
  "fileAttributes" : {
    "name" : "report-json-analysis_400163833-interpretation_30856-#ORDER-46#-rev19549.json",
    "dataAttributes" : {
      "checksum" : "9eee3cc1fc5fce788064b4c26e800b27",
      "length" : 12585
    }
  },
  "createdAt" : "2025-12-11T01:47:07Z",
  "fileAssociations" : [ {
    "entityType" : "SGAANA",
    "entityId" : 400163833,
    "ioType" : "OUTPUT"
  } ],
  "downloadable" : true,
  "hasWrappedEncryption" : true
}, {
  "id" : 422532283,
  "fileAttributes" : {
    "name" : "report-json-analysis_400163833-interpretation_30856-#ORDER-46#-rev19518.json",
    "dataAttributes" : {
      "checksum" : "8678f5dff1d00d266e0a0feba7a804e7",
      "length" : 12587
    }
  },
  "createdAt" : "2025-12-10T17:15:41Z",
  "fileAssociations" : [ {
    "entityType" : "SGAANA",
    "entityId" : 400163833,
    "ioType" : "OUTPUT"
  } ],
  "downloadable" : true,
  "hasWrappedEncryption" : true
} ]

Additional fields in verbose mode:

Field	Description
createdAt	File creation timestamp
fileAssociations	Entity type, ID, and I/O type
downloadable	Whether the file can be downloaded
hasWrappedEncryption	Whether the file uses wrapped encryption

Download files from completed interpretations

$ python3 sg-upload-v2-wrapper.py export --since-date 2025-01-01 --file-pattern "*.json" --download

Found 15 completed interpretations since 2025-01-01
Found 45 files matching pattern '*.json'
Downloading 45 files to export-out
Downloading: report.json
...
Download complete: 45 succeeded, 0 failed

Download to a specific output folder

$ python3 sg-upload-v2-wrapper.py export --since-date 2025-01-01 --file-pattern "*.json" --download --out /path/to/output

Found 15 completed interpretations since 2025-01-01
Created output directory: /path/to/output
Downloading 45 files to /path/to/output
...
Download complete: 45 succeeded, 0 failed

Options

Option	Alias	Mandatory	Description	Example
--since-date		Yes*	Filter completed interpretations since date (YYYY-MM-DD format)	export --since-date 2025-01-01 --list
--file-pattern		No	Filename pattern filter with wildcards (defaults to *.json)	export --since-date 2025-01-01 --file-pattern "*.pdf" --list
--list	-l	*	List files matching criteria in JSON format	export --since-date 2025-01-01 --list
--download	-d	*	Download files matching criteria	export --since-date 2025-01-01 --download
--out		No	Output folder for downloads (defaults to 'export-out')	export --since-date 2025-01-01 --download --out /my/folder
--json-only	-j	No	Concise mode: Output only JSON with essential fields (id, name, checksum, length), no informational messages. Ideal for scripting	export --since-date 2025-01-01 --list --json-only
--flat		No	Verbose mode: Output full file attributes including timestamps, associations, and encryption status	export --since-date 2025-01-01 --list --flat
--analysis-id		**	Analysis ID for variant/CNV/QC export	export --analysis-id 123456 --variant-output --filter myfilter
--variant-output		**	Download variant CSV file	export --analysis-id 123456 --variant-output --filter myfilter
--cnv-output		**	Download CNV CSV file	export --analysis-id 123456 --cnv-output --filter myfilter
--qc-output		**	Download QC file	export --analysis-id 123456 --qc-output
--file-out	-o	No	Output file name for variant/CNV/QC export (used with --analysis-id). If not specified, a default file name is used.	export --analysis-id 123456 --variant-output --filter myfilter --file-out results.csv
--filter		No	Filter for variant/CNV export	export --analysis-id 123456 --variant-output --filter myfilter
--help	-h		Explain previous options	export -h

when using --since-date, one of --list or --download is mandatory *for existing export functionality (variant/CNV/QC), use --analysis-id with the appropriate output flag

How it works

The export command combines two service calls:

Get completed interpretations: Queries all analyses with 'completed' status since the specified date
Filter files by pattern: Queries the file service for files matching the filename pattern (e.g., *.json) for the retrieved analysis IDs

This allows you to efficiently export output files (such as JSON reports) from all completed interpretations without needing to know individual analysis IDs.

Notes

The --since-date option accepts dates in YYYY-MM-DD format
The --file-pattern option supports wildcards (e.g., *.json, *report*.pdf, *.vcf)
Files are downloaded to the current directory by default, or to the folder specified with --out
The command only retrieves files marked as 'downloadable' and 'output' type
Use --json-only for concise mode (clean JSON output suitable for scripting)
Use --flat for verbose mode (full file details with timestamps and associations)

userInfo

Display basic user information.

Typical usage

$ python3 sg-upload-v2-wrapper.py userInfo

{
  "userId": 405,
  "loginUsername": "dnoble",
  "clientId": 12
}

basespace

BaseSpace integration commands allow you to authenticate with Illumina BaseSpace, list projects, and automatically import sequencing data from BaseSpace projects into SOPHiA DDM.

Typical usage

$ python3 sg-upload-v2-wrapper.py basespace auth login

Authenticating to BaseSpace region: us
API Server: https://api.basespace.illumina.com
Enter your BaseSpace access token: <token>

✓ Successfully authenticated to BaseSpace!

basespace auth

BaseSpace authentication commands.

Authenticate to BaseSpace. You need to obtain an access token from BaseSpace first.

Typical usage

$ python3 sg-upload-v2-wrapper.py basespace auth login

Authenticating to BaseSpace region: us
API Server: https://api.basespace.illumina.com
Please obtain an access token from BaseSpace:
1. Go to BaseSpace and navigate to your account settings
2. Create an API application or use an existing one
3. Generate an access token with appropriate scopes

Enter your BaseSpace access token: <token>
Validating token...
✓ Successfully authenticated to BaseSpace!

Options

Option	Alias	Mandatory	Description	Example
--region		No	BaseSpace region (us, euc1, aps1). Defaults to 'us'	basespace auth login --region euc1
--api-server		No	BaseSpace API server URL	basespace auth login --api-server https://api.euc1.sh.basespace.illumina.com
--token		No	BaseSpace access token. If not provided, will prompt for manual entry	basespace auth login --token
--scope		No	Token scope (default: 'read write')	basespace auth login --scope "read write"
--help	-h	No	Explain previous options	basespace auth login -h

Note: If you specify --region, the API server will be automatically set. If you specify --api-server, the region will be automatically detected. If neither is specified, defaults to US region.

basespace auth logout

Clear BaseSpace authentication.

Typical usage

$ python3 sg-upload-v2-wrapper.py basespace auth logout

✓ Successfully logged out from BaseSpace

basespace auth status

Show BaseSpace authentication status.

Typical usage

$ python3 sg-upload-v2-wrapper.py basespace auth status

Authenticated to BaseSpace (region: us)
Current region: us
Current API server: https://api.basespace.illumina.com

basespace project

BaseSpace project management commands.

basespace project list

List all BaseSpace projects for the authenticated user.

Typical usage

$ python3 sg-upload-v2-wrapper.py basespace project list

Listing BaseSpace projects (region: us)

✓ Found 3 projects:

Project: My Sequencing Run
  ID: 12345678
  Description: WGS run from 2025-01-15
  Created: 2025-01-15T10:30:00.0000000Z
  Owner: John Doe (98765432)

Project: Cancer Panel Run
  ID: 87654321
  Created: 2025-01-20T14:20:00.0000000Z
  Owner: Jane Smith (12345678)

Options

Option	Alias	Mandatory	Description	Example
--help	-h	No	Explain previous options	basespace project list -h

basespace project files

List files in a BaseSpace project.

Typical usage

$ python3 sg-upload-v2-wrapper.py basespace project files --project-id 12345678

Listing contents of BaseSpace project: 12345678

✓ Found 2 datasets:

Dataset: Sample1
  ID: dataset-001
  Description: Sample 1 sequencing data
  Created: 2025-01-15T10:35:00.0000000Z
  Files (4):
    - Sample1_S1_L001_R1_001.fastq.gz (1.2 GB)
    - Sample1_S1_L001_R2_001.fastq.gz (1.2 GB)
    - Sample1_S2_L001_R1_001.fastq.gz (1.1 GB)
    - Sample1_S2_L001_R2_001.fastq.gz (1.1 GB)

Options

Option	Alias	Mandatory	Description	Example
--project-id	-p	Yes	BaseSpace project ID	basespace project files -p 12345678
--show-all-files		No	Show all files, not just FASTQ files	basespace project files -p 12345678 --show-all-files
--help	-h	No	Explain previous options	basespace project files -h

basespace auto-import

Automatically import sequencing data from BaseSpace into SOPHiA DDM. Supports two modes:

Run-based mode (recommended): match a BaseSpace run by name, parse its v3 sample sheet ([Cloud_Data] section), and create runs in SOPHiA DDM for all pool projects listed under ProjectName.
Project-based mode (legacy): iterate over BaseSpace projects directly, matching by project name.

The command tracks processed runs/projects using lock files to avoid duplicate imports.

Sample Sheet Support

v3 Sample Sheet with [Cloud_Data] (recommended): The run-level sample sheet uses the standard Illumina v3 format. The [Cloud_Data] section lists pool projects by ProjectName. The CLI fetches this sheet from the run, discovers all pool projects automatically, and creates one SOPHiA DDM run per project. The Pipeline_ID column in [Cloud_Data] is used if present.
v1/v2 Sample Sheet (legacy project-based mode): If a sample sheet is found inside a BaseSpace project, it will be used to create the run. The Pipeline_ID column will be used if present.
Pipeline-based Import: If no sample sheet is found, the command uses the --pipeline option to create the run.

Typical usage

Import a specific run by name (run-based mode — recommended):

$ python3 sg-upload-v2-wrapper.py basespace auto-import --run-match "260414_VH01945_172_AAF2K2HHV"

BaseSpace Auto-Import
====================
Region: euc1

Run-based mode: fetching completed runs matching '260414_VH01945_172_AAF2K2HHV'...

Processing run: 260414_VH01945_172_AAF2K2HHV (ID: 30730711)
  Found sample sheet in run
  Pool projects from [Cloud_Data]: WES_28_pool1, WES_28_pool2
  Processing project: WES_28_pool1
    Success: WES_28_pool1 → WES_28_pool1
  Processing project: WES_28_pool2
    Success: WES_28_pool2 → WES_28_pool2

Summary:
  Runs processed: 1
  Projects succeeded: 2
  Projects failed: 0

Dry-run to preview what would be imported:

$ python3 sg-upload-v2-wrapper.py basespace auto-import --run-match "260414_VH01945_172_AAF2K2HHV" --dry-run

BaseSpace Auto-Import
====================
Region: euc1

Run-based mode: fetching completed runs matching '260414_VH01945_172_AAF2K2HHV'...

DRY RUN - no runs will be imported

[WOULD PROCESS] Run: 260414_VH01945_172_AAF2K2HHV (ID: 30730711)
  Pool projects found: WES_28_pool1, WES_28_pool2

Import all projects with a specific pipeline (legacy project-based mode):

$ python3 sg-upload-v2-wrapper.py basespace auto-import --pipeline 12345

BaseSpace Auto-Import
====================
Region: us

Listing BaseSpace projects...
Found 5 projects

Processing project: 12345678 (My Sequencing Run)
  Success: 12345678 → BS-12345678-20250115-143022

Summary:
  Processed: 2
  Skipped: 3
  Failed: 0

Filter by date (legacy project-based mode):

$ python3 sg-upload-v2-wrapper.py basespace auto-import --pipeline 12345 --from-date 2025-01-15T00:00:00Z

BaseSpace Auto-Import
====================
Region: us

Listing BaseSpace projects...
Found 10 projects
Filtering projects created on or after: 2025-01-15T00:00:00Z
After date filtering: 3 projects

Options

Option	Alias	Mandatory	Description	Example
--run-match		No	Regex pattern to match BaseSpace run names. Processes only completed runs. The run's sample sheet (`[Cloud_Data]` section) is used to discover pool projects automatically. Recommended over `--project-match`.	basespace auto-import --run-match "260414_VH01945_172_AAF2K2HHV"
--run-experiment-match		No	Regex pattern to match BaseSpace runs by ExperimentName (from the run's sample sheet `[Header]` section). Mutually exclusive with `--run-match`.	basespace auto-import --run-experiment-match "MyExperiment_.*"
--pipeline	-p	**	Pipeline ID. Required if no sample sheet is found in projects	basespace auto-import --pipeline 12345
--sampletype	-s	No	Sample Type ID (defaults to 8000)	basespace auto-import --pipeline 12345 --sampletype 8000
--from-date		No	Only import projects created on or after this date. Format: 2025-05-09T22:11:20.0000000Z or 2025-05-09T22:11:20Z	basespace auto-import --pipeline 12345 --from-date 2025-01-15T00:00:00Z
--dry-run		No	Simulate the import process without actually importing. Shows which runs/projects would be processed.	basespace auto-import --run-match "260414_.*" --dry-run
--project-skip		No	Regex pattern to exclude projects by name. Applied before `--project-match`. Examples: `.Workflow.`, `^Test.*`	basespace auto-import --project-skip ".Workflow."
--run-ref-prefix		No	Prefix prepended to the run reference. Default: empty (project name used directly). Example: `ICO_` produces `ICO_20250904_SOP23`.	basespace auto-import --run-match ".*" --run-ref-prefix "ICO_"
--project-match		No	Deprecated. Use `--run-match` instead. Regex pattern to match projects by name.	basespace auto-import --project-match "WES_.*"
--require-complete-run		No	Deprecated. Implied by `--run-match`. Only import projects with a corresponding Complete run.	basespace auto-import --require-complete-run
--help	-h	No	Explain previous options	basespace auto-import -h

How to find the run name

The --run-match pattern matches against the BaseSpace run Name (sequencer folder name), not the ExperimentName. Use the bs CLI to look it up:

$ bs run list --sort-by DateCreated --reverse-sort | head -n 10

+--------------------------------------+----------+--------------------+-----------+
|                 Name                 |    Id    |   ExperimentName   |  Status   |
+--------------------------------------+----------+--------------------+-----------+
| 260414_VH01945_172_AAF2K2HHV         | 30730711 | WES_28_260413      | Complete  |

Pass the Name column value to --run-match.

Pipeline ID requirement: The --pipeline option is required if: - No sample sheet is found in the project, OR - The sample sheet does not contain a Pipeline_ID column

If a sample sheet with Pipeline_ID is found, the pipeline ID from the sample sheet will be used and --pipeline is not required.

Lock Files

The auto-import command uses lock files to track processed runs and avoid duplicate imports. Lock files are stored in:

~/.sophia/basespace/<region>/<run-name>.lock

Each lock file contains: - Timestamp of processing - Status (SUCCESS or FAILED) - Run reference (for successful imports)

Runs/projects with existing lock files are automatically skipped. Failed imports create .lock.failed files.

Date Format

The --from-date option accepts ISO 8601 format dates with or without fractional seconds: - 2025-05-09T22:11:20Z (without fractional seconds) - 2025-05-09T22:11:20.000Z (with milliseconds) - 2025-05-09T22:11:20.0000000Z (with microseconds)

Behavior

In run-based mode (--run-match): only completed runs whose names match the pattern are processed; the v3 sample sheet is fetched from the run to discover pool projects
In project-based mode (legacy): projects without FASTQ files are automatically skipped; projects with sample sheets use the sample sheet; projects without sample sheets fall back to --pipeline
Already processed runs/projects (with lock files) are skipped

basespace status

Show BaseSpace connection and authentication status.

Typical usage

$ python3 sg-upload-v2-wrapper.py basespace status

BaseSpace Integration Status
============================

Authentication: Authenticated to BaseSpace (region: us)
Region: us
API Server: https://api.basespace.illumina.com

Testing connection...
✓ Token is valid and connection is working

Available commands:
  basespace auth login    - Authenticate to BaseSpace
  basespace auth logout   - Clear authentication
  basespace project list  - List all BaseSpace projects
  basespace project files -p <id> - List files in a BaseSpace project

Options

Option	Alias	Mandatory	Description	Example
--help	-h	No	Explain previous options	basespace status -h

One Command to Rule Them All

(New since 6.4.0)

The most efficient way to create and upload an analysis is by using the new command, which enables direct folder analysis with automatic uploading. You can either allow the system to guide you through selecting the appropriate pipeline or specify your choice directly.

Direct pipeline selection:

$ python3 sg-upload-v2-wrapper.py new --folder /path/to/fastq/files --ref MyRun123 --pipeline 1234 --upload

Run successfully created with id 200002747
Starting upload after analysis creation...
Upload ended in 123456ms

This single command does it all: - Automatically scans your FastQ folder - Creates the analysis request (with either interactive or direct pipeline selection) - Immediately starts the upload

No need to manually create JSON files or remember to initiate the upload after run creation — everything is handled in one single step. Ideal for both interactive use and automated scripts, making your workflow seamless and efficient.

Migration Guides

Sample Sheet v1 to v2 Migration

This guide explains how to migrate from v1 to v2. For information about the v2 format, see the Sample Sheet upload workflow section.

Key differences from v2: - Uses [SOPHIA_DDM_Data_v1] section header instead of [SOPHIA_DDM_Data] - Does not support [SOPHIA_DDM_Settings] section

How to Migrate from v1 to v2

To migrate an existing v1 sample sheet to v2 format:

Change the section header: Replace [SOPHIA_DDM_Data_v1] with [SOPHIA_DDM_Data]
Add the Settings section (optional but recommended): Add a [SOPHIA_DDM_Settings] section before the data section: [SOPHIA_DDM_Settings],,,, version,1,,,
Keep all data columns unchanged: The column structure remains the same, so no changes are needed to your sample data rows

Example Migration:

Before (v1):

[SOPHIA_DDM_Data_v1],,,,
Sample_ID,Capture_ID,Bundle_SN,Pipeline_ID,Patient_Ref
SG10000008,1,BDS-1111111111-10,5,SDSD12

After (v2):

[SOPHIA_DDM_Settings],,,,
version,1,,,
[SOPHIA_DDM_Data],,,,
Sample_ID,Capture_ID,Bundle_SN,Pipeline_ID,Patient_Ref
SG10000008,1,BDS-1111111111-10,5,SDSD12

Documentation Version: 7.20.0-6.11.3 | Commit: ed8f6ef7 | Built: 2026-06-29 14:18:34 UTC

Keys	Action
`?`	Open this help
`n`	Next page
`p`	Previous page
`s`	Search

Introduction

Requirements

Installation

Sample Sheet upload workflow

1. Minimal format example

2. Full format example

Sample Sheet Columns

VCF Sample Sheet

Upload size limits

Commands

login-iam

logout-iam

login

logout

adegen ( DEPRECATED: See sample sheet workflow )

new

upload

manage

status

patient

order

Typical usage

GEN1 Orders (Default)

GEN2 Orders with Diseases

List orders for a patient

Validation Rules

pipeline

sample

file

export

Output Modes

Concise Mode (--json-only)

Verbose Mode (--flat)

userInfo

basespace

basespace auth

basespace auth login

basespace auth logout

basespace auth status

basespace project

basespace project list

basespace project files

basespace auto-import

basespace status

One Command to Rule Them All

Migration Guides

Sample Sheet v1 to v2 Migration

How to Migrate from v1 to v2

Concise Mode (`--json-only`)

Verbose Mode (`--flat`)