Introduction
This command-line utility takes advantage of the Sophia DDM API, allowing automatic upload of raw sequencing data.
CLI can be used to trigger upload of a specific run and to download output of runs. The upload can be performed using the Illumina sample sheet file to define the sample details.
Setup of this functionality needs support from SOPHiA GENETICS' IT team and familiarity with command-line utilities, please contact the support team using support@sophiagenetics.com.
This utility is a Java tool that is delivered separately and this user guide helps getting started with the tool. It has no graphical user interface and can only be used using the command line.
Requirements
The CLI tool requires either a Java Runtime Environment (the latest update of Oracle JRE 1.8 is recommended) or a Java Development Kit in order to start.
The wrapper script requires Python 3.8 or higher.
Prior to using the CLI tool, you should have logged in at least one time in the SOPHiA DDM application to change your password from the auto-generated one sent to you via mail to one of your convenience. This is a security layer implemented in SOPHiA DDM that prevent users from performing any actions using the auto-generated password.
Installation
Download the Python script sg-upload-v2-wrapper.py which will keep you up-to-date with the latest bug fixes and security upgrades of the tool.
Run the following command. If the output displays as below, you now have the latest version in the current directory.
$ python3 sg-upload-v2-wrapper.py --help
Sophia Genetics - Downloader v
Usage: Sophia Genetics upload-api client [-hvV] [COMMAND]
-h, --help Show this help message and exit.
-v, --verbose Talks a bit more
-V, --version Print version information and exit.
Commands:
login, li Login with your Sophia Genetics credentials
logout, lo Logs out the current user
login-iam
logout-iam
new, n New batch request analysis (run)
upload, up Uploads the last created batch analysis request (run)
manage, mg Manages batch analysis requests (runs)
status, s Get the status of one or many batch requests (runs)
file, f File management commands
export, e Export files from completed interpretations
tests, t Tests commands
version, cv Check version
patient Get patient information
order Manage patient orders
pipeline Pipeline management commands
sample Get sample information
userInfo User management commands
Enabling TLS 1.2 for Uploads with Java 11+
If you are using Java 11 or later versions, you may experience connection issues when executing Upload commands. This is due to Java 11+ defaulting to TLS 1.3, which does not support TLS renegotiation—a feature required by the upload process.
To resolve this, add the following option to your Upload command to explicitly enforce TLS 1.2:
-Djdk.tls.client.protocols=TLSv1.2
This will instruct Java to use TLS 1.2, ensuring successful and secure communication during the upload.
NOTE
The terms Batch Analysis Request and run are used interchangeably in this document.
Sample Sheet upload workflow
The SOPHiA DDM™ CLI enables automated run upload using a Sample Sheet CSV file generated with Illumina® sequencers. The CLI supports both the new Illumina sample sheet v2 format (with [SOPHIA_DDM_Settings] and [SOPHIA_DDM_Data] sections) and the legacy format (with [SOPHIA_DDM_Data_v1] section) for backward compatibility. The CLI Sample Sheet upload workflow supports the upload of large runs in multiple batches, organized by hybrid capture groups or other custom grouping. It also enables specification of SOPHiA DDM™ Bundle Serial Numbers and SOPHiA DDM™ Pipeline IDs for each sample.
Option 1 - all samples from same assay, to be analyzed using the same pipeline: In this case, you can exclude the Pipeline_ID column from the sample sheet.
$ python3 sg-upload-v2-wrapper.py new --folder /path/to/fastq/files --sampleSheet samplesheet.csv --pipeline=PIPID --upload
Option 2 - samples from different assays, to be analyzed using different pipelines. The sample sheet then needs to contain the Pipeline_ID column and filled for each sample.
$ python3 sg-upload-v2-wrapper.py new --folder /path/to/fastq/files --ref Run_reference --sampleSheet samplesheet.csv --upload
See example: full format example.
The CLI uses the Illumina sample sheet v2 format (recommended). For CLI usage, only the [SOPHIA_DDM_Data] section is required. The [SOPHIA_DDM_Settings] section is optional and defaults to version 1 if omitted. The Illumina header sections ([Header], [BCLConvert_Data]) are optional as well and only needed if you're using the sample sheet with Illumina BCL converter tools.
1. Minimal format example
Download: sampleSheet_minimal.csv
[SOPHIA_DDM_Data]
Sample_ID,Capture_ID,Pipeline_ID
SG10000008,1,1234
SG10000009,1,1234
library01,2,1235
library02,2,1235
2. Full format example
Full format example (with optional Illumina sections for BCL converter compatibility):
Download: sampleSheet_full.csv
[Header],,,,
FileFormatVersion,2,,,
RunName,MyRun,,,
InstrumentPlatform,NextSeq1k2k,,,
InstrumentType,NextSeq2000,,,
,,,,
[Reads],,,,
Read1Cycles,125,,,
Read2Cycles,125,,,
Index1Cycles,8,,,
Index2Cycles,8,,,
,,,,
[BCLConvert_Settings],,,,
SoftwareVersion,x.y.z,,,
,,,,
[BCLConvert_Data],,,,
Lane,Sample_ID,index,index2,,
1,S01-TOO-12plex-P1-rep1,ATCCACTG,AGGTGCGT,,
1,S02-TOO-12plex-P1-rep2,GCTTGTCA,GAACATAC,,
,,,,
[SOPHIA_DDM_Settings],,,,
version,1,,,
,,,,
[SOPHIA_DDM_Data],,,,
Sample_ID,Capture_ID,Upload_Batch,Bundle_SN,Pipeline_ID,Patient_Ref,SIS_No,Order_ID,HPO_ID,Virtual_Panel_ID,Gene_Filter_ID,Patient_Lock,Disease_ID,Patient_First_Name,Patient_Last_Name,Patient_DOB,Patient_Gender,Test_Date,Date_Collected,Sample_Type_ID,Library_Type
SG10000008,1,1,BDS-1111111111-10,5,SDSD12,sis-744587603-44,Order1,1-2-3,123,32323,0,12-23-22,John,Doe,2000-02-03,M,2025-09-03,2025-09-01,308000,DNA
SG10000009,1,1,BDS-1111111112-11,6,DFDF111,sis-744587603-44,Order2,2-3-5,345,12344,1,334-44,John,Doe,2000-02-03,M,2025-09-03,2025-09-01,408000,RNA
library01,1,2,BDS-1111111113-12,6,DSFF111,sis-744587603-44,Order3,4-6-9,445,2444,1,34-54,Jane,Doe,2000-02-03,F,2025-08-03,2025-08-01,608000,-
library02,1,2,BDS-1111111114-13,6,DFDS11,-,Order4,1-8-7,2323,55666,0,34-65-64,Jane,Doe,2000-02-03,F,2025-08-03,2025-08-01,808000,-
Format Structure:
[SOPHIA_DDM_Settings]: Optional section supporting the following fields:version: Sample sheet version. Defaults to 1 if omitted.filetype: File type for the upload. Set toVCFfor VCF-based uploads. If omitted, FASTQ files are assumed.[SOPHIA_DDM_Data]: Main data section containing sample information with all required and optional columns. Required.[Header]and[BCLConvert_Data]: Optional Illumina sections. Only needed if using the sample sheet with Illumina BCL converter tools. Not required for CLI usage.
Note: The CLI also supports the legacy v1 format (with [SOPHIA_DDM_Data_v1] section) for backward compatibility. For information about the legacy format and how to migrate from v1 to v2, see the Migration Guides section.
Sample Sheet Columns
Sample_ID: the id of the sample - needs to match the first part of the fastq file names (e.g. SAMPLEID in SAMPLEID_S01_L001_R1_001.fastq.gz)
Capture_ID: sample capture group id, identical id means the samples have been captured together in the hybridization capture workflow. By default, run splitting will be based on Capture_ID groups.
Upload_Batch: if the lab prefers multiple captures to be included in uploads (e.g. 2 captures per upload batch) a separate column can be used to define upload batch id - samples with same id will be uploaded together as one batch / run
Bundle_SN: for SOPHiA Bundle Solutions only - the Serial Number from the box containing reagents, replaces current separate mapping file
SIS_No: the SIS or DIS number corresponding to the purchase order if sample has been processed as part of Integrated or Dispatch service. Use - (hyphen) if the sample is not part of SIS.
Pipeline_ID: the ID of the pipeline to be launched for the sample - specifying different ids allows mixing multiple panels in the same run / upload batch. (Can be retrieved using the pipeline -l command.)
Patient_Ref: the patient reference to be associated with the sample - defaults to Sample_ID (like when uploading via SOPHiA DDM UI)
Disease_ID: the disease ids that needs to be added to the patient. (Multiple disease id's seperated by "-" hyphen)
Order_ID: the Order id needed to be added for the sample
HPO_ID: the hpo ids need to be added for the order. (Multiple hpo id's seperated by "-" hyphen)
Virtual_Panel_ID: the Virtual panel Id that asscoiated for the Order.
Gene_Filter_ID: the Gene filter Id for the Order
Patient_Lock: sets the patient lock for the order. ("1" is used to set the lock. "0" for not to set lock)
Gene_List: a semicolon-separated list of gene names used to dynamically create a virtual gene panel for the order (e.g., BRCA1;TP53;MAP2K1). Only letters, digits, hyphens, and semicolons are accepted. Gene panel resolution is performed before run creation — if the gene list cannot be resolved (e.g. unknown gene names), the upload fails immediately and no orphaned run or samples are left in the system. The error message will indicate which sample's gene list could not be resolved.
Patient_First_Name: The patient's first name, formatted as a string.
Patient_Last_Name: The patient's last name, formatted as a string.
Patient_DOB: The patient's date of birth, formatted as a string in the format yyyy-mm-dd (e.g., 1990-01-01).
Patient_Gender: The patient's gender, formatted as a string, accepted values are ("M" for male, "F" for female, or "U" unknown)
Test_Date: The date the test was performed, formatted as a string in the format yyyy-mm-dd (e.g., 2025-06-13).
Date_Collected: The date the sample was collected, formatted as a string in the format yyyy-mm-dd (e.g., 2025-06-13).
Sample_Type_ID: The sample type id of the sample. Below is listed the mapping to be used
| Sample Type ID | Sample Type Name |
|----------------|--------------------------------|
| 108000 | PERIPHERAL_BLOOD |
| 208000 | FRESH_TUMOR |
| 308000 | FFPE |
| 408000 | BIOPSY |
| 508000 | CELL_LINE |
| 1308000 | CFDNA |
| 608000 | CTDNA |
| 708000 | BUCCAL_SWAB |
| 808000 | NASOPHARYNGEAL_SWAB |
| 908000 | SPUTUM |
| 1008000 | BRONCHOALVEOLAR_LAVAGE |
| 1108000 | SALIVA |
| 1208000 | BONE_MARROW |
| 8000 | OTHER |
Library_Type : The library type of the sample, used to specify only which one to include. To group the analysis, either add a Library_Type column or use the specified suffixes in the Sample_ID column. For none, use "-" as the column value. Allowed values are described below.
| Library_Type | Sample ID Suffix |
|------------------|--------------------------|
| DNA | -D |
| DNA_WGS | -DW |
| WGS | -W |
| RNA | -R |
| LIB1 | -lib1 |
| LIB2 | -lib2 |
| NORMAL | -N |
| TUMOR | -T |
| NONE | |
Library type grouped DNA/RNA using column:
[SOPHIA_DDM_Data],,,,
Sample_ID,Library_Type,,
sample1,DNA,,
sample1,RNA,,
sample2,DNA,,
sample3,RNA,,
Library type grouped DNA/RNA using suffix
[SOPHIA_DDM_Data],,,,
Sample_ID,,,
sample1-D,,,
sample1-R,,,
sample2-D,,,
sample3-R,,,
Library type grouped Tumor/Normal using suffix:
[SOPHIA_DDM_Data],,,,
Sample_ID,,,
sample1-T,,,
sample1-N,,,
sample2-T,,,
sample3-T,,,
Only Sample_ID column is mandatory in the sheet; others are optional.
VCF Sample Sheet
For VCF-based uploads, set filetype,VCF in the [SOPHIA_DDM_Settings] section. The VCF sample sheet uses a dedicated column set instead of the FASTQ columns above.
VCF format example (tumor-normal pair):
[SOPHIA_DDM_Settings]
version,2
filetype,VCF
[SOPHIA_DDM_Data]
Sample_ID,Patient_Ref,Pipeline_ID,Library_Type,Group_ID,File_Name,Order_ID,Order_Date,Icd10_Info,Tumor_Site
ND-PATIENT-001,PT-2026-001,8579,NORMAL,GROUP_001,short_variants.vcf,ORD-001,2024-06-15,C34.10 Lung malignancy,lung
TD-PATIENT-001,PT-2026-001,8579,TUMOR,GROUP_001,short_variants.vcf,ORD-001,2024-06-15,C34.10 Lung malignancy,lung
VCF-specific columns:
File_Name: the VCF file name to associate with the sample row. VCF files must contain the standard header columns (#CHROM, POS, ID, REF, ALT, QUAL, FILTER, INFO).
Order_Date: the date the order was placed, in yyyy-MM-dd format. Requires Order_ID to be present.
Icd10_Info: ICD-10 diagnosis code and description (e.g., C34.10 Lung malignancy). Requires Order_ID to be present.
Tumor_Site: the anatomical site of the tumor (e.g., lung).
Group_ID: groups related sample rows (e.g., a tumor-normal pair) into a single analysis. When Group_ID is used, Patient_Ref is required for patient identification.
Tumor-normal pairs and Order_ID deduplication:
In a tumor-normal VCF analysis, the NORMAL and TUMOR rows typically share the same Order_ID. The CLI automatically deduplicates orders: when multiple rows in the same batch share the same Order_ID, the order is created only once. A console warning is displayed for each duplicate:
Order ID 'ORD-001' appears on multiple rows (e.g. tumor-normal pair) — order will be created only once.
- By default, all the samples are considered to be a single run if no Capture_ID or Upload_Batch is present.
- If Upload_Batch is present then it will be considered rather than Capture_ID to split the samples into specific runs.
- Bundle_SN can be passed here or by adding the "--bdsNumber" or "--bdsMappingFile"
- By default, underscores (_) in the sample ID are converted to hyphens (-). To preserve underscores, pass the "--includeUnderscores" flag when creating the run.
Note : Sample Numbers within a Batch should be unique.
Upload size limits
There are technical limitations on the total run size that can be processed by SOPHiA DDM in one upload batch. Please note that the application specific limitations are described in the corresponding product's Instruction for Use (IFU) document. Additionally, the upload size limits described below apply to all CLI uploads:
> Enhanced Exome Solutions: maximum 512 GiB per upload batch
> All other applications: maximum 420 GiB per upload batch
Attempting to upload a batch that exceeds above limits will results in an error. To ensure the upload batches remain within accepted size limits the Sample Sheet upload workflow can be used with the Capture_ID or Upload_batch column to ensure the upload is performed in multiple batches.
Commands
The script is run by calling sg-upload-v2-wrapper.py followed by one of the commands below and its options. Each command has a --help option (or -h) which will display more information.
login-iam
(recommended)
Recommended authentication mechanism to login to Sophia Genetics Platform.
This command is used to login with the IAM/SSO. You will be redirected to the browser to login and then after successful login you can close the tab.
After logging in, you will not need to login again as long as you have used the CLI at least once within 90 days.
The IAM/SSO is a new flow where you can authenticate yourself in a secure way. This was introduced to increase the security standards of the application.
Users can now login to multiple accounts using the option --client-id and can create run/upload specifying the client ID. If client ID is not provided then main account client ID will be considered.
Typical usage
$ python3 sg-upload-v2-wrapper.py login-iam
Login successful
$ python3 sg-upload-v2-wrapper.py login-iam --client-id 12345
Login successful
You're logged to IAM with client id: 12345
$ python3 sg-upload-v2-wrapper.py login-iam --client-id 67890
Login successful
You're logged to IAM with client id: 67890
Options
| Option | Alias | Mandatory | Description | Example |
|---|---|---|---|---|
| --force | -f | No | This flag enforces user to re-login with IAM/SSO even you have already logged in | login-iam --force |
| --headless | No | Use this when running on a system with no GUI access | login-iam --headless | |
| --client-id | No | If specified, will login to the account related to this client id. If not specified, will login to the main account Also can login to multi accounts using client id and create run/upload to a particular account. |
login-iam --client-id 123456 |
logout-iam
This command is used to logout the current loggedIn IAM/SSO user and Any subsequent commands (other than login/login-iam) will fail.
Typical usage
$ python3 sg-upload-v2-wrapper.py logout-iam
You have logged out
login
(not recommended, will be deprecated, use login-iam)
Note: We are moving customers from grid card based authentication to a new authentication method. It is recommended to use the login-iam command instead of the login command. Please refer to the section for more information on how to use it command login-iam. The login command will be phased out in the upcoming months.
Only one user can be logged in at a time. If another user logs in, any previous user will automatically be logged out.
Typical usage
$ python3 sg-upload-v2-wrapper.py login -u myemail@mycompany.org -p
Enter value for --password (Password): <enter your password>
Provide token for coordinate [8, D]: <enter your grid token>
Log in success
Options
| Option | Alias | Mandatory | Description | Example |
|---|---|---|---|---|
| --user | -u | Yes | Your Sophia Genetics username | login --user myemail@my-company.org |
| --password | -p | * | Command line will interactively ask for your password | login --user myemail@my-company.org -p |
| --password:env | -pe | * | Password is extracted from an environment variable | login --user myemail@my-company.org -pe ENV_PASSWORD_VAR |
| --password:file | -pf | * | Password is read from a UTF-8 encoded text file | login --user myemail@my-company.org -pf /path_to/password_file |
| --help | -h | Explain previous options | login -h |
*one of -p, -pe or -pf is mandatory
logout
The currently logged in user will be disconnected, and subsequent commands (other than login) will fail.
Typical usage
$ python3 sg-upload-v2-wrapper.py logout
User logged out
adegen ( DEPRECATED: See sample sheet workflow )
Generate ADE formatted JSON from FastQ files. This command provides a convenient way to create ADE format JSON files that can be used with the new command.
Typical usage
$ python3 sg-upload-v2-wrapper.py adegen --folder /path/to/fastq/files --ref MyRun123 --output output.json
Available pipelines:
1234: BRCA Analysis (MISEQ)
5678: Hereditary Cancer Solution (NEXTSEQ)
9012: Solid Tumor Solution (NOVASEQ)
Enter pipeline ID: 1234
Created JSON file: output.json
Options
| Option | Alias | Mandatory | Description | Example |
|---|---|---|---|---|
| --folder | -f | Yes | Path to folder containing FastQ files | adegen --folder /path/to/fastq/files |
| --output | -o | No | Output JSON file. If not specified, JSON will be written to stdout | adegen --folder /path/to/files --output run.json |
| --pipeline | -p | No | Pipeline ID (defaults to -1, which triggers interactive pipeline selection) | adegen --folder /path/to/files --pipeline 12345 |
| --sampletype | -s | No | Sample Type ID (defaults to 108000) | adegen --folder /path/to/files --sampletype 108001 |
| --ref | -r | No | Run name | adegen --folder /path/to/files --ref MyRun123 |
| --deep | -d | No | Recurse through the folder when searching for FastQ files | adegen --folder /path/to/files --deep |
| --bdsNumber | No | Serial Number for all SOPHiA GENETICS bundle solutions. When provided, this number will be applied to all samples in the run. | adegen --folder /path/to/files --bdsNumber BDS-123456 | |
| --bdsMappingFile | No | Path to a mapping file containing patient reference to Serial Number mappings | adegen --folder /path/to/files --bdsMappingFile /path/to/mapping.csv | |
| --help | -h | No | Explain previous options | adegen -h |
BDS Number Format
When using either --bdsNumber or --bdsMappingFile options: - BDS numbers must start with "BDS-" prefix - Must be followed by valid serial numbers - Cannot use both --bdsNumber and --bdsMappingFile together
BDS Mapping File Format
When using --bdsMappingFile, the file should contain one mapping per line containing the patient reference and the BDS number:
patient1,BDS-123456
patient2,BDS-789012
Pipeline Selection
When not specifying a pipeline ID: 1. The system displays available pipelines with IDs and sequencer codes 2. You will be prompted to select a pipeline by entering its ID 3. Selection is validated against available pipelines
Errors
If there are any issues with the input, an error message will explain what went wrong.
Example 1 - Invalid pipeline ID
Error: Invalid pipeline ID
Example 3 - Invalid BDS mapping file format
Error: Invalid format in mapping file. Each line should contain: patient_ref,BDS-number
Important Notes
1. This command requires new Platform Services to be activated
2. The generated JSON file can be used as input for the new command
3. When no output file is specified, the JSON is printed to stdout
new
Create a new batch analysis request. This can be done in a few different ways, but the recommended workflow is using a sample sheet .csv file. See Sample Sheet
If you would prefer to use a JSON file, there are two formats - legacy and the more recent ADE. Python scripts are provided to generate both of these formats.
Typical usage
Using JSON file:
$ python3 sg-upload-v2-wrapper.py new --json /path_to/file.json
Run successfully created with id 200002747
Using folder with FastQ files:
$ python3 sg-upload-v2-wrapper.py new --folder /path/to/fastq/files --ref MyRun123
Available pipelines:
1234: BRCA Analysis (MISEQ)
5678: Hereditary Cancer Solution (NEXTSEQ)
9012: Solid Tumor Solution (NOVASEQ)
Enter pipeline ID: 1234
Run successfully created with id 200002748
Using client-id flag with upload:
$ python3 sg-upload-v2-wrapper.py new --folder /path/to/fastq --client-id 12345 --pipeline 1234 --upload --ref MyRun123
Run successfully created with id 200002749
Starting upload after analysis creation...
Upload ended in 123456ms
$ python3 sg-upload-v2-wrapper.py new --folder /path/to/fastq --client-id 67890 --pipeline 1234 --upload --ref MyRun456
Run successfully created with id 200002750
Starting upload after analysis creation...
Upload ended in 123456ms
Using BaseSpace project ID:
$ python3 sg-upload-v2-wrapper.py new --basespace-project 12345678 --pipeline 1234 --ref MyRun123
BaseSpace project integration:
Project ID: 12345678
Virtual folder: /tmp/basespace_project_12345678_xyz
Note: Files will be processed via BaseSpace URLs during analysis
FASTQ files discovered: 8
Available pipelines:
1234: BRCA Analysis (MISEQ)
5678: Hereditary Cancer Solution (NEXTSEQ)
Enter pipeline ID: 1234
Run successfully created with id 200002751
Using BaseSpace project ID with sample sheet:
$ python3 sg-upload-v2-wrapper.py new --basespace-project 12345678 --sampleSheet samplesheet.csv --ref MyRun123
BaseSpace project integration:
Project ID: 12345678
Virtual folder: /tmp/basespace_project_12345678_xyz
Note: Files will be processed via BaseSpace URLs during analysis
FASTQ files discovered: 8
Run successfully created with id 200002752
Options
| Option | Alias | Mandatory | Description | Example |
|---|---|---|---|---|
| --json | -j | * | Path to your JSON file describing the new run | new --json /path_to/file.json |
| --folder | -f | * | Path to folder containing FastQ files | new --folder /path/to/fastq/files |
| --deep | -d | No | Recurse through the folder when searching for FastQ files | new --folder /path/to/files --deep |
| --ref | -r | No | Run name (required when using --folder) | new --folder /path/to/files --ref MyRun123 |
| --pipeline | -p | No | Pipeline ID (defaults to -1, which triggers interactive pipeline selection) | new --folder /path/to/files --pipeline 12345 |
| --sampletype | -s | No | Sample Type ID (defaults to 108000) | new --folder /path/to/files --sampletype 108001 |
| --legacy | No | Use this option if you provide the json with the legacy format | new --json /path_to/file.json --legacy | |
| --client-id | ** | If specified, will create the RUN on your account related to this client id. Important notes: 1. If you create RUNs for two clients located on different data-centers, the RUN id can be the same for both. In that case the newest one overrides the oldest one. To avoid that, it is recommended to run the "upload" command between each creation. 2. This flag is only relevant for Core Services (not working with new platform services). 3. When specifying client-id either --upload flag or --samplesheet has to be supplied |
new --json /path_to/file.json --client-id 123456 | |
| --force-platform | -fp | No | If specified, will create the RUN with new Platform Services. | new --json /path_to/file.json -fp |
| --bdsNumber | No | Serial Number for all SOPHiA GENETICS bundle solutions. When provided, this number will be applied to all samples in the run. Works only with One command rules flow. | new --folder /path/to/fastq/files --pipeline 12345 --bdsNumber BDS-123456 | |
| --bdsMappingFile | No | Path to a mapping file containing patient reference to Serial Number mappings. The file should be in CSV format with each line containing: patient_ref,BDS-number. Works only with One command rules flow. | new --folder /path/to/fastq/files --pipeline 12345 --bdsMappingFile /path/to/mapping.csv | |
| --basespace-project | No | BaseSpace project ID for file discovery and upload. When specified, files are processed via BaseSpace URLs (no upload needed). Can be used with --pipeline or --sampleSheet. Requires BaseSpace authentication (see basespace auth login). |
new --basespace-project 12345678 --pipeline 1234 --ref MyRun123 | |
| --upload | ** | When set, automatically starts the upload process after successfully creating the analysis. Note: Upload is automatically skipped when using --basespace-project since files are handled via BaseSpace URLs. | new --json /path_to/file.json --upload | |
| --help | -h | No | Explain previous options | new -h |
*one of --json or --folder is mandatory
**when specifying client-id one of --upload or --samplesheet is mandatory
BDS Number Format
When using either --bdsNumber or --bdsMappingFile options: - BDS numbers must start with "BDS-" prefix - Must be followed by valid serial numbers - Cannot use both --bdsNumber and --bdsMappingFile together
BDS Mapping File Format
When using --bdsMappingFile, the file should contain one mapping per line:
patient1,BDS-123456
patient2,BDS-789012
BaseSpace Project Integration
When using --basespace-project:
- The command creates a virtual folder that represents the BaseSpace project
- Files are discovered from the BaseSpace project and processed via BaseSpace URLs during workflow execution
- Upload is automatically skipped - BaseSpace files are handled via URLs, so no file upload is needed
- Requires BaseSpace authentication (run basespace auth login first)
- Can be used with either --pipeline (for pipeline-based import) or --sampleSheet (for sample sheet-based import)
- The BaseSpace project ID can be found using basespace project list
Note
Sample Numbers within a run should be Unique.
Errors
If the JSON file is incorrect, an error message will explain what went wrong (will not be in red in the command line output).
Example 1 - "germatic" does not exist; it is either germline or somatic
Bad request
The experimentType germatic passed in parameters does not exist
Example 2 - one of the files does not exist
Unprocessable request
The file /path_to/SG10000001_S1_L001_R1_001.fastq.gz passed in parameters does not exist
upload
Upload the files and execute the batch analysis requests created with the new command. If an upload was in
progress, it will be resumed. If several batch analysis requests have been created, they will be uploaded
sequentially.
Typical usage
$ python3 sg-upload-v2-wrapper.py upload
Getting upload run information
Found 3 runs to upload.
Uploading run n°
Upload ended in 123456ms
Uploading run n°
Upload ended in 789012ms
Uploading run n°
Upload ended in 3456789ms
Options
| Option | Alias | Mandatory | Description | Example |
|---|---|---|---|---|
| --id | -i | Uploads or resumes a given batch analysis requests | upload --id 200003035 | |
| --dry-run | -d | Verifies that everything is okay, but doesn't start the upload process | upload --dry-run | |
| --port | -p | The upload process opens a socket (default: 40530) on your computer, to ensure that the upload process is only running one at a time. If this socket is already used, you can specify another one. To be able to run multiple uploads in parallel on the same computer, use the --port 0 option |
upload --port 56789 upload --port 0 to launch uploads in parallel on the same computer |
|
| --help | -h | Explain previous options | login -h | |
| --show-progress | -sp | Show upload rate while uploading | upload -i 200003035 --show-progress |
WARNING
Do not upload any files containing nominative information or any other direct identifier related to a patient (e.g. patient's full name).
manage
This command returns the status of a batch analysis request, and can remove or reset all local files that could have been created by mistake. It never modifies data on the server side.
Typical usage
$ python3 sg-upload-v2-wrapper.py manage
Found 3 upload ready to be sent :
- Run with id '200003035' for client id '3' has 1 analysis
- Run with id '200003036' for client id '3' has 1 analysis
- Run with id '200003037' for client id '3' has 4 analysis
You can upload a specific run by using the 'upload' command with the '--id' option
Options
| Option | Alias | Mandatory | Description | Example |
|---|---|---|---|---|
| --status | -s | Returns current status information of the current runs, if they have not been uploaded yet | manage --status | |
| --delete | -d | Deletes local files of a given batch analysis request id (run). | manage --delete 123456789 | |
| --reset | -r | Deletes local files of all batch analysis requests (runs). | manage --reset | |
| --help | -h | Explain previous options | login -h |
status
Get the status of one or more batch analysis requests.
Typical usage
$ python3 sg-upload-v2-wrapper.py status --limit 3
200002934: Waiting for upload
200002933: Waiting for upload
200002932: Waiting for upload
Options
| Option | Alias | Mandatory | Description | Example |
|---|---|---|---|---|
| --id | -i | * | Get the status of one batch analysis request, given a specified identifier. The second line of the terminal output will be the status. For example "Waiting for upload". |
status --id 200002934 |
| --limit | -l | * | Get the status of many batch analysis requests, given a specified limit. Each batch analysis request status will be prefixed with the entity identifier and a colon character. The displayed records will be ordered from the most recently created batch analysis request to the oldest one last. |
status --limit 3 |
| --run-ref | -run-ref | * | Get the status of the batch analysis with the specified run reference and sample Id. This will return the latest run that matches this condition. |
status --run-ref run1 --sample-id sample1 |
| --sample-id | -sample-id | * | Get the status of the batch analysis with the specified run reference and sample Id. This will return the latest run that matches this condition. |
status --run-ref run1 --sample-id sample1 |
| --help | -h | Explain previous options | login -h | |
| --pipeline-info | -pipeline | Lists the pipeline version for each sample in a run. To be used in combination with sample-id and run-ref options. | status --run-ref run1 --sample-id sample1 --pipeline-info |
*one of --id or --limit or(--run-ref and --sample-id) is mandatory
Status Responses
- Waiting for upload
- Upload in progress
- Pipeline running
- Finished
- Status code unknown (####)
- Error
patient
Create new patients, list existing ones, or manage patient diseases.
Typical usage
List patients
$ python3 sg-upload-v2-wrapper.py patient --list patient1,patient
[
{
"medicalInformationId": 111111111,
"personalInformationId": 1111123222,
"userRef": "patient1"
},
{
"medicalInformationId": 2222222222,
"personalInformationId": 22222444444,
"userRef": "patient2"
}
]
Add diseases to a patient
$ python3 sg-upload-v2-wrapper.py patient --add-diseases --patient-ref PATIENT_REF --diseases 1,2,3
Diseases added successfully for patient PATIENT_REF
Options
| Option | Alias | Mandatory | Description | Example |
|---|---|---|---|---|
| --list | -l | * | Retrieve technical IDs of the specified patients. When at least one patient does not exist in the system, an error message is displayed with the list of not found patients: Unable to find following patients in SGP: notFound1,notFound2 |
|
| --create | -c | * | Create patients specified in the command line. It creates only non existing patients. It then prints the patients as with the --list option. |
patient --create p3,p4,p5 |
| --add-diseases | * | Add diseases to a specified patient using comma-separated disease IDs | patient --add-diseases --patient-ref PATIENT_REF --diseases 1,2,3 | |
| --diseases | ** | Comma-separated list of disease IDs to add to the patient | patient --add-diseases --patient-ref PATIENT_REF --diseases 1,2,3 | |
| --force-platform | -fp | If specified, will list patients using new Platform Services | patient --list -fp patient1,patient2 | |
| --help | -h | Explain previous options | login -h |
one of --list, --create, or --add-diseases is mandatory *required when using --add-diseases
order
Manage orders for patients with disease support. The system supports two modes:
- GEN1: Diseases are managed via patient command (default behavior)
- GEN2: Diseases are added to order table via pmi-svc
Note: This command is only available with Platform Services.
Typical usage
GEN1 Orders (Default)
Basic order
$ python3 sg-upload-v2-wrapper.py order --add --patient-ref patient1 --order-id ORDER123
Order 'ORDER123' added successfully for patient patient1
GEN1 order with phenotypes
$ python3 sg-upload-v2-wrapper.py order --add \
--patient-ref PATIENT123 \
--order-id ORDER123 \
--virtual-panel VP789 \
--phenotypes PHENO101,PHENO102,PHENO103 \
--filter FILTER303
Order 'ORDER123' added successfully for patient PATIENT123
Explicitly specify GEN1 (optional)
$ python3 sg-upload-v2-wrapper.py order --add \
--patient-ref PATIENT123 \
--order-id ORDER123 \
--order-type GEN1
Order 'ORDER123' added successfully for patient PATIENT123
GEN2 Orders with Diseases
GEN2 order with diseases (order-type must be explicitly set to GEN2)
$ python3 sg-upload-v2-wrapper.py order --add \
--patient-ref PATIENT123 \
--order-id ORDER123 \
--order-type GEN2 \
--disease-ids "100,200,300"
Order 'ORDER123' added successfully for patient PATIENT123
GEN2 order with all parameters
$ python3 sg-upload-v2-wrapper.py order --add \
--patient-ref PATIENT123 \
--order-id ORDER123 \
--order-type GEN2 \
--disease-ids "100,200" \
--phenotypes "PHENO101" \
--virtual-panel VP789 \
--filter FILTER303 \
--patient-lock
Order 'ORDER123' added successfully for patient PATIENT123
List orders for a patient
JSON format
$ python3 sg-upload-v2-wrapper.py order --list --patient-ref patient1
[ {
"id" : 22,
"patientId" : 627721022,
"orderId" : "ORDER-1",
"diseaseIds" : [],
"orderType" : "GEN1",
"createdAt" : "2025-03-10T13:58:12Z",
"updatedAt" : "2025-03-10T13:58:12Z"
}, {
"id" : 32,
"patientId" : 627721022,
"orderId" : "ORDER-2",
"diseaseIds" : ["100", "200"],
"orderType" : "GEN2",
"createdAt" : "2025-03-10T14:02:07Z",
"updatedAt" : "2025-03-10T14:02:07Z"
} ]
Flat table format
$ python3 sg-upload-v2-wrapper.py order --list --patient-ref patient1 --flat
Orders for patient patient1:
ID Order ID Created At Updated At Virtual Panel Phenotypes Filter Lock Diseases Type
---------- -------------------- ------------------------------ ------------------------------ -------------------- ------------------------------ ------------------------------ ---------- ------------------------------ ----------
22 ORDER-1 2025-03-10T13:58:12Z 2025-03-10T13:58:12Z VP789 PHENO101,PHENO102,PHENO103 FILTER303 No [] GEN1
32 ORDER-2 2025-03-10T14:02:07Z 2025-03-10T14:02:07Z VP789 [] [] No [100, 200] GEN2
Note: The Diseases column shows comma-separated disease IDs when present, or empty brackets [] when no diseases are associated with the order.
Options
| Option | Alias | Mandatory | Description | Example |
|---|---|---|---|---|
| --add | -a | * | Add a new order for a specified patient | order --add --patient-ref patient1 --order-id ORDER123 |
| --list | -l | * | List all orders for a specified patient | order --list --patient-ref patient1 |
| --patient-ref | -p | Yes | The patient reference to add an order for or list orders from | order --add --patient-ref patient1 --order-id ORDER123 |
| --order-id | -o | ** | The order ID to add (required when using --add) | order --add --patient-ref patient1 --order-id ORDER123 |
| --virtual-panel | -vp | No | Virtual panel ID for the order | order --add --patient-ref patient1 --order-id ORDER123 --virtual-panel VP789 |
| --phenotypes | -ph | No | Comma-separated list of phenotype IDs | order --add --patient-ref patient1 --order-id ORDER123 --phenotypes PHENO101,PHENO102 |
| --filter | -f | No | Cascade filter ID for the order | order --add --patient-ref patient1 --order-id ORDER123 --filter FILTER303 |
| --patient-lock | No | Lock the patient to prevent modifications | order --add --patient-ref patient1 --order-id ORDER123 --patient-lock | |
| --disease-ids | No | Comma-separated list of disease IDs (only allowed with GEN2) | order --add --patient-ref patient1 --order-id ORDER123 --order-type GEN2 --disease-ids "100,200,300" | |
| --order-type | No | Order type (GEN1 or GEN2). Disease IDs only allowed with GEN2 | order --add --patient-ref patient1 --order-id ORDER123 --order-type GEN2 | |
| --flat | No | Display orders in flat format instead of JSON (for list command) | order --list --patient-ref patient1 --flat | |
| --force-platform | -fp | No | If specified, will manage orders using Platform Services | order --list --patient-ref patient1 -fp |
| --help | -h | Explain previous options | order -h |
one of --add or --list is mandatory *required when using --add
Validation Rules
- Order Type: Defaults to GEN1 if not specified
- Disease IDs with GEN1: Disease IDs can only be provided when order type is explicitly set to GEN2
- Missing Order ID: Throws exception if order ID is not provided when adding orders
pipeline
Get all pipelines available to currently logged in user.
Typical usage
$ python3 sg-upload-v2-wrapper.py pipeline --list
[
{
"pipeline_id": 123,
"pipeline_name": "Pipeline 123",
"analysis_type": "BRCA",
"analysis_type_id": 30078000,
"kit": "Multiplicom_MASTR_assay",
"sequencer_id": 123456
"sequencer": "ILLUMINA_MiSeq",
"experiment_type": "germline",
"pairend": true
},
{
"pipeline_id": 456,
"pipeline_name": "Pipeline 456",
"analysis_type": "HCS_v1_1",
"analysis_type_id": 6003000,
"kit": "IDT",
"sequencer_id": 123456
"sequencer": "ILLUMINA_MiSeq",
"experiment_type": "germline",
"pairend": true
}
]
Options
| Option | Alias | Mandatory | Description | Example |
|---|---|---|---|---|
| --list | -l | Yes | Retrieve the list of allowed pipelines to the logged in user | pipeline --list |
| --force-platform, -fp | If specified, will list pipelines using new Platform Services. | pipeline --list -fp | ||
| --file-out | -o | No | The path where the file will be downloaded. | pipeline --list --file-out /test/example/pipelineOutput.txt |
| --help | -h | Explain previous options | login -h |
sample
List all samples of a run or a specific sample.
Typical usage
View by Run ID
$ python3 sg-upload-v2-wrapper.py sample --run-id 1234567890
[
{
"id": 1111111111,
"sgacltId": 10,
"userRef": "dnoble",
"analysisType": "BRCA_Tumor",
"kit": "Multiplicom_MASTR_Plus",
"sampleType": "Other",
"sequencer": "ILLUMINA_MiSeq",
"status": "Creation",
"experimentType": "somatic",
"isPairend": true,
"isControl": false
},
{
"id": 2222222222,
"sgacltId": 10,
"userRef": "dnoble2",
"analysisType": "BRCA_Tumor",
"kit": "Multiplicom_MASTR_Plus",
"sampleType": "Other",
"sequencer": "ILLUMINA_MiSeq",
"status": "Creation",
"experimentType": "somatic",
"isPairend": true,
"isControl": false
}
]
View by Sample ID
$ python3 sg-upload-v2-wrapper.py sample --sample-id 1111111111
{
"id": 1111111111,
"sgacltId": 10,
"userRef": "dnoble",
"analysisType": "BRCA_Tumor",
"kit": "Multiplicom_MASTR_Plus",
"sampleType": "Other",
"sequencer": "ILLUMINA_MiSeq",
"status": "Creation",
"experimentType": "somatic",
"isPairend": true,
"isControl": false
}
Options
| Option | Alias | Mandatory | Description | Example |
|---|---|---|---|---|
| --run-id | * | Get all samples of the specified run | sample --run-id 200002934 | |
| --sample-id | * | Get the description of the specified sample | sample --sample-id 123456 | |
| --force-platform, -fp | If specified, will list samples using new Platform Services. | sample --run-id -fp 200002934 | ||
| --help | -h | Explain previous options | login -h |
*one of --run-id or --sample-id is mandatory
file
List and download files from a batch request or an analysis.
Typical usage
List files by run ID
$ python3 sg-upload-v2-wrapper.py file --list --run-id 1234567890
[{"id":204177851,"name":"SG10000001_S1_L001_R1_001.fastq.gz","size":345676,"patient":"SG10000001","analysisId":300042016},
{"id":204177852,"name":"SG10000001_S1_L001_R2_001.fastq.gz","size":345682,"patient":"SG10000001","analysisId":300042016}]
List the Files by date (Time in milliseconds e.g. Epoch Unix Timestamp) and extension
$ python3 sg-upload-v2-wrapper.py file --list --date 1704063600000 --extension .bam
Download file by file ID to non-existent destination
$ python3 sg-upload-v2-wrapper.py file --download --file-id 1234567890 --file-out /tmp/test/test.fastq
Will copy file 1234567890 into /tmp/test/test.fastq
Will create the parent folder: /tmp/test
Have created the parent folder: /tmp/test
Your file has been downloaded and is available here: /tmp/test/test.fastq
Download file by file ID to current directory
$ python3 sg-upload-v2-wrapper.py file --download --file-id 1234567890 --file-out report.pdf
Will copy file 204183269 into report.pdf
Your file has been downloaded and is available here: report.pdf
*Download JSON report based on order ID and patient reference
$ python3 sg-upload-v2-wrapper.py file --download-reports --order-id testorder --patient-ref testpatient
*Download output files of an analysis based on run reference and sample ID
$ python3 sg-upload-v2-wrapper.py file --download --run-ref "testRun2025" --sample-id "mySampleName"
Options
| Option | Alias | Mandatory | Description | Example |
|---|---|---|---|---|
| --list | -l | * | List the files of a bar or an analysis. Requires at least --run-id or --analysis-id argument By default results are in JSON format. For files which has the information associated, the analysis id and patient user reference will also be provided. File list for run 200002764 [{"id":123,"name":"file.fastq.gz","size":0}, {"id":123,"name":"file.fastq.gz","size":0,"analysisId":123,"patient":"dnoble"}] |
file --list --run-id 200002934 |
| --run-id | Goes along with --list, this represent the batch analysis request id | file --list --run-id 200002934 | ||
| --analysis-id | Goes along with --list, this represent the analysis id | file --list --analysis-id 200002934 | ||
| --flat | Outputs the results in CSV format:File list for run 200002764 id;name;size;analysisId;patient |
file --list --run-id 200002934 --flat | ||
| --download | -d | * | Downloads the file given by its --file-id argument in the --file-out file | file --download --file-id 234565645 --file-out fastq.gz |
| --sample-id | The sample id of the analysis ( For use with --download option ) | |||
| --run-ref | The run reference name of the batch ( For use with --download option ) | |||
| --file-id | The id of the file to download (mandatory when using the --download option) | file --download --file-id 234565645 --file-out fastq.gz | ||
| --file-out | -o | The path where the file will be downloaded (mandatory when using the --download option)
|
file --download --file-id 234565645 --file-out /test/example/fastq.gz | |
| --run-id | -r | Download all output files of the specified RUN. Input files (fastq and bam files) will not be downloaded by default. |
file --download --run-id 123456 | |
| --with-input-files | Usable only with the --run-id option. If set to true, will download all the files of the run, even fastq or bam files. |
file --download --run-id 123456 --with-input-files | ||
| --folder-user-ref | Usable only with the --run-id option. If set to true, will download all the files of the run and creates the subfolders by patient/userRef instead of Sophia's Internal Analysis Id. |
file --download --run-id 123456 --folder--user-ref | ||
| --skip-zip | Usable only with the --download option. If set to true, will download all the files of the run and skips the zipping of the folder |
file --download --run-id 123456 --skip-zip | ||
| --out | The destination folder where to download files. If used with the --run-id option, it will download the zip into the specified folder. The zip name will be "runId-out.zip". This option cannot be used with the --file-id and --file-out options. |
file --download --run-id 123456 --out /my/custom/path → Download all files as /my/custom/path/123456-out.zip |
||
| --date | List files uploaded since a specific date (Time in milliseconds e.g. Epoch Unix Timestamp). Important: This flag can only be used when --extension flag is provided. | file --list --date 1704063600000 --extension .bam | ||
| --extension | List files with a specific extension. Important: This flag can only when --date flag is provided. | file --list --date 1704063600000 --extension .bam | ||
| --force-platform, -fp | If specified, will download files using new Platform Services. | file --list -fp | ||
| --help | -h | Explain previous options | login -h |
*one of --list or --download is mandatory
Downloadable files
All fastq, bam, bai, fna, qual, sff, ab1, warnings, zip files, as well as the following:
|
|
export
Export files from completed interpretations. This command allows you to list and download output files (e.g., JSON reports) from analyses that have been completed since a specified date.
Typical usage
List files from completed interpretations since a specific date
$ python3 sg-upload-v2-wrapper.py export --since-date 2025-01-01 --file-pattern "*.json" --list
Found 15 completed interpretations since 2025-01-01
Found 45 files matching pattern '*.json'
[ {
"id" : 204178491,
"name" : "report.json",
"checksum" : "686031b1b1ea63c3fea69b0061237e3b",
"length" : 61812
}, {
"id" : 204175927,
"name" : "variant_report.json",
"checksum" : "57a2743b3e8f5155f90172fb1fbf79c2",
"length" : 129583
} ]
Output Modes
The --list command supports two output modes to suit different use cases:
Concise Mode (--json-only)
Use the --json-only flag for less verbose output — clean JSON with only essential fields, no informational messages. Ideal for scripting and piping to other tools:
$ python3 sg-upload-v2-wrapper.py export --since-date 2025-01-01 --file-pattern "*ORDER-46#*.json" -fp --list --json-only
[ {
"id" : 422532316,
"name" : "report-json-analysis_400163833-interpretation_30856-#ORDER-46#-rev19549.json",
"checksum" : "9eee3cc1fc5fce788064b4c26e800b27",
"length" : 12585
}, {
"id" : 422532283,
"name" : "report-json-analysis_400163833-interpretation_30856-#ORDER-46#-rev19518.json",
"checksum" : "8678f5dff1d00d266e0a0feba7a804e7",
"length" : 12587
} ]
Output fields:
| Field | Description |
|---|---|
| id | Unique file identifier for download |
| name | Filename |
| checksum | MD5 checksum of the file |
| length | File size in bytes |
Verbose Mode (--flat)
Use the --flat flag for detailed output — includes informational messages and full file attributes with timestamps, associations, and encryption status:
$ python3 sg-upload-v2-wrapper.py export --since-date 2025-01-01 --file-pattern "*ORDER-46#*.json" -fp --list --flat
Found 39 completed interpretations since 2025-01-01
Found 2 files matching pattern '*ORDER-46#*.json'
[ {
"id" : 422532316,
"fileAttributes" : {
"name" : "report-json-analysis_400163833-interpretation_30856-#ORDER-46#-rev19549.json",
"dataAttributes" : {
"checksum" : "9eee3cc1fc5fce788064b4c26e800b27",
"length" : 12585
}
},
"createdAt" : "2025-12-11T01:47:07Z",
"fileAssociations" : [ {
"entityType" : "SGAANA",
"entityId" : 400163833,
"ioType" : "OUTPUT"
} ],
"downloadable" : true,
"hasWrappedEncryption" : true
}, {
"id" : 422532283,
"fileAttributes" : {
"name" : "report-json-analysis_400163833-interpretation_30856-#ORDER-46#-rev19518.json",
"dataAttributes" : {
"checksum" : "8678f5dff1d00d266e0a0feba7a804e7",
"length" : 12587
}
},
"createdAt" : "2025-12-10T17:15:41Z",
"fileAssociations" : [ {
"entityType" : "SGAANA",
"entityId" : 400163833,
"ioType" : "OUTPUT"
} ],
"downloadable" : true,
"hasWrappedEncryption" : true
} ]
Additional fields in verbose mode:
| Field | Description |
|---|---|
| createdAt | File creation timestamp |
| fileAssociations | Entity type, ID, and I/O type |
| downloadable | Whether the file can be downloaded |
| hasWrappedEncryption | Whether the file uses wrapped encryption |
Download files from completed interpretations
$ python3 sg-upload-v2-wrapper.py export --since-date 2025-01-01 --file-pattern "*.json" --download
Found 15 completed interpretations since 2025-01-01
Found 45 files matching pattern '*.json'
Downloading 45 files to export-out
Downloading: report.json
...
Download complete: 45 succeeded, 0 failed
Download to a specific output folder
$ python3 sg-upload-v2-wrapper.py export --since-date 2025-01-01 --file-pattern "*.json" --download --out /path/to/output
Found 15 completed interpretations since 2025-01-01
Created output directory: /path/to/output
Downloading 45 files to /path/to/output
...
Download complete: 45 succeeded, 0 failed
Options
| Option | Alias | Mandatory | Description | Example |
|---|---|---|---|---|
| --since-date | Yes* | Filter completed interpretations since date (YYYY-MM-DD format) | export --since-date 2025-01-01 --list | |
| --file-pattern | No | Filename pattern filter with wildcards (defaults to *.json) | export --since-date 2025-01-01 --file-pattern "*.pdf" --list | |
| --list | -l | * | List files matching criteria in JSON format | export --since-date 2025-01-01 --list |
| --download | -d | * | Download files matching criteria | export --since-date 2025-01-01 --download |
| --out | No | Output folder for downloads (defaults to 'export-out') | export --since-date 2025-01-01 --download --out /my/folder | |
| --json-only | -j | No | Concise mode: Output only JSON with essential fields (id, name, checksum, length), no informational messages. Ideal for scripting | export --since-date 2025-01-01 --list --json-only |
| --flat | No | Verbose mode: Output full file attributes including timestamps, associations, and encryption status | export --since-date 2025-01-01 --list --flat | |
| --analysis-id | ** | Analysis ID for variant/CNV/QC export | export --analysis-id 123456 --variant-output --filter myfilter | |
| --variant-output | ** | Download variant CSV file | export --analysis-id 123456 --variant-output --filter myfilter | |
| --cnv-output | ** | Download CNV CSV file | export --analysis-id 123456 --cnv-output --filter myfilter | |
| --qc-output | ** | Download QC file | export --analysis-id 123456 --qc-output | |
| --filter | No | Filter for variant/CNV export | export --analysis-id 123456 --variant-output --filter myfilter | |
| --help | -h | Explain previous options | export -h |
when using --since-date, one of --list or --download is mandatory *for existing export functionality (variant/CNV/QC), use --analysis-id with the appropriate output flag
How it works
The export command combines two service calls:
- Get completed interpretations: Queries all analyses with 'completed' status since the specified date
- Filter files by pattern: Queries the file service for files matching the filename pattern (e.g.,
*.json) for the retrieved analysis IDs
This allows you to efficiently export output files (such as JSON reports) from all completed interpretations without needing to know individual analysis IDs.
Notes
- The
--since-dateoption accepts dates in YYYY-MM-DD format - The
--file-patternoption supports wildcards (e.g.,*.json,*report*.pdf,*.vcf) - Files are downloaded to the current directory by default, or to the folder specified with
--out - The command only retrieves files marked as 'downloadable' and 'output' type
- Use
--json-onlyfor concise mode (clean JSON output suitable for scripting) - Use
--flatfor verbose mode (full file details with timestamps and associations)
userInfo
Display basic user information.
Typical usage
$ python3 sg-upload-v2-wrapper.py userInfo
{
"userId": 405,
"loginUsername": "dnoble",
"clientId": 12
}
basespace
BaseSpace integration commands allow you to authenticate with Illumina BaseSpace, list projects, and automatically import sequencing data from BaseSpace projects into SOPHiA DDM.
Typical usage
$ python3 sg-upload-v2-wrapper.py basespace auth login
Authenticating to BaseSpace region: us
API Server: https://api.basespace.illumina.com
Enter your BaseSpace access token: <token>
✓ Successfully authenticated to BaseSpace!
basespace auth
BaseSpace authentication commands.
basespace auth login
Authenticate to BaseSpace. You need to obtain an access token from BaseSpace first.
Typical usage
$ python3 sg-upload-v2-wrapper.py basespace auth login
Authenticating to BaseSpace region: us
API Server: https://api.basespace.illumina.com
Please obtain an access token from BaseSpace:
1. Go to BaseSpace and navigate to your account settings
2. Create an API application or use an existing one
3. Generate an access token with appropriate scopes
Enter your BaseSpace access token: <token>
Validating token...
✓ Successfully authenticated to BaseSpace!
Options
| Option | Alias | Mandatory | Description | Example |
|---|---|---|---|---|
| --region | No | BaseSpace region (us, euc1, aps1). Defaults to 'us' | basespace auth login --region euc1 | |
| --api-server | No | BaseSpace API server URL | basespace auth login --api-server https://api.euc1.sh.basespace.illumina.com | |
| --token | No | BaseSpace access token. If not provided, will prompt for manual entry | basespace auth login --token |
|
| --scope | No | Token scope (default: 'read write') | basespace auth login --scope "read write" | |
| --help | -h | No | Explain previous options | basespace auth login -h |
Note: If you specify --region, the API server will be automatically set. If you specify --api-server, the region will be automatically detected. If neither is specified, defaults to US region.
basespace auth logout
Clear BaseSpace authentication.
Typical usage
$ python3 sg-upload-v2-wrapper.py basespace auth logout
✓ Successfully logged out from BaseSpace
basespace auth status
Show BaseSpace authentication status.
Typical usage
$ python3 sg-upload-v2-wrapper.py basespace auth status
Authenticated to BaseSpace (region: us)
Current region: us
Current API server: https://api.basespace.illumina.com
basespace project
BaseSpace project management commands.
basespace project list
List all BaseSpace projects for the authenticated user.
Typical usage
$ python3 sg-upload-v2-wrapper.py basespace project list
Listing BaseSpace projects (region: us)
✓ Found 3 projects:
Project: My Sequencing Run
ID: 12345678
Description: WGS run from 2025-01-15
Created: 2025-01-15T10:30:00.0000000Z
Owner: John Doe (98765432)
Project: Cancer Panel Run
ID: 87654321
Created: 2025-01-20T14:20:00.0000000Z
Owner: Jane Smith (12345678)
Options
| Option | Alias | Mandatory | Description | Example |
|---|---|---|---|---|
| --help | -h | No | Explain previous options | basespace project list -h |
basespace project files
List files in a BaseSpace project.
Typical usage
$ python3 sg-upload-v2-wrapper.py basespace project files --project-id 12345678
Listing contents of BaseSpace project: 12345678
✓ Found 2 datasets:
Dataset: Sample1
ID: dataset-001
Description: Sample 1 sequencing data
Created: 2025-01-15T10:35:00.0000000Z
Files (4):
- Sample1_S1_L001_R1_001.fastq.gz (1.2 GB)
- Sample1_S1_L001_R2_001.fastq.gz (1.2 GB)
- Sample1_S2_L001_R1_001.fastq.gz (1.1 GB)
- Sample1_S2_L001_R2_001.fastq.gz (1.1 GB)
Options
| Option | Alias | Mandatory | Description | Example |
|---|---|---|---|---|
| --project-id | -p | Yes | BaseSpace project ID | basespace project files -p 12345678 |
| --show-all-files | No | Show all files, not just FASTQ files | basespace project files -p 12345678 --show-all-files | |
| --help | -h | No | Explain previous options | basespace project files -h |
basespace auto-import
Automatically import all BaseSpace projects. This command will: - List all BaseSpace projects (optionally filtered by creation date) - Check each project for a sample sheet or FASTQ files - Create runs in SOPHiA DDM for each project - Track processed projects using lock files to avoid duplicate imports
Sample Sheet Support
The auto-import command supports two import methods:
-
Sample Sheet Import (preferred): If a sample sheet is found in the project, it will be used to create the run. The sample sheet must follow the SOPHiA DDM format (see Sample Sheet section). The Pipeline_ID column in the sample sheet will be used if present.
-
Pipeline-based Import: If no sample sheet is found, the command will use the specified pipeline ID to create the run. This requires the
--pipelineoption.
Typical usage
Import all projects with a specific pipeline:
$ python3 sg-upload-v2-wrapper.py basespace auto-import --pipeline 12345
BaseSpace Auto-Import
====================
Region: us
Listing BaseSpace projects...
Found 5 projects
Processing project: 12345678 (My Sequencing Run)
Success: 12345678 → BS-12345678-20250115-143022
Processing project: 87654321 (Cancer Panel Run)
Success: 87654321 → BS-87654321-20250115-143045
Summary:
Processed: 2
Skipped: 3
Failed: 0
Import projects with sample sheets (no pipeline needed if sample sheets contain Pipeline_ID):
$ python3 sg-upload-v2-wrapper.py basespace auto-import
BaseSpace Auto-Import
====================
Region: us
Listing BaseSpace projects...
Found 3 projects
Processing project: 12345678 (My Sequencing Run)
Found sample sheet: SampleSheet.csv
Success: 12345678 → BS-12345678-20250115-143022
Dry-run to see what would be processed:
$ python3 sg-upload-v2-wrapper.py basespace auto-import --pipeline 12345 --dry-run
BaseSpace Auto-Import
====================
Region: us
Listing BaseSpace projects...
Found 5 projects
DRY RUN MODE - No projects will be imported
===========================================
[WOULD PROCESS] Project: 12345678 (My Sequencing Run)
Created: 2025-01-15T10:30:00.0000000Z
[SKIP - Already processed] Project: 87654321 (Cancer Panel Run)
Created: 2025-01-14T08:20:00.0000000Z
Lock file: 2025-01-14_14:30:15|SUCCESS|BS-87654321-20250114-143015
Summary (DRY RUN):
Would process: 4
Would skip: 1
Total: 5
Filter by date:
$ python3 sg-upload-v2-wrapper.py basespace auto-import --pipeline 12345 --from-date 2025-01-15T00:00:00Z
BaseSpace Auto-Import
====================
Region: us
Listing BaseSpace projects...
Found 10 projects
Filtering projects created on or after: 2025-01-15T00:00:00Z
After date filtering: 3 projects
Options
| Option | Alias | Mandatory | Description | Example |
|---|---|---|---|---|
| --pipeline | -p | ** | Pipeline ID. Required if no sample sheet is found in projects | basespace auto-import --pipeline 12345 |
| --sampletype | -s | No | Sample Type ID (defaults to 8000) | basespace auto-import --pipeline 12345 --sampletype 8000 |
| --from-date | No | Only import projects created on or after this date. Format: 2025-05-09T22:11:20.0000000Z or 2025-05-09T22:11:20Z | basespace auto-import --pipeline 12345 --from-date 2025-01-15T00:00:00Z | |
| --dry-run | No | Simulate the import process without actually importing projects. Shows which projects would be processed | basespace auto-import --pipeline 12345 --dry-run | |
| --help | -h | No | Explain previous options | basespace auto-import -h |
Pipeline ID requirement: The --pipeline option is required if:
- No sample sheet is found in the project, OR
- The sample sheet is found but doesn't contain a Pipeline_ID column
If a sample sheet with Pipeline_ID is found, the pipeline ID from the sample sheet will be used and --pipeline is not required.
Lock Files
The auto-import command uses lock files to track processed projects and avoid duplicate imports. Lock files are stored in:
~/.sophia/basespace/<region>/<project-id>.lock
Each lock file contains: - Timestamp of processing - Status (SUCCESS or FAILED) - Run reference (for successful imports)
Projects with existing lock files are automatically skipped. Failed imports create .lock.failed files.
Date Format
The --from-date option accepts ISO 8601 format dates with or without fractional seconds:
- 2025-05-09T22:11:20Z (without fractional seconds)
- 2025-05-09T22:11:20.000Z (with milliseconds)
- 2025-05-09T22:11:20.0000000Z (with microseconds)
Behavior
- Projects without FASTQ files are automatically skipped
- Projects with sample sheets are processed using the sample sheet (preferred method)
- Projects without sample sheets fall back to pipeline-based import (requires
--pipeline) - Already processed projects (with lock files) are skipped
- The command generates run references in the format:
BS-{projectId}-{timestamp}
basespace status
Show BaseSpace connection and authentication status.
Typical usage
$ python3 sg-upload-v2-wrapper.py basespace status
BaseSpace Integration Status
============================
Authentication: Authenticated to BaseSpace (region: us)
Region: us
API Server: https://api.basespace.illumina.com
Testing connection...
✓ Token is valid and connection is working
Available commands:
basespace auth login - Authenticate to BaseSpace
basespace auth logout - Clear authentication
basespace project list - List all BaseSpace projects
basespace project files -p <id> - List files in a BaseSpace project
Options
| Option | Alias | Mandatory | Description | Example |
|---|---|---|---|---|
| --help | -h | No | Explain previous options | basespace status -h |
One Command to Rule Them All
(New since 6.4.0)
The most efficient way to create and upload an analysis is by using the new command, which enables direct folder analysis with automatic uploading. You can either allow the system to guide you through selecting the appropriate pipeline or specify your choice directly.
Direct pipeline selection:
$ python3 sg-upload-v2-wrapper.py new --folder /path/to/fastq/files --ref MyRun123 --pipeline 1234 --upload
Run successfully created with id 200002747
Starting upload after analysis creation...
Upload ended in 123456ms
This single command does it all: - Automatically scans your FastQ folder - Creates the analysis request (with either interactive or direct pipeline selection) - Immediately starts the upload
No need to manually create JSON files or remember to initiate the upload after run creation — everything is handled in one single step. Ideal for both interactive use and automated scripts, making your workflow seamless and efficient.
Migration Guides
Sample Sheet v1 to v2 Migration
This guide explains how to migrate from v1 to v2. For information about the v2 format, see the Sample Sheet upload workflow section.
Key differences from v2:
- Uses [SOPHIA_DDM_Data_v1] section header instead of [SOPHIA_DDM_Data]
- Does not support [SOPHIA_DDM_Settings] section
How to Migrate from v1 to v2
To migrate an existing v1 sample sheet to v2 format:
-
Change the section header: Replace
[SOPHIA_DDM_Data_v1]with[SOPHIA_DDM_Data] -
Add the Settings section (optional but recommended): Add a
[SOPHIA_DDM_Settings]section before the data section:[SOPHIA_DDM_Settings],,,, version,1,,, -
Keep all data columns unchanged: The column structure remains the same, so no changes are needed to your sample data rows
Example Migration:
Before (v1):
[SOPHIA_DDM_Data_v1],,,,
Sample_ID,Capture_ID,Bundle_SN,Pipeline_ID,Patient_Ref
SG10000008,1,BDS-1111111111-10,5,SDSD12
After (v2):
[SOPHIA_DDM_Settings],,,,
version,1,,,
[SOPHIA_DDM_Data],,,,
Sample_ID,Capture_ID,Bundle_SN,Pipeline_ID,Patient_Ref
SG10000008,1,BDS-1111111111-10,5,SDSD12
Documentation Version: 7.16.0-6.7.0 | Commit: 0fe5b5e7 | Built: 2026-04-01 08:05:23 UTC