Legacy Generation Script
A Python script can be downloaded to generate this json input. The script can be applied to a folder containing the fastq files to analyze. The pipeline to apply to all of those files has to be described in a configuration file ("pipeline.json" for example). This means that the same pipeline will be applied to all samples of a RUN.
Usage
Move all the fastq files to analyze into a folder.
Then run the following command, which will generate the fastqFolder.json file.
python create-bar-input.py pipeline.json fastqFolder
By default, the file will be saved in the current working directory. This can be overridden with an optional --save-folder argument. E.g.:
Example pipeline
{
"sequencer": "ILLUMINA_MiSeq",
"pairend": true,
"analysis_type": "BRCA_Tumor",
"experiment_type": "somatic",
"kit": "Manufacturer_kit_code"
}
See the end of page for more pipeline examples.
python create-bar-input.py /my/fastq-folder --save-folder /some/path
Please note that the name of the output file can not be changed. This will always be the name of the FastQ folder. Any such file already existing will be overwritten without warning.
Both folder arguments can be absolute or relative. For example, this will process a "subfolder" of the current working directory and save the output file one level up:
python create-bar-input.py pipeline.json subfolder --save-folder ..
The script also supports an optional -v/--verbose flag which increases the output verbosity.
All files in the processed FastQ folder are assumed to follow this convention:
patient_mid_lane_r1/r2_ignored
"patient" has to be the first part of the file name and the last part will always be ignored. The position of "mid", "lane" and "r1/r2" can be interchanged, "lane" is optional.
For example:
GN2804-12A3456-2-subset2_S7_L002_R1_001.fastq.gz
This example file would produce these values:
Patient: GN2804-12A3456-2-subset2
Mid: S7
Lane: L002
R#: R1
WARNING
Do not upload any files containing nominative information or any other direct identifier related to a patient (e.g. patient's full name).
Legacy Format Description
Restricted properties allow only certain values which are provided by Sophia Genetics.
Level: root
| Property | Type | Restricted | Description |
|---|---|---|---|
| analyses | array | List of all the analysis to be done in this batch analysis request. | |
| user_ref | string | Name of the batch analysis request (run) in SophiaDDM. | |
| sequencer | string | Yes | Code of the sequencer used for this batch analysis request |
| pairend | boolean | Whether the run is with pair ended analyses or not. | |
| analyses | array | List of all the analysis to be done in this batch analysis request. |
Level: analyses
| Property | Type | Restricted | Description |
|---|---|---|---|
| analysis_type | string | Yes | The gene panel code as appears in Sophia DDM. |
| user_ref | string | Patient user reference, will be created if not found in Sophia DDM. | |
| mid | string | MID of the sample. | |
| experiment_type | string | Yes | "germline" or "somatic". |
| kit | string | Yes | Manufacturer code of the kit. |
| pairend | boolean | Whether the analysis is pair ended or not. | |
| files | array | List of analysis files. |
Level: analyses - files
| Property | Type | Description |
|---|---|---|
| r1 | string | Path to the R1 file. |
| r2 | string | Path to the R2 file. |
Full Example
{
"v": 1,
"user_ref": "run_name_reference",
"sequencer": "Sequencer_code",
"pairend": true,
"analyses": [
{
"analysis_type": "BRCA_Tumor",
"user_ref": "Seq_01_sample_21",
"mid": "S1",
"experiment_type": "somatic",
"kit": "Manufacturer_kit_code",
"pairend": true,
"files": [
{
"r1": "/path_to/SG10000001_S1_L001_R1_001.fastq.gz",
"r2": "/path_to/SG10000001_S1_L001_R2_001.fastq.gz"
}
]
},
{
"analysis_type": "BRCA_Tumor",
"user_ref": "Seq_01_sample_22",
"mid": "S2",
"experiment_type": "somatic",
"kit": "Manufacturer_kit_code",
"pairend": true,
"files": [
{
"r1": "/path_to/SG10000001_S2_L001_R1_001.fastq.gz",
"r2": "/path_to/SG10000001_S2_L001_R2_001.fastq.gz"
}
]
}
]
}
Pipeline examples
ILLUMINA_MiSeq - BRCA
{
"sequencer": "ILLUMINA_MiSeq",
"analysis_type": "BRCA",
"experiment_type": "germline",
"kit": "Multiplicom_MASTR_assay",
"pairend": true
}
ILLUMINA_MiSeq - BRCA_Tumor
{
"sequencer": "ILLUMINA_MiSeq",
"analysis_type": "BRCA_Tumor",
"experiment_type": "germline",
"kit": "Multiplicom_MASTR_Plus",
"pairend": true
}
ILLUMINA_MiniSeq - STS_v1
{
"sequencer": "ILLUMINA_MiniSeq",
"analysis_type": "STS_v1",
"experiment_type": "somatic",
"kit": "IDT",
"pairend": true
}