FAQ
How do check version of the upload CLI?
To verify the version of the script it is advised to execute the following command:
python3 sg-upload-v2-wrapper.py version
The output will be similar to the following:
Version: 5.10-4.3.4
Debugging network issues with CLI and curl
This guide should help troubleshoot network issues with CLI making direct API calls. This may help identify the root cause of the problem.
SGP issues
Step 1: Run CLI 1. Ensure you have the latest version CLI and you're logged in. 2. Run the following command to query a patient reference with logging enabled:
java -Dlog.to.stdout=true -Dlog.level=info -jar sg-upload-v2-latest.jar patient -l --patient-ref=MP24-1767
where MP24-1767 is the patient reference you want to query.
- Expected: The application should log its actions to the console. Look for any error messages that might indicate what's going wrong.
Step 2: Extract the JSESSIONID
1. From the console output, locate the JSESSIONID. This may appear in part of the HTTP response or within the application logs.
2. If the output is extensive, you can use a tool like grep to filter it:
# Example to filter output for JSESSIONID
java -Dlog.to.stdout=true -Dlog.level=info -jar sg-upload-v2-latest.jar patient -l --patient-ref=MP24-1767 | grep 'JSESSIONID'
- Note: Copy the
JSESSIONIDvalue accurately without any extra spaces or characters.
Step 3: Use Curl to Make a Direct API Call
1. With the JSESSIONID in hand, use the following curl command to make a direct API request:
curl -L -H "Content-Type: application/json" -H "Accept: application/json" -H "Cookie: JSESSIONID=XXXXXXX" https://us.sophiagenetics.com/SGP/dg/patients/userref/MP24-1767
- Replace
XXXXXXXwith your actualJSESSIONID. - Expected: The command should return a JSON response related to the patient reference. Ensure there are no HTTP errors like 401 or 403, which indicate permission issues.
Troubleshooting Tips: - Ensure there are no network connectivity issues. - If necessary, increase the log verbosity for more detailed output.
Troubleshooting Connection Issues with the Upload Command (Java 11 and Above)
If you're using Java 11 or a newer version, you may encounter SSL/TLS connection issues when executing the Upload command in the CLI. This is because Java 11+ uses TLS 1.3 by default, which does not support TLS renegotiation—a feature required by the CLI's upload service.
Since the CLI application currently supports only TLS 1.2 for secure file transfers, you’ll need to explicitly configure the Java runtime to use TLS 1.2 for the upload operation.
Step 1: Verify Your Java Version
To check the Java version installed on your system:
For Windows/macOS/Linux: Open a terminal or command prompt and run:
java -version
Sample Output:
openjdk version "11.0.21" 2024-01-16
OpenJDK Runtime Environment (build 11.0.21+9)
OpenJDK 64-Bit Server VM (build 11.0.21+9, mixed mode)
If the version shown is Java 11 or higher, continue with the next step.
Step 2: Add TLS 1.2 Enforcement to the Upload Command To resolve the TLS handshake issue, add the following system property to the Upload command:
-Djdk.tls.client.protocols=TLSv1.2
Example:
python3 sg-upload-v2-wrapper.py -Djdk.tls.client.protocols=TLSv1.2 upload -i 400025188
This forces Java to use TLS 1.2, which is compatible with the CLI's upload service. Once added, the Upload command should work without any SSL-related errors.
How to configure JVM memory and other JVM options?
The wrapper script (sg-upload-v2-wrapper.py) supports passing JVM options to control memory allocation, garbage collection, and other Java runtime settings. This is particularly useful when encountering OutOfMemoryError or when you need to optimize performance.
Method 1: Individual JVM Options
You can pass JVM options directly as separate arguments. Use -X prefix for memory options and -XX: prefix for advanced options:
python3 sg-upload-v2-wrapper.py pipeline --list -Xms2G -Xmx4G -XX:ActiveProcessorCount=1 -XX:ConcGCThreads=2 -XX:ParallelGCThreads=2
Method 2: Using --jvm-opts Flag
You can also pass all JVM options as a single quoted string using the --jvm-opts flag:
python3 sg-upload-v2-wrapper.py --jvm-opts "-Xms2G -Xmx4G -XX:ActiveProcessorCount=1 -XX:ConcGCThreads=2 -XX:ParallelGCThreads=2 -XX:+PrintFlagsFinal" pipeline --list
How activate debug logs for adegen.py and sg-upload-v2-wrapper.py?
To activate debug logs for adegen.py and sg-upload-v2-wrapper.py you can run them with flag -v.
python3 sg-upload-v2-wrapper.py -v ...
python3 adegen.py -v ...
Example:
python3 sg-upload-v2-wrapper.py -v userInfo
python3 adegen.py -v -c --bdsNumber BDS-4528057838-51 folder -o myinput.json
File Download
This FAQ outlines how users can download data from the platform using specific CLI commands. The focus is on downloading large amounts of data and without including unnecessary files.
General Information
What options are available for listing runs?
Users have the following option for listing runs:
- List N Recent Runs: List the N most recent runs.
Example:
python3 sg-upload-v2-wrapper.py status --limit n
where n is the number of recent runs to list.
What options are available for listing files?
- List Files by Run ID: List all files related to a specific run.
Example:
python3 sg-upload-v2-wrapper.py file --list --run-id 12345
- List Files by Date and Extension: List files based on specific criteria like date and extension.
Note: this command doesn't support filtering by specific run ID e.g. the files will be listed from all runs.
Example:
python3 sg-upload-v2-wrapper.py file --list --date 1704063600000 --extension .bam
where 1704063600000 is the date in milliseconds (e.g. Epoch Unix Timestamp) and .bam is the file extension.
What options are available for downloading files?
Users have 2 main options for downloading files:
- Single File Download: Download one file at a time using the file ID.
Example:
python3 sg-upload-v2-wrapper.py file --download --file-id 12345 --file-out my_file
- Batch File Download: Download all files related to a specific run.
Example:
python3 sg-upload-v2-wrapper.py file --download --run-id 12345
Can I filter files by date or extension?
Yes, the platform allows filtering files by specific criteria, such as date and extension. This can be helpful for narrowing down the files you need before downloading.
Are downloads done in parallel?
By default, downloads are not performed in parallel. However, users can parallelize your downloads using external tools such as GNU parallel.
How can I download specific files for given period?
Here’s a general outline of the steps to follow:
- List the Files You Need: Use filters to list files by date (Time in milliseconds e.g. Epoch Unix Timestamp) and extension.
Example: List all BAM files uploaded since a specific date: ```bash python3 sg-upload-v2-wrapper.py file --list --date 1704063600000 --extension .bam
- Download the Files: Use the file IDs from the previous step to download the files.
Example: Download specific files by ID:
bash
python3 sg-upload-v2-wrapper.py file --download --file-id 12345 --file-out my_file
3. Parallelize the Download: If you have many files to download, consider parallelizing the download process using external tools like GNU parallel.
Example: Parallelize the download process using GNU parallel:
```bash
cat > commands.txt << EOF
java -jar sg-upload-v2-latest.jar file --download --file-id 239566918 --file-out 239566918.bam
java -jar sg-upload-v2-latest.jar file --download --file-id 239566928 --file-out 239566928.bam
java -jar sg-upload-v2-latest.jar file --download --file-id 239566938 --file-out 239566938.bam
java -jar sg-upload-v2-latest.jar file --download --file-id 239566948 --file-out 239566948.bam
java -jar sg-upload-v2-latest.jar file --download --file-id 239566958 --file-out 239566958.bam
java -jar sg-upload-v2-latest.jar file --download --file-id 239566968 --file-out 239566968.bam
java -jar sg-upload-v2-latest.jar file --download --file-id 239566978 --file-out 239566978.bam
java -jar sg-upload-v2-latest.jar file --download --file-id 239566988 --file-out 239566988.bam
EOF
```
parallel -j 4 < commands.txt
Running CLI under proxy
Instructions for Linux and macOS users
This guide outlines how to run the CLI under a proxy server.
Using the wrapper script:
python3 sg-upload-v2-wrapper.py -Dhttp.proxyHost="proxy.service" -Dhttp.proxyPort="3128" -Dhttps.proxyHost="proxy.service" -Dhttps.proxyPort="3128" pipeline --list
Or using the --jvm-opts flag:
python3 sg-upload-v2-wrapper.py --jvm-opts "-Dhttp.proxyHost=proxy.service -Dhttp.proxyPort=3128 -Dhttps.proxyHost=proxy.service -Dhttps.proxyPort=3128" pipeline --list
Direct Java call (alternative):
java -Dhttp.proxyHost="proxy.service" -Dhttp.proxyPort="3128" -Dhttps.proxyHost="proxy.service" -Dhttps.proxyPort="3128" -jar sg-upload-v2-latest.jar pipeline --list
Running CLI with GEN2 Platform Services
Please note that the latest CLI version now supports GEN2 platform services. This feature is currently being rolled out to all users. If this feature has not yet been activated for your account, the following guide provide instructions on how to activate GEN2 platform services on-demand for testing purposes. As previously notified, switching to GEN2 platform services is mandatory for all users and will be enforced soon. Please reach out to your account manager for further information.
This guide outlines how to run the CLI with GEN2 platform services.
Instructions
Please add the following flag -fp to the CLI commands to run them with GEN2 platform services.:
Examples:
ADEGEN script:
python3 adegen.py -c -p 12345 folder -o myinput.json -fp
List pipelines
java -jar sg-upload-v2-latest.jar pipeline --list -fp
List patients
java -jar sg-upload-v2-latest.jar patient --list -fp
Create new run
java -jar sg-upload-v2-latest.jar new -j myinput.json -fp
Download file
java -jar sg-upload-v2-latest.jar file --download --file-id 12345 --file-out my_file -fp
Here’s the how-to guide formatted in Markdown:
Managing Multiple SOPHIA IAM Account Sessions with CLI
Currently, our CLI supports only one active user session per Linux user profile. By default, the session data is stored in ~/.sophia/token.json and ~/.sg-upload-client/client_data.conf. When switching accounts, the session files are overwritten. To work around this, you can manually back up and restore session data for different accounts.
Example: Managing sessions for two IAM accounts (123 and 456)
Step 1: Login to IAM (Account 123)
- Execute the following command to log in:
python3 sg-upload-v2-wrapper.py login-iam
- You will see the following output:
Login successful
You're logged to IAM with client id: 123
- Backup the session files to a separate folder for Account 123:
mkdir -p ~/sophia_sessions/sophia_123
mv ~/.sophia/token.json ~/sophia_sessions/sophia_123/
mv ~/.sg-upload-client/client_data.conf ~/sophia_sessions/sophia_123/
Step 2: Login to IAM (Account 456)
- Log in to a different IAM account:
python3 sg-upload-v2-wrapper.py login-iam
- You will see the following output:
Login successful
You're logged to IAM with client id: 456
- Backup the session files to a separate folder for Account 456:
mkdir -p ~/sophia_sessions/sophia_456
mv ~/.sophia/token.json ~/sophia_sessions/sophia_456/
mv ~/.sg-upload-client/client_data.conf ~/sophia_sessions/sophia_456/
Step 3: Restore Session for Account 123
To restore your session for Account 123, copy the backed-up session files back to their original locations:
cp ~/sophia_sessions/sophia_123/token.json ~/.sophia/
cp ~/sophia_sessions/sophia_123/client_data.conf ~/.sg-upload-client/
You are now restored to Account 123.
Step 4: Restore Session for Account 456
To switch back to Account 456, repeat the process with the backed-up files for Account 456:
cp ~/sophia_sessions/sophia_456/token.json ~/.sophia/
cp ~/sophia_sessions/sophia_456/client_data.conf ~/.sg-upload-client/
By following these steps, you can manage multiple IAM account sessions on the same machine by manually switching between session tokens and configuration files.