sfkit: A Web-Based Toolkit for Secure and Federated Genomic Analysis

Abstract

Advances in genomics are increasingly depending upon the ability to analyze large and diverse genomic data collections, which are often difficult to amass due to privacy concerns. Recent works have shown that it is possible to jointly analyze datasets held by multiple parties, while provably preserving the privacy of each party’s dataset using cryptographic techniques. However, these tools have been challenging to use in practice due to the complexities of the required setup and coordination among the parties. We present sfkit, a secure and federated toolkit for collaborative genomic studies, to allow groups of collaborators to easily perform joint analyses of their datasets without compromising privacy. sfkit consists of a web server and a command-line interface, which together support a range of use cases including both auto-configured and user-supplied computational environments. sfkit provides collaborative workflows for the essential tasks of genome-wide association study (GWAS) and principal component analysis (PCA). We envision sfkit becoming a one-stop server for secure collaborative tools for a broad range of genomic analyses. sfkit is open-source and available at skfit.org.

Type
Publication
Nucleic Acids Research 2023
David Froelicher
David Froelicher
Research Manager

My research interests include applied cryptography, distributed systems and genomic privacy.