Somali Chaterji's Corner in Cyberspace

I am working at the intersection of computational genomics and machine learning, on challenging and potentially ground-breaking biomedical problems.
Best paper award at ACM BCB '15 (Sep '15; Atlanta, GA)
NIH R01 project kickoff at Argonne National Lab (Jul '16; Argonne, IL)
With the Seed of Success acorn award given to Purdue investigators winning > $1M grants (Oct '16; W Lafayette, IN)
My first half marathon (Oct '15; W Lafayette, IN)
Finishing the Purdue half marathon (Oct '16; W Lafayette, IN)


Somali Chaterji (pronounced: SHOH-MAH-LEE CHA-ter-JEEis a Visiting Assistant Professor at Purdue University, where she specializes in developing algorithms and statistical models for bioengineering and computational genomics.

Somali got her PhD in Biomedical Engineering from Purdue University, winning the Chorafas International Award (2010), College of Engineering Best Dissertation Award (2010), and the Future Faculty Fellowship Award (2009). She did her Post-doctoral Fellowship at the University of Texas at Austin in the Department of Biomedical Engineering, where her work was supported by an American Heart Association award. Dr. Chaterji is also a technology commercialization enthusiast and has been consulting for the IC2 Institute at the University of Texas at Austin, since Spring 2014.

Research Interests:

My current areas of interest and expertise are:

  1. Data Science for Healthcare
  2. Synthetic Biology
  3. Systems Engineering
  4. Data-Driven Molecular Therapeutics
  5. High-Performance Computing for Healthcare

From my post-doctoral days, I have expertise in cellular mechanotransduction, nanolithographic fabrication, and vascular biomechanics, with translation of laboratory discoveries to clinical practice through collaboration with cardiologists at St. David's Healthcare in Austin. In my PhD work, my areas of study were cardiovascular engineering, vascular extracellular microenvironments, and cardiovascular devices. I have collaborations with both wet-lab experimentalists (UT Austin, U of Washington, UNC Medical School) and computer engineers (Purdue, U of Maryland, U of Chicago).

News: breaking news

  1. November 2017: We have a new paper in Theranostics (IF = 8.766) on microRNA-based precision therapeutics and one in Biomaterials (IF = 8.402) on biophysical aspects of cardivascular cell-based nanosensing to facilitate cell engineering paradigms. 

  2. August 2017: We have two papers accepted to Briefings in Bioinformatics, one to the Middleware conference, and one to Theranostics. See below under "Publications".

  3. June 2017: I gave a talk at the IFIP Working Group 10.4 workshop in Longmont, Colorado on the topic of "The Fault Tolerant Epigenome & its Data Profile". The working group comprises leading researchers and practitioners in the area of dependable computing and meets for two workshops each year. [ pdf ] [ WG web site ]

  4. March-June 2017: We have 3 papers accepted or under minor revision. They are on federation in bioinformatics cyber infrastructures (Briefings in Bioinformatics), genome editing (Theranostics journal), and parallelization of de novo genome assembly (ACM BCB). [ html ]

  5. January 2017: I am serving on the Organizing Committee and Program Committee of the 10th IEEE International Conference on Communication Systems and Networks (Comsnets) to be held Jan 3-7 in Bangalore, India.  [ html ]

  6. September 2016: A new graduate student Ashraf Mahgoub joins our team as a Graduate Research Assistant. He will be working on the NIH R01 project with Argonne. Ashraf comes to us with a Masters from Cairo University, Egypt. Here is a brief profile. [ html ]

  7. February 2016: Our NIH R01 proposal titled "Continued development and maintenance of the MG-RAST metagenomics pipeline" has just been funded for 5 years. This is joint between us at Purdue and Argonne National Lab (Folker Meyer). This came out of NIAID and the total budget is about $3.7M.

News Archive


Here are the most recent publications (while at Purdue) and some older ones, which are harder to find online. For the complete list, please visit the Google Scolar page at:

  1. Ghoshal, A., Zhang, J., Roth, M., Xia, K. M., Grama, A., and Chaterji, S., "A Distributed Kernel SVM Algorithm for Predicting Non-Canonical MicroRNA Targets and Verified Using TCGA-derived Expression Data," Accepted to IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB), pp. 1-23, 2017.

  2. Koo J., Zhang J., and Chaterji, S., "Tiresias: Deep Learning Approach to Decipher MicroRNA Regulatory Networks," Theranostics, vol. 8, issue 1, pp. 277--291, 2018. [ paper

  3. Thomas, T. E., Koo J., Chaterji, S., and Bagchi, S., "Minerva: A reinforcement learning-based technique for optimal scheduling and bottleneck detection in distributed factory operations," Accepted to appear at the 10th IEEE Conference on Communication Systems & Networks (COMSNETS), pp. 1-8, Jan 3-7, 2018, Bangalore, India. (Acceptance rate: 38/125 (30.4%)) [ paper ]

  4. Le, V., Lee, J., Chaterji, S., Spencer, A., Liu, Y-L., Kim, P., Yeh, H-C., Kim, D-H., Baker, A. B., "Syndecan-1 in mechanosensing of nanotopological cues in engineered materials," Biomaterials, volume 155, pp. 1324, February 2018. [ paper ]

  5. Ashraf Mahgoub, Paul Wood, Sachandhan Ganesh, Subrata Mitra (Adobe Research), Wolfgang Gerlach (Argonne National Laboratory), Travis Harrison (Argonne National Laboratory), Folker Meyer (Argonne National Laboratory), Ananth Grama, Saurabh Bagchi, and Somali Chaterji, “Rafiki: A Middleware for Parameter Tuning of NoSQL Datastores for Dynamic Metagenomics Workloads,” Accepted to appear at the ACM/IFIP/USENIX Middleware Conference, pp. 1-13, Dec 11-15, 2017, Las Vegas, Nevada. (Acceptance rate: 20/85 = 23.5%) [ Paper ]

  6. Somali Chaterji, Eun Hyun Ahn, and Deok-Ho Kim, "CRISPR Genome Engineering for Human Pluripotent Stem Cell Research," Theranostics, vol. 7, pp. 1–25, 2017. [ Paper ]

  7. Folker Meyer, Saurabh Bagchi, Somali Chaterji, Wolfgang Gerlach, Ananth Grama, Travis Harrison, Tobias Paczian, Will Trimble, Andreas Wilke, “MG-RAST Version 4 — Lessons learned from a decade of low-budget ultra-high throughput metagenome analysis,” Accepted to appear in Oxford Briefings in Bioinformatics, pp. 1-12, acceptance date: August 2017. [ Abstract ]

  8. Somali Chaterji, Jinkyu Koo, Ninghui Li, Folker Meyer, Ananth Grama, and Saurabh Bagchi, “Federation in Genomics Pipelines: Techniques and Challenges,” Accepted to appear in Oxford Briefings in Bioinformatics, pp. 1-11, acceptance date: August 2017. [ Paper ]

  9. Kanak Mahadik, Christopher Wright, Milind Kulkarni, Saurabh Bagchi, and Somali Chaterji, “Scalable Genomic Assembly through Parallel de Bruijn Graph Construction for Multiple K-mers,” At the 8th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics (ACM BCB), pp. 1-7, Aug 20-23, 2017, Boston, MA. [ Paper ]

  10. Kim SG, Harwani M, Grama A, and Chaterji S. EP-DNN: A Deep Neural Network-Based Global Enhancer Prediction Algorithm. In Nature Scientific Reports, pp. 1-21, vol. 6, 2016. [ Paper ]

  11. Kim SG, Theera-Ampornpunt N, Fang C-H, Harwani M, Grama A, and Chaterji S. Opening up the blackbox: An interpretable deep neural network-based classifier for cell-type specific enhancer predictions. In the BMC Systems Biology journal, pp. 1-26, 10(2), article 54, 2016. [ Paper ]
  12. Mahadik K, Wright C, Zhang J, Kulkarni M, Bagchi S, and Chaterji S. SARVAVID: A Domain Specific Language for Developing Scalable Computational Genomics Applications. At the International Conference on Supercomputing (ICS), pp. 1-13, June 1-3, 2016, Istanbul, Turkey. (Acceptance rate: 32/183 = 17.5%) [ Paper ]

  13. Theera-Ampornpunt N, Kim SG, Ghoshal A, Bagchi S, Grama A, and Chaterji S. Computational and network cost of training distributed Support Vector Machines for large genomics data. At the 8th International IEEE Conference on Communication Systems and Networks (COMSNETS), pp. 1-8, January 5-9, Bangalore, India. (Acceptance rate: 39/143 = 27.3%)[ Paper ]

  14. Ghoshal A, Grama A, Bagchi S, and Chaterji S. An Ensemble SVM Model for the Accurate Prediction of Non-Canonical MicroRNA Targets. In Proceedings of the 6th ACM Conference on Bioinformatics, Computational Biology and Health Informatics (BCB), pp. 403-412, September 9-12, 2015, Atlanta, GA. (Acceptance rate: 48/141 = 34%) (Winner of the best paper award) [ Paper ] [ Presentation ] [ Recording of the presentation (in wav) ]

  15. Kim S, Ampornpunt N, Grama A, and Chaterji S. Interpretable Deep Neural Networks for Enhancer Prediction. At the 9th IEEE International Conference on Bioinformatics and Biomedicin (BIBM), pp. 242-249, Nov 9-12, 2015. (Acceptance rate: 68/346 = 19.7%) [ Paper ]

  16. Ghoshal A, Shankar R, Bagchi S, Grama A, and Chaterji S. MicroRNA target prediction using thermodynamic and sequence curves. Accepted for publication in BMC Genomics, notification: September 2015. [ Paper ]

  17. Purdue University, "Orion: A fine grained parallel implementation for genomic search," At: (January 2015).

  18. Mahadik K, Chaterji S, Zhou, B, Kulkarni, M, and Bagchi, S. Orion: Scaling Genomic Sequence Matching with Fine-Grained Parallelization. In Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis (Supercomputing), pp. 449-460, Nov 16-21, 2014. (Acceptance rate: 82/394 = 20.8%) [ Paper ]

  19. Chaterji S, Kim P, Choe SH, Ho DS, Tsui JH, Baker AB, Kim DH. Synergistic Effects of Matrix Nanotopography and Stiffness on Vascular Smooth Muscle Cell Function. Tissue Engineering Part A, 2014, 20(15-16): 2115-2126. [ Paper ]

  20. Vedantham K (+), Chaterji S (+), Kim SW, Park K. Development of a probucol‐releasing antithrombogenic drug eluting stent." Journal of Biomedical Materials Research Part B: Applied Biomaterials 100, no. 4 (2012): 1068-1077. (+ = Equal contribution) [ Paper ]

Google Scholar Page


I am fortunate to work with the following brilliant graduate students:

And the following stellar undergraduate students:



  1. How to predict new canonical and non-canonical microRNA targets?
  2. How do long non-coding RNAs act as enhancers of gene expression?
  3. How to build parallel algorithms for genomics applications?
  4. Check out others that are ongoing here:

Examples of questions that we are looking to answer are:

  1. How do you predict the function and targets of these non-coding regions using cutting-edge machine learning techniques?
  2. How to use graph-based methods to extend gene regulatory networks (
  3. How do you become a part of the $1000 genome? Is it for real ( Can you be a part of this Analysis Revolution, analyzing omics data?
  4. Can you decide to tweak your genes using epigenetics?

If all of this excites you and you are looking to be a part of this exciting “omics” revolution, come join us!


Last updated: September 10, 2017