(Almost) new lab in town, and with it, a brand new course to offer.

Somali Chaterji (Sonata)

February 26


Wordle for my Lab ICAN, which expands to Innovatory for Cells and Neural Machines, housed in Purdue’s Ag and Biological Engineering department.


I am teaching an applied data science and engineering course in line with the flavor of research in my lab ICAN. The course is titled: “Machine Learning & High-Performance Computing for Digital Ag & Biological Engineering; Part 1: Algorithms, resilient data lakes, & analytics at the edge”. This course will start in March and will be offered as a one-credit stackable course — a part of a Purdue initiative that aims to deliver stackable courses to create a custom data science curriculum.

Course webpage: schaterji.io/teaching; Flyer: http://bit.ly/chaterji-data-science; Video plug for the course: http://bit.ly/ABE591


Prof. Chaterji teaching Summer School at Purdue’s Ag. and Biological Engineering department [Summer 2019].


ABE 591 will have a top-down flavor leveraging real-world data problems and then discussing data science and data engineering algorithms and concepts that can solve them. The course will feature data analytics and foundational computer science (CS) concepts that power artificial intelligence/machine learning (AI/ML) tools ubiquitous in today’s world. The course will unwrap with some application domains in mind, focusing on a subset of the following (custom-picked based on the student cohort): digital agriculture, internet-of-things (IoT), metagenomics, synthetic biology, machine systems.

Course Highlights: AI/ML tools power search engines and recommendation engines and even your cellphones and Raspberry Pi-class devices; the right application of these tools can fuel AI-for-the-greater-good. ABE 591 is being developed with the curious citizen scientist in mind. Thus, it is expected that the student will have a willingness to delve in data science and engineering principles with the need to go beyond the tools available through various software bundles and packages, focusing on the algorithmic intuition and applications to select research domains inspired by the (food-water-energy) nexus-sensitive global challenges. A lot of these use cases will be inspired by some of the projects in my innovatory that straddle Computer Science & Engineering on the one hand and Digital Agriculture and Health on the other.


ICAN: nexus of (living) Cells and Machines (sensors).


ICAN uses a variety of ML algorithms to identify patterns in data for applications, such as: cluster data (unsupervised learning) into meaningful classes and then label these clusters or to find patterns in data generated from Internet-of-Things (IoT) sensing devices (e.g., ground sensors in farms) or data that is captured by high-throughput genomics sequencers (e.g., Oxford Nanopore sequencers).

In particular, the course will focus on imparting CS-y concepts and tools that will better equip students to pick the right kind of algorithm or platform to “learn” the patterns embedded in the data set and at scale. Here is a snapshot of topics:

Supercomputing clusters and multi-grained parallelization: This will translate to the ability to use supercomputing clusters to scale out the computation and to analyze the data faster. Examples of how distributed computation and multi-grained parallelization to perform faster computation using samples (real and synthetic) data sets will be discussed. Data stories will depict how computing at the edge (also known as fog computing) can be used for some data processing tasks that demand sub-second response times.


From: https://www.networkworld.com/article/3224893/what-is-edge-computing-and-how-it-s-changing-the-network.html


Hardware: Further, some hardware, ranging from microcontrollers to graphics processing units (GPU) and domain-specific accelerators, such as FPGAs, will be covered.

On-device computation: Finally, how TensorFlow and Raspberry Pi work together and how to approximate algorithms to run on embedded devices and real world applications of these deployments will also be discussed.