CPSC/AMTH/CBB 745 - Advanced Topics in Machine Learning & Data Mining - Spring 2017 Yale
Yale University CPSC745 - S2017

CPSC/AMTH/CBB 745

Advanced Topics in Machine Learning & Data Mining

Spring 2017

Alex Cloninger ♦ Smita Krishnaswamy ♦ Guy Wolf
(guy.wolf@yale.edu)

Due to improvements in measurement and storage technologies, massive amounts of data are being collected in a variety of domains including the biological/biomedical, social, economic, artistic, and cultural domains. This seminar will provide an overview of the advances that have been made in the last decade in machine-learning and automatic data-mining approaches for dealing with modern data-analysis challenges. This year, the seminar will focus on a broad scope of biomedical data analysis tasks, such as single-cell RNA sequencing, single-cell signaling, proteomic analysis, healthcare assessment, medical diagnosis, and treatment recommendations. The covered approaches will include, for example, deep learning, kernel methods and dictionary learning.

The seminar will be based on student presentations and discussions of recent prominent publications from leading journals in machine learning (e.g., MLJ, JMLR, TPAMI, and DMKD), computational biology (e.g., Science, Nature Biotechnology, Nature Methods, and BMC Bioinformatics) and conferences (e.g., ICML, NIPS, SampTA, and SigKDD/SigMOD). Students' grades will be based on the quality of these presentations and discussions and on students' written summaries.

Class sessions: Wednesdays 2:45-5:00, AKW 307 (even though the course is officially slotted as 2:30-5:15).
First class will be on Wednesday, January 25th.


Recommended papers:

A list of recommended papers for student presentations is provided here (link). Notice that this is not meant to be an exhaustive list, but rather as a good starting point for exploring suitable topics. You are more than welcome to use papers that do not appear here, and feel free to consult with the course staff regarding any additional topic/s that you would like to presenting in class.


Schedule:

Date Subject Presented by Notes
Jan 25 Introduction to the course & to biomedical data analysis Guy Wolf & Smita Krishnaswamy
Feb 01 Deep learning & their applications in biomedical data analysis Alex Cloninger
Feb 08 The scattering transform (classification & feature extraction) Guy Wolf
Feb 15 Visualizing data using embeddings Laurens van der Maaten First part of the session is joint with the applied math seminar: 2:45-3:45, AKW 000
Image classification with deep convolutional networks Scott Stankey
Feb 22 Topological data analysis with applications in genetics Derek Lo This session will have two student talks
Deep recurrent neural networks & applications to biology Lincoln Swaine-Moore
Mar 01 Methods for predicting patient risk and inferring counterfactuals Derek Yu This session will have two student talks
Analyzing biomedical data with multi-task learning Kevin Ta
Mar 08 Quantum and quantum-inspired machine learning & data mining Jacob Marks This session will have two student talks
Predicting human behavior and disorders from fMRI data Sreejan Kumar
Mar 15 Spring Break --
Mar 22 Spring Break --
Mar 29 Machine learning techniques applied to regulatory sequences Lionel Jin This session will have two student talks
Extracting protein-protein interactions from biomedical literature Michael Menz
Apr 05 Genetic studies of disease: mapping, fine-mapping, and biological interpretation Chris Cotsapas This session will have one guest talk (Prof. Cotsapas) and one student talk
Supervised prediction of protein function based on deep learning Hussein Mohsen
Apr 12 HCRF-based detection of epileptogenic cortical malformations Kay Meelu This session will have two student talks
Deep generative models & feature learning with applications in computational biology Adam Erickson
Apr 19 MAGIC: diffusion-based imputation & denoising of single-cell data David van Dijk This session will have two short guest talk (Dr. Moon and Dr. van Dijk) and one student talk
PHATE: potential of heat-diffusion for affinity-based trajectory embedding Kevin Moon
Multimodal data fusion and analysis Jennifer Laine
Apr 26 Word2vec vector embedding variations and biological applications Benjamin Rosenbluth This session will have two student talks
Reinforcement Learning: Theory and Application Tyler Dohrn