BIOEN 6900-003: Data Science for Bioengineers
- For postdoctoral, graduate, and advanced undergraduate students in Engineering, Sciences, and Medicine, and professionals in industry.
- Fall 2018, Mondays and Wednesdays 11:50am–1:10pm, LCB 115.
Prerequisites: Some experience programming and instructor approval.
100% grade = 30% labs, 30% presentation, 30% class project, 10% class participation; class attendance is required.
We will cover concepts in data science and machine learning, and their applications to discovery of principles from biomedical data.
- Databases, from the Cancer Genome Atlas (TCGA) at the Genomic Data Commons (GDC) to the Utah Population Database (UPDB).
- Data types, from omics, imaging, and patient clinical information to biomedical samples and model organisms and systems.
- Algorithms, from the singular value decomposition (SVD) and principal component analysis (PCA) to multi-tensor decompositions, neural networks, and deep learning.
- Applications, from the Luria-Delbrück experiment to personalized cancer diagnostics, prognostics, and therapeutics.
- Proving mathematical theorems and programming symbolic computations.
- Designing algorithms and programming numerical computations.
- Working with databases and modeling biomedical data.
- In-class presentations of scientific journal articles and patents.
- Participation in guest lectures and seminars on campus and discussions of conference reports.
- End-of-class celebration.
- Fall 2018 Calendar
- Health, Wellness, and Counseling
- Student Code
Numerical Linear Algebra, Trefethen and Bau, III (1997).
August 27 and 29:
In-Class Work on Lab 1:
Code the SVD of synthetic data and its visualization. Test and debug your code.
European Association for Signal Processing (EURASIP) Summer School on Tensors in Medicine (Leuven, Belgium, August 27–31, 2018).
Mathematical properties of the SVD
Composition and decomposition of synthetic data:
September 24 and 26:
In-Class Work on Lab 2:
Compute and visualize the SVD of your data. Test and debug your code. Interpret your data based upon its SVD. Use at least two different approaches each for preprocessing and sorting your data and for assessing the statistical significance of your interpretation.
National Cancer Institute (NCI) Joint Meeting of the Cancer Systems Biology Consortium and the Physical Sciences in Oncology Network (Bethesda, MD, September 25–28, 2018).
October 3, Wednesday, 11:50am–1:10pm, WEB 2230:
October 5, Friday, 2:00–3:00pm, WEB 3780, in lieu of Lab 3:
Scientific Computing and Imaging (SCI) Institute Distinguished Seminar
Kirk E. Jordan
October 8 and 10:
More examples of HOSVD of measured data:
Paper 8: A Tensor Higher-Order Singular Value Decomposition for Integrative Analysis of DNA Microarray Data from Different Studies, Omberg et al., Proc Natl Acad Sci USA (2007).
Paper 9: Characterizing the Evolution of Genetic Variance Using Genetic Covariance Tensors, Hines et al., Philos Trans R Soc Lond B Biol Sci (2009).
Paper 10: Integrative Analysis of Many Weighted Co-Expression Networks Using Tensor Computation, Li et al., PLoS Comp Bio (2011).
Paper 11: MultiFacTV: Module Detection from Higher-Order Time Series Biological Data, Li et al., BMC Genomics (2013).
Paper 12: Subgraph Augmented Nonnegative Tensor Factorization (SANTF) for Modeling Clinical Narrative Text, Luo et al., J Am Med Inform Assoc (2015).
Computation of the HOSVD:
"State of the project" presentations
"State of the project" presentations
Tensor SVD of measured data:
Tensor SVD of synthetic data:
From the SVD to PCA:
Mathematical variations on the SVD and PCA:
The "perceptron," i.e., single-layer neural network, as a mathematical variation on the SVD:
Project update presentations
Happy Winter Break!