Description of Course


This is a graduate level elective course aiming at providing the interface between astronomical data analysis problems and modern statistics methods. Modern astronomy and astrophysics is undergoing a revolution with dramatic increases in both the volume and complexity of astronomical data. The last decade saw the emergence of many terabyte-level sky surveys across the electromagnetic spectrum; the next decade, data volumes will enter the petabyte regime, with an ever strong time domain component. These new data sets represent quantum leaps in our abilities for new astronomical discoveries, but also present significant challenges to standard analysis tools normally employed in astronomy.



The goal of this course is to bridge the gap between modern large data surveys and the data analysis tools that have been provided in normal graduate courses. The course will start with a brief review of the modern statistics framework relevant to large scale data analysis, including probabilities and statistical distribution, classical and Bayesian statistical inferences. Then it will cover the main topics of the course: data mining and machine learning, including density estimation, clustering analysis, dimensionality reduction, regression and model fitting, classification and time series analysis. Another key component of the course is to introduce commonly used data mining and machine learning tools, in the context of Python-based packages, which will be used in solving data problems throughout the course.


The course will be given in a combination of instructor lectures, student-led seminars and guest lectures. After the first section of introductory material, students will lead discussion and demonstration of most of the topics, and guest lectures will introduce important current and future key big data projects in astronomy. The class will conclude with final projects on using data mining and machine learning tools of your own research data.  



Instructor and Contact Information

Prof. Xiaohui Fan 

Office: SO 340 

Phone: 626-7558 

email: fan@as.arizona.edu


office hour: Wednesday 11am – 12pm, SO 340, or by appointment 


Course Format and Teaching Methods


The course will be given in a combination of instructor lectures, student-led seminars and guest lectures. After the first section of introductory material, students will lead discussion and demonstration of most of the topics, and guest lectures will introduce important current and future key big data projects in astronomy. The class will conclude with final projects on using data mining and machine learning tools of your own research data.  


Course Communications


http://sancerre.as.arizona.edu/~fan/Home/AST502.html

https://github.com/UA-ast502-2020/classnotebook





Required Texts or Readings


The main text book is: “Statistics, Data Mining, and Machine Learning in Astronomy”, by Ivezic, Connolly, VanderPlas and Gray, 2019, Princeton University Press.  


The codes and figures used in the book can be found at astroML website: http://www.astroml.org/ 


Final Project

Students will be divided into several groups to work on a data mining and machine learning final project and present it to the entire class.


Grading Scale and Policies

Class Attendance: 33.3% 

Classroom Lecture: 33.3% 

Final Project: 33.3% 


Class Schedule


  1. 1.0115 - Introduction 

  2. 2.0122 - Tools

  3. 3.0127 - Statistics refresher 

  4. 4.0129 - No class

  5. 5.0203 - Palmer: maximum likelihood estimate (4.2 - 4.5)

  6. 6.0205 - Zeljko Ivezic: LSST overview (guest lecture) 

  7. 7.0210 - Chen: non-parametric modeling (4.8 - 4.9)

  8. 8.0212 - RS and Lo: Bayesian parameter estimation (5.3, 5.6)

  9. 9.0217 - Tang: Bayesian model selection (5.4, 5.7)

  10. 10.0219 - Peter Behroozi: Big Data and the Universe Machine (guest lecture) 

  11. 11.0224 - Xu: MCMC (5.8) 

  12. 12.0226 - Pearce and Rodozenski: PCA (7.1 - 7.3) 

  13. 13.0302 - CK Chan: Big Data Challenges in EHT (guest lecture) 

  14. 14.0304 - No class

  15. 15.0316 - Stephanie Juneau and Robert Nikutta: Science Platforms and Data Lab (guest lecture)  

  16. 16.0318 - Stephanie Juneau and Robert Nikutta: Science Platforms and Data Lab (guest lecture) 

  17. 17.0323 - Scott: Dimensionality: Manifold learning and ICA (7.5, 7.6)

  18. 18.0325 - Woodrum: Regression: linear models (8.1 - 8.5)

  19. 19.0330 - White: Regression: nonlinear models (8.7 - 8.10)

  20. 20.0401 - Tom Matheson: ANTARES (guest lecture) 

  21. 21.0406 - Chamberlain: Classification: Generative (9.3)

  22. 22.0408 - Fan and Hayati: Classification: SVM (9.5, 9.6)

  23. 23.0413 - Liang: Classification: trees and forest (9.7) 

  24. 24.0415 - Purdy: Deep learning and neural networks (9.8)

  25. 25.0420 - Jones: time series: basic models (10.1, 10.2)

  26. 26.0422 - Wolfe: time series: periodic (10.3)

  27. 27.0427 - Zhang: time series: localized and stochastic (10.4, 10.5)

  28. 28.0429 - Ann Zabludoff: Frontier Science with LSST (guest lecture) 

  29. 29.0504 - project reports

  30. 30.0506 - project reports


Course Objectives and Expected Learning Outcomes

1.Exhibit an expert-level facility to engage with the principle findings, common applications, current problems, fundamental techniques, and underlying theory of the astronomy discipline.

2.Demonstrate advanced discipline skills and knowledge necessary to utilize the observational techniques, instrumentation, computational methods, and software applications used to investigate modern astrophysical phenomena and problems.

3.Develop expertise with communicating, translating and interpreting fundamental astronomical concepts and research results in oral and/or written formats.

4.Conduct independent research and/or gain mastery-level knowledge of a specific area of the discipline of astronomy.

5.Engage in the scholarly, ethical, and discipline specific practices of the field at a professional level.

Absence and Class Participation Policy

The UA’s policy concerning Class Attendance, Participation, and Administrative Drops is available at: http://catalog.arizona.edu/policy/class-attendance-participation-and-administrative-drop

The UA policy regarding absences for any sincerely held religious belief, observance or practice will be accommodated where reasonable, http://policy.arizona.edu/human-resources/religious-accommodation-policy.

Absences pre-approved by the UA Dean of Students (or Dean Designee) will be honored.  See:  https://deanofstudents.arizona.edu/absences 


Participating in the course and attending lectures and other course events are vital to the learning process. As such, attendance is required at all lectures and discussion section meetings. Students who miss class due to illness or emergency are required to bring documentation from their health-care provider or other relevant, professional third parties. Failure to submit third-party documentation will result in unexcused absences.


Classroom Behavior Policy

To foster a positive learning environment, students and instructors have a shared responsibility. We want a safe, welcoming, and inclusive environment where all of us feel comfortable with each other and where we can challenge ourselves to succeed. To that end, our focus is on the tasks at hand and not on extraneous activities (e.g., texting, chatting, reading a newspaper, making phone calls, web surfing, etc.).


Threatening Behavior Policy

The UA Threatening Behavior by Students Policy prohibits threats of physical harm to any member of the University community, including to oneself. See http://policy.arizona.edu/education-and-student-affairs/threatening-behavior-students.



Accessibility and Accommodations

Our goal in this classroom is that learning experiences be as accessible as possible. If you anticipate or experience physical or academic barriers based on disability, please let me know immediately so that we can discuss options. You are also welcome to contact the Disability Resource Center (520-621-3268) to establish reasonable accommodations. For additional information on the Disability Resource Center and reasonable accommodations, please visit http://drc.arizona.edu.

If you have reasonable accommodations, please plan to meet with me by appointment or during office hours to discuss accommodations and how my course requirements and activities may impact your ability to fully participate.

Please be aware that the accessible table and chairs in this room should remain available for students who find that standard classroom seating is not usable.



Students with Disabilities

If you anticipate barriers related to the format or requirements of this course, please meet with me so that we can discuss ways to ensure your full participation in the course. If you determine that disability-related accommodations are necessary, please register with Disability Resources (621-3268; https://drc.arizona.edu/) and notify me of your eligibility for reasonable accommodations. We can then plan how best to coordinate your accommodations.


Code of Academic Integrity

Students are encouraged to share intellectual views and discuss freely the principles and applications of course materials. However, graded work/exercises must be the product of independent effort unless otherwise instructed. Students are expected to adhere to the UA Code of Academic Integrity as described in the UA General Catalog. See: http://deanofstudents.arizona.edu/academic-integrity/students/academic-integrity.


UA Nondiscrimination and Anti-harassment Policy

The University is committed to creating and maintaining an environment free of discrimination; see http://policy.arizona.edu/human-resources/nondiscrimination-and-anti-harassment-policy



Subject to Change Statement

Information contained in the course syllabus, other than the grade and absence policy, may be subject to change with advance notice, as deemed appropriate by the instructor.