class: center, middle, inverse, title-slide # Final Meetup ## DATA 606 - Statistics & Probability for Data Analytics ### Jason Bryer, Ph.D. ### May 7, 2025 --- # Final Exam * Is now available on Brightspace. * Due by midnight May 11th. * You may use your book and course materials. * We expect you to complete the exam on your own (i.e. do not discuss with classmates, colleagues, significant others, ChatGPT, etc.) * There are two parts: 1. Part one multiple choice questions and short answer questions. 2. Part two has a small data set to analyze with R, then answer some interpretation questions. * Put your answers in the Rmarkdown file and submit the PDF file. **Please do not post your answers online!** --- # Announcements * You should join the New York Open Statistical Programming Meetup group. [https://nyhackr.org](https://nyhackr.org) * They meet monthly, usually at NYU. [George Hagstrom](mailto:George.Hagstrom@cuny.edu) has been organizing a group of MSDS students meeting up there each month. * Their annual conference will be August 26th and 27th: [https://nyhackr.org/events.html](https://nyhackr.org/events.html) * The Joint Statistical Meeting will be August 2nd to 7th in Nashville: [https://ww2.amstat.org/meetings/jsm/2025/](https://ww2.amstat.org/meetings/jsm/2025/) * useR! will be virtual August 1st and in person (Duke University) August 8th to 10th: [https://user2025.r-project.org](https://user2025.r-project.org) --- # Propensity Score Analysis <img src="images/hex/psa.png" class="title-hex"> My statistical research interest is in propensity score methods. Propensity score analysis (PSA) is a quasi-experimental design used to estimate causality from observational studies. Here are some resources for PSA: * PSA [Github repository](https://github.com/jbryer/psa) includes slides slides and Shiny application: https://github.com/jbryer/psa * Early version of an [Intro to PSA](https://psa.bryer.org) book: https://psa.bryer.org * Recording of a talk given in Fall 2023 for the NYC Meetup group here: https://www.youtube.com/watch?v=JLV4mtFhRMM .pull-left[ <img src='images/hex/psa.png' height='100' align='left' style='padding-right:20px'> `multilevelPSA`<br/> [Multilevel PSA]((http://jason.bryer.org/multilevelPSA) <br/><br/> <img src='images/hex/TriMatch.png' height='100' align='left' style='padding-right:20px'> `TriMatch`<br/> [Matching with non-binary treatments](http://jason.bryer.org/TriMatch) ] .pull-right[ <img src='images/hex/PSAboot.png' height='100' align='left' style='padding-right:20px'> `PSAboot`<br/> [Bootstrapping PSA](http://jason.bryer.org/PSAboot) <br/><br/> <img src='images/hex/PSAgraphics.png' height='100' align='left' style='padding-right:20px'> `PSAgraphics`<br/> [Graphical analysis of PSA](http://jason.bryer.org/PSAgraphics) ] --- # R Packages Here is list of some other R related projects I have worked on: <img src='images/hex/likert.png' height='70' align='center'> [`likert`](https://github.com/jbryer/likert) - Analysis and Visualization of Likert Based Items <img src='images/hex/ShinyQDA.png' height='70' align='center'> [`ShinyQDA`](https://github.com/jbryer/ShinyQDA) - R Package and Shiny Application for the Analysis of Qualitative Data <img src='images/hex/medley.png' height='70' align='center'> [`clav`](https://github.com/jbryer/medley) - Predictive modeling with missing data <img src='images/hex/clav.png' height='70' align='center'> [`clav`](https://github.com/jbryer/clav) - Cluster Analysis Validation <img src='images/hex/IRRsim.png' height='70' align='center'> [`IRRsim`](https://github.com/jbryer/IRRsim) - An R Package for Simulating Inter-Rater Reliability <img src='images/hex/mldash.png' height='70' align='center'> [`mldash`](https://github.com/jbryer/mldash) - Machine Learning Dashboard <img src='images/hex/FutureMapping.png' height='70' align='center'> [AmplifyApp](https://amplifyapp.org/en), [dashboard](https://amplifyapp.org/en), and [Future Mapping NYC](https://futuremapping.org) --- # DAACS [The Diagnostic Assessment and Achievement of College Skills](https://daacs.net) (DAACS) is a suite of technological and social supports to optimize student learning. DAACS provides personalized feedback about students’ strengths and weaknesses in terms of key academic and self-regulated learning skills, linking them to the resources to help them be successful students. This is currently supported by a five-year $3.8 million grant received in 2021 from the Institute of Education Sciences to test the efficacy at three institutions. Applications of Data Science: * We use natural language processing and predictive models to machine score the essays. * We had a student this semester work with us to explore whether we can detect AI generated essays specific to the DAACS writing prompt (answer: we) * We use DAACS data to estimate "risk scores" for students failing so we can target them with resources to help them be successful. * Related to this, we have developed a new R package for estimated predictive models with missing data, see [medley](https://github.com/jbryer/medley). --- # Thank You This has been a great semester. Please don't hesitate to reach out:
Email: [jason.bryer@cuny.edu](mailto:jason.bryer@cuny.edu)
Github: https://github.com/jbryer
Personal Website: https://bryer.org
LinkedIn: [jasonbryer](https://www.linkedin.com/in/jasonbryer/)
Mastodon: [@jbryer@vis.social](https://vis.social/@jbryer) <br/> You can download all course materials on [Github](https://github.com/jbryer/DATA606-2025-Spring). Click the [clone or download](https://github.com/jbryer/DATA606-2025-Spring/archive/master.zip) link to download a zip file.