Big Data is a heated topic in a variety of fields such as biology and finance. The challenge of understanding these data has led to the development of new statistical tools and developed new areas such as machine learning and bioinformatics. Many of these tools are developed based on traditional statistical thinking, but are often expressed with different terminology.
In this project, students are expected to choose one or several topics and apply the methods to real datasets: Logistic regression, classification and regression trees, neural networks, boosting, bagging, unsupervised learning, signal processing, random forests and data visualization, etc. Students are expected to demonstrate their understanding of the tools, apply the tools using software and interpret software’s the results.
2022 - 2023
Zibo Yu: Application of support vector machine and its model in judging whether the charging stations of electric cars are faulty
YuZhe Xia: Using PCA and CNN in Cat Breeds Recognition
Ruofan Mao: Image Classification based on Underlying Feathers
Zhuo Chen: Image Super-resolution Based on Convolutional Neural Network
Can He: Novel cluster-based machine learning approach of analyzing video games on the Steam platform
Siqi Wang: Data Reduction Methods and Clustering Algorithms in Case of World Happiness Index
2021 - 2022
2020 - 2021