The Future of Statistics and Data Science
Abstract
In recent years, job openings for data scientists have increased dramatically, driven by the exponential increase of data in modern industries and businesses. Many ideas and approaches used by data scientists, however, are rooted in statistical thinking and methodology. With exposure to more computational thinking and approach, statistics or mathematics students will be in a strong position to take advantage of the boom in data science openings. This two-week course aims to provide participants with an overview of methods to approach data science problems. Participants will learn how to retrieve data from databases for exploration using MySQL and R/Python. Data exploration, pre-processing, and dimensional reduction methods are then introduced to help students handle multivariable data for downstream analyses. After this, some useful models for classification and regression, including ensemble models, will be introduced. Finally, demonstrations of model deployment in web-based applications through git and flask app will be provided. This workshop is suitable for students with prior background in statistics and mathematics, and some programming experience in R or Python. It is jointly conducted by Dr. R.B. Fajriya Hakim from Universitas Islam Indonesia, Yogyakarta, and Dr. Tsung Fei Khang from Universiti Malaya, Kuala Lumpur.