Statistics: The Foundation of AI
Artificial intelligence may feel new, but its roots go back decades, and at the University of Minnesota, the School of Statistics is making sure students understand both the history and future of the field.
“Everything we do is foundational to AI,” says Galin Jones, director of the School of Statistics. “You cannot do AI without statistics.”
That message is coming through in the classroom and beyond. This summer, the school partnered with the Data Science and AI Hub (DSAI) to host high school camps in data science, the backbone of AI. The response was overwhelming: every session filled with waitlists, including one designed specifically for students who identify as female.
Data Science and AI Hub
The DSAI is a University-wide nexus of multidisciplinary collaboration and information sharing. We provide communication, coordination, integration, and amplification of campus-wide data science activities. The DSAI also puts a spotlight on current focus areas by providing education, funding, and research assistance in our focus areas. For 2023-24 our focus areas are foundational data science and digital health and personalized healthcare delivery.
Your Gateway to Data Science and AI Innovation at the University of Minnesota: Data Science and AI Hub
On campus, faculty are rethinking courses to give undergraduates and graduate students alike hands-on experience with AI tools. In STAT 3011, Introduction to Statistical Analysis, instructors Barbara Kuzmak and Yuyoung Park created a course chatbot to handle the kinds of routine questions that flood into large lecture courses.
At higher levels, courses like Statistical and Machine Learning (STAT 5052), Designing Experiments (STAT 5303), and the new Generative AI: Principles and Practices (STAT 8105) put students directly into the evolving conversation about AI’s potential.
Standard statistical reasoning. Simple statistical methods. Social/physical sciences. Mathematical reasoning behind facts in daily news. Basic computing environment.
Credit will not be granted if credit has been received for: ANSC 3011 or ESPM 3012
This is the first semester of the applied statistics and statistical machine learning sequence for majors seeking a BA or BS in statistics or data science, coupled with the course STAT 4052. The course delves into the foundational statistics supporting contemporary machine learning techniques. The emphasis lies on identifying problem types, selecting appropriate analytical methods, accurate result interpretation, and hands-on exposure to real-world data analysis. The curriculum builds upon traditional multivariate statistical analysis and unsupervised learning, extending to modern machine learning topics. Topics include clustering, dimension reduction, matrix completion, factor analysis, covariance analysis, and graphical models. Additionally, advanced data structures such as text and graph data are covered. The course prioritizes the fundamental statistical principles integral to machine learning, demonstrated through the analysis and interpretation of numerous datasets.
prereq: (STAT 3701 or STAT 3301) and (STAT 4101 or STAT 5101 or MATH 5651)
The material covered will be the foundations of modern machine learning methods including regularization methods, discriminant analysis, neural nets, random forest, bagging, boosting, support vector machine, and clustering. Model comparison using cross-validation and bootstrap methods will be emphasized.
Preparing students to navigate an AI-driven future
But statistics can’t answer every question AI raises. As Jones explains, “There are topics, things like media literacy, that statisticians are not well-suited to address. We need the arts and humanities. If we just leave it up to the statisticians, we’ve missed the boat.”
That perspective underscores why the liberal arts are central to the future of AI. Technical expertise must work alongside fields like philosophy, journalism, psychology, and cultural studies to ask questions about ethics, creativity, and society.
We need the arts and humanities. If we just leave it up to the statisticians, we’ve missed the boat.
Galin Jones, director of the School of Statistics
The department is also creating space for dialogue. While some disciplines may fear AI’s role in the classroom, statistics faculty are encouraging students to experiment thoughtfully. They emphasize that AI should be used as an assistant, not a replacement, in everything from coding tasks to brainstorming research questions.
That balance matters, because AI is already shaping industries from healthcare to the arts. And Minnesota has been at the heart of the story: the birthplace of the first supercomputer, home to some of the largest academic data sets in the world, and now a leader in preparing students to navigate an AI-driven future.
“Kids are going to encounter AI in the world,” says Jones. “This is the place where they need to start getting that preparation.”
Why statistics?
It is estimated that in less than three years we will be producing over 1.7 megabytes of new information every single second for every human being on the planet. This new information, in conjunction with the vast amount of information that has already been collected, ranges from the mundane to the extraordinary.
Through the lens of the data scientist, this information can become something larger, and statisticians are at the forefront of this field.
What role do the liberal arts play in understanding and scrutinizing artificial intelligence?
Advances in AI are social, cultural, and ethical. As these new technologies reshape industries and influence our daily lives, they also raise important questions about the relationship between people and technology. That's why we need the liberal arts. The liberal arts provide the context to grapple with these questions through a humanity-centered perspective.
Yes, this story was created with the assistance of artificial intelligence. (We had to!)