Michigan Tech Experts Model the Future of Data Science

Students in a tiered seating lecture hall.
Students in a tiered seating lecture hall.
Michigan Tech data science students practice their craft in and out of the classroom, learning to clean data and train the learning models that shape modern decision making across industries.
×

Data science is everywhere, a driving force behind modern decisions. When a streaming service suggests a movie, a bank sends a warning about unusual activity on an account, or a weather app predicts the rain, these are all examples of data science at work. If the internet creates data, data scientists are the ones who make that data useful.

Data science is the practice of using information to make better decisions. The mountains of data now available from every mouse click and purchase would be just a big pile of numbers without a way to organize and interpret them. That is the job of technology developed by Michigan Technological University data scientists, including Timothy Havens, William and Gloria Jackson professor of computing, and Sujan Kumar Roy, assistant teaching professor of computer science in Michigan Tech's College of Computing.

"What if I asked you how many times the word 'science' appears in an entire library's worth of books? That would take you forever to accomplish, and you would likely be inaccurate," said Havens. "That's a simple problem for data science tools."

Sujan Kumar Roy stands in front of a large lecture hall while students take notes on laptops..
Sujan Kumar Roy teaches Huskies the ins and outs of data science through classroom lectures and hands-on practice.

Technologies like language learning models (LLMs) and artificial intelligence (AI) don't just accelerate the process of gathering massive amounts of information. They also extract meaning, insight and predictions from those mountains of data.

"Every day, we create a lot of information — from shopping receipts and weather reports to health records and social media activity," said Roy. "On its own, that information doesn't mean much. Data science is about organizing it, looking for patterns, and using those patterns to understand what is happening and what might happen next."

Michigan Tech students like Felicia Huffman, a data science major with a double minor in business and statistics, see the data science field as not only their future, but the future.

"A few years ago, I explained it as a combination of statistics and computer science," said Huffman. "Now that LLMs and AI are more well-known, I tell people it's the science behind that. Data science is new and growing, and in 30 years it will be the new statistician."

Preparing Michigan Tech students for that future means developing their technical skills alongside practical, bigger-picture thinking from an ethical perspective. Huffman is gaining experience working with Evan Lucas, assistant professor of computer science, applying her skills to his underwater bioacoustics project. Her part of the project involves programming machine learning models to better identify the sounds of fish communicating with each other — a type of data that is challenging because it has few labeled acoustic events to work with.

Diana Shadibaeva sits at a round table with a laptop displaying code on the table near her while she talks to another person.
Diana Shadibaeva took her love of data science beyond the classroom by helping to co-found Tech's Machine Learning and Artificial Intelligence Club.

Diana Shadibaeva, another data science major and co-founder of Tech's new Machine Learning and Artificial Intelligence Club (MLAIC), works in the University's Laboratory of Medical Imaging and Informatics. As an undergraduate research assistant in the lab, she uses skills from her studies to approach medical problems. She's also seeking a Summer Undergraduate Research Fellowship (SURF) to continue gaining hands-on experience.

"It's pretty easy to get involved with various projects, enterprises, and on-campus jobs at Michigan Tech," said Shadibaeva. "We have a lot of opportunities for undergraduates as early as their first or second year."

Diving into Data

Taking advantage of data science research opportunities at Michigan Tech means getting involved in projects like Havens' work, which focuses on developing machine learning methods for sensing systems. His team is building algorithms that can automatically interpret large streams of data from sensors or imaging systems placed in complex environments. Environments like large lakes, oceans and atmospheric systems are increasingly and continually monitored by sensor networks. In this instance, the challenge isn't collecting data — it's parsing a rich body of data accurately.

The project draws on Havens' expertise in his dual roles as executive director of two Michigan Tech research institutes: the Great Lakes Research Center and the Institute of Computing and Cybersystems.

Tim Havens poses on the dock of the Great Lakes Research Center, in front of the RV Agassiz.
Tim Havens applies his skills in computing and data science to improve pattern detection and prediction in the data-noisy environment of the Great Lakes.

"Our contribution is to develop methods that allow researchers to query all the data from every sensor on the Great Lakes and extract reliable information from them, identifying anomalies like damaging high water levels, detecting patterns over time that could indicate climate change, and building predictive models that help stakeholders make better policy decisions that impact our shorelines," said Havens.

Roy's work also connects applied research with hands-on student projects through both graduate courses and his undergraduate capstone course. He recently supervised a capstone project on collision detection for unmanned aerial vehicles, also known as drones or UAVs. In his graduate machine learning and data mining courses, students conduct research on predictive modeling and deep learning.

"I provide technical mentorship, emphasize strong experimental design and encourage students to consider ethical and societal implications alongside performance," said Roy.

Debugging Data Science

In their time as teachers and mentors, Havens and Roy have enjoyed connecting with students as passionate about the applied and impactful nature of data science as they are. Michigan Tech students dive into data excited to resurface with skills that can solve real-world problems. Many come to the discipline because they see how machine learning and data-driven approaches are shaping technology, health care, environmental monitoring and industry. Their professors are excited about the work for similar reasons.

"When a new algorithm allows researchers to detect patterns in data that were previously invisible, it can change how we understand the world," said Havens. "Being able to contribute tools that enable those discoveries is incredibly rewarding."

"Overall, my daily work is a dynamic combination of technical problem-solving, curriculum development, and mentoring the next generation of data scientists."Sujan Kumar Roy, assistant teaching professor of computer science

In Roy's capstone and graduate-level courses, students particularly enjoy building intelligent models, experimenting with new techniques and seeing their systems perform on real datasets.

"Seeing students grow into confident, thoughtful data scientists who balance innovation with responsibility is one of the most rewarding aspects of my work," said Roy. "They are increasingly interested in ethical considerations — asking thoughtful questions about fairness, transparency and responsible AI."

Those considerations are also at the forefront of Roy's mind. He and his students are equally motivated to improve safety, efficiency and quality of life. That is the passion behind Roy's research on health care and trustworthy, human-centered AI. He develops and evaluates data-driven systems that support decision-making while ensuring the fairness, interpretability and responsible use of AI.

"I believe it should not only produce accurate results but also be interpretable, reliable and socially responsible," said Roy. "Overall, my work integrates applied innovation, research and mentorship to advance practical and human-centered data science."

Students quickly learn that achieving that standard in their work requires more than building models.

"In textbooks, datasets are clean and well-structured," said Havens. "In real-world problems, the data is often incomplete, noisy, biased, or inconsistent."

Felicia Huffman sits in Michigan Tech's library working on her laptop.

"With the combination of excellent math skills and programming challenges that data science offers, I feel like the major was designed for me."

Felicia Huffman
senior data science major

Cleaning, organizing and understanding data, especially when it's messy or contains gaps, can take more time and effort than running the algorithms themselves. Huffman, who spends much of her academic time researching, programming and debugging algorithms, says the latter is the biggest time killer.

"I spend my time fixing bugs and improving the performance of my algorithms through additional data cleaning, adjusting parameters and collecting more data," said Huffman. "I ask a lot of questions, and I try my best to ensure I'm unbiased in my choices."

Shadibaeva was surprised by the amount of mathematics involved in cleaning data. "I didn't realize how math-heavy it would be going into it," she said. "The entire field is built on mathematical principles."

She isn't the only one challenged by walking a tightrope between theory and practice.

"Effective data scientists need a strong foundation in mathematics and statistics," said Havens. "They also need strong computational skills and an understanding of the domain they are working in. Learning to combine those elements is one of the key challenges students face."

"Ultimately, the goal of the data scientist is to push the boundaries of what we can learn from data, developing tools that help others do the same."Tim Havens, executive director, Institute of Computing and Cybersystems

Whether cleaning data, programming a learning model or testing their algorithms, high accuracy alone is not enough. Scientists must choose appropriate metrics, avoid data leakage and ensure results are reliable and reproducible. When all is said and done, model performance must balance with interpretability, fairness, responsibility and transparency.

"Ultimately, the most transformative challenge is learning to think critically — not just applying techniques, but questioning assumptions, validating findings rigorously and understanding the broader impact of their work," said Roy.

Data Science is Everywhere

The demand for critical thinking from data science professionals is strongly connected to its versatile real-world problem-solving applications, whether in health care analytics, intelligent systems, cybersecurity, business decision-making or emerging technologies like UAV navigation and smart infrastructure. Professionals in this field rarely work in isolation. They are most often collaborating to assist in analysis and decision-making across many disciplines.

"In many ways, it functions as an enabling technology, much like mathematics or statistics, because it provides tools that allow other disciplines to ask new types of questions," said Havens. "As more systems generate large datasets, the demand for data-driven analysis spreads across almost every domain."

Clinicians and researchers collaborate with data scientists to analyze patient data, improve diagnostics and support treatment decisions. Data scientists also support engineers and computer scientists in areas like AI, cybersecurity and intelligent systems. In social sciences and public policy, data is used to study education outcomes, public health trends and societal challenges. Scientists in this field are also increasingly called on by experts in environmental science, transportation, manufacturing and finance. In business and economics, they help organizations understand customer behavior, optimize operations, manage risk and guide strategic planning.

"Data science today is not only about building powerful models, but also about ensuring they are ethical, transparent, and beneficial to society."Sujan Kumar Roy, assistant teaching professor of computer science

"Data science is inherently interdisciplinary, serving as a bridge between technical methods and domain expertise to solve complex, real-world problems," said Roy.

Beyond analysis, data science supports prediction, simulation and decision-making. It allows scientists to build models that test hypotheses, forecast outcomes and guide experiments more efficiently.

"In many ways, the field has become a foundational layer of modern research — accelerating innovation, improving accuracy, and enabling interdisciplinary collaboration across scientific domains," said Roy. "Researchers rely on data-driven methods to analyze complex patterns that would be impossible to interpret manually."

Data Science is the Future

Data science will only become increasingly embedded in critical decision-making across every sector of industry and research. As data generated and collected grows in volume and complexity, AI and learning models are growing with it, adapting to learn from different types of information and working in real time to advise smarter, faster decisions. The challenge isn't collecting data, but interpreting it effectively.

"The future will involve deeper integration between machine learning and scientific modeling," said Havens. "Rather than simply fitting models to data, we will increasingly see approaches that combine physical knowledge, domain expertise and data-driven learning."

Roy's vision of the field's future is hopeful, but tempered with the weight of responsibility that rests on the people shaping that future. The pervasiveness of data science technologies calls for a responsible, human-centered approach to intelligent systems.

"As systems become more powerful, ensuring fairness, transparency, robustness and privacy will be essential," said Roy. "The next phase will not only advance technical innovation but also establish higher standards for ethical deployment and societal trust."

For students like Huffman, the future of data science is one with nearly infinite possibilities.

"Data science will be similar to engineering eventually, with different degrees available for each sector. Every industry has data, which means every industry could greatly benefit from it," she said.

While her love of mathematics and coding drew Huffman to the field initially, her dedication to her discipline has only grown with her understanding of its widespread benefits for every industry.

"I've heard some companies complain that data scientists aren't statisticians, and aren't quite software engineers either, but the truth is we're both," said Huffman. "We're the only major that is truly developed with the skills for machine learning, which soon will be implemented in every field."

Michigan Technological University is an R1 public research university founded in 1885 in Houghton, and is home to nearly 7,500 students from more than 60 countries around the world. Consistently ranked among the best universities in the country for return on investment, Michigan's flagship technological university offers more than 185 undergraduate and graduate degree programs in science and technology, engineering, computing, forestry, business, health professions, humanities, mathematics, social sciences, and the arts. The rural campus is situated just miles from Lake Superior in Michigan's Upper Peninsula, offering year-round opportunities for outdoor adventure.

Comments