Since the beginning of time until 2003, humans generated five billion gigabytes of data. Today, we generate that much every two days. More than five billion people are texting, tweeting and browsing on mobile phones worldwide. Thirty billion pieces of content are shared on Facebook every month. People upload 48 hours of new video to YouTube every minute of every day. By 2020, Internet transactions (business to business and business to consumer) could reach 450 billion per day.*
Bamshad Mobasher and Robin Burke, professors in the College of Computing and Digital Media, talk about how their work is helping enterprises make sense of this world of big data.
Mobasher: Our work centers on data mining, predictive analytics, Web personalization and business intelligence—in other words, on wrangling huge amounts of data to find useful information or insights. The whole idea of data mining first came about because big retailers wanted to find patterns in their point-of-sale data. If you can see patterns, you can make predictions; if you can make predictions, you can act accordingly. Products can be grouped and promoted based on observed shoppers’ behavior. Of course, with the emergence of the Internet and mobile technologies came an explosion of data that could be sliced and diced.
Burke: One application of predictive analytics is recommender systems, which help people find content of interest in a vast universe of options. Over time, an intelligent system adapts to become more relevant, sending each person the right information at the right time. Everyone’s familiar with the use of recommender systems on websites like iTunes or Amazon: If you like Infinite Jest by David Foster Wallace, you might also like Gravity’s Rainbow by Thomas Pynchon. Well, recommender systems are just going to keep getting better—smarter and more intuitive.
Mobasher: Just 10 years ago, data on the Internet was one-directional: An organization would post content; users would download it. Now, of course, that’s changed, as everyone is uploading information—putting videos on YouTube, posting on Facebook and Twitter, reviewing books and music, and commenting on everything and everyone.
So, now the words “big data” mean more than a massive amount of bytes; they also say something about the nature of today’s data. User-contributed data is unstructured and hard to manage. But if an organization could dig into that data, it would see not just the wisdom of the crowd, so to speak, but also the opinions and actions of individuals within very specific networks.
Burke: With those insights, a system could take into consideration the context of a query. For example, if you’re looking for a restaurant, the recommendation could reflect not just your preferences, but also your intent—one place for a date, a different place for a business meeting. Or say you’re going on a family vacation: A travel site would surmise that you’d like a hotel with a pool and would give those options a higher value in this particular instance.
The next generation of recommender systems will “get” context.
Mobasher: One source for context is the data in social networks, which would mean looking for insights in the connections between people. For example, one person might comment on a book on Amazon or Facebook; others will respond or add their own opinions. All these people have friends, and if we trace them on social sites, we’ve suddenly a got a “six degrees of separation” network, not only of people but also of topics, events and resources. Data mining like this goes way beyond finding patterns in purchasing histories. Once you have defined a complex network, there are lots of different directions you could go with recommendations.
Burke: These are the kinds of problems we’re solving in our research. The companies we work with in our two centers, one for Data Mining and Predictive Analytics and one for Web Intelligence, are absolutely on the cutting edge: They provide resources, including funding, data, technology and even scholarships; we share our research and expertise. As soon as we come up with a new way of doing things—a new algorithm or a new application—it gets tested in the real world, right away.
It’s really not surprising that our graduates end up working in the best companies.
Mobasher: Sometimes, I have to stop our students from starting jobs before they finish their MS or PhD—there’s that much demand from companies like Google, Amazon, Netflix, Yahoo and Microsoft, as well as large companies in retail, consumer products, telecommunications and other industries. Also, we include undergraduates in our research as much as possible—a lot of them end up in our graduate programs, and that makes us very happy.
Bamshad Mobasher is a leading authority in Web mining, Web personalization and recommender systems. Robin Burke explores the application of artificial intelligence to social computing. Both are on the steering committee of the ACM (Association for Computing Machinery) Conference on Recommender Systems, a top international conference.
* Source: Wikibon Blog