In this course, students will work on solving complex problems in data science using exploratory data visualization and analysis in combination. Students will learn to deal with the Five V's: Volume, Variety, Velocity, Veracity, and Variability, that is with large data, complex heterogeneous data, streaming data, uncertainty in data, and variations in data flow, density and complexity. Students will be able to select the appropriate tools and visualizations in support of problem solving in different application areas. The data sets and problems will be selected mainly from the IEEE VAST Challenges, but also from the KDD CUP, Amazon, Netflix, GroupLens, MovieLens, Wiki releases, Biology competitions and others. We will solve crime, cyber security, health, social, communication, marketing and similar large-scale problems. Data sources will be quite broad and include text, social media, audio, image, video, sensor, and communication collections representing very real problems. Hands-on projects will be based on Python or R, and various visualization libraries, both open source and commercial.
- Teacher: Georges Grinstein