Workshop
Sunday, October 5, 2014
Full day: 9:00am - 4:30pm
Room: Beacon 1
Big Data: Proven Methods You Need to Extract Big Value
Intended Audience: Managers, decision makers, practitioners, and professionals interested in a broad overview and introduction
Knowledge Level: All levels
Attendees will receive an electronic copy of the course notes and materials
Workshop Description
"Big Data" is everywhere. The topic is impacting every industry and institution. Big excitement about big data comes from the intersection of dramatic increases in computing power and data storage with growing streams of data coming from almost every person and process on Earth. The pressing question is, how do we best make value of all this data - what should we do with it?
Working with big data effectively depends on understanding the sources of data and the issues in storing and analyzing it:
- Where does big data come from?
- How do you manage, store, and compute on big data?
- What qualifies as "big"?
This one day workshop reviews major big data success stories that have transformed businesses and created new markets.
Barash will cover these revealing stories in order to illustrate the key concepts, tools, and value-proven applications driving the big data revolution.
"Big data" is a open buzzword - it could be defined as any amount of data you can't afford to handle - but the big, newfound value achieved by computing at scale is no fad.
What you will learn:
- Where does big data come from: Common sources of big data.
- What makes data big: Velocity, Variety, and Volume!
- How can we leverage it: Open tools and platforms for storing and analyzing big data.
- The new paradigm: Today's shift from hypothesis testing to a broad exploration for correlations is a revolutionary change in the way data is explored.
- Best practices for analyzing big data: Key methods in data science, predictive analytics, and text analytics to analytically learn from data.
- Social Data: Finding key connections in webs of people and events.
- Applications of big data insights to business.
- Future directions in big data: bigger, bolder, and better.
Schedule
-
Workshop starts at 9:00am
First AM Break from 10:00 - 10:15am
Second AM Break from 11:15 - 11:30am
Lunch from 12:30 - 1:15pm
First PM Break: 2:00 - 2:15pm
Second PM Break: 3:15 - 3:30pm
Workshops ends at 4:30pm
Coffee breaks and lunch are included on both days.
Attendees receive a copy of the course materials book at the beginning of the workshop.
Instructor
Vladimir Barash, Senior Researcher, Graphika
Vladimir Barash is a Senior Researcher and Engineer at Graphika. He has received his Ph.D. from Cornell University, where he studied Information Science and wrote his thesis on the flow of rumors and virally marketed products through social networks. At Graphika, Vladimir's research focuses mainly on the intersection of social media and large-scale social phenomena, ranging from online political activism in Russia to the cross-cultural patterns of emoticon use in Twitter to leveraging social media for the prediction of emergency events.
In addition to his research duties, Vladimir has a decade's experience working with big data, from scientific computing (Matlab, scipy) to parallel processing technologies (Hadoop / Hive) to data storage and pipelining (Redis, mongodb, MYSQL) at the terabyte scale. At Graphika, Vladimir has co-designed and implemented systems that process tens of millions every six hours to deliver timely information on influencers and conversation leaders in online communities tailored to client interests. Vladimir is proficient in over a dozen programming languages and frameworks and has designed production-ready systems for every stage of big data analysis, from collection to client-facing presentation via web, spreadsheet or graphic visualization.
Vladimir has been active in the Social Media Research Foundation (SMRF) and the NodeXL project, helping build a network analysis package that brings relational data analysis at scale to the fingertips of any interested user, without requiring specialized knowledge or technical training beyond familiarity with Microsoft Excel. NodeXL has enabled users in academia, industry and the general public to analyze tens of thousands of social networks, from networks of politicians voting on bills to networks of motorcycle enthusiasts working together. As part of his work with SMRF and the NodeXL team, Vladimir has contributed a chapter on Twitter analysis to Analyzing Social Media Networks with NodeXL: Insights from a Connected World.
Vladimir's work has received awards at the International Conference for Weblogs in Social Media and Bits on Our Minds. He has presented his research at academic and industrial campuses all over North America and Europe, including: Xerox/PARC, Microsoft, Colgate University, Northeastern University, UMCP and Oxford University (Oxford Internet Institute). He currently resides in Somerville, MA.