Analysis of Large Scale Social Networks (B-KUL-H0T26A)
Aims
The goal of this course is to provide students with deep knowledge and insight in social network
analysis applied on large scale data. As a result they will be able to design, implement and finalize an
analysis project on huge graph datasets with special attention for the relevance of the proposed
solution for the requested application. To achieve this goal, the students will learn basic concepts
of network analysis and get acquainted with advanced analytical methodologies and network
visualizations. They will learn to model the available data in an appropriate manner for storing,
processing and querying network data. They will gain experience with different implementations and
several software tools and they will be able to review the features of these in the light of
performance and the requirements set by the application. They will be able to translate the needs of
specific applications towards concepts and methods in social network analysis and present findings
of the analysis. The students will have the opportunity to train their acquired knowledge on real world
datasets and to communicate their findings with their peers.
The course will focus strongly on the applicability, performance and scalability of the proposed
solutions in a big]data environment and the course will never lose sight of the requirements and
expectations from the real]world application.
Previous knowledge
Students taking this course must have succesfully followed or must simultatiously follow the course Inductive Inferrence or an equivalent course.
Basics of Probability Distributions
Programming (preferably Java)
Elementary knowledge of linear algebra
The student should be able to analyze, summarize and interpret scientific publications
Is included in these courses of study
- Master handelsingenieur in de beleidsinformatica (Leuven) 120 ects.
- Master handelsingenieur in de beleidsinformatica (Leuven) (Minor: Data science) 120 ects.
- Master of Artificial Intelligence (Leuven) (Specialisation: Big Data Analytics (BDA)) 60 ects.
- Master of Artificial Intelligence (Leuven) (Specialisation: Engineering and Computer Science (ECS)) 60 ects.
- Master of Statistics and Data Science (on campus) (Leuven) (Statistics and Data Science for Biometrics) 120 ects.
- Master of Statistics and Data Science (on campus) (Leuven) (Statistics and Data Science for Business) 120 ects.
- Master of Statistics and Data Science (on campus) (Leuven) (Statistics and Data Science for Industry) 120 ects.
- Master of Statistics and Data Science (on campus) (Leuven) (Statistics and Data Science for Social, Behavioral and Educational Sciences) 120 ects.
- Master of Statistics and Data Science (on campus) (Leuven) (Theoretical Statistics and Data Science) 120 ects.
- Courses for Exchange Students Faculty of Engineering Science (Leuven)
- Master of Business and Information Systems Engineering (Leuven) 120 ects.
- Master of Business and Information Systems Engineering (Leuven) (Minor: Data Science) 120 ects.
Activities
2.5 ects. Analysis of Large Scale Social Networks: Lectures (B-KUL-H0T26a)
Content
The content of this course can be divided into three main parts: Fundamentals, Data, and Implementations and Tools. However, the course will not pass through these parts in a linear manner, but deal with separate topics when most appropriate. As an example, after introducing basic concepts like graph or matrix representation, tools like Pajek or Gephi are discussed. Another example is the advantage of graph databases like Neo4J for graph traversal like Breadth First over traditional relational databases.
Part I: Fundamentals
• Basic Concepts: Undirected and directed network, weighted network, bipartite network; nodes, links and their general properties
• Vector Space Model; matrix representation
• Graph Theory: vertex, edge
• Centrality measures like degree, closeness, betweenness, PageRank,
• Connectedness, Clustering Coefficient, Neighborhood
• Graph traversal schemes: Breadth First Search and Depth]First Search, Shortest path
• Partitioning, Clustering and Community Detection
• Different Random Graph Models: Erdös]Renyi; Barabasi]Albert, Watts & Strogatz
• Hubs, Preferential Attachment, Cumulative Advantage, small world networks
Part II: Data
• Graph data representation in adjacency matrices, weighted matrices or as a set of pairs
• Additional data matrices: Degree matrix and Laplacian Matrix
• Graph Processing: Creation of sub]networks or reduction, Graph Concatenation, Hybrid links
• Graph Databases: e.g. Neo4J and Cypher query language
Part III: Implementations and Tools
• Visualizations: Energy or Spring Models: Kamada]Kawaii, Force Atlas; Multi Dimensional Scaling
• Algorithms: eg. Dijkstra’s, A*; Approximations
• Performance of graph algorithms on large graph instances
• Time and Space Complexity of algorithms
• Advantages/Disadvantages of Parallelism and Map]Reduce
• Green Marl as a Domain Specific Language for Graph Analysis
Course material
The course material will consist of a collection of selected papers and book chapters and complemented by slides presented during the lectures
Most topics are covered in this online book
• Network Science by Albert-Lásló Barabási (available at http://www.networksciencebook.com/
Additional Recommended literature:
• Mark Newman. Networks: An Introduction. Oxford University Press, 2010.
• Wasserman, S., Faust, K., Social Network Analysis: Methods and Applications.
Cambridge, Cambridge University Press, 1994.
Selected papers and chapters will be available in Toledo with indication of relevant sections and paragraphs.
1 ects. Analysis of Large Scale Social Networks: Exercises (B-KUL-H0T27a)
Content
Exercise sessions on analysis of social networks.
Course material
Handouts
0.5 ects. Analysis of Large Scale Social Networks: Project (B-KUL-H0T28a)
Content
Practical assignment on analysis of social networks.
Course material
Handouts
Evaluation
Evaluation: Analysis of Large Scale Social Networks (B-KUL-H2T26a)
Explanation
Class participation and preparation: 10% of final grade
Project 25% of final grade
Exam 65% of final grade: partial exam with open ended questions on a research paper and
partial written exam on a pc with exercises and combined with multiple choice questions. The exam is with closed books.