Network-based. Visual Analysis of Tabular Data. Zhicheng Liu, Shamkant Navathe, John Stasko

Similar documents
Knights of Columbus-Marist Poll January 2011

A Scientific Model Explains Spirituality and Nonduality

Natural Language Processing (NLP) 10/30/02 CS470/670 NLP (10/30/02) 1

Evolving Family Structures

Evolving Family Structures

Prentice Hall The American Nation: Beginnings Through 1877 '2002 Correlated to: Chandler USD Social Studies Textbook Evaluation Instrument (Grade 8)

Background and Preview of the Global Overview of the. World Christian Encyclopedia, Third Edition. Albert W. Hickman and Bradley A.

Macmillan/McGraw-Hill SCIENCE: A CLOSER LOOK 2011, Grade 4 Correlated with Common Core State Standards, Grade 4

Health Information Exchange (HIE): Where We Are and What s Ahead

Chapter 2 Science as a Way of Knowing: Critical Thinking about the Environment

Prentice Hall U.S. History Modern America 2013

Macmillan/McGraw-Hill SCIENCE: A CLOSER LOOK 2011, Grade 3 Correlated with Common Core State Standards, Grade 3

***** [KST : Knowledge Sharing Technology]

Macmillan/McGraw-Hill SCIENCE: A CLOSER LOOK 2011, Grade 1 Correlated with Common Core State Standards, Grade 1

INF5020 Philosophy of Information: Ontology

Performance Analysis with Vampir

OPENRULES. Tutorial. Determine Patient Therapy. Decision Model. Open Source Business Decision Management System. Release 6.0

Prentice Hall United States History Survey Edition 2013

Mapping to the CIDOC CRM Basic Overview. George Bruseker ICS-FORTH CIDOC 2017 Tblisi, Georgia 25/09/2017

GOD S WORD CHANGING WORLD

EDUCATION, CRITICAL THINKING, AND TERRORISM: THE REPRODUCTION OF GLOBAL SALAFI JIHAD IN CONTEMPORARY EGYPT

The SAT Essay: An Argument-Centered Strategy

Visual Analytics Based Authorship Discrimination Using Gaussian Mixture Models and Self Organising Maps: Application on Quran and Hadith

Pearson myworld Geography Western Hemisphere 2011

Quorums. Christian Plattner, Gustavo Alonso Exercises for Verteilte Systeme WS05/06 Swiss Federal Institute of Technology (ETH), Zürich

ATTRACTING MILLENNIALS

Grade 6 correlated to Illinois Learning Standards for Mathematics

This report is organized in four sections. The first section discusses the sample design. The next

NPTEL NPTEL ONINE CERTIFICATION COURSE. Introduction to Machine Learning. Lecture-59 Ensemble Methods- Bagging,Committee Machines and Stacking

Parish Needs Survey (part 2): the Needs of the Parishes

Appendix 1. Towers Watson Report. UMC Call to Action Vital Congregations Research Project Findings Report for Steering Team

United States History and Geography: Modern Times

Investigating Worldviews with Protégé Bro Wormslev Jakobsen, Thomas; Jakobsen, David; Øhrstrøm, Peter

TO BE AND TO MAKE DISCIPLES OF CHRIST BSUMC VISION STATEMENT

Circle of Influence Strategy (For YFC Staff)

Causation and Free Will

The Gaia Archive. A. Mora, J. Gonzalez-Núñez, J. Salgado, R. Gutiérrez-Sánchez, J.C. Segovia, J. Duran ESA-ESAC Gaia SOC and ESDC

MISSOURI S FRAMEWORK FOR CURRICULAR DEVELOPMENT IN MATH TOPIC I: PROBLEM SOLVING

Houghton Mifflin Harcourt Collections 2015 Grade 8. Indiana Academic Standards English/Language Arts Grade 8

Cathedral Congregation Conversation DRAFT REPORT. October 2016

Introduction. Selim Aksoy. Bilkent University

15 Does God have a Nature?

Strongly Agree Agree Neutral Disagree Strongly Disagree. Strongly Agree Agree Neutral Disagree Strongly Disagree

Entailment as Plural Modal Anaphora

The Cellular Automaton and the Cosmic Tapestry Kathleen Duffy

The Good, the Bad, and the Ugly

II Plenary discussion of Expertise and the Global Warming debate.

Keywords: Knowledge Organization. Discourse Community. Dimension of Knowledge. 1 What is epistemology in knowledge organization?

Probability Distributions TEACHER NOTES MATH NSPIRED

Houghton Mifflin MATHEMATICS

Phil/Ling 375: Meaning and Mind [Handout #10]

III Knowledge is true belief based on argument. Plato, Theaetetus, 201 c-d Is Justified True Belief Knowledge? Edmund Gettier

Support, Experience and Intentionality:

The Carneades Argumentation Framework

ONTOLOGICAL PROBLEMS OF PLURALIST RESEARCH METHODOLOGIES

In Our Own Words 2000 Research Study

Semantic Foundations for Deductive Methods

(Also, how to do it right, and MOST IMPORTANTLY, how to tell the difference!)

McDougal Littell High School Math Program. correlated to. Oregon Mathematics Grade-Level Standards

Computable Difference Matrix for Synonyms in Holy Quran

Comparing World Religions Using Primary Sources

Artificial Intelligence Prof. P. Dasgupta Department of Computer Science & Engineering Indian Institute of Technology, Kharagpur

Introduction. Selim Aksoy. Bilkent University

A Defense of Contingent Logical Truths

Gallup Survey Reporter 2014

Pronominal, temporal and descriptive anaphora

College and Career Readiness Anchor Standards for Reading. Step Into the Time 36 Step Into the Place 92, 108, 174, 292, 430

Understanding irrational numbers by means of their representation as non-repeating decimals

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.

English Language Arts: Grade 5

A Correlation of. To the. Language Arts Florida Standards (LAFS) Grade 4

SCHOOL OF PRACTICAL AND ADVANCED STUDIES THE NEXT GENERATION BECOME A CHURCH WITH IMPACT! INTRODUCTION TO TAKE YOUR CHURCH S PULSE TOOL

A Correlation of. To the. Language Arts Florida Standards (LAFS) Grade 3

Identity and Curriculum in Catholic Education

CORRELATION FLORIDA DEPARTMENT OF EDUCATION INSTRUCTIONAL MATERIALS CORRELATION COURSE STANDARDS/BENCHMARKS

Islamic Bio-ethics/Online Program

Reference Resolution. Regina Barzilay. February 23, 2004

Does Personhood Begin at Conception?

Reference Resolution. Announcements. Last Time. 3/3 first part of the projects Example topics

The Scripture Engagement of Students at Christian Colleges

Knowledge. Leadership

CONGREGATIONS ON THE GROW: SEVENTH-DAY ADVENTISTS IN THE U.S. CONGREGATIONAL LIFE STUDY

SERVICE PROVIDER KLM VANLINES DRIVER BEN > AKA (?) CHRIS & ANGELA TILLERY 2403 DEL SUR SANTA MARIA, CALIFORNIA Job No: P

ENDS INTERPRETATION Revised April 11, 2014

Digital Methods for App Analysis Mapping App Ecologies in the Google Play Store

Christians Say They Do Best At Relationships, Worst In Bible Knowledge

Lesson 10 Notes. Machine Learning. Intro. Joint Distribution

Finding Gaps in Sources

P 97 Personality and the Practice of Ministry

Artificial Intelligence Prof. Deepak Khemani Department of Computer Science and Engineering Indian Institute of Technology, Madras

Overview of College Board Noncognitive Work Carol Barry

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at

The Intuitives By Steven Brown, Erin Michelle Sky READ ONLINE

Arkansas English Language Arts Standards

Logic & Proofs. Chapter 3 Content. Sentential Logic Semantics. Contents: Studying this chapter will enable you to:

International Mindedness and the Lutheran School. Vicki Schilling/Lisa Kraft

St Thomas C.E. Primary School Collective Worship Policy

Anaphora Resolution in Biomedical Literature: A

Informalizing Formal Logic

ey or s cross isciplinary practice, phenomenography, transformative practice, epistemology

Transcription:

Network-based Visual Analysis of Tabular Data Zhicheng Liu, Shamkant Navathe, John Stasko

Tabular Data 2

Tabular Data Rows and columns Rows are data cases; columns are attributes/dimensions Attribute types o Quantitative (numbers) o Ordinal (e.g. small, medium, large) o Nominal (names, categories) 3

Insight Discovery on Tabular Data: Example NSF Grants o Grant Title o Amount o Date o Program Manager o Awardee / Researcher Name o Awardee Affiliation o 4

Visualizing Tabular Data [Rao and Card, 1994] [Tableau] [Spotfire] 5

# Grants in each program 6

Amount by date 7

Amount by date & ProgMgr 8

Collaboration between institutions? 9

Relationship between ProgMgr and Researchers? 10

Quantitative attributes o Nominal data as independent variables, or not handled Patterns of distributions, correlations and outliers of numerical values Nominal attributes o Quantiative attributes are useful too Entities with interesting roles, emergent global structures 11

Current State of the Art Tabular Data Explicit Network Spotfire, Tableau, TableLens. GUESS, UCINet, SocialAction. 12

Problems with this Partition Analytical: Network semantics are dynamic Usability: Modeling networks is tedious and requires programming skill Counter-intuitive for Exploratory Analysis 13

NodeXL [Hansen, et. al. 2010] 14

Questions & Approaches Goal: Network-based Visual Analysis of Tabular Data 1. Which conceptually meaningful operations are necessary to extract and transform tabular data into networks for exploratory analysis? Domain independent Generalized operations Expressive power 2. Given a set of operations, how to provide analysts with easy access to these operations and to couple network modeling with exploratory analysis? Hide technical details Reduce articulatory distance Immediate visual feedback 15

Formal Framework Tables: Relational model [Codd, 1969] o Each row is uniquely identifiable o Values in each cell is atomic: number, boolean, string, date Networks: Weighted Simple Graphs o Undirected A o At most one edge between any two nodes o Edges are weighted B 16

An Example ID LastNm FirstNm Type Date Size Visitee Loc 1 Dodd Chris VA 6/25/09 2018 POTUS 2 Smith John VA 6/26/09 237 3 Smith John AL 6/26/09 144 4 Hirani Amyn VA 6/30/09 184 5 Keehan Carol VA 6/30/09 8 6 Keehan Carol VA 7/8/09 26 Office Visitors Amanda Kepko Office Visitors Kristin Sheehy Daniella Leger 17

First-order Graph: Single Table ID LastNm FirstNm Type Date Size Visitee Loc 1 Dodd Chris VA 6/25/09 2018 POTUS 2 Smith John VA 6/26/09 237 Office Visitors 3 Smith John AL 6/26/09 144 Amanda Kepko 4 Hirani Amyn VA 6/30/09 184 Office Visitors 5 Keehan Carol VA 6/30/09 8 Kristin Sheehy 6 Keehan Carol VA 7/8/09 26 Daniella Leger 18

First-order Graph: Single Table ID LastNm FirstNm Type Date Size Visitee Loc 1 Dodd Chris VA 6/25/09 2018 POTUS 2 Smith John VA 6/26/09 237 Office Visitors 3 Smith John AL 6/26/09 144 Amanda Kepko 4 Hirani Amyn VA 6/30/09 184 Office Visitors 5 Keehan Carol VA 6/30/09 8 Kristin Sheehy 6 Keehan Carol VA 7/8/09 26 Daniella Leger LastNm, Dodd Smith Smith Hirani Keehan Keehan FirstNm Chris John John Amyn Carol Carol Loc 19

First-order Graph: Single Table ID LastNm FirstNm Type Date Size Visitee Loc 1 Dodd Chris VA 6/25/09 2018 POTUS 2 Smith John VA 6/26/09 237 Office Visitors 3 Smith John AL 6/26/09 144 Amanda Kepko 4 Hirani Amyn VA 6/30/09 184 Office Visitors 5 Keehan Carol VA 6/30/09 8 Kristin Sheehy 6 Keehan Carol VA 7/8/09 26 Daniella Leger LastNm, Dodd Smith Smith Hirani Keehan Keehan FirstNm Chris John John Amyn Carol Carol Loc 20

First-order Graph: Single Table ID LastNm FirstNm Type Date Size Visitee Loc 1 Dodd Chris VA 6/25/09 2018 POTUS 2 Smith John VA 6/26/09 237 Office Visitors 3 Smith John AL 6/26/09 144 Amanda Kepko 4 Hirani Amyn VA 6/30/09 184 Office Visitors 5 Keehan Carol VA 6/30/09 8 Kristin Sheehy 6 Keehan Carol VA 7/8/09 26 Daniella Leger [Type] LastNm, FirstNm Loc [Type = VA] Dodd Chris [Type = VA] Smith John [Type = AL] Smith John [Type = VA] Hirani Amyn [Type = VA] Keehan Carol [Type = VA] Keehan Carol 21

Higher-order Graph: Transformations Aggregation Projection Edge Weighting Slicing n Dicing 22

Aggregation: Entity Resolution original graph after aggregation Dodd, Chris Smith, John Smith, John Hirani, Amyn Keehan, Carol Keehan, Carol Dodd, Chris Smith, John Smith, John Hirani, Amyn Keehan, Carol 23

Aggregation: Pivoting original graph after aggregation Dodd, Chris Smith, John VA Smith, John Hirani, Amyn Keehan, Carol AL Type Location 24

Projection Dodd, Chris Smith, John Hirani, Amyn Dodd, Chris Smith, John Keehan, Carol Hirani, Amyn Keehan, Carol 25

Edge Weighting ID LastNm FirstNm Type Date Size Visitee Loc 1 Dodd Chris VA 6/25/09 2018 POTUS 2 Smith John VA 6/26/09 237 Office Visitors 3 Smith John AL 6/26/09 144 Amanda Kepko 4 Hirani Amyn VA 6/30/09 184 Office Visitors 5 Keehan Carol VA 6/30/09 8 Kristin Sheehy 6 Keehan Carol VA 7/8/09 26 Daniella Leger LastNm, Dodd Smith Hirani Keehan FirstNm Chris John Amyn Carol Visitee POTUS Office Visitors Amanda Kepko Kristin Sheehy Daniella Leger 26

Edge Weighting ID LastNm FirstNm Type Date Size Visitee Loc 1 Dodd Chris VA 6/25/09 2018 POTUS 2 Smith John VA 6/26/09 237 Office Visitors 3 Smith John AL 6/26/09 144 Amanda Kepko 4 Hirani Amyn VA 6/30/09 184 Office Visitors 5 Keehan Carol VA 6/30/09 8 Kristin Sheehy 6 Keehan Carol VA 7/8/09 26 Daniella Leger LastNm, FirstNm Visitee Dodd Chris 2018 POTUS Smith John 237 Office Visitors Hirani Amyn 144 184 Amanda Kepko Keehan Carol 8 Kristin Sheehy 26 Daniella Leger 27

Slice n Dice ID LastNm FirstNm Type Date Size Visitee Loc 1 Dodd Chris VA 6/25/09 2018 POTUS 2 Smith John VA 6/26/09 237 Office Visitors 3 Smith John AL 6/26/09 144 Amanda Kepko 4 Hirani Amyn VA 6/30/09 184 Office Visitors 5 Keehan Carol VA 6/30/09 8 Kristin Sheehy 6 Keehan Carol VA 7/8/09 26 Daniella Leger LastNm, Dodd Smith Hirani Keehan FirstNm Chris John Amyn Carol Visitee POTUS Office Visitors Amanda Kepko Kristin Sheehy Daniella Leger 28

Slice n Dice ID LastNm FirstNm Type Date Size Visitee Loc 1 Dodd Chris VA 6/25/09 2018 POTUS 2 Smith John VA 6/26/09 237 Office Visitors 3 Smith John AL 6/26/09 144 Amanda Kepko 4 Hirani Amyn VA 6/30/09 184 Office Visitors 5 Keehan Carol VA 6/30/09 8 Kristin Sheehy 6 Keehan Carol VA 7/8/09 26 Daniella Leger LastNm, FirstNm Visitee LastNm, FirstNm Visitee Dodd Chris POTUS Smith John Office Visitors Smith John Amanda Kepko Hirani Amyn Keehan Carol Kristin Sheehy Keehan Carol Daniella Leger 29

Expressive Power Proximity grouping Extending to directed one-mode network Limitations 30

Ploceus Interface Overview Data Management View Network Visualization View Network Schema View 31

Direct Manipulation Interface 32

Ploceus Demo 33

Related Work Centrifuge Orion 34

Multiple Tables GID Title Program 1 2 3 Data Mining of Digital Behavior Real-time Capture, Management and Reconstruction of Spatio-Temporal Events Statistical Data Mining of Time-Dependent Data with Applications in Geoscience and Biology Statistics Information Technology Research ITR for National Priorities Program Manager Sylvia Spengler Maria Zemankova Sylvia Spengler Amount Year 2241750 2001 430000 2000 566644 2003 PID Name Org 1 2 Padhraic Smyth Sharad Mehrotra Person Grant Role 1 1 PI 2 1 copi 2 2 PI University of California Irvine University of California Irvine 1 3 PI 35

First-order Graph: Multiple Tables GID Title Program ProMgr Amount Year 1 Data Mining of Digital Sylvia Statistics Behavior Spengler 2241750 2001 2 Real-time Capture, Information Tech- Maria Management nology Research Zemankova 430000 2000 3 Statistical Data Mining ITR for National Sylvia of Time-Dependent Priorities Spengler 566644 2003 PID Name Org 1 Padhraic Smyth University of California Irvine 2 Sharad Mehrotra University of California Irvine Grant Role Person 1 PI 1 1 copi 2 2 PI 2 3 PI 1 Title Program ProMgr Amount Year Grant Role Person PID Name Org 1 PI 1 1 copi 2 2 PI 2 3 PI 1 36

Open Issues with Multiple Tables (1) Join Specification 37

Open Issues with Multiple Tables (2) Interpretation of Edge Weights 38

Acknowledgments IIS-0915788 VACCINE Center 39