A novel virtual node approach for interactive visual analytics of big datasets in parallel coordinatesby Mao Lin Huang, Tze-Haw Huang, Xuyun Zhang

Future Generation Computer Systems


Computer Networks and Communications / Hardware and Architecture / Software


Improving Performance of Forensics Investigation with Parallel Coordinates Visual Analytics

Wen Bo Wang, Mao Lin Huang, Liangfu Lu, Jinson Zhang

A visual analytics approach to model learning

Supriya Garg, I. V. Ramakrishnan, Klaus Mueller

Virtual Nodes

Jörg Domaschka, Christian Spann, Franz J. Hauck

The NIH Glucosamine/Chondroitin Arthritis Intervention Trial (GAIT)

National Center for Complimentary a


Future Generation Computer Systems ( ) –

Contents lists available at ScienceDirect

Future Generation Computer Systems journal homepage: www.elsevier.com/locate/fgcs

A novel virtual node approach for interactive visual analytics of big datasets in parallel coordinates

Mao Lin Huang a,b,∗, Tze-Haw Huang b, Xuyun Zhang b a School of Computer Software Tianjin University, China 92 Weijin Rd., Nankai District, Tianjin, China b School of Software, FEIT, University of Technology, Sydney P.O. Box 123, Broadway NSW 2007, Australia h i g h l i g h t s • Create ‘‘virtual nodes’’ that innovatively makes direct ‘‘mouse click’’ on data items possible in parallel coordinates visualization. • Refine the classification of visual interactions into a four-layer model. • The new approach can handle visualization and interaction with extremely large dataset. a r t i c l e i n f o

Article history:

Received 18 June 2013

Received in revised form 3 July 2014

Accepted 10 February 2015

Available online xxxx


Big data

Visual analytics

Parallel coordinates

Hierarchical clustering

Multidimensional data visualization

Data retrieval a b s t r a c t

Big data is a collection of large and complex datasets that commonly appear in multidimensional and multivariate data formats. It has been recognized as a big challenge in modern computing/information sciences to gain (or find out) due to its massive volume and complexity (e.g. its multivariate format). Accordingly, there is an urgent need to find new and effective techniques to deal with such huge datasets.

Parallel coordinates is a well-established geometrical system for visualizing multidimensional data that has been extensively studied for decades. There is also a variety of associated interaction techniques currently used with this geometrical system. However, none of these existing techniques can achieve the functions that are covered by the Select layer of Yi’s Seven-Layer Interaction Model. This is because it is theoretically impossible to find a select of data items via amouse-click (ormouse-rollover) operation over a particular visual poly-line (a visual object) with no geometric region. In this paper, we present a novel technique that uses a set of virtual nodes to practically achieve the Select interaction which has hitherto proven to be such a challenging sphere in parallel coordinates visualization. © 2015 Elsevier B.V. All rights reserved. 1. Introduction

Modern Information Visualization techniques, at their core, appear to have two main components: representation and interaction. The representation component is concerned with the mapping from data to advanced computer graphics and how to draw or render them on the display. On the other hand, the interaction component is concerned with the dialog between the user and the data stored on the system as the user explores the dataset for the purposes of uncovering insights. The interaction component’s roots lie in the area of Computer–Human Interaction (CHI).

Although discussed as two separate components, representation ∗ Corresponding author at: School of Computer Software Tianjin University,

China 92 Weijin Rd., Nankai District, Tianjin, China.

E-mail address:Mao.Huang@uts.edu.au (M.L. Huang). and interaction clearly are not mutually exclusive. While an information visualization system takes the role of providing advanced

GUIs for supporting Computer–Human Interaction (CHI), it is supposed to facilitate CHI in both directions; i.e. (1) the input from human to computer (or data), and (2) the output from computer (or data) to human. However, in the past few decades, researchers in the InfoVis community have paid more attentions to the output part; which is concerned more about the visual representation of output data, such as the output of analysis results in order for users to better understand its contents, its attributes and relational structures. They have not paid enough attention to the human input part which is for instructing, monitoring and guidance in the whole area of direct data manipulation and analytical reasoning. The existing research works that have been in relation to the visual human input part have mainly focused on low-level zooming and navigation operations and they have not addressed the benefit of human involvement in the visual data manipulation and visual analytical reasoning processes. Up until now there has http://dx.doi.org/10.1016/j.future.2015.02.003 0167-739X/© 2015 Elsevier B.V. All rights reserved. 2 M.L. Huang et al. / Future Generation Computer Systems ( ) – been no standard visual interaction model which has been widely recognized and there have been many different definitions and interpretations of visual interaction in the visualization community, in terms of its goals, objectives and functionalities. In 2007, J.S. Yi et al. [1] proposed a complete seven-layer visual interactionmodel based on the existing context of interaction techniques in visualization. These layers are defined as below: 1. Select: mark something as interesting. 2. Explore: show me something else. 3. Reconfigure: show me a different arrangement. 4. Encode: show me a different representation. 5. Abstract/Elaborate: show me more or less detail. 6. Filter: show me something conditionally 7. Connect: show me related items.

This seven-layer model describes the existing visual interactions from the lower level to the higher level, where the ‘select’ operation is used for highlighting and manipulating a particular data item through the visualization. The ‘explore’ operation on the other hand is used to find out user-interested data items though visual navigation of the data source. The layers 3–5 are concerned with the design of views of display (visualization) for better understanding and highlighting one (or more) portion(s) (or pattern) of the visualization that are currently interest the user. The last two layers use the ‘filtering’ mechanism to display (or visualize) only the interesting or related data items in the visualization and remove other less interesting and related data items from the visualization.