A Linguistic Approach to Categorical Color Assignment for Data Visualization


When data categories have strong color associations, it is useful to use these semantically meaningful concept-color associations in data visualizations. In this paper, we explore how linguistic information about the terms defining the data can be used to generate semantically meaningful colors. To do this effectively, we need first to establish that a term has a strong semantic color association, then discover which color or colors express it. Using co-occurrence measures of color name frequencies from Google n-grams, we define a measure for colorability that describes how strongly associated a given term is to any of a set of basic color terms. We then show how this colorability score can be used with additional semantic analysis to rank and retrieve a representative color from Google Images. Alternatively, we use symbolic relationships defined by WordNet to select identity colors for categories such as countries or brands. To create visually distinct color palettes, we use k-means clustering to create visually distinct sets, iteratively reassigning terms with multiple basic color associations as needed. This can be additionally constrained to use colors only in a predefined palette. Supplementary Material

Vidya Setlur
Maureen C. Stone
Publication Date: 
Sunday, October 25, 2015
Publication Information: 
The IEEE Information Visualization Conference (Chicago, October 25-30, 2015)