Dataset description
This dataset contains information about the use of bar charts for corpus data presentation. It is based on a systematic review covering all papers (n = 1,183) published in five corpus-linguistic journals up to and including the year 2024 (International Journal of Corpus Linguistics, Corpus Linguistics and Linguistic Theory, Corpora, Research in Corpus Linguistics, and the International Journal of Learner Corpus Research). In total, the report pool includes n = 1,061 unique instances of bar charts, and the current dataset documents various properties of these diagrams, including information about their content (e.g. number and types of variables, number of levels per variable) and design (e.g. type of bar chart, scaling information, size, use of color). Further, information is provided about the frequency of use of different types of statistical graphs and statistical tables for each article (n = 1,183) in the report pool. (2025-07-31)
Abstract: Related publication
A recent survey of graph usage in corpus-based research has shown that the bar chart is the most widely used graph type for corpus data presentation. Motivated by this finding, the present paper offers a systematic review of bar chart usage in corpus-based research articles. It covers all papers (n = 1,183) published in five corpus-linguistic journals up to and including the year 2024 (International Journal of Corpus Linguistics, Corpus Linguistics and Linguistic Theory, Corpora, Research in Corpus Linguistics, and the International Journal of Learner Corpus Research). The aim of this survey is to arrive at a better understanding of the kinds of visualization tasks imposed on bar charts. We observe that they most commonly show percentages or absolute/normalized frequencies, and that they are often used for relatively complex visualization tasks involving multiple variables and subgroups. The survey is carried out against the backdrop of known limitations of this graph type and design recommendations found in data visualization guidebooks. Our critical examination of diagrams pays attention to issues that compromise the ability of the viewer to accurately perceive patterns in the data, and minor issues that affect the efficiency of a display. These observations are then distilled into a set of concrete recommendations, which are grounded in current usage and the advice given in the data visualization literature. (2025-07-31) |