Catch up on LTI’s ChatGPT and Generative AI in the Library Coverage
Teaching, learning, and using AI — all in one place
Posted on in Blog Posts
Posted on February 1, 2023 in Blog Posts
Following last week’s reflection on how librarians can leverage digital content in in-person learning, we wanted to dig deeper into specific strategies instructional librarians can use to integrate library tech literacies into learning environments. We knew of no better resource to continue this discussion than The Data Literacy Cookbook (ACRL, 2022), edited by Kelly Getz and Meryl Brodsky. The Cookbook collects 65 different “recipes,” or adaptable lesson plans, covering all aspects of data literacy.
Below is a recipe from the collection, “Veggie Pizza: Choosing a Data Visualization Tool” by Rachel Starry (UC Riverside Library), licensed under CC BY-NC 4.0. This is an excellent lesson plan for guiding students through the process of selecting a “data viz” tool and a great introduction to several new tools to add to your own toolbox.
This recipe describes a module that can be used on its own or incorporated into the flexible approach to developing an Introduction to Data Visualization workshop described in this book in the chapter “Build Your Own Data Viz Pizza.” The assortment of veggies that top this pizza reflects the wide variety of data visualization (also known as data viz) tools that are available, and the goal of this module is to provide learners with resources for identifying which tools may be best suited to their particular needs.
This lesson equips learners with a set of questions to ask themselves as they prepare to select a data visualization tool for a particular project or course. Encouraging learners to pause and reflect on their needs and the kinds of data they will be working with helps to prevent frustration down the line caused by misalignment between their data, goals, and selected visualization tool.
1 classroom of students
The primary audience for this lesson is upper-division undergraduates and graduate students who need to develop basic data visualization skills and awareness of available tools and best practices.
30 minutes
Students and researchers are often faced with the challenge of selecting from a wide variety of data visualization tools. The learning outcomes for this module primarily tie into ACRL’s Framework for Information Literacy for Higher Education through the frame Research as Inquiry, since this lesson emphasizes the need to identify—through a process of initial examination and perhaps data cleaning and transformation—the kind of data that they have and their purposes for visualizing them, in order to select a data viz tool that will work best for their data and use case.
The primary preparation step for this module is to identify the range of data visualization tools and platforms that are accessible to users at your institution, including those for which enterprise software licenses are available, as well as open-source tools supported by the library or other units on campus. Develop at least a cursory working knowledge of the features, capabilities, and learning curves for each of those tools.
The following is an example list of veggies (data viz tools) you might choose from to create your own pizza:
This recipe consists of six steps or questions to walk participants through in order to identify which data visualization tools may be the best option for them to invest time in learning to use.
Question 1 introduces the concept of tidy data, or data stored in “long” versus “wide” format. Different tools require data to be organized in different ways, but tidy data—a concept popularized by Hadley Wickham—is a format that works well for most visualization tools. Other researchers, including Robert Kosara, have framed the concept of data organization for visualization using various terminology, such as “spreadsheet thinking” versus “database thinking.” Briefly discuss the concept of data formatting and either describe or demonstrate the process of pivoting data from wide to long format. If users frequently need to pivot their data before visualizing it, consider selecting a tool with that feature built in, such as Tableau or Datawrapper. Alternatively, you might recommend OpenRefine as a powerful tool for data transformation!
Question 2 reinforces the idea that exploratory visualizations require a different creation process than explanatory visualizations. For example, if users need to quickly reproduce a lot of simple exploratory charts, a programming solution such as R or Python is ideal. If users need to create charts that highlight particular patterns or data values and want to add annotations or other graphical elements to their charts, Plotly, Datawrapper, and Tableau Public are good options in addition to R or Python.
Question 3 emphasizes the need to select appropriate chart types for the kinds of variables you want to visualize and checking to see what chart options are available in different tools or platforms can help narrow down your choice. Resources like the Data Visualisation Catalogue and From Data to Viz (see the Additional Resources section) provide typologies and flowcharts for selecting appropriate chart types and can be excellent examples to share during this discussion.
Question 4 encourages participants to weigh the pros and cons of proprietary versus open-source software when deciding which tool to invest their time in learning. Briefly discuss the importance of considering what options will be available to them in the long term: will particular licensed software be available should they leave their current institution? Does there appear to be an active community of support for a new open-source tool?
Question 5 prompts students to consider using advanced statistical software (Stata, SAS, MATLAB) or code-based solutions (R or Python) if they need to create complex charts or a series of visually consistent graphs, as opposed to browser-based tools such as Datawrapper or RAWGraphs, which enable you to create simple charts quickly and easily.
Question 6 helps participants think through the publication and sharing process they intend to use: will readers encounter their charts in static form, either in print or on a screen, or do they want to create interactive charts to be viewed either online or offline? Creating static charts is a best use case for some tools such as Microsoft Excel and certain R or Python libraries, while tools such as Plotly, Datawrapper, Google Charts, Tableau Public, and many JavaScript libraries (such as D3.js) are designed to create interactive charts.
It can be helpful to wrap up a module like this by sharing locally created library resources such as written tutorials or LibGuides for students to reference after the workshop. Also consider sharing a handout or quick reference sheet that lists all the tools and platforms discussed, with URLs to access each tool and its relevant help pages or community forums.
Bokeh (Python package for interactive visualizations) Documentation, https://docs.bokeh.org/en/latest/.
Data Visualisation Catalogue, https://datavizcatalogue.com.
Datawrapper, https://www.datawrapper.de.
D3: Data-Driven Documents, https://d3js.org.
From Data to Viz, https://www.data-to-viz.com.
ggplot2 (R package for visualizations), https://ggplot2.tidyverse.org.
Google Charts, https://developers.google.com/chart/.
Kosara, Robert. “Spreadsheet Thinking vs. Database Thinking.” EagerEyes (blog), April 24, 2016. https://eagereyes.org/basics/spreadsheet-thinking-vs-database-thinking.
Matplotlib (Python package for static, animated, and interactive visualizations), https://matplotlib.org.
OpenRefine, https://openrefine.org.
Plotly Chart Studio, https://chart-studio.plotly.com/create.
RAWGraphs, https://rawgraphs.io.
Tableau Public, https://public.tableau.com/en-us/s/.
Wickham, Hadley. “Tidy Data.” Journal of Statistical Software 59, no. 10 (2014): 1–23, https://doi.org/10.18637/jss.v059.i10.
Rachel Starry is the Digital Scholarship Librarian at the University of California–Riverside campus.
Choice and LibTech Insights gratefully acknowledge our launch sponsor, Dimensions, a part of Digital Science. Dimensions, is the largest linked research database available and provides a unique view across the whole research ecosystem from idea to impact.
Sign up for LibTech Insights (LTI) new post notifications and updates.
Interested in contributing to LTI? Send an email to Deb V. at Choice with your topic idea.
Teaching, learning, and using AI — all in one place
Posted on in Blog Posts
Meet your users where they're at.
Posted on in Blog Posts
The first major antitrust lawsuit against Big Tech in decades
Posted on in Blog Posts
Makerspaces provide new opportunities for creative thinking
Posted on in Blog Posts