Three Pillars of Preparation
Before a single chart is drawn, three foundational questions need answers: What are you trying to communicate? What tools will serve your data and audience? And is your data ready? This guide addresses each in sequence.
Defining Clear Objectives
A visualization without a purpose is decoration. Every visual should connect to a specific agency goal—informing policy, engaging the public, or improving communication across departments.
Know Your Audience
A policymaker needs high-level summaries with clear takeaways. A technical analyst needs precision and detail. Developing audience personas—mapping out roles, knowledge levels, and expectations—ensures visuals land the way they’re intended to.
What Is Your Message?
Every visualization falls into one of four categories. Identifying yours early shapes every design decision that follows.
Message & Audience Workflow
With message type and audience identified, this workflow guides you from core question to the most effective visual format and framing.
Figure: Message & Audience Workflow — from key message to impactful framing.
Data Storytelling
The most effective visualizations tell a story. Framing data around cause-and-effect or trends over time helps viewers grasp meaning quickly—and act on it. Titles, labels, and annotations reinforce the message. Color and layout direct the eye.
Figure: A side-by-side comparison of passive vs. directed visual storytelling.
What Makes a Narrative Actionable?
An actionable narrative leaves little room for ambiguity. It guides the viewer toward a specific conclusion or decision by using a clear central question, visual hierarchy that supports the message, and annotations that explain rather than just label.
Integrating AI into Visualization Workflows
AI tools can be genuinely powerful—but their value depends entirely on the type of task. The framework below helps practitioners calibrate when and how to use them.
| Challenge Type | What It Looks Like | AI Suitability |
|---|---|---|
| Reasoning Challenges | Complex analytical thinking: designing measures, evaluating tradeoffs, interpreting interacting variables | Moderate–High AI supports analysis; practitioner validates |
| Effort Challenges | High-volume, repetitive work: cleaning datasets, generating draft charts, standardizing fields | Very High AI as primary automation engine |
| Coordination Challenges | Aligning across departments, tracking inputs, reconciling feedback | Moderate AI documents and organizes; humans decide |
| Domain Expertise | Applying lived experience: stakeholder context, policy framing, visual judgment | Low–Moderate Reference support only; no substitute for judgment |
| Ambiguity Challenges | Visualization objective is unclear; the right question hasn’t been defined yet | Moderate AI prototypes options; humans finalize direction |
| Judgment / Courage | Sensitive findings, equity outcomes, politically charged results | Very Low Advisory only; leadership must own these decisions |
The AI Fluency Map
Even when AI is well-suited to a task, outcomes depend on how effectively the practitioner engages it. These six competencies define what fluent AI use looks like in practice.
Prompt Design & Context Framing
Provide scope, constraints, and examples—not just a task request. Embed frameworks and specify tone, length, and format.
Technical Understanding
Know the difference between pattern prediction and fact retrieval. Recognize training cutoffs and the potential for hallucinations.
Workflow Design & Integration
Embed AI intentionally into day-to-day processes. Provide documents, scope, and defined asks—similar to onboarding a new team member.
Advanced Prompting Techniques
Use structured methods—few-shot examples, staged reasoning, multi-step instructions—to improve output quality and control.
Critical Evaluation & Verification
Check all outputs for accuracy, credibility, and fitness for purpose. Flag unsupported assertions; cross-reference key sources.
Managing Expertise “Flattening”
Prevent AI from producing technically correct but generic outputs. Ensure results reflect real tradeoffs, stakeholder context, and institutional realities.
Preparing Data for Visualization
The quality of a visualization depends on the quality of the data behind it. Preparation means making data accurate, well-structured, and aligned with the analysis objective—before any visual is built.
Selecting Your Tool
The right platform works with your data format, supports your publishing needs, meets accessibility requirements, and is something your team can maintain. Key considerations include data connectivity, ease of use, interactivity, and support for automated refresh in dashboards that update frequently.
Becoming Familiar with Your Data
Before cleaning anything, explore. Numeric summaries give a quick picture—but visual exploration reveals what numbers alone cannot: patterns, clusters, outliers, and nonlinear relationships.
Anscombe’s Quartet and the Datasaurus Dozen are classic demonstrations of this principle. Multiple datasets can share nearly identical descriptive statistics while looking completely different when visualized. Always pair numeric and visual exploration—they are complementary, not interchangeable.
Figure: Anscombe’s Quartet—four datasets with identical statistics, four very different patterns.
Figure: Datasaurus Dozen—further proof that numeric summaries can hide dramatic visual differences.
Cleaning the Data
Unclean data produces distorted insights—and no amount of design polish can fix a flawed dataset. Six principles guide effective cleaning:
Validity
Data meets predefined criteria. Traffic counts should be positive whole numbers, not “abc.”
Accuracy
Data reflects reality—not just a correctly formatted number, but the right number.
Completeness
No critical fields are blank. Missing context undermines the whole analysis.
Consistency
Data aligns logically. Zero cyclists but “highly congested” is a red flag.
Uniqueness
No duplicate records. Counting the same vehicle twice inflates every metric.
Uniformity
One unit of measurement throughout. Miles and kilometers cannot coexist.
Define your method for handling outliers before you begin—not after you see results. Removing values only because they’re inconvenient introduces bias.
Transforming Data for Visualization
Once data is clean, it needs to be structured so visualization tools can use it. Most platforms expect tabular data: one row per record, one value per cell, consistent column names, no totals mixed in.
Figure: Non-tabular vs. tabular data—the right structure makes a dataset ready for any visualization tool.
Getting to this structure typically requires one or more transformation methods. The examples below use bicycle traffic counts to keep the logic concrete.
Wide Format
Each variable gets its own column; each row is a unique entity. Ideal for comparing across variables in a single row.
Long Format
One column holds variable names; another holds values. Each row is a single observation. This is the format most visualization libraries and tools expect for charts like line graphs and grouped bars.
Aggregation
Combine values using sum, mean, or count to surface broader trends. Instead of daily counts per street, show monthly averages.
Other Common Methods
Transposing swaps rows and columns—useful as an intermediate step when restructuring orientation.
Derived Metrics create new columns from existing data: percentage change, weekly totals, ratios.
Binning groups continuous values into categories (Low / Medium / High), simplifying distribution analysis.