Essential Tips to Turn Your Data into Captivating Stories
DISCLAIMER: This article provides a summarized version and excerpts from the original PDF, FA_Storytelling_with_Data_Visualization_v1_.pdf, published by financealliance.io. The purpose of this article is to present a concise and relevant post on the topic of data storytelling.
Table of Contents
Chapter 1: Clean Your Data Before Storytelling
Understanding the importance of data cleaning and preparation for effective storytelling.
Chapter 2: The Three Key Principles of Data Storytelling
Master the essential principles that make data storytelling impactful and memorable.
Chapter 3: Choosing the Right Graph for Every Purpose 📊
Learn how to select the perfect visualization to communicate your message clearly.
Chapter 4: Using Color Effectively 🎨
Discover how to use color to enhance readability and highlight key insights.
Chapter 5: The 5 Steps to a Powerful Data Story
A step-by-step guide to crafting a compelling narrative with data.
Chapter 6: Advanced Data Visualization — Heat Maps 🌡️
Dive into heat maps and learn how to visualize complex data patterns.
Chapter 7: Advanced Data Visualization — Tree Maps 🌳
Explore tree maps and how they can help in displaying hierarchical data.
Chapter 8: Advanced Data Visualization — Sankey Diagrams 🔄
Master Sankey diagrams for visualizing data flow and relationships.
Chapter 9: Advanced Data Visualization — Choropleth Maps 🌍
Learn how to use choropleth maps to visualize geographic data and trends.
Conclusion:
Summing up the power of data storytelling and how to use it for maximum impact.
Why do we need data storytelling?
Chapter 1: Clean Your Data Before Storytelling
It’s impossible to do storytelling on broken data, or the story as outcome does not make sense. So we have to make sure our data is clean and ready for meaningful analysis then storytelling.
Here are a few types of data cleanse techniques you can reference:
- Remove duplicates
Duplicated data entries are more common than you might think and tend to occur during data collection. This can lead to inconsistencies and errors in your analysis and visualizations. By removing duplicates, you can ensure that your data is accurate and consistent. Fix this issue in Excel by using the “Remove Duplicates” function to identify and remove duplicate entries.
2. Fill in missing values
Have you ever tried to solve a puzzle with missing pieces? It’s frustrating! The same goes for missing data in your dataset. Missing data can be a major problem when it comes to analysis and visualization. It can skew your results and make it difficult to draw accurate conclusions. By filling in missing values, you can ensure your analysis and visualizations are based on complete and accurate data. In Excel, you can use “Fill Down” function to fill the missing values.
3. Correct inaccuracies
Inaccurate data can lead to incorrect insights and incorrect decisions. Taking time to correct errors ensures your data is reliable and trustworthy. Review your data for errors manually or using scripts. In Excel, you can use “Find and Replace” function to correct inaccuracies.
3. Standardize data formats
Standardizing data formats ensures your data is compatible and easy to work with. If your data formats are inconsistent, it can lead to errors in your analysis and visualizations. In Excel, you can use the “Text to Columns” function to standardize data formats for each column.
4. Remove irrelevant data
Irrelevant data can clutter your dataset and make it difficult to draw meaningful insights. Removing irrelevant data allows you to focus on the most important information. In Excel, you can fix this problem with the “Filter” function to remove irrelevant data.
5. Enforce schema
This is the last but actually most of users miss — your data is not structured in a schema, hence difficult for many analytical and storytelling tools to understand. Most tools take a schema in, each column/field has its meaning, if data grows, they grow by adding rows rather than adding columns. In this way, tools have a fixed schema to analyze data and output beautiful accurate result. Here is a slightly relevant video talking about how people may go wrong with bad schema before analyzing and storytelling.
Chapter 2: The Three Key Principles of Data Storytelling
Please remember that, through “Data Storytelling”, you want to achieve a goal, most of the time is to “convince audience”, you want people to buy in the points you made through storytelling. So think from the target audience’s perspective while you create the story. These 3 principles are most basic ones that you should keep in mind:
- Simplicity
If a chart is more complex than a table, it’s not doing the job of giving the message and supporting your narrative.
2. Accuracy
If the message is inaccurate, you will lose trust, the goal of storytelling is to inspire and convince your audience, but without trust, you already lose the ground to convince any people.
3. Tailored
A personalized presentation or a report requires that you have to tailor the colors, the labels, the annotations, and the explanations to support your point. If you give a standard chart, or sometimes dull, you don’t do the job of analyzing it for audience.
Chapter 3: Choosing the Right Graph for Every Purpose
Different types of visuals work better for different types of data and selecting the wrong one can cause confusion and misinterpretation.
For example, using a bar chart when a line chart is more appropriate can obscure trends and make it more difficult for viewers to see patterns.
Ultimately, selecting the right type of visual is key to ensuring your data is communicated effectively, and your insights are understood. By using visuals that are clear, concise, and engaging, you can make a more significant impact on your audience and drive better decision-making.
Here’s a quick guide (on major chart types) to help you choose:
Bar chart: Ideal for comparing categories or showcasing changes over time.
Line chart: Great for illustrating trends and time-series data.
Pie chart: Perfect for showing proportions or percentages of a whole.
Scatter plot: Best for displaying the relationship between two variables.
Heat map: Excellent for visualizing data density or concentrations.
Area chart: Useful for highlighting the magnitude of change over time and emphasizing trends.
Stacked bar chart: Effective for showing the composition of categories or the distribution of data across multiple groups.
Bubble chart: Ideal for representing three or more variables simultaneously, while showing the relationship and differences between them.
Waterfall chart: Excellent for visualizing the cumulative effect of sequentially introduced positive or negative values, typically used for understanding the incremental contribution of different factors to a final value.
Box and whisker plot: Ideal for displaying the distribution of data, highlighting outliers, and showcasing the central tendency and dispersion of a dataset.
Radar chart: Useful for comparing multiple quantitative variables, showcasing the performance or profile of different entities across various attributes.
Among all these visualization types: Bar chart, Line chart, and Waterfall chart are mostly used for common data storytelling purpose, however you should choose the right one as soon as you have decided:
- What clean data you have at hand?
- What purpose (message to convey) you want to achieve with this story?
Chapter 4: Using Color Effectively
Color is an incredibly powerful tool in data storytelling, capable of evoking emotions, drawing attention, and communicating complex information quickly and effectively.
It is essential to choose colors that work well together and don’t clash. For example, if you want to compare two data points for viewers, using blue and orange can create appealing visual contrast, but using red and pink will cause a clash.
The psychology of color in data visualization — understand color’s emotional associations
Red: Often associated with excitement, passion, and danger, red can be a good choice for highlighting important information or drawing attention to key insights. However, overusing red can be overwhelming, so it’s best used sparingly.
Blue: Tends to evoke feelings of trustworthiness, stability, and calmness, blue is a versatile color that can work well in a variety of data visuals. It’s also easy on the eyes, making it a good choice for longer presentations or reports.
Green: Commonly linked with growth, prosperity, and balance, green is a good choice for financial data visualizations. It can also be calming, making it a good choice for visuals or presentations that require a lot of detail.
Yellow: Universally tied with happiness, optimism, and energy, yellow is perfect for highlighting important data points.
Orange: Usually connected with enthusiasm, creativity, and warmth, orange is great for increasing energy or excitement. It’s also ideal for highlighting key insights.
Purple: Traditionally paired with luxury, creativity, and sophistication, purple gives the impression of elegance. However, it’s another color that can be overwhelming if used too much, so just make sure to use it sparingly.
Gray: Generally tied to neutrality, formality, and professionalism, gray can help create a balanced and professional-looking visualization. It can also help highlight other colors and create contrast.
So what is color’s function? In other words, when do we consider using color?
Here are 7 tips of using colors:
1. Highlight key insights
Use color to draw attention to important data points or trends that you want your audience to focus on. This will help you make your message more impactful and memorable.
2. Contrast is key
Choose contrasting colors to differentiate between different categories or segments in your visualization. This will help ensure your audience can quickly and easily understand the information you’re presenting.
3. Consider accessibility
Keep in mind that not all viewers may be able to see colors the same way. Consider using color-blind-friendly palettes to ensure that everyone can access the information in your visualization.
4. Keep it simple
Limit the number of colors you use to avoid overwhelming your audience. Stick to a few key colors that work well together and complement your data.
5. Purposeful color choices
Be purposeful with your color choices. Consider the cultural and emotional associations of different colors when selecting them for your visualization.
6. Consistent color scheme
Use a consistent color scheme throughout your visualization to create a cohesive and professional look.
7. Create a visual hierarchy
Use color to create visual hierarchies that make it easy for viewers to understand the data at a glance. For example, you could use a bright color to highlight a key data point and a more muted color for supporting data.
Chapter 5: The 5 Steps to a Powerful Data Story
Before practice these 5 steps, you will need to put one thing at first: Human Element.
Remember the goal of data storytelling? build a persuasive argument. What we have learned and applied in data storytelling is make our presentation and argument more acceptable by Humans — your audience. So if you storytelling techniques were applied to make an emotional connection, your data story won’t be bad!
Here comes the 5 steps:
- Setting the Stage: brief, concise background and context info of the problem.
- Presenting the Problem: solving an important problem is the common ground between you and your audience. NEVER FORGET IT!
- Showcasing the Data: data in a story format comes to play, this is the theme of what we talk about here in the whole community!
- Highlight the Key Insights: keep it simple, keep it focused, you’d better not mention more than 1 key insights, but if you have to, keep it under 3. More items dilutes your focus, then it is not key insight any more.
- Make the Conclusion: you are expert and more family in the topic you study and present, make a conclusion or recommendation to keep your audience trust and rely on you. Otherwise you just present a problem and hang your audience in the air. Everyone likes to know they are living a solid house, made by stone, not by straws.
Chapter 6: Advanced data visualization — Heat maps
Heat maps use color to represent data values in a matrix format. They’re great for identifying trends and patterns in large datasets due to its clustering nature, such as stock price fluctuations or correlations between various financial indicators.
The difficulty in creating heatmaps lies in choosing the right color scheme and scaling. Back to our early point, how do you demonstrate the pattern as your point to convey so that audience has emotional echo?
Here is an example:
Chapter 7: Advanced data visualization — Tree map
Tree maps are a space-filling visualization technique that uses nested rectangles to represent hierarchical data, the size of a square/block is visually represents the value weight of that “key”.
Taking finance domain for example. they’re useful for displaying financial data with multiple levels, such as the breakdown of a company’s revenue by product category and subcategory. The difficulty in creating tree maps involves selecting appropriate nesting and color-coding schemes to ensure clarity.
The suggest is to let SIZE play the major rule to signal what weights more, only use outstanding color to draw audience’s attention to specific items (squares) while leaving most blocks color plain or no big contrast.
Chapter 8: Advanced data visualization — Sankey
Sankey diagrams are flow diagrams that show the flow of data or resources between nodes, using the thickness of the connecting lines to represent the flow’s magnitude. They’re useful for visualizing financial flows, such as the movement of funds between accounts or the distribution of investments in a portfolio. The challenge in creating Sankey diagrams comes from managing the layout and line thicknesses to accurately represent the data.
Many financial apps provide Sankey diagram as a dedicated cash flow visualization, such as multiple income streams goes to all different expense category in addition the remaining one as saving.
Chapter 9: Advanced data visualization — Choropleth maps
Choropleth maps use color gradients to represent data values across geographical regions, such as countries or states. They’re ideal for visualizing regional financial data like GDP growth, poverty rates, or market penetration. The difficulty in creating choropleth maps lies in selecting the right color scheme and ensuring the data is accurately represented geographically.
Together with choropleth, there are many different ways to represent data on map, such as Bubble, Spike on the map.
Conclusion
Data storytelling is an incredibly powerful skill that can benefit everyone. By mastering the fundamentals, we elevate our ability to communicate effectively, making it easier and faster to drive progress and inspire action.
Columns Ai, an AI-driven data storytelling tool, empowers anyone to interpret data effortlessly and craft compelling stories that resonate with others. Transform your data into meaningful narratives and share insights that make an impact.
Credit: Finance Alliance — Storytelling with Data Visualization. The author has edited and summarized key insights to enhance readability and value for the audience.