Does any of these questions sounds familiar?

"I keep exporting spreadsheets and making dashboards, but I don’t really know what I’m looking at—or what to do with it."

"We just spent six months rolling out a new program, but only now are we asking: ‘Did it work?’ Shouldn’t we have thought about this from the start?"

"I have access to training completion rates, feedback scores, and learning hours… but I’m not sure what any of it actually tells me."

Data literacy is the ability to read, understand, create, and communicate data as information (source: wikipedia).

Data literacy is regarded as one of the most valuable skills to have in todays data driven and AI dominated world.

But being data-literate isn’t about being an expert or knowing everything—it’s about being able to ask the right questions, spot patterns, and avoid costly mistakes. Whether you’re making a business case for L&D, improving programs, or proving impact… better data skills will make you a more strategic partner. And is that not what we all want in L&D?

This newsletter edition covers the essentials of data literacy in the context of L&D. So that you can start to feel more confident in having a conversation about data. It also will explain the basic concepts of data literacy in L&D to help you make data a more integral part of your day to day work, and ultimately enable you to harvest the real value of data by using it to make informed decisions, solve complex problems and effectively communicate insights.

And finally, I will high light some of the most common data interpretation mistakes people make. So that you do not have to make these mistakes yourself.

Happy reading!

Peter

Data Essentials: What is Data? Data, information and insights explained....	What data do we need? L&D should be easy to obtain, business performance data could be a challenge.
Basic Statistics (without the jargon) Get out your ‘statistics 101’ college book for some essential statistics…made easy!	Messing around with Visuals Look carefully at charts and you will start to see very interesting things…

Data Essentials: What is Data?

Buckle up for a bit of conceptual and deep thinking. So best not read this if you are in a hurry! But I promise, once you understand the real meaning of data, information and insights you will benefit from it the rest of your life. As a minimum, it will allow you to impress your (non data literate!) friends at parties!

There might be more data around you than you realize. According Wikipedia, data are “a collection of discrete or continuous values that convey information, describing the quantity, quality, fact, statistics, other basic units of meaning, or simply sequences of symbols that may be further interpreted formally”. And this is a very wide definition that basically includes everything we record and everything we ever recorded.

It’s key to realize that data goes beyond numbers and can refer to everything that we record to convey information. This makes that even images, video, paintings and music are constructed from data.

Where data is an individual unit that by itself does not really say anything, Information considered a group of data that carries a meaning. When you organize and structure data, it becomes information. A single note, number, pixel or letter does not mean a lot, but if you structure it in let’s say a song, an equation, an image or a word, it starts to have meaning. When data is processed into information, it becomes interpretable and gains significance.

When you analyze the information, and you try to find patterns, correlations and trends that you put in context, you gain insights. The Oxford dictionary explains ‘insights’ as having ‘an accurate and deep understanding’ of something. which stills sound somewhat conceptual and vague.

But the nice thing about data is that data driven insights (if done properly) are more accurate, more consistent and can handle more complexity compared to insights that are not generated using data.

2 simple examples.

Recording completions of AI upskilling programs can be considered data. You can turn that data into information by calculating the total number of completions over a given period of time. This information can be analyzed by looking at the trend: is the number of completions going up or down? When you analyze and see a downward trend, you have your insight. Based on that insight, you can take the decision to increase your marketing efforts to make more employees aware of the amazing AI upskilling opportunities!
A more complex example would be to look at Sales and a ‘Digital Sales’ program intended to improve the sales numbers. In this example, both the program participation and the sales numbers are data. Participation data can be turned into information by calculating the % of Sales professionals who participated in the program. Sales data can be turned into information by calculating the total sales per month.

Then it becomes interesting…

You can turn the participation % information into insights by looking at the trend over time (again). This shows that the % of sales professionals who participate(d) in the program goes up! Great!

However, the trendline of the total sales per month remains flat!

The most valuable insights you can get in this example is by combining information from Sales numbers and and the participation in the ‘Digital Sales’ program. Using both sets of information you can do a correlation analysis that shows that the ‘Digital Sales’ program is NOT contributing to more sales!

This combined insight is a great example of determine business impact of learning! And in this (rather negative) example, it should lead to a re-evaluation of the ‘Digital Sales’ program as it looks like this program is not achieving any impact.

Structure turns data into information, Analysis turns information into insights, and conclusions turn insights into action

What data do we need… and have?

I personally believe that the more data you collect, the more potential insights you can get. And fortunately there are huge amounts of data out there. Here’s a list of essential data you need for Learning Analytics:

People Data

The first essential source of data we use in L&D is naturally your people data. This data typically comes from your core HR system and contains essential data for learning analytics:

Personal data like name, bank account, salary, home address and personal email
Work related data like your work location, your department, your function and your manager
Company related information like your company email, tenure, job grade and your type of employment

In most cases this data is easy accessible as very often there is an interface between your learning system and your core HR system that allows you to bring over all relevant people data.

People data is typically recorded in a dimension table. A dimension table is a table that contains descriptive elements (also called attributes or dimensions). The people data contains elements that describe each of the employees.

❝

Note of Caution: be very careful with sensitive personal data like gender, pay grade and things like sick leave. While such data can be useful for learning, you should always handle with care and comply with all data privacy rules and regulations!

L&D Catalogue Data

L&D catalogue data is all the data related to all your offerings you have made available to your employees and is described in much detail in an earlier newsletter on L&D Catalogue Health! It typically sits in your LMS or LXP and many of us also have 3rd party content providers that have their own catalogue.

I do want to note here that this is the main data collection that we as L&D control and own. It is our own responsibility (and opportunity!) to make sure we maximize the value of the catalogue by also making sure it contains all the data we need to create information, insights and take action.

Like the people data, catalogue data is ideally also recorded in a dimension table, this time one that describes each and every catalogue item!

Training (transcript) data

Where people data and catalogue data is data we typically record via data entry, or an interface to another system. The real core of L&D data is data that is created through the process of registration and completion! This is where the people data and catalogue data come together: When an employee registers for a program, or completes one.

In technical terms, this is called a fact table, or transactional table as this type of data records facts or transactions.

A essence of the L&D Data model is to bring people data and data on learning programs together in a fact table that records everything from registrations, interactions, completions, test results, certifications and more…

Additional Data Sources

People data, Learning data and Transcript data are the foundations for all Learning Analytics efforts. You simply cannot do without them!

Still there are additional sources that are highly relevant and useful.

Evaluation Data: You might use evaluation or survey tools to establish the perceived impact of learning on people and organizational development. These could be simple happy sheets, but I hope they are much more than than! This is no place to go deep into surveys, but I do recommend to have a look at Will Thalheimers book on Learner Surveys. If you follow his method and take a ‘design for data’ approach, you will end up with a goldmine of data!

Skills Data: Skills data is gaining popularity with all the upskilling initiatives and reports in the news. And skills data is a very valuable addition to your set of data sources. I personally love exploring what skills are covered by what items in your L&D catalogue. That would link skills with L&D assets. Equally interesting is looking at skills linked to people: what skills do people have and what skills do they need. That is when you can start looking at the skill gaps!

Business Performance Data: Last but not least! The data that we need to establish the impact of learning on business performance: Business Performance Data! This data can really be anything, very much depending on the type op organization you are working for: Sales data, helpdesk data, production/manufacturing data, quality data, incident data, marketing data, supply chain data. Anything at all. This source of data is so large and diverse that I will probably spend a separate newsletter edition on it to explain how to get your hands on this data and how to link it to learning data!

Basic Statistics (without the jargon)

Statistics are essential when working with learning data. Understanding basic statistical terms can help you interpret insights and make informed decisions for your learning programs. It actually also helps you to better understand news and scientific articles. ANd throwing around a few of these terms at birthday parties makes you look really smart! Here are a few key concepts that L&D professionals should know:

1. Average (Mean)

The average, also known as the mean, is one of the most common statistical terms. It’s calculated by adding all values and dividing by the number of values. In the context of L&D, this could represent the average learning hours per employee or the average course completion rate.

For example, if you want to know how many hours, on average, your employees are spending on training, you would calculate the total hours spent by all employees and divide by the number of employees. This gives you an overall picture of training time.

Why it matters for L&D: The average helps you gauge overall performance and is used a lot in setting targets. It is also great when comparing data elements that comes in different sized: comparing the total training hours of a large manufacturing department with a small IT team does not make sense. Comparing the average training hours does!

Be Careful! Percentages can be deceiving. If I spent 100 hours on learning and you zero, our average per person is 50 hours. A nice number, but not what you want!

2. Sample Size

The sample size refers to the number of individuals or observations you include in your analysis. In L&D, you might be analyzing how many employees participated in a particular training session or how many courses were completed.

For example, if you're measuring participant ratings of a learning module, your sample size would be the number of employees who rated that course. A larger sample size typically leads to more reliable results, while a small sample size might not accurately reflect trends or patterns.

Why it matters for L&D: A proper sample size ensures your data is representative and meaningful. If you analyze data from only a small group, the results might not be applicable to your entire workforce.

Be Careful! Never jump to conclusions with small sample sized. At max you could say ‘the numbers suggest…’. Course evaluations and ratings are particularly pone to this trap as response rates can be very low.

3. Correlation vs. Causation

Understanding the difference between correlation and causation is crucial when interpreting L&D data.

Correlation occurs when two variables are related, but one does not necessarily cause the other. For example, you might notice that employees who complete more training modules also tend to perform better at work. There is a relationship between training and performance, but that doesn't mean the training directly caused the improvement—other factors could be influencing performance.
Causation means that one event directly causes another. In L&D, this would be like saying, "Providing leadership training directly causes an improvement in leadership skills." Causation is harder to prove and requires more detailed analysis, often through controlled experiments or advanced statistical methods.

Why it matters for L&D: Misinterpreting correlation as causation can lead to faulty conclusions. For example, you might think that increasing learning hours will automatically lead to better performance, when other factors like learning quality or learner engagement may play a bigger role.

Be Careful! Mixing causation with correlation is the number 1 crime in data analysis. Whether done by accident or on purpose, it pays out to learn the difference.

Arguably the most used example of confusing correlation with causation is the relationship between ice cream sales and shark attacks. Both tend to increase during the summer months, leading some to mistakenly believe that eating ice cream causes shark attacks. In reality, both are correlated with a third factor: warmer weather.

4. Outliers

Outliers are data points that are significantly different from the rest of your data. For instance, if most learners take 5-10 hours for a course but one takes 50 hours, that would be considered an outlier.

Why it matters for L&D: Outliers can distort your analyses, especially averages. It’s important to identify and understand outliers because they could indicate something unusual (e.g., a learner needing additional support or technical difficulties with the course) or may be errors in the data. It’s recommended to always check your dataset for outliers!

What’s the biggest challenge you face with learning analytics?

First, I'd like your input to make this newsletter as valuable as possible. Click the challenge that you'd most like future content to help you solve (click reply to this email to tell me anything more specific or that's not listed here).

Messing around with Visuals

Nobody want to learn Excel… Right…??

Have a look at the chart below. It’s a simple chart showing the most in demand analytics skills in an organization:

What you will notice immediately is that Excel Basics hardly appears in the chart. And that all other skills are massively more in demand. So you could decide to terminate all excel trainings and focus on the top 4 right?

WRONG!

That would be I’m afraid a wrong decision.

Because if you carefully look at the vertical axis, you will see that it starts not at zero, but at 14. This is often a conscious decision to distort the numbers and lead people to a specific conclusion that you would not draw when seeing the full picture.

That full picture is the one below. You immediately spot that there is a much smaller difference between the top and the bottom. All because the vertical axis starts at zero as it should be! You would also not decide to terminate all your excel training based on the picture below

Unfortunately, the world is full of misleading charts: The news, analyst reports, white papers and even scientific papers are filled with data visuals that at least lead the reader in a specific direction, and sometimes are simply misleading. I personally find this a huge concern in todays world where misinformation is all around us!

The more you are aware of these ‘misleadings’ the better you can spot them and see them for what they are.

That is why SLT Consulting will launch it’s first of a series of crash courses on data literacy. In this crash course you will learn the basics of L&D data and you will practice with situations where you will have to judge if a certain conclusion is actually a right conclusion and if not, be able to articulate why not.

Drop me a note or reply to this newsletter if you’re interested in participating in this crash course and we will get back to you as soon as they first sessions become available!

Thank you for joining me on this journey.

Remember, taking the first step is the hardest part, but I’ll be here to guide you along the way.

Let’s make data work for you.

Best,

Peter Meerman

L&D Data Literacy Unraveled