When bias hides inside your training data
How to spot it early, fix it quickly, and protect your customers and business
Defining data bias and how it happens
Have you ever used an AI powered tool and had it tell you something that was waaaay off base? Not a hallucination per se. We’re not asking if ChatGPT told you it was beneficial to eat at least one rock a day. We’re talking about how LinkedIn thinks our founder would be a great biochemistry professor.
We have every faith in her… but it might take a while to get those credentials.
That’s a playful example of how data might be skewed or biased and provide results that are not accurate or helpful to customers. There are other more problematic examples like how facial recognition software was trained mostly on white, male faces. It misidentified women of color most often, which led to a false arrest.
Most companies are not working with stakes that high, but if you are trying to gain and retain customers, bad data in your AI could cause an erosion of trust that leads to loss of revenue or damage to your reputation. It’s not hard for it to happen. You may be using data that is old — before you did a marketing pivot — and now the outputs are focused on the wrong customers. It may be because your data set disproportionately favors power users. For a lot of companies, this shows up in tools that are being used for everyday business purposes like hiring, marketing, and risk scoring. These types of bias can cause silent drift from your business goals.
Spotting bias
As a leader, it’s important to spot bias in your data and address it quickly before it causes you problems. Here’s how:
Compare who uses your product (and who you want to use your product) to who is represented in your data. Ensure they match as closely as possible.
Test your outputs with specific customer types like: a newcomer; a power user (or frequent flier); and a person whose first language is not the language your company uses. Does your tool produce outcomes that are consistent and seem fair to everyone?
If customers interact with your tool (like a chatbot), review logs and seek out situations where the customer repeats prompts or expresses frustration. Are these always the same types of customers? That may indicates bias.
What you can do to address it, even if you are not a data scientist
You don’t need to be a data scientist or know how to code to avoid bias in your data. In many cases, there are things you can do to tweak or fine tune the tools you are using.
For a tool like Intercom, you can enable a human handoff for the customer. They can bypass it entirely and go right to a human agent. As long as that agent does not also provide biased services, this can create an immediate solution, and likely will build trust for your brand.
Implementing a CSAT or NPS survey for a specific service or your whole company can help you identify trends in satisfaction. If you notice specific groups are consistently expressing dissatisfaction, you may have bias and should investigate.
Tools that deny or accept things like payments (like Stripe Radar) can be configured to flag anything high stakes for manual review. This can allow you or your team to avoid bias by putting another layer of oversight in areas where it is most needed.
Note trends in usage and check to ensure they are not unintentionally leaving specific groups out. Examples of this include appointment bots that only recognize 9–5 work hours on weekdays, or customer intake forms that don’t recognize PO boxes. If you can, look at website analytics and see if you can find these types of issues.
How to know if you are getting it right
For starters, you will notice that you are getting the customers you are targeting, consistently, without a high instance of issues or complaints. When issues arise, these are great metrics you can use after your course correction to ensure you are on the right track. AND, when you are on the right course, they can become the leading indicators that something is wrong. Unbiased data is not just good from a legal or fairness perspective. It creates the least friction for your customers, and helps your bottom line.
Looking for where to start? Check out this free 30-minute data bias worksheet that can help you test and address for success.