Most enterprises use only 12% of their data. Discover how AI transforms unused dark data into competitive advantages through real case studies like Envision Racing's championship-winning strategy.
For five years, the Envision Racing team had been collecting something that seemed valuable but felt worthless: thousands of hours of radio communications from over 100 Formula E races. More than 20 drivers speaking in code, using acronyms, discussing strategy in real-time—all broadcast on open frequencies that anyone could listen to.
The coded language made it nearly impossible to extract actionable intelligence. Understanding what rival drivers were saying about when to overtake or when to apply brakes could provide crucial competitive advantages. But humans need 5-10 seconds just to process what they were hearing—far too slow for split-second racing decisions.
This mountain of audio recordings represented a perfect example of dark data: information that organizations collect and store during regular business activities but fail to use for other purposes. The racing team was paying to store thousands of hours of potentially valuable intelligence that provided zero competitive advantage.
Then something changed. By working with AI specialists, Envision Racing transformed this dark data into a competitive weapon. Using natural language processing and deep learning models, the AI could decode and analyze the communications in 1-2 seconds, providing real-time strategic insights. The result? First and third place finishes in New York.
Sylvain Filippi, Managing Director and CTO of the team, says motorsports are an excellent “...case study for how to use data because many companies have a lot of data … every mileage [sic] on the car generates a huge amount of data. What's interesting is that we have to learn to really structure the data … what companies will need to do more in the future is to use all of their data and use it really quickly.”
Your organization likely faces the same challenge with its own dark data, just on a different scale. But the opportunity is enormous. As Google’s Andi Gutmans puts it, “2025 is the year where dark data lights up … AI and improved data systems will enable businesses to easily process and analyze all of this unstructured data in ways that will completely transform their ability to reason about and leverage their enterprise-wide data.”
Gartner defines dark data as "the information assets organizations collect, process and store during regular business activities, but generally fail to use for other purposes." Think of it as business intelligence trapped in digital storage, which costs money to maintain and provides zero ROI.
 “2025 is the year where dark data lights up … AI and improved data systems will enable businesses to easily process and analyze all of this unstructured data in ways that will completely transform their ability to reason about and leverage their enterprise-wide data.”
—Andi Gutmans, GM and VP for Data Cloud at Google
According to research, this isn't a minor issue. Studies show that 60% of organizations report that half or more of their data is considered dark, with one-third reporting that 75% or more falls into this category. IBM finds that organizations typically use only 12% of the data they collect.
The modern workplace has created a perfect storm for dark data accumulation. Critical business information becomes trapped across different platforms—Slack conversations, Google Drive documents, email threads, CRM notes, and support tickets. Each platform creates its own information silo, making it nearly impossible for teams to access insights that could drive better business decisions.

Dark data typically falls into one of two categories, structured or unstructured:
While structured data sits neatly in rows and columns, unstructured data requires specialized technologies to extract meaningful insights. The challenge is that unstructured data often contains the richest business intelligence—customer sentiment, market trends, operational inefficiencies—but remains the most difficult to analyze using traditional methods.

Activated dark data offers a potent competitive advantage, economically feasible thanks to AI. Research demonstrates that AI can process previously unusable data at a fraction of traditional costs. For example, ChatGPT analysis costs approximately $0.003 per annotation, roughly 20 times cheaper than human analysis.
BrandAlley, a UK flash-sale e-commerce platform, had customer behavior data sitting unused in their systems while paying ongoing storage costs. So they used sophisticated embedding techniques to analyze product relationships and customer preferences hidden in their transaction history and browsing patterns. By applying AI to understand these previously inaccessible insights, they achieved 77% higher conversion rates, 68% increases in average order value, and 60% growth in revenue per user—transforming storage costs into significant revenue growth.
Halliburton collected petabytes of historical data from hundreds of sensors on drilling equipment for over 20 years, stored across various formats, including text files, PDFs, and Excel spreadsheets in different systems. This "really dark data" included electromagnetic rock scans and nuclear magnetic resonance readings—essentially MRIs for oil wells—that were accumulating storage costs without providing operational value. Using machine learning to activate this previously unusable sensor data, they built predictive maintenance systems that identify equipment failures before they occur, dramatically reducing downtime and maintenance costs that previously reached millions in losses.
Successfully activating dark data requires balancing significant opportunities with potential challenges. These considerations help ensure your initiative delivers value while avoiding common pitfalls.
When training AI systems on massive datasets, sensitive information can inadvertently become embedded in these networks, creating hidden vulnerabilities that standard security audits miss. This creates compliance exposure and increases the risk of data breaches.
North American Bancard, a payment processing company, faced this exact challenge when implementing AI analysis of its customer data. They solved it by implementing automated metadata flagging systems to identify and protect sensitive data before AI processing began. Their approach includes data sanitization workflows that preserve analytical value while protecting customer privacy. This demonstrates the importance of establishing security safeguards early rather than retrofitting them after implementation.
Poor data quality represents one of the biggest obstacles to successful dark data activation. Gartner predicts that 30% of generative AI projects will be abandoned by 2025 due to poor data quality, inadequate risk controls, escalating costs, or unclear business value.
The challenge becomes cyclical: poor quality dark data leads to unreliable AI insights, which creates distrust in AI systems, leading to further neglect of data quality initiatives. Organizations must establish systematic approaches to data quality management, starting with analyzing current issues and identifying root causes, then implementing preventive measures.
A significant risk in dark data activation often comes from organizational resistance rather than technical challenges. Teams generate valuable data on various platforms that other departments are either unaware of or unable to access. This silo problem requires intentional governance structures to overcome.
Common governance pitfalls include unclear data stewardship responsibilities scattered across teams without accountability, leading to gaps between AI development and compliance teams. Different departments often create conflicting policies and data access controls without coordination, creating compliance vulnerabilities that undermine dark data initiatives.

Dark data represents both an untapped resource and one of the most significant missed opportunities in modern enterprises. Organizations are paying substantial costs to store information that could drive competitive advantages, optimize operations, and generate new revenue streams. The case studies outlined above (Envision Racing, BrandAlley, and Halliburton) demonstrate how AI technologies can unlock tremendous value from seemingly unusable data sources.
With AI, the potential for dark data activation has matured rapidly, making these capabilities accessible to most enterprises—the primary barriers are now organizational rather than technical.
