Breaking Down Data Silos

The better the data, the better the insight. The better the insight, the better the
results. In this issue, we explore how CFEs can help their organizations break down
data silos to improve business transparency.

After 9/11, the U.S. federal government created the Department of Homeland Security (DHS) to combine all or part of 22 different federal departments and agencies into a unified, more effective and integrated body, creating a strengthened U.S. security enterprise with a mission, stated in part, “to ensure a safe, secure, and prosperous Homeland.” (See dhs.gov.) The DHS founders recognized that gaining access to multiple data sources provides better, more insightful information for more effective decision-making. Likewise, in cybersecurity, organizations and government agencies often collaborate in data-sharing consortiums to share intelligence with many different stakeholders and generate the right level of situational awareness for organizations to defend themselves against cyber threats. Ten years ago, the phrase “big data” was all the rage with software and technology companies. That term is no longer relevant because today’s organization commonly deals with large amounts of volume, variety and velocity (aka, big data, both unstructured and structured). The U.S. Department of Justice (DOJ) 2023 compliance guidance document, “Evaluation of Corporate Compliance Programs,” asks prosecutors when investigating a company to determine if “compliance and controls personnel have sufficient direct and indirect access to relevant sources of data to allow for timely and effective monitoring and/or test of policies, controls and transactions?” (See “Evaluation of Corporate Compliance Programs,” DOJ, Criminal Division, updated March 2023, tinyurl.com/5n7an3xb.)

So, knowing what we do about the need for data sharing and access to multiple data sources, why do we still often see organizations miss critical fraud risks when the information to prevent and detect it was available? More often than not, it’s because different systems within an organization weren’t talking to each other. It’s time to break down these data silos once and for all.

The perils of data silos

The term “data silos” typically refers to isolated data repositories or systems within a company or organization that don’t communicate with each other. Data silos can arise for various reasons, including differences in technology, segregated systems, multiple data formats, multiple company departments, business units and organizational structures. Further, different departments, legacy systems, prior acquisitions, a “data hoarding” culture or lack of data integration tools can all lead to data silos. Data silos waste time, resources and money, often resulting in inefficient processes — including duplication of efforts, manual data entry (which presents risks), and inefficient and time-consuming analysis. Even worse, data silos can lead to inaccurate decisions when a holistic view of relevant information is lacking, or important facts are missing. In fraud examinations, data silos present an even more heightened risk as investigators gather critical facts and make inferences based on the available information. If that information is limited in any way, it can have seriously harmful consequences, and result in what I call the four negative “I’s” of data silos. (See Figure 1) Let’s reflect on the sources of information available to a CFE during an investigation — or even when they’re conducting a proactive fraud risk assessment and/or prevention and detection activities. Data gathering can generally fall into three categories: (1) interviews, (2) unstructured data such as email communication and user documents, and (3) structured data such as accounting records (payments to vendors, sales from customers, and employee travel and entertainment, as examples) and other transactional tables. Thinking these data sources are mutually exclusive is a mistake and can result in “siloed” thinking. Our educational bias also plays a factor in what data sources we lean into — or at least are most comfortable with. (See “Avoiding bias in your fraud risk management program,” by Vincent Walden, Fraud Magazine, September/October 2019, tinyurl.com/mpn655va.) In a nutshell, if your education and background is in the legal field, your investigative tendency will be to lean towards emails, user documents and language. After all, you probably didn’t go into law because you loved math. On the flip side, if you studied accounting or mathematics, you’ll tend to lean towards forensic accounting and structured data, such as enterprise resource planning/accounting systems, databases and the like. You didn’t go into accounting because you loved writing or literature. Finally, if your education and background was in criminology and law enforcement, you’ll have a tendency towards interviewing. Again, these are all generalizations, but today’s fraud examiners know that all three categories of information, working together, is where the key insights lie. See Figure 2

Strategies to break down data silos

At an organizational level, to break down data silos, you must also break down barriers across the organizational divisions within the company. One way to do this is to build a system that benefits all participants (or multiple key business functions). Sometimes this can be challenging when trying to appease all stakeholders — especially the information technology (IT) team. Their business area may not see the direct benefits from the task of assembling all the data into a single location — in fact, it may put certain access controls and security policies at risk. However, ironically, IT is the key player in this equation as it’s often the custodian of data, whose priority is to keep that data — the lifeblood of the organization — secure and running correctly. Convincing IT can be difficult and will often require senior leadership support from other departments, such as finance, marketing, sales, human resources (HR), operations, legal and compliance. Having a clear business case, return on investment, risk assessment and specific data requests will make all the difference in accelerating the integration efforts. In my experience, some strategies for helping make the business case for breaking down data silos include:
  • Looking at data integration platforms: Investing data integration tools and software platforms with open “application programming interfaces” (APIs) that enable the import and extraction of data through the tool is a key step in bringing together multiple data sources.
  • Data governance: Working with your organization to establish data ownership, policies and standards when sharing data helps eliminate confusion and reassures IT and legal that data will be handled in line with corporate policy.
  • Cross-functional teams: I often see legal compliance working with internal audit, and pulling in IT, finance and HR to conduct their proactive fraud prevention and detection activities or support an investigation when needed. As mentioned, showing that an investment in data aggregation will benefit multiple business processes, not just yours, will make a key difference with management.
  • Data warehouse or data lake: As a result of that “big data” revolution a decade or more ago, many companies have since integrated a data warehouse, or “data lake,” that centralizes the data storage of key information repositories, including financial accounting data, contracts, sales activity, and even email and user documents, into a single location to provide faster access and data insight. Make sure you know about these data warehouses or data lakes in your organization and be a part of the discussions when IT is helping design or improve on these types of data aggregation initiatives.
  • Web/cloud services: Nowadays, so many business processes are stored and managed in public and private cloud services, such as Microsoft Azure Cloud, Amazon AWS, IBM Cloud and Google Cloud, among others. When data is centrally and securely stored in the cloud, it helps significantly consolidate data of all types and various formats, bringing data ingestion, processing, transformation and storage together to facilitate data analysis.
    With any data analytics initiative, you need to start by asking the right business questions. For fraud examiners wishing to conduct proactive fraud prevention and detection, the best place to start is your risk assessment. In an investigation, the starting points are the allegations. At this beginning phase, it’s a common trap to get caught up in a single data source where you think the relevant information lies. If you’re searching for high-risk payments, you might jump to the accounts payable subledger. If you’re responding to an investigation, you might first jump to the suspect’s email. As we’ve previously discussed, let’s not fall into that trap, and instead ask more holistic questions of the data. If the risk or accusation is around payments to third-party vendors, for example, you’ll want to consider multiple data sources including the vendor master data, the accounts payable subledger, invoices, payments and purchase order data. You’ll even want to consider non-accounting data such as the due diligence that was performed (or not performed) on the third party, and what, if any, email communications are in line with that vendor.
    Professor Jennifer Arlen, director of corporate compliance and enforcement at New York University’s School of Law, describes the importance of analyzing/integrating multiple data sources to effectively assess misconduct or unethical behavior, writing: “Companies can obtain the information needed to make these assessments through (1) internal reporting hotlines; (2) decision advisory hotlines; (3) well-designed surveys given months after training to assess employees reactions to scenarios implicating choices between compliance and profits; (4) exit interviews; (5) adoption of an analytic detection system that incorporates data from internal hotlines, HR complaints about unethical behavior (including sexual harassment), consumer complaints, and (6) carefully calibrated performance indicators that can raise red flags about  potential misconduct. Advances in AIassisted monitoring of performance and transaction data may also prove a boon to identifying ‘red flags’ or anomalies in data that may be predictive of suspicious conduct. Firms also should audit their systems for detecting and investigating misconduct to determine whether those systems are working well.” [See “The Compliance Function,” by Jennifer Arlen, The Oxford Handbook of Corporate Law and Governance (Jeffrey N. Gordon and Wolf-Georg Ringe, eds., Oxford University Press 2d ed., forthcoming) at tinyurl.com/mwta5v4f.]

Vendors, customers and employees

If you think about the ACFE Fraud Tree and the occupational fraud schemes it addresses, there are only three entities that can rip you off in the context of those schemes: vendors, customers (or distributors) and employees. (See “The Fraud Tree,” tinyurl.com/4c4vhs7u.) This doesn’t include external threat actors in the context of cyber threats, but when it comes to the business functions of the company, those are the only three types of entities we’re concerned with. As you know, they often work together, or require one another — often without the other party knowing — touching multiple data sources to commit the malfeasance. A rogue employee may overcharge a customer or receive a kickback from a vendor. A rogue, or even fake, vendor may submit multiple bogus invoices to the company, where they’re approved by an employee. You get the idea. They interact via accounting systems, approval processes and email, to name a few. Keeping a 360-degree view of customers, vendors and employees is paramount to an effective control environment for fraud prevention and detection. And breaking down the data silos to get there is something we should all be working towards. Keep innovating! FM

Vincent M. Walden, CFE, CPA, is the CEO of konaAI, an AI-driven anti-fraud and compliance technology company providing easy-to-use, cost-effective third-party payment and transaction analytics software around corruption, investigations, fraud prevention, internal audit and compliance monitoring. He welcomes your feedback and ideas. Contact Walden at [email protected].

This article was originally published in Fraud Magazine on January/February 2024.