Graphs, AI, and the Fight Against Hidden Networks
When it comes to financial crime, the old saying applies: defenders think in lists, attackers think in graphs. Paco Nathan, a veteran of AI and data science with over four decades in the field, has built his career around that insight. Today, his work demonstrates how graph analytics, entity resolution, and AI can uncover the hidden structures behind money laundering, tax evasion, and even human trafficking — while also opening opportunities to better understand customers and markets.
In a recent ODSC Ai X Podcast conversation, Paco Nathan explored the technologies and strategies reshaping investigations into “dark money” networks. His perspective draws not only from his current work at Senzing, where entity resolution powers large-scale investigations, but also from a career spanning Bell Labs, Databricks, and the early days of Hadoop and Spark.
You can listen to the full podcast on Spotify, Apple, and SoundCloud.
From Neural Nets to the Dark Web
Nathan’s journey started in the early 1980s, when he took the then-risky path of studying artificial intelligence and neural networks long before their mainstream adoption. He went on to work in machine learning, natural language processing, and large-scale data infrastructure, helping to pioneer cloud-native analytics and contributing to the rise of Spark at Databricks.
That background now informs his focus on financial crime. At Senzing, the technology originally built to spot threats in casinos has evolved into one of the most widely used libraries for entity resolution — the process of determining whether different records actually refer to the same person or organization. “The same technologies used for finding the worst criminals in your network,” Nathan explains, “can also be used for finding the best customers in your network”.
Graphs as Weapons Against Complexity
Money laundering and tax evasion aren’t just about moving cash. They involve vast, distributed networks of shell companies, insider threats, fake invoices, and circular transactions. Human analysts can’t realistically trace thousands of interlinked transfers. Graph analytics, however, can.
Nathan points to high-profile cases like the Danske Bank scandal, where more than €200 billion in suspicious funds flowed through a single branch in Estonia. The key to unraveling such schemes wasn’t looking at single transactions in isolation, but examining the entire network structure. By modeling relationships as graphs, investigators could detect recurring patterns — such as companies acting simultaneously as suppliers and customers, or funds rapidly drained through layers of intermediaries.
This is where AI adds power. Combining entity resolution with probabilistic models and graph algorithms allows systems to recognize motifs of fraud that repeat across networks, even as criminals continually adapt their methods. As Nathan notes, “Defenders think in lists, attackers think in graphs — and as long as that holds, the attackers win”.
Why Entity Resolution Still Matters
Despite decades of progress in machine learning, entity resolution remains a foundational, and unsolved, problem. The challenge lies in merging fragmented, inconsistent, and multilingual data into coherent profiles without introducing false positives. “You don’t want to just go out and do a bunch of fuzzy string matching and say you’re done,” Nathan warns. Decisions about identity can block loans, deny passports, or trigger wrongful arrests.
Instead, robust systems must balance probabilistic evidence, audit trails, and millisecond-scale decisions across billions of records. And increasingly, the outputs of entity resolution form the nodes and edges that drive knowledge graphs — critical inputs for modern AI pipelines.
In practice, this means integrating taxonomies, vocabularies, and domain-specific thesauri into AI systems to constrain meaning and enforce consistency. As Nathan explains, “If agents are taking actions, what are they building on? What are you feeding into the LLM, and what kind of guardrails are you putting on the results?”
Enter GraphRAG
One of the hottest approaches Nathan highlights is GraphRAG (graph-based retrieval-augmented generation). Traditional RAG systems use embeddings and vector search to pull relevant chunks of text into large language models. But they struggle with recall — sometimes missing critical connections that don’t appear semantically close.
GraphRAG tackles this by layering a knowledge graph over the text corpus. Parsing documents into entities and relationships, then running graph algorithms like PageRank or centrality, surfaces key concepts that embeddings alone might miss. This allows systems to answer questions with more accuracy and context — for instance, linking dementia risk studies to specific researchers and hospitals even when the terms don’t appear explicitly.
For Nathan, GraphRAG exemplifies the future of hybrid AI: marrying the symbolic reasoning of graphs with the generative power of LLMs. It’s not just about better accuracy, but also about accountability and explainability. By tracing inferences through graph structures, investigators can show their work — something that courts and regulators increasingly demand.
Patterns of Fraud, Patterns of Opportunity
Perhaps the most intriguing insight Paco Nathan shares is that the same techniques used to trace hidden criminal networks can also unlock business value. Graph motifs that signal insider threats or laundering can, in another context, highlight loyal customer segments, product affinities, or emerging market opportunities.
This dual use underscores the postmodern complexity of finance: sometimes the difference between a bank’s most profitable customer and its most dangerous criminal is hard to distinguish. But it also underscores the power of data science. By learning to think in graphs rather than lists, organizations can both protect themselves from systemic risks and discover new pathways for growth.
The Role of Open Data
Open source plays a crucial role in this ecosystem. Initiatives like OpenSanctions, OpenOwnership, and OpenCorporates compile global datasets on politically exposed persons, sanctioned entities, and beneficial ownership structures. These sources allow investigators, journalists, and even smaller businesses to integrate risk intelligence into their workflows.
When combined with synthetic data simulators that model fraudulent behavior, such datasets give AI practitioners a sandbox for developing and testing new detection methods. As Paco Nathan emphasizes, open data doesn’t just help identify risks — it strengthens transparency and accountability across the financial system.
Toward Neuro-Symbolic AI
Looking ahead, Paco Nathan sees promise in neuro-symbolic AI: blending the probabilistic, data-driven strengths of LLMs with the structured reasoning of symbolic systems like graphs and ontologies. This hybrid approach, he argues, is essential for constraining context, reducing hallucinations, and ensuring trustworthy outputs in high-stakes domains.
“LLMs can do so much,” he says, “but really constraining the context is the game”. For fields like fraud detection, compliance, and cybersecurity, that means combining semantic layers, graph algorithms, and AI workflows in a way that balances scale, accuracy, and explainability.
Conclusion
The fight against dark money isn’t just about policing criminals — it’s about rethinking how organizations handle complexity. Paco Nathan’s work shows that by embracing entity resolution, graph analytics, and hybrid AI techniques like GraphRAG, practitioners can build systems that not only uncover hidden networks of crime, but also reveal hidden networks of opportunity.
As the volume and velocity of global transactions continue to grow, the organizations that learn to think in graphs — and not just lists — will be the ones prepared to defend against threats, comply with regulations, and seize new advantages in the age of AI.
