The Tangled Web of Cyber Data Mining: A Practitioner’s Perspective

The Tangled Web of Cyber Data Mining: A Practitioner’s Perspective 

As someone who has spearheaded hundreds of cyber data mining projects, I’ve navigated the complex maze of extracting personal identifiable information (PII) from unstructured data in order to provide notification lists to affected end clients, who are simultaneously facing unprecedented IT and business interruption issues. This process, crucial for notifying individuals about potential data breaches, is fraught with challenges, often underestimated in their complexity. Today, I want to shed light on the intricacies of this task, as well as briefly sharing some of my work here at iCONECT, where we’ve been fine-tuning new solutions to these persistent hurdles.

1. The Daunting Manual Process

Data mining in cyber incidents is predominantly a manual affair, resembling a meticulous data extraction/data entry operation. Reviewers sift through source documents to identify PII, transferring this information into a separate database. The existing tools, not purpose-built for this task, often lead to workflows that are more about accommodation than optimization. Imagine the potential for errors when each data point, among thousands or even millions of subjects, bears the risk of a typo. The result? An exponential increase in the potential for inaccuracies, leading to compromised data normalization and a subpar final product. 

2. Unyielding Regulatory Expectations

The regulatory landscape adds another layer of complexity. High fines and penalties loom over organizations for failing to notify effectively. Here, the balance tips heavily in favor of an individual’s right to know about their compromised data, overshadowing any leanings towards process reasonableness. This expectation of perfection, especially under tight timelines, sets a challenging precedent, often feeling like an unattainable standard in the face of manual processing limitations. 

3. The Normalization Nightmare

Normalizing and deduplicating data is akin to assembling a jigsaw puzzle with missing pieces. Different source documents for a single data subject often contain only fragments of information – a name here, an SSN there, occasionally a bank account number or an address. Collating these bits into a coherent, singular profile of an individual is a Herculean task, compounded by the sparse and scattered nature of the information. 

4. iCONECT’s Innovative Approach

At iCONECT, we’ve tackled these challenges head-on. Our focus has been twofold: refining the extraction process and enhancing the initial identification phase. We’ve developed multiple extraction pathways tailored to different user expertise levels and use cases. Our software provides comprehensive upfront analysis, offering a clear scope of the incident’s potential impact. Post-extraction, we’ve invested in robust normalization and deduplication processes, aiming to streamline what has traditionally been a labor-intensive and error-prone task. 

In Conclusion 

As we gear up for the upcoming Legal Week conference in NYC (January 29 – February 1), we at iCONECT are excited to showcase how our latest software developments are revolutionizing the field of cyber data mining. We have spent the last year developing this new software and are entering Beta testing this month.  Our goal is to turn a process traditionally marked by manual labor and high error potential into a more streamlined, accurate, and efficient operation. We look forward to demonstrating these advancements and contributing to a more secure digital landscape.   

 To book a demo click here or to email me directly click here 

Emily Johnston

Emily Johnston oversees the iCONECT Incident Response Data Mining program and has an extensive array of professional experience in all aspects of litigation, privacy, and discovery. She has more than 18 years’ experience as an attorney, including as a litigator specializing in eDiscovery, global privacy issues, and the oversight of document review. Most recently, Emily served as the head of the Global Cyber and Incident Response review and notification offering at Epiq Global. There, she worked to develop an end-to-end cyber review process from the ground up and personally oversaw management of hundreds of incident response cyber reviews. She was instrumental in developing the cyber offering for Epiq, leading both the review and notification processes in the US, India, EMEA, and Australia.

Emily has also served as in-house eDiscovery counsel at two Fortune 500 companies, including as Assistant General Counsel and Vice President at Bank of America, where she oversaw all phases of the discovery process. Before that, she was Counsel in the eDiscovery and Information Governance group at Fulbright & Jaworski LLP.