Tag Archives: social security numbers

For Your Eyes Only: Experts Explore Preventing Inadvertent Disclosures During Discovery

The Altep, kCura and Milyli webinar explored best practices for safeguarding information, as well as technological tools for redaction

There may be a number of “Scott’s” in Chicago, but there are fewer with a specific last name attached, and there is only one with that specific Social Security Number. This information – or a telephone number, or a fingerprint, or even the MAC address of a computer – can be used to identify and verify a person.

But of course, for as valuable as personally identifiable information (PII) may be for you, it’s just as valuable to a malicious actor looking to steal and utilize it for nefarious purposes. That’s why, when conducting discovery, protecting that information should be of the utmost importance for organizations, law firms, and discovery vendors.

Three of those legal technology companies joined together to put that security forth in a recent webinar called“How to Prevent the Disclosure of PII.” The webinar’s panel included Hunter McMahon, vice president of legal and consulting services, Altep; Scott Monaghan, technical project manager, Milyli; Aileen Tien, advice specialist, kCura; and Judy Torres, vice president of information services, Altep.

In order to prevent disclosure, the panelists asked one important question: What exactly is PII? “It really comes down to what information can identify you as an individual,” McMahon said. This includes information that can be categorized into different categories based on how specific and how personal it is , leading McMahon to notenote that data holders should examined PII to determine if it is sensitive, private, or restricted.

When examining PII in the system, it’s also important to examine what regulations and laws the PII falls under. This can include a number of different federal regulations, HIPAA/HITECH (health PII), GLBA (financial PII), Privacy Act (PII held by Federal Agencies), and COPPA (children’s PII). Forty-seven states also have their own information laws, including varying guidelines on breach notification, level of culpability, and more.

Once that information is known, said the panelists, those conducting discovery should turn to the next question: What are the processes in place to protect the data? “Documents that are in the midst of discovery are really an extension of your retention policy… so you have to think about that risk the same way,” McMahon noted.

Torres explained that the proper approach to take to PII is that it will always be in a document set, if it seems unlikely that PII exists in a system. For example, she said not to assume that because a data set concerns only documents accessed during work hours, it will not contain PII.

“Most people, when they’re working, are also working the same time as those people they need to send documents to,” Torres explained. In one case, looking at data from Enron’s collapse, the documents in the case contained 7500+ instances of employee PII, including that of employee’s spouses and children, as well as home addresses, credit card numbers, SSN, and dates of birth.

In order to combat this data lying in the system, it’s important to take a proactive approach, the panel said. “The approach is much like data security in that it’s not going to be perfect, but you can help reduce the risk,” McMahon added.

To protect it in review, those conducting discovery can limit access to documents with PII, limit the ability to print, and limit the ability to download native files. Likewise, teams can employ safeguards during review such as training review teams on classifications of PII, training reviewers on PII workflow, implementing a mechanism for redaction and redaction quality control, and establishing technology encryption.

And even if not using human review, abiding these protocols can be important, “I see such a trend of more cases using assisted review, so you’re not necessarily having human eyes on every document. So it makes sense to make our best effort to protect PII on documents that may not necessarily have human review,” Torres said.

Properly conducting redactions to make sure nothing is missed can be a pain for reviewers as well, but Tien walked the webcast’s viewers through an introduction of regular expressions (reg-ex), one of the most common technology tools for PII redaction. In short, reg-ex is a pattern searching language that allows one to construct a single search string to search for a pattern of characters, such as three numbers, or three letters.

For one example, Social Security Numbers have a very specific format: XXX-XX-XXXX. Reg-ex can be used to find all constructions of this type, using an input like the following: [0-9]{3} – [0-9]{2} – [0-9]{4}

“With practice, you’ll be able to pick this up like any foreign language,” Tien said.

See post Sneaky PII: What’s Hiding in Your Data?

Sneaky PII: What’s Hiding in Your Data?


It’s no secret that it’s important to remove personally identifiable information (PII) and other privileged information from case data before it’s produced in order to protect it from falling into the wrong hands. The amount of data to be reviewed prior to litigation continues to grow exponentially as more and more ESI enters into discovery requests, and with an increase in data to review, there’s a greater risk of accidentally disclosing PII. As the past few years have shown us, a breach of PII could have major consequences for a corporation or law firm. The problem with PII, however, is that many companies don’t sufficiently protect employee information within their own environments, and because there’s an increasing amount of overlap between employees’ work and personal lives, there are more opportunities for PII to creep up in unexpected places that are easily overlooked.

Think for a second about where you’d expect PII to show up. You’re probably thinking of HR records where employees’ Social Security numbers, addresses, and phone numbers are stored. PII is easy to spot when you’re checking in obvious places like HR files, but personal information can crop up in other places just as easily when data gets collected from a broad range of sources. If an employee has a payroll issue, they might email bank account information or Social Security numbers to the payroll department. Beyond company-related communications, they might even send scanned images of tax documents to their accountant or mortgage applications to buy a new home from their company email address rather than their personal email. If your case requires that you pull company emails between specified dates you might inadvertently collect this information. In addition to emails, employees might use the office scanner for personal documents that they then send from their personal emails – but if that file lives on the company server, it’s at risk of entering into discovery data. If there isn’t a sweep done for extraneous PII, these details will slip through the cracks and leak to opposing counsel. For this reason, it’s absolutely crucial to comb data not just for relevance and privilege, but also for PII.

It’s a slippery slope, not only because PII is ubiquitous and can easily hide in unexpected places, but there are many contributing factors that make it difficult to pin down and at the mercy of human error. While many individuals are sensitive to their own private information, the average person has low awareness of exactly what data constitutes PII and how it can be compromised, meaning they’re probably revealing their company’s and their own private information unknowingly. Even if employees are hyper-aware of sensitive data, PII differs from state to state, so definitions change constantly and new regulations are implemented frequently. What wasn’t sensitive last year might be sensitive this year, and all the information from last year is still sitting on your company servers.

PII laws are complicated and can widely vary depending on which state and country you’re in, so it’s important to have processes in place to help eliminate extraneous data. Arguing for proportionality to narrow the scope of the case will reduce the amount of unnecessary PII gathered, and making use of technology assisted review and the many e-discovery platforms that can quickly find specific data inputs will dramatically reduce the time it takes to comb through files for PII. There are also products that can assist in the identification and exclusion of PII hiding in your case data, including redaction automation tools. While there are many methods for securing PII, redaction is far and away the safest because it removes sensitive information completely. It cannot be recovered or uncoded, so it is really the best way to eliminate risk.

The sticky nature of PII means that security can’t be done on a case-by-case basis. It should be a part of company-wide best practices and have a well-vetted process in place to ensure data is properly protected, not just for individuals but the company as a whole. Implementing security policies and investing in redaction technologies can help you stay compliant and save time, resources, and your reputation.