DOJ's Evolving Use of Data Analytics in Pursuing PPP Fraud
In early September and again last week Acting Assistant Attorney General Brian Rabbitt delivered public remarks highlighting the role of data analytics in DOJ's criminal enforcement efforts concerning fraud on the Paycheck Protection Program (PPP). The PPP is a key funding feature of the Coronavirus Aid, Relief, and Economic Security Act (CARES Act). By the time the PPP stopped accepted applications on August 8, banks and lenders had approved over 5.2 million loans totaling over $525 billion. By September, DOJ had already announced PPP-related fraud prosecutions against 57 individuals, often charging them in a criminal complaint rather than a grand jury indictment. In his various remarks, Rabbitt credited DOJ's rapid enforcement rollout to its public-private partnerships, its creation of a dedicated PPP enforcement team, and its "data-driven approach to investigation." Although Rabbitt did not detail DOJ's analytical methods, our review of the recent charging documents suggest three ways that DOJ may be using data analytics to identify potential fraud and support its prosecutions: (1) flagging redundant personally identifiable information (PII) across multiple PPP applications; (2) catching data discrepancies between PPP applications and other records in government databases; and (3) identifying the repeated use of identical supporting documents across applications.
First, DOJ appears to be looking for situations in which PII and other unique identifiers—such as applicants' names, Social Security numbers, bank account numbers, phone numbers, and addresses—appear in the PPP applications of multiple businesses, and even across submissions to multiple banks. Under the PPP, an individual may receive only one loan per tax ID, although individuals who own multiple businesses, each with its own tax ID, may apply for multiple loans. Nevertheless, the existence of duplicate information across multiple applications can potentially indicate a pattern of fraud. For example, a prosecution in the Eastern District of Texas alleged that a defendant used his home address and personal email address in applications for multiple businesses claiming to have hundreds of employees. Another prosecution in the Southern District of Texas alleged that the defendants used the same names and phone number in multiple applications. And a criminal complaint filed in the Western District of New York alleged that multiple fraudulent applications were sent from the same IP address. Flagging redundant PII is a tried and true fraud investigation method for financial institutions, so it would not be surprising if DOJ is relying on this same method to identify potential PPP fraud. But from DOJ's statements and charging documents, it is not clear what portion of these data-flagging efforts is attributable to the Fraud Section or to lenders' internal compliance departments.
Second, the prosecutions indicate that investigators are cross-referencing information in PPP loan applications with data that other federal and state agencies already maintain. For example, in the Texas matters noted above, the complaints allege that the number of employees reported in PPP applications did not match the records of the Texas Workforce Commission, which requires employees to be reported within 20 days of hiring. In other prosecutions, the DOJ appears to have identified allegedly fraudulent documents by comparing the tax documents supporting PPP loan applications with tax documents on file with the IRS. For example, in a prosecution in the District of New Jersey, investigators allegedly determined that applicants applied for PPP loans using IRS Form 940 or Form 941—periodic filings that businesses must submit to report employees' income taxes—even though the companies had not filed those forms in 2019 and 2020. Identifying these types of reporting discrepancies is also a well-established fraud detection method.
Third, investigators have identified several instances in which nearly identical versions of supporting documents appeared in multiple PPP applications for different businesses. Charging documents in joint Ohio and Florida prosecutions allege that the defendants submitted the same bank statements with only minor changes to support multiple applications. Similarly, a criminal complaint in the Eastern District of Michigan alleges that a defendant was linked to multiple applications supported by nearly identical documentation, which was flagged by the lending bank's underwriting team. More than simply reviewing document-level metadata, this complaint suggests that DOJ compared supporting documentation's content to identify redundancies in individual characters of text, and alleges that monthly payroll cost figures reported by the defendant on behalf of two different businesses share an unusually high degree of similarity in the numbers used. Identifying near-duplicate content across loan applications requires more sophisticated technological tools than those that merely flag redundant PII and business data discrepancies. The recent significant advances in document-review technology make that it a likely part of DOJ's PPP-fraud enforcement effort.
As Acting AAG Rabbitt acknowledged, DOJ has long relied on data analytics in its healthcare fraud and securities fraud investigations. But the PPP's unique focus has shifted where DOJ searches, and how it does so. DOJ's efforts to compare and match data across different systems and sources is similar to how other government agencies, particularly financial regulators, routinely use data analytics in their investigations. For example, regulators scrutinize data reported by financial institutions for compliance with Bank Secrecy Act/anti-money laundering (BSA/AML) rules through methods such as comparing banks' BSA filings with the transaction activities of its accountholders who present significant compliance risks. And as one of our colleagues described last year, the Securities and Exchange Commission's Division of Enforcement now has advanced tools to detect fraud and insider trading by analyzing suspicious trading patterns and relationships among multiple traders.
DOJ has a tradition of using analytics in its healthcare fraud investigations and prosecutions, but those efforts more frequently focus on statistical analysis. For example, investigators often rely on Medicare claims information to compare the utilization rates of medical services and procedures across providers, and prosecutors often ask judges reviewing search warrants and juries reviewing trial evidence to infer criminal intent from a defendant's supposedly excessive testing or treatment. DOJ's recent opioid enforcement efforts have used data analytics extensively to identify and prosecute healthcare providers allegedly involved in drug diversion, to identify higher prescription levels, and to identify higher rates of adverse outcomes. We are very familiar with DOJ's use of statistics and analytics in various healthcare contexts—and have publicly criticized its flawed approaches and disregard for core statistical concepts.
For now, the PPP-related charging documents do not suggest that any of the pending prosecutions will involve the kind of questionable data analysis and faulty statistics that sometimes infect the government's approach to trial proof. However, we will continue to monitor these cases—and DOJ's investigative techniques—as they develop.
© Arnold & Porter Kaye Scholer LLP 2020 All Rights Reserved. This blog post is intended to be a general summary of the law and does not constitute legal advice. You should consult with counsel to determine applicable legal requirements in a specific fact situation.