If you are a regular reader of the Verizon Data Breach Investigations Report (DBIR), you may be wondering why we are publishing an out-of-cycle document. As a rule, we do not report on events as they unfold, but instead wait until all the data is received and analyzed, and then present the findings in our annual breakdown of cybercrime trends. However, we are living in unprecedented times, and events are unfolding at a rate that has not been seen before. We, here on the team, are receiving many questions from our readers and partners regarding the possible impact of COVID-19 on the data breach landscape. The extreme gravity and the sheer volume of the changes taking place both in industry and society as a whole have compelled us to address the matter in this article format rather than waiting until our next edition of the report to examine the challenges we face and their potential consequences.
The extreme gravity and the sheer volume of the changes taking place both in
industry and society as a whole have compelled us to address the matter in
this article format rather than waiting until our next edition of the report to
examine the challenges we face and their potential consequences.Methodology
It is appropriate here to say a few words about how we arrived at the conclusions we are presenting and how much, or how little, data they are based on. To that end, we collected non-incident data from a couple of our DBIR contributors (many thanks to Recorded Future and KnowBe4) and obtained incident data in the form of 35 publicly disclosed incidents gathered for the Vocabulary for Event Recording and Incident Sharing (VERIS) Community Database (VCDB) project. For the period between March 1, 2020, and June 1, 2020, we added 474 data breach records to the issues list of the VCDB repository to be coded. From those, 36 incidents were identified as being related directly to the COVID-19 pandemic. All but one (which lacked sufficient details) were coded, and they make up the incident dataset for this analysis. We wish to be very clear when we say that unlike the DBIR, which is completely data driven, this article is not solely based on data, and the data we are utilizing is from a comparatively small dataset (necessary, given the newness of the event). We have added our own observations drawn from our collective years of experience and other anecdotal sources available to us. While we feel that the information you will see here is valid and supportable, we make no claim that it is entirely data derived.
What COVID-19 means for industry
The COVID-19 pandemic has changed temporarily, and possibly forever, the way we view and conduct business. Due in large part to the mass confusion it has engendered and the high number of unknowns regarding what is and is not safe, it has resulted in consumers migrating en masse to e-commerce and other online assets rather than risking exposure by physically entering business establishments. That migration has, in turn, forced organizations in most industries, including retail, food services and even healthcare, to meet that much higher demand by increasingly relying on their online presence and by adopting a myriad of software- as-a-service (SaaS) solutions, increased cloud-based storage, use of third-party vendors, etc., and doing so in a very rapid manner.
Any time there is change, there exists potential for confusion, omissions and mistakes. Cybercriminals are aware of this and will do their best to capitalize on any opportunities that are afforded them. It is important to note that we do not mean to imply that the SaaS solutions mentioned above, nor the cloud itself, are inherently less secure. Rather, the concern arises from the fact that due to the conditions the pandemic has created, most organizations are adopting them in a hurried fashion, and they are often forced to do so while relying on fewer resources in terms of both personnel and revenue. When one adds to that dangerous concoction of digital transformation the additional ingredient of large-scale remote work enablement, it can easily spell disaster.
Trends to look out for
Rather than new and innovative approaches of electronic crime being utilized, we expect an increased amount of illicit activities of the type we have already been seeing. Given that these tactics worked before COVID-19, and that one of the results of the pandemic is that we have unwittingly provided a larger attack surface for those attacks, there is not much need for attackers to dream up new strategies and tactics to commit new crime. We make no claims of clairvoyance, but we do have a degree of confidence in our expectations regarding the attack types we discuss throughout the rest of this article.
Increase in error
As we stated ad nauseam in the DBIR this year, one action that occurs all too frequently is human error. We define error as encompassing anything done (or left undone) incorrectly or inadvertently. Errors can range from omissions and programming errors to trips and spills. However, the errors that we have seen most often recently (and expect to increase) are Misconfiguration (i.e., forgetting to add proper data security controls to a new cloud storage bucket, etc.), Misdelivery (sending an electronic file or printed document to an unauthorized or incorrect party) and Publishing errors (granting access to a larger or different audience than intended). These error types are typically due to carelessness and/or hurry on the part of a system administrator or regular end user, depending on the error type. As mentioned above, one result of COVID-19 is that many organizations are operating with a reduced number of staff due to illness or furlough and/or with staff who have limitations due to their remote status. At the same time, these organizations are often experiencing unusually heavy workloads with a much higher reliance on new and unfamiliar solutions that need to be deployed quickly. Add in the distraction of sheltering in place with family members, including children, and it would be remarkable if errors did not increase dramatically.
Stolen credential-related hacking
The DBIR shows that over 80% of breaches within the hacking category are caused by stolen or brute-forced credentials. The majority of the time, these occur via web apps and/or the cloud. Since businesses are forced to lean on SaaS platforms more heavily now than at any time since the internet was invented by Al Gore, we expect this increased reliance to substantially widen the attack surface for bad actors looking for stolen and brute-forced creds. After all, there will be more places to use them than ever before.
If you have a very effective and comprehensive patching process in place in your organization, raise your hand. For the three of you who raised your hands, we congratulate you. For the rest of us, making sure that all corporate-owned assets are promptly and consistently patched may be more difficult in the current environment than it has been in the past. While our research indicates that exploited vulnerabilities account for a relatively small amount of breaches, it is still the second most common hacking variety.
However, given the current circumstances in which a large number of employees are being encouraged (or mandated) to work from home, maintaining those newly external workstations for remote access suddenly becomes a much bigger deal. We can probably all agree that patch and asset management on workstations is already sufficiently difficult when they are all located internally. Securing those assets and preventing them from accessing the corporate network via your preferred zero-trust trickery while unpatched will prove to be very challenging, even to the most mature of organizations.
Ransomware likely to rise
Ransomware is typically not counted as a breach in the DBIR dataset since there is not usually a confirmed compromise of the confidentiality of the data. However, included in the COVID-19 breach dataset are several incidents where the ransomware group was also confirmed to have taken a copy of the data prior to triggering encryption, and posting the data (either partially or entirely) publicly on their website of choice. Of the nine malware incidents in the COVID-19 dataset, seven were confirmed breaches. For the full VCDB dataset discussed in the methodology section of the total 474 incidents, there were 128 total malware incidents, 34 of which were confirmed ransomware breaches.
Impact on the phishing landscape
In order to illicitly utilize those stolen credentials we talked about, the attacker must first be able to obtain them. While this can be achieved in a variety of ways, arguably the simplest and most effective method is to get them via social attacks such as phishing. An employee who is duped into clicking a malicious link or opening an infected attachment will provide the attacker initial access to the system.
Additionally, if the user makes use of the same password for multiple portals, the attacker can utilize them to expand the attack. The surge in remote working due to the pandemic may increase the reliance on mobile phones and tablets. Research from last year’s report indicated that many users are more likely to click on a malicious link when using a mobile device than a desktop or laptop. We discuss phishing in more detail throughout this document.
Topic analysis
To better understand the infrastructure of phishing, we decided to look into the various indicators that were shared in the InfoSec community specifically relating to COVID-19-themed attacks. To provide a contrast, we collected similar non-COVID-19 data dating back to 2018.
When it comes to creating a lure for the phishing attacks, one of the main methods an attacker can use to create an air of legitimacy (aside from sending email from a trusted person) is leveraging domains that resemble a trusted source. While we do find these types of substitutions in the regular threat intel data we analyzed, we did not find many instances of them in the COVID-19-specific data. What we did find was the gratuitous use of specific terms in combination with “COVID” or “CORONAVIRUS,” such as “masks,” “test,” “quarantine” and “vaccine,” as seen in Figure 1.
Another simple thing to look at concerning domains is the top-level domain (TLD). Even though certain TLDs are controlled, such as .gov and .edu, TLDs by themselves do not really convey any degree of trust. However, that does not necessarily mean that users are aware of that fact. We found that over 80% of the domains registered used the .com TLD in the COVID data, which was twice the rate found in previous threat intel data (41%) of malicious domains. Since .com domains usually have a cost associated with being registered, this suggests the actors here were very committed to leveraging the impression of legitimacy they would bring.
While domains provide what users see, computers translate these into the IP addresses where the actual malware or phishing page is hosted. While looking at the COVID-19 data, we found that there was a high level of centralization of hosting, meaning that singular IPs hosted a large number of the bad domains. In this case, the top 100 IPs hosted over 53% of all the domains; however, over 73% of IPs only hosted one domain. In comparison to non-COVID-19, the top 100 IPs only hosted 23% of the bad domains we collected. The good news is that we’re doing a good job as a community of staying on top of these bad domains, with the median time to discovery (difference between registration and showing up in the data) being approximately 12 days for the COVID-19 data versus 284 days for non-COVID-19.
Mind games
Clearly, COVID-19-related terms are showing up in threat indicators. However, how susceptible people are to them is still an open question. To try to provide an answer, we examined some simulated phishing data provided by a DBIR contributor. We compared emails that contained COVID-19-related terms (such as COVID, Corona, pandemic, Wuhan, SARS, etc.) to those emails that did not contain such references.
Figure 2 illustrates the results. It’s easy to see that they are quite similar for the most part; however, the phishing emails unrelated to COVID-19 tend to have a slightly lower click rate (with a median of 3.1%). The phishing emails that were related to COVID-19 had a somewhat higher median at 4.1% and showed more organizations having far higher click rates, even above 50% in some cases. A heightened emotional reaction (or amygdala hijacking, if you are into behavioral security like some of our contributors) is understandable when COVID-19-related terms are involved.
Another DBIR contributor shared data around phishing simulations performed on approximately 16,000 people in late March (the early weeks of shelter-in-place for many states in the USA). They found that almost three times as many people not only clicked the phishing link, but also provided their credentials to the simulated login page than in pre-COVID-19 tests late last year.
This is a staggering result, but we believe it is also understandable when you realize how much cognitive load all of us are being subjected to. In addition to the pain and suffering around us from the actual disease, many of us have to also juggle full-time workloads and homeschooling our children, all with very little warning. The bottom line here is that it is likely that phishing attacks, regardless of their lure, will be more successful than normal during this time period.
COVID-19 and the criminal underground
One place we thought we might see an impact of COVID-19 was in the communications in criminal marketplaces, the criminal underground chatter and security forums. As you can see in Figure 3 (above left), there was a clear uptick in mentions of some terms associated with COVID-19. This increase was especially marked from March into May.
However, when we zoom out on the same chart to include the term “card” (as in credit card) from the same forums and marketplaces (Figure 4, above right), we see that COVID-19-related terms represent merely a drop in the proverbial ocean. Of course, there is a lot that goes into this sort of view. Would criminals mention COVID-19 even if it was used in their phishing or domains? Are credit card-related threads posted more often to get more visibility for their “product”? It’s difficult to say, but the difference in occurrences can’t be ignored.
The takeaway here may be that no matter what events are taking place in society, it just becomes grist for the mill for criminals who launch phishing attacks. As we write this article, the Black Lives Matter protests are front and center both in the U.S. and abroad. In the coming weeks and months, we anticipate that readers will begin to see phishing lures with those designations.
Digging deeper into the data
For those of you who may be interested in a closer look at the VCDB data that we discussed in the Methodology section, we provide some additional details below. While there was not time to code the entire dataset of 474 records, we were able to tally up the VERIS actions of all of them, since we collect that info in the record when we add it to the list to be coded. Here are the breakouts in Figure 5:
Note that VERIS actions are not exclusive, so the total will not add up to 474. There are a number of records where multiple actions were present (most commonly, Hacking and Malware). Interestingly, we had a rare Environmental data breach in the time frame for this article (although it was not flagged as a COVID-19-related breach), which was caused by a tornado hitting a medical records service provider. Environmental data breach cases are so rare that there are only four in the entire VCDB repository of 15,701 records (either coded or uncoded) and are therefore worthy of mention.
In contrast, those incidents that were identified as directly related to the COVID-19 pandemic are shown in Figure 6. You can review which ones we selected for this analysis in VCDB at https://github.com/vz-risk/VCDB/issues?q=label%3ACovid-19-Supplemental-2020.
Finally, Figure 7 shows the breakdown of the breaches into their respective patterns. With such a small sample size, the rankings are too close to call with sufficient statistical confidence. Nevertheless, the top three patterns do correlate well with the commentary in the beginning of this document.
Final thoughts
As readers of the DBIR may be aware, in the 2020 report, we decided to align our findings with the Center for Internet Security (CIS) Critical Security Controls. We did this in an attempt to provide our readers and contributors with an easy-to-understand method of matching the threats we see to their ongoing security effort. For the threats that we focus on primarily in this document (Error, Phishing, Ransomware), we recommend that organizations take a closer look at the following CIS Controls:
- Critical Security Control 12: Boundary Defense
- Critical Security Control 9: Limitation and Control of Network Ports, Protocols and Services
- Critical Security Control 16: Account Monitoring
At the end of the day, the “we are all in this together” mantra that we have heard so often during this crisis is not only an effective marketing slogan, but it also happens to be true. In some ways, we are navigating uncharted waters, but we do feel strongly that we can set a more productive course for where we are going by looking at where we have been.