Introduction

The Internet Archive recently faced a significant cybersecurity incident involving a data breach and ongoing Distributed Denial-of-Service (DDoS) attacks. This breach compromised millions of user records [3], raising concerns about data security and the organization’s vulnerability to cyber threats.

Description

The Internet Archive experienced a significant data breach on October 9, 2024 [5] [11], confirmed by the organization and security researcher Troy Hunt [7]. A threat actor compromised the user authentication database, exfiltrating over 31 million unique records, including email addresses [1] [4] [5] [6] [11], usernames [1] [3] [4] [7], and Bcrypt-hashed passwords [2] [3] [4] [5] [6] [9] [10] [11]. A pop-up notification on the website alerted users to the breach, which involved the theft of a 6.4GB SQL file named ia_users.sql [2], containing critical information such as password change timestamps. Notably, 54% of the compromised records were already present in the Have I Been Pwned (HIBP) database, enabling prompt notifications to affected users [11]. The breach was first reported by Bleeping Computer [7], with the alert stating, “It just happened. See 31 million of you on HIBP!” The most recent password change timestamp suggests the database was likely compromised on September 18, 2024.

Troy Hunt verified the authenticity of the data by contacting users listed in the database [10], including cybersecurity researcher Scott Helme [10], who confirmed that the Bcrypt-hashed password and timestamp matched his password manager’s records [10]. Although Hunt initiated a disclosure process with the Internet Archive on October 6 [10], he had not received a response by the time he planned to load the data into HIBP approximately ten days after the breach. Affected users will soon be able to check HIBP for potential exposure of their email addresses [10].

In addition to the data breach [2] [7], the Internet Archive has been facing ongoing Distributed Denial-of-Service (DDoS) attacks that have intermittently disrupted its services [7], temporarily taking the site offline [6], including the Wayback Machine [4] [5]. Founder Brewster Kahle confirmed the DDoS attack on October 8, 2024 [5], and later acknowledged the data breach [5]. The pro-Palestinian hacktivist group BlackMeta claimed responsibility for these attacks [8], which were motivated by political reasons related to US foreign policy [9]. The group issued an antisemitic message regarding the Archive’s ownership [2]. Kahle reported that the organization is taking measures to recover from these attacks [7], including enhancing its cybersecurity protocols and disabling a vulnerable JavaScript library that contributed to the breach [3]. Experts suggest that the attackers gained control over the back-end infrastructure [8], as evidenced by website defacement and repeated outages [8]. The connection between the DDoS attacks and the data breach remains unclear [5], and speculation suggests that server credentials may have been compromised through “information stealer” malware [1]. The extent of any additional data theft is still unknown.

The Internet Archive is also grappling with legal challenges [7], including a recent loss in a copyright lawsuit brought by book publishers [7], which could have significant financial implications for the organization. It is important to note that the Internet Archive is a nonprofit organization advocating for a free and open internet and has no ties to the US government or Israel [9]. Additionally, for over a decade [1], the Internet Archive has inadvertently exposed the email addresses of users who upload content [1], despite claims of confidentiality [1]. When files are uploaded [1], a metadata file is generated that includes the uploader’s email address [1], which is publicly accessible [1]. Users have raised concerns about this issue for years [1], noting that the Archive does not provide warnings about the exposure of email addresses [1].

To mitigate risks from potential account leaks [1], users are advised to use unique [1], random passwords for each account to prevent credential stuffing attacks [1]. The passwords involved in the breach were hashed using a secure algorithm [1], reducing immediate risk for victims [1]. Furthermore, using unique usernames and email addresses for online accounts is recommended for enhanced security [1].

Conclusion

The recent cybersecurity incidents at the Internet Archive highlight the critical need for robust data protection measures and proactive cybersecurity strategies. The organization is taking steps to enhance its security protocols, but users must also remain vigilant by adopting strong password practices and monitoring their accounts for potential exposure. As the Internet Archive navigates these challenges, it underscores the importance of safeguarding digital information in an increasingly interconnected world.

References

[1] https://theintercept.com/2024/10/10/internet-archive-hack-breach-email-addresses/
[2] https://9to5mac.com/2024/10/10/internet-archive-data-breach-exposes-31m-users-under-ddos-attack/
[3] https://siliconangle.com/2024/10/10/internet-archive-experiences-outages-ddos-attacks-data-breach/
[4] https://lifehacker.com/tech/wayback-machine-hacked-affecting-31-million-records
[5] https://www.infosecurity-magazine.com/news/internet-archive-breach-31m/
[6] https://arstechnica.com/information-technology/2024/10/archive-org-a-repository-storing-the-entire-history-of-the-internet-has-been-hacked/
[7] https://www.wired.com/story/internet-archive-hacked/
[8] https://www.forbes.com/sites/daveywinder/2024/10/10/internet-hacked-wayback-machine-down-31-million-passwords-stolen/
[9] https://www.techradar.com/pro/internet-archive-hacked-millions-of-records-stolen-following-ddos-attack
[10] https://www.isss.org.uk/news/internet-archive-hacked-data-breach-impacts-31-million-users/
[11] https://beebom.com/internet-archive-data-breach-exposes-31-million-accounts/