How Ethereum Swarm Can Protect Nonprofits from DDoS Attacks

3 Jun 2024

A few days ago, someone launched a DDoS attack on the Internet Archive servers.

https://twitter.com/internetarchive/status/1795117949499445554?embedable=true

I know, I know, who the heck would stoop to such a despicable act?! This incident made me ponder whether this points to some systemic, conceptual flaw that made it possible in the first place.

Ethereum Swarm is a decentralized storage solution. When I first read about the concept in the Book of Swarm, the incentive system seemed a bit strange to me, where we have to pay not only for storage but also for retrieval. The node serving the content receives compensation for the data traffic. It's like browsing the internet or scrolling through Facebook and constantly seeing a little counter in the top right corner of the screen showing how much we need to pay for the content. Wouldn't that be weird? Of course, anyone with a server at a cloud service provider knows very well that bandwidth does cost money. For example, on AWS, you currently have to pay $0.09 for every outgoing GB of data traffic.

On the Internet, almost every service is "push" oriented, meaning the service provider has some interest in us consuming their content. This is true even in cases where we think it is in our interest to consume the content. An example of this is when we use Facebook. Users obviously scroll through the Facebook feed for the content, but from a business perspective, the key is the advertisements appearing between the posts, as this is what generates profit.

If you're not paying for it, you become the product!

This is why we typically receive content for free on the Internet, and companies not only cover the infrastructure costs but also spend a lot of money on marketing to ensure we consume content on their platforms. This, of course, leads to a vicious cycle because users have become accustomed to free content. If a new player enters the market, they too are forced to offer free content to be competitive, leaving the service provider no other option but to monetize the content.

If you don’t want to be a product, pay for it! Since Ethereum Swarm is a decentralized system of peer nodes, there is no possibility here to "hide" such infrastructural costs.

Ethereum Swarm is honest!

In the Swarm network, if you need any content, you have to pay for it. Even if the content itself is free, you still have to pay for the data transfer. This solution, however, brings new opportunities.

Ethereum Swarm uses forwarding Kademlia for content delivery. This means that if we request content from a node and that node doesn't have the content, it must fetch it from a neighboring node. In this case, the neighboring node must be paid for the given content. Due to this operation, it is beneficial for nodes to cache popular content. Such content can thus persist in the network even if no one is paying for its storage anymore.

Ethereum Swarm is a perfect client-sponsored Content Delivery Network (CDN).

Thanks to its decentralization, this system can automatically scale according to demand, while content providers are not forced to bear the infrastructural costs themselves covertly.

At this point, we can return to the idea raised at the beginning of the article. For a nonprofit system like the Internet Archive, it would be an ideal solution to entrust content delivery to Ethereum Swarm. On one hand, there would be significant cost savings if data transfer costs were borne by the content consumers instead of the provider (for the consumer, it's just a few cents, while for the provider, with many consumers, it could be thousands of dollars). On the other hand, such a decentralized system cannot be subjected to a DoS attack (only the nodes can be targeted, but attacking the entire system is very difficult).

Given the above, an Internet Archive system based on Ethereum Swarm would look like this: crawlers continuously scan the web and upload the content to Swarm. The storage cost would be minimal because the goal is not long-term storage but merely to get the content into the cache. (It's important to keep a minimal storage cost to prevent spam the caches). If someone needs archived content, they can obtain it from Swarm. If the content is not in the cache, it can be requested from the Internet Archive. This request could also have a minimal fee, providing direct revenue for the Internet Archive. In such cases, the Internet Archive would retrieve the content from deep archive storage and re-upload it to Swarm, where it would become accessible again. In this architecture, the Internet Archive's cost would only be for deep archive storage, which, for example, is about $1 per TB on AWS Glacier. Since Ethereum Swarm would handle the actual content delivery, the users would bear this cost.

The ideal solution for mapping content is to use Swarm feeds. Feeds allow mutable content to be stored under a given topic ID while making the history accessible as well. In this case, web URLs would serve as the topic IDs and older versions would be accessible through the feed history.

Of course, the Internet Archive is just one of the potential use cases. The main point is:

If we use Ethereum Swarm as a content delivery network, there is no longer a need to monetize content or hide the infrastructural costs.

For example, we can build a social network without ads and without the need to sell user data (I covered this in a full article), stream multimedia content at minimal cost, or operate large data storage for training artificial intelligence algorithms at minimal cost.

If you are more interested in how Ethereum Swarm works, read my article about the topic: What's the Difference Between IPFS and Ethereum Swarm?