The concept of decentralization in information technologies is not a new one. The internet, probably the single most influential technological innovation of the last 100 years, started out as a decentralized phenomenon. The pioneers of those early days used protocols to connect their computers with other machines around the world and built applications like email services and the World Wide Web, hosting the content on their own computers.
The Internet Before Decentralization
The internet is a human construction that has its own languages, and these languages have their own rules and protocols allowing it to function properly. Previous to the development of these languages computers were isolated machines with no way to communicate with each other. By creating a structure of interconnections between computers and using these communication protocols, computers are able to interact with each other.
This interconnected structure is called system architecture and it makes the internet possible. There are a number of different types of architecture but the two most prevalent are client-server and peer-to-peer networks. Of these two, the client-server model dominates the landscape and uses a language called Hypertext Transfer Protocol (HTTP) to communicate. Data is stored in centralized servers that are then accessed using location-based addresses utilizing HTTP.
This centralized server model and HTTP are very effective for certain actions like manipulating text and image files and creating websites; when dealing with issues of speed, latency and throughput, centralization has proven to be a useful model. The client-server model is also great at loading websites and handling text and images, aspects that once comprised the majority of internet traffic.
Because of these strengths HTTP has dominated the landscape. However, HTTP is not perfect. Specifically, it is not suited to handle the transfer of large data files, like audio and video, which is why the adoption of peer-to-peer networks gained popularity. There is also the issue of server security. Having a consolidated organization means that the risk of data breaches and hacks are huge: all of the data for a general population is stored on a handful of servers under a central control. If bad actors gain access to these servers they can glean, manipulate and delete huge swaths of information.
Decentralization basically means that instead of all actions and operations passing through a single, central point of access, they are spread across a number of different nodes. Each of these nodes forms an independent part of the network and is involved in the storage of data and the protocols used to access and manipulate it.
The first glimpses of decentralized protocols that the average person came into contact with were music sharing services like Napster and BitTorrent. These platforms used Peer to Peer (p2p) networks to transfer data from the network of nodes to a user’s computer. All users on the network host content as independent nodes, eliminating the need for central servers.
BitTorrent uses p2p and builds upon it, creating a way to download large files with limited bandwidth by sourcing small bits of data across the entire network and downloading them simultaneously, solving the issue of download speed often associated with a client-server model.
How Does Decentralized Cloud Storage Work?
HTTP serves the client-server model of data access very well: data is stored in centralized servers and location-based addresses are used to quickly and efficiently access data. But what happens when there aren’t centralized servers and no single address where data is stored? With different system architecture, this protocol is no longer suitable and a new language must be developed.
One new language is called IPFS, or the InterPlanetray File System, and is an open-source project developed by Protocol Labs. IPFS is a collaborative project with hundreds of developers around the world contributing to its development. With the attention it has gained there are hopes that it can become a new standard in a decentralized internet.
HTTP cannot function properly outside of a client-server model because it utilizes addresses to retrieve data, and on a decentralized network there is no single address for files. IPFS solves this by using content-based addressing: files are found not based on IP addresses and server location but by the data they contain.
While IPFS shares some traits with BitTorrent’s decentralized p2p protocol, it also differs in some fundamental ways. BitTorrent is strictly used for p2p file-sharing, while IPFS is intended to replace HTTP entirely. IPFS also practices deduplication which eliminates redundancy on the network and frees up bandwidth and increases speed. IFPS uses hashing, the cryptographic method utilized by the blockchain in which files are broken into blocks and are then given unique numerical codes. This overlap with blockchain technology makes IPFS ideal for integration.
This decentralized model has its own issues with privacy and data security which are addressed by each of the projects utilizing this protocol. However, given the hashing, encryption processes and decentralized architecture that are built into IPFS suggests that it will be more secure than the centralized models currently used. But what incentive is there for users to utilize their computers in order to provide the network with the storage it needs?
Decentralization Through Blockchain & IPFS
Seeing that the blockchain is a decentralized protocol by design it is not surprising that people would find ways for these technologies to be integrated. There are a number of different projects utilizing the blockchain for decentralized cloud storage, and here we will take a look at a few of the most exciting developments.
FileCoin: Utilizing Unused Storage Through Blockchain
This is a project that comes from Protocol Labs, the same group that developed IPFS, and provides a blockchain-based storage solution to the issue of incentivizing node participation on the IPFS network. The project was developed based on the fact that there are huge amounts of unused storage space across the world’s personal computers, and a way to utilize this idle storage could have profound implications.
Users are able to join the network and rent out the unused space on their hard drives, disks, or data centers. Within the FileCoin ecosystem, there are four different roles. The first is Clients. These are the people paying for their data to be stored across the network. Then there are storage miners that rent out their space to the clients. Retrieval miners act as intermediaries, shuttling data from storage to the clients and back again based around a send/receive, request system. Finally, there are full nodes that act as validators for the entire network.
It is only after the data is validated as being correct and transferred successfully that a storage miner is paid for his storage space. This validation is done using cryptographic means, utilizing the blockchain. Clients and storage miners are able to fine-tune their storage strategy to suit their needs as well as the needs of the network. With first-mover status, this is the project that could, with wide-scale implementation, create the eco-system for a decentralized internet to function.
Sia: Blockchain-Powered Cloud Network
This is another project which aims to replace the current giants of centralized cloud storage. It functions in a similar fashion as FileCoin, linking renters and storage providers on the network, but Sia differs from FileCoin in a few ways.
Sia has placed a high priority on competitive pricing. Out of the gate, Sia is 70% cheaper than centralized cloud storage services like Amazon, Dropbox and Google. Sia also engineered competitiveness into their model. Hosts show their geography, speed, latency and price allowing renters to choose the host that best suits their needs for each transaction. According to Sia, this will put downward pressure on price while rewarding quality hosts.
Sia also places an emphasis on security: low-cost data storage is worth nothing if it is not secure. One tool that Sia utilizes to reduce the loss of data is called Reed-Solomon Redundancy. This means that each piece of data is stored on 30 different devices around the world, while only 10 devices need to be online at any given moment to access the data. Since it is almost guaranteed that all of these machines will not be compromised the odds of data being lost are astronomically low.
Sia also utilizes high-level encryption at several different points for each piece of data. Every separate piece of data has its own passcode and is encrypted on each individual machine. Renters have these passcodes and own all of their data; no third parties—not even the hosts—can access a renter’s data.
This encryption along with Smart Contracts to ensure that hosts and renters fulfill their end of the deal as well as a 64 byte Merkle Tree method of ensuring proof of storage makes for a very secure cloud. Sia is an interesting project, and with their focus on competition and security, is definitely worth watching.
Storj: A New Evolution Of Cloud Storage
Another interesting development utilizing the blockchain for decentralized cloud storage is Storj (pronounced storage) and its Tardigrade project. Like Sia, this project uses sharding as a means to ensure the security of the data stored on its network. Data files are split into a number of smaller pieces in standardized sizes of either 8 or 30 MB. This process not only increases security but also increases privacy while improving the overall functioning of the network.
Data is encrypted by clients before it is transferred to storage space on users’, called Farmers, drives. Each individual shard gets a hash with identifying data stored on a distributed hash table on the blockchain. The distribution of shards ensures that no one Farmer ever controls a complete file and improves data security.
Storj also uses the Reed-Solomon algorithm to protect against lost data due to node failure: this algorithm can recreate a lost file from as little as 50% of the remaining shards. Proof of integrity is maintained by hourly audits that are performed on files using Merkle Trees. Farmers reply to the Merkle Tree query with an answer that can only come if all files are stored properly on their drives and it is only then that they are paid.
This project comes from an experienced team with a history in the crypto industry going back to 2014 and offers the cheapest storage rental prices of any of these projects: prices start at $.015 GB per month. With a strong market presence and continued innovation Storj is an interesting prospect in the race for the adoption of this decentralized tech.
There is a strong push for a decentralized future. This technology allows for an egalitarian development of further innovations and helps bring the power back to the people. Beyond any philosophical arguments about the evils of centralized data control, there are some real-world examples of the way that a decentralized cloud can benefit people.
This is apparent in the way some nations have dealt with censorship and data manipulation. The consolidation of data provides governments an easy and nearly absolute way to control the information a population has access to. There have been many examples of state internet censorship around the world, with notable cases in China and Turkey. China has blocked many social media platforms and replaced them with their own, highly surveilled versions, while Turkey banned Wikipedia outright, claiming it was a threat to national security. These scenarios, along with the implications of hacking these massive servers, make for a strong case in favor of decentralization.
There are a myriad of reasons why a decentralized ecosystem is beneficial for all parties involved—excluding big tech firms that rake in cash for centralized server space and authoritarian regimes. Everything from security and cost to ideology and philosophy are valid arguments for decentralizing data storage. The development of these new technologies bring us closer to taking the reins from monolithic corporations and developing a system that provides users with the freedom to grow and create in new and exciting ways.
More great decentralized projects are available on KuCoin
Mitchell, B. (2019, August 17). Hypertext Transfer Protocol Explained. Retrieved from: https://www.lifewire.com/hypertext-transfer-protocol-817944
Gearlog. (2010, October 27). LimeWire, Napster, The Pirate Bat: A Brief History Of File Sharing. Retrieved from: https://www.geek.com/gadgets/limewire-napster-the-pirate-bay-a-brief-history-of-file-sharing-1359473/
Saini, V. (2019, February 16). Understanding IPFS In Depth (1-6): A Beginner To Advanced Guide. Retrieved from: https://hackernoon.com/understanding-ipfs-in-depth-1-5-a-beginner-to-advanced-guide-e937675a8c8a
Filecoin. (NA). A Robust Foundation For Humanity’s Information. Retrieved from: https://filecoin.io/
Sia. (NA). Decentralized Storage for The Post-Cloud World. Retrieved from: https://sia.tech/
Ray, S. (2017, December 14). Merkle Trees. Retrieved from: https://hackernoon.com/merkle-trees-181cb4bc30b4
Storj. (2019). Introducing Tardigrade.io Decentralized Storage. Retrieved from: https://storj.io/
Djhworld. (2019, February). Reed Solomon Codes Are Cool. Retrieved from: https://news.ycombinator.com/item?id=19247633
Sherman, J. (2019, January 11). Emulating China And Russia, More Countries Crack Down On Internet Freedoms. Retrieved from: https://www.worldpoliticsreview.com/articles/27162/emulating-china-and-russia-more-countries-crack-down-on-internet-freedoms