[This article originally appeared in NODE Vol 01. Click the 'zine' link above to learn more]
Dat (https://datproject.org) is an open-source data distribution tool for easily duplicating and version-controlling different data sets. First released in 2013, Dat combines aspects of Git, BitTorrent, and cloud-storage sites like Dropbox to create an easy-to-use method for reliably sharing files without needing additional infrastructure.
HOW IT WORKS
Three important design features of Dat are in-place-archiving, file versioning, and the use of a distributed network.
Dat is most commonly used by installing the official Dat application and invoking it with commands through your terminal. You can also find download links for pre-built Dat GUI packages and installation instructions through the official website at https://datproject.org.
After installing Dat, let's see how a user can interact with it.
1) Alice has a folder of documents on her system that she wants to share with her friend Bob. She navigates to the directory via her terminal, and issues a `dat share` command to create a new Dat address that corresponds to her directory:
$ dat share
Created new dat
Sharing dat: 96 files
Behind the scenes, Alice's data is broken up into small pieces, hashed, and arranged into a data structure called a Merkle tree. The resulting data is then easily referenced by a Dat address that looks like this:
Dat addresses are hexadecimal representations of a public key that corresponds to the shared data. Any user with the address can clone the data repository, but only the user with the accompanying private key can make changes to it.
2) Alice then sends her newly-made Dat link over to Bob through any standard communication channel. Bob brings up his terminal and runs the following command to clone Alice's data into a directory of his choosing:
$ dat clone dat://7c46dafaa44098d5d439812ed6300036eaa85bbd9422ee677dbbdc722710a231 alices-data
Created new dat
Cloning: 96 files
Behind the scenes, Bob's Dat instance uses a combination of DNS and DHT to connect to Alice or any other peers she may have provided the address to for mirroring. Using two protocols: Hypercore and Hyperdrive, Dat then facilitates connections to machines seeding the files and downloads them.
3) Suppose Alice updates a file within her directory and wants to get that change reflected in her Dat. She can do this easily by rerunning the `dat share` command within her existing directory:
$ dat share
4) Now, Bob needs to update his local store as well. This can be done easily with the `dat sync` command:
$ dat sync
Dat is most similar to other decentralized file storing/sharing tools like IPFS and version control systems like Git.
Much like Git, Dat has built in file versioning to track the history of changes made to data. Dats, the repositories, are created and updated very similarly to Git repositories, only without the reliance on a central server.
The concept of addressable, decentralized data sharing is where Dat and IPFS share some overlap. Each protocol allows a user to create a unique hash to identify data and share it with others to completely clone the data it corresponds to. Dat currently has the added benefit of native browser support with the Beaker browser, allowing the easy creation of Dat-based websites. Websites can be made and hosted in either a user's filesystem with the aid of the `dat` application, or via the Beaker browser itself.
Dat does have a few downsides worth mentioning. While addresses are touted as secure and non-guessable, they are relatively public and unprotected once found. While a bad actor on the network might not be able to target a specific user directly, there is nothing stopping someone from randomly guessing addresses to see if they correspond to data. Dat is targeted at the scientific/research community with an aim of sharing public data. If you want to share private data via Dat, it is advisable that you encrypt your data first.
Dat also does not have a global peer swarm. Peers do not come together to join a global network, and instead create smaller, focused networks around pieces of data. This could be considered beneficial as there is less overhead in terms of peer connections, but it also makes it difficult for applications to simply "connect into Dat," leaving applications to implement their own way of interfacing with Dat.
Further, it is important to note that if the original data owner goes offline or closes their client, their Dat address will not work if there are no other peers hosting it. However, once the data is cloned, it will remain on a user's machine even if the original creator disappears from the network. Services like Hashbase offer free and paid storage options for mirroring Dats to keep them online indefinitely.
So that's the 101 on the Dat Project. It's an innovative, easy-to-use tool for storing and sharing data, with many similarities to version control systems.
There are already some active projects using Dat, including http://sciencefair-app.com, a project for sharing scientific literature, and the previously-mentioned Beaker browser, a browser for creating a browsing Dat-based websites.
BY MIKE DANK