We just launched the open source Qbix Platform 1.0. And we worked hard to make it among the most secure open source web platforms out there. However, we are not about to rest on our laurels. We’ve already begun developing Qbix Platform 2.0!
While we were building Qbix Platform 1.0 as a Web based platform, we saw the revolution taking place in crypto-currencies and distributed systems, and we’ve had extensive conversations with teams and architects from the solid project, SAFE network, Ripple, researchers at MIT and NYU and more. We’ve also noticed that web crypto is now available in all major browsers, including mobile ones.
This presents an opportunity to reinvent everything, again. But in a backwards-compatible way. All sites and apps built with Qbix Platform 1.0 will work on 2.0, but will be far more secure and resilient. Yes, more secure than Equifax, or even Yahoo. All open source, available for everyone to use, out of the box. The trick? Going “serverless”.
What is needed
Until now, the Web had one major weakness: you had to trust the server. Trust it to store your data and not modify it without your permission. Trust it to respect privacy settings on your photos and content. Trust it to serve you the right data and executable scripts every time. And as we can see from the epic breaches at Yahoo, Equifax, etc. or Facebook’s privacy scandals, the trust doesn’t always pay off. While a few issues have been addressed, trusting a bunch of random servers on the internet with your data, identity and brand is dangerous, even if they are “in the cloud”. The more web services you use, the more it becomes an issue. This is even truer if you are a developer.
So we are in the process of architecting an entire end-to-end solution for Qbix Platform 2.0 which will give greater confidence to people that their information is secure and they are in full control. It will also form the basis for the Intercoin Distributed Ledger Technology.
How Qbix Platform 2.0 will work
The next version of Qbix Platform is architected around several core principles:
- Client-Side Crypto: All decryption happens on the client side. Private session keys should never leave the device, and any keys that are stored on servers are themselves encrypted.
- Chunking: Break up large files or revision histories into smaller chunks connected by a Merkle Tree. This way, encryption and decryption can be vastly sped up via parallelization, and encryption may even be strengthened using some techniques.
- Distributed Systems: Take advantage of the latest innovations in byzantine consensus to make the system less about the servers and more about the data.
- Seamless Social: People should be able to main accounts across communities, discover their friends and interact with their posts, in one seamless and secure experience.
- Backward Compatibility: Make sure that all the apps and plugins developed and hosted by communities can be seamlessly transitioned to the new architecture.
Here are the actual details on how it will work. First, a few concepts:
- Domain: A domain is a set of servers that is accessible on the Internet (or a local network). Unlike the Web, domains can use DNS but can also use other routing systems.
- Access: Session tokens and other resource tokens are given out by a domain and signed by its secret key. This allows domains to reject requests with improperly signed access tokens quickly, and without doing any hard-disk access or lookups.
- Identity: This is defined as a set of (domain, userKey) pairs. Someone’s FULL identity is the union of all the pairs under their control. Each domain may know only a subset of your full identity. Domains can host one or more users, who can be person or an entire community or app.
- External Users:Each domain can also have external user ids to refer to other users who make guest accounts. For example, unlike oAuth, an App which wants access to your personal information would need to make a guest account on the domain where you store that information. In accordance with the Client-Side Crypto principle, you’d see a dialog similar to oAuth which says “such-and-such app (user) wants XYZ access to your streams ABC”.
- Streams: Streams are essentially blockchains published by users. They contain data and have rules for accessing and modifying it. They can be used for chats, chess games, collaborative editing, blogs, anything really.
- Pull vs Push: From time to time, someone may want to load an entire stream or resource. But often, they just want to be notified of updates and changes. The latter is referred to as “push”, and is implemented using webhooks, while routing can use DNS or even be peer-to-peer.
- Rules: Rules are functions written in deterministic javascript and executed in a VM. That includes reading, writing, inviting people, rules of the game, or whatever else. Each change done by a user must be validated against the rules that are in effect at the time.
- Validators refers to servers which validate the entire history of a stream, and sign claims that they found a violation, or that everything is fine.
- Governance: Rules provide an expressive and powerful way to implement governance, whether of a simple document or an entire organization. Everyone who can see the stream can see and execute the rules, and everyone who can see its history can see when rules were added and removed; each operation subject, of course, to the previous rules. They can then verify that no violations of the rules were committed in the entire history of the stream, which is very useful for crypto-currencies.
- Replication: A stream can be replicated on other servers and domains. It can be done either via pull or incrementally push. This should be done in a way that’s similar to (and compatible with) the Scuttlebutt protocol. Such replication can happen peer-to-peer and resist censorship, even on the level of an entire network.
- Uniqueness:
- Watchers:
A replicated stream may not be replicated exactly the same way across all computers, because its publishers may generate multiple valid histories and send different versions to different servers. The servers would have to all gossip with one another to discover the “forks” and enforce uniqueness.
A watcher is a server participating in gossip with other watchers to make sure that a stream only has one “head”. This forms the basis for Intercoin’s crypto-currency implementation, as watchers from various communities can keep an eye on various servers (or groups of servers running consensus) storing and reporting a particular transaction. For each transaction, the group of watchers can be found on a Kademlia tree by information derived from the transaction’s contents. This includes nonces supplied by the various parties to the transaction – so the set of watchers is essentially unpredictable in advance (making it harder to induce them to collude). Each watcher for a transaction needs to know all the other watchers and exactly how large of the quorum will be, because after a certain point, forks can no longer invalidate a transaction.
Now, the crypto architecture for identity:
- Keys: Cryptographic keys are used for encrypting, decrypting and signing information. They are either symmetric (all parties using the key know it) or asymmetric (public-private key pair). The latter is susceptible to Quantum computers but NIST is currently in the process of standardizing several Quantum-resistant implementations.
- Session Keys: A device may host several users, each of which has their own session keys for each domain. Session keys are always stored encrypted on the device. Cryptographic keys are derived from passcodes and used to decrypt the session keys. The decrypted version may be stored and used while the user is actively using the device, and deleted after activity ceases for some timeout period.
- Passcodes: Users never store any passwords on the server. Instead, they use passcodes to decrypt local “Session Keys” stored on their device. The passcodes don’t have to be especially strong (e.g. they can be 4 digits) because the physical device is needed, and device can apply rate-limiting after a few failed attempts. Neither they nor their derived keys should be stored by the device.
- Biometrics: Today, solutions like TouchID and FaceID allow unlocking passcodes with biometrics, making access even easier. With something like VoiceID or GaitID, people would be able to seamlessly approach a door or screen and interact with it. The more biometrics together, the better; but this is up to the makers of operating systems.
- Native Sync: Users may choose to sync their session keys using the operating system’s keychain services, such as those provided by iCloud (Apple). That way, the operating system would restore all their accounts on a new device. However, while convenient, it is far more secure to choose not to do this.
- Keychain: A user’s keychain is a list consisting of all the session keys on all their devices, as well as “helper keys” to revoke or restore a device. All the domains the user has an account with will store this list, and subscribe to Push notifications about its changes. It is essentially a Replicated Stream, but it doesn’t need Uniqueness because when it comes to identity, “double-spending” is not a threat that needs to be eliminated.
- Revocations: When a device is compromised and needs to be revoked, this happens via a push update to all the domains. The update is signed by M of N keys in the keychain. Similarly, the keychain stream can add new devices. If you lose all your devices, you can still restore your identity – some keys could be “helper keys” held by friends helping you restore an identity after losing your phone. Together with a key derived from a passphrase you remember (or store with biometrics) you can bootstrap you control of your keychain and update your identity on all domains.
And finally, the crypto architecture for interacting with the servers:
- User Key: Each user on domain has a User Key, which (as usual) is only used for decrypting things on a client. Multiple copies of this key are stored, each encrypted by a different Session Key, in a User KeyFile. A client device obtains the User Key by sending a signed Request with the Session Key’s Public Key, leading the domain’s servers locate the User Key encrypted with the corresponding Private Key and send it to the client. Every time a user’s Keychain changes, its devices get the list of domains and reports to each domain the new User KeyFile for that domain.
- Database: Information at rest is stored on the server in files and databases. The path of a file, or primary key of a row, may consist of encrypted information. But sometimes, the information is structured (i.e.
items are related to each other by their content), in which case the file path or primary key may not be encrypted.
However, the contents should still be encrypted. - Access Keys: Information transmitted to the server is already encrypted on the client with Access Keys. In according to the “Chunking” principle, it is often split into smaller chunks and each chunk is encrypted/decrypted in parallel. Access Keys represent access of one or more domain users to a resource on the domain’s servers. Sometimes, multiple Access keys are used in a particular order to encrypt (chunks of) a resource, and in the reverse order to decrypt it.
- Obtaining Access Keys: Clients can obtain Access Keys with User Keys similar to how they obtain User Keys with Session Keys. An Access KeyFile is stored for each user, with all the Access Keys for various resources (or groups of resources) on the server that the user can load to their client. These can include files, streams, and so on. Whenever a User Key changes for a user on a particular domain, a new Access KeyFile must be generated by the client and sent to all the domains. In particular, this means that updates Keychain may result in lots of updates to various domains the user is a part of. These happen infrequently, but the devices must therefore store the previous keys in order to obtain the current KeyFiles from the servers, and re-encrypt them.
Although we have just begun to follow our roadmap towards realizing this vision, we have already spent many months on finalizing the architecture. The resulting software will be open source, which anyone can build on top of. But even so, we plan to develop a comprehensive set of specs and standards to let other web-based platforms upgrade their software to realize this vision in a compatible way.