Version: 0.1.0 (released 7th Feb 2016)
This version:
Latest published version:
Editors:
Contributors:
Repository:
This document is published as a working draft after preliminary discussion in the diaspora* user migration issue and the relevant Loomio discussion. The original spec draft was published in the diaspora* wiki.
Work on this specification should be done via issues and pull requests in the git repository. Comments can also be given via other ways, for example The Federation mailing list or the above linked earlier discussions. Issues and pull requests are the recommended way to participate however.
This specification document should follow Semantic Versioning with a 1.0 released on the acceptance of the first version by a both the editor and possible implementing platform.
Specification to deal with two common problems with decentralized social sites:
These two problems create lack of identity security and lack of continuity for users of these social networks.
The purpose of this specification is to provide means to protect the identity of users, not actual content. As such, content like posts, comments, likes, photos, or any other content type objects are not in scope of this specification.
NOTE! This specification assumes the servers implement public and private keys to verify authorship of content. If not, a platform implementing this specification should use these methods within this specification.
Term | Explanation |
---|---|
User/identity | An object that is backed up or restored |
Server | A server that is home to the user/identity. Called ‘pods’ for example in diaspora*. |
Handle | A network wide unique identifier for the identity. For example user@domain.tld or https://domain.tld/user . This could also be a GUID but to allow a user friendly restore process a human friendly identifier should be preferred. |
The following use cases are related to and describe the flow of actions in this specification.
Servers which support this specification should publicize a JSON endpoint at .well-known/x-acc-backup-restore
. This endpoint should contain the following schema:
{
"$schema": "http://json-schema.org/draft-04/schema#",
"id": "https://the-federation.info/specs/backup-restore#backup_server_discovery",
"type": "object",
"properties": {
"allow_backups": {
"type": "boolean"
},
"allow_new_backups": {
"type": "boolean"
}
},
"required": [
"allow_backups"
]
}
Servers should query other known servers frequently (1/week minimum) to refresh this information.
TODO: To avoid an extra endpoint, should we use a version of NodeInfo instead?
The archive should, depending on the features offered by the server, contain the following data:
The archive should be in JSON format.
{
"$schema": "http://json-schema.org/draft-04/schema#",
"id": "https://the-federation.info/specs/backup-restore#archive_format",
"type": "object",
"properties": {
"email": {
"type": "string"
},
"content": {
"type": "string"
}
},
"required": [
"email",
"content"
]
}
TODO: Define full schema of content or place content keys in the first level.
The archive should be strongly encrypted using a passphrase given by the user.
TODO: Define encryption method.
Servers which send out backups should store the following extra information for users:
column | type | example |
---|---|---|
Backups opted out | boolean | false |
Backup server | string | sub.domain.tld |
Backup server receive route | string | /receive/backups |
Backup server fail count | integer | 0 |
A dedicated table or other storage should be available for storing backups information.
column | type | example |
---|---|---|
Backed up handle (unique key) | string | user@domain.tld |
Backup content | large text | (encrypted text content) |
Settings should be available to allow advertising backup readyness to other servers.
Servers that have received backups should always allow restoring them, even if they stop allowing new backups to be received.
To protect from arbitrary storage of data and to validate backup ownership, the backup delivery needs to be signed with the user private key. A receiving server should check the signature against user public key before storing the backup.
Actual implementation on how to verify the delivered packages can be left to individual implementation. However, cross-platform compatibility would improve from using identical methods.
Platforms are free to restrict what platforms they deliver backups to, for example to ensure users are able to restore their identity to a place with a similar set of features available.
TODO: Give example of signing method.
TODO: Should signing method be specificied in the delivery schema?
TODO: Should backup servers advertise what signing methods they support? (if above)
Servers should aim to backup the identities of users at minimum once per week.
The delivery JSON message needs to contain the following schema:
{
"$schema": "http://json-schema.org/draft-04/schema#",
"id": "https://the-federation.info/specs/backup-restore#delivery_package",
"type": "object",
"properties": {
"handle": {
"type": "string"
},
"backup": {
"type": "string"
}
},
"required": [
"handle",
"backup"
]
}
backup
is signed using the user private key. This is done to validate the content of what is inside. Additionally, before signing, it is encrypted using a user chosen passphrase.
A successful delivery of a backup should expect to receive 200, 201 or 202 status code. These should all be counted as successful delivery of backup.
Refusal to accept this backup should be indicated with a 403 status code.
Any other error code should be understood as temporary problems with receiving the backup.
On a successful restore of user identity, a moved message should be sent out to all known servers.
The moved message is signed with the users old private key and should contain the users new public key.
In the case of the moved message failing to be understood by the recipient, due to lack of support for this specification (non-2xx status code response), a retry should be scheduled. Since the users old private key is not kept, the scheduled moved message should contain the fully prepared moved message to send. A server should retry moved messages for a minimum of 6 months. The time between sending out retries can be lenghtened over time.
The moved message JSON needs to contain the following schema:
{
"$schema": "http://json-schema.org/draft-04/schema#",
"id": "https://the-federation.info/specs/backup-restore#moved_message",
"type": "object",
"properties": {
"old_handle": {
"type": "string"
},
"new_handle": {
"type": "string"
},
"new_public_key": {
"type": "string"
}
},
"required": [
"old_handle",
"new_handle",
"new_public_key"
]
}
A server receiving a moved message for an identity that exists locally either as a local user or a remote profile, should do the necessary internal changes to map the user to the new location, discarding then the old public key and old handle.
The server receiving a moved message should ensure all local and remote content stored now points to the new handle.
A user backup must be encrypted strongly so it can be safely sent for storage to other servers.
A user backups must not be allowed to be restored without email confirmation. The dual protection of passphrase AND email confirmation is to avoid identity theft in the case that the user passphrase leaks out (from sending server database leak for example).
All signed backup deliveries must be verified against a recent public key of the sending profile.
Servers should not allow restoring any uploaded backup archives unless the user to be restored can be found either existing as a remote profile or by fetching the to be restored remote profile. This is to protect against uploading of faked identity data for identities that have disappeared off the network.
This specification assumes any to be backed up users have their public key available via common known discovery routes.
TODO: Should the spec take opinion on where users can be discovered from?
This work is licensed under a Creative Commons Attribution 4.0 International License.