Discreet Log #13: Metadata Resistant File Sharing
04 Aug 2021
The ability to send images, audio and other files is one of the most requested Cwtch features, and with the Beta now released the Cwtch team have recently started discussing how to support safe and secure metadata resistant file sharing.
This is by no means an easy problem. In this log entry I will discuss some open problems, provide some insight into the design work currently unfolding, and finish up with a sketch roadmap for getting these features into Cwtch.
Files as Deanonymization Vectors
I like to roughly split the threat that files pose into 3 distinct categories: inherent metadata leaks, storage related risks, and security related leaks
Inherent metdata leaks: At their most basic, the files themselves can contain metadata which reveals information about the user. This problem is pervasive enough that entire toolkits like mat2 are dedicated to the problem.
Storage related leaks: Files much be stored somewhere, either locally for peer-fetching, or on a third party storage system. Both of these approaches present risks to metadata e.g. the size of the file, the times it is fetched/uploaded, how the operating system indexes and catalogues the file for consumption by other applications on the system etc.
Security related leaks: All files need to be handled, either directly by the application or by invoking a secondary process. In either case security vulnerabilities associated with handling of the file risk exposing metadata. It is worth noting here that “security vulnerability” covers everything from “sophisticated heap corruption leading to arbitrary code execution” to “the library calls external web resources and doing so leaks the IP address”
Any introduction of file handling in Cwtch needs to be developed with these risks in mind.
Additional Requirements for Metadata Resistant File Handling in Cwtch
Separate from explicit security concerns there are also architectural challenges for support file sharing on Cwtch:
- For peer-to-peer connections both parties must be online in order to exchange data. Connections can never be considered “stable” and as such support for secure resumption is a must.
- For group connections all conversation is currently mediated through untrusted servers (which do not have the bandwidth to support handling files of arbitrary sizes). In order to properly support groups as a first class conversation type file handling must work even if the original sender is offline for an extended period of time.
Given these constraints it is clear that any solution which approximates a simple file server won’t scale well to groups and amplify the inherent UX issues present to all metadata resistant tools. It will instead be necessary to provide Cwtch users with the tools necessary to ensure that their files can be distributed to the conversations they participate it even when they are offline.
In practice this will mean borrowing heavily from lessons learned in torrenting software, by breaking files down into chunks that can later be combined in a way that is verifiable and allowing group members to become potential “seeders” or “rehosters” of shared files for others when the original source is offline.
We expect the majority of this protocol to take place over individual peer-to-peer connections between group members, minimizing metadata, and leaving the limited-bandwidth, asynchronous group messages for other applications with the exception of the initial file-metadata message containing the information needed to securely reconstruct the file from parts. We expect it to look something like below, although the overlay has not yet been finalized.
{
o: ##,
d: filename_suggestion.ext,
h: master-hash,
n: nonce,
s: size_in_bytes,
t?: optional tracker/rehoster info,
}
A Roadmap for Safe and Secure File Handling
We plan to introduce support for file handling in Cwtch over several beta releases, all likely behind the Experiment Gate. Dates are approximate release targets, and may change depending on other work, commitments and staffing resources.
-
October 2021: Basic Support for File Sharing: In the initial version there will only support seeding and downloading files i.e. there will be no support for inline image or audio type messages. Most of this work will be focused on the backend Cwtch protocol, allowing it to support file seeding, downloading and handling. All files will be downloaded to a standalone directory where they can be viewed by external applications. This phase will not attempt to strip metadata from any file type and as such Cwtch will prompt before sending files, in additional to accessing external programs.
-
December 2021: Basic Support for Image Messages: With support for file handling in place, the first likely target will be images. Inherent metadata risks around messages are well known, and are fairly easy to mitigate. Likewise, displaying images is well-supported in our chosen framework Flutter, and we believe it can be done in a way that minimizes security risk.
-
February 2022: Basic Support for Audio Messages: Like images the metadata risk for audio is fairly well known, the main additional work is centered around recording the audio, and the additional UI code work involved to support that.
-
Spring 2022 - beyond: Onboarding of additional file handling use cases e.g. stickers, videos etc. We are very interested to hear your ideas regarding what you think Cwtch should support.
As ever, if you’d like to support Open Privacy’s efforts to develop Cwtch and bring open source metadata-resistant and privacy-first infrastructure to marginalized communities, please consider donating.