Interoperable Serendipity

I just published a blog post talking about interoperability and what it looks like in practice.

I focused in Solid given that that’s what I have worked with, but it’d be interesting to hear if it’s the same for remoteStorage and Fission.

Any thoughts?

1 Like

Nice write up, thanks for sharing.

One example is the type index, which is a way to declare where you data is stored so that other apps can find it. However, it’s perfectly possible to make a Solid app that doesn’t declare its data in the type index. This means that other apps have to know the exact location where the data resides, or they won’t interoperate properly.

This is not a focus of Fission’s webnative, to type data. We see that as an app to app concern which will be widely different. We want to enable personal data ownership and portability.

Being built on IPFS means that anyone can help keep the data online, too.

We’re a fan of Cambria as an emerging solution. Here’s a talk we hosted:

We may have some helpers for common formats — eg Show me everything in the file system that is an image. And one can see that expanding to other file formats.

Basically I don’t think app developers are necessarily going to use RDF or design compatible schemas from the start.

But having the data available to the user means it’s possible, and various transforms can be done.

Yeah I’m not “married” to RDF either, but I do think we need to have some type of schema definition that can be reusable across applications. I think that remoteStorage does something similar with JSON Schema, but I haven’t looked much into it.

I don’t know much about Fission, although I’m keen to learn more :). If you don’t have any typing system, what’s your take on interoperable serendipity? Meaning, two apps being interoperable without their developers being aware of each other’s existence.

I know it sounds utopic, but in Solid this is technically possible if two apps use the same vocabularies. In practice it doesn’t happen much, but it happens sometimes. And hopefully with more apps and the maturity of the ecosystem, it’ll become more common.

I think something like Cambria is a great fit for solving these issues, but I’m not sure how it would work in practice. I think it should be possible to harness the power of the community, instead of leaving everything in the developer’s hands. But this is somewhat unrealistic when you think about applications in the long tail and that’s why I think we need a solution that’s in the hands of users (non-technical users, as well).

Edit: When I say “in the hands of non-technical users”, something like this comes to mind: Web Scraper - Free Web Scraping - Chrome Web Store Web scraping is one of the examples of working around adversarial interoperability, I think we could use this type of UI/UX patterns to auto-generate lenses.

I get the direction, my point previously wasn’t centered on RDF either – I literally don’t think app developers today think about designing schemas in this way, at all.

It’s driven by a desire to be compatible with something. e.g. if they need to read Photoshop files or CSV files, they implement whatever libraries read/write those things, they don’t design schemas a priori.

We don’t have enough personal data store centric apps to know, at this point.

I think that Tools for Thought are a space where interop is highly desired by the early users, and thus have been exploring that space. Even there, I believe it will be about promoting a format like atJSON which can lossless-ly capture and convert between formats, as people are a) stuck on hand typing Markdown and b) disregard front matter and other challenges of “extending” Markdown.

We hosted a small talk on next gen playlist formats that might bootstrap a new shareable primitive file format / JSON type.

So: my take is that interoperable serendipity starts with file types and formats, not with app schemas. Notes are potentially a broad enough category to bootstrap some schema convergence.

1 Like

Something I was pondering in our latest Swap is, what’s the actual difference between a file and a schema?

I could choose a particular schema for my app, and a particular format (let’s say Turtle), and make up a file extension, that would now be a file format (for example, some file formats are really .zip or .xml files under the hood). But isn’t that just a limited version of a schema? If instead of talking about it in terms of files I do in terms of schema, at least in Solid that gives me even more flexibility, because the data could live in different formats (N3, JsonLD, etc.) and be stored across different documents (even different servers).

So I think in a way files are just a narrower definition of schemas (not exactly, but I hope you know what I mean). But maybe that’s the point. Too much freedom tends to turn into complexity, whereas limits make things simple and it’s easier to interoperate in practice. RSS and Email are protocols that haven’t changed in years, and even though that comes with its own challenges, that’s part of the reason why they’ve been so resilient and their client apps have truly become interoperable.

But then, if that’s true, relying on a specific usage of a schema (even if it doesn’t live up to its full potential) should be as good or better than a file format. I think something like that is happening with the fediverse, right? An app won’t be fully interoperable if developers only read the ActivityPub specification, but there are already many apps that are truly interoperable seeing how the ActivityPub specification is being used in practice.

One example is the type index, which is a way to declare where you data is stored so that other apps can find it. However, it’s perfectly possible to make a Solid app that doesn’t declare its data in the type index. This means that other apps have to know the exact location where the data resides, or they won’t interoperate properly.

I consider the ‘location’ part of the schema.

Even with lenses, I’d say the chances are very low — you still have to know about the other app in order to define the translations.

This is true at the moment. I hope it’s not naive of me to imagine a future where lenses become a normal part of the development process and we begin schema design by considering existing translations, especially for common use cases like todos or notes—it’s an easier sell if it allows developers to ultimately do what works for them and ‘not worry about schemas’.

One possibility would be to rely on crowd-sourced repositories of translations.

Geoffrey Litt and I might both be inspired by the idea of ‘converters’ in Jef Raskin’s writing—this quote might have been phrased as a ‘marketplace’ of ‘transformers’ where people buy, sell, create, and maintain these bridges between apps. If I remember which book this was in I’ll share it one day. Geoffrey also has an interest in web scrapers (Wildcard might be the best current example) so I’m sure he’ll have more to add to this topic at some point.

A practice of defining lenses/converters/transformers/shapes and centralized vocabularies may have all the tradeoffs you mention, yet can still be a good foundation for automating schema-matching, which could indirectly look like interoperable serendipity.

I think this highlights the progressive nature of interoperability. It’s probably useful start with something known and achievable, limit scope, and focus solely on decoupling data from the apps. Might be easier to see solutions for interoperability in another phase of evolution.

To me this is all interchangeable: schema, file, extension, type, shape, protocol, format all encompass ‘the data and where it’s stored’, and I believe lenses should transform between all those parameters.

I might share also these specific discussions on interoperability from remoteStorage and Fission:

1 Like

I guess that depends on the protocol, but in Solid it’s discouraged to hard-code the location (even though I know in practice most people does it, me included in my first app).

I suppose this is part of the trade-off between making things interoperable and making things easy for developers. The most flexible something is, the harder it is to implement (in fact, the new spec for interoperability in Solid is even more complicated than the type index, and the type index is probably going to be deprecated).

Actually the “holy grail” for Solid would be to just request resources by type, not caring about where they are stored. If I’m making a task manager, I would like to say “give me all the tasks this user has in their POD”. I know that’s possible in theory with things like Triple stores, but in practice not even SPARQL is supported in the spec, so the mechanisms to get data are quite rudimentary.

The type index at least has been easy for me to implement without adding a lot of complexity, and it makes interoperable serendipity a lot more likely. So I think it’s worth using.

Yes, exactly. The concept of personal data stores is so nascent, we need to start there.

Network protocols are a different thing altogether, as they require a back and forth.

Data on disk – file, something more structured / unique, etc. – only needs to communicate between one app at a time.

The app needs to be able to read the data (at some level of fidelity) and be able to write it, in a way that it can be consumed by others.

I’ve thought about this some more, and of course you’re right that “files” can be just particular types of schemas.

On the other hand, with apps, you are storing “program state” in JSON or whatever type of files. That’s the part that I think is super custom and thus not very shareable between apps.

There are trivial examples where perhaps 90% of TODO app could be captured … but maybe that’s a vCal file format standard again.

So: I think program state / database is hard to generalize AND mostly, not what developers are used to.