I have deeply mixed feelings about #ActivityPub's adoption of JSON-LD, as someone who's spent way too long dealing with it while building #Fedify.
-
@hongminhee take this part with a grain of salt because my benchmarks for it are with dotNetRdf which is the slowest C# implementation i know of (hence my replacement library), but JSON-LD is slower than RSA validation, which is one of the pain points around authorized fetch scalability
wetdry.world/@kopper/114678924693500011@hongminhee if i can give one piece of advice to devs who want to process JSON-LD: dont bother compacting. you already know the schema you output (or you're just passing through what the user gives and it doesn't matter to you), serialize directly to the compacted representation, and only run expansion on incoming data
expansion is the cheapest JSON-LD operation (since all other operations depend on it and run it internally anyhow), and this will get you all the compatibility benefits of JSON-LD with little downsides (beyond more annoying deserialization code, as you have to map the expanded representation to your internal structure which will likely be modeled after the compacted one) -
@hongminhee if i can give one piece of advice to devs who want to process JSON-LD: dont bother compacting. you already know the schema you output (or you're just passing through what the user gives and it doesn't matter to you), serialize directly to the compacted representation, and only run expansion on incoming data
expansion is the cheapest JSON-LD operation (since all other operations depend on it and run it internally anyhow), and this will get you all the compatibility benefits of JSON-LD with little downsides (beyond more annoying deserialization code, as you have to map the expanded representation to your internal structure which will likely be modeled after the compacted one)@kopper@not-brain.d.on-t.work @hongminhee@hollo.social expansion is actually really annoying because the resulting JSON has annoyingly similar keys to lookup in a hashmap, wasting cache lines, and CPU time
-
@hongminhee take this part with a grain of salt because my benchmarks for it are with dotNetRdf which is the slowest C# implementation i know of (hence my replacement library), but JSON-LD is slower than RSA validation, which is one of the pain points around authorized fetch scalability
wetdry.world/@kopper/114678924693500011@hongminhee i put this in a quote but people reading the thread may also be interested: json-ld compaction does not really save that much bandwidth over having all the namespaces explicitly written in property names if you're gzipping (and you are gzipping, right? this is json. make sure your nginx gzip_types includes ld+json and activity+json)
RE: not-brain.d.on-t.work/notes/aihftmbjpxdyb9k7 -
@kopper @hongminhee As the person probably most responsible for making sure json-ld stayed in the spec (two reasons: because it was the only extensibility answer we had, and because we were trying hard to retain interoperability with the linked data people, which ultimately did not matter), I agree with you. I do ultimately regret not having a simpler solution than json-ld, especially because it greatly hurt our ability to sign messages, which has considerable effect on the ecosystem.
Mea culpa

I do think it's fixable. I'd be interested in joining a conversation about how to fix it.
-
@kopper @hongminhee As the person probably most responsible for making sure json-ld stayed in the spec (two reasons: because it was the only extensibility answer we had, and because we were trying hard to retain interoperability with the linked data people, which ultimately did not matter), I agree with you. I do ultimately regret not having a simpler solution than json-ld, especially because it greatly hurt our ability to sign messages, which has considerable effect on the ecosystem.
Mea culpa

I do think it's fixable. I'd be interested in joining a conversation about how to fix it.
I don't remember it that way.
We started the WG off with AS2 being based on JSON-LD, and I don't think we ever considered removing it.
I don't think it was a decision you made on your own. I'm not sure how you would, since you edited AP and not AS2 Core or Vocabulary.
-
I don't remember it that way.
We started the WG off with AS2 being based on JSON-LD, and I don't think we ever considered removing it.
I don't think it was a decision you made on your own. I'm not sure how you would, since you edited AP and not AS2 Core or Vocabulary.
I would be strongly opposed to any effort to remove JSON-LD from AS2. We use it for a lot of extensions. Every AP server uses the Security vocabulary for public keys.
-
I would be strongly opposed to any effort to remove JSON-LD from AS2. We use it for a lot of extensions. Every AP server uses the Security vocabulary for public keys.
@cwebber @kopper @hongminhee It would be a huge backwards-incompatible change for almost zero benefit. People would still make mistakes in their ActivityPub implementations (sorry, Minhee, but that's life on an open network). We'd need to adopt another mechanism for defining extensions, and guess what? People are going to make mistakes with that, too.
-
@cwebber @kopper @hongminhee It would be a huge backwards-incompatible change for almost zero benefit. People would still make mistakes in their ActivityPub implementations (sorry, Minhee, but that's life on an open network). We'd need to adopt another mechanism for defining extensions, and guess what? People are going to make mistakes with that, too.
@cwebber @kopper @hongminhee The biggest downside to JSON-LD, it seems, is that it lets most developers treat AS2 as if it's plain old JSON. That was by design. People sometimes mess it up, but most JSON-LD parsers are pretty tolerant.
-
@cwebber @kopper @hongminhee It would be a huge backwards-incompatible change for almost zero benefit. People would still make mistakes in their ActivityPub implementations (sorry, Minhee, but that's life on an open network). We'd need to adopt another mechanism for defining extensions, and guess what? People are going to make mistakes with that, too.
@evan @hongminhee @cwebber my argument is that json-ld is way more prone to mistakes. in iceshrimp.net, for example, we ship and preload several modified contexts in order to correct some mistakes on our end, and even then we encounter a lot of software that do not, for example, include the security context in their actors
if, as per my suggestion, property names were always written in expanded form, the only mistakes you could really do would be typos, and that would fail pretty loudly compared to the current status quo where most software accept it and some software silently fail. how are those developers meant to even be aware that this is a problem? -
@gugurumbe @hongminhee @evan @cwebber
from my brief tests, compacting with no context (which is basically expanded json-ld, with very minor differences) compresses better, but standardizing on expanded ld would still be better than the status quo. yes backwards compatibility would be broken, but pretty much any other solution to this problem beyond not solving it would end up breaking it anyway
i'm still unsure about certain aspects of json-ld such as everything having the capability for multiple values, but without any context defined it's at least explicit and implementations can take that into account where it's actually helpful (sec:publicKeycomes to mind) and ignore it where it isn't
(edit: ignore the last part, i just re-checked and compact-with-no-context collapses arrays with single values, expanded would be clearer here)
RE: not-brain.d.on-t.work/notes/aihftmbjpxdyb9k7 -
@gugurumbe @cwebber @kopper @hongminhee AS2 requires compacted JSON-LD.
-
@gugurumbe @cwebber @kopper @hongminhee AS2 requires compacted JSON-LD.
There is no data format we can choose to eliminate programmer errors in online protocols. That's a quixotic aim.
-
@cwebber @kopper @hongminhee It would be a huge backwards-incompatible change for almost zero benefit. People would still make mistakes in their ActivityPub implementations (sorry, Minhee, but that's life on an open network). We'd need to adopt another mechanism for defining extensions, and guess what? People are going to make mistakes with that, too.
@evan @cwebber @kopper @hongminhee maybe a compromise approach could be to specify a simpler “json-ld as it is used in practice”, similar to what HTML5 was, that remains backward compatible while simplifying the spec to the point that it is actually feasible to implement
-
@gugurumbe @kopper I don't think that's the model of ActivityPub. It's made to allow reading remote objects.
Most implementations pre-load or compile in the external contexts. I agree, it's a big performance hit to load context URLs at runtime.
-
I would be strongly opposed to any effort to remove JSON-LD from AS2. We use it for a lot of extensions. Every AP server uses the Security vocabulary for public keys.
@evan @kopper @hongminhee The problem is that signing json-ld is extremely hard, because effectively you have to turn to the RDF graph normalization algorithm, which has extremely expensive compute times. The lack of signatures means that when I boost peoples' posts, it takes down their instance, since effectively *every* distributed post on the network doesn't actually get accepted as-is, users dial-back to check its contents.
Which, at that point, we might as well not distribute the contents at all when we post to inboxes! We could just publish with the object of the activity being the object's id uri
-
@kopper @hongminhee @evan Interesting... I guess it means you can't re-compact with a new outer context, but maybe that's fine
-
@evan @kopper @hongminhee The problem is that signing json-ld is extremely hard, because effectively you have to turn to the RDF graph normalization algorithm, which has extremely expensive compute times. The lack of signatures means that when I boost peoples' posts, it takes down their instance, since effectively *every* distributed post on the network doesn't actually get accepted as-is, users dial-back to check its contents.
Which, at that point, we might as well not distribute the contents at all when we post to inboxes! We could just publish with the object of the activity being the object's id uri
@cwebber @kopper @hongminhee I talk about this in my book. Unless the receiving user is online at the time the server receives the Announce, it's ridiculous to fetch the content immediately. Receiving servers should pause a random number of minutes and then fetch the content. It avoids the thundering herd problem.
-
@cwebber @kopper @hongminhee I talk about this in my book. Unless the receiving user is online at the time the server receives the Announce, it's ridiculous to fetch the content immediately. Receiving servers should pause a random number of minutes and then fetch the content. It avoids the thundering herd problem.
@evan @cwebber @kopper @hongminhee I think that is a better algorithm than a brain dead exponential back off. Perhaps put the two together.
-
@kopper It does not; if a malicious context redefines the security properties then the JSON-LD processor will understand the data differently than the unaware processor.
-
@evan @cwebber @kopper @hongminhee I think that is a better algorithm than a brain dead exponential back off. Perhaps put the two together.
@patmikemid I call it trust, then verify. Usually caching the data with a ttl of a short number of minutes is enough.
Hello! It looks like you're interested in this conversation, but you don't have an account yet.
Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.
With your input, this post could be even better 💗
Register Login