just like there’s a distinction between non-information resources and information resources, or between binary resources and text resources, maybe there should be a distinction between descriptors documents and content documents
-
trwnh@mastodon.socialreplied to trwnh@mastodon.social last edited by
what i’m thinking is that sidecar metadata can be stored 1:1 or 1:n — if it’s 1:1 you might as well embed it if you can, either as some kind of frontmatter or inline with rdfa. but having frontmatter means every single processor that touches your content needs to be aware of the existence of that frontmatter (and strip it). so frontmatter isn’t as portable as i would like. basically a document with frontmatter is no longer that content type; it is a new media type for each combination.
-
trwnh@mastodon.socialreplied to trwnh@mastodon.social last edited by
example: markdown is text/markdown but if you add frontmatter it is now something different. but there isn’t a standard type for this; instead, every application implements frontmatter parsing independently. there isn’t consensus on the delimiter or on the format. the definition of a new media type should include the delimiter and the format; for example, “delimit with three dashes and serialize frontmatter as yaml” or “delimit with three pluses and serialize frontmatter as toml”
-
alice@gts.void.dogreplied to trwnh@mastodon.social last edited by
@trwnh it should be specified by markdown (variants) probably,,
-
trwnh@mastodon.socialreplied to alice@gts.void.dog last edited by
@alice im thinking more along the lines of like. what if yaml frontmatter + markdown content = application/markdown-content-with-yaml-frontmatter or whatever. and it had a registered extension .mdyaml or whatever.
i’d be mainly interested in html content and toml frontmatter but saying that this is .html and text/html is not accurate. i can’t directly serve such an html file via a web browser; it needs to be processed/converted/whatever by an application first
-
alice@gts.void.dogreplied to trwnh@mastodon.social last edited by
@trwnh if it needs so much pre processing is it a markdown template?
-
trwnh@mastodon.socialreplied to alice@gts.void.dog last edited by
@alice not really, it’s “content” plus some “header”; you *can* render it against some template or layout, but the main goal here is portability. i want to be able to know ahead-of-time that this text file is not just html/md/etc, it also has some junk i may or may not care about at the start. if i want just the content then i need to strip that junk first
-
trwnh@mastodon.socialreplied to trwnh@mastodon.social last edited by
earlier i said that html’s head-body split is not the same as the metadata-content split i am after; after some further thought, this isn’t really true. i think what i am trying to model here is a way to be able to detect and handle arbitrary header data, by unwrapping it to get at the body content. but i’m realizing that this body content may itself have its own nested headers and body…
-
trwnh@mastodon.socialreplied to trwnh@mastodon.social last edited by
more precisely there is a format to the header data and there is a format to the body content
an http request/response can be serialized as a text file which has http headers and http body, and then that http body can be of a certain content type like html, which itself has html headers and html body. the html body content is often also of type html
you can progressively wrap or unwrap “body content” with “header data” in different formats. i’m not sure how best to describe this…
-
trwnh@mastodon.socialreplied to trwnh@mastodon.social last edited by
how can we generalize this header+content pattern, basically
i’m fairly sure you need to at least define header type, delimiter type, content type
example:
- header = toml
- delimiter = +++ to start, +++ to end
- content = htmlis this enough to describe a canonical data format?
-
trwnh@mastodon.socialreplied to trwnh@mastodon.social last edited by
side note: i wish there was a distinction between html content and a full html document… if you try to render html content in a browser and it isn’t a full html document, weird things might happen