Slightly better titles from fediverse topics
-
An update from last night brings some additional logic to the title generation of topics from the fediverse.
Previously if a title was provided in the
name
property, that was used as the topic title.While that hasn't changed (and is the strongest signal for a topic title), not all fediverse content contains titles. Specifically, Mastodon posts do not require or even have a space to put a title in.
For those cases, we fall back to generating one based on the content. We literally grabbed the first 128 characters or so, and added an ellipsis to the end.
While that worked okay as a stopgap, it meant that a lot of topics ended up with really long titles — not ideal.
The new logic tries to grab the first line of text (either the first
<p>
or line), and from there, the first sentence, using some naive regular expressions.While still not a proper alternative to... you know... specifying a title, it's better than nothing I suppose!
I wonder if other fediverse softwares implement title generation logic like this...
-
jupiter_rowland@hub.netzgemeinde.eureplied to julian@community.nodebb.org on last edited by@julian What Lemmy understands is this:
Title
@Community
Post body
It was added back in the day to make it possible for Mastodon users to start new threads in connected Lemmy communities.
#FediMeta #FediverseMeta #CWFediMeta #CWFediverseMeta -
rimu@mastodon.nzoss.nzreplied to julian@community.nodebb.org on last edited by
@julian PieFed uses the first 150 chars of the first line, but stopping at the first '.' in the first line makes even more sense.
-
crazycells@community.nodebb.orgreplied to julian@community.nodebb.org on last edited by
@julian thanks Julian! what about AI-backed title generation with character limitation?
for example, I put your post here: https://seo.ai/tools/ai-title-generator
and got this:
not bad I guess...
I love "topic sentences" and try to use it when I start a new topic, but unfortunately they are not commonly used by others.
-
julian@community.nodebb.orgreplied to crazycells@community.nodebb.org on last edited by
@crazycells no, I will never use AI for this purpose.
Because the resulting content is in the title, it would be implicitly misattributed to the topic author, without their consent.
-
julian@community.nodebb.orgreplied to jupiter_rowland@hub.netzgemeinde.eu on last edited by
@jupiter_rowland@hub.netzgemeinde.eu said in Slightly better titles from fediverse topics:
Title@CommunityPost body
Thanks, I hate it.
I should say, rather, that I get why it was done, and bonus points for just getting it done, but it reads like so much like "hack it until it works" methodology that I feel like we ought to be better than that by now.
-
julian@community.nodebb.orgreplied to rimu@mastodon.nzoss.nz on last edited by
@rimu@mastodon.nzoss.nz said in Slightly better titles from fediverse topics:
I like your method of stopping at the first '.', that would yield better results more often.
Thanks, it worked decently until I remembered that there were additional punctuation marks besides the lowly period.
So I had to add in support for
?
and!
, and update the logic to actually add those punctuation marks back in to the title.... and yet there are more edge cases... some bot accounts post a title-esque first line along with a link, which needs to be teased out.
-
scott@authorship.studioreplied to julian@community.nodebb.org on last edited by@julian
...not all fediverse content contains titles. Specifically, Mastodon posts do not require or even have a space to put a title in.
Hubzilla dealt with that issue by putting the title of the post in the post itself, so people on Mastodon and other platforms could see the title. So the title is transmitted in both thetitle
field, and in thebody
field. It looks redundant, but I see what they did it that way, especially since Mastodon is so dominant in the space. -
mikedev@fediversity.sitereplied to julian@community.nodebb.org on last edited byBest of luck. We gave up trying to force or generate a title back in 2010, because some posts are nothing but a photo or video. And as you noticed titles are somewhat incompatible with the microblog side of the fediverse. If your own software requires a title you're probably stuck in some cases where you can't extract words from the content with something like '[unknown title]'.
-
rimu@mastodon.nzoss.nzreplied to julian@community.nodebb.org on last edited by
@julian Ooo good point about adding the ? back on.
If you're interested in a non-regex solution, here's what I have - https://codeberg.org/rimu/pyfedi/src/branch/main/app/utils.py#L247