Peertube chat plugin: quick feedbacks about a live stress test

, by  John Livingston
[English] [français]

 Introduction

On the last friday, the 30th of december, the “Au Poste” team — a french media started by David Dufresne — organised a Peertube live test. It was the opportunity for Framasoft, Octopuce and I to test Peertube as well as my chat plugin “at scale”.

This is something I had been waiting for a long time. As an independant developper, it is hard for me to gather hundreds of people to make stress tests. David Dufresne has a community of multiple hundreds of people who follow him on Twitch. Most of whom, for political reasons, are trying to free themselves from GAFAMs. This full-size test was long-awaited by many people of his community.

So I had several meetups with David and his team the week before to organize this live event. From his side, he also contacted Octopuce (his hoster) and Framasoft (which is behind the Peertube project).

Note: it was foreseen that I would be connected to the chat to answer questions, but as it was very active it was difficult to follow any of it. I ended up joining the live half-way. We had many interesting discussions: on politics, or tech, ... The replay is available here: Peerturbateurs, unissez-vous! Test d’un live Peertube avec 400 viewers.

 Feedback

Octopuce has published a very complete article (in french) which details how servers handled the test. Peertube live load testing with Auposte (fr).

For people that doesn’t understand french, in two words: 450 viewers tried to connect at the same time to the live and the chat.
We had some issues with the chat plugin. Some issues that I was expecting, some others not. I will try to come back to them in this article ; and explain what happens and why I think it wasn’t working the best it could. I think that some of the issues are a little more complex than what the Octopuce article made it sound like. This - quickly written- article is also the opportunity for me to explain what I have in mind to improve the situation.

 Technical details

We’re now going to look at what happened during this stress test. I will explain technical details that could IMHO explain observed metrics.

I think some problems are due to many little things that were more or less predictable. The sum of them caused some troubles.

 1. Architecture

The Peertube chat plugin is a bit special. The goal is to be able to install it from the Peertube interface, without any server configuration required, and without root rights.

This means I can’t:

  • Use official packages (debian, ...),
  • Assume necessary XMPP ports are open on the server
  • Access a database. Peertube’s plugin system only gives basic access to the PostgreSQL database. Adding tables from there doesn’t look very clean.

This lead me to make choices that are not optimal and I am aware of it. I already have a very long TODO list with lots of possible improvements. It’s only a matter of time.

Parts of this TODO is already described in the article 2023 will be full of new features, available in multiple languages.

The rest is registered on the github tracker. These issues are also sorted in a github project.

For the frontend, I use ConverseJS which allows connecting to XMPP servers from a web browser, using Javascript.

Let’s dig into constraints I listed above and see what they imply.

 1.A AppImage

The chat plugin uses the Prosody XMPP server as a backend. I need to be able to use Prosody without installing anything on the server. No “root” access.

I created an AppImage of Prosody based on Debian stable packages. This AppImage contains Lua (language used for Prosody), Prosody itself, and a small launcher script used to start either prosody or prosodyctl (to avoid having two different AppImages).

The AppImage is created with appimage-builder, from this configuration file: appimage yml.

Then Peertube generates a configuration file for Prosody (from plugin parameters), and launches the AppImage. It’s not launched as a daemon, but as a child process of Peertube. This way if Peertube is stopped, we’re sure Prosody is too.

 1.B Proxy

As ports on the machine could be closed by firewalls, we need to proxify everything through Peertube. It’s ugly but it works out of the box.

How does it work?
I use Websocket or BOSH connections which go through 2 successive reverse proxies:

  • nginx in front of Peertube
  • Peertube itself

Peertube is written in NodeJS, so I use a NodeJS library which allows me to reverse proxy. It’s not optimal, and I probably should rework it to improve performances.

Note: I plan to provide a way to have advanced configuration in the plugin, which would allow us to optimise it, especially by giving direct access to the XMPP server from the outside. It will require root access on the server though.

 1.C Database

No database: Prosody uses the the internal storage module. Which means everything is stored as files: user files, chat history, ...

 2. Virtual Hosts and XMPP Components

In the generated Prosody configuration we can find:

  • A Virtual host for connected Peertube users. If I have a Peertube account “@john instance.tld”, then I will have an XMPP user “john instance.tld”.
  • A Virtual host for anonymous users. Users without an account can join the chat just by choosing a nickname. In the live test on the 30th, we had more than 400 people in this situation.
  • A “room” component, to handle groupchats.

 3. Prosody specific modules

I have specific Prosody modules. There’s different types: those with “livechat” in the name:
Chat plugin Prosody modules.

Here you’ll find:

  • A module to list rooms (to use in moderation tools)
  • A test module, to check Peertube and Prosody can communicate. This is used in a diagnostic tool included in the plugin.
  • Two vCards modules: One to get connected Peertube users’ avatars, and one to generate random avatars for anonymous users.

I also use existing modules:

  • mod_muc_http_default to verify rooms can be created and configure them
  • mod_auth_http to authenticate connected Peertube users

All these modules make requests to Peertube’s web API.

This is where part of the problems Octopuce had last week comes from (IMHO): because of lacking features in Peertube’s plugin system, whenever Prosody needs to call Peertube, it needs to go through nginx again. An issue has been opened, it needs to be discussed and implemented.
Note: it was implemented today, and will be available in next Peertube version.

In the meantime, this can be configured by hand in advanced plugin settings, but it’s not very well documented.

 4. vCards

An issue that had already been identified is that we use vCards in ConverseJS. vCards are like «electronic business cards» for chat users. They’re also used to store avatars and that’s where I get them from.

As soon as someone connects, ConverseJS will fetch every other connected users’ vcards. And when a new person joins, everyone already connected will ask for their vcard.

For those already connected to Peertube, Peertube’s avatar will be fetched. In last week’s test only 4 or 5 people had Peertube accounts and were in this case.

For anonymous vcards, a random one will be choosen among a list of a few dozens and get displayed.
Code handling this is here: mod_vcard_peertubelivechat->https://github.com/JohnXLivingston/peertube-plugin-livechat/blob/main/prosody-modules/mod_vcard_peertubelivechat/mod_vcard_peertubelivechat.lua].
I’m new at Lua so this may not be the best code there is.

Last week we had 400 people joining around the same time. 400 people asking for 400 vcards. It was really sluggish, and on top of that people refreshed the page multiple times instead of waiting a bit for things to settle down.

In addition to this, each avatar is about 4.2Ko in JPG. This JPG is base64 encoded before being sent via XMPP, which makes it about 6Ko per avatar.
400 * 400 * 6Ko = 960Mo, at once, in addition to the XMPP enveloppe and other vCard data.

This also doesn’t take into account CPU load and disk on Prosody, which was at a 100% for some reason. I think vcards are the culprit (as mentioned in the Octopuce article, Prosody is single-threaded).
Which implies timeouts, people refreshing the page even more, etc.

I have in my TODO list to compute avatars on the frontend, and stop fetching vCards. This should greatly help. I also hesitate to disable avatars completely to replace them with a simple colored nickname, which could use XEP-0392.

 5. Chat history, pruning, ...

Until last friday’s test, I didn’t think handling chat history would be an issue. This is why I haven’t tried to optimise it, and I configured ConverseJS to fetch it when a user joins.

It turns out, with 400 users at once, it’s X messages * 400 users to fetch. All of this being stored in files (Prosody’s internal storage). It was bound to be an issue.

Another completely unexpected issue: I don’t prune messages in ConverseJS. So they all pile up and the browser starts struggling for each key stroke.

This again made people refresh the page even more during the live stream causing multiple issues. This is something I will be looking into very quickly.

 6. Network load caused by fetching ConverseJS

The way ConverseJS is currently integrated in the plugin makes that:

  • ConverseJS is included in its entirety, even modules I don’t need
  • It goes through Nginx and then NodeJS to load files.

Everytime someone opens the page or refreshes it, the browser downloads multiple MB through Nginx and NodeJS. This must explain part of the huge bandwidth consumption observed by Octopuce.

I have already started cleaning all of this. If everything goes to plan, ConverseJS will be optimised and will load more efficiently in the next version of the plugin.

 7. Join/Part messages and activity

I have enabled join/part messages’ display in chat (“Juliet joined the room”), as well as activity messages (“Max is writing...”). With 400 people joining and quitting multiple times, and with all the refreshes, it can also quickly add up and generate traffic (and make JS go sluggish again).

I will remove this. Maybe only leave join / part messages for moderators, but messages such as “Jack is writing” have little interest, if not to generate hundreds of unnecessary XMPP messages.

 8. Changing nicknames

For anybody without a Peertube account to be able to see live chat without even having to choose a nickname, I assign to them a random one such as “Anonymous 123456”.

To be able to chat, they must first choose a nickname, and ConverseJS is going to display “Anonymous 123456 becomes Camille” each time. This again generates lots of XMPP traffic for this number of users and lots of JavaScript running for little gain.

I may end up hiding this type of messages as well. At least when the original nickname is « Anonymous 123456 ». I am also thinking about removing those that haven’t yet chosen a nickname from the list of participants in the room, only to display a counter « X anonymous » which can only reduce traffic and the browser load.

 In Conclusion

As I mentioned at the beginning of this article, this is just a first shot. I put down what came to mind while writing and it’s possible I have forgotten things. You can find all the improvments mentionned in this article in this github issue tracker.

There are still many possible improvements for this plugin. I only need to get back to it!

 Thanks

I want to thank the XMPP community. Many people working on Prosody, ConverseJS, XEPs, etc. are looking at my work and regularly provide help.
Thanks to all, it reassures me in the choice of technology I made when I started this project.

Many thanks also to David and his team, Octopuce, and Framasoft. It was a fun live event, and there is plenty of things to learn from.