A few months back I wrote about reinventing wheels. Going down that course has been interesting and I hope to continue reinventing parts of the personal cloud stack. Personal cloud meaning taking all of the services you have hosted elsewhere and pulling them in. This feeds into the IndieWeb movement as well.
A couple years ago, I deployed my first collocated server with my friends. I got a pretty monstrous setup compared to my needs, but I figured it’d pay for itself over time and it has. One side effect of having all this space was that I could let my friends also have slices of my server. It was nice sharing those extra resources. Unfortunately, by hosting my friend’s slices of my server, it meant doing anything to the root system or how the system was organized was a bit tedious or even off limits.
In owning my own services, I want to restructure my server. Also I want to have interesting routing between containers and keep all the containers down to the single process ideal. In order to move to this world I’ve had to ask my friends to give up their spots on my server. Everyone was really great about this thanking me for hosting for this long and such. I was worried people would complain and I’d have to be more forceful, but instead things were wonderful.
The next step I want to take after deploying my personal cloud, will be to start one by one replacing pieces with my own custom code. The obvious first one will be the SMTP server since I’ve already started implementing one in rust. After that it may be something like my blog, or redis or a number of other parts of the cloud. The eventual goal being that I’ve implemented a fair portion of all cloud services and I can better understand them. I wont be restricting myself to any one language. I will be pushing for a container per process with linking between containers to share services.
Overall, I hope to learn a bunch and have some fun in the process. I recently picked up the domain http://ownstack.club and hope to have something up on it in the near future!
After languishing for a few years, Pulse got a burst of interest and development in 2014. Since I first heard of it, I’ve found the idea of a central message bus for the goings-on in Mozilla’s various systems rather intruiging, and I’m excited to have been able to grow it over the last year.
Pulse falls into that class of problem that is a result of, to borrow from a past Mozilla leader, our tendency to make our lives difficult, that is, to work in the open. Using RabbitMQ as a generic event stream is nothing special; Mozilla’s use of it as an open system is, I believe, completely unique.
Adapting a system intended for private networks into a public service always results in fascinating problems. Pulse has a decent permission-control system, but it’s not designed for self service. It is also very trusting of its users, who can easily overwhelm the system by just subscribing to streams and never consuming the messages.
The solution to both these problems was to design a management application: PulseGuardian. Via Persona, it handles account management, and it comes with a service that monitors Pulse’s queues. Since we presume users are not malicious, it sends a friendly warning when it notices a queue growing too large, but if ignored it will eventually kill the queue to save the system.
If you build it, they will come, or so says some movie I’ve never seen, but in this case it appears to be true. TaskCluster has moved whole-hog over to Pulse for its messaging needs, and the devs wrote a really nice web app for inspecting live messages. MozReview is using it for code-review bots and autolanding commits. Autophone is exploring its use for providing Try support to non-BuildBot-based testing frameworks.
Another step for Pulse beyond the prototype phase is a proper library. The existing mozillapulse Python library works decently, aside from some annoying problems, but it suffers from a lack of extensibility, and, I’m beginning to believe, should be based directly on a lower-level amqp or RabbitMQ-specific Python package and not the strange, overly generic kombu messaging library, in part because of the apparent lack of confirm channels in kombu. We’re looking into taking ideas from TaskCluster’s Pulse usage in bug 1133602.
Here’s the first episode! I streamed it last Wednesday, and it was mostly concerned with bug 1090439, which is about making the print dialog and progress calls from the child process asynchronous.
A note that I did struggle with some resolution issues in this episode. I’m working with Richard Milewski from the Air Mozilla team to make this better for the next episode. Sorry about that!
The Monday Project Meeting
I have blogged about clustering BHR hangs before. This post is dedicated to the add-on side of things.
In a way BHR could be seen as a distributed profiler: each user runs a local profiler that samples the stack of a thread only when that thread is hanging for over N milliseconds. Then the stacks are sent to our backend infrastructure through a Telemetry submission.
Imagine to profile Firefox for an hour. At the end of the hour you would like to determine the impact of one of your installed add-ons on the browsing experience. Not simple at all, the add-on might have been in use only for a fraction of your session, say 20 seconds. In that fraction it might have slowed down the browser significantly though. Since you are collecting hangs for the whole session, that signal might eventually be dominated by noise. This means that in most cases, add-on hangs are not going to look like a big deal once aggregated.
I aggregated the stacks and partitioned them in browser and add-on specific ones. For a given add-on Foo, I computed for all sessions the ratio of hangs of Foo over the number of hangs of both Firefox and Foo. Finally I averaged those ratios. This final number gives an idea of the average proportion of hangs due to Foo an user of that add-on can expect to find in his session.
That’s not the whole story though, one can imagine scenarios where an add-on triggers an asynchrounous workload in the browser which will not be accounted to the add-on itself, like garbage collection. In a grantedly less common, but still plausible scenario, an add-on could improve the performances of the browser and in doing so reducing the number of browser specific hangs while increasing the ratio. In general I tend to see the final number as a lower bound and even though it’s not precise, it can help identify bottlenecks.
From the most popular add-ons the ones that have a high ratio are:
- Lastpass (12%)
- Hola Better Internet (11%)
- Avira Antivirus (10%)
- noscript (9%)
- Adblock (9%), note this is not Adblock Plus
LastPass, for instance, has a ratio of hangs of about 12%, which means that if you had just LastPass installed and no other add-on, on average about 12% of the hangs you would experience would likely be due to LastPass. That’s a lot and and the main culprit seems to be the logic to scan for input fields when the document changes dynamically. That said I am glad I can’t see any popular add-ons with a shockingly high ratio of hangs, which is good.
The numbers shouldn’t be used to mark an add-on as bad or good; they are based on a fallible heuristic and they are meant to give us the tools to prioritize which add-ons we should keep an eye on.
MariaDB Best Practices
Today starts the first day of the mentoring program that I announced in my previous blog post.
In good news, I was overwhelmed by the number of responses I received to the blog post. Within three days, 57 people sent me an email requesting to be a part of the program. This tells me there is a strong need for more guided programs like this. On the downside, it was very hard to select only four people from the group.
In the end, I ended up selecting five people to partake in this. They are from all over the world: India (2); Germany; and USA (2).
I have assigned the first bugs and work should proceed this week on getting a build working and finding their way through the Firefox developer ecosystem.
Tagged: firefox, mozilla, planet-mozilla