In my potentially never-ending quest to get on top of the ever-growing email onslaught, I came across Tony Hsieh's Yesterbox method/manifesto. It's a deceptively simple but effective way to deal with your inbox: You only answer the emails from yesterday (plus the very few emails which require immediate attention). That way you get a chance to be on top of your email (as the number of emails from yesterday is finite) instead of being caught in an endless game of whack-a-mole. Plus people will get a guaranteed response from you in less than 48 hours - whereas in the past I often skipped more complex emails for days as I was constantly dealing with new incoming mail.
For a while I toyed around with different setups. Until I settled on the following Gmail configuration which works beautifully for me:
The left box shows you your incoming email (which allows for quick scanning and identifying those pesky emails which require immediate attention), the top right box is your Yesterbox and thus the email list I focus on. And the lower right box shows emails I starred - typically I use this for important emails I need to keep an eye on (say for example I am waiting for an answer to an email).
It's a simple but incredibly effective setup - here's how you set this up in your Gmail account:
- Activate the Gmail Labs feature "Multiple Inboxes" in Settings/Labs
- After activating Multiple Inboxes and reloading the Settings page in Gmail you will have a new section fittingly called "Multiple Inboxes". Here you add two inboxes with custom searches: One will be your Yesterbox with a search for "in:inbox older_than:24h", the other one will be your Starred inbox with a custom search for "is:starred". Set the extra panels to show on the right side and increase the number of mails to be displayed to 50 (or whatever works for you) and you're done.
- There is no step three! :)
Enjoy and let me know if this works for you (or if you have an even better setup).
If you’ve opened the Browser Console lately while running Nightly with e10s enabled, you might have noticed a warning message – “unsafe CPOW usage” – showing up periodically.
I wanted to talk a little bit about what that means, and what’s being done about it. Brad Lassey already wrote a bit about this, but I wanted to expand upon it (especially since one of my goals this quarter is to get a handle on unsafe CPOW usage in core browser code).
I also wanted to talk about sluggishness that some of our brave Nightly testers with e10s enabled have been experiencing, and where that sluggishness is coming from, and what can be done about it.What is a CPOW?
“CPOW” stands for “Cross-process Object Wrapper”1, and is part of the glue that has allowed e10s to be enabled on Nightly without requiring a full re-write of the front-end code. It’s also part of the magic that’s allowing a good number of our most popular add-ons to continue working (albeit slowly).
Let me give you an example.
In single-process Firefox, easy and synchronous access to the DOM of web content was more or less assumed. For example, in browser code, one could do this from the scope of a browser window:let doc = gBrowser.selectedBrowser.contentDocument; let contentBody = doc.body;
Here contentBody corresponds to the <body> element of the document in the currently selected browser. In single-process Firefox, querying for and manipulating web content like this is quick and easy.
In multi-process Firefox, where content is processed and rendered in a completely separate process, how does something like this work? This is where CPOWs come in2.
With a CPOW, one can synchronously access and manipulate these items, just as if they were in the same process. We expose a CPOW for the content document in a remote browser with contentDocumentAsCPOW, so the above could be rewritten as:let doc = gBrowser.selectedBrowser.contentDocumentAsCPOW; let contentBody = doc.body;
I should point out that contentDocumentAsCPOW and contentWindowAsCPOW are exposed on <xul:browser> objects, and that we don’t make every accessor of a CPOW have the “AsCPOW” suffix. This is just our way of making sure that consumers of the contentWindow and contentDocument on the main process side know that they’re probably working with CPOWs3. contentBody.firstChild would also be a CPOW, since CPOWs can only beget more CPOWs.
So for the most part, with CPOWs, we can continue to query and manipulate the <body> of the document loaded in the current browser just like we used to. It’s like an invisible compatibility layer that hops us right over that process barrier.
Well, not really.
CPOWs are really a crutch to help add-ons and browser code exist in this multi-process world, but they’ve got some drawbacks. Most noticeably, there are performance drawbacks.Why is my Nightly so sluggish with e10s enabled?
Have you been noticing sluggish performance on Nightly with e10s? Chances are this is caused by an add-on making use of CPOWs (either knowingly or unknowingly). Because CPOWs are used for synchronous reading and manipulation of objects in other processes, they send messages to other processes to do that work, and block the main process while they wait for a response. We call this “CPOW traffic”, and if you’re experiencing a sluggish Nightly, this is probably where the sluggishness if coming from.
Instead of using CPOWs, add-ons and browser code should be updated to use frame scripts sent over the message manager. Frame scripts cannot block the main process, and can be optimized to send only the bare minimum of information required to perform an action in content and return a result.
Add-ons built with the Add-on SDK should already be using “content scripts” to manipulate content, and therefore should inherit a bunch of fixes from the SDK as e10s gets closer to shipping. These add-ons should not require too many changes. Old-style add-ons, however, will need to be updated to use frame scripts unless they want to be super-sluggish and bog the browser down with CPOW traffic.And what constitutes “unsafe CPOW usage”?
“unsafe” might be too strong a word. “unexpected” might be a better term. Brad Lassey laid this out in his blog post already, but I’ll quickly rehash it.
There are two main cases to consider when working with CPOWs:
- The content process is already blocked sending up a synchronous message to the parent process
- The content process is not blocked
The first case is what we consider “the good case”. The content process is in a known good state, and its primed to receive IPC traffic (since it’s otherwise just idling). The only bad part about this is the IPC traffic.
The second case is what we consider the bad case. This is when the parent is sending down CPOW messages to the child (by reading or manipulating objects in the content process) when the child process might be off processing other things. This case is far more likely than the first case to cause noticeable performance problems, as the main thread of the content process might be bogged down doing other things before it can handle the CPOW traffic – and the parent will be blocked waiting for the messages to be responded to!
There’s also a more speculative fear that the parent might send down CPOW traffic at a time when it’s “unsafe” to communicate with the content process. There are potentially times when it’s not safe to run JS code in the content process, but CPOWs traffic requires both processes to execute JS. This is a concern that was expressed to me by someone over IRC, and I don’t exactly understand what the implications are – but if somebody wants to comment and let me know, I’ll happily update this post.
So, anyhow, to sum – unsafe CPOW usage is when CPOW traffic is initiated on the parent process side while the content process is not blocked. When this unsafe CPOW usage occurs, we log an “unsafe CPOW usage” message to the Browser Console, along with the script and line number where the CPOW traffic was initiated from.Measuring
We need to measure and understand CPOW usage in Firefox, as well as add-ons running in Firefox, and over time we need to reduce this CPOW usage. The priority should be on reducing the “unsafe CPOW usage” CPOWs in core browser code.
If there’s anything that working on the Australis project taught me, it’s that in order to change something, you need to know how to measure it first. That way, you can make sure your efforts are having an effect.
We now have a way of measuring the amount of time that Firefox code and add-ons spend processing CPOW messages. You can look at it yourself – just go to about:compartments.
It’s not the prettiest interface, but it’s a start. The second column is the time processing CPOW traffic, and the higher the number, the longer it’s been doing it. Naturally, we’ll be working to bring those numbers down over time.A possibly quick-fix for a slow Nightly with e10s
As I mentioned, we also list add-ons in about:compartments, so if you’re experiencing a slow Nightly, check out about:compartments and see if there’s an add-on with a high number in the second column. Then, try disabling that add-on to see if your performance problem is reduced.
If so, great! Please file a bug on Bugzilla in this component for the add-on, mention the name of the add-on4, describe the performance problem, and mark it blocking e10s-addons if you can.
We’re hoping to automate this process by exposing some UI that informs the user when an add-on is causing too much CPOW traffic. This will be landing in Nightly near you very soon.PKE Meter, a CPOW Geiger Counter
Logging “unsafe CPOW usage” is all fine and dandy if you’re constantly looking at the Browser Console… but who is constantly looking at the Browser Console? Certainly not me.
Instead, I whipped up a quick and dirty add-on that plays a click, like a Geiger Counter, anytime “unsafe CPOW usage” is put into the Browser Console. This has already highlighted some places where we can reduce unsafe CPOW usage in core Firefox code – particularly:
- The Page Info dialog. This is probably the worse offender I’ve found so far – humongous unsafe CPOW traffic just by opening the dialog, and it’s really sluggish.
- Closing tabs. SessionStore synchronously communicates with the content process in order to read the tab state before the tab is closed.
- Back / forward gestures, at least on my MacBook
- Typing into an editable HTML element after the Findbar has been opened.
If you’re interested in helping me find more, install this add-on5, and listen for clicks. At this point, I’m only interested in unsafe CPOW usage caused by core Firefox code, so you might want to disable any other add-ons that might try to synchronously communicate with content.
If you find an “unsafe CPOW usage” that’s not already blocking this bug, please file a new one! And cc me on it! I’m mconley at mozilla dot com.
I pronounce CPOW as “kah-POW”, although I’ve also heard people use “SEE-pow”. To each his or her own. ↩
I say probably, because in the single-process case, they’re not working with CPOWs – they’re accessing the objects directly as they used to. ↩
And say where to get it from, especially if it’s not on AMO. ↩
Reunión bi-semanal para hablar sobre el estado de Mozilla, la comunidad y sus proyectos.
I'm currently working on annotating moz.build files with metadata that defines things like which bug component and code reviewers map to which files. It's going to enable a lot of awesomeness.
As part of this project, I'm implementing a new moz.build processing mode. Instead of reading moz.build files by traversing DIRS variables from previously-executed moz.build files, we're evaluating moz.build files according to filesystem topology. This has uncovered a few cases where a moz.build file errors because of assumptions that no longer hold. For example, for directories that are only active on Windows, the moz.build file might assume that if Windows is always true.
One such problem was with gfx/angle/srx/libGLESv2/moz.build. This file contained code similar to the following:if CONFIG['IS_WINDOWS']: SOURCES += ['foo.cpp'] ... SOURCES['foo.cpp'].flags += ['-DBAR']
This always ran without issue because this moz.build was only included if building for Windows. This assumption is of course invalid when in filesystem traversal mode.
Anyway, as part of updating this trouble file, I lost maybe an hour of productivity. Here's how.
The top of the trouble moz.build file has a comment:# Please note this file is autogenerated from generate_mozbuild.py, # so do not modify it directly
OK. So, I need to modify generate_mozbuild.py. First thing's first: I need to locate it:$ hg locate generate_mozbuild.py gfx/skia/generate_mozbuild.py
So I load up this file. I see a main(). I run the script in my shell and get an error. Weird. I look around gfx/skia and see a README_MOZILLA file. I open it. README_MOZILLA contains some instructions. They aren't very good. I hop in #gfx on IRC and ask around. They tell me to do a Subversion clone of Skia and to check out the commit referenced in README_MOZILLA. There is no repo URL in README_MOZILLA. I search Google. I find a Git URL. I notice that README_MOZILLA contains a SHA-1 commit, not a Subversion integer revision. I figure the Git repo is what was meant. I clone the Git repo. I attempt to run the generation script referenced by README_MOZILLA. It fails. I ask again in #gfx. They are baffled at first. I dig around the source code. I see a reference in Skia's upstream code to a path that doesn't exist. I tell the #gfx people. They tell me sub-repos are likly involved and to use gclient to clone the repo. I search for the proper Skia source code docs and type the necessary gclient commands. (Fortunately I've used gclient before, so this wasn't completely alien to me.)
I get the Skia clone in the proper state. I run the generation script and all works. But I don't see it writing the trouble moz.build file I set out to fix. I set some breakpoints. I run the code again. I'm baffled.
Suddenly it hits me: I've been poking around with gfx/skia which is separate from gfx/angle! I look around gfx/angle and see a README.mozilla file. I open it. It reveals the existence of the Git repo https://github.com/mozilla/angle. I open GitHub in my browser. I see a generate_mozbuild.py script.
I now realize there are multiple files named generate_mozbuild.py. Unfortunately, the one I care about - the ANGLE one - is not checked into mozilla-central. So, my search for it with hg files did not reveal its existence. Between me trying to get the Skia code cloned and generating moz.build files, I probably lost an hour of work. All because a file with a similar name wasn't checked into mozilla-central!
I assumed that the single generate_mozbuild.py I found under source control was the only file of that name and that it must be the file I was interested in.
Maybe I should have known to look at gfx/angle/README.mozilla first. Maybe I should have known that gfx/angle and gfx/skia are completely independent.
But I didn't. My ignorance cost me.
Had the contents of the separate ANGLE repository been checked into mozilla-central, I would have seen the multiple generate_mozbuild.py files and I would likely have found the correct one immediately. But they weren't and I lost an hour of my time.
And I'm not done. Now I have to figure out how the separate ANGLE repo integrates with mozilla-central. I'll have to figure out how to submit the patch I still need to write. The GitHub description of this repo says Talk to vlad, jgilbert, or kamidphish for more info. So now I have to bother them before I can submit my patch. Maybe I'll just submit a pull request and see what happens.
I'm convinced I wouldn't have encountered this problem if a monolithic repository were used. I would have found the separate generate_mozbuild.py file immediately. And, the change process would likely have been known to me since all the code was in a repository I already knew how to submit patches from.
Separate repos are just lots of pain. You can bet I'll link to this post when people propose splitting up mozilla-central into multiple repositories.
I got asked this:
Going to organize a series of open, and free, events covering WebGL / Web API […]
We ended up opting for an educational workshop format. Knowing you have experience with WebGL, I’d like to ask you if you woudl support us in setting up the materials […]
In the interest of helping more people that might be wanting to start a WebGL group in their town, I’m posting the answer I gave them:
I think you’re putting too much faith on me
I first learnt maths and then OpenGL and then WebGL. I can’t possibly give you a step by step tutorial that mimics my learning process.
If you have no prior experience with WebGL, I suggest you either look for a (somewhat) local speaker and try to get them to give an introductory talk. Probably people that attend the event will be interested in WebGL already or will get interested after the talk.
Then just get someone from the audience excited about WebGL and have them give the next talk
Or can start by learning to use a library such as three.js and once you become acquainted with its fundamentals, start digging into “pure WebGL” if you want, for example writing your own custom shaders.
Or another thing you could do is get together a bunch of people interested in WebGL and try to follow along the tutorials on WebGL or the examples on three.js. So people can discuss aloud what they understand and what they don’t, and help and learn from each other.
I hope this helps you find your way around this incredibly vast subject! Good luck and have fun!
Now you know how to do this. Go and organise events! EASY!
It’s actually not easy.
Suppose Alice and Bob live in a country with 50 states. Alice is currently in state a and Bob is currently in state b. They can communicate with one another and Alice wants to test if she is currently in the same state as Bob. If they are in the same state, Alice should learn that fact and otherwise she should learn nothing else about Bob’s location. Bob should learn nothing about Alice’s location.
They agree on the following scheme:
- They fix a group G of prime order p and generator g of G
Cryptographic problems. Gotta love ‘em.
A couple years ago Krupa filled up a whiteboard with boxes and arrows, diagramming what the AMO systems looked like. There was recently interest in reviving that diagram and seeing what the Marketplace systems would look like in the same style so I sat down and drew the diagrams below, one for the Marketplace and one for Payments.
Honestly, I appreciate the view, but I wince at first glance because of all the duplication. It's supposed to be "services from the perspective of a single service." Meaning, if the box is red, anything connected to it is what that box talks to. Since the webheads talk to nearly everything it made sense to put them in the middle, and the dotted lines simply connect duplicate services. I'm unsure whether that's intuitive though, or if it would be easier to understand if I simply had a single node for each service and drew lines all over the diagram. I might try that next time, unless someone gives me a different idea. :)
Lastly, this is the diagram that came out first when I was trying to draw the two above. It breaks the Marketplace down into layers which I like because we emphasize being API driven frequently, but I'm not sure the significant vertical alignment is clear unless you're already familiar with the project. I think finding a way to use color here would be helpful - maybe as a background for each "column."
Or maybe I'm being too hard on the diagrams. What would you change? Are there other areas you'd like to see drawn out or maybe this same area but through a different lens?
A few months back I wrote about reinventing wheels. Going down that course has been interesting and I hope to continue reinventing parts of the personal cloud stack. Personal cloud meaning taking all of the services you have hosted elsewhere and pulling them in. This feeds into the IndieWeb movement as well.
A couple years ago, I deployed my first collocated server with my friends. I got a pretty monstrous setup compared to my needs, but I figured it’d pay for itself over time and it has. One side effect of having all this space was that I could let my friends also have slices of my server. It was nice sharing those extra resources. Unfortunately, by hosting my friend’s slices of my server, it meant doing anything to the root system or how the system was organized was a bit tedious or even off limits.
In owning my own services, I want to restructure my server. Also I want to have interesting routing between containers and keep all the containers down to the single process ideal. In order to move to this world I’ve had to ask my friends to give up their spots on my server. Everyone was really great about this thanking me for hosting for this long and such. I was worried people would complain and I’d have to be more forceful, but instead things were wonderful.
The next step I want to take after deploying my personal cloud, will be to start one by one replacing pieces with my own custom code. The obvious first one will be the SMTP server since I’ve already started implementing one in rust. After that it may be something like my blog, or redis or a number of other parts of the cloud. The eventual goal being that I’ve implemented a fair portion of all cloud services and I can better understand them. I wont be restricting myself to any one language. I will be pushing for a container per process with linking between containers to share services.
Overall, I hope to learn a bunch and have some fun in the process. I recently picked up the domain http://ownstack.club and hope to have something up on it in the near future!
After languishing for a few years, Pulse got a burst of interest and development in 2014. Since I first heard of it, I’ve found the idea of a central message bus for the goings-on in Mozilla’s various systems rather intruiging, and I’m excited to have been able to grow it over the last year.
Pulse falls into that class of problem that is a result of, to borrow from a past Mozilla leader, our tendency to make our lives difficult, that is, to work in the open. Using RabbitMQ as a generic event stream is nothing special; Mozilla’s use of it as an open system is, I believe, completely unique.
Adapting a system intended for private networks into a public service always results in fascinating problems. Pulse has a decent permission-control system, but it’s not designed for self service. It is also very trusting of its users, who can easily overwhelm the system by just subscribing to streams and never consuming the messages.
The solution to both these problems was to design a management application: PulseGuardian. Via Persona, it handles account management, and it comes with a service that monitors Pulse’s queues. Since we presume users are not malicious, it sends a friendly warning when it notices a queue growing too large, but if ignored it will eventually kill the queue to save the system.
If you build it, they will come, or so says some movie I’ve never seen, but in this case it appears to be true. TaskCluster has moved whole-hog over to Pulse for its messaging needs, and the devs wrote a really nice web app for inspecting live messages. MozReview is using it for code-review bots and autolanding commits. Autophone is exploring its use for providing Try support to non-BuildBot-based testing frameworks.
Another step for Pulse beyond the prototype phase is a proper library. The existing mozillapulse Python library works decently, aside from some annoying problems, but it suffers from a lack of extensibility, and, I’m beginning to believe, should be based directly on a lower-level amqp or RabbitMQ-specific Python package and not the strange, overly generic kombu messaging library, in part because of the apparent lack of confirm channels in kombu. We’re looking into taking ideas from TaskCluster’s Pulse usage in bug 1133602.
Here’s the first episode! I streamed it last Wednesday, and it was mostly concerned with bug 1090439, which is about making the print dialog and progress calls from the child process asynchronous.
A note that I did struggle with some resolution issues in this episode. I’m working with Richard Milewski from the Air Mozilla team to make this better for the next episode. Sorry about that!
The Monday Project Meeting
I have blogged about clustering BHR hangs before. This post is dedicated to the add-on side of things.
In a way BHR could be seen as a distributed profiler: each user runs a local profiler that samples the stack of a thread only when that thread is hanging for over N milliseconds. Then the stacks are sent to our backend infrastructure through a Telemetry submission.
Imagine to profile Firefox for an hour. At the end of the hour you would like to determine the impact of one of your installed add-ons on the browsing experience. Not simple at all, the add-on might have been in use only for a fraction of your session, say 20 seconds. In that fraction it might have slowed down the browser significantly though. Since you are collecting hangs for the whole session, that signal might eventually be dominated by noise. This means that in most cases, add-on hangs are not going to look like a big deal once aggregated.
I aggregated the stacks and partitioned them in browser and add-on specific ones. For a given add-on Foo, I computed for all sessions the ratio of hangs of Foo over the number of hangs of both Firefox and Foo. Finally I averaged those ratios. This final number gives an idea of the average proportion of hangs due to Foo an user of that add-on can expect to find in his session.
That’s not the whole story though, one can imagine scenarios where an add-on triggers an asynchrounous workload in the browser which will not be accounted to the add-on itself, like garbage collection. In a grantedly less common, but still plausible scenario, an add-on could improve the performances of the browser and in doing so reducing the number of browser specific hangs while increasing the ratio. In general I tend to see the final number as a lower bound and even though it’s not precise, it can help identify bottlenecks.
From the most popular add-ons the ones that have a high ratio are:
- Lastpass (12%)
- Hola Better Internet (11%)
- Avira Antivirus (10%)
- noscript (9%)
- Adblock (9%), note this is not Adblock Plus
LastPass, for instance, has a ratio of hangs of about 12%, which means that if you had just LastPass installed and no other add-on, on average about 12% of the hangs you would experience would likely be due to LastPass. That’s a lot and and the main culprit seems to be the logic to scan for input fields when the document changes dynamically. That said I am glad I can’t see any popular add-ons with a shockingly high ratio of hangs, which is good.
The numbers shouldn’t be used to mark an add-on as bad or good; they are based on a fallible heuristic and they are meant to give us the tools to prioritize which add-ons we should keep an eye on.
MariaDB Best Practices
Today starts the first day of the mentoring program that I announced in my previous blog post.
In good news, I was overwhelmed by the number of responses I received to the blog post. Within three days, 57 people sent me an email requesting to be a part of the program. This tells me there is a strong need for more guided programs like this. On the downside, it was very hard to select only four people from the group.
In the end, I ended up selecting five people to partake in this. They are from all over the world: India (2); Germany; and USA (2).
I have assigned the first bugs and work should proceed this week on getting a build working and finding their way through the Firefox developer ecosystem.
Tagged: firefox, mozilla, planet-mozilla