It is a sad day today. … Or let’s start from somewhere else. I have grown up in the Communist Czechoslovakia. I remember that moment, I was probably something around seven years old, and I was sitting on the floor of our living room and thinking where to hide a tape with anti-Communist protest songs so that it wouldn’t be found by the secret police if we were blessed with the house search. Yes, seven years. Yes, it was shortly after the Charter 77 and there was a lot of hysteria in the air, but yes couple of years later my father (who was an university professor) was falsely accused of committing rape on some female students (fortunately, police was then so sloppy, they made a mistake and provided him with the best alibi possible … he was interrogated by them in time when the rape was supposed to happen; or perhaps it was not mistake at all), so just a house search was not that improbable.
I remember reading a couple of years later a poem by a famous nineteenth century Czech poet, Karel Havlíček Borovský, written about the time when was illegally arrested and deported by the Austrian police because of his anti-government journalism (yes, we have a long history of bad regimes here). This is particularly on the situation when he was drawn out of the bed by the police early in the morning (the translation is mine and very very rough):Ale Džok, můj černý buldog, ten je grobián, na habeas corpus tuze zvyklý — on je Angličan. Málem by byl chlap přestoupil jeden paragraf, již na slavný ouřad zpod postele uďál: Vrr! haf! haf! Hodil jsem mu tam pod postel říšský zákoník, dobře že jsem měl ten moudrý nápad, již ani nekvík. —
//However, Jock, my black bulldog, he is a lout, he is too much used to the Habeas Corpus — being an English dog. He would almost step over one rule of the law, because he started from below the bad doing on the honorable officers: Grrr! Woof! Woof! I have thrown him under the bad the imperial code of law, that was really a smart idea, he haven’t make a sound anymore. —
I have asked my Dad (who was a lawyer) what that Habeas Corpus means, and when he explained it to me my conclusion from this poem was that there is something awesome about the rule of law, and particularly there is something thing great about the English (and by association American) law. Apparently it is not possible for a policemen to draw you out of the bad without a reason, luxury which I was certain we were not blessed with.
Yet later I have learned another standard of the free society (even more relevant to what I would like to talk about anyway). I have been told that this standard is fairly displayed in the famous saying attributed to Voltaire:Monsieur l’abbé, I detest what you write, but I would give my life to make it possible for you to continue to write
Then the so called Velvet Revolution of 1989 happened, and I have found that the reality is a little bit complicated, but I think these rules of freedom of expression and honor to other peoples’ opinion stayed with me forever. So, I was terribly surprised and frankly confused later on when I was reading very excellent de Tocqueville’s book about the democracy in America which contained a statementI know of no country in which there is so little independence of mind and real freedom of discussion as in America.
Isn’t he talking about the country which gave us the First Amendment, which gave us whole concept of the freedom of expression? Isn’t he talking about the country founded by the dissenters? I thought that there must be something wrong with this statement, or that I had misunderstood something in what he was saying. Yet later on I have been blessed with an opportunity to live and study for couple of years in Boston so I have learned that the protection against the government attacking somebody for his expression is very much real, but that there is also present very high level of pressure to conform to the prevalent opinion of the community. And although everybody talks all the time about the value of diversity, there is really a little of it allowed.
So, I had in the last two weeks these two stories.
World Vision, one of the largest Christian charity organization in the world, decided that their employee won’t be fired because they were living in the same-sex marriage sanctioned by their state and their denomination. They were arguing for the decision because they are non-denominational organization and they didn’t want to overrule policy of their employees’ denominations, not mentioning they didn’t want to overrule state laws. I don’t know whether I agree with this argument, but it is obvious that the situation of non-denominational organizations is difficult and whichever decision they make it will be attacked by somebody. Of course, I don’t know what happened thereafter but couple of days later after the unbelievable firestorm of criticism from the evangelical circles World Vision reversed their decision.
Second story. Shortly after Brendan Eich was named CEO of the Mozilla Corporation, somebody picked up an old case of his financial support for the Proposition 8 (if I understand correctly, the issue at stake about that proposition was declaring a marriage to be an union of one man and one woman; if you don’t know who Brendan Eich is, look at his wikipage ). Even couple of LBGT employees of Mozilla Corp. defended Brendan Eich on their blogs claiming that there is no discrimination against them in Mozilla, just to the contrary conditions for LGBT people are way above the legal level and on the highest level in the industry. Also, nobody was able to explain questions of some senior Mozilla developers what has Brendan’s opinions to do with his position of CEO of the company developing computer programs. And whole story again ended the same, most extreme participants in the Kulturkampf won, and Mozilla lost in my opinion one of the most brilliant leaders in the industry.
What would de Tocqueville and Voltaire say?
Predicting time series can be very interesting not only for quants. Any server that logs metrics like the number of submissions or requests over time generates a so called time series or signal. An interesting time series I had the chance to play with some time ago is the one generated by Telemetry’s submissions. This is how it looks like for the Nightly channel for the past 60 days:
It’s immediately evident to an human eye when and where there was a drop in submissions in the past couple of months (bugs!). An interesting problem is to be able to automatically identify a drop in submissions as soon as it happens and at the same time reducing to a minimum the number of “false alarms”. It might seem rather trivial at first, but given that the distribution is quite sparse, caused mostly by daily oscillations, an outlier detection method based on the standard deviation is doomed to fail. Using the median absolute deviation is more robust but still not good enough to avoid false positives.
The periodic patterns might not be immediately visible from the raw data plot but once we connect the dots the daily and weekly pattern appear in all their beauty:
The method I came up with to catch drops does the following:
- It retrieves the distributions of the last 10 days from the current data point
- It performs a series of Mann-Whitney tests to compare the last 24h to the distributions of the previous 9 days
- If the distributions are statistically different for at least 5 days with the current daily one having a lower mean, then we have a drop
The algorithm requires a certain amount of history to make good predictions, reason why it detected the first drop on the left only after several days. As expected though it was able to detect the second drop without any false positives. Sudden drops are easy to detect with a robust outlier detection method but slow drops, as we experienced in the past, can go unnoticed if you just look for outliers.
Another interesting approach is to use time series analysis to decompse the series into its seasonals (periodic), trend and and noise signals. A simple classical decomposition by moving average yields the following series:
This simple algorithm was able to remove most of the periodic pattern; the trend is affected now by the weekly signal and the drops. It turns out that newer methods are able to decompose time series with multiple periodic patterns, or seasonalities. One algorithm I particularly like is the so called TBATS method, which is an advanced exponential smoothing model:
That’s pretty impressive! The TBATS algorithm was able to identify and remove the daily and weekly frequency from our signal, what remains is basically the trend and some random noise. Now that we have such a clean signal we could try to apply statistical quality control to our time series, i.e. use a set of rules to identify drops. The rules look at the historical mean of a series of datapoints and based on the standard deviation, the rules help judge whether a new set of points is experiencing a mean shift (drop) or not.
Given a decomposition of a time series, we can also use it to predict future datapoints. This can be useful for a variety of reasons beyond detecting drops. To have an idea of how well we can predict future submissions let’s take a clean subset of our data, from day 20 to day 40, and let’s try to predict Telemetry’s submissions for the next 5 days while comparing it to the actual data:
That’s pretty neat, we can immediately see that we have an outlier and the prediction is very close to the actual real data.
I wonder if there are other methods used to detect alterations to time series so feel free to drop me a line with a pointer if you happen to have a suggestion.
Modern microarchitectures are incredibly complex. A modern x86 processor will be superscalar and use some form of compilation to microcode to do that. Desktop processors will undoubtedly have multiple instruction issues per cycle, forms of register renaming, branch predictors, etc. Minor changes—a misaligned instruction stream, a poor order of instructions, a bad instruction choice—could kill the ability to take advantages of these features. There are very few people who could accurately predict the performance of a given assembly stream (I myself wouldn't attempt it if the architecture can take advantage of ILP), and these people are disproportionately likely to be working on compiler optimizations. So unless you're knowledgeable enough about assembly to help work on a compiler, you probably shouldn't be hand-coding assembly to make code faster.
To give an example to elucidate this point (and the motivation for this blog post in the first place), I was given a link to an implementation of the N-queens problem in assembly. For various reasons, I decided to use this to start building a fine-grained performance measurement system. This system uses a high-resolution monotonic clock on Linux and runs the function 1000 times to warm up caches and counters and then runs the function 1000 more times, measuring each run independently and reporting the average runtime at the end. This is a single execution of the system; 20 executions of the system were used as the baseline for a t-test to determine statistical significance as well as visual estimation of normality of data. Since the runs observed about a constant 1-2 μs of noise, I ran all of my numbers on the 10-queens problem to better separate the data (total runtimes ended up being in the range of 200-300μs at this level). When I say that some versions are faster, the p-values for individual measurements are on the order of 10-20—meaning that there is a 1-in-100,000,000,000,000,000,000 chance that the observed speedups could be produced if the programs take the same amount of time to run.
The initial assembly version of the program took about 288μs to run. The first C++ version I coded, originating from the same genesis algorithm that the author of the assembly version used, ran in 275μs. A recursive program beat out a hand-written assembly block of code... and when I manually converted the recursive program into a single loop, the runtime improved to 249μs. It wasn't until I got rid of all of the assembly in the original code that I could get the program to beat the derecursified code (at 244μs)—so it's not the vectorization that's causing the code to be slow. Intrigued, I started to analyze why the original assembly was so slow.
It turns out that there are three main things that I think cause the slow speed of the original code. The first one is alignment of branches: the assembly code contains no instructions to align basic blocks on particular branches, whereas gcc happily emits these for some basic blocks. I mention this first as it is mere conjecture; I never made an attempt to measure the effects for myself. The other two causes are directly measured from observing runtime changes as I slowly replaced the assembly with code. When I replaced the use of push and pop instructions with a global static array, the runtime improved dramatically. This suggests that the alignment of the stack could be to blame (although the stack is still 8-byte aligned when I checked via gdb), which just goes to show you how much alignments really do matter in code.
The final, and by far most dramatic, effect I saw involves the use of three assembly instructions: bsf (find the index of the lowest bit that is set), btc (clear a specific bit index), and shl (left shift). When I replaced the use of these instructions with a more complicated expression int bit = x & -x and x = x - bit, the program's speed improved dramatically. And the rationale for why the speed improved won't be found in latency tables, although those will tell you that bsf is not a 1-cycle operation. Rather, it's in minutiae that's not immediately obvious.
The original program used the fact that bsf sets the zero flag if the input register is 0 as the condition to do the backtracking; the converted code just checked if the value was 0 (using a simple test instruction). The compare and the jump instructions are basically converted into a single instruction in the processor. In contrast, the bsf does not get to do this; combined with the lower latency of the instruction intrinsically, it means that empty loops take a lot longer to do nothing. The use of an 8-bit shift value is also interesting, as there is a rather severe penalty for using 8-bit registers in Intel processors as far as I can see.
Now, this isn't to say that the compiler will always produce the best code by itself. My final code wasn't above using x86 intrinsics for the vector instructions. Replacing the _mm_andnot_si128 intrinsic with an actual and-not on vectors caused gcc to use other, slower instructions instead of the vmovq to move the result out of the SSE registers for reasons I don't particularly want to track down. The use of the _mm_blend_epi16 and _mm_srli_si128 intrinsics can probably be replaced with __builtin_shuffle instead for more portability, but I was under the misapprehension that this was a clang-only intrinsic when I first played with the code so I never bothered to try that, and this code has passed out of my memory long enough that I don't want to try to mess with it now.
In short, compilers know things about optimizing for modern architectures that many general programmers don't. Compilers may have issues with autovectorization, but the existence of vector intrinsics allow you to force compilers to use vectorization while still giving them leeway to make decisions about instruction scheduling or code alignment which are easy to screw up in hand-written assembly. Also, compilers are liable to get better in the future, whereas hand-written assembly code is unlikely to get faster in the future. So only write assembly code if you really know what you're doing and you know you're better than the compiler.
I work at Mozilla. The non-profit organisation to keep the open web, well, open and alive. I work here because of a few reasons:
- We have a manifesto. Not a “company guideline” or “our vision” or “about us”. We mean business, but not in the sense of “what brings us the most money”.
- We have people, not employment figures. Amazing people, creative people, misfits and average people. From all over the globe, with all kind of ideas and beliefs and backgrounds. And they work together. They clash, they disagree, they flood my inbox with CCs as I should know the answer to how they can work on this. They all have different setups and ways to work.
- We empower people. We work with a lot of people who we don’t pay. We help them learn, we help them become speakers for a good cause, we help them communicate and we let them be our communicators in regions and languages and manners we have no idea about. We trust them. And it shows. Going to a Mozilla summit is like going to a concert or festival. You have a lot of fun, you have a lot of noise and boy do you get a lot of demands. People are hungry to do good, and are ravenous to learn all about it.
- We are a stepping stone. Quite a few people who I trained on public speaking and tech evangelism got better jobs immediately after that. I write more recommendation letters than ever before. And I see people getting a chance to move to another country and get a job they beforehand only dreamed about.
- We are more than a company in the Silicon Valley. We are world-wide, everybody has the right to work from home and most people do. We trust you to use your time wisely and only ask you to show up for video meetings where we need to sync. This means we release much more than I have ever seen in any other company. Your output speaks for you, not how on time you arrive in the office, how you look or where you are from.
- We value passion and personality – I can be a total pain in the lower backside. Other people drive me crazy. We don’t have to have the same ideas, instead we find a common ground and analyse what is good for the project as a whole. Then we do that together. There is no problem disagreeing with a director, a manager, or even a CEO. If you have a good point, it will be taken in and – after peer review – taken on. You can get away with a lot more than you could in other companies. And this isn’t about “yeah, let them rant – it makes them happy” – if you are professional and have a good point, you find an ear to listen to you.
- We disagree and discuss. The old saying “Arguing with an engineer is like wrestling with a pig in mud – you realise far too late it is enjoying it” is very much alive here. All discussions and disagreements are public. Personal disagreements are normally taken on head-on and in direct messaging. Nobody is asked to agree with anything without having had their say. This delays things, this makes things much more complex, but it also makes us who we are. A free, open product can not be created behind closed walls. Open Source does not mean “code is on GitHub”. It is pure transparency and a messy process. But it makes for strong products that can not be killed if one person leaves or gets bored. Open Source means the big red bus has no power. What is shared can not get taken away, neither by organisational changes, nor by outside powers, nor by silly things like hardware failure.
- We work with the competition. – I have no problem speaking to Google, Microsoft, Opera, Twitter, Facebook and whoever else I please. I speak at their events, I share upcoming work on our part with them. I applaud and publicly promote the great things they do. We work on standards and best practices. These can not be done in one place. They have to have peer review.
- We allow you to speak freely. – there is no censorship, there is no “you have to use this channel to communicate”. The latter drives me crazy, as I have many a time to react to things people say about our products on their personal blogs or find amazing examples and code randomly on the web. People prefer to write on their own channels about products they built on company time rather than using an official channel. In other companies, that is an HR issue. Hell, I had contracts that said that whatever code written on company hardware belongs to it. Not here. You can talk and you should also be aware of the effects your communication has. Many times this means we have to help you out when you miscommunicated. That is tough, but it also means we learn.
All of this is the messy craziness that is Mozilla. And that’s why I am here. It is not a 9-5 job, it is not an easy job. But damn is it rewarding and interesting.
When I started, I took a paycut. I continuously get better offers from the outside. I had a six hour interview with six people. These were the best brainstorming I had done for years. When I met volunteers on my way out and saw them giving their time for Mozilla with a smile that was contagious, I knew I am up to something good.
When I interviewed, nobody asked me about my personal or religious beliefs. This would be illegal – at least where I am from. I don’t have to agree with everyone I work with on a personal level. All I have to do is to allow you your freedom to be who you are and flag up when your personal views inconvenience or hurt others and are just not appropriate in a work situation.
So when you tell me because I work for Mozilla I share ideas of other people “above me” in the hierarchy, you don’t know me and you have no idea how Mozilla works. We are different, and we work differently. You make something that thrives on communication and helping another and having thousands of personal voices something you understand: a hierarchical company with one person who is the embodiment of everything the company does. A figure like that exists – it is a one-man startup or a movie superhero. It doesn’t work for a loosely connected and open construct like Mozilla.
I’ve had moments where I was ready to give up. I had some very painful months lately where all my work of the last years was questioned and I felt I ran out of things to excite me. Then I concentrated on the people who give their free time on us and talked to them. And I found the spark again.
I am here for all the people who spend time to keep the web open, to teach web literacy, to give people a voice where it would be hard for them to get heard. They may be my colleagues, they may be volunteers, they may be people in other companies with similar goals. This is bigger than me and bigger than you. I hope it stays, I hope it thrives and I hope people understand that what Mozilla did and does is incredibly important. Information wants to be out and free. The internet allows for this. We made it our passion to protect this internet and give you software that is yours to use – for free, without backdoors or asking you for your information upfront. If that’s not your thing, fine. But don’t give it up because you disagree with one person’s personal actions and beliefs. I don’t.
ElasticUtils is a Python library for building and executing Elasticsearch searches.
See the Quickstart for more details.v0.9 released!
This is a big release, but there are some compromises in it that I'm not wildly excited about. Things like Elasticsearch 1.0 support didn't make the cut. I'm really sorry about that---we're working on it.
This release has a lot of changes in it. Roughly:
- dropped pyelasticsearch for elasticsearch-py (Thank you Honza!)
- fixed S.all() so it does what Django does which should let you use an S in the place of a QuerySet in some cases
- new FacetResult class (Thank you James!)
- S.facet() can take a size keyword
- cleaned up ESTestCase
- SearchResults now has facet data in the facets property
For the complete list of what's new, What's new in Version 0.9.
Many thanks to everyone who helped out: Alexey Kotlyarov, David Lundgren, Honza Král, James Reynolds, Jannis Leidel, Juan Ignacio Catalano, Kevin Stone, Mathieu Pillard, Mihnea Dobrescu-Balaur, nearlyfreeapps, Ricky Cook, Rob Hudson, William Tisäter and Will Kahn-Greene.
We're going to be sprinting on ElasticUtils 0.10 at PyCon US in Montreal mid April. If you're interested, come find me!
If you have any questions, let us know! We hang out on #elasticutils on irc.mozilla.org.
For Firefox OS the Gaia UI currently uses Travis CI to run a series of test jobs in parallel for each pull request. While Travis has a neat ember.js-based live-updating web UI, I usually find myself either staring at my build watching it go nowhere or forgetting about it entirely. The latter is usually what ends up happening since we have a finite number of builders available, we have tons of developers, each build takes 5 jobs, and some of those jobs can take up to 35 minutes to run when they finally get a turn to run.
I recently noticed ThinkGeek had a bunch of Dream Cheeky USB LED notifiers on sale. They’re each a USB-controlled tri-color LED in a plastic case that acts as a nice diffuser. Linux’s “usbled” driver exposes separate red/green/blue files via sysfs that you can echo numbers into to control them. While the driver and USB protocol inherently support a range of 0-255, it seems like 0-63 or 0-64 is all they give. The color gamut isn’t amazing but is quite respectable and they are bright enough that they are useful in daylight. I made a node.js library at https://github.com/asutherland/gaudy-leds that can do some basic tricks and is available on npm as “gaudy-leds”. You can tell it to do things by doing “gaudy-leds set red green blue purple”, etc. I added a bunch of commander sub-commands, so “gaudy-leds –help” should give a lot more details than the currently spartan readme.
I couldn’t find any existing tools/libraries to easily watch a Travis CI build and invoke commands like that (though I feel like they must exist) so I wrote https://github.com/asutherland/travis-build-watcher. While the eventual goal is to not have to manually activate it at all, right now I can point it at a Travis build or a github pull request and it will poll appropriately so it ends up at the latest build and updates the state of the LEDs each time it polls.
Relevant notes / context:
- There is a storied history of people hooking build/tree status up to LED lights and real traffic lights and stuff like that. I think if you use Jenkins you’re particularly in luck. This isn’t anything particularly new or novel, but the webmail notifiers are a great off-the-shelf solution. The last time I did something like this I used a phidgets LED64 in a rice paper lamp and the soldering was much more annoying than dealing with a mess of USB cables. Also, it could really only display one status at a time.
- There are obviously USB port scalability issues, but you can get a 24-port USB hub for ~$40 from Amazon/monoprice/etc. (They all seem to be made by the same manufacturer.) I coincidentally bought 24 of the notifiers after my initial success with 6, so I am really prepared for an explosion in test jobs!
- While I’m currently trying to keep things UNIXy with a bunch of small command-line tools operating together, I think I would like to have some kind of simple message-bus mechanism so that:
- mozilla-central mach builds can report status as they go
- webhooks / other async mechanisms can be used to improve efficiency and require less manual triggering/interaction on my end. So if I re-spin a build from the web UI I won’t need to re-trigger the script locally and such. Please let me know if you’re aware of existing solutions in this space, I didn’t find much and am planning to just use redis as glue for a bunch of small/independent pieces plus a more daemonish node process for polling / interacting with the web/AMQP.
- There are efforts underway to overhaul the continuous integration mechanism used for Gaia. This should address delays in starting tests by being able to throw more resources at them as well as allow notification by whatever Mozilla Pulse’s successor is.
Then, Jeremy Keith, our unofficial rabble-rouser, excoriates the cognoscenti about a certain “lack of imagination.” Chris Wilson, finally at liberty to blog and tweet about his responsibilities as web platform guy for Google, responds conversationally.
Browser wars always delivers. Thank you, Brendan (“Dart? Good luck with that!”), Charles (who conducted a much-needed straw poll: “Who knows what vendor prefixing is?” to which many hands went up, underscoring the fact that SxSW is really our favorite audience), Chris (“Do you ship VBScript?”), and John (“Chromeless — my favorite word.”).
The panel always coincides with my birthday. I won’t get mawkish, but I will say that there’s something interesting about growing up with web browsers professionally. When I was with Netscape, I talked a relentless amount of smack about IE and railed against closed-source stacks. That kind of talk is antiquated now, really. Flash fallback (for video) notwithstanding, there are open sourced stacks that confuse the web platform landscape. We talked about some of those during the panel, chiefly Dart (though SPDY and VP8 got some mention, along with Native Client). At some point, I found myself moderating a panel where browser vendors agree about the importance of DRM, and its inevitability on the web platform, at least as far as video goes. Times have changed. Have we all grown up? There used to be visceral auto-immune responses in some circles to any kind of mention of DRM whatsoever.
This time, SxSW was bigger than ever. Long lines. LOTS of long lines. And after-after-after parties for people that scorn sleep. Of course, I allowed myself some minor peccadilloes this year at SxSW. Like how I found myself on Snoop Dogg’s tour bus at 4a.m. one night, somewhere on the way to San Antonio. But that’s another kind of story. You’ll have to ask me about it in person.
Update: You can follow the H.264 conversation on the hacks blog also if only to be exposed to a different comment stream.
C’est avec un immense plaisir que je présenterais pour la troisième fois au groupe d’utilisateurs HTML5mtl le 22 avril prochain. Disons que j’ai une petite faiblesse pour ce groupe consacré à HTML pour la grande région de Montréal: j’en suis l’un des fondateurs avec Mathieu Chartier et Benoît Piette. Lors de cette soirée, je parlerais de Firefox OS, mon sujet de prédilection, mais aussi mon principal focus en tant qu’évangéliste chez Mozilla. Pour moi, Firefox OS, c’est HTML5 sur les stéroïdes: c’est un avancé d’HTML incluant les WebAPI qui nous permettent enfin de rivaliser avec les applications natives. Voici donc le résumé de ma présentation:
HTML5 est un pas de géant dans la bonne direction: il apporte plusieurs fonctionnalités dont les développeurs avaient besoin pour créer plus facilement de meilleures expériences web. Il a aussi fait naitre un débat sans fin: applications natives ou applications web! Lors de cette présentation, Frédéric Harper vous montrera comment le web ouvert peut vous aider à créer des applications mobiles de qualités. Vous en apprendrez plus sur des technologies telles que les WebAPIs, ainsi que les outils qui vous permettront de viser un nouveau marché avec Firefox OS et le web d’aujourd’hui.
C’est d’autant plus un plaisir pour moi que de me joindre au groupe ce mois-ci, car j’affectionne particulièrement ma ville où je ne présente plus assez. Pas que je me plaigne des pays où je partage ma passion avec d’autres développeurs, mais c’est plaisant de pouvoir le faire en français et d’avoir l’opportunité de réseauter avec des gens de chez nous. C’est donc un rendez-vous, le 22 avril prochain à 18:30 (ouverture des portes à 18:00 – si vous arrivez en retard, je vais vous faire chanter!) au bureau de Microsoft Montréal, que je remercie, situé au 2000 McGill College suite 450. Réservez votre place rapidement! Au plaisir de vous voir au HTML5mtl.
- FOXHACK, un hackathon Firefox OS à Montréal Le samedi 28 septembre prochain aura lieu un hackathon Firefox...
- Firefox OS au Visual Studio Talkshow Il y a quelques jours j’ai participé au Visual Studio...
I’ve known about it some time, and Eric has been writing and sharing what has been happening along the way. I’ve been wanting to say things, have felt so sad for him and his entire family, have wanted to find a way to just make it go away.
This is about Eric Meyer, and about the terrible situation with his daughter Rebecca, who has cancer – tumors in her head and it seems to be terminal.
Now they have had to tell Rebecca about the current state, outlined in the excruciating post The Truth.
As a father myself, as a human, as a living being, this is so extremely sad, so wrong. No one would ever deserve to go through something like this; nor parents, nor child.Trembling
My hands are trembling as I type this, I’m crying. And me shaking reminds me of the first time I met Eric. It was at SXSW in 2006, the first major international conference I attended. Eric was a role model for me, already famous in the Internet industry for his knowledge about CSS. And to me, he always seemed to have the time to help people out, to discuss, to support. A kind and good person.
And when I met him – just after a presentation he had given there – I walked up to him. Not to have a question for him about the talk, but just to let him know how much his knowledge and him sharing it had meant to me. And when I did, my voice wavered, my hands actually trembled a bit.
And it was silly, right? Although he was a great person, it was just about web development. But looking back at it now, I don’t think it was the starstruck thing of meeting an idol, it was more about me and being nervous about my abilities – or lack thereof – of how to express how much his work had meant to me, and me just really wanting him to know that what he was doing really mattered so much.
And if it can ever be any sort of consolation, Eric, I want you to carry with you how much you and your actions have meant, both to me and many others.Life
I am so very sad for you and your entire family. Heartbroken that Rebecca has to go through this. It feels so unfair, so wrong. And I feel we are so very helpless, and I can’t even begin to fathom what it is like for you.
I just want you to know that we are out here for you and your family. For whatever good we can do.
My thoughts are with you.
This post is partially meant as an extension to Deb's post on the same issue. Most of the contents in this post come from discussions with Deb, Sankha, Saurabh, and others: thanks, everyone!
So till now I've participated in two MozSetup-style events. One in IIT Bombay (where i was a participant), and one in IIT Kharagpur (where I was a volunteer/mentor). And one major issue that's there is setup. Basically, getting participants to come with a build system is rather nontrivial, and can be a turn-off in cases. Plus, some participants are on Windows (on the other end of the spectrum, some are on Arch), and it's harder to sort this out. Internet is not something to rely on either.
Besides that, build times are long, especially on systems like mine:
Well, at least the build didn't take 400 minutes :p pic.twitter.com/W6o4wEDWed
— Manish Goregaokar (@ManishEarth) August 24, 2013 At the IITB event, I had spent quite a bit of time getting a build ready. Fortunately I was able to create a couple of patches without testing, but that's certainly not the ideal way to go. Firstly, getting started takes up a huge chunk of time, and it's a bit overwhelming to have the participants learn and understand the entire process. It's far better to get them involved in writing code and let them figure out the details of setting it up at their leisure.
At the Kharagpur event, I had planned on having some lab machines with a full Nightly build on them so that the students could test and make their patches on this system. This might have worked out, but we didn't have time (or lab access) the day before to initialize this. In the end, we had one machine with a full build on it, and another machine that was built later during the event. I had planned to rsync the built objdirs across systems, but somehow that didn't work even though I'd kept everything in a username-agnostic location (/opt). This is something I'll look into later.
But it turns out there's an easier way to do things than to run full builds on the spot. @Debloper had the interesting idea of using OpenStack for this, and after some discussions it was basically to have an OpenStack instance where we create a VM with a full build environment, and allow participants to fork this VM and do all their coding/testing there (via ssh -X). This requires some investment in maintaining an OpenStack instance, but overall it's a viable way to go. We can also allow participants to keep access to the instance for some time period to make transition to development on their own systems much easier.
As an alternative to this, I had the idea of using flash drives instead of VMs. One way to do this is to install a persistent Ubuntu system1 on a 16 GB flash drive, install the prerequisites, and build. This pen drive can then be booted into and used regardless of the user's system. It's persistent, too, so it can be used in the long term as well. It has the drawback of being a bit slower, though. Also, this drive can be quickly cloned via dd in preparation for an event. If a user wishes to install it baremetal, they can do so manually with dd and update-grub.
The other option is to make an Ubuntu live flash drive, but to customize it via squashfs and chroot and add the required packages along with a full build. Here, there won't be persistent storage, so anyone trying it out by booting into the flash drive will lose their work on reboot. However, this is easier to install baremetal since the standard installation process will work, and a baremetal install is faster, too. Again, the ISOs can be cloned.
If we want this to be scalable, we can eventually ask Mozilla to build these ISOs once every X days (or once every clobber) and put them up for download, much like Nightly builds. As far as I can tell, this won't create much extra strain on their resources. Then event organizers all over the world just have to burn the ISOs to some flash drives the night before, which is something very feasible.
The cherry on top of this method (Deb's awesome idea) is that these flash drives can double as swag. A mozilla-branded drive is something pretty cool, especially if it contains everything you need for contributing to Firefox from wherever you are. The details of this depend on budget and all, but ... it's an option :)
There will still be architecture issues and speed issues, but these can be solved with some work. Using an older Ubuntu version like Backtrack does is one way to make things faster, and we can always have a couple of AMD flash drives ready.
I hope we get to try this method out at a similar event (maybe the upcoming Kolkata one). There are a lot of avenues to explore here, and a lot of room for improvement, but overall it seems like a promising way to fix the setup issues at such events.
1. Or Fedora, but I haven't yet worked out the details for Fedora. I'll be trying this out when I have time.
Merge day, which occurs every six weeks is when changes are uplifted to the next stable branch. Here's a picture from a talk John O'Duinn gave last year that shows an example of how changes move between branches1.
Picture from John O'Duinn's Release Engineering as a Force Multiplier talk at Releng 2013
For the release engineering team, each merge day we would update code in our buildbot configs to reflect changes that needed to be made after uplift. For instance, we often deprecate platforms or add new tests that only apply to certain branches. We used to have to to specify the branches that these applied to in our configs and update them every merge day. It was tricky to get this right. Last fall, Steve Fink fixed a bug that would allow us to base config versions on the version of gecko that rides the trains. So each merge do we update the version of gecko in our configs on a per branch basis, and then have code like this so that the tests are only enabled for branches where the gecko version applies
Enable jittests for desktop where gecko is 31 or more
Load jetpack where gecko is at least 21
To test these changes, you can set up your buildbot test master and run builder_list.py against your master. The builder_list.py script will output the list of build/test jobs (builders) that are enabled on your master. Then apply your patch against the master and diff the resulting builder files to ensure that the tests are enabled in the branches you want. As a side note, if you are enabling tests on platforms that would be on different test masters, you'll have to configure your master for mac, linux and windows test masters and diff the builders for each platform. If you are enabling tests on trunk trees for the first time, your diff should not reveal any new builders on mozilla-aurora, mozilla-beta, mozilla-release but just on mozilla-central, mozilla-inbound and the associated twig branches.
I recently fixed a few bugs where there was a request enable tests on trunk branches and ride the trains, so I thought I'd write about if others had to implement a similar request.
Train-related graffiti in Antwerp (Belgium), near Antwerpen-Centraal train station by ©vitalyzator, https://flic.kr/p/6tQ3H Creative Commons by-nc-sa 2.0
Further reading and notes
1 This applies to Firefox Desktop and mobile only, Firefox OS is on a different cadence and there are different branches involved
Release Management's rapid release calender
Release Engineering Merge duty
Release Engineering Testing Techniques
(This begins a two-part series on upcoming changes in Firefox Sync, based on my presentation at RealWorldCrypto 2014. Part 1 is about problems we observed in the old system. Part 2 will be about the system which replaces it.)
In March of 2011, Sync made its debut in Firefox 4.0 (after spending a couple of years as the Weave add-on). Sync is the feature that lets you keep bookmarks, preferences, saved passwords, and other browser data synchronized between all your browsers and devices (home desktop, mobile phone, work computer, etc).
Our goal for Sync was to make it secure and easy to share your browser state among two or more devices. We wanted your data to be encrypted, so that only your own devices could read it. We weren’t satisfied with just encrypting during transmission to our servers (aka “data-in-flight”), or just encrypting it while it was sitting on the server’s hard drives (aka “data-at-rest”). We wanted proper end-to-end encryption, so that even if somebody broke into the servers, or broke SSL, your data would remain secure.
Proper end-to-end encryption typically requires manual key management: you would be responsible for copying a large randomly-generated encryption key (like cs4am-qaudy-u5rps-x/qca-hu63l-8gjkl-28tky-6whlt-fn0) from your first device to the others. You could make this easier by using a password instead, but that ease-of-use comes at a cost: short, easy-to-remember passwords aren’t very secure. If an attacker could guess your password, they could get your data.
We didn’t like that tradeoff, so we designed an end-to-end encryption system that didn’t use passwords. It worked by “pairing”, which means that every time you add a new device, you have to introduce it to one of your existing devices. For example, you could pair your home computer with your phone, and now both devices could see your Sync data. Then later, you’d pair your phone with your work computer, and now all three devices could synchronize together.
The introduction process worked by copying a short single-use “pairing code” from one device to the other. This code was fed into some crypto magic (the J-PAKE protocol), allowing the two devices to establish a temporary encrypted connection. Then everything necessary to access your account (including the random long-term data-encryption key) was copied through that secure connection to the new device.
The cool thing about pairing is that your data is safely protected by a strong encryption key, against everyone (even the Mozilla server that hosts it), and you don’t need to manage the key. You never even see it.
But, we learned that our pairing implementation had some problems. Some were shallow, others were deep, but the net result is that a lot of people were confused by Sync, and we didn’t get as many people using it as we’d hoped. This post is meant to capture some of the problems that we observed.Idealized Design vs Actual Implementation
Back in those early days, four years ago now, I was picturing a sort of idealized Sync setup process. In this fantasy world, next to the rainbows and unicorns, the first machine would barely have a setup UI at all, maybe just a single button that said “Enable Sync”. When you turned it on, that device would create an encryption key, and start uploading ciphertext. Then, in the second device, the “Connect To Sync” button would initiate the pairing process. At no point would you need a password or even an account name.
But, for a variety of reasons, by the time we had a working deliverable, our setup page looked like this:
Some of the reasons were laudable: having an email address lets us notify users about problems with their account, and establishing a password enabled things like a “Delete My Account” feature to work. But part of the reason included historical leftovers and backward-compatibility with existing prototypes.
In this system, the email address identified the account, and the password was used in an HTTP Basic Auth header to enable read/write access to encrypted data on the server. The data itself was encrypted with a random key, which came to be known as the “recovery key”. The pairing process copied all three things to the new device.Deceptive UI
The problem here was that users still had to pick a password. The account-creation screen gave the impression that this password was important, and did nothing to disabuse them of the notion that email+password would be sufficient to access their data later. But the data was encrypted with the (hidden) key, not the password. In fact, this password was never entered again: it was copied to the other devices by pairing, not by typing.
It didn’t help that Firefox Sync came out at about the same time as a number of other products with “Sync” in the name, all of which did use email+password as the complete credentials.
This wasn’t so bad when the user went to set up a second device: they’d encounter the unusual pairing screen, but could follow the instructions and still get the job done. It was most surprising and frustrating for folks who used Firefox Sync with only one device.“Sync”, not “Backup”
We, too, were surprised that people were using Sync with only one device. After all, it’s obviously not a backup system: given how pairing works, it clearly provides no value unless you’ve got a second device to hold the encryption key when your first device breaks.
At least, it was obvious to me, living in that idealistic world with the rainbows and unicorns. But in the real world, you’d have to read the docs to discover the wonderous joys of “pairing”, and who in the real world ever reads docs?
It turns out that an awful lot of people just went ahead and set up Sync despite having only one device. For a while, the majority of Sync accounts had only one device connected.
And when one of these unlucky folks lost that device or deleted their profile, then wanted to recover their data on a new device, they’d get to the setup box:
and they’d say, “sure, I Have an Account”, and they’d be taken to the pairing-code box:
and then they’d get confused. Remember, for these users, this was the first time they’d ever heard of this “pairing” concept: they were expecting to find a place to type email+password, and instead got this weird code thing. They’d have no idea what those funny letters were, or what they were supposed to do with them. But they were desperate, so they’d keep looking, and would eventually find a way out: the subtle “I don’t have the device with me” link in the bottom left.
Now, this link was intended to be a fallback for the desktop-to-desktop pairing case, where you’re trying to sync two immobile desktop-bound computers together (making it hard to transcribe the pairing code), and involves an extra step: you have to extract the recovery key from the first machine and carry it to the second one. By “I don’t have the device with me”, we meant “another device exists, but it isn’t here right now”. It was never meant to be used very often.
This also provided a safety net: if you had magically known about the recovery key ahead of time, and wrote it down, you could recover your data without an existing device. But since pairing was supposed to be the dominant transfer mechanism, this wasn’t emphasized in the UI, and there were no instructions about this at account setup time.
So when you’ve just lost your phone, or your hard drive got reformatted, it’s not unreasonable to interpret “I don’t have the device with me” as something more like “that device is no longer with us”, as in, “It’s dead, Jim”.
Following the link gets them to the fallback dialog:
Which looked almost like what they were expecting: there’s a place for an account name (email), and for the password that they’ve diligently remembered. But now there’s this “Sync Key” field that they’ve never heard of. The instructions tell them to do something impossible (since “your other device” is broken). A lot of very frustrated people wound up here, and it didn’t provide for their needs in the slightest.
Finally, these poor desperate users would click on the only remaining ray of hope, the “I have lost my other device” link at the bottom. Adding insult to injury, this actually provides instructions to reset the account, regenerate the recovery key, and delete all the server-side data. If you understand pairing, it’s clear why deleting the data is the only remaining option (erase the ciphertext that you can no longer decrypt, and reload from a surviving device). But for most people who got to this point, seeing these instructions only caused even more confusion and anger:
(When reached through the “I have lost my other device” link, this dialog would highlight the “Change Recovery Key” button. This same dialog was reachable through Preferences/Sync, and is how you’d find your Recovery Key and record it for later use.)User Confusion
The net result was that a lot of folks just couldn’t use Sync. You can hear the frustration in these quotes from SUMO, the Firefox support site, circa December 2013:
The upshot is that, while we built a multi-device synchronization system with excellent security properties and (ostensibly) no passwords to manage, a lot of people actually wanted a backup system, with an easy way to recover their data even if they’d only had a single device. And they wanted it to look just like the other systems they were familiar with, using email and password for access.Lessons Learned
We’re moving away from pairing for a while: Firefox 29 will transition to Firefox Accounts (abbreviated “FxA”), in which each account is managed by an email address and a password. Sync will still provide end-to-end encryption, but accessed by a password instead of pairing. My next post will describe the new system in more detail.
But we want to bring back pairing some day. How can we do it better next time? Here are some lessons I’ve learned from the FF 4.0 Sync experience:
- do lots of user testing, early in the design phase
- especially if you’re trying to teach people something new
- pay attention to all the error paths
- if your application behaves differently than the mainstream, make it look different too
- observe how people use your product, figure out what would meet their expectations, and try to build it
- if you think their expectations are “wrong” (i.e. they don’t match your intentions), that’s ok, but now you have two jobs: implementation and education. Factor that into your development budget.
I still believe in building something new when it’s better than the status quo, even if it means you must educate your users. But I guess I appreciate the challenges more now than I did four years ago.
I didn’t see this article by LinkedIn when it was first posted in October last year. But it warms the heart to see a large company laying out how it is deploying things like CSP, HSTS and pinning, along with SSL deployment best practice, to make its users more secure. I hope that many more follow in their footsteps.
From looking at it, I can say that with the changes they have made we're roughly saving the 60-70% on our AWS bill.
If you see them, give them a big pat on the back, this is huge for Mozilla.
Here’s some of the projects that helped with this:
- Experiments with smaller pools of build
- Firefox builds are way cheaper now!
- EC2 spot instances experiments
- AWS, networks and burning trees
- More and faster CI for less on AWS
- Analyzed shared cache on try
- Cost-efficient continuous integration
- Linux and Android try builds, now up to twice as fast
This work by Zambrano Gasparnian, Armen is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License.