My talk was really more about the “network problem” than the “protocol problem”. Networks breed first- and second-mover winners and others path-dependent powers, until the next disruption. Users or rather their data get captured.
Privacy is only one concern among several, including how to realize economic value for many-yet-individually-weak users, not just for data-store/service owners or third parties. Can we do better with client-side and private-cloud tiers, zero-knowledge proofs and protocols, or other ideas?
In the end, I asked these four questions:
- Can a browser/OS “unionize its users” to gain bargaining power vs. net super-powers?
- To create a data commons with “API to me” and aggregated/clustered economics?
- Open the walled gardens to put users first?
- Still be usable and private-enough for most?
I think the answer is yes, but I’m not sure who will do this work. It is vitally important.
I may get to it, but not working at Mozilla. I’ve resigned as CEO and I’m leaving Mozilla to take a rest, take some trips with my family, look at problems from other angles, and see if the “network problem” has a solution that doesn’t require scaling up to hundreds of millions of users and winning their trust while somehow covering costs. That’s a rare, hard thing, which I’m proud to have done with Firefox at Mozilla.
I encourage all Mozillians to keep going. Firefox OS is even more daunting, and more important. Thanks indeed to all who have supported me, and to all my colleagues over the years, at Mozilla, in standards bodies, and at conferences around the world. I will be less visible online, but still around.
It is a sad day today. … Or let’s start from somewhere else. I have grown up in the Communist Czechoslovakia. I remember that moment, I was probably something around seven years old, and I was sitting on the floor of our living room and thinking where to hide a tape with anti-Communist protest songs so that it wouldn’t be found by the secret police if we were blessed with the house search. Yes, seven years. Yes, it was shortly after the Charter 77 and there was a lot of hysteria in the air, but yes couple of years later my father (who was an university professor) was falsely accused of committing rape on some female students (fortunately, police was then so sloppy, they made a mistake and provided him with the best alibi possible … he was interrogated by them in time when the rape was supposed to happen; or perhaps it was not mistake at all), so just a house search was not that improbable.
I remember reading a couple of years later a poem by a famous nineteenth century Czech poet, Karel Havlíček Borovský, written about the time when was illegally arrested and deported by the Austrian police because of his anti-government journalism (yes, we have a long history of bad regimes here). This is particularly on the situation when he was drawn out of the bed by the police early in the morning (the translation is mine and very very rough):Ale Džok, můj černý buldog, ten je grobián, na habeas corpus tuze zvyklý — on je Angličan. Málem by byl chlap přestoupil jeden paragraf, již na slavný ouřad zpod postele uďál: Vrr! haf! haf! Hodil jsem mu tam pod postel říšský zákoník, dobře že jsem měl ten moudrý nápad, již ani nekvík. —
//However, Jock, my black bulldog, he is a lout, he is too much used to the Habeas Corpus — being an English dog. He would almost step over one rule of the law, because he started from below the bad doing on the honorable officers: Grrr! Woof! Woof! I have thrown him under the bad the imperial code of law, that was really a smart idea, he haven’t make a sound anymore. —
I have asked my Dad (who was a lawyer) what that Habeas Corpus means, and when he explained it to me my conclusion from this poem was that there is something awesome about the rule of law, and particularly there is something thing great about the English (and by association American) law. Apparently it is not possible for a policemen to draw you out of the bad without a reason, luxury which I was certain we were not blessed with.
Yet later I have learned another standard of the free society (even more relevant to what I would like to talk about anyway). I have been told that this standard is fairly displayed in the famous saying attributed to Voltaire:Monsieur l’abbé, I detest what you write, but I would give my life to make it possible for you to continue to write
Then the so called Velvet Revolution of 1989 happened, and I have found that the reality is a little bit complicated, but I think these rules of freedom of expression and honor to other peoples’ opinion stayed with me forever. So, I was terribly surprised and frankly confused later on when I was reading very excellent de Tocqueville’s book about the democracy in America which contained a statementI know of no country in which there is so little independence of mind and real freedom of discussion as in America.
Isn’t he talking about the country which gave us the First Amendment, which gave us whole concept of the freedom of expression? Isn’t he talking about the country founded by the dissenters? I thought that there must be something wrong with this statement, or that I had misunderstood something in what he was saying. Yet later on I have been blessed with an opportunity to live and study for couple of years in Boston so I have learned that the protection against the government attacking somebody for his expression is very much real, but that there is also present very high level of pressure to conform to the prevalent opinion of the community. And although everybody talks all the time about the value of diversity, there is really a little of it allowed.
So, I had in the last two weeks these two stories.
World Vision, one of the largest Christian charity organization in the world, decided that their employee won’t be fired because they were living in the same-sex marriage sanctioned by their state and their denomination. They were arguing for the decision because they are non-denominational organization and they didn’t want to overrule policy of their employees’ denominations, not mentioning they didn’t want to overrule state laws. I don’t know whether I agree with this argument, but it is obvious that the situation of non-denominational organizations is difficult and whichever decision they make it will be attacked by somebody. Of course, I don’t know what happened thereafter but couple of days later after the unbelievable firestorm of criticism from the evangelical circles World Vision reversed their decision.
Second story. Shortly after Brendan Eich was named CEO of the Mozilla Corporation, somebody picked up an old case of his financial support for the Proposition 8 (if I understand correctly, the issue at stake about that proposition was declaring a marriage to be an union of one man and one woman; if you don’t know who Brendan Eich is, look at his wikipage ). Even couple of LBGT employees of Mozilla Corp. defended Brendan Eich on their blogs claiming that there is no discrimination against them in Mozilla, just to the contrary conditions for LGBT people are way above the legal level and on the highest level in the industry. Also, nobody was able to explain questions of some senior Mozilla developers what has Brendan’s opinions to do with his position of CEO of the company developing computer programs. And whole story again ended the same, most extreme participants in the Kulturkampf won, and Mozilla lost in my opinion one of the most brilliant leaders in the industry.
What would de Tocqueville and Voltaire say?
Predicting time series can be very interesting not only for quants. Any server that logs metrics like the number of submissions or requests over time generates a so called time series or signal. An interesting time series I had the chance to play with some time ago is the one generated by Telemetry’s submissions. This is how it looks like for the Nightly channel for the past 60 days:
It’s immediately evident to an human eye when and where there was a drop in submissions in the past couple of months (bugs!). An interesting problem is to be able to automatically identify a drop in submissions as soon as it happens and at the same time reducing to a minimum the number of “false alarms”. It might seem rather trivial at first, but given that the distribution is quite sparse, caused mostly by daily oscillations, an outlier detection method based on the standard deviation is doomed to fail. Using the median absolute deviation is more robust but still not good enough to avoid false positives.
The periodic patterns might not be immediately visible from the raw data plot but once we connect the dots the daily and weekly pattern appear in all their beauty:
The method I came up with to catch drops does the following:
- It retrieves the distributions of the last 10 days from the current data point
- It performs a series of Mann-Whitney tests to compare the last 24h to the distributions of the previous 9 days
- If the distributions are statistically different for at least 5 days with the current daily one having a lower mean, then we have a drop
The algorithm requires a certain amount of history to make good predictions, reason why it detected the first drop on the left only after several days. As expected though it was able to detect the second drop without any false positives. Sudden drops are easy to detect with a robust outlier detection method but slow drops, as we experienced in the past, can go unnoticed if you just look for outliers.
Another interesting approach is to use time series analysis to decompse the series into its seasonals (periodic), trend and and noise signals. A simple classical decomposition by moving average yields the following series:
This simple algorithm was able to remove most of the periodic pattern; the trend is affected now by the weekly signal and the drops. It turns out that newer methods are able to decompose time series with multiple periodic patterns, or seasonalities. One algorithm I particularly like is the so called TBATS method, which is an advanced exponential smoothing model:
That’s pretty impressive! The TBATS algorithm was able to identify and remove the daily and weekly frequency from our signal, what remains is basically the trend and some random noise. Now that we have such a clean signal we could try to apply statistical quality control to our time series, i.e. use a set of rules to identify drops. The rules look at the historical mean of a series of datapoints and based on the standard deviation, the rules help judge whether a new set of points is experiencing a mean shift (drop) or not.
Given a decomposition of a time series, we can also use it to predict future datapoints. This can be useful for a variety of reasons beyond detecting drops. To have an idea of how well we can predict future submissions let’s take a clean subset of our data, from day 20 to day 40, and let’s try to predict Telemetry’s submissions for the next 5 days while comparing it to the actual data:
That’s pretty neat, we can immediately see that we have an outlier and the prediction is very close to the actual real data.
I wonder if there are other methods used to detect alterations to time series so feel free to drop me a line with a pointer if you happen to have a suggestion.
Mozilla prides itself on being held to a different standard and, this past week, we didn’t live up to it. We know why people are hurt and angry, and they are right: it’s because we haven’t stayed true to ourselves.
We didn’t act like you’d expect Mozilla to act. We didn’t move fast enough to engage with people once the controversy started. We’re sorry. We must do better.
Brendan Eich has chosen to step down from his role as CEO. He’s made this decision for Mozilla and our community.
Mozilla believes both in equality and freedom of speech. Equality is necessary for meaningful speech. And you need free speech to fight for equality. Figuring out how to stand for both at the same time can be hard.
Our organizational culture reflects diversity and inclusiveness. We welcome contributions from everyone regardless of age, culture, ethnicity, gender, gender-identity, language, race, sexual orientation, geographical location and religious views. Mozilla supports equality for all.
We have employees with a wide diversity of views. Our culture of openness extends to encouraging staff and community to share their beliefs and opinions in public. This is meant to distinguish Mozilla from most organizations and hold us to a higher standard. But this time we failed to listen, to engage, and to be guided by our community.
While painful, the events of the last week show exactly why we need the web. So all of us can engage freely in the tough conversations we need to make the world better.
We need to put our focus back on protecting that Web. And doing so in a way that will make you proud to support Mozilla.
What’s next for Mozilla’s leadership is still being discussed. We want to be open about where we are in deciding the future of the organization and will have more information next week. However, our mission will always be to make the Web more open so that humanity is stronger, more inclusive and more just: that’s what it means to protect the open Web.
We will emerge from this with a renewed understanding and humility — our large, global, and diverse community is what makes Mozilla special, and what will help us fulfill our mission. We are stronger with you involved.
Thank you for sticking with us.
Mitchell Baker, Executive Chairwoman
Modern microarchitectures are incredibly complex. A modern x86 processor will be superscalar and use some form of compilation to microcode to do that. Desktop processors will undoubtedly have multiple instruction issues per cycle, forms of register renaming, branch predictors, etc. Minor changes—a misaligned instruction stream, a poor order of instructions, a bad instruction choice—could kill the ability to take advantages of these features. There are very few people who could accurately predict the performance of a given assembly stream (I myself wouldn't attempt it if the architecture can take advantage of ILP), and these people are disproportionately likely to be working on compiler optimizations. So unless you're knowledgeable enough about assembly to help work on a compiler, you probably shouldn't be hand-coding assembly to make code faster.
To give an example to elucidate this point (and the motivation for this blog post in the first place), I was given a link to an implementation of the N-queens problem in assembly. For various reasons, I decided to use this to start building a fine-grained performance measurement system. This system uses a high-resolution monotonic clock on Linux and runs the function 1000 times to warm up caches and counters and then runs the function 1000 more times, measuring each run independently and reporting the average runtime at the end. This is a single execution of the system; 20 executions of the system were used as the baseline for a t-test to determine statistical significance as well as visual estimation of normality of data. Since the runs observed about a constant 1-2 μs of noise, I ran all of my numbers on the 10-queens problem to better separate the data (total runtimes ended up being in the range of 200-300μs at this level). When I say that some versions are faster, the p-values for individual measurements are on the order of 10-20—meaning that there is a 1-in-100,000,000,000,000,000,000 chance that the observed speedups could be produced if the programs take the same amount of time to run.
The initial assembly version of the program took about 288μs to run. The first C++ version I coded, originating from the same genesis algorithm that the author of the assembly version used, ran in 275μs. A recursive program beat out a hand-written assembly block of code... and when I manually converted the recursive program into a single loop, the runtime improved to 249μs. It wasn't until I got rid of all of the assembly in the original code that I could get the program to beat the derecursified code (at 244μs)—so it's not the vectorization that's causing the code to be slow. Intrigued, I started to analyze why the original assembly was so slow.
It turns out that there are three main things that I think cause the slow speed of the original code. The first one is alignment of branches: the assembly code contains no instructions to align basic blocks on particular branches, whereas gcc happily emits these for some basic blocks. I mention this first as it is mere conjecture; I never made an attempt to measure the effects for myself. The other two causes are directly measured from observing runtime changes as I slowly replaced the assembly with code. When I replaced the use of push and pop instructions with a global static array, the runtime improved dramatically. This suggests that the alignment of the stack could be to blame (although the stack is still 8-byte aligned when I checked via gdb), which just goes to show you how much alignments really do matter in code.
The final, and by far most dramatic, effect I saw involves the use of three assembly instructions: bsf (find the index of the lowest bit that is set), btc (clear a specific bit index), and shl (left shift). When I replaced the use of these instructions with a more complicated expression int bit = x & -x and x = x - bit, the program's speed improved dramatically. And the rationale for why the speed improved won't be found in latency tables, although those will tell you that bsf is not a 1-cycle operation. Rather, it's in minutiae that's not immediately obvious.
The original program used the fact that bsf sets the zero flag if the input register is 0 as the condition to do the backtracking; the converted code just checked if the value was 0 (using a simple test instruction). The compare and the jump instructions are basically converted into a single instruction in the processor. In contrast, the bsf does not get to do this; combined with the lower latency of the instruction intrinsically, it means that empty loops take a lot longer to do nothing. The use of an 8-bit shift value is also interesting, as there is a rather severe penalty for using 8-bit registers in Intel processors as far as I can see.
Now, this isn't to say that the compiler will always produce the best code by itself. My final code wasn't above using x86 intrinsics for the vector instructions. Replacing the _mm_andnot_si128 intrinsic with an actual and-not on vectors caused gcc to use other, slower instructions instead of the vmovq to move the result out of the SSE registers for reasons I don't particularly want to track down. The use of the _mm_blend_epi16 and _mm_srli_si128 intrinsics can probably be replaced with __builtin_shuffle instead for more portability, but I was under the misapprehension that this was a clang-only intrinsic when I first played with the code so I never bothered to try that, and this code has passed out of my memory long enough that I don't want to try to mess with it now.
In short, compilers know things about optimizing for modern architectures that many general programmers don't. Compilers may have issues with autovectorization, but the existence of vector intrinsics allow you to force compilers to use vectorization while still giving them leeway to make decisions about instruction scheduling or code alignment which are easy to screw up in hand-written assembly. Also, compilers are liable to get better in the future, whereas hand-written assembly code is unlikely to get faster in the future. So only write assembly code if you really know what you're doing and you know you're better than the compiler.
I work at Mozilla. The non-profit organisation to keep the open web, well, open and alive. I work here because of a few reasons:
- We have a manifesto. Not a “company guideline” or “our vision” or “about us”. We mean business, but not in the sense of “what brings us the most money”.
- We have people, not employment figures. Amazing people, creative people, misfits and average people. From all over the globe, with all kind of ideas and beliefs and backgrounds. And they work together. They clash, they disagree, they flood my inbox with CCs as I should know the answer to how they can work on this. They all have different setups and ways to work.
- We empower people. We work with a lot of people who we don’t pay. We help them learn, we help them become speakers for a good cause, we help them communicate and we let them be our communicators in regions and languages and manners we have no idea about. We trust them. And it shows. Going to a Mozilla summit is like going to a concert or festival. You have a lot of fun, you have a lot of noise and boy do you get a lot of demands. People are hungry to do good, and are ravenous to learn all about it.
- We are a stepping stone. Quite a few people who I trained on public speaking and tech evangelism got better jobs immediately after that. I write more recommendation letters than ever before. And I see people getting a chance to move to another country and get a job they beforehand only dreamed about.
- We are more than a company in the Silicon Valley. We are world-wide, everybody has the right to work from home and most people do. We trust you to use your time wisely and only ask you to show up for video meetings where we need to sync. This means we release much more than I have ever seen in any other company. Your output speaks for you, not how on time you arrive in the office, how you look or where you are from.
- We value passion and personality – I can be a total pain in the lower backside. Other people drive me crazy. We don’t have to have the same ideas, instead we find a common ground and analyse what is good for the project as a whole. Then we do that together. There is no problem disagreeing with a director, a manager, or even a CEO. If you have a good point, it will be taken in and – after peer review – taken on. You can get away with a lot more than you could in other companies. And this isn’t about “yeah, let them rant – it makes them happy” – if you are professional and have a good point, you find an ear to listen to you.
- We disagree and discuss. The old saying “Arguing with an engineer is like wrestling with a pig in mud – you realise far too late it is enjoying it” is very much alive here. All discussions and disagreements are public. Personal disagreements are normally taken on head-on and in direct messaging. Nobody is asked to agree with anything without having had their say. This delays things, this makes things much more complex, but it also makes us who we are. A free, open product can not be created behind closed walls. Open Source does not mean “code is on GitHub”. It is pure transparency and a messy process. But it makes for strong products that can not be killed if one person leaves or gets bored. Open Source means the big red bus has no power. What is shared can not get taken away, neither by organisational changes, nor by outside powers, nor by silly things like hardware failure.
- We work with the competition. – I have no problem speaking to Google, Microsoft, Opera, Twitter, Facebook and whoever else I please. I speak at their events, I share upcoming work on our part with them. I applaud and publicly promote the great things they do. We work on standards and best practices. These can not be done in one place. They have to have peer review.
- We allow you to speak freely. – there is no censorship, there is no “you have to use this channel to communicate”. The latter drives me crazy, as I have many a time to react to things people say about our products on their personal blogs or find amazing examples and code randomly on the web. People prefer to write on their own channels about products they built on company time rather than using an official channel. In other companies, that is an HR issue. Hell, I had contracts that said that whatever code written on company hardware belongs to it. Not here. You can talk and you should also be aware of the effects your communication has. Many times this means we have to help you out when you miscommunicated. That is tough, but it also means we learn.
All of this is the messy craziness that is Mozilla. And that’s why I am here. It is not a 9-5 job, it is not an easy job. But damn is it rewarding and interesting.
When I started, I took a paycut. I continuously get better offers from the outside. I had a six hour interview with six people. These were the best brainstorming I had done for years. When I met volunteers on my way out and saw them giving their time for Mozilla with a smile that was contagious, I knew I am up to something good.
When I interviewed, nobody asked me about my personal or religious beliefs. This would be illegal – at least where I am from. I don’t have to agree with everyone I work with on a personal level. All I have to do is to allow you your freedom to be who you are and flag up when your personal views inconvenience or hurt others and are just not appropriate in a work situation.
So when you tell me because I work for Mozilla I share ideas of other people “above me” in the hierarchy, you don’t know me and you have no idea how Mozilla works. We are different, and we work differently. You make something that thrives on communication and helping another and having thousands of personal voices something you understand: a hierarchical company with one person who is the embodiment of everything the company does. A figure like that exists – it is a one-man startup or a movie superhero. It doesn’t work for a loosely connected and open construct like Mozilla.
I’ve had moments where I was ready to give up. I had some very painful months lately where all my work of the last years was questioned and I felt I ran out of things to excite me. Then I concentrated on the people who give their free time on us and talked to them. And I found the spark again.
I am here for all the people who spend time to keep the web open, to teach web literacy, to give people a voice where it would be hard for them to get heard. They may be my colleagues, they may be volunteers, they may be people in other companies with similar goals. This is bigger than me and bigger than you. I hope it stays, I hope it thrives and I hope people understand that what Mozilla did and does is incredibly important. Information wants to be out and free. The internet allows for this. We made it our passion to protect this internet and give you software that is yours to use – for free, without backdoors or asking you for your information upfront. If that’s not your thing, fine. But don’t give it up because you disagree with one person’s personal actions and beliefs. I don’t.