With the release of Firefox 52 to all users worldwide, we now have the final Windows XP-supported Firefox release out the door.
This isn’t to say that support is done. As I’ve mentioned before, Windows XP users will be transitioned to the ESR update channel where they’ll continue to receive security updates for the next year or so.
And I don’t expect this to be the end of me having to blog about weird clients that are inexplicably on Windows XP.
However, this does take care of one of the longest-standing data questions I’ve looked at on this blog and in my career at Mozilla. So I feel that it’s worth taking a moment to mark the occasion.
Windows XP is dead. Long live Windows XP.
This is the sumo weekly call
I think I first heard about the Zstandard compression algorithm at a Mercurial developer sprint in 2015. At one end of a large table a few people were uttering expletives out of sheer excitement. At developer gatherings, that's the universal signal for something is awesome. Long story short, a Facebook engineer shared a link to the RealTime Data Compression blog operated by Yann Collet (then known as the author of LZ4 - a compression algorithm known for its insane speeds) and people were completely nerding out over the excellent articles and the data within showing the beginnings of a new general purpose lossless compression algorithm named Zstandard. It promised better-than-deflate/zlib compression ratios and performance on both compression and decompression. This being a Mercurial meeting, many of us were intrigued because zlib is used by Mercurial for various functionality (including on-disk storage and compression over the wire protocol) and zlib operations frequently appear as performance hot spots.
Before I continue, if you are interested in low-level performance and software optimization, I highly recommend perusing the RealTime Data Compression blog. There are some absolute nuggets of info in there.
Anyway, over the months, the news about Zstandard (zstd) kept getting better and more promising. As the 1.0 release neared, the Facebook engineers I interact with (Yann Collet - Zstandard's author - is now employed by Facebook) were absolutely ecstatic about Zstandard and its potential. I was toying around with pre-release versions and was absolutely blown away by the performance and features. I believed the hype.
Zstandard 1.0 was released on August 31, 2016. A few days later, I started the python-zstandard project to provide a fully-featured and Pythonic interface to the underlying zstd C API while not sacrificing safety or performance. The ulterior motive was to leverage those bindings in Mercurial so Zstandard could be a first class citizen in Mercurial, possibly replacing zlib as the default compression algorithm for all operations.
Fast forward six months and I've achieved many of those goals. python-zstandard has a nearly complete interface to the zstd C API. It even exposes some primitives not in the C API, such as batch compression operations that leverage multiple threads and use minimal memory allocations to facilitate insanely fast execution. (Expect a dedicated post on python-zstandard from me soon.)
Mercurial 4.1 ships with the python-zstandard bindings. Two Mercurial 4.1 peers talking to each other will exchange Zstandard compressed data instead of zlib. For a Firefox repository clone, transfer size is reduced from ~1184 MB (zlib level 6) to ~1052 MB (zstd level 3) in the default Mercurial configuration while using ~60% of the CPU that zlib required on the compressor end. When cloning from hg.mozilla.org, the pre-generated zstd clone bundle hosted on a CDN using maximum compression is ~707 MB - ~60% the size of zlib! And, work is ongoing for Mercurial to support Zstandard for on-disk storage, which should bring considerable performance wins over zlib for local operations.
I've learned a lot working on python-zstandard and integrating Zstandard into Mercurial. My primary takeaway is Zstandard is awesome.
In this post, I'm going to extol the virtues of Zstandard and provide reasons why I think you should use it.Why Zstandard
The main objective of lossless compression is to spend one resource (CPU) so that you may reduce another (I/O). This trade-off is usually made because data - either at rest in storage or in motion over a network or even through a machine via software and memory - is a limiting factor for performance. So if compression is needed for your use case to mitigate I/O being the limiting resource and you can swap in a different compression algorithm that magically reduces both CPU and I/O requirements, that's pretty exciting. At scale, better and more efficient compression can translate to substantial cost savings in infrastructure. It can also lead to improved application performance, translating to better end-user engagement, sales, productivity, etc. This is why companies like Facebook (Zstandard), Google (brotli, snappy, zopfli), and Pied Piper (middle-out) invest in compression.
Today, the most widely used compression algorithm in the world is likely DEFLATE. And, software most often interacts with DEFLATE via what is likely the most widely used software library in the world, zlib.
Being at least 27 years old, DEFLATE is getting a bit long in the tooth. Computers are completely different today than they were in 1990. The Pentium microprocessor debuted in 1993. If memory serves (pun intended), it used PC66 DRAM, which had a transfer rate of 533 MB/s. For comparison, a modern NVMe M.2 SSD (like the Samsung 960 PRO) can read at 3000+ MB/s and write at 2000+ MB/s. In other words, persistent storage today is faster than the RAM from the era when DEFLATE was invented. And of course CPU and network speeds have increased as well. We also have completely different instruction sets on CPUs for well-designed algorithms and software to take advantage of. What I'm trying to say is the market is ripe for DEFLATE and zlib to be dethroned by algorithms and software that take into account the realities of modern computers.
(For the remainder of this post I'll use zlib as a stand-in for DEFLATE because it is simpler.)
Zstandard initially piqued my attention by promising better-than-zlib compression and performance in both the compression and decompression directions. That's impressive. But it isn't unique. Brotli achieves the same, for example. But what kept my attention was Zstandard's rich feature set, tuning abilities, and therefore versatility.
In the sections below, I'll describe some of the benefits of Zstandard in more detail.
Before I do, I need to throw in an obligatory disclaimer about data and numbers that I use. Benchmarking is hard. Benchmarks should not be trusted. There are so many variables that can influence performance and benchmarks. (A recent example that surprised me is the CPU frequency/power ramping properties of Xeon versus non-Xeon Intel CPUs. tl;dr a Xeon won't hit max CPU frequency if only a core or two is busy, meaning that any single or low-threaded benchmark is likely misleading on Xeons unless you change power settings to mitigate its conservative power ramping defaults. And if you change power settings, does that reflect real-life usage?)
Reporting useful and accurate performance numbers for compression is hard because there are so many variables to care about. For example:
- Every corpus is different. Text, JSON, C++, photos, numerical data, etc all exhibit different properties when fed into compression and could cause compression ratios or speeds to vary significantly.
- Few large inputs versus many smaller inputs (some algorithms work better on large inputs; some libraries have high per-operation overhead).
- Memory allocation and use strategy. Performance can vary significantly depending on how a compression library allocates, manages, and uses memory. This can be an implementation specific detail as opposed to a core property of the compression algorithm.
All performance data was obtained on an i7-6700K running Ubuntu 16.10 (Linux 4.8.0) with a mostly stock config. Benchmarks were performed in memory to mitigate storage I/O or filesystem interference. Memory used is DDR4-2133 with a cycle time of 35 clocks.
While I'm pretty positive about Zstandard, it isn't perfect. There are corpora for which Zstandard performs worse than other algorithms, even ones I compare it directly to in this post. So, your mileage may vary. Please enlighten me with your counterexamples by leaving a comment.
With that (rather large) disclaimer out of the way, let's talk about what makes Zstandard awesome.Flexibility for Speed Versus Size Trade-offs
Compression algorithms typically contain parameters to control how much work to do. You can choose to spend more CPU to (hopefully) achieve better compression or you can spend less CPU to sacrifice compression. (OK, fine, there are other factors like memory usage at play too. I'm simplifying.) This is commonly exposed to end-users as a compression level. (In reality there are often multiple parameters that can be tuned. But I'll just use level as a stand-in to represent the concept.)
But even with adjustable compression levels, the performance of many compression algorithms and libraries tend to fall within a relatively narrow window. In other words, many compression algorithms focus on niche markets. For example, LZ4 is super fast but doesn't yield great compression ratios. LZMA yields terrific compression ratios but is extremely slow.
This can be visualized in the following chart showing results when compressing a mozilla-unified Mercurial bundle:
This chart plots the logarithmic compression speed in megabytes per second against achieved compression ratio. The further right a data point is, the better the compression and the smaller the output. The higher up a point is, the faster compression is.
The ideal compression algorithm lives in the top right, which means it compresses well and is fast. But the powers of mathematics push compression algorithms away from the top right.
On to the observations.
LZ4 is highly vertical, which means its compression ratios are limited in variance but it is extremely flexible in speed. So for this data, you might as well stick to a lower compression level because higher values don't buy you much.
Bzip2 is the opposite: a horizontal line. That means it is consistently the same speed while yielding different compression ratios. In other words, you might as well crank bzip2 up to maximum compression because it doesn't have a significant adverse impact on speed.
LZMA and zlib are more interesting because they exhibit more variance in both the compression ratio and speed dimensions. But let's be frank, they are still pretty narrow. LZMA looks pretty good from a shape perspective, but its top speed is just too slow - only ~26 MB/s!
This small window of flexibility means that you often have to choose a compression algorithm based on the speed versus size trade-off you are willing to make at that time. That choice often gets baked into software. And as time passes and your software or data gains popularity, changing the software to swap in or support a new compression algorithm becomes harder because of the cost and disruption it will cause. That's technical debt.
What we really want is a single compression algorithm that occupies lots of space in both dimensions of our chart - a curve that has high variance in both compression speed and ratio. Such an algorithm would allow you to make an easy decision choosing a compression algorithm without locking you into a narrow behavior profile. It would allow you make a completely different size versus speed trade-off in the future by only adjusting a config knob or two in your application - no swapping of compression algorithms needed!
As you can guess, Zstandard fulfills this role. This can clearly be seen in the following chart (which also adds brotli for comparison).
The advantages of Zstandard (and brotli) are obvious. Zstandard's compression speeds go from ~338 MB/s at level 1 to ~2.6 MB/s at level 22 while covering compression ratios from 3.72 to 6.05. On one end, zstd level 1 is ~3.4x faster than zlib level 1 while achieving better compression than zlib level 9! That fastest speed is only 2x slower than LZ4 level 1. On the other end of the spectrum, zstd level 22 runs ~1 MB/s slower than LZMA at level 9 and produces a file that is only 2.3% larger.
It's worth noting that zstd's C API exposes several knobs for tweaking the compression algorithm. Each compression level maps to a pre-defined set of values for these knobs. It is possible to set these values beyond the ranges exposed by the default compression levels 1 through 22. I've done some basic experimentation with this and have made compression even faster (while sacrificing ratio, of course). This covers the gap between Zstandard and brotli on this end of the tuning curve.
The wide span of compression speeds and ratios is a game changer for compression. Unless you have special requirements such as lightning fast operations (which LZ4 can provide) or special corpora that Zstandard can't handle well, Zstandard is a very safe and flexible choice for general purpose compression.Multi-threaded Compression
Zstd 1.1.3 contains a multi-threaded compression API that allows a compression operation to leverage multiple threads. The output from this API is compatible with the Zstandard frame format and doesn't require any special handling on the decompression side. In other words, a compressor can switch to the multi-threaded API and decompressors won't care.
This is a big deal for a few reasons. First, today's advancements in computer processors tend to yield more capacity from more cores not from faster clocks and better cycle efficiency (although many cases do benefit greatly from modern instruction sets like AVX and therefore better cycle efficiency). Second, so many compression libraries are only single-threaded and require consumers to invent their own framing formats or storage models to facilitate multi-threading. (See Blosc for such a library.) Lack of a multi-threaded API in the compression library means trusting another piece of software or writing your own multi-threaded code.
The following chart adds a plot of Zstandard multi-threaded compression with 4 threads.
The existing curve for Zstandard basically shifted straight up. Nice!
The ~338 MB/s speed for single-threaded compression on zstd level 1 increases to ~1,376 MB/s with 4 threads. That's ~4.06x faster. And, it is ~2.26x faster than the previous fastest entry, LZ4 at level 1! The output size only increased by ~4 MB or ~0.3% over single-threaded compression.
The scaling properties for multi-threaded compression on this input are terrific: all 4 cores are saturated and the output size barely changed.
Because Zstandard's multi-threaded compression API produces data compatible with any Zstandard decompressor, it can logically be considered an extension of compression levels. This means that the already extremely flexible speed vs ratio curve becomes even wider in the speed axis. Zstandard was already a justifiable choice with its extreme versatility. But when you throw in native multi-threaded compression API support, the flexibility for tuning compression performance is just absurd. With enough cores, you are likely to run into I/O limits long before you exhaust the CPU, at which point you can crank up the compression level and sacrifice as much CPU as you are willing to burn. That's a good position to be in.Decompression Speed
Compression speed and ratios only tell half the story about a compression algorithm. Except for archiving scenarios where you write once and read rarely, you probably care about decompression performance.
Popular compression algorithms like zlib and bzip2 have less than stellar decompression speeds. On my i7-6700K, zlib decompression can deliver many decompressed data sets at the output end at 200+ MB/s. However, on the input/compressed end, it frequently fails to reach 100 MB/s or even 80 MB/s. This is significant because if your application is reading data over a 1 Gbps network or from a local disk (modern SSDs can read at several hundred MB/s or more), then your application has a CPU bottleneck at decoding the data - and that's before you actually do anything useful with the data in the application layer! (Remember: the idea behind compression is to spend CPU to mitigate an I/O bottleneck. So if compression makes you CPU bound, you've undermined the point of compression!) And if my Skylake CPU running at 4.0 GHz is CPU - not I/O - bound, A Xeon in a data center will be even slower and even more CPU bound (Xeons tend to run at much lower clock speeds - the laws of thermodynamics require that in order to run more cores in the package). In short, if you are using zlib for high throughput scenarios, there's a good chance it is a bottleneck and slowing down your application.
We again measure the speed of algorithms using a Firefox Mercurial bundle. The following charts plot decompression speed versus ratio for this file. The first chart measures decompression speed on the input end of the decompressor. The second measures speed at the output end.
Zstandard matches its great compression speed with great decompression speed. Zstandard can deliver decompressed output at 1000+ MB/s while consuming input at 200-275MB/s. Furthermore, decompression speed is mostly independent of the compression level. (Although higher compression levels require more memory in the decompressor.) So, if you want to throw more CPU at re-compression later so data at rest takes less space, you can do that without sacrificing read performance. I haven't done the math, but there is probably a break-even point where having dedicated machines re-compress terabytes or petabytes of data at rest offsets the costs of those machine through reduced storage costs.
While Zstandard is not as fast decompressing as LZ4 (which can consume compressed input at 500+ MB/s), its performance is often ~4x faster than zlib. On many CPUs, this puts it well above 1 Gbps, which is often desirable to avoid a bottleneck at the network layer.
It's also worth noting that while Zstandard and brotli were comparable on the compression half of this data, Zstandard has a clear advantage doing decompression.
Finally, you don't appear to pay a price for multi-threaded Zstandard compression on the decompression side (zstdmt in the chart).Dictionary Support
The examples so far in this post have used a single 4,457 MB piece of input data to measure behavior. Large data can behave completely differently from small data. This is because so much of what compression algorithms do is find patterns that came before so incoming data can be referenced to old data instead of uniquely stored. And if data is small, there isn't much of it that came before to reference!
This is often why many small, independent chunks of input compress poorly compared to a single large chunk. This can be demonstrated by comparing the widely-used zip and tar archive formats. On the surface, both do the same thing: they are a container of files. But they employ compression at different phases. A zip file will zlib compress each entry independently. However, a tar file doesn't use compression internally. Instead, the tar file itself is fed into a compression algorithm and compressed as a whole.
A more extreme example of the differences between zip and tar is the files in the Firefox source checkout. On revision a08ec245fa24 of the Firefox Mercurial repository, a zip file of all files in version control is 430,446,549 bytes versus 322,916,403 bytes for a tar.gz file (1,177,430,383 bytes uncompressed spanning 180,912 files). Using Zstandard, compressing each file discretely at compression level 3 yields 391,387,299 bytes of compressed data versus 294,926,418 as a single stream (without the tar container). Same compression algorithm. Different application method. Drastically different results. That's the impact of input size on compression performance.
While the compression ratio and speed of a single large stream is often better than multiple smaller chunks, there are still use cases that either don't have enough data or prefer independent access to each piece of input (like Firefox's omni.ja file). So a robust compression algorithm should handle small inputs as well as it does large inputs.
Zstandard helps offset the inherent inefficiencies of small inputs by supporting dictionary compression. A dictionary is essentially data used to seed the compressor's state. If the compressor sees data that exists in the dictionary, it references the dictionary instead of storing new data in the compressed output stream. This results in smaller output sizes and better compression ratios. One drawback to this is the dictionary has to be used to decompress data, which means you need to figure out how to distribute the dictionary and ensure it remains in sync with all data producers and consumers. This isn't always trivial.
Dictionary compression only works if there is enough repeated data and patterns in the inputs that can be extracted to yield a useful dictionary. Examples of this include markup languages, source code, or pieces of similar data (such as JSON payloads from HTTP API requests or telemetry data), which often have many repeated keywords and patterns.
Dictionaries are typically produced by training them on existing data. Essentially, you feed a bunch of samples into an algorithm that spits out a meaningful and useful dictionary. The more coherency in the data that will be compressed, the better the dictionary and the better the compression ratios.
Dictionaries can have a significant effect on compression ratios and speed.
Let's go back to Firefox's omni.ja file. Compressing each file discretely at zstd level 12 yields 9,177,410 bytes of data. But if we produce a 131,072 byte dictionary by training it on all files within omni.ja, the total size of each file compressed discretely is 7,942,886 bytes. Including the dictionary, the total size is 8,073,958 bytes, 1,103,452 bytes smaller than non-dictionary compression! (The zlib-based omni.ja is 9,783,749 bytes.) So Zstandard plus dictionary compression would likely yield a meaningful ~1.5 MB size reduction to the omni.ja file. This would make the Firefox distribution smaller and may improve startup time (since many files inside omni.ja are accessed at startup), which would make a number of people very happy. (Of course, Firefox doesn't yet contain the zstd C library. And adding it just for this use case may not make sense. But Firefox does ship with the brotli library and brotli supports dictionary compression and has similar performance characteristics as Zstandard, so, uh, someone may want to look into transitioning omni.jar to not zlib.)
But the benefits of dictionary compression don't end at compression ratios: operations with dictionaries can be faster as well!
The following chart shows performance when compressing Mercurial changeset data (describes a Mercurial commit) for the Firefox repository. There are 382,530 discrete inputs spanning 221,429,458 bytes (mean: 579 bytes, median: 306 bytes). (Note: measurements were conducted in Python and therefore may introduce some overhead.)
Aside from zstd level 3 dictionary compression, Zstandard is faster than zlib level 6 across the board (I suspect this one-off is an oddity with the zstd compression parameters at this level and this corpus because zstd level 4 is faster than level 3, which is weird).
It's also worth noting that non-dictionary zstandard compression has similar compression ratios to zlib. Again, this demonstrates the intrinsic difficulties of compressing small inputs.
But the real takeaway from this data are the speed differences with dictionary compression enabled. Dictionary decompression is 2.2-2.4x faster than non-dictionary decompression. Already respectable ~240 MB/s decompression speed (measured at the output end) becomes ~530 MB/s. Zlib level 6 was ~140 MB/s, so swapping in dictionary compression makes things ~3.8x faster. It takes ~1.5s of CPU time to zlib decompress this corpus. So if Mercurial can be taught to use Zstandard dictionary compression for changelog data, certain operations on this corpus will complete ~1.1s faster. That's significant.
It's worth stating that Zstandard isn't the only compression algorithm or library to support dictionary compression. Brotli and zlib do as well, for example. But, Zstandard's support for dictionary compression seems to be more polished than other libraries I've seen. It has multiple APIs for training dictionaries from sample data. (Brotli has none nor does brotli's documentation say how to generate dictionaries as far as I can tell.)
Dictionary compression is definitely an advanced feature, applicable only to certain use cases (lots of small, similar data). But there's no denying that if you can take advantage of dictionary compression, you may be rewarded with significant performance wins.A Versatile C API
As part of writing python-zstandard, I've spent a lot of time interfacing with the zstd C API. And, as part of evaluating other compression libraries for use in Mercurial, I've been looking at C APIs for other libraries and the Python bindings to them. A takeaway from this is an appreciation for the quality of zstd's C API.
Many compression library APIs are either too simple or too complex. Zstandard's is in the Goldilocks zone. Aside from a few minor missing features, its C API was more than adequate in its 1.0 release.
What I really appreciate about the zstd C API is that it provides high, medium, and low-level APIs. From the highest level, you throw it pointers to input and output buffers and it does an operation. From the medium level, you use a reusable context holding state and other parameters and it does an operation. From the low-level, you are calling multiple functions and shuffling bytes around, maintaining your own state and potentially bypassing the Zstandard framing format in the process. The different levels give you almost total control over everything. This is critical for performance optimization and when writing bindings for higher-level languages that may have different expectations on the behavior of software. The performance I've achieved in python-zstandard just isn't (easily) possible with other compression libraries because of their lacking API design.
Oftentimes when interacting with a C library I think if only there were a function to let me do X my life would be much easier. I rarely have this experience with Zstandard. The C API is well thought out, has almost all the features I want/need, and is pretty easy to use. While most won't notice this difference, it should be a significant advantage for Zstandard in the long run, as more bindings are written and more people have a high-quality experience with it because the C API allows them to.Zstandard Isn't Perfect
I've been pretty positive about Zstandard so far in this post. In fear of sounding like a fanboy who is so blinded by admiration that he can't see faults and because nothing is perfect, I need to point out some negatives about Zstandard. (Aside: put little faith in the words uttered by someone who can't find a fault in something they praise.)
First, the framing format is a bit heavyweight in some scenarios. The frame header is at least 6 bytes. For input of 256-65791 bytes, recording the original source size and its checksum will result in a 12 byte frame. Zlib, by contrast, is only 6 bytes for this scenario. When storing tens of thousands of compressed records (this is a use case in Mercurial), the frame overhead can matter and this can make it difficult for compressed Zstandard data to be as small as zlib for very small inputs. (It's worth noting that zlib doesn't store the decompressed size in its header. There are pros and cons to this, which I'll discuss in my eventual post about python-zstandard and how it achieves optimal performance.) If the frame overhead matters to you, the zstd C API does expose a block API that operates at a level below the framing format, allowing you to roll your own framing protocol. I also filed a GitHub issue to make the 4 byte magic number optional, which would go a long way to cutting down on frame overhead.
Second, the C API is not yet fully stabilized. There are a number of functions marked as experimental that aren't exported from the shared library and are only available via static linking. There's a ton of useful functionality in there, including low-level compression parameter adjustment, digested dictionaries (for reusing computed dictionaries across multiple contexts), and the multi-threaded compression API. python-zstandard makes heavy use of these experimental APIs. This requires bundling zstd with python-zstandard and statically linking with this known version because functionality could change at any time. This is a bit annoying, especially for distro packagers.
Third, the low-level compression parameters are under-documented. I think I understand what a lot of them do. But it isn't obvious when I should consider adjusting what. The default compression levels seem to work pretty well and map to reasonable compression parameters. But a few times I've noticed that tweaking things slightly can result in desirable improvements. I wish there were a guide of sorts to help you tune these parameters.
Fourth, dictionary compression is still a bit too complicated and hand-wavy for my liking. I can measure obvious benefits when using it largely out of the box with some corpora. But it isn't always a win and the cost for training dictionaries is too high to justify using it outside of scenarios where you are pretty sure it will be beneficial. When I do use it, I'm not sure which compression levels it works best with, how many samples need to be fed into the dictionary trainer, which training algorithm to use, etc. If that isn't enough, there is also the concept of content-only dictionaries where you use a fulltext as the dictionary. This can be useful for delta-encoding schemes (where compression effectively acts like a diff/delta generator instead of using something like Myers diff). If this topic interests you, there is a thread on the Mercurial developers list where Yann Collet and I discuss this.
Fifth, the patent rights grant. There is some wording in the PATENTS file in the Zstandard project that may... concern lawyers. While Zstandard is covered by the standard BSD 3-Clause license, that supplemental PATENTS file may scare some lawyers enough that you won't be able to use Zstandard. You may want to talk to a lawyer before using Zstandard, especially if you or your company likes initiating patent lawsuits against companies (or wishes to reserve that right - as many companies do), as that is the condition upon which the license terminates. Note that there is a long history between Facebook and consumers of its open source software regarding this language in the PATENTS file. Do a search for React patent grant to read more.
Sixth and finally, Zstandard is still relatively new. I can totally relate to holding off until something new and shiny proves itself. That being said, the Zstandard framing protocol has some escape hatches for future needs. And, the project proved during its pre-1.0 days that it knows how to handle backwards and future compatibility issues. And considering Facebook and others are using Zstandard in production, I wouldn't be too worried. I think the biggest risk is to people (like me) who are writing code against the experimental C APIs. But even then, the changes to the experimental APIs in the past several months have been minor. I'm not losing sleep over it.
That may seem like a long and concerning list. Most of the issues are relatively minor. The language in the PATENTS file may be a showstopper to some. From my perspective, the biggest thing Zstandard has going against it is its youth. But that will only improve with age. While I'm usually pretty conservative about adopting new technology (I've gotten burned enough times that I prefer the neophytes do the field testing for me), the upside to using Zstandard is potentially drastic performance and efficiency gains. And that can translate to success versus failure or millions of dollars in saved infrastructure costs and productivity gains. I'm willing to take my chances.Conclusion
For the corpora I've thrown at it, Zstandard handily outperforms zlib in almost every dimension. And, it even manages to best other modern compression algorithms like brotli in many tests.
The underlying algorithm and techniques used by Zstandard are highly parameterized, lending themselves to a variety of use cases from embedded hardware to massive data crunching machines with hundreds of gigabytes of memory and dozens of CPU cores.
The C API is well-designed and facilitates high performance and adaptability to numerous use cases. It is batteries included, providing functions to train dictionaries and perform multi-threaded compression.
Zstandard is backed by Facebook and seems to have a healthy open source culture on Github. My interactions with Yann Collet have been positive and he seems to be a great project maintainer.
Zstandard is an exciting advancement for data compression and therefore for the entire computing field. As someone who has lived in the world of zlib for years, was a casual user of compression, and thought zlib was good enough for most use cases, I can attest that Zstandard is game changing. After being enlightened to all the advantages of Zstandard, I'll never casually use zlib again: it's just too slow and inflexible for the needs of modern computing. If you use compression, I highly recommend investigating Zstandard.
(I updated the post on 2017-03-08 to include a paragraph about the supplemental license in the PATENTS file.)
Today, the organization WikiLeaks published a compendium of information alleged to be documents from the U.S. Central Intelligence Agency (CIA) pertaining to tools and techniques to compromise the security of mobile phones, computers, and internet-connected devices. We released the following statement on these reports:
If the information released in today’s reports are accurate, then it proves the CIA is undermining the security of the internet – and so is Wikileaks. We’ve said before that cybersecurity is a shared responsibility, and this is true in this example, regarding the disclosure of security vulnerabilities. It appears that neither the CIA nor Wikileaks are living up to that standard – the CIA seems to be stockpiling vulnerabilities, and Wikileaks seems to be using that trove for shock value rather than coordinating disclosure to the affected companies to give them a chance to fix it and protect users.
The government may have legitimate intelligence or law enforcement reasons for delaying disclosure of vulnerabilities (for example, to enable lawful hacking), but these same vulnerabilities can endanger the security of billions of people. These two interests must be balanced, and recent incidents demonstrate just how easily stockpiling vulnerabilities can go awry without proper policies and procedures in place.
Once governments become aware of a security vulnerability, they have a responsibility to consider how and when (not whether) to disclose the vulnerability to the affected company so they can fix the problem and protect users.
We have been advocating for broader, open conversations about disclosure of security vulnerabilities and although today’s disclosures are jarring, we hope this raises awareness of the severity of these issues and the urgency of collaborating on reforms.
Here is the presentation material for my talk entitled The Dark Arts of SSH. Please note this is a single HTML rendering that incldues presenter’s notes.
Once a month web developers across the Mozilla community get together (in person and virtually) to share what cool stuff we've been working on. This...
David Bryant: Why WebAssembly is a game changer for the web — and a source of pride for Mozilla and Firefox
With today’s release of Firefox, we are the first browser to support WebAssembly. If you haven’t yet heard of WebAssembly, it’s an emerging standard inspired by our research to enable near-native performance for web applications.
WebAssembly is one of the biggest advances to the Web Platform over the past decade.
To get a quick understanding of WebAssembly, and to get an idea of how some companies are looking at using it, check out this video. You’ll hear from engineers at Mozilla, and partners such as Autodesk, Epic, and Unity.https://medium.com/media/1858e816355bfa288aa7294e39278e67/href
It’s been a long, winding, and exciting road getting here.
The asm.js sub-language worked impressively well, and we knew the approach could work even better as a first-class web standard. So, using asm.js as a proof of concept, we set out to collaborate with other browser makers to establish such a standard that could run as part of browsers. Together with expert engineers across browser makers, we established consensus on WebAssembly. We expect support for it will soon start shipping in other browsers.
In some ways, WebAssembly changes what it means to be a web developer, as well as the fundamental abilities of the web. With WebAssembly and an accompanying set of tools, programs written in languages like C/C++ can be ported to the web so they run with near-native performance. We expect that, as WebAssembly continues to evolve, you’ll also be able to use it with programming languages often used for mobile apps, like Java, Swift, and C#.
If you’re interested in hearing more about the backstory of WebAssembly, check out this behind-the-scenes look.https://medium.com/media/7f594db82cecacb4cffaac7932ae1ac9/href
WebAssembly is shipping today in Firefox on Windows, MacOS, Linux, and Android. We’re particularly excited about the potential on mobile — do all those apps really need to be native?
If you’d like to try out some applications that use WebAssembly, upgrade to Firefox 52, and check out this demo of Zen Garden by Epic. For your convenience, we’ve embedded a video of the demo below.https://medium.com/media/9c771666d7a80886c78da81479420ee7/href
If you’re a developer interested in working with WebAssembly, check out WebAssembly documentation on MDN. You might also want to see this series of blog posts by Lin Clark that explain WebAssembly through some cool cartoons.
Here at Mozilla we’re focused on moving the web forward and on making Firefox the best browser, hands down. With WebAssembly shipping today and Project Quantum well underway, we’re more bullish about the web — and about Firefox — than ever.
Why WebAssembly is a game changer for the web — and a source of pride for Mozilla and Firefox was originally published in Mozilla Tech on Medium, where people are continuing the conversation by highlighting and responding to this story.
Over the last month we had a higher rate of commits, failures, and fixes. One large thing is that we turned on stylo specific tests and that was a slightly rocky road. Last month we suggested disabling tests after 2 weeks of seeing the failures. We ended up disabling many tests, but fixing many more.
In addition to more disabling of tests, we implemented a set bugzilla whiteboard entries to track our progress:
* [stockwell fixed] – a fix went in (even if it partially fixed the problem)
* in the last 2 months, we have 106
* [stockwell disabled] – we disabled the test in at least one config and no fix
* in the last 2 months, we have 61
* [stockwell infra] – Infra issues are usually externally driven
* in the last 2 months, we have 11
* [stockwell unknown] – this became less intermittent with no clear reason
* in the last 2 months, we have 44
* [stockwell needswork] – bugs in progress
* in the last 2 months, we have 24
We have also been tracking the orange factor and number of high frequency intermittents:Week starting: Jan 02, 2017 Jan 30, 2017 Feb 27, 2017 Orange Factor (OF) 13.76 10.75 9.06 # priority intermittents 42 61 32 OF – priority intermittents 7.25 5.78 4.78
I added a new row here, tracking the Orange Factor assuming all of the high frequency intermittent bugs didn’t exist. This is what the long tail looks like and I am really excited to see that number going down over time. For me a healthy spot would be OF <5.0 and the long tail <3.0.
We also looked at the number of unique bugs and repeat bugs/week. Most bugs have a lifecycle of 2 weeks and 2/3 of the bugs we see in a given week were high frequency (HF) the week prior. For example this past week we had 32 HF bugs and 21 of them were from the previous week (11 were still HF 2 weeks prior).
While it is nice to assume we should just disable all tests, we find that many developers are actively working on these issues and it shows that we have many more fixed bugs than disabled bugs. The main motivation for disabling tests is to reduce the confusion for developers on try and to reduce the work the sheriffs need to do. Taking this data into account we are looking to adjust our policy for disabling slightly:
- all high frequency bugs (>=30 times/week) will be triaged and expected to be resolved in 2 weeks, otherwise we will start the process of disabling the test that is causing the bug
- if a bug occurs >75 times/week, it will be triaged but expectations are that it will be resolved in 1 week, otherwise we will start the process of disabling the test that is causing the bug
- if a bug is reduced below a high frequency (< 30 times/week), we will be happy to make a note of that and keep an eye on it- but will not look at disabling the test.
The big change here is we will be more serious on disabling tests specifically when a test is >= 75 times/week. We have had many tests failing at least 50% of the time for weeks, these show up on almost all try pushes that run these tests. Developers should not be seeing failures like these. Since we are tracking fixed vs disabled, if we determine that we are disabling too much, we can revisit this policy next month.
Outside of numbers and policy, our goal is to have a solid policy, process, and toolchain available for self triaging as the year goes on. We are refining the policy and process via manual triage. The toolchain is the other work we are doing, here are some updates:
- adding BUG_COMPONENTS to all files in m-c (bug 1328351) – slow and steady progress, thanks for the reviews to date! We got behind to get SETA completed, but much of the heavy lifting is already done
- retrigger an existing job with additional debugging arguments (bug 1322433) – main discussion is done, figuring out small details, we have a prototype working with little work remaining. Next steps would be to implement the top 3 or 4 use cases.
- add a test-lint job to linux64/mochitest (bug 1323044) – no progress yet- this got put on the backburner as we worked on SETA and focused on triage, whiteboard tags, and BUG_COMPONENTS. We have landed code for using the ‘when’ clause for test jobs (bug 1342963) which is a small piece of this. Getting this initially working will move up in priority soon, and making this work on all harnesses/platforms will most likely be a Google Summer of Code project.
Are there items we should be working on or looking into? Please join our meetings.
Up until recently, anytime you pushed a patch series to MozReview, a single attachment would be created on the bug associated with the push.
That single attachment would link to the “parent” or “root” review request, which contains the folded diff of all commits.
We noticed a lot of MozReview users were (rightfully) confused about this mapping from Bugzilla to MozReview. It was not at all obvious that Ship It on the parent review request would cause the attachment on Bugzilla to be r+’d. Consequently, reviewers used a number of workarounds, including, but not limited to:
- Manually setting the r+ or r- flags in Bugzilla for the MozReview attachments
- Marking Ship It on the child review requests, and letting the reviewee take care of setting the reviewer flags in the commit message
- Just writing “r+” in a MozReview comment
Anyhow, this model wasn’t great, and caused a lot of confusion.
So it’s changed! Now, when you push to MozReview, there’s one attachment created for every commit in the push. That means that when different reviewers are set for different commits, that’s reflected in the Bugzilla attachments, and when those reviewers mark “Ship It” on a child commit, that’s also reflected in an r+ on the associated Bugzilla attachment!
I think this makes quite a bit more sense. Hopefully you do too!
I’m on vacation this week, but the show must go on! So I pre-recorded a shorter episode of The Joy of Coding last Friday.
I demo the tool, and then I explain how it works. After I finished the episode, I pushed to repository to GitHub, and you can check that out right here.
So I’ll see you next week with a full length episode! Take care!
Which, several times, I mistakenly refer to as the 15th episode, and not the 16th. Whoops. ↩
Common (excluding Website bugs)-specific: (23)
- Fixed: 768207 – Make the cache checkbox default-on in the new calendar dialog
- Fixed: 1049591 – Fix lots of strict warnings
- Fixed: 1086573 – Lightning and Thunderbird disagree about timezone support in ics files
- Fixed: 1099592 – Make JS callers of ios.newChannel call ios.newChannel2 in calendar/
- Fixed: 1149423 – Add Windows timezone names to list of aliases
- Fixed: 1151011 – Calendar events show up on wrong day when printing
- Fixed: 1151440 – Choose a color not responsive when creating a New calendar in Lightning 4.0b1
- Fixed: 1153327 – Run compare-locales with merging for Lightning
- Fixed: 1156015 – Email scheduling fails for recipients with URN id
- Fixed: 1158036 – Support sendMailTo for URN type attendees
- Fixed: 1159447 – TEST-UNEXPECTED-FAIL | xpcshell-icaljs.ini:calendar/test/unit/test_extract.js
- Fixed: 1159638 – Getter fails in calender-migration-dialog on first run after installation
- Fixed: 1159682 – Provide a more appropriate “learn more” page on integrated Lightning firstrun
- Fixed: 1159698 – Opt-out dialog has a button for “disable”, but actually the addon is removed
- Fixed: 1160728 – Unbreak Lightning 4.0b4 beta builds
- Fixed: 1162300 – TEST-UNEXPECTED-FAIL | xpcshell-libical.ini:calendar/test/unit/test_alarm.js | xpcshell return code: 0
- Fixed: 1163306 – Re-enable libical tests and disable ical.js in nightly builds when binary compatibility is back
- Fixed: 1165002 – Lightning broken, tries to load libical backend although “calendar.icaljs” defaults to “true”
- Fixed: 1165315 – TEST-UNEXPECTED-FAIL | xpcshell-icaljs.ini:calendar/test/unit/test_bug759324.js | xpcshell return code: 1 | ###!!! ASSERTION: Deprecated, use NewChannelFromURI2 providing loadInfo arguments!
- Fixed: 1165497 – TEST-UNEXPECTED-FAIL | xpcshell-icaljs.ini:calendar/test/unit/test_alarmservice.js | xpcshell return code: -11
- Fixed: 1165726 – TEST-UNEXPECTED-FAIL | /builds/slave/test/build/tests/mozmill/testBasicFunctionality.js | testBasicFunctionality.js::testSmokeTest
- Fixed: 1165728 – TEST-UNEXPECTED-FAIL | xpcshell-icaljs.ini:calendar/test/unit/test_bug494140.js | xpcshell return code: -11
Sunbird will no longer be actively developed by the Calendar team.
- Fixed: 401779 – Integrate Lightning Into Thunderbird by Default and Ship Thunderbird with Lightning Enabled
- Fixed: 717292 – Spell check language setting for subject and body not synchronized, but temporarily appears so when changing language and depending on focus (confusing ux)
- Fixed: 914225 – Support hotfix add-on in Thunderbird
- Fixed: 1025547 – newmailaccount/jquery.tmpl.js, line 123: reference to undefined property def
- Fixed: 1088975 – Answering mail with sendername containing encoded special chars and comma creates two “To”-entries
- Fixed: 1101237 – Remove distribution directory during install
- Fixed: 1109178 – Thunderbird OAuth implementation does not work with Evernote
- Fixed: 1110166 – Port |Bug 1102219 – Rename String.prototype.contains to String.prototype.includes| to comm-central
- Fixed: 1113097 – Fix misuse of fixIterator
- Fixed: 1130854 – Package Lightning with Thunderbird
- Fixed: 1131997 – Adapt for Debugger Server code for changes in bug 1059308
- Fixed: 1135291 – Update chat log entries added to Gloda since bug 955292 to use relative paths
- Fixed: 1135588 – New conversations get indexed twice by gloda, leading to duplicate search results
- Fixed: 1138154 – Plugins default to “always activate” in Thunderbird
- Fixed: 1142879 – [meta] track Mozilla-central (Core) issues that we want to have fixed in TB38
- Fixed: 1146698 – Chat Messages added to logs just before shutdown may not be indexed by gloda
- Fixed: 1148330 – Font indicator doesn’t update when cursor is placed in text where core returns sans-serif (Windows). Serif and monospace don’t work (Linux).
- Fixed: 1148512 – TEST-UNEXPECTED-FAIL | mailnews/imap/test/unit/test_dod.js | xpcshell return code: 0||1 | streamMessages – [streamMessages : 94] false == true | application crashed [@ mozalloc_abort(char const * const)]
- Fixed: 1149059 – splitter in compose window can be resized down to completely obscure composition area
- Fixed: 1151206 – Using a theme hides minimize, maximize and close button in composer window [Mac]
- Fixed: 1151475 – Remove use of expression closures in mail/
- Fixed: 1152299 – [autoconfig] Cosmetic changes for WEB.DE config
- Fixed: 1152706 – Upgrade to Correspondents column (combined To/From column) too agressive
- Fixed: 1152796 – chrome://messenger/content/folderDisplay.js, line 697: TypeError: this._savedColumnStates.correspondentCol is undefined
- Fixed: 1152926 – New mail sound preview doesn’t work for default system sound on Mac OS X
- Fixed: 1154737 – Permafail: TEST-UNEXPECTED-FAIL | toolkit/components/telemetry/tests/unit/test_TelemetryPing.js | xpcshell return code: 0
- Fixed: 1154747 – TEST-UNEXPECTED-FAIL | /builds/slave/test/build/tests/mozmill/session-store/test-session-store.js | test-session-store.js::test_message_pane_height_persistence
- Fixed: 1156669 – Trash folder duplication while using IMAP with localized TB
- Fixed: 1157236 – In-content dialogs: Port bug 1043612, bug 1148923 and bug 1141031 to TB
- Fixed: 1157649 – TEST-UNEXPECTED-FAIL | dom/push/test/xpcshell/test_clearAll_successful.js (and most other push tests)
- Fixed: 1158824 – Port bug 138009 to fix packaging errors | Missing file(s): bin/defaults/autoconfig/platform.js
- Fixed: 1159448 – Thunderbird ignores proxy settings on POP3S protocol
- Fixed: 1159627 – resource:///modules/dbViewWrapper.js, line 560: SyntaxError: unreachable code after return statement
- Fixed: 1159630 – components/glautocomp.js, line 155: SyntaxError: unreachable code after return statement
- Fixed: 1159676 – mailnews/mime/jsmime/test/test_custom_headers.js | run_next_test 0 – TypeError: _gRunningTest is undefined at /builds/slave/test/build/tests/xpcshell/head.js:1435 (and other jsmime tests)
- Fixed: 1159688 – After switching/changing the window layout, dragging the splitter between threadpane and messagepane can create gray/grey area/space (misplaced notificationbox)
- Fixed: 1159815 – Take bug 1154791 “Inline spell checker loses red underlines after a backspace is used – take two” in Thunderbird 38
- Fixed: 1159817 – Take “Bug 1100966 – Inline spell checker loses red underlines after a backspace is used” in Thunderbird 38
- Fixed: 1159834 – Consider taking “Bug 756984 – Changing location in editor doesn’t preserve the font when returning to end of text/line” in Thunderbird 38
- Fixed: 1159923 – Take bug 1140105 “Can’t query for a specific font face when the selection is collapsed” in TB 38
- Fixed: 1160105 – Fix strict mode warnings in protovis-r2.6-modded.js
- Fixed: 1160106 – “Searching…” spinner at the bottom of gloda search results never goes away
- Fixed: 1160114 – Strict mode warnings on faceted search
- Fixed: 1160805 – Missing Windows and Linux nightly builds, build step set props: previous_buildid fails
- Fixed: 1161162 – “Join Chat” doesn’t focus the newly joined MUC
- Fixed: 1162396 – Take bug 1140617 “Pasting an image loses the composition style” in TB38
- Fixed: 1163086 – Take bug 967494 “changing spellcheck language in one composition window affects all open and new compositions” in TB38
- Fixed: 1163299 – “TypeError: getBrowser(…) is null” in contentAreaClick with Lightning installed and started in calendar view
- Fixed: 1163343 – Incorrectly formatted error message “sending failed”
- Fixed: 1164415 – Error in comment for imapEnterServerPasswordPrompt
- Fixed: 1164658 – TypeError: Cc[‘@mozilla.org/weave/service;1’] is undefined at resource://gre/modules/FxAccountsWebChannel.jsm:227
- Fixed: 1164707 – missing toolkit_perfmonitoring.xpt in aurora builds
- Fixed: 1165152 – Take bug 1154894 in TB 38 branch: Disable test_plugin_default_state.js so Thunderbird can ship with plugins disabled by default
- Fixed: 1165320 – TEST-UNEXPECTED-FAIL | /builds/slave/test/build/tests/mozmill/notification/test-notification.js
MailNews Core-specific: (30)
- Fixed: 610533 – crash [@ nsMsgDatabase::GetSearchResultsTable(char const*, int, nsIMdbTable**)] with virtual folder
- Fixed: 745664 – Rename Address book aaa to aaa_test, delete another address book bbb, and renamed address book aaa_test will lose its name and appear deleted after restart (dataloss! involving localized names)
- Fixed: 777770 – get rid of nsVoidArray from /mailnews
- Fixed: 786141 – Use nsIFile.exists() instead of stat to check the existence of the file
- Fixed: 1069790 – Email addresses with parenthesis are not pretty-printed anymore
- Fixed: 1072611 – Ctrl+P not working from Composition’s Print Preview window
- Fixed: 1099587 – Make JS callers of ios.newChannel call ios.newChannel2 in mail/ and mailnews/
- Fixed: 1130248 – |To: “firstname.lastname@example.org” <email@example.com>| becomes |”firstname.lastname@example.org”@example.com| when I compose mail to it
- Fixed: 1138220 – some headers are not not properly capitalized
- Fixed: 1141446 – Behaviour of malformed rfc2047 encoded From message header inconsistent
- Fixed: 1143569 – User-agent error when posting to NNTP due to RFC5536 violation of Tb (user-agent header is folded just after user-agent:, “user-agent:[CRLF][SP]Mozilla…”)
- Fixed: 1144693 – Disable libnotify usage on Linux by default for new-mail notifications (doesn’t always work after bug 858919)
- Fixed: 1149320 – fix compile warnings in mailnews/extensions/
- Fixed: 1150891 – Port package-manifest.in changes from Bug 1115495 – Part 2: PAC generator for browsing and system wide proxy
- Fixed: 1151782 – Inputting 29th Feb as a birthday in the addressbook contact replaces it with 1st Mar.
- Fixed: 1152364 – crash in Address Book via nsAbBSDirectory::GetChildNodes nsCOMArrayEnumerator::operator new(unsigned int, nsCOMArray_base const&)
- Fixed: 1152989 – Account Manager Extensions broken in Thunderbird 37/38
- Fixed: 1154521 – jsmime fails on long references header and e-mail gets sent and stored in Sent without headers
- Fixed: 1155491 – Support autoconfig and manual config of gmail IMAP OAuth2 authentication
- Fixed: 1155952 – Nesting level does not match indentation
- Fixed: 1156691 – GUI “Edit filters”: Conditions/actions (for specfic accounts) not visible
- Fixed: 1156777 – nsParseMailbox.cpp:505:55: error: ‘do_QueryObject’ was not declared in this scope
- Fixed: 1158501 – Port bug 1039866 (metro code removal) and bug 1085557 (addition of socorro symbol upload API)
- Fixed: 1158751 – Port NO_JS_MANIFEST changes | mozbuild.frontend.reader.SandboxValidationError: calendar/base/backend/icaljs/moz.build
- Fixed: 1159255 – Build error: MSVC_ENABLE_PGO = True is not permitted to be used in mailnews/intl/moz.build
- Fixed: 1159626 – chrome://messenger/content/accountUtils.js, line 455: SyntaxError: unreachable code after return statement
- Fixed: 1160647 – Port |Bug 1159972 – Remove the fallible version of PL_DHashTableInit()| to comm-central
- Fixed: 1163347 – Don’t require scope in ispdb config for OAuth2
- Fixed: 1165737 – Fix usage of NS_LITERAL_CSTRING in mailnews, port Bug 1155963 to comm-central
- Fixed: 1166842 – Re-enable binary extensions for comm-central
You might have noticed that I had no “Things I’ve Learned This Week” post last week. Sorry about that – by the end of the week, I looked at my Evernote of “lessons from the week”, and it was empty. I’m certain I’d learned stuff, but I just failed to write it down. So I guess the lesson I learned last week was, always write down what you learn.How to make your mozilla-central Mercurial clone work faster
I like Mercurial. I also like Git, but recently, I’ve gotten pretty used to Mercurial.
One complaint I hear over and over (and I’m guilty of it myself sometimes), is that “Mercurial is slow”. I’ve even experienced that slowness during some of my Joy of Coding episodes.
This document did not exist when I first started working with Mercurial – back then, I was using mq or sometimes pbranch, and grumbling about how I missed Git.
But there is some gold in this document.
gps has been doing some killer work documenting best practices with Mercurial, and this document is one of the results of his labour.
watchman is a tool that some folks at Facebook wrote to monitor changes in a folder. hgwatchman is an extension for Mercurial that takes advantage of watchman for a repository, smartly precomputing a bunch of stuff when the folder changes so that when you fire a command, likehg status
It takes a fraction of the time it’d take without hgwatchman. A fraction.
Here’s how I set hgwatchman up on my MacBook (though you should probably go by the Mercurial for Mozillians doc as the official reference):
- Install watchman with brew: brew install watchman
- Clone the hgwatchman extension to some folder that you can easily remember and build it: hg clone https://bitbucket.org/facebook/hgwatchman cd hgwatchman make local
- Add the following lines to my user .hgrc: [extensions] hgwatchman = cloned-in-dir/hgwatchman/hgwatchman
- Make sure the extension is properly installed by running: hg help extensions
- hgwatchman should be listed under “enabled extensions”. If it didn’t work, keep in mind that you want to target the hgwatchman directory
- And then in my mozilla-central .hg/.hgrc: [watchman] mode = on
- Boom, you’re done!
Congratulations, hg should feel snappier now!
On Firefox Hello, we recently added the eslint linter to be run against the Hello code base. We started of with a minimal set of rules, just enough to get us something running. Now we’re working on enabling more rules.
Since we enabled it, I feel like I’m able to iterate faster on patches. For example, if just as I finish typing I see something like:
Now I think about it, I’m realising it has also helped reduced the amount of review nits on my patches – due to trivial formatting mistakes being caught automatically, e.g. trailing white-space or missing semi-colons.
Talking about reviews, as we’re running eslint on the Hello code, we just have to apply the patch, and run our tests, and we automatically get eslint output:
Hopefully our patch authors will be running eslint before uploading the patch anyway, but this is an additional test, and a few less things that we need to look at during review which helps speed up that cycle as well.
I’ve also put together a global config file for eslint (see below), that I use for outside of the Hello code, on the rest of the Firefox code base (and other projects). This is enough, that, when using it in my editor it gives me a reasonable amount of information about bad syntax, without complaining about everything.
I would definitely recommend giving it a try. My patches feel faster overall, and my test runs are for testing, not stupid-mistake catching!
Want more specific details about the setup and advantages? Read on…
You need to have eslint installed globally, or at least in your path, other than that, just follow the installation instructions given on the SublimeLinter page.
One configuration I change I did have to make to the global configuration:
- Select “Preferences” -> “Settings – More” -> “Syntax Specific – User”
- In the file that appears, set the configuration up as follows (or whatever suits you):
I’ve uploaded my global configuration to a gist, if it changes I’ll update it there. It isn’t intended to catch everything – there’s too many inconsistencies across the code base for that to be sensible at the moment. However, it does at least allow general syntax issues to be highlighted for most files – which is obviously useful in itself.
I haven’t yet tried running it across the whole code base via eslint on the command line – there seems to be some sort of configuration issue that is messing it up and I’ve not tracked it down yet.
Firefox Hello’s Configuration
The configuration files for Hello can be found in the mozilla-central source. There’s a few of these because we have both content and chrome code, and some of the content code is shared with a website that can be viewed by most browsers, and hence isn’t currently able to use all the es6 features, whereas the chrome code can. This is another thing that eslint is good for enforcing.
Our eslint configuration is evolving at the moment, as we enable more rules, which we’re tracking in this bug.
Last week I gave a talk at the Philly Tech Week 2015 Dev Day organized by the delightful people at technical.ly on some of the tricks/strategies we use in the Firefox OS Gaia Email app. Note that the credit for implementing most of these techniques goes to the owner of the Email app’s front-end, James Burke. Also, a special shout-out to Vivien for the initial DOM Worker patches for the email app.
I tried to avoid having slides that both I would be reading aloud as the audience read silently, so instead of slides to share, I have the talk script. Well, I also have the slides here, but there’s not much to them. The headings below are the content of the slides, except for the one time I inline some code. Note that the live presentation must have differed slightly, because I’m sure I’m much more witty and clever in person than this script would make it seem…
Cover Slide: Who!
Hi, my name is Andrew Sutherland. I work at Mozilla on the Firefox OS Email Application. I’m here to share some strategies we used to make our HTML5 app Seem faster and sometimes actually Be faster.
What’s A Firefox OS (Screenshot Slide)
Here are some screenshots. We’ve got the default home screen app, the clock app, and of course, the email app.
It’s an entirely client-side offline email application, supporting IMAP4, POP3, and ActiveSync. The goal, like all Firefox OS apps shipped with the phone, is to give native apps on other platforms a run for their money.
And that begins with starting up fast.
Fast Startup: The Problems
But that’s frequently easier said than done. Slow-loading websites are still very much a thing.
The good news for the email application is that a slow network isn’t one of its problems. It’s pre-loaded on the phone. And even if it wasn’t, because of the security implications of the TCP Web API and the difficulty of explaining this risk to users in a way they won’t just click through, any TCP-using app needs to be a cryptographically signed zip file approved by a marketplace. So we do load directly from flash.
It adds up in the form of event loop activity and competition with other threads and processes. With the exception of Promises which get their own micro-task queue fast-lane, the web execution model is the same as all other UI event loops; events get scheduled and then executed in the same order they are scheduled. Loading data from an asynchronous API like IndexedDB means that your read result gets in line behind everything else that’s scheduled. And in the case of the bulk of shipped Firefox OS devices, we only have a single processor core so the thread and process contention do come into play.
So we try not to be a naive.
Seeming Fast at Startup: The HTML Cache
If we’re going to optimize startup, it’s good to start with what the user sees. Once an account exists for the email app, at startup we display the default account’s inbox folder.
What is the least amount of work that we can do to show that? Cache a screenshot of the Inbox. The problem with that, of course, is that a static screenshot is indistinguishable from an unresponsive application.
Local Storage: Okay in small doses
We implement this by storing the HTML in localStorage.
Important Disclaimer! LocalStorage is a bad API. It’s a bad API because it’s synchronous. You can read any value stored in it at any time, without waiting for a callback. Which means if the data is not in memory the browser needs to block its event loop or spin a nested event loop until the data has been read from disk. Browsers avoid this now by trying to preload the Entire contents of local storage for your origin into memory as soon as they know your page is being loaded. And then they keep that information, ALL of it, in memory until your page is gone.
So if you store a megabyte of data in local storage, that’s a megabyte of data that needs to be loaded in its entirety before you can use any of it, and that hangs around in scarce phone memory.
To really make the point: do not use local storage, at least not directly. Use a library like localForage that will use IndexedDB when available, and then fails over to WebSQLDatabase and local storage in that order.
Now, having sufficiently warned you of the terrible evils of local storage, I can say with a sorta-clear conscience… there are upsides in this very specific case.
The synchronous nature of the API means that once we get our turn in the event loop we can act immediately. There’s no waiting around for an IndexedDB read result to gets its turn on the event loop.
This matters because although the concept of loading is simple from a User Experience perspective, there’s no standard to back it up right now. Firefox OS’s UX desires are very straightforward. When you tap on an app, we zoom it in. Until the app is loaded we display the app’s icon in the center of the screen. Unfortunately the standards are still assuming that the content is right there in the HTML. This works well for document-based web pages or server-powered web apps where the contents of the page are baked in. They work less well for client-only web apps where the content lives in a database and has to be dynamically retrieved.
The two events that exist are:
“DOMContentLoaded” fires when the document has been fully parsed and all scripts not tagged as “async” have run. If there were stylesheets referenced prior to the script tags, the script tags will wait for the stylesheet loads.
“load” fires when the document has been fully loaded; stylesheets, images, everything.
But none of these have anything to do with the content in the page saying it’s actually done. This matters because these standards also say nothing about IndexedDB reads or the like. We tried to create a standards consensus around this, but it’s not there yet. So Firefox OS just uses the “load” event to decide an app or page has finished loading and it can stop showing your app icon. This largely avoids the dreaded “flash of unstyled content” problem, but it also means that your webpage or app needs to deal with this period of time by displaying a loading UI or just accepting a potentially awkward transient UI state.
(Trivial HTML slide)<link rel=”stylesheet” ...> <script ...></script> DOMContentLoaded!
This is the important summary of our index.html.
We reference our stylesheet first. It includes all of our styles. We never dynamically load stylesheets because that compels a style recalculation for all nodes and potentially a reflow. We would have to have an awful lot of style declarations before considering that.
Then we have our single script file. Because the stylesheet precedes the script, our script will not execute until the stylesheet has been loaded. Then our script runs and we synchronously insert our HTML from local storage. Then DOMContentLoaded can fire. At this point the layout engine has enough information to perform a style recalculation and determine what CSS-referenced image resources need to be loaded for buttons and icons, then those load, and then we’re good to be displayed as the “load” event can fire.
After that, we’re displaying an interactive-ish HTML document. You can scroll, you can press on buttons and the :active state will apply. So things seem real.
Being Fast: Lazy Loading and Optimized Layers
But now we need to try and get some logic in place as quickly as possible that will actually cash the checks that real-looking HTML UI is writing. And the key to that is only loading what you need when you need it, and trying to get it to load as quickly as possible.
There are many module loading and build optimizing tools out there, and most frameworks have a preferred or required way of handling this. We used the RequireJS family of Asynchronous Module Definition loaders, specifically the alameda loader and the r-dot-js optimizer.
One of the niceties of the loader plugin model is that we are able to express resource dependencies as well as code dependencies.
RequireJS Loader Pluginsvar fooModule = require('./foo'); var htmlString = require('text!./foo.html'); var localizedDomNode = require('tmpl!./foo.html');
The standard Common JS loader semantics used by node.js and io.js are the first one you see here. Load the module, return its exports.
But RequireJS loader plugins also allow us to do things like the second line where the exclamation point indicates that the load should occur using a loader plugin, which is itself a module that conforms to the loader plugin contract. In this case it’s saying load the file foo.html as raw text and return it as a string.
But, wait, there’s more! loader plugins can do more than that. The third example uses a loader that loads the HTML file using the ‘text’ plugin under the hood, creates an HTML document fragment, and pre-localizes it using our localization library. And this works un-optimized in a browser, no compilation step needed, but it can also be optimized.
We then also run the optimizer against our other important cards like the “compose” card and the “message reader” card. We don’t do this for all cards because it can be hard to carve up the module dependency graph for optimization without starting to run into cases of overlap where many optimized files redundantly include files loaded by other optimized files.
Plus, we have another trick up our sleeve:
Seeming Fast: Preloading
Preloading. Our cards optionally know the other cards they can load. So once we display a card, we can kick off a preload of the cards that might potentially be displayed. For example, the message list card can trigger the compose card and the message reader card, so we can trigger a preload of both of those.
But we don’t go overboard with preloading in the frontend because we still haven’t actually loaded the back-end that actually does all the emaily email stuff. The back-end is also chopped up into optimized layers along account type lines and online/offline needs, but the main optimized JS file still weighs in at something like 17 thousand lines of code with newlines retained.
So once our UI logic is loaded, it’s time to kick-off loading the back-end. And in order to avoid impacting the responsiveness of the UI both while it loads and when we’re doing steady-state processing, we run it in a DOM Worker.
Being Responsive: Workers and SharedWorkers
DOM Workers are background JS threads that lack access to the page’s DOM, communicating with their owning page via message passing with postMessage. Normal workers are owned by a single page. SharedWorkers can be accessed via multiple pages from the same document origin.
By doing this, we stay out of the way of the main thread. This is getting less important as browser engines support Asynchronous Panning & Zooming or “APZ” with hardware-accelerated composition, tile-based rendering, and all that good stuff. (Some might even call it magic.)
When Firefox OS started, we didn’t have APZ, so any main-thread logic had the serious potential to result in janky scrolling and the impossibility of rendering at 60 frames per second. It’s a lot easier to get 60 frames-per-second now, but even asynchronous pan and zoom potentially has to wait on dispatching an event to the main thread to figure out if the user’s tap is going to be consumed by app logic and preventDefault called on it. APZ does this because it needs to know whether it should start scrolling or not.
And speaking of 60 frames-per-second…
Being Fast: Virtual List Widgets
…the heart of a mail application is the message list. The expected UX is to be able to fling your way through the entire list of what the email app knows about and see the messages there, just like you would on a native app.
This is admittedly one of the areas where native apps have it easier. There are usually list widgets that explicitly have a contract that says they request data on an as-needed basis. They potentially even include data bindings so you can just point them at a data-store.
But HTML doesn’t yet have a concept of instantiate-on-demand for the DOM, although it’s being discussed by Firefox layout engine developers. For app purposes, the DOM is a scene graph. An extremely capable scene graph that can handle huge documents, but there are footguns and it’s arguably better to err on the side of fewer DOM nodes.
So what the email app does is we create a scroll-region div and explicitly size it based on the number of messages in the mail folder we’re displaying. We create and render enough message summary nodes to cover the current screen, 3 screens worth of messages in the direction we’re scrolling, and then we also retain up to 3 screens worth in the direction we scrolled from. We also pre-fetch 2 more screens worth of messages from the database. These constants were arrived at experimentally on prototype devices.
We listen to “scroll” events and issue database requests and move DOM nodes around and update them as the user scrolls. For any potentially jarring or expensive transitions such as coordinate space changes from new messages being added above the current scroll position, we wait for scrolling to stop.
Nodes are absolutely positioned within the scroll area using their ‘top’ style but translation transforms also work. We remove nodes from the DOM, then update their position and their state before re-appending them. We do this because the browser APZ logic tries to be clever and figure out how to create an efficient series of layers so that it can pre-paint as much of the DOM as possible in graphic buffers, AKA layers, that can be efficiently composited by the GPU. Its goal is that when the user is scrolling, or something is being animated, that it can just move the layers around the screen or adjust their opacity or other transforms without having to ask the layout engine to re-render portions of the DOM.
When our message elements are added to the DOM with an already-initialized absolute position, the APZ logic lumps them together as something it can paint in a single layer along with the other elements in the scrolling region. But if we start moving them around while they’re still in the DOM, the layerization logic decides that they might want to independently move around more in the future and so each message item ends up in its own layer. This slows things down. But by removing them and re-adding them it sees them as new with static positions and decides that it can lump them all together in a single layer. Really, we could just create new DOM nodes, but we produce slightly less garbage this way and in the event there’s a bug, it’s nicer to mess up with 30 DOM nodes displayed incorrectly rather than 3 million.
But as neat as the layerization stuff is to know about on its own, I really mention it to underscore 2 suggestions:
1, Use a library when possible. Getting on and staying on APZ fast-paths is not trivial, especially across browser engines. So it’s a very good idea to use a library rather than rolling your own.
2, Use developer tools. APZ is tricky to reason about and even the developers who write the Async pan & zoom logic can be surprised by what happens in complex real-world situations. And there ARE developer tools available that help you avoid needing to reason about this. Firefox OS has easy on-device developer tools that can help diagnose what’s going on or at least help tell you whether you’re making things faster or slower:
– it’s got a frames-per-second overlay; you do need to scroll like mad to get the system to want to render 60 frames-per-second, but it makes it clear what the net result is
– it has paint flashing that overlays random colors every time it paints the DOM into a layer. If the screen is flashing like a discotheque or has a lot of smeared rainbows, you know something’s wrong because the APZ logic is not able to to just reuse its layers.
– devtools can enable drawing cool colored borders around the layers APZ has created so you can see if layerization is doing something crazy
There’s also fancier and more complicated tools in Firefox and other browsers like Google Chrome to let you see what got painted, what the layer tree looks like, et cetera.
And that’s my spiel.
The source code to Gaia can be found at https://github.com/mozilla-b2g/gaia
The email app in particular can be found at https://github.com/mozilla-b2g/gaia/tree/master/apps/email
(I also asked for questions here.)
It’s that time of year again, we have a new major release of Lightning on the horizon. About every 42 weeks, Thunderbird prepares for a major release, we follow up with a matching major version. You may know these as Lightning 2.6 or 3.3.In order to avoid disappointments, we do a series of beta releases before a such major release. This is where we need you. Please help out in making Lightning 4.0 a great success.Time flies when you are preparing for releases, so we are already at Thunderbird 38.0b3 and Lightning 4.0b3. The final release will be on May 12th and there will be at least one more beta. Please download these betas and take a moment to go through all the actions you normally do on a daily basis. Create an event, accept an invitation, complete a task. You probably have your own workflow, these are of course just examples.
Here is how to get the builds. If you have found an issue, you can either leave a comment here or file a bug on bugzilla.
- Thunderbird Beta: https://www.mozilla.org/thunderbird/all-beta.html
- You can directly select the beta version for your language on this page
- Lightning Beta: https://addons.mozilla.org/thunderbird/addon/lightning/
- Scroll down to the “Developer Channel”, expand the box and click on the download button
You may wonder what is new. I’ve gone through the bugs fixed since 3.3 and found that most issues are backend fixes that won’t be very visible. We do however have a great new feature to save copies of invitations to your calendar. This helps in case you don’t care about replying to the invitation but would still like to see it in your calendar. We also have more general improvements in invitation compatibility, performance and stability and some slight visual enhancements. The full list of changes can be found on bugzilla.
Although its highly unlikely that severe problems will arise, you are encouraged to make a backup before switching to beta. If it comforts you, I am using beta builds for my production profile and I don’t recall there being a time where I lost events or had to start over.
If you have questions or have found a bug, feel free to leave a comment here.
I guess that mostly comes from the impression that the whole story is our government watching (over) us and the worst thing that can happen is incrimination. While that might threaten some things, most people do nothing that is really interesting enough for a government to go into attack mode over it (or so they believe, and very firmly so). And I even agree that most governments (including the US and EU countries) actually actively seek out what they call "terrorist activities" (even though they often stretch that term in crazy ways) and/or child abuse and similar topics that the vast majority of citizens agree are a bad thing and are not part of - and the vast majority of politicians and government workers believe they act in the best interest of their citizens when "obviously fighting that" via their different programs of privacy-undermining surveillance. That said, most people seem to be OK with their government collecting data about them as long as it's not used to incriminate them (and when that happens, it's too late to protest the practice anyhow).
A lot has been said about that since the "Snowden leaks", but I think the more obvious short-term and direct threat is in corporate surveillance, which has been swept under the rug in most discussions recently - to the joy of Facebook, Google and other major players in that area. I have also seen that when depicting some obvious scenarios resulting of that, people start to think about it much more promptly and realize the effect on their daily lives (even if those are minor issues compared to government starting a manhunt against you with terror allegations or similar).
So what I start asking is:
- Are you OK with banks determining your credit conditions based on all his comments on Facebook and his Google searches? ("Your friends say you owe them money, and that you live beyond your means, this is gonna be difficult...")
- Are you OK with insurances changing your rates based on all that data? ("Oh, so you 'like' all those videos about dangerous sports and that deafening music, and you have some quite aggressive or even violent friends - so you see why we need to go a bit higher there, right?")
- Are you OK with prices for flights or products in online stores (Amazon etc.) being different depending on what other things you have done on the web? ("So, you already planned that vacation at that location, good, so we can give you a higher air rate as you' can't back out now anyhow.")
- And, of course, envision ads in public or half-public locations being customized for whoever is in the area. ("You recently searched for engagement rings, so we'll show ads for them wherever you go." or "Hey, this is the third time today we sat down and a screen nearby shows Viagra ads." or "My dear daughter, why do we see ads for diapers everywhere we go?")
And, of course, they are true to a degree even now. Banks are already buying data from Facebook, probably including "private" messages, for determining credit scores, insurances base rates on anything they can find out about you, flight rates as well as prices for some Amazon and other web shop products vary based on what you searched before - and ads both on your screen and even on postal mail get tailored to a profile built on all kinds of your online behavior. My questions above just take all of those another step forward - but a pretty realistic one in my opinion.
I hope thinking about questions like that makes people realize they might actually want to evade some of that and in the end they actually have something to hide.
And then, of course, that a non-profit like Mozilla, which doesn't seek to maximize money, can believably be on their side and help them regain some privacy where they - now - want to.
The next major release of Thunderbird, version 38, is now in beta and available for testing. You may download Thunderbird 38.0b1 here.
This version of Thunderbird is the first that is mostly managed by volunteer community members rather than by Mozilla staff. We have many new features, including:
- Message filtering when a message is sent or archived
- File-per-message local storage available for new accounts (maildir)
- Contact search over multiple address books
- Internationalized domain names for RSS feeds
- Allow expanded columns to the folder pane for folder size and counts
Release notes are available here.
There are still a couple of features missing from this beta that we hope to ship in the final version of Thunderbird 38. Those are:
- Ship Lightning calendar addon with Thunderbird with an opt-out dialog
- Use OAUTH authentication with Gmail IMAP accounts
The primary fear, it seems, is that knowledge that the largest open-source email client was still receiving regular updates would impel its userbase to agitate for increased funding and maintenance of the client to help forestall potential threats to the open nature of email as well as to innovate in the space of providing usable and private communication channels. Such funding, however, would be an unaffordable luxury and would only distract Mozilla from its central goal of building developer productivity tooling. Persistent rumors that Mozilla would be willing to fund Thunderbird were it renamed Firefox Email were finally addressed with the comment, "such a renaming would violate our current policy that all projects be named Persona."
In the spirit of Twitter I will keep this blog post down to 140 characters. Check out @mozcalendar for more frequent updates on the project.