Mozilla Nederland LogoDe Nederlandse

Abonneren op feed Mozilla planet
Planet Mozilla -
Bijgewerkt: 2 dagen 20 uur geleden

Cameron Kaiser: Unexpected FPR30 changes because 2020

vr, 11/12/2020 - 06:36
Well, there were more casualties from the Great Floodgap Power Supply Kablooey of 2020, and one notable additional casualty, because 2020, natch, was my trusty former daily driver Quad G5. Yes, this is the machine that among other tasks builds TenFourFox. The issue is recoverable and I think I can get everything back in order, but due to work and the extent of what appears gone wrong it won't happen before the planned FPR30 release date on December 15 (taking into account it takes about 30 hours to run a complete build cycle).

If you've read this blog for any length of time, you know how much I like to be punctual with releases to parallel mainstream Firefox. However, there have been no reported problems from the beta and there are no major security issues that must be patched immediately, so there's a simple workaround: on Monday night Pacific time the beta will simply become the release. If you're already using the beta, then just keep on using it. Since I was already intending to do a security-only release after FPR30 and I wasn't planning to issue a beta for it anyway, anything left over from FPR30 will get rolled into FPR30 SPR1 and this will give me enough cushion to get the G5 back in working order (or at least dust off the spare) for that release on or about January 26. I'm sure all of you will get over it by then. :)

Categorieën: Mozilla-nl planet

Niko Matsakis: Rotating the compiler team leads

vr, 11/12/2020 - 06:00

Since we created the Rust teams, I have been serving as lead of two teams: the compiler team and the language design team (I’ve also been a member of the core team, which has no lead). For those less familiar with Rust’s governance, the compiler team is focused on the maintenance and implementation of the compiler itself (and, more recently, the standard library). The language design team is focused on the design aspects. Over that time, all the Rust teams have grown and evolved, with the compiler team in particular being home to a number of really strong members.

Last October, I announced that pnkfelix was joining me as compiler team co-lead. Today, I am stepping back from my role as compiler team co-lead altogether. After taking nominations from the compiler team, pnkfelix and I are proud to announce that wesleywiser will replace me as compiler team co-lead. If you don’t know Wesley, there’ll be an announcement on Inside Rust where you can learn a bit more about what he has done, but let me just say I am pleased as punch that he agreed to serve as co-lead. He’s going to do a great job.

You’re not getting rid of me this easily

Stepping back as compiler team co-lead does not mean I plan to step away from the compiler. In fact, quite the opposite. I’m still quite enthusiastic about pushing forward on ongoing implementaton efforts like the work to implement RFC 2229, or the development on chalk and polonius. In fact, I am hopeful that stepping back as co-lead will create more time for these efforts, as well as time to focus on leadership of the language design team.

Rotation is key

I see these changes to compiler team co-leads as fitting into a larger trend, one that I believe is going to be increasingly important in Rust: rotation of leadership. To me, the “corest of the core” value of the Rust project is the importance of “learning from others” – or as I put it in my rust-latam talk from 20191, “a commitment to a CoC and a culture that emphasizes curiosity and deep research”. Part of learning from others has to be actively seeking out fresh leadership and promoting them into positions of authority.

But rotation has a cost too

Another core value of Rust is recognizing the inevitability of tradeoffs2. Rotating leadership is no exception: there is a lot of value in having the same people lead for a long time, as they accumulate all kinds of context and skills. But it also means that you are missing out on the fresh energy and ideas that other people can bring to the problem. I feel confident that Felix and Wesley will help to shape the compiler team in ways that I never would’ve thought to do.

Rotation with intention

The tradeoff between experience and enthusiasm makes it all the more important, in my opinion, to rotate leadership intentionally. I am reminded of Emily Dunham’s classic post on leaving a team3, and how it was aimed at normalizing the idea of “retirement” from a team as something you could actively choose to do, rather than just waiting until you are too burned out to continue.

Wesley, Felix, and I have discussed the idea of “staggered terms” as co-leads. The idea is that you serve as co-lead for two years, but we select one new co-lead per year, with the oldest co-lead stepping back. This way, at every point you have a mix of a new co-lead and someone who has already done it for one year and has some experience.

Lang and compiler need separate leadership

Beyond rotation, another reason I would like to step back from being co-lead of the compiler team is that I don’t really think it makes sense to have one person lead two teams. It’s too much work to do both jobs well, for one thing, but I also think it works to the detriment of the teams. I think the compiler and lang team will work better if they each have their own, separate “advocates”.

I’m actually very curious to work with pnkfelix and Wesley to talk about how the teams ought to coordinate, since I’ve always felt we could do a better job. I would like us to be actively coordinating how we are going to manage the implementation work at the same time as we do the design, to help avoid unbounded queues. I would also like us to be doing a better job getting feedback from the implementation and experimentation stage into the lang team.

You might think having me be the lead of both teams would enable coordination, but I think it can have the opposite effect. Having separate leads for compiler and lang means that those leads must actively communicate and avoids the problem of one person just holding things in their head without realizing other people don’t share that context.

Idea: Deliberate team structures that enable rotation

In terms of the compiler team structure, I think there is room for us to introduce “rotation” as a concept in other ways as well. Recently, I’ve been kicking around an idea for “compiler team officers”4, which would introduce a number of defined roles, each of which is setup in with staggered terms to allow for structured handoff. I don’t think the current proposal is quite right, but I think it’s going in an intriguing direction.

This proposal is trying to address the fact that a successful open source organization needs more than coders, but all too often we fail to recognize and honor that work. Having fixed terms is important because when someone is willing to do that work, they can easily wind up getting stuck being the only one doing it, and they do that until they burn out. The proposal also aims to enable more “part-time” leadership within the compiler team, by making “finer grained” duties that don’t require as much time to complete.

  1. Oh-so-subtle plug: I really quite liked that talk. 

  2. Though not always the tradeoffs you expect. Read the post. 

  3. If you haven’t read it, stop reading now and go do so. Then come back. Or don’t. Just read it already. 

  4. I am not sure that ‘officer’ is the right word here, but I’m not sure what the best replacement is. I want something that conveys respect and responsibility. 

Categorieën: Mozilla-nl planet

Martin Thompson: Next Level Version Negotiation

vr, 11/12/2020 - 01:00

The IAB EDM Program[1] met this morning. While the overall goal of the meeting, we ended up talking a lot a document I wrote a while back and how to design version negotiation in protocols.

This post provides a bit of background and shares some of what we learned today after what was quite a productive discussion.

Protocol Ossification #

The subject of protocol ossification has been something of a live discussion in the past several years. The community has come to the realization that it is effectively impossible to extend many Internet protocols without causing a distressing number of problems with existing deployments. It seems like no protocol is unaffected[2]. IP, TCP, TLS, and HTTP all have various issues that prevent extensions from working correctly.

A number of approaches have been tried. HTTP/2, which was developed early in this process, was deployed only for HTTPS. Even though a cleartext variant was defined, many implementations explicitly decided not to implement that, partly motivated by these concerns. QUIC doubles down on this by encrypting as much as possible.

TLS 1.3, which was delayed by about a year by related problems, doesn't have that option so it ultimately used trickery to avoid notice by problematic middleboxes: TLS 1.3 looks a lot like TLS 1.2 unless you are paying close attention.

One experiment that turned out to quite successful in revealing ossification in TLS was GREASE. David Benjamin and Adam Langley, who maintain the TLS stack used by Google[3] found that inserting random values into different extension points had something of a cleansing effect on the TLS ecosystem. Several TLS implementations were found to be intolerant of new extensions.

One observation out of the experiments with TLS was that protocol elements that routinely saw new values, like cipher suites, were less prone to failing when previously unknown values were encountered. Those that hadn't seen new values as often, like server name types or signature schemes, were more likely to show problems. This caused Adam Langley to advise that protocols "have one joint and keep it well oiled."

draft-iab-use-it-or-lose-it explores the problem space a little more thoroughly. The draft looks at a bunch of different protocols and finds that in general the observations hold. The central thesis is that for an extension point to be usable, it needs to be actively used.

Version Negotiation #

The subject of the discussion today was version negotiation. Of all the extension points available in protocols, the one that often sees the least use is version negotation. A version negotiation mechanism has to exist in the first version of a protocol, but it is never really tested until the second version is deployed.

No matter how carefully the scheme is designed[4], the experience with TLS shows that even a well-designed scheme can fail.

The insight for today, thanks largely to Tommy Pauly, was that the observation about extension points could be harnessed to make version negotiation work. Tommy observed that some protocols don't design in-protocol version negotiation schemes, but instead rely on the protocol at the next layer down. And these protocols have been more successful at avoid some of the pitfalls inherent to version negotiation.

At the next layer down the stack, the codepoints for the higher-layer protocol are just extension codepoints. They aren't exceptional for the lower layer and they probably get more use. Therefore, these extension points are less likely to end up being ossified when the time comes to rely on them.

Supporting Examples #

Tommy offered a few examples and we discussed several others.

IPv6 was originally intended to use the IP EtherType (0x0800) in 802.1, with routers looking at the IP version number to determine how to handle packets. That didn't work out[5]. What did work was assigning IPv6 its own EtherType (0x86dd). This supports the idea that a function that was already in use for other reasons[6] was better able to support the upgrade than the in-protocol mechanisms that were originally designed for that purpose.

HTTP/2 was floated as another potential example of this effect. Though the original reason for adding ALPN was performance - we wanted to ensure that we wouldn't have to do another round trip after the TLS handshake to do Upgrade exchange - the effect is that negotiation of HTTP relied on a mechanism that was well-tested and proven at the TLS layer[7].

We observed that ALPN doesn't work for the HTTP/2 to HTTP/3 upgrade as these protocols don't share a protocol. Here, we observed that we would likely end up relying on SVCB and the HTTPS DNS record.

Carsten Bormann also pointed at SenML, which deliberately provides no inherent version negotiation. I suggest that this is an excellent example of relying on lower-layer negotiation, in this case the content negotiation functions provided by underlying protocols like CoAP or HTTP.

It didn't come up at the time, but one of my favourite examples comes from the people building web services at Mozilla. They do not include version numbers in URLs or hostnames for their APIs and they don't put version numbers in request or response formats. The reasoning being that, should they need to roll a new version that is incompatible with the current one, they can always deploy to a new domain name. I always appreciated the pragmatism of that approach, though I still see lots of /v1/ in public HTTP API documentation.

These all seem to provide good support for the basic idea.

Counterexamples #

Any rule like this isn't worth anything without counterexamples. Understanding counterexamples helps us understand what conditions are necessary for the theory to hold.

SNMP, which was already mentioned in the draft as having successfully managed a version transition using an in-band mechanism, was a particularly interesting case study. Several observations were made, suggesting several inter-connected reasons for success. It was observed that there was no especially strong reason to prefer SNMPv3 over SNMPv2 (or SNMPv2c), a factor which resulted in both SNMP versions coexisting for years.

There was an interesting sidebar at this point. It was observed that SNMP doesn't have any strong need to avoid version downgrade attacks in the way that a protocol like TLS might. Other protocols might not tolerate such phlegmatic coexistence.

While SNMP clients do include probing code to determine what protocols were supported. However, as network management systems include provisioning information for devices, it is usually the case that protocol support for managed devices is stored alongside other configuration. Thus we concluded that SNMP - to the extent that it even needs version upgrades - was closest to the "shove it in the DNS" approach used for the upgrade to HTTP/3.

In Practice #

The lesson here is that planning for the next version doesn't mean designing a version negotiation mechanism. It's possible that a perfectly good mechanism already exists. If it does, it's almost certainly better than anything you might cook up.

This is particularly gratifying to me as I had already begun following the practice of SenML with other work. For instance, RFC 8188 provides no in-band negotiation of version or even cryptographic agility. Instead, it relies on the existing content-coding negotiation mechanisms as a means of enabling its own eventual replacement. This was somewhat controversial at the time, especially the cryptographic agility part, but in retrospect it seems to be a good choice.

It's also good to have a strong basis for rejecting profligate addition of extension points in protocols[8], but now it seems like we have firm reasons to avoid designing version negotiation mechanisms into every protocol.

Maybe version negotiation can now be put better into context. Version negotiation might only belong in protocols at the lowest levels of the stack[9]. For most protocols, which probably need to run over TLS for other reasons, ALPN and maybe SVCB can stand in for version negotiation, with the bonus that these are specifically designed to avoid adding latency. HTTP APIs can move to a different URL.

As this seems solid, I now have the task of writing a brief summary of this conclusion for the next revision of the "use it or lose it" draft. That might take some time as there are a few open issues that need some attention.

  1. Not electronic dance music sadly, it's about Evolvability, Deployability, & Maintainability of Internet protocols ↩︎

  2. UDP maybe. UDP is simple enough that it doesn't have features/bugs. Not to say that it is squeaky clean, it has plenty of baggage, with checksum issues, a reputation for being used for DoS, and issues with flow termination in NATs. ↩︎

  3. BoringSSL, which is now used by a few others, including Cloudflare and Apple. ↩︎

  4. Section 4.1 of RFC 6709 contains some great advice on how to design a version negotiation scheme, so that you can learn from experience. Though pay attention to the disclaimer in the last paragraph. ↩︎

  5. No one on the call was paying sufficient attention at the time, so we don't know precisely why. We intend to find out, of course. ↩︎

  6. At the time, there was still reasonable cause to think that IP wouldn't be the only network layer protocol, so other values were being used routinely. ↩︎

  7. You might rightly observe here that ALPN was brand new for HTTP/2, so the mechanism itself wasn't exactly proven. This is true, but there are mitigating factors. The negotiation method is exactly the same as many other TLS extensions. And we tested the mechanism thoroughly during HTTP/2 deployment as each new revision from the -04 draft onwards was deployed widely with a different ALPN string. By the time HTTP/2 shipped, ALPN was definitely solid. ↩︎

  8. There is probably enough material for a long post on why this is not a problem in JSON, but I'll just assert for now - without support - that there really is only one viable extension point in any JSON usage. ↩︎

  9. It doesn't seem like TLS or QUIC can avoid having version negotiation. ↩︎

Categorieën: Mozilla-nl planet

The Rust Programming Language Blog: Launching the Lock Poisoning Survey

vr, 11/12/2020 - 01:00

The Libs team is looking at how we can improve the std::sync module, by potentially splitting it up into new modules and making some changes to APIs along the way. One of those API changes we're looking at is non-poisoning implementations of Mutex and RwLock. To find the best path forward we're conducting a survey to get a clearer picture of how the standard locks are used out in the wild.

The survey is a Google Form. You can fill it out here.

What is this survey for?

The survey is intended to answer the following questions:

  • When is poisoning on Mutex and RwLock being used deliberately.
  • Whether Mutex and RwLock (and their guard types) appear in the public API of libraries.
  • How much friction there is switching from the poisoning Mutex and RwLock locks to non-poisoning ones (such as from antidote or parking_lot).

This information will then inform an RFC that will set out a path to non-poisoning locks in the standard library. It may also give us a starting point for looking at the tangentially related UnwindSafe and RefUnwindSafe traits for panic safety.

Who is this survey for?

If you write code that uses locks then this survey is for you. That includes the standard library's Mutex and RwLock as well as locks from, such as antidote, parking_lot, and tokio::sync.

So what is poisoning anyway?

Let's say you have an Account that can update its balance:

impl Account { pub fn update_balance(&mut self, change: i32) { self.balance += change; self.changes.push(change); } }

Let's also say we have the invariant that balance == changes.sum(). We'll call this the balance invariant. So at any point when interacting with an Account you can always depend on its balance being the sum of its changes, thanks to the balance invariant.

There's a point in our update_balance method where the balance invariant isn't maintained though:

impl Account { pub fn update_balance(&mut self, change: i32) { self.balance += change; // self.balance != self.changes.sum() self.changes.push(change); } }

That seems ok, because we're in the middle of a method with exclusive access to our Account and everything is back to good when we return. There isn't a Result or ? to be seen so we know there's no chance of an early return before the balance invariant is restored. Or so we think.

What if self.changes.push didn't return normally? What if it panicked instead without actually doing anything? Then we'd return from update_balance early without restoring the balance invariant. That seems ok too, because a panic will start unwinding the thread it was called from, leaving no trace of any data it owned behind. Ignoring the Drop trait, no data means no broken invariants. Problem solved, right?

What if our Account wasn't owned by that thread that panicked? What if it was shared with other threads as a Arc<Mutex<Account>>? Unwinding one thread isn't going to protect other threads that could still access the Account, and they're not going to know that it's now invalid.

This is where poisoning comes in. The Mutex and RwLock types in the standard library use a strategy that makes panics (and by extension the possibility for broken invariants) observable. The next consumer of the lock, such as another thread that didn't unwind, can decide at that point what to do about it. This is done by storing a switch in the lock itself that's flipped when a panic causes a thread to unwind through its guard. Once that switch is flipped the lock is considered poisoned, and the next attempt to acquire it will receive an error instead of a guard.

The standard approach for dealing with a poisoned lock is to propagate the panic to the current thread by unwrapping the error it returns:

let mut guard = shared.lock().unwrap();

That way nobody can ever observe the possibly violated balance invariant on our shared Account.

That sounds great! So why would we want to remove it?

What's wrong with lock poisoning?

There's nothing wrong with poisoning itself. It's an excellent pattern for dealing with failures that can leave behind unworkable state. The question we're really asking is whether it should be used by the standard locks, which are std::sync::Mutex and std::sync::RwLock. We're asking whether it's a standard lock's job to implement poisoning. Just to avoid any confusion, we'll distinguish the poisoning pattern from the API of the standard locks by calling the former poisoning and the latter lock poisoning. We're just talking about lock poisoning.

In the previous section we motivated poisoning as a way to protect us from possibly broken invariants. Lock poisoning isn't actually a tool for doing this in the way you might think. In general, a poisoned lock can't tell whether or not any invariants are actually broken. It assumes that a lock is shared, so is likely going to outlive any individual thread that can access it. It also assumes that if a panic leaves any data behind then it's more likely to be left in an unexpected state, because panics aren't part of normal control flow in Rust. Everything could be fine after a panic, but the standard lock can't guarantee it. Since there's no guarantee there's an escape hatch. We can always still get access to the state guarded by a poisoned lock:

let mut guard = shared.lock().unwrap_or_else(|err| err.into_inner());

All Rust code needs to remain free from any possible undefined behavior in the presence of panics, so ignoring panics is always safe. Rust doesn't try guarantee all safe code is free from logic bugs, so broken invariants that don't potentially lead to undefined behavior aren't strictly considered unsafe. Since ignoring lock poisoning is also always safe it doesn't really give you a dependable tool to protect state from panics. You can always ignore it.

So lock poisoning doesn't give you a tool for guaranteeing safety in the presence of panics. What it does give you is a way to propagate those panics to other threads. The machinery needed to do this adds costs to using the standard locks. There's an ergonomic cost in having to call .lock().unwrap(), and a runtime cost in having to actually track state for panics.

With the standard locks you pay those costs whether you need to or not. That's not typically how APIs in the standard library work. Instead, you compose costs together so you only pay for what you need. Should it be a standard lock's job to synchronize access and propagate panics? We're not so sure it is. If it's not then what should we do about it? That's where the survey comes in. We'd like to get a better idea of how you use locks and poisoning in your projects to help decide what to do about lock poisoning. You can fill it out here.

Categorieën: Mozilla-nl planet

Wladimir Palant: How anti-fingerprinting extensions tend to make fingerprinting easier

do, 10/12/2020 - 14:57

Do you have a privacy protection extension installed in your browser? There are so many around, and every security vendor is promoting their own. Typically, these will provide a feature called “anti-fingerprinting” or “fingerprint protection” which is supposed to make you less identifiable on the web. What you won’t notice: this feature is almost universally flawed, potentially allowing even better fingerprinting.

Pig disguised as a bird but still clearly recognizable<figcaption> Image credits: OpenClipart </figcaption>

I’ve seen a number of extensions misimplement this functionality, yet I rarely bother to write a report. The effort to fully explain the problem is considerable. On the other hand, it is obvious that for most vendors privacy protection is merely a check that they can put on their feature list. Quality does not matter because no user will be able to tell whether their solution actually worked. With minimal resources available, my issue report is unlikely to cause a meaningful action.

That’s why I decided to explain the issues in a blog post, a typical extension will have at least three out of four. Next time I run across a browser extension suffering from all the same flaws I can send them a link to this post. And maybe some vendors will resolve the issues then. Or, even better, not even make these mistakes in the first place.

Contents How fingerprinting works

When you browse the web, you aren’t merely interacting with the website you are visiting but also with numerous third parties. Many of these have a huge interest in recognizing you reliably across different websites, advertisers for example want to “personalize” your ads. The traditional approach is storing a cookie in your browser which contains your unique identifier. However, modern browsers have a highly recommendable setting to clear cookies at the end of the browsing session. There is private browsing mode where no cookies are stored permanently. Further technical restrictions for third-party cookies are expected to be implemented soon, and EU data protection rules also make storing cookies complicated to say the least.

So cookies are becoming increasingly unreliable. Fingerprinting is supposed to solve this issue by recognizing individual users without storing any data on their end. The idea is to look at data about user’s system that browsers make available anyway, for example display resolution. It doesn’t matter what the data is, it should be:

  • sufficiently stable, ideally stay unchanged for months
  • unique to a sufficiently small group of people

Note that no data point needs to identify a single person by itself. If each of them refer to a different group of people, with enough data points the intersection of all these groups will always be a single person.

How anti-fingerprinting is supposed to work

The goal of anti-fingerprinting is reducing the amount and quality of data that can be used for fingerprinting. For example, CSS used to allow recognizing websites that the user visited before – a design flaw that could be used for fingerprinting among other things. It took quite some time and effort, but eventually the browsers found a fix that wouldn’t break the web. Today this data point is no longer available to websites.

Other data points remain but have been defused considerably. For example, browsers provide websites with a user agent string so that these know e.g. which browser brand and version they are dealing with. Applications installed by the users used to extend this user agent string with their own identifiers. Eventually, browser vendors recognized how this could be misused for fingerprinting and decided to remove any third-party additions. Much of the other information originally available here has been removed as well, so that today any user agent string is usually common to a large group of people.

Barking the wrong tree

Browser vendors have already invested a considerable amount of work into anti-fingerprinting. However, they usually limited themselves to measures which wouldn’t break existing websites. And while things like display resolution (unlike window size) aren’t considered by too many websites, these were apparently numerous enough that browsers still give them user’s display resolution and the available space (typically display resolution without the taskbar).

Privacy protection extensions on the other hand aren’t showing as much concern. So they will typically do something like:

screen.width = 1280; screen.height = 1024;

There you go, the website will now see the same display resolution for everybody, right? Well, that’s unless the website does this:

delete screen.width; delete screen.height;

And suddenly screen.width and screen.height are restored to their original values. Fingerprinting can now use two data points instead of one: not merely the real display resolution but also the fake one. Even if that fake display resolution were extremely common, it would still make the fingerprint slightly more precise.

Is this magic? No, just how JavaScript prototypes work. See, these properties are not defined on the screen object itself, they are part of the object’s prototype. So that privacy extension added an override for prototype’s properties. With the override removed the original properties became visible again.

So is this the correct way to do it?

Object.defineProperty(Screen.prototype, "width", {value: 1280}); Object.defineProperty(Screen.prototype, "height", {value: 1024});

Much better. The website can no longer retrieve the original value easily. However, it can detect that the value has been manipulated by calling Object.getOwnPropertyDescriptor(Screen.prototype, "width"). Normally the resulting property descriptor would contain a getter, this one has a static value however. And the fact that a privacy extension is messing with the values is again a usable data point.

Let’s try it without changing the property descriptor:

Object.defineProperty(Screen.prototype, "width", {get: () => 1280}); Object.defineProperty(Screen.prototype, "height", {get: () => 1024});

Almost there. But now the website can call Object.getOwnPropertyDescriptor(Screen.prototype, "width").get.toString() to see the source code of our getter. Again a data point which could be used for fingerprinting. The source code needs to be hidden:

Object.defineProperty(Screen.prototype, "width", {get: (() => 1280).bind(null)}); Object.defineProperty(Screen.prototype, "height", {get: (() => 1024).bind(null)});

This bind() call makes sure the getter looks like a native function. Exactly what we needed.

Update (2020-12-14): Firefox allows content scripts to call exportFunction() which is a better way to do this. In particular, it doesn’t require injecting any code into web page context. Unfortunately, this functionality isn’t available in Chromium-based browsers. Thanks to kkapsner for pointing me towards this functionality.

Catching all those pesky frames

There is a complication here: a website doesn’t merely have one JavaScript execution context, it has one for each frame. So you have to make sure your content script runs in all these frames. And so browser extensions will typically specify "all_frames": true in their manifest. And that’s correct. But then the website does something like this:

var frame = document.createElement("iframe"); document.body.appendChild(frame); console.log(screen.width, frame.contentWindow.screen.width);

Why is this newly created frame still reporting the original display width? We are back at square one: the website again has two data points to work with instead of one.

The problem here: if frame location isn’t set, the default is to load the special page about:blank. When Chrome developers created their extension APIs originally they didn’t give extensions any way to run content scripts here. Luckily, this loophole has been closed by now, but the extension manifest has to set "match_about_blank": true as well.

Timing woes

As anti-fingerprinting functionality in browser extensions is rather invasive, it is prone to breaking websites. So it is important to let users disable this functionality on specific websites. This is why you will often see code like this in extension content scripts:

chrome.runtime.sendMessage("AmIEnabled", function(enabled) { if (enabled) init(); });

So rather than initializing all the anti-fingerprinting measures immediately, this content script first waits for the extension’s background page to tell it whether it is actually supposed to do anything. This gives the website the necessary time to store all the relevant values before these are changed. It could even come back later and check out the modified values as well – once again, two data points are better than one.

This is an important limitation of Chrome’s extension architecture which is sadly shared by all browsers today. It is possible to run a content script before any webpage scripts can run ("run_at": "document_start"). This will only be a static script however, not knowing any of the extension state. And requesting extension state takes time.

This might eventually get solved by dynamic content script support, a request originally created ten years ago. In the meantime however, it seems that the only viable solution is to initialize anti-fingerprinting immediately. If the extension later says “no, you are disabled” – well, then the content script will just have to undo all manipulations. But this approach makes sure that in the common scenario (functionality is enabled) websites won’t see two variants of the same data.

The art of faking

Let’s say that all the technical issues are solved. The mechanism for installing fake values works flawlessly. This still leaves a question: how does one choose the “right” fake value?

How about choosing a random value? My display resolution is 1661×3351, now fingerprint that! As funny as this is, fingerprinting doesn’t rely on data that makes sense. All it needs is data that is stable and sufficiently unique. And that display resolution is certainly extremely unique. Now one could come up with schemes to change this value regularly, but fact is: making users stand out isn’t the right way.

What you’d rather want is finding the largest group out there and joining it. My display resolution is 1920×1080 – just the common Full HD, nothing to see here! Want to know my available display space? I have my Windows taskbar at the bottom, just like everyone else. No, I didn’t resize it either. I’m just your average Joe.

The only trouble with this approach: the values have to be re-evaluated regularly. Two decades ago, 1024×768 was the most common display resolution and a good choice for anti-fingerprinting. Today, someone claiming to have this screen size would certainly stick out. Similarly, in my website logs visitors claiming to use Firefox 48 are noticeable: it might have been a common browser version some years ago, but today it’s usually bots merely pretending to be website visitors.

Categorieën: Mozilla-nl planet

Mike Taylor: Differences in cookie length (size?) restrictions

do, 10/12/2020 - 07:00

I was digging through some of the old http-state tests (which got ported into web-platform-tests, and which I’m rewriting to be more modern and, mostly work?) and noticed an interesting difference between Chrome and Firefox in disabled-chromium0020-test (no idea why it’s called disabled and not, in fact, disabled).

That test looks something like:

Set-Cookie: aaaaaaaaaaaaa....(repeating a's for seemingly forever)

But first, a little background on expected behavior so you can begin to care.

rfc6265 talks about cookie size limits like so:

At least 4096 bytes per cookie (as measured by the sum of the length of the cookie’s name, value, and attributes).

(It’s actually trying to say at most, which confuses me, but a lot of things confuse me on the daily.)

So in my re-written version of disabled-chromium0020-test I’ve got (just assume a function that consumes this object and does something useful):

{ // 7 + 4089 = 4096 cookie: `test=11${"a".repeat(4089)}`, expected: `test=11${"a".repeat(4089)}`, name: "Set cookie with large value ( = 4kb)", },

Firefox and Chrome are happy to set that cookie. Fantastic. So naturally we want to test a cookie with 4097 bytes and make sure the cookie gets ignored:

// 7 + 4091 = 4098 { cookie: `test=12${"a".repeat(4091)}`, expected: "", name: "Ignore cookie with large value ( > 4kb)", },

If you’re paying attention, and good at like, reading and math, you’ll notice that 4096 + 1 is not 4098. A+ work.

What I discovered, much in the same way that Columbus discovered Texas, is that a “cookie string” that is 4097 bytes long currently has different behaviors in Firefox and Chrome (and probably most browsers, TBQH). Firefox (sort of correctly, according to the current spec language, if you ignore attributes) will only consider the name length + the value length, while Chrome will consider the entire cookie string including name, =, value, and all the attributes when enforcing the limit.

I’m going to include the current implementations here, because it makes me look smart (and I’m trying to juice SEO):

Gecko (which sets kMaxBytesPerCookie to 4096):

bool CookieCommons::CheckNameAndValueSize(const CookieStruct& aCookieData) { // reject cookie if it's over the size limit, per RFC2109 return ( + aCookieData.value().Length()) <= kMaxBytesPerCookie; }

Chromium (which sets kMaxCookieSize to 4096):

ParsedCookie::ParsedCookie(const std::string& cookie_line) { if (cookie_line.size() > kMaxCookieSize) { DVLOG(1) << "Not parsing cookie, too large: " << cookie_line.size(); return; } ParseTokenValuePairs(cookie_line); if (!pairs_.empty()) SetupAttributes(); }

Neither are really following what the spec says, so until the spec tightens that down (it’s currently only a SHOULD-level requirement which is how you say pretty-please in a game of telephone from browser engineers to computers), I bumped the test up to 4098 bytes so both browsers consistently ignore the cookie.

(I spent about 30 seconds testing in Safari, and it seems to match Firefox at least for the 4097 “cookie string” limit, but who knows what it does with attributes.)

Categorieën: Mozilla-nl planet

Cameron Kaiser: Floodgap downtime fixed

wo, 09/12/2020 - 23:32
I assume some of you will have noticed that Floodgap was down for a couple of days -- though I wouldn't know, since it wasn't receiving E-mail during the downtime. Being 2020 the problem turned out to be a cavalcade of simultaneous major failures including the complete loss of the main network backbone's power supply. Thus is the occasional "joy" of running a home server room. It is now on a backup rerouted supply while new parts are ordered and all services including TenFourFox and should be back up and running. Note that there will be some reduced redundancy until I can effect definitive repairs but most users shouldn't be affected.
Categorieën: Mozilla-nl planet

Mozilla Privacy Blog: Mozilla teams up with Twitter, Automattic, and Vimeo to provide recommendations on EU content responsibility

wo, 09/12/2020 - 09:30

The European Commission will soon unveil its landmark Digital Services Act draft law, that will set out a vision for the future of online content responsibility in the EU. We’ve joined up with Twitter, Auttomattic, and Vimeo to provide recommendations on how the EU’s novel proposals can ensure a more thoughtful approach to addressing illegal and harmful content in the EU, in a way that tackles online harms while safeguarding smaller companies’ ability to compete.

As we note in our letter,

“The present conversation is too often framed through the prism of content removal alone, where success is judged solely in terms of ever-more content removal in ever-shorter periods of time.

Without question, illegal content – including terrorist content and child sexual abuse material – must be removed expeditiously. Yet by limiting policy options to a solely stay up-come down binary, we forgo promising alternatives that could better address the spread and impact of problematic content while safeguarding rights and the potential for smaller companies to compete.

Indeed, removing content cannot be the sole paradigm of Internet policy, particularly when concerned with the phenomenon of ‘legal-but-harmful’ content. Such an approach would benefit only the very largest companies in our industry.

We therefore encourage a content moderation discussion that emphasises the difference between illegal and harmful content and highlights the potential of interventions that address how content is surfaced and discovered. Included in this is how consumers are offered real choice in the curation of their online environment.”

We look forward to working with lawmakers in the EU to help bring this vision for a healthier internet to fruition in the upcoming Digital Services Act deliberations.

You can read the full letter to EU lawmakers here and more background on our engagement with the EU DSA here.

The post Mozilla teams up with Twitter, Automattic, and Vimeo to provide recommendations on EU content responsibility appeared first on Open Policy & Advocacy.

Categorieën: Mozilla-nl planet

Ryan Harter: Leading with Data - Cascading Metrics

wo, 09/12/2020 - 09:00

It's surprisingly hard to lead a company with data. There's a lot written about how to set good goals and how to avoid common pitfalls (like Surrogation) but I haven't seen much written about the practicalities of taking action on these metrics.

I spent most of this year working with …

Categorieën: Mozilla-nl planet

Daniel Stenberg: curl 7.74.0 with HSTS

wo, 09/12/2020 - 07:51

Welcome to another curl release, 56 days since the previous one.

Release presentation Numbers

the 196th release
1 change
56 days (total: 8,301)

107 bug fixes (total: 6,569)
167 commits (total: 26,484)
0 new public libcurl function (total: 85)
6 new curl_easy_setopt() option (total: 284)

1 new curl command line option (total: 235)
46 contributors, 22 new (total: 2,292)
22 authors, 8 new (total: 843)
3 security fixes (total: 98)
1,600 USD paid in Bug Bounties (total: 4,400 USD)


This time around we have no less than three vulnerabilities fixed and as shown above we’ve paid 1,600 USD in reward money this time, out of which the reporter of the CVE-2020-8286 issue got the new record amount 900 USD. The second one didn’t get any reward simply because it was not claimed. In this single release we doubled the number of vulnerabilities we’ve published this year!

The six announced CVEs during 2020 still means this has been a better year than each of the six previous years (2014-2019) and we have to go all the way back to 2013 to find a year with fewer CVEs reported.

I’m very happy and proud that we as an small independent open source project can reward these skilled security researchers like this. Much thanks to our generous sponsors of course.

CVE-2020-8284: trusting FTP PASV responses

When curl performs a passive FTP transfer, it first tries the EPSV command and if that is not supported, it falls back to using PASV. Passive mode is what curl uses by default.

A server response to a PASV command includes the (IPv4) address and port number for the client to connect back to in order to perform the actual data transfer.

This is how the FTP protocol is designed to work.

A malicious server can use the PASV response to trick curl into connecting back to a given IP address and port, and this way potentially make curl extract information about services that are otherwise private and not disclosed, for example doing port scanning and service banner extractions.

If curl operates on a URL provided by a user (which by all means is an unwise setup), a user can exploit that and pass in a URL to a malicious FTP server instance without needing any server breach to perform the attack.

There’s no really good solution or fix to this, as this is how FTP works, but starting in curl 7.74.0, curl will default to ignoring the IP address in the PASV response and instead just use the address it already uses for the control connection. In other words, we will enable the CURLOPT_FTP_SKIP_PASV_IP option by default! This will cause problems for some rare use cases (which then have to disable this), but we still think it’s worth doing.

CVE-2020-8285: FTP wildcard stack overflow

libcurl offers a wildcard matching functionality, which allows a callback (set with CURLOPT_CHUNK_BGN_FUNCTION) to return information back to libcurl on how to handle a specific entry in a directory when libcurl iterates over a list of all available entries.

When this callback returns CURL_CHUNK_BGN_FUNC_SKIP, to tell libcurl to not deal with that file, the internal function in libcurl then calls itself recursively to handle the next directory entry.

If there’s a sufficient amount of file entries and if the callback returns “skip” enough number of times, libcurl runs out of stack space. The exact amount will of course vary with platforms, compilers and other environmental factors.

The content of the remote directory is not kept on the stack, so it seems hard for the attacker to control exactly what data that overwrites the stack – however it remains a Denial-Of-Service vector as a malicious user who controls a server that a libcurl-using application works with under these premises can trigger a crash.

CVE-2020-8286: Inferior OCSP verification

libcurl offers “OCSP stapling” via the CURLOPT_SSL_VERIFYSTATUS option. When set, libcurl verifies the OCSP response that a server responds with as part of the TLS handshake. It then aborts the TLS negotiation if something is wrong with the response. The same feature can be enabled with --cert-status using the curl tool.

As part of the OCSP response verification, a client should verify that the response is indeed set out for the correct certificate. This step was not performed by libcurl when built or told to use OpenSSL as TLS backend.

This flaw would allow an attacker, who perhaps could have breached a TLS server, to provide a fraudulent OCSP response that would appear fine, instead of the real one. Like if the original certificate actually has been revoked.


There’s really only one “change” this time, and it is an experimental one which means you need to enable it explicitly in the build to get to try it out. We discourage people from using this in production until we no longer consider it experimental but we will of course appreciate feedback on it and help to perfect it.

The change in this release introduces no less than 6 new easy setopts for the library and one command line option: support HTTP Strict-Transport-Security, also known as HSTS. This is a system for HTTPS hosts to tell clients to attempt to contact them over insecure methods (ie clear text HTTP).

One entry-point to the libcurl options for HSTS is the CURLOPT_HSTS_CTRL man page.


Yet another release with over one hundred bug-fixes accounted for. I’ve selected a few interesting ones that I decided to highlight below.

enable alt-svc in the build by default

We landed the code and support for alt-svc: headers in early 2019 marked as “experimental”. We feel the time has come for this little baby to grow up and step out into the real world so we removed the labeling and we made sure the support is enabled by default in builds (you can still disable it if you want).

8 cmake fixes bring cmake closer to autotools level

In curl 7.73.0 we removed the “scary warning” from the cmake build that warned users that the cmake build setup might be inferior. The goal was to get more people to use it, and then by extension help out to fix it. The trick might have worked and we’ve gotten several improvements to the cmake build in this cycle. More over, we’ve gotten a whole slew of new bug reports on it as well so now we have a list of known cmake issues in the KNOWN_BUGS document, ready for interested contributors to dig into!

configure now uses pkg-config to find openSSL when cross-compiling

Just one of those tiny weird things. At some point in the past someone had trouble building OpenSSL cross-compiled when pkg-config was used so it got disabled. I don’t recall the details. This time someone had the reversed problem so now the configure script was fixed again to properly use pkg-config even when cross-compiling… is the new home

You know it.

curl: only warn not fail, if not finding the home dir

The curl tool attempts to find the user’s home dir, the user who invokes the command, in order to look for some files there. For example the .curlrc file. More importantly, when doing SSH related protocol it is somewhat important to find the file ~/.ssh/known_hosts. So important that the tool would abort if not found. Still, a command line can still work without that during various circumstances and in particular if -k is used so bailing out like that was nothing but wrong…

curl_easy_escape: limit output string length to 3 * max input

In general, libcurl enforces an internal string length limit that prevents any string to grow larger than 8MB. This is done to prevent mistakes or abuse. Due a mistake, the string length limit was enforced wrongly in the curl_easy_escape function which could make the limit a third of the intended size: 2.67 MB.

only set USE_RESOLVE_ON_IPS for Apple’s native resolver use

This define is set internally when the resolver function is used even when a plain IP address is given. On macOS for example, the resolver functions are used to do some conversions and thus this is necessary, while for other resolver libraries we avoid the resolver call when we can convert the IP number to binary internally more effectively.

By a mistake we had enabled this “call getaddrinfo() anyway”-logic even when curl was built to use c-ares on macOS.

fix memory leaks in GnuTLS backend

We used two functions to extract information from the server certificate that didn’t properly free the memory after use. We’ve filed subsequent bug reports in the GnuTLS project asking them to make the required steps much clearer in their documentation so that perhaps other projects can avoid the same mistake going forward.

libssh2: fix transport over HTTPS proxy

SFTP file transfers didn’t work correctly since previous fixes obviously weren’t thorough enough. This fix has been confirmed fine in use.

make curl –retry work for HTTP 408 responses too

Again. We made the --retry logic work for 408 once before, but for some inexplicable reasons the support for that was accidentally dropped when we introduced parallel transfer support in curl. Regression fixed!

use OPENSSL_init_ssl() with >= 1.1.0

Initializing the OpenSSL library the correct way is a task that sounds easy but always been a source for problems and misunderstandings and it has never been properly documented. It is a long and boring story that has been going on for a very long time. This time, we add yet another chapter to this novel when we start using this function call when OpenSSL 1.1.0 or later (or BoringSSL) is used in the build. Hopefully, this is one of the last chapters in this book.

“scheme-less URLs” not longer accept blank port number

curl operates on “URLs”, but as a special shortcut it also supports URLs without the scheme. For example just a plain host name. Such input isn’t at all by any standards an actual URL or URI; curl was made to handle such input to mimic how browsers work. curl “guesses” what scheme the given name is meant to have, and for most names it will go with HTTP.

Further, a URL can provide a specific port number using a colon and a port number following the host name, like “hostname:80” and the path then follows the port number: “hostname:80/path“. To complicate matters, the port number can be blank, and the path can start with more than one slash: “hostname://path“.

curl’s logic that determines if a given input string has a scheme present checks the first 40 bytes of the string for a :// sequence and if that is deemed not present, curl determines that this is a scheme-less host name.

This means [39-letter string]:// as input is treated as a URL with a scheme and a scheme that curl doesn’t know about and therefore is rejected as an input, while [40-letter string]:// is considered a host name with a blank port number field and a path that starts with double slash!

In 7.74.0 we remove that potentially confusing difference. If the URL is determined to not have a scheme, it will not be accepted if it also has a blank port number!

Categorieën: Mozilla-nl planet

Martin Thompson: Oblivious DoH

wo, 09/12/2020 - 01:00

Today we heard an announcement that Cloudflare, Apple, and Fastly are collaborating on a new technology for improving privacy of DNS queries using a technology they call Oblivious DoH (ODoH).

This is an exciting development. This posting examines the technology in more detail and looks at some of the challenges this will need to overcome before it can be deployed more widely.

How ODoH Provides Privacy for DNS Queries #

Oblivious DoH is a simple mixnet protocol for making DNS queries. It uses a proxy server to provide added privacy for query streams.

This looks something like:

digraph ODoH { graph [overlap=true, splines=line, nodesep=1.0, ordering=out]; node [shape=rectangle, fontname=" "]; edge [arrowhead=none]; { rank=same; Client->Proxy; Proxy->Resolver; } }

A common criticism of DNS over HTTPS (DoH) is that it provides DoH resolvers with lots of privacy-sensitive information[1]. Currently all DNS resolvers, including DoH resolvers, see the contents of queries and can link that to who is making those queries. DoH includes connection reuse, so resolvers can link requests from the same client using the connection.

In Oblivious DoH, a proxy aggregates queries from multiple clients so that the resolver is unable to link queries to individual clients. ODoH protects the IP address of the client, but it also prevents the resolver from linking queries from the same client together. Unlike an ordinary HTTP proxy, which handle TLS connections to servers[2], ODoH proxies handle queries that are individually encrypted.

ODoH prevents resolvers from assembling profiles on clients by collecting the queries they make, because resolvers see queries from a large number of clients all mixed together.

An ODoH proxy learns almost nothing from this process as ODoH uses HPKE to encrypt the both query and answer with keys chosen by the client and resolver.

The privacy benefits of ODoH can only be undone if both the proxy and resolver cooperate. ODoH therefore recommends that the two services be run independently, with the operator of each making a commitment to respecting privacy.

Costs #

The privacy advantages provided by the ODoH design come at a higher cost than DoH, where a client just queries the resolver directly:

  • The proxy adds a little latency as it needs to forward queries and responses.
  • HPKE encryption adds up to about 100 bytes to each query.
  • The client and resolver need to spend a little CPU time to add and remove the encryption.

Cloudflare's tests show that the overall effect of ODoH on performance is quite modest. These early tests even suggest some improvement for the slowest queries. If those performance gains can be kept as they scale up their deployment, that would be strong justification for deployment.

Why This Design #

A similar outcome might be achieved using a proxy that supports HTTP CONNECT. However, in order to avoid the resolver from learning which queries come from the same client, each query would have to use a new connection.

That gets pretty expensive. While you might be able to tricks to drive down latency like sending the TLS handshake with the HTTP CONNECT, it means that every request uses a separate TCP connection and a round trip to establish the connection[3].

It is also possible to use something like Tor, which provides superior privacy protection. Tor is a lot more expensive.

Using HPKE and a multiplexed protocol like HTTP/2 or HTTP/3 avoids per-query connection setup costs. However, the most important thing is that it involves only minimal additional latency to get the privacy benefits[4].

Key Management in DNS #

The proposal puts HPKE keys for the resolver in the DNS[5]. The idea is that clients can talk to the resolver directly to get these, then use that information to protect its queries. As the keys are DNS records, they can be retrieved from any DNS resolver, which is a potential advantage.

This also means that this ODoH design depends on DNSSEC. Many clients rely on their resolver to perform DNSSEC validation, which doesn't help here. So this makes it difficult to deploy something like this incrementally in clients.

A better option might be to offer the HPKE public key information in response to a direct HTTP request to the resolver. That would ensure that the key could be authenticated by the client using HTTPS and the Web PKI.

Trustworthiness of Proxies #

Both client and resolver will want to authenticate the proxy and only allow a trustworthy proxy. The protocol design means that the need for trust in the proxy is limited, but it isn't zero.

Clients need to trust that the proxy is hiding their IP address. A bad proxy could attach the client IP address to every query they forward. Clients will want some way of knowing that the proxy won't do this[6].

Resolvers will likely want to limit the number of proxies that they will accept requests from, because the aggregated queries from a proxy of any reasonable size will look a lot like a denial of service attack. Mixing all the queries together denies resolvers the ability to do per-client rate limiting, which is a valuable denial of service protection measure. Resolvers will need to apply much more generous rate limits for these proxies and trust that the proxies will take reasonable steps to ensure that individual clients are not able to generate abusive numbers of queries.

This means that proxies will need to be acceptable to both client and resolver. Early deployments will be able to rely on contracts and similar arrangements to guarantee this. However, if use of ODoH is to scale out to support large numbers of providers of both proxies and resolvers, it could be necessary to build systems for managing these relationships.

Proxying For Other Applications #

One obvious thing with this design is that it isn't unique to DNS queries. In fact, there are a large number of request-response exchanges that would benefit from the same privacy benefits that ODoH provides. For example, Google this week announced a trial of a similar technology for preloading content.

A generic design that enabled protection for HTTP queries of any sort would be ideal. My hope is that we can design that protocol.

Once you look to designing a more generic solution, there are a few extra things that might improve the design. Automatic discovery of HTTP endpoints that allow oblivious proxying is one potential enhancement. Servers could advertise both keys and the proxies they support so that clients can choose to use those proxies to mask their address. This might involve automated proxy selection or discovery and even systems for encoding agreements. There are lots of possibilities here.

Centralization #

One criticism regarding DoH deployments is that they encourage consolidation of DNS resolver services. For ODoH - at least in the short term - options for ODoH resolvers will be limited, which could push usage toward a small number of server operators in exchange for the privacy gains ODoH provides.

During initial roll-out, the number of proxy operators will be limited. Also, using a larger proxy means that your queries are mixed in with more queries from other people, providing marginally better privacy. That might provide some impetus to consolidate.

Deploying automated discovery systems for acceptable proxies might help mitigate the worst centralization effects, but it seems likely that this will not be a feature of early deployments.

In the end, it would be a mistake to cry "centralization" in response to early trial deployments of a technology, which are naturally limited in scope. Furthermore, it's hard to know what the long term impact on the ecosystem will be. We might never be able to separate the effect of existing trends toward consolidation from the effect of new technology.

Conclusion #

I like the model adopted here. The use of a proxy neatly addresses one of the biggest concerns with the rollout of DoH: the privacy risk of having a large provider being able to gather information about streams of queries that can be linked to your IP address.

ODoH breaks streams of queries into discrete transactions that are hard to assemble into activity profiles. At the same time, ODoH makes it hard to attribute queries to individuals as it hides the origin of queries.

My sense is that the benefits very much outweigh the performance costs, the protocol complexity, and the operational risks. ODoH is a pretty big privacy win for name resolution. The state of name resolution is pretty poor, with much of it still unprotected from snooping, interception, and poisoning. The deployment of DoH went some way to address that, but came with some drawbacks. Oblivious DoH takes the next logical step.

  1. This is something all current DNS resolvers get, but the complaint is about the scale at which this information is gathered. Some people are unhappy that network operators are unable to access this information, but I regard that as a feature. ↩︎

  2. OK, proxies do handle individual, unencrypted HTTP requests, but that capability is hardly ever used any more now that 90% of the web is HTTPS. ↩︎

  3. Using 0-RTT doesn't work here without some fiddly changes to TLS because the session ticket used for TLS allows the server to link connections together, which isn't what we need. ↩︎

  4. This also makes ODoH far more susceptible to traffic analysis, but it relies on volume and the relative similarity of DNS queries to help manage that risk. ↩︎

  5. The recursion here means that the designers of ODoH probably deserve a prize of some sort. ↩︎

  6. The willful IP blindness proposal goes into more detail on what might be required for this. ↩︎

Categorieën: Mozilla-nl planet

The Mozilla Blog: Why getting voting right is hard, Part I: Introduction and Requirements

di, 08/12/2020 - 19:24

Every two years around this time, the US has an election and the rest of the world marvels and asks itself one question: What the heck is going on with US elections? I’m not talking about US politics here but about the voting systems (machines, paper, etc.) that people use to vote, which are bafflingly complex. While it’s true that American voting is a chaotic patchwork of different systems scattered across jurisdictions, running efficient secure elections is a genuinely hard problem. This is often surprising to people who are used to other systems that demand precise accounting such as banking/ATMs or large scale databases, but the truth is that voting is fundamentally different and much harder.

In this series I’ll be going through a variety of different voting systems so you can see how this works in practice. This post provides a brief overview of the basic requirements for voting systems. We’ll go into more detail about the practical impact of these requirements as we examine each system.


To understand voting systems design, we first need to understand the requirements to which they are designed. These vary somewhat, but generally look something like the below.

Efficient Correct Tabulation

This requirement is basically trivial: collect the ballots and tally them up. The winner is the one with the most votes 1. You also need to do it at scale and within a reasonable period of time otherwise there’s not much point.

Verifiable Results

It’s not enough for the election just to produce the right result, it must also do so in a verifiable fashion. As voting researcher Dan Wallach is fond of saying, the purpose of elections is to convince the loser that they actually lost, and that means more than just trusting the election officials to count the votes correctly. Ideally, everyone in world would be able to check for themselves that the votes had been correctly tabulated (this is often called “public verifiability”), but in real-world systems it usually means that some set of election observers can personally observe parts of the process and hopefully be persuaded it was conducted correctly.

Secrecy of the Ballot

The next major requirement is what’s called “secrecy of the ballot”, i.e., ensuring that others can’t tell how you voted. Without ballot secrecy, people could be pressured to vote certain ways or face negative consequences for their votes. Ballot secrecy actually has two components (1) other people — including election officials — can’t tell how you voted and (2) you can’t prove to other people how you voted. The first component is needed to prevent wholesale retaliation and/or rewards and the second is needed to prevent retail vote buying. The actual level of ballot secrecy provided by systems varies. For instance, the UK system technically allows election officials to match ballots to the voter, but prevents it with procedural controls and vote by mail systems generally don’t do a great job of preventing you from proving how you voted, but in general most voting systems attempt to provide some level of ballot secrecy.2


Finally, we want voting systems to be accessible, both in the specific sense that we want people with disabilities to be able to vote and in the more general sense that we want it to be generally easy for people to vote. Because the voting-eligible population is so large and people’s situations are so varied, this often means that systems have to make accommodations, for instance for overseas or military voters or for people who speak different languages.

Limited Trust

As you’ve probably noticed, one common theme in these requirements is the desire to limit the amount of trust you place in any one entity or person. For instance, when I worked the polls in Santa Clara county elections, we would collect all the paper ballots and put them in tamper-evident bags before taking them back to election central for processing. This makes it harder for the person transporting the ballots to examine the ballots or substitute their own. For those who aren’t used to the way security people think, this often feels like saying that election officials aren’t trustworthy, but really what it’s saying is that elections are very high stakes events and critical systems like this should be designed with as few failure points as possible, and that includes preventing both outsider and insider threats, protecting even against authorized election workers themselves.

An Overconstrained Problem

Individually each of these requirements is fairly easy to meet, but the combination of them turns out to be extremely hard. For example if you publish everyone’s ballots then it’s (relatively) easy to ensure that the ballots were counted correctly, but you’ve just completely give up secrecy of the ballot.3 Conversely, if you just trust election officials to count all the votes, then it’s much easier to provide secrecy from everyone else. But these properties are both important, and hard to provide simultaneously. This tension is at the heart of why voting is so much more difficult than other superficially systems like banking. After all, your transactions aren’t secret from the bank. In general, what we find is that voting systems may not completely meet all the requirements but rather compromise on trying to do a good job on most/all of them.

Up Next: Hand-Counted Paper Ballots

In the next post, I’ll be covering what is probably the simplest common voting system: hand-counted paper ballots. This system actually isn’t that common in the US for reasons I’ll go into, but it’s widely used outside the US and provides a good introduction into some of the problems with running a real election.

  1. For the purpose of this series, we’ll mostly be assuming first past the post systems, which are the main systems in use in the US.
  2. Note that I’m talking here about systems designed for use by ordinary citizens. Legislative voting, judicial voting, etc. are qualitatively different: they usually have a much smaller number of voters and don’t try to preserve the secrecy of the ballot, so the problem is much simpler. 
  3. Thanks to Hovav Shacham for this example. 

The post Why getting voting right is hard, Part I: Introduction and Requirements appeared first on The Mozilla Blog.

Categorieën: Mozilla-nl planet

Hacks.Mozilla.Org: An update on MDN Web Docs’ localization strategy

di, 08/12/2020 - 17:20

In our previous post — MDN Web Docs evolves! Lowdown on the upcoming new platform — we talked about many aspects of the new MDN Web Docs platform that we’re launching on December 14th. In this post, we’ll look at one aspect in more detail — how we are handling localization going forward. We’ll talk about how our thinking has changed since our previous post, and detail our updated course of action.

Updated course of action

Based on thoughtful feedback from the community, we did some additional investigation and determined a stronger, clearer path forward.

First of all, we want to keep a clear focus on work leading up to the launch of our new platform, and making sure the overall system works smoothly. This means that upon launch, we still plan to display translations in all existing locales, but they will all initially be frozen — read-only, not editable.

We were considering automated translations as the main way forward. One key issue was that automated translations into European languages are seen as an acceptable solution, but automated translations into CJK languages are far from ideal — they have a very different structure to English and European languages, plus many Europeans are able to read English well enough to fall back on English documentation when required, whereas some CJK communities do not commonly read English so do not have that luxury.

Many folks we talked to said that automated translations wouldn’t be acceptable in their languages. Not only would they be substandard, but a lot of MDN Web Docs communities center around translating documents. If manual translations went away, those vibrant and highly involved communities would probably go away — something we certainly want to avoid!

We are therefore focusing on limited manual translations as our main way forward instead, looking to unfreeze a number of key locales as soon as possible after the new platform launch.

Limited manual translations

Rigorous testing has been done, and it looks like building translated content as part of the main build process is doable. We are separating locales into two tiers in order to determine which will be unfrozen and which will remain locked.

  • Tier 1 locales will be unfrozen and manually editable via pull requests. These locales are required to have at least one representative who will act as a community lead. The community members will be responsible for monitoring the localized pages, updating translations of key content once the English versions are changed, reviewing edits, etc. The community lead will additionally be in charge of making decisions related to that locale, and acting as a point of contact between the community and the MDN staff team.
  • Tier 2 locales will be frozen, and not accept pull requests, because they have no community to maintain them.

The Tier 1 locales we are starting with unfreezing are:

  • Simplified Chinese (zh-CN)
  • Traditional Chinese (zh-TW)
  • French (fr)
  • Japanese (ja)

If you wish for a Tier 2 locale to be unfrozen, then you need to come to us with a proposal, including evidence of an active team willing to be responsible for the work associated with that locale. If this is the case, then we can promote the locale to Tier 1, and you can start work.

We will monitor the activity on the Tier 1 locales. If a Tier 1 locale is not being maintained by its community, we shall demote it to Tier 2 after a certain period of time, and it will become frozen again.

We are looking at this new system as a reasonable compromise — providing a path for you the community to continue work on MDN translations providing the interest is there, while also ensuring that locale maintenance is viable, and content won’t get any further out of date. With most locales unmaintained, changes weren’t being reviewed effectively, and readers of those locales were often confused between using their preferred locale or English, their experience suffering as a result.

Review process

The review process will be quite simple.

  • The content for each Tier 1 locale will be kept in its own separate repo.
  • When a PR is made against that repo, the localization community will be pinged for a review.
  • When the content has been reviewed, an MDN admin will be pinged to merge the change. We should be able to set up the system so that this happens automatically.
  • There will also be some user-submitted content bugs filed at, as well as on the issue trackers for each locale repo. When triaged, the “sprints” issues will be assigned to the relevant localization team to fix, but the relevant localization team is responsible for triaging and resolving issues filed on their own repo.
Machine translations alongside manual translations

We previously talked about the potential involvement of machine translations to enhance the new localization process. We still have this in mind, but we are looking to keep the initial system simple, in order to make it achievable. The next step in Q1 2021 will be to start looking into how we could most effectively make use of machine translations. We’ll give you another update in mid-Q1, once we’ve made more progress.

The post An update on MDN Web Docs’ localization strategy appeared first on Mozilla Hacks - the Web developer blog.

Categorieën: Mozilla-nl planet

Mozilla Attack & Defense: Guest Blog Post: Good First Steps to Find Security Bugs in Fenix (Part 1)

di, 08/12/2020 - 16:17

This blog post is one of several guest blog posts, where we invite participants of our bug bounty program to write about bugs they’ve reported to us.

Fenix is a newly designed Firefox for Android that officially launched in August 2020. In Fenix, many components required to run as an Android app have been rebuilt from scratch, and various new features are being implemented as well. While they are re-implementing features, security bugs fixed in the past may be introduced again. If you care about the open web and you want to participate in the Client Bug Bounty Program of Mozilla, Fenix is a good target to start with.

Let’s take a look at two bugs I found in the firefox: scheme that is supported by Fenix.

Bugs Came Again with Deep Links

Fenix provides an interesting custom scheme URL firefox://open?url= that can open any specified URL in a new tab. On Android, a deep link is a link that takes you directly to a specific part of an app; and the firefox:open deep link is not intended to be called from web content, but its access was not restricted.

Web Content should not be able to link directly to a file:// URL (although a user can type or copy/paste such a link into the address bar.) While Firefox on the Desktop has long-implemented this fix, Fenix did not – I submitted Bug 1656747 that exploited this behavior and navigated to a local file from web content with the following hyperlink:

<a href="firefox://open?url=file:///sdcard/Download"> Go </a>

But actually, the same bug affected the older Firefox for Android (unofficially referred to as Fennec) and was filed three years ago Bug 1380950.

Likewise, security researcher Jun Kokatsu reported Bug 1447853, which was an <iframe> sandbox bypass in Firefox for iOS. He also abused the same type of deep link URL for bypassing the popup block brought by <iframe> sandbox.

<iframe src="data:text/html,<a href=firefox://open-url?url=> Go </a>" sandbox></iframe>

I found this attack scenario in a test file of Firefox for iOS and I re-tested it in Fenix. I submitted Bug 1656746 which is the same issue as what he found.


As you can see, retesting past attack scenarios can be a good starting point. We can find past vulnerabilities from the Mozilla Foundation Security Advisories. By examining histories accumulated over a decade, we can see what are considered security bugs and how they were resolved. These resources will be useful for retesting past bugs as well as finding attack vectors for newly introduced features.

Have a good bug hunt!

Categorieën: Mozilla-nl planet