mozilla

Mozilla Nederland LogoDe Nederlandse
Mozilla-gemeenschap

Jet Villegas: Firefox Platform Rendering – Current Work

Mozilla planet - 15 oeren 1 min lyn

I’m often asked “what are you working on?” Here’s a snapshot of some of the things currently on my teams’ front burners:

I’m surely forgetting a few things, but that’s a quick snapshot for now. Do you have suggestions for what Platform Rendering features we should pick up next? Add your comments below…

Categorieën: Mozilla-nl planet

Manish Goregaokar: How Rust Achieves Thread Safety

Mozilla planet - 15 oeren 14 min lyn

In every talk I have given till now, the question “how does Rust achieve thread safety?” has invariably come up1. I usually just give an overview, but this provides a more comprehensive explanation for those who are interested

See also: Huon’s blog post on the same topic

In my previous post I touched a bit on the Copy trait. There are other such “marker” traits in the standard library, and the ones relevant to this discussion are Send and Sync. I recommend reading that post if you’re not familiar with Rust wrapper types like RefCell and Rc, since I’ll be using them as examples throughout this post; but the concepts explained here are largely independent.

For the purposes of this post, I’ll restrict thread safety to mean no data races or cross-thread dangling pointers. Rust doesn’t aim to solve race conditions. However, there are projects which utilize the type system to provide some form of extra safety, for example rust-sessions attempts to provide protocol safety using session types.

These traits are auto-implemented using a feature called “opt in builtin traits”. So, for example, if struct Foo is Sync, all structs containing Foo will also be Sync, unless we explicitly opt out using impl !Sync for Bar {}. Similarly, if struct Foo is not Sync, structs containing it will not be Sync either, unless they explicitly opt in (unsafe impl Sync for Bar {})

This means that, for example, a Sender for a Send type is itself Send, but a Sender for a non-Send type will not be Send. This pattern is quite powerful; it lets one use channels with non-threadsafe data in a single-threaded context without requiring a separate “single threaded” channel abstraction.

At the same time, structs like Rc and RefCell which contain Send/Sync fields have explicitly opted out of one or more of these because the invariants they rely on do not hold in threaded situations.

It’s actually possible to design your own library with comparable thread safety guarantees outside of the compiler — while these marker traits are specially treated by the compiler, the special treatment is not necessary for their working. Any two opt-in builtin traits could be used here.

Send and Sync have slightly differing meanings, but are very intertwined.

Send types can be moved between threads without an issue. It answers the question “if this variable were moved to another thread, would it still be valid for use?”. Most objects which completely own their contained data qualify here. Notably, Rc doesn’t (since it is shared ownership). Another exception is LocalKey, which does own its data but isn’t valid from other threads. Borrowed data does qualify to be Send, but in most cases it can’t be sent across threads due to a constraint that will be touched upon later.

Even though types like RefCell use non-atomic reference counting, it can be sent safely between threads because this is a transfer of ownership (a move). Sending a RefCell to another thread will be a move and will make it unusable from the original thread; so this is fine.

Sync, on the other hand, is about synchronous access. It answers the question: “if multiple threads were all trying to access this data, would it be safe?”. Types like Mutex and other lock/atomic based types implement this, along with primitive types. Things containing pointers generally are not Sync.

Sync is sort of a crutch to Send; it helps make other types Send when sharing is involved. For example, &T and Arc<T> are only Send when the inner data is Sync (and additionally Send in the case of Arc<T>). In words, stuff that has shared/borrowed ownership can be sent to another thread if the shared/borrowed data is synchronous-safe.

RefCell, while Send, is not Sync because of the non atomic reference counting.

Bringing it together, the gatekeeper for all this is thread::spawn(). It has the signature

pub fn spawn<F, T>(f: F) -> JoinHandle<T> where F: FnOnce() -> T, F: Send + 'static, T: Send + 'static

Admittedly, this is confusing/noisy, partially because it’s allowed to return a value, and also because it returns a handle from which we can block on a thread join. We can conjure a simpler spawn API for our needs though:

pub fn spawn<F>(f: F) where F: FnOnce(), F: Send + 'static

which can be called like:

let mut x = vec![1,2,3,4]; // `move` instructs the closure to move out of its environment thread::spawn(move || { x.push(1); }); // x is not accessible here since it was moved

In words, spawn() will take a callable (usually a closure) that will be called once, and contains data which is Send and 'static. Here, 'static just means that there is no borrowed data contained in the closure. This is the aforementioned constraint that prevents the sharing of borrowed data across threads; without it we would be able to send a borrowed pointer to a thread that could easily outlive the borrow, causing safety issues.

There’s a slight nuance here about the closures — closures can capture outer variables, but by default they do so by-reference (hence the move keyword). They autoimplement Send and Sync depending on their capture clauses. For more on their internal representation, see huon’s post. In this case, x was captured by-move; i.e. as Vec<T> (instead of being similar to &Vec<T> or something), so the closure itself can be Send. Without the move keyword, the closure would not be `‘static’ since it contains borrowed content.

Since the closure inherits the Send/Sync/'static-ness of its captured data, a closure capturing data of the correct type will satisfy the F: Send+'static bound.

Some examples of things that are allowed and not allowed by this function (for the type of x):

  • Vec<T>, Box<T> are allowed because they are Send and 'static
  • &T isn’t allowed because it’s not 'static. This is good, because borrows should have a statically-known lifetime. Sending a borrowed pointer to a thread may lead to a use after free, or otherwise break aliasing rules.
  • Rc<T> isn’t Send, so it isn’t allowed. We could have some other Rc<T>s hanging around, and end up with a data race on the refcount.
  • Arc<Vec<u32>> is allowed (Vec<T> is Send and Sync if the inner type is); we can’t cause a safety violation here. Iterator invalidation requires mutation, and Arc<T> doesn’t provide this by default.
  • Arc<Cell<T>> isn’t allowed. Cell<T> provides copying-based internal mutability, and isn’t Sync (so the Arc<Cell<T>> isn’t Send). If this were allowed, we could have cases where larger structs are getting written to from different threads simultaneously resulting in some random mishmash of the two. In other words, a data race.
  • Arc<Mutex<T>> or Arc<RwLock<T>> are allowed. The inner types use threadsafe locks and provide lock-based internal mutability. They can guarantee that only one thread is writing to them at any point in time. For this reason, the mutexes are Sync regardless of the inner T, and Sync types can be shared safely with wrappers like Arc. From the point of view of the inner type, it’s only being accessed by one thread at a time (slightly more complex in the case of [RwLock][rwlock]), so it doesn’t need to know about the threads involved. There can’t be data races when Sync types like these are involved.

As mentioned before, you can in fact create a Sender/Receiver pair of non-Send objects. This sounds a bit counterintuitive — shouldn’t we be only sending values which are Send? However, Sender<T> is only Send if T is Send; so even if we can use a Sender of a non-Send type, we cannot send it to another thread, so it cannot be used to violate thread safety.

There is also a way to utilize the Send-ness of &T (which is not 'static) for some Sync T, namely thread::scoped. This function does not have the 'static bound, but it instead has an RAII guard which forces a join before the borrow ends. This allows for easy fork-join parallelism without necessarily needing a Mutex. Sadly, there are problems which crop up when this interacts with Rc cycles, so the API is currently unstable and will be redesigned. This is not a problem with the language design or the design of Send/Sync, rather it is a perfect storm of small design inconsistencies in the libraries.

Discuss: HN, Reddit

  1. So much that I added bonus slides about thread safety to the end of my deck, and of course I ended up using them at the talk I gave recently

Categorieën: Mozilla-nl planet

David Rajchenbach Teller: Re-dreaming Firefox (1): Firefox Agents

Mozilla planet - fr, 29/05/2015 - 23:30

Gerv’s recent post on the Jeeves Test got me thinking of the Firefox of my dreams. So I decided to write down a few ideas on how I would like to experience the web. Today: Firefox Agents. Let me emphasise that the features described in this blog post do not exist.

Marcel uses Firefox every day, for quite a number of things.

  • He uses Firefox for fun, for watching videos and playing online games. For this purpose, he has installed a few tools for finding and downloading videos. Also, one of his main search engines is YouTube. Suggested movies? Sure, as long as they are fun.
  • He uses Firefox for social networks. He follows his friends, he searches on Facebook, or Twitter, or Google+. If anything looks fun, or useful, he’d like to be informed.
  • He uses Firefox for managing his bank accounts, his taxes, his health insurance. For this purpose, he has paranoid security settings – to avoid phishing, he can only browse to a few whitelisted websites – and no add-ons. He may be interested in getting information from these few websites, and in security updates, but that’s about it. Also, since Firefox handles all his passwords, it must itself be protected by a password.
  • He uses Firefox to read his Gmail account. And to read his other Gmail account. And he doesn’t want to leak privacy information by doing so on the same Firefox that he’s using for browsing.
  • Oh, and he may also be using Firefox for browsing websites that are sensitive for any kind of reason, whether he’s hunting for gifts for his close family, dating online, chatting with hackers, discussing politics, helping NGOs in sensitive parts of the globe, visiting BitTorrent trackers, consulting a physician through some online service, or, well, anything else that requires privacy. He’d like to perform such browsing with additional anonymity guarantees. This also means locking Firefox with a password.
  • Sometimes, his children or friends borrow his computer and use Firefox, too.

Of course, since Marcel brings his own device at (or from) work, that’s the same Firefox that he’s using for all of these tasks, and he’s probably even doing several of these tasks at the same time.

So, Marcel has a set of contradictory requirements, not to mention that each of his uses of Firefox needs to pass a distinct Jeeves Test. How do we keep him happy nevertheless?

Introducing Firefox Agents

In the rest of this post, I will be calling each of these uses of Firefox an Agent (if we ever implement this feature, it will, of course, be called Persona). Each Agent matches one way you use Firefox. While Firefox may be delivered with a predefined set of Agents, users can easily create new Agents. In the example, Marcel has his “Fun Agent”, his “Social Agent”, his “Work Agent”, etc.

Each Agent is unique:

  • Each Agent has its own icon on Marcel’s menu/desktop/tablet/phone and task list.
  • Each Agent has its own visual identity, to make sure that work-related stuff doesn’t end up accidentally in the Fun Agent.
  • Each Agent has its own set of preferences, bookmarks, remembered passwords, cookies, cache, and add-ons.
  • Each website may be connected to a given Agent, so that links received through Gmail or through Thunderbird, for instance, automatically open with the right Agent.

As a consequence, any technology that can come bundled with Firefox to, for instance, provide search suggestions or any other kind of website suggestions is tied to an Agent. For instance, Marcel’s browsing a dating site, or shopping for shoes, or having religious activities will not be visible to any of his colleagues looking above his shoulder at his Work Agent, nor will it be tied to either of Marcel’s Gmail accounts. This greatly increases the chances of suggestion technologies passing the Jeeves Test.

Agents are also connected:

  • A menu in each Agent, as well as a keyboard shortcut, lets users quickly open/switch to other Agents.
  • When an Agent follows a link to a website that belongs to another Agent, the relevant Agent opens automatically.
  • Bookmarks may be pushed, on demand, from one Agent to another one.
  • Passwords may be pulled, on demand, from one Agent to another one.
How far are we from Agents?

Technologically speaking, Firefox Agents almost exist. Indeed, Firefox has supported Profiles forever, since way before Firefox 1.0. I generally have three instances of Firefox opened at the same time (four when I’m doing web development), and it works nicely.

With a few add-ons, you can get almost everything, although not entirely connected together:

  • Profilist helps a lot with switching between profiles, and the dev version adds distinct icons;
  • Firefox Themes implement distinct appearances;
  • there are add-ons implementing whitelist browsing;
  • there are add-ons implementing password-protected Firefox.

A few features are missing, but as you can see, the list is actually quite short:

  • Pushing/pulling passwords and bookmarks between Agents (although that’s a subset of what Firefox Accounts can do).
  • Attaching specific websites to specific Agents (although this doesn’t seem too difficult to implement).
  • Connecting this all together.
What now?

I would like to browse with this Firefox. Would you?


Categorieën: Mozilla-nl planet

Selena Deckelmann: Migrating to Taskcluster: work underway!

Mozilla planet - fr, 29/05/2015 - 23:29

Mozilla’s build and test infrastructure has relied on Buildbot as the backbone of our systems for many years. Asking around, I heard that we started using Buildbot around 2008. The time has come for a change!

Many of the people working on migrating from Buildbot to Taskcluster gathered all together for the first time to talk about migration this morning. (A recording of the meeting is available)

The goal of this work is to shut down Buildbot and identify a timeline. Our first goal post is to eliminate the Buildbot Scheduler by moving build production entirely into TaskCluster, and scheduling tests in TaskCluster.

Today, most FirefoxOS builds and tests are in Taskcluster. Nearly everything else for Firefox is driven by Buildbot.

Our current tracker bug is ‘Buildbot -> TaskCluster transition‘. At a high level, the big projects underway are:

We have quite a few things to figure out in the Windows and Mac OS X realm where we’re interacting with hardware, and some work is left to be done to support Windows in AWS. We’re planning to get more clarity on the work that needs to be done there next week.

The bugs identified seem tantalizingly close to describing most of the issues that remain in porting our builds. The plan is to have a timeline documented for builds to be fully migrated over by Whistler! We are also working on migrating tests, but for now believe the Buildbot Bridge will help us get tests out of the Buildbot scheduler, even if we continue to need Buildbot masters for a while. An interesting idea about using runner to manage hardware instead of the masters was raised during the meeting that we’ll be exploring further.

If you’re interested in learning more about TaskCluster and how to use it, Chris Cooper is running a training on Monday June 1 at 1:30pm PT.

Ping me on IRC, Twitter or email if you have questions!

Categorieën: Mozilla-nl planet

David Humphrey: Messing with MessageChannel

Mozilla planet - fr, 29/05/2015 - 22:02

We're getting close to being able to ship a beta release of our work porting Brackets to the browser. I'll spend a bunch of time blogging about it when we do, and detail some of the interesting problems we solved along the way. Today I wanted to talk about a patch I wrote this week and what I learned in the process, specifically, using MessageChannel for cross-origin data sharing.

Brackets needs a POSIX filesystem, which is why we spent so much time on filer.js, which is exactly that. Filer stores filesystem nodes and data blocks in IndexedDB (or WebSQL on older browsers). Since this means that filesystem data is stored per-origin, and shared across tabs/windows, we have to be careful when building an app that lets a user write arbitrary HTML, CSS, and JavaScript that is then run it in the page (did I mention we've built a complete web server and browser on top of filer.js, because it's awesome!).

Our situation isn't that unique: we want to allow potentially dangerous script from the user to get published using our web app; but we need isolation between the web app and the code editor and "browser" that's rendering the content in the editor and filesystem. We do this by isolating the hosting web app from the editor/browser portion using an iframe and separate origins.

Which leads me back to the problem of cross-origin data sharing and MessageChannel. We need access to the filesystem data in the hosting app, so that a logged in user can publish their code to a server. Since the hosted app and the editor iframe run on different origins, we have to somehow allow one to access the data in the other.

Our current solution (we're still testing, but so far it looks good) is to put the filesystem (i.e., IndexedDB database) in the hosting app, and use a MessageChannel to proxy calls to the filesystem from the editor iframe. This is fairly straightforward, since all filesystem operations were already async.

Before this week, I'd only read about MessageChannel, but never really played with it. I found it mostly easy to use, but with a few gotchas. At first glance it looks a lot like postMessage between windows. What's different is that you don't have to validate origins on every call. Instead, a MessageChannel exposes two MessagePort objects: one is held onto by the initiating script; the other is transferred to the remote script.

I think this initial "handshake" is one of the harder things to get your head around when you begin using this approach. To start using a MessageChannel, you first have to do a regular postMessage in order to get the second MessagePort over to the remote script. Furthermore, you need to do it using the often overlooked third argument to postMessage, which lets you include Transferable objects. These objects get transferred (i.e., their ownership switches to the remote execution context).

In code you're doing something like this:

/** * In the hosting app's js */ var channel = new MessageChannel(); var port = channel.port1; ... // Wait until the iframe is loaded, via event or some postMessage // setup, then post to the iframe, indicating that you're // passing (i.e., transferring) the second port over which // future communication will happen. iframe.contentWindow.postMessage("here's your port...", "*", [channel.port2]); // Now wire the "local" port so we can get events from the iframe function onMessage(e) { var data = e.data; // do something with data passed by remote } port.addEventListener("message", onMessage, false); // And, since we used addEventListener vs. onmessage, call start() // see https://developer.mozilla.org/en-US/docs/Web/API/MessagePort/start port.start(); ... // Send some data to the remote end. var data = {...}; port.postMessage(data);

I'm using a window and iframe, but you could also use a worker (or your iframe could pass along to its worker, etc). On the other end, you do something like this:

/** * In the remote iframe's js */ var port; // Let the remote side know we're ready to receive the port parent.postMessage("send me the port, please", "*"); // Wait for a response, then wire the port for `message` events function receivePort(e) { removeListener("message", receivePort, false); if(e.data === "here's your port...") { port = e.ports[0]; function onMessage(e) { var data = e.data; // do something with data passed by remote } port.addEventListener("message", onMessage, false); // Make sure you call start() if you use addEventListener port.start(); } } addEventListener("message", receivePort, true); ... // Send some data to the other rend var data = {...}; port.postMessage(data);

Simple, right? It's mostly that easy, but here's the fine print:

  • It works today in every modern browser except IE 9 and Firefox, where it's awaiting final review and behind a feature pref. I ended up using a slightly modified version of MessageChannel.js as a polyfill. (We need this to land in Mozilla!)
  • You have to be careful with event handling on the ports, since using addEventListener requires an explicit call to start which onmessage doesn't. It's documented, but I know I wasted too much time on that one, so be warned.
  • You can safely pass all manner of data across the channel, except for things like functions, and you can use Transferables once again, for things that you want to ship wholesale across to the remote side.
  • Trying to transfer an ArrayBuffer via postMessage doesn't work right now in Blink

I was extremely pleased to find that I could adapt our filesystem in roughly a day to work across origins, without losing a ton of performance. I'd highly recommend looking at MessageChannels when you have a similar problem to solve.

Categorieën: Mozilla-nl planet

Gregory Szorc: Important Changes to MozReview

Mozilla planet - fr, 29/05/2015 - 18:20

This was a busy week for MozReview! There are a number of changes people need to be aware of.

Support for Specifying Reviewers via Commit Messages

MozReview will now parse r?gps syntax out of commit messages to set reviewers for pushed commits.

Read the docs for more information, including why we are preferring r? to r=.

Since it landed, a number of workflow concerns have been reported. See the bug tree for bug 1142251 before filing a bug to help avoid duplicates.

Thank Dan Minor for the feature!

Review Attachment/Flag Per Commit

Since the beginning of MozReview, there was one Bugzilla attachment / review flag per commit series. This has changed to one attachment / review flag per commit.

Before, you needed to grant Ship It on the parent/root review request in order to r+ the MozReview review request. Now, you grant Ship It on individual commits and these turn into individual r+ on Bugzilla. To reinforce that reviewing the parent/root review request doesn't do anything meaningful any more, the Ship It button and checkbox have been removed from the parent/root review request.

The new model more closely maps to how code review in Bugzilla has worked at Mozilla for ages. And, it is a superior workflow for future workflows we're trying to enable.

We tried to run a one-time migration script to convert existing parent/root attachments/review flags to per-commit attachments/flags. However, there were issues. We will attempt again sometime next week. In the interim, in-flight review requests may enter an inconsistent state if they are updated. If a new push is performed, the old parent/root attachment/review flag may linger and per-commit attachments/flags will be created. This could be confusing. The workaround is to manually clear the r? flag from the parent/root attachment or wait for the migration script to run in a few days.

Mark Côté put in a lot of hard work to make this change happen.

r? Flags Cleared After Review

Before, submitting a review without granting Ship It wouldn't do anything to the r? flag: the r? flag would linger.

Now, submitting review without granting Ship It will clear the r? flag. We think the new default is better for the majority of users. However, we recognize it isn't always wanted. There is a bug open to provide a checkbox to keep the r? flag present.

Metadata Added to Changesets

If you update to the tip of the version-control-tools repository (you should do this every week or so to stay fresh - use mach mercurial-setup to do this automatically), metadata will automatically be added to commits when working with MozReview-enabled repositories.

Specifically, we insert a hidden, unique, random ID into every changeset. This ID can be used to map commits to each other. We don't use this ID yet. But we have plans.

A side-effect of this change is that you can no longer push to MozReview if you have uncommitted local changes. If this is annoying, use hg shelve and hg unshelve to create and undo temporary commits. If this is too annoying, complain by filing a bug and we can look into doing this automatically as part of pushing.

What's Next?

We're actively working on more workflow enhancements to make MozReview an even more compelling experience.

I'm building a web service to query file metadata from moz.build files. This will allow MozReview to automatically file bugs in proper components based on what files changed. Once code reviewer metadata is added to moz.build files, it will enable us to assign reviewers automatically as well. The end goal here is to lower the number of steps needed to turn changed code into a landing. This will be useful when we turn GitHub pull requests into MozReview review requests (random GitHub contributors shouldn't need to know who to flag for review, nor should they be required to file a bug if they write some code). Oh year, we're working on integrating GitHub pull requests.

Another area of focus is better commit tracking and partially landed series. I have preliminary patches for automatically adding review URL annotations to commit messages. This will enable people to easily go from commit (message) to MozReview, without having to jump through Bugzilla. This also enables better commit tracking. If you e.g. perform complicated history rewriting, the review URL annotation will enable the MozReview server to better map previously-submitted commits to existing review requests. Partially landed series will enable commits to land as soon as they are reviewed, without having to wait on the entire series. It's my strong belief that if a commit is granted review, it should land as soon as possible. This helps prevent bit rot of ready-to-land commits. Landings also make people feel better because you feel like you've accomplished something. Positive feedback loops are good.

Major work is also being done to overhaul the web UI. The commit series view (which is currently populated via XHR) will soon be generated on the server and served as part of the page. This should make pages load a bit faster. And, things should be prettier as well.

And, of course, work is being invested into Autoland. Support for submitting pushes to Try landed a few weeks ago. Having Autoland actually land reviewed commits is on the radar.

Exciting times are ahead. Please continue to provide feedback. If you see something, say something.

Categorieën: Mozilla-nl planet

Niko Matsakis: Virtual Structs Part 2: Classes strike back

Mozilla planet - fr, 29/05/2015 - 17:52

This is the second post summarizing my current thoughts about ideas related to “virtual structs”. In the last post, I described how, when coding C++, I find myself missing Rust’s enum type. In this post, I want to turn it around. I’m going to describe why the class model can be great, and something that’s actually kind of missing from Rust. In the next post, I’ll talk about how I think we can get the best of both worlds for Rust. As in the first post, I’m focusing here primarily on the data layout side of the equation; I’ll discuss virtual dispatch afterwards.

(Very) brief recap

In the previous post, I described how one can setup a class hierarchy in C++ (or Java, Scala, etc) with a base class and one subclass for every variant:

1 2 3 class Error { ... }; class FileNotFound : public Error { ... }; class UnexpectedChar : public Error { ... };

This winds up being very similar to a Rust enum:

1 2 3 4 enum ErrorCode { FileNotFound, UnexpectedChar }

However, there are are some important differences. Chief among them is that the Rust enum has a size equal to the size of its largest variant, which means that Rust enums can be passed “by value” rather than using a box. This winds up being absolutely crucial to Rust: it’s what allows us to use Option<&T>, for example, as a zero-cost nullable pointer. It’s what allows us to make arrays of enums (rather than arrays of boxed enums). It’s what allows us to overwrite one enum value with another, e.g. to change from None to Some(_). And so forth.

Problem #1: Memory bloat

There are a lot of use cases, however, where having a size equal to the largest variant is actually a handicap. Consider, for example, the way the rustc compiler represents Rust types (this is actually a cleaned up and simplified version of the real thing).

The type Ty represents a rust type:

1 2 // 'tcx is the lifetime of the arena in which we allocate type information type Ty<'tcx> = &'tcx TypeStructure<'tcx>;

As you can see, it is in fact a reference to a TypeStructure (this is called sty in the Rust compiler, which isn’t completely up to date with modern Rust conventions). The lifetime 'tcx here represents the lifetime of the arena in which we allocate all of our type information. So when you see a type like &'tcx, it represents interned information allocated in an arena. (As an aside, we added the arena back before we even had lifetimes at all, and used to use unsafe pointers here. The fact that we use proper lifetimes here is thanks to the awesome eddyb and his super duper safe-ty branch. What a guy.)

So, here is the first observation: in practice, we are already boxing all the instances of TypeStructure (you may recall that the fact that classes forced us to box was a downside before). We have to, because types are recursively structured. In this case, the ‘box’ is an arena allocation, but still the point remains that we always pass types by reference. And, moreover, once we create a Ty, it is immutable – we never switch a type from one variant to another.

The actual TypeStructure enum is defined something like this:

1 2 3 4 5 6 7 8 enum TypeStructure<'tcx> { Bool, // bool Reference(Region, Mutability, Type<'tcx>), // &'x T, &'x mut T Struct(DefId, &'tcx Substs<'tcx>), // Foo<..> Enum(DefId, &'tcx Substs<'tcx>), // Foo<..> BareFn(&'tcx BareFnData<'tcx>), // fn(..) ... }

You can see that, in addition to the types themselves, we also intern a lot of the data in the variants themselves. For example, the BareFn variant takes a &'tcx BareFnData<'tcx>. The reason we do this is because otherwise the size of the TypeStructure type balloons very quickly. This is because some variants, like BareFn, have a lot of associated data (e.g., the ABI, the types of all the arguments, etc). In contrast, types like structs or references have relatively little associated data. Nonetheless, the size of the TypeStructure type is determined by the largest variant, so it doesn’t matter if all the variants are small but one: the enum is still large. To fix this, Huon spent quite a bit of time analyzing the size of each variant and introducing indirection and interning to bring it down.

Consider what would have happened if we had used classes instead. In that case, the type structure might look like:

1 2 3 4 5 6 7 typedef TypeStructure *Ty; class TypeStructure { .. }; class Bool : public TypeStructure { .. }; class Reference : public TypeStructure { .. }; class Struct : public TypeStructure { .. }; class Enum : public TypeStructure { .. }; class BareFn : public TypeStructure { .. };

In this case, whenever we allocated a Reference from the arena, we would allocate precisely the amount of memory that a Reference needs. Similarly, if we allocated a BareFn type, we’d use more memory for that particular instance, but it wouldn’t affect the other kinds of types. Nice.

Problem #2: Common fields

The definition for Ty that I gave in the previous section was actually somewhat simplified compared to what we really do in rustc. The actual definition looks more like:

1 2 3 4 5 6 7 8 9 // 'tcx is the lifetime of the arena in which we allocate type information type Ty<'tcx> = &'tcx TypeData<'tcx>; struct TypeData<'tcx> { id: u32, flags: u32, ..., structure: TypeStructure<'tcx>, }

As you can see, Ty is in fact a reference not to a TypeStructure directly but to a struct wrapper, TypeData. This wrapper defines a few fields that are common to all types, such as a unique integer id and a set of flags. We could put those fields into the variants of TypeStructure, but it’d be repetitive, annoying, and inefficient.

Nonetheless, introducing this wrapper struct feels a bit indirect. If we are using classes, it would be natural for these fields to live on the base class:

1 2 3 4 5 6 7 8 9 10 11 typedef TypeStructure *Ty; class TypeStructure { unsigned id; unsigned flags; ... }; class Bool : public TypeStructure { .. }; class Reference : public TypeStructure { .. }; class Struct : public TypeStructure { .. }; class Enum : public TypeStructure { .. }; class BareFn : public TypeStructure { .. };

In fact, we could go further. There are many variants that share common bits of data. For example, structs and enums are both just a kind of nominal type (“named” type). Almost always, in fact, we wish to treat them the same. So we could refine the hierarchy a bit to reflect this:

1 2 3 4 5 6 7 8 class Nominal : public TypeStructure { DefId def_id; Substs substs; }; class Struct : public Nominal { }; class Enum : public Nominal { };

Now code that wants to work uniformly on either a struct or enum could just take a Nominal*.

Note that while it’s relatively easy in Rust to handle the case where all variants have common fields, it’s a lot more awkward to handle a case like Struct or Enum, where only some of the variants have common fields.

Problem #3: Initialization of common fields

Rust differs from purely OO languages in that it does not have special constructors. An instance of a struct in Rust is constructed by supplying values for all of its fields. One great thing about this approach is that “partially initialized” struct instances are never exposed. However, the Rust approach has a downside, particularly when we consider code where you have lots of variants with common fields: there is no way to write a fn that initializes only the common fields.

C++ and Java take a different approach to initialization based on constructors. The idea of a constructor is that you first allocate the complete structure you are going to create, and then execute a routine which fills in the fields. This approach to constructos has a lot of problems – some of which I’ll detail below – and I would not advocate for adding it to Rust. However, it does make it convenient to separately abstract over the initialization of base class fields from subclass fields:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 typedef TypeStructure *Ty; class TypeStructure { unsigned id; unsigned flags; TypeStructure(unsigned id, unsigned flags) : id(id), flags(flags) { } }; class Bool : public TypeStructure { Bool(unsigned id) : TypeStructure(id, 0) // bools have no flags { } };

Here, the constructor for TypeStructure initializes the TypeStructure fields, and the Bool constructor initializes the Bool fields. Imagine we were to add a field to TypeStructure that is always 0, such as some sort of counter. We could do this without changing any of the subclasses:

1 2 3 4 5 6 7 8 9 class TypeStructure { unsigned id; unsigned flags; unsigned counter; // new TypeStructure(unsigned id, unsigned flags) : id(id), flags(flags), counter(0) { } };

If you have a lot of variants, being able to extract the common initialization code into a function of some kind is pretty important.

Now, I promised a critique of constructors, so here we go. The biggest reason we do not have them in Rust is that constructors rely on exposing a partially initialized this pointer. This raises the question of what value the fields of that this pointer have before the constructor finishes: in C++, the answer is just undefined behavior. Java at least guarantees that everything is zeroed. But since Rust lacks the idea of a “universal null” – which is an important safety guarantee! – we don’t have such a convenient option. And there are other weird things to consider: what happens if you call a virtual function during the base type constructor, for example? (The answer here again varies by language.)

So, I don’t want to add OO-style constructors to Rust, but I do want some way to pull out the initialization code for common fields into a subroutine that can be shared and reused. This is tricky.

Problem #4: Refinement types

Related to the last point, Rust currently lacks a way to “refine” the type of an enum to indicate the set of variants that it might be. It would be great to be able to say not just “this is a TypeStructure”, but also things like “this is a TypeStructure that corresponds to some nominal type (i.e., a struct or an enum), though I don’t know precisely which kind”. As you’ve probably surmised, making each variant its own type – as you would in the classes approach – gives you a simple form of refinement types for free.

To see what I mean, consider the class hierarchy we built for TypeStructure:

1 2 3 4 5 6 7 8 typedef TypeStructure *Ty; class TypeStructure { .. }; class Bool : public TypeStructure { .. }; class Reference : public TypeStructure { .. }; class Nominal : public TypeStructure { .. } class Struct : public Nominal { .. }; class Enum : public Nominal { .. }; class BareFn : public TypeStructure { .. };

Now, I can pass around a TypeStructure* to indicate “any sort of type”, or a Nominal* to indicate “a struct or an enum”, or a BareFn* to mean “a bare fn type”, and so forth.

If we limit ourselves to single inheritance, that means one can construct an arbitrary tree of refinements. Certainly one can imagine wanting arbitrary refinements, though in my own investigations I have always found a tree to be sufficient. In C++ and Scala, of course, one can use multiple inheritance to create arbitrary refinements, and I think one can imagine doing something similar in Rust with traits.

As an aside, the right way to handle ‘datasort refinements’ has been a topic of discussion in Rust for some time; I’ve posted a different proposal in the past, and, somewhat amusingly, my very first post on this blog was on this topic as well. I personally find that building on a variant hierarchy, as above, is a very appealing solution to this problem, because it avoids introducing a “new concept” for refinements: it just leverages the same structure that is giving you common fields and letting you control layout.

Conclusion

So we’ve seen that there also advantages to the approach of using subclasses to model variants. I showed this using the TypeStructure example, but there are lots of cases where this arises. In the compiler alone, I would say that the abstract syntax tree, the borrow checker’s LoanPath, the memory categorization cmt types, and probably a bunch of other cases would benefit from a more class-like approach. Servo developers have long been requesting something more class-like for use in the DOM. I feel quite confident that there are many other crates at large that could similarly benefit.

Interestingly, Rust can gain a lot of the benefits of the subclass approach—namely, common fields and refinement types—just by making enum variants into types. There have been proposals along these lines before, and I think that’s an important ingredient for the final plan.

Perhaps the biggest difference between the two approaches is the size of the “base type”. That is, in Rust’s current enum model, the base type (TypeStructure) is the size of the maximal variant. In the subclass model, the base class has an indeterminate size, and so must be referenced by pointer. Neither of these are an “expressiveness” distinction—we’ve seen that you can model anything in either approach. But it has a big effect on how easy it is to write code.

One interesting question is whether we can concisely state conditions in which one would prefer to have “precise variant sizes” (class-like) vs “largest variant” (enum). I think the “precise sizes” approach is better when the following apply:

  1. A recursive type (like a tree), which tends to force boxing anyhow. Examples: the AST or types in the compiler, DOM in servo, a GUI.
  2. Instances never change what variant they are.
  3. Potentially wide variance in the sizes of the variants.

The fact that this is really a kind of efficiency tuning is an important insight. Hopefully our final design can make it relatively easy to change between the ‘maximal size’ and the ‘unknown size’ variants, since it may not be obvious from the get go which is better.

Preview of the next post

The next post will describe a scheme in which we could wed together enums and structs, gaining the advantages of both. I don’t plan to touch virtual dispatch yet, but intead just keep focusing on concrete types.

Categorieën: Mozilla-nl planet

Mozilla signing vetted add-ons as thoughts turn to security - The Register

Nieuws verzameld via Google - fr, 29/05/2015 - 15:33

The Register

Mozilla signing vetted add-ons as thoughts turn to security
The Register
Mozilla developer Jorge Villalobos claims the web king has begun signing vetted add-ons in a bid to improve security. The move means Mozilla-signed add-ons hosted on its servers will be maintained through automatic updates, while those lacking the ...

Google Nieuws
Categorieën: Mozilla-nl planet

Adam Munter: “The” Problem With “The” Perimeter

Mozilla planet - fr, 29/05/2015 - 14:02

“It’s secure, we transmit it over over SSL to a cluster behind a firewall in a restricted vlan.”
Protontorpedo
“But my PCI QSA said I had to do it this way to be compliant.”

This study by Gemalto discusses interesting survey results about perceptions of security perimeters such as that 64% of IT decision makers are looking to increase spend on perimeter security within the next 6 months and that 1/3 of those polled believe that unauthorized users have access to their information assets. It also revealed that only 8% of data affected by breaches was protected by encryption.

The perimeter is dead, long live the perimeter! The Jericho Forum started discussing “de-perimeterization” in 2003. If you hung out with pentesters, you already knew the the concept of ‘perimeter’ was built on shaky foundations. The growth of mobile, web API, and Internet of Things have only served to drive the point home.  Yet, there is an entire industry of VC-funded product companies and armies of consultants who are still operating from the mental model of there being “a perimeter.”[0]

In discussion about “the perimeter,” it’s not the concept of “perimeter” that is most problematic, it’s the word “the.”

There is not only “a” perimeter, there are many possible logical perimeters, depending on the viewpoint of the actor you are considering. There are an unquantifiable number of theoretical overlaid perimeters based on the perspective of the actors you’re considering and their motivation, time and resources, what they can interact with or observe, what each of those components can interact with, including humans and their processes and automated data processing systems, authentication and authorization systems, all the software, libraries, and hardware dependencies going down to the firmware, the interaction between different systems that might interpret the same data to mean different things, and all execution paths, known and unknown, etc, etc.

The best CSOs know they are working on a problem that has no solution endpoint, and that thinking so isn’t even the right mindset or model. They know they are living in a world of resource scarcity and have a problem of potentially unlimited size and start by asset classification, threat modeling[1] and inventorying. Without that it’s impossible to even have a rough idea of the shape and size of the problem. They know that their actual perimeter isn’t what’s drawn inside an arbitrary theoretical border in a diagram. It’s based on the attackable surface area seen by an potential attacker, the value of the resource to the attacker, and the many possible paths that could be taken to reach it in a way that is useful to the attacker, not some imaginary mental model of logical border control.

You’ve deployed anti-malware and anti-APT products, Network and web app firewalls, endpoint protection and database encryption. Fully PCI compliant!  All useful when applied with knowledge of what you’re protecting, how, from whom, and why. But if you don’t consider what you’re protecting and from whom as you design and build systems, not so useful. Imagine the following scenario:  All of these perimeter protection technologies allow SSL traffic through port 443 to your webserver’s REST API listeners. The listening application has permission to access the encrypted database to read or modify data. And when the attacker finds a logic vulnerability that lets them access data which their user id should not be able to see, it looks looks like normal application traffic to your IDS/IPS and web app firewall. As requested, the application uses its credentials to retrieve decrypted data and present it to the user.

Footnotes

0. I’m already skeptical about the usefulness of studies that aggregate data in this way. N percent of respondents think that y% is the correct amount to spend on security technology categories A, B, C. Who cares? The increasing yoy numbers of attacks are the result of the distribution of knowledge during the time surveyed and in any event these numbers aggregate a huge variety of industries, business histories, risk tolerance, and other tastes and preferences.
1. Threat modeling doesn’t mean technical decomposition to identify possible attacks, that’s attack modeling, through the two are often confused, even in many books and articles. The “threat” is “customer data exposed to unauthorized individuals.” The business “risk” is “Data exposure would lead to headline risk(bad press) and loss of data worth approx $N dollars.” The technical risk is “Application was built using inline SQL queries and is vulnerable to SQL injection” and “Database is encrypted but the application’s credentials let it retrieve cleartext data” and probably a bunch of other things.


Filed under: infosec, web security Tagged: infosec, Mozilla, perimeter, webappsec
Categorieën: Mozilla-nl planet

Gregory Szorc: Faster Cloning from hg.mozilla.org With Server Provided Bundles

Mozilla planet - fr, 29/05/2015 - 13:30

When you type hg clone, the Mercurial server will create a bundle from repository content at the time of the request and stream it to the client. (Git works essentially the same way.)

This approach usually just works. But there are some downsides, particularly with large repositories.

Creating bundles for large repositories is not cheap. For mozilla-central, Firefox's main repository, it takes ~280s of CPU time on my 2014 MacBook Pro to generate a bundle. Every time a client runs a hg clone https://hg.mozilla.org/mozilla-central, a server somewhere is spinning a CPU core generating ~1.1 GB of data. What's more, if another clone arrives at the same time, another process will perform the exact same work! When we talk about multiple minutes of CPU time per request, this extra work starts to add up.

Another problem with large repositories is interrupted downloads. If you suffer a connectivity blip during your clone command, you'll have to start from scratch. This potentially means re-transferring hundreds of megabytes from the server. It also means the server has to generate a new bundle, consuming even more CPU time. This is not good for the user or the server.

There have been multiple outages of hg.mozilla.org as a result of the service being flooded with clone requests to large repositories. Dozens of clients (most of them in Firefox or Firefox OS release automation) have cloned the same repository around the same time and overwhelmed network bandwidth in the data center or CPU cores on the Mercurial servers.

A common solution to this problem is to not use the clone command to receive initial repository data from the server. Instead, a static bundle file will be generated and made available to clients. Clients will call hg init to create an empty repository then will perform an hg unbundle to apply the contents of a pre-generated bundle file. They will then run hg pull to fetch new data that was created after the bundle was generated. (It's worth noting that Git's clone --reference option is similar.)

This is a good technical solution. Firefox and Firefox OS release automation have effectively implemented this. However, it is a lot of work: you have to build your own bundle generation and hosting infrastructure and you have to remember that every hg clone should probably be using bundles instead. It is extra complexity and complexity that must be undertaken by every client. If a client forgets, the consequences can be disastrous (clone flooding leading to service outage). Client-side opt-in is prone to lapses and doesn't scale.

As of today, we've deployed a more scalable, server-based solution to hg.mozilla.org.

hg.mozilla.org is now itself generating bundles for a handful of repositories, including mozilla-central, inbound, fx-team, and mozharness. These bundles are being uploaded to Amazon S3. And those bundles are being advertised by the server over Mercurial's wire protocol.

When you install the bundleclone Mercurial extension, hg clone is taught to look for bundles being advertised on the server. If a bundle is available, the bundle is downloaded, applied, and then the client does the equivalent of an hg pull to fetch all new data since when the bundle was generated. If a bundle exists, it is used transparently: no client side cooperation is needed beyond installing the bundleclone extension. If a bundle doesn't exist, it simply falls back to Mercurial's default behavior. This effectively shifts responsibility for doing efficient clones from clients to server operators, which means server operators don't need cooperation from clients to enact important service changes. Before, if clients weren't using bundles, we'd have to wait for clients to update their code. Now, we can see a repository is being cloned heavily and start generating bundles for it without having to wait for the client to deploy new code.

Furthermore, we've built primitive content negotiation into the process. The server doesn't simply advertise one bundle file: it advertises several bundle files. We offer gzip, bzip2, and stream bundles. gzip is what Mercurial uses by default. It works OK. bzip2 bundles are smaller, but they take longer to process. stream bundles are essentially tar archives of the .hg/store directory and are larger than gzip bundles, but insanely fast because there is very little CPU required to apply them. In addition, we advertise URLs for multiple S3 regions, currently us-west-2 (Oregon) and us-east-1 (Virginia). This enables clients to prefer the bundle most appropriate for them.

A benefit of serving bundles from S3 is that Firefox and Firefox OS release automation (the biggest consumers of hg.mozilla.org) live in Amazon EC2. They are able to fetch from S3 over a gigabit network. And, since we're transferring data within the same AWS region, there are no data transfer costs. Previously, we were transferring ~1.1 GB from a Mozilla data center to EC2 for each clone. This took up bandwidth in Mozilla's network and cost Mozilla money to send data thousands of miles away. And, we never came close to saturating a gigabit network (we do with stream bundles). Wins everywhere!

The full instructions detail how to use bundleclone. I recommend everyone at Mozilla install the extension because there should be no downside to doing it.

Once bundleclone is deployed to Firefox and Firefox OS release automation, we should hopefully never again see those machines bring down hg.mozilla.org due to a flood of clone requests. We should also see a drastic reduction in load to hg.mozilla.org. I'm optimistic bandwidth will decrease by over 50%!

It's worth noting that the functionality from the bundleclone extension is coming to vanilla Mercurial. The functionality (which was initially added by Mozilla's Mike Hommey) is part of Mercurial's bundle2 protocol, which is available, but isn't enabled by default yet. bundleclone is thus a temporary solution to bring us server stability and client improvements until modern Mercurial versions are deployed everywhere in a few months time.

Finally, I would like to credit Augie Fackler for the original idea for server-assisted bundle-based clones.

Categorieën: Mozilla-nl planet

Air Mozilla: World Wide Haxe Conference 2015

Mozilla planet - fr, 29/05/2015 - 10:00

World Wide Haxe Conference 2015 Talks in English about the Haxe programming language

Categorieën: Mozilla-nl planet

Mozilla Open Policy & Advocacy Blog: Copyright reform in the European Union

Mozilla planet - fr, 29/05/2015 - 05:33

The European Union is considering broad reform of copyright regulations as part of a “Digital Single Market” reform agenda. Review of the current Copyright Directive, passed in 2001, began with a report by MEP Julia Reda. The European Parliament will vote on that report and a number of amendments this summer, and the process will continue with a legislative proposal from the European Commission in the autumn. Over the next few months we plan to add our voice to this debate; in some cases supporting existing ideas, in other cases raising new issues.

This post lays out some of the improvements we’d like to see in the EU’s copyright regime – to preserve and protect the Web, and to better advance the innovation and competition principles of the Mozilla Manifesto. Most of the objectives we identify are actively being discussed today as part of copyright reform. Our advocacy is intended to highlight these, and characterize positions on them. We also offer a proposed exception for interoperability to push the conversation in a slightly new direction. We believe an explicit exception for interoperability would directly advance the goal of promoting innovation and competition through copyright law.

Promoting innovation and competition

“The effectiveness of the Internet as a public resource depends upon interoperability (protocols, data formats, content), innovation and decentralized participation worldwide.” – Mozilla Manifesto Principle #6

Clarity, consistency, and new exceptions are needed to ensure that Europe’s new copyright system encourages innovation and competition instead of stifling it. If new and creative uses of copyrighted content can be shut down unconditionally, innovation suffers. If copyright is used to unduly restrict new businesses from adding value to existing data or software, competition suffers.

Open norm: Implement a new, general exception to copyright allowing actions which pass the 3-step test of the Berne Convention. That test says that any exception to copyright must be a special case, that it should not conflict with a normal exploitation of the work, and it should not unreasonably prejudice the legitimate interests of the author. The idea of an “open norm” is to capture a natural balance for innovation and competition, allowing the copyright holder to retain normal exclusionary rights but not exceptional restrictive capabilities with regards to potential future innovative or competing technologies.

Quotation: Expand existing protections for text quotations to all media and a wider range of uses. An exception of this type is fundamental not only for free expression and democratic dialogue, but also to promote innovation when the quoter is adding value through technology (such as a website which displays and combines excerpts of other pages to meet a new user need).

Interoperability: An exception for acts necessary to enable ongoing interoperability with an existing computer program, protocol, or data format. This would directly enable competition and technology innovation. Such interoperation is also necessary for full accessibility for the disabled (who are often not appropriately catered for in standard programs), and to allow citizens to benefit fully from other exceptions to copyright.

Not breaking the Internet

“The Internet is a global public resource that must remain open and accessible.” – Mozilla Manifesto Principle #2

The Internet has numerous technical and policy features which have combined, sometimes by happy coincidence, to make it what it is today. Clear legislation to preserve and protect these core capabilities would be a powerful assurance, and avoid creating chilling risk and uncertainty.

Hyperlinking: hyperlinking should not be considered as any form of “communication to a public”. A recent Court of Justice of the EU ruling stated that hyperlinking was generally legal, as it does not consist of communication to a “new public.” A stronger and more common-sense rule would be a legislative determination that linking, in and of itself, does not constitute communicating the linked content to a public under copyright law. The acts of communicating and making content available are done by the person who placed the content on the target server, not by those making links to content.

Robust protections for intermediaries: a requirement for due legal process before intermediaries are compelled to take down content. While it makes sense for content hosters to be asked to remove copyright-infringing material within their control, a mandatory requirement to do so should not be triggered by mere assertion, but only after appropriate legal process. The existing waiver for liability for intermediaries should thus be strengthened with an improved definition of “actual knowledge” that requires such process, and (relatedly) to allow minor, reasonable modifications to data (e.g. for network management) without loss of protection.

We look forward to working with European policymakers to build consensus on the best ways to protect and promote innovation and competition on the Internet.

Chris Riley
Gervase Markham
Jochai Ben-Avie

Categorieën: Mozilla-nl planet

The Servo Blog: This Week In Servo 33

Mozilla planet - to, 28/05/2015 - 22:30

In the past two weeks, we merged 73 pull requests.

We have a new member on our team. Please welcome Emily Dunham! Emily will be the DevOps engineer for both Servo and Rust. She has a post about her ideas regarding open infrastructure which is worth reading.

Josh discussed Servo and Rust at a programming talk show hosted by Alexander Putilin.

We have an impending [upgrade of the SpiderMonkey Javascript engine] by Michael Wu. This moves us from a very old spidermonkey to a recent-ish one. Naturally, the team is quite excited about the prospect of getting rid of all the old bugs and getting shiny new ones in their place.

Notable additions New contributors Screenshots

Hebrew Wikipedia in servo-shell

This shows off the CSS direction property. RTL text still needs some work

Meetings

Minutes

  • We discussed forking or committing to maintaining the extensions we need in glutin. Glutin is trying to stay focused and “not become a toolkit”, but there are some changes we need in it for embedding. Currently we have some changes on our fork; but we’d prefer to not use tweaked forks for community-maintained dependencies and were exploring the possibilities.
  • Mike is back and working on more embedding!
  • There was some planning for the Rust-in-Gecko sessions at Whistler
  • RTL is coming!
Categorieën: Mozilla-nl planet

Good luck with the new mobile strategy, Mozilla - Computerworld

Nieuws verzameld via Google - to, 28/05/2015 - 20:48

Computerworld

Good luck with the new mobile strategy, Mozilla
Computerworld
Gold and others were reacting to a report last week by CNET.com that cited an internal memo from Mozilla CEO Chris Beard. In his missive, Beard laid out a new initiative for Firefox OS -- which previously has been pinned to ultra-inexpensive handsets ...

Google Nieuws
Categorieën: Mozilla-nl planet

Monica Chew: Advertising: a sustainable utopia?

Mozilla planet - to, 28/05/2015 - 19:59
Advertising generates $50 billion annually in the US alone, but how much of that figure reflects real value? Approximately ⅓ of click traffic is fraudulent, leading to $10 billion in wasted spending annually. Counting revenue due to fraud towards the value of advertising is like counting money spent on diabetes treatments as part of the GDP -- if those figures went to zero, it would reflect a healthier ecosystem, or healthier people in the diabetes case. For people making money on advertising, it is difficult to accept that a reduction in annual revenue can mean that things are better for everyone else.

Even when ads are displayed to real people, they often create little to no value for the ad creator. According to Google, half of ads are never viewable, not even for a second. In addition, adblocking usage grew by 70% last year, and 41% of people between 18-29 use an adblocker. The advertising industry responds to these trends by making ads increasingly distracting (requiring large amounts of resources and unsafe plugins to run), collecting increasingly large amounts of data, and creating more opportunities for abuse by government agencies and other malicious actors. As Mitchell Baker put it, do we want to live in a house or a fish bowl?

There has to be a better way. Why can’t a person buy and blank out all of the ad space on sites they visit at a deep discount, since targeting machinery would no longer be relevant? Why aren’t subscriptions available as bundle deals, like in streaming video? Solutions like these are hypothetical and will remain so as long we maintain the fiction that the current advertising revenue model is a sustainable utopia.
Categorieën: Mozilla-nl planet

Air Mozilla: Participation at Mozilla

Mozilla planet - to, 28/05/2015 - 19:00

Participation at Mozilla The Participation Forum

Categorieën: Mozilla-nl planet

Joel Maher: the orange factor – no need to retrigger this week

Mozilla planet - to, 28/05/2015 - 18:15

last week I did another round of re-triggering for a root cause and found some root causes!  This week I got an email from orange factor outlining the top 10 failures on the trees (as we do every week).

Unfortunately as of this morning there is no work for me to do- maybe next week I can hunt.

Here is the breakdown of bugs:

  • Bug 1081925 – Intermittent browser_popup_blocker.js
    • investigated last week, test is disabled by a sheriff
  • Bug 1118277 – Intermittent browser_popup_blocker.js
    • investigated last week, test is disabled by a sheriff
  • Bug 1096302 – Intermittent test_collapse.html
    • test is fixed!  already landed
  • Bug 1121145 – Intermittent browser_panel_toggle.js
    • too old!  problem got worse on April 24th
  • Bug 1157948 – DMError: Non-zero return code for command
    • too old!  most likely a harness/infra issue
  • Bug 1166041 – Intermittent LeakSanitizer
    • patch is already on this bug
  • Bug 1165938 – Intermittent media-source
    • disabled the test already!
  • Bug 1149955 – Intermittent Win8-PGO test_shared_all.py
    • too old!
  • Bug 1160008 – Intermittent testVideoDiscovery
    • too old!
  • Bug 1137757 – Intermittent Linux debug mochitest-dt1 command timed out
    • harness infra, test chunk is taking too long- problem is being addressed with more chunks.

As you can see there isn’t much to do here.  Maybe next week we will have some actions we can take.  Once I have about 10 bugs investigated I will summarize the bugs, related dates, and status, etc.


Categorieën: Mozilla-nl planet

Air Mozilla: Reps weekly

Mozilla planet - to, 28/05/2015 - 17:00

Reps weekly Weekly Mozilla Reps call

Categorieën: Mozilla-nl planet

Selena Deckelmann: pushlog from last night, a brief look at Try

Mozilla planet - to, 28/05/2015 - 16:37

One of the mysterious and amazing parts of Mozilla’s Release Engineering infrastructure is the Try server, or just “Try”. This is how Firefox and FirefoxOS developers can push changes to Mozilla’s large build and test system, made up of about 3000 servers at any time. There are a couple amazing things about this — one is that anyone can request access to push to try, not just core developers or Mozilla employees. It is a publicly-available system and tool. Second is that a remarkable amount of control is offered over which builds to produce and which tests are run. Each Try run could consume 300+ hours of machine time if every posible build and test option is selected.

This blog post is a brain dump from a few days of noodling and a quick review of of the pushlog for Try, which shows exactly the options developers are choosing for their Try runs.

To use Try, you need to include a string of configuration that looks something like this in your topmost hg commit:

try: -b do -p emulator,emulator-jb,emulator-kk,linux32_gecko,linux64_gecko,macosx64_gecko,win32_gecko -u all -t none

That’s a recommended string for B2G developers from the Sheriff best practices wiki page. If you’re interested in how this works, the code for the try syntax parser itself is here.

You can include a Try configuration string in an empty commit, or as the last part of an existing commit message. What most developers tell me they do is have a an empty commit with the Try string in it, and they remove the extra commit before merging a patch. From all the feedback I’ve read and heard, I think that’s probably what we should document on the wiki page for Try, and maybe have a secondary page with “variants” for those that want to use more advanced tooling. KISS rule seems to apply here.

If you’re a regular user of Try, you might have heard of the high scores tracker. What you might not know is that there is a JSON file behind this page and it contains quite a bit of history that’s used to generate that page. You can find it if you just replace ‘.html’ with ‘.json’.

Something about the 8-bit ambiance of this page that made me think of “texts from last night”. But in reality, Try is most busy during typical Pacific Time working hours.

The high scores page also made me curious about the actual Try strings that people were using. I pulled them all out and had a look at what config options were most common.

Of the 1262 pushes documented today in that file:

  • 760 used ‘-b do’ meaning both Debug and Opt builds are made. I wonder whether this should just be the default, or we should have some clear recomendations about what developers should do here.
  • 366 used ‘-p all’ meaning build on 28 platforms, and produce 28 binaries. Some people might intend this, but I wonder if some other default might be more helpful.
  • 456 used ‘-u all’ meaning that all the unit tests were run.
  • 1024 used ‘-t none’ reflecting the waning use of Talos tests.

I’m still thinking about how to use this information. I have a few ideas:

  • Change the defaults for a minimal try run
  • Make some commonly-used aliases for things like “build on B2G platforms” or “tests that catch a lot of problems”
  • Create a dashboard that shows TopN try syntax strings
  • update the parser to include more of the options documented on various wiki pages as environment variables

If you’re a regular user of Try, what would you like to see changed? Ping me in #releng, email or tweet your thoughts!

And some background on me: I’ve been working with the Release Engineering team since April 1, 2015, and most of that time so far was spent on #buildduty, a topic I’m planning to write a few blog posts about. I’m also having a look at planning for the Task Cluster migration (away from BuildBot), monitoring and developer environments for releng tooling. I’m also working on a zine to share at Whistler of what is going on when you push to Try. Finally, I stood up a bot for reporting alerts through AWS SNS and to be able to file #buildduty related bugzilla bugs.

My goal right now is to find ways of communicating about and sharing the important work that Release Engineering is doing. Some of that is creating tracker bugs and having meetings to spread knowlege. Some of that is documenting our infrastructure and drawing pictures of how our systems interact. A lot of it is listening and learning from the engineers who build and ship Firefox to the world.

Categorieën: Mozilla-nl planet

Armen Zambrano: mozci 0.7.0 - Less network fetches - great speed improvements!

Mozilla planet - to, 28/05/2015 - 16:01
This release is not large in scope but it has many performance improvements.
The main improvement is to have reduced the number of times that we fetch for information and use a cache where possible. The network cost was very high.
You can read more about in here: http://explique.me/cProfile
ContributionsThanks to @adusca @parkouss @vaibhavmagarwal for their contributions on this release.
How to updateRun "pip install -U mozci" to update
Major highlights
  • Reduce drastically the number of requests by caching where possible
  • If a failed build has uploaded good files let's use them
  • Added support for retriggering and cancelling jobs
  • Retrigger a job once with a count of N instead of triggering individually N times
Minor improvements
  • Documenation updates
  • Add waffle.io badge
All changesYou can see all changes in here:
0.6.0...0.7.0
Creative Commons License
This work by Zambrano Gasparnian, Armen is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License.
Categorieën: Mozilla-nl planet

Pages