mozilla

Mozilla Nederland LogoDe Nederlandse
Mozilla-gemeenschap

Pocket’s state-by-state guide to the most popular articles in 2021

Mozilla Blog - wo, 01/12/2021 - 15:43

We’re just going to say it: it feels a little bit weird to wrap up 2021 because this year feels like three years in one and an extension of 2020 simultaneously. At some point in the near future, 2020 and 2021 will be studied in history books. While we can’t predict what the history books will say, we can analyze what defined this year for us. 

We do just that in Pocket’s Best of 2021 — the most-saved, -read and -shared articles by Pocket readers, spanning culture, science, tech and more. 

As we analyzed the winning articles, we wondered what we might learn if we looked at the data state by state. 

Setting aside the top story worldwide for 2021, Adam Grant’s piece naming that ‘blah’ feeling we felt after 2020, the top story in all but five states was a guide to deleting all of your old online accounts. And most of the five locales that differ — D.C., Maine, New York, North Dakota and Montana — have that story as the second most-saved story. 

We saw a few patterns among top stories across several states. Americans weren’t just deleting old online accounts; they were also trying to strengthen their memory, pondering how the rich avoid income tax and wondering how to be wittier in conversation

We might all have been languishing but we were also questioning if we could improve ourselves — or at least our bank accounts. Some things don’t change, even after two of the strangest years in modern history.

Check below to see the top two stories from your state.

Alabama

  1. How to Delete Your Old Online Accounts (and Why You Should) published on How To Geek
  2. Train Your Brain to Remember Anything You Learn With This Simple, 20-Minute Habit published on Inc

Alaska

  1. How to Delete Your Old Online Accounts (and Why You Should) published on How To Geek
  2. Delta Has Changed the Pandemic Endgame published on The Atlantic 

Arizona

  1. How to Delete Your Old Online Accounts (and Why You Should) published on How To Geek
  2. Train Your Brain to Remember Anything You Learn With This Simple, 20-Minute Habit published on Inc 

Arkansas

  1. How to Delete Your Old Online Accounts (and Why You Should) published on How To Geek
  2. How to be witty and clever in conversation published on Quartz 

California

  1. How to Delete Your Old Online Accounts (and Why You Should) published on How To Geek
  2. ​​The Secret IRS Files: Trove of Never-Before-Seen Records Reveal How the Wealthiest Avoid Income Tax  published on ProPublica 

Colorado

  1. How to Delete Your Old Online Accounts (and Why You Should) published on How To Geek
  2. Train Your Brain to Remember Anything You Learn With This Simple, 20-Minute Habit published on Inc 

Connecticut

  1. How to Delete Your Old Online Accounts (and Why You Should) published on How To Geek
  2. Train Your Brain to Remember Anything You Learn With This Simple, 20-Minute Habit published on Inc 

Delaware

  1. How to Delete Your Old Online Accounts (and Why You Should) published on How To Geek
  2. Among the Insurrectionists published on The New Yorker 

District of Columbia 

  1. The Pandemic Has Erased Entire Categories of Friendship published on The Atlantic 
  2. Grief and Conspiracy 20 Years After 9/11  published on The Atlantic 

Florida

  1. How to Delete Your Old Online Accounts (and Why You Should) published on How To Geek
  2. The Curious Case of Florida’s Pandemic Response published on The Atlantic 

Georgia

  1. How to Delete Your Old Online Accounts (and Why You Should) published on How To Geek
  2. ​​The Secret IRS Files: Trove of Never-Before-Seen Records Reveal How the Wealthiest Avoid Income Tax  published on ProPublica 

Hawaii

  1. How to Delete Your Old Online Accounts (and Why You Should) published on How To Geek
  2. How to Practice published on The New Yorker 

Idaho

  1. How to Delete Your Old Online Accounts (and Why You Should) published on How To Geek
  2. The Great Resignation Is Accelerating published on  The Atlantic 

Illinois

  1. How to Delete Your Old Online Accounts (and Why You Should) published on How To Geek
  2. A Battle Between a Great City and a Great Lake published on The New York Times

Indiana

  1. How to Delete Your Old Online Accounts (and Why You Should) published on How To Geek
  2. Train Your Brain to Remember Anything You Learn With This Simple, 20-Minute Habit published on Inc

Iowa

  1. How to Delete Your Old Online Accounts (and Why You Should) published on How To Geek
  2. The Secret History of the Shadow Campaign That Saved the 2020 Election published on TIME

Kansas

  1. How to Delete Your Old Online Accounts (and Why You Should) published on How To Geek
  2. ​​The Secret IRS Files: Trove of Never-Before-Seen Records Reveal How the Wealthiest Avoid Income Tax  published on ProPublica 

Kentucky

  1. How to Delete Your Old Online Accounts (and Why You Should) published on How To Geek
  2. The Six Morning Routines that Will Make You Happier, Healthier and More Productive published on Scott H. Young

Louisiana

  1. How to Delete Your Old Online Accounts (and Why You Should) published on How To Geek
  2. How to be witty and clever in conversation pubished on Quartz 

Maine

  1. Delta Has Changed the Pandemic Endgame published on The Atlantic 
  2. How to Delete Your Old Online Accounts (and Why You Should) published on How To Geek

Maryland

  1. How to Delete Your Old Online Accounts (and Why You Should) published on How To Geek
  2. ​​The Secret IRS Files: Trove of Never-Before-Seen Records Reveal How the Wealthiest Avoid Income Tax  published on ProPublica 

Massachusetts

  1. How to Delete Your Old Online Accounts (and Why You Should) published on How To Geek
  2. ​​The Secret IRS Files: Trove of Never-Before-Seen Records Reveal How the Wealthiest Avoid Income Tax  published on ProPublica 

Michigan

  1. How to Delete Your Old Online Accounts (and Why You Should) published on How To Geek
  2. Train Your Brain to Remember Anything You Learn With This Simple, 20-Minute Habit published on Inc

Minnesota

  1. How to Delete Your Old Online Accounts (and Why You Should) published on How To Geek
  2. ​​The Secret IRS Files: Trove of Never-Before-Seen Records Reveal How the Wealthiest Avoid Income Tax  published on ProPublica 

Mississippi

  1. How to Delete Your Old Online Accounts (and Why You Should) published on How To Geek
  2. How Fit Can You Get From Just Walking? published on GQ

Missouri

  1. How to Delete Your Old Online Accounts (and Why You Should) published on How To Geek
  2. How to be witty and clever in conversation published on Quartz 

Montana

  1. Train Your Brain to Remember Anything You Learn With This Simple, 20-Minute Habit published on Inc
  2. Scientist Author Busts Myths About Exercise, Sitting And Sleep : Shots – Health News published on NPR

Nebraska

  1. How to Delete Your Old Online Accounts (and Why You Should) published on How To Geek
  2. Delta Has Changed the Pandemic Endgame published on The Atlantic 

Nevada

  1. How to Delete Your Old Online Accounts (and Why You Should) published on How To Geek
  2. Train Your Brain to Remember Anything You Learn With This Simple, 20-Minute Habit published on Inc

New Hampshire

  1. How to Delete Your Old Online Accounts (and Why You Should) published on How To Geek
  2. The Six Morning Routines that Will Make You Happier, Healthier and More Productive published on Scott H. Young

New Jersey

  1. How to Delete Your Old Online Accounts (and Why You Should) published on How To Geek
  2. ​​The Secret IRS Files: Trove of Never-Before-Seen Records Reveal How the Wealthiest Avoid Income Tax  published on ProPublica 

New Mexico

  1. How to Delete Your Old Online Accounts (and Why You Should) published on How To Geek
  2. Train Your Brain to Remember Anything You Learn With This Simple, 20-Minute Habit published on Inc

New York

  1. Who Is the Bad Art Friend? published on The New York Times Magazine
  2. The Secret IRS Files: Trove of Never-Before-Seen Records Reveal How the Wealthiest Avoid Income Tax  published on ProPublica  

North Carolina

  1. How to Delete Your Old Online Accounts (and Why You Should) published on How To Geek
  2. Train Your Brain to Remember Anything You Learn With This Simple, 20-Minute Habit published on Inc

North Dakota

  1. Inside the Worst-Hit County in the Worst-Hit State in the Worst-Hit Country published on The New Yorker
  2. 5 Questions the Most Interesting People Will Always Ask in Conversations published on Inc

Ohio

  1. How to Delete Your Old Online Accounts (and Why You Should) published on How To Geek
  2. Train Your Brain to Remember Anything You Learn With This Simple, 20-Minute Habit published on Inc

Oklahoma

  1. How to Delete Your Old Online Accounts (and Why You Should) published on How To Geek
  2. 5 Questions the Most Interesting People Will Always Ask in Conversations published on Inc

Oregon

  1. How to Delete Your Old Online Accounts (and Why You Should) published on How To Geek
  2. Delta Has Changed the Pandemic Endgame published on The Atlantic 

Pennsylvania

  1. How to Delete Your Old Online Accounts (and Why You Should) published on How To Geek
  2. Train Your Brain to Remember Anything You Learn With This Simple, 20-Minute Habit published on Inc

Rhode Island

  1. How to Delete Your Old Online Accounts (and Why You Should) published on How To Geek
  2. Among the Insurrectionists published on The New Yorker 

South Carolina

  1. How to Delete Your Old Online Accounts (and Why You Should) published on How To Geek
  2. The Six Morning Routines that Will Make You Happier, Healthier and More Productive published on Scott H. Young

South Dakota

  1. How to Delete Your Old Online Accounts (and Why You Should) published on How To Geek
  2. Train Your Brain to Remember Anything You Learn With This Simple, 20-Minute Habit published on Inc

Tennessee

  1. How to Delete Your Old Online Accounts (and Why You Should) published on How To Geek
  2. The Secret IRS Files: Trove of Never-Before-Seen Records Reveal How the Wealthiest Avoid Income Tax  published on ProPublica  

Texas

  1. How to Delete Your Old Online Accounts (and Why You Should) published on How To Geek
  2. The Secret IRS Files: Trove of Never-Before-Seen Records Reveal How the Wealthiest Avoid Income Tax  published on ProPublica  

Utah

  1. How to Delete Your Old Online Accounts (and Why You Should) published on How To Geek
  2. 5 Questions the Most Interesting People Will Always Ask in Conversations published on Inc

Vermont

  1. How to Delete Your Old Online Accounts (and Why You Should) published on How To Geek
  2. Among the Insurrectionists published on The New Yorker

Virginia

  1. How to Delete Your Old Online Accounts (and Why You Should) published on How To Geek
  2. The Secret IRS Files: Trove of Never-Before-Seen Records Reveal How the Wealthiest Avoid Income Tax  published on ProPublica  

Washington

  1. How to Delete Your Old Online Accounts (and Why You Should) published on How To Geek
  2. Delta Has Changed the Pandemic Endgame published on The Atlantic 

West Virginia

  1. How to Delete Your Old Online Accounts (and Why You Should) published on How To Geek
  2. The Six Morning Routines that Will Make You Happier, Healthier and More Productive published on Scott H. Young

Wisconsin

  1. How to Delete Your Old Online Accounts (and Why You Should) published on How To Geek
  2. Train Your Brain to Remember Anything You Learn With This Simple, 20-Minute Habit published on Inc

Wyoming

  1. How to Delete Your Old Online Accounts (and Why You Should) published on How To Geek
  2. Train Your Brain to Remember Anything You Learn With This Simple, 20-Minute Habit published on Inc

Learn more about Pocket’s Best of 2021:

The post Pocket’s state-by-state guide to the most popular articles in 2021 appeared first on The Mozilla Blog.

Categorieën: Mozilla-nl planet

Celebrating Pocket’s Best of 2021

Mozilla Blog - wo, 01/12/2021 - 15:30

Each December, Pocket celebrates the very best of the web — the must-read profiles, thought-provoking essays, and illuminating explainers that Pocket users saved and read the most over the past 12 months. Today, we’re delighted to bring you Pocket’s Best of 2021: more than a dozen collections spotlighting the year’s top articles across culture, technology, science, business, and more. 

We aren’t the only ones putting out Top 10 content lists or Year in Reviews, but we’d argue these lists are different from the rest — a cut above. Pocket readers are a discerning bunch: they gravitate to fascinating long reads that deeply immerse readers in a story or subject; explainers that demystify complex or poorly understood topics; big ideas that challenge us to think and maybe even act differently; and great advice for all facets of life. You’ll find must-read examples of all of these inside these eclectic Best of 2021 collections, from dozens of trustworthy and diverse publications.

The stories people save most to Pocket often provide a fascinating window into what’s occupying our collective attention each year. In 2019, the most-saved article on Pocket examined how modern economic precarity has turned millennials into the burnout generation. In 2020, the most-read article was a probing and prescient examination of how the Covid-19 pandemic might end

This year, the No. 1 article in Pocket put a name to the chronic sense of ‘blah’ that so many of us felt in 2021 as the uncertainty of the pandemic wore on: languishing. (For months, heads would nod all over Zoom whenever this article came up in conversation.) To mark the end of the year, we asked Adam Grant, the organizational psychologist and bestselling author who wrote the piece, to curate a special Pocket Collection all about how to leave languishing behind in 2021 — and start flourishing in 2022 by breaking free from stagnation and rekindling your spark. 

What you’ll also find in this year’s Best Of package: A journey through some of 2021’s most memorable events and storylines, as told through 12 exemplary articles that Pocket users saved to help them make sense of it all. Plus, recommendations from this year’s top writers on the unforgettable stories they couldn’t stop reading, and a special collection from those of us at Pocket about 2021 lessons we won’t soon forget.

If you haven’t read these articles yet, save them to your Pocket and dig in over the holidays. While you’re at it, join the millions of people discovering the thought-provoking articles we curate in our daily newsletter and on the Firefox and Chrome new tab pages each and every day.

From all of us at Pocket, have a joyous and safe holiday season and a happy — and flourishing — new year.

Carolyn O’Hara is senior director of content discovery at Pocket. 

The post Celebrating Pocket’s Best of 2021 appeared first on The Mozilla Blog.

Categorieën: Mozilla-nl planet

Mozilla Localization (L10N): L10n Report: October Edition

Mozilla planet - vr, 15/10/2021 - 22:08
October 2021 Report

Please note some of the information provided in this report may be subject to change as we are sometimes sharing information about projects that are still in early stages and are not final yet. 

Welcome! New l10n-driver

Welcome eemeli, our new l10n-driver! He will be working on Fluent and Pontoon, and is part of our tech team along with Matjaž. We hope we can all connect soon so you can meet him.

New localizers

Katelem from Obolo locale. Welcome to localization at Mozilla!

Are you a locale leader and want us to include new members in our upcoming reports? Contact us!

New community/locales added

Obolo (ann) locale was added to Pontoon.

New content and projects What’s new or coming up in Firefox desktop

A new major release (MR2) is coming for Firefox desktop with Firefox 94. The deadline to translate content for this version, currently in Beta, is October 24.

While MR2 is not as content heavy as MR1, there are changes to very visible parts of the UI, like the onboarding for both new and existing users. Make sure to check out the latest edition of the Firefox L10n Newsletter for more details, and instructions on how to test.

What’s new or coming up in mobile

Focus for Android and iOS have gone through a new refresh! This was done as part of our ongoing MR2 work – which has also covered Firefox for Android and iOS. You can read about all of this here.

Many of you have been heavily involved in this work, and we thank you for making this MR2 launch across all mobile products such a successful release globally.

We are now starting our next iteration of MR2 releases. We are still currently working on scoping out the mobile work for l10n, so stay tuned.

One thing to note is that the l10n schedule dates for mobile should now be aligned across product operating systems: one l10n release cycle for all of android, and another release cycle for all of iOS. As always, Pontoon deadlines remain your source of truth for this.

What’s new or coming up in web projects Firefox Accounts

Firefox Accounts team has been working on transitioning Gettext to Fluent. They are in the middle of migrating server.po to auth.ftl, the component that handles the email feature. Unlike previous migrations where the localized strings were not part of the plan, this time, the team wanted to include them as much as possible. The initial attempt didn’t go as planned due to multiple technical issues. The new auth.ftl file made a brief appearance in Pontoon and is now disabled. They will give it a go after confirming that the identified issues were addressed and tested.

Legal docs

All the legal docs are translated by our vendor. Some of you have reported translation errors or they are out of sync with the English source. If you spot any issues, wrong terminology, typo, missing content, to name a few, you can file a bug. Generally we do not encourage localizers to provide translations because of the nature of the content. If they are minor changes, you can create a PR and ask for a peer review to confirm your change before the change can be merged. If the overall quality is bad, we will request the vendor to change the translators.

Please note, the locale support for legal docs varies from product to product. Starting this year, the number of supported locales also has decreased to under 20. Some of the previously localized docs are no longer updated. This might be the reason you see your language out of sync with the English source.

Mozilla.org

Five more mobile specific pages were added since our last report. If you need to prioritize them, please give higher priority to the Focus, Index and Compare pages.

What’s new or coming up in SuMo

Lots of new stuff since our last update here in June. Here are some of the highlights:

  • We’re working on refreshing the onboarding experience in SUMO. The content preparation has mostly done in Q3 and the implementation is expected in this quarter before the end of the year.
  • Catch up on what’s new in our support platform by reading our release notes in Discourse. One highlight of the past quarter is that we integrated Zendesk form for Mozilla VPN into SUMO. We don’t have the capability to detect subscriber at the moment, so everyone can file a ticket now. But we’re hoping to add the capability for that in the future.
  • Firefox Focus joined our forces in Play Store support. Contributors should be able to reply to Google Play Store reviews for Firefox Focus from Conversocial now. We also create this guideline to help contributors compose a reply for Firefox Focus reviews.
  • We welcomed 2 new team members in Q3. Joe who is our Support Operation Manager is now taking care of the premium customer support experience. And Abby, the new Content Manager, is our team’s latest addition who will be working closely with Fabi and our KB contributors to improve our help content.

You’re always welcome to join our Matrix or the contributor forum to talk more about anything related to support!

What’s new or coming up in Pontoon

Submit your ideas and report bugs via GitHub

We have enabled GitHub Issues in the Pontoon repository and made it the new place for tracking bugs, enhancements and tasks for Pontoon development. At the same time, we have disabled the Pontoon Component in Bugzilla, and imported all open bugs into GitHub Issues. Old bugs are still accessible on their existing URLs. For reporting security vulnerabilities, we’ll use a newly created component in Bugzilla, which allows us to hide security problems from the public until they are resolved.

Using GitHub Issues will make it easier for the development team to resolve bugs via commit messages and put them on a Roadmap, which will also be moved to GitHub soon. We also hope GitHub Issues will make suggesting ideas and reporting issues easier for the users. Let us know if you run into any issues or have any questions!

More improvements to the notification system coming

As part of our H1 effort to better understand how notifications are being used, the following features have received most votes in a localizer survey:

  • Notifications for new strings should link to the group of strings added.
  • For translators and locale managers, get notifications when there are pending suggestions to review.
  • Add the ability to opt-out of specific notifications.

Thanks to eemeli, the first item was resolved back in August. The second feature has also been implemented, which means reviewers will receive weekly notifications about newly created unreviewed suggestions within the last week. Work on the last item – ability to opt-out of specific notification types – has started.

Newly published localizer facing documentation

We published two new posts in the Localization category on Discourse:

Events
  • Michal Stanke shared his experience as a volunteer in the open source community at the annual International Translation Day event hosted by WordPress! Way to go!
  • Want to showcase an event coming up that your community is participating in? Reach out to any l10n-driver and we’ll include that (see links to emails at the bottom of this report)
Useful Links Questions? Want to get involved?
  • If you want to get involved, or have any question about l10n, reach out to:

Did you enjoy reading this report? Let us know how we can improve by reaching out to any one of the l10n-drivers listed above.

Categorieën: Mozilla-nl planet

Niko Matsakis: Dyn async traits, part 6

Mozilla planet - vr, 15/10/2021 - 21:57

A quick update to my last post: first, a better way to do what I was trying to do, and second, a sketch of the crate I’d like to see for experimental purposes.

An easier way to roll our own boxed dyn traits

In the previous post I covered how you could create vtables and pair the up with a data pointer to kind of “roll your own dyn”. After I published the post, though, dtolnay sent me this Rust playground link to show me a much better approach, one based on the erased-serde crate. The idea is that instead of make a “vtable struct” with a bunch of fn pointers, we create a “shadow trait” that reflects the contents of that vtable:

// erased trait: trait ErasedAsyncIter { type Item; fn next<'me>(&'me mut self) -> Pin<Box<dyn Future<Output = Option<Self::Item>> + 'me>>; }

Then the DynAsyncIter struct can just be a boxed form of this trait:

pub struct DynAsyncIter<'data, Item> { pointer: Box<dyn ErasedAsyncIter<Item = Item> + 'data>, }

We define the “shim functions” by implementing ErasedAsyncIter for all T: AsyncIter:

impl<T> ErasedAsyncIter for T where T: AsyncIter, { type Item = T::Item; fn next<'me>(&'me mut self) -> Pin<Box<dyn Future<Output = Option<Self::Item>> + 'me>> { // This code allocates a box for the result // and coerces into a dyn: Box::pin(AsyncIter::next(self)) } }

And finally we can implement the AsyncIter trait for the dynamic type:

impl<'data, Item> AsyncIter for DynAsyncIter<'data, Item> { type Item = Item; type Next<'me> where Item: 'me, 'data: 'me, = Pin<Box<dyn Future<Output = Option<Item>> + 'me>>; fn next(&mut self) -> Self::Next<'_> { self.pointer.next() } }

Yay, it all works, and without any unsafe code!

What I’d like to see

This “convert to dyn” approach isn’t really specific to async (as erased-serde shows). I’d like to see a decorator that applies it to any trait. I imagine something like:

// Generates the `DynAsyncIter` type shown above: #[derive_dyn(DynAsyncIter)] trait AsyncIter { type Item; async fn next(&mut self) -> Option<Self::Item>; }

But this ought to work with any -> impl Trait return type, too, so long as Trait is dyn safe and implemented for Box<T>. So something like this:

// Generates the `DynAsyncIter` type shown above: #[derive_dyn(DynSillyIterTools)] trait SillyIterTools: Iterator { // Iterate over the iter in pairs of two items. fn pair_up(&mut self) -> impl Iterator<(Self::Item, Self::Item)>; }

would generate an erased trait that returns a Box<dyn Iterator<(...)>>. Similarly, you could do a trick with taking any impl Foo and passing in a Box<dyn Foo>, so you can support impl Trait in argument position.

Even without impl trait, derive_dyn would create a more ergonomic dyn to play with.

I don’t really see this as a “long term solution”, but I would be interested to play with it.

Comments?

I’ve created a thread on internals if you’d like to comment on this post, or others in this series.

Categorieën: Mozilla-nl planet

Niko Matsakis: Dyn async traits, part 5

Mozilla planet - do, 14/10/2021 - 19:46

If you’re willing to use nightly, you can already model async functions in traits by using GATs and impl Trait — this is what the Embassy async runtime does, and it’s also what the real-async-trait crate does. One shortcoming, though, is that your trait doesn’t support dynamic dispatch. In the previous posts of this series, I have been exploring some of the reasons for that limitation, and what kind of primitive capabilities need to be exposed in the language to overcome it. My thought was that we could try to stabilize those primitive capabilities with the plan of enabling experimentation. I am still in favor of this plan, but I realized something yesterday: using procedural macros, you can ALMOST do this experimentation today! Unfortunately, it doesn’t quite work owing to some relatively obscure rules in the Rust type system (perhaps some clever readers will find a workaround; that said, these are rules I have wanted to change for a while).

Just to be crystal clear: Nothing in this post is intended to describe an “ideal end state” for async functions in traits. I still want to get to the point where one can write async fn in a trait without any further annotation and have the trait be “fully capable” (support both static dispatch and dyn mode while adhering to the tenets of zero-cost abstractions1). But there are some significant questions there, and to find the best answers for those questions, we need to enable more exploration, which is the point of this post.

Code is on github

The code covered in this blog post has been prototyped and is available on github. See the caveat at the end of the post, though!

Design goal

To see what I mean, let’s return to my favorite trait, AsyncIter:

trait AsyncIter { type Item; async fn next(&mut self) -> Option<Self::Item>; }

The post is going to lay out how we can transform a trait declaration like the one above into a series of declarations that achieve the following:

  • We can use it as a generic bound (fn foo<T: AsyncIter>()), in which case we get static dispatch, full auto trait support, and all the other goodies that normally come with generic bounds in Rust.
  • Given a T: AsyncIter, we can coerce it into some form of DynAsyncIter that uses virtual dispatch. In this case, the type doesn’t reveal the specific T or the specific types of the futures.
    • I wrote DynAsyncIter, and not dyn AsyncIter on purpose — we are going to create our own type that acts like a dyn type, but which manages the adaptations needed for async.
    • For simplicity, let’s assume we want to box the resulting futures. Part of the point of this design though is that it leaves room for us to generate whatever sort of wrapping types we want.

You could write the code I’m showing here by hand, but the better route would be to package it up as a kind of decorator (e.g., #[async_trait_v2]2).

The basics: trait with a GAT

The first step is to transform the trait to have a GAT and a regular fn, in the way that we’ve seen many times:

trait AsyncIter { type Item; type Next<‘me>: Future<Output = Option<Self::Item>> where Self: ‘me; fn next(&mut self) -> Self::Next<‘_>; } Next: define a “DynAsyncIter” struct

The next step is to manage the virtual dispatch (dyn) version of the trait. To do this, we are going to “roll our own” object by creating a struct DynAsyncIter. This struct plays the role of a Box<dyn AsyncIter> trait object. Instances of the struct can be created by calling DynAsyncIter::from with some specific iterator type; the DynAsyncIter type implements the AsyncIter trait, so once you have one you can just call next as usual:

let the_iter: DynAsyncIter<u32> = DynAsyncIter::from(some_iterator); process_items(&mut the_iter); async fn sum_items(iter: &mut impl AsyncIter<Item = u32>) -> u32 { let mut s = 0; while let Some(v) = the_iter.next().await { s += v; } s } Struct definition

Let’s look at how this DynAsyncIter struct is defined. First, we are going to “roll our own” object by creating a struct DynAsyncIter. This struct is going to model a Box<dyn AsyncIter> trait object; it will have one generic parameter for every ordinary associated type declared in the trait (not including the GATs we introduced for async fn return types). The struct itself has two fields, the data pointer (a box, but in raw form) and a vtable. We don’t know the type of the underlying value, so we’ll use ErasedData for that:

type ErasedData = (); pub struct DynAsyncIter<Item> { data: *mut ErasedData, vtable: &’static DynAsyncIterVtable<Item>, }

For the vtable, we will make a struct that contains a fn for each of the methods in the trait. Unlike the builtin vtables, we will modify the return type of these functions to be a boxed future:

struct DynAsyncIterVtable<Item> { drop_fn: unsafe fn(*mut ErasedData), next_fn: unsafe fn(&mut *mut ErasedData) -> Box<dyn Future<Output = Option<Item>> + ‘_>, } Implementing the AsyncIter trait

Next, we can implement the AsyncIter trait for the DynAsyncIter type. For each of the new GATs we introduced, we simply use a boxed future type. For the method bodies, we extract the function pointer from the vtable and call it:

impl<Item> AsyncIter for DynAsyncIter<Item> { type Item = Item; type Next<‘me> = Box<dyn Future<Output = Option<Item>> + ‘me>; fn next(&mut self) -> Self::Next<‘_> { let next_fn = self.vtable.next_fn; unsafe { next_fn(&mut self.data) } } }

The unsafe keyword here is asserting that the safety conditions of next_fn are met. We’ll cover that in more detail later, but in short those conditions are:

  • The vtable corresponds to some erased type T: AsyncIter…
  • …and each instance of *mut ErasedData points to a valid Box<T> for that type.
Dropping the object

Speaking of Drop, we do need to implement that as well. It too will call through the vtable:

impl Drop for DynAsyncIter { fn drop(&mut self) { let drop_fn = self.vtable.drop_fn; unsafe { drop_fn(self.data); } } }

We need to call through the vtable because we don’t know what kind of data we have, so we can’t know how to drop it correctly.

Creating an instance of DynAsyncIter

To create one of these DynAsyncIter objects, we can implement the From trait. This allocates a box, coerces it into a raw pointer, and then combines that with the vtable:

impl<Item, T> From<T> for DynAsyncIter<Item> where T: AsyncIter<Item = Item>, { fn from(value: T) -> DynAsyncIter { let boxed_value = Box::new(value); DynAsyncIter { data: Box::into_raw(boxed_value) as *mut (), vtable: dyn_async_iter_vtable::<T>(), // we’ll cover this fn later } } } Creating the vtable shims

Now we come to the most interesting part: how do we create the vtable for one of these objects? Recall that our vtable was a struct like so:

struct DynAsyncIterVtable<Item> { drop_fn: unsafe fn(*mut ErasedData), next_fn: unsafe fn(&mut *mut ErasedData) -> Box<dyn Future<Output = Option<Item>> + ‘_>, }

We are going to need to create the values for each of those fields. In an ordinary dyn, these would be pointers directly to the methods from the impl, but for us they are “wrapper functions” around the core trait functions. The role of these wrappers is to introduce some minor coercions, such as allocating a box for the resulting future, as well as to adapt from the “erased data” to the true type:

// Safety conditions: // // The `*mut ErasedData` is actually the raw form of a `Box<T>` // that is valid for ‘a. unsafe fn next_wrapper<‘a, T>( this: &’a mut *mut ErasedData, ) -> Box<dyn Future<Output = Option<T::Item>> + ‘a where T: AsyncIter, { let unerased_this: &mut Box<T> = unsafe { &mut *(this as *mut Box<T>) }; let future: T::Next<‘_> = <T as AsyncIter>::next(unerased_this); Box::new(future) }

We’ll also need a “drop” wrapper:

// Safety conditions: // // The `*mut ErasedData` is actually the raw form of a `Box<T>` // and this function is being given ownership of it. fn drop_wrapper<T>( this: *mut ErasedData, ) where T: AsyncIter, { let unerased_this = Box::from_raw(this as *mut T); drop(unerased_this); // Execute destructor as normal } Constructing the vtable

Now that we’ve defined the wrappers, we can construct the vtable itself. Recall that the From impl called a function dyn_async_iter_vtable::<T>. That function looks like this:

fn dyn_async_iter_vtable<T>() -> &’static DynAsyncIterVtable<T::Item> where T: AsyncIter, { const { &DynAsyncIterVtable { drop_fn: drop_wrapper::<T>, next_fn: next_wrapper::<T>, } } }

This constructs a struct with the two function pointers: this struct only contains static data, so we are allowed to return a &’static reference to it.

Done!

And now the caveat, and a plea for help

Unfortunately, this setup doesn’t work quite how I described it. There are two problems:

  • const functions and expressions stil lhave a lot of limitations, especially around generics like T, and I couldn’t get them to work;
  • Because of the rules introduced by RFC 1214, the &’static DynAsyncIterVtable<T::Item> type requires that T::Item: 'static, which may not be true here. This condition perhaps shouldn’t be necessary, but the compiler currently enforces it.

I wound up hacking something terrible that erased the T::Item type into uses and used Box::leak to get a &'static reference, just to prove out the concept. I’m almost embarassed to show the code, but there it is.

Anyway, I know people have done some pretty clever tricks, so I’d be curious to know if I’m missing something and there is a way to build this vtable on Rust today. Regardless, it seems like extending const and a few other things to support this case is a relatively light lift, if we wanted to do that.

Conclusion

This blog post presented a way to implement the dyn dispatch ideas I’ve been talking using only features that currently exist and are generally en route to stabilization. That’s exiting to me, because it means that we can start to do measurements and experimentation. For example, I would really like to know the performance impact of transitiong from async-trait to a scheme that uses a combination of static dispatch and boxed dynamic dispatch as described here. I would also like to explore whether there are other ways to wrap futures (e.g., with task-local allocators or other smart pointers) that might perform better. This would help inform what kind of capabilities we ultimately need.

Looking beyond async, I’m interested in tinkering with different models for dyn in general. As an obvious example, the “always boxed” version I implemented here has some runtime cost (an allocation!) and isn’t applicable in all environments, but it would be far more ergonomic. Trait objects would be Sized and would transparently work in far more contexts. We can also prototype different kinds of vtable adaptation.

  1. In the words of Bjarne Stroustroup, “What you don’t use, you don’t pay for. And further: What you do use, you couldn’t hand code any better.” 

  2. Egads, I need a snazzier name than that! 

Categorieën: Mozilla-nl planet

Jan-Erik Rediger: Fenix Physical Device Testing

Mozilla planet - do, 14/10/2021 - 17:00

The Firefox for Android (Fenix) project runs extensive tests on every pull request and when merging code back into the main branch.

While many tests run within an isolated Java environment, Fenix also contains a multitude of UI tests. They allow testing the full application, interaction with the UI and other events. Running these requires the Android emulator running or a physical Android device connected. To run these tests in the CI environment the Fenix team relies on the Firebase test lab, a cloud-based testing service offering access to a range of physical and virtual devices to run Android applications on.

To speed up development, the automatically scheduled tests associated with a pull request are only run on virtual devices. These are quick to spin up, there is basically no upper limit of devices that can spawn on the cloud infrastructure and they usually produce the same result as running the test on a physical device.

But once in a while you encounter a bug that can only be reproduced reliably on a physical device. If you don't have access to such a device, what do you do? Or you know the bug happens on that one specific device type you don’t have?

You remember that the Firebase Test Lab offers physical devices as well and the Fenix repository is very well set up to run your test on these too if needed!

Here's how you change the CI configuration to do this.

NOTE: Do not land a Pull Request that switches CI from virtual to physical devices! Add the pr:do-not-land label and call out that the PR is only there for testing!

By default the Fenix CI runs tests using virtual devices on x86. That's faster when the host is also a x86(_64) system, but most physical devices use the Arm platform. So first we need to instruct it to run tests on Arm.

Which platform to test on is defined in taskcluster/ci/ui-test/kind.yml. Find the line where it downloads the target.apk produced in a previous step and change it from x86 to arm64-v8a:

run: commands: - - [wget, {artifact-reference: '<signing/public/build/x86/target.apk>'}, '-O', app.apk] + - [wget, {artifact-reference: '<signing/public/build/arm64-v8a/target.apk>'}, '-O', app.apk]

Then look for the line where it invokes the ui-test.sh and tell it to use arm64-v8a again:

run: commands: - - [automation/taskcluster/androidTest/ui-test.sh, x86, app.apk, android-test.apk, '-1'] + - [automation/taskcluster/androidTest/ui-test.sh, arm64-v8a, app.apk, android-test.apk, '-1']

With the old CI configuration it will look for Firebase parameters in automation/taskcluster/androidTest/flank-x86.yml. Now that we switched the architecture it will pick up automation/taskcluster/androidTest/flank-arm64-v8a.yml instead.

In that file we can now pick the device we want to run on:

device: - - model: Pixel2 + - model: dreamlte version: 28

You can get a list of available devices by running gcloud locally:

gcloud firebase test android models list

The value from the MODEL_ID column is what you use for the model parameter in flank-arm64-v8a.yml. dreamlte translates to a Samsung Galaxy S8, which is available on Android API version 28.

If you only want to run a subset of tests define the test-targets:

test-targets: - class org.mozilla.fenix.glean.BaselinePingTest

Specify an exact test class as above to run tests from just that class.

And that's all the configuration necessary. Save your changes, commit them, then push up your code and create a pull request. Once the decision task on your PR finishes you will find a ui-test-x86-debug job (yes, x86, we didn't rename the job). Its log file will have details on the test run and contain links to the test run summary. Follow them to get more details, including the logcat output and a video of the test run.

This explanation will eventually move into documentation for Mozilla's Android projects.
Thanks to Richard Pappalardo & Aaron Train for the help figuring out how to run tests on physical devices and early feedback on the post. Thanks to Will Lachance for feedback and corrections. Any further errors are mine alone.

Categorieën: Mozilla-nl planet

Mozilla Attack & Defense: Implementing form filling and accessibility in the Firefox PDF viewer

Mozilla planet - do, 14/10/2021 - 16:23
Intro

Last year, during lockdown, many discovered the importance of PDF forms when having to deal remotely with administrations and large organizations like banks. Firefox supported displaying PDF forms, but it didn’t support filling them: users had to print them, fill them by hand, and scan them back to digital form. We decided it was time to reinvest in the PDF viewer (PDF.js) and support filling PDF forms within Firefox to make our users’ lives easier.

While we invested more time in the PDF viewer, we also went through the backlog of work and prioritized improving the accessibility of our PDF reader for users of assistive technologies. Below we’ll describe how we implemented the form support, improved accessibility, and made sure we had no regressions along the way.

Brief Summary of the PDF.js Architecture

Overview of the PDF.js Architecture
To understand how we added support for forms and tagged PDFs, it’s first important to understand some basics about how the PDF viewer (PDF.js) works in Firefox.

First, PDF.js will fetch and parse the document in a web worker. The parsed document will then generate drawing instructions. PDF.js sends them to the main thread and draws them on an HTML5 canvas element.

Besides the canvas, PDF.js potentially creates three more layers that are displayed on top of it. The first layer, the text layer, enables text selection and search. It contains span elements that are transparent and line up with the text drawn below them on the canvas. The other two layers are the Annotation/AcroForm layer and the XFA form layer. They support form filling and we will describe them in more detail below.

Filling Forms (AcroForms)

AcroForms are one of two types of forms that PDF supports, the most common type of form.

AcroForm structure

Within a PDF file, the form elements are stored in the annotation data. Annotations in PDF are separate elements from the main content of a document. They are often used for things like taking notes on a document or drawing on top of a document. AcroForm annotation elements support user input similar to HTML input e.g. text, check boxes, radio buttons.

AcroForm implementation

In PDF.js, we parse a PDF file and create the annotations in a web worker. Then, we send them out from the worker and render them in the main process using HTML elements inserted in a div (annotation layer). We render this annotation layer, composed of HTML elements, on top of the canvas layer.

The annotation layer works well for displaying the form elements in the browser, but it was not compatible with the way PDF.js supports printing. When printing a PDF, we draw its contents on a special printing canvas, insert it into the current document and send it to the printer. To support printing form elements with user input, we needed to draw them on the canvas.

By inspecting (with the help of the qpdf tool) the raw PDF data of forms saved using other tools, we discovered that we needed to save the appearance of a filled field by using some PDF drawing instructions, and that we could support both saving and printing with a common implementation.

To generate the field appearance, we needed to get the values entered by the user. We introduced an object called annotationStorage to store those values by using callback functions in the corresponding HTML elements. The annotationStorage is then passed to the worker when saving or printing, and the values for each annotation are used to create an appearance.

Example PDF.js Form Rendering

On top a filled form in Firefox and on bottom the printed PDF opened in Evince.

Safely Executing JavaScript within PDFs

Thanks to our Telemetry, we discovered that many forms contain and use embedded JavaScript code (yes, that’s a thing!).

JavaScript in PDFs can be used for many things, but is most commonly used to validate data entered by the user or automatically calculate formulas. For example, in this PDF, tax calculations are performed automatically starting from user input. Since this feature is common and helpful to users, we set out to implement it in PDF.js.

The alternatives

From the start of our JavaScript implementation, our main concern was security. We did not want PDF files to become a new vector for attacks. Embedded JS code must be executed when a PDF is loaded or on events generated by form elements (focus, input, …).

We investigated using the following:

  1. JS eval function
  2. JS engine compiled in WebAssembly with emscripten
  3. Firefox JS engine ComponentUtils.Sandbox

The first option, while simple, was immediately discarded since running untrusted code in eval is very unsafe.

Option two, using a JS engine compiled with WebAssembly, was a strong contender since it would work with the built-in Firefox PDF viewer and the version of PDF.js that can be used in regular websites. However, it would have been a large new attack surface to audit. It would have also considerably increased the size of PDF.js and it would have been slower.

The third option, sandboxes, is a feature exposed to privileged code in Firefox that allows JS execution in a special isolated environment. The sandbox is created with a null principal, which means that everything within the sandbox can only be accessed by it and can only access other things within the sandbox itself (and by privileged Firefox code).

Our final choice

We settled on using a ComponentUtils.Sandbox for the Firefox built-in viewer. ComponentUtils.Sandbox has been used for years now in WebExtensions, so this implementation is battle tested and very safe: executing a script from a PDF is at least as safe as executing one from a normal web page.

For the generic web viewer (where we can only use standard web APIs, so we know nothing about ComponentUtils.Sandbox) and the pdf.js test suite we used a WebAssembly version of QuickJS (see pdf.js.quickjs for details).

The implementation of the PDF sandbox in Firefox works as follows:

  • We collect all the fields and their properties (including the JS actions associated with them) and then clone them into the sandbox;
  • At build time, we generate a bundle with the JS code to implement the PDF JS API (totally different from the web API we are accustomed to!). We load it in the sandbox and then execute it with the data collected during the first step;
  • In the HTML representation of the fields we added callbacks to handle the events (focus, input, …). The callbacks simply dispatch them into the sandbox through an object containing the field identifier and linked parameters. We execute the corresponding JS actions in the sandbox using eval (it’s safe in this case: we’re in a sandbox). Then, we clone the result and dispatch it outside the sandbox to update the states in the HTML representations of the fields.

We decided not to implement the PDF APIs related to I/O (network, disk, …) to avoid any security concerns.

Yet Another Form Format: XFA

Our Telemetry also informed us that another type of PDF forms, XFA, was fairly common. This format has been removed from the official PDF specification, but many PDFs with XFA still exist and are viewed by our users so we decided to implement it as well.

The XFA format

The XFA format is very different from what is usually in PDF files. A normal PDF is typically a list of drawing commands with all layout statically defined by the PDF generator. However, XFA is much closer to HTML and has a more dynamic layout that the PDF viewer must generate. In reality XFA is a totally different format that was bolted on to PDF.

The XFA entry in a PDF contains multiple XML streams: the most important being the template and datasets. The template XML contains all the information required to render the form: it contains the UI elements (e.g. text fields, checkboxes, …) and containers (subform, draw, …) which can have static or dynamic layouts. The datasets XML contains all the data used by the form itself (e.g. text field content, checkbox state, …). All these data are bound into the template (before layout) to set the values of the different UI elements.

Example Template <template xmlns="http://www.xfa.org/schema/xfa-template/3.6/"> <subform> <pageSet name="ps"> <pageArea name="page1" id="Page1"> <contentArea x="7.62mm" y="30.48mm" w="200.66mm" h="226.06mm"/> <medium stock="default" short="215.9mm" long="279.4mm"/> </pageArea> </pageSet> <subform> <draw name="Text1" y="10mm" x="50mm" w="200mm" h="7mm"> <font size="15pt" typeface="Helvetica"/> <value> <text>Hello XFA & PDF.js world !</text> </value> </ draw> </subform> </subform> </template> Output From Template

Rendering of XFA Document

The XFA implementation

In PDF.js we already had a pretty good XML parser to retrieve metadata about PDFs: it was a good start.

We decided to map every XML node to a JavaScript object, whose structure is used to validate the node (e.g. possible children and their different numbers). Once the XML is parsed and validated, the form data needs to be bound in the form template and some prototypes can be used with the help of SOM expressions (kind of XPath expressions).

The layout engine

In XFA, we can have different kinds of layouts and the final layout depends on the contents. We initially planned to piggyback on the Firefox layout engine, but we discovered that unfortunately we would need to lay everything out ourselves because XFA uses some layout features which don’t exist in Firefox. For example, when a container is overflowing the extra contents can be put in another container (often on a new page, but sometimes also in another subform). Moreover, some template elements don’t have any dimensions, which must be inferred based on their contents.

In the end we implemented a custom layout engine: we traverse the template tree from top to bottom and, following layout rules, check if an element fits into the available space. If it doesn’t, we flush all the elements layed out so far into the current content area, and we move to the next one.

During layout, we convert all the XML elements into JavaScript objects with a tree structure. Then, we send them to the main process to be converted into HTML elements and placed in the XFA layer.

The missing font problem

As mentioned above, the dimensions of some elements are not specified. We must compute them ourselves based on the font used in them. This is even more challenging because sometimes fonts are not embedded in the PDF file.

Not embedding fonts in a PDF is considered bad practice, but in reality many PDFs do not include some well-known fonts (e.g. the ones shipped by Acrobat or Windows: Arial, Calibri, …) as PDF creators simply expected them to be always available.

To have our output more closely match Adobe Acrobat, we decided to ship the Liberation fonts and glyph widths of well-known fonts. We used the widths to rescale the glyph drawing to have compatible font substitutions for all the well-known fonts.

Comparing glyph rescaling

On the left: default font without glyph rescaling. On the right: Liberation font with glyph rescaling to emulate MyriadPro.

The result

In the end the result turned out quite good, for example, you can now open PDFs such as 5704 – APPLICATION FOR A FISH EXPORT LICENCE in Firefox 93!

Making PDFs accessible What is a Tagged PDF?

Early versions of PDFs were not a friendly format for accessibility tools such as screen readers. This was mainly because within a document, all text on a page is more or less absolutely positioned and there’s not a notion of a logical structure such as paragraphs, headings or sentences. There was also no way to provide a text description of images or figures. For example, some pseudo code for how a PDF may draw text:

showText(“This”, 0 /*x*/, 60 /*y*/); showText(“is”, 0, 40); showText(“a”, 0, 20); showText(“Heading!”, 0, 0);

This would draw text as four separate lines, but a screen reader would have no idea that they were all part of one heading. To help with accessibility, later versions of the PDF specification introduced “Tagged PDF.” This allowed PDFs to create a logical structure that screen readers could then use. One can think of this as a similar concept to an HTML hierarchy of DOM nodes. Using the example above, one could add tags:

beginTag(“heading 1”); showText(“This”, 0 /*x*/, 60 /*y*/); showText(“is”, 0, 40); showText(“a”, 0, 20); showText(“Heading!”, 0, 0); endTag(“heading 1”);

With the extra tag information, a screen reader knows that all of the lines are part of “heading 1” and can read it in a more natural fashion. The structure also allows screen readers to easily navigate to different parts of the document.

The above example is only about text, but tagged PDFs support many more features than this e.g. alt text for images, table data, lists, etc.

How we supported Tagged PDFs in PDF.js

For tagged PDFs we leveraged the existing “text layer” and the browsers built in HTML ARIA accessibility features. We can easily see this by a simple PDF example with one heading and one paragraph. First, we generate the logical structure and insert it into the canvas:

<canvas id="page1"> <!-- This content is not visible, but available to screen readers --> <span role="heading" aria-level="1" aria-owns="heading_id"></span> <span aria_owns="some_paragraph"></span> </canvas>

In the text layer that overlays the canvas:

<div id="text_layer"> <span id="heading_id">Some Heading</span> <span id="some_paragaph">Hello world!</span> </div>

A screen reader would then walk the DOM accessibility tree in the canvas and use the `aria-owns` attributes to find the text content for each node. For the above example, a screen reader would announce:

Heading Level 1 Some Heading
Hello World!

For those not familiar with screen readers, having this extra structure also makes navigating around the PDF much easier: you can jump from heading to heading and read paragraphs without unneeded pauses.

Ensure there are no regressions at scale, meet reftests

Reference Test Analyzer

Crawling for PDFs

Over the past few months, we have built a web crawler to retrieve PDFs from the web and, using a set of heuristics, collect statistics about them (e.g. are they XFA? What fonts are they using? What formats of images do they include?).

We have also used the crawler with its heuristics to retrieve PDFs of interest from the “stressful PDF corpus” published by the PDF association, which proved particularly interesting as they contained many corner cases we did not think could exist.

With the crawler, we were able to build a large corpus of Tagged PDFs (around 32000), PDFs using JS (around 1900), XFA PDFs (around 1200) which we could use for manual and automated testing. Kudos to our QA team for going through so many PDFs! They now know everything about asking for a fishing license in Canada, life skills!

Reftests for the win

We did not only use the corpus for manual QA, but also added some of those PDFs to our list of reftests (reference tests).

A reftest is a test consisting of a test file and a reference file. The test file uses the pdf.js rendering engine, while the reference file doesn’t (to make sure it is consistent and can’t be affected by changes in the patch the test is validating). The reference file is simply a screenshot of the rendering of a given PDF from the “master” branch of pdf.js.

The reftest process

When a developer submits a change to the PDF.js repo, we run the reftests and ensure the rendering of the test file is exactly the same as the reference screenshot. If there are differences, we ensure that the differences are improvements rather than regressions.

After accepting and merging a change, we regenerate the references.

The reftest shortcomings

In some situations a test may have subtle differences in rendering compared to the reference due to, e.g., anti-aliasing. This introduces noise in the results, with “fake” regressions the developer and reviewer have to sift through. Sometimes, it is possible to miss real regressions because of the large number of differences to look at.

Another shortcoming of reftests is that they are often big. A regression in a reftest is not as easy to investigate as a failure of a unit test.

Despite these shortcomings, reftests are a very powerful regression prevention weapon in the pdf.js arsenal. The large number of reftests we have boosts our confidence when applying changes.

Conclusion

Support for AcroForms landed in Firefox v84. JavaScript execution in v88. Tagged PDFs in v89. XFA forms in v93 (tomorrow, October 5th, 2021!).

While all of these features have greatly improved form usability and accessibility, there are still more features we’d like to add. If you’re interested in helping, we’re always looking for more contributors and you can join us on element or github.

Categorieën: Mozilla-nl planet

Ludovic Hirlimann: My geeking plans for this summer

Thunderbird - do, 07/05/2015 - 10:39

During July I’ll be visiting family in Mongolia but I’ve also a few things that are very geeky that I want to do.

The first thing I want to do is plug the Ripe Atlas probes I have. It’s litle devices that look like that :

Hello @ripe #Atlas !

They enable anybody with a ripe atlas or ripe account to make measurements for dns queries and others. This helps making a global better internet. I have three of these probes I’d like to install. It’s good because last time I checked Mongolia didn’t have any active probe. These probes will also help Internet become better in Mongolia. I’ll need to buy some network cables before leaving because finding these in mongolia is going to be challenging. More on atlas at https://atlas.ripe.net/.

The second thing I intend to do is map Mongolia a bit better on two projects the first is related to Mozilla and maps gps coordinateswith wifi access point. Only a little part of The capital Ulaanbaatar is covered as per https://location.services.mozilla.com/map#11/47.8740/106.9485 I want this to be way more because having an open data source for this is important in the future. As mapping is my new thing I’ll probably edit Openstreetmap in order to make the urban parts of mongolia that I’ll visit way more usable on all the services that use OSM as a source of truth. There is already a project to map the capital city at http://hotosm.org/projects/mongolia_mapping_ulaanbaatar but I believe osm can server more than just 50% of mongolia’s population.

I got inspired to write this post by mu son this morning, look what he is doing at 17 months :

Geeking on a Sun keyboard at 17 months
Categorieën: Mozilla-nl planet

Andrew Sutherland: Talk Script: Firefox OS Email Performance Strategies

Thunderbird - do, 30/04/2015 - 22:11

Last week I gave a talk at the Philly Tech Week 2015 Dev Day organized by the delightful people at technical.ly on some of the tricks/strategies we use in the Firefox OS Gaia Email app.  Note that the credit for implementing most of these techniques goes to the owner of the Email app’s front-end, James Burke.  Also, a special shout-out to Vivien for the initial DOM Worker patches for the email app.

I tried to avoid having slides that both I would be reading aloud as the audience read silently, so instead of slides to share, I have the talk script.  Well, I also have the slides here, but there’s not much to them.  The headings below are the content of the slides, except for the one time I inline some code.  Note that the live presentation must have differed slightly, because I’m sure I’m much more witty and clever in person than this script would make it seem…

Cover Slide: Who!

Hi, my name is Andrew Sutherland.  I work at Mozilla on the Firefox OS Email Application.  I’m here to share some strategies we used to make our HTML5 app Seem faster and sometimes actually Be faster.

What’s A Firefox OS (Screenshot Slide)

But first: What is a Firefox OS?  It’s a multiprocess Firefox gecko engine on an android linux kernel where all the apps including the system UI are implemented using HTML5, CSS, and JavaScript.  All the apps use some combination of standard web APIs and APIs that we hope to standardize in some form.

Firefox OS homescreen screenshot Firefox OS clock app screenshot Firefox OS email app screenshot

Here are some screenshots.  We’ve got the default home screen app, the clock app, and of course, the email app.

It’s an entirely client-side offline email application, supporting IMAP4, POP3, and ActiveSync.  The goal, like all Firefox OS apps shipped with the phone, is to give native apps on other platforms a run for their money.

And that begins with starting up fast.

Fast Startup: The Problems

But that’s frequently easier said than done.  Slow-loading websites are still very much a thing.

The good news for the email application is that a slow network isn’t one of its problems.  It’s pre-loaded on the phone.  And even if it wasn’t, because of the security implications of the TCP Web API and the difficulty of explaining this risk to users in a way they won’t just click through, any TCP-using app needs to be a cryptographically signed zip file approved by a marketplace.  So we do load directly from flash.

However, it’s not like flash on cellphones is equivalent to an infinitely fast, zero-latency network connection.  And even if it was, in a naive app you’d still try and load all of your HTML, CSS, and JavaScript at the same time because the HTML file would reference them all.  And that adds up.

It adds up in the form of event loop activity and competition with other threads and processes.  With the exception of Promises which get their own micro-task queue fast-lane, the web execution model is the same as all other UI event loops; events get scheduled and then executed in the same order they are scheduled.  Loading data from an asynchronous API like IndexedDB means that your read result gets in line behind everything else that’s scheduled.  And in the case of the bulk of shipped Firefox OS devices, we only have a single processor core so the thread and process contention do come into play.

So we try not to be a naive.

Seeming Fast at Startup: The HTML Cache

If we’re going to optimize startup, it’s good to start with what the user sees.  Once an account exists for the email app, at startup we display the default account’s inbox folder.

What is the least amount of work that we can do to show that?  Cache a screenshot of the Inbox.  The problem with that, of course, is that a static screenshot is indistinguishable from an unresponsive application.

So we did the next best thing, (which is) we cache the actual HTML we display.  At startup we load a minimal HTML file, our concatenated CSS, and just enough Javascript to figure out if we should use the HTML cache and then actually use it if appropriate.  It’s not always appropriate, like if our application is being triggered to display a compose UI or from a new mail notification that wants to show a specific message or a different folder.  But this is a decision we can make synchronously so it doesn’t slow us down.

Local Storage: Okay in small doses

We implement this by storing the HTML in localStorage.

Important Disclaimer!  LocalStorage is a bad API.  It’s a bad API because it’s synchronous.  You can read any value stored in it at any time, without waiting for a callback.  Which means if the data is not in memory the browser needs to block its event loop or spin a nested event loop until the data has been read from disk.  Browsers avoid this now by trying to preload the Entire contents of local storage for your origin into memory as soon as they know your page is being loaded.  And then they keep that information, ALL of it, in memory until your page is gone.

So if you store a megabyte of data in local storage, that’s a megabyte of data that needs to be loaded in its entirety before you can use any of it, and that hangs around in scarce phone memory.

To really make the point: do not use local storage, at least not directly.  Use a library like localForage that will use IndexedDB when available, and then fails over to WebSQLDatabase and local storage in that order.

Now, having sufficiently warned you of the terrible evils of local storage, I can say with a sorta-clear conscience… there are upsides in this very specific case.

The synchronous nature of the API means that once we get our turn in the event loop we can act immediately.  There’s no waiting around for an IndexedDB read result to gets its turn on the event loop.

This matters because although the concept of loading is simple from a User Experience perspective, there’s no standard to back it up right now.  Firefox OS’s UX desires are very straightforward.  When you tap on an app, we zoom it in.  Until the app is loaded we display the app’s icon in the center of the screen.  Unfortunately the standards are still assuming that the content is right there in the HTML.  This works well for document-based web pages or server-powered web apps where the contents of the page are baked in.  They work less well for client-only web apps where the content lives in a database and has to be dynamically retrieved.

The two events that exist are:

DOMContentLoaded” fires when the document has been fully parsed and all scripts not tagged as “async” have run.  If there were stylesheets referenced prior to the script tags, the script tags will wait for the stylesheet loads.

load” fires when the document has been fully loaded; stylesheets, images, everything.

But none of these have anything to do with the content in the page saying it’s actually done.  This matters because these standards also say nothing about IndexedDB reads or the like.  We tried to create a standards consensus around this, but it’s not there yet.  So Firefox OS just uses the “load” event to decide an app or page has finished loading and it can stop showing your app icon.  This largely avoids the dreaded “flash of unstyled content” problem, but it also means that your webpage or app needs to deal with this period of time by displaying a loading UI or just accepting a potentially awkward transient UI state.

(Trivial HTML slide)

<link rel=”stylesheet” ...> <script ...></script> DOMContentLoaded!

This is the important summary of our index.html.

We reference our stylesheet first.  It includes all of our styles.  We never dynamically load stylesheets because that compels a style recalculation for all nodes and potentially a reflow.  We would have to have an awful lot of style declarations before considering that.

Then we have our single script file.  Because the stylesheet precedes the script, our script will not execute until the stylesheet has been loaded.  Then our script runs and we synchronously insert our HTML from local storage.  Then DOMContentLoaded can fire.  At this point the layout engine has enough information to perform a style recalculation and determine what CSS-referenced image resources need to be loaded for buttons and icons, then those load, and then we’re good to be displayed as the “load” event can fire.

After that, we’re displaying an interactive-ish HTML document.  You can scroll, you can press on buttons and the :active state will apply.  So things seem real.

Being Fast: Lazy Loading and Optimized Layers

But now we need to try and get some logic in place as quickly as possible that will actually cash the checks that real-looking HTML UI is writing.  And the key to that is only loading what you need when you need it, and trying to get it to load as quickly as possible.

There are many module loading and build optimizing tools out there, and most frameworks have a preferred or required way of handling this.  We used the RequireJS family of Asynchronous Module Definition loaders, specifically the alameda loader and the r-dot-js optimizer.

One of the niceties of the loader plugin model is that we are able to express resource dependencies as well as code dependencies.

RequireJS Loader Plugins

var fooModule = require('./foo'); var htmlString = require('text!./foo.html'); var localizedDomNode = require('tmpl!./foo.html');

The standard Common JS loader semantics used by node.js and io.js are the first one you see here.  Load the module, return its exports.

But RequireJS loader plugins also allow us to do things like the second line where the exclamation point indicates that the load should occur using a loader plugin, which is itself a module that conforms to the loader plugin contract.  In this case it’s saying load the file foo.html as raw text and return it as a string.

But, wait, there’s more!  loader plugins can do more than that.  The third example uses a loader that loads the HTML file using the ‘text’ plugin under the hood, creates an HTML document fragment, and pre-localizes it using our localization library.  And this works un-optimized in a browser, no compilation step needed, but it can also be optimized.

So when our optimizer runs, it bundles up the core modules we use, plus, the modules for our “message list” card that displays the inbox.  And the message list card loads its HTML snippets using the template loader plugin.  The r-dot-js optimizer then locates these dependencies and the loader plugins also have optimizer logic that results in the HTML strings being inlined in the resulting optimized file.  So there’s just one single javascript file to load with no extra HTML file dependencies or other loads.

We then also run the optimizer against our other important cards like the “compose” card and the “message reader” card.  We don’t do this for all cards because it can be hard to carve up the module dependency graph for optimization without starting to run into cases of overlap where many optimized files redundantly include files loaded by other optimized files.

Plus, we have another trick up our sleeve:

Seeming Fast: Preloading

Preloading.  Our cards optionally know the other cards they can load.  So once we display a card, we can kick off a preload of the cards that might potentially be displayed.  For example, the message list card can trigger the compose card and the message reader card, so we can trigger a preload of both of those.

But we don’t go overboard with preloading in the frontend because we still haven’t actually loaded the back-end that actually does all the emaily email stuff.  The back-end is also chopped up into optimized layers along account type lines and online/offline needs, but the main optimized JS file still weighs in at something like 17 thousand lines of code with newlines retained.

So once our UI logic is loaded, it’s time to kick-off loading the back-end.  And in order to avoid impacting the responsiveness of the UI both while it loads and when we’re doing steady-state processing, we run it in a DOM Worker.

Being Responsive: Workers and SharedWorkers

DOM Workers are background JS threads that lack access to the page’s DOM, communicating with their owning page via message passing with postMessage.  Normal workers are owned by a single page.  SharedWorkers can be accessed via multiple pages from the same document origin.

By doing this, we stay out of the way of the main thread.  This is getting less important as browser engines support Asynchronous Panning & Zooming or “APZ” with hardware-accelerated composition, tile-based rendering, and all that good stuff.  (Some might even call it magic.)

When Firefox OS started, we didn’t have APZ, so any main-thread logic had the serious potential to result in janky scrolling and the impossibility of rendering at 60 frames per second.  It’s a lot easier to get 60 frames-per-second now, but even asynchronous pan and zoom potentially has to wait on dispatching an event to the main thread to figure out if the user’s tap is going to be consumed by app logic and preventDefault called on it.  APZ does this because it needs to know whether it should start scrolling or not.

And speaking of 60 frames-per-second…

Being Fast: Virtual List Widgets

…the heart of a mail application is the message list.  The expected UX is to be able to fling your way through the entire list of what the email app knows about and see the messages there, just like you would on a native app.

This is admittedly one of the areas where native apps have it easier.  There are usually list widgets that explicitly have a contract that says they request data on an as-needed basis.  They potentially even include data bindings so you can just point them at a data-store.

But HTML doesn’t yet have a concept of instantiate-on-demand for the DOM, although it’s being discussed by Firefox layout engine developers.  For app purposes, the DOM is a scene graph.  An extremely capable scene graph that can handle huge documents, but there are footguns and it’s arguably better to err on the side of fewer DOM nodes.

So what the email app does is we create a scroll-region div and explicitly size it based on the number of messages in the mail folder we’re displaying.  We create and render enough message summary nodes to cover the current screen, 3 screens worth of messages in the direction we’re scrolling, and then we also retain up to 3 screens worth in the direction we scrolled from.  We also pre-fetch 2 more screens worth of messages from the database.  These constants were arrived at experimentally on prototype devices.

We listen to “scroll” events and issue database requests and move DOM nodes around and update them as the user scrolls.  For any potentially jarring or expensive transitions such as coordinate space changes from new messages being added above the current scroll position, we wait for scrolling to stop.

Nodes are absolutely positioned within the scroll area using their ‘top’ style but translation transforms also work.  We remove nodes from the DOM, then update their position and their state before re-appending them.  We do this because the browser APZ logic tries to be clever and figure out how to create an efficient series of layers so that it can pre-paint as much of the DOM as possible in graphic buffers, AKA layers, that can be efficiently composited by the GPU.  Its goal is that when the user is scrolling, or something is being animated, that it can just move the layers around the screen or adjust their opacity or other transforms without having to ask the layout engine to re-render portions of the DOM.

When our message elements are added to the DOM with an already-initialized absolute position, the APZ logic lumps them together as something it can paint in a single layer along with the other elements in the scrolling region.  But if we start moving them around while they’re still in the DOM, the layerization logic decides that they might want to independently move around more in the future and so each message item ends up in its own layer.  This slows things down.  But by removing them and re-adding them it sees them as new with static positions and decides that it can lump them all together in a single layer.  Really, we could just create new DOM nodes, but we produce slightly less garbage this way and in the event there’s a bug, it’s nicer to mess up with 30 DOM nodes displayed incorrectly rather than 3 million.

But as neat as the layerization stuff is to know about on its own, I really mention it to underscore 2 suggestions:

1, Use a library when possible.  Getting on and staying on APZ fast-paths is not trivial, especially across browser engines.  So it’s a very good idea to use a library rather than rolling your own.

2, Use developer tools.  APZ is tricky to reason about and even the developers who write the Async pan & zoom logic can be surprised by what happens in complex real-world situations.  And there ARE developer tools available that help you avoid needing to reason about this.  Firefox OS has easy on-device developer tools that can help diagnose what’s going on or at least help tell you whether you’re making things faster or slower:

– it’s got a frames-per-second overlay; you do need to scroll like mad to get the system to want to render 60 frames-per-second, but it makes it clear what the net result is

– it has paint flashing that overlays random colors every time it paints the DOM into a layer.  If the screen is flashing like a discotheque or has a lot of smeared rainbows, you know something’s wrong because the APZ logic is not able to to just reuse its layers.

– devtools can enable drawing cool colored borders around the layers APZ has created so you can see if layerization is doing something crazy

There’s also fancier and more complicated tools in Firefox and other browsers like Google Chrome to let you see what got painted, what the layer tree looks like, et cetera.

And that’s my spiel.

Links

The source code to Gaia can be found at https://github.com/mozilla-b2g/gaia

The email app in particular can be found at https://github.com/mozilla-b2g/gaia/tree/master/apps/email

(I also asked for questions here.)

Categorieën: Mozilla-nl planet

Joshua Cranmer: Breaking news

Thunderbird - wo, 01/04/2015 - 09:00
It was brought to my attention recently by reputable sources that the recent announcement of increased usage in recent years produced an internal firestorm within Mozilla. Key figures raised alarm that some of the tech press had interpreted the blog post as a sign that Thunderbird was not, in fact, dead. As a result, they asked Thunderbird community members to make corrections to emphasize that Mozilla was trying to kill Thunderbird.

The primary fear, it seems, is that knowledge that the largest open-source email client was still receiving regular updates would impel its userbase to agitate for increased funding and maintenance of the client to help forestall potential threats to the open nature of email as well as to innovate in the space of providing usable and private communication channels. Such funding, however, would be an unaffordable luxury and would only distract Mozilla from its central goal of building developer productivity tooling. Persistent rumors that Mozilla would be willing to fund Thunderbird were it renamed Firefox Email were finally addressed with the comment, "such a renaming would violate our current policy that all projects be named Persona."

Categorieën: Mozilla-nl planet

Joshua Cranmer: Why email is hard, part 8: why email security failed

Thunderbird - di, 13/01/2015 - 05:38
This post is part 8 of an intermittent series exploring the difficulties of writing an email client. Part 1 describes a brief history of the infrastructure. Part 2 discusses internationalization. Part 3 discusses MIME. Part 4 discusses email addresses. Part 5 discusses the more general problem of email headers. Part 6 discusses how email security works in practice. Part 7 discusses the problem of trust. This part discusses why email security has largely failed.

At the end of the last part in this series, I posed the question, "Which email security protocol is most popular?" The answer to the question is actually neither S/MIME nor PGP, but a third protocol, DKIM. I haven't brought up DKIM until now because DKIM doesn't try to secure email in the same vein as S/MIME or PGP, but I still consider it relevant to discussing email security.

Unquestionably, DKIM is the only security protocol for email that can be considered successful. There are perhaps 4 billion active email addresses [1]. Of these, about 1-2 billion use DKIM. In contrast, S/MIME can count a few million users, and PGP at best a few hundred thousand. No other security protocols have really caught on past these three. Why did DKIM succeed where the others fail?

DKIM's success stems from its relatively narrow focus. It is nothing more than a cryptographic signature of the message body and a smattering of headers, and is itself stuck in the DKIM-Signature header. It is meant to be applied to messages only on outgoing servers and read and processed at the recipient mail server—it completely bypasses clients. That it bypasses clients allows it to solve the problem of key discovery and key management very easily (public keys are stored in DNS, which is already a key part of mail delivery), and its role in spam filtering is strong motivation to get it implemented quickly (it is 7 years old as of this writing). It's also simple: this one paragraph description is basically all you need to know [2].

The failure of S/MIME and PGP to see large deployment is certainly a large topic of discussion on myriads of cryptography enthusiast mailing lists, which often like to partake in propositions of new end-to-end encryption of email paradigms, such as the recent DIME proposal. Quite frankly, all of these solutions suffer broadly from at least the same 5 fundamental weaknesses, and I see it unlikely that a protocol will come about that can fix these weaknesses well enough to become successful.

The first weakness, and one I've harped about many times already, is UI. Most email security UI is abysmal and generally at best usable only by enthusiasts. At least some of this is endemic to security: while it mean seem obvious how to convey what an email signature or an encrypted email signifies, how do you convey the distinctions between sign-and-encrypt, encrypt-and-sign, or an S/MIME triple wrap? The Web of Trust model used by PGP (and many other proposals) is even worse, in that inherently requires users to do other actions out-of-band of email to work properly.

Trust is the second weakness. Consider that, for all intents and purposes, the email address is the unique identifier on the Internet. By extension, that implies that a lot of services are ultimately predicated on the notion that the ability to receive and respond to an email is a sufficient means to identify an individual. However, the entire purpose of secure email, or at least of end-to-end encryption, is subtly based on the fact that other people in fact have access to your mailbox, thus destroying the most natural ways to build trust models on the Internet. The quest for anonymity or privacy also renders untenable many other plausible ways to establish trust (e.g., phone verification or government-issued ID cards).

Key discovery is another weakness, although it's arguably the easiest one to solve. If you try to keep discovery independent of trust, the problem of key discovery is merely picking a protocol to publish and another one to find keys. Some of these already exist: PGP key servers, for example, or using DANE to publish S/MIME or PGP keys.

Key management, on the other hand, is a more troubling weakness. S/MIME, for example, basically works without issue if you have a certificate, but managing to get an S/MIME certificate is a daunting task (necessitated, in part, by its trust model—see how these issues all intertwine?). This is also where it's easy to say that webmail is an unsolvable problem, but on further reflection, I'm not sure I agree with that statement anymore. One solution is just storing the private key with the webmail provider (you're trusting them as an email client, after all), but it's also not impossible to imagine using phones or flash drives as keystores. Other key management factors are more difficult to solve: people who lose their private keys or key rollover create thorny issues. There is also the difficulty of managing user expectations: if I forget my password to most sites (even my email provider), I can usually get it reset somehow, but when a private key is lost, the user is totally and completely out of luck.

Of course, there is one glaring and almost completely insurmountable problem. Encrypted email fundamentally precludes certain features that we have come to take for granted. The lesser known is server-side search and filtration. While there exist some mechanisms to do search on encrypted text, those mechanisms rely on the fact that you can manipulate the text to change the message, destroying the integrity feature of secure email. They also tend to be fairly expensive. It's easy to just say "who needs server-side stuff?", but the contingent of people who do email on smartphones would not be happy to have to pay the transfer rates to download all the messages in their folder just to find one little email, nor the energy costs of doing it on the phone. And those who have really large folders—Fastmail has a design point of 1,000,000 in a single folder—would still prefer to not have to transfer all their mail even on desktops.

The more well-known feature that would disappear is spam filtration. Consider that 90% of all email is spam, and if you think your spam folder is too slim for that to be true, it's because your spam folder only contains messages that your email provider wasn't sure were spam. The loss of server-side spam filtering would dramatically increase the cost of spam (a 10% reduction in efficiency would double the amount of server storage, per my calculations), and client-side spam filtering is quite literally too slow [3] and too costly (remember smartphones? Imagine having your email take 10 times as much energy and bandwidth) to be a tenable option. And privacy or anonymity tends to be an invitation to abuse (cf. Tor and Wikipedia). Proposed solutions to the spam problem are so common that there is a checklist containing most of the objections.

When you consider all of those weaknesses, it is easy to be pessimistic about the possibility of wide deployment of powerful email security solutions. The strongest future—all email is encrypted, including metadata—is probably impossible or at least woefully impractical. That said, if you weaken some of the assumptions (say, don't desire all or most traffic to be encrypted), then solutions seem possible if difficult.

This concludes my discussion of email security, at least until things change for the better. I don't have a topic for the next part in this series picked out (this part actually concludes the set I knew I wanted to discuss when I started), although OAuth and DMARC are two topics that have been bugging me enough recently to consider writing about. They also have the unfortunate side effect of being things likely to see changes in the near future, unlike most of the topics I've discussed so far. But rest assured that I will find more difficulties in the email infrastructure to write about before long!

[1] All of these numbers are crude estimates and are accurate to only an order of magnitude. To justify my choices: I assume 1 email address per Internet user (this overestimates the developing world and underestimates the developed world). The largest webmail providers have given numbers that claim to be 1 billion active accounts between them, and all of them use DKIM. S/MIME is guessed by assuming that any smartcard deployment supports S/MIME, and noting that the US Department of Defense and Estonia's digital ID project are both heavy users of such smartcards. PGP is estimated from the size of the strong set and old numbers on the reachable set from the core Web of Trust.
[2] Ever since last April, it's become impossible to mention DKIM without referring to DMARC, as a result of Yahoo's controversial DMARC policy. A proper discussion of DMARC (and why what Yahoo did was controversial) requires explaining the mail transmission architecture and spam, however, so I'll defer that to a later post. It's also possible that changes in this space could happen within the next year.
[3] According to a former GMail spam employee, if it takes you as long as three minutes to calculate reputation, the spammer wins.

Categorieën: Mozilla-nl planet

Joshua Cranmer: A unified history for comm-central

Thunderbird - za, 10/01/2015 - 18:55
Several years back, Ehsan and Jeff Muizelaar attempted to build a unified history of mozilla-central across the Mercurial era and the CVS era. Their result is now used in the gecko-dev repository. While being distracted on yet another side project, I thought that I might want to do the same for comm-central. It turns out that building a unified history for comm-central makes mozilla-central look easy: mozilla-central merely had one import from CVS. In contrast, comm-central imported twice from CVS (the calendar code came later), four times from mozilla-central (once with converted history), and imported twice from Instantbird's repository (once with converted history). Three of those conversions also involved moving paths. But I've worked through all of those issues to provide a nice snapshot of the repository [1]. And since I've been frustrated by failing to find good documentation on how this sort of process went for mozilla-central, I'll provide details on the process for comm-central.

The first step and probably the hardest is getting the CVS history in DVCS form (I use hg because I'm more comfortable it, but there's effectively no difference between hg, git, or bzr here). There is a git version of mozilla's CVS tree available, but I've noticed after doing research that its last revision is about a month before the revision I need for Calendar's import. The documentation for how that repo was built is no longer on the web, although we eventually found a copy after I wrote this post on git.mozilla.org. I tried doing another conversion using hg convert to get CVS tags, but that rudely blew up in my face. For now, I've filed a bug on getting an official, branchy-and-tag-filled version of this repository, while using the current lack of history as a base. Calendar people will have to suffer missing a month of history.

CVS is famously hard to convert to more modern repositories, and, as I've done my research, Mozilla's CVS looks like it uses those features which make it difficult. In particular, both the calendar CVS import and the comm-central initial CVS import used a CVS tag HG_COMM_INITIAL_IMPORT. That tagging was done, on only a small portion of the tree, twice, about two months apart. Fortunately, mailnews code was never touched on CVS trunk after the import (there appears to be one commit on calendar after the tagging), so it is probably possible to salvage a repository-wide consistent tag.

The start of my script for conversion looks like this:

#!/bin/bash set -e WORKDIR=/tmp HGCVS=$WORKDIR/mozilla-cvs-history MC=/src/trunk/mozilla-central CC=/src/trunk/comm-central OUTPUT=$WORKDIR/full-c-c # Bug 445146: m-c/editor/ui -> c-c/editor/ui MC_EDITOR_IMPORT=d8064eff0a17372c50014ee305271af8e577a204 # Bug 669040: m-c/db/mork -> c-c/db/mork MC_MORK_IMPORT=f2a50910befcf29eaa1a29dc088a8a33e64a609a # Bug 1027241, bug 611752 m-c/security/manager/ssl/** -> c-c/mailnews/mime/src/* MC_SMIME_IMPORT=e74c19c18f01a5340e00ecfbc44c774c9a71d11d # Step 0: Grab the mozilla CVS history. if [ ! -e $HGCVS ]; then hg clone git+https://github.com/jrmuizel/mozilla-cvs-history.git $HGCVS fi

Since I don't want to include the changesets useless to comm-central history, I trimmed the history by using hg convert to eliminate changesets that don't change the necessary files. Most of the files are simple directory-wide changes, but S/MIME only moved a few files over, so it requires a more complex way to grab the file list. In addition, I also replaced the % in the usernames with @ that they are used to appearing in hg. The relevant code is here:

# Step 1: Trim mozilla CVS history to include only the files we are ultimately # interested in. cat >$WORKDIR/convert-filemap.txt <<EOF # Revision e4f4569d451a include directory/xpcom include mail include mailnews include other-licenses/branding/thunderbird include suite # Revision 7c0bfdcda673 include calendar include other-licenses/branding/sunbird # Revision ee719a0502491fc663bda942dcfc52c0825938d3 include editor/ui # Revision 52efa9789800829c6f0ee6a005f83ed45a250396 include db/mork/ include db/mdb/ EOF # Add the S/MIME import files hg -R $MC log -r "children($MC_SMIME_IMPORT)" \ --template "{file_dels % 'include {file}\n'}" >>$WORKDIR/convert-filemap.txt if [ ! -e $WORKDIR/convert-authormap.txt ]; then hg -R $HGCVS log --template "{email(author)}={sub('%', '@', email(author))}\n" \ | sort -u > $WORKDIR/convert-authormap.txt fi cd $WORKDIR hg convert $HGCVS $OUTPUT --filemap convert-filemap.txt -A convert-authormap.txt

That last command provides us the subset of the CVS history that we need for unified history. Strictly speaking, I should be pulling a specific revision, but I happen to know that there's no need to (we're cloning the only head) in this case. At this point, we now need to pull in the mozilla-central changes before we pull in comm-central. Order is key; hg convert will only apply the graft points when converting the child changeset (which it does but once), and it needs the parents to exist before it can do that. We also need to ensure that the mozilla-central graft point is included before continuing, so we do that, and then pull mozilla-central:

CC_CVS_BASE=$(hg log -R $HGCVS -r 'tip' --template '{node}') CC_CVS_BASE=$(grep $CC_CVS_BASE $OUTPUT/.hg/shamap | cut -d' ' -f2) MC_CVS_BASE=$(hg log -R $HGCVS -r 'gitnode(215f52d06f4260fdcca797eebd78266524ea3d2c)' --template '{node}') MC_CVS_BASE=$(grep $MC_CVS_BASE $OUTPUT/.hg/shamap | cut -d' ' -f2) # Okay, now we need to build the map of revisions. cat >$WORKDIR/convert-revmap.txt <<EOF e4f4569d451a5e0d12a6aa33ebd916f979dd8faa $CC_CVS_BASE # Thunderbird / Suite 7c0bfdcda6731e77303f3c47b01736aaa93d5534 d4b728dc9da418f8d5601ed6735e9a00ac963c4e, $CC_CVS_BASE # Calendar 9b2a99adc05e53cd4010de512f50118594756650 $MC_CVS_BASE # Mozilla graft point ee719a0502491fc663bda942dcfc52c0825938d3 78b3d6c649f71eff41fe3f486c6cc4f4b899fd35, $MC_EDITOR_IMPORT # Editor 8cdfed92867f885fda98664395236b7829947a1d 4b5da7e5d0680c6617ec743109e6efc88ca413da, e4e612fcae9d0e5181a5543ed17f705a83a3de71 # Chat EOF # Next, import mozilla-central revisions for rev in $MC_MORK_IMPORT $MC_EDITOR_IMPORT $MC_SMIME_IMPORT; do hg convert $MC $OUTPUT -r $rev --splicemap $WORKDIR/convert-revmap.txt \ --filemap $WORKDIR/convert-filemap.txt done

Some notes about all of the revision ids in the script. The splicemap requires the full 40-character SHA ids; anything less and the thing complains. I also need to specify the parents of the revisions that deleted the code for the mozilla-central import, so if you go hunting for those revisions and are surprised that they don't remove the code in question, that's why.

I mentioned complications about the merges earlier. The Mork and S/MIME import codes here moved files, so that what was db/mdb in mozilla-central became db/mork. There's no support for causing the generated splice to record these as a move, so I have to manually construct those renamings:

# We need to execute a few hg move commands due to renamings. pushd $OUTPUT hg update -r $(grep $MC_MORK_IMPORT .hg/shamap | cut -d' ' -f2) (hg -R $MC log -r "children($MC_MORK_IMPORT)" \ --template "{file_dels % 'hg mv {file} {sub(\"db/mdb\", \"db/mork\", file)}\n'}") | bash hg commit -m 'Pseudo-changeset to move Mork files' -d '2011-08-06 17:25:21 +0200' MC_MORK_IMPORT=$(hg log -r tip --template '{node}') hg update -r $(grep $MC_SMIME_IMPORT .hg/shamap | cut -d' ' -f2) (hg -R $MC log -r "children($MC_SMIME_IMPORT)" \ --template "{file_dels % 'hg mv {file} {sub(\"security/manager/ssl\", \"mailnews/mime\", file)}\n'}") | bash hg commit -m 'Pseudo-changeset to move S/MIME files' -d '2014-06-15 20:51:51 -0700' MC_SMIME_IMPORT=$(hg log -r tip --template '{node}') popd # Echo the new move commands to the changeset conversion map. cat >>$WORKDIR/convert-revmap.txt <<EOF 52efa9789800829c6f0ee6a005f83ed45a250396 abfd23d7c5042bc87502506c9f34c965fb9a09d1, $MC_MORK_IMPORT # Mork 50f5b5fc3f53c680dba4f237856e530e2097adfd 97253b3cca68f1c287eb5729647ba6f9a5dab08a, $MC_SMIME_IMPORT # S/MIME EOF

Now that we have all of the graft points defined, and all of the external code ready, we can pull comm-central and do the conversion. That's not quite it, though—when we graft the S/MIME history to the original mozilla-central history, we have a small segment of abandoned converted history. A call to hg strip removes that.

# Now, import comm-central revisions that we need hg convert $CC $OUTPUT --splicemap $WORKDIR/convert-revmap.txt hg strip 2f69e0a3a05a

[1] I left out one of the graft points because I just didn't want to deal with it. I'll leave it as an exercise to the reader to figure out which one it was. Hint: it's the only one I didn't know about before I searched for the archive points [2].
[2] Since I wasn't sure I knew all of the graft points, I decided to try to comb through all of the changesets to figure out who imported code. It turns out that hg log -r 'adds("**")' narrows it down nicely (1667 changesets to look at instead of 17547), and using the {file_adds} template helps winnow it down more easily.

Categorieën: Mozilla-nl planet

Philipp Kewisch: Monitor all http(s) network requests using the Mozilla Platform

Thunderbird - do, 02/10/2014 - 16:38

In an xpcshell test, I recently needed a way to monitor all network requests and access both request and response data so I can save them for later use. This required a little bit of digging in Mozilla’s devtools code so I thought I’d write a short blog post about it.

This code will be used in a testcase that ensures that calendar providers in Lightning function properly. In the case of the CalDAV provider, we would need to access a real server for testing. We can’t just set up a few servers and use them for testing, it would end in an unreasonable amount of server maintenance. Given non-local connections are not allowed when running the tests on the Mozilla build infrastructure, it wouldn’t work anyway. The solution is to create a fakeserver, that is able to replay the requests in the same way. Instead of manually making the requests and figuring out how the server replies, we can use this code to quickly collect all the requests we need.

Without further delay, here is the code you have been waiting for:


This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters

Show hidden characters





/* This Source Code Form is subject to the terms of the Mozilla Public * License, v. 2.0. If a copy of the MPL was not distributed with this * file, You can obtain one at http://mozilla.org/MPL/2.0/. */ var allRequests = []; /** * Add the following function as a request observer: * Services.obs.addObserver(httpObserver, "http-on-examine-response", false); * * When done listening on requests: * dump(allRequests.join("\n===\n")); // print them * dump(JSON.stringify(allRequests, null, " ")) // jsonify them */ function httpObserver(aSubject, aTopic, aData) { if (aSubject instanceof Components.interfaces.nsITraceableChannel) { let request = new TracedRequest(aSubject); request._next = aSubject.setNewListener(request); allRequests.push(request); } } /** * This is the object that represents a request/response and also collects the data for it * * @param aSubject The channel from the response observer. */ function TracedRequest(aSubject) { let httpchannel = aSubject.QueryInterface(Components.interfaces.nsIHttpChannel); let self = this; this.requestHeaders = Object.create(null); httpchannel.visitRequestHeaders({ visitHeader: function(k, v) { self.requestHeaders[k] = v; } }); this.responseHeaders = Object.create(null); httpchannel.visitResponseHeaders({ visitHeader: function(k, v) { self.responseHeaders[k] = v; } }); this.uri = aSubject.URI.spec; this.method = httpchannel.requestMethod; this.requestBody = readRequestBody(aSubject); this.responseStatus = httpchannel.responseStatus; this.responseStatusText = httpchannel.responseStatusText; this._chunks = []; } TracedRequest.prototype = { uri: null, method: null, requestBody: null, requestHeaders: null, responseStatus: null, responseStatusText: null, responseHeaders: null, responseBody: null, toJSON: function() { let j = Object.create(null); for (let m of Object.keys(this)) { if (typeof this[m] != "function" && m[0] != "_") { j[m] = this[m]; } } return j; }, onStartRequest: function(aRequest, aContext) this._next.onStartRequest(aRequest, aContext), onStopRequest: function(aRequest, aContext, aStatusCode) { this.responseBody = this._chunks.join(""); this._chunks = null; this._next.onStopRequest(aRequest, aContext, aStatusCode); this._next = null; }, onDataAvailable: function(aRequest, aContext, aStream, aOffset, aCount) { let binaryInputStream = Components.classes["@mozilla.org/binaryinputstream;1"] .createInstance(Components.interfaces.nsIBinaryInputStream); let storageStream = Components.classes["@mozilla.org/storagestream;1"] .createInstance(Components.interfaces.nsIStorageStream); let outStream = Components.classes["@mozilla.org/binaryoutputstream;1"] .createInstance(Components.interfaces.nsIBinaryOutputStream); binaryInputStream.setInputStream(aStream); storageStream.init(8192, aCount, null); outStream.setOutputStream(storageStream.getOutputStream(0)); let data = binaryInputStream.readBytes(aCount); this._chunks.push(data); outStream.writeBytes(data, aCount); this._next.onDataAvailable(aRequest, aContext, storageStream.newInputStream(0), aOffset, aCount); }, toString: function() { let str = this.method + " " + this.uri; for (let hdr of Object.keys(this.requestHeaders)) { str += hdr + ": " + this.requestHeaders[hdr] + "\n"; } if (this.requestBody) { str += "\r\n" + this.requestBody + "\n"; } str += "\n" + this.responseStatus + " " + this.responseStatusText if (this.responseBody) { str += "\r\n" + this.responseBody + "\n"; } return str; } }; // Taken from: // http://hg.mozilla.org/mozilla-central/file/2399d1ae89e9/toolkit/devtools/webconsole/network-helper.js#l120 function readRequestBody(aRequest, aCharset="UTF-8") { let text = null; if (aRequest instanceof Ci.nsIUploadChannel) { let iStream = aRequest.uploadStream; let isSeekableStream = false; if (iStream instanceof Ci.nsISeekableStream) { isSeekableStream = true; } let prevOffset; if (isSeekableStream) { prevOffset = iStream.tell(); iStream.seek(Ci.nsISeekableStream.NS_SEEK_SET, 0); } // Read data from the stream. try { let rawtext = NetUtil.readInputStreamToString(iStream, iStream.available()) let conv = Components.classes["@mozilla.org/intl/scriptableunicodeconverter"] .createInstance(Components.interfaces.nsIScriptableUnicodeConverter); conv.charset = aCharset; text = conv.ConvertToUnicode(rawtext); } catch (err) { } // Seek locks the file, so seek to the beginning only if necko hasn't // read it yet, since necko doesn't eek to 0 before reading (at lest // not till 459384 is fixed). if (isSeekableStream && prevOffset == 0) { iStream.seek(Components.interfaces.nsISeekableStream.NS_SEEK_SET, 0); } } return text; }

view raw

TracedRequest.js

hosted with ❤ by GitHub

Categorieën: Mozilla-nl planet

Pagina's