mozilla

Mozilla Nederland LogoDe Nederlandse
Mozilla-gemeenschap

Abonneren op feed Mozilla planet
Planet Mozilla - http://planet.mozilla.org/
Bijgewerkt: 33 min 55 sec geleden

Mark Côté: Conduit Field Report, March 2017

di, 14/03/2017 - 17:44

For background on Conduit, please see the previous post and the Intent to Implement.

Autoland

We kicked off Conduit work in January starting with the new Autoland service. Right now, much of the Autoland functionality is located in the MozReview Review Board extension: the permissions model, the rewriting of commit messages to reflect the reviewers, and the user interface. The only part that is currently logically separate is the “transplant service”, which actually takes commits from one repo (e.g. reviewboard-hg) and applies it to another (e.g. try, mozilla-central). Since the goal of Conduit is to decouple all the automation from code-review tools, we have to take everything that’s currently in Review Board and move it to new, separate services.

The original plan was to switch Review Board over to the new Autoland service when it was ready, stripping out all the old code from the MozReview extension. This would mean little change for MozReview users (basically just a new, separate UI), but would get people using the new service right away. After Autoland, we’d work on the push-to-review side, hooking that up to Review Board, and then extend both systems to interface with BMO. This strategy of incrementally replacing pieces of MozReview seemed like the best way to find bugs as we went along, rather than a massive switchover all at once.

However, progress was a bit slower than we anticipated, largely due to the fact that so many things were new about this project (see below). We want Autoland to be fully hooked up to BMO by the end of June, and integrating the new system with both Review Board and BMO as we went along seemed increasingly like a bad idea. Instead, we decided to put BMO integration first, and then follow with Review Board later (if indeed we still want to use Review Board as our rich-code-review solution).

This presented us with a problem: if we wouldn’t be hooking the new Autoland service up to Review Board, then we’d have to wait until the push service was also finished before we hooked them both up to BMO. Not wanting to turn everything on at once, we pondered how we could still launch new services as they were completed.

Moving to the other side of the pipeline

The answer is to table our work on Autoland for now and switch to the push service, which is the entrance to the commit pipeline. Building this piece first means that users will be able to push commits to BMO for review. Even though they would not be able to Autoland them right away, we could get feedback and make the service as easy to use as possible. Think of it as a replacement for bzexport.

Thanks to our new Scrum process (see also below), this priority adjustment was not very painful. We’ve been shipping Autoland code each week, so, while it doesn’t do much yet, we’re not abandoning any work in progress or leaving patches half finished. Plus, since this new service is also being started from scratch (although involving lots of code reuse from what’s currently in MozReview), we can apply the lessons we learned from the last couple months, so we should be moving pretty quickly.

Newness

As I mentioned above, although the essence of Conduit work right now is decoupling existing functionality from Review Board, it involves a lot of new stuff. Only recently did we realize exactly how much new stuff there was to get used to!

New team members

We welcomed Israel Madueme to our team in January and threw him right into the thick of things. He’s adapted tremendously well and started contributing immediately. Of course a new team member means new team dynamics, but he already feels like one of us.

Just recently, we’ve stolen dkl from the BMO team, where he’s been working since joining Mozilla 6 years ago. I’m excited to have a long-time A-Teamer join the Conduit team.

A new process

At the moment we have five developers working on the new Conduit services. This is more people on a single project than we’re usually able to pull together, so we needed a process to make sure we’re working to our collective potential. Luckily one of us is a certified ScrumMaster. I’ve never actually experienced Scrum-style development before, but we decided to give it a try.

I’ll have a lot more to say about this in the future, as we’re only just hitting our stride now, but it has felt really good to be working with solid organizational principles. We’re spending more time in meetings than usual, but it’s paying off with a new level of focus and productivity.

A new architecture

Working within Review Board was pretty painful, and the MozReview development environment, while amazing in its breadth and coverage, was slow and too heavily focussed on lengthy end-to-end tests. Our new design follows more of a microservice-based approach. The Autoland verification system (which checks users permissions and ensures that commits have been properly reviewed) is a separate service, as is the UI and the transplant service (as noted above, this last part was actually one of the few pieces of MozReview that was already decoupled, so we’re one step ahead there). Similarly, on the other side of the pipeline, the commit index is a separate service, and the review service may eventually be split up as well.

We’re not yet going whole-hog on microservices—we don’t plan, for starters at least, to have more than 4 or 5 separate services—but we’re already benefitting from being able to work on features in parallel and preventing runaway complexity. The book Building Microservices has been instrumental to our new design, as well as pointing out exactly why we had difficulties in our previous approach.

New operations

As the A-Team is now under Laura Thomson, we’re taking advantage of our new, closer relationship to CloudOps to try a new deployment and operations approach. This has freed us of some of the constraints of working in the data centre while letting us take advantage of a proven toolchain and process.

New technologies

We’re using Python 3.5 (and probably 3.6 at some point) for our new services, which I believe is a first for an A-Team project. It’s new for much of the team, but they’ve quickly adapted, and we’re now insulated against the 2020 deadline for Python 2, as well as benefitting from the niceties of Python 3 like better Unicode support.

We also used a few technologies for the Autoland service that are new to most of the team: React and Tornado. While the team found it interesting to learn them, in retrospect using them now was probably a case of premature optimization. Both added complexity that was unnecessary right now. React’s URL routing was difficult to get working in a way that seamlessly supported a local, Docker-based development environment and a production deployment scenario, and Tornado’s asynchronous nature led to extra complexity in automated tests. Although they are both fine technologies and provide scalable solutions for complex apps, the individual Conduit services are currently too small to really benefit.

We’ve learned from this, so we’re going to use Flask as the back end for the push services (commit index and review-request generator), for now at least, and, if we need a UI, we’ll probably use a relatively simple template approach with JavaScript just for enhancements.

Next

In my next post, I’m going to discuss our approach to the push services and more on what we’ve learned from MozReview.

Categorieën: Mozilla-nl planet

Air Mozilla: Martes Mozilleros, 14 Mar 2017

di, 14/03/2017 - 17:00

Martes Mozilleros Reunión bi-semanal para hablar sobre el estado de Mozilla, la comunidad y sus proyectos. Bi-weekly meeting to talk (in Spanish) about Mozilla status, community and...

Categorieën: Mozilla-nl planet

Christian Heilmann: Want to learn more about using the command line? Remy helps!

di, 14/03/2017 - 14:29

This is an unashamed plug for Remy Sharp’s terminal training course command line for non–techies. Go over there and have a look at what he’s lined up for a very affordable price. In a series of videos he explains all the ins and outs of the terminal and its commands that can make you much more effective in your day-to-day job.

Working the command line ebook

I’ve read the ebook of the same course and have to say that I learned quite a few things but – more importantly – remembered a lot I had forgotten. By using the findings over and over a lot has become muscle memory, but it is tough to explain what I am doing. Remy did a great job making the dark command line magic more understandable and less daunting. Here is what the course covers:

Course material “Just open the terminal”
  • Just open the terminal (03:22)
  • Why use a terminal? (03:23)
  • Navigating directories (07:71)
  • Navigation shortcuts (01:06)
Install all the things
  • Running applications (05:47)
  • brew install fun (07:46)
  • gem install (06:32)
  • npm install—global (09:44)
  • Which is best? (02:13)
Tools of the Terminal Trade
  • Connecting programs (08:25)
  • echo & cat (01:34)
  • grep “searching” (06:22)
  • head tail less (10:24)
  • sort | uniq (07:58)
How (not) to shoot yourself in the foot
  • Delete all the things (07:42)
  • Super user does…sudo (07:50)
  • Permissions: mode & owner (11:16)
  • Kill kill kill! (12:21)
  • Health checking (12:54)
Making the shell your own
  • Owning your terminal (09:19)
  • Fish ~> (10:18)
  • Themes (01:51)
  • zsh (zed shell) (10:11)
  • zsh plugins: z st… (08:26)
  • Aliases (05:43)
  • Alias++ → functions (08:15)
Furthering your command line
  • Piping workflow (08:14)
  • Setting environment values (03:04)
  • Default environment variable values (01:46)
  • Terminal editors (06:41)
  • wget and cURL (09:53)
  • ngrok for tunnelling (06:38)
  • json command for data massage (07:51)
  • awk for splitting output into columns (04:11)
  • xargs (for when pipes won’t do) (02:15)
  • …fun bonus-bonus video (04:13)

I am not getting anything for this, except for making sure that someone as lovely and dedicated as Remy may reach more people with his materials. So, take a peek.

Categorieën: Mozilla-nl planet

The Mozilla Blog: A Public-Private Partnership for Gigabit Innovation and Internet Health

di, 14/03/2017 - 12:18
Mozilla, the National Science Foundation and U.S. Ignite announce $300,000 in grants for gigabit internet projects in Eugene, OR and Lafayette, LA

 

By Chris Lawrence, VP, Leadership Network

At Mozilla, we believe in a networked approach — leveraging the power of diverse people, pooled expertise and shared values.

This was the approach we took nearly 15 years ago when we first launched Firefox. Our open-source browser was — and is — built by a global network of engineers, designers and open web advocates.

This is also the approach Mozilla takes when working toward its greater mission: keeping the internet healthy. We can’t build a healthy internet — one that cherishes freedom, openness and inclusion — alone. To keep the internet a global public resource, we need a network of individuals and organizations and institutions.

One such partnership is Mozilla’s ongoing collaboration with the National Science Foundation (NSF) and U.S. Ignite. We’re currently offering a $2 million prize for projects that decentralize the web. And together in 2014, we launched the Gigabit Community Fund. We committed to supporting promising projects in gigabit-enabled U.S. cities — projects that use connectivity 250-times normal speeds to make learning more engaging, equitable and impactful.

Today, we’re adding two new cities to the Gigabit Community Fund: Eugene, OR and Lafayette, LA.

 

Beginning in May 2017, we’re providing a total of $300,000 in grants to projects in both new cities. Applications for grants will open in early summer 2017; applicants can be individuals, nonprofits and for-profits.

We’ll support educators, technologists and community activists in Eugene and Lafayette who are building and beta-testing the emerging technologies that are shaping the web. We’ll fuel projects that leverage gigabit networks to make learning more inclusive and engaging through VR field trips, ultra-high definition classroom collaboration, and real-time cross-city robot battles. (These are all real examples from the existing Mozilla gigabit cities of Austin, Chattanooga and Kansas City.)

We’re also investing in the local communities on the ground in Eugene and Lafayette — and in the makers, technologists, and educators who are passionate about local innovation. Mozilla will bring its Mozilla Network approach to both cities, hosting local events and strengthening connections between individuals, schools, nonprofits, museums, and other organizations.

Video: Learn how the Mozilla Gigabit Community Fund supports innovative local projects across the U.S.

Why Eugene and Lafayette? Mozilla Community Gigabit Fund cities are selected based on a range of criteria, including a widely deployed high-speed fiber network; a developing conversation about digital literacy, access, and innovation; a critical mass of community anchor organizations, including arts and educational organizations; an evolving entrepreneurial community; and opportunities to engage K-12 school systems.

We’re excited to fuel innovation in the communities of Eugene and Lafayette  — and to continue our networked approach with NSF, U.S. Ignite and others, in service of a healthier internet.

 

The post A Public-Private Partnership for Gigabit Innovation and Internet Health appeared first on The Mozilla Blog.

Categorieën: Mozilla-nl planet

This Week In Rust: This Week in Rust 173

di, 14/03/2017 - 05:00

Hello and welcome to another issue of This Week in Rust! Rust is a systems language pursuing the trifecta: safety, concurrency, and speed. This is a weekly summary of its progress and community. Want something mentioned? Tweet us at @ThisWeekInRust or send us a pull request. Want to get involved? We love contributions.

This Week in Rust is openly developed on GitHub. If you find any errors in this week's issue, please submit a PR.

Updates from Rust Community News & Blog Posts Crate of the Week

This week's crate of the week is µtest, a testing framework for embedded software. Thanks to nasa42 for the suggestion.

Submit your suggestions and votes for next week!

Call for Participation

Always wanted to contribute to open-source projects but didn't know where to start? Every week we highlight some tasks from the Rust community for you to pick and get started!

Some of these tasks may also have mentors available, visit the task page for more information.

If you are a Rust project owner and are looking for contributors, please submit tasks here.

Updates from Rust Core

142 pull requests were merged in the last week.

New Contributors
  • CrazyMerlyn
  • Fabjan Sukalia
  • Gibson Fahnestock
  • Joel Gallant
  • Jonas Bushart
  • Joshua Horwitz
  • madseagames
  • Paul Daniel Faria
  • Petr Hosek
  • Tobias Schottdorf
Approved RFCs

Changes to Rust follow the Rust RFC (request for comments) process. These are the RFCs that were approved for implementation this week:

Final Comment Period

Every week the team announces the 'final comment period' for RFCs and key PRs which are reaching a decision. Express your opinions now. This week's FCPs are:

New RFCs Style RFCs

Style RFCs are part of the process for deciding on style guidelines for the Rust community and defaults for Rustfmt. The process is similar to the RFC process, but we try to reach rough consensus on issues (including a final comment period) before progressing to PRs. Just like the RFC process, all users are welcome to comment and submit RFCs. If you want to help decide what Rust code should look like, come get involved!

PRs in final comment period:

Issues in final comment period:

Other significant issues:

Upcoming Events

If you are running a Rust event please add it to the calendar to get it mentioned here. Email the Rust Community Team for access.

Rust Jobs

Tweet us at @ThisWeekInRust to get your job offers listed here!

Quote of the Week

In #rustlang, None is always an Option<_>.

llogiq on Twitter.

Thanks to Johan Sigfrids for the suggestion.

Submit your quotes for next week!

This Week in Rust is edited by: nasa42, llogiq, and brson.

Categorieën: Mozilla-nl planet

Mitchell Baker: The “Worldview” of Mozilla

ma, 13/03/2017 - 21:59

There are a set of topics that are important to Mozilla and to what we stand for in the world — healthy communities, global communities, multiculturalism, diversity, tolerance, inclusion, empathy, collaboration, technology for shared good and social benefit.  I spoke about them at the Mozilla All Hands in December, if you want to (re)listen to the talk you can find it here.  The sections where I talk about these things are at the beginning, and also starting at about the 14:30 minute mark.

These topics are a key aspect of Mozilla’s worldview.  However, we have not set them out officially as part of who we are, what we stand for and how we describe ourselves publicly.   I’m feeling a deep need to do so.

My goal is to develop a small set of principles about these aspects of Mozilla’s worldview. We have clear principles that Mozilla stands for topics such as security and free and open source software (principles 4 and 7 of the Manifesto).  Similarly clear principles about topic such as global communities and multiculturalism will serve us well as we go forward.  They will also give us guidance as to the scope and public voice of Mozilla, spanning official communications from Mozilla, to the unofficial ways each of us describes Mozilla.

Currently, I’m working on a first draft of the principles.  We are working quickly, as quickly as we can have rich discussions and community-wide participation. If you would like to be involved and can potentially spend some hours reviewing and providing input please sign up here. Jascha and Jane are supporting me in managing this important project.  
I’ll provide updates as we go forward.  

Categorieën: Mozilla-nl planet

Chris Cooper: Shameless self (release) promotion: Firefox 53.0b1 from TaskCluster

ma, 13/03/2017 - 20:17

You may recall two short months ago when we moved Linux and Android nightlies from buildbot to TaskCluster. Due to the train model, this put us (release engineering) on a clock: either we’d be ready to release a beta version of Firefox 53 for Linux and Android using release promotion in TaskCluster, or we’d need to hold back our work for at least the next cycle, causing uplift headaches galore.

I’m happy to report that we were able to successfully release Firefox 53.0b1 for Linux and Android from TaskCluster last week. This is impressive for 3 reasons:

  1. Mac and Windows builds were still promoted from buildbot, so we were able to seamlessly integrate the artifacts of two different continuous integration (CI) platforms.
  2. The process whereby nightly builds are generated has always been different from how we generate release builds. Firefox 53.0b1 represents the first time a beta build was generated using the same taskgraph we use for a nightly, thereby reducing the delta between CI builds and release builds. More work to be done here, for sure.
  3. Nobody noticed. With all the changes under the hood, this may be the most impressive achievement of all.

A round of thanks to Aki, Johan, Kim, and Mihai who worked hard to get the pieces in place for Android, and a special shout-out to Rail who handled the Linux beta while also dealing with the uplift requirements for ESR52. Of course, thanks to everyone else who has helped with the migration thus far. All of that foundational work is starting to pay off.

Much more to do, but I look forward to updating you about Mac and Windows progress soon.

Categorieën: Mozilla-nl planet

Air Mozilla: Mozilla Weekly Project Meeting, 13 Mar 2017

ma, 13/03/2017 - 19:00

Mozilla Weekly Project Meeting The Monday Project Meeting

Categorieën: Mozilla-nl planet

Mozilla Addons Blog: WebExtensions in Firefox 54

ma, 13/03/2017 - 18:00

Firefox 54 landed in Developer Edition this week, so we have another update on WebExtensions for you. In addition to new APIs to help more developers port over to WebExtensions, we also announced a new Office Hours Support schedule where developers can get more personalized help with the transition.

New APIs

A new API for creating sidebars was implemented. This allows you to place a local HTML file inside the sidebar. The API is similar to the one in Opera. If you specify the sidebar_action manifest key, Firefox will create a sidebar:

To allow keyboard commands to be sent to the sidebar, a new _execute_sidebar_action was added to the commands API which allows you trigger the showing of the sidebar.

The ability to override the about:newtab with pages inside your extension was added to the chrome_url_overrides field in the manifest. Check out the example that uses the topSites API to show the top sites you visit .

The privacy API gives you the ability to flip certain Firefox preferences related to privacy. Although the preferences in Chrome and Firefox aren’t a direct mapping, we’ve mapped the Firefox preferences that makes sense to the APIs. Currently implemented are: networkPredictionEnabled, webRTCIPHandlingPolicy and hyperlinkAuditingEnabled.

The protocol_handler API lets you easily map protocols to actions in your extension. For example: we use irccloud at Mozilla, so we can map ircs:// links to irccloud by adding this into an extension:

"protocol_handlers": [ { "protocol": "ircs", "name": "IRC Mozilla Extension", "uriTemplate": "https://irccloud.mozilla.com/#!/%s" } ]

When a user clicks on an IRC link, it shows the application selector with the IRC Mozilla Extension visible:

This release also marks the landing of the first sets of devtools APIs. Quite a few APIs landed including: inspectedWindow.reload(), inspectedWindow.eval(), inspectedWindow.tabId, network.onNavigated, and panels.create().

Here’s an example of the Redux DevTools extension running on Firefox:

Backwards incompatible changes

The webRequest API will now require that you’ve requested the appropriate hosts’ permissions before allowing you to perform webRequest operations on a URL. This will be a backwards-incompatible change for any extension which used webRequest but did not request the host permission.

Deletes in storage.sync are now encrypted. This would be a breaking change for any extensions using storage.sync on Developer Edition.

API Changes

Some key events were completed in some popular APIs:

  • webRequest.onBeforeRequest is initiated before a server side redirect is about occur and webRequest.onAuthRequired is fired when an authentication failure occurs. These allow you to catch authentication requests from servers, such as proxy authentication.
  • webNavigation.onCreatedNavigationTarget event has been completed. This is fired when a new window or tab is created to be navigated to.
  • runtime.onMessageExternal event has been implemented. This is fired when a message is sent from another extension.

Other notable bugs include:

Android

Notably in Firefox 54, basic tabs API support was landed for Android. The API support focuses on the parts of the API that make sense for Android, so some tab methods and events are deliberately not implemented.

This is an important API in its own right, but other key APIs did not have good support without this. By landing this, Android WebExtensions got much better webNavigation and webRequest support. This gives us a clear path to getting ad blockers, the most common extension type on Android.

Contributors

A big thank you to our contributors Rob Wu, Tomislav Jovanovic and Tushar Saini who helped out with this release.

The post WebExtensions in Firefox 54 appeared first on Mozilla Add-ons Blog.

Categorieën: Mozilla-nl planet

Tim Guan-tin Chien: Ben’s Story of Firefox OS

ma, 13/03/2017 - 15:50

Like my good old colleague Ben Francis, I too have a lot to say about Firefox OS.

It’s been little over a year since my team and I moved away from Firefox OS and the ill-fated Connected Devices group. Over the course of last year, each time I think about the Firefox OS experience, I arrived at different conclusions and complicated, sometimes emotional, belives. I didn’t feel I am ready to take a snapshot of these thoughts and publish it permanently, so I didn’t write about it here. Frankly, I don’t even think I am ready now.

In his post, Ben has pointed out many of the turning points of the project. I agree with many of his portrayals, most importantly, lost of direction between being a product (which must ship fast and deliver whatever partners/consumers wanted and used to) and a research project (which involves engineering endeavors that answer questions asked in the original announcement). I, however, have not figured out what can be done instead (which Ben proposed in his post nicely).

Lastly, a final point: to you, this might as well be another story in the volatile tech industry, but to me, I felt the cost of people enormously whenever a change was announced during the “slow death” of Firefox OS.

People moves on and recovers, including me (which fortunately wasn’t nearly being hit the hardest). I can only extend my best wishes to those who had fought the good fight with.

Categorieën: Mozilla-nl planet

Gregory Szorc: from __past__ import bytes_literals

ma, 13/03/2017 - 10:55

Last year, I simultaneously committed one of the ugliest and impressive hacks of my programming career. I haven't had time to write about it. Until now.

In summary, the hack is a source-transforming module loader for Python. It can be used by Python 3 to import a Python 2 source file while translating certain primitives to their Python 3 equivalents. It is kind of like 2to3 except it executes at run-time during import. The main goal of the hack was to facilitate porting Mercurial to Python 3 while deferring having to make the most invasive - and therefore most annoying - elements of the port in the canonical source code representation.

For the technically curious, it works as follows.

The hg Python executable registers a custom meta path finder instance. This entity is invoked during import statements to try to find the module being imported. It tells a later phase of the import mechanism how to load that module from wherever it is (usually a .py or .pyc file on disk) to a Python module object. The custom finder only responds to requests for modules known to be managed by the Mercurial project. For these modules, it tells the next stage of the import mechanism to invoke a custom SourceLoader instance. Here's where the real magic is: when the custom loader is invoked, it tokenizes the Python source code using the tokenize module, iterates over the token stream, finds specific patterns, and rewrites them to something more appropriate. It then untokenizes back to Python source code then falls back to the built-in loader which does the heavy lifting of compiling the source to Python code objects. So, we have Python 2 source files on disk that magically get transformed to be Python compatible when they are loaded by Python 3. Oh, and there is no performance penalty for the token transformation on subsequence loads because the transformed bytecode is cached in the .pyc file (using a custom header so we know it was transformed and can be invalidated when the transformation logic changes).

At the time I wrote it, the token stream manipulation converted most string literals ('') to bytes literals (b''). In other words, it restored the Python 2 behavior of string literals being bytes and not unicode. We jokingly call it from __past__ import bytes_literals (a play on Python 2's from __future__ import unicode_literals special syntax which changes string literals from Python 2's str/bytes type to unicode to match Python 3's behavior).

Since I implemented the first version, others have implemented:

As one can expect, when I tweeted a link to this commit, many Python developers (including a few CPython core developers) expressed a mix of intrigue and horror. But mostly horror.

I fully concede that what I did here is a gross hack. And, it is the intention of the Mercurial project to undo this hack and perform a proper port once Python 3 support in Mercurial is more mature. But, I want to lay out my defense on why I did this and why the Mercurial project is tolerant of this ugly hack.

Individuals within the Mercurial project have wanted to port to Python 3 for years. Until recently, it hasn't been a project priority because a port was too much work for too little end-user gain. And, on the technical front, a port was just not practical until Python 3.5. (Two main blockers were no u'' literals - restored in Python 3.3 - and no % formatting for b'' literals - restored in 3.5. And as I understand it, senior members of the Mercurial project had to lobby Python maintainers pretty hard to get features like % formatting of b'' literals restored to Python 3.)

Anyway, after a number of failed attempts to initiate the Python 3 port over the years, the Mercurial project started making some positive steps towards Python 3 compatibility, such as switching to absolute imports and addressing syntax issues that allowed modules to be parsed into an AST and even compiled and loadable. These may seem like small steps, but for a larger project, it was a lot of work.

The porting effort hit a large wall when it came time to actually make the AST-valid Python code run on Python 3. Specifically, we had a strings problem.

When you write software that exchanges data between machines - sometimes machines running different operating systems or having different encodings - and there is an expectation that things work the same and data roundtrips accordingly, trying to force text encodings is essentially impossible and inevitably breaks something or someone. It is much easier for Mercurial to operate bytes first and only take text encoding into consideration when absolutely necessary (such as when emitting bytes to the terminal in the wanted encoding or when emitting JSON). That's not to say Mercurial ignores the existence of encodings. Far from it: Mercurial does attempt to normalize some data to Unicode. But it often does so with a special Python type that internally stores the raw byte sequence of the source so that a consumer can choose to operate at the bytes or Unicode level.

Anyway, this means that practically every string variable in Mercurial is a bytes type (or something that acts like a bytes type). And since string literals in Python 3 are the str type (which represents Unicode), that would mean having to prefix almost every '' string literal in Mercurial with b'' in order to placate Python 3. Having to update every occurrence of simple primitives that could be statically transformed automatically felt like busy work. We wanted to spend time on the meaningful parts of the Python 3 port so we could find interesting problems and challenges, not toil with mechanical conversions that add little to no short-term value while simultaneously increasing cognitive dissonance and quite possibly increasing the odds of introducing a bug in Python 2. In other words, why should humans do the work that machines can do for us? Thus, the source-transforming module importer was born.

While I concede what Mercurial did is a giant hack, I maintain it was the correct thing to do. It has allowed the Python 3 port to move forward without being blocked on the more tedious and invasive transformations that could introduce subtle bugs (including performance regressions) in Python 2. Perfect is the enemy of good. People time is valuable. The source-transforming module importer allowed us to unblock an important project without sinking a lot of people time into it. I'd make that trade-off again.

While I won't encourage others to take this approach to porting to Python 3, if you want to, Mercurial's source is available under a GPL license and the custom module importer could be adapted to any project with minimal modifications. If someone does extract it as reusable code, please leave a comment and I'll update the post to link to it.

Categorieën: Mozilla-nl planet

The Servo Blog: These Weeks In Servo 94

ma, 13/03/2017 - 01:30

In the last two weeks, we landed 185 PRs in the Servo organization’s repositories.

Planning and Status

Our overall roadmap is available online, including the overall plans for 2017 and Q1. Please check it out and provide feedback!

This week’s status updates are here.

Notable Additions
  • samgiles and rabisg added the Origin header to fetch requests.
  • hiikezoe made CSS animations be processed by Stylo.
  • Manishearth supported SVG presentation attributes in Stylo.
  • ferjm allowed redirects to occur after a CORS preflight fetch request.
  • sendilkumarn corrected the behaviour of loading user scripts in iframes.
  • pcwalton improved the performance of layout queries and requestAnimationFrame.
  • emilio removed unnecessary heap allocation for some CSS parsers.
  • mephisto41 implemented gradient border support in WebRender.
  • jdm avoided some panics triggered by image elements initiating multiple requests.
  • MortimerGoro improved the Android integration and lifecycle hooks.
  • nox removed the last uses of serde_codegen.
  • ferjm avoided a deadlock triggered by the Document.elementsFromPoint API.
  • fitzgen improved the rust-bindgen support for complex template parameter usages.
  • gw implemented page zoom support in WebRender.
  • dpyro added support for the nosniff algorithm in the Fetch implementation.
  • KiChjang implemented the :lang pseudoclass.
New Contributors

Interested in helping build a web browser? Take a look at our curated list of issues that are good for new contributors!

Categorieën: Mozilla-nl planet

Mike Hommey: When the memory allocator works against you

zo, 12/03/2017 - 02:47

Cloning mozilla-central with git-cinnabar requires a lot of memory. Actually too much memory to fit in a 32-bits address space.

I hadn’t optimized for memory use in the first place. For instance, git-cinnabar keeps sha-1s in memory as hex values (40 bytes) rather than raw values (20 bytes). When I wrote the initial prototype, it didn’t matter that much, and while close(ish) to the tipping point, it didn’t require more than 2GB of memory at the time.

Time passed, and mozilla-central grew. I suspect the recent addition of several thousands of commits and files has made things worse.

In order to come up with a plan to make things better (short or longer term), I needed data. So I added some basic memory resource tracking, and collected data while cloning mozilla-central.

I must admit, I was not ready for what I witnessed. Follow me for a tale of frustrations (plural).

I was expecting things to have gotten worse on the master branch (which I used for the data collection) because I am in the middle of some refactoring and did many changes that I was suspecting might have affected memory usage. I wasn’t, however, expecting to see the clone command using 10GB(!) memory at peak usage across all processes.

(Note, those memory sizes are RSS, minus “shared”)

It also was taking an unexpected long time, but then, I hadn’t cloned a large repository like mozilla-central from scratch in a while, so I wasn’t sure if it was just related to its recent growth in size or otherwise. So I collected data on 0.4.0 as well.

Less time spent, less memory usage… ok. There’s definitely something wrong on master. But wait a minute, that slope from ~2GB to ~4GB on the git-remote-hg process doesn’t actually make any kind of sense. I mean, I’d understand it if it were starting and finishing with the “Import manifest” phase, but it starts in the middle of it, and ends long before it finishes. WTH?

First things first, since RSS can be a variety of things, I checked /proc/$pid/smaps and confirmed that most of it was, indeed, the heap.

That’s the point where you reach for Google, type something like “python memory profile” and find various tools. One from the results that I remembered having used in the past is guppy’s heapy.

Armed with pdb, I broke execution in the middle of the slope, and tried to get memory stats with heapy. SIGSEGV. Ouch.

Let’s try something else. I reached out to objgraph and pympler. SIGSEGV. Ouch again.

Tried working around the crashes for a while (too long while, retrospectively, hindsight is 20/20), and was somehow successful at avoiding them by peaking at a smaller set of objects. But whatever I did, despite being attached to a process that had 2.6GB RSS, I wasn’t able to find more than 1.3GB of data. This wasn’t adding up.

It surely didn’t help that getting to that point took close to an hour each time. Retrospectively, I wish I had investigated using something like Checkpoint/Restore in Userspace.

Anyways, after a while, I decided that I really wanted to try to see the whole picture, not smaller peaks here and there that might be missing something. So I resolved myself to look at the SIGSEGV I was getting when using pympler, collecting a core dump when it happened.

Guess what? The Debian python-dbg package does not contain the debug symbols for the python package. The core dump was useless.

Since I was expecting I’d have to fix something in python, I just downloaded its source and built it. Ran the command again, waited, and finally got a backtrace. First Google hit for the crashing function? The exact (unfixed) crash reported on the python bug tracker. No patch.

Crashing code is doing:

((f)->f_builtins != (f)->f_tstate->interp->builtins)

And (f)->f_tstate is NULL. Classic NULL deref.

Added a guard (assessing it wouldn’t break anything). Ran the command again. Waited. Again. SIGSEGV.

Facedesk. Another crash on the same line. Did I really use the patched python? Yes. But this time (f)->f_tstate->interp is NULL. Sigh.

Same player, shoot again.

Finally, no crash… but still stuck on only 1.3GB accounted for. Ok, I know not all python memory profiling tools are entirely reliable, let’s try heapy again. SIGSEGV. Sigh. No debug info on the heapy module, where the crash happens. Sigh. Rebuild the module with debug info, try again. The backtrace looks like heapy is recursing a lot. Look at %rsp, compare with the address space from /proc/$pid/maps. Confirmed. A stack overflow. Let’s do ugly things and increase the stack size in brutal ways.

Woohoo! Now heapy tells me there’s even less memory used than the 1.3GB I found so far. Like, half less. Yeah, right.

I’m not clear on how I got there, but that’s when I found gdb-heap, a tool from Red Hat’s David Malcolm, and the associated talk “Dude, where’s my RAM?” A deep dive into how Python uses memory (slides).

With a gdb attached, I would finally be able to rip python’s guts out and find where all the memory went. Or so I thought. The gdb-heap tool only found about 600MB. About as much as heapy did, for that matter, but it could be coincidental. Oh. Kay.

I don’t remember exactly what went through my mind then, but, since I was attached to a running process with gdb, I typed the following on the gdb prompt:

gdb> call malloc_stats()

And that’s when the truth was finally unvealed: the memory allocator was just acting up the whole time. The ouput was something like:

Arena 0: system bytes = some number above (but close to) 2GB in use bytes = some number above (but close to) 600MB

Yes, the glibc allocator was just telling it had allocated 600MB of memory, but was holding onto 2GB. I must have found a really bad allocation pattern that causes massive fragmentation.

One thing that David Malcolm’s talk taught me, though, is that python uses its own allocator for small sizes, so the glibc allocator doesn’t know about them. And, roughly, adding the difference between RSS and what glibc said it was holding to to the use bytes it reported somehow matches the 1.3GB I had found so far.

So it was time to see how those things evolved in time, during the entire clone process. I grabbed some new data, tracking the evolution of “system bytes” and “in use bytes”.

There are two things of note on this data:

  • There is a relatively large gap between what the glibc allocator says it has gotten from the system, and the RSS (minus “shared”) size, that I’m expecting corresponds to the small allocations that python handles itself.
  • Actual memory use is going down during the “Import manifests” phase, contrary to what the evolution of RSS suggests.

In fact, the latter is exactly how git-cinnabar is supposed to work: It reads changesets and manifests chunks, and holds onto them while importing files. Then it throws away those manifests and changesets chunks one by one while it imports them. There is, however, some extra bookkeeping that requires some additional memory, but it’s expected to be less memory consuming than keeping all the changesets and manifests chunks in memory.

At this point, I thought a possible explanation is that since both python and glibc are mmap()ing their own arenas, they might be intertwined in a way that makes things not go well with the allocation pattern happening during the “Import manifest” phase (which, in fact, allocates and frees increasingly large buffers for each manifest, as manifests grow in size in the mozilla-central history).

To put the theory at work, I patched the python interpreter again, making it use malloc() instead of mmap() for its arenas.

“Aha!” I thought. That definitely looks much better. Less gap between what glibc says it requested from the system and the RSS size. And, more importantly, no runaway increase of memory usage in the middle of nowhere.

I was preparing myself to write a post about how mixing allocators could have unintended consequences. As a comparison point, I went ahead and ran another test, with the python allocator entirely disabled, this time.

Heh. It turns out glibc was acting up all alone. So much for my (plausible) theory. (I still think mixing allocators can have unintended consequences.)

(Note, however, that the reason why the python allocator exists is valid: without it, the overall clone took almost 10 more minutes)

And since I had been getting all this data with 0.4.0, I gathered new data without the python allocator with the master branch.

This paints a rather different picture than the original data on that branch, with much less memory use regression than one would think. In fact, there isn’t much difference, except for the spike at the end, which got worse, and some of the noise during the “Import manifests” phase that got bigger, implying larger amounts of temporary memory used. The latter may contribute to the allocation patterns that throw glibc’s memory allocator off.

It turns out tracking memory usage in python 2.7 is rather painful, and not all the tools paint a complete picture of it. I hear python 3.x is somewhat better in that regard, and I hope it’s true, but at the moment, I’m stuck with 2.7. The most reliable tool I’ve used here, it turns out, is pympler. Or rebuilding the python interpreter without its allocator, and asking the system allocator what is allocated.

With all this data, I now have some defined problems to tackle, some easy (the spike at the end of the clone), and some less easy (working around glibc allocator’s behavior). I have a few hunches as to what kind of allocations are causing the runaway increase of RSS. Coincidentally, I’m half-way through a refactor of the code dealing with manifests, and it should help dealing with the issue.

But that will be the subject of a subsequent post.

Categorieën: Mozilla-nl planet