mozilla

Mozilla Nederland LogoDe Nederlandse
Mozilla-gemeenschap

Abonneren op feed Mozilla planet
Planet Mozilla - http://planet.mozilla.org/
Bijgewerkt: 11 uur 14 min geleden

Daniel Stenberg: curl security audit

wo, 23/11/2016 - 11:10

“the overall impression of the state of security and robustness
of the cURL library was positive.”

I asked for, and we were granted a security audit of curl from the Mozilla Secure Open Source program a while ago. This was done by Mozilla getting a 3rd party company involved to do the job and footing the bill for it. The auditing company is called Cure53.

good_curl_logoI applied for the security audit because I feel that we’ve had some security related issues lately and I’ve had the feeling that we might be missing something so it would be really good to get some experts’ eyes on the code. Also, as curl is one of the most used software components in the world a serious problem in curl could have a serious impact on tools, devices and applications everywhere. We don’t want that to happen.

Scans and tests and all

We run static analyzers on the code frequently with a zero warnings tolerance. The daily clang-analyzer scan hasn’t found a problem in a long time and the Coverity once-every-few-weeks occasionally finds something suspicious but we always fix those immediately.

We have  thousands of tests and unit tests that we run non-stop on the code on multiple platforms running multiple build combinations. We also use valgrind when running tests to verify memory use and check for potential memory leaks.

Secrecy

The audit itself. The report and the work on fixing the issues were all done on closed mailing lists without revealing to the world what was really going on. All as our fine security process describes.

There are several downsides with fixing things secretly. One of the primary ones is that we get much fewer eyes on the fixes and there aren’t that many people involved when discussing solutions or approaches to the issues at hand. Another is that our test infrastructure is made for and runs only public code so the code can’t really be fully tested until it is merged into the public git repository.

The report

We got the report on September 23, 2016 and it certainly gave us a lot of work.

The audit report has now been made public and is a very interesting work if you’re into security, C code and curl hacking. I find the report very clear, well written and it spells out each problem very accurately and even shows proof of concept code snippets and exploit examples to drive the points home.

Quoted from the report intro:

As for the approach, the test was rooted in the public availability of the source code belonging to the cURL software and the investigation involved five testers of the Cure53 team. The tool was tested over the course of twenty days in August and September of 2016 and main efforts were focused on examining cURL 7.50.1. and later versions of cURL. It has to be noted that rather than employ fuzzing or similar approaches to validate the robustness of the build of the application and library, the latter goal was pursued through a classic source code audit. Sources covering authentication, various protocols, and, partly, SSL/TLS, were analyzed in considerable detail. A rationale behind this type of scoping pointed to these parts of the cURL tool that were most likely to be prone and exposed to real-life attack scenarios. Rounding up the methodology of the classic code audit, Cure53 benefited from certain tools, which included ASAN targeted with detecting memory errors, as well as Helgrind, which was tasked with pinpointing synchronization errors with the threading model.

They identified no less than twenty-three (23) potential problems in the code, out of which nine were deemed security vulnerabilities. But I’d also like to emphasize that they did also actually say this:

At the same time, the overall impression of the state of security and robustness of the cURL library was positive.

Resolving problems

In the curl security team we decided to downgrade one of the 9 vulnerabilities to a “plain bug” since the required attack scenario was very complicated and the risk deemed small, and two of the issues we squashed into treating them as a single one. That left us with 7 security vulnerabilities. Whoa, that’s a lot. The largest amount we’ve ever fixed in a single release before was 4.

I consider handling security issues in the project to be one of my most important tasks; pretty much all other jobs are down-prioritized in comparison. So with a large queue of security work, a lot of bug fixing and work on features basically had to halt.

You can get a fairly detailed description of our work on fixing the issues in the fix and validation log. The report, the log and the advisories we’ve already posted should cover enough details about these problems and associated fixes that I don’t feel a need to write about them much further.

More problems

Just because we got our hands full with an audit report doesn’t mean that the world stops, right? While working on the issues one by one to have them fixed we also ended up getting an additional 4 security issues to add to the set, by three independent individuals.

All these issues gave me a really busy period and it felt great when we finally shipped 7.51.0 and announced all those eleven fixes to the world and I could get a short period of relief until the next tsunami hits.

Categorieën: Mozilla-nl planet

Nicholas Nethercote: How to speed up the Rust compiler

wo, 23/11/2016 - 06:08

Rust is a great language, and Mozilla plans to use it extensively in Firefox. However, the Rust compiler (rustc) is quite slow and compile times are a pain point for many Rust users. Recently I’ve been working on improving that. This post covers how I’ve done this, and should be of interest to anybody else who wants to help speed up the Rust compiler. Although I’ve done all this work on Linux it should be mostly applicable to other platforms as well.

Getting the code

The first step is to get the rustc code. First, I fork the main Rust repository on GitHub. Then I make two local clones: a base clone that I won’t modify, which serves as a stable comparison point (rust0), and a second clone where I make my modifications (rust1). I use commands something like this:

user=nnethercote for r in rust0 rust1 ; do cd ~/moz git clone https://github.com/$user/rust $r cd $r git remote add upstream https://github.com/rust-lang/rust git remote set-url origin git@github.com:$user/rust done Building the Rust compiler

Within the two repositories, I first configure:

./configure --enable-optimize --enable-debuginfo

I configure with optimizations enabled because that matches release versions of rustc. And I configure with debug info enabled so that I get good information from profilers.

[Update: I now add –enable-llvm-release-debuginfo which builds the LLVM back-end with debug info too.]

Then I build:

RUSTFLAGS='' make -j8

[Update: I previously had -Ccodegen-units=8 in RUSTFLAGS because it speeds up compile times. But Lars Bergstrom informed me that it can slow down the resulting program significantly. I measured and he was right — the resulting rustc was about 5–10% slower. So I’ve stopped using it now.]

That does a full build, which does the following:

  • Downloads a stage0 compiler, which will be used to build the stage1 local compiler.
  • Builds LLVM, which will become part of the local compilers.
  • Builds the stage1 compiler with the stage0 compiler.
  • Builds the stage2 compiler with the stage1 compiler.

It can be mind-bending to grok all the stages, especially with regards to how libraries work. (One notable example: the stage1 compiler uses the system allocator, but the stage2 compiler uses jemalloc.) I’ve found that the stage1 and stage2 compilers have similar performance. Therefore, I mostly measure the stage1 compiler because it’s much faster to just build the stage1 compiler, which I do with the following command.

RUSTFLAGS='' make -j8 rustc-stage1

Building the compiler takes a while, which isn’t surprising. What is more surprising is that rebuilding the compiler after a small change also takes a while. That’s because a lot of code gets recompiled after any change. There are two reasons for this.

  • Rust’s unit of compilation is the crate. Each crate can consist of multiple files. If you modify a crate, the whole crate must be rebuilt. This isn’t surprising.
  • rustc’s dependency checking is very coarse. If you modify a crate, every other crate that depends on it will also be rebuilt, no matter how trivial the modification. This surprised me greatly. For example, any modification to the parser (which is in a crate called libsyntax) causes multiple other crates to be recompiled, a process which takes 6 minutes on my fast desktop machine. Almost any change to the compiler will result in a rebuild that takes at least 2 or 3 minutes.

Incremental compilation should greatly improve the dependency situation, but it’s still in an experimental state and I haven’t tried it yet.

To run all the tests I do this (after a full build):

ulimit -c 0 && make check

The checking aborts if you don’t do the ulimit, because the tests produces lots of core files and it doesn’t want to swamp your disk.

The build system is complex, with lots of options. This command gives a nice overview of some common invocations:

make tips Basic profiling

The next step is to do some basic profiling. I like to be careful about which rustc I am invoking at any time, especially if there’s a system-wide version installed, so I avoid relying on PATH and instead define some environment variables like this:

export RUSTC01="$HOME/moz/rust0/x86_64-unknown-linux-gnu/stage1/bin/rustc" export RUSTC02="$HOME/moz/rust0/x86_64-unknown-linux-gnu/stage2/bin/rustc" export RUSTC11="$HOME/moz/rust1/x86_64-unknown-linux-gnu/stage1/bin/rustc" export RUSTC12="$HOME/moz/rust1/x86_64-unknown-linux-gnu/stage2/bin/rustc"

In the examples that follow I will use $RUSTC01 as the version of rustc that I invoke.

rustc has the ability to produce some basic stats about the time and memory used by each compiler pass. It is enabled with the -Ztime-passes flag. If you are invoking rustc directly you’d do it like this:

$RUSTC01 -Ztime-passes a.rs

If you are building with Cargo you can instead do this:

RUSTC=$RUSTC01 cargo rustc -- -Ztime-passes

The RUSTC= part tells Cargo you want to use a non-default rustc, and the part after the -- is flags that will be passed to rustc when it builds the final crate. (A bit weird, but useful.)

Here is some sample output from -Ztime-passes:

time: 0.056; rss: 49MB parsing time: 0.000; rss: 49MB recursion limit time: 0.000; rss: 49MB crate injection time: 0.000; rss: 49MB plugin loading time: 0.000; rss: 49MB plugin registration time: 0.103; rss: 87MB expansion time: 0.000; rss: 87MB maybe building test harness time: 0.002; rss: 87MB maybe creating a macro crate time: 0.000; rss: 87MB checking for inline asm in case the target doesn't support it time: 0.005; rss: 87MB complete gated feature checking time: 0.008; rss: 87MB early lint checks time: 0.003; rss: 87MB AST validation time: 0.026; rss: 90MB name resolution time: 0.019; rss: 103MB lowering ast -> hir time: 0.004; rss: 105MB indexing hir time: 0.003; rss: 105MB attribute checking time: 0.003; rss: 105MB language item collection time: 0.004; rss: 105MB lifetime resolution time: 0.000; rss: 105MB looking for entry point time: 0.000; rss: 105MB looking for plugin registrar time: 0.015; rss: 109MB region resolution time: 0.002; rss: 109MB loop checking time: 0.002; rss: 109MB static item recursion checking time: 0.060; rss: 109MB compute_incremental_hashes_map time: 0.000; rss: 109MB load_dep_graph time: 0.021; rss: 109MB type collecting time: 0.000; rss: 109MB variance inference time: 0.038; rss: 113MB coherence checking time: 0.126; rss: 114MB wf checking time: 0.219; rss: 118MB item-types checking time: 1.158; rss: 125MB item-bodies checking time: 0.000; rss: 125MB drop-impl checking time: 0.092; rss: 127MB const checking time: 0.015; rss: 127MB privacy checking time: 0.002; rss: 127MB stability index time: 0.011; rss: 127MB intrinsic checking time: 0.007; rss: 127MB effect checking time: 0.027; rss: 127MB match checking time: 0.014; rss: 127MB liveness checking time: 0.082; rss: 127MB rvalue checking time: 0.145; rss: 161MB MIR dump time: 0.015; rss: 161MB SimplifyCfg time: 0.033; rss: 161MB QualifyAndPromoteConstants time: 0.034; rss: 161MB TypeckMir time: 0.001; rss: 161MB SimplifyBranches time: 0.006; rss: 161MB SimplifyCfg time: 0.089; rss: 161MB MIR passes time: 0.202; rss: 161MB borrow checking time: 0.005; rss: 161MB reachability checking time: 0.012; rss: 161MB death checking time: 0.014; rss: 162MB stability checking time: 0.000; rss: 162MB unused lib feature checking time: 0.101; rss: 162MB lint checking time: 0.000; rss: 162MB resolving dependency formats time: 0.001; rss: 162MB NoLandingPads time: 0.007; rss: 162MB SimplifyCfg time: 0.017; rss: 162MB EraseRegions time: 0.004; rss: 162MB AddCallGuards time: 0.126; rss: 164MB ElaborateDrops time: 0.001; rss: 164MB NoLandingPads time: 0.012; rss: 164MB SimplifyCfg time: 0.008; rss: 164MB InstCombine time: 0.003; rss: 164MB Deaggregator time: 0.001; rss: 164MB CopyPropagation time: 0.003; rss: 164MB AddCallGuards time: 0.001; rss: 164MB PreTrans time: 0.182; rss: 164MB Prepare MIR codegen passes time: 0.081; rss: 167MB write metadata time: 0.590; rss: 177MB translation item collection time: 0.034; rss: 180MB codegen unit partitioning time: 0.032; rss: 300MB internalize symbols time: 3.491; rss: 300MB translation time: 0.000; rss: 300MB assert dep graph time: 0.000; rss: 300MB serialize dep graph time: 0.216; rss: 292MB llvm function passes [0] time: 0.103; rss: 292MB llvm module passes [0] time: 4.497; rss: 308MB codegen passes [0] time: 0.004; rss: 308MB codegen passes [0] time: 5.185; rss: 308MB LLVM passes time: 0.000; rss: 308MB serialize work products time: 0.257; rss: 297MB linking

As far as I can tell, the indented passes are sub-passes, and the parent pass is the first non-indented pass afterwards.

More serious profiling

The -Ztime-passes flag gives a good overview, but you really need a profiling tool that gives finer-grained information to get far. I’ve done most of my profiling with two Valgrind tools, Cachegrind and DHAT. I invoke Cachegrind like this:

valgrind \ --tool=cachegrind --cache-sim=no --branch-sim=yes \ --cachegrind-out-file=$OUTFILE $RUSTC01 ...

where $OUTFILE specifies an output filename. I find the instruction counts measured by Cachegrind to be highly useful; the branch simulation results are occasionally useful, and the cache simulation results are almost never useful.

The Cachegrind output looks like this:

-------------------------------------------------------------------------------- Ir -------------------------------------------------------------------------------- 22,153,170,953 PROGRAM TOTALS -------------------------------------------------------------------------------- Ir file:function -------------------------------------------------------------------------------- 923,519,467 /build/glibc-GKVZIf/glibc-2.23/malloc/malloc.c:_int_malloc 879,700,120 /home/njn/moz/rust0/src/rt/miniz.c:tdefl_compress 629,196,933 /build/glibc-GKVZIf/glibc-2.23/malloc/malloc.c:_int_free 394,687,991 ???:??? 379,869,259 /home/njn/moz/rust0/src/libserialize/leb128.rs:serialize::leb128::read_unsigned_leb128 376,921,973 /build/glibc-GKVZIf/glibc-2.23/malloc/malloc.c:malloc 263,083,755 /build/glibc-GKVZIf/glibc-2.23/string/::/sysdeps/x86_64/multiarch/memcpy-avx-unaligned.S:__memcpy_avx_unaligned 257,219,281 /home/njn/moz/rust0/src/libserialize/opaque.rs:<serialize::opaque::Decoder<'a> as serialize::serialize::Decoder>::read_usize 217,838,379 /build/glibc-GKVZIf/glibc-2.23/malloc/malloc.c:free 217,006,132 /home/njn/moz/rust0/src/librustc_back/sha2.rs:rustc_back::sha2::Engine256State::process_block 211,098,567 ???:llvm::SelectionDAG::Combine(llvm::CombineLevel, llvm::AAResults&, llvm::CodeGenOpt::Level) 185,630,213 /home/njn/moz/rust0/src/libcore/hash/sip.rs:<rustc_incremental::calculate_svh::hasher::IchHasher as core::hash::Hasher>::write 171,360,754 /home/njn/moz/rust0/src/librustc_data_structures/fnv.rs:<rustc::ty::subst::Substs<'tcx> as core::hash::Hash>::hash 150,026,054 ???:llvm::SelectionDAGISel::SelectCodeCommon(llvm::SDNode*, unsigned char const*, unsigned int)

Here “Ir” is short for “I-cache reads”, which corresponds to the number of instructions executed. Cachegrind also gives line-by-line annotations of the source code.

The Cachegrind results indicate that malloc and free are usually the two hottest functions in the compiler. So I also use DHAT, which is a malloc profiler that tells you exactly where all your malloc calls are coming from.  I invoke DHAT like this:

/home/njn/grind/ws3/vg-in-place \ --tool=exp-dhat --show-top-n=1000 --num-callers=4 \ --sort-by=tot-blocks-allocd $RUSTC01 ... 2> $OUTFILE

I sometimes also use --sort-by=tot-bytes-allocd. DHAT’s output looks like this:

==16425== -------------------- 1 of 1000 -------------------- ==16425== max-live: 30,240 in 378 blocks ==16425== tot-alloc: 20,866,160 in 260,827 blocks (avg size 80.00) ==16425== deaths: 260,827, at avg age 113,438 (0.00% of prog lifetime) ==16425== acc-ratios: 0.74 rd, 1.00 wr (15,498,021 b-read, 20,866,160 b-written) ==16425== at 0x4C2BFA6: malloc (vg_replace_malloc.c:299) ==16425== by 0x5AD392B: <syntax::ptr::P<T> as serialize::serialize::Decodable>::decode (heap.rs:59) ==16425== by 0x5AD4456: <core::iter::Map<I, F> as core::iter::iterator::Iterator>::next (serialize.rs:201) ==16425== by 0x5AE2A52: rustc_metadata::decoder::<impl rustc_metadata::cstore::CrateMetadata>::get_attributes (vec.rs:1556) ==16425== ==16425== -------------------- 2 of 1000 -------------------- ==16425== max-live: 1,360 in 17 blocks ==16425== tot-alloc: 10,378,160 in 129,727 blocks (avg size 80.00) ==16425== deaths: 129,727, at avg age 11,622 (0.00% of prog lifetime) ==16425== acc-ratios: 0.47 rd, 0.92 wr (4,929,626 b-read, 9,599,798 b-written) ==16425== at 0x4C2BFA6: malloc (vg_replace_malloc.c:299) ==16425== by 0x881136A: <syntax::ptr::P<T> as core::clone::Clone>::clone (heap.rs:59) ==16425== by 0x88233A7: syntax::ext::tt::macro_parser::parse (vec.rs:1105) ==16425== by 0x8812E66: syntax::tokenstream::TokenTree::parse (tokenstream.rs:230)

The “deaths” value here indicate the total number of calls to malloc for each call stack, which is usually the metric of most interest. The “acc-ratios” value can also be interesting, especially if the “rd” value is 0.00, because that indicates the allocated blocks are never read. (See below for example of problems that I found this way.)

For both profilers I also pipe $OUTFILE through eddyb’s rustfilt.sh script which demangles ugly Rust symbols like this:

_$LT$serialize..opaque..Decoder$LT$$u27$a$GT$$u20$as$u20$serialize..serialize..Decoder$GT$::read_usize::h87863ec7f9234810

to something much nicer, like this:

<serialize::opaque::Decoder<'a> as serialize::serialize::Decoder>::read_usize

[Update: native support for Rust demangling recently landed in Valgrind’s repo. I use a trunk version of Valgrind so I no longer need to use rustfilt.sh in combination with Valgrind.]

For programs that use Cargo, sometimes it’s useful to know the exact rustc invocations that Cargo uses. Find out with either of these commands:

RUSTC=$RUSTC01 cargo build -v RUSTC=$RUSTC01 cargo rust -v

I also have done a decent amount of ad hoc println profiling, where I insert println! calls in hot parts of the code and then I use a script to post-process them. This can be very useful when I want to know exactly how many times particular code paths are hit.

I’ve also tried perf. It works, but I’ve never established much of a rapport with it. YMMV. In general, any profiler that works with C or C++ code should also work with Rust code.

Finding suitable benchmarks

Once you know how you’re going to profile you need some good workloads. You could use the compiler itself, but it’s big and complicated and reasoning about the various stages can be confusing, so I have avoided that myself.

Instead, I have focused entirely on rustc-benchmarks, a pre-existing rustc benchmark suite. It contains 13 benchmarks of various sizes. It has been used to track rustc’s performance at perf.rust-lang.org for some time, but it wasn’t easy to use locally until I wrote a script for that purpose. I invoke it something like this:

./compare.py \ /home/njn/moz/rust0/x86_64-unknown-linux-gnu/stage1/bin/rustc \ /home/njn/moz/rust1/x86_64-unknown-linux-gnu/stage1/bin/rustc

It compares the two given compilers, doing debug builds, on the benchmarks See the next section for example output. If you want to run a subset of the benchmarks you can specify them as additional arguments.

Each benchmark in rustc-benchmarks has a makefile with three targets. See the README for details on these targets, which can be helpful.

Wins

Here are the results if I compare the following two versions of rustc with compare.py.

  • The commit just before my first commit (on September 12).
  • A commit from October 13.
futures-rs-test 5.028s vs 4.433s --> 1.134x faster (variance: 1.020x, 1.030x) helloworld 0.283s vs 0.235s --> 1.202x faster (variance: 1.012x, 1.025x) html5ever-2016- 6.293s vs 5.652s --> 1.113x faster (variance: 1.011x, 1.008x) hyper.0.5.0 6.182s vs 5.039s --> 1.227x faster (variance: 1.002x, 1.018x) inflate-0.1.0 5.168s vs 4.935s --> 1.047x faster (variance: 1.001x, 1.002x) issue-32062-equ 0.457s vs 0.347s --> 1.316x faster (variance: 1.010x, 1.007x) issue-32278-big 2.046s vs 1.706s --> 1.199x faster (variance: 1.003x, 1.007x) jld-day15-parse 1.793s vs 1.538s --> 1.166x faster (variance: 1.059x, 1.020x) piston-image-0. 13.871s vs 11.885s --> 1.167x faster (variance: 1.005x, 1.005x) regex.0.1.30 2.937s vs 2.516s --> 1.167x faster (variance: 1.010x, 1.002x) rust-encoding-0 2.414s vs 2.078s --> 1.162x faster (variance: 1.006x, 1.005x) syntex-0.42.2 36.526s vs 32.373s --> 1.128x faster (variance: 1.003x, 1.004x) syntex-0.42.2-i 21.500s vs 17.916s --> 1.200x faster (variance: 1.007x, 1.013x)

Not all of the improvement is due to my changes, but I have managed a few nice wins, including the following.

#36592: There is an arena allocator called TypedArena. rustc creates many of these, mostly short-lived. On creation, each arena would allocate a 4096 byte chunk, in preparation for the first arena allocation request. But DHAT’s output showed me that the vast majority of arenas never received such a request! So I made TypedArena lazy — the first chunk is now only allocated when necessary. This reduced the number of calls to malloc greatly, which sped up compilation of several rustc-benchmarks by 2–6%.

#36734: This one was similar. Rust’s HashMap implementation is lazy — it doesn’t allocate any memory for elements until the first one is inserted. This is a good thing because it’s surprisingly common in large programs to create HashMaps that are never used. However, Rust’s HashSet implementation (which is just a layer on top of the HashMap) didn’t have this property, and guess what? rustc also creates large numbers of HashSets that are never used. (Again, DHAT’s output made this obvious.) So I fixed that, which sped up compilation of several rustc-benchmarks by 1–4%. Even better, because this change is to Rust’s stdlib, rather than rustc itself, it will speed up any program that creates HashSets without using them.

#36917: This one involved avoiding some useless data structure manipulation when a particular table was empty. Again, DHAT pointed out a table that was created but never read, which was the clue I needed to identify this improvement. This sped up two benchmarks by 16% and a couple of others by 3–5%.

#37064: This one changed a hot function in serialization code to return a Cow<str> instead of a String, which avoided a lot of allocations.

Future work

Profiles indicate that the following parts of the compiler account for a lot of its runtime.

  • malloc and free are still the two hottest functions in most benchmarks. Avoiding heap allocations can be a win.
  • Compression is used for crate metadata and LLVM bitcode. (This shows up in profiles under a function called tdefl_compress.)  There is an issue open about this.
  • Hash table operations are hot. A lot of this comes from the interning of various values during type checking; see the CtxtInterners type for details.
  • Crate metadata decoding is also costly.
  • LLVM execution is a big chunk, especially when doing optimized builds. So far I have treated LLVM as a black box and haven’t tried to change it, at least partly because I don’t know how to build it with debug info, which is necessary to get source files and line numbers in profiles. [Update: there is a new –enable-llvm-release-debuginfo configure option that causes LLVM to be build with debug info.]

A lot of programs have broadly similar profiles, but occasionally you get an odd one that stresses a different part of the compiler. For example, in rustc-benchmarks, inflate-0.1.0 is dominated by operations involving the (delighfully named) ObligationsForest (see #36993), and html5ever-2016-08-25 is dominated by what I think is macro processing. So it’s worth profiling the compiler on new codebases.

Caveat lector

I’m still a newcomer to Rust development. Although I’ve had lots of help on the #rustc IRC channel — big thanks to eddyb and simulacrum in particular — there may be things I am doing wrong or sub-optimally. Nonetheless, I hope this is a useful starting point for newcomers who want to speed up the Rust compiler.

Categorieën: Mozilla-nl planet

Nicholas Nethercote: How to speed up the Rust compiler some more

wo, 23/11/2016 - 06:00

I recently wrote about some work I’ve done to speed up the Rust compiler. Since then I’ve done some more.

Heap allocations

My last post mentioned how heap allocations were frequent within rustc. This led to some wild speculation in some venues: about whether Rust should use heap allocation at all, about whether garbage collection was necessary, and so on. I have two comments about this.

The first is that rustc is a compiler, and like most compilers its execution is dominated by complex traversals of large tree structures: ASTs, IRs, etc. Tree structures typically require heap allocation. In particular, a lot of these tree structures contain Vec, HashMap and HashSet fields, all of which unavoidably use heap allocation.

The second is that although some heap allocation is unavoidable, the amount that rustc was doing was excessive. It is clear that nobody had (or not for a long time) made a concerted effort to minimize heap allocations. With the help of DHAT I’ve been able to greatly reduce the amount done. Any time you throw a new profiler at any large codebase that hasn’t been heavily optimized there’s a good chance you’ll be able to make some sizeable improvements.

Tools

As in my previous post, I focused almost entirely on the benchmarks present in rustc-benchmarks to guide my efforts.

Once again I mostly used Cachegrind and DHAT to profile these benchmarks. I also used Massif (plus the excellent massif-visualizer) to profile peak memory usage on one workload (see below).

I have also recently started using perf a bit more. Huon Wilson told me to add the --callgraph=dwarf flag to my perf record invocations. That improves things significantly, though I still find perf puzzling and frustrating at times, even after reading Brendan Gregg’s thorough examples page.

Wins

#37229: rustc spends a lot of time doing hash table lookups, enough that the cost of hashing is significant. In this PR I changed the hash function used by rustc (FNV) to one inspired by the hash function used within Firefox. The new function is faster because it can process an entire word at a time, rather than one byte at a time. The new hash function is slightly worse in terms of the number of collisions but the change sped up compilation of most workloads by 3–6%.

#37161 & #37318 & #37373: These three PRs removed a lot of unnecessary heap allocations (mostly due to clone()) in the macro parser. They sped up the compilation of html5ever by 20% and 7% and 2%, respectively.

#37267 & #37298: rustc uses “deflate” compression in a couple of places: crate metadata and LLVM bitcode. In the metadata case rustc was often compressing metadata and then throwing away the result! (Thanks to eddyb for diagnosing this and explaining to me how to fix it.) Avoiding this unnecessary work speed up compilation of syntex-incr by 6% and several others by 1–2%.

For the LLVM bitcode case I tweaked the deflate settings so that it ran almost twice as fast while the compression ratio was only slightly worse. This sped up compilation of syntex-incr by another 8% and a couple of others by 1–2%. It’s possible that switching to a different compression algorithm would help some more, thought that would be a much larger change.

#37108: This PR avoided interning values of the Substs type in cases where it was easy to tell that the value had previously been interned. It sped up compilation of several benchmarks by 1–4%.

#37705: This PR avoided some unnecessary calls to mk_ty. It sped up compilation of one benchmark by 5% and a few others by 1–2%.

#37083: This PR inlined various methods involved with uleb128 decoding, which is used when reading crate metadata. This sped up compilation of several benchmarks by 1%.

#36973: This PR fixed things so that a data structure that is only required for incremental compilation is not touched during non-incremental compilation. This sped up compilation of several non-incremental benchmarks by 1%.

#37445 & #37764: One unusual workload was found to make rustc consume excessive amounts of memory (4.5 GiB!) which made it OOM on some machines. I used Massif to identify ways to improve this.

The first PR reduced the size of the Expr enum by shrinking the outsized InlineAsm variant, which reduced peak memory usage by 9%. The second PR removed scope_auxiliary, a data structure used only during MIR dumping (and of marginal utility even then). This reduced peak memory usage by another 10%.

I have filed a PR to add a cut-down version of this workload to rustc-benchmarks.

#36993: This PR did some manual inlining and restructuring of several ObligationForest functions. It sped up compilation of inflate by 2%.

#37642: This PR reduced some excessive indirection present in the representation of HIR. It sped up compilation of some benchmarks by 1, 2 and 4%.

#37427: rustc uses Blake2b hashing to determine when a function’s code has changed. This PR reduced the number of bytes to be hashed by (a) avoiding hashing filenames twice for each span, and (b) pre-uleb128-encoding 32-bit and 64-bit integers, which are usually small. This sped up compilation of syntex-incr by 2%.

Future work

As the compiler front-end gets faster, the proportion of rustc execution time spent in the LLVM back-end increases. For some benchmarks it now exceeds 50% (when doing debug builds), though there is still plenty of variation. Thanks to mrhota there is a new –enable-llvm-release-debuginfo configure option that (unsurprisingly) enabled debuginfo within LLVM. This means that profilers can now give filenames and line numbers for LLVM. I’ve looked at a few places in LLVM that show up high in profiles, though I haven’t yet managed to make any useful changes to it.

Another interesting new development is pnkfelix’s -Zprint-type-sizes option, which should land soon, and will potentially be useful for any program written in Rust, not just rustc. This option will make it trivial to see how each type is laid out in memory, which will make it easy to see how types can be rearranged to be smaller. Up until now this has been a painful and imprecise exercise.

Finally, I want to encourage anyone else with the slightest knack and/or enthusiasm for optimizing code to take a look at rustc. You might be thinking that there isn’t that much low-hanging fruit left, but I am confident there is. It’s getting harder for me to find things to improve, but I am one just person with a particular background and a few preferred profiling tools. I am confident that other people with different backgrounds and tools will find plenty of stuff that I cannot. Rust compile speed still isn’t great, even after these improvements, but the more people who pitch in the faster they’ll improve. Don’t be shy, and if you have any questions please contact me or ask in the #rust or #rustc IRC channels.

Categorieën: Mozilla-nl planet

Mitchell Baker: Expanding Mozilla’s Boards

wo, 23/11/2016 - 02:15

This post was originally published on the Mozilla Blog.

Watch the presentation of Mozilla's Boards expansion on AirMozilla

Watch the presentation of Mozilla’s Boards expansion on AirMozilla

In a post earlier this month, I mentioned the importance of building a network of people who can help us identify and recruit potential Board level contributors and senior advisors. We are also currently working to expand both the Mozilla Foundation and Mozilla Corporation Boards.

The role of a Mozilla Board member

I’ve written a few posts about the role of the Board of Directors at Mozilla.

At Mozilla, we invite our Board members to be more involved with management, employees and volunteers than is generally the case. It’s not that common for Board members to have unstructured contacts with individuals or even sometimes the management team. The conventional thinking is that these types of relationships make it hard for the CEO to do his or her job. We feel differently. We have open flows of information in multiple channels. Part of building the world we want is to have built transparency and shared understandings.

We also prefer a reasonably extended “get to know each other” period for our Board members. Sometimes I hear people speak poorly of extended process, but I feel it’s very important for Mozilla.  Mozilla is an unusual organization. We’re a technology powerhouse with a broad Internet openness and empowerment mission at its core. We feel like a product organization to those from the nonprofit world; we feel like a non-profit organization to those from the Internet industry.

It’s important that our Board members understand the full breadth of Mozilla’s mission. It’s important that Mozilla Foundation Board members understand why we build consumer products, why it happens in the subsidiary and why they cannot micro-manage this work. It is equally important that Mozilla Corporation Board members understand why we engage in the open Internet activities of the Mozilla Foundation and why we seek to develop complementary programs and shared goals.

I want all our Board members to understand that “empowering people” encompasses “user communities” but is much broader for Mozilla. Mozilla should be a resource for the set of people who care about the open Internet. We want people to look to Mozilla because we are such an excellent resource for openness online, not because we hope to “leverage our community” to do something that benefits us.

These sort of distinctions can be rather abstract in practice. So knowing someone well enough to be comfortable about these takes a while. We have a couple of ways of doing this. First, we have extensive discussions with a wide range of people. Board candidates will meet the existing Board members, members of the management team, individual contributors and volunteers. We’ve been piloting ways to work with potential Board candidates in some way. We’ve done that with Cathy Davidson, Ronaldo Lemos, Katharina Borchert and Karim Lakhani. We’re not sure we’ll be able to do it with everyone, and we don’t see it as a requirement. We do see this as a good way to get to know how someone thinks and works within the framework of the Mozilla mission. It helps us feel comfortable including someone at this senior level of stewardship.

What does a Mozilla Board member look like

Job descriptions often get long and wordy. We have those too but, for the search of new Board members, we’ve tried something else this time: a visual role description.

Board member job description for Mozilla Foundation

Board member job description for Mozilla Corporation

Board member job description for Mozilla Foundation

Board member job description for Mozilla Foundation

Here is a short explanation of how to read these visuals:

  • The horizontal lines speaks to things that every Board member should have. For instance, to be a Board member, you have to care about the mission and you have to have some cultural sense of Mozilla, etc. They are a set of things that are important for each and every candidate. In addition, there is a set of things that are important for the Board as a whole. For instance, we could put international experience in there or whether the candidate is a public spokesperson. We want some of that but it is not necessary that every Board member has that.
  • In the vertical green columns, we have the particular skills and expertise that we are looking for at this point.
  • We would expect the horizontal lines not to change too much over time and the vertical lines to change depending on who joins the Board and who leaves.

I invite you to look at these documents and provide input on them. If you have candidates that you believe would be good Board members, send them to the boarddevelopment@mozilla.com mailing list. We will use real discretion with the names you send us.

We’ll also be designing a process for how to broaden participation in the process beyond other Board members. We want to take advantage of the awareness and the cluefulness of the organization. That will be part of a future update.

Update August 2, 2016

Both the Mozilla Foundation and Mozilla Corporation Board Candidate Profiles have been updated. Cultural Fit has been updated to ‘Values Match’.

MoCo Board Candidate Profile

MoFo Board Candidate Profile

 

Categorieën: Mozilla-nl planet

Air Mozilla: Mozilla IOT Meetup Nov. 22, 2016

di, 22/11/2016 - 20:00

Mozilla IOT Meetup Nov. 22, 2016 A monthly meetup in London for IOT and connected devices enthusiasts.

Categorieën: Mozilla-nl planet

Matjaž Horvat: Set up your own Pontoon instance in 5 minutes

di, 22/11/2016 - 14:23

Heroku is a cloud Platform-as-a-Service (PaaS) we use at Pontoon for over a year and a half now. Despite some specifics that are not particularly suitable for our use case, it has proved to be a very reliable and easy to use deployment model.

Thanks to the amazing Jarek, you can now freely deploy your own Pontoon instance in just a few simple steps, without leaving the web browser and with very little configuration.

To start the setup, click a Deploy to Heroku button in your fork, upstream repository or this blog post. You will need to log in to Heroku or create an account first.

Deploy to Heroku

Next, you’ll be presented with the configuration page. All settings are optional, so you can simply scroll to the botton of the page and click Deploy.

Still, I suggest you to set the App Name for an easy to remember URL and Admin email & password, which are required for logging in (instead of Firefox Accounts, custom Heroku deployment uses conventional log in form).

When setup completes, you’re ready to View your personal Pontoon instance in your browser or Manage App in the Heroku Dashboard.

This method is also pretty convenient to quickly test or demonstrate any Pontoon improvements you might want to provide – without setting the development environment locally. Simply click Deploy to Heroku from the README file in your fork after you have pushed the changes.

If you’re searching for inspiration on what to hack on, we have some ideas!

Categorieën: Mozilla-nl planet

Mozilla Addons Blog: webextensions-examples and Hacktoberfest

di, 22/11/2016 - 12:44

Hacktoberfest is an event organized by DigitalOcean in partnership with GitHub. It encourages contributions to open source projects during the month of October. This year the webextensions-examples project participated.

“webextensions-examples” is a collection of simple but complete and installable WebExtensions, that demonstrate how to use the APIs and provide a starting point for people writing their own WebExtensions.

We had a great response: contributions from 8 new volunteers in October. Contributions included 4 brand-new complete examples:

So thanks to DigitalOcean, to the add-ons team for helping me review PRs, and most of all, to our new contributors:

Categorieën: Mozilla-nl planet

Mozilla Release Management Team: Planned 50.1.0 release

di, 22/11/2016 - 09:30

Firefox releases (based on the train release model) go live every 6 weeks. However, this December instead of pushing a new mainline release, we plan to push a dot release 50.1.0 mid December. This will be a limited scope release and will only include fixes for recent severe regressions, crashes and security issues. The motivation behind this dot release is to minimize code churn and limit unexpected disruption to release end-users during December due to a Mozilla internal event and the holiday season.

50.1.0 release is planned to be pushed to release end-users on December 13th 2016. This dot release will be supported until January 24th, 2017 after which Fx51 release is planned to go live.

The ESR release schedule remains unchanged. ESR45.6.0 is planned to go live on December 13th 2016 and ESR45.7.0 will go live on January 24th 2017.

Cheers, Your friendly Mozilla Release Management Team!

Categorieën: Mozilla-nl planet

This Week In Rust: This Week in Rust 157

di, 22/11/2016 - 06:00

Hello and welcome to another issue of This Week in Rust! Rust is a systems language pursuing the trifecta: safety, concurrency, and speed. This is a weekly summary of its progress and community. Want something mentioned? Tweet us at @ThisWeekInRust or send us a pull request. Want to get involved? We love contributions.

This Week in Rust is openly developed on GitHub. If you find any errors in this week's issue, please submit a PR.

Updates from Rust Community Blog Posts News & Project Updates Other Weeklies from Rust Community Crate of the Week

This week's Crate of the Week is cargo-benchcmp. cargo-benchcmp generates nice before-after summaries for benchmarks.

Thanks to bluss for this week's suggestion. Submit your suggestions and votes for next week!

Call for Participation

Always wanted to contribute to open-source projects but didn't know where to start? Every week we highlight some tasks from the Rust community for you to pick and get started!

Some of these tasks may also have mentors available, visit the task page for more information.

If you are a Rust project owner and are looking for contributors, please submit tasks here.

Updates from Rust Core

105 pull requests were merged in the last week.

New Contributors
  • Brett Cooley
  • John Downey
  • jsen-
  • Robert Vally
  • Steve Smith
Approved RFCs

Changes to Rust follow the Rust RFC (request for comments) process. These are the RFCs that were approved for implementation this week:

Final Comment Period

Every week the team announces the 'final comment period' for RFCs and key PRs which are reaching a decision. Express your opinions now. This week's FCPs are:

New RFCs Style RFCs

Style RFCs are part of the process for deciding on style guidelines for the Rust community and defaults for Rustfmt. The process is similar to the RFC process, but we try to reach rough consensus on issues (including a final comment period) before progressing to PRs. Just like the RFC process, all users are welcome to comment and submit RFCs. If you want to help decide what Rust code should look like, come get involved!

PRs:

Ready for PR:

Final comment period:

Other notable issues:

Upcoming Events

If you are running a Rust event please add it to the calendar to get it mentioned here. Email the Rust Community Team for access.

fn work(on: RustProject) -> Money

Tweet us at @ThisWeekInRust to get your job offers listed here!

Quote of the Week

Rust iterators are the best thing since [Bread]

Yaniel on Rust (lang) Matrix channel.

Thanks to Elahn for the suggestion.

Submit your quotes for next week!

This Week in Rust is edited by: nasa42, llogiq, and brson.

Categorieën: Mozilla-nl planet

Air Mozilla: Mozilla Weekly Project Meeting, 21 Nov 2016

ma, 21/11/2016 - 20:00

Mozilla Weekly Project Meeting The Monday Project Meeting

Categorieën: Mozilla-nl planet

Tarek Ziadé: Smoke testing Swagger-based Web Services

ma, 21/11/2016 - 15:11

Swagger has come a long way. The project got renamed ("Open API") and it seems to have a vibrant community.

If you are not familiar with it, it's a specification to describe your HTTP endpoints (spec here) that has been around since a few years now ~ and it seems to be getting really mature at this point.

I was surprised to find out how many available tools they are now. The dedicated page has a serious lists of tools.

There's even a Flask-based framework used by Zalando to build microservices.

Using Swagger makes a lot of sense for improving the discoverability and documentation of JSON web services. But in my experience, with these kind of specs it's always the same issue: unless it provides a real advantage for developers, they are not maintained and eventually removed from projects.

So that's what I am experimenting on right now.

One use case that interests me the most is to see whether we can automate part of the testing we're doing at Mozilla on our web services by using Swagger specs.

That's why I've started to introduce Swagger on a handful of projects, so we can experiment on tools around that spec.

One project I am experimenting on is called Smwogger, a silly contraction of Swagger and Smoke (I am a specialist of stupid project names.)

The plan is to see if we can fully automate smoke tests against our APIs. That is, a very simple test scenario against a deployment, to make sure everything looks OK.

In order to do this, I have added an extension to the spec, called x-smoke-test, where developers can describe a simple scenario to test the API with a couple of assertions. There are a couple of tools like that already, but I wanted to see whether we could have one that could be 100% based on the spec file and not require any extra coding.

Since every endpoint has an operation identifier, it's easy enough to describe it and have a script (==Smwogger) that plays it.

Here's my first shot at it... The project is at https://github.com/tarekziade/smwogger and below is an extract from its README

Running Smwogger

To add a smoke test for you API, add an x-smoke-test section in your YAML or JSON file, describing your smoke test scenario.

Then, you can run the test by pointing the Swagger spec URL (or path to a file):

$ bin/smwogger smwogger/tests/shavar.yaml Scanning spec... OK This is project 'Shavar Service' Mozilla's implementation of the Safe Browsing protocol Version 0.7.0 Running Scenario 1:getHeartbeat... OK 2:getDownloads... OK 3:getDownloads... OK

If you need to get details about the requests and responses sent, you can use the -v option:

$ bin/smwogger -v smwogger/tests/shavar.yaml Scanning spec... OK This is project 'Shavar Service' Mozilla's implementation of the Safe Browsing protocol Version 0.7.0 Running Scenario 1:getHeartbeat... GET https://shavar.somwehere.com/__heartbeat__ >>> HTTP/1.1 200 OK Content-Type: text/plain; charset=UTF-8 Date: Mon, 21 Nov 2016 14:03:19 GMT Content-Length: 2 Connection: keep-alive OK <<< OK 2:getDownloads... POST https://shavar.somwehere.com/downloads Content-Length: 30 moztestpub-track-digest256;a:1 >>> HTTP/1.1 200 OK Content-Type: application/octet-stream Date: Mon, 21 Nov 2016 14:03:23 GMT Content-Length: 118 Connection: keep-alive n:3600 i:moztestpub-track-digest256 ad:1 u:tracking-protection.somwehere.com/moztestpub-track-digest256/1469223014 <<< OK 3:getDownloads... POST https://shavar.somwehere.com/downloads Content-Length: 35 moztestpub-trackwhite-digest256;a:1 >>> HTTP/1.1 200 OK Content-Type: application/octet-stream Date: Mon, 21 Nov 2016 14:03:23 GMT Content-Length: 128 Connection: keep-alive n:3600 i:moztestpub-trackwhite-digest256 ad:1 u:tracking-protection.somwehere.com/moztestpub-trackwhite-digest256/1469551567 <<< OK Scenario

A scenario is described by providing a sequence of operations to perform, given their operationId.

For each operation, you can make some assertions on the response by providing values for the status code and some headers.

Example in YAML

x-smoke-test: scenario: - getSomething: response: status: 200 headers: Content-Type: application/json - getSomethingElse response: status: 200 - getSomething response: status: 200

If a response does not match, an assertion error will be raised.

Posting data

When you are posting data, you can provide the request body content in the operation under the request key.

Example in YAML

x-smoke-test: scenario: - postSomething: request: body: This is the body I am sending. response: status: 200 Replacing Path variables

If some of your paths are using template variables, as defined by the swagger spec, you can use the path option:

x-smoke-test: scenario: - postSomething: request: body: This is the body I am sending. path: var1: ok var2: blah response: status: 200

You can also define global path values that will be looked up when formatting paths. In that case, variables have to be defined in a top-level path section:

x-smoke-test: path: var1: ok scenario: - postSomething: request: body: This is the body I am sending. path: var2: blah response: status: 200 Variables

You can extract values from responses, in order to reuse them in subsequential operations, wether it's to replace variables in path templates, or create a body.

For example, if getSomething returns a JSON dict with a "foo" value, you can extract it by declaring it in a vars section inside the response key:

x-smoke-test: path: var1: ok scenario: - getSomething: request: body: This is the body I am sending. path: var2: blah response: status: 200 vars: foo: query: foo default: baz

Smwogger will use the query value to know where to look in the response body and extract the value. If the value is not found and default is provided, the variable will take that value.

Once the variable is set, it will be reused by Smwogger for subsequent operations, to replace variables in path templates.

The path formatting is done automatically. Smwogger will look first at variables defined in operations, then at the path sections.

Conclusion

None for now. This is an ongoing experiment. But happy to get your feedback on github!

Categorieën: Mozilla-nl planet

Nick Desaulniers: Static and Dynamic Libraries

ma, 21/11/2016 - 08:55

This is the second post in a series on memory segmentation. It covers working with static and dynamic libraries in Linux and OSX. Make sure to check out the first on object files and symbols.

Let’s say we wanted to reuse some of the code from our previous project in our next one. We could continue to copy around object files, but let’s say we have a bunch and it’s hard to keep track of all of them. Let’s combine multiple object files into an archive or static library. Similar to a more conventional zip file or “compressed archive,” our static library will be an uncompressed archive.

We can use the ar command to create and manipulate a static archive.

1 2 $ clang -c x.c y.c $ ar -rv libhello.a x.o y.o

The -r flag will create the archive named libhello.a and add the files x.o and y.o to its index. I like to add the -v flag for verbose output. Then we can use the familiar nm tool I introduced in the previous post to examine the content of the archives and their symbols.

1 2 3 4 5 6 7 8 9 10 11 $ file libhello.a libhello.a: current ar archive random library $ nm libhello.a libhello.a(x.o): U _puts 0000000000000000 T _x libhello.a(y.o): U _puts 0000000000000000 T _y

Some other useful flags for ar are -d to delete an object file, ex. ar -d libhello.a y.o and -u to update existing members of the archive when their source and object files are updated. Not only can we run nm on our archive, otool and objdump both work.

Now that we have our static library, we can statically link it to our program and see the resulting symbols. The .a suffix is typical on both OSX and Linux for archive files.

1 2 3 4 5 6 $ clang main.o libhello.a $ nm a.out 0000000100000f30 T _main U _puts 0000000100000f50 T _x 0000000100000f70 T _y

Our compiler understands how to index into archive files and pull out the functions it needs to combine into the final executable. If we use a static library to statically link all functions required, we can have one binary with no dependencies. This can make deployment of binaries simple, but also greatly increase their size. Upgrading large binaries incrementally becomes more costly in terms of space.

While static libraries allowed us to reuse source code, static linkage does not allow us to reuse memory for executable code between different processes. I really want to put off talking about memory benefits until the next post, but know that the solution to this problem lies in “dynamic libraries.”

While having a single binary file keeps things simple, it can really hamper memory sharing and incremental relinking. For example, if you have multiple executables that are all built with the same static library, unless your OS is really smart about copy-on-write page sharing, then you’re likely loading multiple copies of the same exact code into memory! What a waste! Also, when you want to rebuild or update your binary, you spend time performing relocation again and again with static libraries. What if we could set aside object files that we could share amongst multiple instances of the same or even different processes, and perform relocation at runtime?

The solution is known as dynamic libraries. If static libraries and static linkage were Atari controllers, dynamic libraries and dynamic linkage are Steel Battalion controllers. We’ll show how to work with them in the rest of this post, but I’ll prove how memory is saved in a later post.

Let’s say we want to created a shared version of libhello. Dynamic libraries typically have different suffixes per OS since each OS has it’s preferred object file format. On Linux the .so suffix is common, .dylib on OSX, and .dll on Windows.

1 2 3 4 5 6 7 $ clang -shared -fpic x.c y.c -o libhello.dylib $ file libhello.dylib libhello.dylib: Mach-O 64-bit dynamically linked shared library x86_64 $ nm libhello.dylib U _puts 0000000000000f50 T _x 0000000000000f70 T _y

The -shared flag tells the linker to create a special file called a shared library. The -fpic option converts absolute addresses to relative addresses, which allows for different processes to load the library at different virtual addresses and share memory.

Now that we have our shared library, let’s dynamically link it into our executable.

1 2 3 4 $ clang main.c libhello.dylib $ ./a.out x y

The dynamic linker essential produces an incomplete binary. You can verify with nm. At runtime, we’ll delay start up to perform some memory mapping early on in the process start (performed by the dynamic linker) and pay slight costs for trampolining into position independent code.

Let’s say we want to know what dynamic libraries a binary is using. You can either query the executable (most executable object file formats contain a header the dynamic linker will parse and pull in libs) or observe the executable while running it. Because each major OS has its own object file format, they each have their own tools for these two checks. Note that statically linked libraries won’t show up here, since their object code has already been linked in and thus we’re not able to differentiate between object code that came from our first party code vs third party static libraries.

On OSX, we can use otool -L <bin> to check which .dylibs will get pulled in.

1 2 3 4 $ otool -L a.out a.out: libhello.dylib (compatibility version 0.0.0, current version 0.0.0) /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1226.10.1)

So we can see that a.out depends on libhello.dylib (and expects to find it in the same directory as a.out). It also depends on shared library called libSystem.B.dylib. If you run otool -L on libSystem itself, you’ll see it depends on a bunch of other libraries including a C runtime, malloc implementation, pthreads implementation, and more. Let’s say you want to find the final resting place of where a symbol is defined, without digging with nm and otool, you can fire up your trusty debugger and ask it.

1 2 3 4 5 $ lldb a.out ... (lldb) image lookup -r -s puts ... Summary: libsystem_c.dylib`puts Address: libsystem_c.dylib[0x0000000000085c30] (libsystem_c.dylib.__TEXT.__stubs + 3216)

You’ll see a lot of output since puts is treated as a regex. You’re looking for the Summary line that has an address and is not a symbol stub. You can then check your work with otool and nm.

If we want to observe the dynamic linker in action on OSX, we can use dtruss:

1 2 3 4 5 6 7 8 9 10 11 $ sudo dtruss ./a.out ... stat64("libhello.dylib\0", 0x7FFF50CEAC68, 0x1) = 0 0 open("libhello.dylib\0", 0x0, 0x0) = 3 0 ... mmap(0x10EF27000, 0x1000, 0x5, 0x12, 0x3, 0x0) = 0x10EF27000 0 mmap(0x10EF28000, 0x1000, 0x3, 0x12, 0x3, 0x1000) = 0x10EF28000 0 mmap(0x10EF29000, 0xC0, 0x1, 0x12, 0x3, 0x2000) = 0x10EF29000 0 ... close(0x3) = 0 0 ...

On Linux, we can simply use ldd or readelf -d to query an executable for a list of its dynamic libraries.

1 2 3 4 5 6 7 8 9 10 11 12 13 $ clang -shared -fpic x.c y.c -o libhello.so $ clang main.c libhello.so $ ldd a.out linux-vdso.so.1 => (0x00007fff95d43000) libhello.so => not found libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fcc98c5f000) /lib64/ld-linux-x86-64.so.2 (0x0000555993852000) $ readelf -d a.out Dynamic section at offset 0xe18 contains 25 entries: Tag Type Name/Value 0x0000000000000001 (NEEDED) Shared library: [libhello.so] 0x0000000000000001 (NEEDED) Shared library: [libc.so.6] ...

We can then use strace to observe the dynamic linker in action on Linux:

1 2 3 4 5 6 7 8 $ LD_LIBRARY_PATH=. strace ./a.out ... open("./libhello.so", O_RDONLY|O_CLOEXEC) = 3 ... read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\260\5\0\0\0\0\0\0"..., 832) = 832 fstat(3, {st_mode=S_IFREG|0755, st_size=8216, ...}) = 0 close(3) = 0 ...

What’s this LD_LIBRARY_PATH thing? That’s shell syntax for setting an environmental variable just for the duration of that command (as opposed to exporting it so it stays set for multiple commands). As opposed to OSX’s dynamic linker, which was happy to look in the cwd for libhello.dylib, on Linux we must supply the cwd if the dynamic library we want to link in is not in the standard search path.

But what is the standard search path? Well, there’s another environmental variable we can set to see this, LD_DEBUG. For example, on OSX:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 $ LD_DEBUG=libs LD_LIBRARY_PATH=. ./a.out 15828: find library=libhello.so [0]; searching 15828: search path=./tls/x86_64:./tls:./x86_64:. (LD_LIBRARY_PATH) 15828: trying file=./tls/x86_64/libhello.so 15828: trying file=./tls/libhello.so 15828: trying file=./x86_64/libhello.so 15828: trying file=./libhello.so 15828: 15828: find library=libc.so.6 [0]; searching 15828: search path=./tls/x86_64:./tls:./x86_64:. (LD_LIBRARY_PATH) 15828: trying file=./tls/x86_64/libc.so.6 1earc: trying file=./tls/libc.so.6 15828: trying file=./x86_64/libc.so.6 15828: trying file=./libc.so.6 15828: search cache=/etc/ld.so.cache 15828: trying file=/lib/x86_64-linux-gnu/libc.so.6 15828: calling init: /lib/x86_64-linux-gnu/libc.so.6 15828: calling init: ./libhello.so 15828: initialize program: ./a.out 15828: transferring control: ./a.out x y 15828: calling fini: ./a.out [0] 15828: calling fini: ./libhello.so [0]

LD_DEBUG is pretty useful. Try:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 $ LD_DEBUG=help ./a.out Valid options for the LD_DEBUG environment variable are: libs display library search paths reloc display relocation processing files display progress for input file symbols display symbol table processing bindings display information about symbol binding versions display version dependencies scopes display scope information all all previous options combined statistics display relocation statistics unused determined unused DSOs help display this help message and exit To direct the debugging output into a file instead of standard output a filename can be specified using the LD_DEBUG_OUTPUT environment variable.

For some cool stuff, I recommend checking out LD_DEBUG=symbols and LD_DEBUG=statistics.

Going back to LD_LIBRARY_PATH, usually libraries you create and want to reuse between projects go into /usr/local/lib and the headers into /usr/local/include. I think of the convention as:

1 2 3 4 5 6 7 8 9 $ tree -L 2 /usr/ /usr ├── bin # system installed binaries like nm, gcc ├── include # system installed headers like stdio.h ├── lib # system installed libraries, both static and dynamic └── local ├── bin # user installed binaries like rustc ├── include # user installed headers └── lib # user installed

Unfortunately, it’s a loose convention that’s broken down over the years and things are scattered all over the place. You can also run into dependency and versioning issues, that I don’t want to get into here, by placing libraries here instead of keeping them in-tree or out-of-tree of the source code of a project. Just know when you see a library like libc.so.6 that the numeric suffix is a major version number that follows semantic versioning. For more information, you should read Michael Kerrisk’s excellent book The Linux Programming Interface. This post is based on his chapter’s 41 & 42 (but with more info on tooling and OSX).

If we were to place our libhello.so into /usr/local/lib (on Linux you need to then run sudo ldconfig) and move x.h and y.h to /usr/local/include, then we could then compile with:

1 $ clang main.c -lhello

Note that rather than give a full path to our library, we can use the -l flag followed by the name of our library with the lib prefix and .so suffix removed.

When working with shared libraries and external code, three flags I use pretty often:

1 2 3 * -l<libname to link, no lib prefix or file extension; ex: -lnanomsg to link libnanomsg.so> * -L <path to search for lib if in non standard directory> * -I <path to headers for that library, if in non standard directory>

For finding specific flags needed for compilation where dynamic linkage is required, a tool called pkg-config can be used for finding appropriate flags. I’ve had less than stellar experiences with the tool as it puts the onus on the library author to maintain the .pc files, and the user to have them installed in the right place that pkg-config looks. When they do exist and are installed properly, the tool works well:

1 2 3 4 $ sudo apt-get install libpng12-dev $ pkg-config --libs --cflags libpng12 -I/usr/include/libpng12 -lpng12 $ clang program.c `!!`

Using another neat environmental variable, we can hook into the dynamic linkage process and inject our own shared libraries to be linked instead of the expected libraries. Let’s say libgood.so and libmalicous.so both define a symbol for a function (the same symbol name and function signature). We can get a binary that links in libgood.so’s function to instead call libmalicous.so’s version:

1 2 3 4 $ ./a.out hello from libgood $ LD_PRELOAD=./libmalicious.so ./a.out hello from libmalicious

Manually invoking the dynamic linker from our code, we can even man in the middle library calls (call our hooked function first, then invoke the original target). We’ll see more of this in the next post on using the dynamic linker.

As you can guess, readjusting the search paths for dynamic libraries is a security concern as it let’s good and bad folks change the expected execution paths. Guarding against the use of these env vars becomes a rabbit hole that gets pretty tricky to solve without the heavy handed use of statically linking dependencies.

In the the previous post, I alluded to undefined symbols like puts. puts is part of libc, which is probably the most shared dynamic library on most computing devices since most every program makes use of the C runtime. (I think of a “runtime” as implicit code that runs in your program that you didn’t necessarily write yourself. Usually a runtime is provided as a library that gets implicitly linked into your executable when you compile.) You can statically link against libc with the -static flag, on Linux at least (OSX makes this difficult, “Apple does not support statically linked binaries on Mac OS X”).

I’m not sure what the benefit would be to mixing static and dynamic linking, but after searching the search paths from LD_DEBUG=libs for shared versions of a library, if any static ones are found, they will get linked in.

There’s also an interesting form of library called a “virtual dynamic shared object” on Linux. I haven’t covered memory mapping yet, but know it exists, is usually hidden for libc, and that you can read more about it via man 7 vdso.

One thing I find interesting and don’t quite understand how to recreate is that somehow glibc on Linux is also executable:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 $ /lib/x86_64-linux-gnu/libc.so.6 GNU C Library (Ubuntu GLIBC 2.24-3ubuntu1) stable release version 2.24, by Roland McGrath et al. Copyright (C) 2016 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. Compiled by GNU CC version 6.2.0 20161005. Available extensions: crypt add-on version 2.1 by Michael Glad and others GNU Libidn by Simon Josefsson Native POSIX Threads Library by Ulrich Drepper et al BIND-8.2.3-T5B libc ABIs: UNIQUE IFUNC For bug reporting instructions, please see: <https://bugs.launchpad.net/ubuntu/+source/glibc/+bugs>.

Also, note that linking against third party code has licensing implications (of course) of particular interest when it’s GPL or LGPL. Here is a good overview which I’d summarize as: code that statically links against LGPL code must also be LGPL, while any form of linkage against GPL code must be GPL’d.

Ok, that was a lot. In the previous post, we covered Object Files and Symbols. In this post we covered hacking around with static and dynamic linkage. In the next post, I hope to talk about manually invoking the dynamic linker at runtime.

Categorieën: Mozilla-nl planet

Jack Moffitt: Servo Interview on The Changelog

ma, 21/11/2016 - 01:00

The Changelog has just published an episode about Servo. It covers the motivations and goals of the project, some aspects of Servo performance and use of the Rust language, and even has a bit about our wonderful community. If your curious about why Servo exists, how we plan to ship it to real users, or what it was like to use Rust before it was stable, I recommend giving it a listen.

Categorieën: Mozilla-nl planet

Tantek Çelik: Happy Third Birthday to the Homebrew Website Club!

za, 19/11/2016 - 23:05

Three years ago (2013-324) we held the first Homebrew Website Club meetup at Mozilla San Francisco.

Participants in the first Homebrew Website Club meetup in San Francisco, California

In the tradition of the Homebrew Computer Club, I wrote up the Homebrew Website Club Newsletter Volume 1 Issue 1 which has been largely replaced by Kevin Marks's excellent live-tweeting and summary postings after each San Francisco meetup.

Since then Homebrew Website Clubs have sprung up in over a dozen cities world wide and continue regularly (fortnightly or monthly) in nine cities across four time zones and three countries, six of which started in 2016!

New Homebrew Website Club cities this year, along with their start dates and a subsequent photo from one of their meetups this year:

We have also seen a surge in cities with folks that are interested in starting up a Homebrew Website Club. Many existing cities started with just two people and grew slowly and steadily over time. All it takes is two individuals, committed to supporting each other in a fortnightly (or monthly) gathering to share what they have done recently on their personal websites, and they aspire to create next.

Find your city on this wiki page and add yourself! Then hop in the #indieweb chat channel and say hi!

Pick a venue, talk about all things independent web, take a fun photo like the Nürnberg animate GIF above, or like this recent one in San Francisco and post it on the wiki page for the event.

Homebrew Website Club San Francisco meetup participants

Remember to keep it fun as well as productive. Even if all you do is get together and finish writing a blog post and posting it on your indieweb site, that’s a good thing. Especially these days, the more people we can encourage to write authentic content and publish on their own sites, the better.

Previously: Congrats 2 years of HWC!

Categorieën: Mozilla-nl planet

Support.Mozilla.Org: SUMO Show & Tell: A Small Fox on a Big River

za, 19/11/2016 - 11:11

Hey there, SUMO Nation!

It is with great joy that I present to you a guest post by one of the most involved people I’ve met in my (still relatively short) time at SUMO. Seburo has been a core contributor in the community from his first day on board, and I can only hope we get to enjoy his presence among us in the future. He is one of those people whose photo could be put under “Mozillian” in the encyclopedias of this world… But, let’s not dawdle and move on to his words…

This is the outline of a presentation given at the London All-Hands as part of a session where the SUMO contributors had the opportunity to talk about the work that they have been involved in.

Towards the end of 2015, I noticed that we were getting an increasing number of requests on the SUMO Support Forum questioning how users could put Firefox for Android (Fennec, the small fox of this tale) on their Amazon Kindle Fire tablet (the big river).  At first it was just one or two questions, but the more I saw the more I realised that there were some key facts driving the questions:

  • We know that users like using Firefox for Android on their mobile devices.  It enables them to use Firefox as their user agent on the web when they are away from their laptop or desktop.
  • We know that people like using Android, possibly the worlds most popular operating system.  People recognise and take comfort from the little green Android logo when they see it alongside a device they wish to purchase and they appreciate the ease of use and depth of support at a lower price point than its competitors.
  • With a little research, it became clear why people have the Kindle Fire tablet – the price point.  In the UK at the time it was retailing for £60, almost half the price of a comparative device from the big name brand leader.

What was confusing and confounding users was that having purchased a device at a great price, that uses an operating system they know, they could not find Firefox for Android in the Amazon app store.  SUMO does not support such a configuration, but with the number of questions coming through, I realised that there must be something we could do.

I started helping some users side load Firefox for Android onto their devices and through answering questions I soon found myself using a fairly standard text that users found solved their problem.  As I refined it, it made sense for this to be included within the SUMO Knowledge Base, the user facing guide to using Mozilla software. But before I could do this, there was one key issue for which I needed the help of the truly amazing Firefox Mobile team…which version of Firefox for Android to use.

Whilst Firefox and Firefox Beta are seen as the best product versions, I was advised they they would only get updates through Google Play – not ideal if the user in on the Amazon variant of Android.  I was advised that Aurora was the version to use as it would get important security updates and pointed in the direction of the site where it could be downloaded from.  In addition to this, the Firefox Mobile team helped shape some of the language I was going to use and helped check my draft article (I did say that they are amazing…!).

The article was uploaded to SUMO for approval and further to this, went live for users to see. The article has proven to be popular with users and I understand has been pickup up by some of the SUMO L10n teams, broadening its reach (it is even linked to from an MDN article).

Whilst there is no change to our support of the Kindle platform (the article carries a “health warning” to that effect), I think of this work as a great example of how several different teams, both staff and contributors, can come together to help find a user focussed solution.

Thank you for sharing your story, Seburo. An inspiring tale of initiating on a positive change that affects many users and makes Firefox more available as a result.

Do you have a story you would like to share with us? Let us know in the comments!

Categorieën: Mozilla-nl planet

Robert O'Callahan: Overcoming Stereotypes One Parent At A Time

za, 19/11/2016 - 10:19

I just got back from a children's sports club dinner where I hardly knew anyone and apparently I was seated with the other social leftovers. It turned out the woman next to me was very nice and we had a long conversation. She was excited to hear that I do computer science and software development, and mentioned that her daughter is starting university next year and strongly considering CS. I gave my standard pitch about why CS is a wonderful career path --- hope I didn't lay it on too thick. The daughter apparently is interested in computers and good at maths, and her teachers think she has a "logical mind", so that all sounded promising and I said so. But then the mother started talking about how that "logical mind" wasn't really a girly thing and asking whether the daughter might be better doing something softer like design. I pushed back and asked her not to make assumptions about what women and men might enjoy or be capable of, and mentioned a few of the women I've known who are extremely capable at hard-core CS. I pointed out that while CS isn't for everyone and I think people should try to find work they're passionate about, the demand and rewards are often greater for people in more technical roles.

This isn't the first time I've encountered mothers to a greater or lesser extent steering their daughters away from more technical roles. I've done a fair number of talks in high schools promoting CS careers, but at least for girls maybe targeting their parents somehow would also be worth doing.

I'll send this family some links to Playcanvas and other programming resources and hope that they, plus my sales pitch, will make a difference. One of the difficulties here is that you never know or find out whether what you did matters.

Categorieën: Mozilla-nl planet