Mozilla Nederland Logo De Nederlandse Mozilla gemeenschap


  • Warning: date_timezone_set() expects parameter 1 to be DateTime, boolean given in format_date() (regel 2014 van /home/mzilla2013/domains/
  • Warning: date_format() expects parameter 1 to be DateTime, boolean given in format_date() (regel 2024 van /home/mzilla2013/domains/
  • Warning: date_timezone_set() expects parameter 1 to be DateTime, boolean given in format_date() (regel 2014 van /home/mzilla2013/domains/
  • Warning: date_format() expects parameter 1 to be DateTime, boolean given in format_date() (regel 2024 van /home/mzilla2013/domains/
  • Warning: date_timezone_set() expects parameter 1 to be DateTime, boolean given in format_date() (regel 2014 van /home/mzilla2013/domains/
  • Warning: date_format() expects parameter 1 to be DateTime, boolean given in format_date() (regel 2024 van /home/mzilla2013/domains/
  • Warning: date_timezone_set() expects parameter 1 to be DateTime, boolean given in format_date() (regel 2014 van /home/mzilla2013/domains/
  • Warning: date_format() expects parameter 1 to be DateTime, boolean given in format_date() (regel 2024 van /home/mzilla2013/domains/
  • Warning: date_timezone_set() expects parameter 1 to be DateTime, boolean given in format_date() (regel 2014 van /home/mzilla2013/domains/
  • Warning: date_format() expects parameter 1 to be DateTime, boolean given in format_date() (regel 2024 van /home/mzilla2013/domains/
  • Warning: date_timezone_set() expects parameter 1 to be DateTime, boolean given in format_date() (regel 2014 van /home/mzilla2013/domains/
  • Warning: date_format() expects parameter 1 to be DateTime, boolean given in format_date() (regel 2024 van /home/mzilla2013/domains/
  • Warning: date_timezone_set() expects parameter 1 to be DateTime, boolean given in format_date() (regel 2014 van /home/mzilla2013/domains/
  • Warning: date_format() expects parameter 1 to be DateTime, boolean given in format_date() (regel 2024 van /home/mzilla2013/domains/
  • Warning: date_timezone_set() expects parameter 1 to be DateTime, boolean given in format_date() (regel 2014 van /home/mzilla2013/domains/
  • Warning: date_format() expects parameter 1 to be DateTime, boolean given in format_date() (regel 2024 van /home/mzilla2013/domains/
  • Warning: date_timezone_set() expects parameter 1 to be DateTime, boolean given in format_date() (regel 2014 van /home/mzilla2013/domains/
  • Warning: date_format() expects parameter 1 to be DateTime, boolean given in format_date() (regel 2024 van /home/mzilla2013/domains/
  • Warning: date_timezone_set() expects parameter 1 to be DateTime, boolean given in format_date() (regel 2014 van /home/mzilla2013/domains/
  • Warning: date_format() expects parameter 1 to be DateTime, boolean given in format_date() (regel 2024 van /home/mzilla2013/domains/

Andrew Halberstadt: Part 2: How to deal with IFFY requirements

Mozilla planet - vr, 11/04/2014 - 15:24
My last post was basically a very long winded way of saying, "we have a problem". It kind of did a little dance around "why is there a problem" and "how do we fix it", but I want to explore these two questions in a bit more detail. Specifically, I want to return to the two case studies and explore why our test harnesses don't work and why mozharness does work even though both have IFFY (in flux for years) requirements. Then I will explore how to use the lessons learned to improve our general test harness design. ### DRY is not everything I talked a lot about the [DRY principle][1] in the last article. Basically the conclusion about it was that it is very useful, but that we tend to fixate on it to the point where we ignore other equally useful principles. Having reached this conclusion, I did a quick internet search and found [an article][2] by Joel Abrahamsson arguing the exact same point (albeit much more succinctly than me). Through his article I found out about the [SOLID principles][3] of object oriented design (have I been living under a rock?). They are all very useful guidelines, but there are two that immediately made me think of our test harnesses in a bad way. The first is the [single responsibility principle][4] (which I was delighted to find is meant to mitigate requirement changes) and the second is the [open/closed principle][5]. The single responsibility principle states that a class should only be responsible for one thing, and responsibility for that thing should not be shared with other classes. What is a responsibility? A responsibility is defined as a *reason to change*. To use the wikipedia example, a class that prints a block of text can undergo two changes. The content of the text can change, or the format of the text can change. These are two different responsibilities that should be split out into different classes. The open/closed principle states that software should be open for extension, but closed for modification. In other words, it should be possible to change the behaviour of the software only by adding new code without needing to modify any existing code. A popular way of implementing this is through abstract base classes. Here the interface is closed for modification, and each new implementation is an extension of that. Our test harnesses fail miserably at both of these principles. Instead of having several classes each with a well defined responsibility, we have a single class responsible for everything. Instead of being able to add some functionality without worrying about breaking something else, you have to take great pains that your change won't affect some other platform you don't even care about! Mozharness on the other hand, while not perfect, does a much better job at both principles. The concept of actions makes it easy to extend functionality without modifying existing code. Just add a new action to the list! The core library is also much better separated by responsibility. There is a clear separation between general script, build, and testing related functionality. ### Inheritance is evil This is probably old news to many people, but this is something that I'm just starting to figure out on my own. I like Zed Shaw's [analogy from *Learn Python the Hard Way*][6] the best. Instead of butchering it, here it is in its entirety. > In the fairy tales about heroes defeating evil villains there's always a dark forest of some kind. > It could be a cave, a forest, another planet, just some place that everyone knows the hero > shouldn't go. Of course, shortly after the villain is introduced you find out, yes, the hero has > to go to that stupid forest to kill the bad guy. It seems the hero just keeps getting into > situations that require him to risk his life in this evil forest. > > You rarely read fairy tales about the heroes who are smart enough to just avoid the whole situation > entirely. You never hear a hero say, "Wait a minute, if I leave to make my fortunes on the high seas > leaving Buttercup behind I could die and then she'd have to marry some ugly prince named Humperdink. > Humperdink! I think I'll stay here and start a Farm Boy for Rent business." If he did that there'd > be no fire swamp, dying, reanimation, sword fights, giants, or any kind of story really. Because of > this, the forest in these stories seems to exist like a black hole that drags the hero in no matter > what they do. > > In object-oriented programming, Inheritance is the evil forest. Experienced programmers know to > avoid this evil because they know that deep inside the Dark Forest Inheritance is the Evil Queen > Multiple Inheritance. She likes to eat software and programmers with her massive complexity teeth, > chewing on the flesh of the fallen. But the forest is so powerful and so tempting that nearly every > programmer has to go into it, and try to make it out alive with the Evil Queen's head before they > can call themselves real programmers. You just can't resist the Inheritance Forest's pull, so you go > in. After the adventure you learn to just stay out of that stupid forest and bring an army if you > are ever forced to go in again. > > This is basically a funny way to say that I'm going to teach you something you should avoid called > Inheritance. Programmers who are currently in the forest battling the Queen will probably tell you > that you have to go in. They say this because they need your help since what they've created is > probably too much for them to handle. But you should always remember this: > > Most of the uses of inheritance can be simplified or replaced with composition, and multiple > inheritance should be avoided at all costs. I had never heard the (apparently popular) term "composition over inheritance". Basically, unless you really really mean it, always go for "X has a Y" instead of "X is a Y". Never do "X is a Y" for the sole purpose of avoiding code duplication. This is exactly the mistake we made in our test harnesses. The Android and B2G runners just inherited everything from the desktop runner, but oops, turns out all three are actually quite different from one another. Mozharness, while again not perfect, does a better job at avoiding inheritance. While it makes heavy use of the mixin pattern (which, yes, is still inheritance) at least it promotes separation of concerns more than classic inheritance. ### Practical Lessons So this is all well and great, but how can we apply all of this to our automation code base? A smarter way to approach our test harness design would have been to have most of the shared code between the three platforms in a single (relatively) bare-bones runner that *has a* target environment (e.g desktop Firefox, Fennec or B2G in this case). In this model there is no inheritance, and no code duplication. It is easy to extend without modifying (just add a new target environment) and there are clear and distinct responsibilities between managing tests/results and actually launching them. In fact this how the gaia team implemented their [marionette-js-runner][7]. I'm not sure if that pattern is common to node's [mocha runner][8] or something of their design, but I like it. I'd also like our test harnesses to employ mozharness' concept of actions. Each action could be an atomic as possible setup step. For example, setting preferences in the profile is a single action. Setting environment is another. Parsing a manifest could be a third. Each target environment would consist of a list of actions that are run in a particular order. If code needs to be shared, simply add the corresponding action to whichever targets need it. If not, just don't include the action in the list for targets that don't need it. My dream end state here is that there is no distinction between test runners and mozharness scripts. They are both trying to do the same thing (perform setup, launch some code, collect results) so why bother wrapping one around the other? The test harness should just *be* a mozharness script and vice versa. This would bring actions into test harnesses, and allow mozharness scripts to live in-tree. ### Conclusion Is it possible to avoid code duplication with a project that has IFFY requirements? I think yes. But I still maintain it is exceptionally hard. It wasn't until after it was too late and I had a lot of time to think about it that I realized the mistakes we made. And even with what I know now, I don't think I would have fared much better given the urgency and time constraints we were under. Though next time, I think I'll at least have a better chance. [1]: [2]: [3]: [4]: [5]: [6]: [7]: [8]:
Categorieën: Mozilla-nl planet

Henrik Skupin: Firefox Automation report – week 9/10 2014

Mozilla planet - vr, 11/04/2014 - 11:28

In this post you can find an overview about the work happened in the Firefox Automation team during week 9 and 10. I for myself was a week on vacation. A bit of relaxing before the work on the TPS test framework should get started.


In preparation to run Mozmill tests for Firefox Metro in our Mozmill-CI system, Andreea has started to get support for Metro builds and appropriate tests included.

With the help from Henrik we got Mozmill 2.0.6 released. It contains a helpful fix for waitForPageLoad(), which let you know about the page being loaded and its status in case of a failure. This might help us to nail down the intermittent failures when loading remote and even local pages. But the most important part of this release is indeed the support of mozcrash. Even that we cannot have a full support yet due to missing symbol files for daily builds on, we can at least show that a crash was happening during a testrun, and let the user know about the local minidump files.

Individual Updates

For more granular updates of each individual team member please visit our weekly team etherpad for week 9 and week 10.

Meeting Details

If you are interested in further details and discussions you might also want to have a look at the meeting agenda, the video recording, and notes from the Firefox Automation meetings of week 9 and week 10.

Categorieën: Mozilla-nl planet

Gervase Markham: 21st Century Nesting

Mozilla planet - vr, 11/04/2014 - 10:42

Our neighbours have acquired a 21st century bird’s nest:

Not only is it behind a satellite dish but, if you look closely, large parts of it are constructed from the wire ties that the builders (who are still working on our estate) use for tying layers of bricks together. We believe it belongs to a couple of magpies, and it contains six (low-tech) eggs.

I have no idea what effect this has on their reception…

Categorieën: Mozilla-nl planet

Chris Double: Preventing heartbleed bugs with safe programming languages

Mozilla planet - vr, 11/04/2014 - 06:00

The Heartbleed bug in OpenSSL has resulted in a fair amount of damage across the internet. The bug itself was quite simple and is a textbook case for why programming in unsafe languages like C can be problematic.

As an experiment to see if a safer systems programming language could have prevented the bug I tried rewriting the problematic function in the ATS programming language. I’ve written about ATS as a safer C before. This gives a real world testcase for it. I used the latest version of ATS, called ATS2.

ATS compiles to C code. The function interfaces it generates can exactly match existing C functions and be callable from C. I used this feature to replace the dtls1_process_heartbeat and tls1_process_heartbeat functions in OpnSSL with ATS versions. These two functions are the ones that were patched to correct the heartbleed bug.

The approach I took was to follow something similar to that outlined by John Skaller on the ATS mailing list:

ATS on the other hand is basically C with a better type system. You can write very low level C like code without a lot of the scary dependent typing stuff and then you will have code like C, that will crash if you make mistakes. If you use the high level typing stuff coding is a lot more work and requires more thinking, but you get much stronger assurances of program correctness, stronger than you can get in Ocaml or even Haskell, and you can even hope for *better* performance than C by elision of run time checks otherwise considered mandatory, due to proof of correctness from the type system. Expect over 50% of your code to be such proofs in critical software and probably 90% of your brain power to go into constructing them rather than just implementing the algorithm. It's a paradigm shift.

I started with wrapping the C code directly and calling that from ATS. From there I rewrote the C code into unsafe ATS. Once that worked I added types to find errors.

I’ve put the modified OpenSSl code in a github fork. The two branches there, ats and ats_safe, represent the latter two stages in implementing the functions in ATS.

I’ll give a quick overview of the different paths I took then go into some detail about how I used ATS to find the errors.

Wrapping C code

I’ve written about this approach before. ATS allows embedding C directly so the first start was to embed the dtls1_process_heartbeat C code in an ATS file, call that from an ATS function and expose that ATS function as the real dtls1_process_heartbeat. The code for this is in my first attempt of d1_both.dats.

Unsafe ATS

The second stage was to write the functions using ATS but unsafely. This code is a direct translation of the C code with no additional typechecking via ATS features. It uses usafe ATS code. The rewritten d1_both.dats contains this version.

The code is quite ugly but compiles and matches the C version. When installed on a test system it shows the heartbleed bug still. It uses all the pointer arithmetic and hard coded offsets as the C code. Here’s a snippet of one branch of the function:

val buffer = OPENSSL_malloc(1 + 2 + $UN.cast2int(payload) + padding) val bp = buffer val () = $UN.ptr0_set<uchar> (bp, TLS1_HB_RESPONSE) val bp = ptr0_succ<uchar> (bp) val bp = s2n (payload, bp) val () = unsafe_memcpy (bp, pl, payload) val bp = ptr_add (bp, payload) val () = RAND_pseudo_bytes (bp, padding) val r = dtls1_write_bytes (s, TLS1_RT_HEARTBEAT, buffer, 3 + $UN.cast2int(payload) + padding) val () = if r >=0 && ptr_isnot_null (get_msg_callback (s)) then call_msg_callback (get_msg_callback (s), 1, get_version (s), TLS1_RT_HEARTBEAT, buffer, $UN.cast2uint (3 + $UN.cast2int(payload) + padding), s, get_msg_callback_arg (s)) val () = OPENSSL_free (buffer)

It should be pretty easy to follow this comparing the code to the C version.

Safer ATS

The third stage was adding types to the unsafe ATS version to check that the pointer arithmetic is correct and no bounds errors occur. This version of d1_both.dats fails to compile if certain bounds checks aren’t asserted. If the assertloc at line 123, line 178 or line 193 is removed then a constraint error is produced. This error is effectively preventing the heartbleed bug.

Testable Vesion

The last stage I did was to implement the fix to the tls1_process_heartbeat function and factor out some of the helper routines so it could be shared. This is in the ats_safe branch which is where the newer changes are happening. This version removes the assertloc usage and changes to graceful failure so it could be tested on a live site.

I tested this version of OpenSSL and heartbleed test programs fail to dump memory.

The approach to safety

The tls_process_heartbeat function obtains a pointer to data provided by the sender and the amount of data sent from one of the OpenSSL internal structures. It expects the data to be in the following format:

byte = hbtype ushort = payload length byte[n] = bytes of length 'payload length' byte[16]= padding

The existing C code makes the mistake of trusting the ‘payload length’ the sender supplies and passes that to a memcpy. If the actual length of the data is less than the ‘payload length’ then random data from memory gets sent in the response.

In ATS pointers can be manipulated at will but they can’t be dereferenced or used unless there is a view in scope that proves what is located at that memory address. By passing around views, and subsets of views, it becomes possible to check that ATS code doesn’t access memory it shouldn’t. Views become like capabilities. You hand them out when you want code to have the capability to do things with the memory safely and take it back when it’s done.


To model what the C code does I created an ATS view that represents the layout of the data in memory:

dataview record_data_v (addr, int) = | {l:agz} {n:nat | n > 16 + 2 + 1} make_record_data_v (l, n) of (ptr l, size_t n) | record_data_v_fail (null, 0) of ()

A ‘view’ is like a standard ML datatype but exists at type checking time only. It is erased in the final version of the program so has no runtime overhead. This view has two constructors. The first is for data held at a memory address l of length n. The length is constrained to be greater than 16 + 2 + 1 which is the size of the ‘byte’, ‘ushort’ and ‘padding’ mentioned previously. By putting the constraint here we immediately force anyone creating this view to check the length they pass in. The second constructor, record_data_v_fail, is for the case of an invalid record buffer.

The function that creates this view looks like:

implement get_record (s) = val len = get_record_length (s) val data = get_record_data (s) in if len > 16 + 2 + 1 then (make_record_data_v (data, len) | data, len) else (record_data_v_fail () | null_ptr1 (), i2sz 0) end

Here the len and data are obtained from the SSL structure. The length is checked and the view is created and returned along with the pointer to the data and the length. If the length check is removed there is a compile error due to the constraint we placed earlier on make_record_data_v. Calling code looks like:

val (pf_data | p_data, data_len) = get_record (s)

p_data is a pointer. data_len is an unsigned value and pf_data is our view. In my code the pf_ suffix denotes a proof of some sort (in this case the view) and p_ denotes a pointer.

Proof functions

In ATS we can’t do anything at all with the p_data pointer other than increment, decrement and pass it around. To dereference it we must obtain a view proving what is at that memory address. To get speciailized views specific for the data we want I created some proof functions that convert the record_data_v view to views that provide access to memory. These are the proof functions:

(* These proof functions extract proofs out of the record_data_v to allow access to the data stored in the record. The constants for the size of the padding, payload buffer, etc are checked within the proofs so that functions that manipulate memory are checked that they remain within the correct bounds and use the appropriate pointer values *) prfun extract_data_proof {l:agz} {n:nat} (pf: record_data_v (l, n)): (array_v (byte, l, n), array_v (byte, l, n) -<lin,prf> record_data_v (l,n)) prfun extract_hbtype_proof {l:agz} {n:nat} (pf: record_data_v (l, n)): (byte @ l, byte @ l -<lin,prf> record_data_v (l,n)) prfun extract_payload_length_proof {l:agz} {n:nat} (pf: record_data_v (l, n)): (array_v (byte, l+1, 2), array_v (byte, l+1, 2) -<lin,prf> record_data_v (l,n)) prfun extract_payload_data_proof {l:agz} {n:nat} (pf: record_data_v (l, n)): (array_v (byte, l+1+2, n-16-2-1), array_v (byte, l+1+2, n-16-2-1) -<lin,prf> record_data_v (l,n)) prfun extract_padding_proof {l:agz} {n:nat} {n2:nat | n2 <= n - 16 - 2 - 1} (pf: record_data_v (l, n), payload_length: size_t n2): (array_v (byte, l + n2 + 1 + 2, 16), array_v (byte, l + n2 + 1 + 2, 16) -<lin, prf> record_data_v (l, n))

Proof functions are run at type checking time. They manipulate proofs. Let’s breakdown what the extract_hbtype_proof function does:

prfun extract_hbtype_proof {l:agz} {n:nat} (pf: record_data_v (l, n)): (byte @ l, byte @ l -<lin,prf> record_data_v (l,n))

This function takes a single argument, pf, that is a record_data_v instance for an address l and length n. This proof argument is consumed. Once it is called it cannot be accessed again (it is a linear proof). The function returns two things. The first is a proof byte @ l which says “there is a byte stored at address l”. The second is a linear proof function that takes the first proof we returned, consumes it so it can’t be reused, and returns the original proof we passed in as an argument.

This is a fairly common idiom in ATS. What it does is takes a proof, destroys it, returns a new one and provides a way of destroying the new one and bring back the old one. Here’s how the function is used:

prval (pf, pff) = extract_hbtype_proof (pf_data) val hbtype = $UN.cast2int (!p_data) prval pf_data = pff (pf)

prval is a declaration of a proof variable. pf is my idiomatic name for a proof and pff is what I use for proof functions that destroy proofs and return the original.

The !p_data is similar to *p_data in C. It dereferences what is held at the pointer. When this happens in ATS it searches for a proof that we can access some memory at p_data. The pf proof we obtained says we have a byte @ p_data so we get a byte out of it.

A more complicated proof function is:

prfun extract_payload_length_proof {l:agz} {n:nat} (pf: record_data_v (l, n)): (array_v (byte, l+1, 2), array_v (byte, l+1, 2) -<lin,prf> record_data_v (l,n))

The array_v view repesents a contigous array of memory. The three arguments it takes are the type of data stored in the array, the address of the beginning, and the number of elements. So this function consume the record_data_v and produces a proof saying there is a two element array of bytes held at the 1st byte offset from the original memory address held by the record view. Someone with access to this proof cannot access the entire memory buffer held by the SSL record. It can only get the 2 bytes from the 1st offset.

Safe memcpy

One more complicated view:

prfun extract_payload_data_proof {l:agz} {n:nat} (pf: record_data_v (l, n)): (array_v (byte, l+1+2, n-16-2-1), array_v (byte, l+1+2, n-16-2-1) -<lin,prf> record_data_v (l,n))

This returns a proof for an array of bytes starting at the 3rd byte of the record buffer. Its length is equal to the length of the record buffer less the size of the payload, and first two data items. It’s used during the memcpy call like so:

prval (pf_dst, pff_dst) = extract_payload_data_proof (pf_response) prval (pf_src, pff_src) = extract_payload_data_proof (pf_data) val () = safe_memcpy (pf_dst, pf_src add_ptr1_bsz (p_buffer, i2sz 3), add_ptr1_bsz (p_data, i2sz 3), payload_length) prval pf_response = pff_dst(pf_dst) prval pf_data = pff_src(pf_src)

By having a proof that provides access to only the payload data area we can be sure that the memcpy can not copy memory outside those bounds. Even though the code does manual pointer arithmetic (the add_ptr1_bszfunction) this is safe. An attempt to use a pointer outside the range of the proof results in a compile error.

The same concept is used when setting the padding to random bytes:

prval (pf, pff) = extract_padding_proof (pf_response, payload_length) val () = RAND_pseudo_bytes (pf | add_ptr_bsz (p_buffer, payload_length + 1 + 2), padding) prval pf_response = pff(pf)a Runtime checks

The code does runtime checks that constrain the bounds of various length variables:

if payload_length > 0 then if data_len >= payload_length + padding + 1 + 2 then ... ...

Without those checks a compile error is produced. The original heartbeat flaw was the absence of similar runtime checks. The code as structured can’t suffer from that flaw and still be compiled.


This code can be built and tested. First step is to install ATS2:

$ tar xvf ATS2-Postiats-0.0.7.tgz $ cd ATS2-Postiats-0.0.7 $ ./configure $ make $ export PATSHOME=`pwd` $ export PATH=$PATH:$PATSHOME/bin

Then compile the openssl code with my ATS additions:

$ git clone $ cd openssl $ git checkout -b ats_safe origin/ats_safe $ ./config $ make $ make test

Try changing some of the code, modifying the constraints tests etc, to get an idea of what it is doing.

For testing in a VM, I installed Ubuntu, setup an nginx instance serving an HTTPS site and did something like:

$ git clone $ cd openssl $ git diff 5219d3dd350cc74498dd49daef5e6ee8c34d9857 >~/safe.patch $ cd .. $ apt-get source openssl $ cd openssl-1.0.1e/ $ patch -p1 <~/safe.patch ...might need to fix merge conflicts here... $ fakeroot debian/rules build binary $ cd .. $ sudo dpkg -i libssl1.0.0_1.0.1e-3ubuntu1.2_amd64.deb \ libssl-dev_1.0.1e-3ubuntu1.2_amd64.deb $ sudo /etc/init.d/nginx restart

You can then use a heartbleed tester on the HTTPS server and it should fail.


I think the approach of converting unsafe C code piecemeal worked quite well in this instance. Being able to combine existing C code and ATS makes this much easier. I only concentrated on detecting this particular programming error but it would be possible to use other ATS features to detect memory leaks, abstraction violations and other things. It’s possible to get very specific in defining safe interfaces at a cost of complexity in code.

Although I’ve used ATS for production code this is my first time using ATS2. I may have missed idioms and library functions to make things easier so try not to judge the verbosity or difficulty of the code based on this experiement. The ATS community is helpful in picking up the language. My approach to doing this was basically add types then work through the compiler errors fixing each one until it builds.

One immediate question becomes “How do you trust your proof”. The record_data_v view and the proof functions that manipulate it define the level of checking that occurs. If they are wrong then the constraints checked by the program will be wrong. It comes down to having a trusted kernel of code (in this case the proof and view) and users of that kernel can then be trusted to be correct. Incorrect use of the kernel is what results in the stronger safety. From an auditing perspective it’s easier to check the small trusted kernel and then know the compiler will make sure pointer manipulations are correct.

The ATS specific additions are in the following files:

Categorieën: Mozilla-nl planet

Anthony Hughes: Google RMA, or how I finally got to use Firefox OS on Wind Mobile

Mozilla planet - do, 10/04/2014 - 22:28

The last 24 hours have really been quite an adventure in debugging. It all started last week when I decided to order a Nexus 5 from Google. It arrived yesterday, on time, and I couldn’t wait to get home to unbox it. Soon after unboxing my new Nexus 5 I would discover something was not well.

After setting up my Google account and syncing all my data I usually like to try out the camera. This did not go very well. I was immediately presented with a “Camera could not connect” error. After rebooting a couple times the error continued to persist.

I then went to the internet to research my problem and got the usual advice: clear the cache, force quit any unnecessary apps, or do a factory reset. Try as I might, all of these efforts would fail. I actually tried a factory reset three times and that’s where things got weirder.

On the third factory reset I decided to opt out of syncing my data and just try the camera with a completely stock install. However, this time the camera icon was completely missing. It was absent from my home screen and the app drawer. It was absent from the Gallery app. The only way I was able to get the Camera app to launch was to select the camera button on my lock screen.

Now that I finally got to the Camera app I noticed it had defaulted to the front camera, so naturally I tried to switch to the rear. However when I tried this, the icon to switch cameras was completely missing. I tried some third party camera apps but they would just crash on startup.

After a couple hours jumping through these hoops between factory resets I was about to give in. I gave it one last ditch effort and flashed the phone using Google’s stock Android 4.4 APK. It took me about another hour between getting my environment set up and getting the image flashed to the phone. However the result was the same: missing camera icons and crashing all over the place.

It was now past 1am, I had been at this for hours. I finally gave in and called up Google. They promptly sent me an RMA tag and I shipped the phone back to them this morning for a full refund. And so began the next day of my adventure.

I was now at a point where I had to decide what I wanted to do. Was I going to order another Nexus 5 and trust that one would be fine or would I save myself the hassle and just dig out an old Android phone I had lying around?

I remembered that I still had a Nexus S which was perfectly fine, albeit getting a bit slow. After a bit of research on MDN I decided to try flashing the Nexus S to use B2G. I had never successfully flashed any phone to B2G before and I thought yesterday’s events might have been pushing toward this moment.

I followed the documentation, checked out the source code, sat through the lengthy config and build process (this took about 2 hours), and pushed the bits to my phone. I then swapped in my SIM card and crossed my fingers. It worked! It seemed like magic, but it worked. I can again do all the things I want to: make phone calls, take pictures, check email, and tweet to my hearts content; all on a phone powered by the web.

I have to say the process was fairly painless (apart from the hours spent troubleshooting the Nexus 5). The only problem I encountered was a small hiccup in the process. Fortunately, I was able to work around this pretty easily thanks to Bugzilla. I can’t help but recognize my success was largely due to the excellent documentation provided by Mozilla and the work of developers, testers, and contributors alike who came before me.

I’ve found the process to be pretty rewarding. I built B2G, which I’ve never succeeded at before; I flashed my phone, which I’ve never succeeded at before; and I feel like I learned something new today.

I’ve been waiting a long time to be able to test B2G 1.4 on Wind Mobile, and now I can. Sure I’m sleep deprived, and sure it’s not an “official” Firefox OS phone, but that does not diminish the victory for me; not one bit.


Categorieën: Mozilla-nl planet