Subsections

2019-08-04 Flashing the Librem5 from a 32-bit Machine

Part of what has been keeping my so busy this last year, besides my job, is that every other weekend some friends and I have been getting together to work on the Librem5 devkit; one of them has posted several videos of this on their channel, though I had yet to write any blogs on the matter before this one.

For most of our hack sessions we had been using one of their devices for flashing the phone due to the module loading error I had been experiencing on my Novena board. Since I'd finally fixed the issue, I was excited to finally begin a session by having a recently-flashed device ready when they arrived rather than having us all wait to flash the latest image to the device after they'd arrived. Alas, it was not to be; after some initial set-up I would eventually run into some file loading and flash tool issues that would require hacking the tool in order to get it working on my 32-bit device.

Setting up the Tools

I began my attempt at flashing by reading the guide written by the folks at Purism (the company behind the Librem5). The instructions began by noting that the uuu program from NXP Semiconductor's mfgtool was a requirement, and provided a package for the Debian distribution; I was on Gentoo, however, and did not find a pre-built package, so I instead cloned the Librem5 fork and compiled it myself. Since I had compiled the package locally but not installed it, I then had to add the binaries to my PATH environment variable by running export PATH=${PATH}:/path/to/cloned/repo/mfgtools/uuu.

Next up, I cloned the librem5-devkit-tools scripts to help download and flash the image. When I tried to flash the image with ./scripts/librem5-devkit-flash-image, however, I got a rude ModuleNotFoundError: No module named 'jenkins' error. After checking the Portage repository for this module and not finding it, I decided to create a virtual environment with virtualenv, install the module via pip, and then re-run the command, but it still failed with the same error! Despite which's assurance that it was using the correct python binary, invoking the virtual environment's copy directly resulting in a different, but more perplexing error:

Traceback (most recent call last):
  File "./scripts/librem5-devkit-flash-image", line 7, in <module>
    import jenkins
  File "/home/frostsnow/software/librem5-devkit-tools/python2test/lib/python3.6/site-packages/jenkins.py", line 9, in <module>
    lookup3 = cdll.LoadLibrary(os.path.join(get_python_lib(), "lookup3.so"))
  File "/usr/lib/python3.6/ctypes/__init__.py", line 426, in LoadLibrary
    return self._dlltype(name)
  File "/usr/lib/python3.6/ctypes/__init__.py", line 348, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: /home/frostsnow/software/librem5-devkit-tools/python2test/lib/python3.6/site-packages/lookup3.so: cannot open shared object file: No such file or directory

Huh? After some debugging, I eventually determined that the issue must have been in the lookup module, so I installed it via pip install lookup, and... I got the same error! Now, I don't recall how I found it, but eventually I noticed a python-jenkins module (rather than simply jenkins). What could have been the difference between the two? I didn't know, but I tried installing it anyways, after which the script failed due to another missing dependency. Progress? I then attempted to satisfy the chain of dependencies that cropped up, adding tqdm and PyYAML, after which point the script finally ran, though I still had to invoke the python binary with a full path. Python packaging truly is notoriously frustrating, what a pain!

Initial Flashing Attempts

Tools finally set-up, my hope of being able to flash the device was once again re-kindled, only to be promptly dashed. After downloading and checksumming the image, the script threw a fatal error: Error: fail open file: >/mnt/librem5/2019_05_05/devkit.img. Failed to open the file? Why?! Unfortunately, no error code was provided to me; undaunted, I used my knowledge of Linux systems programming to create a more useful message for the error by including the errno.h and string.h header files and adding the return value strerror(errno) to the string. The next run then gave me the following, slightly more useful, error: Error: fail open file '>/mnt/librem5/2019_07_21/devkit.img': Value too large for defined data type. It appeared that the file, at 3.2GB, was too large for the program, as it was compiled with 32-bit versions of the system calls, but that's hardly a reasonable deterrent since there exist compatibility options for large files in 32-bit systems. Recalling similar experiences in my later college years, and with a bit of searching, I added 64-bit compatibility support by adding O_LARGEFILE to the relevant open(2) system call and sprinkled some header files with the #define _FILE_OFFSET_BITS 64 feature test macro (see feature_test_macros(7)). With this I was able to open the file, only to run into an mmap() error.

As before, I began by adding the return value of strerror(errno) to the string, but the resulting error message was far more concerning: Cannot allocate memory. I had 4GB of RAM, and naively tried to activate some swap space to see if that would fix the error. It didn't. Then I realized: processes on 32-bit systems only have a 4GB address space, and 1GB is reserved for the kernel, leaving only 3GB for the rest of the process. No matter what I could have tried to do, there would be no way to directly map a 3.2GB file into a 3GB address space. Then I knew that this wouldn't be a simple compatibility fix; it'd require a significant change in the code. There were two things that I could do at this point: either give up and use one of my 64-bit desktop machines from upstairs, or hack the tool to make it do what I want. Most people would have given up and used a 64-bit machine, but I'd come so far, and, since part of the funds for the Novena went to help fund the graphics driver development for the same graphics driver that ended up being used in the Librem5, it seemed only right that the Novena should be able to flash it. Truly, there was only one option for me: hack it.

Hacking the Flasher

Before madly diving into my task, I decided to take a step back and attempt to become familiar with the flashing tool. This was when I learned that the tool was made by NXP Semiconductors, and its upstream repo could be found on GitHub. From here, the first thing that I did was to check for documentation, and, much to my delight, there existed both a Wiki and downloadable PDF version of the Wiki. Though a fair amount of it was over my head, I learned that what the tool did was to interact with the target device based on a series of commands passed to it either via a script or interactively. Taking a look at the Librem5 flashing script, I saw where the commands were located, and, after running the script again, deduced that the command flash -raw2sparse all {image} was the cause of my problems. I wasn't able to glean any more information easily from the documentation, so, before diving into the code, I decided to check the forks of the project to see if anyone else might have implemented 32-bit support, but I didn't find anyone who appeared to have done so. While I didn't find an easy way out, finding that command gave me a good idea of what I'd be looking for, and I began to dive into the code.

My first object of interest was the FileBuffer class in libuuu/buffer.h which had failed to mmap() the large file; I thought that perhaps I'd be able to modify its methods to work in 32-bit mode. Looking at its interface showed methods such as data(), which returned the pointer to the mapped memory, size(), which returned the size of the mapped area, and & operator[], which appeared to overload the indexing operator in order to return a reference to the memory addressed at the specified location. Since any user of the interface could call data() followed by size() and justly expect the entire file region to be mapped into memory, and there wouldn't be an easy way (that I know of) to intercept a faulty dereference and transparently load in another chuck of a partially-mapped large file, there didn't seem to be any way of making generic, non-breaking changes to the class. I then decided to check out how the buffer was being used so I could figure out what kinds of breaking changes I'd need to make.

Using my knowledge of mfgtools' command-based structure, I searched the sources and found an aptly-named FBFlashCmd::flash_raw2sparse() method. Of immediate interest to me was that the function declaration included a parameter of type shared_ptr<FileBuffer> pdata, which was a template containing the buffer object I'd just looked at; with any luck this meant that I wouldn't have to modify calling structures and pass extra parameters in order to implement my hack. Digging into the implementation I learned that the code appeared to revolve around something called a SparseFile; "blocks" were taken from the FileBuffer argument and pushed one at a time into the SparseFile via its push_one_block() method until it returned non-zero, at which point flash() was called and the SparseFile appeared to be re-initialized. So, if copying to the SparseFile was being done in blocks then the hack should be easy enough, I'd just need to change how blocks were retrieved from the FileBuffer object! I took a quick look at the SparseFile's push_one_block() method and it appeared to be doing some kind of data-deduplication for consecutive blocks, but didn't take the time to convince myself of my hypothesis; I was ready to try and hack it.

The first step was to make the buffer object not fail when opening the file; I had the code call the mmap() function as usual and check for failure, but, if errno was set to ENOMEM, the buffer instance would set the buffer size, set the buffer itself to be MAP_FAILED, close the file descriptor, and return success. This would probably cause a segmentation fault on any code that wasn't expecting it, so any usage of the modified tool would have to be restricted to certain commands, which I could easily do for my own purposes. Turning my attention to the FBFlashCmd::flash_raw2sparse() method, I had the for loop compare the buffer instance's m_Mapbuffer member to MAP_FAILED and, if not, use the regular functionality, or, if so, read in one block, er, wait, no, there was no file descriptor to read from. I went back to the buffer object, added the file path as a member, then went back to the command and had it open and close a file descriptor for the buffer's file path member before and after the for loop, respectively. Now, back within the for loop, I had the workaround code malloc() an actual buffer whose size was the method's block_size argument, then read() the data from the file descriptor into the actual buffer before passing it to the SparseFile's push_one_block() method (and, finally, freeing the actual buffer, of course). That... should have done it, I hoped.

After taking one last gloss over the code I decided to try my luck. Hopefully I wouldn't brick anything. I compiled the tool, then ran Purism's update script. After a short while it got to the flashing command. The progress meter read "0%". Then "1%". Then "0%". Argh! It went back to "1%", then "0%" again. "Perhaps it just needs to go from "0%" to "1%" a hundred times", David suggested.

Figure: The flasher teasingly switches between "0%" and "1%" complete.
Image 2019_08_04_flashing

We waited while the status bar indicator oscillated. Then, after a few minutes, the program ended with a "Success!" message. ...was it really a success? I switched the phone into eMMC boot mode, turned it on and... it worked. I was finally able to flash the Librem5 from my Novena!

Final Thoughts

I'm really glad that Purism chose mfgtools utility for their flasher as, being Free Software, I was able to modify it to suit my own needs. As of this writing, I have not tried to submit my patches upstream as I do not consider them to be of adequate quality for inclusion, but I have posted them on my GitHub should anyone be inclined to use them at their own risk. It might be possible to harden the code a little bit by adding some assert() calls to the FileBuffer's methods, but I'm not inclined to spend anymore time on this right now. It's time to see if I can make any meaningful forward progress on the Librem5.


Generated using LaTeX2html: Source