An avian carrier's blog – Linux Atom feed

GNU/Linux operating system
  1. ROSE 2011: some afterthoughts (2011-05-01)

    Every year since 2003, Alexis Polti and myself run a course named "ROSE" (Robotique et Systèmes Embarqués, Robotics and Embedded Systems) for future engineers at Télécom ParisTech. During this 120 hour curriculum, students have to design and buid embedded systems, including designing their own electronic boards and programming them. Classical courses are limited to the minimum (real-time operating systems, signal integrity), and students must learn by themselves all the other topics while the two teachers offer lots of assistance (we are physically with the students most of the time to answer their questions).

    As every year, the 2011 occurrence introduced some changes (hopefully for the better), that I now want to analyze.

    Afterthoughts

    Git vs. Hg

    Until last year, we were using Mercurial as our revision control system because we thought it was simpler to use than Git for the students although the teachers both used it. We decided to try Git with the Gitolite backend tool that we already used for research projects. The outcome was unexpectedly successful: every project used lots of branches for their development, merging and rebasing at will.

    The presence of Clément Moussu, a student who had previously done an internship at Gostai where Git is used intensively (they even use "git notes" that almost noone knows about), has been a tremendous help, and has been acknowledged during the debriefing session by other students. He and three other students explained Git to the others, and spoke about the best practices right from the beginning. So we plan to keep using Git as our preferred revision control system.

    Linux-based boards

    For the first time, we accepted that some projects use Linux-based boards in addition to the micro-controllers boards they had to design. The mix of those Linux-based boards (one Armadeus APF27, one BeagleBoard xM and one Gumstix Overo FE COM) allowed them to use high-level languages (Python), libraries (OpenCV, 0MQ) and cloud-based processing capabilities (Google Appengine) very easily. We plan to keep this possibility as well, but we need to ensure that every project needs to build additional micro-controller based boards as we want our students to really know how to design a board from scratch.

    Best programming practices

    This is something we did not do: ensure that our students know the best programming practices. Next year, we plan to do a live-coding session where we will collectively try to write the best possible code. The teacher will write, compile and run the code as suggested by the students, and explain how the code may be improved and what needs to be done to guarantee reliability and ease of maintenance. Tricky exercises will be proposed, to ensure that students need to know when volatile needs to be used, and when it is not needed. Also, lockless algorithms, depending on the underlying hardware, will be used whenever possible. The effect of inlining some functions (and when not to inline and let the compiler work it out) will be studied intensively, and methods to avoid any code duplication will be taught.

    Some students naturally know how to write good code, but some don't and just write code that works but is unmaintainable. Instead of having them fix the code afterward, we will make sure that they write proper code from day one. So next year, the students will learn this skill at the beginning of the class rather than along its course.

    The projects

    If you are curious to see what has been done, here are links to the various projects done by the students in 2011 during this 2.5 month course:

    • Casper: a talking and listening robot shaped like an elephant trunk
    • Copterix: a helicopter with eight engines
    • IRL: a nightclub laser that displays your tweets and let you control via Twitter the color of the club as well as other equipment such as the smoke machine
    • MB Led: a very nice set of blocks letting you play games by moving them around
    • Rosewheel: a Segway clone, remotely controlled using an Android phone
    • TSV Safe Express: control a model railroad layout using cheap components (unfortunately, the web site is almost empty)
  2. Android/Linux: a desperate attempt at creating buzz (2011-03-18)

    In a misinformed article about Android libc and the Linux kernel, Florian Mueller seems to attempt to create buzz about an alleged Linux kernel copyright infrigement. Even if Linux kernel copyright holders decided to complain (which is not the case as far as I know), the article is full of mistakes and approximations.

    In view of all of that, I think the only viable option will be for Google to recognize its error with Bionic and to replace it as soon as possible with glibc (GNU C library). That library is licensed under the LGPL ("Lesser GPL"), which has the effect that applications can access the Linux kernel without necessarily being subjected to copyleft if certain criteria are fulfilled.

    The GPLv2 licence used in the Linux kernel does not allow to reuse parts of the covered software and distribute it under the LGPL license.

    Using glibc is the industry-standard approach, and it is the approach used by those in the open source world who are trying to "play by the rules."

    Come on, many embedded Linux systems use the more compact µClibc. Aren't they playing by the rules?

    In fact, Google's decision to forego glibc is one of the reasons Android is considered a Linux fork rather than a true Linux implementation.

    Wrong. Several distributions use other C libraries, including the eglibc which is a fork of the glibc. Is Debian a Linux fork because it is now using eglibc instead of glibc? Certainly not!

    Android is sometimes considered a Linux fork because some of the features it needs related to the device wakeup when a call arrives for example have disturbed quite a lot of device drivers code, and the Linux mainstream maintainers do not feel comfortable in including those changes in the default kernel.

    Florian Mueller, while trying to make some noise, completely dismisses the fact that Google wrote its own libc (based on a preexisting one using a BSD license) to get better performances. Accessing kernel structures directly instead of trying to build a set of insulation layers allows them to use the Linux kernel more efficiently. As stated on the GNU project web site, The GNU C library is primarily designed to be a portable and high performance C library. It follows all relevant standards (ISO C 99, POSIX.1c, POSIX.1j, POSIX.1d, Unix98, Single Unix Specification). This is just not needed in a restricted capability device where you do not necessarily need to implement or even comply with all the existing standards.

    Even if copyright holders chose to complain, the solution proposed by Florian Mueller is completely misguided, unrealistic and useless. Even if I somewhat did it by writing this post, please do not feed the troll.

  3. Using IPv6 by default with wget (2007-10-31)

    I was surprised to see that wget chose to use IPv4 over IPv6 when downloading a file. It looks like it is on purpose (I would call it a bad design choice). You can tell wget to prefer IPv6 over IPv4 by putting the following line

    prefer-family = IPv6

    in either /etc/wgetrc (system wide) or $HOME/.wgetrc (user settings).

  4. Strange keyboard problem (2007-10-19)

    Since about a week, I started to notice that I had been making a lot of typos in some commands I use frequently. For example, I became unable to type correctly

    cd /usr/src/linux

    which always resulted in

    cd /usr/src:linux

    (incidentally, when typing the above strings, I had to fix the first one and the second one came naturally buggy)

    On a French keyboard (AZERTY layout), / is obtained by pressing simultaneously shift and :. I first thought that my laptop keyboard was misfunctioning. But it happened on my home computer as well. I then thought I had become unable to properly release the C key before pressing the shift one, but no, I think I found a real bug somewhere: this problem occurs only when a key amongst the lowest left part of the keyboard (near to the shift, namely one of the WXCV letters on my keyboard) is rapidly followed by a shift.

    Let’s make a test: while running a X11 server, press the C key, let it pressed so that you turn the auto-repeat mode on, then press shift (without releasing the key). You should, at least under Linux with Xorg, see something like:

    ccccccccccccccCCCCCCCCCCCCCCCC...

    But what I get is:

    cccccccccccccccccccccc...

    The shift key is ignored. Note that it works fine with the right shift key though.

    For a fast touch typist (as I otherwise luckily am), this is rather unfortunate; the combination of one of those wxcv letters followed by a slash happens to me at least fifty times a day, often much more than that. Since I cannot reproduce that on the Linux console, I will for the moment put the blame on my X server.

  5. Will Gentoo be the last OS without IPv6 automatic tunnels? (2007-01-29)

    Tomorrow, Windows Vista will be available in stores. According to press reviews, this operating system will have IPv6 enabled by default with support for automatic Teredo tunnels when native IPv6 is not available.

    Teredo tunnels allows a computer plugged to a IPv4-only network to efficiently talk with computers using IPv6 addresses. IPv6 proponents such as myself are pleased with this move: while I don't like Microsoft at all, I am happy to see them embrace IPv6 and give this protocol the chance it deserves.

    However, I don't use Windows on my laptop (or anywhere else, if that matters), I use the Gentoo Linux free operating system. When my laptop is plugged into my home or work networks, it gets automatic IPv6 connectivity. However, when I am traveling, I usually use IPv4-only networks; an automatic tunnel would really be useful to reach my home computers, some of them being IPv6 only.

    Fortunately, there exists an excellent automatic tunneling software for Linux and FreeBSD called Miredo. This program is already included in Debian GNU Linux and FreeBSD.

    Arne Mejlholm packaged Miredo for Gentoo back in February 2005 after Daniel Webert suggested it. I submitted an updated version in June 2006. However, it has never been integrated into Gentoo's portage system and my question on the next step to do (if any) never got answered.

    As I am tired of chatting with myself on the Gentoo ticket tracking system, I will not submit a new version of the Miredo package that is likely to be ignored as well. I hope Gentoo developers will handle ticket 77603, even if only to tell what is wrong with it.

    Edit (2010-11-24): it took more than five years, but at last Miredo is now included in Gentoo.

  6. Factor: a stack-based programming language (2007-01-18)

    As you may already know, I'm a big fan of stack-based languages such as Forth, functional languages such as Haskell and reflexive languages such as Smalltalk. You can imagine how happy I was when I discovered Factor a few days ago: it combines all those aspects.

    Today, a friend sent me someone email signature and asked me if I could decipher it:

    01101001001000000110011001110101011000110110101101100101011001000010
    00000111100101101111011101010111001000100000011011010110111101101101
    00001101000010100110000101101110011001000010000001101110011011110111
    01110010000001100110011011110111001000100000011110010110111101110101
    

    As any programmer on the Earth would have, I immediately assumed that those were ASCII codes printed in binary format. I had a Factor interactive shell opened in one of my windows, so I cut and pasted the whole string (surrounded by quotes) and entered:

    8 group [ 2 base> ] map >string print
    

    and the cleartext version appeared instantly. All in all, it took me around 20 seconds to uncover the original text using Factor.

    How does this work? Factor is a stack-based language, meaning that data are put onto a stack and words (equivalent of functions in other languages) use the data on the stack and put results there. Factor is a (dynamically) typed language: complex data can be pushed onto the stack, while untyped languages such as Forth can only push numbers there.

    Writing the string pushes it on the stack. Using

    8 group
    

    takes the string on the top of the stack, considers it as a sequence (a succession of characters), group them by eight and returns an array of strings of length 8. At this point, there is only one element on the stack: an array of eight-characters-long strings.

    Then

    [ 2 base> ]
    

    pushes another element on the top of the stack: a quotation (the equivalent of a lambda expression in functional languages), which is a block containing code. The base> word consumes two elements from the stack, a string S and a number B, and pushes back a number which is the value represented by S in base B. For example, the expression

    "01101001" 2 base>
    

    will let the value 105 on the stack, as 01101001 in binary represents 105 in decimal.

    map
    

    takes a sequence and a quotation. It represents the classical map operation in functional languages: it applies the quotation to every element of the sequence and gathers the results in a new sequence. As a consequence, each eight-characters-long string gets transformed into its decimal representation. At the end, we end up with a single element on the stack, which is an array containing all the ASCII codes of the sentence.

    >string
    

    transforms a sequence of ASCII codes into the corresponding string. Then

    print
    

    is similar to C's puts and prints the string on the standard output while recognizing special sequences such as \r and \n.

    The content of the unencrypted text itself is not important; my point is that Factor is very compact and its stack-oriented nature helps writing concise and clear programs. For example, here is one of the many possible implementations of the reverse functionality: it takes a string from the stack and lets its ASCII representation in binary onto the stack.

    [ 2 >base 8 CHAR: 0 pad-head ] [ append ] map-reduce
    

    I'm sure that you're thrilled to know that "Hello, world!" encodes as

    0100100001100101011011000110110001101111001011000010
    0000011101110110111101110010011011000110010000100001
    

    (edited on 2009-06-13 to use pad-head and map-reduce)

  7. To peer review or to not peer review? (2006-12-26)

    As an experienced programmer, I participate in many Free Software projects when time permits. I am committed to a few projects, and I frequently submit patches to random projects that I happen to bump into. I also understand the dynamics of free software: when a bug stands in my way, I often fix it myself rather than waiting for another contributor (who may have her own priorities and agenda) to fix it. Same when I badly need a feature.

    In this post, I will compare the submission process of two changes I made to free software recently:

    • a new watchdog driver for the Linux kernel;
    • a fix for a critical flow in SIP message handling in the Asterisk telephony system.

    Linux device driver

    I first posted my new device driver code as a patch (a difference between the actual Linux source code and the modified one) on the linux-kernel mailing-list. Shortly after that, some people publicly answered my mail and offered remarks and criticisms about my changes. Most of the advices were well targeted and I modified my patch accordingly. Some of the remarks were a bit off because people commenting the code hadn’t read the device datasheet and were confused by some names used therein and mirrored into the driver; I explained the situation and why I would not act upon those remarks. One point about a possible concurrent access was discussed and resolved after a few technical exchanges. I then posted a modified patch for everyone to comment on. This later patch was then acked (i.e., blessed) by a major developer.

    Various parts of the Linux kernel are maintained by different people. The device I was addressing was a watchdog (a piece of hardware that forcibly reboots your computer if the operating system fails to say “I’m still alive” on a regular basis), so the watchdog subsystem maintainer took responsability and integrated it into his own development tree, so that people willing to test this new driver could do so easily. After some time, while the new driver had shown not visible disturbance of the rest of the kernel, it was pulled by Linus Torvalds into the main Linux kernel tree and was released as part of Linux 2.6.19.

    Note that when the watchdog subsystem maintainer integrated my new driver into his tree, he was already quite confident that the driver was clean as it had been carefully read and commented on by several other developers. The integration within his tree rather than into the main Linux kernel ensured that all the watchdog drivers can play nicely together.

    Asterisk flaw in the SIP engine

    Free Telecom is the second most important ADSL provider in France. They provide a triple-play service over ADSL: IP, telephony and television. The telephony service can be accessed either using an analog phone connected to their ADSL modem or using a SIP connection to their server. On the server side, Free Telecom chose to use a solution by Cirpack, made from boxes able to handle several thousands of simultaneous SIP sessions.

    When the Cirpack server was upgraded at the beginning of December, all Asterisk boxes using Free Telecom as their SIP provider immediately stopped working: the voice was not going through anymore. This problem was signaled onto a forum by an Asterisk user a few hours after the upgrade and promptly analyzed by a Cirpack engineer: it appeared to be a flaw in Asterisk SIP handling. The engineer rolled back the Free Telecom server to the previous revision and sent me a mail with the description of the problem. Why me? Because we know each other as we studied together, and he knew I was using Asterisk to connect to the Free Telecom SIP server and that I was likely to quickly investigate and fix the problem.

    A few hours later, I produced two short fixes for Asterisk and was able to test them against a Cirpack server running the new firmware. Everything went fine and the problem was fixed. I posted the patches to the Asterisk bug tracking system and, less than four hours later, added full debugging information with and without the patches at the request of a manager so that it was clear what the problem was and how the patch fixed it.

    I also sent several mails on the Asterisk developers mailing-list to underline the importance of the flaw. As long as the flaw is not fixed, any upgrade made by a VoIP provider may break all its Asterisk clients without any easy workaround. To describe the flaw shortly, an unpatched Asterisk doesn’t understand perfectly valid SIP headers and interprets them in a totally wrong way, causing the subsequent traffic to be sent to the wrong place.

    Asterisk 1.4.0 was released 19 days after I explained this critical flaw and posted the patches to correct it. Not only were the patches not included in the release, but as far as I can tell no peer review has occurred on the patches. The only request made by a manager was that some developers, who have not yet answered, test the patch.

    Also, at some point, this very same manager added a relationship between this problem and another one without any comment to explain this alleged relationship. As far as I can tell, the two bugs are totally unrelated and I fail to see any relationship between them except that they address two problems in SIP message processing, although one is about SIP headers syntax and the other one about the SIP engine internal state machine.

    At this point, it is worth noting that I do not feel bad about Asterisk because my patches were not included in the latest release; what I criticize here is what I consider a lack of feedback on user-contributed fixes and a lack of interaction between developers.

    Comparing the two processes

    Proposed changes to the Linux kernel are posted on a public mailing-list as plain-text, where anyone is free to comment on them. The plain-text format makes it easy to intersperse the relevant code portion with the comments. One or several structured discussions follow, each one addressing one aspect of the proposed patch. New versions of the patch may then be proposed and discussed until the patch is finally blessed (acked) by one or more fellow developers. Note that this process happens in an email client, without any compilation taking place at this stage. Technical flaws may be found by code reading and discussion rather than by testing whether the code seems to trigger a bug or not. Also, if the code would benefit from extra documentation, such documentation will be requested publicly by other developers.

    Proposed changes to Asterisk are posted onto the Asterisk bug tracking system maintained by Digium (the original authors and the current maintainers of Asterisk). A disclaimer also needs to be filled by contributors, as Digium wants to be able to make a proprietary version of Asterisk, while others may only distribute it as a GPL software. I have the impression that the patches are not peer reviewed: the use of a bug tracking system doesn’t ease such a code review process, compared to a mailing-list as in the Linux kernel patches case. I am also under the impression that patches are tested rather than being read first. If enough developers report that the patch hasn’t visibly broken their system, the patch may eventually be integrated.

    Also, parts of Asterisk sometimes undergo major rewritings without any attempt to explain what has been changed exactly. For the Linux kernel, it would be unacceptable: a serie of incremental patches would be required to be submitted on the mailing-list, with a step-by-step justification of why things need to be changed. When incremental patches are not doable, because changes depend on each other, separate patches that need to be applied at the same time will still be required so that individual changes are reviewable by other developers.

    As you may have guessed at this stage, I much prefer the Linux kernel way of doing it. The peer review system exposes proposed changes to several pairs of hackers eyes. The patches and the subsequent discussions also teach potential contributors what they need to send and how they need to present it. This iterative process not only generates better code but also shows good practices to other programmers.

    I would really like other large software projects, such as Asterisk, to adopt it to increase the code quality and the developers interaction.

  8. Linux kernel driver for the Winbond 83697HF/HG watchdog (2006-10-26)

    My device driver for the watchdog embedded in the Winbond 83697HF/HG SuperIO controller has been integrated into the forthcoming Linux 2.6.19 kernel. If you want to use it on a Dedibox dedicated server, you have to:

    • activate the option CONFIG_W83697HF_WDT in your kernel configuration file
    • load the module at boot time with parameter wdt_io=0x4e; creating /etc/modules.d/wdt with a single line options w83697hf_wdt wdt_io=0x4e and running update-modules should work on most Linux distributions
    • install a watchdog signaling program such as watchdog (sys-apps/watchdog in Gentoo portage tree) and run it at boot time

    Then if your server gets stuck, whatever the cause, it will reboot automatically.

  9. rforth1 optimizations (2006-10-24)

    I worked a lot on rforth1 lately, a Forth compiler targetting the PIC 18f family of microcontrollers. I have added many new optimizations in order to generate smaller and more efficient code.

    Let's take an example. The Forth code below cycles through the 8 possible states of 3 leds connected to ports B5, B6 and B7 of a PIC:

    \\ Define three words led0, led1 and led2 designating the leds
    
    LATB 5 bit led0
    LATB 6 bit led1
    LATB 7 bit led2
    
    \\ Use timer 0 to wait for 100ms (with a 40MHz crystal)
    
    : tmr0-init ( -- ) $84 T0CON c! ;    \\ Enable timer, 16 bits, prescaler = 32
    : 100ms ( -- ) -31250 TMR0L ! TMR0IF bit-clr begin TMR0IF bit-set? until ;
    
    \\ Move leds -- when led0 goes to 0, switch led1. When led1 goes to 0, do
    \\ the same thing with led2
    
    : leds-init ( -- ) 0 LATB c! $1F TRISB c! ;   \\ B5, B6 and B7 are outputs
    : switch-led2 ( -- ) led2 bit-toggle ;
    : switch-led1 ( -- ) led1 bit-toggle led1 bit-clr? if switch-led2 then ;
    : switch-led0 ( -- ) led0 bit-toggle led0 bit-clr? if switch-led1 then ;
    
    \\ Loop indefinitely with a pause between each led change
    
    : mainloop ( -- ) begin switch-led0 100ms again ;
    
    \\ Main program: initialize the timer and the leds then run the main loop
    
    : main ( -- ) tmr0-init leds-init mainloop ;
    

    Here is the assembly code with the default compiler switches: (in order to keep it relatively short, I've omitted the declaration of constants such as LATB, which are included automatically, as well as the assembly file header)

    ; main: defined at example.fs:26
    main
            call tmr0_init
            call leds_init
    
    ; mainloop: defined at example.fs:22
    mainloop
            call switch_led0
            call _100ms
            bra mainloop
    
    ; switch-led0: defined at example.fs:18
    switch_led0
            btg LATB,5,0
            btfsc LATB,5,0
            return
    
    ; switch-led1: defined at example.fs:17
    switch_led1
            btg LATB,6,0
            btfsc LATB,6,0
            return
    
    ; switch-led2: defined at example.fs:16
    switch_led2
            btg LATB,7,0
            return
    
    ; tmr0-init: defined at example.fs:9
    tmr0_init
            movlw 0x84
            movwf T0CON,0
            return
    
    ; 100ms: defined at example.fs:10
    _100ms
            movlw LOW(-31250)
            movwf TMR0L,0
            movlw HIGH(-31250)
            movwf (TMR0L+1),0
            bcf INTCON,2,0
    _lbl___197
            btfsc INTCON,2,0
            return
            bra _lbl___197
    
    ; leds-init: defined at example.fs:15
    leds_init
            clrf LATB,0
            movlw 0x1f
            movwf TRISB,0
            return
    END
    

    The assembly code is almost a one-to-one mapping to the Forth one. However, you may notice that the compiler chose to reorder the various parts so that fallbacks can be used between Forth words. For example, switch-led0 potentially falls back through switch-led1 because of the btfsc (test one bit and skip next instruction [return in this case] if bit is clear).

    However, here we have not used a nice feature of rforth1 which is the automatic inlining of words if the generated code is either smaller or more efficient. With the automatic inlining turned on, we now get:

    ; main: defined at example.fs:26
    main
            movlw 0x84
            movwf T0CON,0
            clrf LATB,0
            movlw 0x1f
            movwf TRISB,0
    _lbl___219
            btg LATB,5,0
            btfsc LATB,5,0
            bra _lbl___220
            btg LATB,6,0
            btfss LATB,6,0
            btg LATB,7,0
    _lbl___220
            movlw LOW(-31250)
            movwf TMR0L,0
            movlw HIGH(-31250)
            movwf (TMR0L+1),0
            bcf INTCON,2,0
    _lbl___222
            btfsc INTCON,2,0
            bra _lbl___219
            bra _lbl___222
    END
    

    Isn't that nice? You can identify the various parts of the code: between main and _lbl___219, you get the timer and ports initialization. Between _lbl___219 and _lbl___220 is the whole logic of led switching. Between _lbl___220 and _lbl___222, the timer is reset in order to wait for 100ms, and the last three lines loop until the timer fires and then goes back to the led switching logic.

    If you want to try rforth1, get it here, it is free and distributed under the GNU General Public Licence version 2. At this time, it has no documentation at all but comes with several examples that you can use as a template. And people who can understand French can read this tutorial written by one of the rforth1 users.

  10. Tor, plausible deniability, watchdog and seizures (2006-09-11)

    It looks like the German police has recently seized some servers running the TOR anonymity program because the TOR network seems to have been used to anonymously access child pornography. While of course nobody can publicly stand up against such an action, these seizures may sever the privacy of server owners.

    TOR can be configured in several ways:

    • as a client only, it will transmit encrypted requests to an entry point in the TOR network

    • as an exit point: encrypted packets are decrypted and sent to the targetted server and answers are reencrypted and reinjected into the TOR network to be delivered to the original requestor

    • as a middle-man server: encrypted packets are resent to another TOR server; each server in the middle only see encrypted packets, and doesn't know where they will be directed once they have reached the immediate next node in the network

    This technique is called Onion Routing, because the sender builds the packets by encrypting them successively for the latest, next-to-latest, ... relay in the chain. It also provides a response block that will be used to send the answer back and thus establish a stateful TCP connection (a circuit).

    The TOR servers that have been seized by the German police were probably exit points. Even if by default a TOR server keeps no log of the packets it transmits, it may be possible (e.g., if the access provider keeps extensive traces) to go to the previous hop in the chain, meaning that more servers will be seized and searched. Since the length of the chain is freely chosen by the emitting TOR node, examination of a node is the only solution for the police to know whether the suspected computer is the request originator or only acted as a relay.

    The problem is that a machine acting as a TOR server may well host private data, totally unrelated to this investigation. It can also host hidden TOR services. Those services are only accessible from within the TOR network. Their real location is unknown, even to other TOR operators (even if some researchers pretend that they are able to get partial information by warming up the server CPU and measuring the induced clock jitter).

    By searching a seized server, the police may find a hidden service, be it legal or not, thus compromising the anonymity of such a hidden node. What are the best way to avoid that? How can you still hide your hidden TOR services even if the police gets your server and if you are obliged to reveal your encryption keys by law? In general, how can you keep your data private? I can think of a few solutions, that combined together should make it possible to better protect one's privacy:

    • use an encrypted filesystem with plausible deniability, such as FreeBSD GBDE or David McNab's PhoneBook (probably unmaintained): with such a filesystem, you can get many different encrypted volumes whose number and capacity is unknown to the observers; you may reveal some or all of the keys, they have no way to tell

    • use encrypted swap: what good is it to use an encrypted filesystem if some service traces can be found in your previously used swap partition?

    • use a watchdog program that reboots your computer whenever an IP address (your nearest router) stops pinging: it is easy to imagine that forensics experts may want in some cases not to pull the plug out of your machine; as more and more dedicated servers are run out of small power outlets, it is easy to get one without disturbing the power flow by switching them to a battery; of course, a laptop could play the role of the gateway as well, but it is in my opinion more unlikely; anyway, if you router doesn't work, your machine isn't probably really useful, by rebooting it you automatically unmount the encrypted filesystems as well

    Those actions should let you host legal hidden services with less risk that they are discovered by side-effect of an unrelated police operation. At system boot time, your TOR-enabled server would start using a default relay-only configuration. When you log in and mount the filesystems, you can then restart it with the relay and hidden service configuration.

    Note that I do not recommend that you use those techniques to hide illegal activities or that you don't comply with the law enforcement agencies when you have to do so, just as the authors of GBDE or PhoneBook do not condone such activities. I am only trying to show how one can protect TOR hidden services or private data if those services or data are not the goal of the current investigation.