Janne In Osaka: December 2011

Friday, December 30, 2011

Badger Problems?

The title in this recent paper in PLoS One says it all, really: Effectiveness of Biosecurity Measures in Preventing Badger Visits to Farm Buildings. What is a paper on badger control doing in a highly respected science publication?

For one thing, it shows us an important fact about science: Science is a method. It is not about what you study, but about how you study it. Preventing badger incursions is legitimate science when you go about finding out in the proper way.

What is that proper way? There's many descriptions out there, focusing on philosophical or practical aspects. But the gist really is that you find answers to questions without getting fooled — fooled by bad data, fooled by badly posed questions or fooled by your own biases and expectations. It boils down to making sure that your results are real, not an artifact of noisy data or influenced by what you wish were true.

This paper is a well-done bit of science, and it also addresses a real problem. Badgers visit farms, presumably to find stuff to eat, and when they do they can infect the cattle with a version of TBC. Badgers and cows don't come into direct contact very often, so the disease probably spreads indirectly, through badger faeces and urine. Stopping badger visits seems like a good idea.

But farms are big and badgers are small. Stopping them altogether is difficult and expensive. It would be great if you could use simple ways to just keep badgers out of the cattle feed storage, where the risk of spreading infection is greatest. But you also want to know what happens if you do that. Say that you stop badgers from getting into the feed but the badgers simply spend more time where the cows sleep instead, then you may have gained nothing.

The authors set up surveillance cameras on a number of farms. For a whole year they simply recorded the number of badger visits to various areas on the farms. Then they picked farms with badger visits (not all of them had a badger problem) and divided them into three groups: One got badger barriers around the cattle feed area; one around the sleeping area and one got both. Spend another year recording (and analysing all those those hours of recording must have been painful for some poor graduate student) then analyze the results.

What did they find? All barriers were effective in reducing badger visits, and barriers around one area reduced, not increased, visits to other areas as well. Barriers around both areas were most effective, but the feed area barriers were almost as effective.

Think of what they did to get to that result. They decided at the outset what question to ask and what would be an acceptable kind of answer. They first just recorded visits so that they'd know the baseline of visits, and they spent a whole year doing so, since the badger frequency likely changes over the seasons. Then they changed things systematically — add barriers to one, other or both areas, covering all possibilities — and recorded for another year.

And they were careful to set precise criteria for the recording and analysis beforehand — what does and does not count as badger visit, what to do with the data if a camera doesn't work part of the night, if a barrier was down, that sort of thing — so that their own expectations wouldn't influence the analysis. They were careful to use statistics to find out how likely the changes they saw were real and not just due to chance.

Whether you're looking for badgers or bosons, it's this painstaking attention to remove errors, uncertainty and biases that makes it science.

Thursday, December 29, 2011

While I was otherwise preoccupied…

We have been busy with family issues lately, but the fast-paced, exciting world that is Japanese politics never stops. Very briefly, the Noda cabinet is trying to push through a consumption tax increase; as a result of that as well as a number of other issues, a number of lower house DPJ members have quit the party in protest.

Corey Wallace argues that without a functioning LDP, the DPJ is losing its reason to exist and is in real danger of splitting up. It's difficult for me to argue with that. The DPJ started as the not-LDP party, and has no ideological compass to fall back on now that the LDP is defeated. The greater question is whether a more viable set of parties will ever emerge to take the place of the twin trainwrecks that currently vie to not-really-rule Japan. Right now I am not optimistic.

But this is the New Year holidays. Forget about politics for a while. I am going to spend my time enjoying the company of my wife, with food and drink and lazy strolls in the sunshine. I suggest you do the same¹.

--

#1 With your own wife of course; mine is already occupied. Or your own husband. Or significant other in general. A beloved pet is nice too. Just don't take your goldfish for a walk; they don't appreciate it.

Monday, December 26, 2011

Guile II

I've started learning Scheme, and my system of choice is Guile, a version of Scheme designed to be used as a scripting language for Unix-like operating systems. I'm using the older 1.8 version that is available in Ubuntu right now. The new, much improved 2.0 version will be available in the next Ubuntu version but for now 1.8 is plenty for me to learn the basics.

I have been programming in one way or another for all of my adult life. My experience ranges from embedded systems to very large machines and in a variety of languages. I have never really used Scheme, though, and never tried to work in a largely functional programming language before. The code and all the explanations here is the work of a beginner trying to learn about Scheme programming, and it's bound to contain a lot of bad coding style, suboptimal solutions, misconceptions and outright errors.

So why do I do this? The most effective way to learn something is perhaps to explain it to others. It forces you to make everything explicit, and will expose any doubts or misunderstandings you may have. Now, I don't expect anyone to actually follow these posts in any detail; you, collectively, are the imaginary audience I need for me to to write this.

Guile is Scheme. What does Scheme look like? Here's a bit of scheme code:

;; create a list of values from low to high, inclusive
(define (range low high)
    (if (> low high)
      '()
      (cons low (range (+ low 1) high))))

This defines a function called range that will give you a list of numbers, from low to high:

>(range 1 5)

=> (1 2 3 4 5)

This is a pretty useful function, and not surprisingly a variant (named iota for obscure reasons) is available in a Guile library already, along with a more powerful function called list-tabulate that lets you use any kind of list-generating function.

First, note that there's plenty of parentheses. Lots of them, in fact. Both programs and data use the same notation, and it's quite easy to treat a program as data or the other way around. On the other hand, all those parentheses get quite confusing, so a good editor that shows you where each pair belongs is really helpful if you're going to write Scheme code. I use Vim; other people like Emacs.

Scheme, like LISP, uses prefix notation. That is, first we give the operation, then the data to operate on. So if we want to add two numbers we'd do (+ 2 2), and the expression would be replaced with 4¹. The if-statement comparison shows the same order: if (> low high) would be if (low>high) in many other languages.

Simple statements like numbers or variables can stand by themselves. Any operator or function has to be the first element in a list followed by its parameters. When the function returns the whole thing is replaced by the result. Makes sense, I think, as that's the only way to know what arguments belong to an operator. The range call in the last line, (range (+ low 1) high) has two parameters, where the variable high simply gets replaced with its value, while (+ low 1) is a statement with operator + and two arguments for + to add.

The fundamental way to loop in Scheme is by recursion. We wrap the code we want to loop over in a function². We repeat the body of this function by calling the function again at the end. So we call range with two parameters low and high. If low is smaller, we make a list pair with low as the first element, and the result of calling range with the next value of low and high as parameters. Once they're equal we return back up and build the list on the way. Proper lists end with the empty list value, so we return '() for the final element ("'" is a quote, so the list is not evaluated).

This sounds inefficient perhaps, but it is not. When the recursive call happens at the end this becomes just as fast and efficient as a regular loop. Scheme has other ways to create loops, such as do and while statements, but they are all defined by tail recursion like this.

This is what a non-recursive version could look like, using a while loop:

;; list range function, iterative version
(define (range-iter low high)
  (let ((acc (list high)))
    (while (< low high)
    (set! high (1- high))
    (set! acc (cons high acc)))
    acc))

The let statement creates new local variables. We need an accumulator to store our list, so we start by setting it to a list consisting of the high value. Then, while low is smaller than high, we decrement high, concatenate the new high value to the front of acc, then store that new list in acc again. The final statement is acc which gets substituted with the list it contanins, and becomes the return value of the function.

To me this is perhaps easier to understand, but it's not as elegant as the first version above, and it was more difficult for me to get right.

Now, I said that when the recursive call happens at the end, Scheme can optimize it so it's just as efficient as a regular loop. When the recursive call happens last there is nothing left to do at that level. Scheme doesn't have to keep track of each level, and can return directly to the top at the end. But if you look at the first version, the range is not, in fact, the last statement; cons is. The first version is not properly tail-recursive in other words. Something like

> (range 1 50000)

will fail with a (stack-overflow) error. Scheme has to keep track of each level so it can do the cons at the end. We have to rearrange things so that the cons happens on the way down, not when going back up. We assemble the next part of the list and send it along to the next iteration with an extra parameter. We also define a wrapper function without the extra value just to make it neat — remember, functions are cheap.

;; list range function, properly tail-recursive
(define (inner-range low high acc)
  (if (> low high)
    acc
    (inner-range low (- high 1) (cons high acc))))

(define (range low high)
    (inner-range low high '()))

This works as expected. It turns out though, that the iterative while version is actually slightly but consistently faster than the recursive one for calls up to about two million elements. At that point both versions start slowing down (memory allocation and management starts to become a real problem), but the recursive one slows down more. while is conceptually defined in terms of recursion in the standard, but I guess in practice it's implemented and optimized separately in the interpreter for efficiency.

Note that this is for the ageing 1.8 version of Guile specifically; the newer 2.0 version of Guile or other Scheme implementations may well have faster recursion. And in practice, a 10% difference in speed isn't very important compared to readability and code clarity. If speed really is critical, you're not coding in a dynamic language anyway.

--

#1 We normally use infix notation, where we put the operator between the values: 2 + 2. Why prefix notation? There's a number of reasons, but one is that you're not limited to two values; you can do (+ 1 2 3 4 5 6 7) and get 28 in one go. Also, it makes all operators and functions behave the same, which I guess is sort of important for the kind of person that keeps their canned goods sorted and arranges the family toothbrushes by color and size.

Oh, and is there a postfix notation too? Oh, yes indeed there is, and it's surprisingly useful and easy to work with. I don't care much for prefix notation, but postfix is my favourite way of doing calculations; I guess it mimics the way we do arithmetic in our heads already.

#2 functions are really cheap and easy in Scheme. Defining lots of them at every turn seems to be quite normal and using more but smaller functions seem to be encouraged. It's called "functional programming" for a reason.

Friday, December 23, 2011

Guile

With an end-of-year project sprint, the Japanese language test and family health issues, I been feeling a little drained lately. I'd normally relax and recharge by studying Japanese, by doing photography and by writing on this blog. But I want to get away from studying for a while after the JLPT test; I don't have the uninterrupted blocks of time needed for camera walks and processing the results; and most blogging feels uncomfortably close to work right now, aggravating my stress rather than reducing it.

So, I need a new interest. Something I can do wherever and whenever, and in small snatches of time. Something I know I enjoy. Something like learning a new programming language¹.

A student programming a robot. Good "programming" illustrations are hard to find.

I've enjoyed programming since before high school, I've worked as a programmer and my undergraduate subject was computer science. All you need is a computer. Most languages have lots of good documentation and tutorials online, and you can bring a book for when a laptop is too much. You can read and practice in short increments whenever you have a bit of time. And programming is just like doing puzzles or word games: you give your mind a challenge, occupy your thoughts for a while and emerge refreshed and happy.

But what language? There has been an explosion of new languages, or new interest in old languages, recently. Haskell, Erlang, OCAML, Javascript, Dylan, Scala, Go, Groovy, Kawa, Dart, Clojure… They all promise to be hip, webby, cloudy and concurrent, and they all claim to be the Next Big Thing in computing. Hard to choose. Even harder to choose right. Ideally you'd want to know what a language is not good at, but that kind of comparison is rather hard to find. And the Next Big Thing more often than not turns into a Has Been overnight. Popularity is not a good way to choose.

Better to pick something that teaches fundamentals. One thing many, though not all, of the languages above have in common is that they are functional to some degree. Big, established languages like Python and Ruby also support functional programming. So lets go back to basics, I figure. Let's learn Scheme, and specifically Guile.

I know about functional programming of course. And I use functional constructs in Python and other languages from time to time. But I've never tried to really live a functional life as it were, and I don't have an intuitive sense of how to do it right. By learning and using Scheme I will hopefully not just gain a new language but also enrich my programming abilities in general.

Scheme is a LISP-like language, but very minimalist². On one hand, it makes Scheme an easy language to implement, and it has made it a fairly popular scripting language. On the other hand, the base Scheme specification is so spartan it is very difficult to use. Fortunately, like LISP it has excellent support for extending the language with new syntax, so all implementations are fairly full-featured in practice. Small but extensible is the best of both worlds in a way, but as each implementation does things in different ways, Scheme programs aren't very portable between systems. Even seemingly basic things like looping are actually add-ons to the base language.

Guile is meant to be an embedded scripting language for Linux and other Unix-like systems. That is, it's made to be easily connected to low-level code written in C. You can write the main code in the low-lever language and use Guile as a simple scripting layer. Or you can make Guile the main part, and use C to extend Guile with fast application-specific functions. Or you can skip the low-level code altogether and just use Guile as a development language in its own right.

As it is easy to connect it to lower-level code there are pretty good bindings to many Linux system libraries; there's support for threading and parallelization; and like most Scheme variants it has a very good numeric stack with built in seamless support for exact values (using fractions rather than reals), very large values, complex numbers and more. It is a full-fledged development system, not a toy language.

Guile version 2.0 was released this year with performance improvements, support for compilation, new, better libraries and much more, but for now I will use version 1.8, as it is the current standard version in Ubuntu. The extra features of Guile 2 do not matter while I'm still trying to learn. I thought I'd post about my progress here from time to time.

--

#1 I know I just lost about 90% of you reading this. It's OK; we're all different and so are our interests. Just skip these Guile posts if you're not interested.

#2 Scheme people sometimes brag that the entire Scheme specification has fewer pages than the introduction to the Common LISP standard.

Tuesday, December 13, 2011

Not Exactly The…

…single best week of our lives, and it's only Tuesday. While we wait for the rest of the week to improve, here's a few recent Osaka shots as a pick-me-up.

Young office worker does a bit of paperwork on the road as it were. Crysta shopping mall, Nagahori.

Yodogawa river between Nakanoshima and Umeda.

Hostess on a second-floor hall in Kitashinchi, Umeda.

JR has a new, huge station in Umeda, with malls, exhibition centers and several public plazas. Beautiful, well-designed place.

Platform entrance, JR Umeda station.

View from JR Umeda station. This, by the way, is my new desktop background.

All pictures were taken with a 1970's Minolta SRT-101, using Kodak's new Portra 400 negative film. It takes more time to process and scan, but other than that (a big "other than", to be sure) I find few reasons to recommend a lower-end DSLR over this set-up.

Friday, December 9, 2011

It's All Text

I dislike Google's new version of Google Reader and I'm not alone. The basic problem with their redesign was that button bars and other things took up a lot of vertical space, leaving less than half of the vertical space for the actual text I want to read. They've since re-redesigned it to reclaim some of that lost space but you still lose about a third of the scarce space. I've since turned to another news reader that — while infuriating in other ways — at least gives me all available vertical space for the text.

Gmail has been redesigned too, and it's just as bad. When you reply to a message, less than 40% of the vertical space is available to, you know, actually write your message. Look at this screenshot:

Gmail, while replying to an email. Note how the actual writing area is just a small window in the lower half of the screen.

So why am I not annoyed? Why am I not abandoning Gmail? Shaking my fist impotently against a cold, uncaring sky? Because It's All Text. This is an add-on to Mozilla that lets you edit any web page input using your own favourite text editor.

Look at that screenshot again, and note the yellow-green circle? There's a small button there that says "Edit". Click on it (or choose It's All Text in the right-click menu) and your editor will open with the current contents of this text input area. Write away, in full screen, in your favourite tool. You save the text and the plugin automagically inserts it right into the web page, as if you had typed it in yourself.

This is wonderful. I think you can use almost any text editor you want as long as it can save as normal text (no word-processor, obviously). I use Gvim, but you can use Gedit, Notepad, Textmate, Emacs or whatever you want. Works on Linux, Windows and OSX (though there's apparently a few extra steps for OSX). It's not just the convenience of editing in a real editor. You can save drafts or copies of what you write, and if you leave the editor open until the text is posted your text will be safe even if the website manages to lose your input. If that happens you just reopen the input page and paste your text into the field - no more rewriting your entire text.

The one place it doesn't work right now is on Google+. A few other web sites manages to lose the "edit" button in the corner, but It's All Text is still available and working through the right-click menu. But on most websites it works just beautifully.

Wednesday, December 7, 2011

Sweden

At long last, pictures and notes from our vacation in Sweden this September. Still stuck writing for work, so I'll mostly post images with captions and short fragments. Less to write, less to read. Everybody wins!

Stockholm city hall. The Nobel Price award dinner is held here every December.

We flew with Finnair from Osaka to Stockholm via Helsinki. It's the most convenient route by far, and a good choice for any destination in northern Europe. On the way to Helsinki I saw Princess Toyotomi. It's a recent movie set in Osaka, based around the idea that a heir of the Toyotomi clan escaped the sacking of Osaka castle some 400 years ago, and the descendants are still secretly living in Osaka as royalty of a secret "Osaka nation".

It's a charming adventure movie. There's some fun characters and an implausible but creative story line. It sags a bit toward the end though; I guess they didn't know how to tie it all up. The main pleasure for me was the city itself; large parts of it is set right around where we live — at the Karahori market street, Osaka castle, the City hall and so on. Fun movie, so see it if you get the chance. Anyway…

Skandia, a movie house in Stockholm, still with the classical logo.

Vasagatan, Stockholm.

We stayed at a hotel right next to the train station. Very convenient, what with all the bags and stuff we brought. First night we ate at a Belgian place along Vasagatan called Duvel Café. Pretty good food and a huge selection of Belgian beers seems to make the place very popular for younger professionals to go after work. I kept hearing snippets about cloud storage, Android programming and the next new shiny Apple gadget from the people around us. The people at the next table over were enduring a dinner sales pitch about some cross-platform persistent communications library. In a different life, one where I never went to Japan, I might have ended up as one of the regulars here.

I finally got my plankstek. Steak, sauce and mashed potatoes on a wooden board, finished in a hot oven. This restaurant had such huge portions — that, and I'm no longer used to eating so much — that I had to leave almost half of the food.

Sushi in Stockholm. Specifically, sushi in a cheap Thai restaurant next to the station. Hey, it's all Asian, right?

I wanted to try ramen while in Stockholm. The place I wanted to go to was quite expensive; worse, it was open only open for dinner. Ramen is cheap lunch food to me, so we gave up on that and had a nigiri-sushi lunch set at Kungshallen food court instead. Not bad but not great; the rice was a little undercooked and the flavours weren't very exciting. The miso soup was good, though, and they managed to serve it together with the sushi, unlike the sushi place we tried in Paris last year.

Norstedts is one of the largest publishers in Sweden, and one of the oldest. Their head office "palace" on Riddarholmen island is arguably one of the best addresses in Sweden.

We're leaving Stockholm by train. As are these uniformed, supposedly hard-to-see gentlemen, apparently catching up on paperwork while waiting for a connecting train. There is no way to avoid it is there; I bet Genghis Khan would have conquered the rest of the world as well, but after taking Asia he was held up by piles of paperwork that just couldn't wait…

On our way to Borlänge and my parents we stopped by Uppsala to visit friends. The town (an old university city rival to my own Lund) has been blessed with a new concert hall. It's a popular place, apparently, with a good lunch restaurant on street level (and again with huge portions). It's also nicely photogenic. Here you can see the cathedral, the castle and the university library reflected in the top floor corridors.

Borlänge is an industrial town with a large steel mill and a paper mill, located right in the middle of some of the most beautiful woodland area of the country.

One benefit of small towns like this is, nature is close everywhere you go (OK, so it's not always a benefit, exactly - see "TBE"). The Dalälven river goes right through town, and there's ots of small ponds and inlets like this along the banks. This spot along the river is a leisurely ten minutes by foot from the town center.

Another benefit of small, rural towns: plenty of space. This back yard with a barn or storage shed in the background is right in town, a few blocks away from the main road and close to the pond above. You would have to be very wealthy — and very ostentatious — indeed to have a place like this in, say, Osaka city.

It was good to be back in Sweden for a few days. I got to hear my own language for a change, and spend some time in my own culture. But it did feel a little strange for much of the time. I guess culture and language — slang, catchphrases, current events — is beginning to diverge enough from my own remembered experience that it starts to be noticeable and become just a little disorienting. It's like hearing a new version of an old favourite tune that you haven't heard in years. Not worse than you remember, just a little different.

More pictures in the Flickr set.

Sunday, December 4, 2011

JLPT

It's the first sunday in december. Yes, boys and girls — it's time for 日本語能力試験 (nihongo nōryoku shiken), or the "Japanese language proficiency test".

I'm taking level 1. Like I did last year. And the year before. And the year before that. And I expect to fail this year too. I do suspect I'll be decently close; If I am, I may try it for real next summer and study for the test over spring.

I don't really need the test right now, but it would be good to have it done and over with. It's kind of like Pokemon — you want to collect all the levels. And there's a proposal to let a passing JLPT grade improve your chances to get a working visa here; if that proposal is enacted the test becomes fairly important.

Thursday, December 1, 2011

Wikipedia — Donate!

Wikipedia is, in a way, the most important site I know. For my specific line of research I of course lean heavily on research papers and books. For most anything else, including fields close to mine, Wikipedia is usually my first stop¹ for anything I want to know more about. It is of course unreliable and superficial — and traditional encyclopedias are even more so — but it is a great "map of the land" and launch pad for more reliable, detailed information elsewhere.

There is possibly no other single resource on earth that is as influential and as important, whether your interest is in the advancement of natural science or the definite lineup of characters in One Piece. Wikipedia is run by a non-profit organization and is currently running a donation drive. You could do worse than go over there and donate a few yen, euro, dollars or crowns. After all, you know that you're using Wikipedia far beyond what you pay anyway, right?

--

#1 …and my second, as I click an interesting-sounding link; then my third as I follow another one; then fourth… And suddenly I realize I've spent the last six hours on the site and now learning all about medieval time keeping mechanisms or something. The site is not so much saving me time as much as redirecting it.