Saturday, November 12, 2011

Data collection and analysis app, anyone?

Here's a question for people doing data analysis and programming. I'm looking for a tool that I don't know if it exists:

I often find myself collecting data over time; temperature data, my weight, baking results, lots of stuff like that. I want to be able to very quickly, simply, add data on a daily or hourly basis and do my own exploratory analysis and visualisation. Normally I use a spreadsheet, or hack together a small script to deal with the data, but neither is very convenient.

  • A spreadsheet lets you enter data as it comes in, but both data entry and analysis is clumsy and rudimentary, and you soon hit the wall in what you can do with it. Try to write a spreadsheet that correlates your data with the day of the week, for instance.

  • Octave, R and tools like that are very powerful. But they're not really geared for this kind of simple daily data entry and presentation.They're really about analysing fixed data sets and don't do interactive data collection very well.

  • One-off tools in Ruby or Python will do what I want of course, and in practice it takes less effort than doing this kind of interactive thing in Octave and the like. But it feels like I'm reinventing the wheel every single time.

I'm really looking for a tool somewhere between a completely open-ended scripting environment and a restrictive tool like as spreadsheet; Octave or R but geared towards interactive, daily data collection rather than extensive analysis of fixed data sets.

Is there such a thing?

If not, it may be time to start thinking about creating it. A spreadsheet-like, but more task-specific, frontend, with a good way to enter new data and a real language to do your data analysis. Bonus for being able to generate a matching data entry component for Android phones (can't sideload apps on iPhone).

I'm crossposting this to Google+, and you can also reach me through email as well of course.

5 comments:

Anonymous said...

Jan,

Check out pachube.com. It may be what you are looking for.

Scott

Mike said...

I have never hit a wall using Excel + pivot table + VBA.

Janne Morén said...

ANon, Pachube looks cool, but it's not addressing the problem I'm facing. I want a way to simply create task-specific ways of entering and recording data. Once I have it, the analysis should fall out naturally.

Mike, those specific tools don't run on Linux; however, a similar spreadsheet-based setup is turning out to be insufficient for me.

After thinking more about this, what I really look for, I think, is a library for Ruby or Python to streamline the creation of task-specific data entry and analysis, and reuse both analysis methods and data when appropriate.

Douglas Kretzmann said...

I have not in fact used it, since my needs are so simple that LibreOffice meets them all: but Needlebase might be helpful. It's free for limited private use.

See its lead developer use it on school district statistics, here.

Jonas said...

I've been thinking about the same. I'd like to start recording more data - in particular spending, work habits, ideas etc. (so a mix of numerical and textual data) - but haven't gotten further than to considering the most suitable tool... I think I'm also somewhere between spreadsheet and Python.

Needlebase looks interesting, but perhaps too focused on scraping of public data as opposed to input of personal data?

For a while I was using KeepTrack, a very simple data logging app for Android: http://www.zagalaga.com/index.html
It only tracks specific types of data, plots it and exports as xml. Probably not sufficient for you, but if you decide to roll your own system, it might work as the Android front-end.