Date: Apr 8, 2013 7:05 PM
Author: kj
Subject: Label-based addressing




Let me start with a hypothetical (but representative) use-case.
Suppose I have data stratified over 5 brands of beer, 8 yearly
quarters, 4 age groups, and 50 states. Furthermore, let's say that
for each possible combination of beer brand, yearly quarter, age
group, and state, I have computed a heterogenous k-vector of
statistics (e.g. total consumption, consumption per capita, etc.).
Therefore, all told, I'm talking about a four-dimensional
(5 x 8 x 4 x 50) array of k-vectors.

I would like to store all this data in an object that would allow
me to address subsets of it using *labels* instead of numeric
indices. By "labels" I mean the names of the "dimensions" (in this
case "brand", "quarter", "age", "state"), the names of the possible
values for each dimension, (e.g. for state, I'd have "AK", "AL",
"AR", ..., "WI", "WY"), and the components of the vector of data
associated with each combination of factors, (in this case these
would include "total consumption", "consumption per capita", etc.).

For example, supposing that the variable D held such a data structure,
label-based addressing would allow me to use something like

E = D.extract(state="AK", age="30-45")

to store in E the subset of the data in D corresponding to the
specified values. (The value in E, by the way, would be a similar
data structure, but it would have different shape, since two of
its dimensions now have the minimum depth of 1.)

One can imagine many elaborations of this theme, but I think the
above gives a flavor of what I have in mind.

It is my understanding that MATLAB does not have this functionality.
(But please correct me if I'm wrong!) Therefore I'd have to
implement it myself.

My naive idea is to define an object (class, that is) that internally
stores one or more n-dimensional "boxes" for the data (as standard
MATLAB arrays), as well as hash tables to map between labels and
numeric indices. Methods like "extract" above would convert its
arguments to numeric addresses, and would use these numeric addresses
to fetch the data from the internal data boxes.

I am fairly new to MATLAB, so I'd appreciate your comments on the
above, and any words of wisdom you may give me on how best to do
this.

In particular, are there any existing packages I could use as
*models* for this sort of project?

Thanks in advance!

kj