Date: Apr 8, 2013 7:05 PM
Author: kj
Subject: Label-based addressing

Let me start with a hypothetical (but representative) use-case.

Suppose I have data stratified over 5 brands of beer, 8 yearly

quarters, 4 age groups, and 50 states. Furthermore, let's say that

for each possible combination of beer brand, yearly quarter, age

group, and state, I have computed a heterogenous k-vector of

statistics (e.g. total consumption, consumption per capita, etc.).

Therefore, all told, I'm talking about a four-dimensional

(5 x 8 x 4 x 50) array of k-vectors.

I would like to store all this data in an object that would allow

me to address subsets of it using *labels* instead of numeric

indices. By "labels" I mean the names of the "dimensions" (in this

case "brand", "quarter", "age", "state"), the names of the possible

values for each dimension, (e.g. for state, I'd have "AK", "AL",

"AR", ..., "WI", "WY"), and the components of the vector of data

associated with each combination of factors, (in this case these

would include "total consumption", "consumption per capita", etc.).

For example, supposing that the variable D held such a data structure,

label-based addressing would allow me to use something like

E = D.extract(state="AK", age="30-45")

to store in E the subset of the data in D corresponding to the

specified values. (The value in E, by the way, would be a similar

data structure, but it would have different shape, since two of

its dimensions now have the minimum depth of 1.)

One can imagine many elaborations of this theme, but I think the

above gives a flavor of what I have in mind.

It is my understanding that MATLAB does not have this functionality.

(But please correct me if I'm wrong!) Therefore I'd have to

implement it myself.

My naive idea is to define an object (class, that is) that internally

stores one or more n-dimensional "boxes" for the data (as standard

MATLAB arrays), as well as hash tables to map between labels and

numeric indices. Methods like "extract" above would convert its

arguments to numeric addresses, and would use these numeric addresses

to fetch the data from the internal data boxes.

I am fairly new to MATLAB, so I'd appreciate your comments on the

above, and any words of wisdom you may give me on how best to do

this.

In particular, are there any existing packages I could use as

*models* for this sort of project?

Thanks in advance!

kj