kj
Posts:
184
Registered:
8/17/09


Labelbased addressing
Posted:
Apr 8, 2013 7:05 PM


Let me start with a hypothetical (but representative) usecase. Suppose I have data stratified over 5 brands of beer, 8 yearly quarters, 4 age groups, and 50 states. Furthermore, let's say that for each possible combination of beer brand, yearly quarter, age group, and state, I have computed a heterogenous kvector of statistics (e.g. total consumption, consumption per capita, etc.). Therefore, all told, I'm talking about a fourdimensional (5 x 8 x 4 x 50) array of kvectors.
I would like to store all this data in an object that would allow me to address subsets of it using *labels* instead of numeric indices. By "labels" I mean the names of the "dimensions" (in this case "brand", "quarter", "age", "state"), the names of the possible values for each dimension, (e.g. for state, I'd have "AK", "AL", "AR", ..., "WI", "WY"), and the components of the vector of data associated with each combination of factors, (in this case these would include "total consumption", "consumption per capita", etc.).
For example, supposing that the variable D held such a data structure, labelbased addressing would allow me to use something like
E = D.extract(state="AK", age="3045")
to store in E the subset of the data in D corresponding to the specified values. (The value in E, by the way, would be a similar data structure, but it would have different shape, since two of its dimensions now have the minimum depth of 1.)
One can imagine many elaborations of this theme, but I think the above gives a flavor of what I have in mind.
It is my understanding that MATLAB does not have this functionality. (But please correct me if I'm wrong!) Therefore I'd have to implement it myself.
My naive idea is to define an object (class, that is) that internally stores one or more ndimensional "boxes" for the data (as standard MATLAB arrays), as well as hash tables to map between labels and numeric indices. Methods like "extract" above would convert its arguments to numeric addresses, and would use these numeric addresses to fetch the data from the internal data boxes.
I am fairly new to MATLAB, so I'd appreciate your comments on the above, and any words of wisdom you may give me on how best to do this.
In particular, are there any existing packages I could use as *models* for this sort of project?
Thanks in advance!
kj

