On Sat, 23 Mar 2013 11:22:57 -0700, Dieter von Holten wrote: > lets assume i have a set of lists (4 letter words...), like below:
> (1) abcd, aabb, abda, dbdb, bbca, afah, ... > when i swap two letters, like a<->b, i get another set of lists: > (2) bacd, bbaa, badb, dada, aacb, bfbh, ... > or, by swapping a<->c in (1): > (3) cbad, ccbb, cbdc, dbdb, bbac, cfch, ... > the lists look different, but they share the same structure > (4) 1234, 1122, 1241, 4242, 2231, 1516, ... ...[snip other examples]... > the letters have no (implicit) value, they are just different symbols. > > any ideas ? what shall i google?
Or, this problem might be easier for average cases. I didn't understand examples 5 and 6 (snipped for brevity) but suppose the word lists were related by word-reordering and simple alphabetic substitution. If so, testing for isomorphism probably can be done quickly by canonicalizing each word list. Eg, replace the most common character by 1, the next most common by 2, etc. When a tie occurs, break it by finding the most common first character, or if that ties, the most common 2nd character, or most common in the next combination of columns to be considered, etc. As characters are named in this way, remove them from future counting. This approach serves to canonicalize examples 1-4 but presumably there are examples it doesn't work for.