Search All of the Math Forum:
Views expressed in these public forums are not endorsed by
Drexel University or The Math Forum.



Re: how to extract the structure....
Posted:
Mar 24, 2013 2:24 AM


On Sat, 23 Mar 2013 11:22:57 0700, Dieter von Holten wrote: > lets assume i have a set of lists (4 letter words...), like below:
> (1) abcd, aabb, abda, dbdb, bbca, afah, ... > when i swap two letters, like a<>b, i get another set of lists: > (2) bacd, bbaa, badb, dada, aacb, bfbh, ... > or, by swapping a<>c in (1): > (3) cbad, ccbb, cbdc, dbdb, bbac, cfch, ... > the lists look different, but they share the same structure > (4) 1234, 1122, 1241, 4242, 2231, 1516, ... ...[snip other examples]... > the letters have no (implicit) value, they are just different symbols. > > any ideas ? what shall i google?
It might be that this problem can be put into correspondence with the Graph isomorphism problem, whose complexity of solving is not known, but is an important open question. See <http://en.wikipedia.org/wiki/Graph_isomorphism_problem>.
Or, this problem might be easier for average cases. I didn't understand examples 5 and 6 (snipped for brevity) but suppose the word lists were related by wordreordering and simple alphabetic substitution. If so, testing for isomorphism probably can be done quickly by canonicalizing each word list. Eg, replace the most common character by 1, the next most common by 2, etc. When a tie occurs, break it by finding the most common first character, or if that ties, the most common 2nd character, or most common in the next combination of columns to be considered, etc. As characters are named in this way, remove them from future counting. This approach serves to canonicalize examples 14 but presumably there are examples it doesn't work for.
 jiw



