> Posted to both sci.math.num-analysis and sci.math.symbolic based on a > suggestion in sci.math. I will follow both groups if you'd like to > remove the crosspost. > > > I'm working on bit of code that does some data processing. I'd like > some input on developing the algorithm. This will effectively condense > a large set of data values to a smaller one. > > The original data is the result of sampling a waveform. The processed > data should still represent that waveform, of course. > > Input: 10,000 integer values. > Output: 1023 integer values. > > There's nothing that can be done to change the input or output > requirements. > > My current method is a bit brute force, using a moving window that > sometimes averages 10 raw values, sometimes 9, to derive the results. > > I'm not overly happy with that, so if anyone has some more elegant or > efficient suggestions I'd be quite interested. > > I had looked at data compression algorithms. I think those are really > overkill for what I need, plus some problems. My situation is not a > round-trip one. That is, it's not the case that I need to compress, > store or transmit, and then uncompress later. > > The 1023 data points are the final format for the receiving device. So > I have to end up with that number of data points representing the > entire sampled waveform. The instrument can only provide certain > numbers of points. > > It would be convenient if I could get the raw data as 1023 points, but > that's not an option. In the range we want to work in, I can get 1000 > or 10,000. So we go up a range and average the results. > > > > Brian
What else do you know about the data? You have suggested one extreme of a form of low pass filtering. How about just 1023 sets of average of 9 and discard the rest to avoid the changing 9 or 10 averaging. Explaining why this is a good or bad idea may help understand the data. Another extreme is the average the 9 batches of 1023 adjacent data points and toss the rest. Or maybe treat the rest as a partial batch of 1023 and then figure out how to combine it. Is there any reason for 1023 beyond some sort of caprice?
Is a data point more like it immediate neighbour or does it have more in common with the data point a month later or something else or 1023 points later?