Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
787 views
in Technique[技术] by (71.8m points)

scala - Calculating the Moving Average of a List

This weekend I decided to try my hand at some Scala and Clojure. I'm proficient with object oriented programming, and so Scala was easy to pick up as a language, but wanted to try out functional programming. This is where it got hard.

I just can't seem to get my head into a mode of writing functions. As an expert functional programmer, how do you approach a problem?

Given a list of values and a defined period of summation, how would you generate a new list of the simple moving average of the list?

For example: Given the list values (2.0, 4.0, 7.0, 6.0, 3.0, 8.0, 12.0, 9.0, 4.0, 1.0), and the period 4, the function should return: (0.0, 0.0, 0.0, 4.75, 5.0, 6.0, 7.25, 8.0, 8.25, 6.5)

After spending a day mulling it over, the best I could come up with in Scala was this:

def simpleMovingAverage(values: List[Double], period: Int): List[Double] = {
  (for (i <- 1 to values.length)
    yield
    if (i < period) 0.00
    else values.slice(i - period, i).reduceLeft(_ + _) / period).toList
}

I know this is horribly inefficient, I'd much rather do something like:

where n < period: ma(n) = 0
where n = period: ma(n) = sum(value(1) to value(n)) / period
where n > period: man(n) = ma(n -1) - (value(n-period) / period) + (value(n) / period)

Now that would be easily done in a imperative style, but I can't for the life of me work out how to express that functionally.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Interesting problem. I can think of many solutions, with varying degrees of efficiency. Having to add stuff repeatedly isn't really a performance problem, but let's assume it is. Also, the zeroes at the beginning can be prepended later, so let's not worry about producing them. If the algorithm provides them naturally, fine; if not, we correct it later.

Starting with Scala 2.8, the following would give the result for n >= period by using sliding to get a sliding window of the List:

def simpleMovingAverage(values: List[Double], period: Int): List[Double] =
  List.fill(period - 1)(0.0) ::: (values sliding period map (_.sum) map (_ / period))

Nevertheless, although this is rather elegant, it doesn't have the best performance possible, because it doesn't take advantage of already computed additions. So, speaking of them, how can we get them?

Let's say we write this:

values sliding 2 map sum

We have a list of the sum of each two pairs. Let's try to use this result to compute the moving average of 4 elements. The above formula made the following computation:

from d1, d2, d3, d4, d5, d6, ...
to (d1+d2), (d2+d3), (d3+d4), (d4+d5), (d5+d6), ...

So if we take each element and add it to the second next element, we get the moving average for 4 elements:

(d1+d2)+(d3+d4), (d2+d3)+(d4+d5), (d3+d4)+(d5+d6), ...

We may do it like this:

res zip (res drop 2) map Function.tupled(_+_)

We could then compute the moving average for 8 elements, and so on. Well, there is a well known algorithm to compute things that follow such pattern. It's most known for its use on computing the power of a number. It goes like this:

def power(n: Int, e: Int): Int = e match {
  case 0 => 1
  case 1 => n
  case 2 => n * n
  case odd if odd % 2 == 1 => power(n, (odd - 1)) * n
  case even => power(power(n, even / 2), 2)
}

So, let's apply it here:

def movingSum(values: List[Double], period: Int): List[Double] = period match {
  case 0 => throw new IllegalArgumentException
  case 1 => values
  case 2 => values sliding 2 map (_.sum)
  case odd if odd % 2 == 1 => 
    values zip movingSum(values drop 1, (odd - 1)) map Function.tupled(_+_)
  case even =>
    val half = even / 2
    val partialResult = movingSum(values, half)
    partialResult zip (partialResult drop half) map Function.tupled(_+_)
}

So, here's the logic. Period 0 is invalid, period 1 is equal to the input, period 2 is sliding window of size 2. If greater than that, it may be even or odd.

If odd, we add each element to the movingSum of the next (odd - 1) elements. For example, if 3, we add each element to the movingSum of the next 2 elements.

If even, we compute the movingSum for n / 2, then add each element to the one n / 2 steps afterwards.

With that definition, we can then go back to the problem and do this:

def simpleMovingAverage(values: List[Double], period: Int): List[Double] =
  List.fill(period - 1)(0.0) ::: (movingSum(values, period) map (_ / period))

There's a slight inefficiency with regards to the use of :::, but it's O(period), not O(values.size). It can be made more efficient with a tail recursive function. And, of course, the definition of "sliding" I provided is horrendous performance-wise, but there will be a much better definition of it on Scala 2.8. Note that we can't make an efficient sliding method on a List, but we can do it on an Iterable.

Having said all that, I'd go with the very first definition, and optimize only if a critical path analysis pinpointed this as a big deal.

To conclude, let's consider how I went about the problem. We have a moving average problem. A moving average is the sum of a moving "window" on a list, divided by the size of that window. So, first, I try to get a sliding window, sum everything on it, and then divide by the size.

The next problem was to avoid repetition of already computed additions. In this case, I went to the smallest addition possible, and tried to figure out how to compute bigger sums reusing such results.

Finally, let's try to solve the problem the way you figured it, by adding and subtracting from the previous result. Getting the first average is easy:

 def movingAverage(values: List[Double], period: Int): List[Double] = {
   val first = (values take period).sum / period

Now we make two lists. First, the list of elements to be subtracted. Next, the list of elements to be added:

   val subtract = values map (_ / period)
   val add = subtract drop period

We can add these two lists by using zip. This method will only produce as many elements as the smaller list has, which avoids the problem of subtract being bigger than necessary:

   val addAndSubtract = add zip subtract map Function.tupled(_ - _)

We finish by composing the result with a fold:

   val res = (addAndSubtract.foldLeft(first :: List.fill(period - 1)(0.0)) { 
     (acc, add) => (add + acc.head) :: acc 
   }).reverse

which is the answer to be returned. The whole function looks like this:

 def movingAverage(values: List[Double], period: Int): List[Double] = {
   val first = (values take period).sum / period
   val subtract = values map (_ / period)
   val add = subtract drop period
   val addAndSubtract = add zip subtract map Function.tupled(_ - _)
   val res = (addAndSubtract.foldLeft(first :: List.fill(period - 1)(0.0)) { 
     (acc, add) => (add + acc.head) :: acc 
   }).reverse
   res
 }

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...