Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
246 views
in Technique[技术] by (71.8m points)

java - Is it possible to create nested RDDs in Apache Spark?

I am trying to implement K-nearest neighbor algorithm in Spark. I was wondering if it is possible to work with nested RDD's. This will make my life a lot easier. Consider the following code snippet.

public static void main (String[] args){
//blah blah code
JavaRDD<Double> temp1 = testData.map(
    new Function<Vector,Double>(){
        public Double call(final Vector z) throws Exception{
            JavaRDD<Double> temp2 = trainData.map(
                    new Function<Vector, Double>() {
                        public Double call(Vector vector) throws Exception {
                            return (double) vector.length();
                        }
                    }
            );
            return (double)z.length();
        }    
    }
);
}

Currently I am getting error with this nested settings (I can post here the full log). Is it allowed in the fist place? Thanks

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

No, it is not possible, because the items of an RDD must be serializable and a RDD is not serializable. And this makes sense, otherwise you might transfer over the network a whole RDD which is a problem if it contains a lot of data. And if it does not contain a lot of data, you might and you should use an array or something like it.

However, I don't know how you are implementing the K-nearest neighbor...but be careful: if you do something like calculating the distance between each couple of point, this is actually not scalable in the dataset size, because it's O(n2).


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...