Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
265 views
in Technique[技术] by (71.8m points)

java - Mahout : To read a custom input file

I was playing with Mahout and found that the FileDataModel accepts data in the format

     userId,itemId,pref(long,long,Double).

I have some data which is of the format

     String,long,double 

What is the best/easiest method to work with this dataset on Mahout?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

One way to do this is by creating an extension of FileDataModel. You'll need to override the readUserIDFromString(String value) method to use some kind of resolver do the conversion. You can use one of the implementations of IDMigrator, as Sean suggests.

For example, assuming you have an initialized MemoryIDMigrator, you could do this:

@Override
protected long readUserIDFromString(String stringID) {
    long result = memoryIDMigrator.toLongID(stringID); 
    memoryIDMigrator.storeMapping(result, stringID);
    return result;
}

This way you could use memoryIDMigrator to do the reverse mapping, too. If you don't need that, you can just hash it the way it's done in their implementation (it's in AbstractIDMigrator).


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

2.1m questions

2.1m answers

60 comments

56.9k users

...