Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
4.4k views
in Technique[技术] by (71.8m points)

Sorting text lines from hard drive files by partly loading to memory | Java

My task is to sort file which is too large to fit in memory. File contains text lines. What I did:

  1. read from original file by parts (of allowed size).
  2. sorted each part
  3. saved each sorted part to tempfiles

As I understand next thing i should do is:

  1. read first lines of each file
  2. sort them between each other (use local variable to temporarily store it, but I am not sure if it will be below restricted size)
  3. write first line (as result of sorting) to final file
  4. now I need to remove line I just wrote from temporary file
  5. now i need to repeat steps 1-4 until all lines are sorted and "transferred" from temp files to final file

I am most unsure about step 4 - is there a class than can look for a value and then erase line with this value (at that point I won't even know from which file that line came)? I think that this is not a proper way to reach my goal at all. But I need to remove lines which are already sorted. And I can't operate with files' data in memory.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

do you need to do this in Java (assuming by the tag)? As memory wise it isn't going to be efficient way. The simplest option in my opinion would be using sort and just sort the file directly on the OS level.

This article will give you a guide on how to use sort: https://www.geeksforgeeks.org/sort-command-linuxunix-examples/

Sort is available on Windows as well as unix/linux and can handle huge files.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...