As asked by Dimitre Novatchev I created a new question, as some parts of the old question changed.
(Link to the old question: Merging two different XML log files (trace and messages) using date and timestamp?)
I need to merge two XML log files (up to 700MB). One log file contains a trace with position updates. The other log file contains the received messages. There can be multiple received messages without having a position update inbetween and the other way round.
Both logs have timestamps including milliseconds (123 in this example):
- The trace log uses <date> (eg. 14.7.2012 11:08:07.123)
- The message log uses a unix timestamp <timeStamp> (eg. 1342264087123)
There are also other <timeStamp> elements included in the message log, but only the one within the path messageList/Message/originator/originatorPosition/timeStamp is relevant.
The following structures are slightly simplified, as additional content like "acceleration" etc. is left out. This additional content just needs to be copied together with the rest of the messages/items.
The structure of the position trace looks like:
<itemList>
<item>
<date>14.7.2012 12:13:05.123</date>
<FilteredPosition>
<Latitude>51.12235</Latitude>
<Longitude>9.347214</Longitude>
</FilteredPosition>
</item>
<item>
<date>14.7.2012 12:13:07.456</date>
<FilteredPosition>
<Latitude>51.12235</Latitude>
<Longitude>9.347214</Longitude>
</FilteredPosition>
</item>
</itemList>
The structure of the message log is like that:
<messageList>
<Message>
<messageId>1234</messageId>
<originator>
<originatorPosition>
<nodeId>2345</nodeId>
<timeStamp>1342264087061</timeStamp>
</originatorPosition>
<senderPosition>
<nodeId>2345</nodeId>
<timeStamp>1342264087234</timeStamp>
</senderPosition>
<medium></medium>
</originator>
<MessagePayload>
<generationTime>
<timeStamp>1342264087</timeStamp>
<milliSec>42</milliSec>
</generationTime>
</MessagePayload>
</Message>
<Message>
<messageId>1234</messageId>
<originator>
<originatorPosition>
<nodeId>2345</nodeId>
<timeStamp>1342264088064</timeStamp>
</originatorPosition>
<senderPosition>
<nodeId>2345</nodeId>
<timeStamp>1342264088254</timeStamp>
</senderPosition>
<medium></medium>
</originator>
<MessagePayload>
<generationTime>
<timeStamp>1342264088</timeStamp>
<milliSec>42</milliSec>
</generationTime>
</MessagePayload>
</Message>
</messageList>
When doing the merging, the timestamps should be read (also converting/comparing "date" and "timestamp" including milliseconds in the format "14.7.2012 11:08:07.123") and all positions and messages added in the right order.
The position data can just be added as it is. However, the message should be placed inside of <item> tags, a <date> tag should be added (based on the messages' unix time with milliseconds) and the <Message> tag should be replaced by <m:Message type="received"> tags. The items are placed within the root <itemList>, just as it has been with the position trace.
A result could look like this:
<itemList>
<item>
<date>14.7.2012 12:13:05.123</date>
<FilteredPosition>
<Latitude>51.12235</Latitude>
<Longitude>9.347214</Longitude>
</FilteredPosition>
</item>
<item>
<date>14.7.2012 12:13:07.061</date>
<m:Message type="received">
<messageId>1234</messageId>
<originator>
<originatorPosition>
<nodeId>2345</nodeId>
<timeStamp>1342264087061</timeStamp>
</originatorPosition>
<senderPosition>
<nodeId>2345</nodeId>
<timeStamp>1342264087234</timeStamp>
</senderPosition>
<medium></medium>
</originator>
<MessagePayload>
<generationTime>
<timeStamp>1342264087</timeStamp>
<milliSec>63</milliSec>
</generationTime>
</MessagePayload>
</m:Message>
</item>
<item>
<date>14.7.2012 12:13:07.456</date>
<FilteredPosition>
<Latitude>51.12235</Latitude>
<Longitude>9.347214</Longitude>
</FilteredPosition>
</item>
<item>
<date>14.7.2012 12:13:08.064</date>
<m:Message type="received">
<messageId>1234</messageId>
<originator>
<originatorPosition>
<nodeId>2345</nodeId>
<timeStamp>1342264088064</timeStamp>
</originatorPosition>
<senderPosition>
<nodeId>2345</nodeId>
<timeStamp>1342264088254</timeStamp>
</senderPosition>
<medium></medium>
</originator>
<MessagePayload>
<generationTime>
<timeStamp>1342264088</timeStamp>
<milliSec>70</milliSec>
</generationTime>
</MessagePayload>
</m:Message>
</item>
<itemList>
There are also some <item> elements that do not contain a timestamp (and no "FilteredPosition") inside the position log file. These items can be ignored and do not need to be copied.
I'd appreciate any help with the XSLT-code as I'm quite new to this topic... :-/
See Question&Answers more detail:
os