If you work with the bytes of a wav file directly you can use the same strategy in any programming language. For this example I'll assume the two source files have the same bitrate/numchannels and are the same length/size.
(if not you can probably edit them before starting the merge).
First look over the wav specificaiton, I found a good one at a stanford course website:
Common header lengths are 44 or 46 bytes.
If you want to concatenate two files (ie play one wav then the other in a single file):
- find out what format your wav files are
- chop off the first 44/46 bytes which are the headers, the remainder of the file is the data
create a new file and stick one of the headers in that.
new wav file = {header} = {44/46} bytes long
add the two data parts from the original files
new wav file = {header + data1 + data2 } = {44/46 + size(data1) + size(data2)} bytes long
modify your header in two places to reflect the new file's length.
a. modify bytes 4+4 (ie. 4 bytes starting at offset 4).
The new value should be a hexadecimal number representing the size of the new wav file in bytes {44/46 + size(data1) + size(data2)} - 8bytes.
b. modify bytes 40+4 or 42+4 (the 4 bytes starting at offset 40 or 42, depending on if you have a 44byte header or 46 byte header).
The new value should be a hexadecimal number representing the total size of the new wav file. ie {44/46 + size(data1) + size(data2)}
If you want to instead merge or mix the two files (so that they both play at the same time then):
- you won't have to edit the header if both files are the same length.
starting at byte 44/46 you will have to edit each sample to be the value in data1 + the value in data2.
so for example if your SampleRate was 8 bits you would modify 1 byte, if your sample rate was 16bits you would modify 2 bytes.
the rest of the file is just Samples of 1/2bytes storing an int value representing the waveform of the sound at that time.
a. For each of the remaining samples in the file grab the 1/2 byte hex string and get the int value from both files data1 and data2.
b. add the 1/2 byte integers together
convert the result back to hexadecimal and use that value in your output file.
c. You normally have to divide that number by 2 to get an average value that fits back in the original 1/2byte sample block. I was getting distortion when i tried it in objc(probably related to signed or unsigned ints) and just skipped the division part since it will only likely be a problem if you are merging very loud sounds together.
ie when data1 + data2 is larger than 1/2 bytes the sound will clip. There was a discussion about the clipping issue here and you may want to try one of those clipping techniques.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…