Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compressed audio (continuation of #36) #37

Open
wants to merge 9 commits into
base: main
Choose a base branch
from
Open

Conversation

sampsyo
Copy link
Member

@sampsyo sampsyo commented Oct 26, 2016

This is @jksinton's work on making the FFmpeg backend work on data from a stream instead of from disk. Addresses #34 and #35.

I changed around the top-level interface a bit. There's a new audioread.decode function that works on streams instead of filenames. The included demo script, decode.py, can now work on data from stdin. So this works, for example:

$ python decode.py < foo.mp3

and produces a WAV file in out.wav. (Eventually, it would be nice to write that to stdout too to demonstrate that we're actually streaming!)

@sampsyo sampsyo mentioned this pull request Oct 26, 2016
self.daemon = True

def run(self):
self.fh.write(self.audio.read())
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This currently reads the entire input stream into memory and then writes into FFmpeg. We should do this in a more streaming way by reading a block from self.audio and writing it to self.fh, one block at a time, in a loop.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add block size as an argument similar to QueueReaderThread and read a block at a time. Perhaps in a while True: loop.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call. The loop should probably exit when the file has no more data to read (i.e., when read returns an empty string).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just added these changes to the most recent commit.

class WriterThread(threading.Thread):
"""A thread that writes data to a filehandle
"""
def __init__(self, fh, audio=None):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parameter names here could be a little more descriptive. Both parameters are actually file-like objects—so maybe we should just call them outfile and infile or something?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps writefile and readfile?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call; that's better!

@sampsyo
Copy link
Member Author

sampsyo commented Oct 27, 2016

Cool! This is looking great.

Do you think we should ship this with just one backend enabled (and document that fact), or should we endeavor to add the streaming mode to more backends? (I don't have a strong feeling either way.)

If the consensus is to merge, then all that's left is documentation.

@jksinton
Copy link
Collaborator

I was going to suggest working on other backends, just so that there's parity in backend support for decoding a file and compressed audio. I only have Linux and Mac OS X environments for testing. And so far, I've only been testing with Linux/Python 2.7.

@sampsyo
Copy link
Member Author

sampsyo commented Oct 27, 2016

That sounds good. Some backends will probably more challenging than others to adapt—for example, the Core Audio backend will require fiddling with callbacks to read data from the file.

@strayge
Copy link

strayge commented Nov 14, 2016

Processing audio from pipe doesn't work for me with python3 on Ubuntu 16.04.1

$ python3 decode.py < test_cbr80_mono.mp3
Exception in thread Thread-1:
Traceback (most recent call last):
  File "/usr/lib/python3.5/threading.py", line 914, in _bootstrap_inner
    self.run()
  File "/home/user/Desktop/folder/audioread/ffdec.py", line 93, in run
    data = self.readfile.read(self.blocksize)
  File "/usr/lib/python3.5/codecs.py", line 321, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xfe in position 94: invalid start byte

With python2 it works, but print out wrong length:

$ python2 decode.py < test_cbr80_mono.mp3
Input file: 1 channels at 44100 Hz; 0.0 seconds.
Backend: ffdec

With filename as input both python's versions works correct (same file and ffmpeg as backend):

$ python2 decode.py test_cbr80_mono.mp3
Input file: 1 channels at 44100 Hz; 10.0 seconds.
Backend: ffdec
$ python3 decode.py test_cbr80_mono.mp3
Input file: 1 channels at 44100 Hz; 10.0 seconds.
Backend: ffdec

 
Also with python2 and pipe as input I've got different output file size than with filename mode:
Pipe-input mode:
10.006 secs, 882570 bytes
Filename-input mode (python2 and python3):
10.000 secs, 882044 bytes

@sampsyo
Copy link
Member Author

sampsyo commented Nov 14, 2016

Good point; it looks like we'll need to read from stdin in binary mode instead of text mode. If you're interested, any chance you could give that a shot? This SO answer might help: http://stackoverflow.com/a/32282458

@strayge
Copy link

strayge commented Nov 14, 2016

Just adding .buffer to stdout hasn't fixed problem:

Traceback (most recent call last):
  File "decode.py", line 65, in <module>
    decode()
  File "decode.py", line 38, in decode
    f = audioread.decode(sys.stdin)
  File "/home/user/Desktop/folder/audioread/__init__.py", line 126, in decode
    return ffdec.FFmpegAudioFile(audio=audio)
  File "/home/user/Desktop/folder/audioread/ffdec.py", line 195, in __init__
    self.stdout_reader = QueueReaderThread(self.proc.stdout.buffer, block_size)
AttributeError: '_io.BufferedReader' object has no attribute 'buffer

@strayge
Copy link

strayge commented Nov 14, 2016

After changing

data = self.readfile.read(self.blocksize)

to

try:
    data = self.readfile.buffer.read(self.blocksize)
except AttributeError:
    data = self.readfile.read(self.blocksize)

in ffdec.py it works with python2 & python3.
Both versions generates file with length = 10.006 secs and size = 882570 bytes.

@sampsyo
Copy link
Member Author

sampsyo commented Nov 14, 2016

Cool! Thanks for sorting this out.

I think the right fix here is to use an if PY2: check in decode.py to get a bytes stream from stdin. Then we should document the behavior in the main module: the input stream needs to be a bytes stream, not a text stream.

@jksinton
Copy link
Collaborator

Does this stem from decode.py calling audioread.decode(sys.stdin)?

@strayge
Copy link

strayge commented Nov 14, 2016

Does this stem from decode.py calling audioread.decode(sys.stdin)?

Yes
 

I think the right fix here is to use an if PY2: check in decode.py to get a bytes stream from stdin. Then we should document the behavior in the main module: the input stream needs to be a bytes stream, not a text stream.

I didn't found PY2 variable (just ugly if sys.version_info[0] > 2), and docs says:

Inevitably you will have code that has to choose what to do based on what version of Python is running. The best way to do this is with feature detection of whether the version of Python you’re running under supports what you need.

so, it can be rewritten as:

if 'buffer' in dir(self.readfile):
    data = self.readfile.buffer.read(self.blocksize)
else:
    data = self.readfile.read(self.blocksize)

@strayge
Copy link

strayge commented Nov 15, 2016

Also reproduced bug with different sizes. It seems like ffmpeg bug:

$ ffmpeg -i test_cbr80_mono.mp3 -f s16le - > out.wav
  ...
[mp3 @ 0x1b1a400] Skipping 0 bytes of junk at 495.
Input #0, mp3, from 'test_cbr80_mono.mp3':
  Metadata:
    ...
  Duration: 00:00:10.03, start: 0.025057, bitrate: 80 kb/s
    Stream #0:0: Audio: mp3, 44100 Hz, mono, s16p, 80 kb/s
    Metadata:
      encoder         : LAME3.99r
Output #0, s16le, to 'pipe:':
  Metadata:
    ...
    Stream #0:0: Audio: pcm_s16le, 44100 Hz, mono, s16, 705 kb/s
    Metadata:
      encoder         : Lavc56.60.100 pcm_s16le
Stream mapping:
  Stream #0:0 -> #0:0 (mp3 (native) -> pcm_s16le (native))
Press [q] to stop, [?] for help
size=     861kB time=00:00:10.00 bitrate= 705.6kbits/s    
video:0kB audio:861kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.000000%
$ cat test_cbr80_mono.mp3 | ffmpeg -i - -f s16le - > out.wav
  ...
[mp3 @ 0x17fa400] invalid concatenated file detected - using bitrate for duration
[mp3 @ 0x17fa400] Skipping 0 bytes of junk at 495.
Input #0, mp3, from 'pipe:':
  Metadata:
    ...
  Duration: N/A, start: 0.025057, bitrate: 80 kb/s
    Stream #0:0: Audio: mp3, 44100 Hz, mono, s16p, 80 kb/s
    Metadata:
      encoder         : LAME3.99r
Output #0, s16le, to 'pipe:':
  Metadata:
    ...
    Stream #0:0: Audio: pcm_s16le, 44100 Hz, mono, s16, 705 kb/s
    Metadata:
      encoder         : Lavc56.60.100 pcm_s16le
Stream mapping:
  Stream #0:0 -> #0:0 (mp3 (native) -> pcm_s16le (native))
size=     862kB time=00:00:10.00 bitrate= 705.6kbits/s    
video:0kB audio:862kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.000000%

@sampsyo
Copy link
Member Author

sampsyo commented Nov 15, 2016

Yeah, seems like ffmpeg might work slightly differently depending on where its data comes from.

Thanks for looking into this!

@jksinton
Copy link
Collaborator

jksinton commented Nov 15, 2016

Adding this to decode.py, accounts for the correct sys.stdin input:

        if filename:
            f = audioread.audio_open(filename)
        elif sys.version_info >= (3, 0):
            f = audioread.decode(sys.stdin.buffer)
        else:
            f = audioread.decode(sys.stdin)

I tried catching the UnicodeDecodeError in decode.py with an additional except UnicodeDecodeError: block, but it wasn't working. I suspect this is because the error is generated in ffdec.py. Perhaps we can catch the UnicodeDecodeError in ffdec.py, provide a more pertinent error message, and raise an error that can be caught, like the DecodeError.

@strayge
Copy link

strayge commented Nov 15, 2016

Perhaps we can catch the UnicodeDecodeError in ffdec.py, provide a more pertinent error message, and raise an error that can be caught, like the DecodeError.

As I wrote above

if 'buffer' in dir(self.readfile):
    data = self.readfile.buffer.read(self.blocksize)
else:
    data = self.readfile.read(self.blocksize)

is more pythonic ways than if sys.version_info >= (3, 0)

Also if we adding this check to ffdec.py it works with both python's versions and allowed to call audioread without decode.py

@sampsyo
Copy link
Member Author

sampsyo commented Nov 15, 2016

I don't have a strong opinion about feature detection (if 'buffer' in dir(self.readfile)) vs. version detection (if sys.version_info >= (3, 0)). Either one seems fine from here.

But I do think this check should go in decode.py. Specifically, I would like to avoid requiring each backend to check whether the stream it was passed was a bytes stream or a text stream and, if the latter, if it's the special kind of text stream that happens to have an underlying buffer. Instead, I think documentation is the way to go: we should just make it as clear as possible what should be provided.

@carlthome
Copy link
Contributor

Status?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants