Update: My fuzz targets have been merged into the VLC mainline tree by commits 74e7bd2, b83e9f2.
Introduction to my GSoC project here
Majority of the parsing code of VLC has been successfully setup to be fuzzed by:
The demux fuzz target which creates an input stream from the fuzzed input provided by libfuzzer,
probes and loads an appropriate demux module and demultiplexes the input into the various output
elementary streams, minimally handling all the ES callbacks and calling various demux_Control's to
increase code coverage.
I have been making some few specific improvement to demux fuzz target
and also working on the decoder target side by side which isn't done yet.
Some general improvements that I have made for all fuzz targets is
abstracting away and initializing the parent libvlc_instance_t object in
LLVMFuzzerInitialize which obviously has some performance improvements
as it isn't created and destroyed on every run of the LLVMFuzzerTestOneInput.
So, I have been working on writing a fuzz target for the demux API for the past
two weeks and I have reached a point where it has a good code coverage
and has already found some few shallow bugs.
Here are some of the stacktraces libfuzzer spit out and all of them were fixed
rather promptly.
Introduction to this blog post here
The Architecture
I met with the VLC developers and my mentor at the VideoLabs office
in Paris and after a few meetings and discussions we had a pretty good
idea on how we could fuzz test libVLC and the VLC core most appropriately.
In VLC, except the core, everything is a module.
There are over 200+ modules in VLC along with libVLCCore and libVLC.
The main module categories that take an input are:
Access
Access-demuxer
Demuxer
Packetizer
Decoder
Video filter
Software bugs and vulnerabilities can be difficult to detect and slow to find
even when actively searched for by developers and users who usually look for
superficial functional and visual bugs.
In a large software especially those written in middle level languages like C/C++,
security bugs and vulnerabilities can often be used to comprise the whole system.
Mainly because memory management is left to the programmers of the individual software.
One alternative to human Q&A testing is to use automated software testing techniques
like Fuzzing where random, invalid or unexpected data is provided as input to a computer
program.