This is the second attempt to build a 'quick and dirty' solution to do reduce noise on an audio recording, and to trim moments of silence within the recording.
This effort was build based on the feedback I received on my first try, and it is using the following open source Python packages:
About this set of examples (and what do you need to do with it)
This set of examples includes the best experiments I was able to generate so far.
This time, I tried to use the famous MFCC technique, but it is very fragile I would not rely on it to work in real-world scenarios.
All results went through silence trimming (which was also improved a bit).
Please go over the samples of the different techniques, and send me your feedback -- which technique works best in your opinion, which one you would love to include in the product, etc.
A few things to keep in mind
Some of the techniques I'm showing here (the MFCC ones) assume that the recording has a human voice in it. This might not be true in real-life. If we choose these techniques, we should test them in different scenarios, with and without human voice.
Most of the well known techniques to reduce noise are very sensitive, and almost not relevant on oudoor scenarios (e.g. street), which are probably SpeakApp's most common scenarios.
The fields of noise reduction and speech recognition are currently are under an ongoing research, with almost no break-throughs in recent years.
What's next
Send yyour feedback
Asaf will get the code and implement it in the server
Test:
We might need a toggle on the app that will allow us to hear incoming messages with and without noise reduction.
We need to test the noise reduction feature with different headsets and different iOS devices.