diff options
author | matthewsotoudeh <matthewsot@outlook.com> | 2016-09-05 18:22:12 -0700 |
---|---|---|
committer | matthewsotoudeh <matthewsot@outlook.com> | 2016-09-05 18:22:12 -0700 |
commit | 97ef843a14cf81ad7d21735e49a54b715258a684 (patch) | |
tree | c60c329de760ac06e67962f50f0509ac8316d5be | |
parent | 8194d4790500f796c04fe192844dd5c5dc66e900 (diff) |
updated the README with usage details
-rw-r--r-- | MFCCDotNet/MFCCDotNet/MFCC.fs | 3 | ||||
-rw-r--r-- | README.md | 5 |
2 files changed, 6 insertions, 2 deletions
diff --git a/MFCCDotNet/MFCCDotNet/MFCC.fs b/MFCCDotNet/MFCCDotNet/MFCC.fs index 782e795..b739165 100644 --- a/MFCCDotNet/MFCCDotNet/MFCC.fs +++ b/MFCCDotNet/MFCCDotNet/MFCC.fs @@ -79,8 +79,7 @@ module public MFCC = let filters = compute_filterbank(num_filters, Array.length abs_output, 48000.0); let mel_output = filters |> - Seq.take(num_features) |> Seq.map(fun filter -> log10(apply_and_sum_filter(abs_output, filter))) |> Seq.toArray; - dct(mel_output);
\ No newline at end of file + dct(mel_output) |> Array.take(num_features);
\ No newline at end of file @@ -1,10 +1,14 @@ # mfcc-dotnet An FSharp/.NET library for MFCC audio feature extraction. +# Usage +Just call ``MFCC.compute(samples, num_filters, num_features)`` where ``samples`` is a ``double[]`` containing the raw amplitude samples, ``num_filters`` is the number of mel filters to apply, and ``num_features`` is the number of resulting features to take. + # Implementation This MFCC implementation is partially based off of the guide at [Practical Cryptography](http://practicalcryptography.com/miscellaneous/machine-learning/guide-mel-frequency-cepstral-coefficients-mfccs/#computing-the-mel-filterbank). The specific steps it takes are: + 1. Compute a Hamming window using MathDotNet and apply it to the samples 2. Perform an in-place FFT on the windowed samples using MathDotNet, then take the magnitude of each of the FFT outputs 3. Compute the MFCC filterbank (or read from cached version) @@ -14,6 +18,7 @@ The specific steps it takes are: # Performance I have tried to keep optimization in mind, but no promises are made regarding the speed or efficiency of the code. +Anecdotally, I have successfully used this library in a real-time, GMM-based C# speaker recognition system so it should work. Some optimizations include: - The filterbank is cached after it's first created |