summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authormatthewsotoudeh <matthewsot@outlook.com>2016-09-05 18:22:12 -0700
committermatthewsotoudeh <matthewsot@outlook.com>2016-09-05 18:22:12 -0700
commit97ef843a14cf81ad7d21735e49a54b715258a684 (patch)
treec60c329de760ac06e67962f50f0509ac8316d5be
parent8194d4790500f796c04fe192844dd5c5dc66e900 (diff)
updated the README with usage details
-rw-r--r--MFCCDotNet/MFCCDotNet/MFCC.fs3
-rw-r--r--README.md5
2 files changed, 6 insertions, 2 deletions
diff --git a/MFCCDotNet/MFCCDotNet/MFCC.fs b/MFCCDotNet/MFCCDotNet/MFCC.fs
index 782e795..b739165 100644
--- a/MFCCDotNet/MFCCDotNet/MFCC.fs
+++ b/MFCCDotNet/MFCCDotNet/MFCC.fs
@@ -79,8 +79,7 @@ module public MFCC =
let filters = compute_filterbank(num_filters, Array.length abs_output, 48000.0);
let mel_output = filters |>
- Seq.take(num_features) |>
Seq.map(fun filter -> log10(apply_and_sum_filter(abs_output, filter))) |>
Seq.toArray;
- dct(mel_output); \ No newline at end of file
+ dct(mel_output) |> Array.take(num_features); \ No newline at end of file
diff --git a/README.md b/README.md
index 43843dd..66333b4 100644
--- a/README.md
+++ b/README.md
@@ -1,10 +1,14 @@
# mfcc-dotnet
An FSharp/.NET library for MFCC audio feature extraction.
+# Usage
+Just call ``MFCC.compute(samples, num_filters, num_features)`` where ``samples`` is a ``double[]`` containing the raw amplitude samples, ``num_filters`` is the number of mel filters to apply, and ``num_features`` is the number of resulting features to take.
+
# Implementation
This MFCC implementation is partially based off of the guide at [Practical Cryptography](http://practicalcryptography.com/miscellaneous/machine-learning/guide-mel-frequency-cepstral-coefficients-mfccs/#computing-the-mel-filterbank).
The specific steps it takes are:
+
1. Compute a Hamming window using MathDotNet and apply it to the samples
2. Perform an in-place FFT on the windowed samples using MathDotNet, then take the magnitude of each of the FFT outputs
3. Compute the MFCC filterbank (or read from cached version)
@@ -14,6 +18,7 @@ The specific steps it takes are:
# Performance
I have tried to keep optimization in mind, but no promises are made regarding the speed or efficiency of the code.
+Anecdotally, I have successfully used this library in a real-time, GMM-based C# speaker recognition system so it should work.
Some optimizations include:
- The filterbank is cached after it's first created
generated by cgit on debian on lair
contact matthew@masot.net with questions or feedback