Initial commit

b4b07aae · Adam Blank · b4b07aae · b4b07aae · b4b07aae · b4b07aae
Commit b4b07aae authored 10 months ago by Adam Blank
Hide whitespace changes
Inline Side-by-side

Showing

with 100 additions and 0 deletions
+100 -0
--- a/README.md
+++ b/README.md
+# Instructions  
+
+In today's assignment, we will be writing a program to autotune `.wav` audio files. On a high level, autotune works by correcting original notes, which are unlikely to be exactly in tune, to the closest "in tune" note. We define which sound frequencies are "in tune" according to musical scales.
+
+# Acknowledgements
+Thanks to Adam's friend [Khue](https://thekhue.bandcamp.com) and Khue's friend [Ellen](https://open.spotify.com/artist/6WzPJBZy7pAs0265yHLejf) for the music samples!
+
+## Steps
+  1. Write the `generate_scales` function. A scale is a sequence of notes, such as _"do, re, mi, ..."_. Given the starting note and the size of the steps between each note, we can generate the entire scale. For example, if `note = "D"` and the steps are `[2, 2, 1, 2]`. The scale would be: `[D, E, Fs, G, A]`. And the returned frequencies would be: `[146.83, 164.81, 185.0, 196.0, 220.0]`. __Note:__ if the steps go off the end of the list, make sure they wrap around.
+  2. Write the `find_peak` function. In this function, we are looking for the frequency with the largest _absolute_ amplitude in the array `wav`. Store the index of this amplitude in `wav` as `best_i`. 
+  To calculate the corresponding frequency, use the following formula:
+  `freq = best_i * (samplerate // 2) / (len(fft_cut) - 1)`
+  4. Write the `find_closest_note` function. Given the peak frequency from a chunk of the original soundwave, we want to find the closest frequency that corresponds to a note in our scales. 
+  In other words, we'd like to find a frequency `f*` in `scale_comb` such that `|peak - f*| < |peak - f|` for any other frequency `f` in `scale_comb`.
+  5. Autotune something!
+
+Congratulations on finishing the last assignment! Good luck on your Caltech journey and I hope to see you in the Fall!
--- a/ellen.wav
+++ b/ellen.wav
--- a/khue.wav
+++ b/khue.wav
--- a/main.py
+++ b/main.py
+import numpy as np
+from scipy.fftpack import rfft
+from scipy.signal import resample
+import soundfile as sf
+
+block_size = 4096 // 2
+filename = 'ellen.wav'
+scale = "D"
+type = "Major"
+
+# NOTES are the letter notes
+NOTES = ["C", "Cs", "D", "Ds", "E", "F", "Fs", "G", "Gs", "A", "As", "B"]
+# FREQUENCIES are the corresponding frequencies that produce each of the letter notes. For example, C corresponds to 130.81 Hz.
+FREQUENCIES = [130.81, 138.59, 146.83, 155.56, 164.81, 174.61, 185.0, 196.0, 207.65, 220.0, 233.08, 246.94]
+# MAJOR is the scale, meaning it defines how many notes to step in between each note in a scale.
+# Maj: W - W - H - W - W - W - H
+MAJOR = [2, 2, 1, 2, 2, 2, 1]
+
+
+# Generate a scale starting at the given note
+def generate_scales(note):
+  return []
+
+# Broaden the range of the scales...
+scales = [np.array(generate_scales(scale))]
+for i in range(1, 4):
+  scales.append(scales[0] * float(2**i))
+  scales.append(scales[0] / float(2**i))
+
+scale_comb = np.concatenate(scales)
+
+# samplerate is Hz
+data, samplerate = sf.read(filename)
+
+try:
+  ch = len(data[0,])
+except:
+  ch =  1
+ 
+if ch != 1:
+  L  = data[:,0]    
+  R = data[:,1]  
+  n = len(data) 
+  data = L / 2.0 + R / 2.0
+
+pieces = len(data) // block_size
+chunk = len(data) // pieces
+data_adj = []
+
+# Find the fundamental frequency.
+def find_peak(wav):
+  best_i = 0
+  # Do stuff here....
+  return best_i * ((samplerate // 2) / (len(fft_cut) - 1))
+
+
+# Find the closest note.
+def find_closest_note(peak): 
+  return 0.0
+  
+
+# To autotune the input soundwaves, we first need to cut the data into chunks so we can autotune each piece individually.
+# Then, for each chunk, we will fourier transform the soundwave data into frequency data (using scipy.rfft, which implements the fast fourier transform algorithm).
+# Given the frequencies, we can find the peak frequency, which will (roughly) correspond to the note that is being sung.
+# To try to make the singer sound "better", we can correct the original sound to the closest note on our scale. We only do this if the closest note is indeed close (i.e. factor close to 1).
+print('Autotuning...')  
+for i in range(pieces):
+    data_cut = data[i * chunk : (i + 1) * chunk]
+    fft_cut = rfft(data_cut, chunk)
+    peak = find_peak(fft_cut)
+    factor = 1.0
+    if peak:
+      factor = find_closest_note(peak) / peak
+      if factor < 0.8 or factor > 1.2:
+          factor = 1.0
+    data_adj.append(resample(data_cut, int(chunk / factor)))
+print('Exporting...')
+
+autotuned = np.concatenate(data_adj)
+sf.write(filename[:len(filename)-4]+'_tuned.wav',autotuned,samplerate,'PCM_16')
+
+print('Complete!')
\ No newline at end of file