
Echiche nke oru ngo bụ nye ntuziaka olu ka ị na-emekọrịta ihe site na PC ma ọ bụ Raspberry Pi anyị site na iji ụdị Whisper Voice-to-text.
Anyị ga-enye iwu nke a ga-edegharị, gbanwee ka ọ bụrụ ederede, jiri Whisper wee nyochaa iji mee usoro kwesịrị ekwesị, nke nwere ike ịbụ site na ịme mmemme ruo inye voltaji na ntụtụ RaspberryPi.
M ga-eji Raspberry Pi 2 ochie, micro USB na m ga-eji ụdị olu-na-ederede nke OpenAI wepụtara na nso nso a, Na-agba ọsọ. Na njedebe nke isiokwu ị nwere ike ịhụ ntakwu ntakwu ntakwu.
emebere ha niile Python.
M na-ahapụrụ gị ihe ngosi nke otú o si arụ ọrụ na vidiyo a, na-achịkwa PC site na olu.
Mgbakọ
Iji jiri ya na PC, anyị ga-achọ naanị igwe okwu.
Ọ bụrụ na ị ga-ebunye ya na RaspberryPi, ị ga-achọ igwe okwu USB, n'ihi na jakị ọ nwere bụ naanị maka mmepụta.
Mkpa:
Dị ka ebumnuche izugbe nke ngwá ọrụ bụ njirimara olu. Achọpụtara m na ọ bara ezigbo uru itinye ya n'ọrụ nke ngwaọrụ ndị ọzọ.
- Micro USB
- Raspberry PI nwere sistemụ arụmọrụ (ihe atụ Raspbian pro)
- Eletrọnịkị (LED, wires, resistor 480 ohm na bred)
Anyị na-ejikọta LED na pin 17, nke bụ nke anyị ga-arụ ọrụ ma gbanyụọ maka ahụmahụ a.



mmepe koodu
A na-ekewa ya n'ime akụkụ atọ, nke mbụ, ihe ndekọ ọdịyo nke m nwetara koodu na ya geeksforgeeks, n'ihi na amaghị m ụlọ ahịa akwụkwọ ndị ahụ. Nke abụọ, ntụgharị nke ọdịyo gaa na ederede na Whisper na nke atọ, ọgwụgwọ nke ederede na nzaghachi na RaspberryPi
N'ihe atụ ule, naanị m ga-emekọrịta ihe na Led, na-eme ka ọ na-enwu ma ọ bụ na-egbuke egbuke, mana anyị nwere ike ịmepụta edemede iji dozie ya na mkpa anyị.
Ama m na nke a bụ Raspberry Pi 2 na ọ ga-adị nwayọ karịa Raspberry Pi 4, mana maka ịnwale ọ dị mma.
Tupu ị nweta ya na-arụ ọrụ, ị ga-achọ ịwụnye ihe ndị a
#Instalar whisper pip install git+https://github.com/openai/whisper.git sudo apt update && sudo apt install ffmpeg #para que funcione la grabación de audio python3 -m pip install sounddevice --user pip install git+https://github.com/WarrenWeckesser/wavio.git #si vas a instalarlo en la raspberry #dar permisos para usar la GPIO sudo apt install python3-gpiozero sudo usermode -aG gpio <username>
koodu niile
#!/usr/bin/env python3 import whisper import time from gpiozero import LED import sounddevice as sd from scipy.io.wavfile import write import wavio as wv def main (): inicio = time.time() record_audio () model = whisper.load_model("tiny") result = model.transcribe("audio1.wav") words = result["text"].split() for word in words: word = word.replace(',', '').replace('.', '').lower() if word == 'enciende' or 'encender': encender() break if word == 'parpadea' or 'parpadear': parpadear() break fin = time.time() print(fin-inicio) def encender (): LED(17).on() def parpadear (): light = LED(17) while True: light.on() sleep(1) light.off() sleep(1) def record_audio (): # Sampling frequency freq = 44100 # Recording duration duration = 5 # Start recorder with the given values # of duration and sample frequency recording = sd.rec(int(duration * freq), samplerate=freq, channels=2) # Record audio for the given number of seconds sd.wait() # This will convert the NumPy array to an audio # file with the given sampling frequency write("audio0.wav", freq, recording) # Convert the NumPy array to audio file wv.write("audio1.wav", recording, freq, sampwidth=2) main () #dar permisos para usar la GPIO #sudo apt install python3-gpiozero #sudo usermode -aG gpio <username> #Instalar whisper #pip install git+https://github.com/openai/whisper.git #sudo apt update && sudo apt install ffmpeg
Enwebeghị m ike ịnwale ya n'ihi na enweghị m microSD maka RaspberryPi, ma ọ bụ igwe okwu USB jikọọ, mana ozugbo m nwara ya, m na-edozi njehie ụfọdụ na ọ dị mfe ịbanye.
Nzọụkwụ site nzọụkwụ nkọwa nke koodu
#!/usr/bin/env python3
Shebang ga-agwa ngwaọrụ ahụ asụsụ anyị mebere na ya na onye ntụgharị okwu anyị ga-eji. Ọ bụ ezie na ọ dị ka ihe na-adịghị mkpa, ọ bụghị itinye ya na-akpata mmejọ ọtụtụ oge.
ọba akwụkwọ ndị ebubata
import whisper import time from gpiozero import LED import sounddevice as sd from scipy.io.wavfile import write import wavio as wv
Tụgharịa na-arụ ọrụ na ihe nlereanya
oge, n'ihi na m na-eji ya chịkwaa oge ọ na-ewe iji mebie script, gpiozero na-arụ ọrụ na GPIO ntụtụ nke Raspberry na sounddevice, scipy na wavio ịdekọ ọdịyo.
Ọrụ
Emeela m ọrụ anọ:
- isi ()
- ìhè ()
- kpuo ìsì ()
- record_audio()
gbanye () naanị na-enye voltaji na pin 17 nke rasberi ebe anyị jikọtara na nke a LED iji nwalee.
def encender (): LED(17).on()
blink () dị ka ịgbanwuo () mana ọ na-eme ka ndị na-edu ndú na-egbuke egbuke site na ịgbanwuo ya n'ime akaghị.
def parpadear (): light = LED(17) while True: light.on() sleep(1) light.off() sleep(1)
Site na record_audio() anyị na-edekọ faịlụ ọdịyo
def record_audio (): # Sampling frequency freq = 44100 # Recording duration duration = 5 # Start recorder with the given values # of duration and sample frequency recording = sd.rec(int(duration * freq), samplerate=freq, channels=2) # Record audio for the given number of seconds sd.wait() # This will convert the NumPy array to an audio # file with the given sampling frequency write("audio0.wav", freq, recording) # Convert the NumPy array to audio file wv.write("audio1.wav", recording, freq, sampwidth=2)
Isi bụ isi ọrụ, rịba ama na naanị ihe anyị nwere na mpụga nke ọrụ bụ oku na-aga na isi () na njedebe nke edemede ahụ. N'ụzọ dị otú a na mmalite, ọ ga-ebubata ọba akwụkwọ wee kpọọ ọrụ ahụ.
def main (): inicio = time.time() record_audio () model = whisper.load_model("tiny") result = model.transcribe("audio1.wav") words = result["text"].split() for word in words: word = word.replace(',', '').replace('.', '').lower() if word == 'enciende' or 'encender': encender() break if word == 'parpadea' or 'parpadear': parpadear() break fin = time.time() print(fin-inicio)
Anyị na-echekwa oge anyị na-amalite ịrụ ọrụ ahụ wee kpọọ ọrụ ndekọ ndekọ nke ga-edekọ ntụziaka anyị na faịlụ .wav, .mp3, wdg nke anyị ga-emesị gbanwee gaa na ederede.
inicio = time.time() record_audio ()
Ozugbo anyị nwetara ọdịyo ahụ, a ga-akpọ ntanye ntanye anyị gwa ya ihe nlereanya anyị chọrọ iji, enwere 5 dị, anyị ga-eji obere obere, n'agbanyeghị na ọ bụ nke kachasị njọ n'ihi na ọ kachasị ọsọ na ọdịyo ga-adị mfe. naanị okwu 3 ma ọ bụ 4.
model = whisper.load_model("tiny") result = model.transcribe("audio1.wav")
Site na nke a, anyị na-atụgharị ụda ka ọ bụrụ ederede ma chekwaa ya na mgbanwe. Ka anyị gbanwee ya ntakịrị.
Anyị na-atụgharị nsonaazụ ka ọ bụrụ ndepụta na okwu ọ bụla nke ọdịyo ahụ
words = result["text"].split()
Na ihe niile dị njikere imekọrịta na ngwaọrụ anyị. Ugbu a, anyị ga-emepụta naanị ọnọdụ anyị chọrọ.
Ọ bụrụ na ọdịyo nwere okwu X, mee Y. Dị ka anyị nwere okwu na ndepụta, ọ dị mfe itinye ọnọdụ
for word in words: word = word.replace(',', '').replace('.', '').lower() if word == 'enciende' or 'encender': encender() break if word == 'parpadea' or 'parpadear': parpadear() break
Ahịrị
word = word.replace(',', '').replace('.', '').lower()
M na-eji ya ịtụgharị mkpụrụokwu ndị dị na ọdịyo ka ọ bụrụ obere ma wepụ akararị rịkọm na oge. Na n'ụzọ dị otú a zere njehie na ntụnyere
Na nke ọ bụla ma ọ bụrụ na ọnọdụ nke inwe nke ọ bụla n'ime okwu ndị anyị họọrọ emezu, ọ na-akpọ ọrụ nke ga-eme ihe anyị chọrọ;
Nke a bụ ebe anyị na-agwa ya ka ọ rụọ ọrụ PIN nke ga-enwu ọkụ ma ọ bụ mee ka ọ nwuo. Gbaa ụfọdụ koodu, ma ọ bụ mechie kọmputa.
Ihe a niile bụ echiche bụ isi. Site ebe a ị nwere ike ịmepụta ọrụ ahụ ma melite ya dịka ịchọrọ. Onye ọ bụla nwere ike ịchọta ụzọ dị iche iche maka ya.
Ihe anyị nwere ike iji montage a mee
Ndị a bụ echiche na-abịakwute m iji nweta uru nke montage a. Ozugbo ọkpụkpụ ahụ na-eji ngwá agha, anyị nwere ike iji ya mee ihe ọ bụla na-abata n'uche site na olu, anyị nwere ike ịgbalite relay nke na-amalite moto ma ọ bụ anyị nwere ike ịmalite edemede nke na-eme edemede, email ma ọ bụ ihe ọ bụla.
Gịnị bụ ntakwu
Whisper bụ ụdị njirimara vol, na-arụ ọrụ n'asụsụ dị iche iche yana ọnụ ọgụgụ asụsụ dị ukwuu ma na-enye ohere ntụgharị n'ime Bekee. Ọ bụ ihe anyị maara dị ka ngwá ọrụ ederede na olu, nke ndị otu OpenAI wepụtara, ndị okike nke Dall-e.