Lub tswv yim ntawm qhov project yog muab cov lus qhia lub suab sib cuam tshuam los ntawm peb lub PC lossis peb Raspberry Pi siv lub suab-rau-ntawv Whisper qauv.
Peb yuav muab ib qho kev txiav txim uas yuav muab sau tseg, hloov dua siab tshiab rau hauv ntawv, nrog Whisper thiab tom qab ntawd txheeb xyuas kom ua tiav qhov kev txiav txim tsim nyog, uas tuaj yeem yog los ntawm kev ua tiav ib qho kev pab cuam kom muab hluav taws xob rau RaspberryPi pins.
Kuv yuav siv Raspberry Pi 2 qub, micro USB thiab kuv yuav siv lub suab-rau-ntawv qauv nyuam qhuav tso tawm los ntawm OpenAI, Ntxhi. Thaum kawg ntawm tsab xov xwm koj tuaj yeem pom me ntsis ntxhi.
txhua qhov programmed hauv Nab hab sej.
Kuv cia koj ua qauv qhia txog nws ua haujlwm li cas hauv daim vis dis aus no, tswj lub PC los ntawm lub suab.
Los ua ke
Txhawm rau siv nws nrog PC, peb tsuas yog xav tau lub microphone.
Yog hais tias koj yuav mount nws ntawm lub RaspberryPi, koj yuav xav tau ib tug USB microphone, vim hais tias lub jack nws muaj tsuas yog rau cov zis.
Xav Tau:
Raws li lub hom phiaj dav dav ntawm lub cuab yeej yog kev txheeb xyuas lub suab. Kuv pom tias nws muaj txiaj ntsig zoo rau kev sib koom ua ke rau hauv kev ua haujlwm ntawm lwm yam khoom siv.
- Micro USB
- Raspberry PI nrog kev khiav hauj lwm qhov system (Raspbian pro piv txwv)
- Electronics (LED, xov hlau, 480 ohm resistor thiab breadboard)
Peb txuas lub LED rau tus pin 17, uas yog ib qho uas peb yuav qhib thiab deactivate rau qhov kev paub no.
kev tsim kho code
Nws muab faib ua peb ntu, thawj zaug, cov ntaubntawv povthawj siv lub suab uas kuv tau coj los ntawm code geeksforgeeks, vim kuv tsis paub cov phau ntawv ntawd. Qhov thib ob, kev hloov dua siab tshiab ntawm cov ntawv rau cov ntawv nrog Whisper thiab qhov thib peb, kev kho cov ntawv nyeem thiab cov lus teb hauv RaspberryPi
Hauv kev sim piv txwv kuv tsuas yog yuav cuam tshuam nrog Led, ua rau nws pom lossis ntsais, tab sis peb tuaj yeem tsim cov ntawv los kho nws raws li peb xav tau.
Kuv paub tias qhov no yog Raspberry Pi 2 thiab nws yuav qeeb dua li Raspberry Pi 4, tab sis rau kev sim nws zoo.
Ua ntej koj tuaj yeem tau txais nws ua haujlwm, koj yuav tsum tau nruab cov hauv qab no
#Instalar whisper pip install git+https://github.com/openai/whisper.git sudo apt update && sudo apt install ffmpeg #para que funcione la grabación de audio python3 -m pip install sounddevice --user pip install git+https://github.com/WarrenWeckesser/wavio.git #si vas a instalarlo en la raspberry #dar permisos para usar la GPIO sudo apt install python3-gpiozero sudo usermode -aG gpio <username>
tag nrho cov cai
#!/usr/bin/env python3 import whisper import time from gpiozero import LED import sounddevice as sd from scipy.io.wavfile import write import wavio as wv def main (): inicio = time.time() record_audio () model = whisper.load_model("tiny") result = model.transcribe("audio1.wav") words = result["text"].split() for word in words: word = word.replace(',', '').replace('.', '').lower() if word == 'enciende' or 'encender': encender() break if word == 'parpadea' or 'parpadear': parpadear() break fin = time.time() print(fin-inicio) def encender (): LED(17).on() def parpadear (): light = LED(17) while True: light.on() sleep(1) light.off() sleep(1) def record_audio (): # Sampling frequency freq = 44100 # Recording duration duration = 5 # Start recorder with the given values # of duration and sample frequency recording = sd.rec(int(duration * freq), samplerate=freq, channels=2) # Record audio for the given number of seconds sd.wait() # This will convert the NumPy array to an audio # file with the given sampling frequency write("audio0.wav", freq, recording) # Convert the NumPy array to audio file wv.write("audio1.wav", recording, freq, sampwidth=2) main () #dar permisos para usar la GPIO #sudo apt install python3-gpiozero #sudo usermode -aG gpio <username> #Instalar whisper #pip install git+https://github.com/openai/whisper.git #sudo apt update && sudo apt install ffmpeg
Kuv tsis tuaj yeem sim nws vim kuv tsis muaj microSD rau RaspberryPi, lossis USB hais lus txuas, tab sis sai li sai tau thaum kuv sim nws kuv kho qee qhov yuam kev uas nws yooj yim plam.
Kauj ruam los ntawm kauj ruam piav qhia ntawm cov cai
#!/usr/bin/env python3
Lub Shebang qhia lub cuab yeej yam lus uas peb tau programmed thiab siv tus txhais lus dab tsi. Txawm hais tias nws zoo li tsis tseem ceeb, tsis tso nws ua rau muaj qhov yuam kev ntau zaus.
import cov tsev qiv ntawv
import whisper import time from gpiozero import LED import sounddevice as sd from scipy.io.wavfile import write import wavio as wv
Ntxias ua haujlwm nrog tus qauv
lub sij hawm, vim kuv siv nws los tswj lub sij hawm nws yuav siv sij hawm los tua cov ntawv, gpiozero ua hauj lwm nrog GPIO pins ntawm lub Raspberry thiab sounddevice, scipy thiab wavio los kaw lub suab.
Muaj nuj nqi
Kuv tau tsim 4 lub luag haujlwm:
- lub ntsiab ()
- teeb ( )
- mus ntsais ( )
- record_audio()
tig rau () tsuas yog muab qhov hluav taws xob rau tus pin 17 ntawm lub raspberry qhov twg peb tau txuas rau qhov no lub LED mus kuaj
def encender (): LED(17).on()
blink() zoo li on() tab sis nws ua rau lub teeb ntsais los ntawm kev tig nws rau thiab tawm hauv lub voj voog.
def parpadear (): light = LED(17) while True: light.on() sleep(1) light.off() sleep(1)
Nrog record_audio() peb kaw cov ntaub ntawv suab
def record_audio (): # Sampling frequency freq = 44100 # Recording duration duration = 5 # Start recorder with the given values # of duration and sample frequency recording = sd.rec(int(duration * freq), samplerate=freq, channels=2) # Record audio for the given number of seconds sd.wait() # This will convert the NumPy array to an audio # file with the given sampling frequency write("audio0.wav", freq, recording) # Convert the NumPy array to audio file wv.write("audio1.wav", recording, freq, sampwidth=2)
Main yog lub luag haujlwm tseem ceeb, ceeb toom tias tsuas yog qhov peb muaj sab nraud ntawm kev ua haujlwm yog hu rau lub ntsiab () tom kawg ntawm tsab ntawv. Txoj kev no thaum pib, nws yuav import cov tsev qiv ntawv thiab tom qab ntawd ua haujlwm hu.
def main (): inicio = time.time() record_audio () model = whisper.load_model("tiny") result = model.transcribe("audio1.wav") words = result["text"].split() for word in words: word = word.replace(',', '').replace('.', '').lower() if word == 'enciende' or 'encender': encender() break if word == 'parpadea' or 'parpadear': parpadear() break fin = time.time() print(fin-inicio)
Peb txuag lub sijhawm thaum peb pib ua haujlwm thiab tom qab ntawd peb hu rau cov ntaub ntawv suab ua haujlwm uas yuav sau peb cov lus qhia hauv .wav, .mp3, thiab lwm yam. uas peb yuav tom qab hloov mus rau cov ntawv nyeem.
inicio = time.time() record_audio ()
Thaum peb muaj lub suab, ntxhi yuav hu thiab peb qhia nws tus qauv peb xav siv, muaj 5 muaj, thiab peb yuav siv me me, txawm hais tias nws yog qhov imprecise tshaj plaws vim nws ceev tshaj plaws thiab lub suab yuav yooj yim, tsuas yog 3 lossis 4 lo lus.
model = whisper.load_model("tiny") result = model.transcribe("audio1.wav")
Nrog rau qhov no peb muaj lub suab hloov dua siab tshiab rau cov ntawv nyeem thiab khaws cia rau hauv qhov sib txawv. Cia peb hloov nws me ntsis.
We will start to convert the audiotrack of video Kwv Txhiaj Tawm Tshiab XNUMX XNUMX to mpXNUMX as soon as you have submitted it and you will be able to download Kwv Txhiaj Tawm Tshiab
words = result["text"].split()
Thiab txhua yam npaj los cuam tshuam nrog peb lub cuab yeej. Tam sim no peb tsuas yog yuav tsum tsim cov xwm txheej peb xav tau.
Yog hais tias lub suab muaj lo lus X, ua Y. Raws li peb muaj cov lus nyob rau hauv ib daim ntawv teev, nws yog heev yooj yim mus ntxiv tej yam kev mob
for word in words: word = word.replace(',', '').replace('.', '').lower() if word == 'enciende' or 'encender': encender() break if word == 'parpadea' or 'parpadear': parpadear() break
Cov kab
word = word.replace(',', '').replace('.', '').lower()
Kuv siv nws los hloov cov lus hauv lub suab mus rau tus lej qis thiab tshem tawm cov cim thiab lub sijhawm. Thiab nyob rau hauv txoj kev no tsis txhob yuam kev hauv kev sib piv
Nyob rau hauv txhua tus yog tias cov xwm txheej uas muaj ib lo lus uas peb tau xaiv tau ntsib, nws hu ua lub luag haujlwm uas yuav ua qhov peb xav tau,
Nov yog qhov peb qhia nws kom qhib tus PIN uas yuav teeb lub LED lossis ua kom nws ntsais. Ob leeg khiav ib co code, los yog kaw lub computer.
Tag nrho cov no yog lub tswv yim yooj yim. Ntawm no koj tuaj yeem tsim qhov project thiab txhim kho nws raws li koj xav tau. Txhua tus neeg tuaj yeem nrhiav kev siv sib txawv rau nws.
Tej yam uas peb tuaj yeem ua nrog rau qhov montage no
Cov no yog cov tswv yim uas tuaj rau kuv coj kom zoo dua ntawm no montage. Thaum lub cev pob txha raug tub rog, peb tuaj yeem siv nws los qhib txhua yam uas los ntawm lub siab ntawm lub suab, peb tuaj yeem qhib lub relay uas pib lub cev muaj zog lossis peb tuaj yeem tso ib tsab ntawv uas ua tiav tsab ntawv, email lossis txawm li cas los xij.
Dab tsi yog ntxhi
Whisper yog tus qauv vol lees paub, ua haujlwm hauv ntau hom lus nrog ntau hom lus thiab tso cai rau kev txhais ua lus Askiv. Nws yog qhov peb paub tias yog cov ntawv nyeem-rau-lub suab, tab sis qhov no yog Qhib Qhov Chaw, tso tawm los ntawm pab pawg OpenAI, tus tsim ntawm Stable Diffusion.