Kudzora kwezwi paPC uye RaspberryPi neWhisper

kudzora inzwi paPC uye raspberry pi

Pfungwa yeiyo chirongwa ndeye ipa mirairo yezwi yekudyidzana kuburikidza nePC yedu kana yedu Raspberry Pi uchishandisa iyo Voice-to-text Whisper modhi.

Isu tichapa rairo inozonyorwa, ichishandurwa kuita chinyorwa, neWhisper uye yozoongororwa kuti iite iyo yakakodzera kurongeka, iyo inogona kubva pakuita chirongwa kusvika pakupa voltage kuRaspberryPi pini.

Ndiri kuzoshandisa yekare Raspberry Pi 2, micro USB uye ini ndichashandisa iyo Voice-to-text modhi ichangobva kuburitswa neOpenAI, Whisper. Pakupera kwechinyorwa iwe unogona kuona kazevezevezve zvishoma.

zvese zvakarongwa mukati Python.

Ini ndinokusiira iwe ratidziro yemashandiro ainoita muvhidhiyo iyi, kudzora PC nezwi.

Gungano

Kuti tishandise nePC, isu tichangoda maikorofoni.

Kana iwe uchizoiisa paRaspberryPi, iwe uchada USB maikorofoni, nekuti jack yainayo ndeyekuburitsa chete.

Inoda:

Sezvo chinangwa chikuru chechishandiso chiratidzo chezwi. Ndinoona zvichibatsira zvikuru kuibatanidza mukushanda kwemamwe maturusi.

  • Micro USB
  • Raspberry PI ine inoshanda sisitimu (Raspbian pro muenzaniso)
  • Electronics (LED, waya, 480 ohm resistor uye breadboard)

Isu tinobatanidza iyo LED kune pini 17, inova ndiyo yatichamisa nekudzima chiitiko ichi.

kuvandudzwa kwekodhi

Yakakamurwa kuita zvikamu zvitatu, yekutanga, yekurekodha yekuteerera yandakatora kodhi kubva geeksforgeeks, nokuti handizivi zvitoro zvemabhuku izvozvo. Chechipiri, kushandurwa kweodhiyo kune zvinyorwa neWhisper uye yechitatu, kurapwa kweiyo chinyorwa uye mhinduro muRaspberryPi.

Mumuenzaniso wekuyedza ini ndiri kungosangana neLed, ichiita kuti ivheneke kana kupenya, asi isu tinogona kugadzira script kuti tigadzirise kune zvatinoda.

Ndinoziva kuti iyi iRaspberry Pi 2 uye ichave inononoka kupfuura Raspberry Pi 4, asi yekuyedza zvakanaka.

Usati waita kuti ishande, iwe uchafanirwa kuisa zvinotevera

#Instalar whisper
pip install git+https://github.com/openai/whisper.git
sudo apt update && sudo apt install ffmpeg

#para que funcione la grabación de audio
python3 -m pip install sounddevice --user
pip install git+https://github.com/WarrenWeckesser/wavio.git

#si vas a instalarlo en la raspberry
#dar permisos para usar la GPIO
sudo apt install python3-gpiozero
sudo usermode -aG gpio <username>

kodhi yese

#!/usr/bin/env python3
import whisper
import time
from gpiozero import LED
import sounddevice as sd
from scipy.io.wavfile import write
import wavio as wv

        
def main ():
    inicio = time.time()
    record_audio ()

    model = whisper.load_model("tiny")
    result = model.transcribe("audio1.wav")
    words = result["text"].split()

    for word in words:
        word = word.replace(',', '').replace('.', '').lower()
        if word == 'enciende' or 'encender':
            encender()
            break
        if word == 'parpadea' or 'parpadear':
            parpadear()
            break      
    fin = time.time()
    print(fin-inicio)

def encender ():
    LED(17).on()

def parpadear ():
    light = LED(17)
    while True:
        light.on()
        sleep(1)
        light.off()
        sleep(1)

def record_audio ():
    # Sampling frequency
    freq = 44100
    # Recording duration
    duration = 5
    # Start recorder with the given values
    # of duration and sample frequency
    recording = sd.rec(int(duration * freq),
                    samplerate=freq, channels=2)
    # Record audio for the given number of seconds
    sd.wait()
    # This will convert the NumPy array to an audio
    # file with the given sampling frequency
    write("audio0.wav", freq, recording)
    # Convert the NumPy array to audio file
    wv.write("audio1.wav", recording, freq, sampwidth=2)
        
main ()


#dar permisos para usar la GPIO
#sudo apt install python3-gpiozero
#sudo usermode -aG gpio <username>

#Instalar whisper
#pip install git+https://github.com/openai/whisper.git
#sudo apt update &amp;&amp; sudo apt install ffmpeg

Ini handina kukwanisa kuzviyedza nekuti handina microSD yeRaspberryPi, kana USB speaker yekubatanidza, asi pandinongoidza ndinogadzirisa chimwe chikanganiso chekuti zviri nyore kutsvedza mukati.

Nhanho nhanho tsananguro yekodhi

#!/usr/bin/env python3

Shebhang kuti iudze mudziyo kuti takaronga mutauro upi uye kuti woshandisa muturikiri upi. Kunyange zvazvo zvichiita sezviduku, kusazviisa kunokonzera kukanganisa pazviitiko zvakawanda.

raibhurari dzakatorwa kunze kwenyika

import whisper
import time
from gpiozero import LED
import sounddevice as sd
from scipy.io.wavfile import write
import wavio as wv

Zevezera kuti ushande nemuenzaniso

nguva, nekuti ini ndinoishandisa kudzora nguva inotora kuita script, gpiozero kushanda neGPIO mapini eRaspberry uye sounddevice, scipy uye wavio kurekodha odhiyo.

Mabasa

Ndakagadzira 4 mabasa:

  • chikuru ()
  • chiedza ()
  • Kubwaira ()
  • rekodhi_odhiyo()

batidza () inongopa voltage kupinza 17 ye raspberry patakabatanidza mune iyi kesi iyo LED kuyedza.

def encender ():
    LED(17).on()

blink () yakafanana ne () asi inoita kuti inotungamirwa inopenya nekuibatidza nekuidzima mukati mechiuno.

def parpadear ():
    light = LED(17)
    while True:
        light.on()
        sleep(1)
        light.off()
        sleep(1)

Nerekodhi_audio () tinorekodha faira rekuteerera

def record_audio ():
    # Sampling frequency
    freq = 44100
    # Recording duration
    duration = 5
    # Start recorder with the given values
    # of duration and sample frequency
    recording = sd.rec(int(duration * freq),
                    samplerate=freq, channels=2)
    # Record audio for the given number of seconds
    sd.wait()
    # This will convert the NumPy array to an audio
    # file with the given sampling frequency
    write("audio0.wav", freq, recording)
    # Convert the NumPy array to audio file
    wv.write("audio1.wav", recording, freq, sampwidth=2)

Main ndiro basa guru, cherechedza kuti chinhu chega chatinacho kunze kwemabasa kudana kune main () pamagumo echinyorwa. Nenzira iyi pakutanga, ichaunza ma library uye wobva waita basa rekufona.

def main ():
    inicio = time.time()
    record_audio ()

    model = whisper.load_model("tiny")
    result = model.transcribe("audio1.wav")
    words = result["text"].split()

    for word in words:
        word = word.replace(',', '').replace('.', '').lower()
        if word == 'enciende' or 'encender':
            encender()
            break
        if word == 'parpadea' or 'parpadear':
            parpadear()
            break      
    fin = time.time()
    print(fin-inicio)

Isu tinochengetedza nguva yatinotanga kuita basa uye tobva tadaidza rekodhi redhiyo basa iro richarekodha rairo yedu mu.wav, .mp3, nezvimwewo faira yatinozoshandura kune zvinyorwa.

    inicio = time.time()
    record_audio ()

  

Kana tangove neodhiyo, kuzevezera kuchadaidzwa uye isu tinoiudza iyo modhi yatiri kuda kushandisa, pane 5 iripo, uye isu tichashandisa diki, kunyangwe iri iyo isina kunyatsojeka nekuti ndiyo inokurumidza uye odhiyo ichava nyore, mazwi matatu kana mana chete .

     model = whisper.load_model("tiny")
    result = model.transcribe("audio1.wav")

  

Neizvi isu tine odhiyo yakashandurwa kuita mavara uye yakachengetwa mune inosiyana. Ngatiigadzirise zvishoma.

Isu tinoshandura mhedzisiro kuita rondedzero ine rimwe nerimwe remazwi eodhiyo

     words = result["text"].split()

  

Uye zvese zvakagadzirira kupindirana nemudziyo wedu. Zvino isu tingofanire kugadzira mamiriro atinoda.

Kana iyo odhiyo iine izwi X, ita Y. Sezvo isu tiine mazwi mune rondedzero, zviri nyore kwazvo kuwedzera mamiriro

         for word in words:
        word = word.replace(',', '').replace('.', '').lower()
        if word == 'enciende' or 'encender':
            encender()
            break
        if word == 'parpadea' or 'parpadear':
            parpadear()
            break   

  

Mutsara

         
        word = word.replace(',', '').replace('.', '').lower()


  

Ini ndinoishandisa kushandura mazwi ari muodhiyo kuita madiki uye kubvisa makoma uye nguva. Uye nenzira iyi dzivisa kukanganisa mukuenzanisa

Mune imwe neimwe kana mamiriro ekuva nechero mazwi atakasarudza akasangana, inodaidza basa rinoita zvatinoda,

Apa ndipo patinoiudza kuti ishandise PIN inovhenekera LED kana kuti iite kupenya. Ingo mhanyisa imwe kodhi, kana kudzima komputa.

Zvose izvi ipfungwa huru. Kubva pano iwe unogona kugadzira purojekiti uye kuivandudza sezvaunoda. Munhu mumwe nomumwe anogona kuwana kushandiswa kwakasiyana kwairi.

Zvinhu zvatingaite ne montage iyi

Idzi ndidzo pfungwa dzinouya kwandiri kuti nditore mukana weiyo montage. Kana skeleton yave nezvombo, tinogona kuishandisa kumisa zvese zvinouya mupfungwa nezwi, tinogona kumisikidza relay inotanga mota kana isu tinogona kuvhura script inoita script, email kana chero.

Chii chinonzi kuzevezera

Whisper imodeli yekuziva vol, inoshanda mumitauro yakawanda ine nhamba huru yemitauro uye inobvumira kududzira muChirungu. Ndizvo zvatinoziva sechinyorwa-kune-izwi chishandiso, asi iyi ndiyo Open Source, yakaburitswa neOpenAI timu, vagadziri veStable Diffusion.

Leave mhinduro