Njikwa olu na PC na RaspberryPi na Whisper

njikwa olu na pc na rasberi pi

Echiche nke oru ngo bụ nye ntuziaka olu ka ị na-emekọrịta ihe site na PC ma ọ bụ Raspberry Pi anyị site na iji ụdị Whisper Voice-to-text.

Anyị ga-enye iwu nke a ga-edegharị, gbanwee ka ọ bụrụ ederede, jiri Whisper wee nyochaa iji mee usoro kwesịrị ekwesị, nke nwere ike ịbụ site na ịme mmemme ruo inye voltaji na ntụtụ RaspberryPi.

M ga-eji Raspberry Pi 2 ochie, micro USB na m ga-eji ụdị olu-na-ederede nke OpenAI wepụtara na nso nso a, Na-agba ọsọ. Na njedebe nke isiokwu ị nwere ike ịhụ ntakwu ntakwu ntakwu.

emebere ha niile Python.

M na-ahapụrụ gị ihe ngosi nke otú o si arụ ọrụ na vidiyo a, na-achịkwa PC site na olu.

Mgbakọ

Iji jiri ya na PC, anyị ga-achọ naanị igwe okwu.

Ọ bụrụ na ị ga-ebunye ya na RaspberryPi, ị ga-achọ igwe okwu USB, n'ihi na jakị ọ nwere bụ naanị maka mmepụta.

Mkpa:

Dị ka ebumnuche izugbe nke ngwá ọrụ bụ njirimara olu. Achọpụtara m na ọ bara ezigbo uru itinye ya n'ọrụ nke ngwaọrụ ndị ọzọ.

  • Micro USB
  • Raspberry PI nwere sistemụ arụmọrụ (ihe atụ Raspbian pro)
  • Eletrọnịkị (LED, wires, resistor 480 ohm na bred)

Anyị na-ejikọta LED na pin 17, nke bụ nke anyị ga-arụ ọrụ ma gbanyụọ maka ahụmahụ a.

mmepe koodu

A na-ekewa ya n'ime akụkụ atọ, nke mbụ, ihe ndekọ ọdịyo nke m nwetara koodu na ya geeksforgeeks, n'ihi na amaghị m ụlọ ahịa akwụkwọ ndị ahụ. Nke abụọ, ntụgharị nke ọdịyo gaa na ederede na Whisper na nke atọ, ọgwụgwọ nke ederede na nzaghachi na RaspberryPi

N'ihe atụ ule, naanị m ga-emekọrịta ihe na Led, na-eme ka ọ na-enwu ma ọ bụ na-egbuke egbuke, mana anyị nwere ike ịmepụta edemede iji dozie ya na mkpa anyị.

Ama m na nke a bụ Raspberry Pi 2 na ọ ga-adị nwayọ karịa Raspberry Pi 4, mana maka ịnwale ọ dị mma.

Tupu ị nweta ya na-arụ ọrụ, ị ga-achọ ịwụnye ihe ndị a

#Instalar whisper
pip install git+https://github.com/openai/whisper.git
sudo apt update && sudo apt install ffmpeg

#para que funcione la grabación de audio
python3 -m pip install sounddevice --user
pip install git+https://github.com/WarrenWeckesser/wavio.git

#si vas a instalarlo en la raspberry
#dar permisos para usar la GPIO
sudo apt install python3-gpiozero
sudo usermode -aG gpio <username>

koodu niile

#!/usr/bin/env python3
import whisper
import time
from gpiozero import LED
import sounddevice as sd
from scipy.io.wavfile import write
import wavio as wv

        
def main ():
    inicio = time.time()
    record_audio ()

    model = whisper.load_model("tiny")
    result = model.transcribe("audio1.wav")
    words = result["text"].split()

    for word in words:
        word = word.replace(',', '').replace('.', '').lower()
        if word == 'enciende' or 'encender':
            encender()
            break
        if word == 'parpadea' or 'parpadear':
            parpadear()
            break      
    fin = time.time()
    print(fin-inicio)

def encender ():
    LED(17).on()

def parpadear ():
    light = LED(17)
    while True:
        light.on()
        sleep(1)
        light.off()
        sleep(1)

def record_audio ():
    # Sampling frequency
    freq = 44100
    # Recording duration
    duration = 5
    # Start recorder with the given values
    # of duration and sample frequency
    recording = sd.rec(int(duration * freq),
                    samplerate=freq, channels=2)
    # Record audio for the given number of seconds
    sd.wait()
    # This will convert the NumPy array to an audio
    # file with the given sampling frequency
    write("audio0.wav", freq, recording)
    # Convert the NumPy array to audio file
    wv.write("audio1.wav", recording, freq, sampwidth=2)
        
main ()


#dar permisos para usar la GPIO
#sudo apt install python3-gpiozero
#sudo usermode -aG gpio <username>

#Instalar whisper
#pip install git+https://github.com/openai/whisper.git
#sudo apt update &amp;&amp; sudo apt install ffmpeg

Enwebeghị m ike ịnwale ya n'ihi na enweghị m microSD maka RaspberryPi, ma ọ bụ igwe okwu USB jikọọ, mana ozugbo m nwara ya, m na-edozi njehie ụfọdụ na ọ dị mfe ịbanye.

Nzọụkwụ site nzọụkwụ nkọwa nke koodu

#!/usr/bin/env python3

Shebang ga-agwa ngwaọrụ ahụ asụsụ anyị mebere na ya na onye ntụgharị okwu anyị ga-eji. Ọ bụ ezie na ọ dị ka ihe na-adịghị mkpa, ọ bụghị itinye ya na-akpata mmejọ ọtụtụ oge.

ọba akwụkwọ ndị ebubata

import whisper
import time
from gpiozero import LED
import sounddevice as sd
from scipy.io.wavfile import write
import wavio as wv

Tụgharịa na-arụ ọrụ na ihe nlereanya

oge, n'ihi na m na-eji ya chịkwaa oge ọ na-ewe iji mebie script, gpiozero na-arụ ọrụ na GPIO ntụtụ nke Raspberry na sounddevice, scipy na wavio ịdekọ ọdịyo.

Ọrụ

Emeela m ọrụ anọ:

  • isi ()
  • ìhè ()
  • kpuo ìsì ()
  • record_audio()

gbanye () naanị na-enye voltaji na pin 17 nke rasberi ebe anyị jikọtara na nke a LED iji nwalee.

def encender ():
    LED(17).on()

blink () dị ka ịgbanwuo () mana ọ na-eme ka ndị na-edu ndú na-egbuke egbuke site na ịgbanwuo ya n'ime akaghị.

def parpadear ():
    light = LED(17)
    while True:
        light.on()
        sleep(1)
        light.off()
        sleep(1)

Site na record_audio() anyị na-edekọ faịlụ ọdịyo

def record_audio ():
    # Sampling frequency
    freq = 44100
    # Recording duration
    duration = 5
    # Start recorder with the given values
    # of duration and sample frequency
    recording = sd.rec(int(duration * freq),
                    samplerate=freq, channels=2)
    # Record audio for the given number of seconds
    sd.wait()
    # This will convert the NumPy array to an audio
    # file with the given sampling frequency
    write("audio0.wav", freq, recording)
    # Convert the NumPy array to audio file
    wv.write("audio1.wav", recording, freq, sampwidth=2)

Isi bụ isi ọrụ, rịba ama na naanị ihe anyị nwere na mpụga nke ọrụ bụ oku na-aga na isi () na njedebe nke edemede ahụ. N'ụzọ dị otú a na mmalite, ọ ga-ebubata ọba akwụkwọ wee kpọọ ọrụ ahụ.

def main ():
    inicio = time.time()
    record_audio ()

    model = whisper.load_model("tiny")
    result = model.transcribe("audio1.wav")
    words = result["text"].split()

    for word in words:
        word = word.replace(',', '').replace('.', '').lower()
        if word == 'enciende' or 'encender':
            encender()
            break
        if word == 'parpadea' or 'parpadear':
            parpadear()
            break      
    fin = time.time()
    print(fin-inicio)

Anyị na-echekwa oge anyị na-amalite ịrụ ọrụ ahụ wee kpọọ ọrụ ndekọ ndekọ nke ga-edekọ ntụziaka anyị na faịlụ .wav, .mp3, wdg nke anyị ga-emesị gbanwee gaa na ederede.

    inicio = time.time()
    record_audio ()

  

Ozugbo anyị nwetara ọdịyo ahụ, a ga-akpọ ntanye ntanye anyị gwa ya ihe nlereanya anyị chọrọ iji, enwere 5 dị, anyị ga-eji obere obere, n'agbanyeghị na ọ bụ nke kachasị njọ n'ihi na ọ kachasị ọsọ na ọdịyo ga-adị mfe. naanị okwu 3 ma ọ bụ 4.

     model = whisper.load_model("tiny")
    result = model.transcribe("audio1.wav")

  

Site na nke a, anyị na-atụgharị ụda ka ọ bụrụ ederede ma chekwaa ya na mgbanwe. Ka anyị gbanwee ya ntakịrị.

Anyị na-atụgharị nsonaazụ ka ọ bụrụ ndepụta na okwu ọ bụla nke ọdịyo ahụ

     words = result["text"].split()

  

Na ihe niile dị njikere imekọrịta na ngwaọrụ anyị. Ugbu a, anyị ga-emepụta naanị ọnọdụ anyị chọrọ.

Ọ bụrụ na ọdịyo nwere okwu X, mee Y. Dị ka anyị nwere okwu na ndepụta, ọ dị mfe itinye ọnọdụ

         for word in words:
        word = word.replace(',', '').replace('.', '').lower()
        if word == 'enciende' or 'encender':
            encender()
            break
        if word == 'parpadea' or 'parpadear':
            parpadear()
            break   

  

Ahịrị

         
        word = word.replace(',', '').replace('.', '').lower()


  

M na-eji ya ịtụgharị mkpụrụokwu ndị dị na ọdịyo ka ọ bụrụ obere ma wepụ akararị rịkọm na oge. Na n'ụzọ dị otú a zere njehie na ntụnyere

Na nke ọ bụla ma ọ bụrụ na ọnọdụ nke inwe nke ọ bụla n'ime okwu ndị anyị họọrọ emezu, ọ na-akpọ ọrụ nke ga-eme ihe anyị chọrọ;

Nke a bụ ebe anyị na-agwa ya ka ọ rụọ ọrụ PIN nke ga-enwu ọkụ ma ọ bụ mee ka ọ nwuo. Gbaa ụfọdụ koodu, ma ọ bụ mechie kọmputa.

Ihe a niile bụ echiche bụ isi. Site ebe a ị nwere ike ịmepụta ọrụ ahụ ma melite ya dịka ịchọrọ. Onye ọ bụla nwere ike ịchọta ụzọ dị iche iche maka ya.

Ihe anyị nwere ike iji montage a mee

Ndị a bụ echiche na-abịakwute m iji nweta uru nke montage a. Ozugbo ọkpụkpụ ahụ na-eji ngwá agha, anyị nwere ike iji ya mee ihe ọ bụla na-abata n'uche site na olu, anyị nwere ike ịgbalite relay nke na-amalite moto ma ọ bụ anyị nwere ike ịmalite edemede nke na-eme edemede, email ma ọ bụ ihe ọ bụla.

Gịnị bụ ntakwu

Whisper bụ ụdị njirimara vol, na-arụ ọrụ n'asụsụ dị iche iche yana ọnụ ọgụgụ asụsụ dị ukwuu ma na-enye ohere ntụgharị n'ime Bekee. Ọ bụ ihe anyị maara dị ka ngwá ọrụ ederede na olu, nke ndị otu OpenAI wepụtara, ndị okike nke Dall-e.

Deja un comentario