2024.12.02
MediaTek Genio 130 with ChatGPT Spark - A solution based on MTK Genio 130 combined with ChatGPT features
With the explosive growth of Artificial Intelligence (AI) in 2022-2023, we have ushered in the AI generation, whether it is transportation, industry, finance, manufacturing, medical and other fields, AI has been widely used to solve various problems and accelerate development. With AI entering our lives, we also see a variety of AI tools and products on all kinds of smart devices we have.
ChatGPT, the well-known and widely used generative model for natural language; ChatGPT was developed by OpenAI and launched in 2022. With ChatGPT, we can interact with it in addition to human natural language. It can also transmit text, audio, images, multimedia and other information, and give a response to the user's inquiry that is close to a real person and based on deep learning.
Such advanced AI technology is widely used in various fields and scenarios. In the IoT space, MediaTek has integrated its own solution, Genio 130, into a single-chip integration of Arm Cortex-M33 MCU, Wi-Fi 6 and Bluetooth 5.2 connectivity subsystems, power management units (PMUs) and optional audio DSPs, combined with OpenAI APIs to create a new generation of intelligent connected AI devices that can be used in various IoT scenarios and scenarios.
Figure 2: MediaTek Genio 130 block diagram
This solution will further introduce the solution of Genio 130 combined with ChatGPT features:
Genio 130 Environment & SDK Setup
OpenAI API Import & Behavioral Design
Demonstration of practical operation
Genio 130 Environment & SDK Setup
Figure 3: MediaTek Genio 130 EVK (Data taken from AcSip)
By setting up a Linux development environment (ex. VM + Ubuntu 20.04 LTS), import the Genio 130 SDK, and you can start implementing OpenAI features.
For more information on how to set up a Genio 130 development environment, further build a project, and burn a project binary file to Genio 130 EVK, please refer to the blog post: MediaTek Genio 130/130A Quick Start (1)
Before importing the OpenAI API, we need to implement the following functions to meet the requirements of the OpenAI API, and the Genio 130 SDK already has some of the functions.
Audio data capture from microphone: Captures microphone audio.
Audio playback: Used to play OpenAI response content.
HTTP Client: Sends and receives Genio 130 and OpenAI Server network packaging.
Opernai Appi Leads & Acts Design
Referring to the OpenAI development documentation, we can find various OpenAI APIs and integrate them on Genio 130 to complete with HTTP Request, the following is an example of an HTTP Request using the Chat Completions API:
curl https://api.openai.com/v1/chat/completions \ |
It is worth mentioning that to develop OpenAI APIs, you need to register an account with OpenAI and obtain an OpenAI API Key (fees apply).
For details, please refer to: OpenAI Platform
On the Genio 130, we designed to use the button (SW2) on the EVK to trigger the microphone to pick up the sound, and then send it to the OpenAI Server through the HTTP Requet packet, then obtain the audio response returned after processing by the OpenAI Server, and finally use the Audio playback function to play the result on the Speaker.
Figure 4: MediaTek Genio 130 EVK
Demonstration of practical operation
Next is the Genio 130 ChatGPT function demonstration, we can simply connect the speaker to the Genio 130 EVK, and then connect the Gneio 130 EVK to the power supply, and the Gneio 130 EVK will quickly complete the initialization and wait for the user to perform the next action.
Figure 5: MediaTek Genio 130 EVK
Next, we need to connect the Gneio 130 EVK to a known WIFI AP; Through a series of WIFI CLI commands to establish a Gneio 130 EVK network connection, this WIFI AP profile can also be stored in the NVDM of the Gneio 130 EVK, and the profile will be automatically applied to the WIFI connection after boot.
$ wifi init $ wifi config set ssid 0 SSID $ wifi config set sec 0 7 6 $ wifi config set psk 0 PASSWORD $ wifi config set reload |
Next, let's start the ChatGPT service with the implemented ChatGPT CLI command
$ chatgpt_start |
Once that's done, we can press the SW2 button and ask a question using natural language: Hello, please introduce yourself.
A series of processing via the OpenAI API: audio/transcriptions --> chat/completions --> audio/speech. Completed a "conversation", the following is shown in the form of an envelope:
[249093]<633>[common][I][openAI_chatGPT_task][1289]send audio data complete!
recv data_size:38, { "text": "Hello, please introduce yourself" } [249637]<634>[common][I][openAI_chatGPT_task][1294]httpclient_post https://api.openai.com/v1/audio/transcriptions success ! req: Hello, please introduce yourself [249639]<635>[common][I][openAI_chatGPT_task][1335]send chat request ! [249645]<636>[common][I][openAI_chatGPT_task][1351]send chat request complete!
recv data_size:757, { "id": "chatcmpl-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx", "object": "chat.completion", "created": 1724683334, "model": "gpt-4o-mini-2024-07-18", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "Hello! I'm a chatbot assistant designed to answer questions, provide information, and help with a variety of needs. Whether it's learning something new, looking for advice, writing a text, or anything in between, I can help. If you have any questions or needs, feel free to let me know!", "refusal": null }, "logprobs": null, "finish_reason": "stop" } ], "usage": { "prompt_tokens": 13, "completion_tokens": 74, "total_tokens": 87 }, "system_fingerprint": "fp_507c9469a1" }
[251591]<637>[common][I][openAI_chatGPT_task][1355]httpclient_post https://api.openai.com/v1/chat/completions success ! req txt: Hello! I'm a chatbot assistant designed to answer questions, provide information, and help with a variety of needs. Whether it's learning something new, looking for advice, writing a text, or anything in between, I can help. If you have any questions or needs, feel free to let me know!
[251594]<638>[common][I][openAI_chatGPT_task][1397]send text! [251601]<639>[common][I][openAI_chatGPT_task][1413]send text complete! mp3_codec_start_play,829 [MP3 Codec]Open codec [MP3 Codec]: mp3_decode_buffer 0x1067c0c8 (len 41000), mp3_codec_internal_handle 0x1057a1f0 (size 220), handle 0x1057a1f0 [MP3 Codec]mp3_codec_task_main create [MP3 Codec Demo] first write data 4095.mp3_codec_start_play,848 [MP3 Codec Demo] play + [MP3 Codec] mp3_codec_play_internal ++ [MP3 Codec] mp3_codec_play_internal -- [MP3 Codec Demo] play - recv data done:total size:340800, this block:14400 [260847]<649>[common][I][openAI_chatGPT_task][1434]httpclient_post https://api.openai.com/v1/audio/speech success ! |
Another operation show: Calculate 952 plus 33 and divide by 2 Is there a decimal point? What is the decimal point?
[9323894]<699>[common][I][openAI_chatGPT_task][1289]send audio data complete!
recv data_size:76, { "text": "Is there a decimal point in calculating 952 plus 33 and dividing by 2? What is the decimal point?" } [9324831]<700>[common][I][openAI_chatGPT_task][1294]httpclient_post https://api.openai.com/v1/audio/transcriptions success ! req: Is there a decimal point in calculating 952 plus 33 and dividing by 2? What is the decimal point? [9324833]<701>[common][I][openAI_chatGPT_task][1335]send chat request !
[9324840]<702>[common][I][openAI_chatGPT_task][1351]send chat request complete!
recv data_size:707, { "id": "chatcmpl-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx", "object": "chat.completion", "created": 1724692409, "model": "gpt-4o-mini-2024-07-18", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "First, let's calculate ( 952 + 33 ). n[n952 + 33 = 985n]nnNext, divide this result by 2:nn[nfrac{985}{2} = 492.5n]nnTherefore, the result has a decimal point, and the decimal point is **0.5**.", "refusal": null }, "logprobs": null, "finish_reason": "stop" } ], "usage": { "prompt_tokens": 27, "completion_tokens": 77, "total_tokens": 104 }, "system_fingerprint": "fp_f3db212e1c" }
[9326680]<703>[common][I][openAI_chatGPT_task][1355]httpclient_post https://api.openai.com/v1/chat/completions success ! req txt: First, let's calculate ( 952 + 33 ). n[n952 + 33 = 985n]nnNext, divide this result by 2:nn[nfrac{985}{2} = 492.5n]nnTherefore, the result has a decimal point, and the decimal point is **0.5**. [9326683]<704>[common][I][openAI_chatGPT_task][1397]send text! [9326690]<705>[common][I][openAI_chatGPT_task][1413]send text complete! mp3_codec_start_play,829 [MP3 Codec]Open codec [MP3 Codec]: mp3_decode_buffer 0x1067c0c8 (len 41000), mp3_codec_internal_handle 0x1057a1f0 (size 220), handle 0x1057a1f0 [MP3 Codec]mp3_codec_task_main create [MP3 Codec Demo] first write data 4095.mp3_codec_start_play,848 [MP3 Codec Demo] play + [MP3 Codec] mp3_codec_play_internal ++ [MP3 Codec] mp3_codec_play_internal -- [MP3 Codec Demo] play - recv data done:total size:296640, this block:2880 [9330700]<715>[common][I][openAI_chatGPT_task][1434]httpclient_post https://api.openai.com/v1/audio/speech success ! |
Cited Data :
MediaTek Genio 130(MT7931/MT7933)
Datasheet : https://d86o2zu8ugzlg.cloudfront.net/mediatek-craft/documents/MT7933CT_Datasheet.pdF
Genio 130A (MT7933) EVK User Guide:https://mediatek-marketing.files.svdcdn.com/production/documents/EK-AI7933CLD_User-Guide-Ver.E.pdf?dm=1684470662
OpenAI
►Scenario application diagram
►Photos of display boards
►Scenario block diagram
►Core technical advantages
MediaTek's Genio 130 (MT7931/MT7933) microprocessor is based on the Arm Cortex-M33 architecture processor with up to 300MHz frequency and built-in up to 8MB UHS PSRAM to provide high-performance computing power. It also provides wireless connection technologies such as WiFi 6 and BT 5.2, and has dual-band (2.4GHz and 5GHz) connectivity; In addition, Genio 130 (MT7933 version) has a built-in HiFi4 DSP, 3 ADCs, and 2 DAC channels to provide voice activity detection and trigger word functions, making it suitable for developing IoT devices that support voice assistant cloud services.
►Solution specifications
MediaTek Genio 130 series (MT7931/MT7933) with: • Arm Cortex-M33 processor, Clock 300MHz • Embedded 1MB SRAM and 8MB UHS (Ultra High Speed) PSRAM • WiFi 6 and dual-band IEEE 802.11 a/b/g/n/ac/ax 2.4G/5G connectivity subsystem • Bluetooth 5.2 Connectivity Subsystem • Audio Cadence® Tensilica® HiFi4 DSP@600MHz (Note 1) • Hardware Cryptographic Engine (AES/DES/3DES/SHA/ECC/TRNG) • Power Management Unit • USB 2.0 OTG support (Note 1) • Abundant peripheral interfaces such as USB, SDIO, SPI master/slave, I2C, I2S, UART, AUXADC, PWM and more Up to 46 sets of GIPOs provide FreeRTOS and Arduino development SDKs and multiple sample projects to shorten the development timeNote 1: HiFi4 DSP and USB 2.0 are the functions supported by MT7933.
Other
What is HBM (High Bandwidth Memory)?
2024.09.05
What is Antenna Tuner IC?
2024.09.20
What’s the Difference between LPDDR and DDR?
2024.09.25
Snapdragon 888 5G Mobile Platform
2024.09.26
What is WiFi 6E?
2024.09.26
What is Bluetooth Audio SoC?
2024.09.26
What's HBM3E (High Bandwidth Memory 3)?
2024.09.26
What is an Audio Codec?
2024.10.09