I-AI Yendawo ku-NPU yeselula: ukuthi yenzani ngempela nokuthi ifinyelela kude kangakanani

  • I-NPU ye-SoC yeselula iyisisheshisi senethiwekhi ye-neural esikhethekile esihambisana ne-CPU ne-GPU, enikeza ukusebenza okwengeziwe nge-watt ngayinye emisebenzini ye-AI.
  • I-AI yendawo inciphisa ukubambezeleka futhi ithuthukisa ubumfihlo ngokucubungula idatha kudivayisi, kodwa inqunyelwe yi-RAM, ukushisa, ibhethri, kanye nosayizi wamamodeli engakwazi ukuwaphatha.
  • Abakhiqizi bahlanganisa ama-NPU anamandla kakhulu kumaselula, ama-PC, kanye nezimoto, kodwa izinhlelo zokusebenza eziningi aziwasebenzisi ngokugcwele, ngakho-ke i-CPU kanye ne-GPU ziyaqhubeka nokwenza umsebenzi omningi.
  • Ikusasa eliseduze lihilela imodeli ehlanganisiwe: ingxenye ye-AI isebenza endaweni ku-NPU futhi ingxenye isemafwini, isivinini sokulinganisela, ikhwalithi yemodeli kanye nokusetshenziswa kwayo.

I-AI Yendawo ku-NPU yeselula

Umqondo wokuba nomodeli we I-AI enamandla esebenza ngqo kuselula Ukungabi namafu kuzwakala kukuhle... uze uzame ngempela. Uma une-Galaxy S24 Ultra, landa amamodeli afana ne-Qwen 3.5 4B, bese uwasebenzisa ngezinhlelo zokusebenza ezifana ne-PocketPal, i-Offgrid, noma i-ChatterUI, uzohlangana neqiniso elingakhangi kangako: 4 amathokheni ngomzuzwanaIzikhathi zaphakade kuze kube yilapho ubona ithokheni yokuqala, ukushisa okukhulu kwe-terminal, kanye nomuzwa wokuthi i-super SoC yakho ayisondele ekucindezeleni i-NPU yayo njengoba ukumaketha kuthembisile.

Ngesikhathi esifanayo, lo mkhakha uhlala ukhuluma ngakho I-NPU, i-AI yendawo, i-Copilot PC, i-Apple Neural Engine Njalo njalo. Abakhiqizi bebelokhu befaka ama-accelerator e-AI kuma-SoC abo iminyaka eminingi, kokubili kumafoni nakuma-laptop, besiqinisekisa ukuthi ayikusasa lokubala komuntu siqu. Inkinga ukuthi ngezifinyezo eziningi kangaka nezithembiso, kulula ukulahleka: yini ngempela eyenziwa yi-NPU yocingo? Kungani i-CPU ngezinye izikhathi ibonakala isebenza kangcono? Kunini lapho kunengqondo ukusebenzisa i-AI esekelwe efwini futhi nini lapho kufaneleka khona ukuthembela ku-AI yendawo?

Iyini ngempela i-NPU ku-SoC yeselula futhi iyidlala yiphi indima ku-AI yendawo?

Ku-smartphone yesimanje, lokho okubizwa ngokuthi “iprosesa” empeleni kuyi- I-SoC (Uhlelo oluku-Chip)Ku-chip efanayo ye-silicon, uzothola i-CPU, i-GPU, i-ISP, i-modem, amayunithi okuphepha… futhi, eminyakeni ethile manje, injini ye-NPU noma ye-neural enikezelwe ku-AI. Ayithathi indawo ye-CPU noma i-GPU: iyabaphelezela uhlobo oluthile lomsebenzi.

I-NPU (I-Neural Processing UnitKuyibhulokhi yehadiwe eyenzelwe ukusebenzisa amanethiwekhi e-neural ngesivinini esikhulu: izinkulungwane zemisebenzi yokuphindaphinda nokwengeza ngesikhathi esisodwa, enedatha enembile kakhulu (INT8, FP16, ngisho ne-INT4) futhi enenkumbulo eseduze kakhulu yokugwema ukuchitha isikhathi ihambisa izisindo kanye nokwenza kusebenze. Ayikwazi "ukwenza konke okuncane" njenge-CPU, kodwa lokho engakwenza, ikwenza ngempumelelo enkulu.

Lokho kuchwephesha kufanelana kahle cishe nakho konke esikuqonda namuhla njenge-AI: umbono wekhompyuthaUkuqashelwa kwenkulumo, ukuhlukaniswa kwezithombe, ukuhumusha, ukumodela ulimi, kanye, ngokuvamile, nanoma iyiphi inethiwekhi yesimanje yezinzwa. Esikhundleni sokulayisha ngokweqile i-CPU noma ukuvula i-GPU yomsebenzi ngamunye we-AI, uhlelo luthumela leyo misebenzi ku-NPU, eyenzayo ngamandla amancane nokushisa okuncane.

Eqinisweni, abakhiqizi abaningi abakhulu bachaza i-NPU yabo ngaleyo migomo. I-Qualcomm ikhuluma nge-NPU yabo. ukusebenza okwengeziwe nge-watt ngayinye ngemithwalo yemisebenzi ye-AI; I-Huawei iyithengisa njengesihluthulelo sokwenza okwengeziwe ngesikhathi esincane ngaphandle kokuqeda ibhethri; I-Apple iyichaza njengenjini efana ne-GPU yokusheshisa ukuhlanganiswa kwe-matrix kanye nokuphindaphinda; I-AMD ne-Intel bayihlanganisa kuma-CPU abo ukuze balayishe imisebenzi ye-AI enamandla aphansi, kuyilapho i-Samsung iphikelela ukuthi i-NPU yayo ilungiselelwe ukusebenza kwe-matrix ngasikhathi sinye kanye nokufunda okuqhubekayo ngedatha eqoqwe.

Ama-NPU: awasha futhi awakhethekile kumaselula

Kungase kubonakale sengathi Ama-NPU avele ngokuzumayo Njengoba kukhulunywa kakhulu nge-AI ekhiqizayo, iqiniso liwukuthi besilokhu siphila nayo cishe iminyaka eyishumi singazi. Ngo-2017, i-Apple yakhipha i-iPhone X ene-Face ID kanye ne-Animoji ngenxa ye-chip yayo ye-A11 Bionic, eyayivele ine-"neural engine" ezinikele, yize bambalwa ababenaka igama ngaleso sikhathi.

Kusukela ngaleso sikhathi, i-Apple ibilokhu ikhulisa isizukulwane ngesizukulwane se-Apple Neural Engine. I-ANE ye-iPhone X yayiseduze... Ama-TOPS ayi-0,6 (ama-trillion okusebenza ngomzuzwana) ku-FP16. Namuhla, i-A17 Pro ku-iPhone 15 Pro icishe ibe yi-TOPS ezingama-35, kanti i-chip ye-M4 ye-iPad ne-Mac ikhuphukela cishe ku-TOPS ezingama-38. Okusho ukuthi, eminyakeni embalwa sisuke enjinini ye-neural "yethokheni" saya kweyodwa ekwazi ukusebenzisa amamodeli esasiwabona kuphela ezikhungweni zedatha.

I-Google yenze into efanayo ngasohlangothini lwayo I-TPU (Iyunithi Yokucubungula I-Tensor)Okokuqala ezindaweni zabo zedatha ngama-chip amakhulu okuqeqesha amanethiwekhi e-neural, bese kuba kumafoni e-Pixel nomndeni we-Google Tensor (Pixel 6, 7, 8…). Lapho bahlanganisa i-TPU/NPU ku-SoC ukuze bacindezele ikhamera, izwi, kanye nemisebenzi ye-AI ekhiqizayo kudivayisi uqobo.

Ezweni lama-PC, i-Intel ne-AMD kuye kwadingeka bathuthukise umdlalo wabo. I-Intel ifaka ama-NPU kuma-processor ayo e-Core Ultra (Meteor Lake), ane-TOPS ezingaba ngu-8-12, kuyilapho i-AMD ikhiphe i-Ryzen AI kuma-laptop processor ayo e-Ryzen 7040, ane-TOPS ezingafika ku-10, futhi yaze yafinyelela ne-TOPS ezingama-39 ze-NPU kuma-desktop processor e-desktop e-Ryzen 8000 amafushane. Umqondo uyafana: ukuthatha i-AI emaphethelweni futhi singanciki kakhulu efwini kukho konke.

Indlela i-NPU esebenza ngayo: kungani ilungele i-AI… futhi imbi kakhulu kukho konke okunye

Uma sivula i-chip ngengqondo, i-NPU ibukeka njenge- ifektri yokuphindaphinda kwe-matrix kune-CPU yakudala. Esikhundleni sama-core ambalwa aguquguqukayo kakhulu, inama-ALU angamashumi ezinkulungwane alula ahlelwe ku-matrix noma kunethiwekhi, akwazi ukwenza imikhiqizo "eqoqa okuningi" (i-MAC) ngesikhathi esifanayo, ngokuvamile ngokunemba okuphansi.

AbakwaNokia babikezela ukunyamalala kwamaselula ngaphambi kuka-2030

Icebo ukuhlela la mayunithi njengohlobo lwe- i-systolic matrixIdatha ingena ohlangothini olulodwa, idlula isuka kuseli iye kwenye, futhi iseli ngalinye lenza umsebenzi walo omncane ngaphambi kokudlulisela umphumela kolandelayo. Lokhu kunciphisa ukufinyelela kwimemori eyinhloko futhi kwandisa ukusetshenziswa kwamayunithi e-MAC, okuyilokho kanye inethiwekhi yezinzwa edingayo lapho kucatshangelwa.

Ukuze kufezwe lokhu kusebenza kahle, i-NPU idela izici eziningi ezenza i-CPU noma i-GPU ibize kakhulu: ayinayo i-branch prediction logic eyinkimbinkimbi, uhlelo lwe-cache oluyinkimbinkimbi, kanye nokusekelwa kwayo yonke imiyalelo yenhloso ejwayelekile. I-ISA yayo ngokuvamile incane kakhulu. I-DMA yokudlulisa idatha, imikhiqizo yamaphuzu, izibalo, ukusebenza kanye nokunye okuncane.

Udlala futhi ne- ukunemba kwezinomboloNakuba i-CPU noma i-GPU yendabuko isebenza kahle kumayunithi angama-32-bit noma angama-64-bit floating-point, i-NPU ivame ukusebenza ku-INT8, FP16, ngisho naku-INT4. Kunethiwekhi yezinzwa eqeqeshwe, leli zinga lokunemba lanele ukuletha imiphumela emihle kakhulu, okuvumela imisebenzi eminingi kakhulu ngomjikelezo ngamunye ngamandla amancane kakhulu ngokusebenza ngakunye.

I-CPU, i-GPU, i-NPU kanye ne-TPU: ubani owenza lokho ku-AI

I-CPU ihlala "ingubuchopho obujwayelekile": ilawula uhlelo lokusebenza, ihlela imisebenzi, futhi isebenzise i-control logic. Iyakwazi ukusebenzisa amamodeli amancane, kodwa uma uyicela ukuthi iphathe inethiwekhi enkulu noma igcine ukukhiqizwa kombhalo okuqhubekayo, iba yisithiyo ekubambezelekeni nasekusetshenzisweni kwamandla.

I-GPU iyisisebenzi esisebenzayo se- ukufunda okujulileKuhumusha kahle kakhulu umsebenzi wokwenza ihluzo (imisebenzi eminingi efanayo kuma-vector amakhulu) ekuqeqesheni nasekusebenziseni amanethiwekhi ezinzwa. Ama-GPU anamuhla nawo ahlanganisa i-tensor nuclei ezithile, empeleni, eziziphatha njenge-NPU ezincane ngaphakathi kwe-GPU uqobo.

Ngakolunye uhlangothi, i-NPU yenzelwe ukuqagela i-AI kuphela. Ayifanele imidlalo, izixhumi zokuxhumana, noma ikhodi yokuhlanganisa, kodwa ilungele ukusebenzisa amanethiwekhi okubona, izwi, noma ulimi ngendlela esebenza kahle amandla i-GPU engenakukwazi ukuwafanisa kufoni ephathekayo noma kwi-laptop elula kakhulu.

Ama-TPU e-Google asondelene kakhulu: Ama-ASIC agxile ekusebenzeni kwe-tensor ukusheshisa amamodeli e-AI, ikakhulukazi ezikhungweni zawo zedatha. I-Edge TPU kwi-Coral Dev Board, isibonelo, inikeza okunye Ama-TOPS angu-4 anama-watts ambalwa kuphelaIlungele amakhamera namadivayisi e-IoT adinga ukubona ikhompyutha ngesikhathi sangempela ngaphandle kokushisa ngokweqile noma ukusebenzisa amandla amaningi kakhulu.

Ngamafuphi, inhlanganisela ekahle kakhulu kudivayisi yesimanje yile: I-CPU ye-logic ejwayelekile, i-GPU yemisebenzi yehluzo kanye ne-flexible parallel computing, kanye ne-NPU/TPU yamanethiwekhi e-neuralNgayinye yenza into yayo, futhi uma isofthiwe ibhalwe kahle, uhlelo lusabalalisa umsebenzi ngokuhlakanipha okukhulu.

I-AI yamafu vs. i-AI yasendaweni: Isivinini, Ubumfihlo, kanye Nezindleko

Kuze kube muva nje, cishe konke esikuhlobanise "ne-AI enamandla" kwenzeka efwini: i-ChatGPT, i-Gemini, i-Stable Diffusion, abasizi abathuthukile... Amaselula asebenza kuphela njenge- i-terminal eyisimungulu ethumele idatha futhi yathola impendulo ecutshunguliwe kuseva egcwele ama-GPU noma ama-TPU.

Lo mklamo unenzuzo esobala: ungasebenzisa amamodeli amakhulu ngaphandle kokukhathazeka ngamandla omsebenzisi wokugcina. Idivayisi eshibhile esezingeni eliphansi kanye ne-flagship ephezulu kakhulu zithola umphumela ofanayo, ngoba ukuphakamisa okunzima kwenziwa yiprosesa. isikhungo sedatha ngehadiwe ezinikele.

Kodwa futhi inezinkinga ezinkulu. i-latency Kuncike ngokuphelele ekuxhumekeni: uma une-coverage embi, usendizeni, noma edolobheni eline-ADSL engathembekile, izici eziningi azibe "ziyimilingo" futhi zibe yize ngokuphelele. Ngaphezu kwalokho, isicelo ngasinye sidinga ukuthumela idatha kubantu besithathu nokuthembela ukuthi izophathwa kahle.

Isitoreji samafu

I-AI yendawo idlala umdlalo ophambene ngqo: ulethe imodeli kudivayisi bese usebenzisa isiphetho ku-CPU, i-GPU, noma i-NPU yedivayisi. Lokhu kuqeda ukubambezeleka kwenethiwekhi, kuvumela i-AI engaxhunyiwe ku-inthanethi, futhi, okubaluleke kakhulu, kuyenza ibe idatha yakho akudingeki iphume efonini, i-laptop noma imoto ngaphandle kokuthi uyifuna.

Kodwa-ke, i-AI yendawo inqunyelwe yilokho ihadiwe engakwazi ukukusingatha: I-RAM, i-VRAM, amandla okushisa, ibhethriImodeli enamapharamitha ayizigidigidi ezingu-70.000 ayifaneleki kahle ocingweni namuhla; kufanele sisebenzise izinguqulo ezincishisiwe, ezilinganiselwe, nezilungiselelwe kahle kakhulu uma sifuna okuthile okuguquguqukayo nokuqhubekayo.

Ama-NPU eselula: kusukela kukhamera kuya kumsizi, kufaka phakathi ama-LLM endawo

Ezweni lama-smartphone, ama-NPU abelokhu esebenza buthule iminyaka eminingi kukho konke okuhlobene izithombe zeselula kanye nevidiyo, ukuqashelwa kobuso, izwi, kanye nokuhumusha. Abakhiqizi bebelokhu benezela izici ngaphezu kwalokho.

Kuhlelo lwe-Apple, i-Neural Engine iphatha i-Face ID, ukutholwa kobuso nezinto kugalari, ukubizwa, ukuhumusha bukhoma, ukuqashelwa kombhalo ezithombeni, i-AR, kanye neminye imisebenzi eminingi esiyithatha kalula. Ngomndeni wakwa-A16, A17, kanye nomndeni wakwa-M3/M4, i-Apple isiqala ukwenza izinyathelo ukuze I-Siri nezinye izici ze-AI ezikhiqizayo zisebenza kudivayisi ngokwayo ngaphandle kokuthembela kakhulu efwini, besebenzisa lawo ma-30-40 TOPS enjini yemizwa.

I-Google, nge-Tensor G2 kanye ne-G3 yayo, yenza into efanayo ku-Pixel. I-Pixel 8, kanye ne- I-TPU Ehlanganisiwe, ingasebenzisa izinguqulo ezincishisiwe zamamodeli njenge-PaLM 2 noma Gemini Nano kudivayisi yemisebenzi efana nokuhumusha, ukufunda amawebhusayithi ngokuzwakalayo, izifinyezo zendawo, ukuthayipha ngezwi okubushelelezi, noma amaqhinga ekhamera njenge-Best Take kanye ne-Audio Magic Eraser, konke nge-chip esebenza ngaphandle kwesidingo esiqhubekayo sokuthumela idatha kumaseva ayo.

I-Qualcomm, yona, isebenzise izinjini ze-Hexagon NPU ochungechungeni lwe-Snapdragon izizukulwane eziningana. I-Snapdragon 8 Gen 3 ine-NPU esheshayo ngo-98% kune-Gen 2 futhi ekwazi ukusebenza. Ama-LLM afinyelela kumapharamitha ayizigidigidi eziyi-10.000 kudivayisi yeselula uqobo, ngemiboniso yomphakathi ye-Stable Diffusion ekhiqiza izithombe ngesivinini esikhulu kanye ne-Llama 2 noma i-Llama 3 esebenza ngaphandle kwe-inthanethi ngokuphelele.

I-MediaTek ayisekho ngemuva kakhulu ngama-APU ayo (ama-AI Processing Units) ochungechungeni lwe-Dimensity, ifinyelela imisebenzi efana ne-APU yesizukulwane sesithupha. ukuhlelwa kabusha kwesithombe se-AI ngesikhathi sangempela kumaselula afana ne-Oppo Find X8, futhi ekhomba iqiniso lokuthi lobu buchwepheshe be-NPU buzofika kumathelevishini, i-IoT ngisho nasezimotweni.

Kwenzekani kuma-PC nasezimotweni ngama-NPU

Enkundleni yama-PC, iMicrosoft iqalise isigaba se- "I-PC ene-AI" Ithembele kuma-NPU ahlanganiswe ku-Intel, AMD, kanye ne-Qualcomm SoCs, i-Intel Core Ultra (i-Meteor Lake) ifaka i-NPU yama-TOPS angaba ngu-8-12 ukusheshisa izici ze-Windows 11 ezifana nokufiphala kwangemuva, ukuxhumana kwamehlo okwenziwa, ukunciphisa umsindo, kanye, esikhathini esizayo, izingxenye ze-Copilot.

I-AMD yethule i-Ryzen AI ochungechungeni lwe-Ryzen 7040 lwama-laptop, kanye nama-desktop ochungechunge lwe-Ryzen 8000 ane-NPU efinyelela ku-39 TOPS. Nakuba leyo ndlela isilungisiwe, umyalezo ucacile: I-PC yesikhathi esizayo izohlala ine-AI block ezinikele., njengoba nje ibilokhu ine-GPU ehlanganisiwe iminyaka eminingi.

Embonini yezimoto, izinto ziya ngokuya zithuthuka kakhulu. I-Tesla inezizukulwane ezimbili zehadiwe ye-Full Self-Driving enama-NPU amabili: i-HW3 yayicishe ibe yi-TOPS engu-144 kanti i-HW4 icishe ibe yi-TOPS engu-200-250, konke lokhu kucubungula ngesikhathi sangempela izimpawu ezivela kumakhamera amaningi nezinzwa futhi kusebenze amanethiwekhi e-neural enza izinqumo zokushayela ngemizuzwana emincane.

I-NVIDIA, ngeplatifomu yayo ye-Drive Thor, ithatha elinye igxathu: i-chip eyodwa ingafinyelela ku- Ama-TOPS ayi-1000, noma ama-TOPS angu-2000 anezimbili ezixhunyiweYenzelwe ukuhlanganisa ukushayela okuzenzakalelayo kanye ne-AI engaphakathi kwekhabhini (abasizi bezwi, ukuqapha abashayeli, ukuzijabulisa, njll.). Ifilosofi iyafana: lapho ufuna ukufaka i-AI eningi emotweni ngesikhathi sangempela, kulapho i-accelerator ezinikele emotweni izwakala khona.

Ngaphandle kwezimoto ezizimele, ama-NPU abusa kakhulu kumakhamera okuphepha, ama-drone, namarobhothi: amadivayisi afana ne-Hailo-8 (ama-TOPS angu-26 ane-wattage ephansi) noma i-Intel's Myriad kanye ne-Google's Edge TPU avumela. umbono wekhompyutha onqenqemeni ngaphandle kokulayisha ngokweqile amanethiwekhi noma izikhungo zedatha.

I-AI Yasendaweni kuselula "yangempela": i-PocketPal, i-MNN Chat nabanye

ingxoxo ye-mnn

Ngaphandle kwemisebenzi enqunywe ngumenzi, bayanda abasebenzisi abafuna sebenzisa amamodeli akho olimi endaweni yangakini Kudivayisi yakho yeselula, ngaphandle kokusebenzisa i-ChatGPT, i-Gemini, noma izinhlelo zokusebenza ezifanayo. Yilapho izinhlelo zokusebenza ezifana ne-PocketPal, i-Offgrid, i-ChatterUI, noma i-MNN Chat zingena khona.

I-PocketPal ingenye yamamodeli afinyeleleka kalula. Ikuvumela ukuthi ulande amamodeli omthombo ovulekile (i-Llama, i-Gemma, i-Phi, i-Qwen, i-Mistral…) ngamafomethi amancane njenge-GGUF bese uwasebenzisa ngqo efonini yakho, ungaxhunyiwe ku-inthanethi. ubumfihlo obupheleleIzimpendulo nezimpendulo aziphumi kudivayisi. Okudingayo nje ifoni yeselula yesimanje ye-Android noma ye-iOS, ezimbalwa I-RAM engu-6-8 GB kanye nama-gigabytes amaningana mahhala amamodeli.

Empeleni, amamodeli anemingcele ephakathi kuka-1B no-4B (njenge-Qwen2.5-1.5B, i-Llama 3.2 3B, noma i-Qwen3-4B-Instruct) asebenza kahle kakhulu kumafoni asezingeni eliphakathi. Kodwa-ke, ukusebenza okuvamile kuvame ukuba phakathi kuka-1B no-4B (njenge-Qwen2.5-1.5B, i-Llama 3.2 3B, noma i-Qwen3-4B-Instruct). Amathokheni angu-5 no-20 ngomzuzwana ngekhwalithi ephezulu, futhi ingasaphathwa eyokugcina, kude kakhulu nalokho okungafezwa kuseva ene-GPU yobungcweti.

Ukuze kusebenze kahle kakhulu, ku-iPhone kuyalulekwa ukusebenzisa i-Metal futhi wandise inani lezingqimba ze-GPU; ku-Android, ezinye izinhlelo zokusebenza seziqala ukusizakala ngalokhu. I-Vulkan, i-GPU kanye, ngezikhathi ezingavamile, i-NPU nge-NNAPINoma kunjalo, kweziningi zalezi zixazululo umthwalo wangempela usese phezu kwe-CPU ne-GPU, futhi i-NPU ayisasetshenziswa ngokwanele ngoba ungqimba lwesofthiwe alukavuthwa.

Icala le-MNN Chat libonisa: lingenye yezinhlelo zokusebenza ezisheshayo abasebenzisi abaningi abazizamile ku-S24 Ultra, kodwa ngezindleko zokusebenzisa amamodeli anezilinganiso eziphakeme, kanye nokulahlekelwa okuthile ngekhwalithi, futhi ngaphandle kokucaca ukuthi ngabe isebenzisa ngokugcwele i-NPU ye-Snapdragon noma "kuphela" ukwenza ngcono umzila we-CPU/GPU.

Kungani i-S24 Ultra yakho ingatholi i-100% ku-NPU yayo nge-Qwen 3.5 4B

Nakuba ephepheni i-SoC ye-S24 Ultra noma i-S25 Ultra ingaphatha amamodeli anamapharamitha afinyelela ku-10 billion kanye nokubalwa kwe-AI okungaphezu kuka-40 TOPS, uma ufaka i-LLM efana ne-Qwen 3.5 4B kuhlelo lokusebenza olujwayelekile, into efanayo ivame ukwenzeka: Iqala ngokushesha, bese ishisa, ukusebenza kwayo kuyehla, futhi izinze ngaphansi kakhulu kwalokho obekulindelwe..

Isizathu esiyinhloko ukuthi, ezinhlelweni zokusebenza eziningi zezinkampani zangaphandle, imodeli isebenza ku-CPU noma i-GPU isebenzisa imitapo yolwazi ejwayelekile (i-BLAS, i-Vulkan, i-Metal) ngaphandle kokufinyelela okuqondile, okunezinhlayiya ezincane ku-NPU ye-SoC. Kumadivayisi eselula, i-NPU ivame ukudalulwa ngama-API afana ne-NNAPI ku-Android noma i-Core ML ku-iOS, kodwa akuzona zonke izinhlaka ze-LLM zasendaweni ezihlanganiswe kahle nazo, futhi ukwesekwa komkhiqizi kuyahlukahluka.

Umphumela uba ukuthi isivivinyo esilula, njengaleso i-Nexa AI esibonise nge-Galaxy ephezulu ekhiqiza umbhalo oqhubekayo, sibonisa ngokucacile ukuziphatha: uma konke kuncike ku-CPU, ekuqaleni amathokheni ngomzuzwana aphezulu kakhuluKodwa ngemizuzu embalwa izinga lokushisa liyakhuphuka, uhlelo lwehlisa imvamisa ukuze lugweme ukudlula umkhawulo wokushisa, futhi ukusebenza kwehle kuye ezingeni eliphansi kakhulu kodwa elizinzile.

Uma umthwalo womsebenzi usushintshela ku-NPU ngempela, iphrofayili iyashintsha: awuboni ukwanda okumangalisayo ekuqaleni, kodwa ubona ukukhiqizwa kwamathokheni okuphezulu kakhulu. isicaba futhi sizinzile ngokuhamba kwesikhathingezinga lokushisa eliphansi kanye nomthelela omncane empilweni yebhethri. Inkinga, njengamanje, ukuthola uhlelo lokusebenza lwe-LLM lwasendaweni ukuze luxhumane naleyo NPU kalula.

Ngaphezu kwalokho, kuneminye imikhawulo engokwenyama engenakuxazululwa ngesofthiwe: inani le-RAM etholakalayo, i-bandwidth yememori ye-SoC, kanye nosayizi wemodeli uqobo. Kumadivayisi eselula, "indawo yokunethezeka" ye-LLM ivame ukuba ku- amamodeli alinganisiwe angaba ngu-3-4 GB ngobukhuluNgaphezu kwalokho, izikhathi zokulayisha, ukusetshenziswa, kanye nokucindezela cishe njalo kuyanda.

Ngakho-ke, yize ukumaketha kwama-chip afana ne-Snapdragon 8 Gen 3 noma i-8 Gen 4 kukhuluma "ngama-LLM ayi-10B kudivayisi", empeleni ulwazi lomsebenzisi ngamamodeli avulekile aqinile luhlala lubucayi, ikakhulukazi uma uhlelo lokusebenza lungaklanywanga kusukela ekuqaleni ukuze lusebenzise okuningi kwi-NPU kusetshenziswa ama-SDK asemthethweni omenzi.

Izinzuzo kanye nokungalungi kwe-AI yendawo kuselula

Indlela yokuthuthukisa ukumbozwa kweselula ezindaweni ezinesignali engeyinhle

Ukusebenzisa i-AI endaweni yakini kumadivayisi eselula kukhanga kakhulu. Okokuqala, ubumfihloUma imodeli isefonini futhi kungekho zingcingo eziya kumaseva angaphandle, konke okutshelayo kuhlala lapho. Lokhu kubaluleke kakhulu ekusetshenzisweni okubucayi (amanothi omuntu siqu, idatha yezokwelapha, amadokhumenti enkampani yangaphakathi, njll.).

La i-latency Futhi kusebenza kahle kuwe: awuncikile kunethiwekhi, ngakho-ke isifinyezo sombhalo, ukuhumusha okusheshayo, noma ukucabanga okuncane kufika ngokushesha ngangokunokwenzeka, noma ngabe ukuphi. Ngisho nasesitimeleni esingaphansi komhlaba esingenasiginali noma ohambweni olungenadatha, usenomsizi osebenzayo.

Ngaphezu kwalokho, ngezinga elikhulu, ukulayisha umsebenzi kusuka efwini kunciphisa izindleko. Akufani nezigidi zabasebenzisi abenza umbuzo ngamunye eqenjini lama-GPU akhokhelwayo njengoba kunjalo ngokuhambisa ezinye zalezo zicelo ku... Ama-NPU asevele ekhokhile lapho ethenga ifoni ephathekayoYingakho izinkampani ezifana neQualcomm, MediaTek, kanye ne-Apple zicindezela kakhulu i-AI kumadivayisi.

Imali yokukhokha ingakolunye uhlangothi. ibhethri kanye nezinga lokushisa Ziyahlupheka uma usebenzisa amamodeli asindayo ngokweqile, ikhwalithi yamamodeli amancane ayikafiki ezingeni le-GPT-4 noma i-Gemini Ultra, futhi ulwazi lungaba olungaguquki uma isofthiwe isesesigabeni sayo sokuqala: ukuphahlazeka, amamodeli angalayishi, izikhathi ezinde ngokukhungathekisayo kuya kuthokheni yokuqala…

Yingakho izinhlobo eziningi zibheja kumodeli hybridImisebenzi elula, esheshayo, nephendulayo (ukuhumusha okuyisisekelo, ukulungiswa kombhalo, ukuhlela izithombe ezithile, kanye nezinqamuleli) iphathwa ngqo kudivayisi yeselula, kuyilapho izicelo eziyinkimbinkimbi kakhulu noma lezo ezidinga iprosesa ephezulu zithunyelwa efwini. Lokhu kudala ulwazi olungenazihibe noluyimfihlo ngaphandle kokudela amakhono amadivayisi anamandla kakhulu uma kudingeka.

Ekugcineni, indima ye-NPU ukwenza konke lokhu kusebenze: ngaphandle kwe-AI core esebenza kahle kakhulu ku-SoC, i-AI yendawo ingaba yinto yokunethezeka ngezikhathi ezithile engaqeda ibhethri ngemizuzu embalwa. Nge-NPU evuthiwe nesofthiwe enhle, iba isici esingenamthungo esisebenza ngemuva kufoni yakho, kukhompyutha, noma emotweni ngenkathi ubona konke kusabela ngokushesha nangokuhlakanipha.

Uma sibheka lesi simo, umuzwa ucacile: I-AI ayisahlali efwini kuphela noma kumaseva ezinkampani ezinkulu zobuchwepheshe, kodwa iyasebenza. ukufika ngqo ephaketheni lakho nasedeskini lakhoI-NPU ye-SoC yeselula ayiyona nje eyokubonisa kuphela: yinjini ethule eyenza leyo AI yendawo isheshe, ibe wusizo, futhi ibe yimfihlo, yize sisadinga intuthuko kusofthiwe kanye ne-ecosystem ukuze noma ubani athole okuningi kuyo ngaphandle kokuzikhandla noma ukwamukela amathokheni ama-4 ngomzuzwana.


Ungase ube nentshisekelo ku:
Yiziphi izici ezibaluleke kakhulu lapho ukhetha iselula entsha?