Zaɓi Harshe

Splitwise: Ingantacciyar Harsunan AI Masu Samarwa Ta Amfani da Rarrabe Matakai

Bincike kan inganta aikin LLM ta hanyar raba lissafin umarni da samarwar alamomi zuwa na'urori daban-daban don ingantaccen aiki, farashi, da amfani da wutar lantarki.
computingpowertoken.org | PDF Size: 2.6 MB
Kima: 4.5/5
Kimarku
Kun riga kun ƙididdige wannan takarda
Murfin Takardar PDF - Splitwise: Ingantacciyar Harsunan AI Masu Samarwa Ta Amfani da Rarrabe Matakai

Teburin Abubuwan Ciki

1. Gabatarwa

Manyan samfuran harshe na AI (LLMs) sun kawo sauyi ga aikace-aikacen AI amma suna fuskantar manyan kalubale na turawa saboda tsananin lissafi da buƙatun albarkatun. Yawan amfani da LLMs a fannoni daban-daban ya haifar da buƙatar GPU da ba a taɓa ganin irinta ba, wanda ya haifar da ƙarancin GPU a duniya da kuma matsalar wutar lantarki a cibiyoyin bayanai.

2. Bayanan Baya da Dalili

2.1 Halayen Aikin LLM

Aikin LLM ya ƙunshi matakai biyu daban-daban masu bambancin buƙatun albarkatu:

  • Matakin Lissafin Umarni: Tsananin sarrafa duk alamomin shigarwa tare da aiwatarwa a lokaci guda
  • Matakin Samarwar Alama: Samar da alamomin fitarwa a jere wanda ya dogara da ƙarfin ƙwaƙwalwar ajiya

2.2 Iyakokin Kayan Aiki

Kwatancen Ƙayyadaddun GPU

3.43× Ƙaruwar Lissafi (H100 da A100)
1.64× Ƙaruwar Ƙarfin Ƙwaƙwalwar Ajiya
2.16× Ƙaruwar Farashi
1.75× Ƙaruwar Wutar Lantarki

3. Tsarin Splitwise

3.1 Tsarin Rarrabe Matakai

Splitwise yana ba da shawarar raba matakan aikin biyu zuwa dandamali daban-daban na kayan aiki:

  • Na'urorin Umarni: Manyan GPUs (H100) don sarrafa umarni mai cike da lissafi
  • Na'urorin Alama: GPUs masu inganci (A100) don samar da alamomi masu dogaro da ƙwaƙwalwar ajiya

3.2 Gudanar da Albarkatun

Tsarin yana amfani da ingantattun ɗakunan ajiya na cibiyar sadarwa da saurin haɗin kai don ingantaccen canja wurin yanayi tsakanin matakai. Tushen ilimin lissafi ya ƙunshi ƙirar jinkirin aikin kamar haka:

$L_{total} = L_{prompt} + n \times L_{token}$

inda $n$ shine adadin alamomin fitarwa, $L_{prompt}$ shine jinkirin lissafin umarni, kuma $L_{token}$ shine jinkirin samar da kowace alama.

4. Sakamakon Gwaji

4.1 Kimanta Aiki

Splitwise ya sami gagarumin ci gaba fiye da hanyoyin da aka saba:

  • Matsakaicin aiki mai girma 1.4× idan aka kwatanta da gungu na'urori iri ɗaya
  • Farashi mai rahusa 20% don aiki daidai
  • Matsakaicin aiki mai girma 2.35× a ƙarƙashin kasafin wutar lantarki da farashi ɗaya

4.2 Binciken Farashi da Wutar Lantarki

Ƙirar gungu na'urori iri-iri tana nuna ingantaccen amfani da albarkatu, musamman ga matakan samar da alamomi waɗanda ba sa buƙatar ƙwararrun ƙwarewar lissafi na GPU na zamani.

5. Tsarin Bincike na Fasaha

Zurfin Fahimta

Splitwise yana ƙalubalantar tsarin masana'antu na amfani da GPU guda ɗaya. Binciken ya fallasa wata muhimmiyar aibi a cikin tsarin hidimar LLM na yanzu: kula da aikin a matsayin tsari guda ɗaya alhali kuwa yana ƙunshe da nau'ikan lissafi daban-daban guda biyu. Wannan fahimtar tana da mahimmanci kamar bayanin takardar tsarin transformer na asali game da hanyoyin kulawa.

Tsarin Ma'ana

Hujja tana ci gaba da daidaiton lissafi: (1) Siffanta yanayin aikin LLM mai matakai biyu, (2) Nuna rashin daidaiton kayan aiki ta hanyar binciken A100/H100, (3) Ba da shawarar raba matakai a matsayin mafita ta tiyata, (4) Tabbatar da sakamako ta hanyar gwaji. Wannan ci gaban ma'ana yayi kama da hanyar da ake bi a cikin manyan takardun tsarin kamar tsarin gudanar da gungu na'urori na Google Borg.

Ƙarfi & Aibobi

Ƙarfi: Ingantaccen aiki na 2.35× a ƙarƙashin iyakoki na dindindin yana da sauyi—yana kama da tsalle da ƙwararrun ƙwayoyin NVIDIA suka samu. Rage farashin yana magance babbar cikas ga amfani da LLM na kamfanoni.

Aibobi: Hanyar tana gabatar da jinkirin cibiyar sadarwa tsakanin matakai, yana haifar da sabon toshewa. Kamar yadda tsarin microservices na farko, sarƙaƙƙiyar gudanar da yanayi a rarraba na iya zarce fa'idodi ga ƙananan turawa.

Fahimtoci Masu Aiki

Ya kamata masu samar da gajimare su aiwatar da tsarin raba matakai nan da nan a cikin bayar da LLM. Kamfanonin da ke gina gungun na'urori dole ne su karɓi wannan hanyar ta bambance-bambance ko kuma su fuskanci hukuncin farashi 20-40%. Binciken ya nuna cewa muna shiga cikin zamanin ƙwararrun kayan aikin AI, kamar bambancin CPU/GPU na shekarun 2000.

6. Ayyuka na Gaba da Hanyoyi

Manufar raba matakai ta wuce LLMs na yanzu zuwa sabbin tsare-tsare:

  • Samfurori iri-iri: Rarraba sarrafawa don masu rufaffiyar hanyoyi daban-daban
  • Gaurayawan ƙwararru: Hanyar sadarwa mai sauri tsakanin ƙwararrun kayan aiki na musamman
  • Turawa zuwa gefe: Rarraba tsakanin na'urori na gefe da albarkatun gajimare
  • Ƙwararrun kayan aiki: ASICs na al'ada don matakan samar da alamomi

7. Bayanan da aka yi Amfani da su

  1. Vaswani, A., da saur. "Kulawa shine Duk abin da kuke Bukata." NeurIPS 2017.
  2. Brown, T., da saur. "Samfuran Harshe Ƙwararrun Malamai ne." NeurIPS 2020.
  3. NVIDIA Corporation. "Tsarin Gine-ginen NVIDIA H100 Tensor Core GPU." 2022.
  4. Verma, A., da saur. "Gudanar da babban gungu na'urori a Google tare da Borg." EuroSys 2015.
  5. Farashin GPU na Gajimare. "Farashin AWS EC2 Instance." An ziyarta 2024.