Tutorial

Kugadziriswa kweMashini Kudzidza Algorithms: Linear Regression, Classization uye Kubatanidza

Machine Kudzidza ine kwakakura kufanana neyemasvomhu optimization, iyo inopa nzira, zvirevo uye maficha ekushandisa. 

Kudzidza kwemashini kwakagadzirwa se "matambudziko ekudzikisa" kweanorasikirwa basa kupesana nepi yakapihwa yemienzaniso (kudzidziswa kusekwa). Ichi chiitiko chinoratidza musiyano pakati pemagariro anofungidzirwa nemuenzaniso uchidzidziswa uye zvinotarisirwa kukosha zvemuenzaniso wega wega. 

Chinangwa chekupedzisira kudzidzisa muenzaniso kugona kugona kufanotaura nemazvo pane yakatarwa nguva isipo mune yekudzidziswa yakaiswa.

Nzira maererano neinogoneka kusiyanisa mapoka akasiyana egorgorithm ndiyo mhando yekubuda inotarisirwa kubva kune imwe system ye machine learning

Pakati pemhando huru dzatinowana:

  • La kupatsanurwa: izvo zvakapatsanurwa zvakakamurwa kuva maviri kana anopfuura makirasi uye nzira yekudzidza inofanira kugadzira modhi inokwanisa kugovera kirasi imwe kana inopfuura pakati pevaya vanowanikwa kuisa.Aya marudzi emabasa anowanzo gadziriswa uchishandiswa maitiro akatariswa ekudzidza. 

    Muenzaniso wekukamurwa ndiwo mugove weimwe kana anopfuura mavara kune chifananidzo chakavakirwa pazvinhu kana zvinhu zvirimo;

  • La kudzoreredza: conceptually yakafanana nekugadzika nemusiyano kuti iyo inoburitsa ine inoenderera uye isiri-discrete domain.Iyo inowanzo gadziriswa nekutarisirwa kudzidza. 

    Muenzaniso wekudzoreredza ndiko kufungidzira kwekudzika kwechinoitika kubva pakamiririrwa muchimiro chemufananidzo wemuvara. 

    Muchokwadi, iyo dhezheni yezvinobuda mumubvunzo iri haigoni kuyera, uye haina kuganhurirwa kune imwe yakatarwa discet seti yemikana;

  • Il kuunganidza: iri kupi seti yedatha yakakamurwa kuita mapoka ayo, zvisinei, kusiyana nesainzi, haazivikanwe priori.Iwo iwo chaiwoiwo matambudziko ezvikwata izvi kazhinji anovaita asina kudzidziswa mabasa.
Rakareruka mutsara wekudzokorora modhi

Linear regression iriyakashandiswa zvakanyanya modhi inoshandiswa kufungidzira chaiyo tsika dzakadai se:

  • mutengo wedzimba,
  • nhamba yemafoni,
  • kutengesa kwakazara pamunhu,

uye inotevera maitiro

  • mativi,
  • kunyorera kuaccount yazvino,
  • dzidzo yemunhu

Mune mutsara kudzoreredza, hukama pakati pezvakazvimirira zvakasarudzika uye zvinotsamira zvinotsigirwa zvinoteverwa kuburikidza nerwonzi runowanzomiririra hukama pakati pezvakasiyana.

Mutsetse wakakodzera unozivikanwa seye regression mutsara uye unomiririrwa neyakaganhurirwa equation yerudzi rweY = a * X + b.

Fomula yacho yakavakirwa pakushandura dha data yekubatanidza maitiro maviri kana akawanda nemumwe. Paunopa iyo algorithm yekupinda maitiro, iyo yekudzorera inodzosa iyo imwe hunhu.

Akawanda mutsara mutsara regression modhi

Kana isu tiine anopfuura imwe yakazvimirira inoshanduka, saka tinotaura nezve wandei mutsara kudzoreredza, tichifungidzira muenzaniso senge unotevera:


y=b0 + b1x1 + b2x2 +… + Bnxn

  • y ndiko kupindura kwezvibodzwa, kureva, zvinomiririra mhedzisiro inofungidzirwa nemhando
  • b0 kukanganiswa, ndiko kukosha kwa y apo xi vese vakaenzana pa0;
  • hunhu hwokutanga b1 ndiko kuwanda kwe x1;
  • chimwezve chinhu bn ndiko kuwanda kwe xn;
  • x1,x2, ..., Xn ndiwo akasarudzika akasarudzika emuenzaniso.

Chaizvoizvo equation inotsanangura hukama hwenzvimbo inoramba ichitsamira inodarikira (y) uye maviri kana anopfuura akazvimirira akasiyana (x1, x2, x3…). 

Semuenzaniso, kana isu taida kufungidzira iyo CO2 inosimudza yemotokari (inotsamira kusiyanisa y) tichifunga nezveinjini simba, huwandu hwesilinda uye kushandiswa kwemafuta. Izvi zvekupedzisira zvinhu ndizvo zvakazvimirira misiyano x1, x2 uye x3. Iwo ma constant bi nhamba chaidzo uye inonzi modhi inofungidzirwa kunyorwa regression ma coefficients .. Y ndiyo yekuenderera mberi inotsamira inosarudzika, i.e. kuva iri huwandu hweb0, b1 x1, b2 x2, nezvimwe. y ichava nhamba chaiyo.

Multiple regression kuongorora inzira inoshandiswa kushandisa mhedzisiro iyo misiyano yakasununguka ine pane inosendamira yakasarudzika.

Kunzwisisa kuti zvinoenderana nekuchinja kwakasarudzika sekuchinja kwakazvimiririra kunobvumidza sei kufanofunga mhedzisiro kana kukanganisa kwezvirango mumamiriro ezvinhu chaiwo.

Kushandisa akawanda mutsara kudzoreredza zvinokwanisika kunzwisisa kuti kweropa rinoshanduka sei sezvo body mass index ichichinja uchifunga zvinhu zvakaita zera, bonde, nezvimwe, nekufungidzira zvinogona kuitika.

Nekudzoreredza kwakawanda tinogona kuwana fungidziro pamitengo yemitengo, senge remangwana remafuta kana goridhe.

Chekupedzisira, kuwanda kwemitsetse kudzoreredzwa kuri kutsvaga kufarira kwakawanda mumunda wekudzidza muchina uye huchenjeri hwekugadzira sezvo zvichibvumira kuwana maitiro ekudzidza kunyangwe mune nhamba huru yemarekodhi kuongororwa.

Logistic Regression Model

Logistic regression chishandiso chinoshandiswa kuenzanisira mhedzisiro mhedzisiro nemimwe kana akawanda anotsanangura akasiyana.

Inowanzo shandisirwa pamatambudziko ebhinari, uko kune makirasi maviri chete, semuenzaniso Hongu kana Kwete, 0 kana 1, murume kana mukadzi nezvimwewo ...

Nenzira iyi zvinokwanisika kutsanangura iyo data uye nekutsanangudza hukama uripo pakati peyakaganhurika inotsamira kusiana uye imwe kana mamwe mazita kana akaenzana akazvimirira akasiyana.

Mhedzisiro yacho inotemwa nekuda kwekushandiswa kweiyo logistic basa, iyo inofungidzira mukana uye ipapo defiinopedza kirasi yepedyo (yakanaka kana yakaipa) kune yakawanikwa mukana kukosha.

Tinogona kufunga nezvekudzikamisa zvinhu senzira yekuenzanisa mhuri ye inotarisirwa kudzidza algorithms.

Kushandisa nzira dzehunyanzvi, kudzora kwemitengo kunobvumira kuburitsa mhedzisiro iyo, inomiririra mukana wekuti iyo yakapihwa yekukosha yeyekirasi yakapihwa.

Mumatambudziko ebinomial regression regression, mukana wekuti kuburitsa ndekwekirasi imwe ichave iri P, nepo iyo iri yeimwe kirasi 1-P (apo P iri nhamba iri pakati pe0 ne1 nekuti inoratidza mukana).

Bhinomial logistic regression inoshanda nemazvo mune zvese zviitiko umo iyo musiyano watiri kuyedza kufungidzira ndeyemabhinari, kureva kuti, inogona kungofungidzira miviri chete: kukosha 1 iyo inomiririra yakanaka kirasi, kana kukosha 0 iyo inomiririra iyo yakaipa kirasi.

Mienzaniso yezvinetso zvinogona kugadziriswa nekugadzirisa zvinhu ndezvi:

  • e-mail iri spam kana kwete;
  • kutenga pamhepo ndeyekunyengedza kana kwete, kuongorora mamiriro ekutenga;
  • murwere ane kupwanyika, kuongorora radii yayo.

Nepfungwa yekudzoreredza tinogona kuita fungidziro yekuyera, kuyera hukama pakati pezvatinoda kufanotaura (zvinoenderana nekuchinja) uye chimwe kana chimwe chakazvimiririra chinomiririra, i.e. maitiro. Mukana wekufungidzira unoitwa kuburikidza nebasa rezvinhu.

Izvo zvingangoitika zvinobva zvashandurwa kuita mabhinari tsika, uye kuitira kuti fungidziro ive yechokwadi, mhedzisiro iyi inopihwa kukirasi yacho ndeyayo, zvichibva nekuti kana iri padyo nekirasi pachayo.

Semuenzaniso, kana iko kushandiswa kweiyo logistic basa kuchidzoka 0,85, zvinoreva kuti iko kuiswa kwakaburitsa kirasi yakanaka nekuigovera kukirasi 1. Zvakare kana yaive yawana kukosha senge 0,4 kana kupfuura kazhinji <0,5 ..

Innovation newsletter
Usarasikirwa nenhau dzakanyanya kukosha dzekuvandudza. Nyora kuti uvagamuchire neemail.

Kudzoreredza kwemagetsi kunoshandisa chiitwa chinoshanda kuti chiongorore kupatsanurwa kwemitengo yekuisa.

Iyo logistic basa, inonziwo sigmoid, iri curve inokwanisa kutora chero nhamba yeicho chaicho kukosha uye kuimaka iyo kune kukosha pakati pe0 ne1, kusanganisa akanyanyisa. Basa iri:

iri kupi:

  • e: hwaro hwepanyama logarithms (Nhamba yaEuler, kana excel function exp ())
  • b0 + b1 * x: ndiyo chaiyo nhamba yehuwandu yaunoda kushandura.

Mumiriri anoshandiswa kugadzirisa zvinhu

Logistic regression inoshandisa equation sekumiririra, fanika senge linear regression

Izvo matanho ekuisa (x) akabatanidzwa mumutsetse achishandisa zviyero kana maimendi akakwana, kufanotaura kukosha kwakabuda (y). Musiyano wakakosha kubva kumutsara wekumisidzana ndewekuti yakamisikidzwa yakakura mutengo mutengo webhinari (0 kana 1) pane kukosha kwenhamba.

Heino muenzaniso weiyo Logistic regression equation:

y = e^(b0 + b1 * x) / (1 + e^(b0 + b1 * x))

Njiva:

  • y ndiyo musiyano unoenderana, i.e. kukosha kwakafanotaurwa;
  • b0 ndiko polarization kana kutapudza nguva;
  • b1 ndiko kubatana kweiyo imwechete kukosha kukosha (x).

Chikamu chega chega mu data rekuisa chine chakabatana b chakakwana (kukosha kwechokwadi chaicho) chinofanira kudzidziswa kubva ku data yekudzidziswa.

Iyo chaiyo inomiririra yeiyo modhi iyo iwe yaunove unochengetera mundangariro kana faira ndiwo iwo ma coefficients muquation (iyo beta kana b kukosha).

Logistic regression inofanotaura mikana (technical range)

Logistic regression modhiyo mukana weiyo default kirasi.

Semuenzaniso, ngatifungei tiri kuenzanisira bonde revanhu sevanhurume kana vanhukadzi kubva pakukwirira kwavo, kirasi yekutanga inogona kuve yechirume, uye iyo regression regression modhi inogona kunyorwa semukana wekuve murume kupihwa kureba kwemunhu, kana zvimwe. zvakarongeka:

P (bonde = murume | kukwirira)

Yakanyorwa neimwe nzira, tiri kuenzanisira mukana wekuti chekuisa (X) ndechekirasi predefinite (Y = 1), tinogona kuinyora se:

P(X) = P(Y = 1 | X)

Iko kufanofunga kunogona kuitika kunoshandurwa kuva mhando dzebhinari (0 kana 1) kuitira kuti chaizvo kuita fungidziro iite.

Logistic regression inzira yakatarwa, asi fungidziro dzinoshandurwa uchishandisa basa rekuita. Mhedzisiro yeizvi ndeyekuti hatichakwanisi kunzwisisa kufungidzira senge mutsara wekubatanidza zvigadzirwa sezvatinogona neye mutsara kudzvinyirira, semuenzaniso, kuenderera kubva kumusoro, modhi inogona kuratidzwa se:

p (X) = e ... (b0 + b1 * X) / (1 + e ^ (b0 + b1 * X))

Iye zvino isu tinogona kudzoreredza equation seinotevera. Kuti tichidzorere isu tinogona kuenderera nekubvisa iyo e kune rumwe rutivi nekuwedzera yakasarudzika logarithm kune rumwe rutivi.

ln (p (X) / 1 - p (X)) = b0 + b1 * X

Nenzira iyi tinowana chokwadi chekuti chinongedzo chebudiriro kurudyi chiri mutserendende zvakare (sekufanana kwechirevo kudzvinyirira), uye iko kuruboshwe kuruboshwe rwuri rwenzira rwekugona kweiyo yakasarudzika kirasi.

Iwo mikana yakaverengerwa sechiyero chekufungidzira kwechiitiko chakakamurwa nemukana wekuti hapana chiitiko, i.e. 0,8 / (1-0,8) ane mhedzisiro iri 4. Saka tinogona kunyora kuti:

ln (mikana) = b0 + b1 * X

Sezvo mikana iripo-inoshandurwa, isu tinodaidza iri kuruboshwe-kuruboshwe log-odds kana probit.

Tinogona kudzosera chinongedzo kurudyi ndokuchinyora seichi:

mukana = e ... (b0 + b1 * X)

Zvese izvi zvinotibatsira kunzwisisa kuti zvechokwadi modhi ichiri mutsara musanganiswa wezvakaiswa, asi kuti iyi mutsara musanganiswa unoreva mukana wegigi weiyo pre class.definita.

Kudzidza maitiro ekunyoresa ekudzora

Iyo coefficients (beta kana b maitirwo) eiyo logistic regression algorithm inofungidzirwa muchikamu chekudzidza. Kuti tiite izvi, isu tinoshandisa yakanyanya mukana fungidziro.

Yakanyanya mukana fungidziro ndeye kudzidza algorithm inoshandiswa neanoverengeka muchina kudzidza algorithms. Iwo coefficients anokonzerwa nemuenzaniso anofanotaura kukosha kuri padyo ne1 (semuenzaniso Murume) yepre classdefinite uye kukosha kuri padyo ne0 (semuenzaniso mukadzi) kune imwe kirasi. Mukana wepamusoro wekudzoreredzwa kwemaitiro inzira yekutsvaga kukosha kwema coefficients (Beta kana ob values) inoderedza chikanganiso mumikana yakafanotaurwa nemuenzaniso kune avo vari mudhata (semuenzaniso mukana 1 kana data iri kirasi yekutanga) .

Isu tinoshandisa algorithm yekudzikisa kuti tigonesese zvakanaka zvakaringana zvibvumirano zve data rekudzidzisa. Izvi zvinowanzoitwa mukudzidzira uchishandisa inobudirira manhamba ekugadzirisa algorithm.

Ercole Palmeri


Innovation newsletter
Usarasikirwa nenhau dzakanyanya kukosha dzekuvandudza. Nyora kuti uvagamuchire neemail.

Zvinyorwa zvekare

Yakanaka Idea: Bandalux inopa Airpure®, keteni rinochenesa mweya

Mhedzisiro yenguva dzose yekuvandudza tekinoroji uye kuzvipira kune zvakatipoteredza uye kugara zvakanaka kwevanhu. Bandalux inopa Airpure®, tende…

12 April 2024

Dhizaina Mapeteni Vs SOLID misimboti, zvakanakira uye zvazvakaipira

Dhizaini mapatani ndeaya akadzika-mwero mhinduro kumatambudziko anodzokororwa mukugadzira software. Madhizaini maitiro ari…

11 April 2024

Magica, iyo iOS app inorerutsa hupenyu hwevatyairi mukutonga mota yavo

Magica ndiyo iPhone app inoita kuti manejimendi emotokari ave nyore uye anoshanda, achibatsira vatyairi kuchengetedza uye…

11 April 2024

Excel machati, zvaari, maitiro ekugadzira chati uye maitiro ekusarudza iyo yakakwana chati

Chati yeExcel inoonekwa inomiririra data mune Excel worksheet.…

9 April 2024