Hjem Analyseverktøy Eksterne analyseverktøy Klyngeanalyse (Cluster Analysis)

Klyngeanalyse (Cluster Analysis)

10/08/2024

Innholdsfortegnelse

Hva er en klyngeanalyse (cluster analysis)?

Klyngeanalyse (Cluster Analysis) er en multivariat analyse som brukes i en dataanalyse og maskinlæring til å gruppere variabler eller observasjoner som er sterkt korrelerte. Klyngeanalysen gruppere et sett objekter slik at objekter eller datapunkter i samme klynge (eller segment) er mer like hverandre enn objekter i andre grupper.

Målet med klyngeanalyse er å sikre at objekter innenfor en klynge er så like som mulig, mens objekter fra forskjellige klynger er så forskjellige som mulig. Dette er en form for utforskende dataanalyse som brukes i en rekke felt, inkludert markedsføring, biologi, sosiale vitenskaper, og maskinlæring.

Typer av klyngeanalyse

Vi skiller mellom følgende typer klyngeanalyse:

Hierarkisk klyngeanalyse

Skaper en trelignende struktur (dendrogram) som viser hvordan objektene grupperes i klynger på forskjellige nivåer av likhet. Det finnes to hovedtilnærminger:

Agglomerativ (bottom-up): Starter med hvert objekt som en egen klynge, og kombinerer deretter de nærmeste klyngene trinnvis til alle objektene er samlet i én klynge.
Divisiv (top-down): Starter med alle objektene i én stor klynge, og deler dem deretter trinnvis opp til hver klynge inneholder ett objekt.

Eksempel: Hierarkisk klyngeanalyse kan brukes til å gruppere arter i biologisk forskning basert på genetiske likheter.

K-medoids

Ligner på K-means, men bruker faktiske datapunkter som sentroider (medoids) i stedet for gjennomsnittsposisjoner. Dette kan gjøre K-medoids mer robust mot uteliggere.

Eksempel: Bruke K-medoids for å gruppere geografiske områder basert på demografiske data for å finne representative lokasjoner i en markedsføringskampanje.

DBSCAN (Density-Based Spatial Clustering of Applications with Noise)

En tetthetsbasert klyngeanalyse som identifiserer klynger som områder med høy tetthet av datapunkter, adskilt av områder med lav tetthet. DBSCAN er spesielt nyttig for å identifisere klynger av vilkårlig form og kan håndtere støy (uteliggere).

Eksempel: Bruke DBSCAN til å identifisere klynger av stjerner i astronomiske data basert på deres posisjon i rommet.

Gaussian Mixture Models (GMM)

En probabilistisk tilnærming til klyngeanalyse som antar at dataene er en blanding av flere normalfordelinger (Gaussianer), hvor hver Gaussian representerer en klynge. GMM er mer fleksibel enn K-means fordi den kan modellere klynger av forskjellige former og størrelser.

Eksempel: Bruke GMM for å analysere kundeatferd i en nettbutikk for å identifisere ulike kundesegmenter med forskjellige kjøpsmønstre.

Trinn i Klyngeanalyse

You need to be logged in to view the rest of the content. Vennligst . Ikke medlem? Bli med oss

Infokapsel	Varighet	Beskrivelse
nsid	session	This cookie is set by the provider PayPal to enable the PayPal payment service in the website.
tsrce	3 days	PayPal sets this cookie to enable the PayPal payment service in the website.
x-pp-s	session	PayPal sets this cookie to process payments on the site.

Infokapsel	Varighet	Beskrivelse
d	3 months	Quantserve sets this cookie to anonymously track information on how visitors use the website.
l7_az	30 minutes	This cookie is necessary for the PayPal login-function on the website.
swpm_session	session	This cookie is set by the Simple WordPress Membership Plugin. This cookie is used for membership login session and to provide access to the protected content on the website.This cookie keeps the login records so user don't want to authorise each time while moving to next page.
_gat	1 minute	This cookie is installed by Google Universal Analytics to restrain request rate and thus limit the collection of data on high traffic sites.

Infokapsel	Varighet	Beskrivelse
_ga	2 years	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_gid	1 day	Installed by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously.
__gads	1 year 24 days	The __gads cookie, set by Google, is stored under DoubleClick domain and tracks the number of times users see an advert, measures the success of the campaign and calculates its revenue. This cookie can only be read from the domain they are set on and will not track any data while browsing through other sites.

Infokapsel	Varighet	Beskrivelse
anj	3 months	AppNexus sets the anj cookie that contains data stating whether a cookie ID is synced with partners.
c	20 years	This cookie is set by Rubicon Project to control synchronization of user identification and exchange of user data between various ad services.
CMID	1 year	Casale Media sets this cookie to collect information on user behavior, for targeted advertising.
CMPRO	3 months	CMPRO cookie is set by CasaleMedia for anonymous user tracking, and for targeted advertising.
CMPS	3 months	CMPS cookie is set by CasaleMedia for anonymous user tracking based on user's website visits, for displaying targeted ads.
IDE	1 year 24 days	Google DoubleClick IDE cookies are used to store information about how the user uses the website to present them with relevant ads and according to the user profile.
mc	1 year 1 month	Quantserve sets the mc cookie to anonymously track user behaviour on the website.
test_cookie	15 minutes	The test_cookie is set by doubleclick.net and is used to determine if the user's browser supports cookies.
uuid	3 months	MediaMath sets this cookie to avoid the same ads from being shown repeatedly and for relevant advertising.
uuid2	3 months	The uuid2 cookie is set by AppNexus and records information that helps in differentiating between devices and browsers. This information is used to pick out ads delivered by the platform and assess the ad performance and its attribute payment.

Infokapsel	Varighet	Beskrivelse
CMTS	3 months	No description
cocat1	session	No description
cscat1	session	No description
KHcl0EuY7AKSMgfvHl7J5E7hPtK	20 years	No description available.
LANG	9 hours	No description
sc_f	5 years	No description available.