WOW COULD IT BE BETTER?
Start your journey today
Introduction
This website will take you on a tour, analysing the Major characters in Warcraft . This will be done by accessing the wowpedia page for each of the characters using the wowpedia api , where we will get information about each characters gender, faction, race and whether the character is alive in the lore of the game, while also extracting their in-game quotes from the quotes section on the page. Additionally, we will extract user comments from the wowhead character pages using beautifulsoup , where the timestamp associated with each comment will also be used. The resulting raw and pre-processed json and txt files contains 1739 files (~50MB), which can be downloaded using the button above, including some pickled results etc.. Almost everything on this website is in some way, shape, or form interactive, so please remember to play around. You can use the Navbar in the top to auto scroll to different sections on the page.
In the Warcraft universe (primarily in World of Warcraft), the plot revolves around two competing factions, the Horde and the Alliance, which are in constant conflict with each other. From time to time, outside participants disturb this usual back and forth, e.g. when Arthas defected from the Alliance and became the Lich King, which both factions tried to see an end to (repeated again when Sylvanas Windrunner leaves the Horde and joins the Jailer in Shadowlands by shattering the Lich Kings mask). Some of this strife can be captured by creating a network where the characters have assigned a faction attribute, where neutral characters are mainly bosses in the game, or characters that aid both the Horde and Alliance in their fight against a common enemy.
We will analyse the Major characters by grouping them by different attributes such that we can compare the characteristics of these groups, e.g. comparing how verbal the Alliance are compared to the Horde and the sentiment of the different groups. In order to do this, we begin by creating a network where nodes are each of the major characters, and edges between them arise when one characters wowpedia page links to anothers wowpedia page. This resulted in a network consisting of 261 characters and 4009 edges connecting them, as seen on the Warcraft network visualization below.
Network Visualization
The nodes are colored according to their faction (red for Horde, blue for Alliance and gray for others, such as bosses in the game). Additionally, edges are colored according to the relationships of the two nodes, where gray links from any node to a neutral node, blue is between two Alliance nodes, red is between two Horde nodes and green is between a Horde and an Alliance node. (Interactive plot, reset button in the top right corner of the plot in case you get lost)
We can see from the network that there are some characters that are very well connected, taking a closer look we can see that the most connected Alliance characters include the current and late kings of the Alliance capital "Stormwind" (Anduin and Varian Wrynn). For the Horde characters, we notice the biggest nodes in the network, Thrall, which has been the warchief of the Horde capital "Orgrimmar" and is probably one of the most known character in Warcraft. Finally, for the Neutral characters we see Sylvanas Windrunner as one of the most connected, she is one of the newest boss characters who previously was a part of the Horde.
Top 5 most connected
Alliance | in, out | Horde | in, out | Neutral | in, out | ||
---|---|---|---|---|---|---|---|
Jaina Proudmoore | 73, 72 | Thrall | 103, 84 | Sylvanas Windrunner | 79, 66 | ||
Anduin Wrynn | 68, 63 | Lor'themar Theron | 37, 47 | Deathwing | 69, 53 | ||
Varian Wrynn | 60, 66 | Baine Bloodhoof | 44, 39 | Lich King | 76, 44 | ||
Khadgar | 65, 55 | Vol'jin | 42, 38 | Garrosh Hellscream | 63, 40 | ||
Malfurion Stormrage | 54, 46 | Varok Saurfang | 39, 36 | Arthas Menethil | 56, 41 |
To get a quick overview of the different attributes in the network an attribute distribution is created for each of the four attributes in the graph. From this attribute plot we see that genders, faction and status is nicely divided into a few groupings, however the race distribution is populated with a lot of different races with a few occurrences apart from the main character races such as human, orc and blood elf.
Attribute Distribution
Character Connections
In order investigate the characteristics of connections between the major characters of Warcraft, you can use degree distribution plots, along with stats for in-, out- and total-degree, and different centrality measures.
Below plots of the out- and in-degree distributions of the Warcraft network is shown along with the degree distribution of a random Erdős-Renyi network, which have the same number of nodes as the Warcraft network and the probability of connection is calculated from the average out- and in-degree respectively. From this it seems like the Warcraft network is not entirely random! Additionally, a scatter plot of the In- vs. out-degree for each character in the network is shown.
In order to differentiate between the connectedness within the different communities of Alliance vs. Horde vs. Neutral or Male vs. Female characters, we created subgraphs for each of these and calculated some basic degree stats, that is min, max, mean, median and mode for the in-, out- and total-degree of these subgraphs as displayed in the table below. One thing to note from this is that e.g. the Horde characters are more disconnected from each other than the Alliance and Neutral characters (since the mean and median is lower for the Horde characters). Looking at the male vs. female characters, we see that the female characters seem to talk much less to each other than the male characters do! This could have something to do with the fact that there are far fewer female than male characters in the network (see attribute distribution plot).
Degree Stats
Full Network | min | max | mean | median | mode |
---|---|---|---|---|---|
In-degree | 0 | 103 | 15.36 | 10.0 | 6 |
Out-degree | 0 | 84 | 15.36 | 12.0 | 6 |
Total-degree | 1 | 197 | 30.72 | 21.0 | 13 |
We can dive even further into comparing the connectedness of characters, by inspecting the different centrality measures over different subgraph partitions of the network, as displayed in the table below. Comparing the centrality measures for the different factions, it is clear that the Horde characters and the Alliance characters have a much higher centrality measure in their subnetworks than the characters without faction do, which indicates that characters within the Horde and Alliance factions have more tight-knit communities than the characters without a faction.
Centrality Measures
Faction | Horde | Alliance | Neutral |
---|---|---|---|
Degree centrality | 0.134007 | 0.179356 | 0.096958 |
Betweenness centrality | 0.007062 | 0.008802 | 0.004222 |
Eigenvector centrality | 0.047472 | 0.067552 | 0.034995 |
In addition to partitioning by node attributes, it is possible to use graph partitioning algorithms. One of these algorithms is the Louvain algorithm, which when used on the Warcraft network splits the network into seven communities, where the top three characters of each community according to their node degree is:
Community 1: | Khadgar, Illidan Stormrage, Velen |
Community 2: | Deathwing, Sargeras, Yogg-Saron |
Community 3: | Sylvanas Windrunner, Lich King, Varian Wrynn |
Community 4: | Malfurion Stormrage, Tyrande Whisperwind, Alexstrasza |
Community 5: | Thrall, Ner'zhul, Orgrim Doomhammer |
Community 6: | Anzu, Terokk, Talon King Ikiss |
Community 7: | Jaina Proudmoore, Anduin Wrynn, Garrosh Hellscream |
From the top characters in the community partitions, it seems like the network is partitioned reasonably by the Louvain algorithm. Community 1 has a lot of connection through the lore that unfolded in the Legion expansion. Community 2 consists of end-game bosses. Sylvanas, Lich King and Varian of community 3 have all been faction leaders. Each character in community 4 has some kind of nature aspect. Community 5 consists of important orcs. Community 6 contains the race "Arakkoa". However, community 7 seems a bit random with respect to the lore and their races, factions etc. Below is an image of each of the most connected character overall for each community
WordClouds
A way of getting the words that differentiate characters or groups of characters from each other is by creating wordclouds from TF-IDF scores. This has been done for different groupings of the Warcraft characters, which can be investigated using the dropdown menus below. You can see which words describes your favorite group of characters, whether it be Horde or Alliance, and even which words other players use to describe said characters!
Some nice wordcloud comparisons can be seen between e.g. dead and alive from wowhead, where you can see that the words mainly comes from users trying to describe how to solo boss fights (words pet, kill, phase, killed etc.), these are still there for the deceased characters, but far less important. When comparing the Alliance and Horde wordclouds from wowhead, one thing to notice is that most words are very different, with a few common words appearing across the two sources. For Alliance "naaru", "skybreaker" and "bloodmyst" is mentioned in both wordclouds. For Horde we don't really see any common words, however some new very fitting words like "clan" and "troll" appear. Another interesting wordcloud is for community 6, where the wordclouds for wowhead and wowpedia both match nearly perfectly with each other, and these words also describe characteristics of the underlying community, which is the "Arakkoa" race for which "Skettis" is their capital, "Ikiss" is a king, "Sethekk" is one of their factions.
What can you uncover when you compare the different wordclouds?
Sentiment Analysis
We would now like to produce a unidimensional sentiment score for wowhead comments and character quotes from wowpedia. Sentiment scores are computed with both VADER scoring and the BERT analyser from flairNLP, where we reduce the sentiment scores for the individual text entities (Wowhead comments or Wikipage quotes) to a single character-level score by taking the mean of the scores for all comments/quotes that belong to the respective characters.
Doing this for all characters by first using solely Wowhead comment text data, and afterwards using only Wikipages quote text data, we can produce two histograms plots. To add perspective to the plots, we allow for a sentiment score distribution to be plotted for certain groupings of characters (e.g. grouped by faction, or by community), in order to portray any difference there may be between such groups. This can be explored in the interactive plots below.
Playing around with the different options and observing the plots reveals a few things about the sentiment scores. For instance, the mean of the BERT scores is smaller than 0 for Wowhead comments, and larger than 0 for Wiki quotes. It also shows that VADER sentiment values in general are a bit more centered around 0, with flatter tails. Additionally, we see that the VADER and Bert sentiments sometimes have opposite opinions on whether some character groups have most negative or positive sentiment, e.g. for the Anzu community, we see that Bert classifies the wowhead comments negatively and the character quotes positively, while VADER does the opposite.
Extracting the wowhead comments also gave some metadata. This includes a timestamp of when the comment was created. From this, we've done a time-series analysis of the wowhead user comments sentiment scores, trying to reveal any potential trends for sentiment values related to characters over time.
Similarly to the sentiment value distribution plots, we provide to ability to separate characters by certain groupings (e.g. Faction or Community). The mean comment sentiment values are then calculated for the specified resampling option (e.g. reduced monthly, quarterly or yearly).
Playing around with the options and observing what the plot shows, some trends start to appear. For instance, sentiment values seem to be more volatile after about 2016 (to a more significant degee for certain groupings). Switching the metric to show comment counts provides some extra context, as it reveals that not that many comments are made in this period. In fact, the amount of comments made in each period is unsurprisingly strongly correlated to the number of active players in those periods, if one were to look up this data.
noCopyright
we do not condone the actions of Blizzard Entertainment in any way, shape or form, nor do we encourage the continued support of Blizzard services.