Rubric Used to evaluatie Vikmione Quality

We aim for the criteria & scoring descriptors to be discrimination-free within the intended scope. This means that the scoring should be independent of creative choices such as plot elements, characterizations, wordcounts, or presence of triggers, explicit scenes, violence, etc. Each work should be able to obtain full score or zero score solely based on the execution of the work and regardless of any creative choice.

The rubric’s scope is vikmione, meaning:

The rubric may feel dishonest to stories not matching the above 4 criteria. This is on purpose, because such stories are outside the scope of this rubric.

We aim for a rubric that is reproducible, defendable and (hopefully) discrimination-free within the intended scope. We use 3 main categories, each split into several axis, each with score 0-10:

A) General Storytelling (Is this a good story?)

B) Harry Potter Framework (Is this Harry Potter?)

C) Ship dynamics (Is this a good ship?)

Specification of individual axis

1) Consistence of Characters

Name: Consistence of Characters

Main issue: Do characters behave the same as their canonical counterparts would under the given circumstances? NB: the circumstances within the story itself. Logic and causality between canonical traits, circumstances and outcomes are key here, to whether something is penalized or not. Are darker actions due to: World/plot/tone. Or Character personality change. Only the latter is penalized.

Details: We limit the issue to the following list of canonical characters (we only use those for which reasonable character description/development is known from canon beyond a few stereotypes; Viktor Krum misses from the list as he has a separate category):

Choices under pressure, Dialogues, initiatives/plans, responses, emotions? Are the characters in the story actual duplicates of their canonical counterparts, or are they different characters baring the same name? NB: Characters may act against their core traits (see the list above) if the circumstances & inner conflicts can make this believable. This is not penalized. But changing their core traits by rewriting the character is heavily penalized. NB: It is very important to state that extreme plot circumstances & strong deviations from canonical situations & themes are NOT penalized. What matters is whether the character deals with these situations (conflicts, motivations, etc) in the same way as the canonical character would, in agreement with the above personality traits (or against it, if worked out properly and defendable). Hence, causal reasoning is critical. Do not just compare deviations from traits, actions & outcomes, but consider if these originate logically from the circumstances and a canonical character or not. Logic and causality are key here, to whether something is penalized or not. Are darker actions due to: World/plot/tone. Or Character personality change. Only the latter is penalized. Do the circumstances push hard enough to break the canon traits believably yes/no, is key here. Canon-consistent extreme behaviour is perfectly possible, if it follows logically from the alternate circumstances. But the further you push it, the further the extrapolation goes, and then, one simply must acknowledge that this comes with a large uncertainty margin around the extrapolation. Which means some (small) penalty could be justified.

Motivation: The story should be an actual Harry Potter fanfiction, meaning that the characters should feel the same as their canonical counterparts.

Descriptors: Score the category on a scale of 0-10, with an integer number.

NB: If many characters behave just a little OOC, the score is much higher than if a single character behaves significantly OOC. Also, IC/OOC impact on plot is punished heavily. So many small sloppy errors in dialogues spread across many characters can still score 7-9 (depending on severity) as long as the characters still feel like their canonical counterparts, but one character making a single crucial decision against their canonical counterpart makes the score immediately drop to 4-5 (or lower, depending on severity).

Hard boundary: If the story intentionally implements major rewrites of even one of the listed canonical characters, this is hard-cut by assigning score 0. For example, Harry is consistently sadistic or power-hungry through the entire story. Hermione is consistently opportunistic instead of defending justice, etc. The so-called dark-Harry or dark-Hermione tropes (same for the other listed characters). Note that the boundary is subtle. Harry can be Slytherin because he forgot to ask the hat for Gryffindor and still be canon-Harry. Or join Death eaters for saving/protecting his friends. If executed correctly, this can still score 10. But Harry being Slytherin or joining death eaters because he is hungry for power is a hard 0. The difference between an intrinsic major character rewrite (penalized with 0 points) and a believable stress-test for the canonical character (depending on the executing this could be 10 points).

Again: causal reasoning is critical. Do not just compare deviations from traits, actions & outcomes, but consider if these originate logically from the circumstances and a canonical character or not. Logic and causality are key here, to whether something is penalized or not. Are darker actions due to: World/plot/tone. Or Character personality change. Only he latter is penalized. Do the circumstances push hard enough to break the canon traits believably yes/no, is key here. NB: the contradiction/rewrite must be clear, persistent, and structural, not a one-off line, temporary deception, or local interpretation to trigger the hard-boundary. NB: use only the narrative body text. Do NOT use or infer information from: Tags, Author Notes, Summary, Chapter Titles, or other metadata of any kind. If such elements are present in the text, ignore them entirely.

2) Psychological credibility

Name: Psychological credibility

Main issue: Are emotions and motivations realistic? Do characters behave the same way as a human being would?

Details: Here, we focus not on the match between characters & their canonical counterparts, but on the match between characters and known psychology. Do the characters feel human or stereotypical/cartoonish? Focus on issues like:

Irrational or flawed behaviour is not penalized if it is clearly motivated and consistent with the character’s psychology. This is to differentiate between author’s intent (OK) and author’s error (penalty).

Motivation: General characteristic of good writing. Characters should be well rounded and make sense.

Descriptors: Score the category on a scale of 0-10, with an integer number.

3) Romance

Name: Romance.

Main issue: How believable are romantic relations worked out? Main focus on the main ship in the story, but other romances must be judges as well. Both on romance/attraction/chemistry as long-term compatibility.

Details: This category has some overlap with psychological credibility, but is also distinctly different. Psychological credibility is about whether a character’s emotions are human/believable. Emotions are a vital part of romance, but not all emotions in a story deal with romance and romance is about more than emotions. Romance is also about whether it is clear why characters like each other, and whether the relationship works or not (both are fine, as long as it is well-portraited). Psychological Realism focusses on individuals. This category focusses on (romantic) dynamics between people. There is a difference. Focus on issues like:

NB: A strong emotional impact of the romance is weighted heavily for a believable romance. Very Important! NB: What matters here is, if the romance feels human, realistic and believable, not necessarily whether the romantic dynamics incorporate all core traits of a character. We have a sperate rubric axis for that aspect (Originality).

Motivation: The scope is vikmiones, so there must be a category dealing specifically with romantic realism.

Descriptors: Score the category on a scale of 0-10, with an integer number.

4) Narrative Quality

Name: Narrative Quality

Main Issue: How well is the story told? The category focusses purely on (plot) content. For writing/style quality we have a separate category.

Details: Main focus in on the content and the plot of the story. Not emotions, not romance, not canon-consistency, but about the content of scenes and what happens in them. And we must focus on objective issues, meaning that any concrete plot choice should not affect the score. Only its execution. We, therefore, focus on the following issues:

Motivation: Each story has a plot and the quality of the plot must be separately addressed.

Descriptors: Score the category on a scale of 0-10, with an integer number.

5) Style Quality

Name: Style Quality

Main Issue: How well-written is the story? This category focusses purely on writing style and skill. For content, we have Narrative Quality.

Details: This is about how well everything is written down. Not about content, characters, emotion, but about how good the story handles words and language. We focus on:

Motivation: Writing style is an integral part of any story, so it deserves a separate category to judge.

Descriptors: Score the category on a scale of 0-10, with an integer number.

6) Theme Quality

Name: Theme Quality

Main issue: How well does the story convey its primary message to the reader?

Details: It is very important to separate this category from Narrative Quality and Psychological realism. Strong emotions and traumas can be a powerful tool to convey a message to the reader, but they are not the only tool. Consistency of story arcs and consistent use of the plot are just as powerful. As such, Psychological Realism treats whether emotions of the characters are realistic. Thematic depth treats whether the emotions of the reader (and reader and character are very different!) are well-utilized for conveying the primary message(s). Narrative Quality treats the consistency of the content and plot itself. Thematic Depth focuses on the message behind the plot and content. In order to convey a message powerfully, the reader must both experience strong emotions, and the plot and content must be constructed in a way that the message is conveyed without ambiguities or lack of clarity. The chosen message itself is irrelevant (we want the rubric to be discrimination free), we only measure how well the message is conveyed to and remembered by the reader. So, we ask:

NB: Theme or primary message may be ambiguous itself by author’s intent. For example: love is messy. This is OK. What matters is not whether the transfer process to the reader is ambiguous, or clearly conveyed. Bad transfer is penalized. Also, themes and messages do not need to be explicitly stated. Implicit themes and/or message(s) can be just as strong. Again, what solely matters, is how well it is transferred to the reader.

Motivation: Themes are a vital part of any story. They determine what the story communicates beyond its events and therefore must be assessed separately. However, instead of using a traditional literary framework, where thematic quality is measured by the richness of possible interpretations, we adopt a didactic framework. In interpretation-based analysis, meaning can become highly subjective, as different readers may arrive at different conclusions without a clear objective anchor. This makes consistent and fair rubric scoring difficult.

Therefore, this rubric evaluates thematic quality based on how effectively the story communicates its message to the reader. A theme or message is considered strong if it is clearly conveyed, emotionally reinforced, and consistently supported by the plot and narrative structure. This approach allows for objective assessment using principles from didactics, particularly the alignment between content and the reader’s emotional response.

A consequence of this choice is that stories relying heavily on interpretation puzzles, open-ended conclusions, or unresolved ambiguity will score significantly lower in this category, as they do not prioritize clear message transfer. NB: Complex or nuanced messages are not penalized, provided they are communicated clearly and consistently to the reader.

This limitation is acceptable within the scope of this rubric, which focuses on Hermione/Viktor endgame stories. Such stories inherently aim toward resolution rather than open-ended interpretation, making a communication-based evaluation of theme both appropriate and meaningful.

Note that if we would consider thematic depth as well, we would reward stories with deep philosophical messages over simple clear messages. That is discrimination we do not want. Therefore, we solely focus on message transfer. And works exist in the real world that have a crystal-clear message leaving you world shaken. Max Havelaar by Multatuli. The never-ending story. Ender’s game. So the bar score 10 is not unrealistic. It is high, but not impossible.

Descriptors: Score the category on a scale of 0-10, with an integer number.

7) Worldbuilding Quality

Name: Worldbuilding Quality

Main Issue: How well is worldbuilding executed? Both in terms of writing skills (scenes) and in terms of creative power (expansion of the wizard world).

Details: Worldbuilding has several aspects in Harry Potter Fanfiction. The magic system with its rules, limitations and possibilities, is part of worldbuilding as well as the physical locations of the story. Writing skills measure how well you can describe an environment and/or explain a magic system to the reader. And also, whether the characters respond to their rules and environment (good), or that they just form a static decor (bad). Creative power measures expansion of the wizard world. Again, it must be discrimination-free, so any expansion and/or new design of the wizarding world is fine. The more, the better. However, depth and relevance of worldbuilding are prioritized over quantity. A small number of well-developed elements is preferred over a large number of shallow ones. Quantity acts as a multiplier on quality, not a substitute for it. So: Quantity acts as a multiplier on quality, not a substitute for it. But all creative production power is rewarded. However, a limitation is added that a direct contradiction of canonical material is heavily penalized. For example, adding new spells is ok, the more the better. But adding a counter curse to Avada Kedavra is terrible, as the canon explicitly states that this does not exist. Adding more rooms to the department of Mysteries is fine, but removing the hall of prophecies is terrible. Hence, Creative expansion is only rewarded when it is well-executed, consistent with canon, and relevant to the story. Also, inventing cheap magic just to circumvent plot challenges is penalized. For example, letting Harry hide himself from death eaters with a new cloaking spell is bad worldbuilding. Cloaking spells are canonically complex magic, so Harry cannot just do them without proper training prior to the story. So, this would be worldbuilding as a cheap plot fix. Moreover, canon explicitly states that cloaking spells do not work perfectly (Harry’s cloak from his father does) so inventing a spell that does not respect that, contradicts canon. Worldbuilding and inventing new magic should be about learning more about the wizard world and how complex existing spells work (like Fidelius, Animagus, etc.) not about cheap fixes. So, we focus on:

Motivation: worldbuilding, both environments and magic systems are an integral and vital part of any fantasy story, including a vikmione, which is a Harry Potter fantasy story. So, we need a separate category to judge its quality. Both in terms of creative production (any addition that is no contradiction of canon or cheap shortcut is values) and in terms of writing skills. The emphasis on worldbuilding is very strong, but this is a design choice. We want the story to feel magical, because it is Harry Potter, not just a story with a Harry Potter décor. And this does not have to penalize character-driven Hogwarts-central stories. Even such stories could include some unknown rooms or passages in Hogwarts, or some new spells or new phenomena. Deeper exploration of canonical elements also counts as expansion when it adds functional depth. And these aspects do not necessarily need to take up a large chunk of words. But yes, stories that do not include such aspects are heavily penalized by this axis on purpose. That is the scope limitation this rubric is designed to do.

Descriptors: Score the category on a scale of 0-10, with an integer number.

NB: 8 instead of 10 means that the overall worldbuilding in the story is extremely good, but the story makes 1 or 2 of the above mistakes one some points in the story. Only one or two from the above list and only on a few points in the story.

8) Viktor Krum

Name: Viktor Krum

Main Issue: How well is Viktor Krum rounded out as a character? Contrary to the iconic characters discussed in Consistence of Characters, canon does not give us much of a baseline. As such, an extrapolation of his character is necessary. So instead of just comparing Viktor Krum to its canon baseline, the quality of the extrapolation is assessed here. Details: Canon only gives us a few basic properties, like:

Now, any extrapolation of Viktor Krum is fine. Again, we do not want to discriminate, so the more Viktor is turned into a well-rounded character, the better. Regardless of which creative choices are made. But directly contradicting canon (mainly the above properties) is heavily penalized. We are looking for an extrapolation of the canonical character, not a rewrite. Furthermore, extrapolation choices (regardless of what they are) must be committed to though the entire story. Viktor Krum should be a consistent personality (he can have contradicting personality traits, but his traits and skills cannot vary across the story). Note that this is very different from categories 2 (Psychological Realism) and 3 (Romance). Psychological realism measures whether all characters (including Viktor) act in agreement with their personality traits and act as human being. Hence, it measures how a character behaves with respect to its baseline. This category measures how extensive and well-rounded that baseline is designed for Viktor Krum. Romance measures the dynamic between (mainly) Viktor and Hermione, again with respect to certain baselines for Hermione and Viktor. Again, here, we assess the design of that baseline. More specifically we are looking for design elements like:

NB: many character traits do not necessarily make a good character. The more the better only applies as long as it leads to a coherent, consistent and overall well-rounded character. Motivation: As stated before, Viktor Krum is canonically a shallow character, so we cannot assess him the same way as canonical core characters: on psychological realism and Consistency with canon. He needs extrapolation. So instead, we judge the quality of how well the extrapolation is done. The emphasis on Viktor Krum extrapolation may seem fairly strong, but it is arguable. The scope of the rubric is vikmione, not a Hermione-boyfriend named Viktor. And we do not prescribe how Viktor should be extrapolated, only that it must be done properly. So it is defendable.

Descriptors: Score the category on a scale of 0-10, with an integer number.

9) Originality

Name: Originality

Main Issue: how fresh, new or original do the romantic dynamics feel for the main pairing in the story? This is different from the Romance axis, which measures whether the dynamics are realistic. There is a distinct difference. This is basically why do we need this specific pairing? Can we observe this special dynamic only with this pairing, or also somewhere else?

Details: NB: this category is NOT meant to be ship-specific, but it is hard to describe the details without specific examples. What we are after, is how well the romantic dynamics & interaction reflect the unique personality traits of both partners. And whether the story embraces what makes the pairing unique and deals with this in a fresh and original way (very good), or that the story largely ignores what makes the persons and relationship unique and only applies standard romcom tropes (heavily penalized). Standard romance tropes and romcom cliches can be observed everywhere. You do not need the specific (Harry Potter) pairing in this story to enjoy those. The story should let you experience a dynamic that cannot be observed in any other pairing. For example:

NB: we are not after romantic realism here. We are after how well the story utilizes what makes a pairing unique, and incorporates this properly in the relationship dynamic, so that the story is worth reading. An original angle withing the ship is rewarded, but only slightly (see descriptors: 8 vs 10). Most importantly the story must utilize the specific ship to its full potential. That is what we are after.

Motivation: There are countless romcom-stories and many harry potter pairings. We want to know if this story adds something meaningful to that landscape or not. Descriptors: Score the category on a scale of 0-10, with an integer number.

Guideline: A stellar execution where the romantic dynamic includes all (or most) core personality traits (see details) & heavy topics (imbalances, toxicity, maturity, etc) of both partners, but that does not add new or fresh layers/angles to the pairing is rewarded 8/10. So, 8 points are for a quality execution, and 2 more for adding something new on top of that execution (10/10). New additions to the dynamic while ignoring significant core traits potential is not rewarded with the 2 extra bonus points as this is considered a cheap shortcut. This would probably give score 4-5.

NB: A dynamic that is emotionally strong and believable but would still work largely unchanged with a generic protagonist (because it is mainly based on trauma, for example) should score high on Romance but low on this category (Originality).

10) Integration

Name: Integration

Main Issue: How does the story connect plot, main romance and worldbuilding? Are they separate? Exist plot & romance side-by-side with respect to a magical décor? Or do relationship dynamics, plot & worldbuilding form a structural interdependent system where a change in one of them leads to (massive) responses in the others?

Details: Note that this category is very different from anything we had before. Worldbuilding measures the intrinsic quality of the aspect. Romance and Originality measure the intrinsic ship quality. Narrative Quality measures the intrinsic plot quality. But each of these 3 aspects things can score very high on its own intrinsic quality while having little to do with each other and, therefore, the story scores low on integration. On the other hand, a story can also have a simple ship, worldbuilding and plot while the 3 aspects do continuously interact. So, here we focus on the following questions:

Picture worldbuilding (both magic system and environment), relationship dynamics (of the primary ship in the story) and the main plot on a triangle. Between the points you can connect 6 lines of causality how a change in one aspect can lead to changes in the others. And the more the three aspects actually talk to each other and respond to each other and vice/versa, the better. Integration is weak if any of the three elements (plot, relationship, worldbuilding) can be removed or altered without significantly affecting the others.

Motivation: Each relationship comes with its own main challenge/narrative engine. For example (the list is non-exhausting): Harry/Hermione: how do you make room for the ship (Ron & Ginny)? Ron/Hermione: Can they accept each other for who they are? Snape/Hermione: how do you deal with power/age imbalances and toxicity? Draco/Hermione: Can Draco change and Hermione forgive? Viktor/Hermione: we know that they like each other. So, what happens with their relationship? This axis is meant to specifically assess how well the main challenge of the ship is addressed in the story. It is, and has to be, vikmione-specific. But it has to be discrimination-free. So, we cannot, for example, state that Viktor should join the horcrux hunt, or that the hunt even has to be discussed. Maybe the story has a setting where this is not relevant. The important part is: if there is a horcrux hunt and Hermione is together with Viktor, what does that mean for the ship. Hence, for vikmione integration is the correct abstraction for measuring how well the main challenge of the relationship is handled.

Descriptors: Score the category on a scale of 0-10, with an integer number.

Combination of individual axis into a total score

Within each main category, scores are combined as a single unweighted average. This allows trade-offs in quality between, for example, style and theme.

However, when combines the scores from A, B and C we must be much more careful. For example, a rewritten Hermione provides a way too easy solution in ship dynamics. Another problem is that the rubric favours overengineered stories. Here, overengineered means that the story does everything textbook-right but still feels emotionally shallow by the reader. Organic impression must be rewarded as well as system design. For this reason, we introduce some auxiliary parameters, and a judge severity parameter alpha, where alpha=1 means a very strict judge and alpha=0 means a very forgiving one.

D = avg(Romance,Theme Quality)

E = avg(Romance,Psychological Credibility,Style Quality)

F = avg(Integration,Originality,Worldbuilding,Narrative Quality)

D contains the only two axes in the rubric that strongly depend on emotional impact on the reader. We take an average, so that missing out on one a single axis will not compromise D too much. Romance strongly depends on how the reader can feel with the romance, Theme strongly depends on how emotions of the reader amplify message transfer. We use D to ensure that overengineered stories do not get a free pass in the total score. E and F are not directly incorporated, but serve as a proxy check to whether the score of the total is fairly accumulated. E solely deals with character emotions and whether the reader can access those. F solely deals with story-design. Hence, the two axis are almost orthogonal, meaning that they can be used well to check for rubric biases.

Now, what we must do is incorporate D in the total in such a way that stories with low theme but very high A and/or low romance but very high overall C cannot slip through too easily.

As such, we calculate the total score as:

T = (1-alpha) $\times$ avg(A,B,C) + alpha $\times$ min(A,B,C,D)

We incorporate D in the min, but not the average, so that axes are not counted double. This means a forgiving judge will look past overengineering, but a strict judge will not. Note that E and F are not incorporated in the total score T at all. They just serve as proxy scores to compare T with and see if T makes sense.

Next, we measure the score drift with respect to the judge as:

S = - dT/d alpha = avg(A,B,C) - min(A,B,C,D)

A low drift then means that quality is intrinsically earned, as the average and minimum are close together. A large drift means that part of the quality comes from trade-offs between the categories and/or is bought with overengineering. Hence, Total and Drift tell you what you need to know on a story without having to understand the intrinsic properties and subtilities of the rubric.

Note that this system does accept trade-offs withing categories, but less between categories. This is defendable, as the categories are not randomly chosen, but consist of thematically coherent collections of the rubric axis. As such, the system is much better defendable then arguing weight function between the different rubric axis, etc. You can defend the thematic aggregation and you accept trade-offs within a category but not between.

As such, the rubric score will do exactly what it needs to do: it is defendable, reproducible and discrimination-free within the scope, but it punishes stories outside of the scope:

As such, the combination of total score & drift rewards intrinsic quality, reproducible, defendable and discrimination-free and dismantles stories outside the scope and/or trade-offs.

Unless the specific purpose requires otherwise, we recommend using alpha=0.5 (this is what we used). This is a bit of an arbitrary choice, but understandable. A fully strict judge does not accept any trade-offs in quality and, therefore, does not reward peak performance on a single category at all. This feels too strict. A fully forgiving judge does not punish against overengineering and accepts trade-oofs in quality too easily. a=0.5 then feels as a nice middle-ground.

The rubric is usable outside of the vikmione scope as well (but within the Harry Potter fandom). Categories A and B can stay as they are. But C needs adjustment. Originality & Romance can stay as they are, as they apply to any ship and ships are even present in Harry Potter stories that have no focus on a main ship. Although it is probably unfair to base 1/3 of the total on ships in a story that is not primarily about ships. But that is a scope issue, not a problem with the rubric.

However, the Viktor Krum axis and the Integration axis as main evaluation tools are vikmione-specific. Judging other (ship-)stories with these axes is unfair, as the challenges and circumstances are completely different, making these aspects no longer the main focus. So, we must shift these two axes to aspects vital of those stories.

Viktor Krum should be replaced with an axis judging extrapolation of the relevant character, if this is an OC or underused canonical character, like Draco Malfoy. However, this does require a complete redesign of the axis. A Viktor extrapolation should consider aspects like fame, culture, skills & flaws, etc. But a Draco-extrapolation should focus more on the impact of Draco growing up in a death eater family for example. And on how to integrate his Slytherin traits in a fully rounded character without assigning him unrealistic extra skills. Canon, for example, strongly suggests that his exam results are similar to Harry’s, so he cannot be academically brilliant. For characters with a solid canonical basis like Ron, Harry or Snape, the axis should simply be dropped (giving C 3 axis instead of 4).

Integration should be replaced with an axis measuring the specific challenge of the relevant ship(s). Integration between worldbuilding, plot and ship, is less important for Dramione for example, but the enemy-to-lover arc is. Hence, we need an axis to measure this. This would be the same concept-integration, but between relationship dynamics & emotional development axis, not relation/plot/worldbuilding. Same for Ron. For Harry, one would measure integration between relationship dynamics and social structures (how do the Weasleys react to harmony and how does this affect Harry and Hermione’s social networks in the magical world). For Snape, one would measure something with power imbalances & toxicity.