How we score every song
Every lyric forged or evaluated in SongForgeAI is measured across 12 metrics in three tiers. Scoring is deliberately hard. A 50 is average. A 90+ is historically rare territory.
Craft (25%)
Can this person write? Mechanics, structure, rhyme, and word choice.
Expression (40%)
Does it say something worth hearing? Specificity, originality, truth, and voice.
Impact (35%)
Will anyone remember it tomorrow? Transcendence, arc, stickiness, and genre fit.
The 12 Metrics
1Prosody & Musicality
Meter, stress patterns, consonant and vowel clusters, intentional silence, and breath points. Does the lyric feel good in the mouth?
Natural rhythmic flow that a singer can inhabit without fighting the phrasing. Stressed syllables land on strong beats.
2Structural Architecture
Song shape, arc, verse progression, chorus return, and bridge revelation. Does the structure serve the story?
Each section has a clear job. Verses build, choruses resolve, the bridge shifts perspective. Nothing feels arbitrary.
3Rhyme Intelligence
Rhyme as craft servant: internal rhyme, slant rhyme, strategic non-rhyme. Does the rhyme scheme feel intentional rather than forced?
Rhymes land with purpose. A mix of perfect, slant, and internal rhyme that never bends meaning to satisfy a sound.
4Economy of Language
Every word earning its place. No filler, no padding, no lines that exist only to set up a rhyme.
You cannot remove a word without losing something. Every syllable carries weight or music.
5Lyrical Specificity
Concrete imagery, sensory detail, proper nouns, time anchors. The opposite of abstract generalities.
The song lives in a real place with real objects. "Tangerines and someone else's smile" instead of "memories of you."
6Imagery Originality
Fresh metaphors, defamiliarized objects, governing images that haven't been written to death.
Images that surprise on first read and deepen on second. No shattered hearts, no oceans of tears, no wings of freedom.
7Emotional Truth
The ring-test: does it feel true? Earned emotion, unforced vulnerability, no borrowed sentiment.
The emotion arrives through specificity and honesty, not through telling the listener what to feel.
8Voice & POV Integrity
Narrator consistency, perspective clarity, and a credible speaker. Does this sound like one person talking?
A distinct human presence. Word choices, diction, and references that belong to one coherent narrator.
9The Transcendent Line
The unrepeatable line. Not necessarily the cleverest; the truest. The line someone would quote.
At least one line that stops a listener cold. The kind of line people screenshot and share.
10Emotional Arc
Does the song move from state A to state B? Revelation, release, recalibration. Not just emotion, but emotional motion.
The listener ends the song in a different place than they started. Something shifted.
11Memorability
The one-hour test: could you recall this 60 minutes after hearing it once? Hooks, refrains, and sticky phrases.
Lines that lodge in the brain involuntarily. A chorus you catch yourself humming without trying.
12Genre Authenticity
Does this honor its genre while extending it? Genre fluency without genre cliche.
A country song that sounds like country but doesn't sound like every country song. Respect and surprise.
Why scores are hard to game
We built anti-inflation into the scoring system so that high scores actually mean something.
Gravity Rule
The default is 50, not 80. Every point above average must be earned with specific evidence from the lyrics.
Burden of Proof
Scores above 80 require the scorer to cite specific lines and explain why they justify the number.
Antagonist Ceiling
A dedicated adversarial voice tries to lower every score. If it finds a real weakness, the score drops.
Historical Context
Scores are anchored against the best lyrics ever written. A 90+ means the song stands alongside recognized classics.
Grade Scale
How the composite score works
Each metric scores 0-100. The composite is a weighted average across the three tiers:
See it in action
Every song you forge or evaluate gets a full 12-metric breakdown with reasoning per metric.