Page 137 - Weiss, Jernej, ur./ed. 2025. Glasbena interpretacija: med umetniškim in znanstvenim┊Music Interpretation: Between the Artistic and the Scientific. Koper/Ljubljana: Založba Univerze na Primorskem in Festival Ljubljana. Studia musicologica Labacensia, 8
P. 137
ai and musical interpretation
train a neural network (the LLM) to predict the next word in a given se-
quence of words, no matter if that sequence is long or short, in German
or in English or in any other language, whether it’s a tweet or a mathe-
matical formula, a poem or a snippet of code. All of those are sequences
that we will find in the training data. 16
To achieve this, words are “translated” into word vectors, i.e., numbers
that allow for those statistical operations to be undertaken more smooth-
ly. This means the result of an LLM operation is what is statistically most
17
likely to be convincing on the basis of the model’s extensive database.
The LLM also has certain pre-programmed rules (or “parameters”),
such as when to use a noun, a verb or a participle (or perhaps certain chords,
or voice-leading rules), or which words are more likely to appear in a cer-
tain thematic context (beyond the purely grammatical and syntactical con-
text). There is also “instruction tuning”, which teaches the system what hu-
man operators are likely to expect as a response to a given input. Finally,
there is learning though human feedback – the algorithm constantly in-
teracts with human operators and changes its rules on the basis of the re-
18
sponses it gets from them. This means that those rules are not static – they
can be changed by the system itself over time on the basis of what it learns
from an expanded data set. This is the reason why after a while even the al-
gorithm’s developers do not know in detail any more what the set of rules is
that the algorithm is applying.
The questions or requests put to an LLM are called “prompts”, and
their “design” is crucial to ensure that the response fully covers what the
prompter expects it to engage with. Good prompt design is a central as-
pect of the successful use of generative AI, and it is often advisable to vary
prompts several times to see how the responses change (at a basic level this
may be comparable to the use of search engines, where a similar principle
applies).
It is important to note that the algorithm’s goal is to create a text that
appears as “human” as possible, that is, a convincing answer that passes the
16 Andreas Stöffelbauer, “How Large Language Models Work. From Zero to ChatGPT,”
Data Science at Microsoft, October 24, 2023, https://medium.com/data-sci-
ence-at-microsoft/how-large-language-models-work-91c362f5b78f. This article also
discusses the AI’s analysis of pictures and music.
17 Timothy B. Lee, “A Jargon-Free Explanation of How Large Language Models Work,”
Ars Technica, July 31, 2023, https://arstechnica.com/science/2023/07/a-jargon-free-
explanation-of-how-ai-large-language-models-work/.
18 See: Stöffelbauer, “How Large Language Models Work.”
137