Page 137 - Weiss, Jernej, ur./ed. 2025. Glasbena interpretacija: med umetniškim in znanstvenim┊Music Interpretation: Between the Artistic and the Scientific. Koper/Ljubljana: Založba Univerze na Primorskem in Festival Ljubljana. Studia musicologica Labacensia, 8
P. 137

ai and musical interpretation
                 train a neural network (the LLM) to predict the next word in a given se-
                 quence of words, no matter if that sequence is long or short, in German
                 or in English or in any other language, whether it’s a tweet or a mathe-
                 matical formula, a poem or a snippet of code. All of those are sequences
                 that we will find in the training data. 16
                 To achieve this, words are “translated” into word vectors, i.e., numbers
            that allow for those statistical operations to be undertaken more smooth-
            ly.  This means the result of an LLM operation is what is statistically most
              17
            likely to be convincing on the basis of the model’s extensive database.
                 The LLM also has certain pre-programmed rules (or “parameters”),
            such as when to use a noun, a verb or a participle (or perhaps certain chords,
            or voice-leading rules), or which words are more likely to appear in a cer-
            tain thematic context (beyond the purely grammatical and syntactical con-
            text). There is also “instruction tuning”, which teaches the system what hu-
            man operators are likely to expect as a response to a given input. Finally,
            there is learning though human feedback – the algorithm constantly in-
            teracts with human operators and changes its rules on the basis of the re-
                                    18
            sponses it gets from them.  This means that those rules are not static – they
            can be changed by the system itself over time on the basis of what it learns
            from an expanded data set. This is the reason why after a while even the al-
            gorithm’s developers do not know in detail any more what the set of rules is
            that the algorithm is applying.
                 The questions or requests put to an LLM are called “prompts”, and
            their “design” is crucial to ensure that the response fully covers what the
            prompter expects it to engage with. Good prompt design is a central as-
            pect of the successful use of generative AI, and it is often advisable to vary
            prompts several times to see how the responses change (at a basic level this
            may be comparable to the use of search engines, where a similar principle
            applies).
                 It is important to note that the algorithm’s goal is to create a text that
            appears as “human” as possible, that is, a convincing answer that passes the
            16   Andreas Stöffelbauer, “How Large Language Models Work. From Zero to ChatGPT,”
                 Data Science at Microsoft, October 24, 2023, https://medium.com/data-sci-
                 ence-at-microsoft/how-large-language-models-work-91c362f5b78f. This article also
                 discusses the AI’s analysis of pictures and music.
            17   Timothy B. Lee, “A Jargon-Free Explanation of How Large Language Models Work,”
                 Ars Technica, July 31, 2023, https://arstechnica.com/science/2023/07/a-jargon-free-
                 explanation-of-how-ai-large-language-models-work/.
            18   See: Stöffelbauer, “How Large Language Models Work.”


                                                                              137
   132   133   134   135   136   137   138   139   140   141   142