Page 136 - Weiss, Jernej, ur./ed. 2025. Glasbena interpretacija: med umetniškim in znanstvenim┊Music Interpretation: Between the Artistic and the Scientific. Koper/Ljubljana: Založba Univerze na Primorskem in Festival Ljubljana. Studia musicologica Labacensia, 8

P. 136

glasbena interpretacija ... | music interpretation ...
Sinatra “adaptations”. These include: “Say Something” by A Great Big World
12
(2013); “Still D.R.E.” by Dr. Dre (1999; here the unsuspecting listener can
hear “Sinatra” rapping); “Bad Habits” by Ed Sheeran (2021; in this adapta-
13
14
tion “Sinatra” curiously stays off-pitch much of the time); and others. The
last two examples are perhaps less convincing than the Nirvana clone as Si-
natra here sings in a style that he would probably never have adopted in real
life, even if his voice may be imitated in a passable way. Not all AI-imitat-
ed voices are fully convincing though; the YouTube channel “Elvis Presley
AI” presents a range of AI Elvis clones which often do not manage to repro-
duce the fullness of the singer’s rich baritone voice, as exemplified by “Baby
Got Back”. 15

How Large Language Models Work
The breakthrough in the development of a generative AI that can be used by
anyone to create texts, pictures and music was the development of “Large
Language Models” (LLMs) which can cope with enormous amounts of
data. However, it has to be stressed that LLMs are not databases but instead
algorithms that create texts, pictures, music and other media on the basis
of examples in a database it has access to, coupled with a set of rules it has
been programmed to apply to them. The larger the amount of data the LLM
has access to the more convincing its output will be. We will look here at
the creation of texts, because this is where the principle was developed, and
it is a bit easier to understand. Later we will move on to the creation of mu-
sic though generative AI.
When creating a text, an LLM checks a vast amount of data to find
out which word is usually used following to the previous one in a chain of
words (or next to a previous phrase, or sentence, or paragraph – and all that
in the topical context of what has been requested of it), adding that word,
and repeating the process all over again. The broader the contextual infor-
mation is the more convincing the result will appear. Thus the task is to
12 “Frank Sinatra – Say Something (AI Cover),” AI Music World, uploaded on 13
September 2023, YouTube video, 2:40, https://www.youtube.com/watch?v=6Rgv-
BAoent4.
13 “Frank Sinatra – Still D.R.E. (AI Cover),” AI Music World, uploaded on 2 September
2023, YouTube video, 3:38, https://www.youtube.com/watch?v=VrsNClZ8F38.
14 “Frank Sinatra – Bad Habits (AI Cover),” AI Music World, uploaded on 20 August
2023, YouTube video, 4:11, https://www.youtube.com/watch?v=jbbyGF1t5Eg.
15 “Baby Got Back – Elvis Presley,” Elvis Presley AI, uploaded on 5 June 2023, YouTube
video, 0:52, https://www.youtube.com/watch?v=_uy2NQWal-4.

136

131 132 133 134 135 136 137 138 139 140 141