Volkmar Klien: Total Optimization and Defiance – About artificial intelligence and musical composition

Header_Klien2, Photo: Johannes Novohradsky

The first AI x Music Festival, organized by Ars Electronica and the European Commission as part of the STARTS initiative, is dedicated to the encounter between human creativity and technical perfection. From September 6 to 8, 2019, Ars Electronica will be gathering musicians, composers, cultural historians, technologists, scientists and AI developers from all over the world in Linz to discuss the interaction between human and machines through concerts and performances, conferences, workshops and exhibitions.

Automated data collection and processing with machine learning is about to introduce fundamental changes into societal forms of communication. The widespread application of these new technologies leads to tectonic changes in the structure of human societies, which have already progressed quite far.

While it may not seem obvious at first glance, music, and more particularly music composition, is a very suitable field for dealing with these very shifts and reflecting upon them. Music as a form of communication, as the most primordial form of social media, influences the human communities where it unfolds in multiple ways and, at most times and for most people, in an unthematic way, i.e. without being reflected in its operation. It is precisely because music is rarely in intellectual focus and can touch the listeners’ hearts without language that it is so efficacious. Music creates virtual and hybrid worlds for us to inhabit. In contrast to visual media, it is unrestrained by boundaries and intrudes on us physically: synchronizing, arranging, influencing, motivating.

The musical avantgardes of post-war Europe were dealing intensively, both in theory and practice, with these aspects of music against the backdrop of their relatively new role in mass media. In this respect, this tradition makes for a rich background for questioning new AI-based communication technologies. For new technical possibilities cannot simply be regarded as neutral extensions of the previous tool box. Any tool will always change the world of its users. And with the possibilities at our disposal to interact with our environment, our perception of the world as our space of possibilities is changing.

The Avantgardes – economical-technical vanguard and the traditions of artistic avantgardes

The compositional avantgardes in the fields of electronic and computer music were, until very recently, acting in close contact and constant exchange with those of technology. Composers were deeply involved in the development of electronic musical instruments and computer music programs from the very beginning.

In the meantime, the worlds seem to have separated. The area where most research approaches in AI and music come from, Music Information Retrieval (MIR), appears to be a project of the computer sciences, a branch of information retrieval, rather than an artistically motivated quest for new compositional possibilities.

One reason for that could surely be that the concept of an artistic vanguard may appear slightly anachronistic these days. Generally speaking, the age of artistic promises of salvation seems to be over, and it is the proponents of data capitalism who are now being cast in the role of futuristic preachers. In that regard, today’s technology prophets are the true successors of the Futurists.

The fact that the manifestos[1]of the 20th century avantgardes are just as full of “disruptive” rhetoric as today’s TED talks and product presentations is one of the parallels between the tech sermon and the avantgarde manifesto. While artistic avantgardes definitely regarded themselves as “disruptive,” they still differ fundamentally from those of data business; for they never saw themselves as “optimized” according to defined procedures. As examples for disruption without intended optimization, Tristan Tzara’s “Manifest Dada 1918,”[2] Mladen Stilinovic’s “In praise of laziness,”[3] H.C. Artmann’s “Der poetische Akt”[4] or Pauline Oliveros’ “The Poetics of Environmental Sound”[5] could be mentioned.

Music & Math, Structure & Order

European music history and theory have been characterized from the very beginning by a closeness to mathematics. Numerological reminiscences, strict harmony, hopes for the revelation of transcendental structures of order in and through music permeate various textbooks of composition. In honor of this occasion we should mention Johannes Kepler’s *Harmonices Mundi*,[6] published in Linz in 1619, although it is not a music textbook per se.

Regardless of this there is hardly any thoroughly formalized compositional practice in existence. Even the much-quoted serial works of the 20th century are small in number, and the formal methods applied within them vary tremendously from work to work. There can be no talk of a single, established form of serial composition.

Still, based on serial approaches, we can spot some of the aspects that distinguish the current projects emerging from AI research from the traditional approaches in the field of formalized or automatically generated (generative) music. In very broad terms, a transition can be observed from rule-based to data-based approaches. In serialism, at least some attempt was made to establish the most rigid relation of the note pitches and durations to the structure of the compositional whole by a defined set of rules, therefore providing it with a substantiation. In the approaches that are founded in Big Data, the structure of the generated signal is the result of databases from which correlations between the single datasets, for example compositions, can be inferred. For Big-Data informed approaches in automatic generation, interpretation and sorting of music, the focus is not on explicit rules for composition, arranging, theory of harmony or interpretation, but on the data of musical practice. This practice of human music making is to be thought of in a very broad way. Its data traces reach from written scores and audio recordings of musical works to the data of their reception, and to all imaginable sorts of data that can be brought into context by its collectors (position data, surf behavior, status messages, health data, etc.) Music has never confined itself to a set area defined as musical (the score, mere numeric relations or singing), but was always embedded in the social and political whole of human existence. Particularly with regard to music and AI, it is essential to keep this embedding of music in the diversity of societal and individual execution in mind.

Music making? Composing? Machines?

From the composer’s perspective, very concrete questions are to be raised about the fundamental basics and the concept of art with regard to quite a few AI-based music projects of the global data economy. However, the eventual appearance of machine-based autonomous musicians in composition, improvisation and interpretation will shed new light on old questions, prompting us to rethink what was already implicitly accepted. For the question of whether a robot, i.e. a software, can make music or compose also implies the question of what it actually means to make music and compose. To quickly anticipate one point: From the viewpoint of art, composition certainly does not confine itself to the production of new music pieces along established and defined expectations about what music is supposed to be and the role it is supposed to play.

Music making?

In this context, reflecting on music is less about pieces of music, or works, but rather a matter of pondering about music making, for music ultimately can only exist as human activity.

Music making (and therefore composing) means forming communities and designing complex, highly dynamic societal structures of human interaction. In the very beginning, music probably amounted groups of humans singing together, all at the same time and place. Along with the differentiation of musical media techniques far beyond human singing, from bone flutes and drums to mechanical musical instruments to the present digital music-making forms based on loudspeakers and networks, the result is a richness of possibilities to participate in music-making communities that can be selected relatively freely. Headphones and networks allow us to feed ourselves our choice of soundtrack.

Music today has a broad scope/range of roles available, which it can perform according to the specifications and needs of the recipients. Few of them are close to art or mathematics. Music is listened to for recreation or distraction, it serves as a portable habitat and is also used – apart from innumerable other fields of application – as self-medication and horizon putty. The right kind of music performs miracles over the weekends in the clubs around town in propping up the young employees of banks and industries for another week of high performance at the office. It is here that the possibilities and promises of the automatic generation of musical signals and optimized delivery to consumers can most easily be foreseen. All data that correlate or can be correlated with music of the consumers can be included, ranging from consciously chosen playlists and concert visits to reading lists of books, motion patterns and accompanying habits of consumption.


The question of what it means to compose or create art is relatively easy to answer at first, if “creating pieces of music” is deemed a sufficient answer. But this would also mean ignoring the obvious impossibility of finding an ultimate, all-encompassing answer to this question. For all conceptual difficulties in the question of “composition” are simply packaged into “music pieces” and are shifted away from the defined problem area. “Music pieces” can be, for the sake of focusing on AI-based music generation, very well defined as signals used by humans as music in a purely pragmatic sense. For the development of algorithms to automatically generate music along established patterns, such a pragmatic definition would be entirely sufficient. From an artistic point of view, this approach remains of course problematic, since the really interesting things tend to happen where majority appeal in everyday musical practice is not exactly the norm.

Therefore, the issue of defining or narrowing down the process to be automated (artistic creation) will already raise fundamental problems. Asking, “What would qualify artistic activity as such?” is totally different from asking how things, or signals conforming with established notions of artworks, could be automatically produced. Art is not a matter of product development within set boundaries, but reflection, politics, and action in the free field.


Music making once meant (and in some cases still does mean) gathering in a space for collective activity. Even when somebody sang or played on his/her own, the produced sound was a direct product of bodily acts and only audible for those within earshot, i.e. immediate proximity. Music notation, requiring quite an amount of skill and knowledge both in recording and reception, was the first person-independent medium to transmit music in time and space. Transmission and recording of sound expanded the size of the music making communities enormously in time and space.

Human activity leaves audible traces in music, which can be given longevity by media technology, becoming repeatable and portable. The music-making community as such, of course, will not resurge again in technical playback, but a convincing sonic image can be read from the record disk or sound file, even if it will always remain a sort of shadow existence. In these realms of the shades, musicians soon were able to leave sonic traces with the aid of synthesis that no human or mechanical activity could have ever borne. Sampling then allowed us to overlay and construct several music shadow worlds to create new sonic realms. Seen this way, an automatically generated music piece is not only a new sound object, a new sequence of certain pitches in time, but the emulation of traces of communal human activity that has never happened in that form. A topic that with regard to social media – with its fake news, bots and nudging techniques – has already gained broad attention and even broader application.

Hybrid communities, substitutes, asymmetries

In music that we find developments similar to those discovered in other forms of human communication: a transition from in situ, via in print and on air and on line communities toward hybrid and substitute communities. These new possibilities in designing the music-making community result in completely new forms of music making, that – like the technical developments themselves – have to be mostly thought of as embedded within the amenities of data capitalism. The ideal appears to be the delivery of music optimized by the aid of AI, which is totally directed towards the individual consumption needs of the customer. Involving previous habits of listening, seeing and consumption, physiological data (heart and breathing frequency, sexual activities, menstruation cycles. etc.) and information pertaining to the general social and psychological situation can be gathered in order to choose the ideal musical soundtrack, to modify it accordingly or even to generate it anew. By way of dynamic feedback with millions of user behavior profiles (on turning the volume up or even down, will body movement or heart frequency synchronize with the playback?), the technology of playback can be optimized further without the need of conscious verbal feedback by the listeners. Symbolic layers thus appear in the shadow and from the observation of unreflected practices.

Music as an art form shaping communities is always a hierarchical construct with specific interpersonal relations. Listeners engage with each other and the common rhythm, whether dancing or not, and are able to celebrate both a loss of control and a sense of unification with the whole. Media technologies such as notation, amplification and forms of telepresence (from radio to streaming) expand the reach of the common rhythm and harmonies. Music automatically generated by AI technology promises an automatic, “individualized” remote control of music-making communities from the outside.

Music making always happens from the first-person perspective. Music is created in participation, otherwise it stays sound, or in the worst case, noise (as in the case of the bass from the neighbor’s party). Music making (active or passive) means a lack of distance, whereas observation and data collection are the opposite. AI-based measurement techniques see music always only from an external perspective, while the listeners, i.e. customers, of AI based, automated playback remain in the first-person perspective, an ideally undetached participation. The recipients experience participation in a community, but are actually surrounded by a media feed. The effects, motivations and functionalities of this global playback and substitution machine are left consciously in the dark. Embedded in surveillance capitalism and financialization of all human relations, music can now more than ever sing paeans to the new rule, not only underscoring shifts in power, but even contributing to them, if those exposed to it will perceive an expression of the purest of hearts, testimonies of free, individual artistic needs. Music has always had aspects of distant masturbation (somebody plays guitar on a stage and a group of people feel enchanted/cozy), but now whole new possibilities of cynical music are looming on the horizon.

With the measurement of all activity, there is a fundamental asymmetry in the view of things. The customers are running alone in a forest and have to put up with the worm’s-eye view, while everybody’s motion patterns are recorded somewhere in the control center, where the positions of the trees are also dynamically adapted and underscored by haunting chords. The asymmetries in information flow and design authority in hybrid digital worlds are also a central topic in the context of AI and music.

The problematic part, therefore, is not the substitutional character of the playback worlds per se (in a certain sense music, theater and film always had this element), but the inherent shifts of power structures. At the transition to a total clientele, industrially optimized music will strengthen the individual belief in immediate, personal and free experience. It is thus acting as a lubricant for economic and political shifts. For music is potent. It is not without reason that it can traditionally be found not only in the proximity of rite and mathematics, but also close to narcotics and sexual intercourse. The question of whether an AI is “making music” is therefore similar to the one of whether a sex robot is having sex or not. For the interacting human on the opposite side in each case, this can be answered with yes, but the whole affair seems rather multilayered in nature.

Musical total optimization

AI technologies with their data collections can be used as a wind tunnel of sorts, by testing musical projects in the stream of the recipients in order to be developed and further optimized. The real-time observation of the music-making community in their habitat facilitates the construction of an ideal ergonomic musical signal that adapts dynamically to the conditions; bodily and culturally shaped to fit the conduct of everyday life, non-thematic and hence all the more informative.[7] Such optimized music can be dynamically tailored to the respective situation and person. Whereby the methods of individualization only concern the surface which is shown or pretended to the customers. The algorithm of the machine performing this part-individualization of a single one of millions of transmission channels is configured to be as universal and general as possible.

This musical optimization follows the business models of the data economy, often with the aim of tying up the customers’ attention for as long as possible, since music with its power over humans seems particularly well-suited for the subtle influencing of purchase decisions. The intended role of this dynamically generated music appears to be one of a soundtrack to the simultaneously generated parallel world from the transmission with all its asymmetries included. The more “individualized” the playback, the less it will be perceived as a form of conformity and can thus support this process even more efficiently. For within this hierarchy, the chimera of individuality only exists via the playback channel to the consumer. This kind of individualization acts in fact as a great normalizer, revealing itself as a lethal enemy of the individual act.

Those of us who view music also as an art form won’t be able to avoid seeing this form of optimization and individualization critically. Since before there can be optimization, we must clarify what has to be optimized, and how. The suspicion seems obvious, that activities with exact predefined aims would rather be services than artistic work. But the claim that art should be something higher, something more precious or abstracted, should be avoided; rather we should state that it is something entirely different. Art is always called upon to engage with its conditions and cannot settle for the implementation of specified requirements.

Challenges to the Avantgarde

All this by no means implies that computer-aided approaches for algorithmization and automatization of compositional activities were artistically uninteresting on principle. It’s rather about coping as a composer with these possibilities on a fundamental level, before integrating them as magical tools into everyday activity otherwise unchanged. With that, there is no contest between “real” art and AI to be heralded. It is instead about making them more fruitful for each other, precisely by confronting technical-economical optimization strategies with artistic, hence political, aims. Art always also means defiance and self-asserted independence from established role ascriptions and (musical) conventions. The compositional research work of musical avantgardes cannot limit itself to the expansion of sonic space or musical material. Its aim has to be to develop a constantly evolving understanding of musical, social possibilities within the equally ever-changing techno-political environments. Machines, whether made of soft- or hardware are – even if the marketing departments of the companies keep harping on terms like “self-learning” and “autonomous” – made, constructed and operated by humans with specific interests. This is why a discussion of these technologies can only be meaningful in a wider economical-political context and it is consequently not enough to review the output of AI composers on a purely aesthetic level (e.g., “are the chord progressions convincing?”).

Composing of and composing in magical worlds

With the complexity of musical instruments and compositional tools, on the one hand the musical possibilities are expanding, but on the other hand so are the number and power of preliminary decisions that are being made in their conception and delivered with these tools. Even the contemporary softwares for music production that are not yet advertised with AI buzzwords have reached a level of complexity that is hard to master. To keep these programs operable, the view is directed, and users are guided along designed interfaces. Virtual theater props and stage elements from the metaphorical repository of “music” are digitally shifted to keep the view on the intended surface.

Every interaction with software always implies interactions with the engineers who dynamically provide an interface, a playground, and whose assumptions and preconceptions about compositional work characterize the possibilities of the tools in a fundamental way. Within the explicit and implicit templates, the customer can certainly be creative. They have it nice and easy, as long as they perceive music the same way the “industry” does. This is not just the case with software-based implements, but is part of the idea of tool and instrument manufacturing. The piano, Western music theory turned into an apparatus, is characterized by preconceptions about what music should be and solidifies these notions for further generations. Compositions for piano are qua definitionem almost always in equal temperament and forgo vibrato and glissandi. Software-based instruments are lacking the physical restrictions of the piano (and of all other traditional instruments), but shape, if commercially successful and widespread, at least as much the conceptual spaces of the artists that are growing up and working with them. We are using and trusting machines every day, without an understanding of their functionalities. (Semi-)automatic composition programs are about to introduce this return to magical worlds also into artists’ ateliers and studios. One hopes, swipes, iterates and chooses from suggestions that have appeared in inexplicable ways. We interact with “superordinate” or “basal” secret rules, which we hardly dare to claim influence over. We act supported by and actually in the service of algorithms and rules of the companies who provided us with these tools. Following some of these orders – neither made, nor understood, but invariably already preauthorized by us – seems as a quasi-mystical practice in a data-capitalistic context.

The poetic act in the mass of data

The sheer masses of recordings, those traces of human music making, require automated ploughing through the digital shadows of a sonic chaos, which often was originally created as an expression of the purest of hearts. The poetic act and the musical moment may easily appear in this context, as if they were mere instances of formal classes in the sea of data references, as if the concrete act were deemed secondary to the abstract order. The artwork finds itself at the position of a dataset, classified on the basis of its role in societal practices.

Art, however, happens open-ended and those pursuing it won’t know beforehand, what it is that they are doing. The concrete sound, the concrete act in the totality of its relations is not describable in formulae, nor is it repeatable. This remains a valid statement, despite the potential of every sound nowadays to be measured, classified and repeated ad libitum lays ground to the possibility of informing storage media automatically in a similar fashion.

Unberechenbarkeit + Sinnfreiheit

Art is the place where the implicit aspects of every-day life can be parenthesized consciously, cropping and thematizing them. The artistic act as a jump out of self-evidence remains beyond formalization and is – carried by individual anarchism – at the same time an act against the stereotyping and the vanishing of the concrete, the individual in accepted and implicitly supported ontologies and hierarchies. It can open spaces of possibilities outside established practices, which are nowadays often already molded in rules executable and controllable by machines. ‘Sinnfreiheit’ is a wordplay based on the homonymy of the words for meaning and perception (Sinn) in German as well as the strange relationship between freedom to create meaning and the freedom from (i.e. lack of) defined meaning.

This liberation from the imperatives of personal and societal habits and norms has always come at the price of confronting fundamental absurdities, where the included freedom to create (new) sense on one’s own may appear as a cold comfort at first. Every allowance of doubt is at the same time an attack at authorities. Unberechenbarkeit, which translates literally to ‘incomputable’ denotes not only unpredictability, but also waywardness, erratic and rogue behavior. In a world of total capture everything remaining or wanting to be unforeseeable hence incalculable becomes problematic (“does not compute…”).

Regarding the possibilities and demands of AI-aided (self-)optimization, an aesthetic and artistic self-positing that does not merely consist of choosing one of 10 different surfaces provided for the individualization of the big One, becomes more important than ever. Art and music are not only part of the good life, but a method to formulate and answer our questions about the good life, time and time again. Questions about how and in which company we’d like to make music are eminently political.

Contemporary musical reality is in a fundamental way characterized by technology. The localisation in and synchronization of most AI research with the game rules of big capitalism may complicate common accesses between AI and compositional avantgardes, and it is not easy to imagine big data, the basic nutrient of every AI, independently from big business, yet a mutual discourse is both important and promising.

That data and correlations regarding music are now available to a hitherto unimaginable extent, opens up possibilities to completely new, previously unheard musical practices and means in no way that artists from now on will see the zenith of all artistic realization in the apex of the Gaussian bell curve of standard distribution. Nor is it an obligation to abuse the data and tools of machine learning for the cynical “nudging” of the customers towards a certain consumption or voting behavior. A data-rich approach to musical composition instead offers possibilities for a well-founded critique of existing (artistic) practice with all its implicit preconceptions. People surround themselves with and live in technology, construct realities with constantly renewing strata of synthesized sensory stimuli. AI in the hands of artists opens up a rich field of possibilities for playful questioning and redistribution of authority in the music making community between individual, abstraction and algorithm. Here, AI appears as a protean tool for production and conceptual background for the reflection of novel artistic activity. The discourse of composers with AI cannot exhaust itself in proposals for further optimization of the optimization strategies. It has to deal with the conceptions of music and society underlying the technologies under development.

AI-based and informed compositional techniques and tools therefore won’t have their big appearance as ersatz-artist machines, as uber-composers, interpreters and recipients, but will be introduced by artists, companies and official agencies – in currently unknown ways, on various levels, wildly diverse roles and at very different places – into human music-making communities, thereby expanding and questioning them in many ways. What should the employment of “fully-autonomous” ersatz music machines in production and reception be good for anyway? There is nothing there that could be meaningfully outsourced to automata; because music making (which always includes listening to music) constitutes potentially, if not a sufficient justification, then at least a good excuse for human existence. 

[1] Asholt and Fähnders, Manifeste und Proklamationen der europäischen Avantgarde.

[2] Tzara, Manifest Dada 1918.

[3] Stilinović, In Praise of Laziness.

[4] Artmann, Acht-Punkte-Proklamation des poetischen Actes.

[5] In: Oliveros, Software for the People, Collected Writings 1963-80.

[6] Kepler, Johannes. Harmonices Mundi Libri V. Linz: Johann Planck, 1619.

[7] I suppose that in these optimization loops the suppository or the cavity dowel will emerge as ideal musical forms.

Volkmar Klien spent his childhood and adolescence in Vienna, fascinated by musical life in that city, its glorious traditions and antiquated rituals. Today, inspired by this background, he seeks to expand the possibilities of composing, playing and hearing music far beyond classic concert situations. His interest in the multilayered connections between the various modes of human perception and the roles that they assume in the communal creation of reality have led him to the most multifarious realms of audible—and sometimes inaudible—art. Mariendom was the setting of his “Relative Realities” installation during the 2016 Ars Electronica Festival. Volkmar Klien is professor of composition at Anton Bruckner Private University in Linz.

, , , ,