< INTRO Imagining the Non-Present: Thought Experiments on Rich Temporality in Sound [Peer-Review Version]
Itnp cover v2023 03 04d
Share this publication
tags:
Navigate through publication:

This article is adapted from a virtual presentation at the Imagining the Non-Present conference in September 2021.

When I read the conference brief, I was particularly struck by the sentence, “We seek to understand music’s relationship to that mental capacity which allows one to see, dream or fantasize about events not directly accessible in one’s immediate or close surroundings, either spatially or temporally.” 1 That resonated with me for a wide variety of reasons, one of the chief reasons being a particular project that I did. I’m going to start by mentioning this project briefly and then I’m not going to refer to it again. I know that’s slightly unorthodox, but please bear with me. The project, which I’ve been working on since 2015, is called Aisteach: Historical Documents of the Irish Avant-Garde. I’m Irish, and if you go to aisteach.org, you’ll find a website that purports to be that of the avant-garde archive of Ireland, the Aisteach Foundation, which is the custodian of the history of Irish avant-garde art and music production.

It seems to be a completely legitimate website. There are numerous different articles and resources. For example, straight away you see an entry about Pádraig Mac Giolla Mhuire, who invented drone composition way before La Monte Young got around to it. You can read about Sister Anselme O’Ceallaigh, who was a nun who lived in an enclosed convent in Galway and liked to do these long-form organ improvisations. Or you can read about the Kilbride & Malone Duo, which is considered one of the progenitors of noise improv. When you click on “About Aisteach,” you’re taken to a page that talks about how the Aisteach Foundation was founded in 1974 by a composer and poet in Ireland. And then there’s a little disclaimer that reveals to you that the entire website is completely fictional—that this is a fictional idea of what Irish avant-garde music and art might have been like if the country hadn’t been colonised, if the country hadn’t been extremely poor, and if the country hadn’t been in the repressive grip of the Catholic Church. None of these figures ever existed. They were dreamt up by me and a team of collaborators. And the project is ongoing—there are still new people being added to the roster. Aisteach is a thought experiment about what might have been.

But ultimately, it’s also a thought experiment about how we want the future to be. We wrote the type of history that we would want to have as our future. So, for example, it has a lot of female-identifying artists. It has a lot of queer people doing drag nights with Kurt Schwitters’s poetry back in the day in Dublin. This seems like the most obvious project for me to talk about in this sort of presentation, because it is very much about imagining the non-present. That’s literally what we did. But as I reflected on the topic, I thought, no, I want to talk about projects that I’m doing right now, because those are concerned with trying to imagine what the present even is—how to hold it in your hand, how to grasp it. Now that I’ve briefly mentioned Aisteach, you can go and check it out, and in a parallel universe a version of you has just experienced a presentation where all I did was talk about Aisteach. But in this version of the multiverse—or metaverse (!)—I’m going to move on and talk about something different.

The thing I’d like to talk about is that right now, more than at any point in history, we’re not even sure what present we’re involved in. I’m speaking over Zoom, and I don’t know what the latency is, where you’re located, you know? So I’m not sure whether you’re hearing my words as I’m speaking them or however many milliseconds later, or whether I’m frozen and you’re just waiting politely for the network to release a packet of data so that you can hear what I’m saying. On top of that, I would guess that most of us, probably quite a few of us, are already “friends” on Facebook or some other social media platform; even if we’ve never met or don’t even know we’re “friends,” we’re probably connected in some way. Most likely, though, the version of that social media platform that we open up is completely bespoke to each of us. And even if we were all friends and we were all to open up the same social media platform, what we would see would be completely different.

This is really important to me, because we say that we inhabit a common present, but in fact we’re all in miniature filter bubbles. We talk about large filter bubbles of left versus right, but in fact we’re all in micro filter bubbles that overlap in different ways depending on the communities that we’re participants in. And that’s crucial to the way we think about making music now. The way I think about music is that it’s a way for me to try to be present—literally to try and be here in the present, to try and pay attention to it. It’s the place where I’m the least lazy and least inattentive to the present, if that makes sense. And that’s central to what I do. One theorist that I often turn to when I’m thinking about how to articulate the position I’m coming from is Raymond Williams, the Welsh critic and writer. Williams talked about this concept of what he called the structure of feelings. He described how “we have indeed to find other terms for the undeniable experience of the present: not only the temporal present, the realization of this and this instant, but the specificity of present being, the inalienably physical, within which we may indeed discern and acknowledge institutions, formations, positions, but not always as fixed products,” “structures of feeling.” 2

My reading of this is that he’s trying to talk about what it feels like to be alive right now, in roughly the time frame that we’re in. What are the things that we can’t even articulate yet, which are part of the experience of being alive? We might not even be able to articulate them linguistically. They just come to us as these feelings, this sense of what life is like right now. And if I think about what life is like right now, if I try to think of the structure of feelings at the current moment, I come back to surveillance capitalism. That’s the all-pervading sort of mushroom cloud that we’re all existing in and through right now. Shoshana Zuboff wrote an amazing book about it, The Age of Surveillance Capitalism. She defines surveillance capitalism as “1. A new economic order that claims human experience is free raw material for hidden commercial practices of extraction, prediction and sales. 2. A parasitic economic logic in which the production of goods and services is subordinated to a new global architecture of behavioural modification. 3. A rogue mutation of capitalism marked by concentrations of wealth, knowledge and power unprecedented in human history. 4. The foundational framework of a surveillance economy.” 3

So we’re all here on Zoom today. There’s a very clear data trail that a third party could easily trace to see how we are all converged on Zoom at this moment. Right now, Zoom is extracting vast amounts of data from everything that’s happening in this call: who switched their microphone on, who’s touching up their appearance, who’s using a blurred background, who’s contributing to the chat. Zoom will have all that data afterwards. And what technology is powering all the advanced Zoom features? AI, or more specifically, machine learning. For the last few years, this has been something that I’ve been trying to think through, because I feel that this is the environment that we’re living in right now. We’re living under surveillance capitalism, but it is powered, it is facilitated, and it functions because of machine learning—in every single thing that we do. And I want to make it very clear that it doesn’t matter whether any of us ever writes a line of code: we are participating in machine learning because it’s running on all our devices all the time, and we’re contributing data points to train these algorithms on. So it’s something that we need to think about in the same way that we think about oxygen or plumbing. It’s just part of the environment that we live in right now.

Therefore, when I think about the present, I’m trying to think about how the present is defined through machine learning; how the different presents are manifested to different people through machine learning.

Later in this presentation I’ll talk about three different projects I’ve done using machine learning, but before I get into those, I want to show you a patent application that Spotify made in 2018 (figure 1). I pay very close attention to patents that are lodged by Spotify, Google, Facebook, and companies like that, because it gives me a sense of the structure of feelings. And because such companies lodge an awful lot of patents, some of which are never even realised. If they think they might be able to do something, they lodge a patent. So patents give you an idea of where that cutting-edge technology is, or where the companies think it might be.

Figure 1 Patent with abstract

Figure 1. Patent for “Identification of taste attributes from an audio signal.” Source: United States Patent and Trademark Office, www.uspto.gov.

The title of the patent Spotify applied for is “Identification of taste attributes from an audio signal.” 4  According to the abstract, the patent describes how “a system, method and computer product are provided for processing audio signals. An audio signal of a voice and background noise is input, and speech recognition is performed to retrieve speech content of the voice.” So far, okay, this just sounds like Alexa or Siri. The patent continues, “there is retrieval of content metadata corresponding to the speech content”—here’s where it gets juicy—“and environmental metadata corresponding to the background noise. There is a determination of preferences for media content corresponding to the content metadata and the environmental metadata, and an output is provided corresponding to the preferences.” 5

Figure 2 Spotify diagram

Figure 2. Diagram from patent for “Identification of taste attributes from an audio signal.” Source: United States Patent and Trademark Office, www.uspto.gov.

What’s particularly enlightening for me is one of the diagrams included in the patent (figure 2). At the top of the diagram you can see “audio input,” which is you opening up Spotify and saying, “Hey, play me a party song,” or “Play me some smooth jazz,” or whatever your preference is. In this way, your request goes into the system. Over on the left, you can see that Spotify retrieves the “content.” It retrieves your speech. It wants to know what you are saying, what your exact words are. It normalises the content by removing duplicated words, removing filler words, parsing and formatting the input. So if you say “Ah, erm, ah, oh, Spotify, I don’t know, I suppose play me some country and western,” Spotify is able to sort that out into “They want to listen to country and western.”

But Spotify then does an awful lot of analysis using machine learning on that request. You can see in the central column that it retrieves the content metadata. It figures out your emotional state. It figures out your gender and age. It listens to your accent. And from that it is able to infer where you’re from. It is able to infer class. It is able to infer a huge amount of information just from the grain of your voice.

It also retrieves environmental metadata. Are you outside, or are you on a bus when you make this request? Is it 2 a.m. on the Tube, and are you evidently going home to party and that’s why you want to listen to Dua Lipa? Or are you alone in a park and it’s dawn and that’s why you want to listen to some Bach? Are you in a social environment? Can Spotify hear other voices or does it think you’re alone? Does it think you’re in a small group or does it think you’re at a party? It takes all that information and then it combines that with every piece of music you’ve ever listened to on Spotify, and then it determines what it should serve you, what track it should recommend. So even if you never write a line of code …

All my students use streaming services to listen to music. I ask them all the time, how do you listen to music? What platforms? Do you even buy music? Do you buy CDs? Most of my young students just listen to music on streaming platforms, which means that every single thing they do with music is training these machine learning algorithms and contributing to all this data. And due to the way that they listen to music, all the music they will be served is completely informed by machine learning. This is why I think it’s really important to talk about this.

Fig 1

Figure 3. Two classes of AI.

When we talk about AI, we can break the technology down broadly into two classes—symbolic/GOFAI and machine learning (see figure 3). I’m using the term machine learning, and I know some of you may be very familiar with this, but I’m going to give a brief overview for those who might be new to it. In the news, the term you are most likely to hear used is AI, but machine learning is actually a subset of the broader field of artificial intelligence. Therefore, I’m going very briefly to talk about symbolic AI—what we call good old-fashioned AI (GOFAI)—versus machine learning. Mostly when I talk to my students about AI and ask them to explain what they think AI is, they describe machine learning without understanding that it’s a subset of AI. For this reason, machine learning specialists will often use the terms interchangeably. But among specialists, more commonly they’ll talk about ML, machine learning.

The older type of artificial intelligence is called symbolic because it emanated from the idea that human reasoning could be represented in symbols—you could literally map it out, symbol by symbol. And then you could code a program where, with every single line, you tried to think of every single permutation of something that could happen. So there was no learning happening. The machine was simply given a program that a human had coded, and the machine followed that program—it always followed the exact steps that it was given.

An example of that would be the chatbot ELIZA, which Joseph Weizenbaum wrote in 1964–66. You can still find iterations of ELIZA online. ELIZA was supposed to be a therapist that you told your problems to. But because Weizenbaum wrote every single line of code, there was never a response that you couldn’t predict. It always just followed the code. Whereas now if we talk to Alexa or Siri, we find that such chatbots do not function in the same way. Apple has not written every single response for Siri—sure, some of the responses have been scripted, but a lot of the time, Siri is learning from interactions with you, and giving new, unscripted responses as a result of those interactions. The difference is that the system is trained, and is continually being retrained, on a corpus of data, or a data set. It needs huge amounts of data to do the training, and we often don’t know why and how it makes the decisions it makes, because in a neural network structure, the input is fed in and there are layers of neurons passing on information, just like there are in our eyes. For example, there are layers of neurons in my eyes that take the photons that come in, process that information, and then tell me I’m looking at a picture of people on Zoom. Neural networks do the same thing, processing input through layers of code. But in the same way that I might see a black blob on my road at night and think it’s a badger only to find out it’s a plastic bag, even neural networks give unpredictable results.

Fig 2

Figure 4. My classes of AI.

When I deal with AI in my work, I can break down my conceptualisation of it into two classes (see figure 4); but because I’m a composer rather than a coder, I’m interested in speculative and real AI. Some of the projects I do function like science-fiction short stories to think through aspects of AI and tell stories about it, and other projects use machine-learning-generated material in the project itself. IS IT COOL TO TRY HARD NOW? and THE SITE OF AN INVESTIGATION are examples of the more speculative projects that deal with thinking through AI, but don’t use machine learning in the generation of the pieces themselves. However, what I’m going to focus on now are three of the “real” projects that I’ve done, ULTRACHUNKA Late Anthology of Early Music, and Text Score Dataset 1.0. All these projects very much do engage with machine learning technologies.

The first project I’ll talk about is ULTRACHUNK from 2018, which was a collaboration with the Turkish artist and technologist Memo Akten. Memo did a PhD in machine learning at Goldsmiths and teaches at University of California San Diego now. Memo and I were both artists in residence at Somerset House Studios in London. They paired us up because, in conversations, we both said we were very interested in machine learning, which led to ULTRACHUNK.

We made it as follows: I spent a year recording videos of myself improvising. I spent a very weird year, which was sort of training for the pandemic, in that every day I had to sit with my laptop and improvise, making a video of myself improvising to make training data for Memo. Memo then took that training data and made an architecture of six different neural networks called GRANNMA—because all coders love a cheesy acronym for their systems—Granular Neural Net for Music and Audio. And he trained these neural nets on all the material that I created. So we are now able, live in performance, to use these neural nets to generate new material as I’m performing.

Video example 1. Excerpt from a performance of ULTRACHUNK, a collaboration between Jennifer Walshe and Memo Akten, at Somerset House Studios, London, 2018.

There are a few things I should say about the clip from ULTRACHUNK in video example 1. One is that I’m there, performing live—my vocals are being fed into the system. Therefore, Memo has to be at the side of the stage with an incredibly fast computer. This will not work on a Mac laptop—you have to have sort of a Bitcoin-mining-rig-type thing by the side of the stage, with ultra-powerful graphics cards, and it’s all happening live that the machine system is generating video and generating audio. It’s not sampling it; it’s not processing it. It’s literally drawing in real time approximately twenty frames per second and also about forty-four thousand audio samples per second. But the only thing that ULTRACHUNK has ever seen—that’s the best analogy I can use—is video of me. It doesn’t know any other way to express itself. The only way that it can make outputs is from what it’s seen, which is the dataset that it was trained on, which was videos of me. When you look at it live in performance, the bar at the bottom of the screen is a representation of the level of closeness at which ULTRACHUNK is listening—which is depicted in the bar by colours representing temperature, from green to red. So ULTRACHUNK can listen really, really closely (represented by red in the bar), and then much less closely (represented by green), exactly like a human improviser does. Along the top is a barcode, because for every single grain that ULTRACHUNK navigates—grain of sound or grain of video—it decides which path it will take through a hundred-dimensional hypersphere. The temperature bar, then, is a representation of how it’s dealing with that hypersphere. And all this is happening every single split second.

It’s sort of terrifying when I perform with ULTRACHUNK live, because first of all, you’re worried that the computer is going to melt or explode in the corner because you’re running it at the absolute limit of what it can do in terms of computation. And second, I just don’t know what’s going to happen. I don’t know how ULTRACHUNK is going to react. But to be honest, I’ve had that feeling many times playing with humans on stage. So it’s not something that’s outside the realm of my experience. The context that I put it into is that it’s a type of interspecies communication, and I genuinely feel that way.

Fig 3

Figure 5. Interspecies communication.

Figure 5 shows an image of the robotic Aibo dog that Sony produced. I went to an exhibition about AI at the Barbican in London. And naturally you start playing with the robot. You pet it like you would pet a real dog. You interact with it in that way. You still have the same instinct as a human to treat it as if it were a dog, and scratch its back and scratch its ears.

For me, in trying to think through some of these things, two of the most useful theorists have been Donna Haraway—naturally, who not only in “A Cyborg Manifesto,” 6 in which she discusses technology, but also in The Companion Species Manifesto 7 talks about what it means to live and work alongside another type of being and to communicate across species—and Kate Darling at MIT, who in her recent book The New Breed 8 talks about different forms of ML that were modelled after animals.

The other point of reference, of course, that I’m thinking about is the uncanny valley, defined by Masahiro Mori in 1970, which is the idea that we feel greater affinity the closer something gets to looking like a human up to the point where it becomes uncannily similar, at which point we experience revulsion.

Fig 4

Figure 6. The uncanny valley.

You can see at the bottom left of the graph in figure 6 that if you have an industrial robot, it doesn’t look anything like a human, and therefore we don’t feel a great affinity with it. It’s something that seems quite bizarre to us. So the closer the thing gets to looking exactly like a human, the more affinity we feel. But there is a valley where things get really, really bad just before the thing looks exactly like a human being. Take a toy robot, for example—I can cuddle the Aibo, it really looks like a little dog, even though it’s a robot. But if the Aibo were encased in really terrible taxidermy, it would probably fall into that uncanny valley, and I would feel quite unnerved. I wouldn’t understand whether it was a dog or a robot or something else.

For a long time, if you Googled “uncanny valley” you would get a picture of Hiroshi Ishiguro, who’s one of my favourite all-time inspirations. He had a robot made that looked exactly like him, which is called the Geminoid—he’s pictured with the robot on the right in figure 6. Hiroshi—and I’m sure anybody who works at a university can relate to this—used to send the robot to meetings instead of him and have his grad students operate it with joysticks behind the scenes, which is surely genius. I think Hiroshi is actually one of the greatest performance artists of the twentieth and twenty-first centuries, though we would not call him a performance artist because he’s a roboticist and he’s in a robotics department. But, as a human, he is actually functioning artistically, performatively, by building a robot that looks like him, seeing how people react to it. And in doing so he’s done a sort of reverse Dorian Gray, where as he gets older, I see him trying to stay looking like the robot that he’s built himself.

All these things are in the project ULTRACHUNK, and I was able to grapple with these immediately—feel these structures of feeling of the present that Williams talks about.

Okay, so ULTRACHUNK, which came out in 2018, is one approach I’ve taken to working with machine learning. I was really happy that I could collaborate with Memo on it, because I needed to be able to collaborate in that way to do that project. It threw up many very interesting questions about identity, about planning in improvisation, about listening, how people respond, how close you are to your collaborators, and how far away you are from them in different moments, and moment-to-moment in an improvisation. But the project I’d like to talk about next, A Late Anthology of Early Music Vol. 1: Ancient to Renaissance9 comes at it from a completely different perspective. There is a fantastic duo called Dadabots—CJ Carr and Zack Zukowski—that has been doing machine learning projects for a while. Back in 2018, Carr and Zukowski were launching different projects they had done—for example, they tried to make a new album of Beatles songs by training networks on the Beatles. I wrote to them and said, hey, your work seems really interesting. And they wrote back and said, oh, let’s do a collaboration. I had a lot of vocal a cappella recordings that were separate from the ULTRACHUNK project—just voice, not video—from a different project that I’d done. I sent them those recordings and they sent me back 841 files that were the different generations of outputs that they trained their network on.

For a while I wasn’t sure exactly what I was going to do with these recordings because they gave me everything from the entire training session. They gave me the terrible outputs that were the initial outputs, the whole of the way through to the successful outputs, which sounded similar to my own voice by the end of the process. And that was very interesting to me because normally when we hear about a machine learning project, all we hear is the end result. We see the endpoint of the forty or one thousand generations of training that got it to really work. We don’t normally hear all the steps taken by the network learning how to listen and how to approximate something else.

In 2018 I was listening to these 841 files a lot. I was testing them; often if I was doing a gig, like a free-improv show, I would use some of them and just sort of slip them in, to see how the audience reacted. But I knew I didn’t want just to play with them. I wanted to have a clear concept that went with them. Then one day I was looking at some textbooks that I used for teaching when I did my DMA at Northwestern, Illinois. Teaching the undergrads the history of Western music was the classic duty of the doctoral student, and these were the anthologies that we had to teach from.

Fig 7

Figure 7. The Norton Anthologies.

We had to cover the whole of the anthology on the left in figure 7—the Norton Anthology of Western Music, Volume 1: Medieval, Renaissance, Baroque  10—in just ten weeks, going from the “beginning” of time, from ancient Greece, through to the end of the Renaissance period. It was an incredibly fast whistle-stop tour; the students found it unbelievably difficult because it was the music that they were least familiar with. Even if they’d grown up listening to a lot of classical music, music from the mediaeval and Renaissance periods is not used as commonly even in advertising or film soundtracks as music from the Classical and Romantic periods is. So when I was teaching, we were given a very clear party line. The line was, ignore ancient Greece because it’s too complicated. Instead we’ll start with plainchant, because plainchant evolves. This plan has a beautiful logic to it: we take plainchant, and we turn parts of it into organum, and then we take parts from the organum and we turn them into motets, and this produces a nice clear narrative that the students can get their heads around.

For me, that experience of trying to find a logical through-line mimicked the way I listened to all the 841 output files that Carr and Zukowski sent me, which were generated using their version of sampleRNN, which is a neural network quite commonly used by musicians. These 841 files almost followed the same through line that my music history textbooks did. We moved from something very simple that was monophonic and used long notes, to something very dense and ornamented. And so I decided to use that to create a new history of early music. I wanted to completely rewrite the history of early Western music by mapping these key pieces from the canon onto my 841 files in chronological order, using these key early canonical pieces as a sort of filter to listen to the evolution of the machine learning, and using machine learning as a way to listen to the evolution of these pieces.

I’m going to skip my version of the Epitaph of Seikilos, simply because I want to show you how the more Central European Western tradition evolves. What I also liked was that the number of training epochs mirrored the number of human generations that took place in this historical period. So the very early pieces on the album all use very early training files, and then the late pieces use very late ones. As an Irish person who has spent a very long time looking at anthologies of music, which very rarely contain pieces by Irish people—and, for example, as mentioned above, having had to teach for a great deal of my early career from anthologies that never included pieces by women or people of colour—to be able to go back and sort of rewrite an imaginative parallel world history felt very satisfying. For me, it’s not simply a joke or making fun of the original anthologies. It’s quite a serious project to rethink what an anthology is, what a canon is. Let’s listen to a couple of short examples from some of the different pieces. The first piece comes from very early in the album, at a point where the network was singing a lot of long notes. All the pieces that I picked for A Late Anthology of Early Music were pieces that are commonly included in the Norton or Stolba anthologies or any of the classic textbooks that music history students use.

Audio example 1. “Anonymous: Gregorian Chant, Mass for Christmas Day, Introit: Puer natus est nobis,” from A Late Anthology of Early Music Vol. 1: Ancient to Renaissance, by Jennifer Walshe (2020).

So that’s my new rethinking of what Gregorian chant sounds like through this filter. If we skip on to a much later piece from the anthology, John Dowland’s “Flow, my tears,” you will hear a new version of “Flow, my tears,” but again through the filter, and with much more complex files involved.

Audio example 2. “John Dowland: Flow, my tears, Air” from A Late Anthology of Early Music Vol. 1: Ancient to Renaissance, by Jennifer Walshe (2020).

The last example I’ll play is from Palestrina. The result to me was quite bizarre because I had spent a lot of time listening to the material, and when you’re working through all these different files, you can hear the network literally learning to listen. You can hear how that evolution happens, how the intelligence is being built in real time, because in the outputs you hear it sort of snag on things, like a child developing a really annoying habit that they finally grow out of. At one point I could hear these very high whistles in the files, which stayed for several hundred files, then disappeared for a very long time, and then popped back in again. But this is why artists are interested in machine learning—because you never know what’s going to happen. This was a very unexpected result because when I fed everything through, I got something that actually sounded quite emotional, that to me had a new idea, a different version of the sort of spirituality that I think the original Palestrina piece was reaching for.

Audio example 3. “Giovanni da Palestrina: Missa Papae Marcelli, Agnus Dei I” from A Late Anthology of Early Music Vol. 1: Ancient to Renaissance, by Jennifer Walshe (2020).

To return to what I was talking about at the beginning of this presentation, when I hear these sounds, to me they are sounds that are acoustically full of artefacts. In the same way, the video of ULTRACHUNK is full of artefacts—you can see, for example, that at one point the system has given me lots and lots of extra teeth, which underscores the uncanny valley feeling.

So when I think of the time that we’re currently living in, here in spring 2021, the artefact-laden aesthetic of these works feels very now. I think that as the networks get more sophisticated—and we’re contributing to many networks all the time—they will get cleaner and this will feel like the photocopy. It’s like people making photocopies of images—now you can just scan them and make pristine prints.

This is what I’m interested in, because the presence is so fleeting. It’s different for different people in different places. And slightly different versions of all of it are being served to us by the networks in the same way that the hypersphere within GRANNMA is making decisions on where the network is going to go next.

The very last thing I’ll mention is that I recently launched a project at Darmstadt called Text Score Dataset 1.0, which is a purely text-based machine learning project. The project took three years to come to fruition because it involved making an insanely huge dataset of every text score that we could get our hands on. We began by collecting as many text scores as we possibly could, from Fluxus onwards, but the collection was only the beginning of the process. Every score had to be transcribed, formatted, converted into simple text; metadata tags needed to be added. Then we worked with different collaborators to help develop networks that could train on this dataset, in order to write new text scores that humans could perform. Text Score Dataset 1.0 is the first generation of those. You can download a booklet here containing an essay on the project and some of the text scores generated. It’s a project that’s best witnessed when you can spend some time with it, or, better still, perform one of the scores, and have the uncanny experience of performing something a neural network wrote! For me, that is what’s crucial about all my work with AI—how we come to terms with new technologies, new experiences, new ways of being alive. My interest has led me constantly to learn new things, develop relationships and collaborations with a wide variety of people within machine learning. It’s taught me a huge amount, not just about music, or art, or intelligence, or consciousness, but about what’s at stake for us as humans as AI begins to change the way we live.

Cite as

Walshe, Jennifer. “Other Presents.” In Imagining the Non-Present, edited by Carlo Diaz. SONUS Series. Ghent: Orpheus Instituut, 2025. https://sonus.orpheusinstituut.be/publication/publication/imagining-the-non-present/walshe-other-presents.

Footnotes

  • 1 “Call for Papers: Imagining the Non-Present,” Orpheus Instituut, Ghent.
  • 2 Raymond Williams, Marxism and Literature (Oxford: Oxford University Press, 1977), 128.
  • 3 Shoshana Zuboff, The Age of Surveillance Capitalism: The Fight for a Human Future at the New Frontier of Power (London: Profile Books, 2019), [ix].
  • 4 Stéphane Huland, Identification of taste attributes from an audio signal, US Patent 10,891,948 B2, filed 21 February 2018, and issued 12 January 2021.
  • 5 Huland, Identification of taste attributes.
  • 6 Donna J. Haraway, “A Cyborg Manifesto: Science, Technology, and Socialist-Feminism in the Late Twentieth Century,” chap. 8 in Simians, Cyborgs, and Women: The Reinvention of Nature (New York: Routledge, 1991).
  • 7 Donna J. Haraway, The Companion Species Manifesto: Dogs, People, and Significant Otherness (Chicago: Prickly Paradigm Press, 2003).
  • 8 Kate Darling, The New Breed: How to Think About Robots (London: Allen Lane, 2021).
  • 9 Jennifer Walshe, A Late Anthology of Early Music Vol. 1: Ancient to Renaissance, Jennifer Walshe, 2020, music streaming platforms.
  • 10 Claude V. Palisca, ed., Medieval, Renaissance, Baroque, vol. 1, Norton Anthology of Western Music, 2nd ed. (New York: W. W. Norton, 1988).

Colophon

Date
29 April 2025
Review status
Double-blind peer review
Your browser does not meet the minimum requirements to view this website. The browsers listed below are compatible. If you do not have any of these browsers, click on the icon to download the desired browser.