The Big Ars Electronica AI Glossary

KI Glossar_2000x1000_vogphoto, Neural Network Training, Ars Electronica Futurelab, photo:


AI – Artificial Intelligence

The goal of artificial intelligence (AI) is to recreate human perception and human decision-making processes using machines and computer processes. The AI we have today is very good at handling certain limited tasks. AI is used in many fields: for medical diagnostics, recognizing text and characters, picture recognition and generation, language recognition and generation, automated translations, navigation systems, autocorrection and suggested corrections for searches. It supports us in many where enormous amounts of data are too complex for the human brain to process.

Chinese AI News Anchor, Xinhua: In November 2018 China‘s state news agency, Xinhua, presented its first AI newsreader. Together with Sogou, a Chinese search engine, Xinhua developed these readers with the help of machine learning. It is meant to simulate the voice, facial expressions, and gestures of a human newsreader.

Machine Learning

Machine learning is an application of AI. Machine learning uses algorithms (computing processes based on a certain repetitive schema) to teach computer programs how to continue learning on their own and make decisions in unfamiliar situations. In other words, these programs can change themselves. Usually an algorithm is provided with training models: a data set with different situations along with the decisions that are to be considered correct.

Machine learning is already a big part of our everyday life: programs give us personalized recommendations for products, or translate texts automatically. Banks use machine learning to predict loan defaults, identify applications that have a risk of defaulting, and evaluate credit histories. The field of medical diagnostics is also increasingly relying on automatic data evaluation.

Learning to See: Gloomy Sunday, Memo Akten (TR): Learning to See ist eine fortlaufende Reihe von Arbeiten, die modernste Algorithmen des Machine Learning verwenden, um darüber zu reflektieren, wie wir die Welt verstehen. Was Menschen sehen ist eine Rekonstruktion, die auf unseren Erwartungen und früheren Überzeugungen basiert. Learning to See ist ein künstliches neuronales Netzwerk, das lose am menschlichen visuellen Kortex (der Hirnrinde) inspiriert ist, blickt durch Kameras und versucht zudem, das Gesehene zu verstehen. Natürlich kann es nur sehen, was es bereits weiß. Genau wie wir.

Subcategories of machine learning

Supervised Learning

In supervised learning, the algorithm (a computing process based on a certain repetitive schema) is provided with fully labelled training data. Labelled training data means that every set of training data comes with the corresponding answers. This means the algorithm already has clear instructions for interpreting the data. Afterwards, the algorithm tests different scenarios itself and changes its parameters until the program performs satisfactorily with the training data in the test.

Supervised learning uses a concept called backpropagation. Backpropagation is a learning algorithm that helps the network learn: errors in the decision-making process are fed back into the network. Then the weights are changed in order to achieve a better result. Backpropagation also analyzes which weight has what percentage of an error. For optimizing supervised learning, gradient descent is also used: a process for solving an optimization task based on the principle of movement in the direction of the steepest downward slope.

Unsupervised Learning

If not enough labelled training data are available, unsupervised learning is used. In this case, the algorithm is not provided with any labelled training data. Instead, the algorithm has to find its own way to organize and classify the data. The machine tries to recognize patterns in the input data that deviate from unstructured “noise.” Different algorithms are used here: clustering, which groups similar data points together; anomaly recognition, an algorithm that finds the divergent data in a data set; and association, which can recognize relationships and dependencies between data.

Semi-Supervised Learning

In cases where the identification of massive amounts of training data would be too expensive, a mix of supervised and unsupervised learning can be used. One current example of semi-supervised learning is that of generative adversarial networks (GAN). GAN is a special architecture of neural networks in which two networks counter one another and learn from each other.

Reinforcement Learning

Like human learning, reinforcement learning is based on the idea that rewards can encourage certain actions over others. Here, the actions are observed by an algorithm and categorized as good or bad. With time, the network learns to avoid the bad actions and reinforce the good ones. The longer the network trains, the better its strategies become for achieving the desired goal.

Lyrebird: Das kanadische Start-up Lyrebird hat eine Software entwickelt, die in wenigen Minuten lernt, beliebige Stimmen digital nachzuahmen. Dazu wurde ein neuronales Netzwerk mit Sprechproben Tausender Menschen trainiert, wodurch es gelernt hat, die charakteristischen Merkmale einer Stimme zu erkennen. Lyrebird erstellt für die Stimme jeder neuen Person einen individuellen Schlüssel, der es erlaubt, neue Sätze mit dem digitalen Stimmavatar zu bilden.

Deep Learning

Deep learning is a type of machine learning that uses artificial neural networks and works with several layers and complex amounts of data. The more layers there are, the better the neural network can develop itself.

So deep learning teaches machines to learn: the machine is put in the position of improving its own abilities independently by extracting and classifying patterns from available data and information. The insights it gains can be connected in turn to data in a different context. Ultimately, the machine becomes capable of making decisions based on these connections.

This type of algorithm arrangement is well-suited to processing large amounts of data, but it also requires high computing power and a significantly longer training period than traditional machine learning approaches.

Moral Machine, MIT Media Lab, Scalable Cooperation (US): Intelligente Maschinen werden zunehmend eingesetzt, um komplexe, vom Menschen durchgeführte Tätigkeiten zu unterstützen oder gar ganz zu übernehmen. Intelligenten Maschinen wie autonomen Fahrzeugen wird dabei ein hoher Grad an Selbstständigkeit zugestanden, der dazu führen kann, dass diese moralische Entscheidungen treffen müssen. Daher ist nicht nur ein besseres Verständnis dafür erforderlich, wie Menschen solche Entscheidungen treffen, sondern auch, wie Menschen den Entscheidungsprozess von intelligenten Maschinen beurteilen. Die Moral Machine versucht anhand von moralischen Dilemmas im Straßenverkehr zu untersuchen, wie Menschen zu Entscheidungen stehen, die von autonomen intelligenten Maschinen getroffen werden. Am Ende der Umfrage können die Ergebnisse mit denen der anderen TeilnehmerInnen verglichen werden.

Artificial Neural Network

Neural networks are a collection of individual information processing units (neurons), which are arranged in a layered network architecture. Artificial neural networks are a method of machine learning: computing systems made up of artificial neurons and arranged into levels. In the simplest case, information starts in the input layer and flows through one or several hidden intermediate layers to an output layer. The output of one neuron is also the input of the next one. This typical dataflow is also referred to as forward propagation: the input values that are entered run forwards through the network and create the output values.

Weights, or the intensity of information flow, are also important elements of a neural network. The result of the weight is often managed by a so-called activation function before it is forwarded on to the neurons in the next layer. During the training process the weights are adapted in such a way that the end result meets the specifications as precisely as possible. There is a wide variety of artificial neural networks that are used for different purposes.

Neural Network Training, Ars Electronica Futurelab, photo:

Neural Network Training, Ars Electronica Futurelab: Dieses Convolutional Neural Network (CNN) hat mit vielen Trainingsbeispielen gelernt, bestimmte Objekte zu erkennen. Das Netzwerk besteht aus mehreren aufeinanderfolgenden Schichten, die während der Trainingsphase gelernt haben, unterschiedliche Eigenschaften in einem Bild zu erkennen und diese Information in Folge an die nächste Schicht weiterzugeben. Während die ersten Schichten eher primitive Eigenschaften wie gerade Linien, Farben und Kurven erkennen, spezialisieren sich die nachfolgenden Schichten auf komplexere Formen. Das hier verwendete Netzwerk VGG16 ist eines der bekanntesten Modelle dieser Art.

Elements of an artificial neural network


Artificial neurons are the smallest parts of an artificial neural network. Each one consists of a mathematical function, a formula that computes one or several inputs into an output, making them so-called information processing units. In an artificial neural network, every layer contains several neurons that pass on information to each other.

Hidden Layer

The hidden layers of a neural network contain definitors: their function is to recognize smaller details and patterns as a whole. So a neuron in a hidden layer is activated when it discovers a pattern that has been programmed into it. These layers are called “hidden” because their in- and outputs are not displayed anywhere—they function only as mediators between the input and output layers.

Activation Function

The activation function in an artificial neural network decides for each neuron in the network whether information from the previous layer gets put through and if so, how much. It makes it possible for a neural network to learn the value limits of a decision function. There is a wide range of activation functions, including Sigmoid, ReLU (Rectified Linear Unit), TanH or ELU (Exponential Linear Unit).

Weights and Biases

As it travels between the layers, a piece of information is overlaid with weights and biases in order to determine whether the signal exceeds certain value limits and whether it is important enough to be passed on to the next layer. These weights and biases change during the learning process, while the network corrects its own mistakes.

NORAA – Machinic Doodles, Jessica In (UK/AU): Wie erkennen wir Objekte, wenn wir sie mit Strichen und Linien skizzieren? Nach welchen Regeln zeichnen wir in einer bestimmten Reihenfolge von einem Punkt zum anderen? Und kann man einer Maschine beibringen, das Zeichnen selbst zu erlernen, anstatt ihr explizite Anweisungen zu geben? Welche Erkenntnisse bringt das für den menschlichen Zeichenprozess? Machinic Doodles ist eine interaktive Spielinstallation, die die Zusammenarbeit zwischen einem Menschen und einem Roboter namens NORAA untersucht, einer künstlichen Intelligenz, die lernt, zu zeichnen. Dabei wird untersucht, wie wir Menschen Ideen mit Strichen in einer Zeichnung zum Ausdruck bringen und wie eine Maschine über ein künstliches neuronales Netzwerk das Zeichnen erlernen kann.

Forms of Networks

CNN – Convolutional Neural Network

Convolutional neural networks (CNN) are primarily used for processing images. Their name comes from the mathematical procedure of convolution. In CNN, the individual neurons are arranged in such a way that they compute overlapping areas. Mathematically, this process can be interpreted as discrete folding – hence the description “convolutional.”

A CNN consists of many layers of artificial neurons where not all layers of neurons are fully connected to each other. The first layer of neurons takes pixels from sample images. Then the next layers learn the concept of how the pixels form an edge. Finally this layer passes its knowledge on to other layers and combines this knowledge about edges to learn the concept of a surface. This process of layering knowledge continues until the final layer, where the algorithms of the neural network can recognize specific characteristics and finally specific faces. Convolutional Filters take on most of the work of simplifying the image information and reducing it to certain characteristics.

What a Ghost Dreams Of, h.o (INT): Was ist ein „Geist“? In der Regel wird er als innere „Seele“ und mysteriöse äußere Erscheinung verstanden. What a Ghost Dreams Of setzt sich mit einem neuen „Geist“ unserer Zeit auseinander: die digitale Überwachung in unserer Gesellschaft. Hier in der Main Gallery werden die BesucherInnen, wenn sie eintreten, von einem großen „Auge“ betrachtet. Alle, die vorbeigehen, werden per Computer Vision direkt in einen „Geist“ eingespeist, der neue digitale Gesichter von Menschen erzeugt, die es in der realen Welt nicht gibt. Was projizieren wir Menschen in das digitale Gegenüber, das wir mit KI erzeugen? Sie lernt unsere Welt ohne Vorkenntnisse kennen und generiert Daten, die es nie gab. Was sind die Auswirkungen der Verwendung von KI bei der Erstellung von Kunstwerken? Wer besitzt das Urheberrecht? Und wovon träumt die KI, der „Geist“, und was bedeutet das für uns als Menschen?

GAN – Generative Adversarial Network

The basic idea of a generative adversarial network (GAN) is to have two networks countering each other. The generative network’s job is to outsmart the discriminator network by creating better and better fakes. To start with, there is a small amount of identified training data that are used to generate a new data set – in the generator. These generated data are then sent to another network – the so-called discriminator whose job is to identify whether the data are fake or part of the training data set. As the two networks continue communicating with each other in this way, the generative network learns to create better fakes and the second network learns to better distinguish these fakes from genuine training data. GANs are used to generate artificial images.

GauGAN, NVIDIA – Taesung Park, Ming-Yu Liu, Ting-Chun Wang, Jun-Yan Zhu: GauGAN ist eine Anwendung, die mit wenigen Pinselstrichen in der Lage ist, komplexe Landschaften zu errechnen. Man zeichnet simple Farbblöcke und wählt aus, welche Texturen diesen Farbflächen zugewiesen werden sollen: Sand, Meer, Berge, Wald, Pflanzen oder Gestein sind nur ein paar der Möglichkeiten. Wie der Name bereits impliziert, nutzt GauGAN Generative Adversarial Networks, um die Skizzen mit fotorealistischen Landschaftsmerkmalen auszustatten. GauGAN kann diese Bilder danach sogar in den Malstilen von verschiedenen Künstler Innen ausgeben.

cGAN – Conditional Generative Adversarial Network (pix2pix)

A conditional generative adversarial network (cGAN) is an expanded GAN that has access not only to the output images but also to the input images. The generator network gets the additional job of creating bad fakes as well as good ones so that the discriminator network can learn better. cGANs are the basis for pix2pix applications where users can create a drawing and the network will generate the drawing as a cat picture or in the style of Vincent van Gogh.

ShadowGAN, Ars Electronica Futurelab, photo: Ars Electronica / Martin Hieslmair

ShadowGAN, Ars Electronica Futurelab (AT): Das ShadowGAN erkennt die Silhouetten von Menschen und füllt diese dann zum Beispiel mit errechneten Bildern von Bergen aus. In einem der Beispiele wurde die KI mit unzähligen Bildern von Bergen und Berglandschaften trainiert, es kennt also nur natürliche Landschaften und Berge. So interpretiert es quasi in allem, was es sieht, Berge, selbst in der menschlichen Gestalt. Im Hintergrund wird ein Conditional Generative Adversarial Network (cGAN) benutzt.


An autoencoder is a form of unsupervised learning where the program learns to reduce an input and reproduce it in such a way that the output is as close as possible to the input. This process is particularly relevant in cases where the characteristics of data have to be reduced in order to mitigate their complexity without losing the important information. In the language of machine learning, these characteristics are called dimensions.

Latent Space

Latent space is a variation on the autoencoder. Based on two examples, latent space can calculate all possibilities that exist between these two examples and visualize them. Starting from a picture of a person with thick glasses and one without glasses, all possible variations in between can be created.

MegaPixels / Adam Harvey (US), Jules LaPlace (US), photo: Ars Electronica / Martin Hieslmair

MegaPixels, Adam Harvey (US), Jules LaPlace (US): MegaPixels ist ein unabhängiges Kunst- und Forschungsprojekt, das die Herkunft und die individuellen Auswirkungen von Datensätzen zur Bilderkennung sowie damit verbunden die Ethik in Bezug auf die Privatsphäre und ihre Rolle bei der Erweiterung biometrischer Überwachungstechnologien untersucht. Das Projekt zielt darauf ab, eine kritische Perspektive auf Machine Learning-Bilddatensätze zu bieten, die von akademischen und industriell finanzierten Think-Tanks für KI übersehen werden könnten. Jeder auf dieser Website präsentierte Datensatz wird einer gründlichen Überprüfung seiner Bilder, Absichten und Finanzierungsquellen unterzogen.

Systems with memory

RNN – Recurrent Neural Network

Recurrent neural networks (RNN) are networks equipped with an internal memory that can recognize patterns in data sequences. They use the concept of backpropagation: the results are fed back into the network over and over until the network learns from them and can continue developing them. A variant is often used in which two RNNs are used in parallel, with one of the two counting back from the output. RNN is used in text processing, genome research, handwriting recognition, language recognition or for organizing large numerical series produced by sensors or stock markets, for example.

LSTM – Long Short-Term Memory

A long short-term memory is essentially a somewhat more complex RNN. Here the errors are fed back in not only through the input layer, but also at several filter points that decide whether the information should be passed on. These filter points consist of a specialized memory cell that can remember or forget information in a targeted way based on importance. This mechanism ensures that the network only remembers information for as long as it is needed.

NLP – Natural Language Processing

Natural language processing (NLP) is an AI method for recognizing language and its rules of grammar and usage by identifying patterns in large data sets. NLP is increasingly used in social networks for analyzing feelings about certain products for the marketing sector.

Λόγος—On the Origins of Words, photo:

GPT-2: Λόγος – Vom Ursprung der Worte: OpenAI, eine Non-Profit-Organisation, die sich mit der Erforschung von künstlicher Intelligenz beschäftigt, hat ein groß angelegtes KI-Sprachmodell entwickelt, das fähig ist, ganze zusammenhängende Textparagrafen zu erschaffen: GPT-2. Es kann einfach strukturierte Texte verstehen, automatisch übersetzen, schlichte Fragen beantworten und Inhalte zusammenfassen, ohne speziell für eine dieser Aufgaben trainiert worden zu sein. Aufgrund der hohen Qualität der sprachlichen Fähigkeiten von GPT-2 warnt das OpenAI-Team davor, dass das Modell für unlautere Zwecke missbraucht werden könnte. Es betont, dass BenutzerInnen dem Wahrheitsgehalt von Inhalten im Internet gegenüber noch skeptischer werden müssen und PolitikerInnen sich verstärkt um Regulative gegen den Missbrauch solcher Technologien kümmern sollten.

The Vector Space Word2Vec

A vector space consists of individual data points over which the vectors can be spatially represented. The angle between two vectors makes it possible to determine similarities or relationships between the data points. The larger the angle, the less similarity there is, and the smaller the angle, the more similarity there is. Above all, however, every word vector also represents – at least to some extent – the semantic meaning of the word in context. This means that pre-trained Word2Vec models can replace “mathematical formulas with words”: PARIS – FRANCE + AUSTRIA = VIENNA. This means that here, for example, the vector for the word PARIS minus the vector for the word FRANCE plus the vector for the word AUSTRIA really does equal the vector for VIENNA. Not exactly, of course, but the vector for VIENNA is the vector closest to the result and can therefore be regarded as the correct result.

, ,