A few months ago http://1000memories.com published an insightful article titled “How many photos have ever been taken?”. The article collected and presented various statistics related to digital and analog photography, like camera ownership and number of photographs taken per year. Amidst all this digital image influx, the article points out, how certain pictures are special to us –like those which we keep in the shoebox at home.
The technical paradox
The relationship between technology and life has been well discussed topic. Many of the technological artefacts that we use today produce such huge amounts of data, that we need to seek help from the technology itself to consume its outcome. We try to build more intelligent objects and networks to edit, organise and filter the noise out of this data and easier and simpler to understand.
Going back to digital photographs, there are myriad ways that we create, edit and share or store them. Not just as probably the most popular form of social media, we click and keep photographs that are personal to us also. We print them out and frame it, keep it in that shoebox at home. What has changed since humanity started clicking. We don’t have to look that far back to see how it evolved from a hired photographer crafting your family portrait, to you pointing and clicking the Kodak moments and rushing to the photo studio to develop the film-roll, to clicking that funny snap of your cat shredding toilet paper and share it on your favourite cat-meme social[?] network!
These steps itself has not been evolved much from the analog days of photography. Some steps are definitely new but it mostly replaces an analogous process from the old days, such as film development. The result of these are the same still imagery but from chemicals capturing and holding visual information it has evolved into digital bits [mostly Red Green and Blue] with some meta-data embedded in the digital picture file. These meta-data usually contain the technical details of the equipment used to click the photograph and the details of the photograph itself. In case of a Geo-tagged photo, it can contain the latitude and longitude of the place where the photograph was taken. Can it contain symbolic information? Can the technology involved would find or understand the symbolic information contained in a photograph?
But can it contain symbolic information? Can the technology involved would identify or understand what the photograph mean to a person?
If technology could understand, the implications could be big. Not only it could completely transform many of the steps involved in the process, also could bring about new ways of capturing and using photographs as well. The simplicity and ease of sharing photographs on various social networks have changed the way we plan, shoot and share. With the advent of digital photography, the cost involved in capturing a photograph went down dramatically; and with cameras having more and more storage capacity, most of us continue to click in large numbers, hoping that some of them might turn out to be worthy of posting on our walls. From being the record of a special moment, a photograph have come to mean a lot more as the process and equipment went cheaper and easier to use.
In order to explore how a machine can find the semantic meaning of a photograph, we might need to understand how do people see photographs? How do we arrange them and tell a story? What do we look for when we want to find a photograph? These questions may seem quite obvious. But since machine intelligence as of today understands mainly if not only quantitative data about the world that surrounds it, we need to investigate the questions mentioned above and identify those quantifiable aspects of the symbols that a photograph might contain.
R G B et cetera
Structurally a photograph [digital] is composed of pixels which in turn is specified by Red, Green and Blue numerical values. This information can be augmented to bring out more detail and make the image richer. Moreover, what beyond the RGB can each pixel contain which are relevant to the symbolic meaning of a photograph? And even if the photographs can contain such extra information, how could a machine make sense of it?
Human intelligence is the result of a multitude of interactions that happen inside our brain. The individual neurones does not contain information or process a piece of information to make sense. However each of them are extensively connected with other neurones, and forms a very large network whose interactions result in consciousness and intelligence. Kevin Kelly in his book What technology wants compares the interplanetary electronic membrane to the human brain in complexity. We share millions of photographs on this electronic network each year. The possibilities of the complex networked information and anecdotes associated with photographs on the web might belie the visual information contained in those photographs itself. These information could be of the curated type or the connections that build the network itself. Tags, names, URL, location data, other meta-data etc. falls into the former classification, where humans themselves largely provide these information. The text that appears along with a picture in a blog or the comments beneath the picture, the tone of those comments, how many have liked or linked to a certain picture, the information about these people, what other pictures did they like etc. falls into the second type where the information is gathered by examining the nodes of the network itself.
Just like a human brain make sense of the world around it by the countless networked interactions, can machines [or the digital network as an entity] make sense of a photograph by collecting and correlating information within the image as well as the way it is networked or shared?
Just like a human brain make sense of the world around it by the countless networked interactions, can machines [or the digital network as an entity] make sense of a photograph by collecting and correlating information within the image as well as the way it is networked or shared? A number of technologies and algorithms that exists together could deal with such a task. However there are certain infrastructural challenges associated with this as well. One of the main complications can be of the availability of the data itself. Not all the social media networking platforms have an open data policy. Privacy concerns may be one of the reasons for this issue.
A project done by me and Wan-Ting Liao at Copenhagen Institute of Interaction Design for Intel called Living Images, tried to explore certain aspects of the ideas discussed above. The project explored the idea of adding time as a factor along with RGB for pixels in a photograph. Regions of a photograph ages just as we age and changes in appearance as the people, objects or space in the photograph ages. We explored the aesthetics of such an image and how it imparts the meaning of the original image, scenarios where such a photograph has an impact and the feeling itself when someone navigates through the changes happened to the photograph’s subject in time.
Our explorations were aimed at provoking a discussion, rather than designing an artefact. My topic for the final project since has changed its direction based on the thoughts presented here. They could be summarised as follows.
- What do we see in a photograph and why are some more special to us? memories and tokens, the people, objects, space moment?
- Why do we share certain photographs? alternate online personality, preserve memory, documentation, history, explanation or annotation?
- Can we distill the answers to the questions above at least to a certain degree so that our digital companions can make sense of the photographs that we click, keep and share?
- –If they can, what new insightful services or digital artefacts can be designed or how the present services or devices can be modified?