ZENGENTI HACK DAY
One of our digital team objectives for this year is to do more with data, to collect, share and use it in order to better understand our audiences, their behaviour and their needs. Online, Google analytics provides us with a huge amount of information on our website visitors, and we are only just beginning to scratch the surface of this powerful tool. But for physical visitors, once they come through our doors their behaviour in our buildings largely remains a mystery. We have automatic people counters that tell us the volume of physical visits, but we don’t know how many of these visitors make their way up to the top floor, how long they stay, and how they spend their time. On a basic level, we would like to know which of our temporary exhibitions on the upper floors drive most traffic, but what further insight could we get from more data?
We provide self complete visitor surveys via ipads in the front hall of our museums, and we can manually watch and record behaviour – but are there opportunities for automated processing and sensors to start collecting this information in a way which we can use and without infringing on people’s privacy? What are the variables that we could monitor?
We like to collaborate, and welcome the opportunity to work with technical people to try things out, so the invitation to join the yearly “Lockdown” hack day at Zengenti – a 2 day event where staff form teams to work on non-work related problems. This gave us a good chance to try out some potential solutions to in gallery sensors. Armed with Raspberry Pis, webcams, an array of open source tech (and the obligatory beer) the challenge to come up with a system that can glean useful data about museum visitors at a low cost and using fairly standard infrastructure.
Atti Munir – Zengenti
Dan Badham – Zengenti
Joe Collins – Zengenti
Ant Doyle – Zengenti
Nic Kilby – Zangenti
Kyle Roberts – Zengenti
Mark Pajak – Bristol Museum
- Can we build a prototype sensor that can give us useful data on visitor behaviour in our galleries?
- What are the variables that we would like to know?
- Can AI automate the processing of data to provide us with useful insights?
- Given GDPR,what are the privacy considerations?
- Is it possible to build a compliant and secure system that provides us with useful data without breaching privacy rights of our visitors?
The Microsoft Azure Face API is an online “cognitive service” that is capable of detecting and comparing human faces, and returning an image analysis containing data on age, gender, facial features and emotion. This could potentially give us a “happy-o-meter” for an exhibition or something that told us the distribution of ages over time or across different spaces. This sort of information would be useful for evaluating exhibition displays, or when improving how we use internal spaces for the public.
Face detection: finding faces within an image.
Face verification: providing a likeliness that the same face appears in 2 images.
Clearly, there are positive and negative ramifications of this technology as highlighted by Facebook’s use of facial recognition to automatically tag photos, which has raised privacy concerns. The automated one-to-many ‘matching’ of real-time images of people with a curated ‘watchlist’ of facial images is possible with the same technology, but this is not what we are trying to do – we just want anonymised information that can not be related back to any specific person. Whilst hack days are about experimentation and the scope is fairly open to build a rough prototype – we should spend time reviewing how regulations such as GDPR affect this technology because by nature it is a risky area even for purposes of research.
How are museums currently using facial recognition?
- Cooper Hewitt Smithsonian Design Museum have used it to create artistic installations using computer analysis of the emotional state of visitors to an exhibit.
GDPR and the collecting and processing of personal data
The general data protection regulations focus on the collection of personal data and how it is stored or processed in some way. It defines the various players as data controllers, data processors and data subjects, giving more rights to subjects about how their personal data is used. The concerns and risks around protecting personal data mean more stringent measures need to be taken when storing or processing it, with some categories of data including biometric data considered to be sensitive and so subject to extra scrutiny.
Personal data could be any data that could be used to uniquely identify a person including name, email address, location, ip address etc, but also photographs containing identifiable faces, and therefore video.
Following GDPR guidelines we have already reviewed how we obtain consent when taking photographs of visitors, either individually or as part of an event. Potentially any system that records or photographs people via webcams will be subject to the same policy – meaning we’d need to get consent – this could cause practical problems for deploying such a system, but the subtleties of precisely how we collect, store and process images are important, particularly when we might be calling upon cloud based services for the image analysis.
In our hypothesised solution, we will be hooking up a webcam to take snapshots of exhibition visitors which will then be presented to the image analysis engine. Since images are considered personal data, we would be classed as data controllers, and anything we do with those images as data processing, even if we are not storing the images locally or in the cloud.
Furthermore – the returned analysis of the images would be classed as biometric data under GDPR and as such we would need explicit consent from visitors to the processing of their images for this specific purpose – non consented biometric processing is not allowed.
We therefore need to be particularly careful in anything we do that might involve images of faces even if we are only converting them to anonymised demographic data without any possibility to trace the data to an individual. The problem also occurs if we want to track the same person across several places – we need to be able to identify the same face in 2 images.
This means that whilst our project may identify the potential of currently available technology to give us useful data – we can’t deploy it in a live environment without consent. Still – we could run an experimental area in the museum where we ask for consent for visitors to be filmed for research purposes, as part of an exhibition. We’d need to assess whether the benefits of the research outweigh the effort of gaining consent.
This raises the question of where security cameras fall under this jurisdiction….time for a quick diversion:
As CCTV involves storing images that can be used to identify people, this comes under GDPR’s definition of personal data and as such we are required to have signage in place to inform people that we are using it, and why – the images can only be captured for this limited and specific purpose (before we start thinking we can hack into the CCTV system for some test data)
Live streaming and photography at events
When we take photographs at events we put up signs saying that we are taking photographs, however whilst UK law allows you to take photos in a public place, passive content may not be acceptable under GDPR when collecting data via image recognition technology.
Gallery interactive displays
Some of our exhibition installations involve live streaming – we installed a cctv camera in front of a greenscreen as part of our Early Man exhibition in order. to superimpose visitors in front of a crowd of prehistoric football supporters from the film. The images are not stored but they are processed on the fly – although it is fairly obvious what the interactive exhibit is doing, should we be asking consent before the visitor approaches the camera, or displaying a privacy notice explaining how we are processing the images?
Background image © Aardman animations
Any solution that involves hooking up webcams to a network or the internet comes with a risk. For the purposes of this hackday we are going to be using raspberry pi connected to a webcam and using this to analyse the images. If this was to be implemented in the museum we’d need to assess the risk of the devices being intercepted .
Authentication and encryption:
Authentication – restrict data to authorised users – user name and password (i.e. consent given)
Encryption – encoding of the data stream so even if unauthenticated user accesses the stream, they can’t read it without decrypting. E.g. using SSL.
Furthermore – if we are sending personal data for analysis by a service running online, the geographic location of where this processing takes place is important.
“For GDPR purposes, Microsoft is a data processor for the following Cognitive Services, which align to privacy commitments for other Azure services”
Minimum viable product: Connecting the camera server, the face analyser, the monitoring dashboard and the visualisation.
Despite the above practical considerations – the team have cracked on with assembling various parts of the solution – using a webcam linked to a Raspberry Pi to send images to the Azure Face API for analysis. Following on form that some nifty tools in data visualisation, and monitoring dashboard software can help users manage a number of devices and aggregate data from them.
There are some architectural decisions to make around where the various components sit and whether image processing is done locally, on the Pi, or on a virtual server, which could be hosted locally or in the cloud. The low processing power of the Pi could limit our options for local image analysis, but sending the images for remote processing raises privacy considerations.
Step 1: Camera server
After much head scratching we had an application that could be launched on PC or linux that could be accessed over http:// to retrieve a shot from any connected webcam – this is the first part of the puzzle sorted.
By the second day we had a series of webcam devices – raspberry Pi, windows PC stick and various laptops all providing pictures from their webcams via via http requests over wifi – so far so good – next steps are how to analyse these multiple images from multiple devices.
Step 2: Face analyser.
Because the Azure Face API is a chargeable service, we don’t want to waste money by analysing images that don’t contain faces – so we implemented some open source script to first check for any faces. If an image passses the face test – we can then send it for analysis.
The detailed analysis that is returned in JSON format includes data on age, gender, hair colour and even emotional state of the faces in the picture.
Our first readings are pretty much on point with regards to age when we tested ourselves through our laptop webcams. And seeing the structure of the returned data gives us what we need to start thinking about the potential for visualising this data.
We were intrigued by the faceid code – does this ID relate to an individual person (which would infer the creation of a GDPR-risky person database somewhere), or simply the face within the image, and if we snapped the same people at different intervals, would they count as different people? It turns out the faceid just relates to the face in an individual image, and does not relate to tracking an individual over time – so this looks good as far as GDPR is concerned, but also limits our ability to deduce how many unique visitors we have in a space if we are taking snaphots at regular intervals.
We had originally envisaged that facial analysis of a series of images from webcams could give us metrics on headcount and dwell time. As the technology we are using requires still images captured from a webcam – we would need to take photos on a regular period to get the figures for a day.
Taking a closer look at the “emotion” JSON data reveals a range of emotional states, which when aggregated over time could give us some interesting results and raise more questions – are visitors happier on certain days of the week? Or in some galleries? Is it possible to track the emotion of individuals, albeit anonymously, during their museum experience?
In order to answer this we’d need to save these readings in a database with each recorded against a location for the location and time of day – the number of potential variables are creeping up.
We would also need to do some rigorous testing that the machine readings were reliable – which raises the question about how the Face API is calibrated in the first place…but as this is just an experiment our priority is connecting the various components – fine tuning of the solution is beyond the scope of this hack.
Step 3: Data exporter
Prometheus is the software we are using to record data over time and provide a means to query the data and make it available to incoming requests from a monitoring server. We identified the following variables that we would like to track – both to monitor uptime of each unit and also to give us useful metrics.
- CPU gauge
- Memory gauge
- Disk Space gauge
- Uptime (seconds) counter
- Coeus_up (0/1) gauge
- Exporter_up (0/1) gauge
- Face count
- current_faces (count) gauge
- Face_id (id)
- Total_faces (count) summary
Nice to have
- Gender (0/1) gauge
- Age buckets >18 18<>65 <65 histogram
- Dwell duration
- Dwell_duration_seconds gauge
- Coeus_up (0/1) gauge
- Exporter_up (0/1) gauge
- API queries
- API_calls (count) gauge
- API_request_time (seconds) gauge
- API queries
- Exporter_scrape_duration_seconds gauge
Step 4: Data dashboard
Every data point carries a timestamp and so this data can be plotted along an axis of time and displayed on a dashboard to give a real time overview of the current situation.
Step 5: Data visualisation
Using D3 we can overlay a graphic representing each face/datapoint back onto the camera feed. In our prototype mock up each face is represented by a shape giving an indication of the ir position within the fame. Upon this we could add colour or icons illustrating any of the available data from the facial analysis.
Github: Everything we did is openly available on this code repository: https://github.com/blackradley/coeus
Slack: we used this for collaboration during the project – great for chat and sharing documents and links, and breakout threads for specific conversations. This became the hive of the project.
Prometheus: monitoring remote hardware
Grafana: open source dashboard software
Azure: image recognition
Codepen:a code playground
D3: visualization library
Our aim was to get all the bits of the solution working together into a minimum viable product – to get readings from the webcam into a dashboard. With multiple devices and operating systems there could be many different approaches to this in terms of deployment methods, network considerations and options for where to host the image processing technology. We also wanted a scalable solution that could be deployed to several webcam units.
Just getting the various pieces of the puzzle working would most likely take up the whole time as we sprinted towards our MVP. As we started getting the data back it was starting to become clear that the analysis of the data would present its own problems, not just for reliability, but how to structure it and what the possibilities are – how to glean a useful insight from the almost endless tranches of timestamped data points that the system could potentially generate, and the associated testing, configuring and calibrating that the finished solution would need.
Whilst the Azure Face API will merrily and endlessly convert webcam screenshots of museum visitors to data points – the problem we face is what to make of this. Could this system count individuals over time, and not just within a picture? It seems that to do this you need an idea of how to identify an individual amongst several screen shots using biometric data, and so this would require a biometric database to be constructed somewhere to tell you if the face is new, or a repeat visitor – not something we would really want to explore given the sensitive nature of this data.
So this leaves us with data that does not resolve to the unique number of people in a space over time, but the number of people in a single moment, which when plotted over time is something like an average – and so our dashboard would feature “the average emotional state over time” or “the average gender”. As the same individual could be snapped in different emotional states.
As ever with analytical systems the learning point here is to decide exactly on what to measure and how to analyse the data before choosing the technology – which is why hackathons are so great because the end product is not business critical and our prototype has given us some food for thought.
With GDPR presenting a barrier for experimenting with the Face API, I wonder whether we might have some fun pointing it at our museum collections to analyse the emotional states of the subjects of our paintings instead?
Thanks to Zengenti for creating / hosting the event: https://www.zengenti.com/en-gb/blog
Git repo for the project: https://github.com/blackradley/coeus