All posts by Mark Pajak

CV19 – Digital Battle Plans

April 9, 2020Agile, Coronavirus, Covid-19, Digital media, Digital services, User research, WebsitesCoronavirus, covid-19Mark Pajak

Background

Bristol Culture receives an average of 2.5 million yearly visits to its websites (not including social media). Additionally, we have different demographics specific to each social media channel, which reflect the nature of the content and how users interact with the platform features offered.

Since March 13th visits to the bristolmuseums.org.uk have fallen off sharply from a baseline of 4000/day to under 1000/day as of 6th April. This unprecedented change in website visitors is a reflection of a large scale change in user behaviour which we need to understand, presumably – due to people no longer searching to find out about visiting the museum in person, due to enforced social distancing measures. It remains to be seen how patterns of online behaviour will change in the coming weeks, however, it appears we have a new baseline which more closely matches our other websites that are more about museum objects and subject matter than physical exhibitions and events.

You can explore this graph interactively using the following link:

https://datastudio.google.com/reporting/196MwOHX1WOhtwDQbx62qP0ntT7sLO9mb

Before CV struck

The top 10 most visited pages in January on bristolmuseums.org.uk feature our venue homepages, specific exhibitions and our events listings

During Lockdown

From March-April we are seeing visits to our blog pages, our online stories and our collections pages feature in the top 10 most visited.

Digital Content Strategy

Internally, we have been developing a digital content strategy to help us develop and publish content in a more systematic way. The effect of CV-19 has meant we have had to fast track this process to deal with a large demand for publishing new online content. The challenge we are faced with is how to remain true to our longer-term digital aims, whilst tackling the expectations to do more digitally. In practice, we have had to rapidly transform to a new way of working with colleagues, collaborating remotely, and develop a new fast track system of developing and signing off digital content. This has required the team to work in different ways both internally, distributing tasks between us, but also externally across departments so that our content development workflow is more transparent.

Pre-quarantine online audiences

Online we follow our social media principles: https://www.labs.bristolmuseums.org.uk/social-media-principles/

A key principle of our audience development plan is to understand and improve relationships with our audiences (physical and digital). This involves avoiding the idea that everything is for ‘everyone’. Instead of recognising that different activities suit different audiences. We seek to use data from a range of sources (rather than assumptions) to underpin decisions about how to meet the needs and wants of our audiences.

Quarantine online audiences

Since the implementation of strict quarantine measures by the Government on Tuesday 24th March – audiences’ needs have changed.

Families at home with school-age children (4 – 18) who are now home-schooling during term-time.
Retired people with access to computers/smart-phones who may be isolated and exploring online content for the first time.
People of all ages in high-risk groups advised not to leave their homes for at least the next 12 weeks.
People quarantining who may be lonely/anxious/angry/bored/curious or looking for opportunities to self-educate.
Possible new international audiences under quarantine restrictions.

See this list created anonymously by digital/museum folk: https://docs.google.com/document/d/1MwE3OsljV8noouDopXJ2B3MFXZZvrVSZR8jSrDomf5M/edit

What should our online offer provide?

https://www.bristolmuseums.org.uk/blog/a-dose-of-culture-from-home/

Whilst our plummeting online visitors overall tells us one story – we now have data to tell us there is a baseline of people who are visiting our web pages regularly and this audience needs consideration. Potentially a new audience with new needs but also a core group of digitally engaged visitors who are seeking content in one form or another.

Some things we need to be thinking about when it comes to our digital content:

What audiences are we trying to reach and what platforms are they using?
What reach are we aiming for and what are other museums doing – we don’t necessarily want to publish content that is already out there. What’s our USP?
What can we realistically do, and do well with limited resources?
What format will any resources take and where will they ‘live’?
What’s our content schedule – will we be able to keep producing this stuff if we’ve grown an audience for it once we’re open again? When will we review this content and retire if/when it’s out of date?
We need to be thinking about doing things well (or not doing them at all – social media platforms have ways of working out what good content is, and will penalise us if we keep posting things that get low engagement. A vicious cycle)
We want to engage with a relevant conversation, rather than simply broadcast or repurpose what we have (though in practice we may only have resource to repurpose content)

Submitting ideas/requests for digital content during Quarantine period

We are already familiar with using trello to manage business processes so we quickly created a new board for content suggestions. This trello-ised what had been developing organically for some time, but mainly in the minds of digital and marketing teams.

Content development Process in trello

STEP 1: An idea for a new piece of digital output is suggested, written up and emailed to the digital team, and then added to the Digital Content Requests Trello.

STEP2: The suggestion is then broken down / augmented with the following information (detailed below), and added as fields to the trello card

STEP 3: This list of suggestions is circulated amongst staff on the sign off panel, for comments.

STEP 4: The card is either progressed into the To Do List, or moved back to “more info needed / see comments” list.

The following information is required in order to move a digital content suggestion forward:

Description: Top level description about what the proposal is

Content: What form does the content take? Do we already have the digital assets required or do we need to develop or repurpose and create new content? What guidelines are available around the formats needed?

Resource: What staff are required to develop the content, who has access to upload and publish it?

Audiences: Which online audiences is this for and what is their user need?

Primary platform: Where will the content live, and for how long?

Amplification: How will it be shared?

Success: What is the desired impact / behaviour / outcome?

Opportunities

Experimentation

New and emerging content types: The lockdown period could be an opportunity to try a range of different approaches without worrying too much about their place in the long term strategy.

Online events programme

Now we can only do digital-or-nothing, we need to look at opportunities for live streaming events. Where there is no audience – how do we build enough digital audiences to know and be interested in this if we did go down that route. Related to above – online family/ adult workshops, a lot of this is happening now, are they working, how long will people be interested?

Collaborating with Bristol Cultural organisations

With other cultural organisations in Bristol facing similar situations, we’ll be looking to collaborate on exploring:

What is the online cultural landscape of Bristol?
Collaborative cultural response to Corona
A curated, city wide approach
Working with digital producers on user research questions
Similar to the Culture ‘Flash Sale’
Scheduled content in May

Arts Council England business plan

Those projects are at risk of not being able to be delivered – can digital offer a way to do these in a different way?

Service / Museum topical issues

How can we create an online audience to move forward our decolonisation and climate change discussions?

Family digital engagement

We’ll be working with the public programming team to develop content for a family audience

Examples of museum services with online content responding well to quarantine situation

a) they have a clear message about the Corona virus situation

b) they have adjusted their landing pages to point visitors to online content.

British Museum https://www.britishmuseum.org/closure Up front message from Director about situation, message of valuing audiences, website not that polished, does it matter?
Tate Gallery https://www.tate.org.uk/ No clear upfront message but 4 highlighted options can be enjoyed from home/may be useful for parents too
Rijksmuseum, Amsterdam https://www.rijksmuseum.nl/en Good simple landing message ‘Rijksmuseum from home. Always open online’. 360 degree virtual tours of most popular galleries, ability to zoom in on individual paintings (although not labels).
The Louvre https://www.louvre.fr/en scrolling landing pages link to digital content
Birmingham Museums https://www.birminghammuseums.org.uk/ Closure message (although could be more up front), virtual tours, collections search and NHS cup sales foregrounded
Leeds Museums https://museumsandgalleries.leeds.gov.uk/leeds-city-museum/ Very simple, signposting to following updates via social media.
Glasgow Museums https://www.glasgowlife.org.uk/museums explaining work going on behind the scenes “we are working hard to create online versions of the activities and programmes 75% of Glaswegians enjoy and rely on.”

Examples of museums with good online content generally

Recent Guardian article by Adrian Searle lists museums for digital visits https://www.theguardian.com/artanddesign/2020/mar/25/the-best-online-art-galleries-adrian-searle

The Prado, Madrid https://www.museodelprado.es/en/the-collection Google search brings up Collections page first. Explore individual paintings or interactive timeline

Fundraising

The Development Team typically manages around £12,800 in donations per month through ‘individual giving’ which goes to our charity, Bristol Museums Development Trust. This is from a variety of income streams including donation boxes, contactless kiosks, Welcome Desks and donations on exhibition tickets. Closure of our venues means this valuable income stream is lost. To mitigate this, we need to integrate fundraising ‘asks’ into our online offers. For example, when we promote our online exhibitions, ask for a donation and link back to our online donation page.

The Development Team will work with the Digital and Marketing teams to understand plans and opportunities for digital content and scope out where and how to place fundraising messages across our platforms. We will work together to weave fundraising messages into the promotion of our online offers, across social media, as well as embed ‘asks’ within our website.

Next Steps:

Clearly, there will be long-lasting effects from the pandemic and they’ll sweep through our statistics and data dashboards for some time. However – working collaboratively across teams, responding to change and using data to improve online services are our digital raison d’etre – we’ll
use the opportunity as a new channel for 2020 onwards instead of just a temporary fix .

snapshopt of digital stats before the pandemic

SENSORS IN MUSEUMS

August 23, 2019User ResearchMark Pajak

ZENGENTI HACK DAY

Background

One of our digital team objectives for this year is to do more with data, to collect, share and use it in order to better understand our audiences, their behaviour and their needs. Online, Google analytics provides us with a huge amount of information on our website visitors, and we are only just beginning to scratch the surface of this powerful tool. But for physical visitors, once they come through our doors their behaviour in our buildings largely remains a mystery. We have automatic people counters that tell us the volume of physical visits, but we don’t know how many of these visitors make their way up to the top floor, how long they stay, and how they spend their time. On a basic level, we would like to know which of our temporary exhibitions on the upper floors drive most traffic, but what further insight could we get from more data?

We provide self complete visitor surveys via ipads in the front hall of our museums, and we can manually watch and record behaviour – but are there opportunities for automated processing and sensors to start collecting this information in a way which we can use and without infringing on people’s privacy? What are the variables that we could monitor?

Hack Time!

We like to collaborate, and welcome the opportunity to work with technical people to try things out, so the invitation to join the yearly “Lockdown” hack day at Zengenti – a 2 day event where staff form teams to work on non-work related problems. This gave us a good chance to try out some potential solutions to in gallery sensors. Armed with Raspberry Pis, webcams, an array of open source tech (and the obligatory beer) the challenge to come up with a system that can glean useful data about museum visitors at a low cost and using fairly standard infrastructure.

Team:

Atti Munir – Zengenti

Dan Badham – Zengenti

Joe Collins – Zengenti

Ant Doyle – Zengenti

Nic Kilby – Zangenti

Kyle Roberts – Zengenti

Mark Pajak – Bristol Museum

Mission:

Can we build a prototype sensor that can give us useful data on visitor behaviour in our galleries?
What are the variables that we would like to know?
Can AI automate the processing of data to provide us with useful insights?
Given GDPR,what are the privacy considerations?
Is it possible to build a compliant and secure system that provides us with useful data without breaching privacy rights of our visitors?

Face API

The Microsoft Azure Face API is an online “cognitive service” that is capable of detecting and comparing human faces, and returning an image analysis containing data on age, gender, facial features and emotion. This could potentially give us a “happy-o-meter” for an exhibition or something that told us the distribution of ages over time or across different spaces. This sort of information would be useful for evaluating exhibition displays, or when improving how we use internal spaces for the public.

Face detection: finding faces within an image.

Face verification: providing a likeliness that the same face appears in 2 images.

Clearly, there are positive and negative ramifications of this technology as highlighted by Facebook’s use of facial recognition to automatically tag photos, which has raised privacy concerns. The automated one-to-many ‘matching’ of real-time images of people with a curated ‘watchlist’ of facial images is possible with the same technology, but this is not what we are trying to do – we just want anonymised information that can not be related back to any specific person. Whilst hack days are about experimentation and the scope is fairly open to build a rough prototype – we should spend time reviewing how regulations such as GDPR affect this technology because by nature it is a risky area even for purposes of research.

How are museums currently using facial recognition?

Cooper Hewitt Smithsonian Design Museum have used it to create artistic installations using computer analysis of the emotional state of visitors to an exhibit.

GDPR and the collecting and processing of personal data

The general data protection regulations focus on the collection of personal data and how it is stored or processed in some way. It defines the various players as data controllers, data processors and data subjects, giving more rights to subjects about how their personal data is used. The concerns and risks around protecting personal data mean more stringent measures need to be taken when storing or processing it, with some categories of data including biometric data considered to be sensitive and so subject to extra scrutiny.

Personal data could be any data that could be used to uniquely identify a person including name, email address, location, ip address etc, but also photographs containing identifiable faces, and therefore video.

Following GDPR guidelines we have already reviewed how we obtain consent when taking photographs of visitors, either individually or as part of an event. Potentially any system that records or photographs people via webcams will be subject to the same policy – meaning we’d need to get consent – this could cause practical problems for deploying such a system, but the subtleties of precisely how we collect, store and process images are important, particularly when we might be calling upon cloud based services for the image analysis.

In our hypothesised solution, we will be hooking up a webcam to take snapshots of exhibition visitors which will then be presented to the image analysis engine. Since images are considered personal data, we would be classed as data controllers, and anything we do with those images as data processing, even if we are not storing the images locally or in the cloud.

Furthermore – the returned analysis of the images would be classed as biometric data under GDPR and as such we would need explicit consent from visitors to the processing of their images for this specific purpose – non consented biometric processing is not allowed.

We therefore need to be particularly careful in anything we do that might involve images of faces even if we are only converting them to anonymised demographic data without any possibility to trace the data to an individual. The problem also occurs if we want to track the same person across several places – we need to be able to identify the same face in 2 images.

This means that whilst our project may identify the potential of currently available technology to give us useful data – we can’t deploy it in a live environment without consent. Still – we could run an experimental area in the museum where we ask for consent for visitors to be filmed for research purposes, as part of an exhibition. We’d need to assess whether the benefits of the research outweigh the effort of gaining consent.

This raises the question of where security cameras fall under this jurisdiction….time for a quick diversion:

CCTV Cameras

As CCTV involves storing images that can be used to identify people, this comes under GDPR’s definition of personal data and as such we are required to have signage in place to inform people that we are using it, and why – the images can only be captured for this limited and specific purpose (before we start thinking we can hack into the CCTV system for some test data)

Live streaming and photography at events

When we take photographs at events we put up signs saying that we are taking photographs, however whilst UK law allows you to take photos in a public place, passive content may not be acceptable under GDPR when collecting data via image recognition technology.

Gallery interactive displays

Some of our exhibition installations involve live streaming – we installed a cctv camera in front of a greenscreen as part of our Early Man exhibition in order. to superimpose visitors in front of a crowd of prehistoric football supporters from the film. The images are not stored but they are processed on the fly – although it is fairly obvious what the interactive exhibit is doing, should we be asking consent before the visitor approaches the camera, or displaying a privacy notice explaining how we are processing the images?

Background image © Aardman animations

Security

Any solution that involves hooking up webcams to a network or the internet comes with a risk. For the purposes of this hackday we are going to be using raspberry pi connected to a webcam and using this to analyse the images. If this was to be implemented in the museum we’d need to assess the risk of the devices being intercepted .

Authentication and encryption:

Authentication – restrict data to authorised users – user name and password (i.e. consent given)

Encryption – encoding of the data stream so even if unauthenticated user accesses the stream, they can’t read it without decrypting. E.g. using SSL.

Furthermore – if we are sending personal data for analysis by a service running online, the geographic location of where this processing takes place is important.

“For GDPR purposes, Microsoft is a data processor for the following Cognitive Services, which align to privacy commitments for other Azure services”

Minimum viable product: Connecting the camera server, the face analyser, the monitoring dashboard and the visualisation.

Despite the above practical considerations – the team have cracked on with assembling various parts of the solution – using a webcam linked to a Raspberry Pi to send images to the Azure Face API for analysis. Following on form that some nifty tools in data visualisation, and monitoring dashboard software can help users manage a number of devices and aggregate data from them.

There are some architectural decisions to make around where the various components sit and whether image processing is done locally, on the Pi, or on a virtual server, which could be hosted locally or in the cloud. The low processing power of the Pi could limit our options for local image analysis, but sending the images for remote processing raises privacy considerations.

Step 1: Camera server

After much head scratching we had an application that could be launched on PC or linux that could be accessed over http:// to retrieve a shot from any connected webcam – this is the first part of the puzzle sorted.

By the second day we had a series of webcam devices – raspberry Pi, windows PC stick and various laptops all providing pictures from their webcams via via http requests over wifi – so far so good – next steps are how to analyse these multiple images from multiple devices.

Step 2: Face analyser.

Because the Azure Face API is a chargeable service, we don’t want to waste money by analysing images that don’t contain faces – so we implemented some open source script to first check for any faces. If an image passses the face test – we can then send it for analysis.

The detailed analysis that is returned in JSON format includes data on age, gender, hair colour and even emotional state of the faces in the picture.

Our first readings are pretty much on point with regards to age when we tested ourselves through our laptop webcams. And seeing the structure of the returned data gives us what we need to start thinking about the potential for visualising this data.

We were intrigued by the faceid code – does this ID relate to an individual person (which would infer the creation of a GDPR-risky person database somewhere), or simply the face within the image, and if we snapped the same people at different intervals, would they count as different people? It turns out the faceid just relates to the face in an individual image, and does not relate to tracking an individual over time – so this looks good as far as GDPR is concerned, but also limits our ability to deduce how many unique visitors we have in a space if we are taking snaphots at regular intervals.

We had originally envisaged that facial analysis of a series of images from webcams could give us metrics on headcount and dwell time. As the technology we are using requires still images captured from a webcam – we would need to take photos on a regular period to get the figures for a day.

Taking a closer look at the “emotion” JSON data reveals a range of emotional states, which when aggregated over time could give us some interesting results and raise more questions – are visitors happier on certain days of the week? Or in some galleries? Is it possible to track the emotion of individuals, albeit anonymously, during their museum experience?

In order to answer this we’d need to save these readings in a database with each recorded against a location for the location and time of day – the number of potential variables are creeping up.

We would also need to do some rigorous testing that the machine readings were reliable – which raises the question about how the Face API is calibrated in the first place…but as this is just an experiment our priority is connecting the various components – fine tuning of the solution is beyond the scope of this hack.

Step 3: Data exporter

Prometheus is the software we are using to record data over time and provide a means to query the data and make it available to incoming requests from a monitoring server. We identified the following variables that we would like to track – both to monitor uptime of each unit and also to give us useful metrics.

Essential

CPU gauge
Memory gauge
Disk Space gauge
Uptime
- Uptime (seconds) counter
Services
- Coeus_up (0/1) gauge
- Exporter_up (0/1) gauge
Face count
- current_faces (count) gauge
- Face_id (id)
- Total_faces (count) summary

Nice to have

Gender
- male/female
  1. Gender (0/1) gauge
Age
- Age buckets >18 18<>65 <65 histogram
Dwell duration
- Seconds
  1. Dwell_duration_seconds gauge
Services
- Coeus_up (0/1) gauge
- Exporter_up (0/1) gauge
Coeus
- API queries
  1. API_calls (count) gauge
  2. API_request_time (seconds) gauge
Exporter
- Exporter_scrape_duration_seconds gauge

Step 4: Data dashboard

Every data point carries a timestamp and so this data can be plotted along an axis of time and displayed on a dashboard to give a real time overview of the current situation.

Step 5: Data visualisation

Using D3 we can overlay a graphic representing each face/datapoint back onto the camera feed. In our prototype mock up each face is represented by a shape giving an indication of the ir position within the fame. Upon this we could add colour or icons illustrating any of the available data from the facial analysis.

Tools:

Github: Everything we did is openly available on this code repository: https://github.com/blackradley/coeus

Slack: we used this for collaboration during the project – great for chat and sharing documents and links, and breakout threads for specific conversations. This became the hive of the project.

Prometheus: monitoring remote hardware

Grafana: open source dashboard software

Azure: image recognition

Codepen:a code playground

D3: visualization library

Final remarks

Our aim was to get all the bits of the solution working together into a minimum viable product – to get readings from the webcam into a dashboard. With multiple devices and operating systems there could be many different approaches to this in terms of deployment methods, network considerations and options for where to host the image processing technology. We also wanted a scalable solution that could be deployed to several webcam units.

Just getting the various pieces of the puzzle working would most likely take up the whole time as we sprinted towards our MVP. As we started getting the data back it was starting to become clear that the analysis of the data would present its own problems, not just for reliability, but how to structure it and what the possibilities are – how to glean a useful insight from the almost endless tranches of timestamped data points that the system could potentially generate, and the associated testing, configuring and calibrating that the finished solution would need.

Whilst the Azure Face API will merrily and endlessly convert webcam screenshots of museum visitors to data points – the problem we face is what to make of this. Could this system count individuals over time, and not just within a picture? It seems that to do this you need an idea of how to identify an individual amongst several screen shots using biometric data, and so this would require a biometric database to be constructed somewhere to tell you if the face is new, or a repeat visitor – not something we would really want to explore given the sensitive nature of this data.

So this leaves us with data that does not resolve to the unique number of people in a space over time, but the number of people in a single moment, which when plotted over time is something like an average – and so our dashboard would feature “the average emotional state over time” or “the average gender”. As the same individual could be snapped in different emotional states.

As ever with analytical systems the learning point here is to decide exactly on what to measure and how to analyse the data before choosing the technology – which is why hackathons are so great because the end product is not business critical and our prototype has given us some food for thought.

With GDPR presenting a barrier for experimenting with the Face API, I wonder whether we might have some fun pointing it at our museum collections to analyse the emotional states of the subjects of our paintings instead?

Acknowledgements:

Thanks to Zengenti for creating / hosting the event: https://www.zengenti.com/en-gb/blog

References:

Git repo for the project: https://github.com/blackradley/coeus

Museums and AI: Can You See the Future?

https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/781745/Facial_Recognition_Briefing_BFEG_February_2019.pdf

Museums and AI: Can You See the Future?

Cooper Hewitt reveals sinister side of facial recognition tech at London Design Biennale

https://iapp.org/news/a/how-should-we-regulate-facial-recognition-technology/

https://thenextweb.com/contributors/2018/10/29/heres-how-face-recognition-tech-can-be-gdpr-compliant/

https://ico.org.uk/for-organisations/in-your-sector/business/general-data-protection-regulation-gdpr-faqs-for-small-retailers/

Can neural networks help us reinterpret history?

March 24, 2019User ResearchMark Pajak

Background

Bristol City Council publishes many types of raw data to be transparent about the information they hold, and to encourage positive projects based on this data by any citizen or organisation.

One of the most recent datasets to be published by Bristol Museums is thousands of images from the British Empire and Commonwealth (BEC) collection. You can see a curated selection of these images online “Empire through the Lens”.

At a hackathon hosted by Bristol’s Open Data team with support from the Jean Golding Institute, attendees were encouraged to make use of this new dataset. Our team formed around an idea of using image style transfer, a process of transforming the artistic style of one image based on another using Convolutional Neural Networks.

In layman’s terms this method breaks down images into ‘content’ components and ‘style’ components, then combines them.

We hypothesised there would be value in restyling images from the dataset to draw out themes of Bristol’s economic and cultural history when it comes to Empire and Commonwealth.

The team

Dave Rowe – Development Technical Lead for Bristol City Council and Open Data enthusiast
Junfan Huang – MSc Mathematics of Cybersecurity student in University of Bristol
Mark Pajak – Head of Digital at Bristol City Council Culture Team & Bristol Museums
Rob Griffiths – Bristol resident and Artificial Intelligence Consultant for BJSS in the South West

Aim

To assess the potential of Style Transfer as a technique for bringing attention back to historical images and exploring aspects of their modern relevance.

Method

Natalie Thurlby from the Jean Golding Institute introduced us to a method of style transfer using Lucid, a set of open source tools for working with neural networks. You can view the full Colab notebook we used here.

To start with, we hand-selected images from the collection we thought it would be interesting to transform. We tried to pair each ‘content’ image with ‘style’ images that might draw parallels with Bristol.

Dockside Cranes

A railway steam crane lowers a train engine onto a bogie on the dockside at Kilindini harbour, Mombasa, Kenya.

When we saw this image it immediately made us think of the docks at Bristol harbourside, by the Mshed.

The SS Harmonides which transported the train [likely from Liverpool actually] to Kenya is just visible, docked further along the harbour.

In addition to the images, the data set has keywords and descriptions which provide a useful way to search and filter

[‘railway’, ‘steam’, ‘crane’, ‘lower’, ‘train’, ‘engine’, ‘bogie’, ‘dockside’, ‘Kilindini’, ‘harbour’, ‘Mombasa’, ‘Kenya’]

We liked this painting by Mark Buck called the Cranes of Bristol Harbour. It says online that Mark studied for a degree in illustration at Bower Ashton Art College in Bristol, not too far from this place.

This image has been created as a result of adding the previous two images into the style tranfer engine.

We drew an obvious parallel here between these two sets of cranes in ports around the world. The Bristol cranes are from the 1950s, but the Kenya photo was taken much earlier, in the 1920s It would be interesting to look more deeply at the cargo flows between these two ports during the 19th century.

Cliftonwood Palace

This is a view of the Victoria Memorial, Kolkata, India in 1921.

It was commissioned by Lord Curzon to commemorate the death of Queen Victoria.

We were struck by the grandeur and formality of the photo.

Key words: [‘Victoria’, ‘Memorial’, ‘Kolkata’, ‘India’, ‘1921’] – see “topic modelling below”

A photo of the colourful Victorian terraces of Cliftonwood from the river, which have their own sense of formality.

The architectural significance of these buildings in their locales and link to Queen Victoria are small parallels.

It’s funny how the system seemingly tries to reconstruct the grand building using these houses as colourful building blocks, but it ends up making it look like a shanty town.

This image was created by machine intelligence by taking an historical photograph and applying a style gleaned from a bristol cityscape.

Caribbean Carnival

**Carnival dancers on Nevis, the island in the Caribbean Sea, in 1965.**

Two men perform a carnival dance outdoors, accompanied by a musical band. Both dancers wear crowns adorned with peacock feathers and costumes made from ribbons and scarves.

Key words: [‘perform’, ‘carnival’, ‘dance’, ‘outdoors’, ‘accompany’, ‘musical’, ‘dancer’, ‘crown’, ‘adorn’, ‘peacock’, ‘feather’, ‘costume’, ‘ribbon’, ‘scarf’, ‘Nevis’]

St Pauls Carnival is an annual African-Caribbean carnival held, usually on the first Saturday of July, in St Pauls, Bristol.

We selected this picture to see how the system would handle the colourful feathers and sequined outfits.

The resulting image (below) was somewhat abstract but we agreed was transformed by the vibrant colours and patterns of movement.

Festival colours reimagine an historical photograph using machine intelligence – but is this a valid interpretation of the past or an abstract and meaningless picture?

After generating many examples we came together to discuss some of the ethical and legal implications of this technique.

We were particularly mindful of the fact that any discussion of Empire and Commonwealth should be treated with sensitivity. For each image, it’s challenging both to appreciate fully the context and not to project novelty or inappropriate meaning onto it.

We wondered whether this form of style transfer with heritage images was an interesting technique for people who have something to say and want an eye-catching way of communicating, but not a technique that should be used lightly – particularly with this dataset.

We often found ourselves coming back to discussions of media rights and intellectual property. None of us have a legal background but we were aware that, while we wanted to acknowledge where we had borrowed other people’s work to perform this experiment, we were generating new works of art – and it was unclear where the ownership lay.

Service Design

We set out potential benefits of our service:

A hosted online service to make it a more efficient process
Advice and tips on how to calibrate and get the best results from Style Transfer
Ability to process images in bulk
Interactive ways of browsing the dataset
Communication tools for publishing and sharing results
Interfaces for public engagement with the tool – a Twitter conversational bot

On the first day we started putting together ideas for how a web service might be used to take source images from the Open Data Platform and automate the style stransfer process.

This caused us to think about potential users of the system and what debate might be sparked fromt he resulting images.

Proposition Design

A key requirement for all users would be the ability to explore and see the photographs in their original digitised form, with the available descriptions and other metadata. Those particularly interested in exploring the underlying data would appreciate having search and filter facilities that made use of fields such as location, date, and descriptions.

We would also need a simple way of choosing a set of photographs, without getting in the way of being able to continue to discover other photos. A bit like in an online shopping scenario where you add items to a basket.

The users could then choose a style to apply to their chosen photos. This would be a selection of Bristol artworks, or iconic scenes. For those wanting to apply their own style (artists, for example) we would give an option to upload their own artwork and images.

Depending on processing power, we know that such an online service could have difficulty applying style transforms in an appropriate time for people to wait. If the waiting time were over a couple of minutes it could be that the results are provided by email.

Components

Spin off products…Topic Modelling

We even successfully built a crucial component of our future service. The metadata surrounding the images includes both keywords and descriptive text. Junfan developed a script that analysed the metadata to provide a better understanding of the range of keywords that could be used to interrogate the images. This could potentially be used in the application to enable browsing by subject….

We wanted to generate a list of keywords from the long form text captions that accompanied the images. This would allow us to come up with a classification for pictures using their description. Then, users would be able to select topics and get some pictures they want.

Here in topic 2, our model has added bridge, street, river, house, gardens and some similar words into the same group.

Python is the language of choice for this particular application

Topic modelling reveals patterns of keyword abundance amongst the captions

keywords extracted from the captions can help us build an interface to allow filtering on a theme

Reflections

After generating many examples we came together to discuss some of the ethical and legal implications of this technique.

Does this have potential?

We thought, on balance, yes this was an interesting technique for both artistic historians and artists interested in history.

We imagined their needs using the following user personas:

Artistic Historians: ‘I want to explore the stories behind these images and bring them to life in a contemporary way for my audience.’
Artists interested in history: ‘I want a creative tool to provide inspiration and see what my own personal, artistic style would look like applied to heritage images’.

We spent time scoping ways we could turn our work so far into a service to support these user groups.

References & Links

The repo for our application: https://github.com/xihajun/Art-vs-History-Open-Data-Hackathon-Code
Open data platform:https://opendata.bristol.gov.uk/pages/homepage/

Bristol Archives (British Empire and Commonwealth Collection): https://www.bristolmuseums.org.uk/bristol-archives/whats-at/our-collections/

Acknowledgements

Thanks to Bristol Open for co-ordinating the Hackathon.

Thanks to Lucid contributors for developing the Style Transfer code.

Thanks to the following artists for source artwork:

Mark Buck: https://www.painters-online.co.uk/artist/markbuck

Ellie Pajak

https://www.etsy.com/shop/PapierBeau?section_id=21122286

Testing museum gallery mobile interpretation

December 11, 2018User ResearchMark Pajak

Over the next few weeks we are running user testing of SMARTIFY at M Shed. This app provides visitors with extra information about museum objects using image recognition to trigger the content on a mobile device.

To install the free app use this link: https://smartify.org/

If you have used the app at M Shed, please could you take a few moments to complete the following survey: https://www.surveymonkey.co.uk/r/ZVTVPW9

If you would like to help further, please get in touch with our volunteer co-ordinator: https://www.bristolmuseums.org.uk/jobs-volunteering/

How to nail it in Team Digital by turning it off.

October 31, 2018Agile, Digital services, Digital skillsMark Pajak

This post is about my recent week of reducing screen time to a minimum after seeking a fresh approach, having lost the plot deep in some troublesome code, overloaded with an email avalanche and pestered by projects going stale. In other words…have you tried turning it off? (and not on again!)

STEP 1: TURN OFF PC

Guys this is what a computer looks like when it is off

Kinda feels better already. No more spinning cogs, no more broken code, brain starting to think in more creative ways, generally mind feeling lighter. Trip to the stationary cupboard to stock up on Post-its and sticky things, on way speak to a colleague whom I wouldn’t usually encounter and gain an insight into the user facing end of a project I am currently working on (I try to make a mental note of that).

STEP 2: RECAP ON AGILE METHODS

Agile Service Delivery concept — a great diagram about agile processes by Jamie Arnold

(admittedly you do need to turn the computer back on from here onwards, but you get the idea!)

The team here have just completed SCRUM training and we are tasked with scratching our heads over how to translate this to our own working practices. I was particularly inspired by this diagram and blog by Jamie Arnold from G.D.S. explaining how to run projects in an agile way. I am especially prone to wanting to see things in diagrams, and this tends to be suppressed by too much screen time 🙁

“a picture paints a thousand words.”

Also for projects that are stalled or for whatever reason on the backburner – a recap (or even retrospective creation) on the vision and goals can help you remember why they were once on the agenda in the first place, or if they still should be.

STEP 3: FOCUS ON USER NEEDS

It is actually much easier to concentrate on user needs with the computers switched off. Particularly in the museum where immediately outside the office are a tonne of visitors getting on with their lives, interacting with our products and services, for better or worse. Since several of our projects involve large scale transformation of museum technology, mapping out how the user need is acheived from the range of possible technologies is useful. This post on mapping out the value chain explaines one method.

Mapping the value chain for donation technology

Whilst the resulting spider-web can be intimidating, it certainly helped identify some key dependencies like power and wifi (often overlooked in musuem projects but then causing serious headaches down the line) as well as where extra resource would be needed in developing new services and designs that don’t yet come ‘off the shelf’.

STEP 4: DISCOVERING PRODUCT DISCOVERY

There is almost always one, or more like three of our projects in the discovery phase at any one time, and this video form Teresa Torres on product discovery explains how to take the focus away from features and think more about outcomes, but also how to join the two in a methodical way – testing many solutions at once to analyse different ways of doing things.

We are a small multidisciplinary team, and in that I mean we each need to take on several disciplines at once, from user research, data analysis, coding, system admin, content editing, online shop order fulfilment (yes you heard that right) etc. However, it is always interesting to hear from those who can concentrate on a single line of work. With resources stretched we can waste time going down the wrong route, but we can and do collaborate with others to experiment on new solutions. Our ongoing “student as producer” projects with the University of Bristol have been a great way for us to get insights in this way at low risk whilst helping to upskill a new generation.

STEP 5: GAMIFY THE PROBLEM

Some of the hardest problems are those involving potential conflict between internal teams. These are easier to ignore than fix and therefore won’t get fixed by business as usual, they just linger and manifest, continuing to cause frustration.

Rhythm, Attention, Collaboration from Museums Computer Group

Matt Locke explained it elegantly in MCG’s Museums+Tech 2018: the collaborative museum. And this got me thinking about how to attempt to align project teams that run on totally different rhythms and technologies. Last week I probably would have tried to build something in Excel or web-based tech that visualised resources over time, but no, not this week….this week I decided to use ducks!

Shooting ducks on a pinboard turned out to be a much easier way to negotiate resources and was quicker to prototype than any amount of coffee and coding (its also much easier to support 😉 ). It was also clear that Google sheets or project charts weren’t going to cut it for this particular combination of teams because each had its own way of doing things.

“Shoot the duck”: Blocking out 52 weeks in a year to barter with resources across teams

The challenge was to see how many weeks in a year would be available after a team had been booked for known projects. The gap analysis can be done at a glance – we can now discuss the blocks of free time for potential projects and barter for ducks, which is more fun than email crossfire. The problem has now become a physical puzzle where the negative space (illustrated by red dots) is much more apparent than it was by cross-referencing data squares vs calendars. Its also taken out the underlying agendas across departments and helped us all focus on the problem by playing the same game – helping to synchronise our internal rhythms.

REMARKS

It may have come as a surprise for colleagues to see their digital people switch off and reach for analogue tools, kick back with a pen and paper and start sketching or shooting ducks, but to be honest its been one of the most productive weeks in recent times, and we have new ideas about old problems.

Yes, many bugs still linger in the code, but rather than hunting every last one to extinction, with the benefit of a wider awareness of the needs of our users and teams, maybe we just switch things off and concentrate on building what people actually want?

Integrating Shopify with Google Sheets (magic tricks made to look easy)

July 25, 2018Digital services, Digital skills, RetailMark Pajak

In team digital we like to make things look easy, and in doing so we hope to make life easier for people. A recent challenge has been how to recreate the Top sales by product analysis from the Shopify web application in Google Docs to see how the top 10 selling products compare month by month. The task of creating a monthly breakdown of product sales had up until now been a manual task of choosing from a date picker, exporting data, copying to google sheets, etc.

Having already had some success pushing and pulling data to google sheets using google apps script and our Culture Data platform, we decided to automate the process. The goal was to simplify the procedure of getting the sales analysis into Google docs to make it as easy as possible for the user – all they should need to do would be to select the month they wish to import.

We have developed a set of scripts for extracting data using the Shopify API, but needed to decide how to get the data into Google Sheets. Whilst there is a library for pushing data from a node application into a worksheet, our trials found it to be slow and prone to issues where the sheet did not have enough rows or other unforeseen circumstances. Instead, we performed our monthly analysis on the node server and saved this to a local database. we then built an api for that database that could be queried by shop and by month.

The next step, using google script was to query the api and pull in a month’s worth of data, then save this to a new sheet by month name. This could then be set added as a macro so that it was accessible in the toolbar for the user in a familiar place for them, at their command.

As the data is required on a monthly basis, we need to schedule the server side analysis to save a new batch of data after each month – something we can easily achieve with a cron job. The diagram below shows roughly how the prototype works from the server side and google sheets side. Interestingly, the figures don’t completely match up to the in-application analysis by Shopify, so we have some error checking to do, however we now have the power to enhance the default analysis with our own calculations, for example incorporating the cost of goods into the equation to work out the overall profitability of each product line.

Preserving the digital

May 20, 2018Archives, Digital Curating, Digital media, Digital preservation, Digital servicesMark Pajak

From physical to digital to…?

At Bristol Culture we aim to collect, preserve and create access to our
collections for use by present and future generations. We are increasingly dealing with digital assets amongst these collections – from photographs of our objects, to scans of the historical and unique maps and plans of Bristol, to born-digital creations such as 3D scans of our Pliosaurus fossil. We are also collecting new digital creations in the form of video artwork.

One day we won’t be able to open these books because they are too fragile – digital will be the only way we can access this unique record of Bristol’s history, so digital helps us preserve the physical and provides access. Inside are original plans of Bristols most historic and well-known buildings including the Bristol Hippodrome, which require careful unfolding and digital stitching to reproduce the image of the full drawing inside.

Plans of the Hippodrome, 1912. © Bristol Culture

With new technology comes new opportunities to explore our specimens and this often means having to work with new file types and new applications to view them.

This 3D scan of our Pliosaurus jaw allows us to gain new insights into the behavior and biology of this long-extinct marine reptile.

Horizon © Thompson & CraigheadThis digital collage by Thompson & Craghead features streaming images from webcams in the 25 time zones of the world. The work comes with a Mac mini and a USB drive in an archive box and can be projected or shown on a 42″ monitor. Bristol Museum is developing its artist film and video collection and now holds 22 videos by artists including Mariele Neudecker, Wood and Harrison, Ben Rivers, Walid Raad and Emily Jacir ranging from documentary to structural film, performance, web-based film and video and animation, in digital, video and analogue film formats, and accompanying installations.

What could go wrong?

So digital assets are helping us conserve our archives, explore our collections and experience new forms of art, but how do we look after those assets for future generations?

It might seem like we don’t need to worry about that now but as time goes by there is constant technological change; hardware becomes un-usable or non-existent, software changes and the very 1s and 0s that make up our digital assets can be prone to deteriorating by a process known as bitrot!. Additionally, just as is the case for physical artifacts, the information we know about them including provenance and rights can become dissociated. What’s more, the digital assets can and must multiply, move and adapt to new situations, new storage facilities and new methods of presentation. Digital preservation is the combination of procedures, technology and policy that we can use to help us prevent these risks from rendering our digital repository obsolete. We are currently in the process of upskilling staff and reviewing how we do things so that we can be sure our digital assets are safe and accessible.

Achieving standards

It is clear we need to develop and improve our strategy for dealing with these potential problems, and that this strategy should underline all digital activity where the result of that activity produces output which we wish to preserve and keep. To rectify this, staff at the Bristol Archives, alongside Team Digital and Collections got together to write a digital preservation policy and roadmap to ensure that preserved digital content can be located, rendered (opened) and trusted well into the future.

Our approach to digital preservation is informed by guidance from national organisations and professional bodies including The National Archives, the Archives & Records Association, the Museums Association, the Collections Trust, the Digital Preservation Coalition, the Government Digital Service and the British Library. We will aim to conform to the Open Archival Information System (OAIS) reference model for digital preservation (ISO 14721:2012). We will also measure progress against the National Digital Stewardship Alliance (NSDA) levels of digital preservation.

A safe digital repository

We use EMu for our digital asset management and collections management systems. Any multimedia uploaded to EMu is automatically given a checksum, and this is stored in the database record for that asset. What this means is that if for any reason that file should change or deteriorate (which is unlikely, but the whole point of digital preservation is to have a mechanism to detect if this should happen) the new checksum won’t match the old one and so we can identify a changed file.

Due to the size of the repository, which is currently approaching 10Tb, it would not be practical to this manually, and so we use a scheduled script to pass through each record and generate a new checksum to compare with the original. The trick here is to make sure that the whole repo gets scanned in time for the next backup period because otherwise, any missing or degraded files would become the backup and therefore obscure the original. We also need a working relationship with our IT providers and an agreed procedure to rescue any lost files if this happens.

With all this in place, we know that what goes in can come back out in the same state -so far so good. But what we cant control is the constant change in technology for rendering files – how do we know that the files we are archiving now will be readable in the future? The answer is that we don’t unless we can migrate from out of date file types to new ones. A quick analysis of all records tagged as ‘video’ shows the following diversity of file types:

(See the stats for images and audio here). The majority are mpeg or avi, but there is a tail end of various files which may be less common and we’ll need to consider if these should remain in this format or if we need to arrange for them to be converted to a new video format.

Our plan is to make gradual improvements in our documentation and systems in line with the NDSA to achieve level 2 by 2022:

The following dashboard gives an idea of where we are currently in terms of file types and the rate of growth:

Herding digital sheep

Its all very well having digital preservation systems in place, but the staff culture and working practices must also change and integrate with them.

The digitisation process can involve lots of stages and create many files

In theory, all digital assets should line up and enter the digital repository in an orderly and systematic manner. However, we all know that in practice things aren’t so straightforward.

Staff involved in digitisation and quality control need the freedom to be able to work with files in the applications and hardware they are used to without being hindered by rules and convoluted ingestion processes. They should to be allowed to work in a messy (to outsiders) environment, at least until the assets are finalised. Also there are many other environmental factors that affect working practices including rights issues, time pressures from exhibition development, and skills and tools available to get the job done. By layering new limitations based on digital preservation we are at risk of designing a system that wont be adopted, as illustrated in the following tweet by @steube:

Design vs User Experience Examples: Talk w/ & Listen to Users #ui #ux #userexperience #design #SXSW2016 #sxsw pic.twitter.com/TSqZiKnux0

— Fred Steube (@steube) March 12, 2016

So we’ll need to think carefully about how we implement any new procedures that may increase the workload of staff. Ideally, we’ll be able to reduce the time staff take in moving files around by using designated folders for multimedia ingestion – these would be visible to the digital repository and act as “dropbox” areas which automatically get scanned and any files automatically uploaded an then deleted. For this process to work, we’ll need to name files carefully so that once uploaded they can be digitally associated with the corresponding catalogue records that are created as part of any inventory project. Having a 24 hour ingestion routine would solve many of the complaints we hear from staff about waiting for files to upload to the system.

Automation can help but will need a human element to clean up and anomalies

Digital services

Providing user-friendly, online services is a principle we strive for at Bristol Culture – and access to our digital repository for researchers, commercial companies and the public is something we need to address.

We want to be able to recreate the experience of browsing an old photo album using gallery technology. This interactive uses the Turn JS open source software to simulate page turning on a touchscreen featuring in Empire Through the Lens at Bristol Museum.

Visitors to the search room at Bristol Archives have access to the online catalogue as well as knowledgeable staff to help them access the digital material. This system relies on having structured data in the catalogue and scripts which can extract the data and multiemdia and package them up for the page turning application.

But we receive enquiries and requests from people all over the world, in some cases from different time zones which makes communication difficult. We are planning to improve the online catalogue to allow better access to the digital repository, and to link this up to systems for requesting digital replicas. There are so many potential uses and users of the material that we’ll need to undertake user research into how we should best make it available and in what form.

Going digital with our Exhibition Scheduling Timeline

October 9, 2017Digital services, Gallery TechnologyMark Pajak

developing a digital timeline for scheduling exhibitions

BACKGROUND

Having a visual representation of upcoming exhibitions, works, and major events is important in the exhibition planning process. Rather than relying on spotting dates that clash using lists of data, having a horizontal timeline spread out visually allows for faster cross-checking and helps collaboratively decide on how to plan for exhibition installs and derigs.

Until recently we had a system that used excel to plan out this timeline, by merging cells and coloring horizontally it was possible to manually construct a timeline. Apart from the pure joy that comes from printing anything from Excel, there were a number of limitations of this method.

When dates changed the whole thing needed to be rejigged
Everyone who received a printed copy at meetings stuck that to the wall and so date changes were hard to communicate.
We need to see the timeline over different scales – short term and long term, so this means using 2 separate excel tabs for each, hence duplication of effort.
We were unable to apply any permissions
The data was not interoperable with other systems

TIMELINE SOFTWARE (vis.js)

Thanks to Almende B.V. there is an open source timeline code library available at visjs.org/docs/timeline so this offers a neat solution to the manual task of having to recast the timeline using some creative Excel skills each time. We already have a database of Exhibition dates following our digital signage project and so this was the perfect opportunity to reuse this data, which should be the most up to date version of planned events as it is what we display to the public internally in our venues.

IMPLEMENTATION

The digital timeline was implemented using MEAN stack technology and combines data feeds from a variety of sources. In addition to bringing in data for agreed exhibitions, we wanted a flexible way to add installations, derigs, and other notes and so a new database on the node server combines these dates with exhibitions data. We can assign permissions to different user groups using some open source authentication libraries and this means we can now release the timeline for staff not involved in exhibitions, but also let various teams add and edit their own specific timeline data.

The great thing about vis is the ease of manipulation of the timeline, users are able to zoom in and out, and backward and forwards in time using with mouse, arrow or touch/pinch gestures.

EMU INTEGRATION

The management of information surrounding object conservation, loans and movements is fundamental to successful exhibition development and installation. As such we maintain a record of exhibition dates in EMu, our collections management software. The EMu events module is used to record when exhibitions take place and also the object list where curators select and deselect objects for exhibition. Using the EMU API we are able to extract a structured list of Exhibitions information for publishing to the digital timeline.

HOW OUR TIMELINE WORKS

Each gallery or public space has its own horizontal track where exhibitions are published as blocks. These are grouped into our 5 museums and archives buildings and can be selected/deselected from the timeline to cross reference each. Once logged in a user is able ot manually add new blocks to the timeline and these are pre-set to “install”, “derig” and “provisional date”. Once a block is added our exhibitions team are able to add notes that are accessible on clicking the block. It is also possible to reorder and adjust dates by clicking and dragging.

IMPACT

The timeline now means everyone has access to an up to date picture of upcoming exhibitons installations to no one is out of date. The timeline is on a public platform and is mobile accessible so staff can access it on the move, in galleries or at home. Less time is spent on creative Excel manipulation and more work on spotting errors. It has also made scheduling meetings more dynamic, allowing better cross referencing and moving to different positions in time. An unexpected effect is that we are spotting more uses for the solution and are currently investigating the use of it for booking rooms and resources. There are some really neat things we can do such as import a data feed from the timeline back into our MS Outlook calendars (“oooooh!”). The addition of thumbnail pictures used to advertise exhibitions has been a favorite feature among staff and really helps give an instant impression of current events, since it reinforces the exhibition branding which people are already familiar with.

ISSUES

It is far from perfect! Several iterations were needed to develop the drag and drop feature fo adding events. Also, we are reaching diminishing returns in terms of performance – with more and more data available to plot, the web app is performing slowly and could do with further optimisation to improve speed. Also due to our IT infrastructure, many staff use Internet Explorer and whilst the timeline works OK, many features are broken on this browser without changes to compatibility and caching settings on IE.

WHAT’S NEXT

Hopefully optimisation will improve performance and then it is full steam ahead with developing our resource booking system using the same framework.

How we did it: automating the retail order forms using Shopify.

April 21, 2017Digital skills, RetailMark Pajak

*explicit content warning* this post makes reference to APIs.

THE PROBLEM: Having set ourselves the challenge of improving the buying process , our task in Team Digital was to figure out where we can do things more efficiently and smartly. Thanks to our implementation of Shopify, we have no shortage of data on sales to help with this, however the process of gathering the required information to place an order of more stock is time consuming – retail staff need to manually copy and paste machine-like product codes, look up supplier details and compile fresh order forms each time, all the while attention is taken away from what really matters, i.e. which products are currently selling, and which are not.

In a nutshell, the problem can be addressed by creating a specific view of our shop data – one that combines the cost of goods, with the inventory quantity (amount of stock left) in a way that factors in a specific period of time and which can be combined with supplier information so we know who to order each top selling product from, without having to look anything up. We were keen to get in to the world of Shopify development and thanks to the handy Shopify developer programme documentation & API help it was fairly painless to get a prototype up and running.

SETTING UP: We first had to understand the difference between public and private apps with Shopify. A private app lets you hard code it to speak to a specific shop, whereas the public apps need to be able to authenticate on the fly to any shop. With this we felt a private app was the way to go, at least until we know it works!

Following this and armed with the various passwords and keys needed to programmatically interact with our store, the next step was to find a way to develop a query to give us the data we need, and then to automate the process and present it in a meaningful way. By default Shopify provides its data as JSON, which is nice, if you are a computer.

TECHNICAL DETAILS: We set up a cron job on an AWS virtual machine running Node and MongoDB. Using the MEAN stack framework and some open source libraries to integrate with Google Sheets, and notably to handle asynchronous processes in a tidy way. If you’d like to explore the code – that’s all here. In addition to scheduled tasks we also built an AngularJS web client which allows staff to run reports manually and to change some settings.

Which translates as: In order to process the data automatically, we needed a database and computer setup that would allow us to talk to Shopify and Google Docs, and to run at a set time each day without human intervention.

The way that Shopify works means we couldn’t develop a single query to do the job in one go as you might in SQL (traditional database language). Also, there are limitations in how many times you can query the store. What emerged from our testing was a series of steps, and an algorithm which did multiple data extractions and recombination’s, which I’ll attempt to describe here. P.S. do shout if there is an easier way to do this ;).

STEP 1: Get a list of all products in the store. We’ll need these to know which supplier each product comes from, and the product types might help in further analysis.

STEP 2: Combine results of step one with the cost of goods. This information lives in a separate app and needs to be imported from a csv file. We’ll need this when we come to build our supplier order form.

STEP 3: Get a list of all orders within a certain period. This bit is the crucial factor in understanding what is currently selling. Whilst we do this, we’ll add in the data from the steps above so we can generate a table with all the information we need to make an order.

STEP 4: Count how many sales of each product type have taken place. This converts our list of individual transactions into a list of products with a count of sales. This uses the MongoDB aggregation pipeline and is what turns our raw data into something more meaningful. It looks a bit like this, (just so you know):

STEP 5: Add the data to a Google Sheet. What luck there is some open source code which we can use to hook our Shopify data up to Google. There are a few steps needed in order for the Google sheet to talk to our data – we basically have our server act as a Google user and share editing access with him, or her?. And while we are beginning to personify this system, we are calling it ‘Stockify’, the latest member of Team Digital, however Zak prefers the lofty moniker Dave.

The result is a table of top selling products in the last x number of days, with x being a variable we can control. The whole process takes quite a few minutes, especially if x >60, and this is due to limitations with each integration – you can only add a new line to a Google sheet once / second, and there are over 500 lines. The great thing about our app is that he/she doesn’t mind working at night or early in the morning, and on weekends or at other times when retail managers probably shouldn’t be looking at sales stats, but probably are. With Stockify/Dave scheduled for 7am each morning we know that when staff look at the data to do the ordering it will be an up to date assessment of the last 60 days’ worth of sales.

We now have the following columns in our Google Sheet, some have come directly from their corresponding Shopify table, whereas some have been calculated on the fly to give us a unique view of our data and on we can gain new insights from.

product_type: (from the product table)
variant_i:d (one product can have many variants)
price: (from the product table)
cost_of_goods: (imported from a csv)
order_cost: (cost_of_goods * amount sold)
sales_value: (price * amount sold)
name: (from the product table)
amount sold: (transaction table compared to product table / time)
inventory_quantity: (from the product table)
order_status: (if inventory_quantity < amount sold /time)
barcode: (from the product table)
sku: (from the product table)
vendor: (from the product table)
date_report_ru:n (so we know if the scheduled task failed)

TEST, ITERATE, REFINE: For the first few iterations we failed it on some basic sense checking – not enough data was coming through. This turned out to be because we were running queries faster than the Shopify API would supply the data and transactions were missing. We fixed this with some loopy code, and now we are in the process of tweaking the period of time we wish to analyse – too short and we miss some important items, for example if a popular book hasn’t sold in the last x days, this might not be picked up in the sales report. Also – we need to factor in things like half term, Christmas and other festivals such as Chinese New Year, which Stockify/Dave can’t predict. Yet.

AUTOMATIC ORDER FORMS: To help staff compile the order form we used our latest Google-sheet-fu using a combination of pick lists, named ranges and the query function to lookup all products tagged with a status of “Re-order”

A list of suppliers appears on the order form template:

and then this formula looks up the products for the chosen supplier and populates the order table:

“=QUERY(indirect(“last_60_days”&”!”&”11:685″),”select G where M='”&$B2&”‘ and J=’re-order'”)”

The trick is for our app to check if the quantity sold in the last x days is less than the inventory quantity, in which case it goes on the order form.

NEXT STEPS: Oh we’re not done yet! with each step into automation we take, another possibility appears on the horizon…here’s some questions we’ll be asking our system in the coming weeks..

-How many products have not sold in the last x days?
-If the product type is books, can we order more if the inventory quantity goes below a certain threshold?
Even if a particular product has not sold in the last 60 days, can we flag this product type anyway so it gets added to our automatic order form?
While we are at it, do we need to look up supplier email addresses each time – cant we just have them appear by magic.

…furthermore we need to integrate this data with our CRM…..looks like we will be busy for a while longer.

Digital Curating Internship – an update

August 12, 2016Digital Curating, Digital skills, InternshipsMark Pajak

By David Wright (Digital Curating Intern, Bristol Culture)

Both Macauley Bridgman and I are now into week six of our internship as Digital Curating Assistants here at Bristol Culture (Bristol Museums) . At this stage we have partaken in a wide array of projects which have provided us with invaluable experiences as History and Heritage students (a discipline that combines the study if history with its digital interpretation) at the University of the West of England. We have now been on several different tours of the museum both front of house and behind the scenes. Most notably our store tour with Head of Collections Ray Barnett, which provided us with knowledge of issues facing curators nationwide such as conservation techniques, museum pests and the different methods of both utilisation and presentation of objects within the entirety of the museum’s collection.

pic from stores

In addition we were also invited to a presentation by the International Training Programme in which Bristol Museums is a partner alongside the British Museum. Presentations given by Ntombovuyo Tywakadi, Collections Assistant at Ditsong Museum (South Africa), followed by Wanghuan Shi, Project Co-ordinator at Art Exhibitions China and Ana Sverko, Research Associate at the Institute of Art History (Croatia). All three visitors discussed their roles within their respective institutions and provided us with a unique insight into curating around the world. We both found these presentations both insightful and thought provoking as we entered Q&A centred on restrictions and limitations of historical presentation in different nations.

Alongside these experiences we have also assumed multiple projects for various departments around the museum as part of our cross disciplinary approach to digital curating.

Our first project involved working with Natural Sciences Collections Officer Bonnie Griffin to photograph, catalogue and conserve Natural History specimens in the store. This was a privileged assignment which we have perhaps found the most enjoyable. The first hand curating experience and intimate access with both highly experienced staff and noteworthy artefacts we both found inspiring in relation to our respective future careers.

Following on from this we undertook a project assigned by Lisa Graves, Curator for World Cultures, to digitise the outdated card index system for India. The digital outcome of this will hopefully see use in an exhibition next year to celebrate the seventieth anniversary of Indian independence in a UK-India Year of Culture. At times we found this work to be somewhat tedious and frustrating however upon completion we have come to recognise the immense significance of digitising museum records for both the preservation of information for future generations and the increased potential such records provide for future utilisation and accessibility.

We have now fully immersed ourselves into our main Bristol Parks project which aims to explore processes by which the museum’s collections can be recorded and presented through geo-location technology. For the purposes of this project we have limited our exploration to well-known local parks, namely Clifton and Durdham Downs with the aim of creating a comprehensive catalogue of records that have been geo-referenced to precise sites within the area. With the proliferation of online mapping tools this is an important time for the museum to analyse how it records object provenance, and having mappable collections makes them suitable for inclusion in a variety of new and exciting platforms – watch this space!. Inclusive of this we have established standardised procedures for object georeferencing which can then be replicated for the use of future ventures and areas. Our previous projects for other departments have provided the foundation for us to explore and critically analyse contemporary processes and experiment with new ways to create links between objects within the museum’s collections.

id cards

As the saying goes “time flies when you are having fun”, and this is certainly true for our experience up to date. We are now in our final two weeks here at the museum and our focus is now fervently on completing our Bristol Parks project.