The Food Standards Agency (FSA) is an independent Government department with a statutory objective to protect public health and consumers' interests in relation to food. One of these objectives is to protect those with mild to severe allergies, and make food bought or served in public safe for this segment. There are an estimated 2 million residents in the UK with a diagnosed food allergy or intolerance; 1 – 2% of the adult population and 5 – 8% of children. The result of an allergic reaction can be serious or even fatal, and there is no known cure for food allergies; which means the only way to manage the condition is strict avoidance.
Businesses are responsible for correctly labelling food, and ensuring restaurants, take away food services and packaged food provide the information consumers need about allergens, but this does not always happen correctly, and sometimes advice and regulations may need updating.
Problem to solve
FSA had a hypothesis that social media could be used to track issues around allergen declarations at scale. The Pivigo team were thus asked to use social media to analyse mentions of food allergens across the UK.
Specifically, they were asked to answer the following questions:
• There are 14 official allergens that need to be disclosed according to EU law. Are these official allergens (or others) being discussed on social media?
• How are people talking about allergens; are they making enquiries to restaurants, are they reporting mislabelling, or reporting adverse reactions after an event?
• Are there geographical differences in reporting of allergens?
The team started with circa 180k records from Twitter, YouTube comments, news, forums and blogs from the years between 2016 and 2018. This included actual text data and metadata such as dates, geolocation and sentiment.
First the data was cleaned and prepared by removing duplicates (such as re-tweets) and spurious locations, and removing undesired characters such as emojis. The data were labelled by counting relevant mentions of allergens and the team developed a dictionary of synonyms to which they performed Natural Language Processing analysis.
Eum ut commodi dignissimos omnis rerum sed ab ratione. Dolores similique sint. Minus et et. Quia et voluptas repellendus rerum soluta. A eveniet quia mollitia non. Omnis rerum earum dolorem maiores. Doloremque occaecati dolor in. Totam repudiandae et voluptatem est quibusdam tempo
This dashboard is adaptable to several data sources, and can add more data sources in the future, and makes insights accessible to specified users via a web browser interface.Some of the more specific and interesting findings from the analysis were:
• Milk (dairy) and nut allergies were most commonly reported and had the highest mention on Twitter followed by other forums. A co-occurrence matrix found that some allergen pairs were very common (e.g. milk and cereal).
• Coconut was a top 10 mention, but is not in the core 14 allergens
• Mention of allergens across the UK (by local authority) followed a similar pattern irrespective of the allergen itself.
• In terms of the type of mentions; the highest was regarding food labelling, across all positive, neutral and negative mentions. Very few enquiries are made across social media.
Insight about these mentions is very valuable for FSA. The dashboard is regularly updated with new data, allowing users to monitor the volume of discussion around specific allergens, search through relevant social media posts for emerging issues, and understand the current conversation around allergies. The project has also highlighted the insights that analysis of social media data can provide, and lead to further research using these techniques in different areas.
“The Pivigo team were a motivated and talented group of data scientists who delivered outputs above