Large Personal Lifelogs

What is the Quantified Self?

Our body is constantly sending out signals that, if listened to carefully, allow us to better understand the state of our personal well-being. For example, feeling weak and tired after a long sleep can be seen as indication that the quality of the sleep was low. It might reveal evidences regarding our personal fitness level or our mental state. Being aware of the importance of our body signals, medical doctors heavily reply on them when they have to make a diagnosis. Adopting these methods, a growing number of people have now started to constantly measure the fitness level of their bodies, using a variety of equipment to collect and store the data. These individuals can be considered to be part of the Quantified Self (QS) movement that uses instruments to record numerical data on all aspects of our lives: inputs (food consumed, surrounding air-quality), states (mood, arousal, blood oxygen levels), and performance (mental, physical). The data acquisition through technology: wearable sensors, mobile apps, software interfaces, and online communities.

The technology research and advisory company Gartner predicts that by 2017, the Quantified Self movement will evolve into a new trend with around 80% of consumers collecting or tracking personal data. Moreover, they predict that by 2020, the analysis of consumer data collected from wearable devices will be the foundation for up to five percent of sales from the Global 1000 companies. Given these predictions, it comes with no surprise that more and more companies try to enter the market with novel wearable device. In fact, a multitude of devices, services and apps are available now that track almost everything we do nowadays.



Memory is the process by which information is encoded, stored, and retrieved. Given enough stimuli and rehearsal, humans can remember information for many years and recall that information whenever required. However, not all stimuli are strong enough to generate a memory that can be recalled easily. One of the most valuable and effective ways to reduce the impact of Age-related Memory Impairment on everyday functioning is by using external memory aids. Lifelogging technologies can significantly contribute to the realisation of these external memory aids. Lifelogging is the process of automatically, passively and digitally recording aspects of our life experiences. This includes visual lifelogging, where the lifeloggers wear head-mounted cameras or cameras mounted in front of the chest, that capture personal activities through the medium of images or video. Despite its relative novelty, visual lifelogging is gaining popularity because of projects like the Microsoft SenseCam.

The SenseCam device is a small, lightweight wearable device that automatically captures a wearer’s every moment as a series of images and sensor readings. It has been shown recently that such images and other data can be periodically reviewed to recall and strengthen individuals’ memory. Normally, the SenseCam captures an image at the rate of one every 30 seconds and collects about 4,000 images in a typical day. It can also be set up to take more frequent images by detecting sudden changes in the wearer’s environment such as significant changes in light level, motion and ambient temperature. The SenseCam generate a very large amount of data for a single typical day.


BIG Lifelogging data

“Big Data” applications are generally believed to have four elements which popularly characterise a big data scenario, and these are volume, variety, velocity and veracity. In this section we will examine how lifelogging does, or does not conform to those four characteristics because there are certain advantages which “big Data” technologies could bring to the lifelogging application.


Lifelogging is essentially about generating and capturing data, whether it comes from sensors from sensors, our information accesses, our communications, and so on. One characteristic which makes lifelogging a big data application and poses both challenges and opportunities for data antlyics, is because of the variety in the data sources.

Primary data includes sources such as physiological data from wearable sensors (heart rate, respiration rate, galvanic skin response, etc.), movement data from wearable accelerometers, location data, nearby bluetooth devices, WiFi networks and signal strengths, temperature sensors, communication activities, data activities, environmental context, images or video from wearable cameras, and that does not take into account the secondary data that can be derived from this primary lifelog data through semantic analysis. All these data sources are tremendously varied and different. In lifelogging, all these varied sources merge and combine together to form a holistic personal lifelog where the variety across data sources is normalised and eliminated.

The velocity of data refers to the subtle shifting changes in patterns within a data source or stream, and this is not much of an issue for lifelog data, yet, because most lifelogging analysis and processing work is not done in applications which require identifying a pattern or a change in real time. This is one of the trends for future work; real-time pattern analysis could potentially be employed for healthcare monitoring and real-time interventions.

Lifelogging generates continuous streams of data on a per-person basis, however despite the potential for real-time interactions, most of the applications for lifelogging we have seen to date do not yet operate in a real-time mode. So while lifelogging does not yet have huge volume, this volume of data is constantly increasing as more and more people lifelog. For a single individual, the data volumes can be large when considered as a Personal Information Management challenge, but in terms of big-data analysis, the data volumes for a single individual are small. Considering a lifelog of many people, thousands, perhaps millions, all centrally stored by a service provider, then the data analytics over such huge archives becomes a real big-data challenge in terms of volume of data.

Finally, veracity refers to the accuracy of data and to it sometimes being imprecise and uncertain. In the case of lifelogging, because much of our lifelog data can be derived from sensors which may be trou- blesome, or have issues of calibration and sensor drift, as described in Byrne and Diamond (2006). Hence, we can see that lifelogging does have issues of data veracity which must be addressed. Semantically, such data may not be valuable without additional processing. In applications of wireless sensor networks in environmental monitoring, for example, trust and reputation frameworks to handle issues of data accuracy have been developed, for example RFSN (Reputation-based Framework for High Integrity Sensor Networks) developed by Ganeriwal et al. (2008). Similarly, data quality is a major issue in enterprise information processing.

Byrne, R. and Diamond, D. (2006). Chemo/bio-sensor networks. Nature Materials, 5(6):421-424.

Ganeriwal, S., Balzano, L. K., and Srivastava, M. B. (2008). Reputation based framework for high integrity sensor networks. ACM Transactions on Sensor Networks, 4(3):1-37.

Leave a Reply

Your email address will not be published. Required fields are marked *