Personal photograph collections ontology development through thematic tags

Purpose- The number and the variety of photos have grown to a great extent as they can be created anytime, everywhere and spontaneously. Searching for a particular photo file has become a boring, repetitive and tedious activity. The application of an ontology to express the user profile characteristics relation with the narrative, spatial, time and other types of information of the collected photos becomes imperative.Design/methodology/approach -The work presented in our article includes the development of a personal photograph collections ontology (MyOntoPhotos) specialising in documenting the metadata of the topics that end-users prefer mostly to capture with their devices.  An extensive survey, among 650 participants, was conducted with the use of an online questionnaire comprised of semi-closed questions, following the Likert scale and the scale category grading.Findings -The ontology created was based on the results of an extensive survey aiming to identify thematic areas of interest, apart from spatial and temporal information, as other similar efforts did in the past. It is mentionable that the survey results proved the majority of the responders selected 22 thematic tags.Originality/value - Based on the research findings an innovative concept for a mobile application is presented, focusing on enhancing end-users photo collections organizing and retrieval functions.


I. INTRODUCTION
The number and variety of photographs have increased significantly [1], [2].Users can create photos anytime, everywhere without prior planning [3], capturing a vast variety of everyday life events [4].Digital photographs are not an easy task, and conscious effort is required to organise, manage and, thus to preserve and to locate them when is necessary [5].The photos hold memories of events and have the power to take us back in the time and to remind us what we did, so they are of high emotional value [6].
But searching for a particular photo among a vast volume of digital files is a dull, repetitious and laborious activity [7], mainly because a text retrieval query requires some photography semantics knowledge.For this reason, the present work provides evidence that whenever labelling photos with the appropriate thematic tag will improve the recovery rate significantly and easily.As a result, retrieval is based on the highest possible accuracy and retraction, which has proven to be a challenge [8].This is possible by ontologies.According to [9], ontology is an explicit specification of conceptual thinking.Also, ontology has the definition and the clues as to how these concepts are interlinked imposing a specific structure in the field of study [10].Ontologies can represent a particular area of interest by promoting and facilitating the interoperability between information systems [1], the explanation of questions, the formulation and the utilization of information [11].The use of the positive features of ontologies -interoperability, capture and organization of knowledge -is very important [12], [13].
In this article, a framework for personal photos organization is proposed, through the use of an ontology (MyOntoPhotos).The aforementioned ontology includes thematic, spatial and temporal tags.These tags were selected through a survey that was conducted on an extensive, random sample of end-users.The ontology is going to be part of a photo organizing application which will allow users to improve tags ranking order as they use it.
The rest of the article is organized as follows: next, Section 2 describes similar initiatives and related work.Section 3 is dedicated to presenting the methodology followed concerning the ontology formation.Consequently, Section 4 presents the ontology most popular tags as they were selected by users through the survey.Next, Section 5 provides the conclusions about the most important findings and lessons learned, while identifies the research restrictions.Finally, in Appendix section the questionnaire is presented.

II. LITERATURE REVIEW
To begin with, author [14] establishes image properties categories based on user behavior by analyzing the words and phrases that viewers employ to describe them.According to [15] there is an interest in the detection of still images, user images and metadata to provide the breadth and significance of the semantic gap [16].The semantic gap is the "lack of coincidence between the information that one can extract from the visual data and the interpretation that the same data have for a user in a given situation" [17, p.1]In recent decades several semantic gaps make it difficult for users to search for the photos they want [18], [19].
Also, Flickr allows users to upload images online for storage by commenting on titles, descriptions, or labels [20].Flickr tags -date, location, and owner -are mainly assigned by the user who downloads the image with several benefits [21], but without allowing correlations to the same query to retrieve the requested photos or automatic photo organization.Authors [22] referred to Instagram tags as guides for the main subjects, events, locations, ideas or emotions.In Picasa the organization of photos is limited to creating albums as photo collections without supporting automatic event tracing [23].
Moreover, EXIF (EXchangeable Image File) allows the description of geographic coordinates using GPS tags.At the same time, Photogeo's contribution is very important, with the use of new algorithms.In detail, the algorithms allow the user to comment on photos with basic metadata characteristics -who, the location where was recorded, the date and time of downloading - [24].In PhotoMap [2], the annotation is automatically performed using the spatial, temporal and social context of a photograph [7].
In terms of organization and personal digital imaging, research has mainly focused on interface design [25], spatial indexing [26], data display [27], the time of taking photographs [28] and facilitating the exchange of photographs.Furthermore, the ContextPhoto ontology [1] provides concepts for portraying the spatial and temporal frames of the photo (where, when) and the Semantic Web Rule Language (SWRL) rules for export the social context of photography (who was close).
An essential part of the photos organization and retrieval is to identify the topics that end-users are interested in or impressed by [24].So far, a considerable body of research has been carried out on the above-mentioned domains, but none of them has focused on exhaustive depiction and use of topics as the central entry point for search and retrieval functions, as suggested in this research.

III. METHODOLOGY
In the present study, we conducted a survey that describes and measures the degree of correlation between two variables: the behavior in taking pictures and the subjects that are mainly depicted.Through the correlation, according to [42], a statistical control is performed to determine the two variables to be consistently changing.
The questionnaire used during the survey is provided in the Appendix section and was the most appropriate tool for collecting the necessary input data for building the ontology proposed [42].The content of the questionnaire was based on previous research activities [5], [29], [43], [37] while it was necessary to be modified on the basis of the Greek context and the new technological developments and requirements.
Concerning the structure of the questionnaire, there are 19 questions, divided into two parts.The first part (question 1 up to 13) refers to the participants' demographics and photography preferences.The second part of the questionnaire (question 14 up to19) is devoted to measuring the topics that participants prefer to capture more often, through a set of visual aided questions.
The survey conducted from November 2016 to February 2017 through an online questionnaire on a random sample of participants.The promotion of the survey was realized mainly via the social networks.The number of responders was large enough (650) to enable a satisfactory level of representation among different sub-groups in terms of gender, age, and level of education.It is considered that the sample can provide useful information for creating the ontology.The participants were able to communicate via e-mail, if they needed any further clarification.The protection of personal data, the anonymity of the participants in the study and the use of their responses solely to promote research were highlighted.
Finally, it is worth noting that in this questionnaire, after a thorough study of the literature, visual modernisms were introduced.More specific, hashtags (#) were used for presenting topics (e.g.#Parents / #Children etc.), based on terms from Social Media Networks (e.g.Facebook, Instagram, Twitter, etc.), while their visualization was done with the help of related images, assisting participants to respond more quickly and accurately.

IV. RESULTS AND DISCUSSION
After a thorough study of the responses, the following conclusions were extracted for the topic tags.Initially, it should be noted that the selected number of tags was 22.The tags were organized in eight broader topic areas / categories -#place, #friends, #occasion, #selfies, #family, #domestic animals, #leisure time and #personal items.Besides, based on the results of the survey (see question 15) topic #Various Objects was also used for the case where participants could fill in other topics that can be photographed and not mentioned or included in the previous tags.
Most of the participants, i.e. 95%, would not spend more than one hour per week to organize their captured photos.This indicates that the use of an application to organize photos that would considerably decrease the time spent is essential.It was observed that 80% of the participants did not provide extra tags other than those already included in the survey (questions 14a-e).
Also, #place and #time are preferable topics that users fancy to access (search) their photos.In more detail, users are interested in #place visited, #place of living, #place of working, #place of taking photos, #gps, #year, #season, #month #date and #day of taking.It is remarkable that the survey results proved that the use of the 22 topic tags, thus the number of photos taken for each category, is not affected statistically enough by factors such as gender, age and education profile.
The majority of the participants (i.e.72%) believe that a set of five topics is sufficient for tagging their photos.Thus, the topic tags that would be most frequently chosen and hence the subjects of interest are: #Nature, #Best friends, #Social occasion, #Historical monuments, #City, #Museums / #Buildings, #Selfies, #Brothers, #Classmates, #Wedding / #Baptism, #Dog, #Hobby, #Parents / #Children.The topics mentioned above were selected based on the survey responses and in conjunction with the literature review are the basis for the MyOntoPhotos Personal Photo Ontology entities and relationships creation.
The ontology development followed the guidelines described in "Ontology Development 101", which has been introduced by the creators of Protégé 2000, Ontolingua and Chimaera.Specifically, an iterative design that helps developers to create an ontology [44] was applied.The two most important concepts for an ontology-based system in the field of photography are accuracy and recall during retrieving user-based results [15].All possible combinations for topic tags variations, as shown by the graph and ontology design, are based on the above factors.
More specific the researchers wanted to depict the preferred topics (subjects) that "capture" the respondents, with the percentage of interest in each topic (i.e.#nature 82%, #best friends 75%).Simultaneously, there is another correlation "is interested in" where #person -respondentrefers the #place and the #time, as retrieving tags.In the ontology development, it is shown also the percentages of preferences of #place and #time (i.e.#place visited 53%, #year 55%).The results of the ontology, as set out, are presented in the following figures, 1 and 2, and in the .owlarchive.V. CONCLUSIONS This paper highlighted the importance of using ontology to organize knowledge and specifically the subject of personal photographic collections.As has been already mentioned, the organization of personal photos is a laborious and a boring process that is avoided, resulting in never founding a large part of the photos being as they are lost in the large volume of the collection.In this paper, it is proposed to organize personal photos through an application with the use of the MyOntoPhotos Personal Photo Ontology, which mainly includes topic areas of interest and photographed, place and time tags ranked by using the popularity information based on the survey results.Then, the ranking order will be personalized, based on the user's personal interests.An initialization phase with a game of presented photos and the user selecting the ones that finds interesting enough would be a good primary phase in order to enhance the application training phase and provide the user the best ranking tags as soon as possible.
Differences in preferences varied between gender, age and grade of education exist but are not significant enough.Most of the persons chose to use up to five thematic #tags: #Nature, #Best Friends, #Social Occasion, #Historical Monuments, #City.In essence, the proposed application based on the ontology created after the thorough literature review and the responses of the questionnaire will "learn" users' photographic interests and remove the choices of less interest, emphasizing on the most commonly used #tags that they will assign in their photos.It will also be possible to organize personal photos at users' most convenient time.Ultimately, each user profile will be modeled on the #tags topics chosen, so photos will be organized and retrieved in an easy and quick way.