Designing Authority Data Properties Based on Microdata Method and Study of Web Search Engines’ Reaction to Them

Document Type : Research Paper

Authors

1 Department of Knowledge and Information Science, Faculty of Psychology and Education, Allameh Tabatabaei University, Tehran, Iran

2 Department of Industrial Management, Faculty of Management and Accounting, Allameh Tabatabaei University, Tehran, Iran

3 Central Library and Documentation Center, Allameh Tabatabaei University, Tehran, Iran

4 Department of Information Science, Faculty of Education and Psychology, Alzahra University, Tehran, Iran

Abstract

Objective: The purpose of this research was to study the Search Engine’s responses to authority data properties embedded into schema.org-based metadata on the Microdata syntax.
Methods: The experimental method was used in this research. The research population comprised 400 records of authority metadata based on the Microdata method from the digital library of Allameh Tabataba'i University. The examination group consisted of 200 metadata records, 100 records with authority data extensions embedded into schema.org-based metadata in the Microdata syntax and 100 other similar records in the JSON-LD syntax (50 samples of name authority, and 50 other subject authority) And the control group consisted of 200 Records, including 100 Records related to the description of the book in the Microdata syntax and 100 other similar records in the JSON-LD syntax. The records have been published on the independent website at www.Aghadeh.ir and have been introduced to the Google, Bing, Yahoo, and Yandex search engines as designers of the schema.org standard. Then, through searching the search engines, using the data gathering tool, the checklist provided by the researchers, the indexing and retrieval of the metadata records of the control groups and experimental groups were evaluated in the search results of the selected search engines.
Results: The results of this study showed that search engines were able to index and retrieve all of the metadata records and values of added extensions associated with authority data. Such a possibility had the same status for the name authority records and the subject authority data.
Conclusions: By retrieving each of the variant properties’ values of examination group’s records, in addition to the authorized values of the name and subject terms, a suitable platform for the comprehensiveness of the retrieve process, and the authority control in the Web search tools will be improved.

Keywords


References
Aldaej, A. A. (2015). An enhanced semantic VLE based on schema.org and social media (Doctoral dissertation, University of Surrey). Retrieved 28 March 2021 from  https://epubs.surrey.ac.uk/807070/1/Aldaej%20PhD%20Thesis.pdf 
Azizan, Z. Baker, A. Ismail, N., & Amran M. (2013). Interface Features of Semantic Web Search Engine. IEEE Conference on e-Learning, e-Management and e-Services, December 2–4, Sarawak, Malaysia.
Babolhavaeji, F., Taheri, S. M., & AghaAbedi, Z. (2015).  A comparative Study of Indexing Quality and Ranking of Metadata Records of Dublin Core and MARC 21 by Search Engines. Quarterly of Knowledge Studies, 1(3). 43-59 (in Persian).
Dorri, R. (2015).  Comparison and Evaluation of Semantic Search Engines. Iranian Journal of Information Processing and Management. 30(2), 467-490 (in Persian).
Fardehosseini, M., Taheri, S.M., Hariri, N., Babalhavaeji, F., & Nooshinfard, F. (2020). Representing Properties and Relationships between Entities of Creative Works in Schema.org Based on Library Reference Model (LRM). Iranian Journal of Information Processing and Management, 36(2), 533-562 (in Persian).
Fattahi, S. R. (2007). From ideals to reality; an analysis of the most important challenges and approaches to organizing information in today's era. Library and Information Sciences, 40(10), 22-37. (in Persian).
Gorman, M. (2004). Authority control in the context of bibliographic control in the electronic environment. Cataloging & classification quarterly, 38(3-4), 11-22.
Hariri, N. & Taheri, S.M. (2014). Study on the Use of “Data Island Method “for Creating Metadata Records with Access Improvement to Content Objects by Web Search Engines. Library and Information Science Quarterly, 4(1), 73-91 (in Persian).
Hariri, N. & Taheri, S. M. (2015). Study of the Effect of Discarding XML Declaration and Changing the File Extension on Increasing the Indexability and Visibility of Metadata Records in the Web Search Engines Environment. Library and Information Science Quarterly, 5 (1), 10-21 (in Persian).
Hogenboom, F. Frasinca, F. & Damir, V. (2011).  Automatically Annotating Web pages using Google Rich Snippets. Econometric institute, Erasmus University Rotterdam, Amsterdam, the Netherlands.
Iliadis, A., Acker, A., Stevens, W., & Kavakli, S. B. (2022). One schema to rule them all: How Schema. org models the world of search. Journal of the Association for Information Science and Technology. https://doi.org/10.1002/asi.24744
MohammadiOstani, M., CheshmehSohrabi, M., Taheri, S. M., Shabani, A., & Asemi, A. (2022). Localization of Schema.org for Manuscript Description in the Iranian-Islamic Information Context. Knowledge Organization, 48(5), 345-356.
Negi, Y., & Kumar, S. (2014). A comparative analysis of keyword- and semantic-based search engines. Intelligent Computing, Networking, and Informatics, (243), 727-736.
Nowkarizi, M., & Zeinali, M. (2017). The overlap and coverage of 4 local search engines: Parsijoo, Yooz, Parseek and Rismoun. Human Information Interaction, 4(3), 48-59 (in Persian).
Tabatabaei Amiri, F.S., Taheri, S.M., & Farajpahlu, H. (2012). Web search engines, indexing and Ranking of content objects including metadata elements in dynamic environment. Iranian Journal of Information Processing and Management. 27(4), 907-920. (in Persian)
Taheri, S. M. (2014). Rich snippets: Macro data, macro formats, and resource description framework (attributes) Workshop. For the Khaneye Ketab Institute. Tehran: 27th International Book Fair, Sarai Ahl Qalam, May, 2014 (in Persian)
Taheri, S. M. (2015). Research in the web environment (5)" Workshop. The Faculty of Psychology of Allameh Tabatabae’i University, Tehran, May, 2015 (in Persian)
Taheri, S. M.; Hariri, N., & Fattahi, S. R. (2009). A comparative study of the quality of indexing and ranking of content objects including Dublin Core metadata elements and MARC 21 by general search engines. Library and Information Sciences, 12(4), 141-162 (in Persian).
Taheri, S. M., Nikzad, R., & Samiee, M. (2015). Study of Response of web-based search engines to metadata based on combined method of rich snippets and linked data: A case study of national content consortium. Iranian Journal of Information Processing and Management, 33(2), 639-658. (in Persian).
Taheri, S. M., Zolghadr, S., & Hariri, N. (2018). Comparing the function of web search engines in indexing and finding the metadata records based on the microdata method. Knowledge Retrieval and Semantic Systems, 5(16), 83-101 (in Persian). https://doi.org/10.22054/jks.2019.38110.1210
Turner, D., M. Shah, A., & Bitirim, Y. (2009). An empirical evaluation on semantic search performance of keyword-based and semantic search engines: Google, Yahoo, MSN and Hakia. In proceedings of Fourth international conference on internet monitoring and protection, 24-28. Venice, Mestre. https://doi.org/10.1109/ICIMP.2009.16
Van, T. (2012). How to implement video microdata for Google & Schema.org. Retrieved 10 June 2021, from https://www.searchenginewatch.com/2012/02/28/how-to-implement-video-microdata-for-google-schema-org/
W3Schools (2017). HTML5 History. Retrieved 15 May 2017, from: https://www.w3schools.com/html/html5.asp