A review on the Detection of Missing Content Queries in FAQ Retrieval Systems

Edwin Thuma; Moemedi Lefoane; Gontlafetse Mosweunyane

doi:10.24297/ijct.v16i2.5996

Authors

Edwin Thuma Computer Science Department, University of Botswana, Gaborone
Moemedi Lefoane Computer Science Department, University of Botswana, Gaborone
Gontlafetse Mosweunyane Computer Science Department, University of Botswana, Gaborone

DOI:

https://doi.org/10.24297/ijct.v16i2.5996

Keywords:

Frequently Asked Questions, Missing Content Queries

Abstract

When developing an automated FAQ retrieval system, the information supplier constructs question candidates in advance using their own knowledge. Then they answer these question candidates to create question-answer pairs to use in the FAQ retrieval system. However, these question-answer pairs will not always satisfy the usersâ€™ information needs. When there is no relevant questionâ€“answer pair to a usersâ€™ query, such a user may submit various query reformulations browsing over the long results list and may abandon the search before their information need has been satisfied. Such users many never return to use the system again because of the inability of the system to return relevant question-answer pairs to their query. In order to alleviate this, modern automated FAQ retrieval systems use a Missing Content Query (MCQ) detection subsystem to detect those queries that do not have the relevant questionâ€“answer pair. In this article we conduct a review of the different approaches proposed in the literature for detecting these MCQs. In particular, we provide a comprehensive review of the different systems that deployed the binary classification approach, the thresholding approach and the hybrid approach in the detection of MCQs. Moreover, we describe the strength and weaknesses of each approach.

Downloads

References

1. Contractor, D., Subramaniam, L., Deepak, P., and Mittal, A. (2013). Text Retrieval Using SMS Queries: Datasets and Overview of FIRE 2011 Track on SMS-Based FAQ Retrieval. In Multilingual Information Access in South Asian Languages, volume 7536 of Lecture Notes in Computer Science, pages 86â€“99, Berlin, Heidelberg. Springer-Verlag.
2. Cronen-Townsend, S., Zhou, Y., and Croft, W. (2002). Predicting query performance. In Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR â€™02, pages 299â€“306, New York, NY, USA. ACM.
3. Daelemans,W., Zavrel, J., Sloot, K., and Bosch, A. (2002). TiMBL: Tilburg Memory-Based
Learner - version 4.3 - Reference Guide.
4. Ferguson, P., Oâ€™Hare, N., Lanagan, J., Smeaton, A., McCarthy, K., Phelan, O., and Smyth, B. (2011). CALRITY at the TREC 2011 Microblog Track. In Proceedings of the 20th TREC Conference, pages 1â€“6, Gaithersburg, Md., USA. Text REtrieval Conference (TREC).
5. Gupta, A. (2013). Mapping SMSes to Plain Text FAQs. In Multilingual Information Access in South Asian Languages, volume 7536 of Lecture Notes in Computer Science, pages 157â€“162, Berlin, Heidelberg. Springer-Verlag.
6. Hauff, C., Murdock, V., and Baeza-Yates, R. (2008). Improved query difficulty prediction for the web. In Proceedings of the 17th ACM Conference on Information and Knowledge Management, CIKM â€™08, pages 439â€“448, New York, NY, USA. ACM.
7. He, B. and Ounis, I. (2004). Inferring Query Performance Using Pre-retrieval Predictors. In Proceedings of the String Processing and Information Retrieval, pages 43â€“54, Berlin, Heidelberg. Springer-Verlag.
8. He, B. and Ounis, I. (2006). Query Performance Prediction. Information Systems, 31(7):585â€“594.
9. Hogan, D., Leveling, J., Wang, H., Ferguson, P., and Gurrin, C. (2011). DCU@FIRE 2011: SMS-based FAQ retrieval. In FIRE 2011, 3rd Workshop of the Forum for Information Retrieval Evaluation, 2-4 December, IIT Bombay, pages 34â€“42.
10. Leveling, J. (2012). On the Effect of Stopword Removal for SMS-Based FAQ Retrieval. In Proceedings of the 17th International Conference on Applications of Natural Language Processing and Information Systems, pages 128â€“139, Berlin, Heidelberg. Springer-Verlag.
11. Shaikh, A., Jain, M., Rawat, M., Shah, R., and Kumar, M. (2013). Improving accuracy of sms based faq retrieval system. In Multilingual Information Access in South Asian Languages, volume 7536 of Lecture Notes in Computer Science, pages 142â€“156, Berlin, Heidelberg. Springer-Verlag.
12. Shivhre, N. (2013). SMS Based FAQ Retrieval. In Multilingual Information Access in South Asian Languages, volume 7536 of Lecture Notes in Computer Science, pages 131â€“141, Berlin, Heidelberg. Springer-Verlag.
13. Sneiders, E. (1999). Automated FAQ Answering: Continued Experience with Shallow Language Understanding. Question Answering Systems. In Proceedings of the Association for the Advancement of Artificial Intelligence Fall Symposium, pages 97â€“107, California, USA. AAAI Press.
14. Sneiders, E. (2009). Automated FAQ Answering with Question-specific Knowledge Representation for Web Self-service. In Proceedings of the 2Nd Conference on Human System Interactions, pages 295â€“302, Piscataway, NJ, USA. IEEE Press.
15. Thuma, E. Rogers, S. and Ounis, I (2014). Detecting Missing Content Queries in an SMS-Based HIV/AIDS FAQ Retrieval System, In Advances in Information Retrieval, volume 8416 of Lecture Notes in Computer Science, pages 247-259, Berlin, Heidelberg. Springer-Verlag.
16. Yom-Tov, E., Fine, S., Carmel, D., and Darlow, A. (2005). Learning to Estimate Query Difficulty: Including Applications to Missing Content Detection and Distributed Information Retrieval. In Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, New York, NY, USA. ACM.
17. Zhao, Y., Scholer, F., and Tsegay, Y. (2008). Effective Pre-retrieval Query Performance Prediction Using Similarity and Variability Evidence. In Proceedings of the IR Research, 30th European Conference on Advances in Information Retrieval, pages 52â€“64, Berlin, Heidelberg. Springer-Verlag.