Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Applying Natural Language Processing Techniques to Generate Open Data Web APIs Documentation

  • Conference paper
  • First Online:
Web Engineering (ICWE 2020)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12128))

Included in the following conference series:

  • 1617 Accesses

Abstract

Information access globalisation has resulted in the continuous growing of online available data on the Web, especially open data portals. However, in current open data portals, data is difficult to understand and access. One of the reasons of such difficulty is the lack of suitable mechanisms to extract and learn valuable information from existing open data, such as Web Application Programming Interfaces (APIs) with proper documentation. Actually, in most cases, open data Web APIs documentation is very rudimentary, hard to follow, and sometimes incomplete or even inaccurate. To solve these data management problems, this paper proposes an approach to automatically generate Web API’s documentation which is both machine and user readable. Our approach consists of applying natural language processing techniques to create OpenAPI documentations. This manner, the access to data is facilitated because of the improvement on the comprehension of the APIs, thus promoting the reusability of data. The feasibility of our approach is presented through a case study that shows and compares the benefits of using our OpenAPI documentation process within an open data web API.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://github.com/cgmora12/AG.

  2. 2.

    https://docs.ckan.org/en/latest/api/index.html.

  3. 3.

    https://www.data.gov/.

  4. 4.

    https://data.gov.uk/.

  5. 5.

    https://datos.gob.es/es/apidata.

  6. 6.

    https://en.wikipedia.org/.

  7. 7.

    A value between 0 and 1 indicating the degree of confidence that the algorithm had when it performed the disambiguation of the term [21].

  8. 8.

    The process of splitting a stream of text into more basic units such as words, phrases or tokens (elements with an identified meaning).

  9. 9.

    http://wake.dlsi.ua.es/EmploymentAPI/docs/.

  10. 10.

    https://wake.dlsi.ua.es/EmploymentAPI/docs/complete.html.

  11. 11.

    https://www.json.org/.

  12. 12.

    https://github.com/cgmora12/AG.

  13. 13.

    https://github.com/cgmora12/NL4OpenAPI.

References

  1. Abelló Gamazo, A., Ayala Martínez, C.P., Farré Tost, C., Gómez Seoane, C., Oriol Hilari, M., Romero Moral, Ó.: A data-driven approach to improve the process of data-intensive API creation and evolution. In: Proceedings of the Forum and Doctoral Consortium Papers Presented at the 29th International Conference on Advanced Information Systems Engineering, CAiSE 2017, pp. 1–8 (2017)

    Google Scholar 

  2. Alharbi, N., Gotoh, Y.: Natural language descriptions for human activities in video streams. In: Proceedings of the 10th International Conference on Natural Language Generation, pp. 85–94 (2017)

    Google Scholar 

  3. Alonso, J.M., Ramos-Soto, A., Castiello, C., Mencar, C.: Explainable AI beer style classifier. In: The SICSA Reasoning, Learning and Explainability Workshop 2018 (2018)

    Google Scholar 

  4. Atzeni, P., Merialdo, P., Mecca, G.: Data-intensive web sites: design and maintenance. World Wide Web 4(1), 21–47 (2001)

    Article  Google Scholar 

  5. Aysolmaz, B., Leopold, H., Reijers, H.A., Demirörs, O.: A semi-automated approach for generating natural language requirements documents based on business process models. Inf. Softw. Technol. 93, 14–29 (2018)

    Article  Google Scholar 

  6. Bateman, J., Zoch, M.: Natural Language Generation. Oxford University Press, Oxford (2003)

    Google Scholar 

  7. Braun, N., Goudbeek, M., Krahmer, E.: The Multilingual Affective Soccer Corpus (MASC): compiling a biased parallel corpus on soccer reportage in English, German, Dutch. In: Proceedings of the 9th International Natural Language Generation conference, pp. 74–78 (2016)

    Google Scholar 

  8. Braunschweig, K., Eberius, J., Thiele, M., Lehner, W.: The state of open data - limits of current open data platforms. In: Proceedings of the 21st World Wide Web Conference 2012, Web Science Track at WWW 2012 (2012)

    Google Scholar 

  9. Cao, H., Falleri, J.-R., Blanc, X.: Automated generation of REST API specification from plain HTML documentation. In: Maximilien, M., Vallecillo, A., Wang, J., Oriol, M. (eds.) ICSOC 2017. LNCS, vol. 10601, pp. 453–461. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-69035-3_32

    Chapter  Google Scholar 

  10. Cole, R. (ed.): Survey of the State of the Art in Human Language Technology. Cambridge University Press, New York (1997)

    Google Scholar 

  11. Daga, E., Panziera, L., Pedrinaci, C.: A BASILar approach for building Web APIs on top of SPARQL endpoints. In: Proceedings of the 3rd Workshop on Services and Applications over Linked APIs and Data, vol. 1359, pp. 22–32 (2015)

    Google Scholar 

  12. Danielsen, P.J., Jeffrey, A.: Validation and interactivity of Web API documentation. In: IEEE 20th International Conference on Web Services, pp. 523–530 (2013)

    Google Scholar 

  13. De Renzis, A., Garriga, M., Flores, A., Cechich, A., Mateos, C., Zunino, A.: A domain independent readability metric for web service descriptions. Comput. Stan. Interfaces 50, 124–141 (2017)

    Article  Google Scholar 

  14. Eciolaza, L., Pereira-Fariña, M., Trivino, G.: Automatic linguistic reporting in driving simulation environments. Appl. Soft Comput. 13(9), 3956–3967 (2013)

    Article  Google Scholar 

  15. Ed-douibi, H., Cánovas Izquierdo, J.L., Cabot, J.: Example-driven web API specification discovery. In: Anjorin, A., Espinoza, H. (eds.) ECMFA 2017. LNCS, vol. 10376, pp. 267–284. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-61482-3_16

    Chapter  Google Scholar 

  16. Ed-douibi, H., Cánovas Izquierdo, J.L., Cabot, J.: OpenAPItoUML: a tool to generate UML models from OpenAPI definitions. In: Mikkonen, T., Klamma, R., Hernández, J. (eds.) ICWE 2018. LNCS, vol. 10845, pp. 487–491. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-91662-0_41

    Chapter  Google Scholar 

  17. Fellbaum, C.: WordNet: An Electronic Lexical Database (Language, Speech, and Communication). MIT Press, Cambridge (1998)

    Book  Google Scholar 

  18. Hancock, B., Lee, H., Yu, C.: Generating titles for web tables. In: The World Wide Web Conference, pp. 638–647 (2019)

    Google Scholar 

  19. Hardy, H., Vlachos, A.: Guided neural language generation for abstractive summarization using abstract meaning representation. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 768–773 (2018)

    Google Scholar 

  20. Huang, C., Zaiane, O., Trabelsi, A., Dziri, N.: Automatic dialogue generation with expressed emotions. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol. 2, pp. 49–54 (2018)

    Google Scholar 

  21. Iacobacci, I.: Neural-grounded semantic representations and word sense disambiguation: a mutually beneficial relationship, Ph.D. thesis (2018)

    Google Scholar 

  22. Janssen, M., Charalabidis, Y., Zuiderwijk, A.: Benefits, adoption barriers and myths of open data and open government. Inf. Syst. Manag. 29(4), 258–268 (2012)

    Article  Google Scholar 

  23. Keim, D.A.: Information visualization and visual data mining. IEEE Trans. Vis. Comput. Graph. 8(1), 1–8 (2002)

    Article  MathSciNet  Google Scholar 

  24. Kopecký, J., Vitvar, T., Pedrinaci, C., Maleshkova, M.: RESTful services with lightweight machine-readable descriptions and semantic annotations. In: Wilde, E., Pautasso, C. (eds.) REST: From Research to Practice, chap. 22, pp. 473–506. Springer, New York(2011). https://doi.org/10.1007/978-1-4419-8303-9_22

  25. Van der Lee, C., Krahmer, E., Wubben, S.: PASS: a Dutch data-to-text system for soccer, targeted towards specific audiences. In: Proceedings of the 10th International Conference on Natural Language Generation, pp. 95–104 (2017)

    Google Scholar 

  26. Lu, Y., Li, G., Zhao, Z., Wen, L., Jin, Z.: Learning to infer API mappings from API documents. In: Li, G., Ge, Y., Zhang, Z., Jin, Z., Blumenstein, M. (eds.) KSEM 2017. LNCS (LNAI), vol. 10412, pp. 237–248. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-63558-3_20

    Chapter  Google Scholar 

  27. Macdonald, I., Siddharthan, A.: Summarising news stories for children. In: Proceedings of the 9th International Natural Language Generation Conference, pp. 1–10 (2016)

    Google Scholar 

  28. Maleshkova, M., Pedrinaci, C., Domingue, J.: Investigating web APIs on the World Wide Web. In: 2010 8th IEEE European Conference on Web Services, pp. 107–114 (2010)

    Google Scholar 

  29. Moreno, L., Aponte, J., Sridhara, G., Marcus, A., Pollock, L., Vijay-Shanker, K.: Automatic generation of natural language summaries for Java classes. In: 21st International Conference on Program Comprehension, pp. 23–32 (2013)

    Google Scholar 

  30. Moro, A., Raganato, A., Navigli, R.: Entity linking meets word sense disambiguation: a unified approach. Trans. Assoc. Comput. Linguist. 2, 231–244 (2014)

    Article  Google Scholar 

  31. Navigli, R., Ponzetto, S.P.: BabelNet: the automatic construction, evaluation and application of a wide-coverage multilingual semantic network. Artif. Intell. 193, 217–250 (2012)

    Article  MathSciNet  Google Scholar 

  32. Pandita, R., Xiao, X., Zhong, H., Xie, T., Oney, S., Paradkar, A.: Inferring method specifications from natural language API descriptions. In: Proceedings of the 34th International Conference on Software Engineering, pp. 815–825 (2012)

    Google Scholar 

  33. Ramos-Soto, A., Janeiro, J., Alonso, J.M., Bugarin, A., Berea-Cabaleiro, D.: Using fuzzy sets in a data-to-text system for business service intelligence. In: Kacprzyk, J., Szmidt, E., Zadrożny, S., Atanassov, K.T., Krawczak, M. (eds.) IWIFSGN/EUSFLAT -2017. AISC, vol. 643, pp. 220–231. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-66827-7_20

    Chapter  Google Scholar 

  34. Robillard, M.P., DeLine, R.: A field study of API learning obstacles. Empirical Softw. Eng. 16(6), 703–732 (2011)

    Article  Google Scholar 

  35. Rodríguez, R., Espinosa, R., Bianchini, D., Garrigós, I., Mazón, J.-N., Zubcoff, J.J.: Extracting models from web API documentation. In: Grossniklaus, M., Wimmer, M. (eds.) ICWE 2012. LNCS, vol. 7703, pp. 134–145. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35623-0_14

    Chapter  Google Scholar 

  36. Suter, P., Wittern, E.: Inferring web API descriptions from usage data. In: 3rd IEEE Workshop on Hot Topics in Web Systems and Technologies, pp. 7–12 (2015)

    Google Scholar 

  37. Trivino, G., Sanchez, A., Montemayor, A.S., Pantrigo, J.J., Cabido, R., Pardo, E.G.: Linguistic description of traffic in a roundabout. In: International Conference on Fuzzy Systems, pp. 1–8 (2010)

    Google Scholar 

  38. Uddin, G., Robillard, M.P.: How API documentation fails. IEEE Softw. 32(4), 68–75 (2015)

    Article  Google Scholar 

  39. Vicente, M.E., Barros, C., Agulló, F., Peregrino, F.S., Lloret, E.: La generacion de lenguaje natural: análisis del estado actual. Computación y Sistemas 19(4), 721–756 (2015)

    Google Scholar 

Download references

Acknowledgments

This work has been partially funded by the following projects: TIN2016-78103-C2-2-R, PROMETEU/2018/089, RTI2018-094653-B-C22, RTI2018-094649-B-I00, TIN2017-90773-REDT and COST Action CA18231. Furthermore, the author César González-Mora has a contract for predoctoral training with the Generalitat Valenciana and the European Social Fund by the grant ACIF/2019/044.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to César González-Mora .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

González-Mora, C., Barros, C., Garrigós, I., Zubcoff, J., Lloret, E., Mazón, JN. (2020). Applying Natural Language Processing Techniques to Generate Open Data Web APIs Documentation. In: Bielikova, M., Mikkonen, T., Pautasso, C. (eds) Web Engineering. ICWE 2020. Lecture Notes in Computer Science(), vol 12128. Springer, Cham. https://doi.org/10.1007/978-3-030-50578-3_28

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-50578-3_28

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-50577-6

  • Online ISBN: 978-3-030-50578-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics