Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Semantic Facettation in Pharmaceutical Collections Using Deep Learning for Active Substance Contextualization

  • Conference paper
  • First Online:
Digital Libraries: Data, Information, and Knowledge for Digital Lives (ICADL 2017)

Abstract

Alternative access paths to literature beyond mere keyword or bibliographic search are a major success factor in today’s digital libraries. Especially in the sciences, users are in dire need of complex knowledge spaces and facettations where entities like e.g., chemical substances, genes, or mathematical formulae may play a central role. However, even for clear-cut entities the requirements in terms of contextualized similarities or rankings may strongly differ. In this paper, we show how deep learning techniques used on scientific corpora lead to a strongly contextualized description of entities. As application case we take pharmaceutical entities in the form of small molecules and demonstrate how their learned contexts and profiles reflect their actual use as well as possible new uses, e.g., for drug design or repurposing. As our evaluation shows, the results gained are quite comparable to expensive manually maintained classifications in the field. Since our techniques only rely on deep embeddings of textual documents, our methodology promises to be generalizable to other use cases, too.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://www.nlm.nih.gov/mesh/intro_trees.html.

  2. 2.

    https://www.whocc.no/atc_ddd_index/.

  3. 3.

    http://www.ahfsdruginformation.com/ahfs-pharmacologic-therapeutic-classification/.

  4. 4.

    https://www.drugbank.ca/.

  5. 5.

    https://deeplearning4j.org/.

  6. 6.

    https://www.ncbi.nlm.nih.gov/pubmed/.

  7. 7.

    https://www.drugbank.ca/.

  8. 8.

    https://meshb.nlm.nih.gov/search.

  9. 9.

    https://lucene.apache.org/.

  10. 10.

    https://deeplearning4j.org/word2vec.

  11. 11.

    http://algo.uni-konstanz.de/software/mdsj/.

  12. 12.

    http://commons.apache.org/proper/commons-math/.

References

  1. Willett, P., Barnard, J.M., Downs, G.M.: Chemical similarity searching. J. Chem. Inf. Comput. Sci. 38(6), 983–996 (1998)

    Article  Google Scholar 

  2. Tönnies, S., Köhncke, B., Balke, W.T.: Taking chemistry to the task: personalized queries for chemical digital libraries. In: Proceedings of the ACM/IEEE Joint Conference on Digital Libraries (JCDL 2011), Ottawa, Canada (2011)

    Google Scholar 

  3. Wishart, D.S., Knox, C., Guo, A.C., Shrivastava, S., Hassanali, M., Stothard, P., Chang, Z., Woolsey, J.: DrugBank: a comprehensive resource for in silico drug discovery and exploration. Nucleic Acids Res. 34(1), D668–D672 (2006). Database issue

    Article  Google Scholar 

  4. Sacco, G.M., Tzitzikas, Y.: Dynamic Taxonomies and Faceted Search: Theory, Practice, and Experience. Springer, Heidelberg (2009). doi:10.1007/978-3-642-02359-0

    Book  Google Scholar 

  5. Köhncke, B., Balke, W.-T.: Context-sensitive ranking using cross-domain knowledge for chemical digital libraries. In: Aalberg, T., Papatheodorou, C., Dobreva, M., Tsakonas, G., Farrugia, C.J. (eds.) TPDL 2013. LNCS, vol. 8092, pp. 285–296. Springer, Heidelberg (2013). doi:10.1007/978-3-642-40501-3_29

    Chapter  Google Scholar 

  6. Gonzalez Pinto, J.M., Balke, W.T.: Demystifying the semantics of relevant objects in scholarly collections: a probabilistic approach. In: Proceedings of the ACM/IEEE Joint Conference on Digital Libraries (JCDL), Knoxville, TN, USA (2015)

    Google Scholar 

  7. Gurulingappa, H., Kolárik, C., Hofmann-Apitius, M., Fluck, J.: Concept-based semi-automatic classification of drugs. J. Chem. Inf. Model. 49(8), 1986–1992 (2009)

    Article  Google Scholar 

  8. Dunkel, M., Günther, S., Ahmed, J., Wittig, B., Preissner, R.: SuperPred: drug classification and target prediction. Nucleic Acids Res. 36(suppl 2), W55–W59 (2008)

    Article  Google Scholar 

  9. Trieschnigg, D., Pezik, P., Lee, V., De Jong, F., Kraaij, W., Rebholz-Schuhmann, D.: MeSH Up: effective MeSH text classification for improved document retrieval. Bioinformatics 25(11), 1412–1418 (2009). Oxford University Press

    Article  Google Scholar 

  10. Dumais, S.T.: Latent semantic analysis. In: Annual Review of Information Science and Technology (ARIST), Association for Information Science & Technology, vol. 38, no. 1 (2004)

    Google Scholar 

  11. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3(Jan), 993–1022 (2003). MIT Press

    MATH  Google Scholar 

  12. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Proceedings of the Annual Conference on Neural Information Processing Systems (NIPS), Lake Tahoe, NV, USA (2013)

    Google Scholar 

  13. Jessop, D.M., Adams, S.E., Willighagen, E.L., Hawizy, L., Murray-Rust, P.: OSCAR4: a flexible architecture for chemical text-mining. J. Cheminform. 3(1), 41 (2011). Springer

    Article  Google Scholar 

  14. Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, Cambridge (2008)

    Book  MATH  Google Scholar 

  15. Borg, I., Groenen, P.J.: Modern Multidimensional Scaling: Theory and Applications. Springer, Heidelberg (2005). doi:10.1007/0-387-28981-X

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Janus Wawrzinek .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wawrzinek, J., Balke, WT. (2017). Semantic Facettation in Pharmaceutical Collections Using Deep Learning for Active Substance Contextualization. In: Choemprayong, S., Crestani, F., Cunningham, S. (eds) Digital Libraries: Data, Information, and Knowledge for Digital Lives. ICADL 2017. Lecture Notes in Computer Science(), vol 10647. Springer, Cham. https://doi.org/10.1007/978-3-319-70232-2_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-70232-2_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-70231-5

  • Online ISBN: 978-3-319-70232-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics