Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Projects

Current and Old Project Links (some should work)

CiteSeerx

MathSeer

PrivaSeer

COVIDSeer

ChemxSeer

BBookX

Visual Cortex on Silicon

CSSeers

SeerSuite

SimSeerX

CollabSeer

RefSeer

YouSeer

ArchSeer

EthnicSeer

Current Research Projects

The current focus of my research is cyberinfrastructure, information retrieval, knowledge extraction and management, and data mining, both for public and private small and big data for information resources with a particular interest in scholarly big data. Our application domain has primarily been the Web and Internet with a focus on academic, scientific and government information, data, and documents.  I am also interested in automated methods for developing and designing cyberinfrastructure (also known as e-science) for academic research and related areas. This has led to research in various aspects of social networks and how they facilitate information access. Other interests are knowledge aggregation and architectures.

My recent research and scholarly interests are listed below:

  • Design and creation of specialty or vertical search engines, cyberinfrastructure, digital libraries, and focused crawlers.
    • Open Source infrastructure and tool kits for search engines and digital libaries: SeerSuite.
      • SeerSuite code is now available on Github; now you can build your own CiteSeerx like Seer or just use the special extraction modules.
    • YouSeer is a complete and powerful open source search engine available on SourceForge that integrates the open source crawler Heritrix with the open source indexer Solr/Lucene.
    • Next Generation CiteSeer, CiteSeerx, built from SeerSuite, now with a new look and author name disambiguation.
    • Automated interactive textbook building tool, BBookX.
    • Specialty search engines such as:
      • PrivaSeer is a search engine for web privacy policies.
      • COVIDSeer searches Covid papers.
      • CSSeers was an expert recommendation search engine.
      • RefSeer was a citation recommendation system.
      • A collaboration search tool, CollabSeer, covered over 400,000 collaborators in CiteSeerX.
      • TableSeer was a table search engine integrated into ChemXSeer and CiteseerX.
      • GrantSeer allowed program managers to search their grant portfolios.
      • SeerSeer was based on the CiteSeerX database and allowed search of experts.
      • EthnicSeer is a name ethnicity classifier based on name ethnicity as defined in wikipedia.
      • AckSeer was an early acknowledgement indexing search engine.
  • A cyberinfrastructure search engine and data portal built on SeerSuite for environmental chemistry: ChemxSeer
    • This project focused on searching for chemical formulae, table search, figure search and data search for chemistry.
  • Automated methods in systems research and cyberinfrastructure - information and knowledge extraction, data mining, web services. Please see some of my recent papers.
    • As an example please see the work on automated acknowledgement indexing and who and what gets acknowledged published in PNAS yields insights into scientific and social trends.
  • Social network analysis for enhanced search and understanding trends in science and discovering e-communities.

Past research and scholarly areas which are still of interest:

  • Recent work on deep learning with recurrent neural networks and sequence processing. Please see new work.
  • Computational models of e-commerce, most recently game markets (letter in Science.)
  • How do we measure and characterize the web, what's there and what is changing?


Brief Descriptions of Some New and Old Projects:

  • ChemxSeer was a search engine focused on the development of a cyberinfrastructure portal for environmental kinetic chemistry integrating chemistry specific search with data repositories and analysis tools.
  • Next Generation CiteSeer, CiteSeerx, has focused on the future of the CiteSeer search engine and digital library.
  • CiteSeer.IST was the Penn State home of the academic search engine and digital library CiteSeer and has been replaced by CiteSeerx
  • A protosearch engine for archaeology, ArchSeer, primarily focused on map search.
  • eBizSearch was a CiteSeer-like niche search engine and digital library for business schools. eBizSearch was a predecessor to SmealSearch and was a also CiteSeer-like niche search engine for finding and indexing documents about e-business and e-commerce. All these were rolled into BizSeer.
  • Acknowledgement search was part of the old CiteSeer.IST project and has now been replace by AckSeer.
  • Mobile social networking using mobile phones, MobiSNA, was one of the first to use social network ranking of videos.
  • BotSeer was a specialty search engine devoted to harvesting and providing search functionality for web site robots.txt files and related information and software.
  • Inquirus was once a popular content-based metasearch engine.
  • Inquirus2 was a preference-based metasearch engine.

I have an interest in machine learning, pattern recognition, text mining, information extraction and retrieval, and artificial intelligence, especially as applied to the topics above. I like to use existing methods and develop new methods for novel automated applications in handling and extracting knowledge and information from massive data sets, temporal data, multimedia, etc. My past work has focused on the role of memory in learning and theoretical models of knowledge representation and capture, learning and intelligent multiagent systems, and applications of intelligent systems to computing and computer systems, finance, and signal processing.