Reconstructing the house from the ad: Structured prediction on real estate classifieds

This repository contains the code used for dependency parsing and information about how to obtain the dataset presented in the work:

The dataset includes 2,318 manually annotated property advertisements from a real estate company.

If you use part of the code or the dataset please cite:

@InProceedings{E17-2044,
  author = 	"Bekoulis, Giannis
		and Deleu, Johannes
		and Demeester, Thomas
		and Develder, Chris",
  title = 	"Reconstructing the house from the ad: Structured prediction on real estate classifieds",
  booktitle = 	"Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers",
  year = 	"2017",
  publisher = 	"Association for Computational Linguistics",
  pages = 	"274--279",
  location = 	"Valencia, Spain",
  url = 	"http://aclweb.org/anthology/E17-2044"
}

and

@article{BEKOULIS2018100,
title = "An attentive neural architecture for joint segmentation and parsing and its application to real estate ads",
journal = "Expert Systems with Applications",
volume = "102",
pages = "100 - 112",
year = "2018",
issn = "0957-4174",
doi = "https://doi.org/10.1016/j.eswa.2018.02.031",
url = "http://www.sciencedirect.com/science/article/pii/S0957417418301192",
author = "Giannis Bekoulis and Johannes Deleu and Thomas Demeester and Chris Develder"
}

Pre-requisites

The code is written for Python 2.7. Some of the python packages needed to run these files, best installed using pip.

scikit-learn (machine learning)
pandas (Data manipulation)
pandas_confusion (performance measures)

Dependency parser

In the repository, one can find the 4 models (Threshold, Edmond, Structured Prediction via the Matrix-Tree Theorem (MTT), Transition-based) that we have developed for dependency parsing. One should run the run_script.py file that serves as a main function.

Dataset

To obtain the anonymized dataset fill in and sign this form. Send it also via email to giannis.bekoulis@gmail.com. Follow the instructions and we will get back to you as soon as possible with information about how to download the anonymized dataset.

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
agreement		agreement
config		config
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
dependency_parsing.py		dependency_parsing.py
file_parsers.py		file_parsers.py
graph_utils.py		graph_utils.py
matrix_tree_theorem.py		matrix_tree_theorem.py
preprocessing.py		preprocessing.py
run_script.py		run_script.py
transition_utils.py		transition_utils.py
utilities.py		utilities.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Reconstructing the house from the ad: Structured prediction on real estate classifieds

Pre-requisites

Dependency parser

Dataset

About

Releases

Packages

Languages

License

bekou/ad_data

Folders and files

Latest commit

History

Repository files navigation

Reconstructing the house from the ad: Structured prediction on real estate classifieds

Pre-requisites

Dependency parser

Dataset

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages