Bayesian Non-Exhaustive Classification A Case Study: Online Name Disambiguation using Temporal Record Streams

dc.contributor.authorZhang, Baichuan
dc.contributor.authorDundar, Murat
dc.contributor.authorAl Hasan, Mohammad
dc.contributor.departmentDepartment of Computer and Information Science, School of Scienceen_US
dc.date.accessioned2017-08-23T19:29:00Z
dc.date.available2017-08-23T19:29:00Z
dc.date.issued2016-10
dc.description.abstractThe name entity disambiguation task aims to partition the records of multiple real-life persons so that each partition contains records pertaining to a unique person. Most of the existing solutions for this task operate in a batch mode, where all records to be disambiguated are initially available to the algorithm. However, more realistic settings require that the name disambiguation task be performed in an online fashion, in addition to, being able to identify records of new ambiguous entities having no preexisting records. In this work, we propose a Bayesian non-exhaustive classification framework for solving online name disambiguation task. Our proposed method uses a Dirichlet process prior with a Normal x Normal x Inverse Wishart data model which enables identification of new ambiguous entities who have no records in the training data. For online classification, we use one sweep Gibbs sampler which is very efficient and effective. As a case study we consider bibliographic data in a temporal stream format and disambiguate authors by partitioning their papers into homogeneous groups. Our experimental results demonstrate that the proposed method is better than existing methods for performing online name disambiguation task.en_US
dc.eprint.versionAuthor's manuscripten_US
dc.identifier.citationZhang, B., Dundar, M., & Al Hasan, M. (2016, October). Bayesian non-exhaustive classification a case study: Online name disambiguation using temporal record streams. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management (pp. 1341-1350). ACM. https://doi.org/10.1145/2983323.2983714en_US
dc.identifier.urihttps://hdl.handle.net/1805/13893
dc.language.isoenen_US
dc.publisherACMen_US
dc.relation.isversionof10.1145/2983323.2983714en_US
dc.relation.journalProceedings of the 25th ACM International on Conference on Information and Knowledge Managementen_US
dc.rightsPublisher Policyen_US
dc.sourceAuthoren_US
dc.subjectbayesian non-exhaustive classificationen_US
dc.subjectonline name disambiguationen_US
dc.subjecttemporal record streamen_US
dc.titleBayesian Non-Exhaustive Classification A Case Study: Online Name Disambiguation using Temporal Record Streamsen_US
dc.typeArticleen_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Zhang_2017_bayesian.pdf
Size:
218.36 KB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.88 KB
Format:
Item-specific license agreed upon to submission
Description: