Named Entity Disambiguation for Maritime-related Data Retrieved from Heterogenous Sources

Abstract

The article concerns integration and disambiguation of data related to the maritime domain. A developed system is described, which collects and merges data about several maritime-related entities (vessels, vessel types, ports, companies etc.) retrieved from different internet sources and feeds the data into a single database. This process is however not trivial. There are few challenges, which need to be faced to successfully conduct it. Firstly, in different sources, entities may be referenced to in different ways, for example, by using different text strings. Additionally, some of these references may be ambiguous, i.e. potentially the reference may point to more than one entity. To enable efficient analysis of data coming from different sources, such ambiguities must be resolved automatically as a preprocessing step, before the data is uploaded to the database and utilized in further computations. The aim of the disambiguation process is to assign artificial, unique identifiers to each entity and then, if possible, automatically assign these identifiers to each data item related to a given entity. In the article, developed methods for resolving such ambiguities are discussed and their evaluation is presented.

Authors and Affiliations

Jacek Małyszko, Witold Abramowicz, Milena Stróżyna

Keywords

Related Articles

Determination of the Territorial Sea Baseline – Aspect of Using Unmanned Hydrographic Vessels

Nowadays, most merchant vessels use Heavy Fuel Oils (HFOs) for the ship propulsion. These fuels are cost effective but they produce significant amounts of noxious emissions. To comply with IMO & MARPOL environmental regu...

e-Navigation and Future Trend in Navigation

The International Maritime Organization (IMO) adopted the following definition of e-Navigation: “e-Navigation is the harmonised collection, integration, exchange, presentation and analysis of maritime information onboard...

Methodological Approach and Basic Analysis of Maritime Labour Market Needs by Case of Estonia

The Estonian Government has adopted Estonian Marine Policy 2012–2020 (EMP) as a long-term planning basis for the development of maritime sector in Estonia. A number of practical tasks shall provide the achieving of goals...

Possibilities for Providing of Professional Practices on the Training Vessel Dar Mlodziezy

The article presents possibility of performing trainings on board of the Gdynia Maritime Academy's training ships. It has been described main rules of curriculum of practices, also included requirements of the timeline o...

Routing Planning As An Application Of Graph Theory with Fuzzy Logic

The routing planning one of the classic problems in graph theory. Its application have various practical uses ranging from the transportation, civil engineering and other applications. The resolution of this paper is to...

Download PDF file
  • EP ID EP193364
  • DOI 10.12716/1001.10.03.12
  • Views 117
  • Downloads 0

How To Cite

Jacek Małyszko, Witold Abramowicz, Milena Stróżyna (2016). Named Entity Disambiguation for Maritime-related Data Retrieved from Heterogenous Sources. TransNav, the International Journal on Marine Navigation and Safety of Sea Transportation, 10(3), 465-477. https://europub.co.uk/articles/-A-193364