From Monolingual Multiword Expression Discovery to Multilingual Concept Enrichment: an Ontology-based approach
Contributo in Atti di convegno
Data di Pubblicazione:
2022
Abstract:
In this paper, we present a methodology for the semantic
enrichment of cultural heritage (CH) data, based on the use of ontologies and Linked data. The proposed method aims at developing
domain-specific resources enriched with multilingual conceptual information starting from monolingual RDF data. Particularly, our approach
begins with a Multiword Expressions (MWEs) discovery process to select a starting list of domain-specific candidate mentions. Subsequently,
we perform a concept discovery phase in order to link them to closely
matching Dbpedia concepts through the use of two similarity measures.
The semantic information related to these concepts is used to further
filter the candidates and obtain representative mention-concept pairs by
reweighting automatically computed scores making use of a graph representation.
We test our methodology on biographic information about authors extracted from the Europeana Data Collection. The final results are a resource of semantically enriched data, containing a list of domain-specific
keywords and MWEs together with Dbpedia concepts they strongly
match, and the multilingual labels representing these specific concepts
enrichment of cultural heritage (CH) data, based on the use of ontologies and Linked data. The proposed method aims at developing
domain-specific resources enriched with multilingual conceptual information starting from monolingual RDF data. Particularly, our approach
begins with a Multiword Expressions (MWEs) discovery process to select a starting list of domain-specific candidate mentions. Subsequently,
we perform a concept discovery phase in order to link them to closely
matching Dbpedia concepts through the use of two similarity measures.
The semantic information related to these concepts is used to further
filter the candidates and obtain representative mention-concept pairs by
reweighting automatically computed scores making use of a graph representation.
We test our methodology on biographic information about authors extracted from the Europeana Data Collection. The final results are a resource of semantically enriched data, containing a list of domain-specific
keywords and MWEs together with Dbpedia concepts they strongly
match, and the multilingual labels representing these specific concepts
Tipologia CRIS:
4.1 Contributo in Atti di convegno
Keywords:
MWE discovery · Concept Discovery · Ontology
Elenco autori:
Nolano, Gennaro
Link alla scheda completa:
Link al Full Text:
Titolo del libro:
Proceedings of the International Conference EUROPHRAS 2022