Fachinformationsdienst Linguistik launches third project phase
The Fachinformationsdienst Linguistik (FID Linguistik) is a project to strengthen research infrastructures in linguistics, especially pertaining to the accessibility and findability of subject-specific information and resources. It is funded by the German Research Foundation’s programme “Specialised Information Services” and is one of 40 individual projects devoted to a particular academic field. As most FID projects, the FID Linguistik is hosted at an academic library, the University Library Johann Christian Senckenberg at Goethe University Frankfurt, where it builds on the expertise and experience in the curation of a Special Subject Collection for general linguistics since 1950. The same institution hosts five other FIDs, leading to synergies within the library but also a strong representation within the network of FIDs. As the FID Linguistik launched its third iteration in late 2023 (scheduled to run until 2026), this report shall outline the current state of the project and its planned task areas for the current project phase.
The core tasks of all FIDs are to provide infrastructural services to Germany-based researchers and research projects. However, many of these services can be accessed or used by international researchers, too, except for licensed products like online corpora and linguistic databaseswhich may only be offered to Germany-based institutions as per the funding regulations of the German Research Foundation. Through web-interfaces, all FIDs offer tools to search and locate relevant informational resources like print and e-publications, databases, or repositories, and are actively engaging in subject-specific networking and exchange. The FIDs also contribute to the development of library and research data infrastructures, usually in collaboration with their academic partners. Furthermore, many FIDs engage in digitisation of print publications and the administration of open-access publication solutions for their project partners.
The FID Linguistik provides most of its services through the project website, which offers facilities for searching multiple sources for publications, resources, and meta-information. On top of the library catalogues, the search tool enables browsing manually curated bibliographies, open-access repositories, and collections of web-resources, spanning around 3 million entries. The FID Linguistik cooperates closely with the Bibliography of Linguistic Literature (BLL) – a separate project also hosted at the University Library – which is a manually curated and annotated bibliography of general, German, English, and Romance linguistics continuously developed since 1971 and containing over 500,000 entries. While the BLL is available as a separate, licensed product, its underlying database and thesaurus are reused for the development of the FID Linguistik infrastructure. One of the recurring task areas establishes links between publications and their underlying research data through the continuous development of controlled vocabulary for databases, dictionaries, corpora, and tools as authority records. With entries indexed and identified through the Integrated Authority File (GND), it will be possible to reverse search and cross-search between datasets, publications, and citing literature. This provides users with enhanced search facilities for linguistic literature and resources beyond a title and index search.
A main focus for the FID Linguistik has been the development of Linked Open Data (LOD) tools, also in collaboration with its partners. The standardised terminology for the categorisation and indexation of literature led to the creation of an ontology that will be further enriched with information from other resources such as Wikidata by using semantic equivalence relations. The resulting knowledge graph allows the exploration of relationships between data, publications, corpora, and other resources in the database through semantic web technologies. A desired outcome of the work is an LOD-based search function that allows for the index search to be supported by ontology recommendations and references. The ontology will also constitute the terminological foundation of a full-text search on Open Access publications based on Named Entity Linking. The code and training data of this new service will be released as Open Source for general reuse.
Apart from the aforementioned services that are globally accessible, the FID Linguistik is tasked by the German Research Foundation to manage certain services on behalf of the German scholarship in linguistics. Print and online publications can be acquired upon user request (in accordance with the acquisition policies), and several electronic resources are offered through supra-regional, university-wide, and individual licences for Germany-based users registered with the FID. In 2023, licences for Linguistic Minorities in Europe Online (LME) & Oxford Research Encyclopedia of Linguistics were added to the portfolio. In addition, established journals with an institutional affiliation to the German research landscape can apply to be hosted through the FID Linguistik and University Library’s open-access e-publishing service, which is based on instances of Open Journal Systems.
The next steps for the project – apart from the above-mentioned task areas – put an emphasis on outreach, networking activities, and increasing the visibility of resources in and on endangered languages. The latter also involves the inclusion of links to initiatives for endangered languages and curation of resources for researchers working in these settings. With an improved index, research data, materials, and publications shall become more visible and usable for researchers working on endangered and minoritised languages. Furthermore, the ontology shall be extended by directly linking its concepts to additional external ontologies and knowledge bases such as the META-SHARE Ontology, Wikidata, and the GND.
In order to achieve these goals, the FID Linguistik is actively looking for user feedback and external input from experts on any (endangered) language. With the goal of enriching and collecting more descriptive metadata on language archives and language data sets, all suggestions and links to credible, scientific websites that are not yet included on the website are very welcome. Furthermore, we are happy to receive feedback on the assigned metadata tags for each linguistic data set in the reference list – not all resources have descriptive metadata and we cannot investigate the content of every resource ourselves. We are, therefore, encouraging researchers and community stakeholders to reach out, in order to increase the visibility and (re)use of the resources for their languages and research outputs.
Further information can be found on the project website and the team is happy to receive feedback and hear from potential partners for collaborations via info@linguistik.de.