CorpusSense
CorpusSense
Log in
|

Research

Two decades of computational linguistics research — from sentiment analysis to AI-powered corpus understanding.

A lineage of innovation

Each tool builds on the last — refining approaches to computational linguistics and expanding what automated analysis can achieve.

2014
Sentitext

Sentiment analysis of Spanish texts

2017
Lingmotif

Multi-language sentiment analysis tool

2021
DisParSA

Discourse parser for sentiment analysis

2024
CorpusSense

AI-powered multi-modal corpus analysis platform

Three-year research plan

Funded by Spain's Ministerio de Ciencia, Innovación y Universidades (PID2023-152406NB-I00). A systematic approach to building comprehensive linguistic analysis infrastructure.

Year 1
Foundation & Corpus Assembly

Compile 10 thematic corpora across 23 languages. Develop annotation schemas, metadata standards, and the initial processing pipeline.

Year 2
AI Integration & Analysis

Integrate LLM-based analysis (Qwen 2.5), implement BERTopic modeling, semantic search with embeddings, and the 15-aspect insight extraction system.

Year 3
Validation & Dissemination

User studies, platform optimization, academic publications, and technology transfer to industry. Open access to research community.

Selected publications

View all on Google Scholar →
2024
CorpusSense: A web-based platform for AI-powered corpus analysisMoreno-Ortiz, A. J. — Language Resources and Evaluation Conference (LREC-COLING)
2021
DisParSA: A corpus-based approach to discourse parsing for sentiment analysisMoreno-Ortiz, A. J. et al. — Procesamiento del Lenguaje Natural (SEPLN)
2019
Lingmotif 2: Integrating text analytics and visualisationMoreno-Ortiz, A. J. — Journal of Research Design and Statistics in Linguistics and Communication Science
2017
Lingmotif: Sentiment analysis for the digital humanitiesMoreno-Ortiz, A. J. — Proceedings of the Software Demonstrations of EACL

Expected impact

Advancing the intersection of computational linguistics, NLP, and AI — with practical applications across academia and industry.

01Open-access corpus resources for the global research community
02Democratizing AI-powered text analysis for non-technical researchers
03Technology transfer to media monitoring, brand analysis, and compliance sectors
04Advancing multilingual NLP methodologies across 23 languages