Related projects
Discover more projects across a range of sectors and discipline — from AI to cleantech to social innovation.
Mitacs brings innovation to more people in more places across Canada and around the world.
Learn MoreWe work closely with businesses, researchers, and governments to create new pathways to innovation.
Learn MoreNo matter the size of your budget or scope of your research, Mitacs can help you turn ideas into impact.
Learn MoreThe Mitacs Entrepreneur Awards and the Mitacs Awards celebrate inspiring entrepreneurs and innovators who are galvanizing cutting-edge research across Canada.
Learn MoreDiscover the people, the ideas, the projects, and the partnerships that are making news, and creating meaningful impact across the Canadian innovation ecosystem.
Learn More
Text mining is the process of automatically extracting knowledge from unstructured, natural language documents. It aims to support users in dealing with large amount of textual information. Examples for specific text mining tasks are entity detection, summarization, and opinion mining. Due to the complexity and ambiguity of natural language, this analysis is broken down into individual processing steps, which are based on the techniques from the fields of machine learning, natural language processing, and semantic computing.
In this project, the goal is to enrich the text mining pipelines developed at KeaText for the processing of legal documents. Specifically, the analysis is to be enriched with a topic segmentation module that is tailored to the specific domain and application requirements. Automatic topic segmentation, also known as text tiling, structures documents into individual parts, each representing a distinct theme. It is well-known that topic segmentation can improve several information retrieval and text analysis tasks. In this project, the following tasks are to be completed: (1) Survey of existing research literature to identify suitable methods and tools; (2) Design of a new topic segmentation algorithm specifically for legal documents; and (3) Implementation and evaluation of this algorithm based on the General Architecture for Text Engineering (GATE) framework.
Dr. Rene Witte
Nona Naderi
KeaText
Computer science
Information and communications technologies
Concordia University
Accelerate
Discover more projects across a range of sectors and discipline — from AI to cleantech to social innovation.
Find the perfect opportunity to put your academic skills and knowledge into practice!
Find ProjectsThe strong support from governments across Canada, international partners, universities, colleges, companies, and community organizations has enabled Mitacs to focus on the core idea that talent and partnerships power innovation — and innovation creates a better future.