Natural Language Understanding and Generation - QC-134

Preferred Disciplines: Machine learning et natural language, Master or PhD
Company: Thales Research and Technology (TRT) Canada
Project Length:  4-12 months or more (1-2 units or more)
Desired start date: May 2018 or as soon as possible
Location: Flexible. Quebec City or Montreal but flexible to have remote internships. 
No. of Positions: 1
Preferences: Language: no preference

About Company:

Established in 1972, Thales Canada is a leading electronic solutions provider for the Transportation, Defence & Security, and Aerospace sectors. Thales Canada employs staff across main sites located in Montreal, Quebec City, Ottawa, Toronto and Vancouver. The role of Thales Research and Technology (TRT) is to extend the influence of Thales within the scientific and technical communities, offering a platform for innovation and knowledge sharing, and attracting talented science graduates to build on our acquired expertise. Thales needs to obtain increasingly sophisticated technologies, particularly in detection, analysis and decision-making technology, in order to design and develop critical information systems. 

Project Description:

NLP techniques have been used and tested for several years in different environments and for different applications/domains. The performances of the Natural Language Understanding (NLU) toolbox are closely related to the quality of the text but also on the specific knowledge-domain. Social Media content typically use short sentences with simple grammar and tend to include specific jargon and abbreviations. Grammatical rules are not always respected and spelling errors are common. These characteristics are also common to the human generated military intelligence/tactical reports. The first objective of the project is to automatically generate "human-like" reports, in both French and English, based on "structured-data". The second objective of this project is to create a knowledge extraction toolbox to be used with both social media reports and military intelligence reports (for both French and English).

Background and required skills

Research Objectives/Sub-Objectives:

  • Information extraction from unstructure data
  • Automated understanding of natural human languages in a specific knowlegde domain (for both French and English)
  • Automated generation of natural language (both syntactically and semantically correct  and relevant) in a specific knowledge domain (for both French and English)

Methodology:

  • To be determined

Expertise and Skills Needed:

  • Machine learning, with focus on Natural Language Processing (NLP) including Natural Language Understading (NLU) and Natural Language Generation (NLG)
  • Knowledge of Python, Java and experience with NLP toolkits
  • Source Control (Git)
  • Development and testing under Windows and Linux environments
  • Graph Databases (OrientDB, Neo4j, or similar)
  • Semantic web technologies (Ontology)
  • Knowledge/Previous Studies in lingustic will be a plus
  • Knolwedge of both French and English will be a plus

 

For more info or to apply to this applied research position, please

  1. Check your eligibility and find more information about open projects.
  2. Interested students need to get the approval from their supervisor and send their CV to Ingrid Saba at isaba(a)mitacs.ca, along with a link to their supervisor’s university webpage
Program: