Collaboration with industry partners

Cognitive Claims AI


NSERC Engage project with IMC Business Architecture (in progress, 2017)

IMC is developing a tool for the Property & Casualty (P&C) insurance industry. P&C companies are seeking ways to minimize their cost of claims (currently running at some 60% of premiums) as well as increase their rate of fraud detection, which they estimate at 10% of actual fraud.

IMC believes it is possible to improve this performance by using available data to predict the behaviour of the claimant and treat them appropriately. The most useful data would be the statement of claim by the customer, however, as this is in the form of a telephone interview, it is not readily useable for algorithmic modelling. Therefore, IMC is seeking to find a way of turning the content of the statement of claim into data useable for predictive modelling of the claim outcome. IMC has identified a number of tools for converting natural language data into numerical scores, but the considered prediction problems are non-standard and so require development of novel approaches to data modelling. The opportunity to collaborate with Dr. Pralat in researching the potential for such algorithms will allow IMC to determine if this tool represents a viable opportunity.

Modelling of Homophily and Social Influence via Random Graphs


NSERC Engage project with Alcatel-Lucent (completed, 2016-17)

The proliferation of cellular usage has given rise to massive amounts of data that, through data mining and analytics, promises to reveal a wealth of information on how agents interact with one another and effect one another's preferences. For example, cellular devices frequently communicate with cell towers, from which agent locations, and hence, agent activity profiles, are readily available. The company aims to understand the interconnections between agent profiles and, in particular, how these profiles co-evolve over time.

It is through the lens of social learning that we propose to model and derive value from agent profiles. The first step is to understand the social environments of the agents which is both shaped by the agents and influences the agents to adopt new behaviours. So, any relevant theory of social learning must account for at least two interrelated factors: network change as a result of agent attributes, and attribute updating as a result of network position. Two leading hypotheses in this area are that network ties are formed and deleted based on similarity or differences in agent attributes (homophily), and that certain attributes are likely to diffuse through existing network ties (social influence). This project aims to determine whether or not homophily and social influence are good models of networks described by agent location data and then use the resultant models to develop scalable analytics algorithms.

Hypergraph Theory and Applications


Project with The Tutte Institute for Mathematics and Computing (completed, 2015-16)

Myriad problems can be described in hypergraph terms, however, the theory and tools are not sufficiently developed to allow most problems to be tackled directly within this context. Hypergraphs are of particular interest in the field of knowledge discovery where most problems currently modelled as graphs would be more accurately modelled as hypergraphs. Those in the knowledge discovery field are particularly interested in the generalization of the concepts of modularity and diffusion to hypergraphs. Such generalizations require a firm theoretical base on which to develop these concepts. Unfortunately, although hypergraphs were formally defined in the 1960s (and various realizations of hypergraphs were studied long before that), the general formal theory is not as mature as required for the applications of interest to the TIMC. The TIMC wishes to encourage the development of this formal theory, in conjunction with development of concrete applications.

Relationship Mapping Analytics for Fundraising and Sales Prospect Research


NSERC Engage project with Charter Press Ltd. (completed, 2015-16)

Third Sector Publishing (TSP) has been successful selling CharityCAN subscriptions to fundraising organizations across Canada. This has been the result of incorporating a large volume of data from different sources that prospect researchers find useful as they attempt to identify potential donors for their organizations. For example, the Canadian data that TSP licenses from Blackbaud, Inc. includes over 7.3 million donation records - records of donations that individuals, foundations, and companies made to which organizations.

As well as licensed data, there is an abundance of publicly available data that will be useful to CharityCAN subscribers. TSP will be able to extract this data from websites through automated extraction processes. For example, most law firms in Canada create and post for free biographies of their lawyers. TSP will be able to add these biographies to the growing volume of useful data that already can be found on the CharityCAN platform.

The challenge for CharityCAN is connecting this growing number of data points. Relationship mapping refers to the identification of relationships among individuals. Relationship mapping becomes particularly useful when it can predict the strength or weakness of any relationship. CharityCAN requires sophisticated machine learning algorithms and data mining tools that will identify relationships among individuals, private-sector companies, and non-profit institutions, and then these algorithms should be able to predict the strength (or lack thereof) of these relationships. As a result, various (complex) networks could be formed and, with this in hand, some hybrid clustering methods could be used to extract groups of users that are potentially of interest to the subscribing institution or company (for example, for personalized and targeted solicitation).

Web Visitor Engagement Measurement and Maximization


Ontario Centres of Excellence Talent Edge Fellowship Program with The Globe and Mail (completed, 2014-15)

A very important measure of how well a news website is doing in providing content, as well as how attractive they are to advertisers, is how engaged their visitors are with their site. News websites need to maximize visitor engagement, however, they do not currently have an accurate way to measure engagement. Ideally they could measure a visitor's time spent looking at their website, but the web analytics software available in the marketplace all fall short in their ability to do this accurately, as they always miss the last page of a visit, and they include time that they should not (for example, when a visitor physically has walked away from their computer). The Globe and Mail is seeking a machine learning & big data solution to help them accurately measure engagement, and to then optimize for it. They need tools that help them optimize the selection of articles promoted on their section homepages at any given time, as well as their ordering, such that engagement is maximized.

Utilizing big data for business-to-business matching and recommendation system


NSERC Engage project with ComLinked Corp. (completed, 2014-15)

The social media industry is experiencing a tremendous amount of growth and innovation with new business models being developed especially in the B2C space. With the success of social media platforms such as Facebook, Twitter, and LinkedIn, the commercial segment has been looking to consolidate the main features and functionalities of these B2C platforms and apply it to solve real-life B2B problems. ComLinked is an online B2B platform where companies across all industries can create their online business profiles, and in addition to their basic company information, can list specific company information such as their founding year, their products and services and their customer's industries. Based on these elements, the platform uses matching algorithms to recommend companies to other companies to connect to. ComLinked Corp. is seeking to collaborate with the academic community to develop its core set of algorithms utilizing machine learning & big data solution.

A self-organizing dynamic network model increasing the efficiency of outdoor digital billboards


NSERC Engage project with KEEN Projection Media Ltd. (KPM) (completed, 2014)

KPM is developing a business model for infrastructure development and management (Coop Billboard Network - CBN - www.coopbn.com) with the goal of creating an optimum working platform which consolidates multiple LED outdoor billboards (of various designs, ages, models, suppliers, locations, etc.) under one umbrella (similar to what Expedia does to the hotel business). The company is looking for a dynamic system that assigns user requests to specific billboards and optimizes the network in a self-organizing manner. Modelling should play an important role in this system, since it is expected that the system will be able to predict future requests and available time slots based on the history of the process as well as current trends. The system is supposed to have some artificial intelligence built-in to not only predict these events but also self-correct the network behaviour in order to increase the efficiency and global performance of the network.

Exploiting Big Data for Customized Online News Recommendation System


NSERC Engage project with The Globe and Mail (completed, 2014)

The news industry is undergoing major changes, with competition from non-traditional, international competitors negatively impacting both readership levels (pageviews) and the ad revenue associated to each pageview. The Globe and Mail is seeking a machine learning and big data solution to help them come out on top in this period of change. A system that offers personalized content recommendations to each user would help greatly. However, because their content library, akin to a product catalog at a retailer, changes dramatically every minute with the arrival of fresh news articles, traditional recommender systems would have a very hard time providing good recommendations of fresh articles. Traditional recommender systems also fail to consider popularity as a function of how much a piece of content was promoted, and the business consideration of the revenue driven by a piece of content. This project will combine big data and advanced algorithms to account for these considerations while driving personalized content recommendations.

Personalized Mobile Recommender System


NSERC Engage project with BlackBerry (completed, 2013-14)

We are developing a series of recommendation algorithms to enhance the mobile user experience. The algorithms will utilize mobile user behavioural data and application contents to determine the most relevant applications to recommend to the end users. The system will be developed on the leading edge big data platform Apache Hadoop and algorithms will need to be distributed to hundreds of computing nodes and scale to millions of users and items. The leading edge algorithms we design will be benchmarked against industry standard algorithms on performance and scalability.

Intelligent Rating System


NSERC Engage project with Mako (completed, 2012-13)

We are developing a series of formulas and algorithms to map a new artificial intelligence rating system online. The core of the platform is to utilize advanced statistical and technological indicators to determine rank of the reviewed subject, with the biggest nugget being the ability to identify the quality/merit of each review based on many interconnected variables.

Dynamic clustering and prediction of taxi service demand


NSERC Engage project with Winston (completed, 2012)

The Winston mobile phone application completely transforms the archaic end-to-end taxi experience. By leveraging mobile technology and working with established, professional limousine service providers, they are able to connect users to car service in a way that makes sense today. Although they have a large amount of potentially important and relevant data, they have no tools to use it to improve their system efficiency. The goal of this project is to use the aggregated data to improve the demand prediction. By better predicting where and what time the demands are likely to occur using historical data, it should be possible to better allocate a driver's location in order to minimize passenger wait time and maximize coverage. The algorithm should automatically adapt and improve as more and more data are aggregated.