Machine Learning made more meaningful through Graph Analytics
In this information age, data is the greatest asset. A successful digital transformation rests on the ability to master data to the fullest extent, despite its ever-expanding volume and complexity.
Our world is deeply connected, but connections within data are often not obvious, with this information often siloed and occasionally representing only a partial picture of the real world. In addition to this, traditional analytics exclusively looks at individual entities and can only model properties in isolation.
Capturing your data as a network graph enables you to capture context. Modelling the rich, relationship-driven structure of multiple data sources allows you to reveal the full picture of your domain. At CSIRO’s Data61, using new machine learning techniques and graph analytics, we can now unlock hidden patterns and insights from data that were previously hidden from view.
If you imagine what you are connected to, you might think of your friends, your family, and places you regularly visit. In business, this could be your competitors, suppliers, where you operate, who you sell to and who you want to sell to.
In many cases, these data points come from multiple sources, where on their own they tell only part of the story. Being able to connect all these data points and articulate the relationships between them gives a richness to the story that enables us to see the whole picture, and ultimately make better decisions.
Many technologies currently capture data in table format, providing a two-dimensional representation of information, which can be enriched by connecting many rows, tables together. However, the ability to clearly see and understand these static relationships can brought to life by a graphed structure.
What are the benefits of graph analytics?
“Organisations can benefit from graph analytics whenever they have complex connected data and a predictive problem,” Alex Collins, Group Leader of Graph Analytics at CSIRO’s Data61.
“Graphs, or networks, can provide a new way of looking at connected data, through both visualisations and analytics. There is tremendous value to be unlocked by combining data sources for analysis, but this can be tricky when the data is very different. Graphs are flexible enough to handle this, and now we can perform analytics on these complex systems as well.”
According to Collins, the problem with traditional analytics is that they don’t consider the context of the data, treating individual people as rows in a table. “For many organisations, the way customers interact with the business or a product is more complex, and graph analytics can use this complexity to help solve these problems.”
What can we learn from graph analytics?
Having access to the bigger picture extends a new challenge – what data in an analytical graph is important to inform your next move. The three main issues we can solve are:
- Link prediction
- Community detection
- Graph classification
Link prediction is the process of inferring missing or finding hidden relationships between entities within the data. This method can provide answers to questions that arise when user relationships are hidden from us during data collection. Alternatively, you can predict how the network structure will evolve in the future through insertion or deletion of links, given its snapshot at current time. Link prediction is an algorithm that we use every day, as it is typically the algorithm behind recommender systems. For example, in an online social network, we can use link prediction to suggest new friends to members. Another example is product recommendation for a content provider or an e-commerce website.
Community detection can identify communities or clusters of nodes based on the graph’s structure, similarity of attributes, or both. There are many applications for community detection, with one example the segmentation of users of a social network into communities based on their hobbies, without having to explicitly ask each user if they are interested in that topic. Instead of answers to topical questions, such as, ‘Do you like fiction or nonfiction?’, these can be automatically answered by the algorithm using the data gathered from the people the individual frequently interacts with. Such segmentation can be used to deliver targeted ads based on common traits of community members.
Graph classification separates graphs into various segments based on a particular similarity or trend in the data. This can be used to improve the matching accuracy, provide a clearer visualisation around the similarity of segmented users, and reveal hidden connections and alert areas of interest for your attention. For example, using an analytical graph to represent a chemical compound, it could be possible to predict whether a compound is cancer-hindering or not.
How Data61 is using graph analytics
“We’re currently working with Australia’s law enforcement agencies, where the data is especially complex, varied and connected,” says Collins. “We also have some upcoming projects applying graph analytics to cybersecurity and genetic research.
“We’re looking at problems around entity resolution and using predictive analytics in areas such as fraud and anti-money laundering, while we’re also working on a project about HR analytics and understanding attrition rates in businesses.”
CSIRO’s Data61’s StellarGraph feature uses cutting-edge engineering and data science to reveal in-depth and contextualised insights from complex data patterns. Designed by an expert team of engineers, data scientists, researchers, devops, product managers and user experience planners, Stellargraph is one of Australia’s leading insights analysis platforms.
Through Stellargraph, we’ve been able to predict which genes contain Alzheimer’s, detect potential oxycodone abusers, and map the spread of negative social media activity. Combining three core tools, (open source library, a distributed platform and a visualisation network), Stellargraph allows you to uncover insights with the next generation of data science technology.
To learn more about Stellargraph and what it can do for you, click here. To read the full Machine Learning Through Graphs paper, continuing on to ‘Node classification example using a Graph Convolutional Network’ here. Article adapted from Dr. Pantelis Elinas’s Knowing Your Neighbours: Machine Learning on Graphs