Nhung Nguyen, UX Designer at CSIRO’s Data61, on making graph machine learning a user-friendly experience.  

 As a User Experience designer at CSIRO’s Data61, I analyse all aspects of a user’s interaction with a company, its services, its product, and online platform. What drives us is designing meaningful products that are intuitive, meet user needs, and delight. 

I collaborate with people who build machine learning algorithms for large-scale graph networks on a daily basis, and we’re currently designing a visualisation tool people can use to see the hidden insights from these models and make data-informed decisions in often high-risk scenarios. 

For others like me who may be equally amazed by this rapidly emerging technology and just as drawn to its budding relationship with UX – you’ve come to the right place.  

In this article, I’ll explore what machine learning on graphs is and building an end-to-end story to define users, their relationships, and their different needs. I’ll also share three key challenges I faced in the task of designing visualisation solutions for graph machine learning, but how returning to UX design 101 resulted in an accessible, purposeful outcome. 

Building the scenario 

I quickly discovered that once you learn about graphs, you’ll start to see them everywhere. Many real-world datasets can be naturally represented as networks or graphs, with nodes representing entities and links representing relationships or interactions between them. 

My work represented in a graph.

As for managing the vast, ever-increasing amount of connected data being generated in this data-driven age, this is when machine learning comes into play.  

Machine learning on graphs can draw powerful insights from connected data, like predicting relationships between entities that are missing from the data or resolving unknown entities across different networks. 

My job as a UX Designer is to question what problems our users can solve with graph machine learning.  

Graphs have proven to be well-suited to crime investigation, and the potential of using machine learning on graphs for investigative analytics in law enforcement is immense. The challenge is that understandable security restrictions in this domain make access to data impossible. 

In the absence of data, it’s difficult to develop a tool that can analyse or visualise it. When products are developed by technology first, they risk failing to deliver on what the user actually needs. It can also be difficult for teams to establish a meaningful direction, ending up with something like this: “Let’s cater for all types of tasks, data and scenarios…!” 

In mapping our product, we ended up with a long list of features on a roadmap, however, we needed something to connect these features in a meaningful way to demonstrate the value of the product and help us prioritise. We needed to build a scenario. 

We found an open source Twitter dataset containing 100k users, and 20m tweets.  

A small portion of the dataset had users labelled as ‘hateful’ or ‘normal’ (not hateful). (Read ‘Characterizing and Detecting Hateful Users on Twitter’ for more detail about how these labels were created, and an explanation of labelled versus non-labelled data for supervised and unsupervised machine learning approaches). 

The scenario was could machine learning on graphs be used to predict ‘hateful’ users in online social networks? 

We used the profiles already labelled as hateful or normal as training data for the machine learning models before running our algorithms on the dataset.  

The results were promising, demonstrating that the model could indeed predict hateful Twitter users. In addition, I had user research from the law enforcement domain outlining two distinct user types: the data scientist, and the intelligence analyst.  

I also had mapping of how they work together, and the problems they encounter when working with data. 

Combined, this information gave me the base ingredients to build a scenario around the features.
 

| The first challenge | 

Characters ‘borrowed’ from Scott McCloud’s Google Chrome’s Comic

Ian and Dan 

The first player in my scenario was Ian, an intelligence analyst, whose job is to watch social media and other public data sources to flag hateful behaviours and assess risks. This manual process means scrolling through a long list of feeds, every day, looking for hateful language or hashtags. 

Manual labelling is labour-intensive. Say it takes just five minutes to manually check whether a user is hateful or not hateful, then labelling 100K users (equivalent to our Twitter dataset) would take just shy of 12 months for one person working 24/7. Additionally, new research shows there’s a correlation between retweets and the spread of online extremism. 

So, I paired Ian with Dan; a data scientist specialised in machine learning on graphs. In my scenario, Dan plans to use an algorithm that uses the retweet network to infer whether a user is likely to be hateful or not across the remaining portion of data that is unlabelled. This will produce a shortlist of suspects for Ian to further investigate which he can filter by characteristics of the data or the confidence of the label, for example. 

The story is starting to come together – there’s a context, user types, the problems they’re trying to solve, insight on how graph and machine learning might be useful, an idea of how the users might approach the problem, and the intended outcome. 

When it came to user testing our tools, this storyboard acted as a context snapshot which was effective especially for participants not familiar with machine learning process and terminology. 
 

| The second challenge | 

Untangling the hairball 

The side-effect of working on something cutting-edge is that there is little, or nothing, to copy or iterate from. 

An example of this is basic graph interaction; expanding the ‘neighbourhood’ of a node (or entity). As the name suggests, this is when we look at which other entities connect to the chosen entity in a graph. 

It sounds simple enough until there are hundreds, thousands or millions of entities. Unsurprisingly, you soon end up with an impossibly complicated graph.  

That’s why we need intuitive, meaningful ways to interact with graphs, so users can discover rich insights that can enhance successful decision making.

Continue reading here.