The future of science is data
What does the future of science look like? And how does data play an important role in it? This and much more, is what CSIRO’s Chief Scientist, Dr Cathy Foley, explored in her keynote session at this year’s D61+ LIVE.
Cathy delivered the closing keynote on Day 1 and in her session, she looked at how scientific discovery has changed throughout time and why more data-driven science is critical for future impact.
Every solution starts with a problem and Australia as a country has its fair share. We are in the middle of a major drought, fish are dying in the Murray–Darling Basin, the Great Barrier Reef’s biodiversity is under threat, we are a nation highly-dependent on coal to deliver its energy, and we face a range of health issues, just to name a few.
So, CSIRO’s purpose, as Australia’s science agency, has never been more important – to solve some of these great challenges using innovative science and technology. But solving problems is more than just writing a research paper or publishing an article in Nature. It’s about doing things which lead to a product with tangible solutions.
How do we get to that point? According to Cathy, what we need are challenges that are clear, defined and can be delivered on. Taking the UN’s Sustainable Development Goals and the new Australia National Outlook into consideration, CSIRO has boiled down the problems into six challenges we are assisting the nation to overcome, and turn to Australia’s unique advantage:
- Sustainable energy and resources
- A secure Australia and region (including defence, cyber security and biosecurity)
- Future industries to create jobs
- Resilient and valuable environments
- Food security and quality
- Health and well-being
“In order to deliver on any of these challenges, new future industries and engineered products and solutions are needed, that take the science, technology and knowledge we have and turn it into practical outcomes,” said Cathy in her keynote address.
And it all starts with data. According to the Economist, “data is the new oil” – the world’s most valuable resource. Nowadays we have an enormous amount of information and the supercomputer power to help us store it. So why is it that we still can’t find a solution to these critical issues?
Making scientific breakthroughs through data would require the integration of massive datasets, sensors implementation and automation that feed real-time decision and action.
For example, imagine if we could wire the whole Murray-Darling Basin and measure what is happening there. If we could create a digital model of it, then we’d be able to make decisions based on real data – should we flush it, or should we implement water restrictions, or should we collect the fish? There is a whole range of possibilities if we have the data available.
A similar concept is being tackled by CSIRO Energy in partnership with the Australian Energy Market Operator. Energy Director Dr Tim Finnigan discussed during a panel on Day 2 at D61+ LIVE how we are building a digital twin of the entire Australian energy system including electricity, gas, batteries, hydro, wind farms and the financial settlement market that sits behind it. By doing this, we will be able to capture data from the system which contains millions of nodes and more accurately measure and model the entire network to reduce overconsumption of energy and how best to develop the network to meet Australia’s needs in the future.
So, what do we need to do to make this data useful? Cathy lays out the process:
“We need to make sure that data is discoverable, that it has relationships, and that we can establish the stakeholders – who are the owners, domain specialists, the users and the consumers.
“We need to make sure that it’s optimised by categorising the datasets, so we can communicate the value into an infrastructure.
“We need to have data that maps into workflows, which can be labelled in order to establish a process to take that data and use it.
“We need to make sure the data is consumable, so we have foundations for delivery to non-domain consumers.
“We need to make sure that workflows have high impact, so data can be published, peer-reviewed, and workflows can be easily mapped.
“And lastly, we need to make sure that the data has provenance, so that we know where is comes from, who owns it, and it can be trusted.”
This information will change the way researchers do their future work. Instead of doing experiments in the laboratory, we’ll be doing them on a computer first. Then once we’re in the lab the process will be mostly automated rather than manual.
We’re going to be able to visualise data in a way which allows us to make the most of it.
We’ll have an ecosystem that fosters open innovation, shares data and cooperates around common goals and challenges – this is going to be vital for future science breakthroughs. A new way of tokenisation would allow us to trace the origin of an IP or contribution.
Open access to publications and papers would change the peer-review process and would set an expectation of repeatability and statistical significance. And finally, because data is open, science integrity will be at the core.
Once we have that all, artificial intelligence and machine learning will help us use these data platforms to learn. Feature extraction would allow us to spot different patterns in the information, which we can turn into models to predict the future.
“So future science is going to be more data-driven; it’s going to be more augmented in the way we interface with devices, systems and information; we’re going to collaborate and communicate with machines in ways which will allow us to be at our best; we’re going to have more complex tasks which are automated; we’re going to have an explosion of tools, and we’re also going to have quantum properties that allow us to do better modelling. And, finally, the whole scientific publication process is about to change.”
But we shouldn’t forget the human element – human collaboration and curiosity will be essential to our future success.
“Keep being curious,” Cathy urged the audience. “What is one topic or activity that you’re curious about today? Think about it. If we want to have alternate thinking and come up with creative solutions, we really must push ourselves.”
Taking this curiosity and research and turning it to solutions will require setting up an ecosystem that can provide the following: services that help us get through the valley of death; continual and innovation platform; a way for talent to move across the sector; research platforms; data platforms; and access to secure high-power computing.
“If we can do that, we’ll have a connected system with the scale we need to deliver on these big problems,” concludes Cathy. “An ecosystem that will lift the productivity and be able to make the most of the data and computing access, and I think with that, we’ll have an accelerated innovation.”
You can watch Cathy’s entire keynote address here: