The ocean is in constant flux. Tides and currents swoosh and swirl in mercurial cycles, cascading nutrients and dispersing life through the vast waters. This dynamism makes the ocean so fascinating, but it also makes it difficult to study. Now scientists are finding new ways to explore the ocean’s lifecycle by leveraging advances in storage and computing to process data out at sea. And it has the potential to transform oceanographic science.
The secret lives of plankton
The Marcus G. Langseth is a 3,834-ton research vessel (R/V). It’s bigger than an NHL hockey rink, houses 12 laboratory spaces, and is outfitted with some of the most advanced scientific equipment in the world.
In July 2022, the ship embarked on an 11-day scientific journey to study the ocean’s trophic web. On board the Langseth were its 20-person crew, 20 scientists from Oregon State University, and a one-of-a-kind underwater imaging device that looks like a cross between a bobsled, a Bauhaus-style sideboard, and a diesel generator.
The In Situ Ichthyoplankton Imaging System (ISIIS)—the bobsled gizmo—was developed by an interdisciplinary team of scientists, engineers, and software developers. Since there’s no Amazon or Walmart for oceanographic tools, scientists often have to build their own.
ISIIS was invented to capture real-time images of marine zooplankton. Plankton are (mostly) tiny organisms that drift through the ocean, carried and dispersed by tides and currents. There are trillions of them in the ocean. So many that their nighttime venture to feed near the water’s surface is considered the largest animal migration on Earth; it can even be seen from space.
Plankton are essential for the well-being of all creatures on Earth. Smaller phytoplankton (plant plankton) produce vast amounts of oxygen, more than half of what we breathe. And together with zooplankton, the organisms the ISIIS primarily images, they form the base of the ocean’s complex food web.
Yet despite the plankton’s importance for the ocean’s ecosystem, how environmental drivers impact them is still poorly understood; something the ISIIS team is trying to change.
From nets to bytes
Among the scientists aboard the R/V Langseth as it left the whale-watching coastal town of Newport, Ore., were two of the brains behind ISIIS’s newest iteration: Bob Cowen, the associate vice president for research and operations and director at Oregon State University’s Hatfield Marine Science Center, and Moritz Schmid, a research associate in Cowen’s lab who most people address as “Mo.”
Schmid is an oceanographer and ecologist with a love for machine learning. He had spent months aboard Canadian Coast Guard icebreakers, exploring the fine-scale distribution of plankton in the Arctic seas. He is as familiar with advanced computing as he is with the rigors of sea exploration, where shifts start at 3 a.m., days are spent in grueling weather, rubber boots and bib pants, maintaining instruments, and hauling nets.
“People started researching plankton over a hundred years ago using nets,” said Schmid. “Like fishing nets, these plankton nets let you grab a sample that you then take to a microscope to count how many individuals of the different species are present.”
But this method comes with many limitations. For one, it’s a tiny sample with little spatial resolution and information about its surroundings or origin. The process is also very time consuming, and even worse, it can physically harm larger, fragile plankton like jellyfish.
Underwater imaging systems, like the ISIIS, are increasingly used in plankton research as they allow scientists to study plankton in their habitat, with very high spatial resolution and no physical contact.
As the ISIIS gets towed behind a ship, its two state-of-the-art cameras image the water using shadowgraph imaging. In this technique, light is projected from a pinhole, and any object that passes through it casts a shadow. To create a very high-resolution image capable of capturing the shadows of microscopic plankton, ISIIS uses digital line scan cameras. Unlike ordinary cameras that capture an image in its entirety, these cameras scan the water line by line 36,000 times per second.
“Over a day’s recording, the ship crosses about 100 miles, so we actually end up with a 100-mile-long image,” explained Schmid. “It’s one gigantic picture but for handling and processing we store it as a 20 frame-per-second video file.”
The ISIIS’ cameras collect 10GB of video every minute. With 140 hours of high-resolution video collected on the Langseth, it’s easy to see why underwater imaging is a big data challenge.
Meet the IT guy
Christopher M. Sullivan is a computational scientist at the Center for Quantitative Life Sciences at Oregon State University. He has helped build and design the university’s multi-million, multi-petabyte computational research infrastructure. He’s a fast talker, abreast of any new technology, and is adamant about getting (or building) the best and most interoperable architecture to answer pressing scientific questions.
“We’re not just button pushers,” said Sullivan about his role in building IT infrastructure. “First and foremost, I’m a scientist, and the button-pushing we do is about enabling science.” Sullivan’s work also encompasses about 40 publications and 9,000 citations.
Sullivan has been involved with the plankton project for more than a decade. “This is one of the best projects,” he said. “It leverages all of the artificial intelligence technologies out there, pushes every technology to its limit, and, in the end, I get to help solve climate change.”
The big data challenge of ISIIS isn’t just about capturing terabytes upon terabytes of data at sea. Having data is great, but generating insights is better.
After returning to shore, the ISIIS video data goes through a machine learning pipeline. First, AI was taught to recognize plankton in the images of the water. Then, it was taught how to segment plankton according to biological classification using a deep neural network. And the last step was adding complex data about geospatial location and other parameters to be made available to researchers worldwide.
“A week-long trip to sea could return over a billion images of individual plankton,” said Sullivan.
It’s an elegant and dazzling example of the power of AI. But crunching this much data is a very complex and compute-demanding operation. To get the needed results, the project leveraged the National Science Foundation’s XSEDE infrastructure (now ACCESS), which taps into the major national supercomputing centers such as the San Diego Supercomputing Center and the Pittsburgh Supercomputing Center.
While these supercomputers helped deliver faster insights, there was a constant gap between when the research took place and when scientists could explore its results. But Sullivan, Schmid, and Cowen had an itch to find a way to see this data in real-time, out at sea. It wasn’t just about reducing the lag between research, analyses, and results; it was about doing science differently. “At-sea processing of complex data has the potential to transform oceanographic science,” said Schmid.
A million dollars’ worth of data
Sullivan, who is used to working with high performance computing, was faced with a new challenge. All the brawly gear and GPUs that could gorge the AI pipeline and pop out images rely on a lot of electricity. Yet science ships don’t house supercomputers or offer megawatts. Processing data at the edge has very different requirements. “There was nobody out there who could really solve our problem meaningfully,” said Sullivan.
His statement held until 2021.
In 2021, Western Digital introduced the Ultrastar Edge-MR, an edge server purpose-built for extreme conditions. Sullivan immediately saw its potential. The server came with 60 terabytes of blazing fast NVMe flash storage that could ingest the 8K video files alongside 40 CPU cores and a GPU unit for crunching AI tasks at the edge. But the cherry on top was that it was also a rugged device.
Designed for military-grade use, the edge server comes in a desert tan box that offers internal suspension to protect the server from shock, vibration, and electromagnetic interference. It is also certified water and dust resistant. Features necessary when leaving the pristine and predictable data center for untamed seas.
“This [ruggedization] is really important,” said Sullivan. “All our devices are ruggedized; every machine that I put out there has to be encased.”
For Sullivan, any new equipment must first prove itself under his painstaking microscope.
“You need to understand that with the costs of the boat, machines, equipment, and labor, we come out with about $1,000,000 to run that experiment for 10 days,” said Sullivan. “I needed to partner with a company who was serious about making sure that if I put that data onto that device I was bringing back my million dollars’ worth of data.”
Data scientist on board
After rigorous testing, Sullivan was convinced. It wasn’t just the ability to easily run a choice of enterprise-grade operating systems like RedHat or Ubuntu, the server’s 100Gb Ethernet networking capabilities, or RAID flexibility. It was about how easy it was to configure it all.
“It really did appear that Western Digital had given that a lot of thought,” said Sullivan. “It shows that this is a fabulous piece of technology because I can put it in front of a computer scientist and quickly watch it come to life.”
One of the computer scientists who could turn on the server and make it do its magic was Dominic Daprano, an undergraduate GPU computational researcher supervised by Sullivan and Schmid.
Daprano had worked on the AI algorithm with Sullivan. By this time, the AI has learned to identify around 170 different types of plankton. “This is a very large mathematical algorithm with a very large convolutional neural net,” said Sullivan. “There are billions of plankton we’re trying to process per run.”
But since no one knew what would happen when they’d take the setup from the lab into the real world, Daprano was asked to join the research cruise, bringing a computer scientist into the mesh of oceanographic tasks.
On board the Langseth, the setup was complete. The ISIIS was connected to the computer controlling the oceanographic system with a fiber optic wire, a multi-mile long cable, and a second fiber optic cable pushing the video data onto the Western Digital server.
On the first night out, at around 9 p.m., the crew lowered the ISIIS into the water via a crane, lit by the ship’s floodlamps. Everyone on the night shift huddled into the control room next to Daprano, eyes glued to the computer screens. As each image flew across the screen a choir of ‘oos’ and ‘ahhs’ broke out.
“It was amazing to see all this diversity,” said Schmid. “This system is unique because we travel fast and ingest a large volume of water (180L s-1), to allow us to sample organisms that are representatively rare. Look at this amazing mosaic of species!”
For Cowen and Schmid, knowing what is happening around the waters of the boat in near real-time is a game-changer.
“In the vast ocean of currents, everything is dynamic,” Cowen said. “If you come back to the same area an hour later, things are going to be different. So, you want the best snapshot, and you want to be able to react. Adaptive sampling means we can recognize points of interest in real-time and then change the course of the ship in accordance. The more targeted your sampler can be, the more effective it is. This is a huge step forward.”
As science pushes technology forward, technology becomes its enabler. An enabler that can open the door to exploring novel scientific questions. “It’s not about you or me. It’s about monitoring the oceans for climate change and understanding how our planet is operating right now,” said Sullivan.
“Technology has to be able to change the way we’re doing science.”
Funding for the ISIIS project and ship time is provided by the National Science Foundation (Award: OCE-2125407)
Photos of the Pacific Ocean, Moritz (Mo) S. Schmid, ISIIS controller, Ultrastar server, the R/V Langseth science systems control room computers, and team of scientists lowering the ISIIS, courtesy of Elizabeth Lafferty.
Photo of the R/V Langseth courtesy of the Office of Marine Operations, LDEO.
Photo of the ISIIS courtesy of Moritz S. Schmid.