Unleashing the Power of RF Data

Enabling Analytics with HawkEye 360's Commercial Satellite ConstellationJuly 24th, 2023
Picture of Kaitlin Zimmerman
Kaitlin Zimmerman

Daily, millions of devices emit radio frequency (RF) energy across the electromagnetic spectrum for vital functions such as communication and navigation. HawkEye 360’s satellites can passively collect this energy and provide accurate geolocations, creating a unique look at the activity around the world.

Space-based RF geolocation capabilities provide global access to monitoring RF activities. Space-based RF collectors are not limited by weather and can provide a daily survey of the planet, providing a powerful platform for monitoring patterns of life around the world and identifying anomalies that might warrant further investigation. Historically, this capability has been reserved for military and intelligence use cases, making it difficult for countries and commercial companies to use without their own space-based RF geolocation capabilities. HawkEye 360 is bringing unprecedented access to RF data and analytics through our commercial satellite constellation, enabling users to leverage this unique dataset to solve some of humanity’s most challenging use-cases from global security to environmental concerns.

Placeholder alt text
RF signals are detected and geolocated using the HawkEye 360 constellation of satellites. The above image represents the global emissions over one month.

HawkEye 360’s Satellite Constellation

HawkEye 360 was founded in 2015 to become the first provider of commercial space-based RF data and analytics. The company launched its initial cluster of three Pathfinder satellites in December 2018 and has since expanded to a robust constellation of 21 satellites. Each satellite in the constellation is equipped with sensors to detect a wide range of RF radar and communications signals across air, land, sea, and space, providing increased situational awareness to an array of activities, including border patrol, military activities, illegal fishing, illegal mining, and indicators of RF interference. Once the RF data is downlinked to Earth, HawkEye 360 uses sophisticated algorithms to geolocate and provide important signal attributes about the detected RF emitters in a collection. Over the last four years, HawkEye 360 has geolocated more than 200 million RF signals.

The vast amount of data being collected by HawkEye 360’s satellites daily has led to an increasing need for advanced analytics that make finding anomalous and potentially nefarious activity easier. A large portion of RF emissions detected from space belongs to vessel operators who are relying on their devices to provide accurate navigation and communication in remote areas, such as the ocean. Our customers are interested in identifying those devices that are engaged in illicit activity such as illegal fishing or illegal mining. To find these few devices among millions of geolocations, HawkEye 360 has been expanding its analytic capabilities through advanced algorithms that leverage big data analytics, artificial intelligence, and machine learning (AI/ML) to automatically sift through data points for outliers.

Enter the Analytics

Like many companies with abundant data and technical expertise, HawkEye 360's data science team endeavors to enhance our analytic capabilities commenced with a journey to break down the barriers of data and knowledge silos. To truly embrace an analytics-first culture, where we are constantly evaluating new ways we can increase the value of our data through new analytic algorithms, we had to reimagine our cloud architecture. HawkEye 360 has developed a scalable data storage architecture that allows us to gather data from our constellation, combine it with external sources, and provide access to all our production, experimental, and legacy data processing results. This simplifies data discovery and speeds up algorithm development.

Data engineering has become as important to our business as the satellites themselves. Breaking down our data silos and creating an architecture that makes sharing results easy has allowed us to greatly increase our velocity in experimentation. Our data scientists and RF algorithm experts are now working closely with analysts to identify tedious aspects of RF data analysis and build first-of-their-kind analytics that simplifies these workflows. This has allowed us to accelerate our software development lifecycle across the board, from building algorithms for geolocating new devices to training and deploying machine learning models.

Today, we are applying these analytics to test out new capabilities soon to be offered to our customers. This includes the ability to uniquely identify an RF emitter based on the raw waveform data and using our unique satellite cluster formation to provide the speed and direction of geolocated emitters. Bringing it all together, our users will soon be able to be alerted when a known nefarious emitter of interest is geolocated, receive data for not only where the emitter is located but also where it is most likely headed, and then associate that geolocation with imagery or AIS data sources. All automatically. Behind the scenes, this is all done through a combination of mathematical analytic techniques and AI/ML applied to our historical data living in the central data storage architecture. The result is taking a workflow that may have once required hours to complete and now completing in seconds.

Placeholder alt text
HawkEye 360's RF geolocation and maritime analytics allow organizations to pinpoint where vessels experience gaps in AIS reporting while continuing to collect important signal information to assess relevant activities during that dark period.

Lessons Learned for Enabling Analytics

In the beginning, our data science team discovered that it was relatively easy to take an open-source AI/ML model and retrain it to build a prototype. However, the real challenge lay in transitioning that prototype into a fully functional AI/ML model for production. For every powerful AI/ML model out there, an entire ecosystem must be built around it to feed data to the model, process the results, generate the data products, alert to potential drift or model bias, monitor for performance, data provenance, auditing, logging, and much more. Getting a model to production, required collaboration across a multitude of skill sets and disciplines beyond just data science. Below is a short list of some of the key lessons we had to learn that we hope will help others build the next generation of analytics for commercial space data.

Before you can do good data science, you need to have good data. Machine learning models have an annoying way of exposing all the biases or errors that may be hidden within your dataset. For any large dataset, with millions of data points, there will be some amount of noise. Plan to invest time upfront in validating the goodness of your data and be humble about what discoveries may lie ahead.

AI/ML and analytics at scale require a multidisciplinary team. Our scientists required support from the platform engineering teams, data engineering teams, and security teams to build a robust development environment for experimenting. These same teams also helped build the scaffolding that eventually enabled us to adopt industry best practices such as MLOps to enable rapid retraining and deployment of production models. You will need more software engineers than you probably think.

A robust data storage architecture will accelerate development. Data silos slowed down our software development lifecycle across the company. Designing an architecture that enabled easier access to all our data and analytics was fundamental to building models faster, accelerating our testing of code, and identifying operations issues quickly. This data storage architecture proved to be fundamental to our ground software designs.

Be open to new ideas and pivoting when it makes sense. At the beginning of a project, our scientists would take the best guess at what the solution may be and explore from there. More often than not there were quite a few iterations of trying new ideas until we discovered the one that showed results.

Keep the analysts in the loop. When working with space-based RF data, analysts are some of the heaviest users and thus gave us the highest quality feedback. Who are your heaviest users? Get them involved early. Our analysts pointed out flaws in early results that we would have never noticed otherwise. Without their involvement, we would have not been able to validate many of our results.

Placeholder alt text
HawkEye 360's AI analytics allow our analysts to identify and track "dark vessels" through their unique signal characteristics. Drastically improving analyst workflow, eliminating the time consuming, tedious tasks associated with RF datasets.

Conclusion

The journey of enabling analytics and harnessing the power of AI/ML models has taught us valuable lessons about the importance of building a comprehensive ecosystem around these technologies. AI/ML models have the potential to revolutionize various domains of analytics, but do not forget they need to rely on a robust infrastructure and supporting processes to function effectively and ethically.

An AI/ML model can only be as good as the data it is trained on, and therefore, a strong emphasis must be placed on data quality, diversity, and representativeness. Building a data pipeline that can continuously feed relevant and reliable data to the model is essential. As is having an accessible data storage architecture that minimizes or removes data siloes.

Enabling analytics and maximizing the potential of AI/ML models necessitate the development of a comprehensive ecosystem. This ecosystem should encompass data pipelines, processing mechanisms, alert systems, performance monitoring, data provenance, auditing, and ethical considerations. By embracing these lessons learned, we can build robust and responsible AI/ML systems that drive meaningful insights, innovation, and positive impact across various domain.

Here at HawkEye 360, we are leveraging AI/ML to build increasingly more advance analytics to support our customers and their RF analysis. We are excited for how this field will continue to make space-based RF data easier to use and apply to a broader range of use-cases in support of solving some of humanity’s most challenging problems.

Learn more about HawkEye 360 at he360.com

Kaitlin "Kate" Zimmerman is an accomplished data science expert currently serving as the Chief of Data Science and Analytics at HawkEye 360. In this role, she leads the development of the HawkEye 360's data science strategy and manages a team of scientists responsible for building AI/ML models and enhancing analytics capabilities. With a focus on harnessing the power of Artificial Intelligence (AI), Zimmerman aims to provide customers with increased value from HawkEye 360's RF data. Before joining HawkEye 360, Zimmerman worked at Amazon Web Services (AWS). Her educational background includes a graduate degree from the United States Air Force Academy and the U.S. Air Force Institute of Technology. Zimmerman continues to serve her country as a USAF Reservist, concentrating on prototyping AI/ML payloads for the US Space Force.