In episode 5 of “This Is How We Do It,” Peter Havens from Cortex product marketing and Isaac Krzywanowski, staff security engineer at Palo Alto Networks, discuss data pipelining, operational assurance and the importance of monitoring the health of data sources. Join their conversation as they dig into the complexity behind data ingestion and how it is managed in the Palo Alto Networks security operation center (SOC).
In an era where digital landscapes are under siege, data – where it’s from and how it’s used – emerges as the unsung hero of cybersecurity. Isaac is entrusted with the responsibility of data pipelining: "Data is the lifeblood of security and detection engineering." Within his purview lies the critical task of data ingestion and operational assurance, which are pivotal to arming the SOC with timely and relevant information to counter potential threats.
The process of data ingestion is no trivial matter. It includes the intricate orchestration of collecting data from multiple sources, ensuring its relevance to the SOC's needs, and then mapping it onto the Cortex Data Model (XDM). Isaac explains the significance of this data model, as it "standardizes the data, allowing analysts to search through multiple tables effortlessly, using a unified framework."
The palette of data sources at Palo Alto Networks is expansive and diverse. It includes firewall data, endpoint detection and response (EDR) data from Cortex XDR, cloud platform data from Google, Amazon and Azure, and even unconventional sources, such as source code repositories (like GitHub and GitLab). The inclusion of source code data proves insightful as it gathers audit logs to monitor user access and potential security breaches.
With an astounding influx of 50 terabytes to 75 terabytes of data each day, the complexity of data collection in the Palo Alto Networks SOC cannot be underestimated. This torrent of data arrives in myriad formats from different vendors and partners, requiring careful consideration. XSIAM helps facilitate the collection of data regardless of where it's from. Isaac underscores the importance of selecting pivotal data sources, which comprise security data, identity-related information, access logs and the “crown jewels” of company-sensitive data that is vital to any organization:
“It’s really important because you never know when a data source might drop for whatever reason, and you need to know about that as fast as possible. That's critical because the entire SOC is waiting for that log, so operational assurance is really important. Making sure that data is coming in, and coming in within a reasonable amount of time, is super important. First of all, we have an out-of-the-box table that basically holds what we call ‘data freshness.’ This is really helpful in determining whether your flow of data is starting to lag behind, or whether it's dropped altogether.”
Our XSIAM platform serves as the backbone of data management, streamlining the process of data collection through API integrations, XDR collectors and broker virtual machines (VMs). Through these mechanisms, XSIAM not only simplifies data gathering but also standardizes it. Issac says that this allows data to be accessible and analyzable across the board:
A large part of the data ingestion process is the critical practice of operational assurance. It ensures the reliability of incoming data streams and whether the system's technical features are being bypassed, have vulnerabilities, and whether required procedures are being followed. Operational assurance has been fortified through real-time alerts. By integrating with the Slack communication platform, Isaac and his team receive instantaneous notifications about potential data ingestion issues. This real-time collaboration minimizes mean time to response (MTTR), fortifying the defense against potential data interruptions.
In cybersecurity, data mastery emerges as a fundamental pillar of protection. Isaac Krzywanowski's insights provide a window into the nuances of data pipelining and operational assurance at Palo Alto Networks. With data as the bedrock of detection, response and AI-driven analysis, it's evident that the processes discussed above are not just technical undertakings, but essential safeguards against digital threats.
Watch their full interview on our Cortex YouTube channel.